CN112581203A

CN112581203A - Providing explanatory product recommendations in a session

Info

Publication number: CN112581203A
Application number: CN201910941459.4A
Authority: CN
Inventors: 吴先超
Original assignee: Microsoft Technology Licensing LLC
Current assignee: Microsoft Technology Licensing LLC
Priority date: 2019-09-30
Filing date: 2019-09-30
Publication date: 2021-03-30
Also published as: WO2021066903A1

Abstract

The present disclosure provides methods and apparatus for providing explanatory product recommendations in a session. At least one question associated with the product recommendation may be provided. An answer to the at least one question may be received. Whether at least one recommended product exists may be determined based at least on the at least one question and the answer. A reason for the recommendation of the at least one recommended product may be generated in response to determining that the at least one recommended product exists. A response may be provided that includes product information for the at least one recommended product and the reason for the recommendation.

Description

Providing explanatory product recommendations in a session

Background

Artificial Intelligence (AI) chat robots are becoming more popular and are finding application in more and more scenarios. The chat robot is designed to simulate human speech and can chat with users through text, voice, images, and the like. In general, the chat robot can identify verbal content within a message input by a user or apply natural language processing to a message and, in turn, provide a response to the message to the user.

Disclosure of Invention

This summary is provided to introduce a selection of concepts that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

Embodiments of the present disclosure propose methods and apparatus for providing explanatory product recommendations in a session. At least one question associated with the product recommendation may be provided. An answer to the at least one question may be received. Whether at least one recommended product exists may be determined based at least on the at least one question and the answer. A reason for the recommendation of the at least one recommended product may be generated in response to determining that the at least one recommended product exists. A response may be provided that includes product information for the at least one recommended product and the reason for the recommendation.

It should be noted that one or more of the above aspects include features that are specifically pointed out in the following detailed description and claims. The following description and the annexed drawings set forth in detail certain illustrative features of the one or more aspects. These features are indicative of but a few of the various ways in which the principles of various aspects may be employed and the present disclosure is intended to include all such aspects and their equivalents.

Drawings

The disclosed aspects will hereinafter be described in conjunction with the appended drawings, which are provided to illustrate, but not to limit, the disclosed aspects.

Figure 1 illustrates an exemplary network architecture for deploying a chat robot, according to an embodiment.

Fig. 2 illustrates an exemplary chat robot system, in accordance with embodiments.

Fig. 3 illustrates a mapping between a set of candidate products and a set of candidate problems, according to an embodiment.

FIG. 4 shows an exemplary overall process for providing explanatory product recommendations, according to an embodiment.

FIG. 5 illustrates an exemplary specific process for providing explanatory product recommendations, according to an embodiment.

FIG. 6 illustrates an exemplary process for training a recommendation reason generation model, according to an embodiment.

FIG. 7 illustrates an exemplary process for generating a reason for recommendation, according to an embodiment.

Fig. 8-12 illustrate exemplary chat windows according to embodiments.

FIG. 13 shows a flowchart of an exemplary method for providing an explanatory product recommendation in a session, according to an embodiment.

FIG. 14 illustrates an exemplary apparatus for providing an explanatory product recommendation in a session, according to an embodiment.

FIG. 15 shows an exemplary apparatus for providing an explanatory product recommendation in a session, according to an embodiment.

Detailed Description

The present disclosure will now be discussed with reference to various exemplary embodiments. It is to be understood that the discussion of these embodiments is merely intended to enable those skilled in the art to better understand and thereby practice the embodiments of the present disclosure, and does not teach any limitation as to the scope of the present disclosure.

In general, a chat robot can conduct an automatic chat in a conversation with a user. As used herein, a "conversation" may refer to a time-continuous conversation between two chat participants, and may include messages and responses in the conversation. The "message" may refer to any information entered by the user, such as a query from the user, a user's answer to a question of the chat robot, a user's opinion, and the like. The term "message" and the term "query" may also be used interchangeably. A "response" may refer to any information provided by the chat robot, such as answers to the user's questions by the chat robot, comments by the chat robot, questions posed by the chat robot, and so forth.

In some application scenarios, the chat robot may provide product recommendations to the user in a conversation with the user. Herein, a product may include goods, services, and the like. However, providing product recommendations by chat robots faces a number of challenges. In one aspect, a large-scale corpus of tokens needs to be prepared for training a machine learning model for use in capturing a user's intent or need expressed in natural language. The user's intent or need indicates the user's preference for the product attributes of the recommended product. Product attributes may include various parameters, configurations, characteristics, etc. of the product. In another aspect, there is an information asymmetry between the products that the chat robot can provide and the needs of the user. For example, the user does not know which products the chat robot can offer and how to find the desired products. The user needs to include keywords describing product attributes in the message sent to the chat robot, however, the chat robot may not be able to use these keywords to efficiently find recommended products or products corresponding to these keywords may not exist at all. In yet another aspect, the chat robot takes a long time to collect user needs step by step, as users may chat with the chat robot in an inefficient and low informative manner.

Embodiments of the present disclosure provide for providing explanatory product recommendations in an efficient and accurate manner during a chat robot's conversation with a user. The chat robot may provide explanatory product recommendations based on a learning-to-interpretation (LTE) architecture as proposed by embodiments of the present disclosure. The LTE architecture may dynamically provide a series of questions associated with product recommendations and collect user answers to the questions in multiple rounds of sessions with the user, and may learn new knowledge from at least the user answers to determine what to recommend a product and to give a recommendation reason to explain why the product is recommended. The LTE architecture can screen a recommended product from a large number of candidate products through a relatively short session.

The questions provided to the user may be directed to various product attributes. Optionally, the question may be attached with options indicating different product attributes, so that the user can directly select the desired option in the answer. Alternatively, if the user answers a question or sends a message using a natural language sentence, the natural language sentence may be parsed to identify the product attributes desired by the user.

The LTE architecture may perform product ranking on multiple candidate products each time an answer to a current question is received from a user. Candidate products that include the user selected attribute will be ranked higher. With the product ordering, each candidate product will have a corresponding expected probability that indicates the likelihood that the candidate product is expected by the user after the round of conversation. The expected probability of a candidate product may be calculated for each round of conversation. Thus, as the session progresses, the expected probability for each candidate product is continually updated. Optionally, a weight may also be calculated for each candidate product in the product ranking, which may be mapped to a desired probability. Since there is a particular mapping between the weight of the candidate product and the desired probability, the two may be used interchangeably herein or collectively referred to as a probability weight.

After product ranking the candidate products, the LTE architecture may further perform problem ranking on the plurality of candidate problems if no candidate product satisfies the predetermined condition. The predetermined condition may include, for example, an expected probability of the candidate product being above a threshold, etc. These candidate questions may be predetermined directly or indirectly based on attributes of the plurality of candidate products. For example, a plurality of candidate questions may be ranked by considering previously provided questions, as well as user answers, product rankings of candidate products, and the like. The LTE architecture may perform problem ranking by entropy-based ranking and/or policy-based reinforcement learning (reinformance) ranking. With problem ranking, each candidate problem will have a corresponding information gain. The candidate question with, for example, the largest information gain may be selected from the ranked candidate questions as the next question to provide to the user. Here, the maximum information gain may indicate that the candidate problem has the highest degree of discrimination, so that the range of candidate products to be considered subsequently can be narrowed down as much as possible at the fastest speed. Optionally, a weight may also be calculated in the product ranking for each candidate question, which may be mapped to an information gain. Since there is a specific mapping relationship between the weights and information gains of the candidate questions, the two may be used alternatively to each other herein.

If there is at least one candidate product satisfying a predetermined condition after the candidate products are product-ranked, the LTE architecture may determine the candidate product as a recommended product and provide product information on the recommended product to the user. The product information may include any information about the product, such as name, number, price, etc. Further, the LTE architecture may determine a reason for recommendation of the recommended product. In one aspect, the LTE architecture may select at least one question and corresponding user answer that contributes most from among the questions previously provided to the user to generate a recommendation reason. For example, a question that produces the greatest expected probability of being promoted for the recommended product may be selected from historical questions provided to the user, and a reason for the recommendation may be generated with reference to the selected question and the user's answer. In another aspect, the LTE architecture may also generate a referral reason by a pre-trained referral reason generation model.

In one aspect, embodiments of the present disclosure may use a list of product attributes of candidate products to generate questions in advance, and may identify user-selected attributes directly from the user's answers or by comparing the user's answers to the list of product attributes. Thus, embodiments of the present disclosure do not require a large-scale corpus of tokens for capturing user intent or needs. On the other hand, since embodiments of the present disclosure can provide a user with a question to which an option is attached, and the user can answer the question by simply selecting the option, the chat robot can effectively guide a conversation to avoid a problem caused by information asymmetry between available products and user needs and improve product recommendation efficiency. In yet another aspect, the LTE architecture according to the embodiments of the present disclosure can learn new knowledge in a session in real time and guide a next session accordingly, so that user intention or need can be accurately understood and user engagement, interest, etc. can be effectively improved. In yet another aspect, embodiments of the present disclosure may provide a recommendation reason for a recommended product to a user, which may enhance the user's attention and confidence level for the recommended product and thereby increase the likelihood that the user subscribes to the recommended product.

The explanatory product recommendations according to embodiments of the disclosure can be applied in a variety of scenarios. In some scenarios, embodiments of the present disclosure may be used for merchandise recommendations, e.g., gift recommendations, and the like. In some scenarios, embodiments of the present disclosure may be used for service recommendations, such as task-oriented hotel, restaurant reservation recommendations, and the like.

Fig. 1 illustrates an exemplary network architecture 100 for deploying a chat robot, according to an embodiment.

In fig. 1, a network 110 is applied to interconnect between a terminal device 120 and a chat robot server 130.

Network 110 may be any type of network capable of interconnecting network entities. The network 110 may be a single network or a combination of networks. In terms of coverage, the network 110 may be a Local Area Network (LAN), a Wide Area Network (WAN), or the like. In terms of a carrier medium, the network 110 may be a wired network, a wireless network, or the like. In terms of data switching technology, the network 110 may be a circuit switched network, a packet switched network, or the like.

Terminal device 120 may be any type of electronic computing device capable of connecting to network 110, accessing a server or website on network 110, processing data or signals, and so forth. For example, the terminal device 120 may be a desktop computer, a laptop computer, a tablet computer, a smart phone, an AI terminal, and the like. Although only one terminal device is shown in fig. 1, it should be understood that a different number of terminal devices may be connected to network 110.

In one embodiment, terminal device 120 may be used by a user. The terminal device 120 can include a chat bot client 122 that can provide automated chat services for users. In some cases, chat robot client 122 may interact with chat robot server 130. For example, the chat robot client 122 may transmit a message input by the user to the chat robot server 130 and receive a response associated with the message from the chat robot server 130. However, it should be understood that in other cases, instead of interacting with the chat robot server 130, the chat robot client 122 may also generate responses to user-entered messages locally.

The chat robot server 130 may be connected to or contain a chat robot database 140. Chat robot database 140 may include information that may be used by chat robot server 130 to generate responses. In one embodiment, the chat robot server 130 may also be connected to a product database 150. The product database 150 may include various product information about a plurality of candidate products, such as product names, product attributes, and the like. The product information may be, for example, pre-supplied by a product provider or captured from a network. Although product database 150 is shown as being separate from chat robot database 140, product database 150 may also be included in chat robot database 140. When one or more candidate products are determined to be recommended products, product information associated with the recommended products may be provided to the user.

It should be understood that all of the network entities shown in fig. 1 are exemplary, and that any other network entities may be involved in the network architecture 100 depending on the particular application requirements.

Fig. 2 illustrates an exemplary chat bot system 200 in accordance with embodiments.

The chat bot system 200 can include a User Interface (UI)210 for presenting chat windows. The chat window can be used by the chat robot to interact with the user.

The chat robot system 200 may include a core processing module 220. The core processing module 220 is configured to provide processing capabilities during operation of the chat robot through cooperation with other modules of the chat robot system 200.

The core processing module 220 may obtain messages entered by the user in the chat window, which are stored in the message queue 232. The message may take various multimedia forms such as text, voice, image, video, etc.

The core processing module 220 may process the messages in the message queue 232 in a first-in-first-out manner. The core processing module 220 may call processing units in an Application Program Interface (API) module 240 to process various forms of messages. The API module 240 may include a text processing unit 242, a voice processing unit 244, an image processing unit 246, and the like.

For text messages, the text processing unit 242 may perform text understanding on the text message and the core processing module 220 may further determine a text response.

For voice messages, the voice processing unit 244 may perform voice-to-text conversion on the voice message to obtain a text statement, the text processing unit 242 may perform text understanding on the obtained text statement, and the core processing module 220 may further determine a text response. If it is determined that the response is provided in speech, the speech processing unit 244 may perform text-to-speech conversion on the text response to generate a corresponding speech response.

For image messages, the image processing unit 246 may perform image recognition on the image message to generate corresponding text, and the core processing module 220 may further determine a text response. In some cases, the image processing unit 246 may also be used to obtain an image response based on the text response.

Furthermore, although not shown in fig. 2, API module 240 may also include any other processing unit. For example, the API module 240 may include a video processing unit for cooperating with the core processing module 220 to process video messages and determine responses.

The core processing module 220 may determine the response through the database 250. The database 250 may include various information accessible by the core processing module 220 for determining a response.

Database 250 may include a pure chat index set 251. Pure chat index set 251 may include index items that are prepared for free chat between the chat bot and the user, and may be established using data from, for example, a social network.

Database 250 may include at least one set of candidate products 252. Candidate product set 252 may include a list of candidate products, attributes for each candidate product, and the like. In one embodiment, a set of candidate products may be established for each product category, the set including a plurality of candidate products belonging to the category. Product categories may be set based on different levels or criteria, and candidate products are classified into corresponding one or more categories. For example, the "gift" category may include chocolate, jewelry, flowers, artwork awaiting selection products, the "electronic goods" category may include cell phones, computers, televisions, headphones awaiting selection products, and so on. In addition, the candidate products in the candidate product set 252 may also include, for example, a specific brand, a specific model of candidate products, such as a XX brand YY model of cell phone Z, hotel K, and so on. Taking the candidate product "handset Z" as an example, the attributes may include, for example, a 4G +5G mobile network, a 6 inch screen, a fingerprint recognition function, and so on. The various information included in the set of candidate products 252 may be, for example, pre-supplied by a product provider or crawled from a network.

The database 250 may include a set of candidate questions 253. The set of candidate questions 253 may include a list of candidate questions determined based at least on attributes of candidate products in the set of candidate products 252. Each candidate question may be directed to one or more product attributes, such that when a user provides an answer to the candidate question, the user's desired attributes may be determined based on the answer and identified which candidate products have the desired attributes. In one embodiment, each candidate question may also be appended with one or more options corresponding to the product attributes for which the candidate question is directed.

In one embodiment, the set of candidate products 252 and the set of candidate questions 253 may be linked together by product attributes. Fig. 3 shows a mapping 300 between a set of candidate products and a set of candidate questions, according to an embodiment. Set of candidate products P_iIncluding a plurality of candidate products

Where i represents the categories of these candidate products and M is the number of candidate products. Set of candidate products P_iA plurality of attributes for each candidate product is also included. For example, candidate products

Is composed of

Where N is the number of attributes. Set of candidate problems Q_iInvolving a plurality of candidate questions

Where i represents the set of candidate questions Q_iFor the product category, N is the number of candidate questions. Set of candidate problems Q_iReference answers for different candidate products for each candidate question may also be included. For example, for candidate questions

Candidate product

Is referred to as

Candidate product

Is referred to as

And so on. Thus, the matrix formed by the attributes of the candidate products in FIG. 3 may also be referred to as the reference answer matrix D_i. Reference answer matrix D_iSet P of candidate products_iAnd candidate problem set Q_iAre linked together. D_iEach element in (a) can be represented as

Where m is the candidate product index and n is the candidate problem index. It should be appreciated that although the candidate problem set is shown in FIG. 3 as having N candidate problems, the candidate problem set may have any number of candidate problems. Although each candidate product is shown in fig. 3 as having N attributes, different candidate products may have different numbers of attributes.

It should be appreciated that embodiments of the present disclosure are not limited to any particular manner of constructing candidate questions based on attributes of the candidate product. As an example, assuming that candidate product 1 is headphone X, which includes the attribute "in-ear", and candidate product 2 is headphone Y, which includes the attribute "over-the-ear", a candidate question "do you like in-ear headphones or over-the-ear headphones? ". For the candidate question, the reference answer for candidate product 1 is "in-ear" and the reference answer for candidate product 2 is "in-ear". If the user's answer is "in-ear," then candidate product 1 may be determined to have the attribute selected by the user, and accordingly, candidate product 1 may have a higher rank than candidate product 2.

Database 250 may include session records 254. The session record 254 can include historical questions associated with product recommendations that the chat robot provided and corresponding historical answers from the user in a conversation with the user.

The database 250 may include a candidate product evaluation status 255. The candidate product evaluation state 255 may include an expected probability or weight evaluated for each candidate product after each session.

The chat bot system 200 can include a set of modules 260, the set of modules 260 being a set of functional modules that can be operated by the core processing module 220 to generate or obtain a response.

The set of modules 260 may include a product ordering module 261. Each time an answer to the current question is received from the user, the product ranking module 261 may recalculate the expected probability or weight for each candidate product and rank the candidate products accordingly. The calculated expected probabilities or weights may be used to update the candidate product evaluation states. When one or more candidate products satisfy a predetermined condition, the one or more candidate products may be determined as recommended products to be provided to the user.

The set of modules 260 may include a question ordering module 262. The question ranking module 262 may calculate a weight for each candidate question and rank the candidate questions accordingly before providing the next question. The problem ranking may be based at least on the results of the product ranking. For example, the question ordering may take into account a current expected probability or weight for each candidate product included in the current candidate product evaluation state. In addition, the problem ordering can also consider session records and the like. The highest ranked candidate question may be selected or one candidate question may be randomly selected from a plurality of highest ranked candidate questions as the next question to be presented to the user.

It should be appreciated that although the product ranking module 261 and the issue ranking module 262 are shown as separate modules, the two modules may be combined so that both product ranking and issue ranking may be achieved by performing a unified process.

The set of modules 260 may include a recommendation reason generation module 263. The reason for recommendation generating module 263 may generate the determined reason for recommendation of the recommended product in various ways. In one approach, the reason for recommendation generation module 263 may select a question from previously provided questions that produces the greatest desired probability of improvement for the recommended product and generate a reason for recommendation with reference to the selected question and the user's answer. In another approach, the reason for recommendation generation module 263 may generate a reason for recommendation by a pre-trained reason for recommendation generation model.

The set of modules 260 may include a statement parsing module 264. When a user answers a question or sends a message using a natural language sentence, the sentence parsing module 264 may parse the natural language sentence to identify the product attributes desired by the user.

The set of modules 260 may include a response providing module 265. The response providing module 265 may be configured to provide or communicate a response to the user's message. In some embodiments, the response provided by the response providing module 265 may include product information, a reason for recommendation, etc. for the determined recommended product.

The core processing module 220 may provide the determined response to a response queue or response cache 234. For example, response cache 234 may ensure that the response sequence can be displayed with the proper timing. Assuming that no less than two responses are determined by the core processing module 220 for one message, a time delay setting for the responses may be necessary. For example, if the user enters a message that is "do you eat breakfast? ", it is possible to determine two responses, e.g., the first response is" yes, i eat bread ", the second response is" do you? Also hungry? ". In this case, the chat robot can ensure that the first response is immediately provided to the user through the response cache 234. Further, the chat robot may ensure that the second response is provided with a time delay of, for example, 1 or 2 seconds, such that the second response will be provided to the

user

1 or 2 seconds after the first response. Thus, response cache 234 may manage the responses to be sent and the appropriate timing for each response.

The responses in the response queue or response cache 234 may be further communicated to the UI 210 so that the responses may be displayed to the user in a chat window.

It should be understood that all of the elements shown in chat bot system 200 in fig. 2 are exemplary, and that any of the elements shown may be omitted and any other elements may be involved in chat bot system 200 depending on the particular application requirements.

FIG. 4 shows an exemplary overall process 400 for providing explanatory product recommendations, according to an embodiment.

At 410, a message from a user may be received. The message may indicate the user's intent to obtain a recommendation for a product. For example, the message may be "recommend a gift for me".

At 420, a product category to which a product to be recommended belongs may be determined based at least on the message received at 410. For example, when the received message is "recommend a gift for me", the product category may be determined to be "gift". For example, when the received message is "recommend me electronic goods", the product category may be determined to be "electronic goods".

At 430, multiple rounds of sessions with the user may be conducted based on the LTE architecture proposed by embodiments of the present disclosure. In the multi-turn session, questions associated with product recommendations under the determined product categories may be dynamically provided and user responses to the questions collected and a recommended product and recommendation reason determined. As will be discussed in more detail below in conjunction with fig. 5.

At 450, the user may be provided with product information for the recommended product and a reason for the recommendation in response.

FIG. 5 illustrates an exemplary specific process 500 for providing explanatory product recommendations, according to an embodiment. Process 500 illustrates an exemplary operational procedure for an LTE architecture according to an embodiment of the disclosure. Prior to performing process 500, the user's intent to obtain a recommendation for a product has been determined, and a product category is determined. Accordingly, process 500 will be performed for providing product recommendations under that product category. It should be appreciated that the process 500 may be iteratively performed until a recommended product is determined.

At 502, a question associated with a product recommendation may be provided to a user. The problem may be product attribute specific. Optionally, a plurality of options may be provided to the user along with the question so that the user may select an answer from the options.

At 504, an answer to the provided question may be received from the user.

Where the question is accompanied by options, the user's answer may be a direct selection of one or more of the appended options. Accordingly, the selections made by the user for these options may be identified at 506 in order to determine the product attributes selected by the user. For example, if the question is attached with three options and the user answers with the index "2" of the second option or includes an expression associated with the content of the second option, it may be determined that the product attribute indicated by the 2 nd option is desired by the user.

In the case of questions with no additional options, the user's answers may take the form of natural language sentences. Parsing may be performed on the natural language statement at 508. The parsing may employ any existing intent-slot parsing technique to detect slots and corresponding values from natural language statements. These values may be used as keywords to retrieve relevant questions and corresponding answers from a set of candidate questions. The retrieved relevant questions may include the questions provided to the user as well as other questions. For example, suppose that the question provided to the user is "how do you like to stay in a quiet place? "and the user's answer is a natural language sentence" i like quiet but i like running ", the keywords" quiet "and" running "can be detected at least from the natural language sentence. Quiet may be for the related question "how do you like to stay in a quiet place? "and" running "may be to another related question" what do you like? "is selected. Thus, at least two answers of the user to two related questions are obtained from the natural language sentence.

At 510, a < question, answer > pair may be added to session record 512. The question may be a provided question or a related question and the answer may be the content of the option selected by the user or a corresponding answer to the related question. The questions and answers included in session record 512 may be continually updated as the session progresses.

At 514, product ranking may be performed on the plurality of candidate products 516 based at least on the provided questions and the user's answers. By product ranking, the expected probability for each candidate product may be updated and the calculated expected probability included into the candidate product evaluation state 518. In one embodiment, the candidate product evaluation state 518 may take the form of a vector having dimensions corresponding to the number of candidate products, with each dimension having a value corresponding to the expected probability of one candidate product. Further, as previously described, the weight of the candidate product may also be used in place of the expected probability, such that the weight of each candidate product is included in the candidate product evaluation state 518.

At 520, it may be determined whether there is at least one candidate product of the plurality of candidate products that satisfies a predetermined condition based on the result of the product ranking. The predetermined condition may indicate whether a candidate product meets the condition determined to be the recommended product. For example, the predetermined condition may be that the expected probability or weight of the candidate product is above a threshold, where the threshold may be empirically set in advance.

If it is determined at 520 that there are no candidate products that satisfy the predetermined condition, the process 500 may further provide the user with a question. At 522, question ranking may be performed on the plurality of candidate questions 524 to determine the next question to provide to the user. The problem ranking may be based at least on the outcome of the product ranking, e.g., a current expected probability or weight for each candidate product included in the current candidate product evaluation state. The question ordering may also be based on historical questions and historical answers in the conversation record 512, and the like. In the problem ranking, a weight or an information gain may be calculated for each candidate problem, and the candidate problems are ranked according to the weight or the information gain. The ordering of questions may be performed in various ways, such as by entropy-based ordering 522-1, by policy-based reinforcement learning ordering 522-2, and so forth, as will be discussed in more detail below.

At 526, the next question to be provided to the user may be selected based on the weights of the candidate questions. For example, the highest ranked candidate question may be selected as the next question, one candidate question may be randomly selected from a plurality of highest ranked candidate questions as the next question, and so on.

After the next question is selected, the process 500 iteratively returns to 502 to provide the selected next question to the user and then perform subsequent steps.

If it is determined at 520 that there is at least one candidate product that satisfies the predetermined condition, then the at least one candidate product may be determined at 528 to be a recommended product to be provided to the user.

At 530, a reason for the recommendation of the recommended product may be determined. The reason for the recommendation may be determined in various ways.

In one approach, the recommendation rationale may be generated by selecting at least one question and corresponding user answer that contributes most from the questions previously provided to the user. For example, a question that produces the greatest desired probability of being promoted for the recommended product may be selected from historical questions provided to the user, and a reason for the recommendation may be generated with reference to the selected question and the user's answer. A recommendation reason may be constructed based on the selected questions and answers based on various predefined rules. The reason for recommendation may include a simple repetition of at least one of the selected question and the answer. The reason for recommendation may include a transformed representation of at least one of the selected question and answer. For example, the recommendation reason may include the expression "like snack food" transformed from the answer "hamburger like KFC". The reason for recommendation may include a general expression of at least one of the selected question and the answer. For example, the content of these questions and answers may be semantically summarized by any natural language processing technique. The reason for the recommendation may include some words or phrases that are commonly used in free chat to make sentence expression more natural. For example, expressions such as "i give the above recommendation the reason is …", "i decide the recommendation … in consideration of …", and the like are added to the recommendation reason.

In another approach, a reason for recommendation generation model may be trained in advance to generate a reason for recommendation based on at least one of attributes selected by historical answers in the session record, attributes of the recommended product, a description of the recommended product, and the like, as discussed in more detail below.

At 532, a response may be provided to the user including product information for the recommended product and a reason for the recommendation. The product information may be extracted from, for example, the product database 150 of FIG. 1 or the candidate product set 252 of FIG. 2.

It should be understood that all of the process steps and their order included in the process 500 of fig. 5 are exemplary, and any additions, deletions, or substitutions of process steps in the process 500 may be made as desired for a particular application. For example, in one embodiment, where the chat robot receives a natural language message sent by the user and the message is not directed to any questions provided by the chat robot associated with product recommendations, parsing of the natural language message may be performed directly at 508 to determine corresponding relevant questions and corresponding answers and, in turn, perform subsequent processing. Further, in one embodiment, other criteria for determining recommended products may be defined or added at 520. For example, a threshold for the number of questions presented to the user may be predefined. If it is determined that more than a threshold number of questions have been provided to the user in process 500, a recommended product, which may be, for example, a candidate product currently having the highest desired probability or weight, may be determined directly at 528.

A method of selecting questions by entropy-based ranking will be discussed below, which may be used to rank candidate questions and in turn select the next question from the candidate questions to be provided to the user.

The next question may be selected such that the question can cull as many candidate products as possible that are less likely to be recommended products, regardless of what answer the user gives to the question. The next question may be selected that is able to divide the candidate products into two subsets of similar size or similar weight. For example, where the answer to the question is a binary answer, such as "yes" or "no," the candidate products may be divided into a subset with a reference answer of "yes" and a subset with a reference answer of "no. The two subsets may have equal or similar numbers of candidate products, or have equal or similar cumulative weights. For example, where the answer to a question has more than two options, such as "cheap", "medium" and "expensive", the candidate products may be divided into three subsets with reference answers "cheap", "medium" and "expensive", respectively, having similar sizes or similar weights. In one embodiment, after determining the product attributes currently selected by the user through one session, those candidate products that meet the attributes currently selected by the user may be used as the candidate product set to be referred to in determining the question of the next session. By continuously performing the above process, the size of the candidate product set for determining the next question can be continuously reduced.

In entropy-based ranking, for product category i, each candidate product may be initially

(1. ltoreq. M. ltoreq.M) is assigned a prior probability weight w (. cndot.). The weight may be set with reference to a search frequency for the candidate product on a search engine or a point score value or an order frequency for the candidate product on an e-commerce website. Then, w may beNormalized to:

for a candidate product

Its pair selection candidate question q can be calculated_nContribution of (1)

Wherein,

indicating that the users have selected the candidate question q in the history data_nAfter option l of (1) finally selecting a candidate product

Of (c) is detected. The historical data may be obtained by collecting usage information for a large number of users, thereby

Reflecting historical usage information for a large number of users. I (-) is an indicator function, which when

If true, return 1, otherwise return 0, wherein,

presentation option l is a candidate product

For candidate problem q_nReference answer to (3). The parameter alpha is used for historical use information and reference answer matrix D_iIs balanced against the reference answer in (1). For example, when α is 0, only the historical usage information is considered, and the reference answer is ignored. When α is set to a very large value, the reference answer is largely considered and the historical usage information is ignored. Furthermore, α can also be extended to a time decay function α (t), where t represents time.

In one embodiment, negative shannon entropy may be used for multivariate bernoulli distribution of options, and the parameter M is calculated_m,n：

The candidate question q may then be evaluated_nWeight w (q) of_n) The calculation is as follows:

can be directed to a set of candidate problems Q_iPerforms the above-described process so as to calculate a weight of each candidate question. The candidate questions may be ranked based on weight and a next question to be provided to the user may be selected from the ranked candidate questions

User-based targeting of questions

The answer a, each candidate product may be updated

The weight of (c):

answer a to the user is not associated with the question

In the case that any of the options in (b) match,

will be 0. In this case, the candidate product can be avoided from being discarded completely by setting the return value of I (-) to a very small value, e.g., 0.01, instead of 0.

The calculated weight for each candidate product may be added to the candidate product evaluation state. A candidate product may be determined to be a recommended product if it meets a predetermined condition, such as the weight or expected probability of the candidate product exceeding a threshold.

If no candidate product satisfies the predetermined condition, the above-described entropy-based ranking may be iteratively performed again. For example, equation (1) can be again used to

Normalized to

And then continues to perform subsequent processing.

The current stage and the next stage may be represented by subscripts t and t +1, respectively, to derive candidate product weights for use at the current stage

And representing candidate product weights to be used in a next stage after the current stage

Then, the current question provided by the user can be calculated

And candidate products elicited by the user's answer a

Weight lifting or expected probability lifting of

In one embodiment, after a candidate product is determined to be a recommended product, questions and corresponding user responses that cause the greatest weight increase or the greatest expected probability increase for the recommended product may be selected to generate a reason for recommendation for the recommended product. For example, assume that the first question results in a change in the expected probability of the recommended product from 0 to 0.2, the second question results in a change in the expected probability from 0.2 to 0.6, and the third question results in a change in the expected probability from 0.6 to 0.9, where the expected probability of 0.9 exceeds the threshold of 0.8. Since the expected probability due to the first question is raised to 0.2, the expected probability due to the second question is raised to 0.4, and the expected probability due to the third question is raised to 0.3, the second question and the corresponding user answer that cause the greatest expected probability rise (i.e., 0.4) may be selected to generate the recommendation reason. As described above, the recommendation reason may be generated by the recommendation reason generation model.

By performing the entropy-based ranking described above, in one aspect, the weights of the candidate questions may be continually updated during the session to select the next question to be provided to the user, and in another aspect, the candidate product weights or desired probabilities may be continually updated so that recommended products may be determined.

It should be understood that all of the above equations are exemplary, are merely used to illustrate exemplary processes, and embodiments of the present disclosure are not limited to any of the specific equations described above. For example, instead of the formula (3) and the formula (4), the candidate problem q may be calculated by the following procedure_nThe weight of (c).

Is calculated by the formula (2)

Thereafter, the representation option l may be calculated next to the candidate question q_nOf importance in

Then, utilize

To calculate the problem q_nWeight w (q) of_n)：

Furthermore, it should be understood that in the above description of the entropy-based ordering process, in addition to

Except that the category index i of most variables is omitted for simplicity.

A method of selecting questions by a policy-based reinforcement learning ranking will be discussed below, which may be used to rank candidate questions and in turn select the next question from the candidate questions to be presented to the user.

Policy-based reinforcement learning algorithms can be used to predict particular entities (e.g., celebrities, etc.) under the constraints of allowing users to respond to single-attribute questions. A single-attribute question may refer to a question that is intended to obtain a binary answer (e.g., yes, no, etc.). Embodiments of the present disclosure adapt the algorithm to a multi-option problem for explanatory product recommendations. For example, the algorithm may be adapted to a task-oriented scenario of a problem with multiple options attached. The user may select any subset of these options. This can be seen as a crossover on several single attribute issues. Further, when a user expresses a requirement in a natural language sentence that involves answers to multiple questions, crossing of multiple single-attribute questions or combined questions in a single turn of conversation with the user may be achieved.

The problem ordering can be generalized to a finite Markov Decision Process (MDP) represented by the five-tuple S, A, P, R, γ. S is a continuous candidate product evaluation state space, each state S in S representing a vector storing expected probabilities of candidate products. A ═ q₁,q₂,…,q_nIs the set of candidate problems. P (S)_t+1＝s′|S_t＝s,A_tQ) is the state transition probability matrix. R (s, q) represents a feedback function or a feedback network. Gamma is belonged to 0,1]Is the attenuation coefficient used to convert the long-term return value. In a policy-based reinforcement learning algorithm, at each time step t, the chat robot can act according to a policy function or a policy network π_θ(q_t| s) to provide a candidate question q in a current candidate question evaluation state s_t. Providing a candidate question q_tAnd receives the user pair q_tAfter the answer(s) are made, a feedback score r can be generated_t+1And updates the candidate problem evaluation state s to s'. The quadruple s, q_t,r_t+1S' is used as a context (epicode) in the reinforcement learning process. Can feed back the long-term of the time step t_tIs defined as:

can make the candidate question evaluate the state s_tKeep track of candidate products

Confidence at time step t, e.g. expected probability. For example,

and is

Here, s_t,mIndicating that the user desires a product at time step t

The confidence of (c). Initially, s is similar to the entropy-based ordering described above₀A priori expected probabilities of candidate products may be employed.

Given a set of candidate products

And the candidate problem set Q ═ Q_nThat the user's answer at each question q can be calculated_nNormalized confidence level over multiple options. That is, the transition of the candidate product evaluation state may be defined as:

here,. beta.is a dot product operator, beta. depends on the user for the question q selected at time step t_tAnswer x of_tWherein q is_tIn the candidate problem set Q ═ { Q ═ Q_nWith an index n_t. When the user aims at the current question q_tThe answer selects an option

Can define

And is

A definition similar to equation (2) is used. In this way, it is possible to base the user on timeQuestion q at intermediate step t_tIs given as the answer { l } to the candidate product

S confidence of_t,mIs updated to s_t+1,m。

To enable policy-based reinforcement learning algorithms to take a previously provided problem as a precondition for determining the next problem and to use historical selections of the user to rank candidate problems, embodiments of the present disclosure propose to use a neural network-based LTE feedback network that employs quadruplets

As input, and outputs feedback r of the next step_t+1. The LTE feedback network employs maximum layer awareness (MLP) with sigmoid output in order to learn the appropriate immediate feedback during training. Table 1 below shows an exemplary training process.

TABLE 1

In the procedure shown in Table 1, for problem Q_tAnd corresponding answers

Embedding is performed, and the resulting embedded vector is summed with s_tA cascade for training the feedback network R under, for example, a squared error loss function. The feedback network is further used to train a policy function to rank the candidate questions and select the next question. The policy function may be trained using an enhancement algorithm under, for example, a cross-entropy loss function.

The value network V can be used to match the current state s_tIs scored for goodness. The value network can estimate how good the current state itself is to be selected in the scenario. The value network may use, for example, a squared error loss function, and employ cumulative feedback r'_t+1AsAnd (6) referring to the scores. After updating, the new estimated score v_t+1Is from r'_t+1Is further used to update R and π, respectively_θ。

Four cycles are included in table 1. The first cycle is from 1.3 to 1.21, which controls the number of cycles to within Z. The second loop is from 1.5 to 1.9, which applies a policy function to select the problem and update the candidate product evaluation status. In one embodiment, a candidate question may be restricted to being selected and used only once during a session. Storing the result obtained in the second cycle in S₁For use in subsequent steps. The third cycle is from 1.11 to 1.13, which applies a feedback network R to obtain immediate feedback. The fourth cycle is from 1.14 to 1.21 by remembering from the scene

And picking small batches of data to update the strategy function and the parameters in the feedback network. In one embodiment, the policy functions and feedback networks can employ MLPs with, for example, 3 hidden layers and utilize ADAM optimizer-based algorithms.

In the stage of applying policy-based reinforcement learning ranking, the current question q_tAnd user answers to all candidate products

The expected probability of the resulting t +1 step is raised as:

thus, a recommended product may be determined by comparing the expected probability for each candidate product to a predetermined condition.

It should be understood that all formulas, variables, parameters, etc. referred to in the discussion above regarding entropy-based ranking and policy-based reinforcement learning ranking are exemplary. Any form of deletion, substitution, or addition of these formulas, variables, parameters, etc. may be made according to the specific application requirements. Embodiments of the disclosure are not limited to any of the details discussed above.

FIG. 6 illustrates an exemplary process 600 for training a recommendation reason generation model according to an embodiment.

Training data for training the recommendation reason generation model may be collected from a network, such as an e-commerce website. A set of products 610 for generating training data may be first identified or designated. For each product in the product collection 610, category information 640 for the product may in turn be obtained from the website that provided the product. In general, category information for a product may include a series of different levels of categories. For example, for the product "salmon," it may correspond to multiple categories at different levels of "food," "seafood," "fish," and so forth.

In some cases, after a user purchases a product from an e-commerce website, they may provide reviews (reviews) for the product that may contain interpretative reasons for purchasing the product in natural language. Accordingly, reviews 620 for the products in the product collection 610 may be collected. Optionally, process 600 may perform a filter on the critique 620 to obtain a filtered critique 622. For example, sentiment analysis may be performed on the reviews to filter out negative reviews and retain positive reviews. Further, for example, predefined expression patterns may be employed to detect the effectiveness of a comment and filter out invalid comments that include too few words or too many repeated characters in the expression. It should be understood that the term "review" referred to hereinafter may broadly refer to either or both of the review 620 and the filtered review 622. The process 600 may extract attribute information 650 for the product from the review 620 or the filtered review 622. For example, for the comment "these shoes are extremely big! Super soft, shock absorbing, and very light, attributes about the product "shoe" can be extracted, such as "soft", "shock absorbing", "light", and so on.

Typically, a description of the product, e.g., its characteristics, parameters, etc., may also be provided on the e-commerce website. These descriptions typically explicitly include various attributes of the product expressed in natural language. Thus, descriptions 630 for the products in the product collection 610 may be collected. Optionally, the process 600 may perform summarization of the description 630 of the product to obtain a product description summary 632. In some cases, the description of the product may be long, so only the main content of the description may be used for subsequent training. The summarization of the description may be performed using existing unsupervised text ranking algorithms. It should be understood that the term "description of a product" referred to below may refer broadly to either or both of the product description 630 and the product description summary 632. The process 600 may extract attribute information 650 for a product from the description 630 or the product description summary 632 for the product.

Through the data collection process described above, a training data set 660 in the form of an < attribute + description, criticizing > pair may be formed. Each < attribute + description, criticizing > data pair is associated with a particular product. The recommendation reason generation model 670 may be trained using the training data set 660. The "attribute + description" in the training data may be used as an input to the model, and the "comment" may be used as an output from the model. The recommendation reason generation model 670 may employ a transformer architecture in which both the encoding portion and the decoding portion may employ a self-attention mechanism with position coding for sequence dependent learning. In training, the encoding portion may process attributes and descriptions of the product, while the decoding portion may process previous reviews of the product by different purchasers. The trained reason for recommendation generation model 670 may generate a reason for recommendation in natural language that is similar to a comment.

It should be understood that all of the steps in process 600 are exemplary and that any manner of modification to process 600 may be made depending on the particular application requirements and design. For example, when the number of critiques collected for a certain product from the network is small, the description of the product may be used instead of the critiques for constructing an explanatory reason. For example, the training data may take the form of < attribute, description >, whereby "attribute" may be used as an input and "description" as an output of the model when training the recommendation reason to generate the model.

Fig. 7 illustrates an exemplary process 700 for generating a reason for recommendation, according to an embodiment. As previously described, after determining the recommended product, a reason for the recommendation of the recommended product may be further determined for presentation to the user in a response. A recommendation reason generation model 710 is used in process 700 to generate a recommendation reason for recommending a product, which model 710 may be pre-trained by process 600 of FIG. 6.

Historical questions provided by the chat robot and historical answers provided by the user in the conversation may be included in the conversation record 702. The user's historical responses may indicate product attributes 704 selected or desired by the user. The attributes 704 selected by the user for the recommended product may be provided as input to the recommendation reason generation model 710.

Attributes 706 of the recommended product may be obtained, for example, by process 600 in FIG. 6, and provided as input to the recommendation reason generation model 710. Further, a description 708 of the recommended product may also be obtained, for example, by process 600 in FIG. 6, and provided as input to a recommendation reason generation model 710.

The recommendation reason generation model 710 may generate a recommendation reason 720 for the recommended product based on at least one of the user selected attributes 704, the attributes 706 of the recommended product, and the description 708 of the recommended product. Alternatively, for questions that produce the greatest expected probability of promotion for the recommended product, the user-selected attribute indicated by the answer to the question may be given a higher weight in generating the reason for the recommendation.

Fig. 8 illustrates an exemplary chat window 800, according to an embodiment.

Upon receiving the message "help me find a gift" provided by the user, the chat robot may determine that the user wants to obtain a recommendation for a product, and that the product category is "gift". The chat robot may then provide a plurality of questions to the user and receive answers from the user through a plurality of rounds of conversations. These issues may be dynamically determined in turn in accordance with the embodiments of the present disclosure discussed above. Each question is attached with an option and the user's answer accordingly comprises an explicit selection of the option. Finally, the chat robot provides the user with a response "according to you ' choice ' river side ' for question 6 i recommend you buy a fishing rod". The response includes product information "fishing rod" for the recommended product, where the recommended product may be determined according to the embodiments of the present disclosure discussed above. The response also includes a reason for recommendation "according to your choice of question 6 'river'", where the reason for recommendation may be generated according to embodiments of the present disclosure discussed above, e.g., based on question 6 and the corresponding answer that produced the greatest expected probability of improvement for the recommended product.

Fig. 9 illustrates an exemplary chat window 900, according to an embodiment.

Upon receiving the message "help me recommend electronic goods as a gift" provided by the user, the chat robot may determine that the user wants to obtain a product recommendation and that the product categories are "gift" and "electronic goods". The chat robot may then provide a plurality of questions to the user and receive answers from the user through a plurality of rounds of conversations. Finally, the chat robot provides a response to the user "consider that you like 'running', i recommend you buy the smart band". The response includes product information "smart band" of the recommended product, and a recommendation reason "consider you like 'run'. The recommended product and reason for recommendation may be determined according to embodiments of the present disclosure discussed above, where the reason for recommendation may be generated based on the question 5 and corresponding answer that produced the greatest expected probability increase for the recommended product.

As can be seen by comparing fig. 8 with fig. 9, the chat robot can dynamically determine the next question based at least on the user's answer to each question.

Fig. 10 illustrates an exemplary chat window 1000 in accordance with embodiments. In the session of fig. 10, the chat robot may provide product recommendations to the user relating to hotel reservation services. Questions 1 through 2 have options attached for the user to select. Questions 3 through 6 have no additional options and the product attributes desired by the user may be determined by parsing the user's answers in natural language sentences, such as step 508 of FIG. 5. Further, relevant questions and corresponding answers corresponding to the user's answers may be determined by parsing for determining a next question.

Fig. 11 illustrates an exemplary chat window 1100 in accordance with embodiments. In the session of fig. 11, the chat robot may provide product recommendations to the user that relate to gifts. The first question and the second question are attached with options and the user's answers are in the form of natural language sentences. The user's responses may be identified, for example, by step 506 of FIG. 5, to determine the options selected by the user, or parsed, for example, by step 508 of FIG. 5, to determine the product attributes desired by the user and to determine the relevant questions and corresponding responses. Two reasons for the recommendation are included in the response provided by the chat robot. The first reason for recommendation, "i recommended based on keywords ' quiet ' and ' yoga/reading," may be generated based on first and second questions and corresponding answers that yield the greatest expected probability of promoting the recommended product. A second reason for the recommendation "i want the gifts to help her enjoy her life in a quieter environment" may be generated by generating a recommendation reason generation model.

Fig. 12 illustrates an exemplary chat window 1200 in accordance with embodiments. In the session of fig. 12, the user has actively sent a message in natural language sentence "i can tell you that she likes quiet places, music, and yoga". The message may be parsed, for example, by step 508 of fig. 5, to determine a set of related questions and corresponding answers, such as the question "do she like quiet? "and the corresponding answer" like quiet ", question" her liking? "and corresponding answers" music "," yoga ", etc. The above-mentioned "quiet", "music", "yoga", etc. may be regarded as the product attribute desired by the user, and further used for performing the subsequent product sorting, question sorting, etc.

FIG. 13 shows a flowchart of an exemplary method 1300 for providing an explanatory product recommendation in a session, according to an embodiment.

At 1310, at least one question associated with the product recommendation may be provided.

At 1320, an answer to the at least one question may be received.

At 1330, it may be determined whether at least one recommended product is present based at least on the at least one question and the answer.

At 1340, a reason for the recommendation of the at least one recommended product may be generated in response to determining that the at least one recommended product exists.

At 1350, a response may be provided that includes product information for the at least one recommended product and the reason for the recommendation.

In one embodiment, the determining whether at least one recommended product exists may include: performing product ranking on a plurality of candidate products based at least on the at least one question and the answer; and determining whether at least one candidate product satisfying a predetermined condition exists among the plurality of candidate products based on a result of the product ranking.

In one embodiment, the performing product ordering may include: updating a candidate product evaluation state based at least on the at least one question and the answer, the candidate product evaluation state including an expected probability for each of the plurality of candidate products.

In one embodiment, the predetermined condition may include: the expected probability of the candidate product is above a threshold.

In one embodiment, the method 1300 may further include: adding the at least one question and the answer to a session record for the session, the session record including historical questions and historical answers associated with the product recommendation in the session.

In one embodiment, the generating the reason for the recommendation may include: determining a desired probability boost that each historical issue in the session record results in for the at least one recommended product; selecting a historical problem that produces the greatest expected probability increase; and generating the recommendation reason based at least on the selected historical questions and the corresponding answers.

In one embodiment, the generating the reason for the recommendation may include: generating, by a recommendation reason generation model, the recommendation reason based on at least one of the attribute selected by the historical answer, the attribute of the recommended product, and the description of the recommended product.

In one embodiment, the method 1300 may further include: in response to determining that the at least one recommended product does not exist, performing a question ordering for a plurality of candidate questions based at least on a result of the product ordering; and selecting a next question to be presented based on the result of the question ranking.

In one embodiment, the problem ranking may be performed by entropy-based ranking or policy-based reinforcement learning ranking.

In one embodiment, the plurality of candidate questions may be predetermined based at least on attributes of the plurality of candidate products.

In one embodiment, the at least one question may include one or more options and the answer may include a selection for the one or more options.

In one embodiment, the answer may include a natural language sentence. The determining whether at least one recommended product exists may include: determining one or more relevant questions and corresponding answers corresponding to the natural language sentence; and determining whether the at least one recommended product is present based at least on the one or more related questions and the corresponding answer.

It should be understood that the method 1300 may also include any steps/processes for providing an explanatory product recommendation in a session in accordance with embodiments of the present disclosure described above.

Fig. 14 illustrates an exemplary apparatus 1400 for providing an explanatory product recommendation in a session, according to an embodiment.

The apparatus 1400 may include: a question providing module 1410 for providing at least one question associated with the product recommendation; an answer receiving module 1420 to receive an answer to the at least one question; a recommended product determination module 1430 for determining whether at least one recommended product exists based at least on the at least one question and the answer; a reason for recommendation generation module 1440 for, in response to determining that the at least one recommended product exists, generating a reason for recommendation for the at least one recommended product; and a response providing module 1450 for providing a response including the product information of the at least one recommended product and the reason for the recommendation.

In one embodiment, the recommended products determination module 1430 may be configured to: performing product ranking on a plurality of candidate products based at least on the at least one question and the answer; and determining whether at least one candidate product satisfying a predetermined condition exists among the plurality of candidate products based on a result of the product ranking. The performing product ordering may include: updating a candidate product evaluation state based at least on the at least one question and the answer, the candidate product evaluation state including an expected probability for each of the plurality of candidate products.

In one embodiment, the apparatus 1400 may further include: a session record adding module to add the at least one question and the answer to a session record of the session, the session record including historical questions and historical answers associated with the product recommendation in the session.

In one embodiment, the recommendation reason generation module 1440 may be configured to: determining a desired probability boost that each historical issue in the session record results in for the at least one recommended product; selecting a historical problem that produces the greatest expected probability increase; and generating the recommendation reason based at least on the selected historical questions and the corresponding answers.

In one embodiment, the recommendation reason generation module 1440 may be configured to: generating, by a recommendation reason generation model, the recommendation reason based on at least one of the attribute selected by the historical answer, the attribute of the recommended product, and the description of the recommended product.

In one embodiment, the apparatus 1400 may further include a question selection module to: in response to determining that the at least one recommended product does not exist, performing a question ordering for a plurality of candidate questions based at least on a result of the product ordering; and selecting a next question to be presented based on the result of the question ranking.

Furthermore, the apparatus 1400 may also include any other modules configured to provide an explanatory product recommendation in a session in accordance with embodiments of the present disclosure described above.

FIG. 15 shows an exemplary apparatus 1500 for providing an explanatory product recommendation in a session, according to an embodiment.

The apparatus 1500 may include at least one processor 1510 and a memory 1520 storing computer-executable instructions. When executing the computer-executable instructions, processor 1510 may: providing at least one question associated with a product recommendation; receiving an answer to the at least one question; determining whether at least one recommended product exists based at least on the at least one question and the answer; in response to determining that the at least one recommended product exists, generating a reason for recommendation for the at least one recommended product; and providing a response including the product information of the at least one recommended product and the reason for the recommendation. Further, the processor 1510 may also perform any other process for providing an explanatory product recommendation in a session in accordance with embodiments of the present disclosure as described above.

Embodiments of the present disclosure may be embodied in non-transitory computer readable media. The non-transitory computer-readable medium may include instructions that, when executed, cause one or more processors to perform any of the operations of the method for providing an explanatory product recommendation in a session in accordance with embodiments of the present disclosure as described above.

It should be understood that all operations in the methods described above are exemplary only, and the present disclosure is not limited to any operations in the methods or the order of the operations, but rather should encompass all other equivalent variations under the same or similar concepts.

It should also be understood that all of the modules in the above described apparatus may be implemented in various ways. These modules may be implemented as hardware, software, or a combination thereof. In addition, any of these modules may be further divided functionally into sub-modules or combined together.

The processor has been described in connection with various apparatus and methods. These processors may be implemented using electronic hardware, computer software, or any combination thereof. Whether such processors are implemented as hardware or software depends upon the particular application and the overall design constraints imposed on the system. By way of example, the processor, any portion of the processor, or any combination of processors presented in this disclosure may be implemented as a microprocessor, microcontroller, Digital Signal Processor (DSP), Field Programmable Gate Array (FPGA), Programmable Logic Device (PLD), state machine, gated logic, discrete hardware circuits, and other suitable processing components configured to perform the various functions described in this disclosure. The functionality of a processor, any portion of a processor, or any combination of processors presented in this disclosure may be implemented as software executed by a microprocessor, microcontroller, DSP, or other suitable platform.

Software should be viewed broadly as representing instructions, instruction sets, code segments, program code, programs, subroutines, software modules, applications, software packages, routines, subroutines, objects, threads of execution, procedures, functions, and the like. The software may reside in a computer readable medium. The computer readable medium may include, for example, memory, which may be, for example, a magnetic storage device (e.g., hard disk, floppy disk, magnetic strip), an optical disk, a smart card, a flash memory device, a Random Access Memory (RAM), a Read Only Memory (ROM), a programmable ROM (prom), an erasable prom (eprom), an electrically erasable prom (eeprom), a register, or a removable disk. Although the memory is shown as being separate from the processor in aspects presented in this disclosure, the memory may be located internal to the processor (e.g., a cache or a register).

The above description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein. All structural and functional equivalents to the elements of the various aspects described herein that are known or later come to be known to those of ordinary skill in the art are intended to be encompassed by the present claims.

Claims

1. A method for providing an explanatory product recommendation in a session, comprising:

providing at least one question associated with a product recommendation;

receiving an answer to the at least one question;

determining whether at least one recommended product exists based at least on the at least one question and the answer;

in response to determining that the at least one recommended product exists, generating a reason for recommendation for the at least one recommended product; and

providing a response including product information for the at least one recommended product and the reason for the recommendation.

2. The method of claim 1, wherein the determining whether at least one recommended product exists comprises:

performing product ranking on a plurality of candidate products based at least on the at least one question and the answer; and

determining whether there is at least one candidate product among the plurality of candidate products that satisfies a predetermined condition based on a result of the product ranking.

3. The method of claim 2, wherein the performing product ordering comprises:

updating a candidate product evaluation state based at least on the at least one question and the answer, the candidate product evaluation state including an expected probability for each of the plurality of candidate products.

4. The method of claim 3, wherein the predetermined condition comprises:

the expected probability of the candidate product is above a threshold.

5. The method of claim 1, further comprising:

adding the at least one question and the answer to a session record for the session, the session record including historical questions and historical answers associated with the product recommendation in the session.

6. The method of claim 5, wherein the generating a recommendation reason comprises:

determining a desired probability boost that each historical issue in the session record results in for the at least one recommended product;

selecting a historical problem that produces the greatest expected probability increase; and

generating the recommendation reason based at least on the selected historical questions and the corresponding answers.

7. The method of claim 5, wherein the generating a recommendation reason comprises:

generating, by a recommendation reason generation model, the recommendation reason based on at least one of the attribute selected by the historical answer, the attribute of the recommended product, and the description of the recommended product.

8. The method of claim 2, further comprising:

in response to determining that the at least one recommended product does not exist, performing a question ordering for a plurality of candidate questions based at least on a result of the product ordering; and

selecting a next question to be presented based on the results of the question ranking.

9. The method of claim 8, wherein the problem ranking is performed by entropy-based ranking or policy-based reinforcement learning ranking.

10. The method of claim 8, wherein the plurality of candidate questions are predetermined based at least on attributes of the plurality of candidate products.

11. The method of claim 1, wherein the at least one question includes one or more options and the answer includes a selection for the one or more options.

12. The method of claim 1, wherein the answer comprises a natural language sentence, and the determining whether at least one recommended product exists comprises:

determining one or more relevant questions and corresponding answers corresponding to the natural language sentence; and

determining whether the at least one recommended product is present based at least on the one or more related questions and corresponding answers.

13. An apparatus for providing an explanatory product recommendation in a session, comprising:

a question providing module for providing at least one question associated with the product recommendation;

an answer receiving module for receiving an answer to the at least one question;

a recommended product determination module to determine whether at least one recommended product exists based at least on the at least one question and the answer;

a reason for recommendation generation module to generate a reason for recommendation for the at least one recommended product in response to determining that the at least one recommended product exists; and

a response providing module for providing a response comprising the product information of the at least one recommended product and the recommendation reason.

14. The apparatus of claim 13, wherein the recommended product determination module is to:

15. The apparatus of claim 14, wherein the performing product ordering comprises:

16. The apparatus of claim 13, further comprising:

a session record adding module to add the at least one question and the answer to a session record of the session, the session record including historical questions and historical answers associated with the product recommendation in the session.

17. The apparatus of claim 16, wherein the referral reason generation module is to:

18. The apparatus of claim 16, wherein the referral reason generation module is to:

19. The apparatus of claim 14, further comprising a question selection module to:

20. An apparatus for providing an explanatory product recommendation in a session, comprising:

at least one processor; and

a memory storing computer-executable instructions that, when executed, cause the at least one processor to:

providing at least one question associated with the product recommendation,

receiving an answer to the at least one question,

determining whether at least one recommended product exists based at least on the at least one question and the answer,

in response to determining that the at least one recommended product exists, generating a reason for recommendation for the at least one recommended product, an