Embodiment
Below with the embodiment of DETAILED DESCRIPTION The present application.Should be noted that the embodiments described herein only is used for illustrating, be not limited to the application.
For the displaying of data object, it is very crucial how selecting suitable displaying keyword.This data object just can be demonstrated to search subscriber when the Query Information of the data object of seller user and search subscriber input was relevant, and when having only intention when the data object that is demonstrated and search subscriber relevant, search subscriber just can be clicked the data object that shows so that the transaction chance to be provided to seller user.Thereby having good correlativity (relevance) between the Query Information of search subscriber input and the data object becomes the major criterion of judging recommendation effect.
Fig. 2 illustrates the operating environment synoptic diagram that the application relates to, show to be provided with in the keyword commending system 1 and recommend storehouse 2, store in this recommendation storehouse 2 by mode under the line recommended show historical data object and the corresponding displaying keyword of keyword, historical data object and show that keyword is corresponding stored wherein.This displaying keyword commending system 1 can be made up of one or more server.Preferably, this displaying keyword commending system 1 can be made up of multiple servers, for example can be formed by server 12 under aol server 11 and the line, server 12 can be carried out the step of online and offline respectively under aol server 11 and the line, for example recommends stored history object in the storehouse 2 and corresponding displaying keyword to be obtained by mode under the line by server under the line 12.Like this, step can not influence step on the line that aol server 11 carries out under the line that server 12 carries out under the line, makes that the speed of handling on the line can be faster, and efficient is higher.In addition, recommend storehouse 2 can be arranged on the aol server 11, perhaps can arrange by an independent server.
Fig. 3 illustrates the process flow diagram of the displaying keyword recommend method embodiment one of the application's data object.Introduce the implementation procedure of the embodiment of the present application one in detail below in conjunction with Fig. 2 and Fig. 3.Method among the embodiment one comprises the steps:
Step 101, aol server 11 receive the current data object for the treatment of recommended displaying keyword.For example, aol server 11 can receive the data object that is passed through interfacing equipment inputs such as keyboard, touch screen by the maintainer of this commending system.
Step 102, aol server 11 judge to recommend in the storehouse 2 whether the relevant historical data object relevant with the current data object is arranged; The relevant historical data object relevant with the current data object arranged, execution in step 103 if recommend in the storehouse 2; There are not the relevant historical data object relevant with the current data object, execution in step 104 in the storehouse 2 if recommend.
In step 102, when if the content of current data object comprises single word, can recommend whether exist in the storehouse historical data object that comprises this single word determine whether comprise the relevant historical data object relevant with the current data object in the recommendation storehouse by judging.When if the content of current data object comprises a plurality of word, can judge according to current data object and the text similarity of recommending data object in the storehouse, perhaps can judge by the mode of cluster.Concrete implementation is introduced later.
Step 103, aol server 11 obtain the displaying keyword corresponding with this relevant historical data object from recommend storehouse 2, and determine the displaying keyword corresponding with this current data object based on the displaying keyword corresponding with this relevant historical data object.
Step 104, aol server 11 obtain the displaying keyword corresponding with the current data object by the line upper type according to inquiry log.
In the method that the embodiment of the present application one provides, for the historical data object of recommending in the storehouse, expanded displaying keyword as much as possible in advance by mode under the line, these show that keyword has covered relevant inquiring word as much as possible in the inquiry log, for the coverage rate height of inquiry log.When recommending to show keyword to the current data object, directly from recommend the storehouse, find the relevant historical data object relevant with the current data object, and recommend to show keyword based on the displaying keyword of relevant historical data object to the current data object.Like this, by in advance online down to the historical data object in the inquiry log summarize expansion obtain representative, than more comprehensively showing keyword, and the current data object is associated with the historical data object, thereby when searching the displaying keyword at every turn, can when guaranteeing summary responses, break away from the burden of traversal queries daily record full dose, can improve the coverage rate for inquiry log, reduce the probability that query word useful in the inquiry log is missed.
And, owing to recommend the displaying keyword in the storehouse to obtain in advance by mode under the line, determine the displaying keyword that the current data object is corresponding based on the displaying keyword corresponding with the relevant historical data object in the recommendation storehouse, speed is fast and data processing amount is little, can fully satisfy the instantaneity requirement.
If there be not the relevant historical data object relevant with the current data object in the recommendation storehouse, then obtain the displaying keyword corresponding with the current data object by the line upper type according to inquiry log, can satisfy and show that the keyword recommendation service is for the requirement of instantaneity.
As seen, the method that the application provides in conjunction with the online and offline processing mode, has been taken into account for the coverage rate of inquiry log with for the instantaneity requirement of showing that keyword is recommended.
The method that the application provides preferably can be applied to pay in the system of (pay for performance) by effect.Adopt this method, the chance that the displaying keyword of recommendation makes data object be demonstrated increases.
In embodiment as shown in Figure 3, can also comprise step 105 under the line, in step 105, can obtain the displaying keyword corresponding with the current data object by mode under the line according to inquiry log by server under the line 12, and current data object and the displaying keyword corresponding with the current data object are added in the recommendation storehouse.In order to distinguish, step under the with dashed lines schematic lines in Fig. 3 is illustrated step on each line with solid line.
Step 105 can periodically be carried out under the line, for example carries out once every a set time section (for example, two days).The data object that aol server 11 receives can send to server 12 under the line, periodically obtains the displaying keyword corresponding with data object by mode under the line by server under the line 12.
In the above-described embodiments, obtain the displaying keyword corresponding with the current data object by mode under the line according to inquiry log, and the current data object that gets access to by mode under the line and the displaying keyword corresponding with the current data object added to recommend in the storehouse, realized recommending the accumulation of data in the storehouse, be conducive to follow-up recommendation process and use.
Introduce the specific implementation of each step in above-described embodiment below in detail.At first introduce step 103.
In order to make demand that the displaying keyword recommend can satisfy different seller user (for example, the seller user that has wishes that data object can access more click, the seller user that has is then paid close attention to the price of showing keyword), realize showing customization and the personalization of keyword recommendation, in step 103, can determine the displaying keyword corresponding with the current data object based on the displaying keyword corresponding with the relevant historical data object and policy library.
Wherein, policy library can comprise at least one in the following scoring substrategy plug-in unit: based on the plug-in unit of the strategy of clicking arrival rate (Click Through Rate is called for short CTR), based on the plug-in unit of the strategy of showing keyword prices, based on the plug-in unit of the strategy of correlativity and based on the plug-in unit of the strategy of the purchase number of users of showing keyword etc.Tactful plug-in unit in certain policy library is not limited to above-mentioned listed, can also add more other tactful plug-in units in policy library, for example based on tactful plug-in unit such as competition temperature or release time section.
In addition, in step 103, in order to make the displaying keyword of recommending current production information more accurate, can be conducive to product information more is demonstrated, the relevant historical data object corresponding displaying keyword relevant with current production information that obtains can be showed keyword as the candidate from recommend the storehouse, therefrom filter out final displaying keyword again.
Particularly, step 103 can comprise:
Step 1031, from recommend the storehouse, obtain the displaying keyword corresponding with the relevant historical data object and show keyword as first candidate.
Step 1032, based on dynamic load at least one the scoring substrategy plug-in unit obtain the scoring that first candidate shows keyword.
Step 1033, show that based on first candidate scoring of keyword shows the keyword ordering with first candidate.
Step 1034, from first candidate show the keyword select ordering before M first candidate show that keyword is as the displaying keyword of current data object correspondence.Wherein, M is the natural number greater than 1.
Above-mentioned steps 1031-1034 mainly is that first candidate of leading to selecting from recommend the storehouse shows that keyword has carried out scoring, ordering, and first candidate who has therefrom selected the high-quality of the preceding M of ordering shows that keyword is as the displaying keyword of current data object correspondence.
In the application's embodiment, strategy refers to some rules or the method for scoring or selection.Various strategies can set in advance, and various strategies can dynamically be loaded in the mode of plug-in unit, and these plug-in units can be included in the policy library.When showing that the keyword commending system need be used these plug-in units, can call by the libdl storehouse under the linux server environment for example, with the form dynamic load of tactful plug-in unit with dynamic base.In addition, various strategies can make up.Particularly, step 1032 can comprise: obtain first candidate respectively based on the plug-in unit of at least one scoring substrategy of dynamic load and show that keyword is in each scoring of scoring under substrategy; Show keyword in the scoring of each scoring under substrategy and the weight of each scoring substrategy according to first candidate, obtain comprehensive grading that first candidate shows keyword is showed keyword as first candidate scoring.
Comprehensive grading can be as shown in Equation (1):
score=F(score
1,score
2,...,score
n)(1)
In the formula (1), score is the comprehensive grading of first candidate keywords, score
1, score
2..., score
nThe expression scoring under the substrategy of respectively marking, used computing method when F represents for the scoring under each scoring substrategy in conjunction with weight calculation.For different seller user, different scoring substrategys can have different weights.For example, for some seller user, relatively tend to make data object easier to be clicked, then can arrange the weight based on the strategy of CTR higher.
In addition, also can consider to determine scoring based on displaying number of times pv, shown in the following formula (2):
In the formula (2), pv is the number of times that data object is demonstrated for certain shows keyword; w
PvIt is the weight of pv; Break is a smoothing factor, can be set to a for example constant of hundreds of size; w
iIt is the weight of each scoring substrategy; Score
iBe i the scoring under the scoring substrategy.
In the application's embodiment, adopt various strategies to select to show keyword, and be provided with corresponding weight for various strategies, make and to recommend to show keyword according to different strategies for different user, make that the recommendation of showing keyword is more flexible, satisfy the tendentiousness of different seller user, realized the personalized and customization of showing that keyword is recommended.
Introduce step 104 below in detail.
In the application's embodiment, step 104 can realize by the line upper type of prior art, perhaps also can realize by the mode of following introduction.
In the application's embodiment, term " on the line (on line) " and " under the line (offline) " at first can be the concepts on a kind of temporal meaning.In technical field of information processing, some processing has the requirement of instantaneity, namely requires to obtain result in the short period of time, and this processing can be described as the line upper type.And other is handled the requirement that does not have instantaneity, can handle with the long relatively time, and this processing can be described as mode under the line.In addition, " on the line " and " under the line " can refer in the processing that differentiation is arranged aspect importance and the service arrangement.System can only generate data and can the use of user on the website do not had a direct impact under the line, thereby can greatly reduce risk, guarantees website stability.
In order to cover inquiry log as far as possible all sidedly, traversal queries daily record full dose is obtained the displaying keyword corresponding with the current data object yes that effect is best.Yet the data scale of inquiry log full dose is huge, if obtain the displaying keyword based on the inquiry log full dose, time overhead is excessive and be difficult to realize, and data processing speed also will be slow excessively.For this reason, on line in the mode, although the application's embodiment also recommends with reference to inquiry log, but in order when covering inquiry log, to guarantee the speed that data are handled, be to require data processing amount mode still less to recommend to show keyword according to inquiry log with a kind of with respect to mode under the line.
Think by analysis: when having the vocabulary identical with data object in the Query Information, think that this Query Information and data object have similarity to a certain degree, and the height of similarity is directly proportional with the significance level of these identical vocabulary and the number of identical vocabulary.
Based on above-mentioned analysis, in the application's embodiment, can set up the core implication word of Query Information and each bar Query Information correspondence based on inquiry log in advance.Core implication word is centre word or prime phrase, can refer to the word behind the removal ornamental equivalent in the Query Information, and these words can represent the main query intention of search subscriber.For example, for Query Information " red MP3 player ", core implication word " MP3 player " can embody the main query intention of search subscriber, and " redness " is qualifier.The core implication word of many Query Informations and each bar Query Information correspondence is organized into the inverted index dictionary with the form of inverted index structure, and the inverted index dictionary comprises that the core implication word of the Query Information from inquiry log is to many mappings of Query Information.For example: can use Q={[W
I, 1, W
I, 2..., W
I, L]
iRepresent Query Information.W wherein
I, 1, W
I, 2..., W
I, LRepresent i bar Query Information [W
I, 1, W
I, 2..., W
I, L]
iIn the 1st, 2 ..., a L core implication word.Particularly, can utilize natural language processing (Natural Language Processing is called for short NLP) instrument that the query text in the inquiry log is analyzed, mark out the core implication word of this query text, the inverted index dictionary set up in core implication word.Above-mentioned core implication word can be only to extract from the title of current data object, also can be to extract from the text of whole data object.
Inverted index dictionary based on setting up in advance in step 104, obtains the displaying keyword corresponding with the current data object by the line upper type according to inquiry log, can comprise the following steps 1041-1046 that carries out by the line upper type:
The core implication word of step 1041, extraction current data object, this core implication word can be to extract from the title of current data object, also can be to extract from the text of whole data object.Can extract based on the NLP instrument.
Step 1042, from the inverted index dictionary of setting up in advance according to inquiry log the core implication word of search current data object, the set of Query Information of obtaining the core implication word correspondence of current data object.
For example, in the inverted index dictionary, can comprise following mapping:
Core implication word |
Query Information Q
1 |
W
11=mobile;W
12=mobile phone
|
Nokia; Mobile; Phone |
In above-mentioned mapping, Query Information Q
1Comprising three query words, is respectively " Nokia ", " mobile " and " phone ", and the core implication word that obtains according to the NLP instrument comprises " mobile " and " mobile phone ".
For example, the core implication word of the current data object that employing NLP instrument obtains comprises " mobile ", by retrieving from the inverted index dictionary, can retrieve core implication word W
11And W
12, and then can obtain the Query Information Q of these two core implication word correspondences
1
If the Query Information that retrieves comprises many, then can form the set of a Query Information.Because Query Information comprises a plurality of query words, thereby the set of Query Information can be regarded as the set of query word.
Step 1043, from the set of the Query Information of the core implication word correspondence of current data object, select second candidate based on the plug-in unit of the simplification screening strategy of dynamic load and show keyword.For example, above-mentioned Query Information Q
1In three query words " Nokia ", " mobile " and " phone " can be used as second candidate and show keyword.
Step 1044, obtain the scoring that second candidate shows keyword based on the plug-in unit of the simplification of dynamic load scoring strategy.
Step 1045, show that based on second candidate scoring of keyword shows the keyword ordering with second candidate.
Step 1046, from second candidate show the keyword select ordering before M second candidate show that keyword is as the displaying keyword of current data object correspondence.
In step 1043, adopted the simplification screening strategy to select second candidate to show keyword.This simplification screening strategy refers to more simplify, require data processing amount strategy still less with respect to the screening strategy that adopts in the mode under the line.For example, if the Query Information that obtains has 1000, then according to simplifying screening strategy, can therefrom select 10 Query Informations from the inverted index dictionary, the query word in these 10 Query Informations is showed keyword as second candidate.
In step 1044, that the simplification of employing scoring strategy refers to is more simple with respect to the scoring strategy that adopts in the mode under the line, require data processing amount strategy still less.For example, simplify in the scoring strategy and can only adopt pure strategy, and do not adopt the comprehensive of a plurality of strategies.
Introduce step 105 under the line below in detail.Step 105 can realize by the step 1051-1056 that carries out in mode under the line under the line:
The core implication word of step 1051, extraction current data object.
Step 1052, from the inverted index dictionary core implication word of search current data object, the set of Query Information of obtaining the core implication word correspondence of current data object.
Step 1053, from the set of the Query Information of the core implication word correspondence of current data object, select the 3rd candidate based on the plug-in unit of screening strategy under the line of dynamic load and show keyword.
Step 1054, based on the plug-in unit of the not collinear down scoring strategy of dynamic record obtain the not collinear strategy of scoring down down the 3rd candidate show the scoring of keyword.
Step 1055, under not collinear down scoring strategy, show that based on the 3rd candidate the scoring of keyword shows the keyword ordering with the 3rd candidate respectively.
Step 1056, respectively under not collinear down scoring strategy, from the 3rd candidate show select ordering the keyword before P the 3rd candidate show that keyword is as the displaying keyword of current data object correspondence; P is the natural number greater than 1.
The difference part of step 1051-1056 and step 1041-1046 is: the former has adopted and has required the bigger screening strategy of data processing amount and scoring strategy, used more Query Information in the inverted index dictionary, the latter has adopted and has required the less screening of data processing amount and scoring strategy, and the Query Information in the inverted index dictionary of use is less.
In addition, in step 1051, can also obtain out synonym or the near synonym of the core implication word of current data object.In step 1052, can from the inverted index dictionary, search for the core implication word of current data object and synonym or the near synonym of this core implication word, obtain the set of the Query Information of the core implication word of current data object and this synonym or near synonym correspondence.In step 1053, can from the set of the Query Information of the core implication word of current data object and this synonym or near synonym correspondence, select the 3rd candidate based on the plug-in unit of screening strategy under the line of dynamic load and show keyword.
Synonym or the near synonym of the core implication word by obtaining the current data object, the 3rd candidate who makes it possible to retrieve shows the more of keyword, reduces to miss the probability that useful candidate shows keyword, improves the coverage rate for inquiry log.
Certainly, the core implication word of synonym or near synonym also can be by obtaining to(for) step 1041-1043 on the line obtain the second more candidate and show keyword, but consider the instantaneity requirement of step on the line, preferably, can omit the synonym that obtains core implication word or the step of near synonym.
In the application's embodiment, extract the current data object core implication word, from the inverted index dictionary, obtain described current data object core implication word correspondence Query Information set, the selection candidate shows that this series of steps such as keyword is called " expansion process " from this set." expansion " mainly refers to the core implication word in the data object is carried out the expansion of meaning or expression form, generates the candidate that more can represent this data object and shows keyword.Pass through expansion process, can obtain quantity and show keyword less than the candidate of the query word quantity in the full dose inquiry log, and because showing keyword, these candidates obtain by retrieval inverted index dictionary, has local text coherence (namely with the current data object, the candidate shows some identical/similar word in keyword and the current data object), guaranteed to show that from these candidates displaying keyword and the data object selected the keyword have correlativity preferably, and can embody search or the buying intention of search subscriber better.
In the mode of the online and offline of step 104, all related to expansion process.Then, because the line upper type has carried out the expansion of comparatively simplifying for the requirement height of instantaneity; Under the line mode for instantaneity require lowly, carried out comparatively abundant, complicated expansion, make the displaying keyword that obtains by mode under the line have higher coverage rate for inquiry log.
Screening strategy can be more complicated so that the 3rd candidate who filters out shows that keyword is more with respect to the simplification screening strategy in the step 1043 under the line in step 104c.
The scoring strategy also can be more complicated so that scoring can embody the emphasis of different seller user more with respect to the scoring of the simplification in the step 1044 strategy under the line in step 104d, for example can lay particular emphasis on CTR by scoring strategy under the line.
In addition, the value of P can namely can be showed from the 3rd candidate by mode under the line and select the more keyword of showing to add in the recommendation storehouse as final displaying keyword the keyword greater than M.
Not collinear very unwise move displaying keyword and the corresponding historical data object slightly down that obtains by mode under the line can add in the recommendation storehouse.Displaying keyword and corresponding historical data object under the various strategies in the recommendation storehouse can be as shown in Table 1.
Table one
In the above-mentioned table one, strategy 1, strategy 2 ..., tactful n refers under the different lines scoring strategy.Data in the recommendation storehouse can be with the form tissue of key-value to (key-value pair), and key (key) is the historical data object, and value (value) is the displaying keyword of this data object correspondence.By key-value form is organized data, can realize inquiry velocity faster.
Under having introduced line, after the step, look back at step 103 again, can have the application's technical scheme more clearly and understand.Toward recommending to have added the data object under the Different Strategies and corresponding displaying keyword in the storehouse, realized recommending the accumulation in storehouse in the online step down.In follow-up recommendation process, after receiving data object, in step 102 and 103, can recommend the storehouse to obtain the displaying keyword by inquiry quickly and easily.
Introduce step 102 below.
In step 102, can determine to recommend whether to have the related data object relevant with the current data object in the storehouse by judging text similarity.
Particularly, step 102 can comprise: the classification that obtains the current data object; Judge in described current data object and the described recommendation storehouse and described current data object belongs to text similarity between the same item destination data object; If exist similarity greater than the situation of first predetermined threshold value, then determining has the relevant historical data object relevant with described current data object in the described recommendation storehouse; If there is no similarity is greater than the situation of described first predetermined threshold value, and then determining does not have the relevant historical data object relevant with described current data object in the described recommendation storehouse.This first predetermined threshold value can arrange flexibly according to the needs that data are handled.
In the application's embodiment, data object is text data, can search the text similar to the current data object in the existing text of same item purpose from recommend the storehouse, can represent with following formula (3):
In the formula (3), sim (o
c, q
c) be to recommend historical data object in the storehouse and the text similarity assessment algorithm between the current data object, subscript c represents affiliated classification, O
cUnder the expression all of classification recommended show the set of the data object of keyword, offer
CandidateExpression is recommended in the storehouse and current data object q
cThe most similar data object.Sim (o
c, q
c) can be chosen as vector space model method and Simhash method.Arg max represents to produce the data object of maximum similarity.
Obtaining offer
CandidateAfterwards, can determine this offer
CandidateAnd whether the text similarity between the current data object is greater than first predetermined threshold value.If similarity greater than first predetermined threshold value, then determines to exist in the described recommendation storehouse historical data object relevant with described current data object.Offer
CandidateNamely can be considered to the historical data object relevant with the current data object.If similarity is less than or equal to first predetermined threshold value, then determine not have the historical data object relevant with described current data object in the described recommendation storehouse.
(i) vector space model (Vector Space Model is called for short VSM) method
Particularly, can use VSM to analyze recommending data object and current data object place in the storehouse, obtain proper vector, similarity between proper vector by current data object relatively and the proper vector of recommending historical data object in the storehouse namely can be obtained the text similarity between the historical data object in data object and the recommendation storehouse.
(ii) Simhash method
Particularly, can adopt the Simhash method with the current data object and recommend that each historical data object forms fingerprint by the bitmap that hash algorithm obtains certain-length in the storehouse, the distance between the fingerprint by obtaining the current data object and the fingerprint of recommending data object in the storehouse obtains the current data object and recommends text similarity between the historical data object in the storehouse.Specifically can represent by following formula (4) and (5):
fp=[b
1,b
2,...,b
N] (4)
sim(fp
1,fp
2)=hamming_distance(fp
1,fp
2) (5)
Formula (4) is the fingerprint expression formula of text, b
1, b
2..., b
NBe N bitmap, b
i=0 or 1,1≤i≤N, fp represents that length is the bitmap of N.Formula (5) is the fingerprint fp of two texts
1And fp
2Between similarity, hamming_distance (fp
1, fp
2) be the fingerprint fp of two texts
1And fp
2Between Hamming distance.
In step 102, also can judge by the mode of cluster.Particularly, step 102 can comprise: the similarity between the center historical data object under of all categories in judgement current data object and the recommendation storehouse; If the maximal value of the similarity between current data object and each the center historical data object is more than or equal to second predetermined threshold value, then determine to recommend in the storehouse relevant historical data object relevant with the current data object arranged, and classification under the center historical data object of similarity maximum can be defined as the classification of current data object, then can be with the historical data object of the similarity maximum in the classification under the current data object and between the current data object as described relevant historical data object; If the maximal value of the similarity between current data object and each the center historical data object less than second predetermined threshold value, is then determined to recommend there be not the relevant historical data object relevant with the current data object in the storehouse.This second predetermined threshold value can arrange flexibly according to the needs that data are handled.
Particularly, by to recommending the historical data object in the storehouse to carry out cluster, can be classified as a class to similar historical data object, obtain n classification, Z
1, Z
2..., Z
n, Z
i(1≤i≤n) is one of them classification, wherein comprises a plurality of historical data objects with similar features, and this classification can be expressed as { o
I, 1, o
I, 2..., o
I, m, o wherein
I, 1, o
I, 2..., o
I, mFor belonging to this classification Z
iEach historical data object, such other center historical data object can be designated as o
I, cFor the current data object of input, can judge the similarity between current data object and the recommendation center historical data object under of all categories in the storehouse.If maximal value (for example, current data object and the classification Z of the similarity between current data object and each the center historical data object
jCenter historical data object between the similarity maximum) more than or equal to second predetermined threshold value, then determine to recommend to exist in the storehouse historical data object relevant with described current data object.And, the classification of current data object can be defined as Z
j, and with classification Z
jDown and the historical data object of the similarity maximum between the current data object as described relevant historical data object.
Adopt the mode of cluster, with the historical data object of the similarity maximum under the classification under the current data object and between the current data object as the relevant historical data object, and the crucial keyword of the displaying of this relevant historical data object showed that as the candidate keyword recommends, data processing amount is little, speed is fast, and owing to similarity maximum between relevant historical data object and the current data object, thereby the candidate who recommends shows that keyword is more accurate.
Need to prove, recommend proper vector, fingerprint or the classification of each data object in the storehouse to obtain in advance and to be stored in the recommendation storehouse by mode under the line, when receiving the current data object, can only calculate proper vector or fingerprint or the classification of current data object, and need not on the line proper vector of all data objects or fingerprint or classification in the calculated recommendation storehouse, to improve the speed that data are handled.
Fig. 4 illustrates the process flow diagram of the displaying keyword recommend method embodiment two of the application's data object, comprising:
Step 201, aol server receive the current data object for the treatment of recommended displaying keyword.
Step 202, aol server carry out pre-service to the current data object that receives.This pre-service can comprise cuts word, gets processing such as root the data object.
Step 203, aol server adopt with the step 102 of aforementioned description similarly method obtain the proper vector, fingerprint, classification etc. of current data object.
Step 204, aol server judge to recommend whether to have the relevant historical data object relevant with the current data object in the storehouse.If there is execution in step 205; If there is no, execution in step 206.
In this embodiment, recommend the storehouse can comprise first dictionary and second dictionary.Store recommended historical data object and the corresponding displaying keyword of showing keyword under the various strategies of setting up by mode under the line in first dictionary, wherein key is the historical data object, and value be the displaying keyword of correspondence.First dictionary can comprise each seed dictionary, can comprise recommended historical data object and the corresponding displaying keyword of showing keyword under a kind of strategy in each sub-dictionary.Store proper vector, fingerprint or the classification etc. of each historical data object in second dictionary, the key in second dictionary is the historical data object, and value is proper vector, fingerprint or the classification etc. of historical data object.
In step 204, specifically can judge whether there be the relevant historical data object relevant with the current data object in first dictionary based on proper vector or the fingerprint of historical data object in the proper vector of current data object or fingerprint and second dictionary.The mode of cluster that perhaps can be by aforementioned introduction is judged.
Step 205, aol server obtain the displaying keyword of relevant historical data object correspondence and show keyword as first candidate from first dictionary, plug-in unit based on the scoring substrategy of dynamic load obtains the scoring that first candidate shows keyword, the scoring of showing keyword based on first candidate is showed the keyword ordering with first candidate, then from first candidate show select ordering the keyword before M first candidate show that keyword is as the displaying keyword of current data object correspondence.
Execution in step 207 after the step 205.
Step 206, aol server obtain the displaying keyword corresponding with the current data object by the line upper type according to inquiry log.
Each substep 206a-206f is identical with the substep 1041-1046 of step 104 among the embodiment one respectively in the step 206, does not repeat them here.
Step 207, aol server are exported the displaying keyword of current data object correspondence.
After step 201, can also carry out step under the line, under this line the substep 208-213 of step respectively with embodiment one in step 1051-1056 substantially identical, do not repeat them here.
Wherein, in step 212, server can also add current data object and the displaying keyword of selecting corresponding with the current data object in first dictionary under the line.
And in the online step down, server can carry out pre-service (referring to step 214) to the current data object that receives under the line, and pretreated mode can be identical with step 202.
Several sub-steps on the line among embodiment one and the embodiment two in the step 104 and 206 can realize with a function, also can realize with several functions.
For example, substep 1041 and 206a (extraction of core implication word), 1042 and 206b (obtaining core implication set of words), 1043 and 206c (the 3rd or second candidate shows the selection of keyword) can utilize a spread function to realize, step 1044 and 206d (scoring) can realize with a score function, step 1045 and 206e (ordering) can realize that step 1046 and 206f (selection of the displaying keyword of current data object) can realize with a choice function with a ranking functions.
Realize different steps by these four functions, rather than realize by a function, be convenient to the loading of tactful plug-in unit, and be convenient to other steps and call.Like this, for same or analogous processing, a function is set gets final product, and need not to arrange a plurality of functions.For example, in step 1051-1056 and the 208-213, also can call above-mentioned function under carrying out line, only the tactful plug-in unit of step dynamic load can be different on step and the line under the line.
In addition, the step of each in step 205 also can utilize different functions to realize.For example, in step 205, from recommend the storehouse, obtain the displaying keyword corresponding with the relevant historical data object and show that as first candidate step of keyword can obtain function by one and realize, obtaining the step that first candidate shows the scoring of keyword based on the plug-in unit of the first scoring strategy of dynamic load can realize by above-mentioned score function, shows from first candidate and selects before the ordering M first candidate to show that keyword can realize by above-mentioned choice function as the step of the displaying keyword of current data object correspondence the keyword.
In the application's embodiment, adopted multiple strategy, the various logic of the dynamic base plug-in unit that relates among each embodiment and function thereof as shown in Table 2:
The function of each logic of table two, dynamic base plug-in unit
These functions in the table two are divided into two environment in online and offline and call, for each tactful plug-in unit, be its major function with mark (SCORE), in the SCORE logic, mark according to different scoring strategies, according to the difference of scoring, reflect the selection tendency of this strategy.The strategy plug-in unit on line with line under application flow in the step referring to Fig. 5.
Mode by dynamic load strategy plug-in unit can be brought following advantage:
(1) the application loads strategy with card format, and master routine and tactful plug-in unit are separated, and has realized the loose coupling of master routine and tactful plug-in unit, does not need to revise master routine when revising strategy, therefore, can keep system stability, is easy to safeguard.
(2) Ce Lve change is finished by the configuration file configuration plug-in, has increased the dirigibility of safeguarding.
(3) because strategy can be revised neatly, therefore increased the participation of operating side for strategy, and feasible strategy can variation.
In showing the keyword commending system, recommend to provide function in the platform by providing tactful dynamic base, policy configurations file, policy data file resource to be embedded into the displaying keyword as a plug-in unit.
Fig. 6 illustrates the structural representation of the displaying keyword commending system of the application's data object, comprises receiver module 21, judge module 22, first processing module 23 and second processing module 24.Receiver module 21 is used for receiving the current data object for the treatment of recommended displaying keyword.Judge module 22 is connected with receiver module 21, is used for judging recommending the storehouse whether the relevant historical data object relevant with the current data object is arranged.First processing module 23 is connected with judge module 22, be used for judging that at judge module 22 the recommendation storehouse has under the situation of the relevant historical data object relevant with the current data object, from recommend the storehouse, obtain the displaying keyword corresponding with the relevant historical data object, and determine the displaying keyword corresponding with the current data object based on the displaying keyword corresponding with the relevant historical data object.Second processing module 24 is connected with judge module 22, is used for obtaining the displaying keyword corresponding with the current data object by the line upper type according to inquiry log.
Wherein, first processing module 23 can be determined the displaying keyword corresponding with described current data object based on the displaying keyword corresponding with the relevant historical data object and policy library.Policy library can comprise at least one in the plug-in unit of following scoring substrategy: based on the plug-in unit of the strategy of clicking arrival rate, based on the plug-in unit of the strategy of showing keyword prices, based on the plug-in unit of the strategy of correlativity and based on the plug-in unit of the strategy of the purchase number of users of showing keyword.
Wherein, first processing module 23 can comprise first acquiring unit, 231, the first scoring unit 232, first sequencing unit 233 and first selected cells 234.First acquiring unit 231 is used for showing keyword from recommending the storehouse to obtain the displaying keyword corresponding with the relevant historical data object as first candidate.The first scoring unit 232 is connected with first acquiring unit 231, obtains the scoring that first candidate shows keyword for the plug-in unit based at least one substrategy of marking of dynamic load.First sequencing unit 233 is connected with the first scoring unit 232, for the scoring of showing keyword based on first candidate first candidate is showed the keyword ordering.First selected cell 234 is connected with first sequencing unit 233, is used for showing that from first candidate preceding M first candidate of keyword selection ordering shows that keyword is as the displaying keyword of described current data object correspondence; M is the natural number greater than 1.
Wherein, second processing module 24 can comprise first extracting unit 241, first search unit 242, second acquisition unit 243, second scoring unit 244, second sequencing unit 245 and second selected cell 246.First extracting unit 241 is used for extracting the core implication word of current data object.First search unit 242 is connected with first extracting unit 241, is used for from the core implication word of the inverted index dictionary search current data object of setting up in advance according to inquiry log, obtains the set of Query Information of the core implication word correspondence of current data object.Second acquisition unit 243 is connected with first search unit 242, is used for selecting second candidate based on the plug-in unit of the simplification screening strategy of dynamic load from the set of the Query Information of the core implication word correspondence of current data object and shows keyword.The second scoring unit 244 is connected with second acquisition unit 243, obtains the scoring that second candidate shows keyword for the tactful plug-in unit of marking based on the simplification of dynamic load.Second sequencing unit 245 is connected with the second scoring unit 244, for the scoring of showing keyword based on second candidate second candidate is showed the keyword ordering.Second selected cell 246 is connected with second sequencing unit 245, is used for showing that from second candidate preceding M second candidate of keyword selection ordering shows that keyword is as the displaying keyword of described current data object correspondence.
Can also comprise the 3rd processing module 25 in the system shown in Figure 6, the 3rd processing module 25 is connected with receiver module 21, be used for obtaining the displaying keyword corresponding with the current data object by mode under the line according to inquiry log, and current data object and the displaying keyword corresponding with the current data object are added in the recommendation storehouse.
The 3rd processing module 25 can comprise second extracting unit 251, second search unit 252, the 3rd acquiring unit 253, the 3rd scoring unit 254, the 3rd sequencing unit 255 and the 3rd selected cell 256.Second extracting unit 251 is used for extracting the core implication word of current data object.Second search unit 252 is connected with second extracting unit 251, is used for from the core implication word of inverted index dictionary search current data object, obtains the set of Query Information of the core implication word correspondence of current data object.The 3rd acquiring unit 253 is connected with second search unit 252, is used for selecting the 3rd candidate based on screening strategy plug-in unit under the line of dynamic load from the set of the Query Information of the core implication word correspondence of current data object and shows keyword.The 3rd scoring unit 254 is connected with the 3rd acquiring unit 253, is used for plug-in unit based on the not collinear strategy of scoring down of dynamic load and obtains the scoring that described the 3rd candidate under the strategy of marking under not collinear shows keyword.The 3rd sequencing unit 255 and the 3rd scoring unit 254 are connected, are used for respectively under the not collinear strategy of scoring down showing that based on described the 3rd candidate the scoring of keyword shows that with described the 3rd candidate keyword sorts.The 3rd selected cell 256 is connected with the 3rd sequencing unit 255, be used for respectively under the not collinear strategy of scoring down, from the 3rd candidate show select ordering the keyword before P the 3rd candidate show that keyword is as the displaying keyword of current data object correspondence.
The method that the application provides and step thereof can by the one or more treatment facilities with data-handling capacity for example one or more servers operation computer executable instructions realize.Can store various instructions for each step of carrying out the method that the application provides in the storage medium of server.
Each module in the application's the system can be realized by one or more servers of operation computer executable instructions.Each module has the apparatus assembly of corresponding function in the time of can moving computer executable instructions for this server.Wherein, receiver module, judge module, first processing module and second processing module can be realized that the 3rd processing module can be realized by server under the line by an aol server.
In sum, the method and system that the application provides, to showing that crucial recommendation process has carried out analyzing clearly, whole process roughly is divided into expansion and selects two stages, by expanding the coverage rate that can improve for inquiry log, can more flexibly and accurately recommend to show keyword for seller user by the selection based on various strategies, thereby can utilize inquiry log effectively, all sidedly, in the accurate recommendation that compares, not lose recommendation efficient.And, according to the situation of utilizing of computational resource, the computation process of complexity is arranged to carry out under online, improved the overall performance index of system, make system response time fast.
In addition, in the prior art, in order to obtain to pursue better recommendation effect, usually different recommend methods is carried out assembling combination in system, caused situations such as system code redundancy, complex structure like this, to such an extent as to system is difficult to maintenance, and complicated system has also caused performance decrease.
And in the application's embodiment, with the form dynamic load of various strategies with plug-in unit, increase new strategy if desired, only need the expanding policy storehouse to get final product, do not have the assembling combination of various recommend methods, code is simple, and system has the modularization structure that is easy to expand, be easy to compatible various recommendation strategies, and relevance algorithms.
Though described the application with reference to exemplary embodiments, should be appreciated that used term is explanation and exemplary and nonrestrictive term.The spirit or the essence that do not break away from invention because the application can specifically implement in a variety of forms, so be to be understood that, above-described embodiment is not limited to any aforesaid details, and should be in the spirit and scope that the claim of enclosing limits explain widely, therefore fall into whole variations in claim or its equivalent scope and remodeling and all should be the claim of enclosing and contain.