CN111695917B

CN111695917B - Commodity recommendation method, commodity recommendation system, electronic equipment and storage medium

Info

Publication number: CN111695917B
Application number: CN201910181571.2A
Authority: CN
Inventors: 张明阳; 宋增超
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2019-03-11
Filing date: 2019-03-11
Publication date: 2024-10-22
Anticipated expiration: 2039-03-11
Also published as: CN111695917A

Abstract

The invention discloses a commodity recommendation method, a commodity recommendation system, electronic equipment and a storage medium. The commodity recommendation method comprises the following steps: acquiring a plurality of behavior data of a user in an European space; mapping all the behavior data to a Hamming space according to a local sensitive hash algorithm, wherein each line of data corresponds to one binary code in Yu Haiming spaces; selecting a plurality of first cluster centers from all binary codes; clustering all binary codes according to a plurality of first clustering centers to obtain a plurality of first cluster classes; for each first cluster, reselecting a first cluster center, and continuing to execute the step of clustering until the clustering termination condition is met; recommending commodities to the user according to the obtained first cluster classes. Because the exclusive or calculation between binary codes is quite simple, the clustering operation is performed based on the Hamming distance calculation similarity, the calculation amount of measurement similarity is greatly reduced, and the calculation complexity of commodity recommendation is further greatly reduced.

Description

Commodity recommendation method, commodity recommendation system, electronic equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method, a system, an electronic device, and a storage medium for recommending commodities.

Background

The user can show the purchase interests of the user through the actions such as searching, browsing the commodities and the like when shopping online, and recommend proper commodities to the user in order to meet the purchase interests of the user. The clustering is to divide a large number of data sets with unknown labels into a plurality of different categories according to the data characteristics existing in the data, and divide similar data into the same categories, and divide dissimilar data into different categories, thereby belonging to unsupervised learning. Commonly used clustering algorithms (such as K-means algorithms) generally use spatial distances such as euclidean distances to measure similarity between data. However, when the data set to be clustered is large, the time complexity and the space complexity of calculating the similarity based on the space distance are high, and the calculation for recommending the commodity is more complicated.

Disclosure of Invention

The technical problem to be solved by the embodiment of the invention is to overcome the defect of higher calculation complexity of commodity recommendation based on spatial distance clustering in the prior art, and provide a commodity recommendation method, a commodity recommendation system, electronic equipment and a storage medium.

The embodiment of the invention solves the technical problems through the following technical scheme:

the commodity recommending method is characterized by comprising the following steps:

acquiring a plurality of behavior data of a user in an European space, wherein the behavior data comprises at least one of search behavior data and browsing behavior data;

mapping all the behavior data to a Hamming space according to a local sensitive hash algorithm, wherein each line of data corresponds to one binary code in Yu Haiming spaces;

selecting a plurality of first cluster centers from all binary codes;

Clustering all binary codes according to a plurality of first clustering centers to obtain a plurality of first cluster classes;

For each first cluster, reselecting a first cluster center, and continuing to execute the step of clustering all binary codes according to the plurality of first cluster centers until a cluster termination condition is met;

And recommending commodities to the user according to the obtained first cluster after the clustering is ended.

Preferably, the step of clustering all binary codes according to the plurality of first clustering centers includes:

for each binary, calculating hamming distances of the binary from all first cluster centers;

Selecting the shortest hamming distance from all hamming distances;

and classifying the binary codes into a first cluster class of a first cluster center corresponding to the shortest Hamming distance.

Preferably, the step of selecting a plurality of first cluster centers among all binary codes includes:

Selecting a plurality of second hub classes among all the behavioral data;

setting binary codes corresponding to each second clustering center as first clustering centers;

For each first cluster class, the step of reselecting the first cluster center includes:

Setting behavior data corresponding to binary codes classified into the first cluster class to be classified into the same second cluster class, wherein different first cluster classes correspond to different second cluster classes;

reselecting a second cluster center among the second cluster types;

mapping the reselected second aggregation center to a Hamming space according to the local sensitive hash algorithm to obtain a recalculated binary code;

the recalculated binary code is set to the reselected first cluster center.

Preferably, in the step of selecting a plurality of second aggregation centers in all the behavior data, each second aggregation center corresponds to a different binary code;

and/or in the step of reselecting a second cluster center in the second cluster, the second cluster center is obtained by calculation according to Euclidean distances of all behavior data in the second cluster.

Preferably, the clustering termination condition includes:

Executing the step of reselecting the first cluster center for a preset number of times for each first cluster class, or enabling the variation of each first cluster center to be smaller than a preset variation;

and/or the locality sensitive hashing algorithm comprises SimHash.

An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements any of the above-mentioned commodity recommendation methods when executing the computer program.

A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of any of the above-mentioned commodity recommendation methods.

A merchandise recommendation system, the merchandise recommendation system comprising:

the behavior data acquisition module is used for acquiring a plurality of behavior data of a user in an European space, wherein the behavior data comprises at least one of search behavior data and browsing behavior data;

The mapping module is used for mapping all the behavior data to a Hamming space according to a local sensitive hash algorithm, wherein each line of data corresponds to one binary code in Yu Haiming spaces;

A first selection module for selecting a plurality of first cluster centers among all binary codes;

The clustering module is used for clustering all binary codes according to the plurality of first clustering centers to obtain a plurality of first cluster classes;

the second selection module is used for reselecting the first clustering center for each first cluster class and continuously calling the clustering module until the clustering termination condition is met;

and the commodity recommending module is used for recommending commodities to the user according to the obtained first cluster types after the clustering is ended.

Preferably, the clustering module includes:

A distance calculation unit for calculating Hamming distances between each binary code and all first cluster centers;

a shortest distance selecting unit for selecting a shortest hamming distance among all hamming distances;

and the classifying unit is used for classifying the binary codes into a first cluster class of a first cluster center corresponding to the shortest Hamming distance.

Preferably, the first selecting module includes:

A first selecting unit for selecting a plurality of second aggregation centers among all the behavior data;

the first setting unit is used for setting binary codes corresponding to each second clustering center as first clustering centers;

The second selection module includes:

a second setting unit, configured to set behavior data corresponding to binary codes classified in the first cluster class to be classified in the same second cluster class, where different first cluster classes correspond to different second cluster classes;

a second selecting unit, configured to reselect a second cluster center in the second cluster;

The mapping unit is used for mapping the reselected second aggregation center to the Hamming space according to the local sensitive hash algorithm to obtain a recalculated binary code;

and a third setting unit, configured to set the recalculated binary code as the reselected first cluster center.

Preferably, each second aggregation center selected by the first selecting unit corresponds to a different binary code;

and the second selecting unit calculates the second cluster center according to Euclidean distances of all the behavior data in the second cluster.

Preferably, the clustering termination condition includes:

For each first cluster, calling the second selection module for preset times, or enabling the variation of each first cluster center to be smaller than the preset variation;

and/or the locality sensitive hashing algorithm comprises SimHash.

The embodiment of the invention has the positive progress effects that: when the behavior data are clustered, the embodiment of the invention does not measure the similarity between the data according to the Euclidean distance and other spatial distances in the Euclidean space, but maps the behavior data in the Euclidean space to binary codes of the Hamming space through local sensitive hash, measures the similarity between the behavior data before conversion in the Euclidean space by measuring the similarity between the converted behavior data in the Hamming space, and greatly reduces the calculated amount of measuring the similarity, thereby greatly improving the clustering speed of the behavior data and further greatly reducing the calculation complexity of commodity recommendation.

Drawings

Fig. 1 is a flowchart of a commodity recommendation method according to embodiment 1 of the present invention.

Fig. 2 is a schematic hardware structure of an electronic device according to embodiment 2 of the present invention.

Fig. 3 is a flowchart of step S103 in the commodity recommendation method according to embodiment 4 of the present invention.

Fig. 4 is a flowchart of step S105 in the commodity recommendation method according to embodiment 4 of the present invention.

Fig. 5 is a schematic diagram of a commodity recommendation system according to embodiment 7 of the present invention.

Fig. 6 is a schematic block diagram of a commodity recommendation system according to embodiment 8 of the present invention.

Detailed Description

The invention is further illustrated by means of the following examples, which are not intended to limit the scope of the invention.

Example 1

The present embodiment provides a commodity recommendation method, and fig. 1 shows a flowchart of the present embodiment. Referring to fig. 1, the commodity recommendation method of the present embodiment includes:

s101, acquiring a plurality of behavior data of a user in an European space;

S102, mapping all behavior data to a Hamming space according to a local sensitive hash algorithm, wherein each line of data corresponds to one binary code in Yu Haiming spaces;

S103, selecting a plurality of first clustering centers from all binary codes;

s104, clustering all binary codes according to a plurality of first clustering centers to obtain a plurality of first cluster classes;

S105, for each first cluster, reselecting a first cluster center, and continuing to execute the step S104 until a cluster termination condition is met;

s106, recommending commodities to the user according to the obtained first cluster classes.

In the present embodiment, a plurality of behavior data in the european space is first acquired in step S101, where the behavior data may include, but is not limited to, search behavior data generated by a user on a line and browsing behavior data, and more specifically, the behavior data may include time data, session data, and the like.

Then in step S102, the behavior data are mapped to Hamming (Hamming) space one by one according to the locality sensitive hashing algorithm, so as to form binary codes corresponding to the behavior data one by one. The basic idea of the local sensitive hash algorithm (Locality-SENSITIVE HASHING, LSH) is that a hash function is designed to calculate hash values of two points in a high-dimensional space, and if the two points are closer, the probability that the hash values of the two points are the same is high; if the two points are far apart, then the probability that the hash values of the two points are the same is small. Thus, the binary code formed by the locality sensitive hashing algorithm in hamming space can reflect the similarity between behavior data in European space.

Specifically, in this embodiment SimHash may be used to map all behavior data to hamming space. Assuming that the behavior data includes X ₁ = (1, 2,3, 4), the procedure for performing SimHash on X ₁ is as follows:

Firstly, the components of X ₁ are transformed by using a traditional Hash algorithm to obtain signature values, and the signature values are as follows:

next, if a value of 0 in the signature value is set to-1, there are:

Again, the weighting operation is performed on the current signature value, and assuming that the weights of the components in X ₁ are 1, 3, 1, and 3, respectively, there are:

then, the current signature value is summed by bit, and then:

(1,2,3,4)→(-8,-2,0,-4)

Finally, the positive number in the signature value is set to 1, otherwise, the positive number is set to 0, and the following steps are included:

(1,2,3,4)→(-8,-2,0,-4)→(0,0,0,0)

That is, after the behavior data X ₁ = (1, 2,3, 4) is mapped to the hamming space via SimHash, the resulting binary code is X' ₁ = (0, 0).

Assuming that the behavior data also includes X ₂ = (1, 2,3, 5), then after the behavior data X ₂ is mapped to hamming space via SimHash described above, the resulting binary code is X' ₂ = (0, 1).

Wherein the behavior data X ₁ and X ₂ are similar in European space, which differ only in the 4 th component (one 4 and one 5). After the SimHash transformation, behavior data X ₁ and X ₂ were also similar with a hamming distance of 1 in hamming space.

In step S103, N (N is a positive integer and may be set up in a customized manner according to the specific application) first cluster centers are selected from all binary codes, and are denoted as: k ₁,K₂,……,K_N. In step S104, all binary codes in hamming space are clustered according to N first cluster centers, so as to obtain N first cluster classes corresponding to each first cluster center, which are marked as follows: c ₁,C₂,……,C_N.

Specifically, for each binary code in hamming space, its hamming distance from the first cluster center K ₁,K₂,……,K_N is calculated separately, noted as: d ₁,D₂,……,D_N, from which the shortest hamming distance D _M (m=1, 2, … …, N) is selected, then the binary code is categorized in the first cluster class C _M of the first cluster center K _M corresponding to the shortest hamming distance D _M. Thus, clustering of all binary codes is achieved in hamming space.

In step S105, the first cluster center is reselected for each first cluster class, for example, K _I (i=1, 2, … …, N) is used as the cluster center to obtain a first cluster class C _I including a plurality of binary codes, the first cluster center K _I is redetermined according to the plurality of binary codes, and the process goes to step S104 to cluster all binary codes in hamming space to obtain a first cluster class C _I. The iteration is repeated until the clustering termination condition is met, and the clustering termination condition may include, but is not limited to, the iteration repetition number, that is, the iteration repeatedly executing step S104 or S105 is performed for a preset number of times, or the variation of the first cluster center K _I in the hamming space is smaller than the preset variation and tends to converge.

After the clustering is terminated, the finally obtained first cluster classes can represent user behaviors of a plurality of classes, wherein the user behaviors of each class can reflect different shopping interests of the user. In step S106, the user is recommended with the appropriate merchandise according to the resulting plurality of first cluster classes, i.e., according to the inferred shopping interests.

Because the exclusive or calculation between the binary codes is quite simple, the clustering operation is performed based on the Hamming distance calculation similarity, the calculation amount of measuring the similarity is greatly reduced, the clustering speed of behavior data is greatly improved, and the calculation complexity of commodity recommendation is further greatly reduced.

Example 2

The present embodiment provides an electronic device, which may be expressed in the form of a computing device (for example, may be a server device), including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor may implement the commodity recommendation method provided in embodiment 1 when executing the computer program.

Fig. 2 shows a schematic hardware structure of the present embodiment, and as shown in fig. 2, the electronic device 9 specifically includes:

At least one processor 91, at least one memory 92, and a bus 93 for connecting the different system components (including the processor 91 and the memory 92), wherein:

The bus 93 includes a data bus, an address bus, and a control bus.

The memory 92 includes volatile memory such as Random Access Memory (RAM) 921 and/or cache memory 922, and may further include Read Only Memory (ROM) 923.

Memory 92 also includes a program/utility 925 having a set (at least one) of program modules 924, such program modules 924 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

The processor 91 executes various functional applications and data processing such as the commodity recommendation method provided in embodiment 1 of the present invention by running a computer program stored in the memory 92.

The electronic device 9 may further communicate with one or more external devices 94 (e.g., keyboard, pointing device, etc.). Such communication may occur through an input/output (I/O) interface 95. Also, the electronic device 9 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 96. The network adapter 96 communicates with other modules of the electronic device 9 via the bus 93. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in connection with the electronic device 9, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, data backup storage systems, and the like.

It should be noted that although several units/modules or sub-units/modules of an electronic device are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present application. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.

Example 3

The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the commodity recommendation method provided in embodiment 1.

More specifically, among others, readable storage media may be employed including, but not limited to: portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.

In a possible embodiment, the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps of implementing the merchandise recommendation method in embodiment 1 when said program product is run on the terminal device.

Wherein the program code for carrying out the invention may be written in any combination of one or more programming languages, which program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on the remote device or entirely on the remote device.

Example 4

On the basis of embodiment 1, this embodiment provides a commodity recommendation method. Referring to fig. 3, in the present embodiment, step S103 specifically includes:

s1031, selecting a plurality of second aggregation centers from all behavior data;

s1032, setting the binary code corresponding to each second cluster center as the first cluster center.

Specifically, in this embodiment, N second aggregation centers are first selected from all behavior data, and are denoted as: l ₁,L₂,……,L_N, and in step S102, the second aggregation center L ₁,L₂,……,L_N maps to hamming space via a locality sensitive hashing algorithm to obtain N binary codes, denoted as: k ₁,K₂,……,K_N, setting the binary code K ₁,K₂,……,K_N as the first cluster center.

The difference between this embodiment and embodiment 1 is that the second cluster center is first selected in the euclidean space, and then the first cluster center corresponding to the second cluster center is determined in the hamming space according to the mapping relationship between the euclidean space and the hamming space. For example, N second cluster centers may be randomly selected among all behavior data according to the K-means algorithm, wherein the N first cluster centers are preferably different from each other, so that when selecting the second cluster centers, it should be noted that each second cluster center corresponds to a different binary code. For another example, N second aggregation centers may be selected from all behavior data directly according to the K-means++ algorithm, such that the second aggregation centers are far apart.

Referring to fig. 4, in the present embodiment, step S105 specifically includes:

s1051, setting behavior data corresponding to binary codes classified in a first cluster class to be classified in the same second cluster class;

s1052, in the second cluster, reselecting a second cluster center;

S1053, mapping the re-selected second aggregation center to a Hamming space according to a local sensitive hash algorithm to obtain a re-calculated binary code;

s1054, setting the recalculated binary code as the first cluster center of the reselection.

Specifically, in step S104, a first cluster class C _I having K _I as a cluster center and including a plurality of binary codes is obtained, and according to the mapping relationship between the euclidean space and the hamming space, the behavior data corresponding to the binary codes classified in the first cluster class C _I may be classified in the second cluster class D _I, that is, different first cluster classes correspond to different second cluster classes. In the second cluster D _I, the second cluster center L _I may be recalculated according to the euclidean distance of all the behavior data, and since the reselected second cluster center L _I is calculated and may not exist in the behavior data, it is required to remap the binary code corresponding to the reselected second cluster center L _I according to the locally sensitive hash algorithm to the hamming space, and further, the binary code obtained by the recalculation is set as the reselected first cluster center K _I.

In this embodiment, the second clustering center is selected in the european style space, and then the first clustering center is determined according to the mapping relationship between the european style space and the hamming space, instead of directly selecting the first clustering center in the hamming space, another manner of determining the first clustering center is provided, and further another commodity recommendation method is provided.

Example 5

The present embodiment provides an electronic device, which may be expressed in the form of a computing device (for example, may be a server device), including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor may implement the commodity recommendation method provided in embodiment 4 when executing the computer program.

Example 6

The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the merchandise recommendation method provided by embodiment 4.

Example 7

The embodiment provides a commodity recommendation system, and fig. 5 shows a schematic block diagram of the embodiment. Referring to fig. 5, the commodity recommendation system of the present embodiment includes:

The behavior data acquisition module 1 is used for acquiring a plurality of behavior data of a user in an European space;

The mapping module 2 is used for mapping all the behavior data to a Hamming space according to a local sensitive hash algorithm, wherein each line of data corresponds to one binary code in Yu Haiming spaces;

a first selecting module 3, configured to select a plurality of first cluster centers from all binary codes;

the clustering module 4 is used for clustering all binary codes according to a plurality of first clustering centers to obtain a plurality of first cluster types;

the second selecting module 5 is used for reselecting the first cluster center for each first cluster class, and continuously calling the clustering module 4 until the clustering termination condition is met;

And the commodity recommending module 6 is used for recommending commodities to the user according to the obtained first cluster types after the clustering is ended.

In the present embodiment, the behavior data acquisition module 1 is first invoked to acquire a plurality of behavior data in the european style space, wherein the behavior data may include, but is not limited to, search behavior data generated by a user on a line and browsing behavior data, and more specifically, the behavior data may include time data, session data, and the like.

And then, calling a mapping module 2 to map the behavior data to Hamming (Hamming) space one by one according to a local sensitive hash algorithm to form binary codes corresponding to the behavior data one by one. The basic idea of the local sensitive hash algorithm (Locality-SENSITIVE HASHING, LSH) is that a hash function is designed to calculate hash values of two points in a high-dimensional space, and if the two points are closer, the probability that the hash values of the two points are the same is high; if the two points are far apart, then the probability that the hash values of the two points are the same is small. Thus, the binary code formed by the locality sensitive hashing algorithm in hamming space can reflect the similarity between behavior data in European space.

next, if a value of 0 in the signature value is set to-1, there are:

then, the current signature value is summed by bit, and then:

(1,2,3,4)→(-8,-2,0,-4)

(1,2,3,4)→(-8,-2,0,-4)→(0,0,0,0)

The first selection module 3 selects N (N is a positive integer and may be set up in a customized manner according to a specific application) first cluster centers from all binary codes, and marks as: k ₁,K₂,……,K_N. The clustering module 4 clusters all binary codes in the Hamming space according to the N first clustering centers to obtain N first cluster classes corresponding to the first clustering centers, and the N first cluster classes are recorded as follows: c ₁,C₂,……,C_N.

Referring to fig. 5, the clustering module 4 includes a distance calculation unit 41, a shortest distance selection unit 42, and a classification unit 43. Specifically, the distance calculation unit 41 is configured to calculate, for each binary code in the hamming space, the hamming distance thereof from the first cluster center K ₁,K₂,……,K_N, and is written as: d ₁,D₂,……,D_N, the shortest distance selecting unit 42 is configured to select the shortest hamming distance D _M (m=1, 2, … …, N) from the shortest distance, and the classifying unit 43 is configured to classify the binary code into the first cluster class C _M of the first cluster center K _M corresponding to the shortest hamming distance D _M. Thus, clustering of all binary codes is achieved in hamming space.

For each first cluster class, the first cluster center is reselected, for example, K _I (i=1, 2, … …, N) is used as a cluster center to obtain a first cluster class C _I including a plurality of binary codes, the second selecting module 5 redetermines the first cluster center K _I according to the plurality of binary codes, and clusters all binary codes in hamming space to obtain a first cluster class C _I. The iteration is repeated until the clustering termination condition is met, and the clustering termination condition may include, but is not limited to, iteration repetition times, that is, a preset times clustering module 4 or a second selecting module 5 is called, or the variation of a first clustering center K _I in the hamming space is smaller than a preset variation and tends to converge.

After the clustering is terminated, the finally obtained first cluster classes can represent user behaviors of a plurality of classes, wherein the user behaviors of each class can reflect different shopping interests of the user. The item recommendation module 6 is invoked to recommend the appropriate item to the user based on the resulting plurality of first cluster classes, i.e., based on the inferred shopping interests.

Example 8

On the basis of embodiment 7, this embodiment provides a commodity recommendation system, and fig. 6 shows a schematic block diagram of this embodiment. Referring to fig. 6, in the merchandise recommendation system of the present embodiment, the first selection module 3 specifically includes:

A first selecting unit 31 for selecting a plurality of second aggregation centers among all the behavior data;

The first setting unit 32 is configured to set the binary code corresponding to each second cluster center as the first cluster center.

Specifically, in the present embodiment, the first selection unit 31 first selects N second aggregation centers among all the behavior data, denoted as: l ₁,L₂,……,L_N, and further, mapping the second aggregation center L ₁,L₂,……,L_N to hamming space via a locality sensitive hashing algorithm to obtain N binary codes, which are recorded as: k ₁,K₂,……,K_N, the first setting unit 32 may set the binary code K ₁,K₂,……,K_N as the first cluster center.

The difference between this embodiment and embodiment 7 is that the second cluster center is selected in the euclidean space first, and then the first cluster center corresponding to the second cluster center is determined in the hamming space according to the mapping relationship between the euclidean space and the hamming space. For example, N second cluster centers may be randomly selected among all behavior data according to the K-means algorithm, wherein the N first cluster centers are preferably different from each other, so that when selecting the second cluster centers, it should be noted that each second cluster center corresponds to a different binary code. For another example, N second aggregation centers may be selected from all behavior data directly according to the K-means++ algorithm, such that the second aggregation centers are far apart.

In this embodiment, the second selecting module 5 specifically includes:

a second setting unit 51, configured to set behavior data corresponding to binary codes classified in the first cluster class to be classified in the same second cluster class;

a second selecting unit 52, configured to reselect a second cluster center in a second cluster;

a mapping unit 53, configured to map the re-selected second aggregation center to hamming space according to a locally sensitive hash algorithm, to obtain a re-calculated binary code;

a third setting unit 54 is configured to set the recalculated binary code to the reselected first cluster center.

Specifically, by calling the clustering module 4 to obtain a first cluster class C _I having K _I as a clustering center and including a plurality of binary codes, according to the mapping relationship between the european space and the hamming space, the second setting unit 51 may classify behavior data corresponding to the binary codes classified in the first cluster class C _I into a second cluster class D _I, that is, different first cluster classes correspond to different second cluster classes. In the second cluster D _I, the second selecting unit 52 may recalculate and select the second cluster center L _I according to the euclidean distance of all the behavior data, and since the recalculated second cluster center L _I is calculated and may not exist in the behavior data, the mapping unit 53 needs to be called to remap to the hamming space according to the above-mentioned locally sensitive hash algorithm, so as to obtain the binary code corresponding to the reselected second cluster center L _I, and further, the third setting unit 54 sets the recalculated binary code to be the reselected first cluster center K _I.

In this embodiment, the second clustering center is selected in the european style space, and then the first clustering center is determined according to the mapping relationship between the european style space and the hamming space, instead of directly selecting the first clustering center in the hamming space, another manner of determining the first clustering center is provided, and further another commodity recommendation system is provided.

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the principles and spirit of the invention, but such changes and modifications fall within the scope of the invention.

Claims

1. A commodity recommendation method, characterized in that the commodity recommendation method comprises:

selecting a plurality of first cluster centers from all binary codes;

2. The merchandise recommendation method according to claim 1, wherein said step of clustering all binary codes according to a plurality of first clustering centers comprises:

Selecting the shortest hamming distance from all hamming distances;

3. The merchandise recommendation method of claim 1, wherein the step of selecting a plurality of first cluster centers among all binary codes comprises:

Selecting a plurality of second hub classes among all the behavioral data;

reselecting a second cluster center among the second cluster types;

the recalculated binary code is set to the reselected first cluster center.

4. The merchandise recommendation method according to claim 3, wherein in said step of selecting a plurality of second aggregation centers among all of the behavior data, each second aggregation center corresponds to a different binary code;

5. The merchandise recommendation method of claim 1, wherein the cluster termination condition comprises:

and/or the locality sensitive hashing algorithm comprises SimHash.

6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the merchandise recommendation method of any one of claims 1-5 when the computer program is executed by the processor.

7. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the merchandise recommendation method according to any one of claims 1-5.

8. A merchandise recommendation system, the merchandise recommendation system comprising:

9. The merchandise recommendation system of claim 8, wherein the clustering module comprises:

10. The merchandise recommendation system of claim 8 wherein the first selection module comprises:

The second selection module includes:

11. The merchandise recommendation system of claim 10, wherein each second hub selected by said first selection unit corresponds to a different binary code;

And/or the second selecting unit calculates the second cluster center according to Euclidean distances of all behavior data in the second cluster.

12. The merchandise recommendation system of claim 8, wherein the cluster termination condition comprises:

and/or the locality sensitive hashing algorithm comprises SimHash.