CN118761825B - Neural network model data processing method and device of recommendation system - Google Patents
Neural network model data processing method and device of recommendation system Download PDFInfo
- Publication number
- CN118761825B CN118761825B CN202411239007.9A CN202411239007A CN118761825B CN 118761825 B CN118761825 B CN 118761825B CN 202411239007 A CN202411239007 A CN 202411239007A CN 118761825 B CN118761825 B CN 118761825B
- Authority
- CN
- China
- Prior art keywords
- sub
- embedding
- bucket
- neural network
- network model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Recommending goods or services
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Artificial Intelligence (AREA)
- Finance (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a neural network model data processing method and device of a recommendation system, wherein the method comprises the following steps: acquiring recommended candidate feature data, wherein the recommended candidate feature data comprise category type feature data and numerical type feature data; obtaining embedded representation of the category type characteristic data through a lookup table; for numerical value type characteristic data, calculating a weighting coefficient m of a main sub-barrel and a weighting coefficient p of an adjacent sub-barrel according to the window size k and the weighting base b by setting the window size k and the weighting base b, obtaining numerical values in the sub-barrels by normalization and sub-barrel processing, and then calculating the weighted average value of the embedded vector of the main sub-barrel where the sub-barrel is located and the embedded vectors of k left sub-barrels and k right sub-barrels adjacent to the main sub-barrel, which are the window size, so as to obtain the embedded representation of the numerical values in the sub-barrels; and then the embedded representations of the two kinds of characteristic data are spliced and then input into a neural network model of the recommendation system, so that the accuracy of model data processing and the generalization capability of the model are improved.
Description
Technical Field
The invention belongs to the technical field of electronic data processing, and particularly relates to a neural network model data processing method and device of a recommendation system.
Background
The neural network model of the recommendation system is a technology applied in the field of the recommendation system in recent years, and utilizes the characteristic extraction and learning capability of the neural network model to learn complex relations between users and articles from multi-dimensional data such as historical behaviors of the users, properties of the articles and the like, so that personalized and accurate recommendation is provided for the users. The e-commerce recommendation neural network model is widely studied and applied in the field of recommendation systems, and is an important tool for improving user experience and increasing sales in an e-commerce platform. The model can be used for performing deep learning and analysis on a large amount of user behavior data and commodity information by simulating the working principle of human brain neurons, so that personalized commodity recommendation is provided for users.
In deep neural networks, data is often represented using an embedding method. For category type feature data, such as user and commodity attribute information in e-commerce recommendation systems, a low-dimensional dense vector is generally assigned to the category type feature data through a hash method, which is commonly called "embedding" (embedding), and then the parameter is updated through a learning method based on gradient descent. For numerical feature data, such as associated feature information of users and commodities in an e-commerce recommendation system, the associated feature information is discretized in a barrel dividing mode, and then a low-dimensional dense vector is allocated to the discrete feature data, however, two problems exist in the data processing method: one is the close values but large differences in embedding, SBD (Similar value But Dis-similar embedding), these partitioning methods may exist to partition similar values into different groups, resulting in a large difference in the final embedding characterization, such as some boundary values; secondly, the values are very different but are consistent, i.e. DBS (Dis-similar value But Same embedding), two very different values may be contained in the same bucket, but they are grouped, e.g. boundary values in the same bucket, the values on both sides may be very different, but the corresponding embedding representation is the same in the end, because in the same bucket. Such differences can make the model unable to accurately capture these semantic similarities in machine learning tasks, such as neural network models applied to recommendation systems are unable to accurately capture dynamic changes between user behaviors, thereby affecting the performance of model predictions and evaluations.
The latest AutoDis method adopts meta-learning (META LEARNING) to solve the SBD and DBS problems. The Meta-learning approach is summarized by first defining a set of Meta-Embeddings for each domain (field) numerical feature. The eigenvalues of each domain are then discretized automatically by a neural network learning and assigned to different Meta-Embeddings buckets, so that the eigenvalues of each domain may be split into one bucket or multiple buckets. Finally, the weights of the buckets embedding are learned through a neural network, and the embedding results of the buckets are aggregated according to the weights to obtain embedding of the final continuous characteristic values. Among them, there are three ways of polymerization: first, maximum pooling: selecting a Meta-Embeddings barrel with the highest probability value as the last embedding representation of the characteristic value; top-K summation: selecting top-k embedding with highest probability for summation; 3. weighted average: and (5) obtaining the weighted average by using the probability logic value. However, autoDis needs to learn the weights of multiple sub-buckets by means of a neural network, so that the data processing process is complex, and in a recommendation system, especially an electronic commerce recommendation system with huge data volume, complex data processing needs to be configured with higher hardware resources, and meanwhile, the calculation cost is increased, so that the response speed of the recommendation system is influenced.
Therefore, improvement on the prior art is urgently needed, and a neural network model data processing method which is simply and reliably applied to a recommendation system is sought.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a neural network model data processing method and device of a recommendation system.
In order to achieve the above object, the data processing method and apparatus for a neural network model of a recommendation system of the present invention adopts the following technical scheme:
in a first aspect, a neural network model data processing method of a recommendation system is provided, and the method includes:
S1: acquiring recommendation candidate feature data, wherein the feature data comprises category type feature data and numerical type feature data, and the category type feature data comprises a user ID, a commodity ID, a category ID, a brand ID and a user behavior type; the numerical value type characteristic data comprises user image characteristics and commodity image characteristics;
s2: for category type feature data, obtaining embedding representation thereof through a lookup table;
S3: for the numerical value type characteristic data, setting a window size k and a weighting base b, and calculating a weighting coefficient m of a main sub-bucket and a weighting coefficient p of an adjacent sub-bucket according to the window size k and the weighting base b;
S4: normalizing and barrel dividing the numerical characteristic data to obtain numerical values in the barrels;
S5: for the numerical value in the sub-bucket, calculating the weighted average value of embedding vectors of the main sub-bucket where the sub-bucket is located and embedding vectors of k left sub-buckets and k right sub-buckets adjacent to the main sub-bucket, wherein the number of the left sub-bucket and the number of the right sub-buckets are the window size, and obtaining embedding representation of the numerical value in the sub-bucket;
s6: splicing embedding representations of the category type feature data obtained in the step S2 and embedding representations of the numerical type feature data obtained in the step S5 to obtain a complete embedding representation of the candidate feature data;
s7: and outputting the complete embedding representation to a neural network model of the recommendation system.
Further, after step S7, step S8 is further included, which outputs the interest score value of the current user for the commodity through neural network model processing, and the recommended commodity list for the current user is formed according to the order of the interest score values of the commodity from high to low.
Further, in step S3, letLet i traverse 1 through k-1,,The saidRepresentation ofA summation calculation is performed.
Further, the barrel dividing process in step S4 adopts equidistant barrel dividing.
Further, for the numerical value in the sub-bucket, the embedding vector of the main sub-bucket where the numerical value is located is denoted as embedding _table [ n ], where n is a sub-bucket index, and the specific calculation process represented by embedding of the numerical value in the sub-bucket is as follows:
step S51: calculating the weight embedding of the main sub-bucket: ;
Step S52: calculate weights embedding for left and right sub-buckets: let i traverse 1 to k, calculate left sub-bucket respectively And right barrel;
Step S53: the main sub-bucket weights embedding and the left and right sub-bucket weights embedding are added to obtain a embedding representation of the values in the sub-bucket.
In a second aspect, the present invention further includes a neural network model data processing apparatus of a recommendation system, including:
The data acquisition module is used for acquiring recommended candidate feature data, wherein the feature data comprise category type feature data and numerical value type feature data, and the category type feature data comprise user ID, commodity ID, category ID, brand ID and user behavior type data; the numerical data comprises user portrait characteristics and commodity portrait characteristics data;
The category type characteristic data processing module is used for acquiring embedding representations of category type characteristic data through the lookup table;
The numerical characteristic data processing module is used for acquiring embedding representations of the numerical characteristic data and comprises a parameter setting sub-module, a parameter setting sub-module and a parameter setting sub-module, wherein the parameter setting sub-module is used for setting the window size k and the weighting base b, and calculating the weighting coefficient m of the main sub-bucket and the weighting coefficient p of the adjacent sub-bucket according to the window size k and the weighting base b; the data processing sub-module is used for carrying out normalization and barrel separation processing on the numerical type characteristics to obtain numerical values in the barrel separation; and the calculation sub-module is used for calculating the embedded value, and for the numerical value in the sub-bucket, calculating the weighted average value of the embedding vector of the main sub-bucket where the sub-bucket is positioned and the embedding vectors of k left sub-buckets and k right sub-buckets adjacent to the main sub-bucket, wherein the number of the left sub-bucket and the k right sub-buckets are the window size, and obtaining the embedding representation of the numerical value in the sub-bucket.
And (3) splicing modules: and splicing embedding representations respectively obtained by the category type characteristic data processing module and the numerical type characteristic data processing module to obtain a complete embedding representation of the candidate characteristic data, and inputting the complete embedding representation into a neural network model of a recommendation system.
The neural network model input module: for inputting the complete embedding representation into a neural network model of the recommendation system.
The neural network model output module: and the method is used for outputting the interest score value of the current user processed through the neural network model to the commodity, and sequentially sequencing the interest score values of the commodity from high to low to form a recommended commodity list for the current user.
Further, the parameter setting sub-module is specifically configured to: order theLet i traverse 1 through k-1,,,Representation ofA summation calculation is performed.
Further, the sub-modules of data processing process the sub-modules with equally spaced sub-barrels.
Further, the computing sub-module is configured to, when computing the embedded value:
For the numerical value in the sub-bucket, the embedding vector of the main sub-bucket where the numerical value is located is denoted as embedding _table [ n ], and the specific calculation process represented by embedding of the numerical value in the sub-bucket is as follows:
calculating the weight embedding of the main sub-bucket: ;
Calculate weights embedding for left and right sub-buckets: let i traverse 1 to k to calculate left sub-bucket respectively And right barrel;
The main sub-bucket weights embedding and the left and right sub-bucket weights embedding are added to obtain a embedding representation of the values in the sub-bucket.
Compared with the processing method for the characteristic data of the recommendation system in the prior art, the method has the advantages that before the data are input into the neural network model, the data processing is carried out on the category type characteristic data and the numerical type characteristic data in different modes, wherein for the numerical type characteristic data processing, the weighted average value of embedding adjacent to the sub-bucket and the current sub-bucket is used as the embedded representation of the numerical type characteristic data, then the weighted average value is spliced with embedding of the category type characteristic data, and then the neural network model is input, so that the SBD problem is relieved, the personalized requirements of users can be better understood in the follow-up processing of the neural network model, and further more personalized and accurate recommendation results are generated; in addition, learning of the adjacent classification bucket embedding can be enhanced, numerical characteristic data which are not seen by the model can be better represented in prediction, generalization capability of the model is improved, and applicability of the model in different application scenes is improved. Compared with AutoDis, the data processing method is simpler, the weight of each sub-bucket is not required to be learned through the neural network, and the SBD problem can be relieved, so that the accuracy of prediction and the generalization capability of the model are improved when the neural network model performs subsequent data processing, and the hardware resources and the calculation cost required by a computer for processing data are saved.
Drawings
FIG. 1 is a flowchart of a neural network model data processing method of a recommendation system in the present invention;
FIG. 2 is a flowchart of neural network model data processing using the recommendation system of the present invention;
FIG. 3 is a block diagram of a neural network model data processing device of the recommendation system of the present invention;
Fig. 4 is a schematic diagram of an electronic device hardware structure for executing the neural network model data processing method of the recommendation system in the present invention.
Detailed Description
In order to make the technical solution of the present invention better understood by those skilled in the art, the technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. It should be noted that, the illustrations provided in the present embodiment merely illustrate the basic concept of the present invention by way of illustration, and only the components related to the present invention are shown in the drawings rather than the number, shape and size of the components in actual implementation, and the form, number and proportion of each component in actual implementation may be arbitrarily changed, and the layout of the components may be more complex. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
Referring to fig. 1, the present embodiment proposes a neural network model data processing method of a recommendation system. Specifically, the method comprises the steps of:
Step S1: acquiring recommendation candidate feature data, wherein the feature data comprises category type feature data and numerical type feature data, and the category type feature data comprises a user ID, a commodity ID, a category ID, a brand ID and a user behavior type; the numerical characteristic data comprises user portrait characteristics and commodity portrait characteristics;
Step S2: obtaining embedding representation of the category type characteristic data through a lookup table;
step S3: setting a window size k and a weighting base b for the numerical characteristic data, and calculating a weighting coefficient m of a main sub-bucket and a weighting coefficient p of an adjacent sub-bucket according to the window size k and the weighting base b;
Let p [0] =b, let i traverse 1 to k-1, ,The method comprises) Representing a summation calculation for pi.
For example: let k=2, b=0.2, then it can be calculated to be. From this, the weighting coefficient m of the main sub-bucket is 0.4, and the weighting coefficients p of the adjacent sub-buckets are 0.2 and 0.1, respectively.
Step S4: normalizing and barrel dividing the numerical characteristic data to obtain numerical values in the barrels;
The numerical type features are also called continuous type features, the data of which are of floating point type, and the purpose of normalizing the numerical type features is to solve the problem of gradient descent convergence speed. The barrel separation processing is performed to ensure that the characteristic with large numerical difference is not placed in the same barrel as much as possible. Firstly, normalization operation is carried out, and common numerical characteristic distribution rules comprise poisson distribution, normal distribution, bernoulli distribution, uniform distribution and the like. The data processing is performed by taking the conventional power law distribution and the uniform distribution as examples.
For the numerical feature of the power law distribution, a normalization operation is performed by y=max (0, min (LN (1+x), ln_max))/ln_max. Where ln_max is determined according to the order of magnitude of the maximum value of the statistical value type feature, for example, the maximum value of the numerical value type feature may be ln_max=16 within millions, because LN (1 e 7) =16.11. If the current feature x=20000, then the transformation is: y=max (0, min (ln (1+20000, 16))/16=0.619.
For uniformly distributed data, a maximum minimum normalization is taken: y= (x-MIN)/(MAX-MIN), where MAX, MIN are the maximum and minimum values of the numerical feature, respectively. For example, if the maximum value of the numerical feature is 9, the minimum value is 1, and the current value is 3, y= (3-1)/(9-1) =0.25.
After normalization, the numerical characteristic range is transformed to 0-1, and equidistant barrel division is carried out according to 0.01 interval, so that 0.619 and 0.25 become 62 and 26 respectively. Of course, other values for the bucket spacing may be selected herein, as just an illustrative example.
Step S5: the embedded value is calculated, each sub-bucket is provided with a corresponding embedding vector, the embedding vector forms embedding _table, for the numerical value in the sub-bucket, the sub-bucket index n where the numerical value in the sub-bucket is located and the left and right nearest adjacent 2k sub-buckets are found, and then the weighted average value of the sub-buckets embedding is calculated and used as embedding representation of the numerical value, namely the embedded representation. The specific calculation is as follows:
Calculating the weight embedding of the main sub-bucket: embedding _main=m× embedding _table [ n ]
Weights embedding of the left and right nearest adjacent 2k sub-buckets are calculated respectively, i is traversed from 1 to k-1, and embedding of the left sub-bucket and the right sub-bucket are calculated respectively, namely:
;
;
the main sub-bucket weights embedding and the left and right sub-bucket weights embedding are added to yield a embedding representation of the values in the sub-buckets, i.e., out embedding.
For example: for a feature value of 0.25, the corresponding feature vector is embedding _table [26] if the bucket is 26, but the vector finally used for representing the feature is out_embedding = 0.1× embedding_table[24]+ 0.2× embedding_table[25] + 0.4× embedding_table[26]+ 0.2× embedding_table[27] + 0.1× embedding_table[28].
After the embedding representation of the numerical feature data is calculated, the class-type feature data and the numerical feature data are subjected to a splicing process, wherein the embedding representation of the class-type feature data is acquired through a lookup table, and the embedding representation of the class feature data and the embedding representation of the numerical feature data are spliced to serve as input of a neural network model.
And S6, splicing embedding representations of the category type feature data obtained in the step S2 and embedding representations of the numerical type feature data obtained in the step S5 to obtain a complete embedding representation of the candidate feature data of the recommendation system.
And S7, inputting the complete embedding representation of the candidate characteristic data into a neural network model of the recommendation system.
And S8, outputting interest grading values of the current user on the commodities through neural network model processing, and sequentially sequencing the interest grading values according to the interest grading values of the commodities from high to low to form a recommended commodity list for the current user.
Specifically, by optimizing the processing of embedding of the numerical feature data, all the obtained types of feature data and embedding representations of the numerical feature data are spliced together and then input into the neural network model, and the core idea of the numerical feature data processing is as follows: by taking the weighted average value of embedding of adjacent sub-buckets as the embedded representation of the numerical value type characteristics, the characteristics with similar numerical values and different sub-buckets can be mapped to similar embedded spaces, the SBD problem is relieved, the performance of the neural network is improved, an effective solution is provided for the embedding processing of the numerical value type characteristic data in the deep neural network, and therefore the neural network model of the recommendation system has better generalization capability and prediction accuracy. Illustratively, the candidate feature data obtained is a model of a neural network including various attribute features of the user and the merchandise, such as user ID, user behavior, and user portrayal features, and merchandise ID, category ID, brand ID, and merchandise portrayal features, etc., which are input to the neural network model after being processed by embedding, illustratively, the neural network model in the embodiment of the invention can adopt DNN, deep FM or DCN and the like, and through the processing of the neural network model, namely, the interest score of the current user on the commodity is output, and the commodity with high score is arranged at a position which is more forward when the commodity is recommended for the corresponding user, so that personalized and accurate commodity recommendation is realized. In the embodiment of the invention, the obtained candidate feature data may be feature data of the current user and commodity obtained after the user clicks the tag, that is, the recommendation list is output after reasoning the user and the commodity, or sample data used as model training is used as candidate feature data, and the data processing method is the same and will not be repeated here.
Exemplary, in the prior art, sample data of an e-commerce recommendation system is input to a neural network, for a user and candidate commodities, category type feature data include but are not limited to gender, age, historical click sequence and the like of the user, class, brand, category, label and the like of the commodities, continuous type features include but are not limited to historical click rate, historical conversion rate, price, sales and the like of the candidate commodities, commodity portrait feature data and the like of the user, in a sorting model of the e-commerce recommendation system, numerical type feature data in the sample data are divided into bins to obtain a bin ID to which each feature data belongs, then the bins are directly processed embedding to obtain embedding representations of values in the corresponding bins, embedding representations of the category type feature data and the numerical type feature data are spliced (embedding concat) and then input to a Deep Neural Network (DNN), and binary cross entropy loss function and related index are used for calculation to evaluate and optimize performance of the model.
For example, for a feature value of 0.619, its bucket ID is 62, then its corresponding feature vector is embedding _table [62]. Referring to fig. 2, the data processing method of the present invention is used in the e-commerce recommendation ordering model, for the numerical feature data, if the window size k is set to 2 and the weighting base b is set to 0.2, the current feature sub-bucket ID of the numerical feature in the sample data is obtained, namely 62, and the 2 left and 2 right adjacent feature sub-bucket IDs are obtained to perform embedding processing, and the final embedding representation out_embedding = 0.1× embedding_table[60]+ 0.2× embedding_table[61]+ 0.4× embedding_table[62]+ 0.2× embedding_table[63]+ 0.1× embedding_table[64]. of the current feature sub-bucket is obtained through weighted summation, compared with the basic ordering model which does not adopt the data processing method but directly carries out sub-bucket on the numerical feature data and converts the numerical feature data into embedding, the front and rear effects are compared with the following table:
In the table AUC (Area Under Curve), referring to the area under the working characteristic curve of the subject, when AUC is used as the evaluation index of the e-commerce recommendation ranking model, the value represents the distinguishing capability of the ranking model for the positive sample (i.e. the commodity which may be interested by the user) and the negative sample (i.e. the commodity which may not be interested by the user), specifically, the larger the AUC value, the stronger the distinguishing capability of the model for the positive sample and the negative sample, that is, the model can predict the commodity which may be interested by the user more accurately, and rank them in front of the recommendation list. To further evaluate the overall performance of the model under different users, the embodiment of the invention also uses UAUC (Average AUC) indicators, the values of which represent the average value of AUCs under different users, specifically, the larger the value of UAUC, the higher the average value of AUCs under different users is, that is, the model can more accurately predict the goods that different users may be interested in and rank them in front. As can be seen from the table, in the sorting model of the electronic commerce recommendation system adopting the data processing method of the neural network model of the recommendation system, the AUC and UAUC values of the sorting model are improved compared with those of the basic sorting model, which means that the data processing method of the invention is adopted to process the input data of the neural network model, thereby improving the generalization capability of the model and reducing the complexity of operation, and the personalized and accurate recommendation result can provide satisfactory recommendation commodity lists for the whole user and different users, promote the conversion rate, thereby improving the overall benefit of the electronic commerce platform.
From the above description of the method embodiments, it will be clear to those skilled in the art that the present invention may be implemented by means of software plus necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on this understanding, the technical solution of the present invention may be embodied essentially or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: read Only Memory (ROM), random Access Memory (RAM), magnetic or optical disks, and the like.
Corresponding to the embodiment of the neural network model data processing method of the recommendation system, the invention also provides a neural network model data processing device of the recommendation system.
Referring to fig. 3, a schematic structural diagram of a neural network model data processing device of a recommendation system according to an embodiment of the present invention is provided, where the device includes:
a data acquisition module 21 for acquiring recommendation candidate feature data, the feature data including category type feature data and numerical value type feature data, the category type feature data including user ID, commodity ID, category ID, brand ID and user behavior type data; the numerical data comprises user portrait characteristics and commodity portrait characteristics data;
a category type feature data processing module 22, configured to obtain embedding representations of category type feature data through a lookup table, where embedding refers to embedding;
a numerical feature data processing module 23, configured to obtain embedding representations of the numerical feature data, including: the parameter setting sub-module is used for setting the window size k and the weighting base b, and calculating the weighting coefficient m of the main sub-bucket and the weighting coefficient p of the adjacent sub-bucket according to the window size k and the weighting base b; the data processing sub-module is used for carrying out normalization and barrel separation processing on the numerical characteristic data to obtain numerical values in the barrel separation; the computing sub-module is used for computing an embedded value, each sub-bucket is provided with a corresponding embedding vector, the embedding vector forms a embedding _table, for the numerical value in the sub-bucket, the sub-bucket index n where the numerical value in the sub-bucket is positioned and 2k adjacent sub-buckets which are nearest to the left and right are found, and then the weighted average value of the sub-buckets embedding is computed and is used as an embedded representation of the numerical value, and the embedded representation is represented by out_ embedding;
Splice module 24: splicing embedding representations respectively obtained by the category type characteristic data processing module and the numerical type characteristic data processing module to obtain a complete embedding representation of the candidate characteristic data;
neural network model input module 25: inputting a complete embedding representation of the candidate feature data into a neural network model of a recommendation system;
Neural network model output module 26: and the method is used for outputting the interest score value of the commodity of the user processed by the neural network model, and sequentially sequencing the interest score values of the commodity from high to low to form a recommended commodity list for the current user.
In an exemplary embodiment, the parameter setting submodule is specifically configured to: let p0=b, let i traverse 1 to k-1, pi=pi-1×0.5, m=1-2×sum (pi ]), sum (pi) representing the sum calculation for pi.
In an exemplary embodiment, the data processing sub-module employs equally spaced buckets.
In an exemplary embodiment, the calculation submodule is configured to, when calculating the embedded value:
For the numerical value in the sub-bucket, the embedding vector of the main sub-bucket where the numerical value is located is denoted as embedding _table [ n ], and the specific calculation process represented by embedding of the numerical value in the sub-bucket is as follows:
(1) Calculating the weight embedding of the main sub-bucket: embedding _main=m× embedding _table [ n ]
(2) Calculate weights embedding for left and right sub-buckets: let i traverse 1 to k, calculate embedding _left [ i-1] =p [ i-1] × embedding _table [ n-i ] of left sub-bucket and embedding _right [ i-1] =p [ i-1] × embedding _table [ n+i ] of right sub-bucket, respectively;
(3) The main sub-bucket weights embedding and the left and right sub-bucket weights embedding are added to obtain a embedding representation of the values in the sub-bucket.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Fig. 4 is a schematic hardware structure of an electronic device for executing a neural network model data processing method of a recommendation system according to an embodiment of the present invention, where, as shown in fig. 4, the device includes:
One or more processors 310 and a memory 320, one processor 310 being illustrated in fig. 4.
The apparatus for performing the neural network model data processing method of the recommendation system may further include: an input device 330 and an output device 340.
The processor 310, memory 320, input device 330, and output device 340 may be connected by a bus or other means, for example in fig. 4.
The memory 320 is used as a non-volatile computer readable storage medium, and may be used to store a non-volatile software program, a non-volatile computer executable program, and a module, such as program instructions/modules (e.g., a parameter setting module, a data processing module, and a computing module shown in fig. 3) corresponding to a neural network model data processing method of the recommendation system in an embodiment of the present invention. The processor 310 executes various functional applications of the server and data processing, that is, implements the neural network model data processing method of the above-described method embodiment recommendation system, by running nonvolatile software programs, instructions, and modules stored in the memory 320.
Memory 320 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the neural network model data processing apparatus of the recommendation system, and the like. In addition, memory 320 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 320 optionally includes memory remotely located with respect to processor 310, which may be connected to the neural network model data processing device of the recommendation system via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 330 may receive input numeric or character information and generate key signal inputs related to parameter settings and function control of the embedded device of neural network numerical features. The output device 340 may include a display device such as a display screen.
The one or more modules are stored in the memory 320 that, when executed by the one or more processors 310, perform the method of embedding neural network numerical features in any of the method embodiments described above.
The product can execute the method provided by the embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may be found in the methods provided in the embodiments of the present invention.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, with reference to the description of method embodiments in part. The apparatus and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A neural network model data processing method of a recommendation system, the method comprising the steps of:
s1: acquiring recommended candidate feature data, wherein the feature data comprises category type feature data and numerical type feature data;
s2: obtaining embedding representations of the category type characteristic data through a lookup table, wherein embedding refers to embedding;
s3: setting a window size k and a weighting base b for the numerical characteristic data, and calculating a weighting coefficient m of a main sub-bucket and a weighting coefficient p of an adjacent sub-bucket according to the window size k and the weighting base b;
S4: normalizing and barrel dividing the numerical characteristic data to obtain numerical values in the barrels;
S5: for the numerical value in the sub-bucket, calculating the weighted average value of embedding vectors of the main sub-bucket where the sub-bucket is located and embedding vectors of k left sub-buckets and k right sub-buckets adjacent to the main sub-bucket, wherein the number of the left sub-bucket and the number of the right sub-buckets are the window size, and obtaining embedding representation of the numerical value in the sub-bucket;
S6: splicing embedding representations of the category type feature data obtained in the step S2 and embedding representations of the numerical type feature data obtained in the step S5 to obtain a complete embedding representation of the recommended candidate feature data;
s7: and inputting the complete embedding representation of the recommended candidate feature data into a neural network model of a recommendation system.
2. The neural network model data processing method of a recommendation system according to claim 1, wherein in S3, the weighting coefficient m and the weighting coefficient p are specifically: order theLet i traverse 1 through k-1,,The saidRepresentation ofA summation calculation is performed.
3. The neural network model data processing method of a recommendation system according to claim 1, wherein in S4, the barrel-dividing process employs equidistant barrel-dividing.
4. The neural network model data processing method of a recommendation system according to claim 1, wherein in S5, further comprising, for values in the sub-buckets, marking a embedding vector of a main sub-bucket where the values are located as embedding _table [ n ], where n is a sub-bucket index, and a specific calculation process represented by embedding of the values in the sub-bucket is as follows:
S51: calculating the weight embedding of the main sub-bucket: ;
s52: calculate weights embedding for left and right sub-buckets: let i traverse 1 to k, calculate left sub-bucket respectively And right barrel;
S53: the main sub-bucket weights embedding and the left and right sub-bucket weights embedding are added to obtain a embedding representation of the values in the sub-bucket.
5. The neural network model data processing method of a recommendation system according to claim 1, wherein the category characteristic data includes a user ID, a commodity ID, a category ID, a brand ID, and a user behavior type; the numerical feature data includes user portrait features and merchandise portrait features.
6. The neural network model data processing method of a recommendation system according to claim 5, further comprising, after S7:
S8: and outputting interest grading values of the current user on the commodities through neural network model processing, and sequentially sequencing the interest grading values according to the commodity from high to low to form a recommended commodity list for the current user.
7. A neural network model data processing apparatus of a recommendation system, comprising:
the data acquisition module is used for acquiring recommended candidate feature data, wherein the feature data comprises category type feature data and numerical value type feature data;
The category type characteristic data processing module is used for acquiring embedding representations of category type characteristic data through a lookup table, wherein embedding refers to embedding;
The numerical characteristic data processing module is used for acquiring embedding representations of the numerical characteristic data and comprises a parameter setting sub-module, a parameter setting sub-module and a parameter setting sub-module, wherein the parameter setting sub-module is used for setting the window size k and the weighting base b, and calculating the weighting coefficient m of the main sub-bucket and the weighting coefficient p of the adjacent sub-bucket according to the window size k and the weighting base b; the data processing sub-module is used for carrying out normalization and barrel separation processing on the numerical characteristic data to obtain numerical values in the barrel separation; the calculating sub-module is used for calculating an embedded value, and for the numerical value in the sub-barrel, calculating a weighted average value of embedding vectors of a main sub-barrel where the sub-barrel is located and embedding vectors of K left and K right sub-barrels adjacent to the main sub-barrel, wherein the number of the left and K right sub-barrels is the window size, and obtaining embedding representation of the numerical value in the sub-barrel;
And (3) splicing modules: splicing embedding representations respectively obtained by the category type feature data processing module and the numerical type feature data processing module to obtain a complete embedding representation of the recommended candidate feature data;
The neural network model input module: inputting a complete embedding representation of the recommendation candidate feature data into a neural network model of a recommendation system;
The neural network model output module: and the method is used for outputting the interest score value of the current user processed through the neural network model to the commodity, and sequentially sequencing the interest score values of the commodity from high to low to form a recommended commodity list for the current user.
8. The neural network model data processing device of the recommendation system according to claim 7, wherein the parameter setting sub-module is specifically configured to: order theLet i traverse 1 through k-1,,,Representation ofA summation calculation is performed.
9. The neural network model data processing device of the recommendation system of claim 7, wherein the sub-modules of data processing use equally spaced sub-buckets.
10. The neural network model data processing device of the recommendation system of claim 7, wherein the computing sub-module is configured to, when computing the embedded value:
For the numerical value in the sub-bucket, the embedding vector of the main sub-bucket where the numerical value is located is marked as embedding _table [ n ], wherein n is a sub-bucket index, and the specific calculation process represented by embedding of the numerical value in the sub-bucket is as follows:
calculating the weight embedding of the main sub-bucket: ;
calculate weights embedding for left and right sub-buckets: let i traverse 1 to k, calculate left sub-bucket respectively And right barrel;
The main sub-bucket weights embedding and the left and right sub-bucket weights embedding are added to obtain a embedding representation of the values in the sub-bucket.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411239007.9A CN118761825B (en) | 2024-09-05 | 2024-09-05 | Neural network model data processing method and device of recommendation system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411239007.9A CN118761825B (en) | 2024-09-05 | 2024-09-05 | Neural network model data processing method and device of recommendation system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118761825A CN118761825A (en) | 2024-10-11 |
| CN118761825B true CN118761825B (en) | 2024-11-15 |
Family
ID=92947976
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411239007.9A Active CN118761825B (en) | 2024-09-05 | 2024-09-05 | Neural network model data processing method and device of recommendation system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118761825B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111553759A (en) * | 2020-03-25 | 2020-08-18 | 平安科技(深圳)有限公司 | Product information pushing method, device, equipment and storage medium |
| CN113268633A (en) * | 2021-06-25 | 2021-08-17 | 北京邮电大学 | Short video recommendation method |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11687829B2 (en) * | 2020-04-28 | 2023-06-27 | Optum Services (Ireland) Limited | Artificial intelligence recommendation system |
| KR20230045309A (en) * | 2021-09-28 | 2023-04-04 | 삼성전자주식회사 | Computational storage device for deep-learning recommendation system and method of operating the same |
| CN115689639A (en) * | 2022-08-25 | 2023-02-03 | 江南大学 | Commercial advertisement click rate prediction method based on deep learning |
-
2024
- 2024-09-05 CN CN202411239007.9A patent/CN118761825B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111553759A (en) * | 2020-03-25 | 2020-08-18 | 平安科技(深圳)有限公司 | Product information pushing method, device, equipment and storage medium |
| CN113268633A (en) * | 2021-06-25 | 2021-08-17 | 北京邮电大学 | Short video recommendation method |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118761825A (en) | 2024-10-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111798273B (en) | Training method of product purchase probability prediction model and purchase probability prediction method | |
| Malik et al. | EPR-ML: E-Commerce Product Recommendation Using NLP and Machine Learning Algorithm | |
| CN110969516A (en) | Product recommendation method and device | |
| JP2021518024A (en) | How to generate data for machine learning algorithms, systems | |
| CN110647696B (en) | Method and device for sorting business objects | |
| CA3066029A1 (en) | Image feature acquisition | |
| CN112395487B (en) | Information recommendation method and device, computer readable storage medium and electronic equipment | |
| CN110008397B (en) | Recommendation model training method and device | |
| CN112085565A (en) | Deep learning-based information recommendation method, device, equipment and storage medium | |
| US20240242127A1 (en) | Recommendation method and related apparatus | |
| CN112446739B (en) | Click rate prediction method and system based on decomposition machine and graph neural network | |
| CN106157156A (en) | A kind of cooperation recommending system based on communities of users | |
| CN116523608B (en) | Item sequence recommendation method based on probabilistic logic reasoning | |
| CN112818218A (en) | Information recommendation method and device, terminal equipment and computer readable storage medium | |
| CN111695024A (en) | Object evaluation value prediction method and system, and recommendation method and system | |
| CN119228477A (en) | A recommendation system based on knowledge graph completion | |
| CN113837836A (en) | Model recommendation method, device, equipment and storage medium | |
| Saleh | The The Machine Learning Workshop: Get ready to develop your own high-performance machine learning algorithms with scikit-learn | |
| CN117670470A (en) | Training method and device for recommendation model | |
| CN113032648B (en) | A data screening method, device, computer equipment and storage medium | |
| CN118761825B (en) | Neural network model data processing method and device of recommendation system | |
| CN111415222B (en) | Article recommendation method, device, equipment and computer readable storage medium | |
| CN120410321A (en) | Model processing methods, devices and computer equipment for converting innovative achievements into standard evaluation | |
| Rofik et al. | Improving the accuracy of the logistic regression algorithm model using SelectKBest in customer prediction based on purchasing behavior patterns | |
| Ghosh et al. | Understanding machine learning |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |