CN118537053B - E-commerce data asset management and control method and system based on business center - Google Patents
E-commerce data asset management and control method and system based on business center Download PDFInfo
- Publication number
- CN118537053B CN118537053B CN202411013650.XA CN202411013650A CN118537053B CN 118537053 B CN118537053 B CN 118537053B CN 202411013650 A CN202411013650 A CN 202411013650A CN 118537053 B CN118537053 B CN 118537053B
- Authority
- CN
- China
- Prior art keywords
- commodity
- search
- word
- target
- phrase
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Entrepreneurship & Innovation (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of electronic commerce data asset management and control, in particular to an electronic commerce data asset management and control method and system based on a business center, comprising the following steps: the method comprises the steps of obtaining electronic commerce data of an electronic commerce user by utilizing a service center, obtaining a target word set according to a search word set and a search word frequency set, calculating a target word vector of the target word set, obtaining a commodity description text according to a commodity click id set, extracting commodity phrase in the commodity description text, calculating a commodity word vector of the commodity phrase, calculating phrase matching degree of the target word set and the commodity phrase according to the commodity word vector and the target word vector, obtaining commodity intention evaluation sets according to login times, commodity browsing duration and the phrase matching degree, obtaining intention commodity id sets according to the commodity intention evaluation sets, and realizing electronic commerce data asset management and control based on the intention commodity id sets. The invention can reduce the manpower resource consumption of the electronic commerce data asset management and control process and improve the efficiency of electronic commerce data asset management and control.
Description
Technical Field
The invention relates to the technical field of electronic commerce data asset management and control, in particular to an electronic commerce data asset management and control method, system, electronic equipment and computer readable storage medium based on a business center.
Background
Electronic commerce plays an increasingly important role in today's commodity transactions as an important component in digital economy. The rapid development of the e-commerce platform and the rapid growth of the e-commerce users at present lead the total amount of the e-commerce data assets to be increased in an explosive manner, which brings serious challenges for the e-commerce platform to process the e-commerce data in time. Thus, the management of e-commerce data assets becomes particularly important.
At present, the electronic commerce data assets of electronic commerce users of a single electronic commerce platform are mainly collected, and the purpose of controlling the electronic commerce data assets of the electronic commerce users is achieved through research and processing of the electronic commerce data. However, in the existing method, the data asset management and control are performed on the electronic commerce data of each electronic commerce platform, so that a great deal of manpower is consumed, and the management and control efficiency is low.
Disclosure of Invention
The invention provides an electronic commerce data asset management and control method based on a business center and a computer readable storage medium, which mainly aim to reduce human resource consumption in an electronic commerce data asset management and control process and improve electronic commerce data asset management and control efficiency.
In order to achieve the above object, the present invention provides a business center-based electronic commerce data asset management and control method, which includes:
E-commerce data of an E-commerce user are acquired by utilizing a pre-constructed business center, wherein the E-commerce data comprise: user id, login times, search word sets, search word frequency sets, commodity click id sets and commodity browsing duration sets;
classifying search hotness of the search word sets according to a preset word frequency threshold and the search word frequency set to obtain a search hotword set and a non-search hotword set;
Sequentially extracting search hotwords from the search hotword set, and calculating the text similarity of each non-search hotword in the search hotword set and each non-search hotword in the non-search hotword set to obtain a text similarity set;
Extracting target similarity larger than a preset similarity threshold value from the text similarity set, identifying target phrases corresponding to the target similarity, summarizing the target phrases to obtain a target word set, and calculating target word vectors of the target word set;
Sequentially extracting commodity click ids from the commodity click id sets, acquiring commodity description texts of the commodity click ids, extracting commodity phrase from the commodity description texts, and calculating commodity word vectors of the commodity phrase;
calculating the phrase matching degree of the target phrase set and the commodity phrase according to the commodity phrase vector and the target phrase vector;
Sequentially extracting commodity browsing duration in the commodity browsing duration set, and calculating commodity intention scores of the electric commodity click id by utilizing a preset commodity intention formula according to the login times, the commodity browsing duration and the phrase matching degree to obtain commodity intention evaluation diversity;
and extracting target intention scoring sets larger than a preset scoring threshold value from the commodity intention scoring sets, identifying intention commodity id sets corresponding to the target intention scoring sets, pushing commodities to electric business users according to the intention commodity id sets, and completing management and control of electric business data assets based on a business center.
Optionally, the acquiring, by using the pre-built service center, the e-commerce data of the e-commerce user includes:
e-commerce data are butted between the business center and a plurality of pre-constructed E-commerce platforms to obtain all-channel E-commerce data, wherein the butted mode of the E-commerce data is API (application program interface) butting;
extracting a search word set, a commodity browsing duration set and a commodity click id set of an e-commerce user in a preset statistical period from the all-channel e-commerce data;
Counting the search word frequency of each search word in the search word set to obtain a search word frequency set;
Combining the user id, the login times, the search word set, the search word frequency set, the commodity click id set and the commodity browsing duration set according to a key value pair form to obtain electronic commerce data, wherein the electronic commerce data are expressed as:
;
Wherein, The data of the electronic commerce is represented,Representing the user id(s) and,The number of logins is indicated and,A set of words for the search is represented,Representing the set of search term frequencies,Representing a set of item click ids,Representing a set of merchandise browsing durations.
Optionally, the classifying the search hotness of the search word set according to the preset word frequency threshold and the search word frequency set to obtain a search hotword set and a non-search hotword set includes:
Sequentially extracting search words from a search word set, and identifying search word frequencies of the search words;
Using search words with the search word frequency being greater than or equal to a word frequency threshold value as search hot words, and using search words with the search word frequency being less than the word frequency threshold value as non-search hot words;
Summarizing the search hot words and the non-search hot words respectively to obtain a search hot word set and a non-search hot word set, wherein the search hot word set and the non-search hot word set are expressed as follows:
;
Wherein, Representing a search of a hot word set,Representing the order of search hotwords in the search hotword set,Represents the i-th search hotword in the set of search hotwords,Representing a set of non-search hotwords, j representing the order of the non-search hotwords in the set of non-search hotwords,Represents the j-th non-search hotword in the set of non-search hotwords.
Optionally, the calculating the text similarity between the search hotword and each non-search hotword in the non-search hotword set to obtain a text similarity set includes:
Sequentially extracting non-search hot words in a non-search hot word set, and sequentially combining the search hot words with the non-search hot words to obtain paired phrase, wherein the paired phrase is expressed as:
;
Wherein, A pairing phrase representing the ith searching hot word and the jth non-searching hot word;
Calculating the text similarity of the search hot words and the non-search hot words in the matched word group by using a pre-constructed text recognition model, and summarizing the text similarity to obtain a text similarity set, wherein the text similarity set is expressed as:
;
Wherein, A set of text similarities is represented and,And the text similarity of the ith searching hot word and the jth non-searching hot word is represented.
Optionally, the calculating the target word vector of the target word group set includes:
Constructing a user vocabulary according to the word set for searching and the pre-constructed comment text set;
counting the vocabulary frequency of each target phrase in the target phrase set in the user vocabulary, and constructing an original word vector according to the vocabulary frequency, wherein the original word vector is expressed as:
;
Wherein, The vector of the original word is represented,Represents the vocabulary frequency of the 1 st target phrase, k represents the total number of vocabulary frequencies,The vocabulary frequency of the kth target phrase is represented;
Acquiring a plurality of historical vocabularies and historical vocabulary numbers of the user id, and identifying target vocabulary numbers containing the target phrase in the plurality of historical vocabularies to obtain a target vocabulary number set;
Constructing a weight word vector according to the historical vocabulary number and the target vocabulary number set, wherein the weight word vector is expressed as:
;
Wherein, The weight word vector is represented as such,A logarithmic function with a base of 10 is shown,The number of the vocabulary of the history is represented,Represents the target vocabulary number corresponding to the 1 st target phrase,Representing a target vocabulary number corresponding to the kth target phrase;
according to the original word vector and the weight word vector, calculating a target word vector by using the following formula:
;
Wherein, The vector of the target word is represented,The inner product sign of the vector is represented.
Optionally, the constructing a user vocabulary according to the search word set and the pre-constructed comment text set includes:
Sequentially extracting comment texts from the comment text set, and identifying comment phrase of the comment texts by using a text identification model;
summarizing the comment phrase to obtain a comment word set, and summarizing the search word set and the comment word set to obtain an original vocabulary;
Identifying punctuation marks in the original vocabulary, and deleting the punctuation marks to obtain the user vocabulary.
Optionally, extracting the commodity phrase in the commodity description text includes:
Performing word segmentation operation on the commodity description text to obtain a candidate word set;
sequentially extracting candidate words from the candidate word set, and counting the number of the historical vocabularies containing the candidate words to obtain the number of the candidate vocabularies;
calculating the candidate weight value of the candidate word by using the following formula to obtain a candidate weight value set:
;
Wherein, The value of the candidate weight is indicated,Representing the number of times a candidate word appears in the candidate word set,Representing the total number of candidate words in the candidate word set,Representing a candidate vocabulary number;
And identifying candidate weight values larger than a preset weight threshold in the candidate weight value set to obtain a target weight value set, and identifying commodity phrase corresponding to the target weight value set.
Optionally, the calculating the phrase matching degree between the target phrase set and the commodity phrase according to the commodity phrase vector and the target phrase vector includes:
According to the target word vector and the commodity word vector, calculating the phrase matching degree by using the following formula:
;
Wherein, The matching degree of the phrase is represented,The norm sign representing the vector is represented by,Representing the commodity word vector.
Optionally, calculating the commodity intention score of the commodity click id by the electric user according to the login times, the commodity browsing duration and the phrase matching degree by using a preset commodity intention formula to obtain a commodity intention score set, including:
calculating commodity intention scores according to the login times, commodity browsing duration and phrase matching degree by using the following formula:
;
Wherein, Represents the intent score of the commodity, b represents the number of logins,A preset time length coefficient is represented, t represents a commodity browsing time length,Represents the number of the target phrase, n represents the sequence number of the target phrase,The vocabulary frequency of the nth target phrase is represented.
And summarizing the commodity intention scores to obtain commodity intention score sets.
In order to achieve the above object, the present invention further provides an electronic commerce data asset management and control system based on a service center, including:
The data classification module is used for acquiring the electronic commerce data of the electronic commerce user by utilizing the pre-constructed business center, wherein the electronic commerce data comprises: user id, login times, a search word set, a search word frequency set, a commodity click id set and a commodity browsing duration set, and classifying search heat of the search word set according to a preset word frequency threshold and the search word frequency set to obtain a search hot word set and a non-search hot word set;
The vector calculation module is used for sequentially extracting search hot words in the search hot word set, calculating text similarity of the search hot words and each non-search hot word in the non-search hot word set to obtain a text similarity set, extracting target similarity larger than a preset similarity threshold in the text similarity set, identifying target phrases corresponding to the target similarity, summarizing the target phrases to obtain a target phrase set, calculating target word vectors of the target phrase set, sequentially extracting commodity click ids in the commodity click id set to obtain commodity description text of the commodity click ids, extracting commodity phrases in the commodity description text, and calculating commodity word vectors of the commodity phrases;
The scoring acquisition module is used for calculating the phrase matching degree of the target word set and the commodity phrase according to the commodity word vector and the target word vector, sequentially extracting commodity browsing duration in the commodity browsing duration set, and calculating commodity intention scores of a commodity click id of a user according to the login times, the commodity browsing duration and the phrase matching degree by utilizing a preset commodity intention formula to obtain commodity intention scoring set;
And the data management and control module is used for extracting target intention score sets larger than a preset score threshold value from the commodity intention score sets, identifying intention commodity id sets corresponding to the target intention score sets, and carrying out commodity pushing on the electric commodity users according to the intention commodity id sets.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
a memory storing at least one instruction; and
And the processor executes the instructions stored in the memory to realize the business center-based electronic commerce data asset management and control method.
In order to solve the above-mentioned problems, the present invention further provides a computer readable storage medium, where at least one instruction is stored, where the at least one instruction is executed by a processor in an electronic device to implement the above-mentioned business center-based e-commerce data asset management method.
In order to solve the problems in the background art, the invention utilizes the business center to acquire the electronic commerce data of the electronic commerce users, realizes the combination of a plurality of electronic commerce platforms, and enables the electronic commerce data of the electronic commerce users acquired by different electronic commerce platforms to be communicated, thereby improving the efficiency of electronic commerce data asset management and control, acquiring a large amount of electronic commerce data of the electronic commerce users, and providing preconditions for the subsequent electronic commerce data management and control. The method comprises the steps of firstly classifying search hotness of a search word set according to a word frequency threshold value and the search word frequency set to obtain the search hotword set and a non-search hotword set, so that an electronic commerce data asset is more refined in the processing process, a large amount of unnecessary data calculation is removed, and the management and control efficiency of the electronic commerce data asset is further improved. The text similarity set is obtained by sequentially extracting the search hotwords from the search hotword set and calculating the text similarity of each non-search hotword in the search hotword set, so that the search hotwords and the non-search hotwords are combined, a large number of calculation of the text similarity of the non-search hotwords and the non-search hotwords is eliminated, and the calculation speed is further increased. Similarly, extracting the target similarity larger than the similarity threshold value in the text similarity set, namely refining the electronic commerce data resource, and summarizing the target phrase by identifying the target phrase corresponding to the target similarity to obtain a target phrase set, which provides conditions for subsequent calculation. The electronic commerce data asset is successfully converted from an abstract form of a business into a numerical form which can be used for calculation through calculation of a target word vector of the target word group set, so that the electronic commerce data asset can be reused by other electronic commerce platforms, the process of electronic commerce data asset management and control is quickened, commodity click ids are sequentially extracted in a commodity click id set to obtain commodity description texts of the commodity click ids, commodity word groups are extracted in the commodity description texts, commodity word vectors of the commodity word groups are calculated, the electronic commerce data asset is further converted into the numerical form which can be used for calculation from the abstract form of the business, the word matching degree of the target word group set and the commodity word groups is calculated through utilization of the commodity word vectors and the target word vectors, the commodity browsing time is sequentially extracted in the commodity browsing time set when commodity intention score is calculated later, commodity intention score is calculated according to login times, commodity browsing time and commodity word group matching degree, commodity intention scores of commodity users are calculated through commodity click formulas, commodity intention score of commodity intention score sets are obtained, commodity intention score is divided more finely, commodity intention score control conditions are calculated, commodity intention score is provided for the subsequent data management and control, the commodity intention score is carried out according to a preset commodity intention score is achieved through the commodity intention score is corresponding to the commodity intention score set, and commodity intention score is calculated through the electronic intention score control, and the commodity intention score is correspondingly conducted by the commodity intention score is calculated by the commodity intention score table, and the commodity intention score is calculated by means and is correspondingly by the commodity intention score is calculated by the commodity intention score by means and is calculated. Therefore, the invention can reduce the manpower resource consumption of the electronic commerce data asset management and control process and improve the efficiency of electronic commerce data asset management and control.
Drawings
FIG. 1 is a flow chart of a business center-based e-commerce data asset management and control method according to an embodiment of the present invention;
FIG. 2 is a functional block diagram of a business center-based e-commerce data asset management and control system according to an embodiment of the present invention;
Fig. 3 is a schematic structural diagram of an electronic device for implementing the electronic commerce data asset management and control method based on a service center according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the application provides an electronic commerce data asset management and control method based on a business center. The execution main body of the business center-based e-commerce data asset management method comprises at least one of a server, a terminal and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the business center-based e-commerce data asset management method may be performed by software or hardware installed in a terminal device or a server device, where the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Referring to fig. 1, a flow chart of a business center-based e-commerce data asset management method according to an embodiment of the invention is shown. In this embodiment, the electronic commerce data asset management and control method based on the service center station includes:
s1, acquiring electronic commerce data of an electronic commerce by utilizing a pre-constructed business center, wherein the electronic commerce data comprises: user id, login times, search word sets, search word frequency sets, commodity click id sets and commodity browsing duration sets.
It should be explained that the service center refers to an e-commerce platform for converting a core service of the e-commerce platform into reusable data assets, wherein the core service includes a search habit of an e-commerce user acquired by the e-commerce platform, and the data assets include intention commodity data of the e-commerce user.
It can be understood that the user id refers to a character string used by the service center for identifying the identity of the electric business, and each electric business user only corresponds to one user id. The login times refer to the times of logging in the E-commerce platform by the same E-commerce user in a preset statistical period. The searching word set refers to a searching word set which is searched on the E-commerce platform by the same E-commerce user in the statistical period. The search word frequency set refers to a set of times that each search word in the search word set appears in the search word set in a statistical period, wherein the search word frequency corresponds to the search word one by one. The statistical time period refers to a time manually set according to market change.
It can be understood that the commodity click id set refers to a commodity click id set clicked by the same electric user on the electric commerce platform in the statistical period, wherein the commodity click id refers to the commodity id clicked in the commodity page. The commodity browsing duration set refers to a browsing duration set of commodities corresponding to commodity clicking ids of the same electric user in the statistical period, wherein the commodity browsing duration refers to the duration from clicking of the electric user and entering of a commodity page to leaving of the commodity page, and the commodity browsing duration corresponds to the commodity clicking ids one by one.
In detail, the acquiring the e-commerce data of the e-commerce user by using the pre-constructed service center includes:
e-commerce data are butted between the business center and a plurality of pre-constructed E-commerce platforms to obtain all-channel E-commerce data, wherein the butted mode of the E-commerce data is API (application program interface) butting;
extracting a search word set, a commodity browsing duration set and a commodity click id set of an e-commerce user in a preset statistical period from the all-channel e-commerce data;
Counting the search word frequency of each search word in the search word set to obtain a search word frequency set;
Combining the user id, the login times, the search word set, the search word frequency set, the commodity click id set and the commodity browsing duration set according to a key value pair form to obtain electronic commerce data, wherein the electronic commerce data are expressed as:
;
Wherein, The data of the electronic commerce is represented,Representing the user id(s) and,The number of logins is indicated and,A set of words for the search is represented,Representing the set of search term frequencies,Representing a set of item click ids,Representing a set of merchandise browsing durations.
It should be explained that the e-commerce platform refers to a website or an application program that provides online commodity transaction service through the internet. The API interface refers to data interface for realizing data exchange and data integration between different platforms through a programming interface, wherein the programming interface refers to a software middleware which allows interaction between different programs or services without a developer knowing the internal working principle of the software middleware.
S2, classifying search hotness of the search word sets according to a preset word frequency threshold and the search word frequency set to obtain a search hotword set and a non-search hotword set.
It should be explained that the term frequency threshold refers to a term frequency set by human, and it can be used to distinguish the heat of the search term. The hot word set refers to a hot word set, wherein the hot word set refers to a hot word set, the frequency of searching the hot word set is greater than or equal to a word frequency threshold, the hot word set refers to a hot word set, and the hot word set refers to a hot word set, wherein the frequency of searching the hot word set is less than the word frequency threshold.
In detail, the classifying the search word set according to the preset word frequency threshold and the search word frequency set to obtain a search hot word set and a non-search hot word set includes:
Sequentially extracting search words from a search word set, and identifying search word frequencies of the search words;
Using search words with the search word frequency being greater than or equal to a word frequency threshold value as search hot words, and using search words with the search word frequency being less than the word frequency threshold value as non-search hot words;
Summarizing the search hot words and the non-search hot words respectively to obtain a search hot word set and a non-search hot word set, wherein the search hot word set and the non-search hot word set are expressed as follows:
;
Wherein, Representing a search of a hot word set,Representing the order of search hotwords in the search hotword set,Represents the i-th search hotword in the set of search hotwords,Representing a set of non-search hotwords, j representing the order of the non-search hotwords in the set of non-search hotwords,Represents the j-th non-search hotword in the set of non-search hotwords.
And S3, sequentially extracting search hot words from the search hot word set, and calculating the text similarity of each non-search hot word in the search hot word set and each non-search hot word in the non-search hot word set to obtain a text similarity set.
The text similarity refers to semantic similarity of a search hot word and a non-search hot word in a matched phrase, wherein the matched phrase consists of the search hot word and the non-search hot word.
In detail, the calculating the text similarity between the search hotword and each non-search hotword in the non-search hotword set to obtain a text similarity set includes:
Sequentially extracting non-search hot words in a non-search hot word set, and sequentially combining the search hot words with the non-search hot words to obtain paired phrase, wherein the paired phrase is expressed as:
;
Wherein, A pairing phrase representing the ith searching hot word and the jth non-searching hot word;
Calculating the text similarity of the search hot words and the non-search hot words in the matched word group by using a pre-constructed text recognition model, and summarizing the text similarity to obtain a text similarity set, wherein the text similarity set is expressed as:
;
Wherein, A set of text similarities is represented and,And the text similarity of the ith searching hot word and the jth non-searching hot word is represented.
It can be appreciated that the text recognition model refers to a model, for example, ELMo model, which is constructed by natural language processing technology and can be used to recognize the semantic similarity between the search hot word and the non-search hot word, where the natural language processing technology is a mature and widely used technology and will not be described herein.
S4, extracting target similarity larger than a preset similarity threshold value from the text similarity set, identifying target phrases corresponding to the target similarity, summarizing the target phrases to obtain a target word set, and calculating target word vectors of the target word set.
It is obvious that the similarity threshold value refers to a similarity value set by people, and can be used for filtering data of the text similarity in the text similarity set, so as to obtain the target similarity. The target similarity refers to the text similarity which is larger than a similarity threshold value in the text similarity set. The target phrase refers to a pairing phrase corresponding to the target similarity.
It should be explained that, when the target phrase is summarized into the target phrase set, the search hot word and the non-search hot word in the target phrase will not exist in the phrase form, but exist as separate individuals. The target word vector refers to a vector representation method of a target word set.
Illustratively, the following sets of target phrases exist, each: the target phrase is summarized into a target phrase set by the mobile phone shell and the bread strawberry bread, and the target phrase set is expressed as: { cell phone, cell phone case, bread, strawberry bread }.
In detail, the calculating the target word vector of the target word group set includes:
Constructing a user vocabulary according to the word set for searching and the pre-constructed comment text set;
counting the vocabulary frequency of each target phrase in the target phrase set in the user vocabulary, and constructing an original word vector according to the vocabulary frequency, wherein the original word vector is expressed as:
;
Wherein, The vector of the original word is represented,Represents the vocabulary frequency of the 1 st target phrase, k represents the total number of vocabulary frequencies,The vocabulary frequency of the kth target phrase is represented;
Acquiring a plurality of historical vocabularies and historical vocabulary numbers of the user id, and identifying target vocabulary numbers containing the target phrase in the plurality of historical vocabularies to obtain a target vocabulary number set;
Constructing a weight word vector according to the historical vocabulary number and the target vocabulary number set, wherein the weight word vector is expressed as:
;
Wherein, The weight word vector is represented as such,A logarithmic function with a base of 10 is shown,The number of the vocabulary of the history is represented,Represents the target vocabulary number corresponding to the 1 st target phrase,Representing a target vocabulary number corresponding to the kth target phrase;
according to the original word vector and the weight word vector, calculating a target word vector by using the following formula:
;
Wherein, The vector of the target word is represented,The inner product sign of the vector is represented.
The comment text set refers to a set of comment texts of the same electronic commerce user in the statistical period, wherein the comment texts refer to texts of comments of the electronic commerce user on an electronic commerce platform after the electronic commerce user purchases the commodity. The user vocabulary refers to a vocabulary constructed by identifying text keywords in a comment text set and according to the text keywords and a search word set, and the text keywords in the user vocabulary can be repeated. The term frequency refers to the sum of the frequency of occurrence of search hot words and non-search hot words in the user vocabulary.
Illustratively, a user vocabulary is: { one, cell phone shell, good looking, my, cell phone, collocation, bread, good eating, but like, strawberry bread, unfortunately, best, chocolate, bread, however, chocolate, bread, good eating }, some target word set is: { cell-phone, cell-phone shell, bread, strawberry bread }, it is summarized by [ cell-phone shell ] and [ bread strawberry bread ] two target phrases, from the above-mentioned, when the target phrase is summarized into the target group of words collection, search hot word and non-search hot word in the target phrase will not exist in the form of phrase, but exist in the individual, so the vocabulary frequency in this target group of words collection is: 1+1, 3+1.
It can be understood that, the original word vector refers to a vector for constructing the target word group set according to the vocabulary frequency, and common conjunctions in the original word vector occupy a great proportion, because a great number of common conjunctions appear in the acquired user vocabulary, for example: it is possible, of course, and also, that the common conjunctions do not contain useful merchandise information, so a weight word vector is introduced to reduce the duty cycle of the common conjunctions in the target word vector.
It is understood that the historical vocabulary refers to the user vocabulary of the e-commerce user prior to the statistical period. The target vocabulary number refers to the number of historical vocabularies which simultaneously contain search hot words and non-search hot words. The weight word vector is used for reducing the weight of common conjunctions in the history vocabulary in the calculation of the target word vector.
Illustratively, a certain target vocabulary set is: { cell phone, cell phone shell, bread, strawberry bread }, 5 history vocabularies are obtained, wherein the number of history vocabularies comprising cell phone and cell phone shell is 2, the number of history vocabularies comprising bread and strawberry bread is 1, the number of target vocabularies of [ cell phone shell ] is 2, the number of target vocabularies of [ bread strawberry bread ] is 1, and the weight word vector is ] )。
In detail, the step of constructing a user vocabulary according to the search word set and the pre-constructed comment text set includes:
Sequentially extracting comment texts from the comment text set, and identifying comment phrase of the comment texts by using a text identification model;
summarizing the comment phrase to obtain a comment word set, and summarizing the search word set and the comment word set to obtain an original vocabulary;
Identifying punctuation marks in the original vocabulary, and deleting the punctuation marks to obtain the user vocabulary.
It is understood that the comment phrase refers to a set of comment words in a comment text, where the comment words may include punctuation marks. The comment phrase set refers to a set of all comment phrases. The original vocabulary refers to the collection of all search word sets and comment word sets.
And S5, sequentially extracting commodity click ids from the commodity click id sets, acquiring commodity description texts of the commodity click ids, extracting commodity phrase from the commodity description texts, and calculating commodity word vectors of the commodity phrase.
It should be explained that the commodity descriptive text refers to the commodity descriptive text appearing in the commodity page of the commodity click id, and includes: commodity name, commodity address, commodity function, etc. The commodity phrase refers to a collection of commodity words appearing in the commodity description text. The commodity word vector refers to a vector representation method of commodity word groups.
In detail, the acquiring the commodity description text of the commodity click id includes:
Performing word segmentation operation on the commodity description text to obtain a candidate word set;
sequentially extracting candidate words from the candidate word set, and counting the number of the historical vocabularies containing the candidate words to obtain the number of the candidate vocabularies;
calculating the candidate weight value of the candidate word by using the following formula to obtain a candidate weight value set:
;
Wherein, The value of the candidate weight is indicated,Representing the number of times a candidate word appears in the candidate word set,Representing the total number of candidate words in the candidate word set,Representing a candidate vocabulary number;
And identifying candidate weight values larger than a preset weight threshold in the candidate weight value set to obtain a target weight value set, and identifying commodity phrase corresponding to the target weight value set.
It is understood that the word segmentation operation refers to a process of dividing the article description text into a plurality of candidate words. The candidate weight values refer to numerical values used for representing the importance degree of candidate words in commodity description texts, and each candidate weight value corresponds to one candidate word. The candidate weight value set refers to a set of all candidate weight values. The weight threshold refers to a numerical value set by human beings and used for screening candidate weight values.
In detail, the calculating the commodity word vector of the commodity phrase includes:
Counting the commodity frequency of each commodity word in the commodity word group in the user vocabulary, and identifying the commodity vocabulary number containing the commodity word in the plurality of historical vocabularies;
Calculating an original commodity word vector according to the commodity frequency, wherein the original commodity word vector is expressed as:
;
Wherein, The vector of the original commodity word is represented,Represents the commodity frequency of the 1 st commodity word, p represents the total number of commodity frequencies,The commodity frequency of the kth commodity word is represented;
Obtaining commodity weight word vectors according to the number of the historical vocabularies and commodity words, wherein the commodity weight word vectors are expressed as follows:
;
Wherein, The weight word vector is represented as such,Representing the number of the vocabulary of the commodity corresponding to the 1 st commodity word,Representing a commodity vocabulary number including a p-th commodity word;
Calculating commodity word vectors according to the original commodity word vectors and the weight commodity word vectors:
;
Wherein, Representing the commodity word vector.
It is understood that the term frequency of merchandise refers to the frequency with which merchandise words appear in the user's vocabulary. The original commodity word vector refers to a vector for constructing commodity word groups according to commodity frequency. The commodity weight word vector is used for reducing the weight of common conjunctions in the history vocabulary in the calculation of the commodity word vector.
And S6, calculating the phrase matching degree of the target phrase set and the commodity phrase according to the commodity phrase vector and the target phrase vector.
The phrase matching degree refers to the matching degree of the target phrase set and the commodity phrase, and can be used for adjusting the calculation weight of the related search phrase set in the subsequent calculation.
In detail, the calculating the phrase matching degree between the target phrase set and the commodity phrase according to the commodity phrase vector and the target phrase vector includes:
According to the target word vector and the commodity word vector, calculating the phrase matching degree by using the following formula:
;
Wherein, The matching degree of the phrase is represented,The norm sign representing the vector is represented by,Representing the commodity word vector.
It can be understood that the lengths of the target word vector and the commodity word vector should be kept consistent in the calculation process, so that when the vector lengths of the target word vector and the commodity word vector deviate, the length equivalent complementation is performed by using 0 elements.
Illustratively, a certain target word vector is: (2 13 46 78 0 1) a commodity word vector is: (3 115 2 6) wherein the article word vector is smaller than the target word vector in length, so 0 should be added at the end of the article word vector until the length is equal to the target word vector, and the added article word vector is: (3 115 260 0 0). The phrase matching degree is calculated according to the target word vector and the completed commodity word vector as follows:。
And S7, sequentially extracting commodity browsing duration in the commodity browsing duration set, and calculating commodity intention scores of the commodity click id of the electric user according to the login times, the commodity browsing duration and the phrase matching degree by using a preset commodity intention formula to obtain commodity intention evaluation diversity.
To explain, the commodity intention score refers to a purchase intention score value of the commodity corresponding to the commodity click id of the commodity by the electric user.
In detail, the calculating the commodity intention score of the click id of the commodity by the electric user by using the preset commodity intention formula to obtain the commodity intention score set includes:
calculating commodity intention scores according to the login times, commodity browsing duration and phrase matching degree by using the following formula:
;
Wherein, Represents the intent score of the commodity, b represents the number of logins,A preset time length coefficient is represented, t represents a commodity browsing time length,Represents the number of the target phrase, n represents the sequence number of the target phrase,The vocabulary frequency of the nth target phrase is represented.
And summarizing the commodity intention scores to obtain commodity intention score sets.
It can be appreciated that the target frequency refers to the number of times a target phrase appears in the search term set. The time length coefficient refers to a manually set numerical value, and represents the weight of the commodity browsing time length when calculating commodity intention scores.
For example, if a user logs in 5 times in a statistics period, the login times are 5, the set duration coefficient is 0.8, the commodity browsing duration for browsing a commodity is 10min, the phrase matching degree is 0.6, the target frequency of target phrases is 1,2, 3, 2 and 4 respectively, and the target score of the commodity click id is。
And S8, extracting target intention score sets larger than a preset score threshold value from the commodity intention score sets, identifying intention commodity id sets corresponding to the target intention score sets, pushing commodities to the electric users according to the intention commodity id sets, and completing management and control of electric commerce data assets based on the business center.
It should be explained that the scoring threshold refers to a manually set scoring value for screening commodity intention scoring sets, and when the commodity intention score of the commodity click id of the e-commerce user is greater than the scoring threshold, the e-commerce user is indicated to have a purchase intention for the commodity corresponding to the commodity click id. The target intent score refers to an intent score of the commodity in the commodity intent score set that is greater than a score threshold.
It is understood that the intent commodity id set refers to a set of all intent commodity ids.
In detail, the identifying the intent commodity id set corresponding to the target intent score set includes:
Sequentially extracting target intention scores from a target intention score set, identifying commodity click ids corresponding to the target intention scores, taking the commodity click ids corresponding to the target intention scores as intention commodity ids, and summarizing the intention commodity ids to obtain an intention commodity id set.
In order to solve the problems in the background art, the invention utilizes the business center to acquire the electronic commerce data of the electronic commerce users, realizes the combination of a plurality of electronic commerce platforms, and enables the electronic commerce data of the electronic commerce users acquired by different electronic commerce platforms to be communicated, thereby improving the efficiency of electronic commerce data asset management and control, acquiring a large amount of electronic commerce data of the electronic commerce users, and providing preconditions for the subsequent electronic commerce data management and control. The method comprises the steps of firstly classifying search hotness of a search word set according to a word frequency threshold value and the search word frequency set to obtain the search hotword set and a non-search hotword set, so that an electronic commerce data asset is more refined in the processing process, a large amount of unnecessary data calculation is removed, and the management and control efficiency of the electronic commerce data asset is further improved. The text similarity set is obtained by sequentially extracting the search hotwords from the search hotword set and calculating the text similarity of each non-search hotword in the search hotword set, so that the search hotwords and the non-search hotwords are combined, a large number of calculation of the text similarity of the non-search hotwords and the non-search hotwords is eliminated, and the calculation speed is further increased. Similarly, extracting the target similarity larger than the similarity threshold value in the text similarity set, namely refining the electronic commerce data resource, and summarizing the target phrase by identifying the target phrase corresponding to the target similarity to obtain a target phrase set, which provides conditions for subsequent calculation. The electronic commerce data asset is successfully converted from an abstract form of a business into a numerical form which can be used for calculation through calculation of a target word vector of the target word group set, so that the electronic commerce data asset can be reused by other electronic commerce platforms, the process of electronic commerce data asset management and control is quickened, commodity click ids are sequentially extracted in a commodity click id set to obtain commodity description texts of the commodity click ids, commodity word groups are extracted in the commodity description texts, commodity word vectors of the commodity word groups are calculated, the electronic commerce data asset is further converted into the numerical form which can be used for calculation from the abstract form of the business, the word matching degree of the target word group set and the commodity word groups is calculated through utilization of the commodity word vectors and the target word vectors, the commodity browsing time is sequentially extracted in the commodity browsing time set when commodity intention score is calculated later, commodity intention score is calculated according to login times, commodity browsing time and commodity word group matching degree, commodity intention scores of commodity users are calculated through commodity click formulas, commodity intention score of commodity intention score sets are obtained, commodity intention score is divided more finely, commodity intention score control conditions are calculated, commodity intention score is provided for the subsequent data management and control, the commodity intention score is carried out according to a preset commodity intention score is achieved through the commodity intention score is corresponding to the commodity intention score set, and commodity intention score is calculated through the electronic intention score control, and the commodity intention score is correspondingly conducted by the commodity intention score is calculated by the commodity intention score table, and the commodity intention score is calculated by means and is correspondingly by the commodity intention score is calculated by the commodity intention score by means and is calculated. Therefore, the invention can reduce the manpower resource consumption of the electronic commerce data asset management and control process and improve the efficiency of electronic commerce data asset management and control.
FIG. 2 is a functional block diagram of a business center based electronic commerce data asset management and control system according to an embodiment of the present invention.
The electronic commerce data asset management and control system 100 based on the business center can be installed in electronic equipment. Depending on the implementation, the business center-based e-commerce data asset management system 100 may include a data classification module 101, a vector calculation module 102, a score acquisition module 103, and a data management module 104. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
The data classification module 101 is configured to obtain electronic commerce data of an electronic commerce user by using a pre-constructed service center, where the electronic commerce data includes: user id, login times, a search word set, a search word frequency set, a commodity click id set and a commodity browsing duration set, and classifying search heat of the search word set according to a preset word frequency threshold and the search word frequency set to obtain a search hot word set and a non-search hot word set;
The vector calculation module 102 is configured to sequentially extract search hotwords in the search hotword set, calculate text similarity between the search hotword and each non-search hotword in the non-search hotword set, obtain a text similarity set, extract a target similarity greater than a preset similarity threshold in the text similarity set, identify a target phrase corresponding to the target similarity, summarize the target phrase to obtain a target phrase set, calculate a target phrase vector of the target phrase set, sequentially extract commodity click ids in the commodity click id set, obtain a commodity description text of the commodity click ids, extract a commodity phrase in the commodity description text, and calculate a commodity phrase vector of the commodity phrase;
The scoring module 103 is configured to calculate a phrase matching degree between a target word set and a commodity phrase according to the commodity word vector and the target word vector, sequentially extract commodity browsing duration in the commodity browsing duration set, and calculate a commodity intention score of an electric user on a commodity click id according to the login times, the commodity browsing duration and the phrase matching degree by using a preset commodity intention formula to obtain a commodity intention scoring set;
The data management and control module 104 is configured to extract a target intent score set greater than a preset score threshold from the commodity intent score set, identify an intent commodity id set corresponding to the target intent score set, and push commodities to the electric user according to the intent commodity id set.
In detail, the modules in the business center-based electronic commerce data asset management and control system 100 in the embodiment of the present invention use the same technical means as the business center-based electronic commerce data asset management and control method described in fig. 1, and can produce the same technical effects, which are not described herein.
Fig. 3 is a schematic structural diagram of an electronic device for implementing a business center-based electronic commerce data asset management method according to an embodiment of the present invention.
The electronic device 1 may comprise a processor 10, a memory 11 and a bus 12, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as an electronic commerce data asset management method program based on a service center.
The memory 11 includes at least one type of readable storage medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the electronic device 1. Further, the memory 11 further comprises an internal storage unit of the electronic device 1, and also comprises an external storage device. The memory 11 may be used not only for storing application software installed in the electronic device 1 and various types of data, such as code based on an electronic commerce data asset management method program of a business center, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects the respective components of the entire electronic device using various interfaces and lines, executes or executes programs or modules stored in the memory 11 (for example, an electronic commerce data asset management method program based on a service center, etc.), and invokes data stored in the memory 11 to perform various functions of the electronic device 1 and process data.
The bus 12 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus 12 may be divided into an address bus, a data bus, a control bus, etc. The bus 12 is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
Fig. 3 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.
Further, the electronic device 1 may also comprise a network interface, optionally the network interface may comprise a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the electronic device 1 and other electronic devices.
The electronic device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 1 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The electronic commerce data asset management method program based on a business center stored in the memory 11 of the electronic device 1 is a combination of a plurality of instructions, which when executed in the processor 10, can implement:
E-commerce data of an E-commerce user are acquired by utilizing a pre-constructed business center, wherein the E-commerce data comprise: user id, login times, search word sets, search word frequency sets, commodity click id sets and commodity browsing duration sets;
classifying search hotness of the search word sets according to a preset word frequency threshold and the search word frequency set to obtain a search hotword set and a non-search hotword set;
Sequentially extracting search hotwords from the search hotword set, and calculating the text similarity of each non-search hotword in the search hotword set and each non-search hotword in the non-search hotword set to obtain a text similarity set;
Extracting target similarity larger than a preset similarity threshold value from the text similarity set, identifying target phrases corresponding to the target similarity, summarizing the target phrases to obtain a target word set, and calculating target word vectors of the target word set;
Sequentially extracting commodity click ids from the commodity click id sets, acquiring commodity description texts of the commodity click ids, extracting commodity phrase from the commodity description texts, and calculating commodity word vectors of the commodity phrase;
calculating the phrase matching degree of the target phrase set and the commodity phrase according to the commodity phrase vector and the target phrase vector;
Sequentially extracting commodity browsing duration in the commodity browsing duration set, and calculating commodity intention scores of the electric commodity click id by utilizing a preset commodity intention formula according to the login times, the commodity browsing duration and the phrase matching degree to obtain commodity intention evaluation diversity;
and extracting target intention scoring sets larger than a preset scoring threshold value from the commodity intention scoring sets, identifying intention commodity id sets corresponding to the target intention scoring sets, pushing commodities to electric business users according to the intention commodity id sets, and completing management and control of electric business data assets based on a business center.
Specifically, the specific implementation method of the above instructions by the processor 10 may refer to descriptions of related steps in the corresponding embodiments of fig. 1 to 3, which are not repeated herein.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:
E-commerce data of an E-commerce user are acquired by utilizing a pre-constructed business center, wherein the E-commerce data comprise: user id, login times, search word sets, search word frequency sets, commodity click id sets and commodity browsing duration sets;
classifying search hotness of the search word sets according to a preset word frequency threshold and the search word frequency set to obtain a search hotword set and a non-search hotword set;
Sequentially extracting search hotwords from the search hotword set, and calculating the text similarity of each non-search hotword in the search hotword set and each non-search hotword in the non-search hotword set to obtain a text similarity set;
Extracting target similarity larger than a preset similarity threshold value from the text similarity set, identifying target phrases corresponding to the target similarity, summarizing the target phrases to obtain a target word set, and calculating target word vectors of the target word set;
Sequentially extracting commodity click ids from the commodity click id sets, acquiring commodity description texts of the commodity click ids, extracting commodity phrase from the commodity description texts, and calculating commodity word vectors of the commodity phrase;
calculating the phrase matching degree of the target phrase set and the commodity phrase according to the commodity phrase vector and the target phrase vector;
Sequentially extracting commodity browsing duration in the commodity browsing duration set, and calculating commodity intention scores of the electric commodity click id by utilizing a preset commodity intention formula according to the login times, the commodity browsing duration and the phrase matching degree to obtain commodity intention evaluation diversity;
and extracting target intention scoring sets larger than a preset scoring threshold value from the commodity intention scoring sets, identifying intention commodity id sets corresponding to the target intention scoring sets, pushing commodities to electric business users according to the intention commodity id sets, and completing management and control of electric business data assets based on a business center.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, system and method may be implemented in other manners. For example, the system embodiments described above are merely illustrative, and there may be additional divisions of a practical implementation.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.
Claims (10)
1. An electronic commerce data asset management and control method based on a business center, which is characterized by comprising the following steps:
E-commerce data of an E-commerce user are acquired by utilizing a pre-constructed business center, wherein the E-commerce data comprise: user id, login times, search word sets, search word frequency sets, commodity click id sets and commodity browsing duration sets;
classifying search hotness of the search word sets according to a preset word frequency threshold and the search word frequency set to obtain a search hotword set and a non-search hotword set;
Sequentially extracting search hotwords from the search hotword set, and calculating the text similarity of each non-search hotword in the search hotword set and each non-search hotword in the non-search hotword set to obtain a text similarity set;
Extracting target similarity larger than a preset similarity threshold value from the text similarity set, identifying target phrases corresponding to the target similarity, summarizing the target phrases to obtain a target word set, and calculating target word vectors of the target word set;
Sequentially extracting commodity click ids from the commodity click id sets, acquiring commodity description texts of the commodity click ids, extracting commodity phrase from the commodity description texts, and calculating commodity word vectors of the commodity phrase;
calculating the phrase matching degree of the target phrase set and the commodity phrase according to the commodity phrase vector and the target phrase vector;
Sequentially extracting commodity browsing duration in the commodity browsing duration set, and calculating commodity intention scores of the electric commodity click id by utilizing a preset commodity intention formula according to the login times, the commodity browsing duration and the phrase matching degree to obtain commodity intention evaluation diversity;
and extracting target intention scoring sets larger than a preset scoring threshold value from the commodity intention scoring sets, identifying intention commodity id sets corresponding to the target intention scoring sets, pushing commodities to electric business users according to the intention commodity id sets, and completing management and control of electric business data assets based on a business center.
2. The business center-based e-commerce data asset management method of claim 1, wherein the acquiring e-commerce data of the e-commerce by the pre-constructed business center comprises:
e-commerce data are butted between the business center and a plurality of pre-constructed E-commerce platforms to obtain all-channel E-commerce data, wherein the butted mode of the E-commerce data is API (application program interface) butting;
extracting a search word set, a commodity browsing duration set and a commodity click id set of an e-commerce user in a preset statistical period from the all-channel e-commerce data;
Counting the search word frequency of each search word in the search word set to obtain a search word frequency set;
Combining the user id, the login times, the search word set, the search word frequency set, the commodity click id set and the commodity browsing duration set according to a key value pair form to obtain electronic commerce data, wherein the electronic commerce data are expressed as:
;
Wherein, The data of the electronic commerce is represented,Representing the user id(s) and,The number of logins is indicated and,A set of words for the search is represented,Representing the set of search term frequencies,Representing a set of item click ids,Representing a set of merchandise browsing durations.
3. The method for managing and controlling electronic commerce data assets based on a business center as claimed in claim 2, wherein the classifying the search word sets according to the preset word frequency threshold and the search word frequency set to obtain the search hot word set and the non-search hot word set includes:
Sequentially extracting search words from a search word set, and identifying search word frequencies of the search words;
Using search words with the search word frequency being greater than or equal to a word frequency threshold value as search hot words, and using search words with the search word frequency being less than the word frequency threshold value as non-search hot words;
Summarizing the search hot words and the non-search hot words respectively to obtain a search hot word set and a non-search hot word set, wherein the search hot word set and the non-search hot word set are expressed as follows:
;
Wherein, Representing a search of a hot word set,Representing the order of search hotwords in the search hotword set,Represents the i-th search hotword in the set of search hotwords,Representing a set of non-search hotwords, j representing the order of the non-search hotwords in the set of non-search hotwords,Represents the j-th non-search hotword in the set of non-search hotwords.
4. The business center-based e-commerce data asset management method of claim 3, wherein said calculating the text similarity of the search hotword and each of the non-search hotwords in the non-search hotword set to obtain a text similarity set comprises:
Sequentially extracting non-search hot words in a non-search hot word set, and sequentially combining the search hot words with the non-search hot words to obtain paired phrase, wherein the paired phrase is expressed as:
;
Wherein, A pairing phrase representing the ith searching hot word and the jth non-searching hot word;
Calculating the text similarity of the search hot words and the non-search hot words in the matched word group by using a pre-constructed text recognition model, and summarizing the text similarity to obtain a text similarity set, wherein the text similarity set is expressed as:
;
Wherein, A set of text similarities is represented and,And the text similarity of the ith searching hot word and the jth non-searching hot word is represented.
5. The business center-based e-commerce data asset management method of claim 4, wherein said calculating a target word vector for the target word group set comprises:
Constructing a user vocabulary according to the word set for searching and the pre-constructed comment text set;
counting the vocabulary frequency of each target phrase in the target phrase set in the user vocabulary, and constructing an original word vector according to the vocabulary frequency, wherein the original word vector is expressed as:
;
Wherein, The vector of the original word is represented,Represents the vocabulary frequency of the 1 st target phrase, k represents the total number of vocabulary frequencies,The vocabulary frequency of the kth target phrase is represented;
Acquiring a plurality of historical vocabularies and historical vocabulary numbers of the user id, and identifying target vocabulary numbers containing the target phrase in the plurality of historical vocabularies to obtain a target vocabulary number set;
Constructing a weight word vector according to the historical vocabulary number and the target vocabulary number set, wherein the weight word vector is expressed as:
;
Wherein, The weight word vector is represented as such,A logarithmic function with a base of 10 is shown,The number of the vocabulary of the history is represented,Represents the target vocabulary number corresponding to the 1 st target phrase,Representing a target vocabulary number corresponding to the kth target phrase;
according to the original word vector and the weight word vector, calculating a target word vector by using the following formula:
;
Wherein, The vector of the target word is represented,The inner product sign of the vector is represented.
6. The business center-based e-commerce data asset management method of claim 5, wherein said constructing a user vocabulary from said search vocabulary and a pre-constructed comment text set comprises:
Sequentially extracting comment texts from the comment text set, and identifying comment phrase of the comment texts by using a text identification model;
summarizing the comment phrase to obtain a comment word set, and summarizing the search word set and the comment word set to obtain an original vocabulary;
Identifying punctuation marks in the original vocabulary, and deleting the punctuation marks to obtain the user vocabulary.
7. The business center-based electronic commerce data asset management method of claim 1, wherein the extracting the commodity phrase in the commodity description text comprises:
Performing word segmentation operation on the commodity description text to obtain a candidate word set;
sequentially extracting candidate words from the candidate word set, and counting the number of the historical vocabularies containing the candidate words to obtain the number of the candidate vocabularies;
calculating the candidate weight value of the candidate word by using the following formula to obtain a candidate weight value set:
;
Wherein, The value of the candidate weight is indicated,Representing the number of times a candidate word appears in the candidate word set,Representing the total number of candidate words in the candidate word set,Representing a candidate vocabulary number;
and identifying candidate weight values larger than a preset weight threshold in the candidate weight value set to obtain a target weight value set, and identifying commodity phrase corresponding to the target weight value set.
8. The method for managing and controlling electronic commerce data assets based on a business center as claimed in claim 7, wherein said calculating a phrase matching degree of a target phrase set and a commodity phrase according to the commodity phrase vector and the target phrase vector includes:
According to the target word vector and the commodity word vector, calculating the phrase matching degree by using the following formula:
;
Wherein, The matching degree of the phrase is represented,The norm sign representing the vector is represented by,Representing the commodity word vector.
9. The method for managing and controlling electronic commerce data assets based on a business center as claimed in claim 8, wherein the calculating the commodity intention score of the user on the commodity click id according to the login times, the commodity browsing duration and the phrase matching degree by using a preset commodity intention formula to obtain the commodity intention score set comprises the following steps:
calculating commodity intention scores according to the login times, commodity browsing duration and phrase matching degree by using the following formula:
;
Wherein, Represents the intent score of the commodity, b represents the number of logins,A preset time length coefficient is represented, t represents a commodity browsing time length,Represents the number of the target phrase, n represents the sequence number of the target phrase,The vocabulary frequency of the nth target phrase is represented;
and summarizing the commodity intention scores to obtain commodity intention score sets.
10. An electronic commerce data asset management and control system based on a business center, the system comprising:
The data classification module is used for acquiring the electronic commerce data of the electronic commerce user by utilizing the pre-constructed business center, wherein the electronic commerce data comprises: user id, login times, a search word set, a search word frequency set, a commodity click id set and a commodity browsing duration set, and classifying search heat of the search word set according to a preset word frequency threshold and the search word frequency set to obtain a search hot word set and a non-search hot word set;
The vector calculation module is used for sequentially extracting search hot words in the search hot word set, calculating text similarity of the search hot words and each non-search hot word in the non-search hot word set to obtain a text similarity set, extracting target similarity larger than a preset similarity threshold in the text similarity set, identifying target phrases corresponding to the target similarity, summarizing the target phrases to obtain a target phrase set, calculating target word vectors of the target phrase set, sequentially extracting commodity click ids in the commodity click id set to obtain commodity description text of the commodity click ids, extracting commodity phrases in the commodity description text, and calculating commodity word vectors of the commodity phrases;
The scoring acquisition module is used for calculating the phrase matching degree of the target word set and the commodity phrase according to the commodity word vector and the target word vector, sequentially extracting commodity browsing duration in the commodity browsing duration set, and calculating commodity intention scores of a commodity click id of a user according to the login times, the commodity browsing duration and the phrase matching degree by utilizing a preset commodity intention formula to obtain commodity intention scoring set;
And the data management and control module is used for extracting target intention score sets larger than a preset score threshold value from the commodity intention score sets, identifying intention commodity id sets corresponding to the target intention score sets, and carrying out commodity pushing on the electric commodity users according to the intention commodity id sets.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411013650.XA CN118537053B (en) | 2024-07-26 | 2024-07-26 | E-commerce data asset management and control method and system based on business center |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411013650.XA CN118537053B (en) | 2024-07-26 | 2024-07-26 | E-commerce data asset management and control method and system based on business center |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118537053A CN118537053A (en) | 2024-08-23 |
CN118537053B true CN118537053B (en) | 2024-09-17 |
Family
ID=92381319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411013650.XA Active CN118537053B (en) | 2024-07-26 | 2024-07-26 | E-commerce data asset management and control method and system based on business center |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118537053B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545832A (en) * | 2022-10-08 | 2022-12-30 | 广州欢聚时代信息科技有限公司 | Commodity search recommendation method and device, equipment and medium thereof |
CN115599768A (en) * | 2022-10-19 | 2023-01-13 | 深圳市灵智数字科技有限公司(Cn) | Association word library construction method, association word recommendation method and device |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5238418B2 (en) * | 2008-09-09 | 2013-07-17 | 株式会社東芝 | Information recommendation device and information recommendation method |
JP5277941B2 (en) * | 2008-12-18 | 2013-08-28 | 大日本印刷株式会社 | Related product presentation method, related product presentation system, program, recording medium |
CN112256822A (en) * | 2020-10-21 | 2021-01-22 | 平安科技(深圳)有限公司 | Text search method, apparatus, computer equipment and storage medium |
CN117635238A (en) * | 2023-12-22 | 2024-03-01 | 税友软件集团股份有限公司 | Commodity recommendation method, device, equipment and storage medium |
-
2024
- 2024-07-26 CN CN202411013650.XA patent/CN118537053B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545832A (en) * | 2022-10-08 | 2022-12-30 | 广州欢聚时代信息科技有限公司 | Commodity search recommendation method and device, equipment and medium thereof |
CN115599768A (en) * | 2022-10-19 | 2023-01-13 | 深圳市灵智数字科技有限公司(Cn) | Association word library construction method, association word recommendation method and device |
Also Published As
Publication number | Publication date |
---|---|
CN118537053A (en) | 2024-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022141861A1 (en) | Emotion classification method and apparatus, electronic device, and storage medium | |
CN111177569B (en) | Recommendation processing method, device and equipment based on artificial intelligence | |
WO2020125445A1 (en) | Classification model training method, classification method, device and medium | |
CN106649818B (en) | Application search intent identification method, device, application search method and server | |
WO2019201098A1 (en) | Question and answer interactive method and apparatus, computer device and computer readable storage medium | |
CN112733042B (en) | Recommendation information generation method, related device and computer program product | |
CN115002200B (en) | Message pushing method, device, equipment and storage medium based on user portrait | |
CN110347908B (en) | Voice shopping method, device, medium and electronic equipment | |
CN114387061A (en) | Product pushing method and device, electronic equipment and readable storage medium | |
CN111144120A (en) | Training sentence acquisition method and device, storage medium and electronic equipment | |
CN112559684A (en) | Keyword extraction and information retrieval method | |
CN112989208B (en) | Information recommendation method and device, electronic equipment and storage medium | |
CN109582788A (en) | Comment spam training, recognition methods, device, equipment and readable storage medium storing program for executing | |
CN113268615A (en) | Resource label generation method and device, electronic equipment and storage medium | |
CN113850643B (en) | Product recommendation method and device, electronic equipment and readable storage medium | |
CN111061939A (en) | Scientific research academic news keyword matching recommendation method based on deep learning | |
CN114416939A (en) | Intelligent question and answer method, device, equipment and storage medium | |
CN113886708A (en) | Product recommendation method, device, equipment and storage medium based on user information | |
CN112632264A (en) | Intelligent question and answer method and device, electronic equipment and storage medium | |
CN110245357B (en) | Main entity identification method and device | |
CN109753646B (en) | Article attribute identification method and electronic equipment | |
CN114255067A (en) | Data pricing method and device, electronic equipment and storage medium | |
CN112597299A (en) | Text entity classification method and device, terminal equipment and storage medium | |
CN118537053B (en) | E-commerce data asset management and control method and system based on business center | |
WO2019192122A1 (en) | Document topic parameter extraction method, product recommendation method and device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |