[go: up one dir, main page]

CN118780280B - Multi-platform integrated intelligent topic selection inspiration generation method and topic selection inspiration engine - Google Patents

Multi-platform integrated intelligent topic selection inspiration generation method and topic selection inspiration engine Download PDF

Info

Publication number
CN118780280B
CN118780280B CN202411248796.2A CN202411248796A CN118780280B CN 118780280 B CN118780280 B CN 118780280B CN 202411248796 A CN202411248796 A CN 202411248796A CN 118780280 B CN118780280 B CN 118780280B
Authority
CN
China
Prior art keywords
topic
keyword
information
weight
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411248796.2A
Other languages
Chinese (zh)
Other versions
CN118780280A (en
Inventor
沈利
陈庆婷
徐春良
钱陈健
车文龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Fuwen Network Information Technology Co ltd
Original Assignee
Zhejiang Fuwen Network Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Fuwen Network Information Technology Co ltd filed Critical Zhejiang Fuwen Network Information Technology Co ltd
Priority to CN202411248796.2A priority Critical patent/CN118780280B/en
Publication of CN118780280A publication Critical patent/CN118780280A/en
Application granted granted Critical
Publication of CN118780280B publication Critical patent/CN118780280B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请涉及新媒体应用的技术领域,尤其是涉及一种多平台融合的智能选题灵感生成方法及选题灵感引擎,其包括:通过若干渠道接口爬取相应平台上的资讯榜单并加入选题池;基于选题列表中各个资讯所处的位序计算第一选题权重;选取关键词或由若干关键词组成的关键词组,并基于内容热度计算第二选题权重;基于关键词或关键词组出现的次数计算出关键词或关键词组第三选题权重;基于第一选题权重、第二选题权重和第三选题权重计算出最终权重;在选题池中沿最终权重从高到低的顺序选择预设数量的关键词或关键词组构建选题列表,选题列表中包含各关键词或关键词组对应的最终权重的数值及资讯的链接。本申请可以提高选题过程中的多平台整体寻优的效果。

The present application relates to the technical field of new media applications, and in particular to a multi-platform integrated intelligent topic inspiration generation method and topic inspiration engine, which includes: crawling the information list on the corresponding platform through several channel interfaces and adding it to the topic pool; calculating the first topic weight based on the position of each information in the topic list; selecting keywords or keyword groups composed of several keywords, and calculating the second topic weight based on the content popularity; calculating the third topic weight of the keyword or keyword group based on the number of times the keyword or keyword group appears; calculating the final weight based on the first topic weight, the second topic weight and the third topic weight; selecting a preset number of keywords or keyword groups in the topic pool in order from high to low in the final weight to construct a topic list, and the topic list contains the final weight value corresponding to each keyword or keyword group and the link to the information. The present application can improve the effect of multi-platform overall optimization in the topic selection process.

Description

Multi-platform integrated intelligent topic selection inspiration generation method and topic selection inspiration engine
Technical Field
The application relates to the technical field of new media application, in particular to a multi-platform fusion intelligent topic selection inspiration generation method and a topic selection inspiration engine.
Background
With the continuous development of new media technology, more and more traditional media forms are replaced by short video and online media platforms, so that when various bloggers and new media persons create and manufacture corresponding short video, articles, public numbers and other new media contents, how to quickly and accurately find out the current hottest hot topics is related to the data such as play quantity, transfer quantity and the like of the created and manufactured new media contents.
There are a large number of news information platforms and short video platforms on the market, and a huge number of information contents including hot spot information, daily information, industry information, financial information and the like are produced on a huge number of platforms each day, and the new media people need to combine the current hot spot information or the industry information corresponding to the current hot spot information to perform corresponding question selection before preparing the new media contents, so how to obtain optimal question selection inspiration in the huge number of information is an important link in the work of the new media people.
The current media platforms have hot list, but no system tool can integrate the hot information in a plurality of media platforms, so that a new media practitioner can only select questions in one or a few platforms according to the hot list integrated by the platforms, on one hand, the information of the hottest point of some platforms is not the hottest point information in all the current platforms due to different fields and key consideration directions corresponding to the platforms, and on the other hand, the new media practitioner needs to manually search the question inspiration in a plurality of platforms, and high workload is required.
Disclosure of Invention
In order to improve the overall optimizing effect of multiple platforms in the process of selecting questions, the application provides a multi-platform fused intelligent selecting question inspiration generating method and a selecting question inspiration engine.
In a first aspect, the application provides a multi-platform fused intelligent topic selection inspiration generation method, which adopts the following technical scheme:
A method for generating a topic-selecting inspiration comprises the following steps:
Crawling information list on the corresponding platform through a plurality of channel interfaces;
Adding a plurality of information lists into a reserved topic pool, wherein each information list is used as an independent topic list;
calculating a first topic weight corresponding to each piece of information based on the bit sequence of each piece of information in the topic selection list;
Selecting keywords corresponding to the titles of the information or keyword groups formed by a plurality of keywords based on a keyword library, and calculating corresponding second topic selection weights based on the content heat of the keywords or the keyword groups;
Calculating a third topic weight corresponding to the keyword or the keyword group based on the number of times of occurrence of the keyword or the keyword group in the corresponding text of the information and the number of times of occurrence of other text of the information in the topic selection list;
Calculating the final weight corresponding to each keyword or the keyword group based on the first topic weight, the second topic weight and the third topic weight;
And selecting a preset number of keywords or keyword groups from high to low in the topic selection pool to construct a topic selection list, wherein the topic selection list comprises numerical values of the final weights corresponding to the keywords or the keyword groups and links of the corresponding information.
In some embodiments, after crawling the information list on the corresponding platform through the plurality of channel interfaces, the method further comprises the following steps:
Judging whether a plurality of pieces of information on each information list belong to negative information or not through a preset negative filtering model, and filtering the negative information;
judging whether specific filtering words exist in a plurality of pieces of information on each information list through a preset specific word stock, and filtering the information with the specific filtering words.
In some embodiments, the calculating the first topic weight corresponding to each information based on the bit sequence of each information in the topic list includes the following steps:
Generating a preliminary weight based on the bit sequence of the information in the topic selection list, wherein the preliminary weight is larger when the bit sequence is more advanced;
Generating maximum weight based on the length of the topic selection list;
generating an effect addition coefficient based on the platform corresponding to the topic selection list;
And multiplying the effect addition coefficient by the ratio of the preliminary weight to the maximum weight to calculate the first topic weight.
In some embodiments, calculating the corresponding second topic weight based on the content popularity of each keyword or the keyword group includes the following steps:
Selecting a daily heat range and a real-time heat range from the keyword library, wherein the daily heat range and the real-time heat range comprise a plurality of keywords;
acquiring a heat retention value of each keyword in the keyword library, and adding the keywords with the heat retention values larger than a first preset value into the daily heat range;
Acquiring a heat mutation value of each keyword in the keyword library, and taking the keywords with the heat mutation values amplified by more than a second preset value as candidate keywords;
Performing global networking search based on the candidate keywords, acquiring event time corresponding to the candidate keywords, and adding the candidate keywords into the real-time heat range if the event time is matched with the current time;
Generating the corresponding second topic weights based on whether the keywords or the keyword groups are in the daily hotness range or the real-time hotness range.
In some embodiments, if the information corresponds to the keyword, calculating a third topic weight corresponding to the keyword, including the following steps:
Acquiring the number of times of occurrence of the key words in the text of the corresponding information, and defining the number of times as the original number of times;
Acquiring the occurrence times of the key words in the text of other information in the topic selection list, and selecting the maximum times and the minimum times;
calculating the third topic weight based on the following formula:
third topic weight= (original number-minimum number)/(maximum number-minimum number).
In some embodiments, if the information corresponds to the keyword group, a third topic weight corresponding to the keyword group is calculated, including the following steps:
acquiring a plurality of keywords in the keyword group;
the number of times that each keyword appears in the text of the corresponding information is respectively obtained and defined as the original number of times;
the number of times that each keyword appears in the text of other information in the topic selection list where the keyword is located is respectively obtained, and the maximum number of times and the minimum number of times corresponding to each keyword are respectively selected;
Calculating the third topic weight corresponding to each keyword based on the following formula:
third topic weight= (original number-minimum number)/(maximum number-minimum number);
judging whether the difference value between the third topic weights of a plurality of keywords is larger than a preset value or not;
if the keyword group is larger than the first question group, selecting the third question weight with the largest numerical value as the third question weight corresponding to the keyword group;
and if the keyword group is not greater than the keyword group, taking the average value of the third topic weights as the third topic weight corresponding to the keyword group.
In some embodiments, calculating the final weight corresponding to each keyword or the keyword group based on the first topic weight, the second topic weight and the third topic weight includes the following steps:
The final weight is calculated by the following formula:
Wherein n is characterized by the number of the topic lists, k1, k2 and k3 are all calculation coefficients, and the sum of k1, k2 and k3 is 1.
In some embodiments, the calculating the final weight corresponding to each keyword or the keyword group based on the first topic weight, the second topic weight and the third topic weight further includes the following steps:
judging whether the information list on the corresponding platform has a classification list or not;
If yes, acquiring the classification information corresponding to the classification list, and judging the existence quantity of each piece of classification information in the topic selection pool;
if the keyword or the keyword group corresponds to the classification list, multiplying the calculated final weight by a corresponding classification coefficient;
Wherein, the classification coefficient is larger than 1 and the size thereof is inversely proportional to the quantity of the classification information corresponding to the classification list thereof in the question selection pool.
In some embodiments, in the topic selection list, the method for generating the link of the information corresponding to each keyword or the keyword group includes the following steps:
Acquiring the information with the largest first topic weight in all the information of the keywords or the keyword groups as target information;
Acquiring the link corresponding to the target information based on the channel interface;
And sending the link to the keyword or the keyword group.
In a second aspect, the application provides a topic-selecting inspiration engine, which adopts the following technical scheme:
A topical inspiration engine comprising:
the channel interface is used for butting the platforms;
the crawler module is used for crawling information sheets on the corresponding platforms of the channel interfaces;
The topic selection pool is used for placing a plurality of information lists and converting each information list into a corresponding plurality of independent topic selection lists;
The weight extraction module is used for calculating a first topic weight corresponding to each piece of information based on the bit sequence of each piece of information in the topic selection list; selecting keywords corresponding to the titles of the information or keyword groups formed by a plurality of keywords based on a keyword library, and calculating corresponding second topic selection weights based on the content heat of the keywords or the keyword groups; calculating a third topic weight corresponding to the keyword or the keyword group based on the frequency of occurrence of the keyword or the keyword group in the corresponding text of the information and the frequency of occurrence of the keyword or the keyword group in other text of the information in the topic selection list;
the topic selection weight calculation module is used for calculating the final weight corresponding to each keyword or the keyword group based on the first topic selection weight, the second topic selection weight and the third topic selection weight;
And the topic selection generating module is used for selecting a preset number of keywords or key phrases from high to low in the topic selection pool to construct a topic selection list, wherein the topic selection list comprises the final weight value corresponding to each keyword or key phrase and the corresponding links of the information.
By the technical scheme provided by the embodiment of the application, the following technical effects are achieved:
The method comprises the steps of docking and integrating a large number of platforms, crawling information sheets on various platforms to transversely compare keywords extracted from different hot spot information of the multiple platforms through a weight algorithm, judging the heat condition corresponding to each keyword on the whole, rapidly and accurately acquiring the latest and hottest keywords on the current full platform as preferred topics, and enabling a user to obtain corresponding topic inspiration through calculating the screened keywords directly, so that the workload is reduced.
Drawings
Fig. 1 is a schematic step diagram of a multi-platform fusion intelligent topic-selection inspiration generation method provided by an embodiment of the application.
Fig. 2 is a schematic diagram of module connection of a topic selection inspiration engine according to an embodiment of the present application.
Detailed Description
The present application will be described and illustrated with reference to the accompanying drawings and examples for a clearer understanding of the objects, technical solutions and advantages of the present application. However, it will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In some instances, well known methods, procedures, systems, components, and/or circuits have been described at a high-level so as not to obscure aspects of the present application with unnecessary description. It will be apparent to those having ordinary skill in the art that various changes can be made to the disclosed embodiments of the application and that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Thus, the present application is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the scope of the application as claimed.
The description of these embodiments is provided to assist understanding of the present invention, but is not intended to limit the present invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
In the description of the present application, a number means one or more, a number means two or more, and greater than, less than, exceeding, etc. are understood to not include the present number, and above, below, within, etc. are understood to include the present number. The description of the first and second is only for the purpose of distinguishing between technical features and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present application, the descriptions of the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic line representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The embodiment of the application discloses a multi-platform fusion intelligent topic selection inspiration generation method.
As shown in FIG. 1, the intelligent topic selection inspiration generation method with multi-platform fusion comprises the following steps:
S100, crawling information list on the corresponding platform through a plurality of channel interfaces.
The channel interface is used for interfacing with each news platform, the self-media platform, the short video platform and the like of the main stream in the Internet, and crawling the latest information list on each main stream popular channel through a crawler algorithm, wherein the information list is a hot list obtained by sorting on each platform based on the information popularity calculation method of the platform, and the crawled content comprises the title of information, the content of the information and the source of the information.
S200, adding a plurality of information lists into the reserved topic pool, wherein each information list is used as an independent topic list.
The topic pool is an integrated library, wherein the information list crawled by each platform of the reserved number can be stored, and meanwhile, a user can also randomly check the information list on each platform in the topic pool.
After the information list of each platform enters the topic pool, each information list is an independent topic list after the subsequent information pretreatment, and each information on the topic list is arranged in the same arrangement sequence as the information list.
S300, calculating first topic weights corresponding to the information based on the bit sequence of the information in the topic selection list.
First, the first topic weight corresponding to each information is obtained by sorting the information on the topic list, in general, the higher the information popularity corresponding to one information is, the higher the information popularity corresponding to the information is, and the higher the information popularity corresponding to the information is, because the topic list is correspondingly generated by the information list, the higher the information popularity corresponding to the information is.
The order of each information is used as the first reference factor for judging the heat condition of the selected questions.
S400, selecting keywords corresponding to titles of the information or keyword libraries composed of a plurality of keywords based on the keyword libraries, and calculating corresponding second topic weights based on content heat of the keywords or keyword libraries.
The keyword library is a word library generated by corresponding a keyword extraction model trained by a large model and a neural network algorithm in advance when the topic selection inspiration engine is created, and a large number of keywords conforming to common semantics and human language are stored in the keyword library, and the keyword library can be used as a keyword library commonly used in the prior art, such as NLTK library, jieba library and the like. The extraction object of the keywords is the title of the information, so that the title of the information can comprehensively summarize the key content in the information, the number of words of the title is relatively small, and the finally extracted keywords are more accurate and the result is relatively small.
The number of keywords selected in the title of each information is not necessarily the same depending on the content of the information, for example: in the information of 'express company accelerating into village and increasing part of regional parts by ten times', the selected keyword is 'express company', and in the information of 'xx player cutting gold medal', the selected keyword is 'xx sport meeting' and 'xxx player'.
The selection of keywords mainly includes the following aspects: major activities, major meetings, major policies, major rewards, major industrial businesses, public figures, and the like. Generally, in order to reduce the workload of subsequent topics and improve the accuracy of topics, at most three keywords exist in a keyword group of one information, and when more than three keywords can be selected from one information, the three keywords which appear first in the word sequence of the information are selected by default.
After the keywords or the keyword groups corresponding to the information are selected, the content heat corresponding to the keywords or the keyword groups is needed to be used as a second reference factor for judging the heat condition of the selected questions. The content popularity of a keyword or a keyword group is characterized by the activity of an event or object corresponding to the keyword in the society during the period of time, for example, during holidays, the content popularity of the keyword related to the traveling of the holidays is relatively high.
The second topic weight is mainly characterized by the activity degree corresponding to the social public opinion environment inferred by the information through the keywords.
S500, calculating a third topic weight corresponding to the keyword or the keyword group based on the number of times the keyword or the keyword group appears in the text of the corresponding information and the number of times the keyword or the keyword group appears in the text of other information in the topic list.
The number of times of occurrence of the keywords or the keyword groups in each information is used as a third reference factor for judging the heat condition of the selected questions, if the keywords are frequently in a plurality of pieces of information in one information list, the current heat of the event or the person corresponding to the keywords is very high, and then the corresponding third selected questions weight is larger.
S600, calculating the final weight corresponding to each keyword or keyword group based on the first topic weight, the second topic weight and the third topic weight.
And calculating the final weight corresponding to each keyword or keyword group through the calculated multiple sub weights, wherein the final weight is used for expressing the weight obtained by the keywords or keyword groups in the information list of all the crawled platforms, and the higher the final weight is, the higher the heat proposal condition of the keyword or keyword groups on various platforms is, and the keyword or keyword groups are more suitable for being used as the topic content.
S700, selecting a preset number of keywords or keyword groups from high to low in the topic selection pool to construct a topic selection list, wherein the topic selection list comprises the numerical value of the final weight numerical value corresponding to each keyword or keyword group and the link of corresponding information.
After the final weight of each keyword or keyword group is obtained, a plurality of keywords or keyword groups with higher final weights can be selected, and a topic selection list is generated. And displaying selected keywords for the topics with better hot spots on the topic selection list, and correspondingly displaying a weight value corresponding to the keywords and links of the information matched with the keywords.
According to the method, a large number of platforms are integrated in a butt joint mode, the information list on each type of platform is crawled, the keywords extracted from the different hot spot information of the plurality of platforms are transversely compared through the weight algorithm, the heat condition corresponding to each keyword is judged on the whole, the latest and hottest keyword on the current whole platform is rapidly and accurately obtained to serve as a preferred topic, a user can directly obtain corresponding topic inspiration through calculating the screened keywords, and workload is reduced.
In other embodiments, after crawling the information list on the corresponding platform through the plurality of channel interfaces, the method further comprises the following steps:
s110, judging whether a plurality of pieces of information on each information list belong to negative information or not through a preset negative filtering model, and filtering the negative information.
Because some short video platforms and promotion and drainage forces of partial sensitive negative contents in the media platform are influenced, in order to reduce the probability of generating negative keywords in the process of selecting questions, misguidance on new media persons is reduced, and information preprocessing can be performed after the information list is crawled.
Firstly, compliance detection is carried out, and contents such as illegal behaviors, war, natural disasters, fraud and the like are filtered through a large model technology.
The negative filtering model is obtained through a machine learning training algorithm in a common technology, the model is trained through a large number of training sets of negative illegal words, the trained negative filtering model is applied to classifying and filtering real-time texts, and whether bad or illegal contents exist in the texts or not is automatically judged.
S111, judging whether specific filtering words exist in the information on each information list through a preset specific word stock, and filtering the information with the specific filtering words.
The specific word stock stores a large number of classification sets which accord with different industries, different fields and different types, and each classification set has keywords corresponding to the corresponding industries, fields and types. The specific part-of-speech library can be selected by a user, for example, a certain new media person user mainly generates information of trolley industry, then the user can sort other irrelevant industries, such as corresponding classification sets of stocks, economy, entertainment and the like, and the information of corresponding specific filtering words in a plurality of pieces of information can be deleted through the selected screening range, so that the selected questions obtained by the user are more suitable for the corresponding fields of the user, and the selected questions corresponding to the non-related fields do not occupy the positions of the final selected question list; meanwhile, when the calculation of the topic selection weight is carried out later, the calculation result of the topic selection weight required by certain specific fields and industries is more accurate finally by deleting irrelevant information.
In other embodiments, the calculating the first topic weight corresponding to each information based on the order in which each information in the topic list is located includes the following steps:
S310, generating a preliminary weight based on the bit sequence of the information in the topic list, wherein the earlier the bit sequence is, the larger the preliminary weight is.
Each topic list is an independent list, and the preliminary weight is calculated according to the position of each piece of information on the list where the topic list is located. The information list of each platform is ranked by a certain algorithm, such as heat priority, time priority, heat ratio priority in unit time, etc., and the information with highest heat or shortest release time is always in the front of the information list, so that the heat condition of the information can be judged according to the order of the information on the list.
S320, generating the maximum weight based on the length of the topic list.
The value of the maximum weight is related to the length of the selected question list, if 10 pieces of information are in the selected question list, the maximum weight is 10, and the larger the length is, the larger the maximum weight is.
S330, generating an effect addition coefficient based on the platform corresponding to the topic selection list.
Different platforms correspond to different scales, such as awareness, user quantity, liveness, online quantity, click quantity, forwarding quantity, comment quantity and the like, different scales represent the volume of the platform, the larger the volume of the platform is, the greater the information on the platform is, the greater authority and audience are, and the effect addition coefficient corresponding to the question selection list is greater.
Because the calculated final weight is used for representing the heat degree of a keyword of one information, the effect addition coefficient is the addition quantity of the first topic weight in the calculation process, and the larger the platform is, the larger the user activity is, the larger the addition of the effect addition coefficient to the heat degree is. In the embodiment of the application, the effect addition coefficient is a numerical value between 1 and 2, and when the scale of the platform is extremely small, the effect addition coefficient is 1, which indicates that no platform addition exists.
S340, calculating the first topic weight according to the ratio of the preliminary weight to the maximum weight and multiplying the ratio by the effect addition coefficient.
And calculating to obtain a first topic weight by using the (preliminary weight/maximum weight) effect addition coefficient, wherein the first topic weight is characterized as an information weight corresponding to single information.
In other embodiments, calculating the corresponding second topic weight based on the content popularity of each keyword or keyword group includes the following steps:
s410, selecting a daily heat range and a real-time heat range from the keyword library, wherein the daily heat range and the real-time heat range comprise a plurality of keywords.
Two ranges, namely two empty sets are set in the reserved keyword library, and any number of keywords meeting the requirements can be placed in each empty set. One of the empty sets corresponds to a daily heat range in which the stored keywords are keywords such as "semiconductor", "electric car", etc., which maintain a certain heat for a long time, and the other empty set corresponds to a real-time heat range in which the stored keywords are keywords such as "release meeting", "concert", etc., which suddenly increase in heat for a certain time.
S420, acquiring a heat retention value of each keyword in the keyword library, and adding the keywords with the heat retention values larger than a first preset value into a daily heat range.
Firstly, judging the heat retention value of each keyword based on the number of times each keyword appears in each list in the history data, the retention time and other data, for example, one keyword frequently appears in each information list for 2-3 months, the whole heat is not high, but the retention time is long, the calculated heat retention value of the keywords is large, and when the heat retention value is larger than a first preset value, the keyword is added into the daily heat range.
S430, acquiring the heat mutation value of each keyword in the keyword library, and taking the keywords with the heat mutation value amplified more than a second preset value as candidate keywords.
And judging the heat mutation value based on the times and the holding time of each keyword in the history data in each list and the times and the holding time of the keyword in the last period of time. If a low-heat keyword which has extremely small frequency in the past appears on the list in a period of time suddenly and frequently appears on the list and is maintained for a period of time, the heat mutation value is considered to be larger, and when the heat mutation value is larger than a second preset value, the keyword is used as a candidate keyword and enters the next round of judgment.
S440, global networking search is conducted based on the candidate keywords, event time corresponding to the candidate keywords is obtained, and if the event time is matched with the current time, the candidate keywords are added into a real-time heat range.
For example, the candidate keyword is "concert", then the keyword is searched in the corresponding internet search engine through the interface, and the event corresponding to the keyword and the event time corresponding to the event are obtained. For example, if the event is xxx to hold a concert, the event time is characterized by a period of time for which social public opinion corresponding to the event appears hot, for example, the time of the concert is 2024.5.20-2024.5.23, the event time can be respectively prolonged forward and backward for half a month, namely, 2024.5.5-2024.6.4, and the prolonged time is taken as public opinion increment corresponding to a certain high-heat event, for example, the pre-selling time before the concert and the hot time after the concert.
If the current time is within the event, the candidate keyword is considered to be within the occurrence event corresponding to a certain high-heat agenda event, and the possibility and reasonability of heat mutation exist, so that the candidate keyword can be added into the real-time heat range.
S450, generating corresponding second topic weights based on whether the keywords or the keyword groups are in the daily heat range or the real-time heat range.
The keywords or keyword groups are in different ranges, and the corresponding second topic weights are different. In general, if a keyword is in the daily heat range, it is explained that the keyword is a reasonably effective heat question point in a longer period of time, and the keyword is used as a topic which does not completely match the optimal topic in the current time, but has a certain heat in any time; when one keyword is in the real-time heat range, the keyword is the hottest topic in the current time, and the keyword can be used as a topic to bring higher flow and attention. Therefore, the second topic weight corresponding to the keyword in the real-time heat range is larger than the second topic weight in the daily heat range, and the specific size of the second topic weight is calculated specifically according to the time, the times and the like of the keyword on the information list in the historical data, and the calculation mode can be modified according to actual conditions.
Meanwhile, if the keyword or the keyword group is not in the daily hotness range or the real-time hotness range, the keyword is not the topic which is proposed in the current or the historical time, so that the generated second topic weight is smaller than the second topic weight generated under the other two conditions.
In other embodiments, if the information is selected as the keyword, a third topic weight corresponding to the keyword is calculated, including the following steps:
S510, the number of times the keyword appears in the text of the corresponding information is obtained and defined as the original number of times.
The number of occurrences of the keyword in the body of the information in which it is located is calculated.
S511, the number of times that the keyword appears in the text of other information in the topic selection list where the keyword is located is obtained, and the maximum number of times and the minimum number of times are selected.
And calculating the occurrence times of the keyword in the text of each other information in the topic selection list where the keyword is located, and selecting the maximum occurrence times and the minimum occurrence times.
S512, calculating a third topic weight based on the following formula:
third topic weight= (original number-minimum number)/(maximum number-minimum number).
The weight of each keyword represented by the occurrence number is calculated through normalization, for example, the original number is 6, the minimum occurrence number is 3, the maximum occurrence number is 8, and then the corresponding third choice question weight is 3/5. When the original times are closer to the maximum times, the corresponding third question weights are larger, otherwise, the original times are closer to the minimum times, and the corresponding third question weights are smaller.
If a keyword exists in a plurality of pieces of information in the corresponding topic list, the third topic weights corresponding to the keywords in the pieces of information are required to be calculated respectively, the calculated third topic weights are added, and the added result is used as the third topic weight value of the keyword in the corresponding topic list.
In other embodiments, if the information is a keyword group, a third topic weight corresponding to the keyword group is calculated, which includes the following steps:
s520, acquiring a plurality of keywords in the keyword group.
If a keyword group is selected, the keyword group is decomposed into a plurality of keywords.
S521, the number of times of the keyword appearing in the text of the corresponding information is obtained and defined as the original number of times.
S522, the number of times that each keyword appears in the text of other information in the topic selection list where the keyword is located is respectively obtained, and the maximum number of times and the minimum number of times corresponding to each keyword are respectively selected.
S523, calculating third topic weights corresponding to the keywords based on the following formulas:
third topic weight= (original number-minimum number)/(maximum number-minimum number).
The third topic weight corresponding to each keyword is obtained through the steps, and is the same as the steps of S510-S512, and when the number of information of the keywords appearing in the topic list is greater than 1, the final third topic weight is obtained through an addition mode.
S524, judging whether the difference value between the third topic weights of the keywords is larger than a preset value.
If the information corresponds to the keyword group, the difference value between the third question weights of the resolved keywords is also needed to be determined, and when the difference value between the keywords is larger than or smaller than a preset value, the different third question weights are all corresponding to the selection method.
And S525, if the number is larger than the preset number, selecting the third topic weight with the largest value as the third topic weight corresponding to the keyword group.
If the difference of the third topic weights between the keywords is greater than the preset value, it is indicated that the independent hotness between the keywords in a keyword group is not balanced, which may be caused by different semantic roles of different words in the text, for example, the title of an information is: the key phrase is x-motion meeting-y, wherein the y athletes chop gold cards in the x-motion meeting. In the text, x motion represents the current overall event, y is a name, and after motion is led out from the beginning in the text according to semantic logic and a line thought, subsequent contents are described around y athletes, so that the occurrence frequency of y is far higher than the occurrence frequency of x motion under the condition, and x motion is used as a description of the overall event, the heat effect of the x motion is not influenced by the fact that the occurrence frequency is less, so that in order to ensure accurate calculation of the weight of the parameter corresponding to the occurrence frequency, the value with the largest occurrence frequency in a plurality of keywords, namely the value with the largest third choice weight, can be selected as the third choice weight of the keyword group, and the keywords with the extremely small occurrence frequency but larger heat degree can not influence the overall weight value.
S526, if not, taking the average value of the third topic weights as the third topic weight corresponding to the keyword group.
If the difference value between the third topic weights of the keywords in the keyword group is smaller than a preset value, the fact that the frequency difference of the occurrence of the keywords is not large is indicated, meanwhile, the weight influence of the keywords on the frequency factor calculation is balanced, at this time, in order to ensure the numerical balance of the third topic weights between the keyword groups consisting of a single keyword or a plurality of keywords, the calculated third topic weights are required to be subjected to average calculation to obtain corresponding third topic weights because the keywords exist in the keyword group.
In other embodiments, the final weight corresponding to each keyword or keyword group is calculated based on the first topic weight, the second topic weight, and the third topic weight, including the steps of:
S610, calculating a final weight by the following formula:
wherein n is characterized by the number of the topic lists, k1, k2 and k3 are all calculation coefficients, and the sum of k1, k2 and k3 is 1.
K1, k2 and k3 are all corresponding numerical coefficients, which are characterized by the specific gravity considered by each topic weight in the final weights, and in the embodiment of the present application, k1 and k2 are all 0.3 and k3 is 0.4.
The final weight needs to add the weights corresponding to the topic lists, because the keywords appear in the topic lists, and after the topic lists are transversely compared, the numerical values calculated by the topic lists are needed to be added, so that the overall heat weight of the keywords or the keyword groups is obtained.
In other embodiments, the final weight corresponding to each keyword or keyword group is calculated based on the first topic weight, the second topic weight, and the third topic weight, and further comprising the steps of:
s620, judging whether the information list on the corresponding platform has a classification list.
The classification list is characterized by being divided into lists based on different specific fields and specific industry, for example, an economic information list, an entertainment information list, an industrial information list and the like exist in addition to the whole list of the information list of the part of the platform.
S621, if yes, obtaining the classification information corresponding to the classification list, and judging the existence quantity of each classification information in the topic selection pool.
If so, specific classification information corresponding to the classification list is required to be obtained, and the number of the presence of the list of various classifications in the topic pool is calculated.
S622, if the keyword or the keyword group corresponds to the classification list, multiplying the calculated final weight by the corresponding classification coefficient.
If the keyword or the keyword group corresponds to the classification list, the final weight corresponding to the keyword or the keyword group is multiplied by a coefficient, and the coefficient is an addition value of the list of different fine classifications. For example, a keyword appears in a list of an economic class, which is very likely to correspond to the content of economics, which is difficult to appear in other types of list, but which does not appear in other types of list and does not represent that the current popularity is low, which is only less frequently caused by the fact that the type of list is more special, so that in order to avoid that the keyword of the specific field and the specific industry affects the accuracy of the weight value due to the limitation of the field when calculating the final weight, the classification coefficient needs to be multiplied to compensate the limitation of the choice due to the special type of list.
Wherein, the classification coefficient is larger than 1, and the size of the classification coefficient is inversely proportional to the number of the classification information corresponding to the classification list in the topic pool. That is, when the number of times of the field corresponding to a classification list in the choice question pool is smaller, the smaller the field characterized as the classification list is, the larger the classification coefficient value which needs to be multiplied, the larger the corresponding weight calculation support is, otherwise, when the number of times of the field corresponding to the classification list in the choice question pool is larger, the larger the field characterized as the classification list is, the smaller the classification coefficient value which needs to be multiplied is, and the smaller the corresponding weight calculation support is.
In other embodiments, in the topic selection list, the method for generating the links of the information corresponding to each keyword or keyword group includes the following steps:
s710, obtaining the information with the largest first topic weight in all the information of the keywords or the keyword groups as target information.
S720, obtaining links corresponding to the target information based on the channel interfaces.
S730, binding the link transmission to the keyword or the keyword group.
When one keyword or one keyword group corresponds to a plurality of information, the information with the largest first topic weight in the plurality of information of the keyword or the keyword group is preferentially selected, and the information corresponds to the information which has the keyword or the keyword group and is in the forefront sorting, namely the hottest information.
After the corresponding information is selected, the information is used as target information, the link of the target information is obtained through a channel interface, and then when a user selects a keyword or a keyword group and wants to view the corresponding information content, the link corresponding to the keyword or the keyword group can be directly selected to jump to the corresponding information.
In other embodiments, after the user selects the keyword or the keyword group corresponding to the corresponding topic, the method further includes:
Selecting a link corresponding to the keyword or the keyword group and jumping to the text of the information, if a user wants to generate a topic view based on the information, transmitting the text of the information, acquiring a negative view in the text of the information through a negative filtering model and filtering, acquiring key sentences in the text through other big data models, extracting the key sentences including positive view sentences, summary sentences and the like, and generating the topic view based on corresponding data information.
In other embodiments, after the topic views are generated, topic scripts may be generated based on a preset trained large model, the large model learns the writing style, network popular words, writing methods, etc. of training a specific article through a large number of script learning sets, and the topic scripts of corresponding styles are generated by introducing the topic views into the large model.
As shown in fig. 2, the present application also discloses a topic-selecting inspiration engine, which comprises:
And the channel interface is used for docking each platform.
The crawler module is used for crawling information sheets on the corresponding platforms of the channel interfaces;
the topic selection pool is used for placing a plurality of information lists and converting each information list into a corresponding plurality of independent topic selection lists.
The weight extraction module is used for calculating a first topic weight corresponding to each piece of information based on the bit sequence of each piece of information in the topic selection list; selecting keywords corresponding to the titles of the information or keyword groups formed by a plurality of keywords based on the keyword library, and calculating corresponding second topic weights based on the content heat of the keywords or keyword groups; and calculating a third topic weight corresponding to the keyword or the keyword group based on the frequency of occurrence of the keyword or the keyword group in the text of the corresponding information and the frequency of occurrence of the keyword or the keyword group in the text of other information in the topic selection list.
And the topic selection weight calculation module is used for calculating the final weight corresponding to each keyword or keyword group based on the first topic selection weight, the second topic selection weight and the third topic selection weight.
And the topic selection generating module is used for selecting a preset number of keywords or key phrases from high to low in the topic selection pool in sequence to construct a topic selection list, wherein the topic selection list comprises links of final weight values corresponding to the keywords or key phrases and corresponding information.
The implementation principle is as follows:
The method comprises the steps of docking and integrating a large number of platforms, crawling information sheets on various platforms to transversely compare keywords extracted from different hot spot information of the multiple platforms through a weight algorithm, judging the heat condition corresponding to each keyword on the whole, rapidly and accurately acquiring the latest and hottest keywords on the current full platform as preferred topics, and enabling a user to obtain corresponding topic inspiration through calculating the screened keywords directly, so that the workload is reduced.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein.
The above embodiments are not intended to limit the scope of the present application, so: all equivalent changes in structure, shape and principle of the application should be covered in the scope of protection of the application.

Claims (8)

1.一种多平台融合的智能选题灵感生成方法,其特征在于,包括以下步骤:1. A multi-platform integrated intelligent topic inspiration generation method, characterized by comprising the following steps: 通过若干渠道接口爬取相应平台上的资讯榜单;Crawl the information lists on the corresponding platforms through several channel interfaces; 将若干所述资讯榜单加入预约的选题池中,且各所述资讯榜单皆作为一个独立的选题列表;Adding a number of the above-mentioned information lists to the reserved topic pool, and each of the above-mentioned information lists is regarded as an independent topic list; 基于所述选题列表中各个资讯所处的位序计算各所述资讯对应的第一选题权重;Calculate the first topic weight corresponding to each information based on the position of each information in the topic list; 基于关键词库选取各所述资讯的标题对应的关键词或由若干关键词组成的关键词组,并基于各所述关键词或所述关键词组的内容热度计算对应的第二选题权重,其中,所述内容热度表征为关键词或关键词组对应的事件或对象在对应时间段上在社会上的活跃程度;Selecting keywords or keyword groups consisting of several keywords corresponding to the titles of the information based on the keyword library, and calculating corresponding second topic selection weights based on the content popularity of the keywords or keyword groups, wherein the content popularity is characterized by the degree of social activity of the events or objects corresponding to the keywords or keyword groups in the corresponding time period; 基于所述关键词或所述关键词组在其对应的所述资讯的正文中出现的次数以及在所述选题列表中其他的所述资讯的正文中出现的次数,以计算出所述关键词或所述关键词组对应的第三选题权重;Calculate the third topic weight corresponding to the keyword or the keyword group based on the number of times the keyword or the keyword group appears in the text of the corresponding information and the number of times the keyword or the keyword group appears in the text of other information in the topic list; 具体的,基于所述资讯对应所述关键词或所述关键词组,直接获取所述关键词或分别获取所述关键词组内的各所述关键词在其对应的所述资讯的正文中出现的次数并定义为原始次数,获取所述关键词或所述关键词组内的各所述关键词在其所处的所述选题列表中其他的各个所述资讯的正文中出现的次数,并选择各所述关键词对应的最大次数和最小次数;Specifically, based on the keyword or the keyword group corresponding to the information, directly obtain the keyword or respectively obtain the number of times each keyword in the keyword group appears in the text of the corresponding information and define it as the original number, obtain the number of times the keyword or each keyword in the keyword group appears in the text of each other information in the topic list where it is located, and select the maximum number and the minimum number corresponding to each keyword; 基于以下公式计算所述关键词对应的所述第三选题权重:The third topic weight corresponding to the keyword is calculated based on the following formula: 第三选题权重=(原始次数-最小次数)/(最大次数-最小次数);The weight of the third question = (original number - minimum number) / (maximum number - minimum number); 其中,若所述资讯对应关键词组,则判断若干所述关键词的所述第三选题权重之间的差值是否大于预设值,若大于,则选择数值最大的所述第三选题权重作为该所述关键词组对应的所述第三选题权重,若不大于,则基于若干所述第三选题权重的平均值作为该所述关键词组对应的所述第三选题权重;Wherein, if the information corresponds to a keyword group, it is determined whether the difference between the third topic weights of several keywords is greater than a preset value. If it is greater, the third topic weight with the largest value is selected as the third topic weight corresponding to the keyword group. If it is not greater, the average value of several third topic weights is used as the third topic weight corresponding to the keyword group. 基于所述第一选题权重、所述第二选题权重和所述第三选题权重计算出各个所述关键词或所述关键词组对应的最终权重;Calculate the final weight corresponding to each of the keywords or the keyword groups based on the first topic selection weight, the second topic selection weight, and the third topic selection weight; 在所述选题池中沿所述最终权重从高到低的顺序选择预设数量的所述关键词或所述关键词组构建选题列表,所述选题列表中包含各关键词或所述关键词组对应的所述最终权重的数值及对应的所述资讯的链接。A preset number of the keywords or keyword groups are selected from the topic pool in order from high to low along the final weight to construct a topic list, wherein the topic list includes the value of the final weight corresponding to each keyword or keyword group and the corresponding link to the information. 2.根据权利要求1所述的多平台融合的智能选题灵感生成方法,其特征在于,通过若干渠道接口爬取相应平台上的资讯榜单之后,还包括以下步骤:2. The method for generating intelligent topic inspiration based on multi-platform integration according to claim 1 is characterized in that after crawling the information list on the corresponding platform through several channel interfaces, it also includes the following steps: 通过预设的负面过滤模型判断各所述资讯榜单上的若干所述资讯是否属于负面资讯,并将所述负面资讯进行过滤;Determine whether some of the information on each of the information lists is negative information through a preset negative filtering model, and filter the negative information; 通过预设的特定词库判断各个所述资讯榜单上的若干所述资讯中是否存在特定过滤词,并将存在所述特定过滤词的所述资讯进行过滤。It is determined through a preset specific word library whether there are specific filtering words in the information on each of the information lists, and the information containing the specific filtering words is filtered. 3.根据权利要求1所述的多平台融合的智能选题灵感生成方法,其特征在于,基于所述选题列表中各个资讯所处的位序计算各所述资讯对应的第一选题权重,包括以下步骤:3. The multi-platform integrated intelligent topic inspiration generation method according to claim 1 is characterized in that the first topic weight corresponding to each piece of information is calculated based on the position of each piece of information in the topic list, comprising the following steps: 基于所述资讯在其所在的所述选题列表中的位序生成初步权重,其中,所述位序越靠前,所述初步权重越大;Generate a preliminary weight based on the position of the information in the topic list, wherein the higher the position, the greater the preliminary weight; 基于所述选题列表的长度生成最大权重;generating a maximum weight based on the length of the topic list; 基于所述选题列表对应的所述平台生成效应加成系数;The platform generation effect addition coefficient corresponding to the topic list is based on the platform generation effect addition coefficient; 根据所述初步权重与所述最大权重的比值再乘以所述效应加成系数计算出所述第一选题权重。The first topic selection weight is calculated according to the ratio of the preliminary weight to the maximum weight multiplied by the effect addition coefficient. 4.根据权利要求1所述的多平台融合的智能选题灵感生成方法,其特征在于,基于各所述关键词或所述关键词组的内容热度计算对应的第二选题权重,包括以下步骤:4. The multi-platform integrated intelligent topic inspiration generation method according to claim 1 is characterized in that the corresponding second topic weight is calculated based on the content heat of each keyword or keyword group, comprising the following steps: 在所述关键词库中框选出日常热度范围和实时热度范围,所述日常热度范围和所述实时热度范围内皆包含若干关键词;Selecting a daily heat range and a real-time heat range from the keyword library, wherein both the daily heat range and the real-time heat range contain a number of keywords; 获取所述关键词库中各所述关键词的热度保持值,并将所述热度保持值大于第一预设值的所述关键词加入所述日常热度范围内;Acquire the heat retention value of each of the keywords in the keyword library, and add the keywords whose heat retention value is greater than a first preset value into the daily heat range; 获取所述关键词库中各所述关键词的热度突变值,并将所述热度突变值增幅大于第二预设值的所述关键词作为候选关键词;Obtaining a sudden change value of heat of each keyword in the keyword library, and taking the keyword whose sudden change value increase is greater than a second preset value as a candidate keyword; 基于所述候选关键词进行全局联网搜索,并获取所述候选关键词对应的事件时间,若所述事件时间与当前时间相匹配,则将所述候选关键词加入所述实时热度范围内;Perform a global online search based on the candidate keyword, and obtain the event time corresponding to the candidate keyword. If the event time matches the current time, add the candidate keyword to the real-time popularity range; 基于所述关键词或所述关键词组是否处于所述日常热度范围内或所述实时热度范围内以生成相应的所述第二选题权重。The corresponding second topic selection weight is generated based on whether the keyword or the keyword group is within the daily popularity range or the real-time popularity range. 5.根据权利要求1所述的多平台融合的智能选题灵感生成方法,其特征在于,基于所述第一选题权重、所述第二选题权重和所述第三选题权重计算出各个所述关键词或所述关键词组对应的最终权重,包括以下步骤:5. The multi-platform integrated intelligent topic inspiration generation method according to claim 1, characterized in that the final weight corresponding to each of the keywords or the keyword groups is calculated based on the first topic weight, the second topic weight and the third topic weight, comprising the following steps: 通过以下公式计算所述最终权重:The final weight is calculated by the following formula: , 其中,n表征为所述选题列表的数量,k1、k2、k3皆为计算系数,k1、k2、k3之和为1。Among them, n represents the number of the topic list, k1, k2, k3 are all calculation coefficients, and the sum of k1, k2, k3 is 1. 6.根据权利要求5所述的多平台融合的智能选题灵感生成方法,其特征在于,基于所述第一选题权重、所述第二选题权重和所述第三选题权重计算出各个所述关键词或所述关键词组对应的最终权重,还包括以下步骤:6. The multi-platform integrated intelligent topic inspiration generation method according to claim 5, characterized in that the final weight corresponding to each of the keywords or the keyword groups is calculated based on the first topic weight, the second topic weight and the third topic weight, and further comprising the following steps: 判断相应平台上的所述资讯榜单是否存在分类榜单;Determine whether there is a classified list for the information list on the corresponding platform; 若存在,则获取所述分类榜单对应的分类信息,并判断所述选题池中各所述分类信息存在的数量;If so, obtain the classification information corresponding to the classification list, and determine the number of each classification information in the topic pool; 若所述关键词或所述关键词组对应于所述分类榜单,则将计算出的所述最终权重乘以相应的分类系数;其中,所述分类系数大于1且其大小与其所述分类榜单对应的所述分类信息在所述选题池中的数量成反比。If the keyword or the keyword group corresponds to the classification list, the calculated final weight is multiplied by the corresponding classification coefficient; wherein the classification coefficient is greater than 1 and its size is inversely proportional to the amount of the classification information corresponding to the classification list in the topic pool. 7.根据权利要求1所述的多平台融合的智能选题灵感生成方法,其特征在于,在所述选题列表中,各所述关键词或所述关键词组对应的所述资讯的链接的生成方法,包括以下步骤:7. The method for generating intelligent topic inspiration based on multi-platform integration according to claim 1, characterized in that, in the topic list, the method for generating the link of the information corresponding to each keyword or keyword group comprises the following steps: 获取所述关键词或所述关键词组所在的所有所述资讯中所述第一选题权重最大的所述资讯作为目标资讯;Acquire the information with the largest first topic weight among all the information containing the keyword or the keyword group as the target information; 基于所述渠道接口获取所述目标资讯对应的所述链接;Acquire the link corresponding to the target information based on the channel interface; 将所述链接发送绑定至所述关键词或所述关键词组。The link is sent and bound to the keyword or the keyword group. 8.一种选题灵感引擎,其特征在于,包括:8. A topic inspiration engine, characterized by comprising: 渠道接口,用于对接各平台;Channel interface, used to connect to various platforms; 爬虫模块,用于爬取若干所述渠道接口对接的相应平台上的资讯榜单;A crawler module, used to crawl information lists on corresponding platforms connected to the channel interfaces; 选题池,用于放置若干所述资讯榜单,并将各所述资讯榜单转换为相应的若干独立的选题列表;A topic pool, used to store a plurality of the information lists, and convert each of the information lists into a corresponding plurality of independent topic lists; 权重提取模块,用于基于所述选题列表中各个资讯所处的位序计算各所述资讯对应的第一选题权重;还基于关键词库选取各所述资讯的标题对应的关键词或由若干关键词组成的关键词组,并基于各所述关键词或所述关键词组的内容热度计算对应的第二选题权重,其中,所述内容热度表征为关键词或关键词组对应的事件或对象在对应时间段上在社会上的活跃程度;还基于所述关键词或所述关键词组在其对应的所述资讯的正文中出现的次数以及在所述选题列表中其他的所述资讯的正文中出现的次数,以计算出所述关键词或所述关键词组对应的第三选题权重;A weight extraction module is used to calculate the first topic weight corresponding to each piece of information based on the position of each piece of information in the topic list; further select the keyword or keyword group consisting of several keywords corresponding to the title of each piece of information based on the keyword library, and calculate the corresponding second topic weight based on the content popularity of each keyword or keyword group, wherein the content popularity is characterized by the social activity level of the event or object corresponding to the keyword or keyword group in the corresponding time period; further calculate the third topic weight corresponding to the keyword or keyword group based on the number of times the keyword or keyword group appears in the text of the corresponding piece of information and the number of times the keyword or keyword group appears in the text of other pieces of information in the topic list; 还用于基于所述资讯对应所述关键词或所述关键词组,直接获取所述关键词或分别获取所述关键词组内的各所述关键词在其对应的所述资讯的正文中出现的次数并定义为原始次数,获取所述关键词或所述关键词组内的各所述关键词在其所处的所述选题列表中其他的各个所述资讯的正文中出现的次数,并选择各所述关键词对应的最大次数和最小次数;It is also used to directly obtain the keyword or the keyword group corresponding to the information, respectively obtain the number of times each keyword in the keyword group appears in the text of the corresponding information and define it as the original number, obtain the number of times each keyword or the keyword in the keyword group appears in the text of each other information in the topic list, and select the maximum number and the minimum number corresponding to each keyword; 基于以下公式计算所述关键词对应的所述第三选题权重:The third topic weight corresponding to the keyword is calculated based on the following formula: 第三选题权重=(原始次数-最小次数)/(最大次数-最小次数);The weight of the third question = (original number - minimum number) / (maximum number - minimum number); 其中,若所述资讯对应关键词组,则判断若干所述关键词的所述第三选题权重之间的差值是否大于预设值,若大于,则选择数值最大的所述第三选题权重作为该所述关键词组对应的所述第三选题权重,若不大于,则基于若干所述第三选题权重的平均值作为该所述关键词组对应的所述第三选题权重;Wherein, if the information corresponds to a keyword group, it is determined whether the difference between the third topic weights of several keywords is greater than a preset value. If it is greater, the third topic weight with the largest value is selected as the third topic weight corresponding to the keyword group. If it is not greater, the average value of several third topic weights is used as the third topic weight corresponding to the keyword group. 选题权重计算模块,用于基于所述第一选题权重、所述第二选题权重和所述第三选题权重计算出各个所述关键词或所述关键词组对应的最终权重;A topic selection weight calculation module, used to calculate the final weight corresponding to each of the keywords or the keyword groups based on the first topic selection weight, the second topic selection weight and the third topic selection weight; 选题生成模块,用于在所述选题池中沿所述最终权重从高到低的顺序选择预设数量的所述关键词或所述关键词组构建选题列表,所述选题列表中包含各关键词或所述关键词组对应的所述最终权重数值及对应的所述资讯的链接。A topic generation module is used to select a preset number of the keywords or keyword groups in the topic pool in order from high to low along the final weight to construct a topic list, wherein the topic list includes the final weight value corresponding to each keyword or keyword group and the corresponding link to the information.
CN202411248796.2A 2024-09-06 2024-09-06 Multi-platform integrated intelligent topic selection inspiration generation method and topic selection inspiration engine Active CN118780280B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411248796.2A CN118780280B (en) 2024-09-06 2024-09-06 Multi-platform integrated intelligent topic selection inspiration generation method and topic selection inspiration engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411248796.2A CN118780280B (en) 2024-09-06 2024-09-06 Multi-platform integrated intelligent topic selection inspiration generation method and topic selection inspiration engine

Publications (2)

Publication Number Publication Date
CN118780280A CN118780280A (en) 2024-10-15
CN118780280B true CN118780280B (en) 2024-11-26

Family

ID=92986648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411248796.2A Active CN118780280B (en) 2024-09-06 2024-09-06 Multi-platform integrated intelligent topic selection inspiration generation method and topic selection inspiration engine

Country Status (1)

Country Link
CN (1) CN118780280B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111881275A (en) * 2020-07-24 2020-11-03 新华智云科技有限公司 Efficient hotspot identification and matching method
CN111898369A (en) * 2020-08-17 2020-11-06 腾讯科技(深圳)有限公司 Article title generation method, model training method and device and electronic equipment
CN116484845A (en) * 2022-01-14 2023-07-25 腾讯科技(深圳)有限公司 Method and device for updating real-time hotword of input method and electronic equipment
CN118170899A (en) * 2024-05-09 2024-06-11 珠海传媒融创科技有限公司 AIGC-based media news manuscript generation method and related device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105528432B (en) * 2015-12-15 2019-04-26 北大方正集团有限公司 A method and device for generating a digital resource hotspot
CN107153658A (en) * 2016-03-03 2017-09-12 常州普适信息科技有限公司 A method for discovering public opinion hot words based on keyword weighting algorithm
CN107168943B (en) * 2017-04-07 2018-07-03 平安科技(深圳)有限公司 The method and apparatus of topic early warning
CN108959383A (en) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 Analysis method, device and the computer readable storage medium of network public-opinion
CN110874530B (en) * 2019-10-30 2023-06-13 深圳价值在线信息科技股份有限公司 Keyword extraction method, keyword extraction device, terminal equipment and storage medium
CN113822067A (en) * 2021-08-17 2021-12-21 深圳市东信时代信息技术有限公司 Key information extraction method and device, computer equipment and storage medium
CN115982473B (en) * 2023-03-21 2023-06-23 环球数科集团有限公司 An AIGC-based public opinion analysis and arrangement system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111881275A (en) * 2020-07-24 2020-11-03 新华智云科技有限公司 Efficient hotspot identification and matching method
CN111898369A (en) * 2020-08-17 2020-11-06 腾讯科技(深圳)有限公司 Article title generation method, model training method and device and electronic equipment
CN116484845A (en) * 2022-01-14 2023-07-25 腾讯科技(深圳)有限公司 Method and device for updating real-time hotword of input method and electronic equipment
CN118170899A (en) * 2024-05-09 2024-06-11 珠海传媒融创科技有限公司 AIGC-based media news manuscript generation method and related device

Also Published As

Publication number Publication date
CN118780280A (en) 2024-10-15

Similar Documents

Publication Publication Date Title
CN109933664B (en) An Improved Method for Fine-Grained Sentiment Analysis Based on Sentiment Word Embedding
CN105740228B (en) A kind of internet public feelings analysis method and system
Knees et al. A survey of music similarity and recommendation from music context data
CN103150333B (en) Opinion leader identification method in microblog media
US20020078044A1 (en) System for automatically classifying documents by category learning using a genetic algorithm and a term cluster and method thereof
US20070260586A1 (en) Systems and methods for selecting and organizing information using temporal clustering
CN110705288A (en) Big data-based public opinion analysis system
CN103870523A (en) Analyzing content to determine context and serving relevant content based on the context
CN106354845A (en) Microblog rumor recognizing method and system based on propagation structures
CN112861541A (en) Commodity comment sentiment analysis method based on multi-feature fusion
CN112905800B (en) Sentiment early warning method based on public opinion knowledge graph and XGBoost multi-feature fusion
CN104268130A (en) Social advertising facing Twitter feasibility analysis method
CN110609950B (en) Public opinion system search word recommendation method and system
Dang et al. Machine learning approaches for mood classification of songs toward music search engine
Yao et al. Online deception detection refueled by real world data collection
CN118780280B (en) Multi-platform integrated intelligent topic selection inspiration generation method and topic selection inspiration engine
CN118246988B (en) A personalized content push method and system based on AI
KR101355956B1 (en) Method and apparatus for sorting news articles in order to suggest opposite perspecitves for contentious issues
CN118093857A (en) Tourist destination image visualization method based on LDA and BERT-BiLSTM-attribute model
CN111859165A (en) A real-time personalized information flow recommendation method based on user behavior
Kian et al. An efficient approach for keyword selection; improving accessibility of web contents by general search engines
CN117033655A (en) Scientific and technological mist event spreading method based on social media data
Kamaliha et al. Characterizing network motifs to identify spam comments
CN116167525A (en) A method and system for predicting the spread of public opinion
CN115964574A (en) A data mining-based method for evaluating the popularity of public opinion on smart traffic safety

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant