[go: up one dir, main page]

CN112069802B - Article quality scoring method, article quality scoring device and storage medium - Google Patents

Article quality scoring method, article quality scoring device and storage medium Download PDF

Info

Publication number
CN112069802B
CN112069802B CN202010873079.4A CN202010873079A CN112069802B CN 112069802 B CN112069802 B CN 112069802B CN 202010873079 A CN202010873079 A CN 202010873079A CN 112069802 B CN112069802 B CN 112069802B
Authority
CN
China
Prior art keywords
article
feature
static
scored
quality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010873079.4A
Other languages
Chinese (zh)
Other versions
CN112069802A (en
Inventor
史小婉
覃玉清
陈婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Pinecone Electronic Co Ltd
Original Assignee
Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Pinecone Electronic Co Ltd filed Critical Beijing Xiaomi Pinecone Electronic Co Ltd
Priority to CN202010873079.4A priority Critical patent/CN112069802B/en
Publication of CN112069802A publication Critical patent/CN112069802A/en
Application granted granted Critical
Publication of CN112069802B publication Critical patent/CN112069802B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to an article quality scoring method, an article quality scoring device and a storage medium. The article quality scoring method comprises the steps of determining a plurality of static features contained in articles to be scored, determining a static feature score and a static feature weight of each of the plurality of static features, multiplying the static feature scores corresponding to the static features in the plurality of static features by the static feature weights, accumulating to obtain a static feature total score, and determining the article quality score of the articles to be scored based on the static feature total score. By the article quality scoring method, the article quality score of the article to be scored can be determined efficiently and accurately, the high-quality article is recommended to the user, and the satisfaction degree of the user in the process of acquiring the recommended article is improved.

Description

Article quality scoring method, article quality scoring device and storage medium
Technical Field
The disclosure relates to the technical field of article quality scoring, in particular to an article quality scoring method, an article quality scoring device and a storage medium.
Background
With the rapid development of internet technology, accessing the internet through a browser or some other browser-like application has become an important means for users to obtain information.
In order to be able to provide a good use experience for the user, a browser or browser-like application will often recommend daily new articles to the user. However, since there are more low-quality articles in the newly added articles, if a large number of such articles are recommended to the user, the loss of the user will be caused. In order to reduce the loss of users and increase the experience of users, it is important to recommend high-quality articles to users. Accordingly, in the related art, an article is recommended to a user by scoring the quality of the article and based on the result of the scoring.
Currently, in the related art, quality scoring of an article to be scored is achieved by a scoring model constructed in advance based on text scoring characteristics of the article to be scored. However, since the method only uses text data of the articles, and does not consider other features of the articles, such as text pictures of the articles, classification of the articles, and the like, the accuracy of quality scores of the articles to be scored is low.
In the related art, the browsing behavior score of the user on the target article is obtained by obtaining the browsing behavior information of the user when browsing the target article (also called as the article to be scored), according to the browsing behavior information and the corresponding browsing behavior coefficient, and finally, the article quality score of the target article is obtained according to the obtained browsing behavior scores of a plurality of users on the target article. However, the method only uses the browsing behavior data of the user on the article, but does not fully utilize the existing characteristics of the article, so that the accuracy of quality scoring of the target article can be caused. Further, based on the method, if quality evaluation is to be performed on the target articles, the target articles are required to be recommended to a plurality of users, and in the application process, the possibility of recommending low-quality target articles to the users is easy to occur, so that bad experience is brought to the users.
Therefore, how to efficiently and accurately score the quality of articles is a focus of current interest.
Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides an article quality scoring method, an article quality scoring apparatus, and a storage medium.
According to a first aspect of embodiments of the present disclosure, an article quality scoring method is provided, which includes determining a plurality of static features included in an article to be scored, determining a static feature score and a static feature weight of each of the plurality of static features, multiplying the static feature score corresponding to each of the plurality of static features by the static feature weight, accumulating to obtain a static feature total score, and determining an article quality score of the article to be scored based on the static feature total score.
In one embodiment, the article quality scoring method further comprises presetting a plurality of feature threshold intervals based on the article type and a feature threshold of at least one reference static feature, wherein each feature threshold interval corresponds to a weight combination, each weight combination comprises weights of the plurality of static features, the determining of the static feature weight of each static feature in the plurality of static features comprises determining a feature threshold interval corresponding to the static feature of the reference static feature in the article to be scored based on the type of the article to be scored, the static feature of the reference static feature in the article to be scored, and the feature threshold, and determining the static feature weight of each static feature in the plurality of static features based on the weight combination corresponding to the feature threshold interval.
In another embodiment, the method comprises the steps of presetting a plurality of feature threshold intervals based on article types and feature thresholds of at least one reference static feature, wherein the presetting of the feature threshold intervals comprises the steps of setting at least one feature threshold for each of the preset plurality of reference static features, setting different feature thresholds for the reference static features of different article types, and combining any two feature thresholds corresponding to all the reference static features in the plurality of reference static features to form the plurality of feature threshold intervals.
In still another embodiment, the article quality scoring method further comprises determining whether the article to be scored meets a preset additional scoring standard, wherein the additional scoring standard comprises a scoring standard and/or a subtracting standard, and determining the article quality score of the article to be scored based on the static feature total score comprises scoring and/or subtracting the static feature total score according to the scoring standard and/or subtracting the scoring standard if the article to be scored meets the preset additional scoring standard, so as to obtain the article quality score of the article to be scored.
In yet another embodiment, the article quality scoring method further comprises, prior to determining the plurality of static features contained in the article to be scored, determining that the article to be scored is a non-low quality article.
In yet another embodiment, the article quality scoring method further comprises determining that the article quality score of the article to be scored is lowest in response to determining that the article to be scored is a low quality article.
In yet another embodiment, the static features include one or more of a text length, a paragraph, a text picture number, a text picture definition, a low quality feature, and an author, and the determining the static feature score for each of the plurality of static features includes determining a text length score for the article based on the text length, determining a paragraph score based on the number of paragraphs, determining a text picture number score based on the text picture number, determining a text picture definition score based on the text picture definition, determining a low quality feature score based on the low quality feature, and determining an author score based on the author's rating.
According to a second aspect of the disclosed embodiments, an article quality scoring device is provided, which comprises a static feature determining module, a processing module and an article quality scoring module, wherein the static feature determining module is used for determining a plurality of static features contained in an article to be scored, determining a static feature score and a static feature weight of each static feature in the plurality of static features, the processing module is used for multiplying the static feature score corresponding to each static feature in the plurality of static features by the static feature weight and then accumulating the static feature score to obtain a static feature total score, and the article quality scoring module is used for determining the article quality score of the article to be scored based on the static feature total score.
In one embodiment, the article quality scoring device further comprises a feature threshold interval setting module, a static feature determining module and a static feature determining module, wherein the feature threshold interval setting module is used for presetting a plurality of feature threshold intervals based on article types and feature thresholds of at least one reference static feature, each feature threshold interval corresponds to a weight combination, each weight combination comprises weights of the plurality of static features, the static feature determining module is used for determining a feature threshold interval corresponding to the static feature of the reference static feature of the article to be scored based on the type of the article to be scored, the static feature of the article to be scored, and the feature threshold, and the static feature weight of each static feature of the plurality of static features is determined based on the weight combination corresponding to the feature threshold interval.
In yet another embodiment, the feature threshold interval setting module is configured to set at least one feature threshold for each of a plurality of preset reference static features, set different feature thresholds for reference static features of different article types, and combine any two feature thresholds of feature thresholds corresponding to all of the plurality of reference static features to form a plurality of feature threshold intervals.
In still another embodiment, the article quality scoring device further comprises a judging module, wherein the judging module is used for determining whether the articles to be scored meet preset additional scoring standards or not, the additional scoring standards comprise scoring standards and/or subtracting scoring standards, the article quality scoring module is used for determining the article quality scores of the articles to be scored based on the static feature total score in a mode that if the articles to be scored meet the preset additional scoring standards, the static feature total score is subjected to scoring and/or subtracting scoring according to the scoring standards and/or subtracting scoring standards, and the article quality scores of the articles to be scored are obtained.
In yet another embodiment, the article quality scoring device further includes a non-low quality article determining module configured to determine the article to be scored as a non-low quality article.
In yet another embodiment, the article quality scoring device further comprises a low quality article processing module configured to determine that the article quality score of the article to be scored is lowest in response to determining that the article to be scored is a low quality article.
In yet another embodiment, the static feature comprises one or more of a text length, a paragraph, a text picture count, a text picture definition, a low-quality feature, and an author, and the determining static feature module determines a static feature score for each of the plurality of static features by determining a text length score for the article based on the text length, a paragraph score based on the paragraph count, a text picture count score based on the text picture count, a text picture definition score based on the text picture definition, a low-quality feature score based on the low-quality feature, and an author score based on the author's rank.
The technical scheme provided by the embodiment of the disclosure can have the beneficial effects that the article quality scoring method provided by the disclosure fully utilizes each static feature of the articles to be scored, obtains the total score of the static feature by determining the score of each static feature and the weight of the static feature, determines the article quality score of the articles to be scored based on the total score of the static feature, can efficiently and accurately determine the article quality score of the articles to be scored, recommends high-quality articles for users, and lays a foundation for improving the satisfaction degree of the users in the process of acquiring recommended articles.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 shows a flow chart of a method of currently quality scoring articles to be scored.
FIG. 2 shows a flow chart of another method of quality scoring an article currently being scored.
FIG. 3 is a flowchart illustrating a method of article quality scoring, according to one example embodiment.
Fig. 4 shows a schematic diagram of an application scenario in which an article quality scoring method is applied.
FIG. 5 is a flowchart illustrating another article quality scoring method, according to one example embodiment.
FIG. 6 illustrates a flow chart for determining a static feature weight for each of a plurality of static features.
Fig. 7 shows a flow chart for presetting a plurality of feature threshold intervals based on article type and at least one feature threshold referring to static features.
FIG. 8 is a flowchart illustrating yet another article quality scoring method, according to one example embodiment.
FIG. 9 is a flowchart illustrating another article quality scoring method, according to one example embodiment.
Fig. 10 is a block diagram illustrating an article quality scoring apparatus according to one example embodiment.
FIG. 11 is a block diagram illustrating an apparatus for article quality scoring, according to an example embodiment.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
Currently, in the era of large information explosion, a browser or an application program similar to the browser, such as a news push application, a MIUI browser and other clients, can add a great number of articles every day. These newly added articles are varied and may contain low quality articles such as advertisement-type articles, title-party articles, etc. If a large number of articles are recommended to the user, the reading experience of the user is greatly reduced, and the loss of the user is caused. In addition, the newly added articles also contain articles with good quality, and if relevant articles are pushed to the user according to the characteristics and preference of the user, the experience of the user can be improved, so that the user can be better kept. Therefore, how to efficiently and accurately score the quality of an article becomes particularly important.
FIG. 1 shows a flow chart of a method of currently quality scoring articles to be scored.
As shown in fig. 1, in the related art, quality scoring of an article to be scored may be achieved by scoring features based on the text of the article to be scored and a scoring model constructed in advance. However, since the method only uses text data of the articles, and does not consider other features of the articles, such as text pictures of the articles, classification of the articles, and the like, the accuracy of quality scores of the articles to be scored is low.
In an example, the ideographic features, the chapter structural features, and the lexical semantic features of some articles to be scored are good, but if there are no pictures in the text of these articles, the user can read the articles very dry and odorless, and the visual sense of the articles is also poor, which ultimately results in poor experience for the user. Or if some advertisement content is included in the article to be scored, the user can feel disliked. From a practical point of view, the overall quality of these articles is not high, but under this scenario, the quality score of such articles is high.
FIG. 2 shows a flow chart of another method of quality scoring an article currently being scored.
As shown in fig. 2, in the related art, the browsing behavior score of the user on the target article may also be obtained by obtaining the browsing behavior information of the user when browsing the target article (which may be referred to as an article to be scored), according to the browsing behavior information and the corresponding browsing behavior coefficients, and finally, according to the obtained browsing behavior scores of the plurality of users on the target article, obtaining the article quality score of the target article. However, the method only uses the browsing behavior data of the user on the article, but does not fully utilize the existing characteristics of the article, so that the accuracy of quality scoring of the target article can be caused.
Further, under this scheme, if the target article belongs to an article with good quality, but when browsing actions of multiple collected users are performed, if most of the selected users are users not interested in the article, but other users not collected and interested in the article exist in reality, the target article is scored as a low-quality article on one side according to the scheme, so that the accuracy of scoring is reduced.
Further, in the article quality scoring scheme based on the user browsing information, if quality assessment is performed on the target article, the article to be scored needs to be recommended to a plurality of users first, and then browsing behaviors of the plurality of users on the article to be scored are acquired. In this process, if the target article recommended to the user belongs to a low-quality article, although the quality score of the article to be scored can be calculated according to the browsing behavior of the user, the low-quality article has a bad experience to the user, which may cause the loss of the user.
Thus, the method of scoring article quality in the related art needs to be further optimized.
The embodiment of the disclosure provides an article quality scoring method, wherein scoring is performed based on a plurality of static features of articles, so that article quality scores of articles to be scored can be determined efficiently and accurately, high-quality articles are recommended to users, and satisfaction of the users in the process of acquiring recommended articles is improved.
In an embodiment of the present disclosure, the article quality scoring is performed before pushing to the user, so as to further ensure the quality of the pushed articles.
FIG. 3 is a flowchart illustrating a method of article quality scoring, according to one example embodiment.
In an exemplary embodiment of the present disclosure, as shown in fig. 3, the article scoring method may include steps S11 to S13. The steps will be described separately.
In step S11, a plurality of static features included in the article to be scored are determined, and a static feature score and a static feature weight of each of the plurality of static features are determined.
In an example of the present disclosure, a static feature may be understood as a non-dynamic feature about an article to be scored. For example, the static features of the article to be scored may include text length, paragraphs, text picture count, text picture sharpness, low-quality features, authors, and so forth. In the present disclosure, the static feature is not specifically limited, and if a new static feature exists subsequently for evaluating the quality of an article, the new static feature can still be applied to the article quality scoring method related to the present disclosure.
As can be seen from the above description, the static features of the article to be evaluated basically cover various aspects related to the features of the article itself, such as text, picture, paragraph, author, low-quality features, etc., and in the present disclosure, the quality scoring of the article based on multiple static features of the article to be scored may improve the scoring accuracy.
According to the embodiment of the disclosure, the static feature scores and the static feature weights can be respectively determined for the plurality of static features of the articles to be scored, the quality of the articles to be scored is scored based on the static feature scores and the static feature weights of the plurality of static features of the articles to be scored, and the quality of the articles to be evaluated can be scored more accurately.
In step S12, the static feature score corresponding to each static feature in the plurality of static features is multiplied by the static feature weight and accumulated to obtain a total static feature score.
In the application process, the static feature score corresponding to each static feature in the plurality of static features can be multiplied by the static feature weight and then accumulated, so that the total static feature score of the article to be scored is obtained.
It should be noted that, each static feature weight may be determined according to the article type of the article to be evaluated, the text of the article, and the picture condition, and each static feature weight may also be determined according to other manners, and in this embodiment, each static feature weight is not specifically limited.
In step S13, an article quality score for the article to be scored is determined based on the static feature total score.
In one example, the article quality score for the article to be scored may be determined based on a static feature total score determined for each static feature. In one possible implementation, the articles to be scored may be classified into high quality articles, medium quality articles, and low quality articles based on the article quality scores of the articles to be scored.
In a possible implementation manner, when an article to be pushed to a user is newly added by a browser or an application program similar to the browser, the article quality scoring method provided by the embodiment of the disclosure may score the quality of the newly added article, and recommend the newly added article to the user in a targeted manner based on the result of the article quality scoring.
In one example, if some newly added articles have low quality scores and are rated as low quality articles, the low quality articles may be filtered out when making article recommendations to the user as needed. For example, if a user finds that a title party or advertisement article is very objectionable to the user through behavior analysis of the user, then when an article recommendation is made to the user, low quality articles belonging to the title party or advertisement may be actively filtered out to avoid the user receiving the type of article.
Fig. 4 shows a schematic diagram of an application scenario in which an article quality scoring method is applied. In fig. 4, 1 is an article, 2 is an article quality scoring method related to the disclosure, 3 is a personalized recommendation system, and 4 is a user. When a new article is added, the quality score of the article is determined according to the article quality method related to the disclosure, then the article with different quality scores is recommended to the user through the personalized recommendation system according to the characteristics of the user, after the article is exposed to the user, the user can feed back the article, and then the behavior data of the article is collected according to the feedback of the user, so that the article is recommended to the user more accurately.
In an example, as shown in fig. 4, if the user 4 is very interested in the food product 1, and the user finds, through analysis of the previous behavior data of the user, that the user reads the product with the product quality score of the food product 1 not too high for a short time and has no other operation, and reads the product with the product quality score of higher for a long time and frequently reviews or collects the product. Then, for the newly added food product 1 of the browser or the application program similar to the browser, the product quality scoring method 2 provided by the embodiment of the present disclosure may score the product quality of the newly added food product 1, and recommend the food product with the higher product quality score to the user 4 through the personalized recommendation system 3.
In yet another example, a user often reads or reviews high quality articles and medium quality articles and does not exhibit obvious article-type preferences, then the high quality articles and medium quality articles may be multi-category decimated to make article recommendations to the user.
It should be noted that, after recommending an article to the user, the user may make feedback on the article. In one embodiment, the behavior data of the user can be obtained based on feedback made by the user, so as to lay a foundation for recommending articles which more meet the requirements of the user to the user.
According to the article quality scoring method, each static feature of the article to be scored is fully utilized, the total score of the static feature is obtained by determining the score and the static feature weight of each static feature, and the article quality score of the article to be scored is determined based on the total score of the static feature. According to the method and the device for scoring the article quality, the article quality score of the article to be scored can be determined efficiently and accurately, the article with high quality is recommended to the user, and the satisfaction degree of the user in the process of acquiring the recommended article is improved.
The present disclosure will be described with respect to the process of article quality scoring by the following examples.
In the embodiment of the disclosure, a static feature score determining process will be described first.
In an exemplary embodiment of the present disclosure, the static features include one or more of text length, paragraphs, text picture count, text picture sharpness, low quality features, and authors. Determining the static feature score for each of the plurality of static features may be performed in the following manner.
In one example, an article body length score may be determined from the article body length. Wherein, determining the article text length score may be accomplished by the following formula:
Wherein bodyLen represents the body length of the article to be scored. l1 and l2 represent two thresholds for the body length of the article to be scored corresponding to different article types, respectively, where l1 is the smaller threshold (e.g., X Small size = 600 above) and l2 is the larger threshold (e.g., X Big size = 1000 above). w1 and w2 represent two parameters, where w1 and w2 can be set according to l1 and l 2.
In an example, a paragraph score may be determined based on the number of paragraphs. Wherein, determining the paragraph score may be accomplished by the following formula:
Wherein paragraphNum denotes the paragraph number of the article.
In an example, a text picture number score may be determined from the text picture numbers. Wherein, determining the text picture number score may be accomplished by the following formula:
Wherein imgNum denotes the number of text pictures. n2 represents the larger of the two thresholds for text picture numbers for the article to be scored corresponding to different article types (e.g., P Big size = 5 above).
In an example, a text picture sharpness score may be determined from the text picture sharpness. The definition score of the text picture can be determined by the following formula:
wherein IMGCLARITY denotes a sharpness value of the picture, and n denotes a text picture number.
In one example, a low-quality feature score may be determined from the low-quality features. Wherein, determining the low-quality feature score may be accomplished by the following formula:
negFeatureScore=min(negFeatureScore1,negFeatureScore2......)
wherein the score for the total low-quality feature is the smallest of all low-quality feature scores.
In an example, the author score may be determined based on the author's rank. Wherein, determining the author score may be accomplished by the following formula:
authorScore=0.2*level
wherein level represents the number of author levels. In one example, the author ratings may be divided into 5 levels 1, 2, 3,4, 5. The author score is defined in terms of an author rank size, the higher the author rank, the higher the author score.
Further, the static feature total score can be obtained by multiplying the static feature scores of the 6 static features by the static feature weights and then accumulating the multiplied static feature scores. Wherein the static feature total score may be determined by the following formula:
Wherein i represents the ith static feature, factor represents the static feature score, weight represents the static feature weight corresponding to the static feature. In the embodiment of the disclosure, the total 6 static features are referred to as text length, paragraph, text picture number, text picture definition, low-quality feature and author.
The following description of the embodiment of the present disclosure describes a process of determining static feature weights.
In an embodiment of the disclosure, weights of static features may be preset, weights of the static features corresponding to each article type are formed into weight combinations, each weight combination includes weights of a plurality of static features, and when static feature weight determination is performed, determination of the static feature weight may be performed according to the preset weight combinations, and seal quality scoring may be performed.
FIG. 5 is a flowchart illustrating another article quality scoring method, according to one example embodiment.
In an exemplary embodiment of the present disclosure, as shown in fig. 5, the article quality scoring method may include steps S21-S24. The steps will be described separately.
In step S21, a plurality of feature threshold intervals are preset based on the article type and at least one feature threshold of the reference static feature, and each feature threshold interval corresponds to a weight combination, wherein each weight combination includes weights of a plurality of static features.
In step S22, a plurality of static features included in the article to be scored are determined, and a static feature score and a static feature weight of each of the plurality of static features are determined.
In step S23, the static feature score corresponding to each static feature in the plurality of static features is multiplied by the static feature weight and accumulated to obtain a total static feature score.
In step S24, an article quality score for the article to be scored is determined based on the static feature total score.
The steps S22-S24 are the same as the steps S11-S13 in the previous embodiments, respectively, and the explanation and description thereof and the beneficial effects are referred to the description of the steps S11-S13 above, and are not repeated here. Step S21 will be described in detail below.
In the application process, a plurality of feature threshold intervals can be preset based on the article type and at least one feature threshold of the reference static feature. The article types may include, among others, food-class articles, political-class articles, cartoon-class articles, and the like. Reference to static features is understood to mean features which to some extent characterize the article. In an example, the reference static feature may be a body length, a body picture number, etc. of the article. In the present disclosure, reference to static features is not specifically defined.
In one possible embodiment, the feature threshold for each reference static feature may be at least one. In an example, for a food article, when the reference static feature is body length, there may be two feature thresholds, e.g., a larger threshold (1000) and a smaller threshold (800). Further, three feature threshold intervals (L.gtoreq.1000, 800< L <1000, L.ltoreq.800) may be set for static features with respect to body length based on the larger and smaller thresholds, where L represents body length.
It should be noted that, each feature threshold interval corresponds to a weight combination, and each weight combination includes weights of a plurality of static features. In the application process, the static feature weight of the static feature in the article to be scored can be determined based on the weight of the corresponding static feature in the feature threshold interval.
Further, the static feature scores of the articles to be scored are multiplied by the static feature weights and accumulated to obtain static feature total scores, and the article quality scores of the articles to be scored are determined based on the static feature total scores.
As can be seen from the above description, in the embodiments of the present disclosure, the static feature weights of the static features in the article to be scored may be determined based on the weights of the corresponding static features in the feature threshold interval.
The present disclosure will explain, by the following embodiments, a process of determining a static feature weight of each of a plurality of static features in an article to be scored based on a weight of a static feature corresponding to a feature threshold interval.
FIG. 6 illustrates a flow chart for determining a static feature weight for each of a plurality of static features.
In an exemplary embodiment of the present disclosure, determining the static feature weight of each of the plurality of static features may include step S31 and step S32. The steps will be described separately.
In step S31, a feature threshold interval corresponding to the reference static feature in the article to be scored is determined based on the type of the article to be scored, the static feature of the reference static feature in the article to be scored, and the feature threshold.
In the application process, due to different types of articles, the feature threshold value of the static feature is correspondingly different for the same reference static feature. In an example, for a food article, when the reference static feature is body length, there may be two feature thresholds, e.g., a larger threshold (1000) and a smaller threshold (800). For an administrative-like article, when the reference static feature is body length, there may be two feature thresholds corresponding, for example, a larger threshold (2000) and a smaller threshold (1000).
In step S32, a static feature weight of each of the plurality of static features is determined based on the weight combination corresponding to the feature threshold interval.
It should be noted that, besides the weight combination, the feature threshold interval may also represent the value range of the reference static feature. And determining whether the static feature weight of the article to be scored can be determined according to the weight combination included in the feature threshold interval by judging whether the static feature corresponding to the reference static feature in the article to be scored is located in the feature threshold interval.
The present disclosure will explain a process of determining a plurality of characteristic threshold intervals by the following embodiments.
Fig. 7 shows a flow chart for presetting a plurality of feature threshold intervals based on article type and at least one feature threshold referring to static features.
In an exemplary embodiment of the present disclosure, presetting a plurality of feature threshold intervals based on the article type and at least one feature threshold referring to the static feature may include step S41 and step S42.
In step S41, at least one feature threshold is set for each of a plurality of preset reference static features, and different feature thresholds are set for reference static features of different article types.
In step S42, any two feature thresholds among the feature thresholds corresponding to all the reference static features in the plurality of reference static features are combined to form a plurality of feature threshold sections.
In an example, two reference static features (e.g., a body length and a body picture number) may be determined. Continuing with the example of a food article, the text length may include two thresholds, and the text picture count may also include two thresholds. For ease of description, let the larger threshold of the text length be X Big size (1000), the smaller threshold of the text length be X Small size (800), the larger threshold of the text picture number be Y Big size (5), and the smaller threshold of the text picture number be Y Small size (0). Further, three feature threshold intervals (L.gtoreq.1000, 800< L.ltoreq.1000, L.ltoreq.800) may be set for static features with respect to body length based on X Big size and X Small size , where L represents body length, and three feature threshold intervals (P.gtoreq.5, 0< P.ltoreq.5, P.ltoreq.5) may be set for static features with respect to body picture numbers based on Y Big size and Y Small size , where P represents body picture numbers.
Further, two reference static features of text length and text picture number can be combined, and feature threshold intervals are combined, so that 9 (3*3) different feature threshold interval conditions can be formed, and a feature threshold interval similar to a nine-grid is formed. Reference may be made to table 1.
Table 1 Sudoku for different characteristic threshold intervals
L≤X Small size ,P≥Y Big size X Small size <L<X Big size ,P≥Y Big size L≥X Big size ,P≥Y Big size
L≤X Small size ,Y Small size <P<Y Big size X Small size <L<X Big size ,Y Small size <P<Y Big size L≥X Big size ,Y Small size <P<Y Big size
L≤X Small size ,P≤Y Small size X Small size <L<X Big size ,P≤Y Small size L≥X Big size ,P≤Y Small size
Note that, each article type may correspond to one of the nine boxes in table 1. For the static characteristics of the articles to be scored, different grids (characteristic threshold intervals) have different weight combinations, and the articles to be evaluated of different article types have the same weight combinations in the same grids (characteristic threshold intervals).
In the application process, for an article to be scored, a grid (L is less than or equal to X Small size ,P≤Y Small size ) in a corresponding Sudoku grid can be found according to the article type of the article to be scored and the numerical value (for example, the text length is 600 and the text picture number is 0) of the static feature corresponding to the reference static feature, and the static feature weight is determined through the weight combination in the grid.
In an example, for a food article, the two thresholds for body length may be 800 and 1000, and the two thresholds for body picture count may be 0 and 5. The method comprises the steps of determining that a characteristic threshold interval is (L is less than or equal to X Small size ,P≤Y Small size ) when the text length of an article to be scored is 600 and the text picture number is 0, determining that the characteristic threshold interval is (X Small size <L<X Big size ,Y Small size <P<Y Big size ) when the text length of the article to be scored is 900 and the text picture number is 3, determining that the characteristic threshold interval is (X Small size <L<X Big size ,P≥Y Big size ) when the text length of the article to be scored is 950 and the text picture number is 5, and determining that the characteristic threshold interval is (L is more than or equal to X Big size ,P≥Y Big size ) when the text length of the article to be scored is 1200 and the text picture number is 6.
Further, the static feature score of the article to be scored may be multiplied by the static feature weight and accumulated to obtain a static feature total score, and the article quality score of the article to be scored may be determined based on the static feature total score.
In the application process, if the articles to be scored meet some additional scoring criteria, the final article quality score of the articles to be scored is affected.
The following embodiments of the present disclosure will describe the process of article quality scoring when the article to be scored meets some additional scoring criteria.
FIG. 8 is a flowchart illustrating yet another article quality scoring method, according to one example embodiment.
In an exemplary embodiment of the present disclosure, as shown in fig. 8, the article quality scoring method includes steps S51-S54. The steps will be described separately.
In step S51, a plurality of static features included in the article to be scored are determined, and a static feature score and a static feature weight of each of the plurality of static features are determined.
In step S52, the static feature scores corresponding to the static features in the plurality of static features are multiplied by the static feature weights and accumulated to obtain a total static feature score.
In step S53, it is determined whether the article to be scored satisfies a preset additional scoring criterion. Wherein the additional scoring criteria include an add-on criterion and/or a subtract-on criterion.
In step S54, if the articles to be scored meet the preset additional scoring criteria, the static feature total score is scored and/or subtracted according to the scoring criteria and/or the subtracting criteria, so as to obtain the article quality score of the articles to be scored.
The steps S51 to S52 are the same as the steps S11 to S12 in the previous embodiments, respectively, and the explanation and description thereof and the beneficial effects are referred to the description of the steps S11 to S12 above, and are not repeated here. Step S53 and step S54 will be described in detail below.
In the application process, whether the article to be scored meets the preset accessory scoring standard can be judged. Wherein the additional scoring criteria include an add-on criterion and/or a subtract-on criterion. In one example, when the tail punctuation mark of the article to be evaluated is detected to be ":", the content of the article to be evaluated is not ended, and the content is incomplete, so that the preset reduction standard is met. In the application process, the static feature total score can be subtracted to obtain the article quality score of the article to be scored. In a further example, since there are subtitles to paragraphs in the text of the article to be scored, the preset scoring criteria are met. In the application process, the static feature total score can be added to obtain the article quality score of the article to be scored.
It should be noted that, the preset additional scoring criteria may be adjusted according to actual situations, and in the present disclosure, the preset additional scoring criteria are not specifically limited.
Because the method is based on the title and the text of the article to be scored, whether the article is a low-quality article (title party, advertisement, bid, seal-build vague, low style, text content repetition and the like) can be simply judged. Thus, when an article to be scored is obtained, it may be first determined whether the article is a low quality article.
The present disclosure will be described with reference to the following examples.
In an exemplary embodiment of the present disclosure, before determining the plurality of static features included in the article to be scored, it may also be determined that the article to be scored is a non-low quality article first. If it is determined that the article to be scored is not a non-low quality article, the article may be directly defined as a low quality article without performing steps such as determining a plurality of static features included in the article to be scored. By the embodiment, the operation process of determining the article quality scores of the articles to be scored can be reduced.
In an exemplary embodiment of the present disclosure, in response to determining that the article to be scored is a low quality article, determining that the article quality score of the article to be scored is the lowest score. In one example, the lowest score may be 0 score.
The article quality scoring method according to the above embodiment is described below in connection with practical applications.
FIG. 9 is a flowchart illustrating another article quality scoring method, according to one example embodiment.
As shown in fig. 9, in an example, information of a title, a body picture, an article category, an author, and the like of an article to be scored may be acquired. Based on the above information of the article to be scored, judging whether the article to be scored belongs to a low-quality article or has a phenomenon that the content is repeated, if so, directly setting the article quality score of the article to be 0 without performing the following steps.
If the articles to be scored are judged not to belong to low-quality articles and the phenomenon of content repetition does not occur, the respective static feature scores (such as text length, paragraphs, text picture numbers, text picture sharpness, low-quality features and authors) of the articles to be scored are further calculated. And determining the static feature weights corresponding to the static features of the articles to be scored according to the weight combinations corresponding to each grid (feature threshold interval) in the nine-grid of the article type, the text length and the text picture number lookup table 1. And multiplying each static feature by the static feature weight and accumulating to obtain the total quality score of the static features of the article to be scored.
Further, it is determined whether the article to be scored meets a preset additional scoring criterion. Wherein the additional scoring criteria include an add-on criterion and/or a subtract-on criterion. And if the articles to be scored meet the preset additional scoring standard, scoring and/or subtracting the total score of the static features according to the scoring standard and/or subtracting standard to obtain the final article quality score of the articles to be scored.
As can be seen from the above description, the article quality scoring method provided by the present disclosure fully uses each static feature of the article to be scored, obtains a static feature total score by determining each static feature score and static feature weight, and determines an article quality score of the article to be scored based on the static feature total score. According to the method and the device for scoring the article quality, the article quality score of the article to be scored can be determined efficiently and accurately, the article with high quality is recommended to the user, and the satisfaction degree of the user in the process of acquiring the recommended article is improved.
According to the method for scoring the quality of the article, which is provided by the embodiment of the disclosure, aiming at the defect that the accuracy of a scoring scheme of the quality is low based on a text only, which is related to a related technical scheme, each static feature of the article is fully utilized, each static feature score of the article is obtained based on information such as a title, a text, a text picture, an author and the like of the article, then different weights are given to each static feature according to the characteristics of text lengths and text picture numbers of different classified articles, and whether the article is a low-quality article or a high-quality article can be accurately identified, wherein if the article is a title party, an advertisement, a bid, a sealing vague, a low pattern, the article is repeated in content, the article is short and has no picture, the article content is incomplete, and the article content is hydrologic and the like and is defined as the low-quality article. If the content of an article is defined as a high quality article with depth, high specificity, rich text, well-defined hierarchy, complete structure, etc., the other may be defined as a medium quality article. The technical scheme greatly improves the accuracy of the article quality scoring.
According to the article quality scoring method provided by the embodiment of the disclosure, aiming at the defect that the user is easy to lose due to the fact that the article quality scoring is determined based on the user browsing behaviors in the related technical scheme, the quality of the articles is evaluated before the articles are recommended to the user, the problem that some low-quality articles are recommended to the user is avoided, and different articles can be directionally recommended to the user according to different user characteristics, so that better manifestation is brought to the user, interest and hobbies of the user are met, and the user is kept.
Based on the same conception, the embodiment of the disclosure also provides an article quality scoring device.
It can be appreciated that, in order to implement the above-mentioned functions, the article quality scoring device provided in the embodiments of the present disclosure includes a hardware structure and/or a software module that perform each function. The disclosed embodiments may be implemented in hardware or a combination of hardware and computer software, in combination with the various example elements and algorithm steps disclosed in the embodiments of the disclosure. Whether a function is implemented as hardware or computer software driven hardware depends upon the particular application and design constraints imposed on the solution. Those skilled in the art may implement the described functionality using different approaches for each particular application, but such implementation is not to be considered as beyond the scope of the embodiments of the present disclosure.
FIG. 10 is a block diagram of an article quality scoring apparatus, according to one example embodiment. Referring to fig. 10, the article quality scoring apparatus includes a determine static feature module 110, a processing module 120, and an article quality scoring module 130. The respective modules will be described below.
The determine static feature module 110 may be configured to determine a plurality of static features included in the article to be scored and determine a static feature score and a static feature weight for each of the plurality of static features.
The processing module 120 may be configured to multiply the static feature score corresponding to each of the plurality of static features with the static feature weight and accumulate the multiplied static feature score to obtain a total static feature score.
The article quality scoring module 130 may be configured to determine an article quality score for an article to be scored based on the static feature total score.
In an exemplary embodiment of the present disclosure, the article quality scoring apparatus further includes a set feature threshold interval module.
The feature threshold interval setting module may be configured to preset a plurality of feature threshold intervals based on the article type and at least one feature threshold of the reference static feature, wherein each feature threshold interval corresponds to a weight combination, and each weight combination comprises weights of the plurality of static features;
The determine static feature module 110 may be configured to determine a feature threshold interval corresponding to a static feature of the article to be scored corresponding to the reference static feature based on the type of the article to be scored, the static feature of the article to be scored corresponding to the reference static feature, and a feature threshold, and determine a static feature weight for each of the plurality of static features based on a weight combination corresponding to the feature threshold interval.
In an exemplary embodiment of the disclosure, the set feature threshold interval module may be configured to set at least one feature threshold for each of a plurality of preset reference static features, and set different feature thresholds for reference static features of different article types, and combine any two feature thresholds of feature thresholds corresponding to all of the plurality of reference static features to form a plurality of feature threshold intervals.
In an exemplary embodiment of the present disclosure, the article quality scoring apparatus further includes a determining module.
The determination module may be configured to determine whether the article to be scored meets preset additional scoring criteria, including scoring criteria and/or subtracting scoring criteria.
The article quality scoring module 130 may determine the article quality score of the article to be scored based on the static feature total score by scoring and/or subtracting the static feature total score according to the scoring criteria and/or subtracting the scoring criteria if the article to be scored meets the preset additional scoring criteria, thereby obtaining the article quality score of the article to be scored.
In an exemplary embodiment of the present disclosure, the article quality scoring apparatus further includes a determine non-low quality article module.
The determine non-low quality articles module may be configured to determine articles to be scored as non-low quality articles.
In an exemplary embodiment of the present disclosure, the article quality scoring apparatus further includes a process low quality article module.
The process low quality articles module may be configured to determine that the article quality score of the article to be scored is lowest in response to determining that the article to be scored is a low quality article.
In an exemplary embodiment of the present disclosure, the static feature comprises one or more of a text length, a paragraph, a text picture number, a text picture definition, a low-quality feature, and an author, and the determining static feature module 110 may determine a static feature score for each of the plurality of static features by determining a text length score for the text based on the text length, determining a paragraph score based on the number of paragraphs, determining a text picture number score based on the text picture number, determining a text picture definition score based on the text picture definition, determining a low-quality feature score based on the low-quality feature, and determining an author score based on the author's rank.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
FIG. 11 is a block diagram illustrating an apparatus 200 for article quality scoring, according to an example embodiment. For example, apparatus 200 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to FIG. 11, the apparatus 200 may include one or more of a processing component 202, a memory 204, a power component 206, a multimedia component 208, an audio component 210, an input/output (I/O) interface 212, a sensor component 214, and a communication component 216.
The processing component 202 generally controls overall operation of the apparatus 200, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 202 may include one or more processors 220 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 202 can include one or more modules that facilitate interactions between the processing component 202 and other components. For example, the processing component 202 may include a multimedia module to facilitate interaction between the multimedia component 208 and the processing component 202.
The memory 204 is configured to store various types of data to support operations at the apparatus 200. Examples of such data include instructions for any application or method operating on the device 200, contact data, phonebook data, messages, pictures, videos, and the like. The memory 204 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power component 206 provides power to the various components of the device 200. The power components 206 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 200.
The multimedia component 208 includes a screen between the device 200 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 208 includes a front-facing camera and/or a rear-facing camera. The front camera and/or the rear camera may receive external multimedia data when the apparatus 200 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 210 is configured to output and/or input audio signals. For example, the audio component 210 includes a Microphone (MIC) configured to receive external audio signals when the device 200 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 204 or transmitted via the communication component 216. In some embodiments, audio component 210 further includes a speaker for outputting audio signals.
The I/O interface 212 provides an interface between the processing assembly 202 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to, a home button, a volume button, an activate button, and a lock button.
The sensor assembly 214 includes one or more sensors for providing status assessment of various aspects of the apparatus 200. For example, the sensor assembly 214 may detect the on/off state of the device 200, the relative positioning of the components, such as the display and keypad of the device 200, the sensor assembly 214 may also detect a change in position of the device 200 or a component of the device 200, the presence or absence of user contact with the device 200, the orientation or acceleration/deceleration of the device 200, and a change in temperature of the device 200. The sensor assembly 214 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 214 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 214 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 216 is configured to facilitate communication between the apparatus 200 and other devices in a wired or wireless manner. The device 200 may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component 216 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 216 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 200 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for performing the article quality scoring method described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as a memory 204 including instructions executable by the processor 220 of the apparatus 200 to perform the article quality scoring method described above. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
It is understood that the term "plurality" in this disclosure means two or more, and other adjectives are similar thereto. "and/or" describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate that there are three cases of a alone, a and B together, and B alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. The singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It is further understood that the terms "first," "second," and the like are used to describe various information, but such information should not be limited to these terms. These terms are only used to distinguish one type of information from another and do not denote a particular order or importance. Indeed, the expressions "first", "second", etc. may be used entirely interchangeably. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure.
It will be further understood that the terms "center," "longitudinal," "transverse," "front," "rear," "upper," "lower," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like, as used herein, refer to an orientation or positional relationship based on that shown in the drawings, merely for convenience in describing the present embodiments and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed and operate in a particular orientation.
In the drawings, the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The described embodiments are some, but not all, embodiments of the present disclosure. The embodiments described above by referring to the drawings are exemplary and intended to be used for explaining the present disclosure and are not to be construed as limiting the present disclosure. Based on the embodiments in this disclosure, all other embodiments that a person of ordinary skill in the art would obtain without making any inventive effort are within the scope of protection of this disclosure. The embodiments of the present disclosure have been described in detail above with reference to the accompanying drawings.
It will be further understood that "connected" includes both direct connection where no other member is present and indirect connection where other element is present, unless specifically stated otherwise.
It will be further understood that although operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (14)

1. An article quality scoring method, comprising:
Determining a plurality of static features contained in an article to be scored, and determining a static feature score and a static feature weight of each static feature in the plurality of static features;
multiplying the static feature scores corresponding to each static feature in the plurality of static features by the static feature weights, and accumulating to obtain a static feature total score;
determining an article quality score for the article to be scored based on the static feature total score;
The method further comprises the steps of:
Presetting a plurality of feature threshold intervals based on article types and at least one feature threshold of reference static features, wherein each feature threshold interval corresponds to a weight combination, and each weight combination comprises weights of the plurality of static features;
The determining the static feature weight of each of the plurality of static features includes:
Determining a feature threshold interval corresponding to the static feature of the reference static feature in the article to be scored based on the type of the article to be scored, the static feature of the article to be scored corresponding to the reference static feature and the feature threshold;
Determining a static feature weight for each of the plurality of static features based on a combination of weights corresponding to the feature threshold interval;
The static features comprise text length and paragraphs, wherein text length scores of the articles to be scored are determined according to the text length of the articles to be scored, and paragraph scores are determined according to the number of the paragraphs;
The text length score is determined in the following manner:
,
wherein bodyLen represents the text length of the article to be scored, l1 and l2 respectively represent two thresholds of the text length of the article to be scored corresponding to different article types, w1 and w2 represent two parameters, and w1 and w2 are set based on l1 and l 2;
the paragraph score is determined as follows:
,
Wherein paragraphNum represents the number of paragraphs of the article to be scored.
2. The article quality scoring method of claim 1, wherein presetting a plurality of feature threshold intervals based on article type and feature threshold of at least one reference static feature comprises:
setting at least one feature threshold for each of a plurality of preset reference static features, and setting different feature thresholds for the reference static features of different article types;
And combining any two feature thresholds in the feature thresholds corresponding to all the reference static features in the plurality of reference static features to form a plurality of feature threshold intervals.
3. The article quality scoring method of claim 1, the article quality scoring method is characterized by further comprising the following steps:
Determining whether the article to be scored meets preset additional scoring criteria, wherein the additional scoring criteria comprise scoring criteria and/or subtracting scoring criteria;
the determining the article quality score of the article to be scored based on the static feature total score includes:
And if the article to be scored meets the preset additional scoring standard, scoring and/or subtracting the score from the static feature total score according to the scoring standard and/or subtracting the score standard to obtain the article quality score of the article to be scored.
4. The article quality scoring method of claim 1, wherein prior to determining the plurality of static features contained in the article to be scored, the article quality scoring method further comprises:
and determining the article to be scored as a non-low quality article.
5. The article quality scoring method of claim 4, further comprising:
And in response to determining that the article to be scored is a low-quality article, determining that the article quality score of the article to be scored is the lowest score.
6. The article quality scoring method of claim 1, wherein the static features further comprise a plurality of text picture numbers, text picture sharpness, low-quality features, and authors, the determining a static feature score for each of the plurality of static features comprising:
Determining text picture number scores according to the text picture numbers;
determining a text picture definition score according to the text picture definition;
Determining a low-quality feature score from the low-quality features, and
The author score is determined based on the author's rank.
7. An article quality scoring apparatus, comprising:
The static feature determining module is used for determining a plurality of static features contained in the article to be scored and determining a static feature score and a static feature weight of each static feature in the plurality of static features;
The processing module is used for multiplying the static feature scores corresponding to each static feature in the plurality of static features by the static feature weights and accumulating to obtain a static feature total score;
The article quality scoring module is used for determining article quality scores of the articles to be scored based on the static feature total score;
The apparatus further comprises:
the feature threshold interval setting module is used for presetting a plurality of feature threshold intervals based on article types and feature thresholds of at least one reference static feature, wherein each feature threshold interval corresponds to a weight combination, and each weight combination comprises weights of the plurality of static features;
the static feature determining module is used for determining a feature threshold interval corresponding to the static feature of the reference static feature in the article to be scored based on the type of the article to be scored, the static feature of the reference static feature in the article to be scored and the feature threshold;
The static features comprise text length and paragraphs, wherein text length scores of the articles to be scored are determined according to the text length of the articles to be scored, and paragraph scores are determined according to the number of the paragraphs;
The text length score is determined in the following manner:
,
wherein bodyLen represents the text length of the article to be scored, l1 and l2 respectively represent two thresholds of the text length of the article to be scored corresponding to different article types, w1 and w2 represent two parameters, and w1 and w2 are set based on l1 and l 2;
the paragraph score is determined as follows:
,
Wherein paragraphNum represents the number of paragraphs of the article to be scored.
8. The article quality scoring device of claim 7, wherein the set feature threshold interval module is configured to:
setting at least one feature threshold for each of a plurality of preset reference static features, and setting different feature thresholds for the reference static features of different article types;
And combining any two feature thresholds in the feature thresholds corresponding to all the reference static features in the plurality of reference static features to form a plurality of feature threshold intervals.
9. The article quality scoring device of claim 7, further comprising:
the judging module is used for determining whether the article to be scored meets the preset additional scoring standard or not, wherein the additional scoring standard comprises a scoring standard and/or a subtracting scoring standard;
the article quality scoring module determines an article quality score for the article to be scored based on the static feature total score in the following manner:
And if the article to be scored meets the preset additional scoring standard, scoring and/or subtracting the score from the static feature total score according to the scoring standard and/or subtracting the score standard to obtain the article quality score of the article to be scored.
10. The article quality scoring device of claim 7, further comprising:
And the non-low quality article determining module is used for determining the article to be scored as a non-low quality article.
11. The article quality scoring device of claim 10, wherein the device further comprises:
and the low-quality article processing module is used for determining that the article quality score of the article to be scored is the lowest score in response to determining that the article to be scored is the low-quality article.
12. The article quality scoring device of claim 7, wherein the static features further comprise a plurality of text picture numbers, text picture sharpness, low-quality features, and authors, the determining static feature module determining a static feature score for each of the plurality of static features by:
Determining text picture number scores according to the text picture numbers;
determining a text picture definition score according to the text picture definition;
Determining a low-quality feature score from the low-quality features, and
The author score is determined based on the author's rank.
13. An image processing apparatus, comprising:
A processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the article quality scoring method of any one of claims 1 to 6.
14. A non-transitory computer readable storage medium, which when executed by a processor of a mobile terminal, causes the mobile terminal to perform the article quality scoring method of any one of claims 1 to 6.
CN202010873079.4A 2020-08-26 2020-08-26 Article quality scoring method, article quality scoring device and storage medium Active CN112069802B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010873079.4A CN112069802B (en) 2020-08-26 2020-08-26 Article quality scoring method, article quality scoring device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010873079.4A CN112069802B (en) 2020-08-26 2020-08-26 Article quality scoring method, article quality scoring device and storage medium

Publications (2)

Publication Number Publication Date
CN112069802A CN112069802A (en) 2020-12-11
CN112069802B true CN112069802B (en) 2024-12-03

Family

ID=73658949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010873079.4A Active CN112069802B (en) 2020-08-26 2020-08-26 Article quality scoring method, article quality scoring device and storage medium

Country Status (1)

Country Link
CN (1) CN112069802B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818691A (en) * 2021-01-29 2022-07-29 腾讯科技(深圳)有限公司 Article content evaluation method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214614A (en) * 2017-06-29 2019-01-15 北京嘀嘀无限科技发展有限公司 Net about vehicle driver credit-graded approach, credit scoring system and computer installation
CN111488931A (en) * 2020-04-10 2020-08-04 腾讯科技(深圳)有限公司 Article quality evaluation method, article recommendation method and corresponding devices

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850642B (en) * 2015-05-26 2017-05-17 广州神马移动信息科技有限公司 Internet content quality evaluation method and internet content quality evaluation device
CN107870912A (en) * 2016-09-22 2018-04-03 广州市动景计算机科技有限公司 Article quality score method, equipment, client, server and programmable device
JP6717387B2 (en) * 2016-12-22 2020-07-01 日本電気株式会社 Text evaluation device, text evaluation method and recording medium
CN107193805B (en) * 2017-06-06 2021-05-14 北京百度网讯科技有限公司 Article value evaluation method and device based on artificial intelligence and storage medium
CN107729473B (en) * 2017-10-13 2021-03-30 东软集团股份有限公司 Article recommendation method and device
CN108920488B (en) * 2018-05-14 2021-09-28 平安科技(深圳)有限公司 Multi-system combined natural language processing method and device
CN110019814B (en) * 2018-07-09 2021-07-27 暨南大学 A news information aggregation method based on data mining and deep learning
CN110334356B (en) * 2019-07-15 2023-08-04 腾讯科技(深圳)有限公司 Article quality determining method, article screening method and corresponding device
CN110866119B (en) * 2019-11-14 2021-06-15 腾讯科技(深圳)有限公司 Article quality determination method and device, electronic equipment and storage medium
CN111309854B (en) * 2019-11-20 2023-05-26 武汉烽火信息集成技术有限公司 Article evaluation method and system based on article structure tree

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214614A (en) * 2017-06-29 2019-01-15 北京嘀嘀无限科技发展有限公司 Net about vehicle driver credit-graded approach, credit scoring system and computer installation
CN111488931A (en) * 2020-04-10 2020-08-04 腾讯科技(深圳)有限公司 Article quality evaluation method, article recommendation method and corresponding devices

Also Published As

Publication number Publication date
CN112069802A (en) 2020-12-11

Similar Documents

Publication Publication Date Title
CN107105314B (en) Video playing method and device
CN111859020B (en) Recommendation method, recommendation device, electronic equipment and computer readable storage medium
CN111461304B (en) Training method of classified neural network, text classification method, device and equipment
CN106372204A (en) Push message processing method and device
CN106033397B (en) Memory buffer area adjusting method, device and terminal
CN111556352A (en) Multimedia resource sharing method and device, electronic equipment and storage medium
CN107480785A (en) The training method and device of convolutional neural networks
CN105242837B (en) Five application page acquisition methods and terminal
CN112069802B (en) Article quality scoring method, article quality scoring device and storage medium
CN105516457A (en) Communication message processing method and apparatus
CN112685599B (en) Video recommendation method and device
CN103970831B (en) Recommend the method and apparatus of icon
CN113268655B (en) Information recommendation method, device and electronic device
CN111859097B (en) Data processing method, device, electronic equipment and storage medium
CN104112460B (en) Method and device for playing audio data
CN111291268B (en) Information processing method, information processing apparatus, and storage medium
CN107301188B (en) Method for acquiring user interest and electronic equipment
CN116029607A (en) Account screening method, account screening device, electronic equipment and storage medium
CN111753182B (en) Multimedia information recommendation method and device, electronic equipment and readable storage medium
CN114462410A (en) Entity identification method, device, terminal and storage medium
CN114201665A (en) Data processing method and device, electronic equipment and storage medium
CN111898019B (en) Information pushing method and device
CN109962841B (en) Information interaction method and device, server, electronic equipment and storage medium
CN107239280B (en) Method and device for determining notification importance and mobile terminal
CN106919395B (en) Application notification display method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant