[go: up one dir, main page]

CN101887460A - A Document Quality Evaluation Method and Its Application - Google Patents

A Document Quality Evaluation Method and Its Application Download PDF

Info

Publication number
CN101887460A
CN101887460A CN2010102263535A CN201010226353A CN101887460A CN 101887460 A CN101887460 A CN 101887460A CN 2010102263535 A CN2010102263535 A CN 2010102263535A CN 201010226353 A CN201010226353 A CN 201010226353A CN 101887460 A CN101887460 A CN 101887460A
Authority
CN
China
Prior art keywords
mrow
msub
document
author
literature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010102263535A
Other languages
Chinese (zh)
Inventor
张铭
封盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN2010102263535A priority Critical patent/CN101887460A/en
Publication of CN101887460A publication Critical patent/CN101887460A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提供一种应用于文献共享平台中的文献质量评估算法,该算法包括以下步骤:利用文献-文献,文献-期刊会议和作者的关系构建学术网络图;将这些关系定量成图上顶点之间的转移关系,建模得到转移概率矩阵;利用用户对文献的收藏行为建立模型,计算得到基于用户分析的文献质量值;对该图进行带重启动的随机游走迭代算法,得到文献质量、期刊会议质量和作者学术声望的信息。本发明第一次将用户行为信息与文献质量评价结合起来,能够在给出文献质量分析结果时还能够给出作者学术声望和期刊会议学术质量的分析结果,本方法的排序效果相比其他方法有明显提高。

The invention provides a document quality evaluation algorithm applied in a document sharing platform, the algorithm comprising the following steps: constructing an academic network graph using the relationship between document-document, document-journal meeting and author; The transition relationship among them is modeled to obtain the transition probability matrix; the user's collection behavior of documents is used to build a model, and the document quality value based on user analysis is calculated; the random walk iterative algorithm with restart is performed on the graph to obtain the document quality, Information on journal conference quality and author academic reputation. The present invention combines user behavior information with document quality evaluation for the first time, and can also give the analysis results of author's academic reputation and academic quality of journal conferences when giving document quality analysis results. The ranking effect of this method is compared with other methods There is a significant improvement.

Description

一种文献质量评估方法及应用 A Document Quality Evaluation Method and Its Application

技术领域technical field

本发明涉及一种文献的质量评估方法,具体涉及一种在文献共享平台上的文献质量评估方法,属于知识挖掘技术领域。The invention relates to a document quality assessment method, in particular to a document quality assessment method on a document sharing platform, and belongs to the technical field of knowledge mining.

背景技术Background technique

近年以来,随着科学研究的飞速发展,科技文献的出版速度逐年增加,其数量已经非常庞大,例如仅针对计算机和信息科学领域的数字图书馆CiteSeerX上就存有150多万篇科技文献。科研人员在进行研究工作的过程中需要阅读和参考大量的科技文献资料,高质量的文献和低质量的文献对于科研工作者的价值是迥然不同的,从这些良莠不齐而数量十分庞大的文献资料中获取具有较高价值的科技文献成为了一项非常困难的工作。因此,如何对科技文献的质量进行有效的自动评估这一研究课题也吸引了越来越多的研究人员。In recent years, with the rapid development of scientific research, the publication speed of scientific and technological literature has increased year by year, and the number has become very large. For example, there are more than 1.5 million scientific and technological literature on CiteSeerX, a digital library only for the field of computer and information science. Scientific researchers need to read and refer to a large number of scientific and technological documents in the process of conducting research work. The value of high-quality documents and low-quality documents to scientific researchers is very different. Obtaining high-value scientific literature has become a very difficult task. Therefore, the research topic of how to effectively and automatically evaluate the quality of scientific and technological literature has attracted more and more researchers.

在学术研究领域的社会化文献共享交流网站上,用户可以收藏自己认为比较有价值的科技文献,标注标签,进行评论,并将这些文献分享给其他的用户。用户的收藏行为应当在对科技文献的质量进行分析的时候成为一个重要的参考,而目前利用了用户的行为来对科技文献质量进行分析的研究还非常少。因此,在Web 2.0环境下,如何将用户行为有效应用到科技文献质量评价系统中,值得进一步研究。On the social literature sharing and exchange website in the field of academic research, users can collect scientific and technological literature that they think are more valuable, mark tags, make comments, and share these literature with other users. User's collection behavior should be an important reference when analyzing the quality of scientific and technological literature, but currently there are very few studies that use user behavior to analyze the quality of scientific and technological literature. Therefore, in the Web 2.0 environment, how to effectively apply user behavior to the quality evaluation system of scientific and technological literature is worthy of further study.

对学术论文进行质量评估,学术界现有的评价方法主要包括同行评议、引文分析和基于链接分析的方法。同行评议通常用于论文的前期评价,如会议或期刊评审投稿论文;引文评价用于后期评价,例如评价研究人员已发表论文的学术水平。To evaluate the quality of academic papers, the existing evaluation methods in academia mainly include peer review, citation analysis and methods based on link analysis. Peer review is usually used for early evaluation of papers, such as reviewing submitted papers at conferences or journals; citation evaluation is used for later evaluation, such as evaluating the academic level of published papers by researchers.

同行评议,即由相同研究领域的自身专家学者从所选课题的意义以及创新性、研究方法、研究完成的质量、论文写作水平等各个方面进行综合性的评价。同行评议的优点在于专家对研究质量的评价是细致而准确的,专家凭借相关领域深厚的学术造诣能够看清学术研究的水平高下;而缺点则在于当前评价制度尚不完善、“同行”自律不严容易引发一些“流弊”,并且对大量的学术论文进行同行评价费时费力,是不太现实的。Peer review is a comprehensive evaluation by experts and scholars in the same research field from the significance and innovation of the selected topic, research methods, the quality of research completion, and the level of paper writing. The advantage of peer review is that the evaluation of research quality by experts is meticulous and accurate. Experts can see the level of academic research by virtue of their profound academic attainments in related fields. Laxity is likely to lead to some "frauds", and peer evaluation of a large number of academic papers is time-consuming and laborious, which is not realistic.

引文分析,即利用学术论文间的引用和被引用关系采用某种具体方法和评价标准对论文进行质量评价。引文分析法的研究人员提出了一系列量化的质量评价指标,例如被引频次、影响因子等。相对于同行评议,引文分析的评价方法更加简单,易于利用计算机自动完成;与此同时,引文分析的结果更粗糙,而且必须利用论文间的引用与被引用关系,对新发表的文献,因为被引用较少,往往给出的评价偏低,局限性较强。Citation analysis is to use the citation and cited relationship between academic papers to evaluate the quality of papers by using a specific method and evaluation standard. Researchers of citation analysis methods have proposed a series of quantitative quality evaluation indicators, such as cited frequency, impact factor, etc. Compared with peer review, the evaluation method of citation analysis is simpler, and it is easy to use computer to complete automatically; at the same time, the results of citation analysis are rougher, and the relationship between citations and citations between papers must be used. There are few citations, the evaluation is often given low, and the limitations are strong.

Brin和Page在1998年基于网页之间的链接关系提出了PageRank算法来对网页按照其重要度排序,并以此为基础创立了Google搜索引擎。Kleinberg提出了另外一种链接分析算法HITS算法。之后,考虑到科技文献之间通过引用关系天然形成的链接结构,很多研究人员基于这些方法的思想来解决文献质量评价方面的问题。In 1998, Brin and Page proposed the PageRank algorithm based on the link relationship between web pages to sort web pages according to their importance, and based on this, they created the Google search engine. Kleinberg proposed another link analysis algorithm HITS algorithm. Later, considering the link structure naturally formed by the citation relationship between scientific and technical literature, many researchers have solved the problem of literature quality evaluation based on the ideas of these methods.

发明内容Contents of the invention

本发明的目的是通过对文献、作者和期刊会议之间的关系建模并进行分析,利用Web 2.0环境下用户行为和文献质量之间的关系协助分析文献质量。本发明将同行评议和引文分析这两种分析方法统一在带重启动的随机游走算法框架下,给出最终的分析结果。The purpose of the present invention is to use the relationship between user behavior and document quality in the Web 2.0 environment to assist in the analysis of document quality by modeling and analyzing the relationship between documents, authors and journal conferences. The invention unifies the two analysis methods of peer review and citation analysis under the frame of random walk algorithm with restart, and gives the final analysis result.

本发明解决其技术问题所采用的方案是(流程如图1所示):The scheme adopted by the present invention to solve its technical problems is (flow process as shown in Figure 1):

本发明提出一种评估文献质量的方法,该方法应用于科技文献共享平台,在该平台上,用户可以对文献进行收藏、添加标签、评论、分享给其他用户,其特征在于,所述方法包括以下步骤:The present invention proposes a method for evaluating the quality of documents, which is applied to a sharing platform for scientific and technological documents. On the platform, users can collect, add tags, comment on and share documents with other users. The method is characterized in that the method includes The following steps:

A.利用文献的引用关系、文献与期刊会议和作者的关系以及文献的发表时间,构建带权的有向图,称为学术网络图;A. Using the citation relationship of the literature, the relationship between the literature and journal conferences and authors, and the publication time of the literature, construct a directed graph with weights, which is called an academic network graph;

B.将文献的引用关系、文献与期刊会议和作者的关系定量成图上顶点之间的转移关系,建模得到学术网络图上的转移概率矩阵;B. Quantify the citation relationship of literature, the relationship between literature and journal conferences and authors into the transfer relationship between vertices on the graph, and model the transfer probability matrix on the academic network graph;

C.利用用户对文献的收藏行为建立模型,考虑收藏时间,利用HITS算法计算得到一个基于用户分析的文献质量值;C. Use the user's collection behavior to establish a model, consider the collection time, and use the HITS algorithm to calculate a document quality value based on user analysis;

D.根据步骤B和步骤C建立的模型,进行带重启动的随机游走迭代,直到结果收敛,得到学术网络图上每个顶点的概率值,这个概率值即为文献质量、期刊会议质量和作者学术声望的信息。D. According to the model established in step B and step C, perform random walk iterations with restart until the result converges, and obtain the probability value of each vertex on the academic network graph. This probability value is the document quality, journal conference quality and Information about the author's academic reputation.

本发明提供的方法不仅可用于科技文献共享平台,同样也适用于论文共享平台或网站(其中的文献指的是论文),以及图片共享平台或网站(其中的文献指的是图片)等。The method provided by the present invention can be used not only for scientific and technological literature sharing platforms, but also for paper sharing platforms or websites (documents therein refer to papers), and picture sharing platforms or websites (documents therein refer to pictures) and the like.

本发明的有益效果:Beneficial effects of the present invention:

本发明提出的应用于科技文献的基于图的质量评估方法,第一次将用户行为信息与文献质量评价结合起来,能够在给出文献质量分析结果时还能够给出作者学术声望和期刊会议学术质量的分析结果。如将本发明应用于科技文献检索网站,对用户按照关键字检索到的结果进行质量值排序,能够帮助用户更快找到高质量的科技文献,更快了解到高质量的期刊和会议,以及学术声望高的作者。实验证明,本方法的排序效果相比其他方法有明显提高。The graph-based quality assessment method applied to scientific and technological documents proposed by the present invention combines user behavior information with document quality evaluation for the first time, and can also provide the author's academic reputation and journal conference academic results when the document quality analysis results are given. quality analysis results. For example, if the present invention is applied to a scientific and technological literature retrieval website, the quality values of the results retrieved by users according to keywords can be sorted, which can help users find high-quality scientific and technological literature faster, learn about high-quality journals and conferences, and academic Author of high reputation. Experiments show that the sorting effect of this method is significantly improved compared with other methods.

附图说明Description of drawings

图1为根据本发明的基于图的科技文献质量评估方法的总流程图;Fig. 1 is the general flowchart of the method for evaluating the quality of scientific and technological documents based on graphs according to the present invention;

图2为根据本发明构建的学术网络图;Fig. 2 is an academic network diagram constructed according to the present invention;

图3为根据本发明构建的学术网络图上顶点间转移关系图;Fig. 3 is a transfer relationship diagram between vertices on the academic network graph constructed according to the present invention;

图4为根据本发明构建的用户-文献收藏关系图。Fig. 4 is a user-document collection relationship diagram constructed according to the present invention.

具体实施方式Detailed ways

下面结合附图和具体实施方式对本发明作进一步详细描述:Below in conjunction with accompanying drawing and specific embodiment the present invention is described in further detail:

步骤1,利用文献的引用关系、文献与期刊会议和作者的关系以及文献的发表时间,构建带权的有向图,称为学术网络图。Step 1: Construct a weighted directed graph, called the academic network graph, using the citation relationship of the literature, the relationship between the literature and journal conferences and authors, and the publication time of the literature.

本发明设计构建的学术网络图由三个部分组成,对文献、作者、期刊会议三种实体之间的关系进行建模。三个部分分别为:The academic network diagram designed and constructed by the present invention is composed of three parts, and models the relationships among three entities: documents, authors, and periodical conferences. The three parts are:

●文献引文互联子图Gdd=(Vd,Edd),● Literature citation interconnection subgraph G dd = (V d , E dd ),

Gdd是有向图,表示文献之间的引用关系,其中Vd是文献顶点集,Edd是边集,有向边<di,dj>∈Edd表示文献di引用了文献djG dd is a directed graph, which represents the citation relationship between documents, where V d is a document vertex set, E dd is an edge set, and the directed edge <d i , d j >∈E dd means that document d i cites document d j ;

●作者-文献子图Gad=(Va∪Vd,Ead),● Author-document subgraph G ad = (V a ∪ V d , E ad ),

Gad是一个二部图,表示作者和文献之间的著作关系,其中Va是作者顶点集,Ead是边集,无向边(ai,dj)∈Ead表示作者ai写作了文献djG ad is a bipartite graph, which represents the authorship relationship between the author and the document, where V a is the author vertex set, E ad is the edge set, and the undirected edge (a i , d j ) ∈ E ad represents the author a i writing Document d j ;

●期刊会议-文献子图Gcd=(Vc∪Vd,Ecd),●Journal conference-document subgraph G cd =(V c ∪V d ,E cd ),

Gcd是一个二部图,表示期刊会议和文献之间的发表关系,其中Vc是期刊、会议顶点集,Ecd是边集,无向边(ci,dj)∈Ecd表示文献dj发表在期刊或会议ci上;G cd is a bipartite graph, which represents the publishing relationship between journal conferences and documents, where V c is the journal and conference vertex set, E cd is the edge set, and the undirected edge (c i , d j ) ∈ E cd represents the document d j published in journals or conferences c i ;

这三个子图的组合即为学术网络图,如图2所示。The combination of these three subgraphs is the academic network diagram, as shown in Figure 2.

定义学术网络图为有向图G=(V,E)。其中V为顶点集,V=Va∪Vd∪Vc,E为边集,E=Edd∪Ead∪Ecd。考虑到随机游走需要在有向图上进行,因此这里将作者-文献子图和期刊会议-文献子图中的每一条无向边都表示成连接这两个顶点的两条有向边,例如:(ci,dj)→<ci,dj>∪<dj,ci>。Define the academic network graph as a directed graph G=(V, E). Where V is the vertex set, V=V a ∪V d ∪V c , E is the edge set, E=E dd ∪E ad ∪E cd . Considering that the random walk needs to be performed on a directed graph, each undirected edge in the author-document subgraph and journal meeting-document subgraph is represented here as two directed edges connecting these two vertices. For example: (c i , d j )→<c i , d j >∪<d j , c i >.

步骤2,将文献的引用关系、文献与期刊会议和作者的关系定量成图上顶点之间的转移关系,建模得到学术网络图上的转移概率矩阵。Step 2. Quantify the citation relationship of the literature, the relationship between the literature and the journal conference, and the author into the transition relationship between the vertices on the graph, and model the transition probability matrix on the academic network graph.

学术网络图G中每个顶点代表一个作者、一篇文献或者一个期刊/会议,因此图G是一个包含三种不同类型实体的异构图。本发明对不同类型的顶点(实体)之间的转移定义不同的转移概率α,如图3中所示。对于这些转移概率参数,定义:Each vertex in the academic network graph G represents an author, a document, or a journal/conference, so graph G is a heterogeneous graph containing three different types of entities. The present invention defines different transition probabilities α for transitions between different types of vertices (entities), as shown in FIG. 3 . For these transition probability parameters, define:

αad=αcd=1α adcd =1

αdadcdd=1α dadcdd =1

其中:αad为从作者顶点到文献顶点的转移概率,αcd为从发表地点顶点到文献顶点的转移概率,αda为从文献顶点到作者顶点的转移概率,αdc为从文献顶点到发表地点顶点的转移概率,αdd为从文献顶点到文献顶点的转移概率。Among them: α ad is the transition probability from the author vertex to the document vertex, α cd is the transition probability from the publication site vertex to the document vertex, α da is the transition probability from the document vertex to the author vertex, α dc is the transition probability from the document vertex to the publication vertex The transition probability of the location vertex, α dd is the transition probability from document vertex to document vertex.

定义W(G)为图G的带权邻接矩阵,对应于学术网络图中不同顶点之间关系的权重,根据前面对学术网络图的定义,W(G)可以被分解为如下表所示的一系列子矩阵。首先,本发明对各个子矩阵赋初值获得初始的带权邻接矩阵;然后,对矩阵的初值应用权值计分函数,获得最终的带权邻接矩阵;最后,再以带权邻接矩阵为基础,计算得到转移概率矩阵。Define W(G) as the weighted adjacency matrix of graph G, which corresponds to the weight of the relationship between different vertices in the academic network graph. According to the previous definition of academic network graph, W(G) can be decomposed into the following table A series of sub-matrices of . First, the present invention assigns initial values to each sub-matrix to obtain an initial weighted adjacency matrix; then, a weighted scoring function is applied to the initial value of the matrix to obtain the final weighted adjacency matrix; finally, the weighted adjacency matrix is used as Based on the calculation, the transition probability matrix is obtained.

Figure BDA0000023283060000041
Figure BDA0000023283060000041

以下分别给出这些子矩阵的初始定义:The initial definitions of these sub-matrices are given below:

●从文献顶点到文献顶点的带权邻接矩阵● Weighted adjacency matrix from document vertex to document vertex

Figure BDA0000023283060000042
Figure BDA0000023283060000042

其中t(d)表示文献d的发表时间,Γdd(di)表示文献di引用的文献的集合。Where t(d) represents the publication time of document d, and Γ dd (d i ) represents the collection of documents cited by document d i .

●从作者地点顶点到文献顶点的带权邻接矩阵●Weighted adjacency matrix from author location vertex to document vertex

Figure BDA0000023283060000043
Figure BDA0000023283060000043

其中Γad(ai)表示作者ai发表文献的集合,

Figure BDA0000023283060000044
作者a是文献d的第k作者。Where Γ ad (a i ) represents the collection of published documents by author a i ,
Figure BDA0000023283060000044
Author a is the kth author of document d.

●从文献顶点到作者顶点的带权邻接矩阵Wda(j,i)=|Γda(dj)|-k+1●The weighted adjacency matrix W da (j, i)=|Γ da (d j )|-k+1 from document vertex to author vertex

其中Γda(dj)表示文献dj的作者集合,k表示作者ai是文献dj的第k作者。Where Γ da (d j ) represents the set of authors of document d j , and k represents that author a i is the kth author of document d j .

●从文献顶点到发表地点顶点的带权邻接矩阵

Figure BDA0000023283060000051
●The weighted adjacency matrix from the document vertex to the publication site vertex
Figure BDA0000023283060000051

●从发表地点顶点到文献顶点的带权邻接矩阵●The weighted adjacency matrix from the publishing site vertex to the document vertex

Figure BDA0000023283060000052
Figure BDA0000023283060000052

其中cik表示会议ci的某一届,或者期刊ci的某一卷,Γcd(cim)表示发表在cim上的文献集合,t(cim)表示cim的对应时间(年份)。Among them, c ik represents a certain session of conference ci , or a certain volume of journal ci , Γ cd (c im ) represents the collection of documents published on c im , t(c im ) represents the corresponding time of c im (year ).

Γcd(cik)={d|t(d)=t(cik)∧d∈Γcd(ci)}Γ cd (ci ik )={d|t(d)=t(ci ik )∧d∈Γ cd ( ci )}

显然, Y k &Gamma; cd ( c ik ) = &Gamma; cd ( c i ) ,

Figure BDA0000023283060000054
&ForAll; k , l , t ( c ik ) &NotEqual; t ( c il ) . Obviously, Y k &Gamma; cd ( c ik ) = &Gamma; cd ( c i ) , and
Figure BDA0000023283060000054
&ForAll; k , l , t ( c ik ) &NotEqual; t ( c il ) .

接下来对矩阵中的初始权值应用一个权值计分函数Ф:Next apply a weight scoring function Ф to the initial weights in the matrix:

W(i,j)=Ф(W(i,j))W(i,j)=Ф(W(i,j))

合适的权值计分函数的标准是:这个函数应该是一个单调递增函数,但随着自变量取值的增大,函数值的增长幅度逐渐减小,即:Ф′(x)>0且Ф″(x)<0,本方法中取

Figure BDA0000023283060000056
The standard for a suitable weight scoring function is: this function should be a monotonically increasing function, but as the value of the independent variable increases, the growth rate of the function value gradually decreases, that is: Ф′(x)>0 and Ф″(x)<0, this method takes
Figure BDA0000023283060000056

接下来,首先定义三个子图对应的转移概率矩阵,最后计算出整个学术网络图的转移概率矩阵。Next, first define the transition probability matrix corresponding to the three sub-graphs, and finally calculate the transition probability matrix of the entire academic network graph.

●文献引用子图Gdd ● Literature citation subgraph G dd

文献到文献的转移概率矩阵为:The document-to-document transition probability matrix is:

Mm dddd == (( Mm dddd (( ii ,, jj )) )) ii ,, jj &Element;&Element; VV dd

其中,in,

Mm dddd (( ii ,, jj )) == PP (( dd jj || dd ii )) == WW dddd (( ii ,, jj )) &Sigma;&Sigma; kk WW dddd (( ii ,, kk ))

●作者-文献子图Gad ●Author-document subgraph G ad

作者到文献的转移概率矩阵为:The transition probability matrix from author to document is:

Mm adad == (( Mm adad (( ii ,, jj )) )) ii &Element;&Element; VV aa ,, jj &Element;&Element; VV dd

其中,in,

Mm adad (( ii ,, jj )) == PP (( dd jj || aa ii )) == WW adad (( ii ,, jj )) &Sigma;&Sigma; kk WW adad (( ii ,, kk ))

文献到作者的转移概率矩阵为:The transition probability matrix from document to author is:

Mm dada == (( Mm dada (( ii ,, jj )) )) ii &Element;&Element; VV aa ,, jj &Element;&Element; VV dd

其中,in,

Mm dada (( jj ,, ii )) == PP (( aa ii || dd jj )) == WW dada (( jj ,, ii )) &Sigma;&Sigma; kk WW dada (( jj ,, kk ))

●期刊会议-文献子图Gcd ●Journal conference - literature subgraph G cd

文献到期刊会议的转移概率矩阵为:The transition probability matrix from a document to a journal conference is:

Mm dcdc == (( Mm dcdc (( ii ,, jj )) )) ii &Element;&Element; VV dd ,, jj &Element;&Element; VV cc

其中,in,

Mm dcdc (( ii ,, jj )) == PP (( cc jj || dd ii )) == WW dcdc (( ii ,, jj )) &Sigma;&Sigma; kk WW dcdc (( ii ,, kk ))

期刊会议到文献的转移概率矩阵为:The transition probability matrix from journal conference to literature is:

Mm cdcd == (( Mm cdcd (( jj ,, ii )) )) ii &Element;&Element; VV dd ,, jj &Element;&Element; VV cc

其中,in,

Mm cdcd (( jj ,, ii )) == PP (( dd ii || cc jj )) == WW cdcd (( jj ,, ii )) &Sigma;&Sigma; kk WW cdcd (( jj ,, kk ))

通过子图的转移概率矩阵,得到学术网络图上的转移概率矩阵:Through the transition probability matrix of the subgraph, the transition probability matrix on the academic network graph is obtained:

Mm (( GG )) == (( PP (( jj || ii )) )) ii ,, jj &Element;&Element; VV == &alpha;&alpha; dddd Mm dddd &alpha;&alpha; dada Mm dada &alpha;&alpha; dcdc Mm dcdc Mm adad 00 00 Mm cdcd 00 00

步骤3,利用用户对文献的收藏行为建立模型,考虑收藏时间,利用HITS算法计算得到一个基于用户分析的文献质量值。Step 3, use the user's collection behavior to establish a model, consider the collection time, and use the HITS algorithm to calculate a document quality value based on user analysis.

本发明将文献和用户之间通过收藏行为连接起来构造用户-文献收藏关系图,用户和文献是图中的顶点,收藏行为是边,如图4所示。本发明定义用户-文献收藏体系为B=(U,D,T,R),其中U是用户集合,D是文献集合,T是一系列时间点的集合,

Figure BDA0000023283060000071
表示收藏关系的集合。(u,d,t)∈R,表示用户u在时刻t收藏了文献d。The present invention connects documents and users through collection behaviors to construct a user-document collection relationship graph. Users and documents are vertices in the graph, and collection behaviors are edges, as shown in FIG. 4 . The present invention defines the user-document collection system as B=(U, D, T, R), wherein U is a collection of users, D is a collection of documents, and T is a collection of a series of time points,
Figure BDA0000023283060000071
Represents a collection of favorites. (u, d, t) ∈ R, means that user u bookmarked document d at time t.

定义文献集合的质量值向量为:q=(q1,q2,Λ,qm),其中m=|D|;定义用户集合的专家度向量为:e=(e1,e2,Λ,en),其中n=|U|。定义用户-文献收藏关系图的邻接矩阵A:The quality value vector defining the document set is: q=(q 1 , q 2 , Λ, q m ), where m=|D|; the expert degree vector defining the user set is: e=(e 1 , e 2 , Λ , e n ), where n=|U|. Define the adjacency matrix A of the user-document collection relationship graph:

Figure BDA0000023283060000072
Figure BDA0000023283060000072

计算文献质量值和用户专家度就是重复如下的迭代过程直到结果收敛:Calculating the document quality value and user expert degree is to repeat the following iterative process until the result converges:

q=e×Aq=e×A

e=q×AT e=q× AT

步骤4,根据步骤2和步骤3建立的模型,进行带重启动的随机游走迭代,直到结果收敛,得到学术网络图上每个顶点的概率值,这个概率值即为文献质量、期刊会议质量和作者学术声望的信息。Step 4, according to the model established in step 2 and step 3, perform random walk iterations with restart until the result converges, and obtain the probability value of each vertex on the academic network graph, which is the quality of literature and the quality of journal conferences and information on the author's academic reputation.

设d为文献质量值向量,a为作者学术声望向量,c为期刊会议质量值向量。将对应三种实体的向量连接成一个向量:π=[dT,aT,cT]T。带重启动的随机游走算法可以用如下的公式表达:Let d be the vector of document quality value, a be the vector of author's academic reputation, and c be the vector of journal conference quality value. Connect the vectors corresponding to the three entities into one vector: π=[d T , a T , c T ] T . The random walk algorithm with restart can be expressed by the following formula:

πt+1=cMTπt+(1-c)Q,0≤c≤1π t+1 =cM T π t +(1-c)Q, 0≤c≤1

采用如下的方法构建Q:Q is constructed in the following way:

对Q(i)进行规范化,使得 &Sigma; i &Element; V Q ( i ) = | V | . Normalize Q(i) such that &Sigma; i &Element; V Q ( i ) = | V | .

在判断是否收敛时,将相邻的前后两次迭代得到的π向量相减,如果差小于10-6,则判断其为收敛。假设最后得到的向量为πn,则其中的值为文献质量值、作者学术声望值和期刊会议质量值。When judging whether it is converged, subtract the π vectors obtained by two adjacent iterations, and if the difference is less than 10 -6 , it is judged to be converged. Assuming that the final vector is π n , the values in it are literature quality value, author's academic reputation value and journal conference quality value.

性能评测performance evaluation

本发明的科技文献质量评价方法为文献、期刊会议和作者都给出了一个质量评分值,利用这一分值得到的排序结果进行实验评测。The method for evaluating the quality of scientific and technological documents of the present invention provides a quality scoring value for documents, periodical conferences and authors, and uses the sorting results obtained by this score for experimental evaluation.

首先对文献质量评价的结果进行评测,选取三个领域:“Opinion Mining”、“Topic Model”和“Social Network”的文献来进行评测。文献评价的实验人工评测主要利用人工对质量排序结果打分的方式结合DCG(Discounted Cumulative Gain)评测算法来评测。评测者依据不同的文献的质量不同给其赋予不同的分值,分值越高的文献越应该排在排序结果的前面。之后,使用DCG评测算法来对结果进行评测,DCG值越高,说明算法输出的排序结果越符合实际需要。DCG评测值的计算公式为:Firstly, evaluate the results of the literature quality evaluation, and select the literature in three fields: "Opinion Mining", "Topic Model" and "Social Network" for evaluation. The experimental manual evaluation of literature evaluation mainly uses the method of manually scoring the quality sorting results combined with the DCG (Discounted Cumulative Gain) evaluation algorithm to evaluate. Evaluators assign different scores to different documents according to their quality, and the documents with higher scores should be ranked at the front of the sorting results. After that, use the DCG evaluation algorithm to evaluate the results. The higher the DCG value, the more the sorting results output by the algorithm meet the actual needs. The calculation formula of DCG evaluation value is:

DCGDCG pp == scorescore 11 ++ &Sigma;&Sigma; ii == 22 pp scorescore ii loglog 22 ii

其中scorei为评测者给排序结果中第i项的分值。Where score i is the score given by the evaluator to the i-th item in the ranking results.

对文献质量的评价,所采用的对比方法如下:The comparison methods used to evaluate the quality of literature are as follows:

●PageRank算法结果中的文献部分●The literature part of the PageRank algorithm results

●PopRank算法结果中的文献部分●The literature part of the results of the PopRank algorithm

●学术网络图上的Random Walk算法(RW)结果中的文献部分●The literature part in the results of the Random Walk algorithm (RW) on the academic network graph

●文献被引次数(Citation Count):文献在本文实验采用的论文集中的被引用次数。●Citation Count: The number of times the document is cited in the collection of papers used in this experiment.

以下为评测结果(为了便于表示,本发明的方法记为RW+U ):The following is the evaluation result (for ease of expression, the method of the present invention is denoted as RW+U):

  OpinionMiningOpinion Mining   TopicModelTopicModel   SocialNetworkSocial Network   PageRankPageRank   13.6654713.66547   15.7819115.78191   12.0344812.03448   PopRankPopRank   17.1611717.16117   17.3334317.33343   16.1354616.13546   CitationCountCitationCount   17.4009217.40092   17.2142917.21429   14.6334814.63348   RWRW   17.6803317.68033   17.9036717.90367   16.8155816.81558   RW+URW+U   18.1655918.16559   18.4108118.41081   17.2826117.28261

其次是对作者学术声望的评价实验结果进行评测,方法与文献质量评价实验相同,对比方法如下:The second is to evaluate the results of the author’s academic prestige evaluation experiment. The method is the same as that of the literature quality evaluation experiment. The comparison method is as follows:

PageRank算法结果中的作者部分Author section in PageRank algorithm results

PopRank算法结果中的作者部分Author section in PopRank algorithm results

学术网络图上的Random Walk算法(RW)结果中的作者部分The author part in the results of the Random Walk algorithm (RW) on the academic network graph

发表文献数(Publication Count):作者在实验的领域文献集中发表的文献总数Publication Count: The total number of documents published by the author in the field literature collection of the experiment

领域文献被引次数(Citation Count):作者在实验的领域文献集中发表的文献的被引次数总和Citation Count: The total number of citations of the literature published by the author in the experimental field literature collection

评测结果如下所示:The evaluation results are as follows:

  OpinionMiningOpinion Mining   TopicModelTopicModel   SocialNetworkSocial Network   PubNumPubNum   12.7936512.79365   13.2912913.29129   10.6482110.64821   CitationCountCitationCount   16.8609116.86091   14.1274414.12744   11.167911.1679   PageRankPageRank   15.5391115.53911   14.7748914.77489   13.7711713.77117   PopRankPopRank   17.3377917.33779   15.8755115.87551   16.4807516.48075   RWRW   17.8166117.81661   16.578616.5786   16.8756816.87568   RW+URW+U   17.9962717.99627   16.6129116.61291   16.885216.8852

最后是对期刊的学术质量评价结果进行评测。考虑到影响因子是学术界中普遍采用的期刊质量评价方法,所以评测的参考标准是修改版影响因子分析法的结果。修改版影响因子计算方法如下:Finally, evaluate the academic quality evaluation results of the journals. Considering that the impact factor is a commonly used journal quality evaluation method in academia, the reference standard for evaluation is the result of the modified version of the impact factor analysis method. The calculation method of the revised impact factor is as follows:

mIFmIF Xx == CC DD.

其中,D是期刊X上发表的文献的总数,C是这些文献被引用次数之和。Among them, D is the total number of documents published in journal X, and C is the sum of the number of citations of these documents.

对于期刊评价评测的方法是前N个结果的准确率,其计算方法如下:The method for journal evaluation is the accuracy rate of the top N results, which is calculated as follows:

Figure BDA0000023283060000092
Figure BDA0000023283060000092

以下为评测结果:The following are the evaluation results:

  P@50P@50   P@80P@80   P@100P@100   PageRankPageRank   0.240.24   0.33750.3375   0.40.4   PopRankPopRank   0.420.42   0.4250.425   0.470.47   RWRW   0.420.42   0.4250.425   0.470.47   RW+URW+U   0.440.44   0.43750.4375   0.480.48

上表所示为几种算法结果中的文献质量值平均值的按年分布情况。这里列出的是从1971年到2009年的平均值,每年的均值是用当年发表文献的质量值之和除以发表的文献数。从图中可以看出,本发明的方法RW和RW+U对新文献的质量值要普遍高于其他两种方法,说明本发明的方法解决了传统方法中新文献评价结果普遍偏低的问题。The table above shows the yearly distribution of the average document quality values in the results of several algorithms. Listed here are the average values from 1971 to 2009, and the average value for each year is the sum of the quality values of published documents in that year divided by the number of published documents. As can be seen from the figure, the method RW and RW+U of the present invention have generally higher quality values for new documents than the other two methods, indicating that the method of the present invention solves the problem that the evaluation results of new documents in traditional methods are generally low .

需要注意的是,公布实施例的目的在于帮助进一步理解本发明,本领域的技术人员可以理解:在不脱离本发明及所附权利要求的精神和范围内,各种替换和修改都是可能的。例如,本发明同样可以应用于论文共享平台或网站(只需用论文取代文献),以及图片共享平台或网站(只需用图片取代文献)等。因此,本发明不应局限于实施例所公开的内容,本发明要求保护的范围以权利要求书界定的范围为准。It should be noted that the purpose of the published embodiments is to help further understand the present invention, and those skilled in the art can understand that various replacements and modifications are possible without departing from the spirit and scope of the present invention and the appended claims . For example, the present invention can also be applied to paper sharing platforms or websites (only need to replace documents with papers), and picture sharing platforms or websites (only need to replace documents with pictures). Therefore, the present invention should not be limited to the content disclosed in the embodiments, and the protection scope of the present invention is subject to the scope defined in the claims.

Claims (8)

1. A method for evaluating quality of literature is applied to a scientific and technical literature sharing platform, on which a user can collect, add tags, comment and share the literature to other users, and is characterized by comprising the following steps:
A. constructing an authorized directed graph called an academic network graph by using the citation relationship of the literature, the relationship between the literature and the periodical conference and the author and the publication time of the literature;
B. quantifying the citation relation of the literature, the relation of the literature and the periodical conference and the relation of the literature and the author into the transfer relation between vertexes on the graph, and modeling to obtain a transfer probability matrix on the academic network graph;
C. establishing a model by using the collection behavior of the user on the document, considering the collection time, and calculating by using a HITS algorithm to obtain a document quality value based on user analysis;
D. and C, according to the models established in the steps B and C, carrying out random walk iteration with restarting until the result is converged to obtain a probability value of each vertex on the academic network diagram, wherein the probability value is the information of the literature quality, the periodical conference quality and the academic reputation of the author.
2. The method of claim 1, wherein the academic network graph in step a is composed of three subgraphs, respectively:
● citation interconnection subfigure Gdd=(Vd,Edd),
GddIs a directed graph representing the citation relationship between documents, where VdIs a set of document vertices, EddIs a set of edges, directed edges<di,dj>∈EddExpression document diReference d is made toj
● Author-bibliographic subfigure Gad=(Va∪Vd,Ead),
GadIs a bipartite graph representing the written relationship between the author and the literature, where VaIs a set of author vertices, EadIs an edge set, a non-directional edge (a)i,dj)∈EadRepresents author aiWrite out document dj
● conference of periodical-literature subgraph Gcd=(Vc∪Vd,Ecd),
GcdIs a bipartite graph representing the publication relationship between a periodical conference and a document, wherein VcIs a set of periodicals and meeting vertices, EcdIs an edge set, no directional edge (c)i,dj)∈EcdExpression document djPublished in periodicals or conferences ciThe above step (1); the academic network graph is a directed graph G(V, E), wherein the set of vertices V ═ Va∪Vd∪VcEdge set E ═ Edd∪Ead∪Ecd(ii) a Author-literature subgraph GadAnd journal conference-literature subgraph GcdEach undirected edge in the set of undirected edges is replaced with two directed edges connecting the two vertices of the edge.
3. The method of claim 2, wherein the step B is implemented by:
B1. different transition probabilities a are defined for transitions between different types of vertices,
αad=αcd=1
αdadcdd=1
αadfor transition probability from author vertex to document vertex, αcdFor transition probability from published site vertex to document vertex, αdaFor transition probability from document vertex to author vertex, αdcFor transition probabilities from document vertices to published site vertices, αddIs the transition probability from document vertex to document vertex;
B2. defining a weighted adjacency matrix W (G) of the graph G, corresponding to the weights of the relations between different vertexes in the academic network graph, and decomposing W (G) into a series of sub-matrixes according to the definition of the academic network graph: wdd,Wad,Wda,Wdc,WcdWherein W isddFor weighted adjacency matrices from document vertex to document vertex, WadFor weighted adjacency matrices from author site vertex to document vertex, WdaFor weighted adjacency matrices from document vertex to author vertex, WdcFor weighted adjacency matrices from document vertex to publishing point vertex, WcdIs a weighted adjacency matrix from publication location vertex to document vertex;
B3. and (3) assigning initial values to each submatrix to obtain an initial weighted adjacency matrix:
Figure FDA0000023283050000021
wherein t (d) represents publication time of document d, Γdd(di) shows document diA collection of cited documents;
Figure FDA0000023283050000022
wherein gamma isad(ai) Represents author aiThe collection of published documents is presented in a number of publications,
Figure FDA0000023283050000023
author a is the kth author of document d;
c)Wda(j,i)=|Γda(dj)|-k+1
wherein gamma isda(dj) Expression document djK denotes author aiIs document djThe kth author of (1);
Figure FDA0000023283050000024
Figure FDA0000023283050000025
wherein c isikRepresenting a meeting ciA certain time, or periodical ciOf a certain volume, gammacd(cim) Presentation is given in cimDocument set above, t (c)im) Denotes cimThe corresponding time of (d);
B4. applying a weight scoring function to the initial value of the matrix to obtain a final weighted adjacent matrix;
B5. and calculating to obtain a transition probability matrix based on the weighted adjacent matrix.
4. The method of claim 3,the weight scoring function adopted in the step B4 is a monotone increasing function, but as the value of the independent variable increases, the increase range of the function value gradually decreases, that is: phi '(x) > 0 and phi' (x) < 0, in the method taking
Figure FDA0000023283050000031
5. The method of claim 4, wherein the step B5 is implemented by:
i. defining a transition probability matrix for three subgraphs
Document citation subgraph Gdd
Document-to-document transition probability matrix
Figure FDA0000023283050000032
Wherein
<math><mrow><msub><mi>M</mi><mi>dd</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo>(</mo><msub><mi>d</mi><mi>j</mi></msub><mo>|</mo><msub><mi>d</mi><mi>i</mi></msub><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><msub><mi>W</mi><mi>dd</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>)</mo></mrow></mrow><mrow><msub><mi>&Sigma;</mi><mi>k</mi></msub><msub><mi>W</mi><mi>dd</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></mrow></mfrac><mo>;</mo></mrow></math>
Author-literature subfigure Gad
Author to document transition probability matrix
Figure FDA0000023283050000034
Wherein
<math><mrow><msub><mi>M</mi><mi>ad</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo>(</mo><msub><mi>d</mi><mi>j</mi></msub><mo>|</mo><msub><mi>a</mi><mi>i</mi></msub><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><msub><mi>W</mi><mi>ad</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>)</mo></mrow></mrow><mrow><msub><mi>&Sigma;</mi><mi>k</mi></msub><msub><mi>W</mi><mi>ad</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></mrow></mfrac><mo>;</mo></mrow></math>
Document to author transition probability matrix
Figure FDA0000023283050000036
Wherein
<math><mrow><msub><mi>M</mi><mi>da</mi></msub><mrow><mo>(</mo><mi>j</mi><mo>,</mo><mi>i</mi><mo>)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo>(</mo><msub><mi>a</mi><mi>i</mi></msub><mo>|</mo><msub><mi>d</mi><mi>j</mi></msub><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><msub><mi>W</mi><mi>da</mi></msub><mrow><mo>(</mo><mi>j</mi><mo>,</mo><mi>i</mi><mo>)</mo></mrow></mrow><mrow><msub><mi>&Sigma;</mi><mi>k</mi></msub><msub><mi>W</mi><mi>da</mi></msub><mrow><mo>(</mo><mi>j</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></mrow></mfrac><mo>;</mo></mrow></math>
Journal conference-literature subgraph Gcd
Transition probability matrix from literature to periodical conference
Figure FDA0000023283050000038
Wherein
<math><mrow><msub><mi>M</mi><mi>dc</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo>(</mo><msub><mi>c</mi><mi>j</mi></msub><mo>|</mo><msub><mi>d</mi><mi>i</mi></msub><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><msub><mi>W</mi><mi>dc</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>j</mi><mo>)</mo></mrow></mrow><mrow><msub><mi>&Sigma;</mi><mi>k</mi></msub><msub><mi>W</mi><mi>dc</mi></msub><mrow><mo>(</mo><mi>i</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></mrow></mfrac><mo>;</mo></mrow></math>
Transition probability matrix from periodical conference to literature
Figure FDA00000232830500000310
Wherein
<math><mrow><msub><mi>M</mi><mi>cd</mi></msub><mrow><mo>(</mo><mi>j</mi><mo>,</mo><mi>i</mi><mo>)</mo></mrow><mo>=</mo><mi>P</mi><mrow><mo>(</mo><msub><mi>d</mi><mi>i</mi></msub><mo>|</mo><msub><mi>c</mi><mi>j</mi></msub><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><msub><mi>W</mi><mi>cd</mi></msub><mrow><mo>(</mo><mi>j</mi><mo>,</mo><mi>i</mi><mo>)</mo></mrow></mrow><mrow><msub><mi>&Sigma;</mi><mi>k</mi></msub><msub><mi>W</mi><mi>cd</mi></msub><mrow><mo>(</mo><mi>j</mi><mo>,</mo><mi>k</mi><mo>)</mo></mrow></mrow></mfrac><mo>;</mo></mrow></math>
ii, obtaining a transition probability matrix on the academic network diagram through the transition probability matrix of the subgraph:
<math><mrow><mi>M</mi><mrow><mo>(</mo><mi>G</mi><mo>)</mo></mrow><mo>=</mo><msub><mrow><mo>(</mo><mi>P</mi><mrow><mo>(</mo><mi>j</mi><mo>|</mo><mi>i</mi><mo>)</mo></mrow><mo>)</mo></mrow><mrow><mi>i</mi><mo>,</mo><mi>j</mi><mo>&Element;</mo><mi>V</mi></mrow></msub><mo>=</mo><mfenced open='[' close=']'><mtable><mtr><mtd><msub><mi>&alpha;</mi><mi>dd</mi></msub><msub><mi>M</mi><mi>dd</mi></msub></mtd><mtd><msub><mi>&alpha;</mi><mi>da</mi></msub><msub><mi>M</mi><mi>da</mi></msub></mtd><mtd><msub><mi>&alpha;</mi><mi>dc</mi></msub><msub><mi>M</mi><mi>dc</mi></msub></mtd></mtr><mtr><mtd><msub><mi>M</mi><mi>ad</mi></msub></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><msub><mi>M</mi><mi>cd</mi></msub></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable></mfenced><mo>.</mo></mrow></math>
6. the method of claim 5, wherein the step C is implemented by:
C1. a user-document collection relationship graph is constructed,
the top point is the user and the literature, and the edge is the collection behavior; defining a user-document collection system as B ═ U, D, T, R, where U is a set of users, D is a set of documents, T is a set of a series of time points,
Figure FDA0000023283050000041
representing a collection relation set, (u, d, t) belongs to R, and representing that a user u collects a document d at a moment t;
C2. an adjacency matrix a of the user-document collection graph is defined,
first, a quality value vector q ═ for a document collection is defined (q)1,q2,Λ,qm) Wherein m ═ D |; defining an expert degree vector e ═ (e) for a set of users1,e2,Λ,en) Wherein n ═ U |; then the adjacency matrix of the user-document collection graph
Figure FDA0000023283050000042
C3. Calculating the document quality value and the user expertise by repeating the following iterative process until the result converges
q=e×A
e=q×AT
7. The method of claim 6, wherein the step D is implemented by:
D1. let d be the document quality value vector, a be the author academic reputation vector, and c be the periodical conferenceThe quality value vector is formed by connecting the vectors corresponding to the three entities into a vector pi ═ dT,aT,cT]T
D2. Using random walk algorithm with restart, using formula pit+1=cMTπtQ is more than or equal to 0 and less than or equal to 1, wherein
Figure FDA0000023283050000043
Normalizing Q (i) such that <math><mrow><munder><mi>&Sigma;</mi><mrow><mi>i</mi><mo>&Element;</mo><mi>V</mi></mrow></munder><mi>Q</mi><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow><mo>=</mo><mo>|</mo><mi>V</mi><mo>|</mo><mo>;</mo></mrow></math>
D3. Subtracting pi vectors obtained by two adjacent iterations, if the difference is less than 10-6If yes, judging the convergence; suppose the resulting vector is pinValues therein are document quality value, author academic reputation value and periodical conference quality value.
8. Applying the method of claim 1 to: a paper sharing platform or website, a picture sharing platform or website.
CN2010102263535A 2010-07-14 2010-07-14 A Document Quality Evaluation Method and Its Application Pending CN101887460A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010102263535A CN101887460A (en) 2010-07-14 2010-07-14 A Document Quality Evaluation Method and Its Application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010102263535A CN101887460A (en) 2010-07-14 2010-07-14 A Document Quality Evaluation Method and Its Application

Publications (1)

Publication Number Publication Date
CN101887460A true CN101887460A (en) 2010-11-17

Family

ID=43073382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010102263535A Pending CN101887460A (en) 2010-07-14 2010-07-14 A Document Quality Evaluation Method and Its Application

Country Status (1)

Country Link
CN (1) CN101887460A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102984191A (en) * 2011-09-07 2013-03-20 百度在线网络技术(北京)有限公司 Method and device and equipment used for determining behavior related quality information
CN103559407A (en) * 2013-11-14 2014-02-05 北京航空航天大学深圳研究院 Recommendation system and method for measuring node intimacy in weighted graph with direction
CN104462215A (en) * 2014-11-05 2015-03-25 大连理工大学 Scientific and technical literature quoting number predicting method based on time sequence
CN104537495A (en) * 2014-12-31 2015-04-22 浙江大学 Scholar ability calculation method and system
CN104657488A (en) * 2015-03-05 2015-05-27 中南大学 Method for calculating author influence based on citation propagation network
CN105404641A (en) * 2015-10-23 2016-03-16 华建宇通科技(北京)有限责任公司 Baseline based journal evaluation method and evaluation apparatus
CN105589948A (en) * 2015-12-18 2016-05-18 重庆邮电大学 Document citation network visualization and document recommendation method and system
CN105740386A (en) * 2016-01-27 2016-07-06 北京航空航天大学 Thesis search method and device based on sorting integration
CN105843876A (en) * 2016-03-18 2016-08-10 合网络技术(北京)有限公司 Multimedia resource quality assessment method and apparatus
CN107391659A (en) * 2017-07-18 2017-11-24 北京工业大学 A kind of citation network academic evaluation sort method based on credit worthiness
CN107833142A (en) * 2017-11-08 2018-03-23 广西师范大学 Academic social networks scientific research cooperative person recommends method
WO2018077181A1 (en) * 2016-10-27 2018-05-03 腾讯科技(深圳)有限公司 Method and device for graph centrality calculation, and storage medium
CN109272228A (en) * 2018-09-12 2019-01-25 石家庄铁道大学 Scientific research influence power analysis method based on Research Team's cooperative network
CN109801692A (en) * 2018-12-14 2019-05-24 平安医疗健康管理股份有限公司 A kind of Medical record database method for evaluating quality and device
CN110457439A (en) * 2019-08-06 2019-11-15 北京如优教育科技有限公司 One-stop intelligent writes householder method, device and system
CN110825942A (en) * 2019-10-22 2020-02-21 清华大学 Method and system for calculating quality of thesis
CN110955749A (en) * 2019-10-24 2020-04-03 浙江工业大学 Paper attention prediction method
CN112286988A (en) * 2020-10-23 2021-01-29 平安科技(深圳)有限公司 Medical document sorting method and device, electronic equipment and storage medium
CN112508461A (en) * 2021-01-27 2021-03-16 中国科学院自动化研究所 Academic influence evaluation service platform system and device for multiple elements
US11328328B2 (en) 2019-03-28 2022-05-10 Coupang Corp. Computer-implemented method for arranging hyperlinks on a grapical user-interface

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102984191A (en) * 2011-09-07 2013-03-20 百度在线网络技术(北京)有限公司 Method and device and equipment used for determining behavior related quality information
CN102984191B (en) * 2011-09-07 2017-06-09 百度在线网络技术(北京)有限公司 Method, device and equipment for determining behavior correlated quality information
CN103559407B (en) * 2013-11-14 2016-08-31 北京航空航天大学深圳研究院 A kind of commending system for measuring direct graph with weight interior joint cohesion and method
CN103559407A (en) * 2013-11-14 2014-02-05 北京航空航天大学深圳研究院 Recommendation system and method for measuring node intimacy in weighted graph with direction
CN104462215A (en) * 2014-11-05 2015-03-25 大连理工大学 Scientific and technical literature quoting number predicting method based on time sequence
CN104462215B (en) * 2014-11-05 2017-07-11 大连理工大学 A kind of scientific and technical literature based on time series is cited number Forecasting Methodology
CN104537495A (en) * 2014-12-31 2015-04-22 浙江大学 Scholar ability calculation method and system
CN104657488A (en) * 2015-03-05 2015-05-27 中南大学 Method for calculating author influence based on citation propagation network
CN105404641A (en) * 2015-10-23 2016-03-16 华建宇通科技(北京)有限责任公司 Baseline based journal evaluation method and evaluation apparatus
CN105404641B (en) * 2015-10-23 2018-10-26 华建宇通科技(北京)有限责任公司 A kind of Journal Evaluation method and evaluating apparatus based on baseline
CN105589948A (en) * 2015-12-18 2016-05-18 重庆邮电大学 Document citation network visualization and document recommendation method and system
CN105589948B (en) * 2015-12-18 2018-10-12 重庆邮电大学 A kind of reference citation network visualization and literature recommendation method and system
CN105740386A (en) * 2016-01-27 2016-07-06 北京航空航天大学 Thesis search method and device based on sorting integration
CN105843876A (en) * 2016-03-18 2016-08-10 合网络技术(北京)有限公司 Multimedia resource quality assessment method and apparatus
CN105843876B (en) * 2016-03-18 2020-07-14 阿里巴巴(中国)有限公司 Quality evaluation method and device for multimedia resources
WO2018077181A1 (en) * 2016-10-27 2018-05-03 腾讯科技(深圳)有限公司 Method and device for graph centrality calculation, and storage medium
US10936765B2 (en) 2016-10-27 2021-03-02 Tencent Technology (Shenzhen) Company Limited Graph centrality calculation method and apparatus, and storage medium
CN107391659B (en) * 2017-07-18 2020-05-22 北京工业大学 A Reputation-Based Citation Network Academic Influence Evaluation Ranking Method
CN107391659A (en) * 2017-07-18 2017-11-24 北京工业大学 A kind of citation network academic evaluation sort method based on credit worthiness
CN107833142A (en) * 2017-11-08 2018-03-23 广西师范大学 Academic social networks scientific research cooperative person recommends method
CN109272228B (en) * 2018-09-12 2022-03-15 石家庄铁道大学 Scientific research influence analysis method based on scientific research team cooperation network
CN109272228A (en) * 2018-09-12 2019-01-25 石家庄铁道大学 Scientific research influence power analysis method based on Research Team's cooperative network
CN109801692A (en) * 2018-12-14 2019-05-24 平安医疗健康管理股份有限公司 A kind of Medical record database method for evaluating quality and device
US11328328B2 (en) 2019-03-28 2022-05-10 Coupang Corp. Computer-implemented method for arranging hyperlinks on a grapical user-interface
CN110457439A (en) * 2019-08-06 2019-11-15 北京如优教育科技有限公司 One-stop intelligent writes householder method, device and system
CN110825942A (en) * 2019-10-22 2020-02-21 清华大学 Method and system for calculating quality of thesis
CN110955749A (en) * 2019-10-24 2020-04-03 浙江工业大学 Paper attention prediction method
CN112286988A (en) * 2020-10-23 2021-01-29 平安科技(深圳)有限公司 Medical document sorting method and device, electronic equipment and storage medium
WO2021179687A1 (en) * 2020-10-23 2021-09-16 平安科技(深圳)有限公司 Medical literature sorting method and apparatus, electronic device and storage medium
CN112286988B (en) * 2020-10-23 2023-07-25 平安科技(深圳)有限公司 Medical document ordering method, device, electronic equipment and storage medium
CN112508461A (en) * 2021-01-27 2021-03-16 中国科学院自动化研究所 Academic influence evaluation service platform system and device for multiple elements

Similar Documents

Publication Publication Date Title
CN101887460A (en) A Document Quality Evaluation Method and Its Application
CN107766324B (en) Text consistency analysis method based on deep neural network
CN111159395B (en) Chart neural network-based rumor standpoint detection method and device and electronic equipment
CN108399163B (en) Text similarity measurement method combining word aggregation and word combination semantic features
CN104915448B (en) A kind of entity based on level convolutional network and paragraph link method
CN103198228B (en) Based on the relational network link Forecasting Methodology of the hidden topic model of broad sense relationship
CN113254637B (en) Grammar-fused aspect-level text emotion classification method and system
CN108573411A (en) A Hybrid Recommendation Method Based on Deep Sentiment Analysis of User Reviews and Fusion of Multi-source Recommendation Views
Zhang et al. User community discovery from multi-relational networks
CN113312480B (en) Multi-label classification method and equipment for scientific papers based on graph convolutional network
Zhu et al. Global and local multi-view multi-label learning
CN110674318A (en) Data recommendation method based on citation network community discovery
CN108776844A (en) Social network user behavior prediction method based on context-aware tensor resolution
CN105740381A (en) User interest mining method based on complex network characteristics and neural network clustering
CN112527981B (en) Open information extraction method, device, electronic device and storage medium
CN113449204A (en) Social event classification method and device based on local aggregation graph attention network
CN107545033A (en) A kind of computational methods based on the knowledge base entity classification for representing study
CN113516553A (en) Credit risk early warning method and device
CN104778205A (en) Heterogeneous information network-based mobile application ordering and clustering method
Yi et al. Graphical visual analysis of consumer electronics public comment information mining under knowledge graph
Qi et al. Application of LDA and word2vec to detect English off-topic composition
CN112001165B (en) A method for fine-grained text sentiment analysis based on user harshness
CN105162648B (en) Corporations&#39; detection method based on backbone network extension
Bai et al. Quantifying the impact of scientific collaboration and papers via motif-based heterogeneous networks
CN117933222A (en) Power marketing reform policy evaluation method based on policy consistency index model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20101117