Skip to main content
Kayvan Kousha
  • School of Technology, University of Wolverhampton, Wulfruna Street, Wolverhampton WV1 1LY, UK
  • My research includes Web citation analysis and online scholarly impact assessment using Web-based quantitative and qu... moreedit
National research evaluation initiatives and incentive schemes choose between simplistic quantitative indicators and time-consuming peer/expert review, sometimes supported by bibliometrics. Here we assess whether machine learning could... more
National research evaluation initiatives and incentive schemes choose between simplistic quantitative indicators and time-consuming peer/expert review, sometimes supported by bibliometrics. Here we assess whether machine learning could provide a third alternative, estimating article quality using more multiple bibliometric and metadata inputs. We investigated this using provisional three-level REF2021 peer review scores for 84,966 articles submitted to the U.K. Research Excellence Framework 2021, matching a Scopus record 2014–18 and with a substantial abstract. We found that accuracy is highest in the medical and physical sciences Units of Assessment (UoAs) and economics, reaching 42% above the baseline (72% overall) in the best case. This is based on 1,000 bibliometric inputs and half of the articles used for training in each UoA. Prediction accuracies above the baseline for the social science, mathematics, engineering, arts, and humanities UoAs were much lower or close to zero. Th...
Citation counts are widely used as indicators of research quality to support or replace human peer review and for lists of top cited papers, researchers, and institutions. Nevertheless, the relationship between citations and research... more
Citation counts are widely used as indicators of research quality to support or replace human peer review and for lists of top cited papers, researchers, and institutions. Nevertheless, the relationship between citations and research quality is poorly evidenced. We report the first large‐scale science‐wide academic evaluation of the relationship between research quality and citations (field normalized citation counts), correlating them for 87,739 journal articles in 34 field‐based UK Units of Assessment (UoA). The two correlate positively in all academic fields, from very weak (0.1) to strong (0.5), reflecting broadly linear relationships in all fields. We give the first evidence that the correlations are positive even across the arts and humanities. The patterns are similar for the field classification schemes of Scopus and Dimensions.ai, although varying for some individual subjects and therefore more uncertain for these. We also show for the first time that no field has a citatio...
PurposeTo assess whether interdisciplinary research evaluation scores vary between fields.Design/methodology/approachThe authors investigate whether published refereed journal articles were scored differently by expert assessors (two per... more
PurposeTo assess whether interdisciplinary research evaluation scores vary between fields.Design/methodology/approachThe authors investigate whether published refereed journal articles were scored differently by expert assessors (two per output, agreeing a score and norm referencing) from multiple subject-based Units of Assessment (UoAs) in the REF2021 UK national research assessment exercise. The primary raw data was 8,015 journal articles published 2014–2020 and evaluated by multiple UoAs, and the agreement rates were compared to the estimated agreement rates for articles multiply-evaluated within a single UoA.FindingsThe authors estimated a 53% agreement rate on a four-point quality scale between UoAs for the same article and a within-UoA agreement rate of 70%. This suggests that quality scores vary more between fields than within fields for interdisciplinary research. There were also some hierarchies between fields, in the sense of UoAs that tended to give higher scores for the ...
Although funding is essential for some types of research and beneficial for others, it may constrain academic choice and creativity. Thus, it is important to check whether it ever seems unnecessary. Here we investigate whether funded U.K.... more
Although funding is essential for some types of research and beneficial for others, it may constrain academic choice and creativity. Thus, it is important to check whether it ever seems unnecessary. Here we investigate whether funded U.K. research tends to be higher quality in all fields and for all major research funders. Based on peer review quality scores for 113,877 articles from all fields in the U.K.’s Research Excellence Framework (REF) 2021, we estimate that there are substantial disciplinary differences in the proportion of funded journal articles, from Theology and Religious Studies (16%+) to Biological Sciences (91%+). The results suggest that funded research is likely to be of higher quality overall, for all the largest research funders, and for 30 out of 34 REF Units of Assessment (disciplines or sets of disciplines), even after factoring out research team size. There are differences between funders in the average quality of the research supported, however. Funding seem...
Collaboration is encouraged because it is believed to improve academic research, supported by indirect evidence in the form of more coauthored articles being more cited. Nevertheless, this might not reflect quality but increased... more
Collaboration is encouraged because it is believed to improve academic research, supported by indirect evidence in the form of more coauthored articles being more cited. Nevertheless, this might not reflect quality but increased self‐citations or the “audience effect”: citations from increased awareness through multiple author networks. We address this with the first science wide investigation into whether author numbers associate with journal article quality, using expert peer quality judgments for 122,331 articles from the 2014–20 UK national assessment. Spearman correlations between author numbers and quality scores show moderately strong positive associations (0.2–0.4) in the health, life, and physical sciences, but weak or no positive associations in engineering and social sciences, with weak negative/positive or no associations in various arts and humanities, and a possible negative association for decision sciences. This gives the first systematic evidence that greater number...
Collaborative research causes problems for research assessments because of the difficulty in fairly crediting its authors. Whilst splitting the rewards for an article amongst its authors has the greatest surface-level fairness, many... more
Collaborative research causes problems for research assessments because of the difficulty in fairly crediting its authors. Whilst splitting the rewards for an article amongst its authors has the greatest surface-level fairness, many important evaluations assign full credit to each author, irrespective of team size. The underlying rationales for this are labour reduction and the need to incentivise collaborative work because it is necessary to solve many important societal problems. This article assesses whether full counting changes results compared to fractional counting in the case of the UK's Research Excellence Framework (REF) 2021. For this assessment, fractional counting reduces the number of journal articles to as little as 10% of the full counting value, depending on the Unit of Assessment (UoA). Despite this large difference, allocating an overall grade point average (GPA) based on full counting or fractional counting gives results with a median Pearson correlation with...
PurposeScholars often aim to conduct high quality research and their success is judged primarily by peer reviewers. Research quality is difficult for either group to identify, however and misunderstandings can reduce the efficiency of the... more
PurposeScholars often aim to conduct high quality research and their success is judged primarily by peer reviewers. Research quality is difficult for either group to identify, however and misunderstandings can reduce the efficiency of the scientific enterprise. In response, we use a novel term association strategy to seek quantitative evidence of aspects of research that are associated with high or low quality.Design/methodology/approachWe extracted the words and 2–5-word phrases most strongly associated with different quality scores in each of 34 Units of Assessment (UoAs) in the Research Excellence Framework (REF) 2021. We extracted the terms from 122,331 journal articles 2014–2020 with individual REF2021 quality scores.FindingsThe terms associating with high- or low-quality scores vary between fields but relate to writing styles, methods and topics. We show that the first-person writing style strongly associates with higher quality research in many areas because it is the norm fo...
Category Type Type Type Teach. Teach. Teach. Res. Res. Res. Prof. Prof. Prof. Other Other Other Subject A-B A-C B-C A-B A-C B-C A-B A-C B-C A-B A-C B-C A-B A-C B-C Social sciences .445 .625 .256 .615 .615 .875 -.020 1.000 -.020 .125 .570... more
Category Type Type Type Teach. Teach. Teach. Res. Res. Res. Prof. Prof. Prof. Other Other Other Subject A-B A-C B-C A-B A-C B-C A-B A-C B-C A-B A-C B-C A-B A-C B-C Social sciences .445 .625 .256 .615 .615 .875 -.020 1.000 -.020 .125 .570 .177 .200 .141 .320 Arts and humanities .434 .663 .301 1.000 .260 .260 -.020 -.031 .485 .205 .451 .172 -.034 .364 .262 Sciences .306 .441 .227 .336 .239 .519 .380 -.027 .296 .029 .308 .605 .197 .270 .146 Medical sciences .522 .608 .384 .202 .790 .396 .000 .000 .291 .197 .217 .393 .245 .205 .286 Engineering .370 .284 .103 .556 .481 .221 -.031 .000 .000 .422 .358 .427 -.027 .380 -.056 Average .415 .524 .254 .542 .477 .454 .062 .188 .210 .196 .381 .355 .116 .272 .192
Although peer‐review and citation counts are commonly used to help assess the scholarly impact of published research, informal reader feedback might also be exploited to help assess the wider impacts of books, such as their educational or... more
Although peer‐review and citation counts are commonly used to help assess the scholarly impact of published research, informal reader feedback might also be exploited to help assess the wider impacts of books, such as their educational or cultural value. The social website Goodreads seems to be a reasonable source for this purpose because it includes a large number of book reviews and ratings by many users inside and outside of academia. To check this, Goodreads book metrics were compared with different book‐based impact indicators for 15,928 academic books across broad fields. Goodreads engagements were numerous enough in the arts (85% of books had at least one), humanities (80%), and social sciences (67%) for use as a source of impact evidence. Low and moderate correlations between Goodreads book metrics and scholarly or non‐scholarly indicators suggest that reader feedback in Goodreads reflects the many purposes of books rather than a single type of impact. Although Goodreads boo...
Although Mendeley bookmarking counts appear to correlate moderately with conventional citation metrics, it is not known whether academic publications are bookmarked in Mendeley in order to be read or not. Without this information, it is... more
Although Mendeley bookmarking counts appear to correlate moderately with conventional citation metrics, it is not known whether academic publications are bookmarked in Mendeley in order to be read or not. Without this information, it is not possible to give a confident interpretation of altmetrics derived from Mendeley. In response, a survey of 860 Mendeley users shows that it is reasonable to use Mendeley bookmarking counts as an indication of readership because most (55%) users with a Mendeley library had read or intended to read at least half of their bookmarked publications. This was true across all broad areas of scholarship except for the arts and humanities (42%). About 85% of the respondents also declared that they bookmarked articles in Mendeley to cite them in their publications, but some also bookmark articles for use in professional (50%), teaching (25%), and educational activities (13%). Of course, it is likely that most readers do not record articles in Mendeley and so...
Although there is some evidence that online videos are increasingly used by academics for informal scholarly communication and teaching, the extent to which they are used in published academic research is unknown. This article explores... more
Although there is some evidence that online videos are increasingly used by academics for informal scholarly communication and teaching, the extent to which they are used in published academic research is unknown. This article explores the extent to which YouTube videos are cited in academic publications and whether there are significant broad disciplinary differences in this practice. To investigate, we extracted the URL citations to YouTube videos from academic publications indexed by Scopus. A total of 1,808 Scopus publications cited at least one YouTube video, and there was a steady upward growth in citing online videos within scholarly publications from 2006 to 2011, with YouTube citations being most common within arts and humanities (0.3%) and the social sciences (0.2%). A content analysis of 551 YouTube videos cited by research articles indicated that in science (78%) and in medicine and health sciences (77%), over three fourths of the cited videos had either direct scientifi...
Purpose– This study aims to explore the link creating behaviour of European highly cited scientists based upon their online lists of publications and their institutional personal websites.Design/methodology/approach– A total of 1,525... more
Purpose– This study aims to explore the link creating behaviour of European highly cited scientists based upon their online lists of publications and their institutional personal websites.Design/methodology/approach– A total of 1,525 highly cited scientists working at European institutions were first identified. Outlinks from their online lists of publications and their personal websites pointing to a pre-defined collection of popular academic websites and file types were then gathered by a personal web crawler.Findings– Perhaps surprisingly, a larger proportion of social scientists provided at least one outlink compared to the other disciplines investigated. By far the most linked-to file type was PDF and the most linked-to type of target website was scholarly databases, especially the Digital Object Identifier website. Health science and life science researchers mainly linked to scholarly databases, while scientists from engineering, hard sciences and social sciences linked to a w...
Two partly conflicting academic pressures from the seriousness of the Covid-19 pandemic are the need for faster peer review of Covid-19 health-related research and greater scrutiny of its findings. This paper investigates whether... more
Two partly conflicting academic pressures from the seriousness of the Covid-19 pandemic are the need for faster peer review of Covid-19 health-related research and greater scrutiny of its findings. This paper investigates whether decreases in peer review durations for Covid-19 articles were universal across 97 major medical journals, Nature, Science, and Cell. The results suggest that on average, Covid-19 articles submitted during 2020 were reviewed 1.7-2.1 times faster than non-Covid-19 articles submitted during 2017-2020. Nevertheless, whilst the review speed of Covid-19 research was particularly fast during the first five months (1.9-3.4 times faster) of the pandemic (January-May 2020), this speed advantage was no longer evident for articles submitted November-December 2020. Faster peer review also associates with higher citation impact for Covid-19 articles in the same journals, suggesting it did not usually compromise the scholarly impact of important Covid-19 research. Overall, then, it seems that core medical and general journals responded quickly but carefully to the pandemic, although the situation returned closer to normal within a year.
Introduction. Computer scientists and other researchers often make their programs freely available online. If this software makes a valuable contribution inside or outside of academia then its creators may want to demonstrate this with a... more
Introduction. Computer scientists and other researchers often make their programs freely available online. If this software makes a valuable contribution inside or outside of academia then its creators may want to demonstrate this with a suitable indicator, such as download counts. Methods. Download counts, citation counts, labels and licenses were extracted for programs that were both hosted in the Google Code software repository and cited in Scopus. Analysis. Download counts were correlated with Web of Science citations, the distributions of both were compared and common software labels and licencing arrangements were identified. Results. Although downloads correlate positively and significantly with Scopus citations, the correlation is weak (0.3) because some software has a large natural audience outside of academia. There is disagreement on the best licence to use for shared software, with no licence chosen by more than about a fifth of the projects. The most common language lab...
This literature review assesses indicators derived from social media sources, including both general and academic sites. Such indicators have been termed altmetrics, influmetrics, social media metrics, or a type of webometric, and have... more
This literature review assesses indicators derived from social media sources, including both general and academic sites. Such indicators have been termed altmetrics, influmetrics, social media metrics, or a type of webometric, and have recently been commercialised by a number of companies and employed by some publishers and university administrators. The social media metrics analysed here derive mainly from Twitter , Facebook, Google+, F1000, Mendeley, ResearchGate, and Academia. edu. They have the apparent potential to deliver fast, free indicators of the wider societal impact of research, or of different types of academic impacts, complementing academic impact indicators from traditional citation indexes. Although it is un wise to employ them in formal evaluations with stakeholders, due to their susceptibility to gaming and lack of real evidence that they reflect wider research impacts, they are useful for formative evaluations and to investigate science itself. Mendeley reader co...
Although peer review is likely to dominate quality assessment of research in the future UK Research Excellence Framework (REF), citation indictors will also be used in some subject areas to support the peer-review process. However,... more
Although peer review is likely to dominate quality assessment of research in the future UK Research Excellence Framework (REF), citation indictors will also be used in some subject areas to support the peer-review process. However, traditional journal-based citation indexes may be inadequate for the citation impact assessment of book-based disciplines. This article examines whether online citations from Google Books and Google Scholar can provide an alternative. We compared the citation counts to books submitted to 2008 Research Assessment Exercise (RAE – the forerunner of the REF) from Google Books and Google Scholar with Scopus citations across seven book-based disciplines (archaeology, law, politics and international studies, philosophy, sociology, history, and communication, cultural and media studies) based upon a sample of 1,000 authored books. Google books and Google Scholar citations to authored books were 1.4 and 3.2 times bigger than Scopus citations and their medians were...
PurposeThe purpose of this study is to explore current practices, challenges and technological needs of different data repositories.Design/methodology/approachAn online survey was designed for data repository managers, and contact... more
PurposeThe purpose of this study is to explore current practices, challenges and technological needs of different data repositories.Design/methodology/approachAn online survey was designed for data repository managers, and contact information from the re3data, a data repository registry, was collected to disseminate the survey.FindingsIn total, 189 responses were received, including 47% discipline specific and 34% institutional data repositories. A total of 71% of the repositories reporting their software used bespoke technical frameworks, with DSpace, EPrint and Dataverse being commonly used by institutional repositories. Of repository managers, 32% reported tracking secondary data reuse while 50% would like to. Among data reuse metrics, citation counts were considered extremely important by the majority, followed by links to the data from other websites and download counts. Despite their perceived usefulness, repository managers struggle to track dataset citations. Most repository...
There are many ways in which academic articles can be used outside research contexts for teaching, culture, medical practice, business, policy making, or knowledge communication. Articles that have significant wider benefits may therefore... more
There are many ways in which academic articles can be used outside research contexts for teaching, culture, medical practice, business, policy making, or knowledge communication. Articles that have significant wider benefits may therefore be undervalued if they are assessed through conventional citation indicators and sources (e. g., the Web of Science ( ) and Scopus). A range of online document genres and sources may help to evaluate these broader impacts of articles, including academic syllabi, textbooks, clinical trials or guidelines, patents, encyclopedia articles, and grey literature publications. Web citations can be used as a quantitative impact indicator for monitoring the wider impact of articles, especially in the arts, humanities, and social sciences where many research outputs have value beyond academia. This article reviews literature about the web citation analysis of articles and explains different methods to capture web citations from a range of online sources via commercial search engines. The applications and limitations of web citation analysis for wider impact assessment of articles are discussed, in addition to practical advice for data gathering. New web citation indicators can help research evaluation peer review and citation analysis by giving additional information about the wider benefits of published research when a type of impact (e. g., teaching, commercial, or clinical impact) is required to be assessed by authors, research funders, or evaluators in addition to their academic research impact.
While funders increasingly request evidence of the societal benefits of research, all academics in the UK must periodically provide this information to gain part of their block funding within the Research Excellence Framework (REF). The... more
While funders increasingly request evidence of the societal benefits of research, all academics in the UK must periodically provide this information to gain part of their block funding within the Research Excellence Framework (REF). The impact case studies produced in the UK are public and can therefore be used to gain insights into the types of sources used to justify societal impact claims. This study focuses on the URLs cited as evidence in the last public REF to help researchers and resource providers to understand what types can be used and the disciplinary differences in their uptake. Based on a new semiautomatic method to classify the URLs cited in impact case studies, the results show that there are a few key online types of source for most broad fields, but these sources differ substantially between subject areas. For example, news websites are more important in some fields than others, and YouTube is sometimes used for multimedia evidence in the arts and humanities. Knowle...
Covid-19 vaccine hesitancy seems likely to increase mortality rates and delay the easing of social distancing restrictions. Online platforms with large audiences may influence vaccine hesitancy by spreading fear and misinformation that is... more
Covid-19 vaccine hesitancy seems likely to increase mortality rates and delay the easing of social distancing restrictions. Online platforms with large audiences may influence vaccine hesitancy by spreading fear and misinformation that is avoided by the mainstream media. Understanding what types of vaccine hesitancy information is shared on the popular social web site Twitter may therefore help to design interventions to address misleading attitudes. This study applies content analysis to a random sample of 446 vaccine hesitant Covid-19 tweets in English posted between 10 March and 5 December 2020. The main themes discussed were conspiracies, vaccine development speed, and vaccine safety. Most (79%) of those tweeting refusal to take a vaccine expressed right-wing opinions, fear of a deep state, or conspiracy theories. A substantial minority of vaccine refusers (18%) mainly tweeted non-politically about other themes. The topics on Twitter reflect vaccine concerns, but those stating v...
Purpose – The purpose of this paper is to investigate the potential of altmetric and webometric indicators to aid with funding agencies’ evaluations of their funding schemes. Design/methodology/approach – This paper analyses a range of... more
Purpose – The purpose of this paper is to investigate the potential of altmetric and webometric indicators to aid with funding agencies’ evaluations of their funding schemes. Design/methodology/approach – This paper analyses a range of altmetric and webometric indicators in terms of suitability for funding scheme evaluations, compares them to traditional indicators and reports some statistics derived from a pilot study with Wellcome Trust-associated publications. Findings – Some alternative indicators have advantages to usefully complement scientometric data by reflecting a different type of impact or through being available before citation data. Research limitations/implications – The empirical part of the results is based on a single case study and does not give statistical evidence for the added value of any of the indicators. Practical implications – A few selected alternative indicators can be used by funding agencies as part of their funding scheme evaluations if they are proc...
Canada is one of the countries that permit a large amount of migrations from different nations. According to the Citizenship and Immigration of Canada, Iran has been one of the top 10 countries in terms of immigration rate over the past... more
Canada is one of the countries that permit a large amount of migrations from different nations. According to the Citizenship and Immigration of Canada, Iran has been one of the top 10 countries in terms of immigration rate over the past decade. It is not fully known how the immigration of Iranian scholars may influence scientific productivity of destination country, however. To fill this gap, we assessed the share of Iranian authors' contribution in scientific publications of Canada and reported their educational and occupational backgrounds. Corresponding author affiliations of about 39,500 articles indexed in Scopus (2005-2011) in engineering fields have been extracted and checked if they are Persian names. A sample of online CVs from Iranian corresponding authors has been used to determine the researchers' educational and occupational backgrounds. Results showed a constant increase in the proportion of publications with Iranian corresponding authors and Canadian affiliati...
Purpose Communicating scientific results to the public is essential to inspire future researchers and ensure that discoveries are exploited. News stories about research are a key communication pathway for this and have been manually... more
Purpose Communicating scientific results to the public is essential to inspire future researchers and ensure that discoveries are exploited. News stories about research are a key communication pathway for this and have been manually monitored to assess the extent of press coverage of scholarship. Design/methodology/Approach To make larger scale studies practical, this paper introduces an automatic method to extract citations from newspaper stories to large sets of academic journals. Curated ProQuest queries were used to search for citations to 9,639 Science and 3,412 Social Science Web of Science (WoS) journals from eight UK daily newspapers during 2006–2015. False matches were automatically filtered out by a new program, with 94% of the remaining stories meaningfully citing research. Findings Most Science (95%) and Social Science (94%) journals were never cited by these newspapers. Half of the cited Science journals covered medical or health-related topics, whereas 43% of the Socia...
Primary data collected during a research study is increasingly shared and may be re-used for new studies. To assess the extent of data sharing in favourable circumstances and whether such checks can be automated, this article investigates... more
Primary data collected during a research study is increasingly shared and may be re-used for new studies. To assess the extent of data sharing in favourable circumstances and whether such checks can be automated, this article investigates the summary statistics of primary human genome-wide association studies (GWAS). This type of data is highly suitable for sharing because it is a standard research output, is straightforward to use in future studies (e.g., for secondary analysis), and may be already stored in a standard format for internal sharing within multi-site research projects. Manual checks of 1799 articles from 2010 and 2017 matching a simple PubMed query for molecular epidemiology GWAS were used to identify 330 primary human GWAS papers. Of these, only 10.6% reported the location of a complete set of GWAS summary data, increasing from 4.3% in 2010 to 16.8% in 2017. Whilst information about whether data was shared was usually located clearly within a data availability statem...

And 80 more