Abstract
Human papillomavirus (HPV) negative cancers are associated with symptomatic detection, late-stage diagnosis, and worse prognosis. It is thus essential to investigate all possible infectious agents and biomarkers that could early identify these HPV negative cancers. We aimed to analyze and compare the metatranscriptome present in HPV positive and HPV negative cervical cancers. We analyzed the whole RNA sequencing files from 223 HPV negative cervical cancers (negativity established after confirming cervical cancer diagnosis, sample adequacy and subjecting specimens to PCR and unbiased RNA sequencing), 223 HPV positive tumors and 11 blank paraffin block pools (used as controls) using Kraken2 software. Overall, 84 bacterial genera were detected, with 6/84 genera showing a positive median number of reads/sample and being present in both cervical tumor groups (HPV positive and negative). Viral reads belonged to 63 different viral genera, with 6/63 genera showing a positive median annotated read/sample value. No significant difference among genera was detected except for the presence of alpha-papillomaviruses. Metatranscriptome of bacteria and viruses present in HPV positive and HPV negative cervical cancers show no significant difference, except for HPV. Further studies are needed to early identify this biologically distinct group of cervical cancers.
Similar content being viewed by others
Introduction
Human papillomaviruses are a group of double-stranded DNA viruses which comprise up to 223 different types, with new types being continuously identified1,2,3,4. About 12 HPV types are nowadays classified as oncogenic, high-risk HPV genotypes, and persistent infection of these oncogenic HPV types is known to be a necessary cause for cervical cancer.
Although effective screening methods (cytology and HPV testing) exist and several effective prophylactic vaccines are available in the market to prevent cervical cancer, the annual number of new cases of cervical cancer is estimated to increase from 570,000 to 700,000 between 2018 and 2030, with the annual number of deaths rising from 311,000 to 400,0005. Cervical cancer is preventable, but it is still one of the most common cancers and causes of cancer-related death in women, worldwide.
Due to advances in molecular testing, the more sensitive method of HPV-based screening is now replacing cytology as the primary cervical cancer screening tool. This method translates into testing cervical samples for HPV DNA rather than morphological abnormalities and constitute a more automated and objective screening strategy. Women with a negative HPV test, have very low risk of subsequent cancer and may safely be re-called in three-seven years. Nevertheless, a significant proportion of cervical cancers (around 7%) appears to be HPV negative6,7,8. This percentage of HPV negative cancers may translate into women not being categorized correctly for the screening program and tumors being detected only when they had become symptomatic (very late). Of particular note, authors have shown that cervical cancer cases that appear to have lost HPV DNA seem to constitute a specific subgroup of cervical cancers with a different biological behavior (worse prognosis)9,10.
Efforts to find out and explain HPV negativity in cervical cancers have led scientists to reanalyze smears and retest these HPV negative tumors by using different approaches. Main explanations for HPV negativity have been sample inadequacy, false diagnosis, presence of cancers associated with types that were not targeted in the studies, and false negative results due to low sensitivity of the HPV detection methods6,7,8,11.
Most HPV screening methods are focused on detecting high-risk HPV types and are based on PCR amplification of conserved regions in the L1 gene by consensus primers, followed by HPV genotyping identification by hybridization of amplicons to type-specific probes12,13. However, these methods are biased to detect mostly known HPVs that bind specifically to designed primers and probes. Unknown HPV types, known HPV types that are distantly related as well as targeted HPV types presenting variations within probe/primer regions may escape amplification or hybridization4,14,15,16,17. To overcome this bottleneck, it is nowadays possible to perform unbiased metagenomic sequencing (not based on PCR) and detect all HPVs present in a sample, without prior knowledge on which types might be present14,18,19. Furthermore, if cDNA is sequenced, the data will reveal if there is viral transcriptional activity, essential for both initiation and maintenance of the malignant phenotype.
Following the three recommendations mentioned above, cervical tumors classified as HPV negative after (a) assuring sample adequacy, (b) reanalysis of corresponding slides/smears to confirm tumor diagnosis and (c) not revealing presence of HPV after performing an unbiased sequencing analysis, may be called truly HPV negative. Understanding the biology of truly HPV negative cervical cancers is urgently needed in the era of cervical cancer elimination.
Cervical cancer is known to have an infection as cause, being HPV the primary predisposing factor. Truly HPV negative cervical cancers could have initially been caused by HPV but lost the virus as the tumor progressed. There is a very high proportion of cervical intraepithelial neoplasias (CIN) 3 that contains HPV and there are cases of HPV negative cancers that have been HPV positive in the earlier screening appointments20. On the other hand, it is possible that another infectious agent may play a role in this specific subgroup of HPV negative cancers. Taking into consideration both possibilities, a search for an infectious cause/biomarker should focus on unbiased RNA sequencing of truly HPV negative cancers. We aimed to analyze and compare the metatranscriptome of HPV positive cervical cancers with the metatranscriptome of HPV negative cancers and assess whether there are differences that could act as biomarkers to early identify these HPV negative cancers.
Materials and methods
In a previous study, we systematically genotyped all cervical cancer cases occurring in Sweden from 2002–2011 (n = 2850), and for those that were HPV negative (n = 527, 18.5% of total cervical tumors), sample adequacy and correct diagnosis were analyzed and reanalyzed, respectively. After excluding samples that were not adequate (beta-globin negativity) and those whose diagnosis was not confirmed, a different PCR approach (targeting E6/E7 genes instead of L1) was performed. Repeatedly negative samples decreased to 394/2850 (13.8% of total cervical tumors). We then subjected all HPV negative carcinomas (and a subset of 59 HPV PCR positive cervical cancers used as positive controls) to an unbiased RNA sequencing using Novaseq 6000 (Illumina Platform), generating high quality sequencing data with a median of 30 million reads per sample. RNA positivity was detected in 169/392, decreasing the percentage of cervical cancers negative for HPV from 18.5% to 7.8% (223/2850)15.
For this study, we selected all the 223 HPV negative specimens and compared the metatranscriptomes present with the metatranscriptomes detected in 223 HPV positive cervical cancers. These 223 HPV positive cervical specimens corresponded to 169 specimens that were originally HPV PCR negative but turned out to be HPV positive when subjected to unbiased sequencing as well as 54/59 HPV PCR positive samples used previously as positive controls (5/59 HPV PCR positive samples were HPV negative when subjected to sequencing). A total of 11 blank paraffin blocks pools were used as negative controls. While sectioning each cervical tumor formalin-fixed paraffin-embedded (FFPE) block, blank paraffin blocks had been sectioned in between as control for contamination. These paraffin blocks were extracted in the same manner as the cervical tumors, pooled (each pool containing about 45 blank blocks) and sequenced together with the corresponding tumor blocks.
High-quality non-human reads were classified using Kraken2 v. 2.1.121, which was run against a reference database containing all RefSeq bacterial and viral genomes (built in December 2020) with a 0.1 confidence threshold. A cut-off of 10 classified unique reads was used to discriminate positive genera for bacteria and viruses, and results reported all genera which comprised more than 1% of total bacterial or viral reads, respectively.
As HPV was the key difference between the cervical tumors (and the only significant difference detected between the 2 groups), we also queried all high-quality non-human reads to a reference database containing all human papillomavirus nucleotide sequences deposited in GenBank until July 8th 2022 (parameters: Taxid 151,340, length 5000–12,000, excluding non-partial genomes) using Kraken2 v. 2.1.121, with a 0.1 confidence threshold. A cut-off of 10 classified unique reads was used to discriminate positive species. Reads that corresponded to HPV genomes were subjected to visual inspection using Integrative Genomics Viewer to confirm mapping.
Statistical analysis
Differences in bacterial/viral presence across the two groups (HPV positive and HPV negative cancers) were evaluated by comparing median unique reads with nonparametric Wilcoxon rank-sum test, and relative proportions of bacteria/viral communities among the HPV positive and HPV negative cervical cancers were analyzed using a two proportion Z-test and its associated p-value. Bonferroni correction was applied considering a 0.05 error rate (alpha level) and statistical significance was then obtained if p < 0.0002.
Ethical approval was granted by the Regional Ethical Review Board of Stockholm, Sweden (EPN-Dnr: 2012/1028/32). The Regional Ethical Review Board determined that, due to the population-based nature of the study, informed consent from study participants was not required (EPN-Dnr: 2011/1026–31/4). All methods were performed in accordance with the relevant guidelines and regulations.
Results
In a previous study, all cervical cancers diagnosed from 2002 to 2011 in the whole Sweden were collected and genotyped. HPV negativity was determined after sample adequacy confirmation, diagnose reanalysis and confirmation of cervical cancer, and not detecting HPV after both PCR and unbiased RNA sequencing15. A total of 223 HPV negative cervical cancers were reported. The present study aimed to compare the metatranscriptome present in these 223 HPV negative cervical cancer specimens with the corresponding metatranscriptome present in 223 HPV positive cervical tumors.
The median age at cancer diagnosis of the HPV negative cervical cases (n = 223) was 68 years (range 30–93 years) and the median age at cancer diagnosis of the HPV positive cervical cases (n = 223) was 56 years (range 24–95 years).
Bacteriome
The metatranscriptome analysis from RNA sequencing had a total number of 63.80 M annotated bacterial reads for the HPV positive specimens and 62.81 M annotated bacterial reads for the HPV negative specimens.
A total of 84 different bacterial genera showing at least 10 reads and 1% of total bacterial annotated reads were detected within the 223 HPV positive and 223 HPV negative cervical tumors. Only 6/84 genera showed a positive median number of reads/sample for both specimen groups (HPV positive and HPV negative cancers) (Table 1).
The same 6 genera were found in both types of cervical tumors (HPV positive and HPV negative cancers), and the corresponding proportions were similar. From higher to lower number of median annotated reads/sample: Klebsiella (66,563 and 69,618 median reads/sample), Staphylococcus (40,020 and 39,935 reads), Pasteurella (31,263 and 31,026 reads), Burkholderia (5633 and 5643 reads), Paracoccus (4817 and 4955 median reads) and Bacillus (2898 and 3260 reads) were the genera detected in HPV positive and HPV negative specimens, respectively. All 6 genera were present in at least 87.89% of specimens. No statistical difference (p < 0.001) in median read level nor in proportion was detected when comparing any bacteria genera within HPV positive and HPV negative specimens (Table 1).
Analysis of controls (blank paraffin specimens) showed a total of 31 different genera with 28/31 genera being also detected in cervical cancer specimens. All 6 genera that showed positive median read values in cervical tumors, were also present in at least one of the blank controls, with Klebsiella, Staphylococcus, Pasteurella and Paracoccus being identified in 1/11 blank controls and Burkholderia and Bacillus in 5 and 3 blank controls, respectively.
Virome
The metatranscriptome analysis from RNA sequencing had a total number of 613,606 annotated viral reads for the HPV positive specimens and 575,734 annotated viral reads for the HPV negative specimens.
Overall, 63 different viral genera were detected when analyzing the RNA metatranscriptome within HPV positive and HPV negative tumors, with 6/28 genera showing a positive median annotated read/sample value (Table 2).
Three out of these 6 genera: Gorganvirus, Orthobunyavirus and Betabaculovirus, showed a positive median read value in both HPV negative and HPV positive cervical tumors while Alphabaculovirus, Alphapapillomavirus and Pandoravirus showed a positive median read value only in HPV positive tumors. Statistical significance (p < 0.001) was only detected for the Alphapapillomavirus genus (both when analysing the difference in median unique reads as well as the relative proportions). Alphapapillomavirus genus was detected in 142/223 HPV positive specimens and in 3/223 HPV negative specimens according to established cutoffs and reporting parameters (10 reads and at least 1% of total viral reads).
Further analysis of HPV presence was performed by subjecting high-quality non-human reads to a more “complete” database containing thousands of human papillomavirus sequences deposited in GenBank using Kraken. The analysis revealed HPV positivity (considered when samples showed more than 10 reads/species) in all 223 HPV positive cervical tumors, 7/10 controls blank pools and 30/223 HPV negative cervical tumors. Visual inspection of reads using Integrative Genomics Viewer showed that the blank controls and HPV negative cervical tumors that turned out to be HPV positive with this analysis did in fact show more than 10 reads, but reads were very short (< 50 bp) and viral coverage was below 10%. All 223 HPV positive cervical tumors showed more than 10 reads and HPV coverage was above 10% for all specimens.
Analysis of controls (blank paraffin specimen pools) revealed presence of 74 different viral genera, with 30/74 genera being also present in HPV negative and/or HPV positive cervical cancer samples. Most of the genera (25/30) were present in less than half of the blank controls, while Locarnavirus, Chlorovirus, Gemycircularvirus, Orthobunyavirus and Betabaculovirus, were present in 6/11, 9/11, 9/11, 10/11 and 11/11 blank pools, respectively.
Discussion
Cervical HPV negative cancers exist and have unique properties such as late stage diagnosed cancers with poor prognosis. In this paper, we compared the metatrascriptome from 223 HPV negative cervical cancers with the metatranscriptome corresponding to 223 HPV positive cancers to inspect if there was any difference between them.
Strengths of this study include the use of HPV negative cancers whose diagnosis, sample adequacy and HPV result had been confirmed and analyzed with different methods in order to dismiss false negativity. FFPE specimens were reanalyzed by an expert pathologist and confirmed to indeed contain invasive cervical cancer tissue. Beta-globin was present in all specimens included in the study and HPV detection was performed by using PCR targeting L1, E6/E7 as well as unbiased RNA sequencing. Furthermore, blank paraffin controls were used to assess possible contamination as well as presence of environmental communities, and Bonferroni correction was applied to prevent data from incorrectly appearing to be statistically significant (a common event when performing multiple comparisons).
Bacterial and viral communities have been already analyzed and compared between health and disease, specially thanks to the effort of the Human Microbiome Project (HMP), the first large study to address the diversity of microorganisms present in the different organs of the human body22. Literature agrees on reporting that “healthy” women show a vaginal microbiome with low diversity, dominated by Lactobacillus, and that one of the most prominent features of “disease” is an increase of pH (due to a decrease of lactate concentration), reduction of lactobacilli and a great diversity of bacterial vaginosis-related bacteria, which are primarily anaerobic bacteria23. Studies on HPV positive CIN and cervical cancers show higher diversity in vaginal microbiota, with depletion of Lactobacillus crispatus and increase abundance of anaerobic bacteria24,25. While our results from cervical cancer cases agree with the literature (higher diversity of bacteria and low abundance of Lactobacillus, only present in 10/446), microbiome differences between HPV negative and HPV positive cervical cancers have not been studied yet. No significant difference among present genera was detected except for the presence of alpha-papillomaviruses, which is in line with the fact that HPV is the only reported infectious agent so far associated with cervical cancer, supporting the hypothesis that loss of HPV may occur as the tumor progresses and may be the reason for the occurrence of such cancer which is almost 7% of all cervical cancers diagnosed. A weakness from the present study may comprise not having previous screening specimens from the corresponding women to see if HPV presence was detectable prior cancer diagnosis.
Human papillomaviruses were initially reported in 143/223 HPV positive cervical cancers (142 samples showing predominantly an alphapapillomavirus genus, and 1 specimen showing a gammapapillomavirus) and 3/223 HPV negative cervical cancers when querying sequencing reads to the RefSeq database. Genera were reported when they presented more than 10 reads and comprised at least 1% of total bacterial/viral reads/sample. A further investigation of HPV presence using a broader HPV database identified HPV in all HPV positive tumors, as well as in 7/10 controls blank pools and 30/223 HPV negative cervical tumors. Visual inspection of these reads revealed questionable calling for both the blank pools as well as the 30/223 HPV negative cervical tumors, where the viral coverage was below 10%.
There is no consensus about which cut-offs to use for HPV “calling” when performing sequencing analysis and there is an urgent need to establish those26. To reduce false-negative/positive classification, it is imperative to use complete and updated databases and take into consideration both the number of reads/k-mers (e.g. at least 10 reads) as well as the genome coverage to achieve accurate identification (e.g. at least 10%). Accepting number of reads/k-mers as only parameter is not enough, as artefacts/background/noise may produce false positivity26,27.
In conclusion, the present study reports the metatransciptome analysis of both HPV negative and HPV positive cervical tumors and does not detect any statistical difference in bacterial/viral communities´ expression when comparing HPV positive and HPV negative cervical cancers (except for Alphapapillomavirus). Further studies are needed to possibly find differences and/or biomarkers to early identify this biologically distinct subgroup of HPV negative cervical cancers. As the study is clear regarding that the metatranscriptome does not differ much between HPV negative and HPV positive cancers, we suggest that it may be more rewarding to look for differences in the human genome and transcriptome between the HPV positive and HPV negative cervical cancers.
Data availability
All the aligned, non-human sequences are available at the Sequence Read Archive (SRA) within the bio-project ID PRJNA563802 since previous publication.15 (https://www.ncbi.nlm.nih.gov/bioproject/563802).
References
Bzhalava, D. et al. Deep sequencing extends the diversity of human papillomaviruses in human skin. Sci. Rep. 4, 5807. https://doi.org/10.1038/srep05807 (2014).
Ekstrom, J. et al. Diversity of human papillomaviruses in skin lesions. Virology 447, 300–311. https://doi.org/10.1016/j.virol.2013.09.010 (2013).
Martin, E. et al. Characterization of three novel human papillomavirus types isolated from oral rinse samples of healthy individuals. J. Clin. Virol. 59, 30–37. https://doi.org/10.1016/j.jcv.2013.10.028 (2014).
Arroyo Muhr, L. S. et al. Human papillomavirus type 197 is commonly present in skin tumors. Int. J. Cancer 136, 2546–2555. https://doi.org/10.1002/ijc.29325 (2015).
Schiffman, M., Clifford, G. & Buonaguro, F. M. Classification of weakly carcinogenic human papillomavirus types: Addressing the limits of epidemiology at the borderline. Infect. Agent Cancer 4, 8. https://doi.org/10.1186/1750-9378-4-8 (2009).
Cancer Genome Atlas Research, N. et al. Integrated genomic and molecular characterization of cervical cancer. Nature 543, 378–384, doi:https://doi.org/10.1038/nature21386 (2017).
Tjalma, W. HPV negative cervical cancers and primary HPV screening. Facts Views Vis. Obgyn. 10, 107–113 (2018).
Walboomers, J. M. et al. Human papillomavirus is a necessary cause of invasive cervical cancer worldwide. J. Pathol. 189, 12–19. https://doi.org/10.1002/(SICI)1096-9896(199909)189:1%3c12::AID-PATH431%3e3.0.CO;2-F (1999).
Lei, J. et al. Human Papillomavirus Infection Determines Prognosis in Cervical Cancer. J. Clin. Oncol. https://doi.org/10.1200/JCO.21.01930 (2022).
Banister, C. E., Liu, C., Pirisi, L., Creek, K. E. & Buckhaults, P. J. Identification and characterization of HPV-independent cervical cancers. Oncotarget 8, 13375–13386. https://doi.org/10.18632/oncotarget.14533 (2017).
Li, N., Franceschi, S., Howell-Jones, R., Snijders, P. J. & Clifford, G. M. Human papillomavirus type distribution in 30,848 invasive cervical cancers worldwide: Variation by geographical region, histological type and year of publication. Int. J. Cancer 128, 927–935. https://doi.org/10.1002/ijc.25396 (2011).
Arbyn, M. et al. 2020 list of human papillomavirus assays suitable for primary cervical cancer screening. Clin. Microbiol. Infect. 27, 1083–1095. https://doi.org/10.1016/j.cmi.2021.04.031 (2021).
Iftner, T. & Villa, L. L. Chapter 12: Human papillomavirus technologies. J. Natl. Cancer Inst. Monogr. 1, 80–88. https://doi.org/10.1093/oxfordjournals.jncimonographs.a003487 (2003).
Arroyo Muhr, L. S. et al. Sequencing detects human papillomavirus in some apparently HPV-negative invasive cervical cancers. J. Gen. Virol. 101, 265–270. https://doi.org/10.1099/jgv.0.001374 (2020).
Arroyo Muhr, L. S. et al. Deep sequencing detects human papillomavirus (HPV) in cervical cancers negative for HPV by PCR. Br. J. Cancer 123, 1790–1795. https://doi.org/10.1038/s41416-020-01111-0 (2020).
Arroyo Muhr, L. S. et al. Does human papillomavirus-negative condylomata exist?. Virology 485, 283–288. https://doi.org/10.1016/j.virol.2015.07.023 (2015).
Arroyo, L. S. et al. Next generation sequencing for human papillomavirus genotyping. J. Clin. Virol. 58, 437–442. https://doi.org/10.1016/j.jcv.2013.07.013 (2013).
Arroyo Muhr, L. S. et al. Viruses in cancers among the immunosuppressed. Int. J. Cancer 141, 2498–2504. https://doi.org/10.1002/ijc.31017 (2017).
Bzhalava, D. et al. Unbiased approach for virus detection in skin lesions. PLoS ONE 8, e65953. https://doi.org/10.1371/journal.pone.0065953 (2013).
Xing, B., Guo, J., Sheng, Y., Wu, G. & Zhao, Y. Human papillomavirus-negative cervical cancer: A comprehensive review. Front. Oncol. 10, 606335. https://doi.org/10.3389/fonc.2020.606335 (2020).
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257. https://doi.org/10.1186/s13059-019-1891-0 (2019).
Group et al. The NIH Human Microbiome Project. Genome Res 19, 2317–2323. https://doi.org/10.1101/gr.096651.109 (2009).
Chee, W. J. Y., Chew, S. Y. & Than, L. T. L. Vaginal microbiota and the potential of Lactobacillus derivatives in maintaining vaginal health. Microb Cell Fact 19, 203. https://doi.org/10.1186/s12934-020-01464-4 (2020).
Lin, D. et al. Microbiome factors in HPV-driven carcinogenesis and cancers. PLoS Pathog 16, e1008524. https://doi.org/10.1371/journal.ppat.1008524 (2020).
Carlander, C. et al. HPV types in cervical precancer by HIV status and birth region: A population-based register study. Cancer Epidemiol. Biomark. Prev. 29, 2662–2668. https://doi.org/10.1158/1055-9965.EPI-20-0969 (2020).
Muhr, L. S. A., Guerendiain, D., Cuschieri, K. & Sundstrom, K. Human papillomavirus detection by whole-genome next-generation sequencing: Importance of validation and quality assurance procedures. Viruses 13, 1. https://doi.org/10.3390/v13071323 (2021).
Zhang, C. et al. Identification of low abundance microbiome in clinical samples using whole genome sequencing. Genome Biol. 16, 265. https://doi.org/10.1186/s13059-015-0821-z (2015).
Acknowledgements
The authors would like to acknowledge support from Science for Life Laboratory, the National Genomics Infrastructure, NGI, and Uppmax for providing assistance in massive parallel sequencing and computational infrastructure.
Funding
Open access funding provided by Karolinska Institute. This study was funded by the Karolinska Institutet Research Foundation (FS-2020:0007, LSAM).
Author information
Authors and Affiliations
Contributions
Conceptualization, Supervision, Project administration, Writing/Original Draft: L.S.A.M.; Methodology: C.L.; Formal analysis, Data curation: A.U., L.S.A.M.; Investigation, Validation, Visualization: A.U., C.L., L.S.A.M.; All authors participated in writing and critical revision of the manuscript for important intellectual content. All the authors have read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ure, A.E., Lagheden, C. & Arroyo Mühr, L.S. Metatranscriptome analysis in human papillomavirus negative cervical cancers. Sci Rep 12, 15062 (2022). https://doi.org/10.1038/s41598-022-19008-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-022-19008-8