Gene four-typing classification system for papillary thyroid carcinoma patients and application of gene four-typing classification system
Technical Field
The invention relates to the technical field of biomedicine, in particular to a thyroid papillary carcinoma patient gene four-typing classification system and application thereof.
Background
Thyroid cancer, the most common malignant tumor in the endocrine system, is rapidly increasing in incidence. Thyroid cancer of different pathological types is significantly different in its pathogenesis, biological behavior, histological morphology, clinical manifestations, therapeutic methods and prognosis. Papillary Thyroid Carcinoma (PTC) is a differentiated thyroid carcinoma that accounts for approximately 90% of all cases of thyroid cancer. PTC grows slowly and the prognosis is generally good. However, the morbidity and recurrence rate is high (20-40%), and the mortality rate of PTC is about 50% of the total mortality rate of thyroid cancer. Research shows that some signaling pathways may be involved in the occurrence and development of thyroid cancer, including MAPK pathway, PI3K-AKT pathway, NF-kB pathway, RASSF1-MST1-FOXO3 pathway, and various gene mutations related to the signaling pathways play important roles in the occurrence and development of thyroid cancer, such as BRAF, HRAS, KRAS, NRAS, RET, TERT and the like. The genetic heterogeneity of PTC patients affects the clinical and pathological characteristics (such as age, stage, lesion diameter, lymph node) and survival prognosis of patients, and is worthy of exploration.
PTC, a tumor driven by MAPK signaling pathway, has two mutually exclusive driver genes: BRAF-V600E and RAS mutation. BRAF-V600E mutations occur in about 45% of PTCs and studies have shown that BRAF-V600E has a strong correlation with poor clinical pathological prognosis of PTCs, including pathological aggressiveness profile, increased recurrence, loss of radioactive iodine affinity and treatment failure, and two subsequent large meta-analyses demonstrated a more aggressive gene-carrying BRAF-V600E, due to the greatly reduced responsible iodine uptake and metabolic gene expression in BRAF-V600E compared to RAS and RTK fused tumors, whereas RAS mutations, including three subtypes HRAS, KRAS and NRAS, are inferior to BRAF, 10% to 20% in PTCs. While RAS is a classical dual activator of the MAPK and PI3K-AKT pathways, RAS mutations appear to preferentially activate the PI3K-AKT pathway in thyroid tumorigenesis. To explore the relationship of BRAF-V600E to RAS, researchers developed a BRAF-V600E-RAS scoring model to quantify the propensity of a given tumor expression profile to BRAF-V600E or RAS gene mutation profiles, but neglecting patients with BRAF-V600E and other gene mutation types other than RAS. In addition, there are some other types of genetic mutations or fusions in PTC, such as RET mutations/fusions that may be associated with invasion, metastasis, TP53 genetic mutations; BRAF/TERT co-mutation, RAS/TERT co-mutation, and the like, potentially associated with malignancy such as relapse, poor prognosis, high mortality, and the like. Meanwhile, the literature reports that the TP53 and KRAS have significant effects on the expression of PD-L1, infiltration of T cells of an immune system and enhancement of tumor immunogenicity. Research shows that the BRAF gene mutation is related to prognosis of PTC patients, such as higher tumor stage, local invasion, cervical lymph node metastasis and the like. But there are also studies showing that BRAF gene mutations are associated with the age and aggressiveness of PTC, but not with lymph node metastasis; RET rearrangement suggests increased lymph node metastasis in young PTC patients. In addition, there are studies showing that RET, ALK or NTRK1 gene fusion and chromosome 22q deletion are independently associated with PTC transfer.
Therefore, PTC gene heterogeneity has important regulation and control effects on clinical manifestations, lymph node metastasis and prognosis, but no systematic PTC genotyping system is proposed at present, and certain controversy is made for the influence of specific genes. Meanwhile, relatively large-scale Chinese thyroid cancer mutation spectrum, clinical characteristics and prognosis correlation reports are few.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a thyroid papillary carcinoma patient genotyping classification system and application thereof.
In order to achieve the purpose, the invention adopts the technical scheme that:
in a first aspect, the present invention provides a genotyping classification system for papillary thyroid carcinoma patients, comprising:
1) the data collection module is used for collecting SYSMH-PTC queue data and TCGA-PTC queue multigroup information of the papillary thyroid cancer patient;
2) the typing module is used for dividing SYSMH-PTC queue data and TCGA-PTC queue multiomic information of the papillary thyroid carcinoma patients into new gene quadritypes of BRAF mutation, RET fusion, RAS mutation and other gene mutation types according to representative mutant genes (mutual exclusion relation among driving genes) of the papillary thyroid carcinoma patients;
3) and the analysis module is used for analyzing clinical pathological manifestations, tumor immune cell infiltration conditions, proinflammatory factor expression conditions, immune checkpoint expression conditions, lymph node metastasis risks and prognosis of patients with papillary thyroid cancers of different genotypes.
As a preferred embodiment of the genotyping classification system for papillary thyroid carcinoma patients according to the present invention, the SYSMH-PTC cohort data comprises clinical pathology information and tumor tissue gene sequencing data; the TCGA-PTC cohort multigroup information comprises clinical pathological information, tumor tissue gene sequencing data and prognosis information.
The new genotyping of BRAF mutations, RET fusions, RAS mutations and other types of gene mutations can be validated in the TCGA-PTC cohort.
As a preferred embodiment of the genotyping classification system for papillary thyroid carcinoma patients according to the invention, the clinical pathological manifestations include the age, lesion maximum diameter, number of central lymph nodes, sex, stage, tumor TPO, envelope invasion, vascular embolus, T-stage, N-stage, tumor CK19, galectin.3, lymph CK19, lymph TPO, CD56, ki37 or TTF.1 of each genotyping.
Through comparative analysis of clinical pathological manifestations of PTC patients with different genotypes, the PTC new gene tetrasyping system is proved to be capable of remarkably distinguishing clinical pathological manifestation conditions of the PTC patients. There were significant differences in the age of each genotyping, number of central lymph nodes, maximum lesion diameter, while TPO was associated with RET and BRAF.
As a preferred embodiment of the genotyping classification system for papillary thyroid carcinoma patients according to the present invention, the analysis of tumor immune cell infiltration is performed by using a heat map, Xcells algorithm or CIBERSORT algorithm.
By comparing and analyzing the immune cell infiltration conditions among PTC patients with different genotypes, the tumor tissue immune cell infiltration conditions among the patients with different gene mutation conditions are different in the PTC new genotyping, namely the PTC new genotyping system can obviously distinguish the tumor tissue immune cell infiltration conditions of the PTC patients. Unsupervised hierarchical clustering is carried out according to the different TIME key cell content, the similarity among RAS patients is higher, and the obvious groupware TIME difference (heat map Subtype red area) is obtained; except for macrophages of the M2 class, the other cells were under-expressed in RAS class patients. The same conclusion is obtained through the Xcells algorithm and the CIBERSORT algorithm.
As a preferred embodiment of the thyroid papillary carcinoma patient genotyping classification system of the present invention, the proinflammatory factor comprises IL-1, IL-2, IL-6, IL-12, IL-17, IL-18, IFN-gamma or TNF-alpha.
In the PTC new genotyping, the expression of proinflammatory factors in different genotype patients is obviously different. Namely, the PTC new gene four-type system can obviously distinguish the expression condition of proinflammatory factors of PTC patients.
As a preferred embodiment of the papillary thyroid carcinoma patient genotyping classification system of the present invention, the immune checkpoints include PDCD1, CD274 and CTLA 4. Through analysis, the expression conditions of the immune check points of patients with different genotypes have obvious difference, and the obtained PTC new gene tetrasyping system can obviously distinguish the expression conditions of the immune check points of the PTC patients.
As a preferred embodiment of the papillary thyroid carcinoma patient genotyping classification system, papillary thyroid carcinoma lymph node metastasis regulation key gene screening is carried out on the basis of papillary thyroid carcinoma patient SYSMH-PTC queue data, and genotyping is confirmed to be related to lymph node metastasis of papillary thyroid carcinoma patients.
As a preferred embodiment of the genotyping classification system for papillary thyroid carcinoma patients, the analysis module further comprises a coexpression network analysis of new genotyping differential genes.
Preferably, an RAS mutant vs BRAF mutant, an RAS mutant vs RET fusion and a BRAF mutant vs RET fusion are selected respectively, and the top 1000 differential genes are ranked and taken together to perform WCGNA differential gene co-expression network analysis.
The invention also provides application of the papillary thyroid carcinoma patient genotyping classification system in diagnosis of papillary thyroid carcinoma patients.
In the present invention, statistical analysis was performed using R software (R software, http:// www.R-project. org), Pythom software (https:// www.pythom.com /). The specific statistical method is as follows:
general principles of statistical analysis: all statistical tests, except where specified, used a two-sided test with α ═ 0.05, and confidence intervals were estimated using two-sided 95% confidence intervals. The quantitative indexes are statistically described by the number of cases, the mean number, the standard deviation, the median, the upper quartile, the lower quartile, the minimum value and the maximum value, and the classification indexes are statistically described by the number of cases and the percentage of various types. Comparative analysis of the two groups of baseline equality adopts t test, X2 test, Fisher accurate test or rank sum test and the like.
The data analysis method comprises the following steps: the differential comparative analysis adopts t test, X2 test, Fisher accurate test or rank-sum test and the like. The median and survival curve of the prognosis survival time of the patients are evaluated by adopting a Kaplan-Meier method, and the treatment effect is evaluated by adopting a log-rank test. The optimal cutoff values for continuous variables are generated by the R language survivor software package. The frequency of the altered genes was generated by the complexHeatmap software package in the R language to map the segmentation points. Analyzing the immune infiltration cell condition of the tumor microenvironment by adopting Xcells and CIBERSORT algorithm, and analyzing the inflammatory factors of the immune microenvironment and the expression condition of the immune check points by adopting GSEA. The cell function and pathway enrichment of key genes adopts gene co-expression network analysis (WGCNA), GO gene function analysis and KEGG pathway enrichment analysis. The predictor was analyzed using univariate Cox regression. A graphical and quantitative consistency assessment of the predictive efficacy of the Nomogram model was performed by calibrating the curve index (C-index).
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a gene four-type classification system for papillary thyroid carcinoma patients, which is discovered by carrying out detailed exploration and analysis on clinical pathological manifestations of different genotyping PTC patients, tumor immune cell infiltration conditions, proinflammatory factor expression conditions, immune checkpoint expression conditions, lymph node metastasis risks, prognosis and other conditions. Different lymph node metastasis risks can help guide the way of surgery for different genotyped PTC patients, while their tumor immune cell infiltration, proinflammatory factor expression and immune checkpoint expression help in their drug treatment strategy formulation. The PTC new gene tetratype system has innovative and scientific guiding significance for clinical diagnosis and treatment strategy formulation of PTC patients.
Drawings
FIG. 1 is a diagram of basic information of an enqueue;
FIG. 2 is a spectrum of gene mutation in a SYMH-PTC cohort of patients;
FIG. 3 is a diagram showing the distribution of the PTC novel gene tetrasyping;
FIG. 4 is a diagram showing the difference in clinical pathological manifestations of the PTC novel gene tetrasype;
FIG. 5 is a cytodifferential thermograph of PTC new gene tetrasyping tumor microenvironment;
figure 6 is a diagram of the cell differences of the PTC new gene tetrasyping tumor microenvironment I (using xcels algorithm, * p<0.05, ** p<0.01, *** p<0.001);
figure 7 is a diagram of the differences in cells in the tumor microenvironment for the four-genotype PTC new genes (using the CIBERSORT algorithm, * p<0.05, ** p<0.01, *** p<0.001);
FIG. 8 is a diagram showing differences in expression of pro-inflammatory factors in the tetrasype of the PTC new gene;
FIG. 9 is a diagram showing the expression of the PTC novel gene tetrasyping immune checkpoint;
FIG. 10 shows the correlation between PTC novel gene tetrasyping and lymph node metastasisAnalysis graphs (FIG. 10-a is a graph showing the correlation between different genotypes and lymph node metastasis; FIG. 10-b is a graph showing the difference in lymph node metastasis risk between BRAF mutant and BRAF wild type; FIG. 10-c is a graph showing the difference in lymph node metastasis risk between RAS mutant and RAS wild type; FIG. 10-d is a graph showing the difference in lymph node metastasis between RET fusion and RET wild type, ** p<0.01);
FIG. 11 is a diagram showing the correlation between the lymph node metastasis and the DFS of different PTC genotyping patients (FIG. 11-A is a diagram showing the correlation between the occurrence of the lymph node metastasis in all PTC patients; FIG. 11-B is a diagram showing the correlation between the BRAF mutant-type-performed lymph node metastasis and the DFS of PTC patients; FIG. 11-C is a diagram showing the correlation between the RAS mutant-type-performed lymph node metastasis and the DFS of PTC patients; FIG. 11-D is a diagram showing the correlation between the RET fusion-type-performed lymph node metastasis and the PTC patients);
FIG. 12 is a diagram of a differential gene co-expression network analysis for different PTC genotyping;
fig. 13 is a block diagram of the BRAF mutant versus RAS mutant patient differential gene modules.
Detailed Description
To better illustrate the objects, aspects and advantages of the present invention, the present invention will be further described with reference to the accompanying drawings and specific embodiments.
In the following examples, the experimental methods used were all conventional methods unless otherwise specified, and the materials, reagents and the like used were commercially available without otherwise specified.
Example 1
The present invention collects SYSMH-PTC cohort data (including clinical pathology information and tumor tissue gene sequencing data) of 252 PTC patients, and collects 499 TCGA-PTC cohort multigroup science information (including clinical pathology information, tumor tissue gene sequencing data and prognosis information, etc.) (see FIG. 1).
It was found by sequencing data analysis of the SYSMH-PTC cohort tumor 18 gene that PTC patients can be classified into a tetratype of genes for BRAF mutations, RET fusions and RAS mutations and other types of gene mutations based on their representative mutant genes (mutual exclusion relationship between driver genes) (see fig. 2). The genotyping of this BRAF mutation, RET fusion and RAS mutation and other gene mutation types was verified in the TCGA-PTC cohort (see figure 3).
Through comparative analysis of clinical pathological manifestations (including the age, the focus maximum diameter, the number of central lymph nodes, the sex, the stage, the tumor TPO, the envelope invasion, the vascular cancer embolus, the T-stage, the N-stage, the tumor CK19, galectin.3, the lymph CK19, the lymph TPO, the CD56, the ki37 or TTF.1) of different genotypes among PTC patients, the clinical pathological manifestations of different gene mutation conditions among patients in the PTC new genotyping are proved, namely the clinical pathological manifestation condition of the PTC patients can be obviously distinguished by the new PTC gene four-type system. There were significant differences in age, number of central lymph nodes, and maximal diameter of the lesion for each genotyping, while TPO was associated with RET and BRAF (results refer to fig. 4).
By comparing and analyzing the immune cell infiltration conditions among PTC patients with different genotypes, the tumor tissue immune cell infiltration conditions among the patients with different gene mutation conditions are proved to be different in the PTC new genotyping, namely the new PTC gene tetrasyping system can obviously distinguish the tumor tissue immune cell infiltration conditions of the PTC patients. Unsupervised hierarchical clustering is carried out according to the different TIME key cell content, the similarity among RAS patients is higher, and the obvious groupware TIME difference (heat map Subtype red area) is obtained; except for macrophages of the M2 class, the other cells were under-expressed in RAS class patients (results refer to fig. 5). The same conclusions were obtained with both the xcels algorithm and the CIBERSORT algorithm (results refer to fig. 6-7).
By comparing the expression of proinflammatory factors among PTC patients with different genotypes, the proinflammatory factors comprise IL-1, IL-2, IL-12, IL-17, IL-18, IFN-gamma and TNF-alpha, and the key factors are IL-1, IL-6 and TNF-alpha. Analysis proves that in PTC new genotyping, the expression conditions of the proinflammatory factors of patients with different genotypes have obvious difference (the result refers to figure 8), which indicates that the PTC new genotyping system can obviously distinguish the expression conditions of the proinflammatory factors of the PTC patients.
By comparing the expression of immune check points among PTC patients with different genotypes, PDCD1, CD274 and CTLA4 are included. Analysis demonstrated significant differences in immune checkpoint expression among patients of different genotypes in the novel PTC genotyping (results refer to fig. 9). The PTC new gene tetrasyping system can obviously distinguish the immune checkpoint expression condition of the PTC patients.
PTC lymph node metastasis regulation key gene screening was performed based on gene sequencing data of 252 patients in SYSMH-PTC cohort (see FIG. 10 for results). Among them, BRAF mutation, RAS mutation, RET fusion were significantly associated with lymph node metastasis in PTC patients (fig. 10-a); with gene mutation/fusion as a grouping condition, the patient lymph node metastasis was found to have significant differences among groups, confirming that the genotyping of BRAF mutation, RET fusion and RAS mutation and other gene mutation types correlated with lymph node metastasis of PTC patients (fig. 10-b, fig. 10-c and fig. 10-d).
Whether or not lymph node metastasis occurred in patients with different genotypes of PTC (see figure 11 for results), there was a significant difference in disease-free survival (DFS) of patients (figure 11-a). The association of lymph node metastasis with the DFS of PTC patients was analyzed for BRAF mutant, RAS mutant, and RET fusion, and it was found whether BRAF mutant (fig. 11-B), RAS mutant (fig. 11-C), and RET fusion (fig. 11-D) were lymph node metastasis, and the DFS of patients was significantly different.
Furthermore, RAS mutant type vs BRAF mutant type, RAS mutant type vs RET fusion type and BRAF mutant type vs RET fusion type are respectively selected and ranked to the top 1000 differential genes to be taken and collected for WCGNA differential gene co-expression network analysis (refer to fig. 12). The difference of each genotyping in the pathway expression module is mainly reflected in BRAF mutant vs RAS mutant patients, wherein 4 gene sets of black, brown, turquoise and gray are highly related to RAS mutant/BRAF mutant samples; the turquoise gene set was low in expression in RAS mutant patients and high in expression in BRAF mutant patients (see fig. 13). Enrichment analysis was performed on 4 gene sets individually to determine 2 key gene sets affecting immunity in RAS mutant patients, including Type II interferon signaling (IFNG), Cancer immunotherapy by CTLA4 blockade.
In conclusion, the gene four-type classification system for papillary thyroid carcinoma patients is discovered by performing detailed exploration and analysis on clinical pathological manifestations of different genotyping PTC patients, tumor immune cell infiltration conditions, proinflammatory factor expression conditions, immune checkpoint expression conditions, lymph node metastasis risks, prognosis and the like. Different lymph node metastasis risks can help guide the way of surgery for different genotyped PTC patients, while their tumor immune cell infiltration, proinflammatory factor expression and immune checkpoint expression help in their drug treatment strategy formulation. The novel PTC gene tetratype has innovative and scientific guiding significance for clinical diagnosis and treatment strategy formulation of PTC patients.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the protection scope of the present invention, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.