CN116334225A - Non-small cell lung cancer PD-1 immune therapy response prediction method for non-disease diagnosis or treatment - Google Patents
Non-small cell lung cancer PD-1 immune therapy response prediction method for non-disease diagnosis or treatment Download PDFInfo
- Publication number
- CN116334225A CN116334225A CN202310326560.5A CN202310326560A CN116334225A CN 116334225 A CN116334225 A CN 116334225A CN 202310326560 A CN202310326560 A CN 202310326560A CN 116334225 A CN116334225 A CN 116334225A
- Authority
- CN
- China
- Prior art keywords
- response
- cell
- clinical
- tumor
- transcriptome
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004044 response Effects 0.000 title claims abstract description 135
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000009169 immunotherapy Methods 0.000 title claims abstract description 30
- 208000002154 non-small cell lung carcinoma Diseases 0.000 title claims abstract description 27
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 title claims abstract description 26
- 238000003745 diagnosis Methods 0.000 title claims abstract description 16
- 201000010099 disease Diseases 0.000 title claims abstract description 12
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 12
- 206010028980 Neoplasm Diseases 0.000 claims abstract description 59
- 238000001514 detection method Methods 0.000 claims abstract description 30
- 210000005259 peripheral blood Anatomy 0.000 claims abstract description 30
- 239000011886 peripheral blood Substances 0.000 claims abstract description 30
- 238000010276 construction Methods 0.000 claims abstract description 16
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 10
- 210000004027 cell Anatomy 0.000 claims description 53
- 210000001744 T-lymphocyte Anatomy 0.000 claims description 41
- ZMXDDKWLCZADIW-UHFFFAOYSA-N N,N-Dimethylformamide Chemical compound CN(C)C=O ZMXDDKWLCZADIW-UHFFFAOYSA-N 0.000 claims description 24
- 238000007477 logistic regression Methods 0.000 claims description 23
- 230000014509 gene expression Effects 0.000 claims description 21
- 239000011324 bead Substances 0.000 claims description 16
- 238000000746 purification Methods 0.000 claims description 16
- 102000004190 Enzymes Human genes 0.000 claims description 15
- 108090000790 Enzymes Proteins 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 15
- 238000001914 filtration Methods 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 14
- 210000001519 tissue Anatomy 0.000 claims description 14
- 230000003321 amplification Effects 0.000 claims description 12
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 11
- 238000012935 Averaging Methods 0.000 claims description 9
- 201000011510 cancer Diseases 0.000 claims description 9
- 238000012165 high-throughput sequencing Methods 0.000 claims description 9
- 238000012216 screening Methods 0.000 claims description 9
- 210000000662 T-lymphocyte subset Anatomy 0.000 claims description 8
- 238000013467 fragmentation Methods 0.000 claims description 8
- 238000006062 fragmentation reaction Methods 0.000 claims description 8
- 230000014759 maintenance of location Effects 0.000 claims description 8
- 239000002299 complementary DNA Substances 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 7
- 201000001441 melanoma Diseases 0.000 claims description 7
- 108090000835 CX3C Chemokine Receptor 1 Proteins 0.000 claims description 6
- 102100039196 CX3C chemokine receptor 1 Human genes 0.000 claims description 6
- 210000001165 lymph node Anatomy 0.000 claims description 6
- 208000003721 Triple Negative Breast Neoplasms Diseases 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000001024 immunotherapeutic effect Effects 0.000 claims description 5
- 208000022679 triple-negative breast carcinoma Diseases 0.000 claims description 5
- 102100023589 Fibroblast growth factor-binding protein 2 Human genes 0.000 claims description 4
- 102100021186 Granulysin Human genes 0.000 claims description 4
- 102100030385 Granzyme B Human genes 0.000 claims description 4
- 102100038393 Granzyme H Human genes 0.000 claims description 4
- 101000827770 Homo sapiens Fibroblast growth factor-binding protein 2 Proteins 0.000 claims description 4
- 101001040751 Homo sapiens Granulysin Proteins 0.000 claims description 4
- 101001009603 Homo sapiens Granzyme B Proteins 0.000 claims description 4
- 101001033000 Homo sapiens Granzyme H Proteins 0.000 claims description 4
- 101000971513 Homo sapiens Natural killer cells antigen CD94 Proteins 0.000 claims description 4
- 101000979599 Homo sapiens Protein NKG7 Proteins 0.000 claims description 4
- 102100021462 Natural killer cells antigen CD94 Human genes 0.000 claims description 4
- 102100023370 Protein NKG7 Human genes 0.000 claims description 4
- 108060008683 Tumor Necrosis Factor Receptor Proteins 0.000 claims description 4
- 230000004663 cell proliferation Effects 0.000 claims description 4
- 238000003776 cleavage reaction Methods 0.000 claims description 4
- 238000005520 cutting process Methods 0.000 claims description 4
- 238000009826 distribution Methods 0.000 claims description 4
- 238000013508 migration Methods 0.000 claims description 4
- 230000005012 migration Effects 0.000 claims description 4
- 210000003289 regulatory T cell Anatomy 0.000 claims description 4
- 238000010839 reverse transcription Methods 0.000 claims description 4
- 230000007017 scission Effects 0.000 claims description 4
- 239000006228 supernatant Substances 0.000 claims description 4
- 102000003298 tumor necrosis factor receptor Human genes 0.000 claims description 4
- 238000012795 verification Methods 0.000 claims description 4
- 229920001213 Polysorbate 20 Polymers 0.000 claims description 3
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 claims description 3
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 claims description 3
- 230000009089 cytolysis Effects 0.000 claims description 2
- 238000003860 storage Methods 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 abstract description 12
- 238000005516 engineering process Methods 0.000 abstract description 8
- 238000010801 machine learning Methods 0.000 abstract description 5
- 238000002560 therapeutic procedure Methods 0.000 abstract description 3
- 102000008096 B7-H1 Antigen Human genes 0.000 description 13
- 108010074708 B7-H1 Antigen Proteins 0.000 description 13
- 230000008569 process Effects 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 210000003071 memory t lymphocyte Anatomy 0.000 description 5
- 230000035772 mutation Effects 0.000 description 5
- 239000000047 product Substances 0.000 description 5
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 230000001744 histochemical effect Effects 0.000 description 4
- 201000005202 lung cancer Diseases 0.000 description 4
- 208000020816 lung neoplasm Diseases 0.000 description 4
- 210000004881 tumor cell Anatomy 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000002619 cancer immunotherapy Methods 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 210000003819 peripheral blood mononuclear cell Anatomy 0.000 description 3
- 239000011535 reaction buffer Substances 0.000 description 3
- 230000004083 survival effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000006037 cell lysis Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 239000012636 effector Substances 0.000 description 2
- 210000003162 effector t lymphocyte Anatomy 0.000 description 2
- 238000003364 immunohistochemistry Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 206010059866 Drug resistance Diseases 0.000 description 1
- 102000037982 Immune checkpoint proteins Human genes 0.000 description 1
- 108091008036 Immune checkpoint proteins Proteins 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- 108010052285 Membrane Proteins Proteins 0.000 description 1
- 206010033701 Papillary thyroid cancer Diseases 0.000 description 1
- 238000003559 RNA-seq method Methods 0.000 description 1
- 230000000259 anti-tumor effect Effects 0.000 description 1
- 229940125644 antibody drug Drugs 0.000 description 1
- 230000030741 antigen processing and presentation Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 230000036755 cellular response Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000000378 dietary effect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 210000002865 immune cell Anatomy 0.000 description 1
- 230000005847 immunogenicity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 210000005075 mammary gland Anatomy 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000000869 mutational effect Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
- 230000006916 protein interaction Effects 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 201000010106 skin squamous cell carcinoma Diseases 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 238000007447 staining method Methods 0.000 description 1
- 238000013517 stratification Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 208000030045 thyroid gland papillary carcinoma Diseases 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/106—Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Pathology (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Genetics & Genomics (AREA)
- Immunology (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Epidemiology (AREA)
- Theoretical Computer Science (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Databases & Information Systems (AREA)
- Primary Health Care (AREA)
- Biophysics (AREA)
- Biochemistry (AREA)
- Oncology (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Hospice & Palliative Care (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention provides a method for predicting the PD-1 immune therapy response of non-small cell lung cancer for the purpose of non-disease diagnosis or therapy, which mainly comprises the following two parts, namely, fully utilizing immune therapy single-cell sequencing data to establish a stable prediction model, and accurately acquiring transcriptome information of peripheral blood key cells by a full-length transcriptome sequencing technology. The cell subtype and the characteristic gene related to tumor immunotherapy in peripheral blood are identified, and the method has obvious statistical significance in science; a prediction model based on machine learning is established, and the accuracy and stability of prediction are high; the construction flow of the transcriptome library is optimized, so that the technical application range is improved; the peripheral blood sample before treatment of a small amount of patients can be used for giving more accurate prediction results to the response situation of the peripheral blood sample after the peripheral blood sample is subjected to immunotherapy, and the detection cost is low and the popularization and the use are easy.
Description
Technical Field
The invention belongs to the technical field of biomedicine, and particularly relates to a non-small cell lung cancer PD-1 immune therapy response prediction method aiming at non-disease diagnosis or treatment.
Background
Currently, the common lung cancer immunotherapy and accompanying diagnosis methods on the market are mainly divided into two major categories, namely PD-L1 histochemical staining based on imaging and tumor mutation load analysis based on high-throughput sequencing. Immune checkpoint-based therapies target cell surface specific protein interactions (e.g., PD-1/PD-L1 antibody drugs), thereby reducing immune cell inhibition status and enhancing anti-tumor function. The histochemical staining method can directly analyze the quantity of the PD-L1 protein expressed by tumor cells in the tissue sample, thereby judging the potential clinical benefit of the patient.
Internationally, PD-L1 histochemical detection methods have been approved by the United states FDA, and existing commercial reagents include various products such as PD-L1 IHC 22C3 pharmDx (Dako), PD-L1 IHC 28-8pharmDx (Dako) and VENTANA PD-L1 SP142 (Roche), VENTANA PD-L1 SP263 (Roche).
However, PD-L1 detection by immunohistochemistry has many problems in practice, including:
(1) This method of detection is difficult because it requires the acquisition of a sample of tumor tissue of a patient and requires a high sample size, but it is difficult to sample a part of patients in practice.
(2) In the quantification process, the ratio of tumor cell surface protein expression (tumor proportion score, TPS) needs to be counted, and due to the heterogeneity of tumor tissue samples, the tumor cell PD-L1 expression of the puncture sampling part may be difficult to represent the overall situation of the microenvironment, thereby causing deviation of the final treatment prediction result of the patient.
(3) Some patients have been shown to be refractory to immunotherapy, i.e. have primary drug resistance problems, although PD-L1 detection is positive; while some patients respond better initially, they eventually develop acquired resistance, which is difficult to benefit from treatment, these objective conditions also lead to inaccuracy in patient stratification based on a histochemical analysis.
(4) Because commercial kits selected by different detection institutions/hospital pathology departments are different, quantitative results of various commercial detection products are different due to the fact that different antibody recognition areas are selected, detection uncertainty is brought, the process of multiple detection is time-consuming, and clinical work is finally affected.
For a number of reasons as described above, there are deviations in the sensitivity and accuracy of PD-L1 detection by immunohistochemistry, and detection of PD-L1 is difficult to be the only standard for patient group entry in tumor immunotherapy.
Mutations in the genome of tumor cells result in changes in the sequence of the encoded protein, which, upon antigen presentation, produces immunogenicity, i.e., tumor neoantigens. Therefore, the tumor mutation load can be analyzed by high-throughput sequencing, and the quality of the immunotherapy effect of the patient can be estimated. At present, a plurality of companies internationally provide a detection kit for tumor mutation loads, wherein the detection kit comprises GH Omni 500 (Guardant Health), foundationOneCDx (Foundation Medicine), plasmaSELECT (Personal Genome Diagnostics) and the like.
However, the analysis of Tumor Mutational Burden (TMB) is also affected by a number of factors, including: (1) The range of TMB values varies with the species, for example, high TMB values are most common in squamous cell carcinoma of the skin, melanoma, non-small cell lung carcinoma, and minimally in papillary thyroid carcinoma. (2) The impact of the patient's living environment and dietary daily life, such as TMB, is generally high in smokers, and results are biased during analysis. (3) TMB detection is affected by different methods and technical platforms; different TMB products have differences in the range of formulating the detection genes, and sequencing means are also divided into targeted sequencing and whole exon sequencing; on the other hand, there is also a difference between the detection results for tissue TMB and plasma TMB. (4) Difficulty in threshold selection at the time of analysis, for example in the KEYNOTE-158 study, 102 patients were defined as TMB-high (. Gtoreq.10 mut/Mb), accounting for 13%; in the B-F1RST (NCT 02848651) study, the TMB threshold (. Gtoreq.14.5 mut/Mb) was adjusted, but the results showed that both groups of patients were not statistically significant in Progression Free Survival (PFS). (5) At the same threshold, there is also a contradiction between the statistics of TMB. For example, in NEPTUNE studies of the anti-combination of Duplex with Tremellizumab for Duplex with Duvali You Shan, TMB-H (. Gtoreq.20 mut/Mb) is independent of clinical benefit; whereas patients with TMB.gtoreq.20 mut/Mb in MYSTIC (NCT 02453282) study had longer Overall Survival (OS) and Progression Free Survival (PFS). The problems and differences in clinical data present in these analyses render TMB difficult as an accurate predictor.
For these complications occurring in clinical practice, the identification and development of new detection markers and companion diagnostic strategies is urgent, and therefore, it is important to provide a more accurate companion diagnostic scheme.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a method for predicting the PD-1 immune therapy response of non-small cell lung cancer for the purpose of non-disease diagnosis or treatment. The method of the invention is expected to solve the problems of accuracy, stability and convenience in the diagnosis accompanied by the immunotherapy.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for predicting the response of non-small cell lung cancer PD-1 immunotherapy for the purpose of non-disease diagnosis or treatment, said method comprising:
(1) Collecting single-cell transcriptome data of tumor tissues, peripheral blood and lymph nodes of a cancer patient, and carrying out data filtering and standardization on the single-cell transcriptome data; analyzing the differences of tissue distribution, expression profile characteristics, cell proliferation, migration capacity and clonal expansion of different groups of T cells to identify a tumor-responsive T cell subset including terminally differentiated depleted T cells and regulatory T cells expressing TNFRSF 9; the same type of T cells were found in peripheral blood by analysis of TCR sequences of the tumor-responsive T cell subpopulation and designated as outer Zhou Xiezhong tumor-responsive T cells;
(2) Collecting single-cell transcriptome data sets and clinical treatment information disclosed by existing melanoma, triple-negative breast and non-small cell lung cancer patients, and carrying out data filtering, standardization and annotation;
(3) Layering the single-cell transcriptome data obtained in the step (2), randomly dividing a training set and a testing set, and taking clinical response information as a single-cell transcriptome data label;
(4) Training a logistic regression model based on single-cell transcriptome data of a training set, fitting a parameter vector according to the gene expression condition of a sample cell and corresponding clinical response information, and constructing the logistic regression model;
(5) Verifying by adopting the test set data pair, and adding and averaging the response probability obtained by model prediction to calculate the corresponding clinical response probability;
(6) Collecting peripheral blood samples of a patient with non-small cell lung cancer receiving PD-1 immune treatment, sorting out Zhou Xiezhong tumor response T cells, performing high-throughput sequencing to obtain transcriptome information, filtering and standardizing transcriptome data, inputting a logistic regression model, adding and averaging response probabilities obtained through model prediction to calculate corresponding clinical response probabilities, and predicting clinical response conditions according to the clinical response probabilities.
In the invention, in the step (1), the gene expression characteristics of the external Zhou Xiezhong tumor response T cells are similar to those of the terminally differentiated memory T cells, and are closely related to immunotherapy. The terminally differentiated memory T cells were designated Temra (terminally differentiated effector memory or effector T cell).
Preferably, in step (1), identifying a characteristic gene of the tumor-responsive T cell subpopulation comprises: CX3CR1, GZMB, GZMH, KLRD1, NKG7, GNLY, and FGFBP2.
Preferably, in step (3), the training set comprises 80% single cell transcriptome data and the test set comprises 20% single cell transcriptome data.
Preferably, in step (4), the logistic regression model is represented by the following formula:
wherein X is the gene expression, W T As parameter vector, W 0 Is a bias parameter.
Preferably, in step (5), the method for calculating the response probability is as follows:
wherein P is probability corresponding to clinical response condition, x is characteristic value, y is clinical response condition, θ T Is a parameter vector.
Preferably, in step (5), the method for calculating the clinical response probability is as follows:
where y=1 indicates clinical response, y=0 indicates clinical non-response, n indicates cell number, response is a predicted value, and if response >0 is clinical response, response <0 is clinical non-response, response=0 is indistinct, and detection is required again.
Preferably, in step (6), the high throughput sequencing obtaining transcriptome information is performed using a method comprising:
cells are subjected to lysis and reverse transcription, cDNA amplification and fragmentation are performed, library amplification and purification are performed after fragmentation, and high-throughput sequencing is performed after purification to obtain transcriptome information.
Preferably, the cDNA is fragmented using a Tn5 cleavage system in which the final concentration of Tn5 enzyme is 0.001-0.01. Mu.M, for example, 0.001. Mu.M, 0.005. Mu.M, or 0.01. Mu.M, and preferably 0.005. Mu.M.
Preferably, 5% -20% of dimethylformamide is also included in the Tn5 enzyme cutting system, for example, 5%, 10%, 15% or 20% of dimethylformamide is selected, and preferably 10% of dimethylformamide is selected.
Preferably, the pH value of the Tn5 enzyme cutting system is 7.0-8.5, for example, 7.0, 7.5, 8.0 or 8.5, and the like, preferably 7.3.
Preferably, the library amplification is performed using an amplification system to which 0.01-0.012% Tween-20 (e.g., may be 0.01%, 0.011% or 0.012% or the like) is added.
Preferably, the purification employs a purification strategy of: 0.7X retention beads+0.6x retention supernatant+0.15X retention beads+0.6x retention beads.
In a second aspect, the present invention provides a non-small cell lung cancer PD-1 immunotherapeutic response prediction system based on peripheral blood detection, the system comprising:
a tumor-responsive T cell subpopulation screening module for screening a tumor-responsive T cell subpopulation from single cell transcriptome data of tumor tissue, peripheral blood, and lymph nodes of a cancer patient;
the data acquisition module is used for acquiring a single-cell transcriptome data set and clinical treatment information disclosed by the existing melanoma, triple-negative breast and non-small cell lung cancer patients;
the data dividing module is used for layering the obtained single-cell transcriptome data and randomly dividing a training set and a testing set, and taking clinical response information as a single-cell transcriptome data tag;
the logistic regression model construction module is used for fitting the parameter vector according to the gene expression condition of the sample cell and the corresponding clinical response information to construct a logistic regression model;
the logistic regression model verification module is used for verifying the obtained logistic regression model, and calculating the clinical response probability by adding and averaging the response probability obtained by model prediction;
the clinical response situation prediction module is used for filtering and standardizing the response T cell transcriptome information of the outer Zhou Xiezhong tumor of the person to be predicted, inputting a logistic regression model, and adding and averaging the response probability obtained by model prediction to calculate the clinical response probability so as to predict the clinical response situation according to the clinical response probability.
Preferably, in the tumor-responsive T cell subset screening module, the tumor-responsive T cell subset is screened by the following method: collecting single-cell transcriptome data of tumor tissues, peripheral blood and lymph nodes of a cancer patient, and carrying out data filtering and standardization on the single-cell transcriptome data; analyzing the differences of tissue distribution, expression profile characteristics, cell proliferation, migration capacity and clonal expansion of different groups of T cells to identify a tumor-responsive T cell subset including terminally differentiated depleted T cells and regulatory T cells expressing TNFRSF 9; the same type of T cells were found in peripheral blood by analysis of TCR sequences of the tumor-responsive T cell subpopulation and designated as outer Zhou Xiezhong tumor-responsive T cells.
In the invention, the gene expression characteristics of the selected external Zhou Xiezhong tumor response T cells are similar to those of terminally differentiated memory T cells, and the gene expression characteristics are closely related to immunotherapy. The terminally differentiated memory T cells were designated Temra (terminally differentiated effector memory or effector T cell).
Preferably, identifying a characteristic gene of the tumor-responsive T cell subpopulation comprises: CX3CR1, GZMB, GZMH, KLRD1, NKG7, GNLY, and FGFBP2.
Preferably, the data obtaining module further comprises data filtering, standardization and annotation after obtaining the data information.
Preferably, in the data partitioning module, the training set includes 80% single cell transcriptome data, and the test set includes 20% single cell transcriptome data.
Preferably, the logistic regression model is represented by the following formula:
wherein the method comprises the steps ofX is the gene expression, W T As parameter vector, W 0 Is a bias parameter.
Preferably, the calculation method of the response probability is as follows:
wherein P is probability corresponding to clinical response condition, x is characteristic value, y is clinical response condition, θ T Is a parameter vector.
Preferably, the calculation method of the clinical response probability is as follows:
where y=1 indicates clinical response, y=0 indicates clinical non-response, n indicates cell number, response is a predicted value, and if response >0 is clinical response, response <0 is clinical non-response, response=0 is indistinct, and detection is required again.
In a third aspect, the present invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method of the first aspect when the computer program is executed.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of the first aspect.
The numerical ranges recited herein include not only the recited point values, but also any point values between the recited numerical ranges that are not recited, and are limited to, and for the sake of brevity, the invention is not intended to be exhaustive of the specific point values that the recited range includes.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention identifies cell subtype and characteristic gene related to tumor immunotherapy in peripheral blood based on large-queue single-cell histology data, and has obvious statistical significance in science.
(2) The invention establishes a prediction model based on machine learning, and can improve the accuracy and stability of treatment response prediction by continuously accumulating the data and repeatedly iterating the model.
(3) The experimental method of the invention is based on SMART-seq full-length transcriptome amplification technology, and has higher gene detection capability, so the detection resolution is higher.
(4) The invention optimizes the construction flow of transcriptome library, can successfully perform experiments in the initial quantity range of 1-1000 cells, and improves the technical application range.
(5) The detection means of the invention adopts a common transcriptome method (bulk RNA-seq) instead of a single-cell transcriptome, can greatly reduce the experimental workload and cost, and is favorable for clinical practice and popularization.
(6) The invention collects the peripheral blood sample of the immune therapeutic patient before treatment, the detection usage amount is not more than 4 milliliters (mL), the sampling process is convenient and quick, the damage to the patient is less, and the invention is beneficial to clinical practice and popularization.
Drawings
FIG. 1 is a flow chart of machine learning predictive model construction based on single cell data with concomitant diagnosis of lung cancer immunotherapy;
FIG. 2 is transcriptome library information at different Tn5 enzyme concentrations;
FIG. 3 is transcriptome library information at different buffer pH values;
FIG. 4 shows the purification results before and after adjustment of the magnetic bead screening strategy;
FIG. 5 is the results of construction of full length transcriptome library at a cell starting amount of 10;
FIG. 6 is the results of construction of full length transcriptome libraries at a cell starting amount of 100;
FIG. 7 is the results of construction of a full length transcriptome library at a cell starting amount of 200;
FIG. 8 is the results of construction of a full length transcriptome library at a cell starting amount of 300;
FIG. 9 is the results of construction of full length transcriptome libraries at a cell initiation amount of 1000;
FIG. 10 is a predicted outcome of immunotherapy response in peripheral blood based lung cancer patients.
Detailed Description
The technical scheme of the invention is further described by the following specific embodiments. It will be apparent to those skilled in the art that the examples are merely to aid in understanding the invention and are not to be construed as a specific limitation thereof.
The specific techniques or conditions are not identified in the examples and are described in the literature in this field or are carried out in accordance with the product specifications. The reagents or apparatus used were conventional products commercially available through regular channels, with no manufacturer noted.
The materials and solution formulation methods used in the following embodiments are as follows:
example 1
In order to realize the prediction of drug response to a patient with non-small cell lung cancer receiving PD-1 immunotherapy, the embodiment builds a prediction model based on the data of an immunotherapy transcriptome, and develops a peripheral blood sequencing technology based on a SMART-seq full-length transcriptome.
1. In order to improve the accuracy of the model building link, single cell data collection and arrangement are carried out:
(1) Through unified filtering standard and data standardization, integrating single cell transcriptome data of tumor tissues, peripheral blood and lymph nodes of a total of 21 patients with more than 300 cancer types, and carrying out data filtering and standardization on the single cell transcriptome data; analyzing the differences of tissue distribution, expression profile characteristics, cell proliferation, migration capacity and clonal expansion of different groups of T cells, and identifying T cell subsets of tumor responses therefrom, including terminally differentiated depleted T cells and regulatory T cells expressing TNFRSF 9; the same type of T cells were found in peripheral blood by analysis of TCR sequences of the tumor-responsive T cell subpopulation and designated as outer Zhou Xiezhong tumor-responsive T cells.
The gene expression characteristics of the outer Zhou Xiezhong tumor response T cells are similar to those of the terminally differentiated memory T cells, and are closely related to immunotherapy.
(2) Based on the latest single cell immunotherapy research results, the single cell data set and clinical treatment information (including melanoma, triple negative breast and non-small cell lung cancer) disclosed by 38 additional cancer patients are collected, and after data filtering, standardization and annotation, the T cell group is focused on.
(3) Single cell data were stratified with patient numbers and training sets (80% single cell data) and test sets (20% single cell data) were randomly divided at a ratio of 4:1, with clinical response information as single cell data tags.
(3) Selecting a default parameter training logistic regression model based on single cell data of a training set by using Python3.9, firstly fitting a parameter vector according to the gene expression condition of a sample cell and corresponding clinical response information, wherein the logistic regression model is shown in a formula 1:
wherein X is the gene expression, W T As parameter vector, W 0 Is a bias parameter.
(4) And verifying the model obtained by training by adopting test set data, and adding and averaging single cell response probabilities obtained by model prediction to calculate clinical response probabilities of corresponding patients, wherein the clinical response probabilities of the patients are predicted according to the clinical response probabilities, and the clinical response probabilities are specifically shown as a formula 2 and a formula 3:
wherein, in the formula 2, P is the probability of corresponding to the clinical response condition, x is the characteristic value, y is the clinical response condition,θ T is a parameter vector.
In formula 3, y=1 indicates clinical response, y=0 indicates clinical non-response, n indicates cell number, response is a predicted value, and if response >0 is clinical response, response <0 is clinical non-response, response=0 is indistinct, and detection is required again.
(5) The model construction strategy is based on verification test of melanoma, triple negative mammary gland and non-small cell lung cancer single cell immunotherapy data, and can effectively predict clinical response conditions of patients, and the prediction accuracy rate is more than 90%.
A flow chart of machine learning prediction model construction and lung cancer immunotherapy accompanying diagnosis based on single-cell data is shown in fig. 1, wherein the left graph is a prediction model construction flow, and the right graph is a clinical sample processing and therapy response prediction flow.
2. In the development link of peripheral blood sequencing technology:
(1) Peripheral blood samples of non-small cell lung cancer patients receiving PD-1 immunotherapy were collected and PBMC cells were extracted therefrom (sample size required: fresh not more than 4 ml or 10 ml 6 Individual cryopreserved PBMC cells).
(2) Labelling of CD3 in PBMC by fluorescent antibodies + T cells and specific T cell subsets (i.e. CD 3) are obtained by flow cell sorting + CX3CR1 + Outer Zhou Xiezhong tumor-responsive T cells and CD3 as background + CX3CR1 - Non-external Zhou Xiezhong tumor responsive T cells), three tubes were collected for each cell subpopulation as a technical repeat in order to ensure stability of the experimental results.
(3) The literature reports that the SMART-seq full-length transcriptome technology prototype is mainly used for constructing a second generation sequencing library aiming at single cells, and experimental reagents and steps are adjusted in the embodiment so that the SMART-seq full-length transcriptome technology prototype can be suitable for 1-10 3 Cell processing and library construction on the order of magnitude; this process involves three main steps, namely cell lysis and reverse transcription, cDNA amplification and fragmentation, library preparation and purification.
(4) Cell lysis and reverse transcription are routine experimental procedures, and conditions are optimized in key steps, including Tn5 cleavage system, reaction buffer adjustment, library purification strategy, etc., for better realization of subsequent cDNA fragmentation and library preparation.
(5) To improve experimental efficiency, the final concentrations of the reactions of different Tn5 enzyme complexes were tested (0.001. Mu.M-0.01. Mu.M, transcriptome library information and statistical results at different Tn5 enzyme concentrations are shown in FIG. 2 and Table 1, FIG. 2 is transcriptome library information at different Tn5 enzyme concentrations, table 1 is a summary statistical result of transcriptome library information at different Tn5 enzyme concentrations, and the results indicate that the library concentration and total amount are highest when the Tn5 concentration is 0.005. Mu.M. Therefore, the optimal condition is finally determined to be 0.005. Mu.M, compared with the commercial kit, the ratio of small fragments in the library is reduced, the quality of the library is improved, and the experimental cost is remarkably reduced.
TABLE 1
Sample name | Tn5 enzyme concentration | Library | Library inventory | |
1 | 0.01μM | 8.76ng/μL | 87.6ng | |
2 | 0.0075μM | 10.64ng/μL | 106.4ng | |
3 | 0.005μM | 12.40ng/μL | 124.0ng | |
4 | 0.004μM | 10.50ng/μL | 105.0ng | |
5 | 0.003μM | 6.62ng/μL | 66.2ng | |
6 | 0.001μM | 1.28ng/μL | 12.8ng |
(6) In order to improve the quality and stability of experimental results, the pH value (7.0-8.5) of the reaction buffer solution and the concentration (0% -20%) of Dimethylformamide (DMF) are adjusted, and finally, the combination condition of the optimal pH value of 7.3 and the optimal DMF of 10% is determined, so that the cDNA fragmentation effect is best, the library concentration is higher, and the quality is guaranteed. The transcriptome library information and the statistical results under the different buffer pH values are shown in FIG. 3 and Table 2, FIG. 3 is the transcriptome library information under the different buffer pH values, and Table 2 is the summarized statistical result of the transcriptome library information under the different buffer pH values; the results indicated that the library concentration and total amount were highest at pH 7.3. Meanwhile, 0.01% of Tween-20 is added in the library amplification step after fragmentation, and the improvement can neutralize the effect of SDS and improve the efficiency of the amplification enzyme.
TABLE 2
Sample is clearly called | PH value | Library | Library inventory | |
1 | 7.2 | 7.86ng/μL | 78.6ng | |
2 | 7.3 | 10.30ng/μL | 103.0ng | |
3 | 8.4 | 4.68ng/μL | 46.8ng |
(7) In order to improve the quality of sequencing data, in the library purification process, the step of magnetic bead screening is adjusted to be a purification strategy of 0.7X (reserved magnetic beads) +0.6X (reserved supernatant) +0.15X (reserved magnetic beads) +0.6X (reserved magnetic beads), and the scheme can effectively remove small fragments, so that the library size is more concentrated, and the data quality is improved. The purification results before and after the magnetic bead screening strategy are adjusted are shown in fig. 4, the magnetic bead screening strategy is adjusted, and one-step purification is additionally added, so that the proportion of small fragments can be effectively reduced, and the library quality is improved. The left image is an original document method, wherein the original document method is 1X (reserved magnetic beads) +0.6X (reserved supernatant) +0.15X (reserved magnetic beads), more magnetic beads are consumed, and small fragments are difficult to completely remove; the right panel shows the purification result after strategy adjustment, and the small fragments in the range of 200bp are obviously reduced.
(8) Transcriptome information corresponding to the cells was then obtained by Illumina NovaSeq 6000PE150 high throughput sequencing.
(9) And filtering and normalizing the transcriptome data, and inputting the transcriptome data into a prediction model to finally obtain the treatment response probability value of the corresponding patient.
In conclusion, the invention fully utilizes the immune therapy single cell sequencing data to establish a stable prediction model, and accurately obtains transcriptome information of peripheral blood key cells by a full-length transcriptome sequencing technology:
(1) The invention has firm theoretical basis and abundant data, integrates single cell transcriptome data of a total of 21 cancer types of more than 300 patients, and identifies cell types and gene expression characteristics related to immune therapy response in peripheral blood, namely external Zhou Xiezhong tumor response T cells, wherein the characteristic genes comprise CX3CR1, GZMB, GZMH, KLRD1, NKG7, GNLY, FGFBP2 and the like;
(2) The present invention collects and integrates immunotherapeutic single cell datasets for an additional 38 cancer patients, directed against CD8 therein + T lymphocytes, establishing a stable machine learning prediction model through logistic regression; the subsequent cross verification is carried out through the test set data, and the overall prediction accuracy can reach more than 90%;
(3) The experimental method of the technology only needs 2-4 milliliters (mL) of peripheral blood sample or 10 milliliters (mL) 6 The specific T cells responding to the outer Zhou Xiezhong tumor are enriched by antibody labeling and flow sorting, which is lower than the sample size required by tumor mutation load detection or other diagnostic methods;
(4) Unlike available single cell, the present invention improves the SMART-seq full length transcriptome library constructing process to reach initial amount of 1-1000 cells, and has integral experiment success rate not less than 95%, and the results shown in FIG. 5, FIG. 6, FIG. 7, FIG. 8 and FIG. 9, in which specific T cell group is flow sorted for subsequent transcriptome library construction and library segment analysis shows obvious signal peak in 400-600 range; thus, more gene numbers are detected by the SMART-seq method, the accuracy of the data is improved, and meanwhile, the stability of the result is ensured based on transcriptome data of a plurality of cells;
(5) The sequencing library construction process adopts an optimized and improved enzyme breaking system (comprising an enzyme complex and a reaction buffer system), so that the quality of the library is obviously improved, and the experimental cost is obviously reduced;
(6) According to the invention, more than 50 pre-treatment blood samples of non-small cell lung cancer patients receiving PD-1 immune treatment are collected, the current 16 patients have paired clinical follow-up information, the overall prediction accuracy can reach more than 80% by testing the method, and the result is shown in a graph 10, wherein in the graph 10, R represents clinical treatment response, non-R represents clinical treatment non-response, data points are distributed above a y-axis 0 point and represent model prediction as response, the model prediction below the y-axis 0 point is not response, the number of the data points represents the number of calculation simulation, the current display result is 10 simulation cases, and the current overall prediction accuracy is 81.3%, so that the method is proved to be effective in practical application.
Based on the characteristics, the invention can give more accurate prediction results for the response situation of the peripheral blood sample before the treatment of a small number of patients after the treatment of the patients.
The applicant declares that the above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and it should be apparent to those skilled in the art that any changes or substitutions that are easily conceivable within the technical scope of the present invention disclosed by the present invention fall within the scope of the present invention and the disclosure.
Claims (10)
1. A method of predicting a non-small cell lung cancer PD-1 immunotherapeutic response for the purposes of non-disease diagnosis or treatment, the method comprising:
(1) Collecting single-cell transcriptome data of tumor tissues, peripheral blood and lymph nodes of a cancer patient, and carrying out data filtering and standardization on the single-cell transcriptome data; analyzing the differences of tissue distribution, expression profile characteristics, cell proliferation, migration capacity and clonal expansion of different groups of T cells to identify a tumor-responsive T cell subset including terminally differentiated depleted T cells and regulatory T cells expressing TNFRSF 9; the same type of T cells were found in peripheral blood by analysis of TCR sequences of the tumor-responsive T cell subpopulation and designated as outer Zhou Xiezhong tumor-responsive T cells;
(2) Collecting single-cell transcriptome data sets and clinical treatment information disclosed by existing melanoma, triple-negative breast and non-small cell lung cancer patients, and carrying out data filtering, standardization and annotation;
(3) Layering the single-cell transcriptome data obtained in the step (2), randomly dividing a training set and a testing set, and taking clinical response information as a single-cell transcriptome data label;
(4) Training a logistic regression model based on single-cell transcriptome data of a training set, fitting a parameter vector according to the gene expression condition of a sample cell and corresponding clinical response information, and constructing the logistic regression model;
(5) Verifying by adopting the test set data pair, and adding and averaging the response probability obtained by model prediction to calculate the corresponding clinical response probability;
(6) Collecting peripheral blood samples of a patient with non-small cell lung cancer receiving PD-1 immune treatment, sorting T cells responding to Zhou Xiezhong tumor outside, performing high-throughput sequencing to obtain transcriptome information, filtering and standardizing transcriptome data, inputting a logistic regression model, adding and averaging response probabilities obtained through model prediction to calculate corresponding clinical response probabilities, and predicting clinical response conditions according to the clinical response probabilities.
2. The method of predicting the response of non-small cell lung cancer PD-1 immunotherapy for non-disease diagnosis or treatment according to claim 1, wherein in step (1), identifying the characteristic genes of the T cell subset of the tumor response comprises: CX3CR1, GZMB, GZMH, KLRD1, NKG7, GNLY, and FGFBP2.
3. The method of predicting the response of non-small cell lung cancer PD-1 immunotherapy for non-disease diagnosis or treatment according to claim 1 or 2, wherein in step (3), the training set comprises 80% single cell transcriptome data and the test set comprises 20% single cell transcriptome data.
4. The method of predicting the PD-1 immunotherapy response of non-small cell lung cancer for the purpose of non-disease diagnosis or treatment according to any one of claims 1-3, wherein in step (4), the logistic regression model is represented by the following formula:
wherein X is the gene expression, W T As parameter vector, W 0 Is a bias parameter.
5. The method of predicting the response of non-small cell lung cancer PD-1 immunotherapy for the purpose of non-disease diagnosis or treatment according to any one of claims 1 to 4, wherein in step (5), the method of calculating the response probability is represented by the following formula:
wherein P is probability corresponding to clinical response condition, x is characteristic value, y is clinical response condition, θ T Is a parameter vector;
preferably, in step (5), the method for calculating the clinical response probability is as follows:
where y=1 indicates clinical response, y=0 indicates clinical non-response, n indicates cell number, response is a predicted value, and if response >0 is clinical response, response <0 is clinical non-response, response=0 is indistinct, and detection is required again.
6. The method of predicting the response to non-small cell lung cancer PD-1 immunotherapy for the purpose of non-disease diagnosis or treatment according to any one of claims 2 to 5, wherein in step (6), the high throughput sequencing obtaining transcriptome information is performed by a method comprising the steps of:
the cell is subjected to lysis and reverse transcription, cDNA amplification and fragmentation are carried out, library amplification and purification are carried out after fragmentation, and high-throughput sequencing is carried out after purification to obtain transcriptome information;
preferably, the cDNA is fragmented by using a Tn5 cleavage system, wherein the final concentration of Tn5 enzyme in the Tn5 cleavage system is 0.001-0.01. Mu.M, preferably 0.005. Mu.M;
preferably, the Tn5 enzyme cutting system also comprises 5% -20% of dimethylformamide, preferably 10% of dimethylformamide;
preferably, the pH value of the Tn5 enzyme cutting system is 7.0-8.5, preferably 7.3;
preferably, the library amplification is performed using an amplification system with 0.01-0.012% Tween-20 added;
preferably, the purification employs a purification strategy of: 0.7X retention beads+0.6x retention supernatant+0.15X retention beads+0.6x retention beads.
7. A non-small cell lung cancer PD-1 immunotherapeutic response prediction system based on peripheral blood detection, the system comprising:
a tumor-responsive T cell subpopulation screening module for screening a tumor-responsive T cell subpopulation from single cell transcriptome data of tumor tissue, peripheral blood, and lymph nodes of a cancer patient;
the data acquisition module is used for acquiring a single-cell transcriptome data set and clinical treatment information disclosed by the existing melanoma, triple-negative breast and non-small cell lung cancer patients;
the data dividing module is used for layering the obtained single-cell transcriptome data and randomly dividing a training set and a testing set, and taking clinical response information as a single-cell transcriptome data tag;
the logistic regression model construction module is used for fitting the parameter vector according to the gene expression condition of the sample cell and the corresponding clinical response information to construct a logistic regression model;
the logistic regression model verification module is used for verifying the obtained logistic regression model, and calculating the clinical response probability by adding and averaging the response probability obtained by model prediction;
the clinical response situation prediction module is used for filtering and standardizing the response T cell transcriptome information of the outer Zhou Xiezhong tumor of the person to be predicted, inputting a logistic regression model, and adding and averaging the response probability obtained by model prediction to calculate the clinical response probability so as to predict the clinical response situation according to the clinical response probability.
8. The peripheral blood detection-based non-small cell lung cancer PD-1 immunotherapeutic response prediction system of claim 7, wherein the logistic regression model is represented by the formula:
wherein X is the gene expression, W T As parameter vector, W 0 Is a bias parameter;
preferably, the calculation method of the response probability is as follows:
wherein P is probability corresponding to clinical response condition, x is characteristic value, y is clinical response condition, θ T Is a parameter vector;
preferably, the calculation method of the clinical response probability is as follows:
where y=1 indicates clinical response, y=0 indicates clinical non-response, n indicates cell number, response is a predicted value, and if response >0 is clinical response, response <0 is clinical non-response, response=0 is indistinct, and detection is required again.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-5 when the computer program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310326560.5A CN116334225A (en) | 2023-03-30 | 2023-03-30 | Non-small cell lung cancer PD-1 immune therapy response prediction method for non-disease diagnosis or treatment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310326560.5A CN116334225A (en) | 2023-03-30 | 2023-03-30 | Non-small cell lung cancer PD-1 immune therapy response prediction method for non-disease diagnosis or treatment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116334225A true CN116334225A (en) | 2023-06-27 |
Family
ID=86891064
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310326560.5A Pending CN116334225A (en) | 2023-03-30 | 2023-03-30 | Non-small cell lung cancer PD-1 immune therapy response prediction method for non-disease diagnosis or treatment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116334225A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117290817A (en) * | 2023-10-18 | 2023-12-26 | 浙江省立同德医院(浙江省精神卫生研究院) | Marker combination of product for relieving hypercoagulability state of lung cancer, method for establishing and applying curative effect discrimination model and traditional Chinese medicine combination |
CN119170087A (en) * | 2024-11-20 | 2024-12-20 | 北京大学人民医院 | Predictive model and construction method for predicting prognosis after neoadjuvant immunotherapy by metastatic lymph nodes |
-
2023
- 2023-03-30 CN CN202310326560.5A patent/CN116334225A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117290817A (en) * | 2023-10-18 | 2023-12-26 | 浙江省立同德医院(浙江省精神卫生研究院) | Marker combination of product for relieving hypercoagulability state of lung cancer, method for establishing and applying curative effect discrimination model and traditional Chinese medicine combination |
CN119170087A (en) * | 2024-11-20 | 2024-12-20 | 北京大学人民医院 | Predictive model and construction method for predicting prognosis after neoadjuvant immunotherapy by metastatic lymph nodes |
CN119170087B (en) * | 2024-11-20 | 2025-02-18 | 北京大学人民医院 | Predictive model for predicting prognosis after neoadjuvant immunotherapy through metastatic lymph node and construction method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhao et al. | Detection of fetal subchromosomal abnormalities by sequencing circulating cell-free DNA from maternal plasma | |
US20210002728A1 (en) | Systems and methods for detection of residual disease | |
Ermann et al. | Immune cell profiling to guide therapeutic decisions in rheumatic diseases | |
AU2020221845A1 (en) | An integrated machine-learning framework to estimate homologous recombination deficiency | |
CN106650312B (en) | Device for detecting copy number variation of circulating tumor DNA | |
CN109880910A (en) | A kind of detection site combination, detection method, detection kit and the system of Tumor mutations load | |
AU2021282414B2 (en) | Systems And Methods For Determining Microsatellite Instability | |
Shegekar et al. | The emerging role of liquid biopsies in revolutionising cancer diagnosis and therapy | |
EP3629904A1 (en) | Methods and systems for identifying or monitoring lung disease | |
CN106778073B (en) | A kind of method and system of assessment tumor load variation | |
CN116334225A (en) | Non-small cell lung cancer PD-1 immune therapy response prediction method for non-disease diagnosis or treatment | |
CN105219844A (en) | A kind of compose examination 11 kinds of diseases gene marker combination, test kit and disease risks predictive model | |
Xu-Monette et al. | A refined cell-of-origin classifier with targeted NGS and artificial intelligence shows robust predictive value in DLBCL | |
Zheng | Study design considerations for cancer biomarker discoveries | |
JP2022552723A (en) | Method and system for measuring cell status | |
US20240401035A1 (en) | Fragment size characterization of cell-free dna mutations from clonal hematopoiesis | |
CN113853444A (en) | Methods for predicting survival in cancer patients | |
Michaelsen et al. | A B-cell–associated gene signature classification of diffuse large B-cell lymphoma by NanoString technology | |
Kotecha et al. | Matched molecular profiling of cell-free DNA and tumor tissue in patients with advanced clear cell renal cell carcinoma | |
RU2744604C2 (en) | Method for non-invasive prenatal diagnostics of fetal chromosomal aneuploidy from maternal blood | |
Aung et al. | Spatially informed gene signatures for response to immunotherapy in melanoma | |
Bhattacharya et al. | DeCompress: tissue compartment deconvolution of targeted mRNA expression panels using compressed sensing | |
Kim et al. | Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer | |
US20250059608A1 (en) | Molecular signatures for cell typing and monitoring immune health | |
Hobbs et al. | Biostatistics and bioinformatics in clinical trials |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |