[go: up one dir, main page]

CN108491689B - Tumour neoantigen identification method based on transcript profile - Google Patents

Tumour neoantigen identification method based on transcript profile Download PDF

Info

Publication number
CN108491689B
CN108491689B CN201810101545.XA CN201810101545A CN108491689B CN 108491689 B CN108491689 B CN 108491689B CN 201810101545 A CN201810101545 A CN 201810101545A CN 108491689 B CN108491689 B CN 108491689B
Authority
CN
China
Prior art keywords
rna
sequence
mutation
sample
amino acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810101545.XA
Other languages
Chinese (zh)
Other versions
CN108491689A (en
Inventor
莫凡
陈荣昌
罗凯
马志明
周秀卿
黄灵灵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou New Ann Tianjin Biological Technology Co Ltd
Original Assignee
Hangzhou New Ann Tianjin Biological Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou New Ann Tianjin Biological Technology Co Ltd filed Critical Hangzhou New Ann Tianjin Biological Technology Co Ltd
Priority to CN201810101545.XA priority Critical patent/CN108491689B/en
Publication of CN108491689A publication Critical patent/CN108491689A/en
Application granted granted Critical
Publication of CN108491689B publication Critical patent/CN108491689B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses the tumor antigen identification methods based on transcript profile, and the RNA sample including obtaining specimens builds library and amplification to RNA sample, obtains the RNA sample sequencing result of tumor tissues;The short read of RNA sample sequencing result is compared to the mankind and refers to genome, obtains RNA comparison result;Gene expression amount is calculated according to RNA comparison result, abrupt climatic change and prediction fusion event are carried out according to RNA comparison result;Transcript profile HLA parting is predicted according to comparison result;Calculate gene expression amount, abrupt climatic change and prediction fusion event;By the gene expression amount of transcript profile sample, depth of the transcript profile mutational site in complete outer sequencing sample, newborn small peptide and the binding force of patient HLA parting give four steps of downstream analysis personnel as analysis result.The present invention provides the methods that one kind can identify the tumour specific antigen of individual specimen from tumor patient transcript profile NGS data.

Description

Tumour neoantigen identification method based on transcript profile
Technical field
The present invention relates to a kind of biometric information authentication methods of tumor neogenetic antigen.
Background technique
Conventional genetic mutation detection technique includes Sanger sequencing, pyrosequencing, amplification retardance discrimination system at present (ARMS), fluorescence was hybridization technique and allele specific pcr etc. originally.These universal flux of genetic test means are lower, and take With height, time-consuming.
With the appearance of new-generation sequencing technology (Next-Generation Sequencing, NGS), extensive parallel survey Sequence is possibly realized.Compared to first generation sequencing technologies, NGS technology detection speed is fast, and accuracy rate is high, and at low cost, coverage is wide.Benefit The sequence of resurveying of Oncogenome sequence can be carried out with NGS technology, then (point is prominent to the different type variation of tumor-related gene Become, insertion, missing etc.), copy number variation and gene associations and polymorphism identified.Tumour NGS detection at present can answer For hereditary tumor screening and mutation identification.
In immunotherapy of tumors, the neoantigen (neoantigen) that identification tumor tissue cell generates is to determine that downstream is faced The committed step of bed treatment.Normal cell is in Carcinogenesis, since inhereditary material changes, leads to its DNA sequence dna and its His normal cell is variant, therefore can generate tumor associated antigen (expression of tumour cell height) or tumour specific antigen and (only exist It is expressed in tumour cell).These antigens can theoretically be resisted due to having the specific epitope of mark tumour cell Original then in conjunction with TCR, and then activates T cell in human leukocyte antigen (HLA) identification on delivery cell, starts immune anti- It answers, is the potential target spot of immunotherapy of tumors.Pass through the analysis to patient's tumor tissues and normal tissue NGS data, Ke Yijian Make mutation of the tissue, such as point mutation, insertion and deletion mutation, structure variation, Gene Fusion etc..It can from these catastrophic events To predict variation that tumour cell occurs in downstream transcription and expression process, and then neoantigen that may be present is deduced, transcribed Group sequencing data can more directly reflect the variation of inhereditary material, can not only identify gene for genome Presentation situation of the horizontal mutation of group in central dogma, can also confirm the reliability of gene order-checking data analysis result, be Tumor neogenetic Antigen Identification provides more foundations and evidence, and then guiding clinical treatment.
Summary of the invention
The purpose of the present invention is to provide one kind to identify individual specimen from tumor patient transcript profile NGS data The method of tumour specific antigen.
Tumor antigen identification method based on transcript profile, comprising the following steps:
S1: obtaining the RNA sample of specimens, builds library and amplification to RNA sample, obtains the RNA sample of tumor tissues This sequencing result;
S2: the short read of RNA sample sequencing result being compared to the mankind and refers to genome, obtains RNA comparison result;
S3, gene expression amount is calculated according to RNA comparison result, abrupt climatic change is carried out according to RNA comparison result, according to RNA Comparison result predicts fusion event;Transcript profile HLA parting is predicted according to comparison result;Calculate gene expression amount, mutation inspection It surveys and prediction fusion event carries out in a designated order, or synchronous progress;
S3A: calculating gene expression amount includes: the short read of introne in removal RNA comparison result is formed only by exon The RNA comparison result of short read composition, calculates gene expression amount;
S3B: abrupt climatic change includes:
S3B1: from the mutation detected in RNA comparison result in RNA sample, remove in the full exon sample of normal tissue The mutation having, somatic mutation of the remaining mutation as rna level;Functional annotation is carried out to the mutation of RNA;
S3B2: it calculates the depth in mutational site: for each mutational site, calculating current mutational site and surveyed in full exon Depth in sequence sample, full exon sample include the full exon sample of tumor tissues and the full exon sample of normal tissue This;
S3B3:HLA parting affinity prediction: according to the annotation of mutation as a result, corresponding for each catastrophic event editor Transcript sequence, obtain one include all mutation polypeptide sequence, removing can be matched in wild type protein sequence Peptide fragment forms the distinctive nascent polypeptide sequence of transcript profile, and the distinctive polypeptide sequence of transcript profile is combined needs according to HLA parting Length is intercepted into small peptide, and it is pre- that the patient's HLA parting identified in small peptide and the RNA sample sequencing result of step S1 is done affinity It surveys;
Wherein, the depth for calculating mutational site carries out in a designated order with the prediction of HLA parting affinity or synchronous progress;
S3C, prediction fusion event: predicting the fusion event of transcript profile according to the RNA comparison result of step S2, After being screened out from it believable fusion, protein sequence is translated into, protein sequence is combined to the length needed according to HLA parting Intercept into small peptide;The patient's HLA parting identified in the RNA sample sequencing result of small peptide and step S1 is done into affinity prediction;
S4: by the gene expression amount of transcript profile sample, depth of the transcript profile mutational site in complete outer sequencing sample is newborn The binding force of small peptide and patient HLA parting gives downstream analysis personnel as analysis result.
Further, it in step S3A, calculates gene expression amount and comprises the steps of:
S3A1, removal build the duplicate short read sequence generated during library by PCR amplification;
S3A2, the short read compared to introne is removed, obtains the RNA comparison result being made of the short read of exon;
S3A3, each short read number of exon is calculated, with following formula scales at the expression quantity of gene:
Wherein total exon reads indicates to compare the total short read number for arriving some exon region, mapped reads (millions) it indicates to compare the short read number for arriving the region in every 1000000 short reads, exon length (KB) indicates this The length of section exon.
Further, after step S2 obtains gene expression amount, Quality Control is carried out to RNA comparison result, the content of Quality Control includes: whole The coverage of a capture region, average sequencing depth, comparison rate, the specific gravity of repetitive sequence and unique specific gravity for comparing reads;If Comparison result meets Quality Control requirement, then enters S3;If comparison result does not meet Quality Control requirement, sample is reacquired.
Further, step S3B2 include all mutation polypeptide sequence acquisition methods are as follows: according to transcript number from Corresponding CDS Region Nucleotide sequence is found out in ensembl database, is spliced 3 ' terminal sequence of downstream (3 ' UTR), according to mutation The nucleotide of corresponding position on nucleotide sequence is made modification by functional annotation, obtains the nucleotides sequence comprising all mutation Column;That provide such as functional annotation is G100A, then the 100th bases G for splicing nucleotide sequence is revised as A;Institute will be included There is the transcript nucleotide sequence of mutation to translate into polypeptide sequence.
Further, the method for small peptide is intercepted in step S3B and S3C are as follows:
SI, sequence obtain sporting in polypeptide sequence and work as premutation, centered on being currently mutated, intercept n ammonia forward M amino acid acquisition polypeptide sequence of base acid and backward interception, the maximum length that n can present for HLA parting, m is HLA parting institute The maximum length or m that can be presented are from the length when premutation to first terminator codon;
SII, successively intercepted length is the small peptide of N in polypeptide sequence, and N is that HLA parting combines the length needed;Obtain N Item includes the small peptide when premutation.Such as: it is 8, N=8 that HLA parting, which combines the length needed,;It is cut forward since mutated site It takes 7 amino acid, form the small peptide that a length is 8 amino acid with the mutated site;Intercept 6 forward since mutated site A amino acid intercepts 1 amino acid backward, and it is the short of 8 amino acid that this 7 amino acid, which form the 2nd article of length with mutated site, Peptide, in this way, 8 small peptides containing the mutation can be obtained altogether.
Further, step SI intercepts the rule of polypeptide sequence containing mutation sites are as follows:
A, for point mutation: centered on the position of mutation, intercepting n amino acid forward, intercept m amino backward Acid, the longest peptide fragment that n=m=HLA parting can present, if when leading portion or back segment curtailment, how many cuts how many;If Point mutation belongs to stop loss (terminator codon loss), then m is from the length when premutation to first terminator codon;
B, for non-frameshift mutation: non-frameshit insertion will intercept forward n amino from first amino acid of insetion sequence Acid;M amino acid, the longest peptide fragment that n=m=HLA parting can present are intercepted backward from the last one amino acid of insetion sequence; Centered on insetion sequence, n amino acid before insetion sequence, insetion sequence and m amino acid after insetion sequence and Collectively constitute nascent polypeptide sequence;
Non- frameshift deletion then centered on deletion segment, respectively forwardly intercepts n amino acid, intercepts m amino acid backward, The longest peptide fragment that n=m=HLA parting can present;
C, for frameshift mutation: centered on first amino acid for starting frameshift mutation, intercepting n amino acid, n forward The longest peptide fragment that=HLA parting can present;M amino acid is intercepted backward, and m is from when premutation to first terminator codon Length, i.e., intercept backward to first terminator codon.
The present invention has the advantages that
1. entire qualification process, since fastq file, user is without preparing other input files.
2. all authentication steps have corresponding Quality Control step, the accuracy of result is improved.
The case where 3. identification for mutational site is more comprehensive, not only considers individual cells, it is also considered that monolith tissue Mutation distribution.
4. there is the confirmation of multiple groups qualification result.
5. committed step is all carried out there are many algorithm simultaneously, it can both be mutually authenticated, and also reduce the false negative of result.
6. committed step all optimizes algorithm, and introduces parallel computation, the analysis time of single sample is shortened.
Specific embodiment
Tumor antigen identification method based on transcript profile, comprising the following steps:
S1: obtaining the RNA sample of specimens, builds library and amplification to RNA sample, obtains the RNA sample of tumor tissues This sequencing result.
S2: the short read of RNA sample sequencing result being compared to the mankind and refers to genome, obtains RNA comparison result.
S3, according to RNA comparison result and gene expression amount is calculated, abrupt climatic change is carried out according to RNA comparison result, and according to RNA comparison result predicts fusion event;HLA parting is predicted according to comparison result;Calculate gene expression amount, abrupt climatic change and Prediction fusion event carries out in a designated order, or synchronous progress.
S4: by the gene expression amount of transcript profile sample;Depth of the transcript profile mutational site in complete outer sequencing sample;It is newborn The binding force of small peptide and patient HLA parting gives downstream analysis personnel as analysis result.
Calculate gene expression amount
The short read of introne in RNA comparison result is removed, is formed only by the RNA comparison result of the short read of exon, meter Calculate gene expression amount.
Gene expression amount is calculated to comprise the steps of:
S3A1, removal build the duplicate short read sequence generated during library by PCR amplification;The library stage is built in sequencing, with The nucleotides sequence that machine interrupts, which is listed in flowcell, can expand cluster, and the sequence of cluster is the same, referred to as repetitive sequence, main If repetitive sequence is removed in this step in order to increase probability and confidence level that base is measured, capture sequencing when It waits, if the repetitive sequence of target area does not remove, can not just know the true coverage in the region and depth, mutation is identified Interference can be all generated with expression quantity calculating.
S3A2, the short read of introne that will will be transcribed on RNA using the split_N_cigar function in GATK kit Segment excision obtains the RNA comparison result being only made of the short read of exon;Although what our sequencings obtained is mature rna, Some short reads, which can compare, includes subregion, because short read only has the length of 150bp, there is a strong possibility can compare reality On be not belonging to its place, it is therefore desirable to the short read that subregion is included in transcript profile identification is removed.
S3A3, the short read number that each gene region compares is calculated using HTseq, with following formula scales at gene Expression quantity:
Wherein total exon reads indicates to compare the total short read number for arriving some exon region, mapped reads (millions) it indicates to compare the short read number for arriving the region in every 1000000 short reads, exon length (KB) indicates this The length of section exon.
Calculate mutational site depth
Depth of the catastrophe point in full exon sample is calculated to comprise the steps of:
S3B.1: from the mutation detected in RNA comparison result in RNA sample, remove in the full exon sample of normal tissue The mutation having, somatic mutation of the remaining mutation as rna level;Functional annotation is carried out to the mutation of RNA;
S3B.2: it calculates the depth in mutational site: for each mutational site, calculating current mutational site in full exon The depth in sample is sequenced, full exon sample includes the full exon sample of tumor tissues and the full exon sample of normal tissue This, the tools such as Bam-recount can be used to calculate depth of the transcript profile mutational site in full exon sample, Bam- Recount can only calculate the depth of SNV, and javakit can calculate the depth of indel, and the two result is incorporated as final knot Fruit.Full exon sample includes the full exon sample of tumor tissues and the full exon sample of normal tissue.
HLA parting affinity predicts S3C.1: from the mutation detected in RNA sample in RNA comparison result, removing normal group Identified mutation in the full exon sample knitted, mutation of the remaining mutation as rna level;Function is carried out to the mutation of RNA It can annotation;
S3C.2: according to the annotation of mutation as a result, being directed to the corresponding transcript sequence of each catastrophic event editor, one is obtained Item includes the nucleotide sequence of all mutation, then translates into polypeptide sequence, remove in polypeptide sequence wild-type protein can be with The peptide fragment being matched to forms the distinctive nascent polypeptide sequence of transcript profile.By the distinctive polypeptide sequence of transcript profile according to HLA parting knot The length interception needed is closed into small peptide, the patient's HLA parting identified in the RNA sample sequencing result of small peptide and step S1 is done into parent It is predicted with power.
The preparation method of transcript nucleotide sequence are as follows: found out from ensembl database according to transcript number corresponding CDS Region Nucleotide sequence, splice 3 ' terminal sequence of downstream, according to the functional annotation of mutation by corresponding position on nucleotide sequence Nucleotide make modification, obtain one include all mutation nucleotide sequence.That provide such as functional annotation is G100A, then The 100th bases G for splicing nucleotide sequence is revised as A;By the transcript nucleotide sequence translation comprising all mutation At polypeptide sequence.
Using the HLA parting of multiple HLA Classification Identification tools identification transcript profile sample, such as SOAP-HLA, by each calculation The result comprehensive consideration that method obtains is as final result.
Affinity prediction is carried out using multiple HLA parting affinity forecasting softwares respectively, is such as directed to HLA I type (netMHC4.0 etc.), for HLA II type (netMHCII 2.2 etc.);In prediction result, retain the judgement of at least one software To there is the result of affinity.
Predict fusion event
According to the RNA comparison result of step S2 predict transcript profile fusion event, by SOAPfuse (v1.27), The tools such as STAR-Fusion are predicted.After being screened out from it believable fusion, protein sequence is translated into, by albumen sequence Column combine the length needed to intercept into small peptide according to HLA parting;By what is identified in the RNA sample sequencing result of small peptide and step S1 Patient's HLA parting does affinity prediction;
After step S2 obtains gene expression amount, Quality Control is carried out to RNA comparison result, the content of Quality Control includes: entire capture The coverage in region, average sequencing depth, comparison rate, the specific gravity of repetitive sequence and unique specific gravity for comparing reads;If comparing knot Fruit meets Quality Control requirement, then enters S4;If comparison result does not meet Quality Control requirement, sample is reacquired.
Intercept small peptide
The method for intercepting small peptide are as follows:
SI, sequence obtain sporting in polypeptide sequence and work as premutation, centered on being currently mutated, intercept n ammonia forward M amino acid acquisition polypeptide sequence of base acid and backward interception, the maximum length that n can present for HLA parting, m is HLA parting institute The maximum length or m that can be presented are from the length when premutation to first terminator codon;
SII, in the polypeptide sequence small peptide that successively intercepted length is N, N is that HLA parting combines the length needed;Obtain N item Small peptide comprising working as premutation.Such as: it is 8, N=8 that HLA parting, which combines the length needed,;Intercept 7 forward since mutated site A amino acid forms the small peptide that a length is 8 amino acid with the mutated site;Intercept 6 forward since mutated site Amino acid intercepts 1 amino acid backward, this 7 amino acid and mutated site form the small peptide that the 2nd article of length is 8 amino acid, In this way, 8 small peptides containing the mutation can be obtained altogether.
For different types of mutation, the rule of interception small peptide are as follows:
A, for point mutation: centered on the position of mutation, intercepting n amino acid forward, intercept m amino backward Acid, the longest peptide fragment that n=m=HLA parting can present, if when leading portion or back segment curtailment, how many cuts how many;If Point mutation belongs to stop loss (terminator codon loss), then m is from the length when premutation to first terminator codon;
B, for non-frameshift mutation: non-frameshit insertion will intercept forward n amino from first amino acid of insetion sequence Acid;M amino acid, the longest peptide fragment that n=m=HLA parting can present are intercepted backward from the last one amino acid of insetion sequence; Centered on insetion sequence, n amino acid before insetion sequence, insetion sequence and m amino acid after insetion sequence and Collectively constitute polypeptide sequence;
Non- frameshift deletion then centered on deletion segment, respectively forwardly intercepts n amino acid, intercepts m amino acid backward, The longest peptide fragment that n=m=HLA parting can present;
C, for frameshift mutation: centered on first amino acid for starting frameshift mutation, intercepting n amino acid, n forward The longest peptide fragment that=HLA parting can present;M amino acid is intercepted backward, and m is from when premutation to first terminator codon Length, i.e., intercept backward to first terminator codon.
Specific embodiment is that invention is further explained, but of the invention is not limited to these specific embodiment parties Formula.

Claims (4)

1. the tumor antigen identification method based on transcript profile, which comprises the following steps:
S1: obtaining the RNA sample of specimens, builds library and amplification to RNA sample, and the RNA sample for obtaining tumor tissues is surveyed Sequence is as a result, identify the HLA parting of patient according to sequencing result;
S2: the short read of RNA sample sequencing result being compared to the mankind and refers to genome, obtains RNA comparison result;
S3, gene expression amount is calculated according to RNA comparison result, abrupt climatic change is carried out according to RNA comparison result, is compared according to RNA Prediction of result fusion event;Transcript profile HLA parting is predicted according to comparison result;Calculate gene expression amount, abrupt climatic change and Prediction fusion event carries out in a designated order, or synchronous progress;
S3A: calculating gene expression amount includes: the short read of introne in removal RNA comparison result is formed only by the short reading of exon The RNA comparison result of Duan Zucheng calculates gene expression amount,
Gene expression amount is calculated to comprise the steps of:
S3A1, removal build the duplicate short read sequence generated during library by PCR amplification;
S3A2, the short read compared to introne is removed, obtains the RNA comparison result being made of the short read of exon;
S3A3, each short read number of exon is calculated, with following formula scales at the expression quantity of gene:
Wherein total exon reads indicates to compare the total short read number for arriving some exon region, mapped reads (millions) it indicates to compare the short read number for arriving the region in every 1000000 short reads, exon length (KB) indicates this The length of section exon;
S3B: abrupt climatic change includes:
S3B1: from the mutation detected in RNA sample in RNA comparison result, removing has in the full exon sample of normal tissue Mutation, remaining somatic mutation of the mutation as rna level;Functional annotation is carried out to the mutation of RNA;
S3B2: it calculates the depth in mutational site: for each mutational site, calculating current mutational site in full exon and sample is sequenced Depth in this, full exon sample includes the full exon sample of tumor tissues and the full exon sample of normal tissue;
S3B3:HLA parting affinity prediction: according to the annotation of mutation as a result, accordingly being transcribed for each catastrophic event editor This sequence obtains the polypeptide sequence comprising all mutation, removes the peptide fragment that can be matched in wild type protein sequence The distinctive nascent polypeptide sequence of transcript profile is formed, the distinctive polypeptide sequence of transcript profile is combined to the length needed according to HLA parting Small peptide is intercepted into, the patient's HLA parting identified in the RNA sample sequencing result of small peptide and step S1 is done into affinity prediction;
Wherein, the depth for calculating mutational site carries out in a designated order with the prediction of HLA parting affinity or synchronous progress;
S3C, prediction fusion event: the fusion event of transcript profile is predicted according to the RNA comparison result of step S2, therefrom After filtering out believable fusion, protein sequence is translated into, combines the length needed to intercept according to HLA parting protein sequence At small peptide;The patient's HLA parting identified in the RNA sample sequencing result of small peptide and step S1 is done into affinity prediction;
The method of small peptide is intercepted in step S3B3 and S3C are as follows:
SI, sequence obtain sporting in polypeptide sequence and work as premutation, centered on being currently mutated, intercept n amino acid forward M amino acid is intercepted backward and obtains polypeptide sequence, the maximum length that n can present for HLA parting, and m can be in for HLA parting The maximum length or m passed are from the length when premutation to first terminator codon;
SII, successively intercepted length is the small peptide of N in polypeptide sequence, and N is that HLA parting combines the length needed;Obtain N item packet Containing the small peptide for working as premutation;
S4: by the gene expression amount of transcript profile sample, depth of the transcript profile mutational site in complete outer sequencing sample, newborn small peptide Downstream analysis personnel are given as analysis result with the affinity of patient's HLA parting.
2. the tumor antigen identification method according to claim 1 based on transcript profile, it is characterised in that: step S2 obtains base After expression quantity, Quality Control is carried out to RNA comparison result, the content of Quality Control includes: the coverage of entire capture region, it is average to be sequenced Depth, comparison rate, the specific gravity of repetitive sequence and unique specific gravity for comparing reads;If comparison result meets Quality Control requirement, enter S3;If comparison result does not meet Quality Control requirement, sample is reacquired.
3. the tumor antigen identification method according to claim 2 based on transcript profile, it is characterised in that: step S3B3 includes The acquisition methods of the polypeptide sequence of all mutation are as follows: the corresponding area CDS is found out from ensembl database according to transcript number Domain nucleotide sequence splices 3 ' terminal sequence of downstream, according to the functional annotation of mutation by the nucleosides of corresponding position on nucleotide sequence Acid makes modification, obtains the nucleotide sequence comprising all mutation;By the transcript nucleotide sequence comprising all mutation Translate into polypeptide sequence.
4. the tumor antigen identification method according to claim 3 based on transcript profile, it is characterised in that: step SI interception contains There is the rule of the polypeptide sequence in mutational site are as follows:
A, for point mutation: centered on the position of mutation, intercepting n amino acid forward, intercept m amino acid, n backward The longest peptide fragment that=m=HLA parting can present, if when leading portion or back segment curtailment, how many cuts how many;As fruit dot is prominent Change belongs to stop loss, then m is from the length when premutation to first terminator codon;
B, for non-frameshift mutation: non-frameshit insertion will intercept forward n amino acid from first amino acid of insetion sequence;From The last one amino acid of insetion sequence intercepts m amino acid, the longest peptide fragment that n=m=HLA parting can present backward;To insert Centered on entering sequence, n amino acid before insetion sequence, insetion sequence and common group of m amino acid after insetion sequence At nascent polypeptide sequence;
Non- frameshift deletion then centered on deletion segment, respectively forwardly intercepts n amino acid, intercepts m amino acid, n=m backward The longest peptide fragment that=HLA parting can present;
C, for frameshift mutation: centered on first amino acid for starting frameshift mutation, intercepting n amino acid, n=forward The longest peptide fragment that HLA parting can present;M amino acid is intercepted backward, and m is from when premutation to first terminator codon Length is intercepted backward to first terminator codon.
CN201810101545.XA 2018-02-01 2018-02-01 Tumour neoantigen identification method based on transcript profile Active CN108491689B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810101545.XA CN108491689B (en) 2018-02-01 2018-02-01 Tumour neoantigen identification method based on transcript profile

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810101545.XA CN108491689B (en) 2018-02-01 2018-02-01 Tumour neoantigen identification method based on transcript profile

Publications (2)

Publication Number Publication Date
CN108491689A CN108491689A (en) 2018-09-04
CN108491689B true CN108491689B (en) 2019-07-09

Family

ID=63344296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810101545.XA Active CN108491689B (en) 2018-02-01 2018-02-01 Tumour neoantigen identification method based on transcript profile

Country Status (1)

Country Link
CN (1) CN108491689B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110093316A (en) * 2018-09-30 2019-08-06 北京鼎成肽源生物技术有限公司 A kind of construction method of AFF cell
CN109136269A (en) * 2018-09-30 2019-01-04 北京鼎成肽源生物技术有限公司 A kind of AFF cell
CN109584960B (en) * 2018-12-14 2021-07-30 序康医疗科技(苏州)有限公司 Method, device and storage medium for predicting tumor neoantigen
CN109706065A (en) * 2018-12-29 2019-05-03 深圳裕策生物科技有限公司 Tumor neogenetic antigen load detection device and storage medium
CN109584966B (en) * 2019-01-08 2019-09-20 杭州纽安津生物科技有限公司 A kind of design method and cancer of pancreas general vaccines of tumour general vaccines
CN110675913B (en) * 2019-01-16 2022-04-12 倍而达药业(苏州)有限公司 Screening method of tumor neoantigen based on HLA typing and structure
CN109637587B (en) * 2019-01-18 2022-11-04 臻悦生物科技江苏有限公司 Method, device, storage medium, processor and method for standardizing transcriptome data expression quantity for detecting gene fusion mutation
CN109801678B (en) * 2019-01-25 2023-07-25 上海鲸舟基因科技有限公司 Tumor antigen prediction method based on complete transcriptome and application thereof
CN111621564B (en) * 2019-02-28 2022-03-25 武汉大学 A method for identifying potent tumor neoantigens
CN111696628A (en) * 2019-03-15 2020-09-22 痕准生物科技有限公司 Methods for the identification of neoantigens
CN110415766A (en) * 2019-06-05 2019-11-05 复旦大学 A method and related device for predicting the degree of effect of mutation on RNA secondary structure
CN110322925B (en) * 2019-07-18 2021-09-03 杭州纽安津生物科技有限公司 Method for predicting generation of neoantigen by fusion gene
CN110600077B (en) * 2019-08-29 2022-07-12 北京优迅医学检验实验室有限公司 Prediction method of tumor neoantigen and application thereof
CN111192632B (en) * 2019-12-16 2023-06-13 深圳市新合生物医疗科技有限公司 Method and device for extracting gene fusion immunotherapy new antigen by integrating DNA and RNA deep sequencing data
CN111415707B (en) * 2020-03-10 2023-04-25 四川大学 Prediction method of clinical individualized tumor neoantigen
CN111627497B (en) * 2020-05-19 2023-06-13 深圳市新合生物医疗科技有限公司 Method for extracting immunotherapeutic new antigen based on tumor specific transcription region assembled by new transcripts and application
CN113160887B (en) * 2021-04-23 2022-06-14 哈尔滨工业大学 Screening method of tumor neoantigen fused with single cell TCR sequencing data
CN116825188B (en) * 2023-06-25 2024-04-09 北京泛生子基因科技有限公司 Method, device and computer readable storage medium for identifying tumor neoantigen at multiple groups of chemical layers based on high-throughput sequencing technology
CN119785886A (en) * 2024-12-31 2025-04-08 合肥综合性国家科学中心大健康研究院 Method and system for detecting new proteins based on high-throughput sequencing technology of full-length and short-fragment transcriptomes

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107406876A (en) * 2014-12-31 2017-11-28 夸登特健康公司 Detection and treatment of diseases exhibiting heterogeneity of diseased cells and systems and methods for communicating test results
CN107604062A (en) * 2017-09-01 2018-01-19 北京启辰生生物科技有限公司 A kind of more antigen detection methods for liver cancer immunity treatment

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102017898B1 (en) * 2010-05-14 2019-09-04 더 제너럴 하스피톨 코포레이션 Compositions and methods of identifying tumor specific neoantigens
JP2014526032A (en) * 2011-06-07 2014-10-02 カリス ライフ サイエンシズ ルクセンブルク ホールディングス エス.アー.エール.エル. Circulating biomarkers for cancer
CN102507937A (en) * 2011-09-27 2012-06-20 中国人民解放军第四军医大学 Method for screening and identifying tumor/testis antigens based on monoclonal antibodies with spermatogenic cell specificities
CA2908434C (en) * 2013-04-07 2021-12-28 The Broad Institute, Inc. Compositions and methods for personalized neoplasia vaccines
WO2014180490A1 (en) * 2013-05-10 2014-11-13 Biontech Ag Predicting immunogenicity of t cell epitopes
CN104059966A (en) * 2014-05-20 2014-09-24 吴松 STAG2 gene mutant sequence and detection method thereof as well as use of STAG2 gene mutation in detecting bladder cancer
WO2016128060A1 (en) * 2015-02-12 2016-08-18 Biontech Ag Predicting t cell epitopes useful for vaccination
CN108351916A (en) * 2015-07-14 2018-07-31 个人基因组诊断公司 Neoantigen is analyzed

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107406876A (en) * 2014-12-31 2017-11-28 夸登特健康公司 Detection and treatment of diseases exhibiting heterogeneity of diseased cells and systems and methods for communicating test results
CN107604062A (en) * 2017-09-01 2018-01-19 北京启辰生生物科技有限公司 A kind of more antigen detection methods for liver cancer immunity treatment

Also Published As

Publication number Publication date
CN108491689A (en) 2018-09-04

Similar Documents

Publication Publication Date Title
CN108491689B (en) Tumour neoantigen identification method based on transcript profile
CN108388773B (en) A kind of identification method of tumor neogenetic antigen
Eberlein et al. Hybridization is a recurrent evolutionary stimulus in wild yeast speciation
Marchant et al. The C-Fern (Ceratopteris richardii) genome: insights into plant genome evolution with the first partial homosporous fern genome assembly
Marburger et al. Interspecific introgression mediates adaptation to whole genome duplication
Kozak et al. Rampant genome-wide admixture across the Heliconius radiation
Tennessen et al. Evolutionary origins and dynamics of octoploid strawberry subgenomes revealed by dense targeted capture linkage maps
Kamneva et al. Evaluating allopolyploid origins in strawberries (Fragaria) using haplotypes generated from target capture sequencing
Salojärvi et al. The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars
Neafsey et al. Population genomic sequencing of Coccidioides fungi reveals recent hybridization and transposon control
Cochetel et al. Diploid chromosome-scale assembly of the Muscadinia rotundifolia genome supports chromosome fusion and disease resistance gene expansion during Vitis and Muscadinia divergence
Wang et al. Dispersed emergence and protracted domestication of polyploid wheat uncovered by mosaic ancestral haploblock inference
Izraelson et al. Comparative analysis of murine T‐cell receptor repertoires
Bao et al. A chromosomal-scale genome assembly of modern cultivated hybrid sugarcane provides insights into origination and evolution
CN110621785B (en) Method and device for haplotyping diploid genome based on three-generation capture sequencing
Nakagome et al. Estimating the ages of selection signals from different epochs in human history
Larson et al. Admixture may be extensive among hyperdominant Amazon rainforest tree species
Chen et al. Tracing Bai-Yue ancestry in aboriginal Li people on hainan island
CN111534602A (en) Method for analyzing human blood type and genotype based on high-throughput sequencing and application thereof
CN107247890A (en) A kind of gene data system for clinical diagnosis and prediction
Owens et al. Standing variation rather than recent adaptive introgression probably underlies differentiation of the texanus subspecies of Helianthus annuus
Hao et al. Convergent evolution of polyploid genomes from across the eukaryotic tree of life
Moutinho et al. Evolutionary history of two cryptic species of northern African jerboas
Su et al. Phased telomere-to-telomere reference genome and pangenome reveal an expansion of resistance genes during apple domestication
CN109712672A (en) Detect method, apparatus, storage medium and the processor of gene rearrangement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Transcriptome-based tumor neoantigen identification method

Effective date of registration: 20200618

Granted publication date: 20190709

Pledgee: Bank of Hangzhou Limited by Share Ltd. science and Technology Branch

Pledgor: HANGZHOU NEOANTIGEN BIOTECHNOLOGY Co.,Ltd.

Registration number: Y2020330000365

PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20230627

Granted publication date: 20190709

Pledgee: Bank of Hangzhou Limited by Share Ltd. science and Technology Branch

Pledgor: HANGZHOU NEOANTIGEN BIOTECHNOLOGY Co.,Ltd.

Registration number: Y2020330000365