CN110379458A - Pathogenicity variation site determination method, device, computer equipment and storage medium - Google Patents
Pathogenicity variation site determination method, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN110379458A CN110379458A CN201910634602.5A CN201910634602A CN110379458A CN 110379458 A CN110379458 A CN 110379458A CN 201910634602 A CN201910634602 A CN 201910634602A CN 110379458 A CN110379458 A CN 110379458A
- Authority
- CN
- China
- Prior art keywords
- variant
- pathogenicity
- evidence
- sample
- tested
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/30—Detection of binding sites or motifs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/50—Mutagenesis
Landscapes
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Bioinformatics & Computational Biology (AREA)
- Analytical Chemistry (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
本发明适用于生物技术领域,提供了致病性变异位点判定方法、装置,计算机设备及存储介质,该方法包括:对待注释矩阵中的变异位点进行遗传变异注释;将变异位点标注的变异特征值转化为变异的致病性证据;对变异的致病性证据进行解读,得到每个待测样本的每个变异位点的变异致病性解读结果;根据用户针对待测样本的变异位点的变异致病性解读结果中的变异致病性证据的修改操作,生成新的变异致病性证据;将新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复上述步骤,直至不再增加新的变异致病性证据时,输出每个待测样本的遗传变异位点列表。本发明实施例提供的测定方法对于遗传变异位点的解读精确率和召回率高。
The invention is applicable to the field of biotechnology, and provides a pathogenic variation site determination method, device, computer equipment and storage medium. The method includes: performing genetic variation annotation on the mutation sites in the to-be-annotated matrix; The variation characteristic value is converted into the pathogenicity evidence of the variant; the pathogenicity evidence of the variant is interpreted to obtain the variant pathogenicity interpretation result of each variant site of each sample to be tested; The modification operation of the variant pathogenicity evidence in the variant pathogenicity interpretation result of the locus generates new variant pathogenicity evidence; the new variant pathogenicity evidence is updated to the preset pathogenicity evidence transformation rule. Variation pathogenicity evidence base, and repeat the above steps until no new variant pathogenicity evidence is added, output the list of genetic variation sites for each sample to be tested. The assay method provided by the embodiment of the present invention has high interpretation precision and recall for genetic variation sites.
Description
技术领域technical field
本发明属于生物技术领域,尤其涉及一种致病性变异位点判定方法、装置,计算机设备及存储介质。The invention belongs to the field of biotechnology, and in particular relates to a pathogenic variant site determination method, device, computer equipment and storage medium.
背景技术Background technique
随着科技的进步,基因组计划也得到了飞速的发展,而基因数据重分析技术已广泛用于发现和检测孟德尔遗传疾病的致病变异位点。With the advancement of science and technology, the genome project has also developed rapidly, and genetic data reanalysis technology has been widely used to discover and detect pathogenic variants in Mendelian diseases.
目前常规的基因数据重分析方法主要是通过对特征进行层层严格筛选,得到候选的致病性变异,再结合变异的致病性报道,得到高置信度的致病变异。该方法将单个待测样本的变异位点通过多种特征严格的筛选,以此减少致病性变异评估的基数,再结合表型信息,可以得到较为精准的致病变异,但这样严格的初筛标准也会导致潜在的致病性变异被过滤掉,造成较高的假阴性率。At present, conventional genetic data re-analysis methods are mainly based on rigorous screening of features layer by layer to obtain candidate pathogenic variants, and then combined with pathogenic reports of variants to obtain high-confidence pathogenic variants. In this method, the mutation sites of a single sample to be tested are rigorously screened by various characteristics, so as to reduce the base number of pathogenic variant evaluation, and combined with phenotypic information, more accurate pathogenic variants can be obtained, but such a strict initial Screening criteria can also result in potentially pathogenic variants being filtered out, resulting in higher false-negative rates.
为了解决上述方法的假阴性率较高的问题,提出了只对变异进行测序质量、基因型质量方面的初步筛选,再通过注释信息、人群数据等来为所有变异生成证据,期间不会损失潜在的致病性变异,通过初步筛选的每个变异都会进入证据转化、致病性解读步骤,但是该方法仍然存在一定的局限性,例如,可能由于其证据不足,过多依赖与已知致病性变异,而导致召回率和精确率低。In order to solve the problem of the high false negative rate of the above methods, it is proposed to only perform preliminary screening on the sequencing quality and genotype quality of the variants, and then generate evidence for all variants through annotation information, population data, etc., without losing potential pathogenic variants, each variant that passes the preliminary screening will enter the steps of evidence transformation and pathogenicity interpretation, but this method still has certain limitations. Sexual variation, resulting in low recall and precision.
发明内容SUMMARY OF THE INVENTION
本发明实施例提供一种致病性变异位点判定方法,旨在解决现有的遗传变异位点解读的精确率和召回率低的问题。The embodiments of the present invention provide a method for determining a pathogenic variant site, which aims to solve the problems of low precision and recall rate of the existing genetic variant site interpretation.
本发明实施例是这样实现的,一种致病性变异位点判定方法,包括如下步骤:The embodiments of the present invention are implemented in this way, a method for determining a pathogenic variant site, comprising the following steps:
获取待测样本的测序序列,并将所述待测样本的测序序列与参考基因组序列进行比对,生成待注释矩阵;Obtain the sequencing sequence of the sample to be tested, and compare the sequencing sequence of the sample to be tested with the reference genome sequence to generate a matrix to be annotated;
利用基因注释工具,对所述待注释矩阵中的变异位点进行遗传变异注释,得到已注释的遗传变异矩阵;Using a gene annotation tool, perform genetic variation annotation on the variation sites in the matrix to be annotated, and obtain an annotated genetic variation matrix;
根据预设的变异致病性证据转化规则,将所述变异位点标注的变异特征值转化为变异的致病性证据;According to the preset variant pathogenicity evidence conversion rules, the variant characteristic value marked on the variant site is converted into the variant pathogenicity evidence;
根据预设的变异致病性解读规则,对所述变异的致病性证据进行分析,得到每个待测样本的变异位点的变异致病性解读结果;According to the preset variant pathogenicity interpretation rules, analyze the pathogenicity evidence of the variant, and obtain the variant pathogenicity interpretation result of the variant locus of each sample to be tested;
根据用户针对所述待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据;According to the modification operation of the variant pathogenicity evidence input by the user for the variant pathogenicity interpretation result of the variant locus of the sample to be tested that does not conform to the preset pathogenicity evidence transformation rules, a new variant pathogenicity evidence is generated. Evidence of variant pathogenicity;
将所述新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对所述待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,直至不再增加新的变异致病性证据时,输出每个待测样本的遗传变异位点列表。Update the new variant pathogenicity evidence to the variant pathogenicity evidence base in the preset pathogenicity evidence transformation rule, and repeat the transformation of the variant pathogenicity evidence of the mutation site of the sample to be tested , variant pathogenicity interpretation and manual correction, until no new variant pathogenicity evidence is added, output a list of genetic variant sites for each sample to be tested.
本发明实施例还提供一种致病性变异位点判定装置,包括:The embodiment of the present invention also provides a pathogenic variant site determination device, comprising:
获取单元,用于获取待测样本的测序序列,并将所述待测样本的测序序列与参考基因组序列进行比对,生成待注释矩阵;an acquisition unit, configured to acquire the sequenced sequence of the sample to be tested, and compare the sequenced sequence of the sample to be tested with the reference genome sequence to generate a matrix to be annotated;
遗传变异注释单元,用于利用基因注释工具,对所述待注释矩阵中的变异位点进行遗传变异注释,得到已注释的遗传变异矩阵;A genetic variation annotation unit, used for using a gene annotation tool to perform genetic variation annotation on the variation sites in the to-be-annotated matrix to obtain an annotated genetic variation matrix;
变异致病性证据转化单元,用于根据预设的变异致病性证据转化规则,将所述变异位点标注的变异特征值转化为变异的致病性证据;A variant pathogenicity evidence conversion unit, which is used to convert the variant characteristic value marked on the variant site into variant pathogenicity evidence according to preset variant pathogenicity evidence conversion rules;
变异致病性解读单元,用于根据预设的变异致病性解读规则,对所述变异的致病性证据进行分析,得到每个待测样本的变异位点的变异致病性解读结果;The variant pathogenicity interpretation unit is used to analyze the pathogenicity evidence of the variant according to the preset variant pathogenicity interpretation rules, and obtain the variant pathogenicity interpretation result of the variant site of each sample to be tested;
变异致病性证据修改单元,用于根据用户针对所述待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据。The variant pathogenicity evidence modification unit is used to input the variant pathogenicity according to the variant pathogenicity interpretation result of the variant locus of the sample to be tested that does not conform to the preset pathogenicity evidence transformation rules. Modification of pathogenic evidence to generate new variant pathogenic evidence.
遗传变异位点列表输出单元,用于将所述新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对所述待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,直至不再增加新的变异致病性证据时,输出每个待测样本的遗传变异位点列表。The genetic variation site list output unit is used to update the new variant pathogenicity evidence to the variant pathogenicity evidence base in the preset pathogenicity evidence transformation rule, and repeat the process of the sample to be tested. The steps of transforming the variant pathogenicity evidence of the variant locus, interpreting the variant pathogenicity and manually correcting, until no new variant pathogenicity evidence is added, output the list of genetic variant loci for each sample to be tested.
本发明实施例还提供一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行上述的致病性变异位点判定方法的步骤。An embodiment of the present invention further provides a computer device, including a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor causes the processor to execute the above-mentioned pathogenic variant Steps of a site determination method.
本发明实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行上述的致病性变异位点判定方法的步骤。Embodiments of the present invention further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the processor causes the processor to execute the above-mentioned pathogenic variant Steps of a site determination method.
本发明实施例提供的致病性变异位点判定方法,通过用户针对待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据,并将新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,扩大了已知变异致病性证据库中的变异致病性证据的数量,从而可找到更多与变异位点相关联的变异致病性证据,挖掘出更多的潜在的致病性变异位点,减少“不确定”的解读结果,提高召回率。In the method for determining pathogenic variant sites provided by the embodiments of the present invention, the user inputs the variant sites that do not conform to the preset pathogenicity evidence transformation rules in the variant pathogenicity interpretation result of the variant sites of the sample to be tested. The modification operation of the pathogenicity evidence of the variant, generating new pathogenicity evidence of the variant, and updating the new pathogenicity evidence of the variant to the pathogenicity evidence base of the variant in the pre-set pathogenicity evidence transformation rules, and Repeating the steps of variant pathogenicity evidence transformation, variant pathogenicity interpretation, and manual correction of the variant loci in the sample to be tested expands the amount of variant pathogenicity evidence in the known variant pathogenicity evidence base, so that it is possible to find More pathogenic evidence of variants associated with variant loci, mining more potential pathogenic variant loci, reducing "uncertain" interpretation results and improving recall rate.
附图说明Description of drawings
图1是本发明实施例一提供的致病性变异位点判定方法的实现流程图;Fig. 1 is the realization flow chart of the pathogenic variant locus determination method that the embodiment of the present invention provides;
图2是本发明实施例提供的某待测样本的变异文件中的部分内容;2 is a partial content in a variation file of a sample to be tested provided by an embodiment of the present invention;
图3是本发明实施例二提供的致病性变异位点判定方法的实现流程图;Fig. 3 is the realization flow chart of the pathogenic variant locus determination method provided in the second embodiment of the present invention;
图4是本发明实施例提供的已注释的遗传变异矩阵的VCF格式列表的示意图;Fig. 4 is the schematic diagram of the VCF format list of the annotated genetic variation matrix provided by the embodiment of the present invention;
图5是本发明实施例三提供的致病性变异位点判定方法的实现流程图;Fig. 5 is the realization flow chart of the pathogenic variant locus determination method provided by the third embodiment of the present invention;
图6是本发明实施例提供的一种致病性变异位点判定装置的结构框图。FIG. 6 is a structural block diagram of an apparatus for determining a pathogenic variant site according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
在本发明实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本发明。在本发明实施例和所附权利要求书中所使用的单数形式的“一种”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terms used in the embodiments of the present invention are only for the purpose of describing specific embodiments, and are not intended to limit the present invention. As used in the embodiments of the present invention and the appended claims, the singular forms "a" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.
本发明实施例提供的致病性变异位点判定方法,通过用户针对待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据,并将新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,扩大了已知变异致病性证据库中的变异致病性证据的数量,从而可找到更多与变异位点相关联的变异致病性证据,挖掘出更多的潜在的致病性变异位点,减少“不确定”的解读结果,提高召回率。In the method for determining pathogenic variant sites provided by the embodiments of the present invention, the user inputs the variant sites that do not conform to the preset pathogenicity evidence transformation rules in the variant pathogenicity interpretation result of the variant sites of the sample to be tested. The modification operation of the pathogenicity evidence of the variant, generating new pathogenicity evidence of the variant, and updating the new pathogenicity evidence of the variant to the pathogenicity evidence base of the variant in the pre-set pathogenicity evidence transformation rules, and Repeating the steps of variant pathogenicity evidence transformation, variant pathogenicity interpretation, and manual correction of the variant loci in the sample to be tested expands the amount of variant pathogenicity evidence in the known variant pathogenicity evidence base, so that it is possible to find More pathogenic evidence of variants associated with variant loci, mining more potential pathogenic variant loci, reducing "uncertain" interpretation results and improving recall rate.
图1是本发明实施例一提供的一种致病性变异位点判定方法的实现流程图,如图1所示,致病性变异位点判定方法,包括如下步骤:Fig. 1 is the realization flow chart of a kind of pathogenic variant locus determination method provided in the first embodiment of the present invention, as shown in Fig. 1, the pathogenic variant locus determination method comprises the following steps:
在步骤S102中,获取待测样本的测序序列,并将待测样本的测序序列与参考基因组序列进行比对,生成待注释矩阵。In step S102, the sequencing sequence of the sample to be tested is obtained, and the sequencing sequence of the sample to be tested is compared with the reference genome sequence to generate a matrix to be annotated.
在本发明实施例中,待测样本为含有核酸的样本,核酸的类型并不受特别限制,可以是脱氧核糖核酸(DNA),也可以是核糖核酸(RNA),优选DNA。本领域技术人员可以理解,对于RNA,可以通过实验方法逆转录成DNA,进行后续检测和分析。In the embodiment of the present invention, the sample to be tested is a sample containing nucleic acid, and the type of nucleic acid is not particularly limited, and may be deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), preferably DNA. Those skilled in the art can understand that RNA can be reverse transcribed into DNA by experimental methods for subsequent detection and analysis.
参考基因组序列指具有唯一碱基排列方式的基因序列片段,通过与这种片段的比对可以确定地定位染色体上的位置。本发明实施例中可采用版本号为hg38、hg19或hg18的参考基因组序列。A reference genome sequence refers to a fragment of a gene sequence with a unique arrangement of bases, with which a position on a chromosome can be located with certainty by alignment with this fragment. In the embodiments of the present invention, a reference genome sequence with a version number of hg38, hg19 or hg18 can be used.
在本发明实施例中,可利用基因比对和变异鉴定工具,将待测样本的测序序列与参考基因组序列进行比对,生成各个待测样本的变异列表,再合并待测样本的变异列表生成待注释矩阵。In the embodiment of the present invention, a gene comparison and variation identification tool can be used to compare the sequenced sequence of the sample to be tested with the reference genome sequence to generate a variation list of each sample to be tested, and then merge the variation list of the sample to be tested to generate Matrix to be annotated.
在步骤S104中,利用基因注释工具,对待注释矩阵中的变异位点进行遗传变异注释,得到已注释的遗传变异矩阵。In step S104, genetic variation annotation is performed on the mutation sites in the to-be-annotated matrix by using a gene annotation tool to obtain an annotated genetic variation matrix.
在本发明实施例中,基因注释工具可以为VEP(variant Effect Predictor,突变注释工具),ANNOVAR等。In this embodiment of the present invention, the gene annotation tool may be VEP (variant Effect Predictor, mutation annotation tool), ANNOVAR, and the like.
变异位点通常包括:点突变(SNV),短的插入缺失(Indel)等。Variation sites usually include: point mutation (SNV), short insertion deletion (Indel) and so on.
对待注释矩阵中的变异位点进行遗传变异注释,注释的内容包括:①变异在基因组中的位置,包括DNA水平的基因、RNA水平的转录本、蛋白质水平的结构域等;②变异的类型,表明对翻译蛋白质的影响,如无义突变、错义突变、同义突变;③变异在人群中的频率,如千人基因组,gnomAD(Genome Aggregation Database,基因组突变频率数据库),ExAC(Exome Aggregation Consortium,外显子组测序项目)等数据库收集的样本中的频率等。Genetic variation annotation is performed on the mutation sites in the annotation matrix. The annotation content includes: ① the location of the mutation in the genome, including genes at the DNA level, transcripts at the RNA level, and domains at the protein level; ② the type of variation, Indicate the impact on translation proteins, such as nonsense mutation, missense mutation, synonymous mutation; ③ the frequency of the mutation in the population, such as thousand genomes, gnomAD (Genome Aggregation Database, genome mutation frequency database), ExAC (Exome Aggregation Consortium) , the Exome Sequencing Project) and other databases collected frequencies in samples, etc.
假设上述矩阵显示的是待测样本1和待测样本2中的某个基因位点(变异位点)上的碱基序列与参考基因组序列的比对结果。下面以待测样本1为例进行说明,参考基因组序列中的编码第一个氨基酸的碱基为ACC(苏氨酸),编码第二个氨基酸的碱基为GGC(甘氨酸),而待测样本1中编码第一个氨基酸的碱基为AT(U)C(异亮氨酸),编码第二个氨基酸的碱基为CGT(U)(精氨酸)。即待测样本1中在该基因位点上的单碱基变异导致了其氨基酸的种类的改变,此时,可在该变异位点处进行注释“错义突变”。Assuming the above matrix Shown is the comparison result of the base sequence on a certain gene locus (variation site) in sample 1 and sample 2 to be tested and the reference genome sequence. In the following, sample 1 to be tested is taken as an example for description. The base encoding the first amino acid in the reference genome sequence is ACC (threonine), the base encoding the second amino acid is GGC (glycine), and the sample to be tested is The base encoding the first amino acid in 1 is AT(U)C (isoleucine), and the base encoding the second amino acid is CGT(U) (arginine). That is, the single base variation at the gene locus in the sample to be tested 1 leads to a change in the type of its amino acid. In this case, an annotation "missense mutation" can be performed at the variation site.
在步骤S106中,根据预设的变异致病性证据转化规则,将变异位点标注的变异特征值转化为变异的致病性证据。In step S106, according to a preset mutation pathogenicity evidence conversion rule, the mutation characteristic value marked by the mutation site is converted into the mutation pathogenicity evidence.
在本发明实施例中,预设的变异致病性证据转化规则参考美国医学遗传学与基因组学学会(ACMG,American College of Medical Genetics and Genomics)制定的序列变异解读指南中的证据规则。根据ACMG中的证据转化规则将变异位点标注的变异特征值转化为变异的致病性证据。In the embodiment of the present invention, the preset variant pathogenicity evidence transformation rules refer to the evidence rules in the Sequence Variation Interpretation Guidelines formulated by the American College of Medical Genetics and Genomics (ACMG, American College of Medical Genetics and Genomics). According to the evidence transformation rules in ACMG, the variant feature values annotated at variant loci were transformed into pathogenic evidence of the variant.
以变异位点中已注释的变异特征值gnomAD等位基因频率为例,gnomAD数据库收录的是正常人群,当注释的变异特征值为某变异位点的变异在该人群中的频率大于一定的阈值(如0.1%)时,则认为该变异位点的变异为致病变异的概率非常小,则在该变异位点处标记良性的证据BA1。Take the gnomAD allele frequency of the annotated variant characteristic value in the variant locus as an example. The gnomAD database contains the normal population. When the annotated variant characteristic value is a variant locus, the frequency of the variant in the population is greater than a certain threshold. (eg 0.1%), it is considered that the probability of the mutation at the mutation site is very small, and the benign evidence BA1 is marked at the mutation site.
另外,基于待测样本信息的证据转化,以PP4为例,ACMG指南中的证据规则之一为:如果样本具有耳聋前庭导水管扩大的表型,且患有Pendred综合征(又名家族性甲状腺肿先天性聋综合征),那么该待测样本发生在SLC26A4基因内的变异可能为致病性变异,则在SLC26A4基因处标记致病证据PP4。In addition, based on the transformation of evidence based on the information of the sample to be tested, taking PP4 as an example, one of the evidence rules in the ACMG guidelines is: if the sample has the phenotype of deafness with vestibular aqueduct enlargement, and has Pendred syndrome (also known as familial thyroid gland) swollen congenital deafness syndrome), then the mutation in the SLC26A4 gene in the sample to be tested may be a pathogenic variant, and the pathogenic evidence PP4 is marked at the SLC26A4 gene.
在步骤S108中,根据预设的变异致病性解读规则,对变异的致病性证据进行解读,得到每个待测样本的变异位点的变异致病性解读结果。In step S108, according to the preset mutation pathogenicity interpretation rules, the pathogenicity evidence of the mutation is interpreted, and the mutation pathogenicity interpretation result of the mutation site of each sample to be tested is obtained.
经过上述步骤S106的处理后,将得到每个待测样本(个体)的变异文件,其中该变异文件为一矩阵,该矩阵中的每行表示一个变异位点的变异,每列表示每个变异位点的变异致病性证据。根据该变异文件中每个变异位点所显示的变异致病性证据进行解读,得到每个待测样本的变异位点的变异致病性解读结果。After the processing of the above step S106, a mutation file of each sample to be tested (individual) will be obtained, wherein the mutation file is a matrix, each row in the matrix represents the mutation of a mutation site, and each column represents each mutation Evidence of variant pathogenicity at the locus. Interpretation is carried out according to the variant pathogenicity evidence displayed at each variant locus in the variant file, and the variant pathogenicity interpretation result of the variant locus of each sample to be tested is obtained.
具体的,证据分为致病证据和良性证据,证据的权重划分为四个等级,分别为非常强,强,中等和支持,通过证据强度组合规则进行致病性解读,得到的致病性结果分为五类:致病,可能致病,良性,可能良性和无法确定,比如1条非常强致病证据和至少1条强致病证据组合解读结果为致病;2条以上强良性证据解读结果为良性等。Specifically, the evidence is divided into pathogenic evidence and benign evidence, and the weight of the evidence is divided into four grades, namely very strong, strong, moderate and supportive. The pathogenicity result is obtained by interpreting pathogenicity through the combination rule of evidence strength. Divided into five categories: pathogenic, possibly pathogenic, benign, possibly benign and indeterminate, for example, a combination of 1 very strong pathogenic evidence and at least 1 strong pathogenic evidence is interpreted as pathogenic; more than 2 strong benign evidences are interpreted The results are benign and so on.
以上述两条规则为例:1条非常强致病证据和至少1条强致病证据组合解读结果为致病;2条以上强良性证据解读结果为良性等。如果某变异被标记PVS1(非常强)、PS1(强)证据,则该变异位点解读为致病变异;如果某变异被标记BS1(强)、BS2(强)证据,则该变异位点解读为良性变异。Take the above two rules as an example: a combination of 1 very strong pathogenic evidence and at least 1 strong pathogenic evidence is causative; 2 or more strong benign evidences are interpreted as benign, etc. If a variant is marked with PVS1 (very strong) and PS1 (strong) evidence, the variant locus is interpreted as a pathogenic variant; if a variant is marked with BS1 (strong) and BS2 (strong) evidence, the variant locus is interpreted as a pathogenic variant a benign mutation.
图2示出了某待测样本的部分变异位点的致病性变异证据,以图2中所示的第一行为例进行说明,第一行显示了6号染色体上33152074基因组位置处发生了的插入G的变异,该变异被标记的致病性变异证据有PVS1=0;PM2=3;PP5=0;PM3=1,数字表示证据的强度,0-3强度递减,根据ACMG指南中的遗传变异分类联合标准规则中的相关描述可知,其解读结果为致病。以此类推,对每个变异文件中的每个变异位点中所显示的变异致病性证据进行解读,得到每个待测样本所有的变异位点的变异致病性解读结果。Figure 2 shows the pathogenic variant evidence for some variant loci in a sample to be tested, illustrated by the first row shown in Figure 2. The first row shows the occurrence of the 33152074 genomic location on chromosome 6. For G-inserted variants, the variant is marked with evidence of a pathogenic variant with PVS1=0; PM2=3; PP5=0; PM3=1, the numbers indicate the strength of the evidence, with decreasing strength from 0-3, according to the ACMG guidelines According to the relevant descriptions in the combined standard rules for the classification of genetic variants, the interpretation result is pathogenic. By analogy, the variant pathogenicity evidence displayed in each variant locus in each variant file is interpreted, and the variant pathogenicity interpretation results of all variant loci in each sample to be tested are obtained.
在步骤S110中,根据用户针对所述待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据。In step S110, according to the modification of the variant pathogenicity evidence input by the user for the variant pathogenicity interpretation result of the variant locus of the sample to be tested that does not conform to the preset pathogenicity evidence transformation rules Operations to generate new pathogenicity evidence for variants.
在本发明实施例中,用户对待测样本的变异位点的变异致病性解读结果进行校正,校正主要针对致病性解读结果中标注为致病的变异位点,具体的,用户结合待测样本表型数据和查阅的文献资料等对该变异点位进行致病性解读结果的确认和更正,得到高置信度的致病变异,最后将更正的致病变异与公共数据库中的致病变异合并,以用于生成新的变异致病性证据。In the embodiment of the present invention, the user corrects the mutation pathogenicity interpretation result of the mutation site of the sample to be tested, and the correction is mainly for the mutation sites marked as pathogenic in the pathogenic interpretation result. Confirm and correct the pathogenic interpretation results of the variant locus with the phenotypic data of the sample and the reviewed literature, etc., to obtain the pathogenic variant with high confidence, and finally compare the corrected pathogenic variant with the pathogenic variant in the public database. Incorporated for the generation of new variant pathogenicity evidence.
在本发明实施例中,用户指的是遗传咨询师等专业人员,用户对每个待测样本的变异位点的变异致病性解读结果进行查阅,并挑出其中与预设的致病性证据转化规则不相符的变异位点的变异致病性证据,并进行修改保存。根据修改的内容生成新的变异致病性证据。In the embodiment of the present invention, the user refers to a professional such as a genetic counselor, and the user consults the variant pathogenicity interpretation result of the variant locus of each sample to be tested, and picks out the pathogenicity that matches the preset pathogenicity. Variation pathogenicity evidence of variant loci that do not conform to the evidence transformation rules shall be modified and saved. Generate new variant pathogenicity evidence based on the modifications.
在步骤S112中,判断是否有增加新的变异致病性证据,若是,则执行步骤S114,将新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中;若否,则执行步骤S116,输出每个待测样本的遗传变异位点列表。In step S112, it is judged whether new mutation pathogenicity evidence is added, and if so, step S114 is executed to update the new mutational pathogenicity evidence to the variant pathogenicity evidence in the preset pathogenicity evidence transformation rule If not, step S116 is executed to output a list of genetic variation sites of each sample to be tested.
在本发明实施例中,通过循环分析是否有新增的变异致病性证据,可以不断地更新完善变异致病性证据库,扩大了已知变异致病性证据库中的变异致病性证据的数量,从而可找到更多与变异位点相关联的变异致病性证据,挖掘出更多的潜在的致病性变异位点,减少“不确定”的解读结果,提高召回率和检测的精确度。In the embodiment of the present invention, by cyclically analyzing whether there is any new variant pathogenicity evidence, the variant pathogenicity evidence base can be continuously updated and improved, and the variant pathogenicity evidence in the known variant pathogenicity evidence base can be expanded. Therefore, more pathogenic evidence of variants associated with variant loci can be found, more potential pathogenic variant loci can be mined, “uncertain” interpretation results can be reduced, and the recall rate and detection accuracy can be improved. Accuracy.
在本发明实施例中,将新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,具体为,将上述步骤S110得到的新的变异致病性证据添加到ACMG中的变异致病性证据库中,从而提高变异解读的精确度和召回率。In the embodiment of the present invention, the new variant pathogenicity evidence is updated to the variant pathogenicity evidence base in the preset pathogenicity evidence transformation rule, specifically, the new variant pathogenicity obtained in the above step S110 is updated Sexuality evidence is added to the variant pathogenicity evidence base in ACMG, thereby improving the precision and recall of variant interpretation.
在本发明实施例中,输出的每个待测样本的遗传变异位点列表中每行显示一个变异位点的变异及其致病性解读结果(即致病性评级):致病,可能致病,良性,可能良性或无法确定。In the embodiment of the present invention, each line in the output list of genetic variation sites of each sample to be tested displays the variation of a mutation site and its pathogenicity interpretation result (ie, pathogenicity rating): pathogenic, likely to cause Disease, benign, possibly benign or indeterminate.
本发明实施例提供的致病性变异位点判定方法,通过用户针对待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据,并将新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,扩大了已知变异致病性证据库中的变异致病性证据的数量,从而可找到更多与变异位点相关联的变异致病性证据,挖掘出更多的潜在的致病性变异位点,减少“不确定”的解读结果,提高召回率。In the method for determining pathogenic variant sites provided by the embodiments of the present invention, the user inputs the variant sites that do not conform to the preset pathogenicity evidence transformation rules in the variant pathogenicity interpretation result of the variant sites of the sample to be tested. The modification operation of the pathogenicity evidence of the variant, generating new pathogenicity evidence of the variant, and updating the new pathogenicity evidence of the variant to the pathogenicity evidence base of the variant in the pre-set pathogenicity evidence transformation rules, and Repeating the steps of variant pathogenicity evidence transformation, variant pathogenicity interpretation, and manual correction of the variant loci in the sample to be tested expands the amount of variant pathogenicity evidence in the known variant pathogenicity evidence base, so that it is possible to find More pathogenic evidence of variants associated with variant loci, mining more potential pathogenic variant loci, reducing "uncertain" interpretation results and improving recall rate.
在本发明的一个实施例中,上述步骤S102具体为:获取待测样本的测序序列,并将所述待测样本的测序序列与参考基因组序列进行比对,得到每个待测样本的DNA变异列表,鉴定并合并每个待测样本的DNA变异列表,生成待注释矩阵。In an embodiment of the present invention, the above step S102 is specifically: obtaining the sequencing sequence of the sample to be tested, and comparing the sequencing sequence of the sample to be tested with the reference genome sequence to obtain the DNA variation of each sample to be tested List, identify and combine the list of DNA variants for each sample to be tested to generate a matrix to be annotated.
在本发明实施例中,基因比对工具可以为BWA软件,ANNOVAR注释软件等;变异鉴定工具可以为GATK(The Genome Analysis Toolkit),FreeBayes(贝叶斯遗传变异检测器)等。待测矩阵采用多样本VCF文件格式,具体为:每一列定义一个待测样本,每一行定义一个变异位点。若待测样本1携带变异位点1的杂合突变,则标记为0/1;若待测样本1携带变异位点1的纯合突变,则标记为1/1;若待测样本1携带变异位点1的半合突变,则标记为./1;若待测样本1不携带变异位点1的突变,则标记为0/0;若无法判断待测样本1在变异位点1的突变情况(例如该区域测序数据质量较差,无法进行变异鉴定),则标记为./.;若所有待测样本均不携带某种变异,则该变异位点不在矩阵中列出。In the embodiment of the present invention, the gene alignment tool may be BWA software, ANNOVAR annotation software, etc.; the variant identification tool may be GATK (The Genome Analysis Toolkit), FreeBayes (Bayesian Genetic Variation Detector) and the like. The matrix to be tested adopts the multi-sample VCF file format, specifically: each column defines a sample to be tested, and each row defines a mutation site. If sample 1 to be tested carries a heterozygous mutation at mutation site 1, it is marked as 0/1; if sample 1 to be tested carries a homozygous mutation of mutation site 1, it is marked as 1/1; The hemizygous mutation of mutation site 1 is marked as ./1; if the sample to be tested does not carry the mutation of mutation site 1, it is marked as 0/0; Mutation status (for example, the quality of sequencing data in this region is poor, and variant identification cannot be performed), it is marked as ./.; if all samples to be tested do not carry a certain variant, the variant site is not listed in the matrix.
示例性的,待测样本1和待测样本2的变异列表如下表1所示。Exemplarily, the variation list of the sample to be tested 1 and the sample to be tested 2 is shown in Table 1 below.
表1Table 1
图3是本发明实施例二提供的一种致病性变异位点判定方法的实现流程图,如图3所示,本发明实施例与上述实施例一基本相同,其不同之处仅在于:上述步骤S110具体包括步骤S202。Fig. 3 is the realization flow chart of a pathogenic variant locus determination method provided by the second embodiment of the present invention. As shown in Fig. 3, the embodiment of the present invention is basically the same as the above-mentioned first embodiment, and the difference is only: The above step S110 specifically includes step S202.
在步骤S202中,根据用户针对待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据修改操作,生成符合预设的致病性证据转化规则的新的变异致病性证据。In step S202, according to the mutation pathogenicity evidence modification operation input by the user for the variant pathogenicity interpretation result of the mutation site of the sample to be tested that does not conform to the preset pathogenicity evidence transformation rule, generate New variant pathogenicity evidence that conforms to pre-specified pathogenicity evidence transformation rules.
在本发明的一个示例性实施例中,在耳聋疾病中,变异位点7:107302082:A,T显示不符合PVS1规则,但却标记有非常强的致病证据,此时,专业人员将会结合自身掌握的经验知识和现有数据库记载的关于该变异位点7:107302082:A,T的致病性数据,为该变异位点添加可靠的ACMG变异致病性证据。In an exemplary embodiment of the present invention, in the deafness disease, the variant site 7:107302082:A, T shows that it does not meet the PVS1 rule, but it is marked with very strong evidence of disease. At this time, professionals will Combined with our own experience and knowledge and the pathogenicity data of this variant locus 7:107302082:A, T recorded in the existing database, we added reliable ACMG variant pathogenic evidence for this variant locus.
作为本发明的一个实施例,上述实施例一中的步骤S104具体为:利用基因注释工具,对所述待注释矩阵中的变异位点在基因组中的位置、变异的类型和变异在人群中的频率进行注释,得到已注释的遗传变异矩阵,格式为VCF文件格式(如图4所示,图中仅示出其中的部分)。As an embodiment of the present invention, step S104 in the above-mentioned Embodiment 1 is specifically: using a gene annotation tool, the position of the mutation site in the to-be-annotated matrix in the genome, the type of mutation, and the variation in the population are analyzed. The frequency is annotated to obtain the annotated genetic variation matrix in the VCF file format (as shown in Figure 4, only a part of which is shown in the figure).
在本发明实施例中,变异注释步骤将在遗传变异的VCF文件中添加注释信息。VCF格式文件分为以“#”开头的注释部分和没有“#”开头的主体部分。主体部分每行代表一个变异位点,至少包括CHROM、POS、ID、REF、ALT、QUAL、FILTER、INFO八列信息,分别表示变异的染色体、基因组位置、变异位点号、参考基因组碱基、改变碱基、变异质量、过滤、变异详细信息。In the embodiment of the present invention, the variation annotation step will add annotation information in the VCF file of the genetic variation. The VCF format file is divided into a comment section that begins with "#" and a body section that does not begin with "#". Each row of the main body represents a variant site, including at least eight columns of information: CHROM, POS, ID, REF, ALT, QUAL, FILTER, INFO, which represent the variant chromosome, genome location, variant site number, reference genome base, Change base, variant quality, filtering, variant details.
遗传变异注释的信息将会存于INFO列,在INFO列中以如下格式显示:The genetic variant annotation information will be stored in the INFO column, which is displayed in the following format:
“gnomAD_AF_raw=0.0001601;gnomAD_AN=108404;gnomAD_AN_AFR=9550”。表示该变异位点在gnomAD数据库中基因型过滤前的频率是0.0001601,在gnomAD数据库中的108404个样本中被检测到,其中有9550个样本的人种信息为非裔美国。"gnomAD_AF_raw=0.0001601; gnomAD_AN=108404; gnomAD_AN_AFR=9550". Indicates that the frequency of this variant site before genotype filtering in the gnomAD database is 0.0001601, and it was detected in 108,404 samples in the gnomAD database, of which 9,550 samples have African-American ethnicity.
图5是本发明实施例三提供的一种致病性变异位点判定方法的实现流程图,如图5所示,本发明实施例与上述实施例一基本相同,其不同之处仅在于:上述步骤S106具体包括步骤S502。Fig. 5 is the realization flow chart of a pathogenic variant locus determination method provided by the third embodiment of the present invention. As shown in Fig. 5 , the embodiment of the present invention is basically the same as the above-mentioned first embodiment, and the difference is only: The above step S106 specifically includes step S502.
在步骤S502中,建立待注释矩阵中的变异位点与从公共数据库中调取出来的已知致病性变异位点的关联关系,将变异位点标注的变异特征值转化为变异的致病性证据。In step S502, establish the association relationship between the variant sites in the matrix to be annotated and the known pathogenic variant sites retrieved from the public database, and convert the variant feature values marked on the variant sites into the variant pathogenic variant sites Sexual evidence.
在本发明实施例中,建立所述待注释矩阵中的变异位点与从公共数据库(例如,ClinVar、HGMD、OMIM、DVD等)中调取出来的已知致病性变异位点的关联关系,包括与已知致病变异位点导致同样的氨基酸变化、复合杂合、连锁等关系,从而判断变异位点为致病变异位点的可能性,并转化为相应强度的变异致病性证据。In the embodiment of the present invention, the association relationship between the variant sites in the matrix to be annotated and the known pathogenic variant sites retrieved from public databases (eg, ClinVar, HGMD, OMIM, DVD, etc.) is established , including the same amino acid changes, compound heterozygosity, linkage and other relationships as the known pathogenic variant sites, so as to judge the possibility that the variant site is a pathogenic variant site, and convert it into the corresponding strength of variant pathogenicity evidence .
示例性的,假设来自公共数据库的已知致病变异V1的注释信息为“NP_065756.1:p.Arg40Cys”,即表示该变异导致蛋白质NP_065756.1的第40位精氨酸变为了半胱氨酸。若检测到变异V2的DNA改变与V1不同,但注释信息同样为“NP_065756.1:p.Arg40Cys”,则变异V2与变异V1导致了相同的氨基酸改变,即变异V2构成与变异V1导致同样的氨基酸变化的关系。基于此,可推测变异V2的DNA改变导致了蛋白质NP_065756.1的第40位精氨酸变为了半胱氨酸,并将“变异V2的DNA改变导致了蛋白质NP_065756.1的第40位精氨酸变为了半胱氨酸”转化为相应强度的变异致病性证据。Exemplarily, it is assumed that the annotation information of the known pathogenic variant V1 from the public database is "NP_065756.1:p.Arg40Cys", which means that the variant causes the 40th arginine of the protein NP_065756.1 to be changed to cysteine acid. If it is detected that the DNA change of variant V2 is different from that of V1, but the annotation information is also "NP_065756.1:p.Arg40Cys", then variant V2 and variant V1 cause the same amino acid change, that is, variant V2 has the same composition as variant V1. Relationship of amino acid changes. Based on this, it can be speculated that the DNA change of variant V2 caused the 40th arginine of protein NP_065756.1 to change to cysteine, and the "DNA change of variant V2 caused the 40th arginine of protein NP_065756.1 to change" acid to cysteine" translates into correspondingly strong evidence of variant pathogenicity.
又例如,待测样本在某基因G1中的被鉴定出两个罕见杂合变异V3和V4,且V3和V4分别来自父方和母方,则V3和V4组成复合杂合变异关系。根据单基因遗传疾病的致病原理,一个纯合致病性变异或两个组成复合杂合关系的致病变异则会致病。以PM3证据为例,当V3和V4其中一个杂合变异为致病变异时,则另一个变异被标记PM3的证据。For another example, two rare heterozygous variants V3 and V4 are identified in a gene G1 of the sample to be tested, and V3 and V4 are from the father and mother respectively, then V3 and V4 form a compound heterozygous variant relationship. Depending on the etiology of monogenic disorders, a homozygous pathogenic variant or two pathogenic variants in a compound heterozygous relationship can cause disease. Taking PM3 evidence as an example, when one of the heterozygous variants in V3 and V4 is a pathogenic variant, the other variant is marked for PM3 evidence.
再例如,待测样本在某基因G2中的被鉴定出两个变异V5和V6,且V5和V6均来自父方(或母方),则V5和V6构成相互连锁关系。以BP2证据为例,在该待测样本中,当两个连锁的变异V5和V6其中一个为致病变异时,则另一个变异被标记BP2的证据。For another example, two variants V5 and V6 are identified in a gene G2 of the sample to be tested, and both V5 and V6 are from the father (or mother), then V5 and V6 constitute a mutual linkage relationship. Taking BP2 evidence as an example, in the sample to be tested, when one of the two linked variants V5 and V6 is a pathogenic variant, the other variant is marked for BP2 evidence.
上述所有可选技术方案,可以采用任意结合形成本发明的可选实施例,在此不再一一赘述。All the above-mentioned optional technical solutions can be combined arbitrarily to form optional embodiments of the present invention, which will not be repeated here.
下述为本发明公开的致病性变异位点判定装置实施例,可以用于执行本发明实施例公开的遗传变异位点致病性判定方法实施例。对于本发明公开的致病性变异位点判定装置实施例中未披露的细节,请参照本发明公开的方法实施例。The following are embodiments of the apparatus for determining a pathogenic variant site disclosed in the present invention, which can be used to execute the embodiments of the method for determining the pathogenicity of a genetic variation site disclosed in the embodiments of the present invention. For details not disclosed in the embodiments of the apparatus for determining pathogenic variant sites disclosed in the present invention, please refer to the method embodiments disclosed in the present invention.
图6是本发明实施例还提供了一种致病性变异位点判定装置的框图,为了便于描述,图中仅示出与本发明相关的部分。如图6所示,该致病性变异位点判定装置包括:获取单元610、遗传变异注释单元620、变异致病性证据转化单元630、变异致病性解读单元640、变异致病性证据修改单元650和遗传变异位点列表输出单元660。FIG. 6 is a block diagram of an apparatus for determining a pathogenic variant locus according to an embodiment of the present invention. For ease of description, only parts related to the present invention are shown in the figure. As shown in FIG. 6 , the pathogenic variant locus determination device includes: an acquisition unit 610, a genetic variation annotation unit 620, a variant pathogenicity evidence transformation unit 630, a variant pathogenicity interpretation unit 640, and a variant pathogenicity evidence modification unit 640. unit 650 and genetic variation site list output unit 660.
获取单元610被配置为获取待测样本的测序序列,并将所述待测样本的测序序列与参考基因组序列进行比对,生成待注释矩阵。The obtaining unit 610 is configured to obtain the sequenced sequence of the sample to be tested, and to compare the sequenced sequence of the sample to be tested with the reference genome sequence to generate a matrix to be annotated.
遗传变异注释单元620,被配置为利用基因注释工具,对所述待注释矩阵中的变异位点进行遗传变异注释,得到已注释的遗传变异矩阵。The genetic variation annotation unit 620 is configured to use a gene annotation tool to perform genetic variation annotation on the variation sites in the matrix to be annotated to obtain an annotated genetic variation matrix.
变异致病性证据转化单元630,被配置为根据预设的变异致病性证据转化规则,将所述变异位点标注的变异特征值转化为变异的致病性证据;The variant pathogenicity evidence conversion unit 630 is configured to convert the variant characteristic value marked by the variant site into the variant pathogenicity evidence according to a preset variant pathogenicity evidence conversion rule;
变异致病性解读单元640,被配置为根据预设的变异致病性解读规则,对所述变异的致病性证据进行分析,得到每个待测样本的变异位点的变异致病性解读结果。The variant pathogenicity interpretation unit 640 is configured to analyze the pathogenicity evidence of the variant according to the preset variant pathogenicity interpretation rules, and obtain the variant pathogenicity interpretation of the variant site of each sample to be tested result.
变异致病性证据修改单元650,被配置为根据用户针对所述待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据;The variant pathogenicity evidence modification unit 650 is configured to input according to the variant pathogenicity interpretation result of the user for the variant pathogenicity of the sample to be tested that does not conform to the preset pathogenicity evidence transformation rule. Modification of variant pathogenicity evidence to generate new variant pathogenicity evidence;
遗传变异位点列表输出单元660,被配置为将所述新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对所述待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,直至不再增加新的变异致病性证据时,输出每个待测样本的遗传变异位点列表。The genetic variation site list output unit 660 is configured to update the new variant pathogenicity evidence into the variant pathogenicity evidence base in the preset pathogenicity evidence transformation rule, and repeat the process for the pathogenicity evidence to be tested. The steps of transforming the variant pathogenicity evidence of the variant loci of the sample, interpretation of variant pathogenicity and manual correction, until no new variant pathogenicity evidence is added, output the list of genetic variant loci of each sample to be tested.
本发明实施例提供的致病性变异位点判定装置,通过用户针对待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据,并将新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,扩大了已知变异致病性证据库中的变异致病性证据的数量,从而可找到更多与变异位点相关联的变异致病性证据,挖掘出更多的潜在的致病性变异位点,减少“不确定”的解读结果,提高召回率。The pathogenic variant site determination device provided by the embodiment of the present invention inputs the variant sites that do not conform to the preset pathogenicity evidence transformation rules in the variant pathogenicity interpretation result of the variant sites of the sample to be tested by the user. The modification operation of the pathogenicity evidence of the variant, generating new pathogenicity evidence of the variant, and updating the new pathogenicity evidence of the variant to the pathogenicity evidence base of the variant in the pre-set pathogenicity evidence transformation rules, and Repeating the steps of variant pathogenicity evidence transformation, variant pathogenicity interpretation, and manual correction of the variant loci in the sample to be tested expands the amount of variant pathogenicity evidence in the known variant pathogenicity evidence base, so that it is possible to find More pathogenic evidence of variants associated with variant loci, mining more potential pathogenic variant loci, reducing "uncertain" interpretation results and improving recall rate.
优选的,上述获取单元610,被配置为获取待测样本的测序序列,并将所述待测样本的测序序列与参考基因组序列进行比对,得到每个待测样本的DNA变异列表,鉴定并合并每个待测样本的DNA变异列表,生成待注释矩阵。Preferably, the above obtaining unit 610 is configured to obtain the sequencing sequence of the sample to be tested, compare the sequencing sequence of the sample to be tested with the reference genome sequence, obtain a DNA variation list of each sample to be tested, identify and The list of DNA variants for each sample to be tested is combined to generate a matrix to be annotated.
优选的,上述变异致病性证据修改单元650,被配置为根据用户针对所述待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据修改操作,生成符合预设的致病性证据转化规则的新的变异致病性证据。Preferably, the above-mentioned variant pathogenicity evidence modification unit 650 is configured to, according to the user's variant pathogenicity interpretation result for the variant site of the test sample, the variant that does not conform to the preset pathogenicity evidence conversion rule The mutation pathogenicity evidence modification operation of locus input generates new variant pathogenicity evidence that conforms to the preset pathogenicity evidence transformation rules.
优选的,上述遗传变异注释单元620,被配置为利用基因注释工具,对所述待注释矩阵中的变异位点在基因组中的位置、变异的类型和变异在人群中的频率进行注释,得到已注释的遗传变异矩阵。Preferably, the above-mentioned genetic variation annotation unit 620 is configured to use a gene annotation tool to annotate the position of the mutation site in the genome to be annotated, the type of mutation and the frequency of the mutation in the population to obtain the Annotated genetic variation matrix.
优选的,上述变异致病性证据转化单元630,被配置为建立所述待注释矩阵中的变异位点与从公共数据库中调取出来的已知致病性变异位点的关联关系,将所述变异位点标注的变异特征值转化为变异的致病性证据。Preferably, the above-mentioned variant pathogenicity evidence conversion unit 630 is configured to establish an association relationship between the variant sites in the matrix to be annotated and the known pathogenic variant sites retrieved from the public database, and convert the The variant eigenvalues annotated at the aforementioned variant loci are converted into pathogenic evidence of the variant.
关于上述实施例中的装置,其中各个单元执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细描述说明。Regarding the apparatus in the above-mentioned embodiment, the specific manner in which each unit performs the operation has been described in detail in the embodiment of the method, and will not be described in detail here.
在一个实施例中,提出了一种计算机设备,所述计算机设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现以下步骤:In one embodiment, a computer device is proposed, the computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the computer The program implements the following steps:
获取待测样本的测序序列,并将所述待测样本的测序序列与参考基因组序列进行比对,生成待注释矩阵;Obtain the sequencing sequence of the sample to be tested, and compare the sequencing sequence of the sample to be tested with the reference genome sequence to generate a matrix to be annotated;
利用基因注释工具,对所述待注释矩阵中的变异位点进行遗传变异注释,得到已注释的遗传变异矩阵;Using a gene annotation tool, perform genetic variation annotation on the variation sites in the matrix to be annotated, and obtain an annotated genetic variation matrix;
根据预设的变异致病性证据转化规则,将所述变异位点标注的变异特征值转化为变异的致病性证据;According to the preset variant pathogenicity evidence conversion rules, the variant characteristic value marked on the variant site is converted into the variant pathogenicity evidence;
根据预设的变异致病性解读规则,对所述变异的致病性证据进行分析,得到每个待测样本的变异位点的变异致病性解读结果;According to the preset variant pathogenicity interpretation rules, analyze the pathogenicity evidence of the variant, and obtain the variant pathogenicity interpretation result of the variant locus of each sample to be tested;
根据用户针对所述待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据;According to the modification operation of the variant pathogenicity evidence input by the user for the variant pathogenicity interpretation result of the variant locus of the sample to be tested that does not conform to the preset pathogenicity evidence transformation rules, a new variant pathogenicity evidence is generated. Evidence of variant pathogenicity;
将所述新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对所述待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,直至不再增加新的变异致病性证据时,输出每个待测样本的遗传变异位点列表。Update the new variant pathogenicity evidence to the variant pathogenicity evidence base in the preset pathogenicity evidence transformation rule, and repeat the transformation of the variant pathogenicity evidence of the mutation site of the sample to be tested , variant pathogenicity interpretation and manual correction, until no new variant pathogenicity evidence is added, output a list of genetic variant sites for each sample to be tested.
在一个实施例中,提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时,使得处理器执行以下步骤:In one embodiment, a computer-readable storage medium is provided, and a computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, the processor performs the following steps:
获取待测样本的测序序列,并将所述待测样本的测序序列与参考基因组序列进行比对,得到待注释矩阵;Obtain the sequencing sequence of the sample to be tested, and compare the sequencing sequence of the sample to be tested with the reference genome sequence to obtain a matrix to be annotated;
利用基因注释工具,对所述待注释矩阵中的变异位点进行遗传变异注释,生成已注释的遗传变异矩阵;Using a gene annotation tool, perform genetic variation annotation on the variation sites in the to-be-annotated matrix to generate an annotated genetic variation matrix;
根据预设的变异致病性证据转化规则,将所述变异位点标注的变异特征值转化为变异的致病性证据;According to the preset variant pathogenicity evidence conversion rules, the variant characteristic value marked on the variant site is converted into the variant pathogenicity evidence;
根据预设的变异致病性解读规则,对所述变异的致病性证据进行分析,得到每个待测样本的变异位点的变异致病性解读结果;According to the preset variant pathogenicity interpretation rules, analyze the pathogenicity evidence of the variant, and obtain the variant pathogenicity interpretation result of the variant locus of each sample to be tested;
根据用户针对所述待测样本的变异位点的变异致病性解读结果中与预设的致病性证据转化规则不相符的变异位点输入的变异致病性证据的修改操作,生成新的变异致病性证据;According to the modification operation of the variant pathogenicity evidence input by the user for the variant pathogenicity interpretation result of the variant locus of the sample to be tested that does not conform to the preset pathogenicity evidence transformation rules, a new variant pathogenicity evidence is generated. Evidence of variant pathogenicity;
将所述新的变异致病性证据更新到预设的致病性证据转化规则中的变异致病性证据库中,并重复对所述待测样本的变异位点的变异致病性证据转化、变异致病性解读和人工校正的步骤,直至不再增加新的变异致病性证据时,输出每个待测样本的遗传变异位点列表。Update the new variant pathogenicity evidence to the variant pathogenicity evidence base in the preset pathogenicity evidence transformation rule, and repeat the transformation of the variant pathogenicity evidence of the mutation site of the sample to be tested , variant pathogenicity interpretation and manual correction, until no new variant pathogenicity evidence is added, output a list of genetic variant sites for each sample to be tested.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the program can be stored in a non-volatile computer-readable storage medium , when the program is executed, it may include the flow of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-described embodiments can be combined arbitrarily. For the sake of brevity, all possible combinations of the technical features in the above-described embodiments are not described. However, as long as there is no contradiction between the combinations of these technical features, All should be regarded as the scope described in this specification.
为了进一步说明本发明的技术效果,以下通过具体的试验例来进行阐述:In order to further illustrate the technical effect of the present invention, the following is set forth by specific test examples:
通过分别采用本发明实施例提供的致病性变异位点判定方法和现有的致病性变异位点分析方法,对11475名耳聋患者的基因测序数据进行致病性变异位点分析,经过三轮循环分析确定基因诊断率。基因诊断是指通过基因检测和遗传分析,确定导致患者患病的基因变异。其中诊断率=可诊断人数/患者总数×100%。测试结果如下表2所示:By using the method for determining the pathogenic variant locus provided in the embodiment of the present invention and the existing method for analyzing the pathogenic variant locus, the gene sequencing data of 11,475 deaf patients were analyzed for the pathogenic variant locus. Round-robin analysis to determine genetic diagnostic rates. Genetic diagnosis is the use of genetic testing and genetic analysis to identify the genetic variants that cause a patient's disease. Wherein the diagnosis rate=diagnosable number/total number of patients×100%. The test results are shown in Table 2 below:
从上表2的试验结果可以看出,采用现有的分析方法确定的可诊断人数为2257人,诊断率为23.12%,而采用本发明实施例提供的致病性变异位点判定方法确定的可诊断人数为2654人,诊断率为19.67%。可见,对于孟德尔遗传疾病的致病变异位点的测定,在本次3次循环的测试中,本发明实施例提供的致病性变异位点判定方法相较于现有的分析方法,其诊断率提高了近五分之一,即提高了对致病性变异位点的筛查诊断的召回率和准确度。As can be seen from the test results in Table 2 above, the number of diagnosable persons determined by the existing analysis method is 2257, and the diagnosis rate is 23.12%, while the number of persons determined by the method for determining the pathogenic variant locus provided by the embodiment of the present invention is 2257. The number of diagnosable people was 2654, and the diagnosis rate was 19.67%. It can be seen that, for the determination of the pathogenic variant loci of Mendelian genetic diseases, in this three-cycle test, the method for determining the pathogenic variant locus provided by the embodiment of the present invention is compared with the existing analysis methods. The diagnostic rate was improved by nearly one-fifth, that is, the recall rate and accuracy of screening diagnosis for pathogenic variant loci were improved.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910634602.5A CN110379458A (en) | 2019-07-15 | 2019-07-15 | Pathogenicity variation site determination method, device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910634602.5A CN110379458A (en) | 2019-07-15 | 2019-07-15 | Pathogenicity variation site determination method, device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110379458A true CN110379458A (en) | 2019-10-25 |
Family
ID=68253142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910634602.5A Pending CN110379458A (en) | 2019-07-15 | 2019-07-15 | Pathogenicity variation site determination method, device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110379458A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110957006A (en) * | 2019-12-14 | 2020-04-03 | 杭州联川基因诊断技术有限公司 | Interpretation method of BRCA1/2 gene variation |
CN111798926A (en) * | 2020-06-30 | 2020-10-20 | 广州金域医学检验中心有限公司 | Pathogenic gene locus database and establishment method thereof |
CN112489727A (en) * | 2020-12-24 | 2021-03-12 | 厦门基源医疗科技有限公司 | Method and system for rapidly acquiring pathogenic site of rare disease |
CN112735520A (en) * | 2021-02-03 | 2021-04-30 | 深圳裕康医学检验实验室 | Interpretation method, system and storage medium for tumor individualized immunotherapy gene detection result |
CN112795635A (en) * | 2020-12-31 | 2021-05-14 | 南昌瑞因康生物科技有限公司 | A detection method, device and storage medium for Marfan syndrome and related genes |
CN112908412A (en) * | 2021-02-10 | 2021-06-04 | 北京贝瑞和康生物技术有限公司 | Methods, devices and media for compounding the applicability of heterozygous variant pathogenic evidence |
CN113808662A (en) * | 2021-09-01 | 2021-12-17 | 基诺莱(重庆)生物技术有限公司 | Neural network-based prediction method and system for pathogenicity of gene variation sites |
CN114429785A (en) * | 2022-04-01 | 2022-05-03 | 普瑞基准生物医药(苏州)有限公司 | Automatic classification method and device for genetic variation and electronic equipment |
CN114496080A (en) * | 2022-01-17 | 2022-05-13 | 中国人民解放军总医院第一医学中心 | Deafness pathogenicity gene screening method and device, storage medium and server |
CN114613435A (en) * | 2022-03-15 | 2022-06-10 | 中国人民解放军陆军军医大学第一附属医院 | Method for measuring small number of cell translation groups and application thereof |
CN115171781A (en) * | 2022-07-13 | 2022-10-11 | 广州市金圻睿生物科技有限责任公司 | Method, system, device and medium for identifying whether tumor mutation sites are noise |
WO2024092681A1 (en) * | 2022-11-04 | 2024-05-10 | 深圳华大基因股份有限公司 | Method and apparatus for determining loss-of-function evidence of pathogenicity |
CN119049569A (en) * | 2024-11-04 | 2024-11-29 | 广州女娲生命科技有限公司 | Genetic database updating method, classification method and system based on sequencing data |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106156538A (en) * | 2016-06-29 | 2016-11-23 | 天津诺禾医学检验所有限公司 | The annotation method of a kind of full-length genome variation data and annotation system |
CN106599613A (en) * | 2016-12-15 | 2017-04-26 | 博奥生物集团有限公司 | Method for judging genetic tumor variation site classification |
AU2017100960A4 (en) * | 2017-07-13 | 2017-08-10 | Macau University Of Science And Technology | Method of identifying a gene associated with a disease or pathological condition of the disease |
CN107247890A (en) * | 2017-06-30 | 2017-10-13 | 张巍 | A kind of gene data system for clinical diagnosis and prediction |
CN109182483A (en) * | 2018-09-04 | 2019-01-11 | 天津诺禾致源生物信息科技有限公司 | The method and device that genetic mutation is interpreted |
CN109616155A (en) * | 2018-11-19 | 2019-04-12 | 江苏科技大学 | A data processing system and method for pathogenicity classification of genetic variants in coding regions |
CN109920481A (en) * | 2019-01-31 | 2019-06-21 | 北京诺禾致源科技股份有限公司 | The genetic mutation unscrambling data library BRCA1/2 and its construction method |
-
2019
- 2019-07-15 CN CN201910634602.5A patent/CN110379458A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106156538A (en) * | 2016-06-29 | 2016-11-23 | 天津诺禾医学检验所有限公司 | The annotation method of a kind of full-length genome variation data and annotation system |
CN106599613A (en) * | 2016-12-15 | 2017-04-26 | 博奥生物集团有限公司 | Method for judging genetic tumor variation site classification |
CN107247890A (en) * | 2017-06-30 | 2017-10-13 | 张巍 | A kind of gene data system for clinical diagnosis and prediction |
AU2017100960A4 (en) * | 2017-07-13 | 2017-08-10 | Macau University Of Science And Technology | Method of identifying a gene associated with a disease or pathological condition of the disease |
CN109182483A (en) * | 2018-09-04 | 2019-01-11 | 天津诺禾致源生物信息科技有限公司 | The method and device that genetic mutation is interpreted |
CN109616155A (en) * | 2018-11-19 | 2019-04-12 | 江苏科技大学 | A data processing system and method for pathogenicity classification of genetic variants in coding regions |
CN109920481A (en) * | 2019-01-31 | 2019-06-21 | 北京诺禾致源科技股份有限公司 | The genetic mutation unscrambling data library BRCA1/2 and its construction method |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110957006A (en) * | 2019-12-14 | 2020-04-03 | 杭州联川基因诊断技术有限公司 | Interpretation method of BRCA1/2 gene variation |
CN110957006B (en) * | 2019-12-14 | 2023-08-11 | 杭州联川基因诊断技术有限公司 | Interpretation method of BRCA1/2 gene variation |
CN111798926A (en) * | 2020-06-30 | 2020-10-20 | 广州金域医学检验中心有限公司 | Pathogenic gene locus database and establishment method thereof |
CN111798926B (en) * | 2020-06-30 | 2023-09-29 | 广州金域医学检验中心有限公司 | Pathogenic gene locus database and establishment method thereof |
CN112489727B (en) * | 2020-12-24 | 2023-06-23 | 厦门基源医疗科技有限公司 | Method and system for rapidly acquiring rare disease pathogenic sites |
CN112489727A (en) * | 2020-12-24 | 2021-03-12 | 厦门基源医疗科技有限公司 | Method and system for rapidly acquiring pathogenic site of rare disease |
CN112795635A (en) * | 2020-12-31 | 2021-05-14 | 南昌瑞因康生物科技有限公司 | A detection method, device and storage medium for Marfan syndrome and related genes |
CN112735520B (en) * | 2021-02-03 | 2021-07-20 | 深圳裕康医学检验实验室 | Interpretation method, system and storage medium for tumor individualized immunotherapy gene detection result |
CN112735520A (en) * | 2021-02-03 | 2021-04-30 | 深圳裕康医学检验实验室 | Interpretation method, system and storage medium for tumor individualized immunotherapy gene detection result |
CN112908412A (en) * | 2021-02-10 | 2021-06-04 | 北京贝瑞和康生物技术有限公司 | Methods, devices and media for compounding the applicability of heterozygous variant pathogenic evidence |
CN113808662A (en) * | 2021-09-01 | 2021-12-17 | 基诺莱(重庆)生物技术有限公司 | Neural network-based prediction method and system for pathogenicity of gene variation sites |
CN114496080A (en) * | 2022-01-17 | 2022-05-13 | 中国人民解放军总医院第一医学中心 | Deafness pathogenicity gene screening method and device, storage medium and server |
CN114613435A (en) * | 2022-03-15 | 2022-06-10 | 中国人民解放军陆军军医大学第一附属医院 | Method for measuring small number of cell translation groups and application thereof |
CN114429785A (en) * | 2022-04-01 | 2022-05-03 | 普瑞基准生物医药(苏州)有限公司 | Automatic classification method and device for genetic variation and electronic equipment |
CN115171781A (en) * | 2022-07-13 | 2022-10-11 | 广州市金圻睿生物科技有限责任公司 | Method, system, device and medium for identifying whether tumor mutation sites are noise |
WO2024092681A1 (en) * | 2022-11-04 | 2024-05-10 | 深圳华大基因股份有限公司 | Method and apparatus for determining loss-of-function evidence of pathogenicity |
CN119049569A (en) * | 2024-11-04 | 2024-11-29 | 广州女娲生命科技有限公司 | Genetic database updating method, classification method and system based on sequencing data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110379458A (en) | Pathogenicity variation site determination method, device, computer equipment and storage medium | |
Sedlazeck et al. | Accurate detection of complex structural variations using single-molecule sequencing | |
Tanudisastro et al. | Sequencing and characterizing short tandem repeats in the human genome | |
Aguet et al. | Molecular quantitative trait loci | |
Pugh et al. | VisCap: inference and visualization of germ-line copy-number variants from targeted clinical sequencing data | |
JP7637139B2 (en) | Systems and methods for automating rna expression calling in cancer prediction pipelines | |
Zook et al. | A robust benchmark for germline structural variant detection | |
Lee et al. | MOSAIK: a hash-based algorithm for accurate next-generation sequencing short-read mapping | |
Magi et al. | Characterization of MinION nanopore data for resequencing analyses | |
Li et al. | Transcriptome sequencing of a large human family identifies the impact of rare noncoding variants | |
US20240105282A1 (en) | Methods for detecting bialllic loss of function in next-generation sequencing genomic data | |
CN107849612A (en) | Compare and variant sequencing analysis pipeline | |
JP7634626B2 (en) | Method for detecting genetic variations in highly homologous sequences by independent alignment and pairing of sequence reads - Patents.com | |
Smith et al. | Benchmarking splice variant prediction algorithms using massively parallel splicing assays | |
US12154662B2 (en) | Method of analyzing nucleic acid sequence of patient sample, presentation method, presentation apparatus, and presentation program of analysis result, and system for analyzing nucleic acid sequence of patient sample | |
Tae et al. | Discretized Gaussian mixture for genotyping of microsatellite loci containing homopolymer runs | |
Legault et al. | Comparison of sequencing based CNV discovery methods using monozygotic twin quartets | |
JP2018536914A (en) | Systems and methods for genetic medicine testing | |
CN117497047B (en) | Method, equipment and medium for screening tumor gene markers based on exon sequencing | |
Lebo et al. | Bioinformatics in clinical genomic sequencing | |
Söylev et al. | CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data | |
Behera et al. | Fixing reference errors efficiently improves sequencing results | |
Lin et al. | MapCaller–An integrated and efficient tool for short-read mapping and variant calling using high-throughput sequenced data | |
US20240355415A1 (en) | Methods and devices for non-invasive prenatal testing | |
Zook et al. | Integrating sequencing datasets to form highly confident SNP and indel genotype calls for a whole human genome |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191025 |