CN118737291A - Method, system and device for realizing normalization of detection signal of gene analyzer - Google Patents
Method, system and device for realizing normalization of detection signal of gene analyzer Download PDFInfo
- Publication number
- CN118737291A CN118737291A CN202410764366.XA CN202410764366A CN118737291A CN 118737291 A CN118737291 A CN 118737291A CN 202410764366 A CN202410764366 A CN 202410764366A CN 118737291 A CN118737291 A CN 118737291A
- Authority
- CN
- China
- Prior art keywords
- internal standard
- capillary
- peak
- matching
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/10—Signal processing, e.g. from mass spectrometry [MS] or from PCR
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
- G06F18/15—Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/02—Preprocessing
- G06F2218/04—Denoising
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Probability & Statistics with Applications (AREA)
- Molecular Biology (AREA)
- Signal Processing (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Epidemiology (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Investigating, Analyzing Materials By Fluorescence Or Luminescence (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及生化检测技术领域,尤其涉及一种实现基因分析仪检测信号归一化的方法、系统及设备。The present invention relates to the field of biochemical detection technology, and in particular to a method, system and device for realizing normalization of detection signals of a gene analyzer.
背景技术Background Art
基于毛细管电泳技术的基因分析仪可应用于Sanger测序与基因片段分析。毛细管电泳是以石英毛细管为分离通道,以高压直流电场为驱动力,充填多孔凝胶作为支持介质,通过温度控制保证凝胶的孔径分布于DNA构象。当DNA分子的大小与凝胶孔径相当时,其淌度与尺寸大小有关,短片段受到的阻碍较小,从毛细管中涌动较快,长片段受到的阻碍较大,从毛细管中涌动较慢。因DNA分子带负电,在毛细管两端加上直流高压电后,通过电进样的方式,标记了荧光基团的DNA会从毛细管阴极端口进入毛细管,并向阳极涌动,不同长度的DNA分子会先后通过检测窗口,当某一DNA分子经过光学检测窗口时,通过激光激发DNA上的荧光基团而产生荧光,从而被光谱仪采集,光谱仪将光信号转换为电信号再转换为数字信号。将原始的数字信号进行处理后,通过分析软件分析,就可获得DNA分子的碱基序列或相对片段长度。Genetic analyzers based on capillary electrophoresis technology can be applied to Sanger sequencing and gene fragment analysis. Capillary electrophoresis uses quartz capillaries as separation channels, high-voltage DC electric fields as driving forces, and porous gels as supporting media. Temperature control is used to ensure that the pore size of the gel is distributed in the DNA conformation. When the size of a DNA molecule is comparable to the pore size of the gel, its mobility is related to its size. Short fragments are less obstructed and flow faster from the capillary, while long fragments are more obstructed and flow slower from the capillary. Because DNA molecules are negatively charged, after adding high-voltage DC electricity to both ends of the capillary, DNA labeled with fluorescent groups will enter the capillary from the cathode port of the capillary through electrical injection, and flow toward the anode. DNA molecules of different lengths will pass through the detection window one after another. When a DNA molecule passes through the optical detection window, the fluorescent group on the DNA is excited by the laser to generate fluorescence, which is then collected by the spectrometer. The spectrometer converts the optical signal into an electrical signal and then into a digital signal. After processing the original digital signal, the base sequence or relative fragment length of the DNA molecule can be obtained through analysis software.
基因分析仪所采集的信号,本质上是光信号。光信号又分为DNA上荧光基团上的荧光信号与电泳凝胶的拉曼信号。一般用拉曼信号来表征光学系统的状态,而DNA荧光信号受上样时间、上样电压、上样位置、样品浓度、光学系统以及毛细管自身状态等诸多因素的影响。因此使用基因分析仪检测时,一直不可避免不同毛细管间的信号差异,以及相同毛细管不同检测次数的差异,从而导致检测结果的偏差。如何解决不同毛细管间的信号差异以及相同毛细管不同检测次数的信号差异,对提升基因分析结果质量具有重要意义。The signals collected by the genetic analyzer are essentially optical signals. Optical signals are divided into fluorescent signals on fluorescent groups on DNA and Raman signals on electrophoresis gels. Raman signals are generally used to characterize the state of the optical system, while DNA fluorescence signals are affected by many factors such as loading time, loading voltage, loading position, sample concentration, optical system, and the state of the capillary itself. Therefore, when using a genetic analyzer for detection, it is inevitable that there will be signal differences between different capillaries and differences in the number of detections of the same capillary, which will lead to deviations in the detection results. How to solve the signal differences between different capillaries and the signal differences in the number of detections of the same capillary is of great significance to improving the quality of genetic analysis results.
发明内容Summary of the invention
鉴于上述问题,提出了本发明以便提供一种克服上述问题或者至少部分地解决上述问题的实现基因分析仪检测信号归一化的方法、系统及设备。In view of the above problems, the present invention is proposed to provide a method, system and device for realizing normalization of detection signals of a gene analyzer, which overcomes the above problems or at least partially solves the above problems.
本发明的一个方面,提供了一种实现基因分析仪检测信号归一化的方法,所述方法包括:One aspect of the present invention provides a method for normalizing detection signals of a gene analyzer, the method comprising:
对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行内标匹配,并根据匹配成功的内标匹配序列计算每一道毛细管内标物通道的单通道内标平均值;Performing internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer, and calculating the single-channel internal standard average value of each capillary internal standard channel according to the successfully matched internal standard matching sequence;
采用预设的内标标准值分别除以每一道毛细管的内标物通道的单通道内标平均值,将得到的计算结果作为对应毛细管的归一化系数;The preset internal standard value is divided by the single channel internal standard average value of the internal standard channel of each capillary, and the calculated result is used as the normalization coefficient of the corresponding capillary;
根据每一道毛细管的归一化系数分别对各自毛细管中其他颜色通道的光谱信号进行归一化修正,实现检测信号归一化。According to the normalization coefficient of each capillary, the spectral signals of other color channels in each capillary are normalized and corrected respectively to achieve normalization of the detection signal.
可选地,在对基因分析仪电泳过程中采集的每一道毛细管的内标物通道的光谱信号进行内标匹配之前,所述方法还包括:Optionally, before performing internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer, the method further comprises:
对仅装有指定浓度的内标物的多道毛细管进行电泳,以采集每一道毛细管的内标物通道的光谱信号并进行内标匹配,计算每一道毛细管对应的满足内标匹配要求的内标匹配序列中各个内标峰的平均峰值,得到每一道毛细管内标物通道的单通道内标平均值;Performing electrophoresis on a multi-channel capillary filled only with an internal standard of a specified concentration to collect the spectral signal of the internal standard channel of each capillary and perform internal standard matching, calculating the average peak value of each internal standard peak in the internal standard matching sequence corresponding to each capillary that meets the internal standard matching requirements, and obtaining the single-channel internal standard average value of each capillary internal standard channel;
计算各道毛细管内标物通道的单通道内标平均值的变异系数,若变异系数小于预设变异阈值,则将各道毛细管内标物通道的单通道内标平均值的均值或均值的预设倍数作为所述内标标准值。The coefficient of variation of the single-channel internal standard average value of each capillary internal standard channel is calculated. If the coefficient of variation is less than a preset variation threshold, the mean of the single-channel internal standard average values of each capillary internal standard channel or a preset multiple of the mean is used as the internal standard standard value.
可选地,在将得到的计算结果作为对应毛细管的归一化系数之后,所述方法还包括:Optionally, after using the obtained calculation result as the normalization coefficient of the corresponding capillary, the method further includes:
判断每一毛细管的归一化系数是否合规,当毛细管的归一化系数处于预设的归一化系数取值范围内时则判定当前毛细管的归一化系数合规,否则判定当前毛细管的归一化系数不合规;Determine whether the normalization coefficient of each capillary is compliant. When the normalization coefficient of the capillary is within the preset normalization coefficient value range, the normalization coefficient of the current capillary is determined to be compliant. Otherwise, the normalization coefficient of the current capillary is determined to be non-compliant.
若当前电泳检测数据中归一化系数不合规的毛细管数量占毛细管总数的比例大于预设的第一比例阈值,或是,至少连续三次的电泳检测数据中归一化系数不合规的毛细管数量占毛细管总数的比例大于预设的第二比例阈值,则对所述内标标准值进行修正;If the ratio of the number of capillaries with non-compliant normalization coefficients in the current electrophoresis detection data to the total number of capillaries is greater than a preset first ratio threshold, or if the ratio of the number of capillaries with non-compliant normalization coefficients in the electrophoresis detection data for at least three consecutive times to the total number of capillaries is greater than a preset second ratio threshold, the internal standard value is corrected;
其中,第二比例阈值小于第一比例阈值。The second ratio threshold is smaller than the first ratio threshold.
可选地,所述根据每一道毛细管的归一化系数分别对各自毛细管中其他颜色通道的光谱信号进行归一化修正,包括:Optionally, the normalizing and correcting the spectral signals of other color channels in the respective capillaries according to the normalization coefficient of each capillary comprises:
若毛细管的归一化系数合规,则将当前毛细管中其他颜色通道的光谱信号乘以对应毛细管的归一化系数以实现光谱信号归一化修正;If the normalization coefficient of the capillary is in compliance, the spectral signals of other color channels in the current capillary are multiplied by the normalization coefficient of the corresponding capillary to achieve spectral signal normalization correction;
若毛细管的归一化系数不合规,则将当前毛细管中其他颜色通道的光谱信号乘以对应毛细管的归一化系数在归一化系数取值范围内靠近的极值以实现光谱信号归一化修正。If the normalization coefficient of the capillary is not compliant, the spectral signals of other color channels in the current capillary are multiplied by the normalization coefficient of the corresponding capillary close to the extreme value within the normalization coefficient value range to achieve spectral signal normalization correction.
可选地,在对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行内标匹配之前,所述方法还包括:Optionally, before performing internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer, the method further comprises:
监测电泳过程中的拉曼信号,获取拉曼信号的信号值、峰平衡数据和/或均匀度数据;Monitoring the Raman signal during the electrophoresis process, and obtaining the signal value, peak balance data and/or uniformity data of the Raman signal;
当所述拉曼信号的信号值、峰平衡数据和/或均匀度数据中任一参数不满足对应的预设光学参数标准时,执行光学对准操作,直到信号值、峰平衡数据和/或均匀度数据均满足对应的光学参数标准。When any parameter of the signal value, peak balance data and/or uniformity data of the Raman signal does not meet the corresponding preset optical parameter standard, an optical alignment operation is performed until the signal value, peak balance data and/or uniformity data all meet the corresponding optical parameter standard.
可选地,所述对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行内标匹配,包括:Optionally, the performing internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer includes:
S10、对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行数据前处理,以滤除所述光谱信号的背景噪声并对光谱信号进行平滑处理;S10, performing data pre-processing on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer to filter out the background noise of the spectral signal and perform smoothing on the spectral signal;
S11、对数据前处理后的光谱信号进行峰识别,以筛选出当前光谱信号中包含的候选峰序列;S11, performing peak recognition on the spectral signal after data pre-processing to screen out candidate peak sequences contained in the current spectral signal;
S12、按照采样先后顺序依次对候选峰序列中的候选峰与电泳过程所选内标物对应的标准光谱信号中的标准峰序列进行内标匹配,以找到候选峰序列中存在的符合预设的内标匹配条件的候选峰组合,候选峰组合中的候选峰数量大于或等于3;内标匹配条件包括第一距离与第二距离之间的差值的绝对值小于预设的距离误差阈值,且,max{候选峰组合中已选定候选峰的峰高度,当前待匹配候选峰的峰高度}/min{候选峰组合中已选定候选峰的峰高度,当前待匹配候选峰的峰高度}<预设的相对高度阈值,其中,第一距离为候选峰序列中相邻候选峰峰值点之间的距离,第二距离为标准峰序列中与当前计算第一距离的候选峰采样顺序相同的相邻标准峰峰值点之间的距离;S12. Perform internal standard matching on the candidate peaks in the candidate peak sequence and the standard peak sequence in the standard spectral signal corresponding to the internal standard selected in the electrophoresis process in sequence according to the sampling order, so as to find the candidate peak combination in the candidate peak sequence that meets the preset internal standard matching conditions, and the number of candidate peaks in the candidate peak combination is greater than or equal to 3; the internal standard matching conditions include that the absolute value of the difference between the first distance and the second distance is less than the preset distance error threshold, and max{peak height of the selected candidate peak in the candidate peak combination, peak height of the current candidate peak to be matched}/min{peak height of the selected candidate peak in the candidate peak combination, peak height of the current candidate peak to be matched}<preset relative height threshold, wherein the first distance is the distance between the peak points of adjacent candidate peaks in the candidate peak sequence, and the second distance is the distance between the peak points of adjacent standard peaks in the standard peak sequence that have the same sampling order as the candidate peak for calculating the first distance currently;
S13、以每一候选峰组合作为匹配基础对候选峰序列中其他候选峰依次进行内标匹配,得到每一候选峰组合对应的匹配结果;S13, using each candidate peak combination as a matching basis, sequentially performing internal standard matching on other candidate peaks in the candidate peak sequence to obtain a matching result corresponding to each candidate peak combination;
S14、统计每一候选峰组合对应的匹配结果中包含的候选峰数量;S14, counting the number of candidate peaks included in the matching results corresponding to each candidate peak combination;
S15、当各个候选峰组合对应的匹配结果中包含的候选峰数量的最大值等于标准峰序列中包含的标准峰数量时,则判定内标匹配成功,并将所述最大值对应的匹配结果作为最优内标匹配结果。S15. When the maximum number of candidate peaks contained in the matching results corresponding to each candidate peak combination is equal to the number of standard peaks contained in the standard peak sequence, the internal standard matching is determined to be successful, and the matching result corresponding to the maximum value is used as the optimal internal standard matching result.
可选地,在将所述最大值对应的匹配结果作为最优内标匹配结果之后,所述方法包括:Optionally, after taking the matching result corresponding to the maximum value as the optimal internal standard matching result, the method includes:
将标准峰序列和最优内标匹配结果的峰序列进行曲线拟合,将曲线拟合后的标准差、平均残差和最大残差作为当前最优内标匹配结果的特征数据;Perform curve fitting on the standard peak sequence and the peak sequence of the optimal internal standard matching result, and use the standard deviation, average residual and maximum residual after the curve fitting as characteristic data of the current optimal internal standard matching result;
将所述特征数据输入预设的内标匹配评分模型进行识别,以得到当期最优内标匹配结果的匹配程度评分。The characteristic data is input into a preset internal standard matching scoring model for identification to obtain a matching degree score of the current optimal internal standard matching result.
可选地,所述方法还包括:Optionally, the method further comprises:
若每一候选峰组合对应的匹配结果中包含的候选峰数量的最大值均不等于标准峰序列中包含的标准峰数量,则根据预设的第一阈值调整规则更新所述距离误差阈值,并返回步骤S12,直到更新后的距离误差阈值大于距离误差阈值的最大值;If the maximum number of candidate peaks included in the matching results corresponding to each candidate peak combination is not equal to the number of standard peaks included in the standard peak sequence, the distance error threshold is updated according to the preset first threshold adjustment rule, and the process returns to step S12 until the updated distance error threshold is greater than the maximum value of the distance error threshold;
当更新后的距离误差阈值大于距离误差阈值的最大值时,根据预设的第二阈值调整规则更新所述相对高度阈值,且将距离误差阈值更新为对应的初始值,并返回步骤S12,直到更新后的相对高度阈值大于相对高度阈值的最大值;When the updated distance error threshold is greater than the maximum value of the distance error threshold, the relative height threshold is updated according to the preset second threshold adjustment rule, and the distance error threshold is updated to the corresponding initial value, and the process returns to step S12 until the updated relative height threshold is greater than the maximum value of the relative height threshold;
当更新后的相对高度阈值大于相对高度阈值的最大值时,则判定内标匹配失败。When the updated relative height threshold is greater than the maximum value of the relative height threshold, it is determined that the internal standard matching fails.
第二方面,本发明还提供了一种实现基因分析仪检测信号归一化的系统,所述系统包括用于实现如上实现基因分析仪检测信号归一化的方法的功能模块。In a second aspect, the present invention further provides a system for realizing normalization of detection signals of a gene analyzer, wherein the system comprises a functional module for realizing the above method for realizing normalization of detection signals of a gene analyzer.
第三方面,本发明还提供了一种计算机设备,存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上实现基因分析仪检测信号归一化的方法的步骤。In a third aspect, the present invention further provides a computer device, a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the above-mentioned method for normalizing detection signals of a genetic analyzer when executing the computer program.
本发明实施例提供的实现基因分析仪检测信号归一化的方法、系统及设备,通过预设的内标标准值和基因分析仪电泳过程中每一道毛细管的内标物通道的单通道内标平均值计算对应毛细管的归一化系数,并根据每一道毛细管的归一化系数对各自毛细管中其他颜色通道的光谱信号进行归一化修正,实现根据内标物的峰高来调整待测样本的峰高,改善不同毛细管间的信号差异,以及相同毛细管不同检测次数的差异,有效提升基因分析结果质量。The method, system and device for realizing normalization of detection signals of a gene analyzer provided by the embodiments of the present invention calculate the normalization coefficient of the corresponding capillary through the preset internal standard standard value and the single-channel internal standard average value of the internal standard channel of each capillary in the electrophoresis process of the gene analyzer, and perform normalization correction on the spectral signals of other color channels in each capillary according to the normalization coefficient of each capillary, so as to adjust the peak height of the sample to be tested according to the peak height of the internal standard, improve the signal difference between different capillaries, and the difference of different detection times of the same capillary, and effectively improve the quality of gene analysis results.
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solution of the present invention. In order to more clearly understand the technical means of the present invention, it can be implemented according to the contents of the specification. In order to make the above and other purposes, features and advantages of the present invention more obvious and easy to understand, the specific implementation methods of the present invention are listed below.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art by reading the detailed description of the preferred embodiments below. The accompanying drawings are only for the purpose of illustrating the preferred embodiments and are not to be considered as limiting the present invention. Also, the same reference symbols are used throughout the accompanying drawings to represent the same components. In the accompanying drawings:
图1为本发明实施例提供的实现基因分析仪检测信号归一化的方法的流程图;FIG1 is a flow chart of a method for realizing normalization of detection signals of a gene analyzer provided by an embodiment of the present invention;
图2为本发明实施例提供的内标匹配条件中距离相似的实现原理示意图;FIG2 is a schematic diagram of the implementation principle of the similar distance in the internal standard matching condition provided by an embodiment of the present invention;
图3为本发明实施例提供的基因分析仪检测光谱的内标匹配方法的流程图;3 is a flow chart of an internal standard matching method for a gene analyzer detecting a spectrum provided by an embodiment of the present invention;
图4为本发明实施例提供的实现基因分析仪检测信号归一化的系统的结构框图。FIG. 4 is a structural block diagram of a system for realizing normalization of detection signals of a gene analyzer provided by an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。The exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although the exemplary embodiments of the present disclosure are shown in the accompanying drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. On the contrary, these embodiments are provided in order to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本发明的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。Those skilled in the art will appreciate that, unless otherwise stated, the singular forms "a", "an", "said" and "the" used herein may also include plural forms. It should be further understood that the term "comprising" used in the specification of the present invention refers to the presence of the features, integers, steps, operations, elements and/or components, but does not exclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本发明所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非被特定定义,否则不会用理想化或过于正式的含义来解释。Those skilled in the art will understand that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as those generally understood by those skilled in the art in the field to which the present invention belongs. It should also be understood that terms such as those defined in general dictionaries should be understood to have meanings consistent with the meanings in the context of the prior art, and will not be interpreted with idealized or overly formal meanings unless specifically defined.
使用基因分析仪检测时,一直不可避免不同毛细管间的信号差异,以及相同毛细管不同检测次数的差异,从而导致检测结果的偏差。导致信号偏差的因素包括上样时间、上样电压、上样位置、样品浓度、光学系统等,对于一般仪器,上样时间可通过软件控制,上样电压可通过检具校正。上样位置可通过上样台校正,光学系统可通过光学对准校正。但是导致信号偏差的因素除了上样时间、上样电压、上样位置、样品浓度、光学系统等之外,还存在例如毛细管组内径导致的偏差,某根毛细管与电极接触不良导致的电阻偏差,气泡随机进入毛细管带来的电阻偏差,不同毛细管温度不均导致的电阻偏差等由毛细管自身状态导致的偏差。这些因素无法完全避免,因此为了避免由于检测信号的信号偏差影响基因分析结果质量,本发明提出了一种实现基因分析仪检测信号归一化的方法。When using a gene analyzer for detection, it is inevitable that the signal difference between different capillaries and the difference in the number of different detections of the same capillary are always present, which leads to the deviation of the detection result. The factors causing signal deviation include loading time, loading voltage, loading position, sample concentration, optical system, etc. For general instruments, the loading time can be controlled by software, and the loading voltage can be corrected by a test fixture. The loading position can be corrected by a loading station, and the optical system can be corrected by optical alignment. However, the factors causing signal deviation include, in addition to loading time, loading voltage, loading position, sample concentration, optical system, etc., there are also deviations caused by, for example, the inner diameter of the capillary group, resistance deviations caused by poor contact between a certain capillary and the electrode, resistance deviations caused by random entry of bubbles into the capillary, and resistance deviations caused by uneven temperatures of different capillaries, etc., caused by the state of the capillary itself. These factors cannot be completely avoided, so in order to avoid affecting the quality of the gene analysis results due to the signal deviation of the detection signal, the present invention proposes a method for realizing the normalization of the detection signal of a gene analyzer.
图1示意性示出了本发明一个实施例的实现基因分析仪检测信号归一化的方法的流程图。参照图1,本发明实施例的实现基因分析仪检测信号归一化的方法具体包括以下步骤:FIG1 schematically shows a flow chart of a method for realizing normalization of detection signals of a gene analyzer according to an embodiment of the present invention. Referring to FIG1 , the method for realizing normalization of detection signals of a gene analyzer according to an embodiment of the present invention specifically comprises the following steps:
S1、对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行内标匹配,并根据匹配成功的内标匹配序列计算每一道毛细管内标物通道的单通道内标平均值。S1. Perform internal standard matching on the spectral signals of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer, and calculate the single-channel internal standard average value of each capillary internal standard channel according to the successfully matched internal standard matching sequence.
本发明实施例中,在对基因分析仪电泳过程中采集的每一道毛细管的内标物通道的光谱信号进行内标匹配之前,所述方法还包括建立内标标准值的步骤,具体包括:对仅装有指定浓度的内标物的多道毛细管进行电泳,以采集每一道毛细管的内标物通道的光谱信号并进行内标匹配,计算每一道毛细管对应的满足内标匹配要求的内标匹配序列中各个内标峰的平均峰值,得到每一道毛细管内标物通道的单通道内标平均值;计算各道毛细管内标物通道的单通道内标平均值的变异系数,若变异系数小于预设变异阈值,例如30%,则将各道毛细管内标物通道的单通道内标平均值的均值或均值的预设倍数作为所述内标标准值。若变异系数大于或等于预设变异阈值,则重新执行建立内标标准值的步骤。其中,预设倍数可选为00.1-10倍。In an embodiment of the present invention, before performing internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis of the gene analyzer, the method further includes the step of establishing an internal standard standard value, specifically including: performing electrophoresis on a multi-channel capillary containing only an internal standard of a specified concentration to collect the spectral signal of the internal standard channel of each capillary and perform internal standard matching, calculating the average peak value of each internal standard peak in the internal standard matching sequence corresponding to each capillary that meets the internal standard matching requirements, and obtaining the single-channel internal standard average value of each capillary internal standard channel; calculating the coefficient of variation of the single-channel internal standard average value of each capillary internal standard channel, if the coefficient of variation is less than a preset variation threshold, such as 30%, then the mean of the single-channel internal standard average value of each capillary internal standard channel or a preset multiple of the mean is used as the internal standard standard value. If the coefficient of variation is greater than or equal to the preset variation threshold, the step of establishing the internal standard standard value is re-executed. Wherein, the preset multiple can be selected as 00.1-10 times.
具体的,变异系数CV的具体计算模型为:CV=标准差/平均数;Specifically, the specific calculation model of the coefficient of variation CV is: CV = standard deviation/mean;
其中:n为毛细管通道数量,xi为第i个毛细管通道的单通道内标平均值。in: n is the number of capillary channels, and xi is the single-channel internal standard average value of the i-th capillary channel.
具体的,在建立内标标准值时,使用一定浓度的内标物进行电泳,经过数据前处理、峰识别、内标物匹配、评分等用于实现内标匹配的步骤,选出合格的用于峰值归一化的内标匹配序列数据。若不合格即没有实现内标的匹配成功,则不使用该次数据。在一个可选实施例中,对于单通道仪器,单个通道中所有内标物的平均峰高记为单通道内标平均值。例如一个内标物中有10个峰,则计算10个峰的平均值,得到单通道内标平均值,将其作为内标标准值。在另一个可选实施例中,对于8通道仪器,若8个通道的数据全部合格,则计算得到8个通道的单通道内标平均值,然后取8个通道峰值的平均值即8个单通道内标平均值的均值,得到内标标准值。Specifically, when establishing internal standard value, use a certain concentration of internal standard to carry out electrophoresis, through data pre-processing, peak identification, internal standard matching, scoring and other steps for realizing internal standard matching, select qualified internal standard matching sequence data for peak normalization. If unqualified, i.e., the matching success of internal standard is not realized, then the secondary data is not used. In an optional embodiment, for a single-channel instrument, the average peak height of all internal standards in a single channel is recorded as the single-channel internal standard average value. For example, there are 10 peaks in an internal standard, then the average value of 10 peaks is calculated, and the single-channel internal standard average value is obtained, which is used as the internal standard value. In another optional embodiment, for an 8-channel instrument, if the data of 8 channels are all qualified, the single-channel internal standard average value of 8 channels is calculated, and then the average value of the average value of 8 channel peak values, i.e., the average value of 8 single-channel internal standard average values, is taken to obtain the internal standard value.
其中,基因分析仪电泳的DNA样本通常分为三个组分,作为溶剂的甲酰胺或纯水,作为标尺的内标物以及标记有荧光基团的待测样本。其中,一般内标物的浓度是固定的,建立内标标准值时内标物浓度与检测时待测样本中的内标物浓度相同,待测样本的浓度是未知的。因此根据内标物的峰高来调整待测样本的峰高在一定范围内可以改善信号的偏差。Among them, the DNA sample electrophoresed by the gene analyzer is usually divided into three components, formamide or pure water as a solvent, an internal standard as a ruler, and a sample to be tested labeled with a fluorescent group. Among them, the concentration of the internal standard is generally fixed. When establishing the internal standard standard value, the concentration of the internal standard is the same as the concentration of the internal standard in the sample to be tested during detection, and the concentration of the sample to be tested is unknown. Therefore, adjusting the peak height of the sample to be tested according to the peak height of the internal standard can improve the signal deviation within a certain range.
S2、采用预设的内标标准值分别除以每一道毛细管的内标物通道的单通道内标平均值,将得到的计算结果作为对应毛细管的归一化系数。S2. Divide the preset internal standard value by the single-channel internal standard average value of the internal standard channel of each capillary, and use the calculated result as the normalization coefficient of the corresponding capillary.
S3、根据每一道毛细管的归一化系数分别对各自毛细管中其他颜色通道的光谱信号进行归一化修正,实现检测信号归一化。S3. According to the normalization coefficient of each capillary, the spectral signals of other color channels in each capillary are normalized and corrected to achieve normalization of the detection signal.
本发明实施例提供的实现基因分析仪检测信号归一化的方法,通过预设的内标标准值和基因分析仪电泳过程中每一道毛细管的内标物通道的单通道内标平均值计算对应毛细管的归一化系数,并根据每一道毛细管的归一化系数对各自毛细管中其他颜色通道的光谱信号进行归一化修正,实现根据内标物的峰高来调整待测样本的峰高,改善不同毛细管间的信号差异,以及相同毛细管不同检测次数的差异,有效提升基因分析结果质量。The method for realizing normalization of detection signals of a gene analyzer provided in an embodiment of the present invention calculates the normalization coefficient of the corresponding capillary through a preset internal standard standard value and a single-channel internal standard average value of the internal standard channel of each capillary during electrophoresis of the gene analyzer, and performs normalization correction on the spectral signals of other color channels in each capillary according to the normalization coefficient of each capillary, thereby adjusting the peak height of the sample to be tested according to the peak height of the internal standard, improving the signal difference between different capillaries, and the difference of different detection times of the same capillary, and effectively improving the quality of the gene analysis result.
本发明实施例中,在将得到的计算结果作为对应毛细管的归一化系数之后,所述方法还包括:判断每一毛细管的归一化系数是否合规,当毛细管的归一化系数处于预设的归一化系数取值范围内时则判定当前毛细管的归一化系数合规,否则判定当前毛细管的归一化系数不合规;若当前电泳检测数据中归一化系数不合规的毛细管数量占毛细管总数的比例大于预设的第一比例阈值,或是,至少连续三次的电泳检测数据中归一化系数不合规的毛细管数量占毛细管总数的比例大于预设的第二比例阈值,则对所述内标标准值进行修正;其中,第二比例阈值小于第一比例阈值。In an embodiment of the present invention, after using the obtained calculation result as the normalization coefficient of the corresponding capillary, the method also includes: judging whether the normalization coefficient of each capillary is compliant, and when the normalization coefficient of the capillary is within the preset normalization coefficient value range, the normalization coefficient of the current capillary is judged to be compliant, otherwise the normalization coefficient of the current capillary is judged to be non-compliant; if the proportion of the number of capillaries with non-compliant normalization coefficients in the current electrophoresis detection data to the total number of capillaries is greater than a preset first ratio threshold, or, the proportion of the number of capillaries with non-compliant normalization coefficients in the electrophoresis detection data for at least three consecutive times to the total number of capillaries is greater than a preset second ratio threshold, then the internal standard value is corrected; wherein the second ratio threshold is less than the first ratio threshold.
具体的,毛细管的归一化系数应在一定取值范围内,若超出范围,判定当前毛细管的归一化系数不合规。可选地,归一化系数取值范围可取0.3-3之间。若检测过程中,发现当前电泳检测数据中归一化系数不合规的毛细管数量占毛细管总数的比例大于预设的第一比例阈值,或是,至少连续三次的电泳检测数据中归一化系数不合规的毛细管数量占毛细管总数的比例大于预设的第二比例阈值,则判定需要对内标标准值进行修正。其中,第一比例阈值可选为50%,第二比例阈值可选为30%。Specifically, the normalization coefficient of the capillary should be within a certain range of values. If it exceeds the range, the normalization coefficient of the current capillary is determined to be non-compliant. Optionally, the normalization coefficient value range can be between 0.3-3. If during the detection process, it is found that the proportion of the number of capillaries with non-compliant normalization coefficients in the current electrophoresis detection data to the total number of capillaries is greater than the preset first ratio threshold, or, the proportion of the number of capillaries with non-compliant normalization coefficients in the electrophoresis detection data for at least three consecutive times is greater than the preset second ratio threshold, it is determined that the internal standard value needs to be corrected. Among them, the first ratio threshold can be selected as 50%, and the second ratio threshold can be selected as 30%.
进一步地,对内标标准值进行修正包括计算各个毛细管归一化系数的变异系数,若各个毛细管归一化系数的变异系数小于预设的第二变异阈值,例如30%,则将当前当所有毛细管每一道毛细管内标物通道的单通道内标平均值的均值或均值的预设倍数作为新的内标标准值。若变异系数大于或等于预设的第二变异阈值,则提示重新通过电泳建立内标标准值。预设倍数可选为00.1-10倍。每一毛细管归一化系数的变异系数为当前毛细管归一化系数的标准差与所有毛细管归一化系数的平均数的比值,计算方式参见毛细管内标物通道的单通道内标平均值的变异系数。Further, the internal standard value is corrected including calculating the coefficient of variation of each capillary normalization coefficient. If the coefficient of variation of each capillary normalization coefficient is less than a preset second variation threshold, such as 30%, the mean of the single-channel internal standard average value of each capillary internal standard channel of all capillaries or a preset multiple of the mean is used as a new internal standard value. If the coefficient of variation is greater than or equal to the preset second variation threshold, it is prompted to re-establish the internal standard value by electrophoresis. The preset multiple can be selected as 00.1-10 times. The coefficient of variation of each capillary normalization coefficient is the ratio of the standard deviation of the current capillary normalization coefficient to the average of all capillary normalization coefficients, and the calculation method refers to the coefficient of variation of the single-channel internal standard average value of the capillary internal standard channel.
本发明实施例中,所述根据每一道毛细管的归一化系数分别对各自毛细管中其他颜色通道的光谱信号进行归一化修正,包括:若毛细管的归一化系数合规,则将当前毛细管中其他颜色通道的光谱信号乘以对应毛细管的归一化系数以实现光谱信号归一化修正;若毛细管的归一化系数不合规,则将当前毛细管中其他颜色通道的光谱信号乘以对应毛细管的归一化系数在归一化系数取值范围内靠近的极值以实现光谱信号归一化修正。In an embodiment of the present invention, the spectral signals of other color channels in each capillary are normalized and corrected according to the normalization coefficient of each capillary, including: if the normalization coefficient of the capillary is compliant, the spectral signals of other color channels in the current capillary are multiplied by the normalization coefficient of the corresponding capillary to achieve normalization correction of the spectral signal; if the normalization coefficient of the capillary is not compliant, the spectral signals of other color channels in the current capillary are multiplied by the normalization coefficient of the corresponding capillary that is close to the extreme value within the normalization coefficient value range to achieve normalization correction of the spectral signal.
具体的,在测试过程中,可根据测试需要选取是否对光谱信号进行归一化修正,若选取,当一次电泳结束后,经过数据前处理、峰识别、内标物匹配、评分等用于实现内标匹配的步骤,选出合格的用于峰值归一化的内标匹配序列数据。若不合格,则不计算该次数据。具体归一化修正实现方式为使用标准值除以每一道毛细管的单通道内标平均值,得到每一道毛细管的归一化系数,该道毛细管中,其他颜色通道的光谱信号再乘以该系数,到达光谱信号归一化目的。其中,归一化系数应在一定取值范围内,若存在超出归一化系数取值范围的毛细管,但还未到达需要对内标标准值进行修正的条件,可以在对当前毛细管中其他颜色通道的光谱信号进行归一化时,以当前毛细管的归一化系数所接近的归一化系数取值范围的极值作为归一化系数,将其与其他颜色通道的光谱信号相乘实现归一化修正。Specifically, during the test process, it can be selected whether to perform normalization correction on the spectral signal according to the test needs. If selected, after one electrophoresis is completed, after data pre-processing, peak identification, internal standard matching, scoring and other steps for achieving internal standard matching, select qualified internal standard matching sequence data for peak normalization. If unqualified, the data is not calculated. The specific normalization correction is implemented by using the standard value divided by the single-channel internal standard average value of each capillary to obtain the normalization coefficient of each capillary. In the capillary, the spectral signals of other color channels are multiplied by the coefficient to achieve the purpose of spectral signal normalization. Among them, the normalization coefficient should be within a certain range of values. If there is a capillary that exceeds the range of the normalization coefficient, but the condition for correcting the internal standard value has not been reached, when normalizing the spectral signals of other color channels in the current capillary, the extreme value of the normalization coefficient range close to the normalization coefficient of the current capillary can be used as the normalization coefficient, and it can be multiplied with the spectral signals of other color channels to achieve normalization correction.
本发明实施例中,在对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行内标匹配之前,所述方法还包括:监测电泳过程中的拉曼信号,获取拉曼信号的信号值、峰平衡数据和/或均匀度数据;当所述拉曼信号的信号值、峰平衡数据和/或均匀度数据中任一参数不满足对应的预设光学参数标准时,执行光学对准操作,直到信号值、峰平衡数据和/或均匀度数据均满足对应的光学参数标准。In an embodiment of the present invention, before performing internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the genetic analyzer, the method further includes: monitoring the Raman signal during the electrophoresis process to obtain the signal value, peak balance data and/or uniformity data of the Raman signal; when any parameter among the signal value, peak balance data and/or uniformity data of the Raman signal does not meet the corresponding preset optical parameter standard, performing an optical alignment operation until the signal value, peak balance data and/or uniformity data all meet the corresponding optical parameter standard.
具体的,每次更换毛细管耗材时,执行自动光学对准,减少由于毛细管制作、安装的偏差带来的光学系统影响。若未更换毛细管耗材,由于镜头污染、毛细管污染、振动等,也会对光学系统带来影响。这时可通过监测电泳过程中的拉曼信号,来判断是否出现光学影响。拉曼信号的监测具体包括:每一通道的光谱信号值判断、峰平衡(翘起)判断和均匀度判断,若每一通道的光谱信号值均大于标准值,标准值根据测试场景可选100-10000,且峰平衡数据小于平衡标准值,平衡标准值可选100-10000,且均匀度数据大于均匀度标准值,均匀度标准值可选0.1-1,均匀度=min/max,峰平衡数据取值为左侧基线(如560-590)的平均值与右侧基线(如630-660)的平均值的差值的绝对值。其中,min为各个通道的光谱信号值的最小值,max为各个通道的光谱信号值的最大值。每次电泳时,若发现某一值不合格,则自动执行光学对准操作。Specifically, each time the capillary consumables are replaced, automatic optical alignment is performed to reduce the impact of the optical system caused by the deviation of capillary production and installation. If the capillary consumables are not replaced, the optical system will also be affected by lens contamination, capillary contamination, vibration, etc. At this time, the Raman signal during the electrophoresis process can be monitored to determine whether optical impact occurs. The monitoring of Raman signals specifically includes: spectral signal value judgment of each channel, peak balance (tilt) judgment and uniformity judgment. If the spectral signal value of each channel is greater than the standard value, the standard value can be selected from 100-10000 according to the test scenario, and the peak balance data is less than the balance standard value, the balance standard value can be selected from 100-10000, and the uniformity data is greater than the uniformity standard value, the uniformity standard value can be selected from 0.1-1, uniformity = min/max, and the peak balance data is taken as the absolute value of the difference between the average value of the left baseline (such as 560-590) and the average value of the right baseline (such as 630-660). Wherein, min is the minimum value of the spectral signal value of each channel, and max is the maximum value of the spectral signal value of each channel. During each electrophoresis, if a value is found to be unqualified, the optical alignment operation is automatically performed.
进一步地,光学对准操作的实现过程包括:Furthermore, the implementation process of the optical alignment operation includes:
启动光学自动对准流程,对准电机运行到光耦零位。Start the optical automatic alignment process and run the alignment motor to the optocoupler zero position.
对准电机向光偶限位运动,运动速度下位机设置,可选为1000um/s。对准电机运动过程中,实时读取光谱仪的光谱数据并显示在界面。The alignment motor moves toward the photocouple limit, and the movement speed is set by the lower computer, which can be selected as 1000um/s. During the movement of the alignment motor, the spectrum data of the spectrometer is read in real time and displayed on the interface.
以四通道为例,对四个通道的检测值进行判断,如果都大于预设的空间校正标准值与空间校正光学对准阈值的乘积时,对准电机慢速扫描开始。其中,空间校正标准值可选500,空间校正光学对准阈值可选50%。检测值计算方法如下:找采样波长下限(610)与采样波长上限(620)之间的最大光谱强度数值,减去基线值。基线计算方法:分别计算左侧基线下限与左侧基线上限的平均值(左侧基线,560-590)以及右侧基线下限与右侧基线上限的平均值(右侧基线,630-660),然后计算左侧基线的平均值与右侧基线的平均值的均值。Taking four channels as an example, the detection values of the four channels are judged. If they are all greater than the product of the preset spatial correction standard value and the spatial correction optical alignment threshold, the alignment motor slow scan begins. Among them, the spatial correction standard value can be selected as 500, and the spatial correction optical alignment threshold can be selected as 50%. The detection value calculation method is as follows: find the maximum spectral intensity value between the sampling wavelength lower limit (610) and the sampling wavelength upper limit (620), and subtract the baseline value. Baseline calculation method: calculate the average value of the left baseline lower limit and the left baseline upper limit (left baseline, 560-590) and the average value of the right baseline lower limit and the right baseline upper limit (right baseline, 630-660), and then calculate the average of the left baseline average value and the right baseline average value.
对准电机每运动预设的扫描电机单次运动步数阈值,扫描电机单次运动步数阈值可选为8细分,在光学自动对准文件中记录四个通道的光谱信号值。The scan motor single motion step number threshold is preset for each movement of the alignment motor. The scan motor single motion step number threshold can be selected as 8 subdivisions. The spectral signal values of the four channels are recorded in the optical automatic alignment file.
对一个通道的光谱信号值进行判断,如果都小于空间校正标准值与空间校正光学对准阈值的乘积时,结束记录一个通道的光谱信号值,对准电机慢速扫描结束,同时开始记录单步信号值文件。假如一直未找到该数值,当取光谱数量超过500(如空间校正设置为0.4秒积分时间,则为200秒)的时候,依然记录单步信号值文件,并提示"查找对准位失败"。The spectral signal value of a channel is judged. If they are all less than the product of the spatial correction standard value and the spatial correction optical alignment threshold, the spectral signal value of a channel is recorded, the slow scanning of the alignment motor is ended, and the single-step signal value file is recorded. If the value is not found, when the number of spectra taken exceeds 500 (if the spatial correction is set to 0.4 seconds integration time, it is 200 seconds), the single-step signal value file is still recorded, and the prompt "Finding alignment position failed" is displayed.
对光学自动对准文件中记录的四个通道光谱信号值进行计算。用每个位置的四个通道光谱信号计算极差系数,取极差系数大于极差系数阈值的位置作为备选位置,在所有备选位置中选择平均数最大的作为对准位。其中,极差系数=min/max,可以用极差系数来评价对准的是否均匀。Calculate the four channel spectral signal values recorded in the optical automatic alignment file. Calculate the range coefficient using the four channel spectral signals at each position, take the position with the range coefficient greater than the range coefficient threshold as the candidate position, and select the position with the largest average value among all the candidate positions as the alignment position. Among them, the range coefficient = min/max, and the range coefficient can be used to evaluate whether the alignment is uniform.
对准电机运动到对准位。运动步数为:对准电机当前停留的位置,与对准位的步数差。The alignment motor moves to the alignment position. The movement steps are: the difference between the current position of the alignment motor and the alignment position.
取光谱信号值,也即检测预设采集时间内的平均检测值,预设采集时间可选为10秒,对比空间校正标准值,若光谱信号值大于空间校正标准值则进入峰翘起判断。Take the spectral signal value, that is, the average detection value within the preset acquisition time. The preset acquisition time can be selected as 10 seconds, and compare it with the spatial correction standard value. If the spectral signal value is greater than the spatial correction standard value, enter the peak lift judgment.
峰翘起判断:左侧基线下限与左侧基线上限(即左侧基线,560-590)的平均值,减去右侧基线下限与右侧基线上限(即右侧基线630-660)的平均值,取绝对值,对比预设的归一化对称值,归一化对称值可选为2000,若大于则翘起,判断不通过。小于则判断空间校正通过,开始计算归一化值。若空间校正不通过,则停留在四道总和最大的位置。Peak tilt judgment: The average value of the left baseline lower limit and the left baseline upper limit (i.e., the left baseline, 560-590) minus the average value of the right baseline lower limit and the right baseline upper limit (i.e., the right baseline 630-660), take the absolute value, and compare it with the preset normalized symmetry value. The normalized symmetry value can be selected as 2000. If it is greater than, it is tilted and the judgment fails. If it is less than, it is judged that the spatial correction passes and the normalized value is calculated. If the spatial correction fails, it stays at the position where the sum of the four channels is the largest.
本发明实施例中,对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行内标匹配,具体包括如下附图中未示出的步骤:In the embodiment of the present invention, internal standard matching is performed on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer, specifically including the following steps not shown in the drawings:
S10、对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行数据前处理,以滤除所述光谱信号的背景噪声并对光谱信号进行平滑处理;S10, performing data pre-processing on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer to filter out the background noise of the spectral signal and perform smoothing on the spectral signal;
S11、对数据前处理后的光谱信号进行峰识别,以筛选出当前光谱信号中包含的候选峰序列。S11, performing peak recognition on the spectral signal after data pre-processing to screen out candidate peak sequences contained in the current spectral signal.
具体的,步骤S11具体包括:识别所述光谱信号中的极大值点,计算各个极大值点对应波峰信号的峰特征,所述峰特征包括峰高度、底峰宽、半高宽、峰间距和邻近点下降高度中的一个或多个特征;按照峰高度、半高宽、峰间距、邻近点下降高度、底峰宽的顺序依次对每一各个极大值点对应波峰信号的峰特征进行筛选,将得到的满足各个峰特征对应阈值要求的波峰信号作为候选峰以形成候选峰序列。Specifically, step S11 specifically includes: identifying the maximum points in the spectral signal, calculating the peak characteristics of the peak signal corresponding to each maximum point, the peak characteristics including one or more characteristics of peak height, bottom peak width, half-height width, peak spacing and adjacent point drop height; screening the peak characteristics of the peak signal corresponding to each maximum point in the order of peak height, half-height width, peak spacing, adjacent point drop height and bottom peak width, and taking the peak signals that meet the corresponding threshold requirements of each peak characteristic as candidate peaks to form a candidate peak sequence.
本实施例中,在得到采集的荧光光谱信号后,识别荧光光谱信号中所有荧光强度的极大值和极小值,这里的极大值就是我们要找的峰值,计算每个峰的下述特征:In this embodiment, after obtaining the collected fluorescence spectrum signal, the maximum and minimum values of all fluorescence intensities in the fluorescence spectrum signal are identified. The maximum value here is the peak value we are looking for, and the following characteristics of each peak are calculated:
峰高度:峰的高度,指峰值的大小;Peak height: The height of the peak, which refers to the size of the peak;
底峰宽:峰左右边界之间的距离;Bottom peak width: the distance between the left and right boundaries of the peak;
半高宽:从峰顶向左右两侧移动,直至荧光强度降低到峰值的一半高度的位置,然后测量两个位置之间的水平距离;Half-height width: Move from the peak to the left and right sides until the fluorescence intensity drops to half the height of the peak, and then measure the horizontal distance between the two positions;
峰间距:两个峰最高点之间的水平距离;Peak distance: the horizontal distance between the highest points of two peaks;
邻近点下降高度:峰值与相邻点之间的高度差。Neighboring point drop height: the height difference between the peak and the adjacent points.
通过测试精度以及所选用内标物的特性设置这些特征对应的阈值要求,如表1所示,对峰值点进行筛选,留下符合要求的峰值点以形成候选峰序列。The threshold requirements corresponding to these features are set by testing the accuracy and the characteristics of the selected internal standard, as shown in Table 1, and the peak points are screened to leave the peak points that meet the requirements to form a candidate peak sequence.
表1阈值要求Table 1 Threshold requirements
在一个具体实施例中,筛选规则如下:In a specific embodiment, the screening rules are as follows:
峰识别==>峰高度==>半高宽==>峰间距==>邻近点下降高度==>底峰宽Peak identification ==> peak height ==> half-height width ==> peak spacing ==> adjacent point drop height ==> bottom peak width
本实施例中,通过单独的特征逐一判断,判断顺序和数据特性有关,比如,半高宽==>峰间距,这两个顺序,先判断半高宽,可以去除异常峰,有些半高宽较小的峰是异常峰,但是峰高较高;在峰间距的判断里面是根据峰高进行先后判断的,如果不去除异常峰,就会导致在峰间距这里一些正常峰不被识别,所以这里的顺序不能调换。In this embodiment, individual features are judged one by one, and the judgment order is related to the data characteristics. For example, half-width ==> peak spacing. In these two orders, the half-width is judged first to remove abnormal peaks. Some peaks with smaller half-width are abnormal peaks, but the peak height is higher. In the judgment of the peak spacing, the peak height is used to make judgments in order. If the abnormal peaks are not removed, some normal peaks will not be recognized in the peak spacing, so the order here cannot be changed.
S12、按照采样先后顺序依次对候选峰序列中的候选峰与电泳过程所选内标物对应的标准光谱信号中的标准峰序列进行内标匹配,以找到候选峰序列中存在的符合预设的内标匹配条件的候选峰组合,候选峰组合中的候选峰数量大于或等于3;内标匹配条件包括第一距离与第二距离之间的差值的绝对值小于预设的距离误差阈值,且,max{候选峰组合中已选定候选峰的峰高度,当前待匹配候选峰的峰高度}/min{候选峰组合中已选定候选峰的峰高度,当前待匹配候选峰的峰高度}<预设的相对高度阈值,其中,第一距离为候选峰序列中相邻候选峰峰值点之间的距离,第二距离为标准峰序列中与当前计算第一距离的候选峰采样顺序相同的相邻标准峰峰值点之间的距离。S12. Perform internal standard matching on the candidate peaks in the candidate peak sequence and the standard peak sequence in the standard spectral signal corresponding to the internal standard selected in the electrophoresis process in sequence according to the sampling order, so as to find the candidate peak combination in the candidate peak sequence that meets the preset internal standard matching conditions, and the number of candidate peaks in the candidate peak combination is greater than or equal to 3; the internal standard matching conditions include that the absolute value of the difference between the first distance and the second distance is less than the preset distance error threshold, and max{peak height of the selected candidate peak in the candidate peak combination, peak height of the current candidate peak to be matched}/min{peak height of the selected candidate peak in the candidate peak combination, peak height of the current candidate peak to be matched}<preset relative height threshold, wherein the first distance is the distance between the peak points of adjacent candidate peaks in the candidate peak sequence, and the second distance is the distance between the peak points of adjacent standard peaks in the standard peak sequence that have the same sampling order as the candidate peak for calculating the first distance currently.
具体的,步骤S12具体包括:获取候选峰序列中任意相邻候选峰峰值点之间的距离,以及获取标准峰序列中任意相邻标准峰峰值点之间的距离;根据预设的候选峰组合中包含候选峰的数量按照采样先后顺序从候选峰序列中匹配出相应数量的符合预设的内标匹配条件的候选峰,得到候选峰组合。Specifically, step S12 specifically includes: obtaining the distance between any adjacent candidate peak peak points in the candidate peak sequence, and obtaining the distance between any adjacent standard peak peak points in the standard peak sequence; matching a corresponding number of candidate peaks that meet the preset internal standard matching conditions from the candidate peak sequence in the sampling order according to the number of candidate peaks included in the preset candidate peak combination to obtain a candidate peak combination.
本发明实施例中,在进行内标匹配时引入可接受的距离误差和相对高度两个指标,按照需要同时满足距离相似和高度均匀的内标匹配模式进行内标匹配。其中,距离误差可以是距离长度误差也可以是比例误差ratioD,相对高度即相对峰高relativeH。在一个具体实施例中,以比例误差ratioD作为距离误差,以相对峰高relativeH作为相对高度进行解释说明。可选地,比例误差一般取值小于2;相对峰高relativeH取值大于1。In an embodiment of the present invention, two acceptable indicators, distance error and relative height, are introduced when performing internal standard matching, and internal standard matching is performed according to the internal standard matching mode that satisfies both distance similarity and height uniformity as needed. Among them, the distance error can be a distance length error or a ratio error ratioD, and the relative height is the relative peak height relativeH. In a specific embodiment, the ratio error ratioD is used as the distance error, and the relative peak height relativeH is used as the relative height for explanation. Optionally, the ratio error generally takes a value less than 2; the relative peak height relativeH takes a value greater than 1.
下面对比例误差ratioD进行说明,参见图2,内标(standard)已知,标准峰序列如图2中红色表示,侯选峰序列如图2中蓝色表示,计算a1、a2、···的值,在标准品电泳数据即侯选峰序列中,如果|ai-bi|<ratioD,则认为bi对应的距离符合要求,这里的ratioD是可接受的比例误差,ai、bi分别表示标准内标和识别的内标的距离关系中的系数,"||"表示计算绝对值符号。The proportional error ratioD is explained below. Referring to Figure 2, the internal standard (standard) is known, the standard peak sequence is represented by red in Figure 2, and the candidate peak sequence is represented by blue in Figure 2. The values of a1, a2, ... are calculated. In the standard electrophoresis data, i.e., the candidate peak sequence, if |ai-bi|<ratioD, then it is considered that the distance corresponding to bi meets the requirements. Here, ratioD is an acceptable proportional error. ai and bi represent the coefficients of the distance relationship between the standard internal standard and the identified internal standard, respectively. "||" indicates the symbol for calculating the absolute value.
下面对相对峰高relativeH进行说明,在每次新加入即新匹配某个峰(待选峰)时,都要满足max{已选定峰的峰高度,待选峰的峰高度}/min{已选定峰的峰高度,待选峰的峰高度}<relativeH。The relative peak height relativeH is explained below. Each time a new peak (peak to be selected) is added or matched, it must satisfy max{peak height of the selected peak, peak height of the peak to be selected}/min{peak height of the selected peak, peak height of the peak to be selected}<relativeH.
其中,比例误差ratioD的阈值为matchTH,表示可接受最大比例误差;相对峰高relativeH的阈值为evennessTH,表示可接受最大相对峰高误差。这里的relativeH由evennessTH通过一定的阈值调整规则或函数转换得到,ratioD由matchTH通过一定的阈值调整规则或函数转换得到。Among them, the threshold of the ratio error ratioD is matchTH, which indicates the maximum acceptable ratio error; the threshold of the relative peak height relativeH is evennessTH, which indicates the maximum acceptable relative peak height error. Here, relativeH is converted from evennessTH through a certain threshold adjustment rule or function, and ratioD is converted from matchTH through a certain threshold adjustment rule or function.
S13、以每一候选峰组合作为匹配基础对候选峰序列中其他候选峰依次进行内标匹配,得到每一候选峰组合对应的匹配结果。S13, using each candidate peak combination as a matching basis, sequentially performing internal standard matching on other candidate peaks in the candidate peak sequence to obtain a matching result corresponding to each candidate peak combination.
S14、统计每一候选峰组合对应的匹配结果中包含的候选峰数量。S14. Count the number of candidate peaks included in the matching results corresponding to each candidate peak combination.
S15、当各个候选峰组合对应的匹配结果中包含的候选峰数量的最大值等于标准峰序列中包含的标准峰数量时,则判定内标匹配成功,并将所述最大值对应的匹配结果作为最优内标匹配结果。S15. When the maximum number of candidate peaks contained in the matching results corresponding to each candidate peak combination is equal to the number of standard peaks contained in the standard peak sequence, the internal standard matching is determined to be successful, and the matching result corresponding to the maximum value is used as the optimal internal standard matching result.
步骤S15中的将所述最大值对应的匹配结果作为最优内标匹配结果,具体包括:若与最大值对应的匹配结果只有一个,则将与最大值对应的匹配结果作为最优内标匹配结果;若与最大值对应的匹配结果大于一个,则将最大值对应的各个匹配结果中第一个候选峰的采样点位置最大的匹配结果作为最优内标匹配结果。The step S15 of taking the matching result corresponding to the maximum value as the optimal internal standard matching result specifically includes: if there is only one matching result corresponding to the maximum value, taking the matching result corresponding to the maximum value as the optimal internal standard matching result; if there is more than one matching result corresponding to the maximum value, taking the matching result with the largest sampling point position of the first candidate peak among the matching results corresponding to the maximum value as the optimal internal standard matching result.
本发明实施例通过对光谱信号进行峰识别,以筛选出当前光谱信号中包含的符合要求的候选峰序列,然后基于距离相似、高度均匀两个内标匹配条件,按照采样先后顺序依次对候选峰序列中的候选峰与电泳过程所选内标物对应的标准光谱信号中的标准峰序列进行内标匹配,以找到候选峰序列中存在的符合预设的内标匹配条件的候选峰组合,将候选峰组合作为匹配结果的初始部分,并以此作为匹配基础对候选峰序列中其他候选峰依次进行内标匹配,得到与每一候选峰组合对应的完整匹配结果,当得到的匹配结果中包含的候选峰数量等于标准峰序列中包含的标准峰数量时,则判定内标匹配成功并将当前匹配结果作为最优内标匹配结果,本发明能够快速、准确地实现内标峰的最优匹配,进而保证光谱检出片段长度的准确性。The embodiment of the present invention performs peak recognition on the spectral signal to screen out the candidate peak sequence that meets the requirements contained in the current spectral signal, and then based on the two internal standard matching conditions of distance similarity and height uniformity, internal standard matching is performed on the candidate peaks in the candidate peak sequence and the standard peak sequence in the standard spectral signal corresponding to the internal standard selected in the electrophoresis process in sequence in accordance with the sampling order to find the candidate peak combination that meets the preset internal standard matching conditions in the candidate peak sequence, and the candidate peak combination is used as the initial part of the matching result, and based on this, internal standard matching is performed on other candidate peaks in the candidate peak sequence in sequence to obtain a complete matching result corresponding to each candidate peak combination. When the number of candidate peaks contained in the obtained matching result is equal to the number of standard peaks contained in the standard peak sequence, it is determined that the internal standard matching is successful and the current matching result is used as the optimal internal standard matching result. The present invention can quickly and accurately achieve the optimal matching of the internal standard peak, thereby ensuring the accuracy of the spectral detection fragment length.
本发明实施例中,在将所述最大值对应的匹配结果作为最优内标匹配结果之后,所述方法包括内标匹配程度评分操作。具体实现如下:将标准峰序列和最优内标匹配结果的峰序列进行曲线拟合,将曲线拟合后的标准差、平均残差和最大残差作为当前最优内标匹配结果的特征数据;将所述特征数据输入预设的内标匹配评分模型进行识别,以得到当前最优内标匹配结果的匹配程度评分。In an embodiment of the present invention, after the matching result corresponding to the maximum value is used as the optimal internal standard matching result, the method includes an internal standard matching degree scoring operation. The specific implementation is as follows: curve fitting is performed on the standard peak sequence and the peak sequence of the optimal internal standard matching result, and the standard deviation, average residual and maximum residual after the curve fitting are used as feature data of the current optimal internal standard matching result; the feature data is input into a preset internal standard matching scoring model for identification to obtain a matching degree score of the current optimal internal standard matching result.
具体的,本发明在内标匹配成功的情况下,可采用Local Southern方法或高阶多项式拟合方法实现标准峰序列和内标匹配结果的峰序列的曲线拟合,并选择曲线拟合后的标准差、平均残差和最大残差作为每种匹配情况的三个特征数据,然后将特征数据输入预设的内标匹配评分模型进行内标匹配程度评分。Specifically, in the present invention, when the internal standard match is successful, the Local Southern method or the high-order polynomial fitting method can be used to implement curve fitting of the standard peak sequence and the peak sequence of the internal standard matching result, and the standard deviation, average residual and maximum residual after curve fitting are selected as the three characteristic data for each matching situation, and then the characteristic data are input into a preset internal standard matching scoring model to perform internal standard matching degree scoring.
进一步地,所述方法还包括内标匹配评分模型的训练步骤,具体包括:将预设的不同内标匹配情况下的样本数据对应的曲线拟合后的标准差、平均残差和最大残差作为对应样本的样本特征数据,将正确的内标匹配结果的样本数据设为正类,将不正确的内标匹配结果的样本数据设为负类,得到训练数据集;采用Hinge损失作为模型训练的损失函数,采用Sigmoid函数归一化分类结果,基于预设的机器学习模型对所述训练数据集进行学习训练,得到训练好的内标匹配评分模型。Furthermore, the method also includes a training step of an internal standard matching scoring model, specifically including: using the standard deviation, average residual and maximum residual after curve fitting corresponding to the sample data under different preset internal standard matching conditions as sample feature data of the corresponding sample, setting the sample data of the correct internal standard matching result as the positive class, and setting the sample data of the incorrect internal standard matching result as the negative class to obtain a training data set; using Hinge loss as the loss function for model training, using Sigmoid function to normalize the classification results, and performing learning and training on the training data set based on a preset machine learning model to obtain a trained internal standard matching scoring model.
具体的,本发明计算大量样本对应不同内标的匹配情况的以上三个特征数据,将正确的匹配设为正类,不正确的匹配设为负类,组成训练数据集,选择Hinge损失作为模型训练的损失函数,采用机器学习模型,如支持向量机,对该训练数据集进行训练,得到模型参数,最后使用sigmoid函数对结果作进一步处理,得到训练好的内标匹配评分模型。Specifically, the present invention calculates the above three feature data of the matching conditions of a large number of samples corresponding to different internal standards, sets the correct match as a positive class, and the incorrect match as a negative class to form a training data set, selects Hinge loss as the loss function for model training, and uses a machine learning model, such as a support vector machine, to train the training data set to obtain model parameters, and finally uses a sigmoid function to further process the results to obtain a trained internal standard matching scoring model.
在本发明一个可选实施例中,所述方法还包括:若每一候选峰组合对应的匹配结果中包含的候选峰数量的最大值均不等于标准峰序列中包含的标准峰数量,则根据预设的第一阈值调整规则更新所述距离误差阈值,并返回步骤S12重新进行内标匹配,直到更新后的距离误差阈值大于距离误差阈值的最大值;In an optional embodiment of the present invention, the method further includes: if the maximum number of candidate peaks included in the matching results corresponding to each candidate peak combination is not equal to the number of standard peaks included in the standard peak sequence, updating the distance error threshold according to a preset first threshold adjustment rule, and returning to step S12 to re-perform internal standard matching until the updated distance error threshold is greater than the maximum value of the distance error threshold;
当更新后的距离误差阈值大于距离误差阈值的最大值时,根据预设的第二阈值调整规则更新所述相对高度阈值,且将距离误差阈值更新为对应的初始值,并返回步骤S12重新进行内标匹配,直到更新后的相对高度阈值大于相对高度阈值的最大值;When the updated distance error threshold is greater than the maximum value of the distance error threshold, the relative height threshold is updated according to the preset second threshold adjustment rule, and the distance error threshold is updated to the corresponding initial value, and the process returns to step S12 to re-perform the internal standard matching until the updated relative height threshold is greater than the maximum value of the relative height threshold;
当更新后的相对高度阈值大于相对高度阈值的最大值时,则判定内标匹配失败。When the updated relative height threshold is greater than the maximum value of the relative height threshold, it is determined that the internal standard matching fails.
本发明实施例通过对高度阈值和距离误差阈值在预设的取值范围内动态调整实现内标的动态匹配。具体的,如果在初始高度阈值和距离误差阈值的状态下不能成功匹配内标,为了避免由于阈值选取不合适原因导致的匹配失败,本发明能够对两类阈值分别进行动态调整,首先在保持相对高度阈值不变的前提下动态增大距离误差阈值,如每次调整预设的单位步进长度,并在当前高度阈值和调整后的距离误差阈值状态下重新进行内标匹配,直到内标匹配成功或更新后的距离误差阈值大于距离误差阈值的最大值,当更新后的距离误差阈值大于距离误差阈值的最大值时,则动态增大高度阈值,并在调整后的高度阈值状态下,重新以初始距离误差阈值开始进行内标匹配,直到内标匹配成功或更新后的相对高度阈值大于相对高度阈值的最大值,利用内外两重阈值循环的动态规划对识别的内标峰与标准的内标长度进行匹配,能够进一步保证内标匹配的准确性。The embodiment of the present invention realizes dynamic matching of internal standards by dynamically adjusting the height threshold and the distance error threshold within a preset value range. Specifically, if the internal standard cannot be successfully matched under the initial height threshold and the distance error threshold, in order to avoid matching failure caused by inappropriate threshold selection, the present invention can dynamically adjust the two types of thresholds respectively, firstly dynamically increase the distance error threshold under the premise of keeping the relative height threshold unchanged, such as adjusting the preset unit step length each time, and re-performing internal standard matching under the current height threshold and the adjusted distance error threshold, until the internal standard is successfully matched or the updated distance error threshold is greater than the maximum value of the distance error threshold, when the updated distance error threshold is greater than the maximum value of the distance error threshold, then dynamically increase the height threshold, and under the adjusted height threshold state, re-start internal standard matching with the initial distance error threshold, until the internal standard is successfully matched or the updated relative height threshold is greater than the maximum value of the relative height threshold, and use the dynamic programming of the inner and outer double threshold cycles to match the identified internal standard peak with the standard internal standard length, which can further ensure the accuracy of internal standard matching.
本发明实施例中,对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行数据前处理,具体包括如下步骤:In the embodiment of the present invention, data pre-processing is performed on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer, specifically including the following steps:
S01、采用预设的局部自适应多项式拟合算法滤除所述光谱信号的背景噪声。S01. Using a preset local adaptive polynomial fitting algorithm to filter out background noise of the spectral signal.
具体的,首先将数据分段,然后按照下述方式进行多项式拟合:Specifically, the data is first segmented, and then polynomial fitting is performed in the following manner:
initial:k=1,O0(i)=O(i);initial: k=1, O 0 (i)=O (i);
step1:用适当阶数的多项式拟合信号(1,2,3,.....)得到Pk(i);Step 1: Fit the signal with a polynomial of appropriate order (1, 2, 3, ...) to obtain P k (i);
step2:计算原始信号与拟合信号的残差Rk(i)=Ok-1(i)-Pi(i);Step 2: Calculate the residual between the original signal and the fitted signal R k (i) = O k-1 (i) - Pi (i);
step3:计算残差的标准差DEVk;Step 3: Calculate the standard deviation of the residual DEV k ;
step4:如果是第一次迭代,删除满足条件O0(i)>Pk(i)+DEVk的信号(去除相应的下标点);如果不是第一次迭代,按照下列规则赋值:Step 4: If it is the first iteration, delete the signals that satisfy the condition O 0 (i)>P k (i)+DEV k (remove the corresponding subscript points); if it is not the first iteration, assign values according to the following rules:
if Ok+1(i)<Pk(i)+DEVk,Ok(i)=Ok-1(i)if O k+1 (i)<P k (i)+DEV k , O k (i)=O k-1 (i)
otherwise,Ok(i)=Pk(i)+DEVk Otherwise, O k (i) = P k (i) + DEV k
step5:计算是否满足|(DEVk-DDVk-1)/DEVk|<gradient其中gradient=0.01参数可调,如果满足,则继续下一步,如果不满足,则回到step1,重复此过程;Step 5: Calculate whether |(DEV k -DDV k-1 )/DEV k |<gradient, where gradient=0.01 is an adjustable parameter. If it is satisfied, proceed to the next step. If not, return to step 1 and repeat this process.
step6:用多项式系数计算基线Pk(i),输出除基线后的信号S(i)=O0(i)-Pk(i)。Step 6: Calculate the baseline P k (i) using the polynomial coefficients, and output the signal S (i) = O 0 (i) - P k (i) after removing the baseline.
其中:K为迭代次数;i为光谱数据点,i=0,1,2,…,N-1;O为原始信号;P为多项式拟合信号(基线);R为原始信号与拟合信号的残差;DEV为标准差;Ok(i)为第k次迭代时原始光谱数据中的第i个数据;Pk(i)为第k次迭代时多项式拟合数据中的第i个数据;Rk(i)为第k次迭代时原始光谱数据与拟合数据中的第i个数据的残差;DEVk为第k次迭代时的标准差;gradient停止迭代阈值;S(i)除基线后的数据的第i个数据(下一步处理使用的数据)。Wherein: K is the number of iterations; i is the spectral data point, i = 0, 1, 2, ..., N-1; O is the original signal; P is the polynomial fitting signal (baseline); R is the residual between the original signal and the fitting signal; DEV is the standard deviation; O k (i) is the i-th data in the original spectral data at the k-th iteration; P k (i) is the i-th data in the polynomial fitting data at the k-th iteration; R k (i) is the residual between the original spectral data and the fitting data at the k-th iteration; DEV k is the standard deviation at the k-th iteration; gradient stops the iteration threshold; S(i) is the i-th data after removing the baseline (the data used in the next step of processing).
S02、采用Savitzky Golay多项式平滑算法对滤除背景噪声后的光谱信号进行平滑处理。S02. Use Savitzky Golay polynomial smoothing algorithm to smooth the spectral signal after filtering out the background noise.
具体的,在每个数据点Si处,使用一个长度为2n+1的窗口,对该窗口内的数据进行多项式拟合。常见的多项式阶数为p(p=1,2,3,…)。最后,用拟合多项式的中心点的值来代替原始数据点Si,从而完成平滑处理。Specifically, at each data point Si , a window of length 2n+1 is used to perform polynomial fitting on the data in the window. The common polynomial order is p (p=1, 2, 3, ...). Finally, the value of the center point of the fitting polynomial is used to replace the original data point Si , thereby completing the smoothing process.
具体来说,Savitzky-Golay滤波器的公式为:Specifically, the formula for the Savitzky-Golay filter is:
其中,Hi是平滑后的数据点,Si是原始数据点,cj是滤波器的系数。滤波器的系数的计算涉及到多项式拟合的过程,可以通过最小二乘法等方法求解。Among them, Hi is the smoothed data point, Si is the original data point, and cj is the coefficient of the filter. The calculation of the filter coefficient involves the process of polynomial fitting, which can be solved by methods such as the least squares method.
由于激光背景、基底拉曼信号、光谱仪暗噪声等问题,不同波段采集出的信号基线不一致,导致无法直接通过信号强度判断DNA种类与含量。因此本发明通过除基线、平滑等方式对原始数据进行处理,能够保证候选峰的准确识别。Due to the problems of laser background, base Raman signal, spectrometer dark noise, etc., the signal baselines collected in different bands are inconsistent, which makes it impossible to directly determine the type and content of DNA by signal intensity. Therefore, the present invention processes the raw data by removing the baseline, smoothing, etc., which can ensure the accurate identification of candidate peaks.
在本发明的一个具体实施例中,实现基因分析仪检测光谱的内标匹配的方法如图3所示,将识别到的每个峰表示为(pi,Hpi),表示光谱数据中第pi个数据点的信号强度是Hpi,其中p0<p1<p2<···<pi<···,所有识别到的峰称为候选峰序列{(p0,Hp0),(p1,Hp1),···,(pi,Hpi),···}。In a specific embodiment of the present invention, a method for implementing internal standard matching of a spectrum detected by a genetic analyzer is shown in FIG3 , where each identified peak is represented as (pi, H pi ), indicating that the signal intensity of the pi th data point in the spectrum data is H pi , where p0<p1<p2<···<pi<···, and all identified peaks are called candidate peak sequences {(p0, H p0 ), (p1, H p1 ),···, (pi, H pi ),···}.
标准的内标长度对应的峰称为标准峰序列{s0,s1,s2,···,si,···},s0<s1<s2<···<si<···。The peaks corresponding to the standard internal standard length are called the standard peak sequence {s0, s1, s2, ···, si, ···}, s0<s1<s2<···<si<···.
1.计算候选峰序列中所有相邻数据点的距离差值,记为C={C0,C1,···}={p1-p0,p2-p1,···};1. Calculate the distance difference of all adjacent data points in the candidate peak sequence, denoted as C = {C 0 , C 1 , ···} = {p1-p0, p2-p1, ···};
2.计算标准峰序列中所有相邻长度的差值,记为S={S0,S1,···}={s1-s0,s2–s1,···};2. Calculate the difference of all adjacent lengths in the standard peak sequence, denoted as S = {S 0 , S 1 , ···} = {s1-s0, s2–s1, ···};
3.初始化相对高度ralativeH;3. Initialize relative height ralativeH;
4.初始化可接受的比例误差ratioD;4. Initialize the acceptable ratio error ratioD;
5.在候选峰序列中按数据点从小到大匹配的方式,找到预设数量的符合内部匹配条件的候选峰,找出所有可能的组合;其中,预设数量可以为3;5. In the candidate peak sequence, find a preset number of candidate peaks that meet the internal matching conditions by matching the data points from small to large, and find all possible combinations; wherein the preset number may be 3;
6.接下来将所有的组合匹配完整,此时可能存在,检测到的峰已全部判断过,但内标未完全匹配;6. Next, all combinations are matched completely. At this time, it may be that all detected peaks have been judged, but the internal standard is not completely matched;
7.计算每个组合中匹配到的峰的数量,并取最大值;7. Calculate the number of matched peaks in each combination and take the maximum value;
8.判断这个最大值是否等于内标的数量;8. Determine whether this maximum value is equal to the number of internal standards;
9.如果等于内标的数量,则内标匹配成功,输出最大值对应的且第一个峰的数据点最大的组合;如果不等于,则进行第10步9. If it is equal to the number of internal standards, the internal standard match is successful, and the combination with the largest data point corresponding to the maximum value and the first peak is output; if it is not equal, proceed to step 10
10.更新ratioD,并判断更新后的ratioD是否小于matchTH,如果小于则回到第5步,否则,进行第11步;10. Update ratioD and determine whether the updated ratioD is less than matchTH. If so, return to step 5; otherwise, proceed to step 11.
11.相对高度ralativeH增加1,判断增加后的ralativeH是否超过相对峰高阈值,如果没有,则回到第4步,重复匹配过程,否则,内标匹配失败,分析流程结束,匹配得分等于0。11. The relative height ralativeH increases by 1, and it is determined whether the increased ralativeH exceeds the relative peak height threshold. If not, return to step 4 and repeat the matching process. Otherwise, the internal standard matching fails, the analysis process ends, and the matching score is equal to 0.
本发明采用局部自适应荧光背景噪声去除算法,将数据分段,通过多项式拟合除基线并进行平滑处理,再将处理后的数据通过设置特征对应的阈值,对峰值点进行筛选,留下符合要求的候选峰值点。通过距离相似、高度均匀两个匹配条件,利用动态规划对识别的内标峰与标准的内标长度进行匹配。在内标匹配成功的情况下,将标准峰序列和匹配到的序列进行曲线拟合,并选择曲线拟合后的标准差、平均残差和最大残差作为每种匹配情况的三个特征数据,然后将特征数据输入预设的内标匹配评分模型进行内标匹配程度评分,不仅能够准确地实现内标匹配,还能够智能地对内标匹配程度进行评分,提升电泳光谱数据分析过程的智能化和精准度。The present invention adopts a local adaptive fluorescence background noise removal algorithm, segments the data, removes the baseline and performs smoothing by polynomial fitting, and then screens the peak points by setting the threshold value corresponding to the feature of the processed data, leaving the candidate peak points that meet the requirements. Through the two matching conditions of distance similarity and height uniformity, dynamic programming is used to match the identified internal standard peak with the standard internal standard length. In the case of successful internal standard matching, the standard peak sequence and the matched sequence are curve fitted, and the standard deviation, average residual and maximum residual after curve fitting are selected as the three characteristic data of each matching situation, and then the characteristic data is input into a preset internal standard matching scoring model to score the internal standard matching degree, which can not only accurately realize the internal standard matching, but also intelligently score the internal standard matching degree, and improve the intelligence and accuracy of the electrophoresis spectrum data analysis process.
对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明实施例并不受所描述的动作顺序的限制,因为依据本发明实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本发明实施例所必须的。For the method embodiments, for the sake of simplicity, they are all described as a series of action combinations, but those skilled in the art should know that the embodiments of the present invention are not limited by the order of the actions described, because according to the embodiments of the present invention, some steps can be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the embodiments of the present invention.
此外,本发明实施例还提供了一种实现基因分析仪检测信号归一化的系统。所述系统包括用于实现如上实现基因分析仪检测信号归一化的方法的功能模块。如图4所示,本发明实施例的实现基因分析仪检测信号归一化的系统包括内标匹配单元40、计算单元50和归一化单元60,其中:In addition, an embodiment of the present invention further provides a system for realizing normalization of detection signals of a gene analyzer. The system includes a functional module for realizing the method for realizing normalization of detection signals of a gene analyzer as described above. As shown in FIG4 , the system for realizing normalization of detection signals of a gene analyzer according to an embodiment of the present invention includes an internal standard matching unit 40, a calculation unit 50 and a normalization unit 60, wherein:
内标匹配单元40,用于对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行内标匹配,并根据匹配成功的内标匹配序列计算每一道毛细管内标物通道的单通道内标平均值;The internal standard matching unit 40 is used to perform internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer, and calculate the single channel internal standard average value of each capillary internal standard channel according to the successfully matched internal standard matching sequence;
计算单元50,用于采用预设的内标标准值分别除以每一道毛细管的内标物通道的单通道内标平均值,将得到的计算结果作为对应毛细管的归一化系数;A calculation unit 50 is used to divide the preset internal standard value by the single channel internal standard average value of the internal standard channel of each capillary, and use the obtained calculation result as the normalization coefficient of the corresponding capillary;
归一化单元60,用于根据每一道毛细管的归一化系数分别对各自毛细管中其他颜色通道的光谱信号进行归一化修正,实现检测信号归一化。The normalization unit 60 is used to perform normalization correction on the spectral signals of other color channels in the respective capillaries according to the normalization coefficient of each capillary, so as to realize the normalization of the detection signal.
本发明实施例中,所述的内标匹配单元,还用于在对基因分析仪电泳过程中采集的每一道毛细管的内标物通道的光谱信号进行内标匹配之前,对仅装有指定浓度的内标物的多道毛细管进行电泳,以采集每一道毛细管的内标物通道的光谱信号并进行内标匹配,计算每一道毛细管对应的满足内标匹配要求的内标匹配序列中各个内标峰的平均峰值,得到每一道毛细管内标物通道的单通道内标平均值;In the embodiment of the present invention, the internal standard matching unit is further used to perform electrophoresis on a plurality of capillaries containing only an internal standard of a specified concentration before performing internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis of the gene analyzer, so as to collect the spectral signal of the internal standard channel of each capillary and perform internal standard matching, calculate the average peak value of each internal standard peak in the internal standard matching sequence corresponding to each capillary that meets the internal standard matching requirements, and obtain the single-channel internal standard average value of the internal standard channel of each capillary;
所述的计算单元,还用于计算各道毛细管内标物通道的单通道内标平均值的变异系数,若变异系数小于预设变异阈值,则将各道毛细管内标物通道的单通道内标平均值的均值或均值的预设倍数作为所述内标标准值。The calculation unit is also used to calculate the coefficient of variation of the single-channel internal standard average value of each capillary internal standard channel. If the coefficient of variation is less than a preset variation threshold, the mean of the single-channel internal standard average values of each capillary internal standard channel or a preset multiple of the mean is used as the internal standard standard value.
本发明实施例中,所述系统还包括判断单元和重置单元。其中,判断单元,用于在计算单元将得到的计算结果作为对应毛细管的归一化系数之后,判断每一毛细管的归一化系数是否合规,当毛细管的归一化系数处于预设的归一化系数取值范围内时则判定当前毛细管的归一化系数合规,否则判定当前毛细管的归一化系数不合规;In an embodiment of the present invention, the system further includes a judgment unit and a reset unit. The judgment unit is used to judge whether the normalization coefficient of each capillary is compliant after the calculation unit uses the obtained calculation result as the normalization coefficient of the corresponding capillary, and when the normalization coefficient of the capillary is within the preset normalization coefficient value range, the normalization coefficient of the current capillary is judged to be compliant, otherwise, the normalization coefficient of the current capillary is judged to be non-compliant;
重置单元,还用于若当前电泳检测数据中归一化系数不合规的毛细管数量占毛细管总数的比例大于预设的第一比例阈值,或是,至少连续三次的电泳检测数据中归一化系数不合规的毛细管数量占毛细管总数的比例大于预设的第二比例阈值,则对所述内标标准值进行修正;The reset unit is further used to correct the internal standard value if the ratio of the number of capillaries with non-compliant normalization coefficients in the current electrophoresis detection data to the total number of capillaries is greater than a preset first ratio threshold, or if the ratio of the number of capillaries with non-compliant normalization coefficients in the electrophoresis detection data for at least three consecutive times to the total number of capillaries is greater than a preset second ratio threshold;
其中,第二比例阈值小于第一比例阈值。The second ratio threshold is smaller than the first ratio threshold.
进一步地,归一化单元60,具体用于若毛细管的归一化系数合规,则将当前毛细管中其他颜色通道的光谱信号乘以对应毛细管的归一化系数以实现光谱信号归一化修正;若毛细管的归一化系数不合规,则将当前毛细管中其他颜色通道的光谱信号乘以对应毛细管的归一化系数在归一化系数取值范围内靠近的极值以实现光谱信号归一化修正。Furthermore, the normalization unit 60 is specifically used to multiply the spectral signals of other color channels in the current capillary by the normalization coefficient of the corresponding capillary to achieve normalization correction of the spectral signal if the normalization coefficient of the capillary is compliant; if the normalization coefficient of the capillary is not compliant, multiply the spectral signals of other color channels in the current capillary by the normalization coefficient of the corresponding capillary close to the extreme value within the normalization coefficient value range to achieve normalization correction of the spectral signal.
本发明实施例中,所述系统还包括光学对准单元,用于在对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行内标匹配之前,监测电泳过程中的拉曼信号,获取拉曼信号的信号值、峰平衡数据和/或均匀度数据;当所述拉曼信号的信号值、峰平衡数据和/或均匀度数据中任一参数不满足对应的预设光学参数标准时,执行光学对准操作,直到信号值、峰平衡数据和/或均匀度数据均满足对应的光学参数标准。In an embodiment of the present invention, the system further includes an optical alignment unit, which is used to monitor the Raman signal during the electrophoresis process and obtain the signal value, peak balance data and/or uniformity data of the Raman signal before performing internal standard matching on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the genetic analyzer; when any parameter among the signal value, peak balance data and/or uniformity data of the Raman signal does not meet the corresponding preset optical parameter standard, perform an optical alignment operation until the signal value, peak balance data and/or uniformity data all meet the corresponding optical parameter standard.
在本发明实施例中,所述内标匹配单元40包括附图中未示出的数据前处理模块400、峰识别模块401、初始匹配模块402、内标匹配模块403、统计模块404和判定模块405,其中:In the embodiment of the present invention, the internal standard matching unit 40 includes a data pre-processing module 400, a peak identification module 401, an initial matching module 402, an internal standard matching module 403, a statistical module 404 and a determination module 405, which are not shown in the drawings, wherein:
数据前处理模块400,用于对基因分析仪电泳过程采集的每一道毛细管的内标物通道的光谱信号进行数据前处理,以滤除所述光谱信号的背景噪声并对光谱信号进行平滑处理;The data pre-processing module 400 is used to perform data pre-processing on the spectral signal of the internal standard channel of each capillary collected during the electrophoresis process of the gene analyzer, so as to filter out the background noise of the spectral signal and smooth the spectral signal;
峰识别模块401,对数据前处理后的光谱信号进行峰识别,以筛选出当前光谱信号中包含的候选峰序列;The peak identification module 401 performs peak identification on the spectral signal after data pre-processing to screen out candidate peak sequences contained in the current spectral signal;
初始匹配模块402,按照采样先后顺序依次对候选峰序列中的候选峰与电泳过程所选内标物对应的标准光谱信号中的标准峰序列进行内标匹配,以找到候选峰序列中存在的符合预设的内标匹配条件的候选峰组合,候选峰组合中的候选峰数量大于或等于3;内标匹配条件包括第一距离与第二距离之间的差值的绝对值小于预设的距离误差阈值,且,max{候选峰组合中已选定候选峰的峰高度,当前待匹配候选峰的峰高度}/min{候选峰组合中已选定候选峰的峰高度,当前待匹配候选峰的峰高度}<预设的相对高度阈值,其中,第一距离为候选峰序列中相邻候选峰峰值点之间的距离,第二距离为标准峰序列中与当前计算第一距离的候选峰采样顺序相同的相邻标准峰峰值点之间的距离;The initial matching module 402 performs internal standard matching on the candidate peaks in the candidate peak sequence and the standard peak sequence in the standard spectral signal corresponding to the internal standard selected in the electrophoresis process in sequence according to the sampling order, so as to find a candidate peak combination in the candidate peak sequence that meets the preset internal standard matching conditions, and the number of candidate peaks in the candidate peak combination is greater than or equal to 3; the internal standard matching conditions include that the absolute value of the difference between the first distance and the second distance is less than the preset distance error threshold, and max{peak height of the selected candidate peak in the candidate peak combination, peak height of the current candidate peak to be matched}/min{peak height of the selected candidate peak in the candidate peak combination, peak height of the current candidate peak to be matched}<preset relative height threshold, wherein the first distance is the distance between the peak points of adjacent candidate peaks in the candidate peak sequence, and the second distance is the distance between the peak points of adjacent standard peaks in the standard peak sequence that have the same sampling order as the candidate peak for calculating the first distance currently;
内标匹配模块403,以每一候选峰组合作为匹配基础对候选峰序列中其他候选峰依次进行内标匹配,得到每一候选峰组合对应的匹配结果;The internal standard matching module 403 performs internal standard matching on other candidate peaks in the candidate peak sequence in sequence with each candidate peak combination as a matching basis to obtain a matching result corresponding to each candidate peak combination;
统计模块404,统计每一候选峰组合对应的匹配结果中包含的候选峰数量;A statistics module 404 is used to count the number of candidate peaks included in the matching results corresponding to each candidate peak combination;
判定模块405,当各个候选峰组合对应的匹配结果中包含的候选峰数量的最大值等于标准峰序列中包含的标准峰数量时,则判定内标匹配成功,并将所述最大值对应的匹配结果作为最优内标匹配结果。The determination module 405 determines that the internal standard match is successful when the maximum number of candidate peaks included in the matching results corresponding to each candidate peak combination is equal to the number of standard peaks included in the standard peak sequence, and uses the matching result corresponding to the maximum value as the optimal internal standard matching result.
本发明实施例提供的内标匹配单元40,还包括附图中未示出的智能评分模块,所述的智能评分模块,用于在将所述最大值对应的匹配结果作为最优内标匹配结果之后,将标准峰序列和最优内标匹配结果的峰序列进行曲线拟合,将曲线拟合后的标准差、平均残差和最大残差作为当前最优内标匹配结果的特征数据;将所述特征数据输入预设的内标匹配评分模型进行识别,以得到当期最优内标匹配结果的匹配程度评分。The internal standard matching unit 40 provided in the embodiment of the present invention further includes an intelligent scoring module not shown in the accompanying drawings, wherein the intelligent scoring module is used to perform curve fitting on the standard peak sequence and the peak sequence of the optimal internal standard matching result after taking the matching result corresponding to the maximum value as the optimal internal standard matching result, and taking the standard deviation, average residual and maximum residual after the curve fitting as the characteristic data of the current optimal internal standard matching result; and inputting the characteristic data into a preset internal standard matching scoring model for identification to obtain a matching degree score of the current optimal internal standard matching result.
具体的,本发明在内标匹配成功的情况下,可采用高阶多项式拟合方法实现标准峰序列和内标匹配结果的峰序列的曲线拟合,并选择曲线拟合后的标准差、平均残差和最大残差作为每种匹配情况的三个特征数据,然后将特征数据输入预设的内标匹配评分模型进行内标匹配程度评分。Specifically, when the internal standard match is successful, the present invention can use a high-order polynomial fitting method to realize curve fitting of the standard peak sequence and the peak sequence of the internal standard matching result, and select the standard deviation, average residual and maximum residual after curve fitting as the three characteristic data of each matching situation, and then input the characteristic data into a preset internal standard matching scoring model to score the degree of internal standard matching.
进一步地,本发明实施例提供的内标匹配单元40,还包括附图中未示出的模型训练模块,用于执行内标匹配评分模型的训练操作,具体包括:将预设的不同内标匹配情况下的样本数据对应的曲线拟合后的标准差、平均残差和最大残差作为对应样本的样本特征数据,将正确的内标匹配结果的样本数据设为正类,将不正确的内标匹配结果的样本数据设为负类,得到训练数据集;采用Hinge损失作为模型训练的损失函数,采用Sigmoid函数归一化分类结果,基于预设的机器学习模型对所述训练数据集进行学习训练,得到训练好的内标匹配评分模型。Furthermore, the internal standard matching unit 40 provided in the embodiment of the present invention also includes a model training module not shown in the accompanying drawings, which is used to perform training operations of the internal standard matching scoring model, specifically including: using the standard deviation, average residual and maximum residual after curve fitting corresponding to the sample data under different preset internal standard matching conditions as sample feature data of the corresponding sample, setting the sample data of the correct internal standard matching result as the positive class, and setting the sample data of the incorrect internal standard matching result as the negative class, to obtain a training data set; using Hinge loss as the loss function for model training, using Sigmoid function to normalize the classification results, and learning and training the training data set based on a preset machine learning model to obtain a trained internal standard matching scoring model.
具体的,本发明计算大量样本对应不同内标的匹配情况的以上三个特征数据,将正确的匹配设为正类,不正确的匹配设为负类,组成训练数据集,选择Hinge损失作为模型训练的损失函数,采用机器学习模型,如支持向量机,对该训练数据集进行训练,得到模型参数,最后使用sigmoid函数对结果作进一步处理,得到训练好的内标匹配评分模型。Specifically, the present invention calculates the above three feature data of the matching conditions of a large number of samples corresponding to different internal standards, sets the correct match as a positive class, and the incorrect match as a negative class to form a training data set, selects Hinge loss as the loss function for model training, and uses a machine learning model, such as a support vector machine, to train the training data set to obtain model parameters, and finally uses a sigmoid function to further process the results to obtain a trained internal standard matching scoring model.
本发明实施例提供的内标匹配单元40,还包括附图中未示出的阈值动态调整模块,用于当每一候选峰组合对应的匹配结果中包含的候选峰数量的最大值均不等于标准峰序列中包含的标准峰数量时,根据预设的第一阈值调整规则更新所述距离误差阈值,并返回初始匹配模块402执行相应操作,直到阈值动态调整模块更新后的距离误差阈值大于距离误差阈值的最大值;The internal standard matching unit 40 provided in the embodiment of the present invention further includes a threshold dynamic adjustment module not shown in the drawings, which is used to update the distance error threshold according to a preset first threshold adjustment rule when the maximum number of candidate peaks contained in the matching results corresponding to each candidate peak combination is not equal to the number of standard peaks contained in the standard peak sequence, and return to the initial matching module 402 to perform corresponding operations until the distance error threshold updated by the threshold dynamic adjustment module is greater than the maximum value of the distance error threshold;
阈值动态调整模块,还用于当更新后的距离误差阈值大于距离误差阈值的最大值时,根据预设的第二阈值调整规则更新所述相对高度阈值,且将距离误差阈值更新为对应的初始值,并返回初始匹配模块402执行相应操作,直到阈值动态调整模块更新后的相对高度阈值大于相对高度阈值的最大值;The threshold dynamic adjustment module is further used to update the relative height threshold according to a preset second threshold adjustment rule when the updated distance error threshold is greater than the maximum value of the distance error threshold, and update the distance error threshold to the corresponding initial value, and return to the initial matching module 402 to perform corresponding operations until the relative height threshold updated by the threshold dynamic adjustment module is greater than the maximum value of the relative height threshold;
判定模块405,还用于当更新后的相对高度阈值大于相对高度阈值的最大值时,判定内标匹配失败。The determination module 405 is further configured to determine that the internal standard matching fails when the updated relative height threshold is greater than the maximum value of the relative height threshold.
对于系统实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可,且具有相应的技术效果。As for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the partial description of the method embodiment, and it has the corresponding technical effects.
此外,本发明实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如上实现基因分析仪检测信号归一化的方法的步骤。In addition, an embodiment of the present invention further provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the steps of the method for normalizing detection signals of a genetic analyzer are implemented as described above.
本实施例中,所述实现基因分析仪检测信号归一化的方法如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-OnlyMemory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。In this embodiment, if the method for realizing the normalization of the detection signal of the gene analyzer is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on such an understanding, the present invention implements all or part of the processes in the above-mentioned embodiment method, and can also be completed by instructing the relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium, and the computer program can implement the steps of the above-mentioned various method embodiments when executed by the processor. Among them, the computer program includes computer program code, and the computer program code can be in source code form, object code form, executable file or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electrical carrier signal, telecommunication signal and software distribution medium, etc. It should be noted that the content contained in the computer-readable medium can be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, computer-readable media does not include electrical carrier signals and telecommunication signals.
此外,本发明实施例还提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上实现基因分析仪检测信号归一化的方法的步骤。例如图1所示的步骤S1~S3。或者,所述处理器执行所述计算机程序时实现上述实现基因分析仪检测信号归一化的系统实施例中各模块/单元的功能,例如图4所示的内标匹配单元40、计算单元50和归一化单元60。In addition, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method for normalizing the detection signal of the gene analyzer as described above when executing the computer program. For example, steps S1 to S3 shown in FIG1 . Alternatively, when the processor executes the computer program, the functions of each module/unit in the above-mentioned system embodiment for normalizing the detection signal of the gene analyzer are implemented, such as the internal standard matching unit 40, the calculation unit 50, and the normalization unit 60 shown in FIG4 .
本发明实施例提供的实现基因分析仪检测信号归一化的方法、系统及设备,通过预设的内标标准值和基因分析仪电泳过程中每一道毛细管的内标物通道的单通道内标平均值计算对应毛细管的归一化系数,并根据每一道毛细管的归一化系数对各自毛细管中其他颜色通道的光谱信号进行归一化修正,实现根据内标物的峰高来调整待测样本的峰高,改善不同毛细管间的信号差异,以及相同毛细管不同检测次数的差异,有效提升基因分析结果质量。The method, system and device for realizing normalization of detection signals of a gene analyzer provided by the embodiments of the present invention calculate the normalization coefficient of the corresponding capillary through the preset internal standard standard value and the single-channel internal standard average value of the internal standard channel of each capillary in the electrophoresis process of the gene analyzer, and perform normalization correction on the spectral signals of other color channels in each capillary according to the normalization coefficient of each capillary, so as to adjust the peak height of the sample to be tested according to the peak height of the internal standard, improve the signal difference between different capillaries, and the difference of different detection times of the same capillary, and effectively improve the quality of gene analysis results.
此外,本领域的技术人员能够理解,尽管在此的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。In addition, those skilled in the art will appreciate that, although some embodiments herein include certain features included in other embodiments but not other features, the combination of features of different embodiments is meant to be within the scope of the present invention and form different embodiments. For example, any one of the claimed embodiments may be used in any combination.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or make equivalent replacements for some of the technical features therein. However, these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410764366.XA CN118737291A (en) | 2024-06-13 | 2024-06-13 | Method, system and device for realizing normalization of detection signal of gene analyzer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410764366.XA CN118737291A (en) | 2024-06-13 | 2024-06-13 | Method, system and device for realizing normalization of detection signal of gene analyzer |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118737291A true CN118737291A (en) | 2024-10-01 |
Family
ID=92857974
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410764366.XA Pending CN118737291A (en) | 2024-06-13 | 2024-06-13 | Method, system and device for realizing normalization of detection signal of gene analyzer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118737291A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119026064A (en) * | 2024-10-25 | 2024-11-26 | 宁波海尔施基因科技股份有限公司 | A signal processing and determination method based on capillary electrophoresis nucleic acid fragment analysis |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106715768A (en) * | 2014-07-30 | 2017-05-24 | 哈佛学院院长及董事 | Systems and methods for determining nucleic acids |
CN115035948A (en) * | 2022-07-20 | 2022-09-09 | 北京阅微基因技术股份有限公司 | CE platform multiple multi-channel STR primer design method and system based on maximum point weight cluster |
-
2024
- 2024-06-13 CN CN202410764366.XA patent/CN118737291A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106715768A (en) * | 2014-07-30 | 2017-05-24 | 哈佛学院院长及董事 | Systems and methods for determining nucleic acids |
CN115035948A (en) * | 2022-07-20 | 2022-09-09 | 北京阅微基因技术股份有限公司 | CE platform multiple multi-channel STR primer design method and system based on maximum point weight cluster |
Non-Patent Citations (1)
Title |
---|
许如苏;周广彪;魏霜;段建发;刘中勇;陈冠武;: "锁核酸探针多重荧光PCR快速检测肉制品中4种动物肉掺假的研究", 检验检疫学刊, no. 02, 20 April 2016 (2016-04-20) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119026064A (en) * | 2024-10-25 | 2024-11-26 | 宁波海尔施基因科技股份有限公司 | A signal processing and determination method based on capillary electrophoresis nucleic acid fragment analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN118737291A (en) | Method, system and device for realizing normalization of detection signal of gene analyzer | |
US20240085329A1 (en) | Analysis system and analysis method | |
US20200003728A1 (en) | Automated quality control and spectral error correction for sample analysis instruments | |
JP2020510822A5 (en) | ||
US10041884B2 (en) | Nucleic acid analyzer and nucleic acid analysis method using same | |
US8653447B2 (en) | Chromatograph mass spectrometer | |
US8368010B2 (en) | Quadrupole mass spectrometer | |
CN111610179A (en) | System and method for LIBS rapid detection of high temperature sample components in front of furnace | |
JP5786776B2 (en) | Substance identification method and mass spectrometry system used in the method | |
JP2023159214A (en) | Waveform analysis method and waveform analysis device | |
CN118038981A (en) | Method and measuring instrument for extracting Cq value based on curvature change of qPCR amplification curve | |
CN118169110A (en) | Spectral analysis method, sample component analysis method and device, equipment and medium | |
CN118737290A (en) | Internal standard matching method, system and equipment for detecting spectrum of gene analyzer | |
CN112513618B (en) | Biopolymer analysis method and biopolymer analysis device | |
CN114391098B (en) | Biological sample analysis device and biological sample analysis method | |
CN117173059A (en) | Abnormal point and noise removing method and device for near infrared moisture meter | |
CN111650175B (en) | Nondestructive testing method for fat oxidation degree of fresh meat | |
RU2707949C1 (en) | Multichannel capillary genetic analyzer | |
CN117250183B (en) | Gas component analysis method, apparatus, device, and storage medium | |
CN115728276B (en) | Explosive detection method and detection system | |
CN112599189B (en) | Data quality assessment method for whole genome sequencing and application thereof | |
CN114910754B (en) | A method and device for evaluating the aging state of insulating materials of enclosed power equipment | |
CN118730969A (en) | Water content detection method, device and system | |
CN115856062A (en) | Method, device, equipment, medium and program product for mass spectrometer acquisition card testing and instrument parameter optimization | |
US20240044836A1 (en) | Capillary-array-electrophoresis device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |