[go: up one dir, main page]

CN120019165A - Method for determining reaction time in sequencing and sequencing method and system - Google Patents

Method for determining reaction time in sequencing and sequencing method and system Download PDF

Info

Publication number
CN120019165A
CN120019165A CN202380068162.6A CN202380068162A CN120019165A CN 120019165 A CN120019165 A CN 120019165A CN 202380068162 A CN202380068162 A CN 202380068162A CN 120019165 A CN120019165 A CN 120019165A
Authority
CN
China
Prior art keywords
reaction
cycle
sequencing
reaction time
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202380068162.6A
Other languages
Chinese (zh)
Inventor
韦小芳
龚梅花
周爽
赵微
缪海涛
王静静
罗银铃
樊帆
赵胜明
徐崇钧
李计广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MGI Tech Co Ltd
Original Assignee
MGI Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MGI Tech Co Ltd filed Critical MGI Tech Co Ltd
Publication of CN120019165A publication Critical patent/CN120019165A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Wood Science & Technology (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Genetics & Genomics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

一种测序方法和测序方法及系统,该测序方法包括:(1)将固定于芯片表面的目标核酸与聚合试剂进行反应,掺入核苷酸或核苷酸类似物,得到反应产物;(2)检测荧光信号;(3)将反应产物与再生试剂进行反应,获得可以进行下一轮聚合反应的产物;(4)重复步骤(1)到步骤(3),以此类推,进行多轮循环,最终获得测序数据;其中,步骤(1)和步骤(3)至少之一中反应的时间是通过预先确定的循环数‑反应时间关系计算获得。A sequencing method and a sequencing method and system, the sequencing method comprising: (1) reacting a target nucleic acid fixed on a chip surface with a polymerization reagent, incorporating nucleotides or nucleotide analogs, and obtaining a reaction product; (2) detecting a fluorescent signal; (3) reacting the reaction product with a regeneration reagent to obtain a product that can be subjected to a next round of polymerization reaction; (4) repeating steps (1) to (3), and so on, for multiple cycles, and finally obtaining sequencing data; wherein the reaction time in at least one of steps (1) and (3) is calculated by a predetermined cycle number-reaction time relationship.

Description

Method for determining reaction time in sequencing and sequencing method and system Technical Field
The present disclosure relates to the field of biology. In particular, the present disclosure relates to methods and sequencing methods and systems for determining reaction time in sequencing.
Background
The second generation sequencing technology is the most widely used sequencing technology at present due to the advantages of low cost, high sequencing speed, high throughput and the like. The second generation sequencing introduces a reversible blocking group, carries fluorescent marker molecules on bases, and reads out DNA sequences by enhancing fluorescent signal intensity, but the technology has a certain limit, as the reading length increases, fluorescent signals can be reduced, so that the quality of base sequencing is reduced, and the reading length of the second generation sequencing is limited, so that the long-reading length sequencing of the second generation sequencing is a great challenge, and the accurate long-reading length sequencing is very difficult to obtain.
The complementary paired nucleotide is combined to the DNA chain under the action of polymerase in the sequencing process, because the nucleotide is provided with a reversible blocking group, each cycle is combined with one nucleotide only, the chip is scanned and then the sequence of the DNA is read, after the cycle is completed, a chemical reagent is added to cut off the reversible blocking group and the fluorescent group for regeneration, and the next cycle is carried out.
The nucleotide is combined to the DNA template chain under the action of polymerase, a certain reaction temperature and a certain reaction time are needed, and meanwhile, the reaction process of cutting off the reversible blocking group and the fluorescent group also needs a certain reaction temperature and a certain reaction time, wherein different reaction time can influence the effects of polymerization and regeneration (cutting), and the reaction efficiency gradually decreases due to the influence of chemical reagents and the structure of the DNA template chain along with the increase of sequencing read length.
Therefore, how to determine the reaction time during sequencing remains to be studied.
Disclosure of Invention
The present disclosure aims to solve, at least to some extent, the technical problems existing in the prior art.
It should be noted that the present disclosure has been completed based on the following findings of the applicant:
both polymerization and regeneration require a certain reaction time, the prior art uses the same polymerization time and the same regeneration time from the start cycle to the end cycle, and the same reaction time is used for both strands for double ended sequencing.
As sequencing reads grow, reagents affect DNA template strand structure, photodamage, and reduced enzymatic activity, resulting in reduced efficiency of polymerization and regeneration. The same reaction time is used, which results in waste of the front-end time of sequencing, because the reaction can be completed in a shorter reaction time when the sequencing is short and long, but the reaction time when the sequencing is used at the rear end may result in waste of the long time when the sequencing is short and long, and the reaction efficiency is reduced when the sequencing is performed at the rear end (for example, from 150 th cycle), the reaction is insufficient when the reaction time when the short cycle is still used, and the sequencing quality is reduced. Under the condition that the total time is the same in the prior art, the unreasonable reaction time distribution leads to insufficient sequencing reaction and poor sequencing quality result.
If the reaction time of sequencing is blindly increased in the late sequencing stage to obtain a better result, although data with better quality can be obtained on a short reading length in some cases, in the long term, a certain disadvantage is brought to the sequencing of a long reading length, and the nucleic acid sequence is irreversibly damaged due to the excessively increased reaction time when the nucleic acid sequence is in a longer time at a high temperature, so that the specific structural characteristics cannot be maintained to finish the sequencing, and the sequencing quality is deteriorated.
Similarly, if the same reaction temperature is used in the sequencing process, insufficient reaction due to low reaction temperature or damage to the nucleic acid sequence due to high reaction temperature is liable to occur, which affects the sequencing quality.
In theory, the longer the DNA template strand reacts at high temperature, the more damaging the nucleic acid sequence, and the more serious the damage to the nucleic acid sequence is due to the cumulative reaction time or temperature increase. For double-ended sequencing, the reaction time or reaction temperature of a strand affects the result of a strand, the result of a two strand and the overall result, and for single-ended sequencing, the reaction time or reaction temperature of a strand affects the result of a two strand and the overall result, and the reaction time or reaction temperature of a strand directly affects the quality of the sequencing result. Poor distribution of reaction time or reaction temperature leads to waste of front-end circulation reaction time, insufficient back-end circulation reaction, long reaction time, low accuracy and increased sequencing time.
In view of this, the applicant constructs cycle number-reaction time relationship and cycle number-reaction temperature relationship by adopting different reaction times or reaction temperatures for different cycles, and reasonably distributes the reaction times and reaction temperatures of different cycles based on the pre-constructed relationship, so that the reaction of each cycle is more sufficient, and the damage of the nucleic acid sequence is reduced. Therefore, the method solves the problems caused by overlong or overlong reaction time and overhigh or overlow reaction temperature, can balance the reaction time well, is suitable for sequencing with long reading length, protects the nucleic acid sequence from high temperature damage under the condition of ensuring the shorter reaction time in the early sequencing stage, simultaneously does not reduce the sequencing quality greatly, reserves sufficient space for the later sequencing stage, ensures the whole sequencing level and solves the influence of the earlier sequencing stage on the later sequencing stage and the whole sequencing.
To this end, in one aspect of the disclosure, the disclosure proposes a sequencing method. According to the embodiment of the disclosure, the sequencing method comprises (1) reacting target nucleic acid fixed on the surface of a chip with a polymerization reagent, doping nucleotide or nucleotide analogue to obtain a reaction product, (2) detecting a fluorescent signal, (3) reacting the reaction product with a regeneration reagent to obtain a product capable of carrying out the next round of polymerization reaction, (4) repeating the steps (1) to (3), and the like for a plurality of rounds of circulation to finally obtain sequencing data, wherein the time of the reaction in at least one of the steps (1) and (3) is calculated by a predetermined cycle number-reaction time relation.
According to the method disclosed by the embodiment of the disclosure, different reaction time is adopted for different cycles, a cycle number-reaction time relation is constructed, and the reaction time of different cycles is reasonably distributed based on the pre-constructed relation, so that the reaction of each cycle is more sufficient, meanwhile, the damage of long-time high temperature to nucleic acid in the reaction is reduced, the sequencing quality is improved, and the long-reading long sequencing is facilitated.
According to an embodiment of the present disclosure, the above-described sequencing method may further have the following additional technical features:
According to one embodiment of the disclosure, the total number of cycles of the cycle is divided into N cycle segments, the reaction time of each cycle in each cycle segment is the same, the reaction time between each cycle segment is different, and N is an integer greater than 1.
According to one embodiment of the present disclosure, the reaction time of each of the circulation segments increases with an increase in the number of circulation segments.
According to one embodiment of the present disclosure, as the number of circulation segments increases, the reaction time between each of the circulation segments increases in an exponential or linear function distribution manner.
According to one embodiment of the present disclosure, the linear function distribution is selected from an arithmetic series or an arithmetic series distribution.
According to one embodiment of the disclosure, the cycle number-reaction time relationship is selected from the formulA reaction time(s) =a+ (B-A)/nx (C-1), wherein A represents the reaction time constant of the first cycle, B represents the reaction time constant of the last cycle, C represents the current sequencing cycle number, and N represents the total number of sequencing cycles, wherein the reaction time is the whole of the formulA result.
According to one embodiment of the disclosure, the method for determining the cycle number-reaction time relationship comprises the steps of performing the steps (1) - (4) under the condition that the reaction time of each cycle is the same, obtaining a normalized curve according to the obtained sequencing error rate, multiplying the normalized value of each cycle in the normalized curve by a coefficient P, constructing an exponential curve by the obtained normalized value and the cycle number, wherein the coefficient P is the reaction time expected to be increased by the last cycle, and adding the reaction time required by the first cycle to an exponential formula corresponding to the exponential curve to obtain the cycle number-reaction time relationship.
According to one embodiment of the present disclosure, the reaction includes immersing the chip in a reaction vessel storing a polymerization reagent or a regeneration reagent in each cycle, and after the reaction is completed, transferring the sequencing chip to another reaction vessel, and as the number of cycles increases, the number of times of immersion in each reaction vessel increases, and the time of each immersion increases.
According to one embodiment of the present disclosure, the time per soak is determined according to the following formula, soak time(s) =50+10 (X-1)/(Y-1), X being the number of times the same reaction vessel is soaked, and Y being the total number of times the same reaction vessel is soaked.
According to one embodiment of the present disclosure, the temperature of the reaction in at least one of the step (1) and the step (3) is calculated by a predetermined cycle number-reaction temperature relationship. Dividing the total circulation number of the circulation into N circulation sections, wherein the reaction temperature of each circulation in each circulation section is the same, the reaction temperatures of the circulation sections are different, N is an integer greater than 1, the reaction temperature of each circulation section increases with the increase of the circulation section number, the reaction temperature of each circulation section increases in an exponential function or linear function distribution mode with the increase of the circulation section number, and the linear function distribution mode is selected from an arithmetic series or an arithmetic series distribution mode.
According to one embodiment of the disclosure, after multiple cycles, the target nucleic acid immobilized on the chip may be referred to as a single strand by using a DNA polymerase having strand displacement activity, and multiple displacement amplification reactions may be performed to obtain complementary strands, which may be referred to as two strands, and the two strands are subjected to steps (1) - (4), wherein the reaction time and/or reaction temperature of each of the single strand is different from the corresponding reaction time and/or reaction temperature of each of the two strands.
In yet another aspect of the disclosure, the disclosure presents a sequencing system. According to one embodiment of the disclosure, the sequencing system comprises a chip, a sequencing device for sequencing target nucleic acid immobilized on the surface of the chip, one or more processors configured to perform (1) reacting target nucleic acid immobilized on the surface of the chip with a polymerization reagent, incorporating a nucleotide or nucleotide analogue to obtain a reaction product, (2) detecting a fluorescent signal, (3) reacting the reaction product with a regeneration reagent to obtain a product that can undergo the next polymerization reaction, (4) repeating steps (1) to (3), and so forth, performing a plurality of cycles to finally obtain sequencing data, wherein the time of the reaction in at least one of the steps (1) and (3) is calculated by a predetermined cycle number-reaction time relationship.
According to one embodiment of the disclosure, the one or more processors are configured to perform dividing a total number of the cycles into N cycles, each cycle having a same reaction time, each cycle having a different reaction time, N being an integer greater than 1, and increasing the reaction time between each cycle in an exponential or linear function distribution manner as the number of cycles increases.
According to one embodiment of the disclosure, the one or more processors are configured to perform determining A reaction time according to A formulA, wherein A represents A reaction time constant of A first cycle, B represents A reaction time constant of A last cycle, C represents A current sequencing cycle number, and D represents A sequencing total cycle number, wherein the reaction time is A formulA result rounding, or the one or more processors are configured to perform obtaining A normalized curve from sequencing error rates obtained by sequencing under the same reaction time conditions of the respective cycles, multiplying A normalized value of each cycle in the normalized curve by A coefficient P, which is an expected increase reaction time of the last cycle, and the coefficient P is an expected increase reaction time of the last cycle, and adding A required reaction time of the first cycle to an index formulA corresponding to the index curve, to obtain the relationship of the cycle number to the reaction time.
According to one embodiment of the present disclosure, the one or more processors are configured to perform the reactions in each cycle of reactions including immersing the chip in a reaction vessel containing a polymerization or regeneration reagent, transferring the chip to another reaction vessel after the reaction is completed, and as the number of cycles increases, an increase in the number of dips occurs in each reaction vessel, and the time per dip increases.
According to one embodiment of the present disclosure, the one or more processors are configured to perform determining a time for each soak according to the following formula, soak time(s) =50+10 (X-1)/(Y-1), X being the number of times the same reaction vessel is soaked, Y being the total number of times the same reaction vessel is soaked.
According to one embodiment of the disclosure, the temperature of the reaction in at least one of the steps (1) and (3) is calculated by a predetermined cycle number-reaction temperature relationship, the one or more processors are configured to perform dividing the total cycle number of the cycles into N cycle segments, each cycle segment having the same reaction temperature, each cycle segment having a different reaction temperature, N being an integer greater than 1, each cycle segment having a reaction temperature that increases with increasing cycle number in an exponential or linear function distribution selected from an arithmetic series or an arithmetic series distribution.
In yet another aspect of the disclosure, the disclosure presents an electronic device. According to one embodiment of the disclosure, the electronic device includes a memory coupled with one or more processors, the memory for storing computer program code comprising computer instructions that the one or more processors call to cause the electronic device to perform a sequencing method as previously described.
In yet another aspect of the disclosure, the disclosure presents a computer-readable storage medium comprising computer instructions. According to one embodiment of the disclosure, the computer instructions, when run on an electronic device, cause the electronic device to perform a sequencing method as previously described.
In yet another aspect of the disclosure, the disclosure presents a computer program product. According to an embodiment of the present disclosure, the computer program product, when run on a computer, causes the computer to perform the sequencing method as described previously.
Additional aspects and advantages of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the disclosure.
Drawings
The foregoing and/or additional aspects and advantages of the present disclosure will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, wherein:
FIG. 1 shows a schematic flow diagram of a sequencing method according to one embodiment of the present disclosure;
FIG. 2 shows a cycle number-linear incubation time graph according to one embodiment of the present disclosure;
FIG. 3 shows a cycle number-index incubation time graph, a representing an error rate curve of a subsequent strand of sequencing, B representing a curve obtained by normalizing the error rate curve, c representing an index curve obtained by normalizing a value x a coefficient B, according to one embodiment of the present disclosure;
FIG. 4 shows a schematic diagram of a sequencing system according to one embodiment of the present disclosure;
FIG. 5 shows a schematic diagram of an electronic device structure according to one embodiment of the present disclosure;
FIG. 6 shows a graph of constant incubation and gradient incubation with cycle numbers on the abscissa and Q30% values per cycle on the ordinate, showing that Q30% decreases with increasing cycle number, to some extent, as may represent the quality of sequencing results, according to example 1 of the present disclosure;
FIG. 7 shows a graph of incubation time according to example 2 of the present disclosure;
FIG. 8 shows a graph of Q30% value analysis with different incubation time patterns on the abscissa and Q30% index on the ordinate, according to example 2 of the present disclosure;
FIG. 9 shows a graph of Q30% value analysis for each cycle, with the abscissa being the number of cycles and the ordinate being the value of Q30% for each cycle, showing the decrease in Q30% as the number of cycles increases, according to example 2 of the present disclosure;
FIG. 10 shows an analysis chart of the Q30% value per cycle, the abscissa being the number of cycles, and the ordinate being the value of Q30% per cycle, showing the condition that Q30% decreases with increasing number of cycles, which may represent the degree of quality of sequencing results to some extent, according to example 3 of the present disclosure;
FIG. 11 shows an comparative analysis of the intermediate constant reaction temperature and the gradient reaction temperature versus the decrease in sequencing quality according to example 4 of the present disclosure, with the abscissa being the number of cycles and the ordinate being the value of Q30% per cycle, indicating that Q30% decreases with increasing number of cycles, and may represent the degree of quality of sequencing results to some extent.
Detailed Description
Embodiments of the present disclosure are described in detail below. The embodiments described below are exemplary only for explaining the present disclosure, and are not to be construed as limiting the present disclosure.
It should be noted that the terms "first," "second," and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implying a number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. Further, in the description of the present disclosure, unless otherwise indicated, the meaning of "a plurality" is two or more.
The endpoints and any values of the ranges disclosed herein are not limited to the precise range or value, and are understood to encompass values approaching those ranges or values. For numerical ranges, one or more new numerical ranges may be found between the endpoints of each range, between the endpoint of each range and the individual point value, and between the individual point value, in combination with each other, and are to be considered as specifically disclosed herein.
In this document, the term "comprising" is an open-ended expression, i.e., including what is indicated in the disclosure, but not excluding other aspects.
The present disclosure presents a sequencing method, a method of determining reaction time in sequencing, a sequencing system, an electronic device, and a computer readable storage medium, each of which is described in detail below.
Sequencing method
In one aspect of the disclosure, the disclosure proposes a sequencing method. According to an embodiment of the present disclosure, referring to FIG. 1, the sequencing method includes S100 reacting a target nucleic acid with a polymerization reagent, S200 detecting a fluorescent signal, S300 reacting a reaction product with a regeneration reagent, and S400 repeating steps S100 to S300, respectively, which will be described in detail below.
S100 reacting the target nucleic acid with a polymerization reagent
In this step, the target nucleic acid immobilized on the chip surface is reacted with a polymerization reagent (also referred to as "polymerization reaction" in the present invention), and a nucleotide or nucleotide analogue is incorporated to obtain a reaction product. Under the guidance of a template, a polymerization reagent continuously adds dNTPs to the 3' -OH end of the primer/the nucleotide or nucleotide analogue which is doped in the previous step, so that the primer is prolonged, and a new complementary DNA strand is synthesized.
In the present disclosure, the term "polymerization reagent" refers to a reagent required to participate in a polymerization reaction, and all reagents that occur in the polymerization reaction disclosed in the art are included in the present disclosure, such as dntps, DNA polymerase, buffers. The term "polymerization" may include both chain synthesis, i.e.chain extension, and filling, i.e.the filling in of dNTPs with reversible blocking groups and fluorophores not attached to the 3' -OH end during chain synthesis, and further attachment during filling, allowing synchronization of all reads signals.
Step 100 may also include washing the synthesized product with a washing reagent, in particular, washing the free reaction reagent on the chip, with the sequencing chip with the synthesized product attached thereto in a reaction vessel containing the washing reagent, according to one embodiment of the present disclosure.
In the present disclosure, the term "reaction vessel" refers to a reaction site where polymerization and removal of reversible blocking groups and fluorophores can be performed during sequencing, and the kind of a specific reaction vessel is not strictly limited, and may be, for example, a reagent tank on a sequencing platform.
S200 detecting fluorescent signals
By detecting the fluorescent signal, sequencing read information is known based on signal color, intensity, and the like.
S300, reacting the reaction product with a regeneration reagent
In this step, the reaction product is reacted with a regeneration reagent (also referred to as a "regeneration reaction" in this disclosure) to obtain a product that can be subjected to the next polymerization reaction. By reacting the reaction product with a regeneration reagent, the reversible blocking group and the fluorescent group carried on the reaction product can be removed. Thereby facilitating the next round of polymerization.
In the present disclosure, the term "blocking group" terminates chain extension by binding to the 3' hydroxyl or other site of deoxyribose, resulting in the deoxyribose being unable to form a phosphodiester bond with subsequent dntps or forming a steric hindrance that prevents the polymerase from undergoing polymerization. When reversibly removed, the 3' -hydroxyl structure can be restored, and a phosphodiester bond can be formed with subsequent dNTPs, thereby completing chain extension. Or the DNA can be reversibly removed to recover the space structure of the DNA, and can be subjected to polymerization reaction with subsequent dNTPs and polymerase to further complete chain extension. The blocking group may be any group used in the art to block dNTPs, typically but not limited to azidomethylenes, with some fluorescent groups also being possible as blocking groups. Other useful blocking groups also include, for example, those disclosed in International application WO2014139596A 1. The present disclosure is not particularly limited in the kind of blocking group, and any group used in the art to block the 3' hydroxyl group of dntps may be used as the blocking group in the present disclosure. These blocking groups can be reversibly detached from the dNTPs.
The present disclosure is not limited in terms of the type of "regeneration agent", and any material disclosed in the art that can remove both the reversible blocking group and the fluorescent group is included in the present disclosure.
S400 repeating steps S100 to S300
In this step, steps S100 to S300 are repeated, and so on, a plurality of cycles are performed, and finally sequencing data is obtained.
According to one embodiment of the present disclosure, the time of reaction in at least one of steps S100 and S300 is calculated by a predetermined cycle number-reaction time relationship. According to the method disclosed by the embodiment of the disclosure, different reaction time is adopted for different cycles, a cycle number-reaction time relation is constructed, and the reaction time of different cycles is reasonably distributed based on the pre-constructed relation, so that the reaction of each cycle is more sufficient, meanwhile, the damage of long-time high temperature to nucleic acid in the reaction is reduced, the sequencing quality is improved, and the long-reading long sequencing is facilitated.
According to one embodiment of the present disclosure, the total number of cycles of a cycle is divided into N cycle segments, the reaction time of each cycle within each cycle segment is the same, the reaction time between each cycle segment is different, and N is an integer greater than 1.
Different circulation can adopt different reaction time, the circulation can be divided into different circulation sections according to a certain rule, the circulation number in the circulation sections adopts the same reaction time, the reaction time among different circulation sections is inconsistent, or the reaction time of each circulation is inconsistent, the reaction time and the circulation number are carried out according to a certain formula, namely the total circulation number is the same as the circulation section number.
According to one embodiment of the present disclosure, the reaction time per cycle increases with increasing number of cycles. According to another embodiment of the present disclosure, as the number of circulation segments increases, the reaction time between the respective circulation segments increases in an arithmetic series or an arithmetic series distribution.
During the sequencing reaction, the reaction time of the polymerization and regeneration reaction can be increased in a gradient manner within a certain range from the start cycle to the end cycle, and the gradient increase can be performed in an arithmetic series or an arithmetic series gradient or other manners, specifically as follows:
1. The total circulation number can be divided into different sections by setting the number of sections, the circulation number in the circulation section adopts the same reaction time, and the reaction time between different circulation sections is inconsistent;
1) The bisected segmentation rules are as follows:
if the total number of cycles is 300, the 300 cycles may be divided into X segments, and then the number of cycles per cycle segment is Y, y=300/X, and a specific example is a segmentation method in which 300 cycles are divided into 2 cycles, each cycle segment is 150 cycles, another segmentation method in which 300 cycles are divided into 5 cycles, each cycle segment is 60 cycles, yet another segmentation method in which 300 cycles are divided into 6 cycles, each cycle segment is 50 cycles, and the like.
2) The custom segmentation rules are:
For example, 300 cycles can be divided into 1-150 cycles in the first cycle segment, 151-200 cycles in the second cycle segment, 201-250 cycles in the third cycle segment, 251-300 cycles in the fourth cycle segment, and the like, and the number of segments and the number of cycles in the segments can be adjusted according to the needs without the need of the same number of cycles in each segment.
2. After the segmentation is set, the same reaction time is adopted for the circulation in the same circulation segment, different reaction times are adopted among different circulation segments, the reaction time can be increased according to a certain arithmetic sequence or an arithmetic sequence, and the reaction time of each circulation segment can be customized.
1) The incremental manner of the arithmetic series may be:
If 300 cycles are divided into 5 cycles, each cycle is 60 cycles, and the reaction time is 20s, 40s, 60s, 80s and 100s in the range of 20s-100s according to the gradient increasing mode of the arithmetic progression. The above is merely illustrative and the segmentation and reaction times may be of other similar arrangements.
2) The incremental manner of the array of equipotential numbers may be:
if 200 cycles are divided into 4 cycles, each cycle is 50 cycles, and the reaction time is 15s, 30s, 60s and 120s in the range of 15s-120s according to the gradient increasing mode of the equal ratio array. The above is merely illustrative and the segmentation and reaction times may be of other similar arrangements.
3) The custom increment may be:
If 300 cycles are divided into 5 cycles, each cycle is 60 cycles, and the reaction time is in a gradient increasing mode within the range of 20s-100s, the reaction time of the 5 cycles is 20s, 30s, 60s, 80s and 100s respectively. The above is merely illustrative and the segmentation and reaction times may be of other similar arrangements.
The polymerization and regeneration reactions may be performed in different stages and may be performed by editing each of the polymerization and regeneration reactions.
The segmentation mode and the reaction time setting of the first chain and the second chain can be different, and the editing setting can be respectively carried out.
For example, as shown in table 1 below:
TABLE 1 one arrangement of the gradient reactions
According to one embodiment of the disclosure, the cycle number-reaction time relationship is selected from the formulA reaction time(s) =a+ (B-A)/d× (C-1), wherein A represents the reaction time constant of the first cycle, B represents the reaction time constant of the last cycle, C represents the current sequencing cycle number, and N represents the total sequencing cycle number, wherein the reaction time is the rounded up of the formulA result.
During the sequencing reaction, the reaction time from the start cycle to the end cycle of the polymerization and regeneration reaction is carried out in a linear reaction manner within a certain range:
the exposure time for each current cycle is determined according to the following formula:
Reaction time(s) =a+ (B-A)/d× (C-1)
Wherein A represents the reaction time constant of the first cycle, B represents the reaction time constant of the last cycle, C represents the current sequencing cycle number, and D represents the total sequencing cycle number, wherein the reaction time is the whole result of the formula.
The linear reaction time setting for polymerization and regeneration reactions may be different and may be edited separately.
The linear response time settings of the first and second chains may be different and may be edited separately.
FIG. 2 shows a reaction time set-up for a linear reaction, but does not represent the only way this disclosure would be for a linear reaction.
According to one embodiment of the disclosure, the method for determining the cycle number-reaction time relationship comprises the steps of performing the steps (1) - (4) under the condition that the reaction time of each cycle is the same, obtaining a normalized curve according to the obtained sequencing error rate, multiplying the normalized value of each cycle in the normalized curve by a coefficient P, constructing an exponential curve by the obtained normalized value and the cycle number, wherein the coefficient P is the reaction time expected to be increased by the last cycle, and adding the reaction time required by the first cycle to an exponential formula corresponding to the exponential curve to obtain the cycle number-reaction time relationship.
In the sequencing reaction, the reaction time of the polymerization and regeneration reaction is performed in the form of an exponential reaction within a certain range from the start cycle to the end cycle, and the formula of the exponential reaction may be set in the following manner, but is not limited to only this setting manner:
sequencing under the condition of fixed reaction time, normalizing the error rate of the last cycle according to the error rate curve of sequencing, and dividing the corresponding error rate of all cycles by the coefficient Q to obtain a normalized curve if the normalized coefficient is Q;
further, for each cycle, normalized value P, coefficient P is the incubation time of the last cycle scheduled increase;
further, an Excel, MATLAB or other tools are used for solving an index curve formula to obtain an index formula;
Further, adding the time required for the first cycle to react on the basis of the exponential formula, a curve of exponential reaction time is obtained.
Using Excel, MATLAB or other tool to calculate the formula of the exponential curve, the formula is calculated:
S=a*K^3+b*K^2+c;
Adding the incubation time T of the first cycle on the basis of the formula, and obtaining an index incubation formula:
S=a×k×3+b×k×2+c, in this example t=50, i.e. the formula of exponential incubation is s=a×k×3+b×k×2+c+50, where K is the current cycle, S is the incubation time of the cycle, and a, b are constants.
Fig. 3 illustrates a time setting of an exponential response, but does not represent the only manner in which this disclosure is directed to an exponential response.
The setting manner of the index reaction of the polymerization and regeneration reaction in the present disclosure may be different, and editing setting may be performed separately.
Further, the index reaction time settings of the first and second chains may be different, and the editing settings may be performed separately.
The sequencing methods of the present disclosure can be single-ended or double-ended sequencing methods, including sequencing by primer extension using labeled or unlabeled nucleotides, such as sequencing-by-ligation or pyrophosphate sequencing, etc., for example, using any of the Sanger dideoxy methods, nanopores, or "NexGen" sequencing methods of the art (e.g., a sequencing platform using MGI, ROCHE454 sequencing platform, ILLUMINATM SOLEXATM sequencing platform, SOLIDTM sequencing platform of LIFE TECHNOLOGIES/APPLIED BIOSYSTEMS, smrtm sequencing platform of PACIFIC BIOSCIENCES, POLLONATOR Polony sequencing platform, COMPLETE GENOMICS sequencing platform, sequencing platform of INTELLIGENT BIOSYSTEMS, HELICOS sequencing platform, or any other sequencer and system in the art). The method is not strictly limited to the polymerization reaction and regeneration reaction in the steps S100 and S300, some sequencing platforms are sequencing platforms such as MGISEQ-200RS, MGISEQ-2000RS, DNBSEQ-T7, DNBSEQ-G99, DNBSEQ-E25 and the like of MGI by injecting a reaction reagent or a regeneration reagent on a sequencing chip through a syringe pump, and other sequencing platforms are soaking modes, namely a submerged biochemical platform, wherein the chip is soaked in a reaction container containing the reaction reagent or the regeneration reagent, and after a certain reaction is completed, the chip is transferred to another reaction container to complete the next operation, such as a DNBSEQ-T10 multiplied by 4RS sequencing platform.
According to one embodiment of the present disclosure, each cycle of the reaction includes immersing the chip in a reaction vessel containing a polymerization or regeneration reagent, transferring the sequencing chip to another reaction vessel after the reaction is completed, and as the number of cycles increases, the number of times of immersion in each reaction vessel increases and the time of each immersion increases.
According to one embodiment of the present disclosure, the time per soak is determined according to the following formula, soak time(s) =50+10 (X-1)/(Y-1), X being the number of times the same reaction vessel is soaked, and Y being the total number of times the same reaction vessel is soaked.
When operated on a sequencing platform using reagent soaking, small amounts of residual liquid are carried along during the chip transfer process into a new reaction vessel (e.g., a reagent tank), resulting in a decrease in the concentration of polymerization or regeneration reagents in the new reaction vessel. In addition, the effective components of the reagents in some key reagent tanks can be changed during the heating time. In order for the reaction to take place sufficiently, it is necessary to lengthen the time for each soak. When the concentration of the reagent to be reacted in the reaction vessel is too low or the active ingredient is deteriorated to be unfavorable for the reaction, a new reagent needs to be replaced again, and at this time, the soaking time can be properly shortened. Specifically, through biochemical parameter setting in sequencing platform (such as immersed biochemical platform) software, different temperatures can be set at different sequencing stages to achieve a proper sequencing result, and different reaction times are set at different soaking times in the same round of reagent to achieve the best experimental effect.
The specific scheme is as follows:
setting the initial biochemical time and the final biochemical time of the biochemical tank, for example, the initial biochemical time of the biochemical tank 1 is 50s, and the final biochemical time is 60s;
when sequencing is performed on 6 slides simultaneously, the soaking times are set to 78 times;
when the reagent in the biochemical tank is immersed for the X-th time, the reaction time(s) =50+10 (X-1)/(78-1).
The reaction time was 50s when the slide was first soaked in the biochemical tank 1, and 60s when 78 soaks were performed, and the reaction time was linearly increased with the increase of the soaking times.
The reaction temperature can be adjusted according to the soaking times.
According to one embodiment of the present disclosure, the temperature of the reaction in at least one of step S100 and step S300 is calculated by a predetermined cycle number-reaction temperature relationship.
According to one embodiment of the disclosure, the total number of cycles of the cycle is divided into N cycle segments, the reaction temperatures of the cycles in each cycle segment are the same, the reaction temperatures of the cycle segments are different, N is an integer greater than 1, the reaction temperature of each cycle segment increases with the increase of the number of cycle segments, the reaction temperature of the cycle segments increases in an exponential function or linear function distribution manner with the increase of the number of cycle segments, and the linear function distribution manner is selected from an arithmetic progression or an arithmetic progression distribution manner.
According to one embodiment of the disclosure, after multiple cycles, multiple displacement amplification reactions are performed on target nucleic acids immobilized on a chip by using DNA polymerase having strand displacement activity to obtain complementary strands, and two strands are subjected to steps S100 to S400, wherein the reaction time and/or reaction temperature of each strand is different from the corresponding reaction time and/or reaction temperature of each strand. Specifically, the reaction time/reaction temperature for the first chain to undergo the nth cycle is different from the reaction time/reaction temperature for the second chain to undergo the nth cycle.
Sequencing system, electronic device, computer-readable storage medium, and computer program product
In yet another aspect of the disclosure, the disclosure presents a sequencing system. According to an embodiment of the disclosure, referring to FIG. 4, the sequencing system 1000 comprises a chip 100, a sequencing device 200 and one or more processors 300, wherein the one or more processors 300 are configured to perform (1) reacting a target nucleic acid immobilized on the surface of the chip with a polymerization reagent, incorporating a nucleotide or nucleotide analogue to obtain a reaction product, (2) detecting a fluorescent signal, (3) reacting the reaction product with a regeneration reagent to obtain a product that can undergo a next round of polymerization reaction, (4) repeating steps (1) to (3), and so forth, for a plurality of rounds of cycles, to finally obtain sequencing data, wherein the time of the reaction in at least one of steps (1) and (3) is calculated by a predetermined cycle number-reaction time relationship.
Different reaction time is adopted for different loops, a loop number-reaction time relation is constructed, and the reaction time of different loops is reasonably distributed based on the pre-constructed relation, so that the reaction of each loop is more sufficient, meanwhile, the damage of long-time high temperature to nucleic acid in the reaction is reduced, the sequencing quality is improved, and the long-reading long sequencing is facilitated.
According to one embodiment of the present disclosure, referring to fig. 4, one or more processors 300 include:
A first module 210, wherein the first module 210 is used for reacting target nucleic acid immobilized on the surface of a chip with a polymerization reagent, and doping nucleotide or nucleotide analogue to obtain a reaction product;
A second module 220, the second module 220 for detecting a fluorescent signal;
A third module 230, wherein the third module 230 is used for reacting the reaction product with a regeneration reagent to obtain a product which can be subjected to a next polymerization reaction;
A fourth module 240, where the fourth module 240 is configured to sequentially repeat the operations performed in the first module 210, the second module 220, and the third module 230, and so on, perform a plurality of cycles, and finally obtain sequencing data;
Wherein the time of reaction in at least one of the first module 210 and the third module 230 is calculated by a predetermined cycle number-reaction time relationship.
According to one embodiment of the present disclosure, the one or more processors 300 are configured to perform dividing a total number of cycles of a cycle into N cycle segments, each cycle segment having the same reaction time for each cycle, the reaction times being different between the cycle segments, N being an integer greater than 1.
According to one embodiment of the present disclosure, as the number of circulation segments increases, the reaction time between the individual circulation segments increases in an exponential or linear function distribution.
According to one embodiment of the present disclosure, the one or more processors 300 are configured to perform determining A reaction time according to the formulA reaction time(s) =a+ (B-A)/d× (C-1), wherein A represents the reaction time constant of the first cycle, B represents the reaction time constant of the last cycle, C represents the current number of sequencing cycles, and D represents the total number of sequencing cycles, wherein the reaction time is the rounded up of the formulA result.
According to one embodiment of the present disclosure, the one or more processors 300 are configured to perform obtaining a normalized curve from sequencing error rates obtained by sequencing under the same reaction time conditions of each cycle, multiplying the normalized value of each cycle in the normalized curve by a coefficient P, and constructing an exponential curve from the obtained normalized value and the cycle number, the coefficient P being the expected increased reaction time of the last cycle, and adding the reaction time required for the first cycle to an exponential formula corresponding to the exponential curve, to obtain a cycle number-reaction time relationship.
According to one embodiment of the present disclosure, the one or more processors 300 are configured to perform a reaction in each cycle of reactions including immersing a chip in a reaction vessel containing a polymerization or regeneration reagent, removing the chip from the reaction vessel after the reaction is completed, increasing the number of times of immersion in each reaction vessel as the number of cycles increases, and increasing the time of each immersion, the one or more processors are configured to perform a determination of the time of each immersion according to the following formula: immersion time=50+10 (X-1)/(Y-1), X being the number of times of immersion in the same reaction vessel, the unit of immersion time being seconds, and Y being the total number of times of immersion in the same reaction vessel.
According to one embodiment of the present disclosure, the temperature of the reaction in at least one of step (1) and step (3) is calculated by a predetermined cycle number-reaction temperature relationship, and the one or more processors are configured to perform dividing the total cycle number of the cycles into N cycle segments, each cycle segment having the same reaction temperature, each cycle segment having a different reaction temperature, N being an integer greater than 1, each cycle segment having a reaction temperature that increases with increasing cycle number, each cycle segment having a reaction temperature that increases in an exponential or linear function distribution manner, the linear function distribution manner being selected from an arithmetic series or an arithmetic series distribution manner.
The one or more processors 300 may be general purpose processors or special purpose processors, etc. For example, a baseband processor or a central processing unit. The baseband processor may be used to process communication protocols and communication data, and the central processor may be used to control communication devices (e.g., base stations, baseband chips, terminal devices, terminal device chips, DUs or CUs, etc.), execute computer programs, and process data of the computer programs. The processor 300 may be implemented on an integrated circuit (INTEGRATED CIRCUIT, IC), analog IC, radio frequency integrated circuit RFIC, mixed signal IC, application SPECIFIC INTEGRATED Circuit (ASIC), printed circuit board (printed circuit board, PCB), electronic device, or the like. The processor and transceiver may also be fabricated using a variety of IC process technologies such as complementary metal oxide semiconductor (complementary metal oxide semiconductor, CMOS), N-type metal oxide semiconductor (NMOS), P-type metal oxide semiconductor (PMOS), bipolar junction transistor (bipolar junction transistor, BJT), bipolar CMOS (BiCMOS), silicon germanium (SiGe), gallium arsenide (GaAs), etc.
In yet another aspect of the disclosure, the disclosure presents an electronic device. According to an embodiment of the disclosure, an electronic device includes a memory, one or more processors, the memory coupled with the one or more processors, the memory for storing computer program code, the computer program code including computer instructions, the one or more processors invoking the computer instructions to cause the electronic device to perform a sequencing method as previously described. Specifically, the electronic device may be any intelligent terminal including a sequencer, a tablet computer, a computing cluster, and the like.
The term "memory" as used in this disclosure refers to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The Memory may be implemented in the form of Read Only Memory (ROM), static storage, dynamic storage, or random access Memory (Random Access Memory, RAM). The memory may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present disclosure are implemented by software or firmware, relevant program codes are stored in the memory, and the processor invokes the sequencing method to perform the embodiments of the present disclosure.
In some embodiments, referring to FIG. 5, the electronic device 400 may include a processor 410, a memory 420, an input/output interface 430, a communication interface 440, and a bus 450. Wherein processor 410, memory 420, input/output interface 430, and communication interface 440 enable communication connections within the device between each other via bus 450.
The processor 410 may be implemented by a general-purpose CPU (Central Processing Unit ), a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 420 may be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory ), static storage, dynamic storage, etc. Memory 420 may store an operating system and other application programs, and when the technical solutions provided by the embodiments of the present specification are implemented in software or firmware, the relevant program codes are stored in memory 420 and invoked for execution by processor 410.
The input/output interface 430 is used to connect with an input/output module to realize information input and output. The input/output module may be configured as a component in a device (not shown in the figure) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.
The communication interface 440 is used to connect communication modules (not shown) to enable communication interactions of the device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).
Bus 450 includes a path to transfer information between components of the device (e.g., processor 410, memory 420, input/output interface 430, and communication interface 440).
It should be noted that although the above device only shows the processor 410, the memory 420, the input/output interface 430, the communication interface 440, and the bus 450, in the implementation, the device may further include other components necessary to achieve normal operation. Furthermore, it will be understood by those skilled in the art that the above-described apparatus may include only the components necessary to implement the embodiments of the present description, and not all the components shown in the drawings.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer programs. When the computer program is loaded and executed on a computer, the flow or functions according to embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer program may be stored in or transmitted from one computer readable storage medium to another, e.g., from one website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means.
To this end, in yet another aspect of the disclosure, the disclosure proposes a computer-readable storage medium comprising computer instructions. According to an embodiment of the present disclosure, computer instructions, when run on an electronic device, cause the electronic device to perform a sequencing method as previously described. Computer readable storage media can be any available media that can be accessed by a computer or data storage devices, such as servers, data centers, etc., that contain an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a high-density digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a solid-state disk (solid-state drive STATE DISK, SSD)), or the like.
In yet another aspect of the disclosure, the disclosure presents a computer program product. According to an embodiment of the present disclosure, the computer program product, when run on a computer, causes the computer to perform the sequencing method as described previously.
Those of ordinary skill in the art will appreciate that implementing all or part of the above-described method embodiments may be accomplished by a computer program to instruct related hardware, the program may be stored in a computer readable storage medium, and the program may include the above-described method embodiments when executed. The storage medium includes a ROM or a random access memory RAM, a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that the features and advantages described above for the sequencing method and the method for determining the reaction time in sequencing are equally applicable to the sequencing system, the electronic device, the computer readable storage medium and the computer program product, and are not described here again.
The aspects of the present disclosure will be explained below with reference to examples. Those skilled in the art will appreciate that the following examples are illustrative of the present disclosure and should not be construed as limiting the scope of the present disclosure. The examples are not to be construed as limiting the specific techniques or conditions described in the literature in this field or as per the specifications of the product. The reagents or apparatus used were conventional products commercially available without the manufacturer's attention.
In the following examples, the key equipment and key reagents used were as follows:
1. key equipment:
MGISEQ-2000RS sequencer, FTAT sequencer, PCR instrument, PCR octant, 3.0 Fluorescence quantitative instrument, a set of pipettes, a high-speed centrifuge, 200 μl wide-mouth tip, and an ice bin.
2. The key reagents are shown in table 2 below:
Table 2 reagents required
Example 1
The implementation case is based on a Hua Dazhi manufactured sequencer MGISEQ-2000RS platform, the used reagents are derived from a library-building kit and a single-end sequencing kit (hereinafter referred to as SE400 kit) which are matched with the sequencer, the used verification sample is derived from escherichia coli, and preparation, loading and preparation of DNB (deoxyribonucleic acid) required by SE400 and a SE400 reagent tank are carried out by referring to a MGISEQ-2000RS high-throughput (rapid) sequencing kit using instruction in the verification process;
this example designed two sets of incubation times for comparison:
the first group maintains the overall synthesis time at 60s, i.e. maintains a constant incubation time.
The second group is that the incubation time is carried out according to the gradient incubation mode, one gradient is adopted for every 100 cycles, the incubation time of each gradient is different, and the specific incubation time setting is shown in table 3.
The results are shown in table 4 and fig. 6, where the gradient incubation time set by the method according to the disclosed embodiments is reduced slowly from Q30% of the constant incubation time, demonstrating that the gradient incubation time method produces better results than the constant incubation.
TABLE 3 time setting for SE400 gradient incubation
Table 4. Comparison of constant incubation time and gradient incubation time to overall sequencing quality in this example.
Note that ESR (%) is greater than the ratio of reads of Q30 by filtering reads according to the Q value.
Example 2
The embodiment is based on a Hua Dazhi-made sequencer MGISEQ-2000RS platform, a used verification sample is derived from escherichia coli, DNB preparation, loading and reagent tank preparation are universal to other MGI platforms, adaptation can be achieved, the embodiment takes SE150 sequencing as an example, setting of incubation time is carried out in a linear incubation mode, and biochemical scripts of SE150 needed by sequencing are changed.
This example designed two sets of incubation times for comparison:
The first group maintains the overall synthesis time at 20s, the complementary time at 30s, and the regeneration time at 30s, i.e., a constant incubation time.
And the second group, wherein the incubation time is carried out in a linear incubation mode, and the incubation time is calculated according to the formula, wherein the formula is as follows:
Reaction time(s) =a+ (B-A)/D (C-1), wherein A represents the reaction time constant of the first cycle, B represents the reaction time constant of the last cycle, C represents the current number of sequencing cycles, and D represents the total number of sequencing cycles, wherein the reaction time is the rounding of the formulA result.
The incubation time for the synthesis was 13+ (43-13)/150× (C-1)
The incubation time for filling in was 22+ (52-22)/150× (C-1)
The incubation time for regeneration was 27+ (57-27)/150× (C-1)
Specific incubation times are shown in figure 7.
As a result, see fig. 8 and 9, the linear incubation time set by the method according to the embodiment of the present disclosure is reduced slowly from Q30% of the constant incubation time, demonstrating that the linear incubation time method produces better effects than the constant incubation.
Example 3
The embodiment is based on a Hua Dazhi manufactured sequencer MGISEQ-2000RS platform, the used reagents are derived from a library-building kit and a double-end sequencing kit (hereinafter referred to as PE300 kit) which are matched with the sequencer, the used verification sample is derived from escherichia coli, the preparation and loading of DNB required by PE300 and the preparation of a PE300 reagent tank are carried out by referring to a MGISEQ-2000RS high-throughput (rapid) sequencing kit package using instruction in the verification process, the embodiment takes PE300 sequencing as an example, the setting of incubation time is carried out in an exponential incubation mode, and the biochemical script of PE300 required by sequencing is changed.
This example designed two sets of incubation times for comparison:
The first group maintains the overall synthesis (chain synthesis) time at 60s, the complementary time at 120s, and the regeneration time at 60s, i.e., a constant incubation time.
Second group referring to fig. 3, the incubation time is calculated according to the formula:
S=3*10^-6*K^3-0.0001*K^2-0.0089*K+1.3741+T
Wherein K is the current cycle, Y is the incubation time of the cycle, T is the incubation time of the first cycle, then
The synthetic incubation time formula is: y=3×10++6×a3-0.0001×a2; -0.0089 x a+1.3741+19
The formula of the incubation time for filling up is as follows: y=3×10++6×a3-0.0001×a2; -0.0089 x a+1.3741+60
The regeneration incubation time formula is y=3×10-6×3-0.0001×2-0.0089×1.3741+40
Results as shown in table 5 and fig. 10, the exponential incubation time set by the method according to the embodiments of the present disclosure decreased slowly from Q30% of the constant incubation time, demonstrating that the exponential incubation time method produces better results than the constant incubation.
Table 5 shows the comparison of constant incubation time and exponential incubation time versus sequencing quality in this example
Note that SPLITRATE (%) resolution, the proportion of the total data that was successfully tag-removed in the data.
RecoverValue (AVG) this index reflects the two-strand signal back-up for the PE sequencing portion only.
Example 4
The implementation case is based on a Hua Dazhi manufactured sequencer MGISEQ-2000RS platform, the used reagents are all derived from a library-building kit and a double-end sequencing kit which are matched with the sequencer, the used verification sample is derived from escherichia coli, and preparation, loading and preparation of DNB (deoxyribonucleic acid) and PE250 reagent tanks required by PE250 are carried out by referring to a MGISEQ-2000RS high-throughput (rapid) sequencing kit use instruction in the verification process;
This example designed two sets of incubation temperatures for comparison:
the first group maintains the global make-up temperature at 60 ℃, i.e., maintains a constant reaction temperature.
The second group is that the reaction temperature is carried out in a mode of gradient temperature, the incubation temperature of each gradient is different, and the specific incubation time setting is shown in table 6.
The results are shown in table 7 and fig. 11, and the gradient reaction temperature set by the method according to the embodiment of the disclosure is better than the constant incubation temperature, the two-chain signal of the gradient reaction temperature rises back, the Q30% drops slowly, the overall data size is higher, and the method of the gradient temperature is proved to have better effect than the constant temperature.
TABLE 6 PE250 one-chain gradient temperature set up
TABLE 7 comparison of constant incubation temperature and gradient incubation temperature to overall sequencing quality in this example.
Although embodiments of the present disclosure have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the present disclosure, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the present disclosure.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The above is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (18)

  1. A method of sequencing a sample of a sample, characterized by comprising the following steps:
    (1) Reacting target nucleic acid fixed on the surface of a chip with a polymerization reagent, and doping nucleotide or nucleotide analogue to obtain a reaction product;
    (2) Detecting a fluorescent signal;
    (3) Reacting the reaction product with a regeneration reagent to obtain a product which can be subjected to a next polymerization reaction;
    (4) Repeating the steps (1) to (3), and so on, performing multiple cycles, and finally obtaining sequencing data;
    wherein the reaction time in at least one of the step (1) and the step (3) is obtained by calculation of a predetermined cycle number-reaction time relationship.
  2. The method of claim 1, wherein the total number of cycles of the cycle is divided into N cycle segments, the reaction time of each cycle within each cycle segment being the same, the reaction time being different between each cycle segment;
    n is an integer greater than 1.
  3. The method of claim 2, wherein the reaction time of each of the circulation segments increases with an increase in the number of circulation segments;
    With the increase of the number of the circulating sections, the reaction time among the circulating sections is increased in an exponential function or linear function distribution mode;
    the linear function distribution mode is selected from an arithmetic series or an arithmetic series distribution mode.
  4. The method according to claim 1, wherein the cycle number-reaction time relationship is selected from the formulA reaction time(s) =a+ (B-A)/d× (C-1);
    wherein A represents the reaction time constant of the first cycle, B represents the reaction time constant of the last cycle, C represents the current sequencing cycle number, and D represents the total sequencing cycle number, wherein the reaction time is the whole result of the formula.
  5. The method of claim 1, wherein determining the cycle number-reaction time relationship comprises:
    Under the condition that the reaction time of each cycle is the same, performing the steps (1) - (4), and obtaining a normalized curve according to the obtained sequencing error rate;
    multiplying the normalized value of each cycle in the normalized curve by a coefficient P, and constructing an exponential curve by the obtained normalized value and the cycle number, wherein the coefficient P is the reaction time expected to increase in the last cycle;
    And adding the reaction time required by the first cycle to an index formula corresponding to the index curve to obtain the cycle number-reaction time relation.
  6. The method of claim 1, wherein in each cycle, the reaction comprises immersing the chip in a reaction vessel containing a polymerization or regeneration reagent, and after the reaction is completed, transferring the sequencing chip to another reaction vessel;
    As the number of cycles increases, the number of infusions in each reaction vessel increases, and the time per infusion increases.
  7. The method of claim 6, wherein the time for each soak is determined according to the following equation:
    Soaking time(s) =50+10 (X-1)/(Y-1), X is the number of times of soaking in the same reaction vessel, and Y is the total number of times of soaking in the same reaction vessel.
  8. The method according to claim 1, wherein the temperature of the reaction in at least one of the step (1) and the step (3) is calculated by a predetermined cycle number-reaction temperature relationship.
  9. The method of claim 1, wherein the total number of cycles of the cycle is divided into N cycle segments, the reaction temperature of each cycle within each cycle segment is the same, and the reaction temperature is different between each cycle segment;
    n is an integer greater than 1;
    The reaction temperature of each circulation section increases with the increase of the circulation section number;
    With the increase of the number of the circulating sections, the reaction temperature among the circulating sections increases in an exponential function or linear function distribution mode;
    the linear function distribution mode is selected from an arithmetic series or an arithmetic series distribution mode.
  10. The method according to any one of claims 1 to 9, wherein after a plurality of cycles, a target nucleic acid is subjected to multiple displacement amplification reaction using a DNA polymerase having strand displacement activity to obtain complementary two strands;
    wherein the reaction time and/or reaction temperature of each run of the first chain is different from the corresponding reaction time and/or reaction temperature of each run of the second chain.
  11. A sequencing system, comprising a plurality of sequencing units, characterized by comprising the following steps:
    A chip;
    The sequencing equipment is used for sequencing target nucleic acid fixed on the surface of the chip;
    one or more of the processors of the present invention, the one or more processors are configured to perform:
    (1) Reacting target nucleic acid fixed on the surface of the chip with a polymerization reagent, and doping nucleotide or nucleotide analogue to obtain a reaction product;
    (2) Detecting a fluorescent signal;
    (3) Reacting the reaction product with a regeneration reagent to obtain a product which can be subjected to a next polymerization reaction;
    (4) Repeating the steps (1) to (3), and so on, performing multiple cycles, and finally obtaining sequencing data;
    Wherein the time of the reaction in at least one of the step (1) and the step (3) is obtained by calculation of a predetermined cycle number-reaction time relationship.
  12. The sequencing system of claim 11, wherein said one or more processors are configured to perform:
    dividing the total circulation number of the circulation into N circulation sections, wherein the reaction time of each circulation in each circulation section is the same, and the reaction time between each circulation section is different;
    n is an integer greater than 1;
    As the number of circulation segments increases, the reaction time between the circulation segments increases in an exponential or linear function distribution manner.
  13. The sequencing system of claim 11, wherein said one or more processors are configured to perform determining A reaction time according to the formulA reaction time(s) = a+ (B-A)/d× (C-1);
    Wherein A represents the reaction time constant of the first cycle, B represents the reaction time constant of the last cycle, C represents the current sequencing cycle number, D represents the total sequencing cycle number, wherein the reaction time is the result of the formula is rounded, or
    The one or more processors are configured to perform:
    obtaining a normalized curve according to a sequencing error rate obtained by sequencing under the condition that the reaction time of each cycle is the same;
    multiplying the normalized value of each cycle in the normalized curve by a coefficient P, and constructing an exponential curve by the obtained normalized value and the cycle number, wherein the coefficient P is the reaction time expected to increase in the last cycle;
    And adding the reaction time required by the first cycle to an index formula corresponding to the index curve to obtain the cycle number-reaction time relation.
  14. The sequencing system of claim 11, wherein the one or more processors are configured to perform the reactions in each cycle of reactions comprising immersing the chip in a reaction vessel containing a polymerization or regeneration reagent, the reaction being completed and removing the chip from the reaction vessel;
    as the cycle times increase, the soaking times in each reaction container increase, and the soaking time of each time is prolonged;
    The one or more processors are configured to perform determining a time for each soak according to the following formula:
    Soaking time=50+10 (X-1)/(Y-1), X is the number of times of soaking in the same reaction vessel, the unit of soaking time is seconds, and Y is the total number of times of soaking in the same reaction vessel.
  15. The system of claim 11, wherein the temperature of the reaction in at least one of step (1) and step (3) is calculated by a predetermined cycle number-reaction temperature relationship;
    The one or more processors are configured to perform:
    Dividing the total circulation number of the circulation into N circulation sections, wherein the reaction temperature of each circulation in each circulation section is the same, and the reaction temperatures of the circulation sections are different;
    n is an integer greater than 1;
    The reaction temperature of each circulation section increases with the increase of the circulation section number;
    With the increase of the number of the circulating sections, the reaction temperature among the circulating sections increases in an exponential function or linear function distribution mode;
    the linear function distribution mode is selected from an arithmetic series or an arithmetic series distribution mode.
  16. An electronic device comprising a memory, one or more processors, the memory coupled to the one or more processors, the memory to store computer program code, the computer program code comprising computer instructions that the one or more processors invoke to cause the electronic device to perform the sequencing method of any of claims 1-10.
  17. A computer readable storage medium comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the sequencing method of any of claims 1 to 10.
  18. A computer program product, characterized in that the computer program product, when run on a computer, causes the computer to perform the sequencing method according to any of claims 1 to 10.
CN202380068162.6A 2023-03-01 2023-03-01 Method for determining reaction time in sequencing and sequencing method and system Pending CN120019165A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/079100 WO2024178682A1 (en) 2023-03-01 2023-03-01 Method for determining reaction time in sequencing, and sequencing method and system

Publications (1)

Publication Number Publication Date
CN120019165A true CN120019165A (en) 2025-05-16

Family

ID=92589291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202380068162.6A Pending CN120019165A (en) 2023-03-01 2023-03-01 Method for determining reaction time in sequencing and sequencing method and system

Country Status (2)

Country Link
CN (1) CN120019165A (en)
WO (1) WO2024178682A1 (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120252682A1 (en) * 2011-04-01 2012-10-04 Maples Corporate Services Limited Methods and systems for sequencing nucleic acids
GB201416106D0 (en) * 2014-09-11 2014-10-29 Illumina Cambridge Ltd Paired-end sequencing by attachment of the complementary strand
CN105001292A (en) * 2015-07-14 2015-10-28 深圳市瀚海基因生物科技有限公司 Light-fractured fluorescence-labeling reversible terminal compound and use thereof in DNA (Deoxyribonucleic Acid) or RNA (Ribonucleic Acid) sequencing
EP4678766A2 (en) * 2015-11-18 2026-01-14 Kalim U. Mir Super-resolution sequencing
US11427867B2 (en) * 2017-11-29 2022-08-30 Xgenomes Corp. Sequencing by emergence
AU2020282704A1 (en) * 2019-05-29 2022-01-27 Xgenomes Corp. Sequencing by emergence

Also Published As

Publication number Publication date
WO2024178682A1 (en) 2024-09-06

Similar Documents

Publication Publication Date Title
EP3141614A1 (en) Predictive model for use in sequencing-by-synthesis
Wang et al. Parallel molecular computation on digital data stored in DNA
JP2018509178A (en) Highly parallel nucleic acid and accurate measurement method
Tulpan et al. Thermodynamic post-processing versus GC-content pre-processing for DNA codes satisfying the Hamming distance and reverse-complement constraints
CN114846153B (en) Method for simultaneously sequencing positive and antisense strands of DNA
EP4361287A2 (en) Methods for sequencing nucleic acids using termination chemistry
CN120019165A (en) Method for determining reaction time in sequencing and sequencing method and system
EP3684952B1 (en) Estimating pre-pcr fragment numbers from post-pcr frequencies of unique molecular identifiers
Kawano et al. Reduction of non-insert sequence reads by dimer eliminator LNA oligonucleotide for small RNA deep sequencing
Gildea et al. Multiplexed primer extension sequencing: A targeted RNA-seq method that enables high-precision quantitation of mRNA splicing isoforms and rare pre-mRNA splicing intermediates
CN112063703B (en) Sequencing method and kit for improving resolution ratio of bar code
EP3814522A1 (en) Method for predicting the melting temperature of oligonucleotide
WO2023137667A1 (en) Linker and use thereof in constructing dnb library
CN107109478B (en) Method for obtaining paired-end sequencing information
CN117025736B (en) A single nucleotide polymorphism detection method, device, storage medium and equipment
Sochivko et al. Mathematical model of polymerase chain reaction with temperature-dependent parameters
CN116189770A (en) Single cell transcriptome RNA pollution removal method, medium and equipment
Kudella et al. Ligation of random oligomers leads to emergence of autocatalytic sequence network
CN108546739A (en) A method of the nucleic acid target sequence enrichment for NGS sequencings
WO2023216030A1 (en) Site-occupying primer and removal method
US20250236909A1 (en) Methods for detecting allele dosages in polyploid organisms
WO2024124379A1 (en) Multi-template nucleic acid synchronous sequencing method and use thereof
Lock et al. Efficiency clustering for low-density microarrays and its application to QPCR
WO2009063270A1 (en) Method for the design and engineering of oligonucleotides
Saha et al. A general probabilistic model of the PCR process

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination