[go: up one dir, main page]

CN115323045A - Gene sequencing reagent and gene sequencing method - Google Patents

Gene sequencing reagent and gene sequencing method Download PDF

Info

Publication number
CN115323045A
CN115323045A CN202211137859.8A CN202211137859A CN115323045A CN 115323045 A CN115323045 A CN 115323045A CN 202211137859 A CN202211137859 A CN 202211137859A CN 115323045 A CN115323045 A CN 115323045A
Authority
CN
China
Prior art keywords
compound
group
fluorescent
independently
absent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211137859.8A
Other languages
Chinese (zh)
Inventor
陈鑫
伍建
卓少春
周蓉
冯越
赵晓飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Mingyi Intelligent Manufacturing Technology Co ltd
Original Assignee
Chongqing Mingyi Intelligent Manufacturing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Mingyi Intelligent Manufacturing Technology Co ltd filed Critical Chongqing Mingyi Intelligent Manufacturing Technology Co ltd
Publication of CN115323045A publication Critical patent/CN115323045A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P20/00Technologies relating to chemical industry
    • Y02P20/50Improvements relating to the production of bulk chemicals
    • Y02P20/55Design of synthesis routes, e.g. reducing the use of auxiliary or protecting groups

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Saccharide Compounds (AREA)

Abstract

本发明涉及基因测序技术领域,具体涉及一种基因测序试剂及基因测序方法,本方法包括包括:将化合物1‑4和聚合酶同时加入体系中进行核苷酸聚合反应得一中间态复合体;洗去未反应的化合物1‑4维持中间态复合体状态,检测记录每个并入的核苷酸衍生物的荧光标记,判断DNA模板上对应位置的碱基;将化合物5‑8和聚合酶加入处理后的反应体系中进行核苷酸聚合反应;往反应后的体系中加入切割液洗去化合物5‑8中的可切割保护基团,移除溶液相并用缓冲液冲洗;洗去反应后替换下来的化合物1‑4;本发明最终并入新合成链的碱基为天然碱基,没有疤痕残留,不影响新合成链的柔韧性,聚合酶对链的持续延伸的结合效率,有利于减少碱基互补错配几率,提高测序读长和数据质量。The invention relates to the technical field of gene sequencing, in particular to a gene sequencing reagent and a gene sequencing method. The method comprises: adding compound 1-4 and a polymerase into a system simultaneously to carry out a nucleotide polymerization reaction to obtain an intermediate state complex; Wash away the unreacted compound 1-4 to maintain the intermediate complex state, detect and record the fluorescent label of each incorporated nucleotide derivative, and determine the base at the corresponding position on the DNA template; compound 5-8 and polymerase Add the treated reaction system to carry out nucleotide polymerization reaction; add cleavage solution to the reaction system to wash off the cleavable protective group in compound 5-8, remove the solution phase and rinse with buffer; wash off the reaction The replaced compounds 1-4; the bases finally incorporated into the newly synthesized chain of the present invention are natural bases, no scar remains, and the flexibility of the newly synthesized chain is not affected, and the binding efficiency of the polymerase to the continuous extension of the chain is beneficial to Reduce the chance of base complementation mismatch and improve sequencing read length and data quality.

Description

一种基因测序试剂及基因测序方法Gene sequencing reagent and gene sequencing method

技术领域technical field

本发明涉及基因测序技术领域,具体为一种基因测序试剂及基因测序方法。The invention relates to the technical field of gene sequencing, in particular to a gene sequencing reagent and a gene sequencing method.

背景技术Background technique

基因测序是医学和生物学发现的重要推动力,随着基因测序技术的迅猛发展,高通量基因测序技术已深入生命科学的各个领域,高通量基因测序采用克隆扩增和边合成边测序(SBS)的测序化学技术,可以实现了快速、准确的测序,经过近十年来的迅猛发展,已经深入到生命科学的各个领域,不仅有力地推动了基础研究的发展,在临床应用阶段也占据重要的角色。在目前最为流行的高通量基因测序平台中,为了测定碱基类型,将荧光染料标记的四种可逆终止核苷酸(Reversible Termination Nucleotide,NRT)加入反应体系,通常的,每个修饰NRT的荧光染料标记通过可切割链连接在碱基上,并在3’-OH端加以可切割的保护基团,不同的NRT会发出独特的荧光信号,该信号可用于确定DNA序列的顺序。目前修饰在NRT上的荧光染料标记通过可切割链连接在碱基上后,大多在洗脱过程中都会出现部分连接链化学结构残留在新合成核苷酸链上的现象,这些残留的化学键会影响新合成链的柔韧性,聚合酶对链的持续延伸的结合效率及碱基互补错配几率,从而影响测序的质量。Gene sequencing is an important driving force for medical and biological discoveries. With the rapid development of gene sequencing technology, high-throughput gene sequencing technology has penetrated into various fields of life sciences. High-throughput gene sequencing adopts clonal amplification and sequencing while synthesizing (SBS) sequencing chemistry technology can achieve fast and accurate sequencing. After nearly a decade of rapid development, it has penetrated into various fields of life sciences, not only strongly promoting the development of basic research, but also occupying an important position in the clinical application stage. important role. In the current most popular high-throughput gene sequencing platform, in order to determine the base type, four reversible termination nucleotides (Reversible Termination Nucleotide, NRT) labeled with fluorescent dyes are added to the reaction system. Usually, each modified NRT The fluorescent dye label is connected to the base through a cleavable chain, and a cleavable protective group is added at the 3'-OH end. Different NRTs will emit unique fluorescent signals, which can be used to determine the sequence of DNA sequences. At present, after the fluorescent dye label modified on NRT is connected to the base through the cleavable chain, most of the chemical structures of the connecting chain will remain on the newly synthesized nucleotide chain during the elution process. These residual chemical bonds will It affects the flexibility of the newly synthesized chain, the binding efficiency of the polymerase to the continuous extension of the chain, and the probability of base complementary mismatches, thereby affecting the quality of sequencing.

发明内容Contents of the invention

针对上述现有技术的不足,本发明旨在提供一种基因测序试剂及基因测序方法,最终并入新合成链的碱基为天然碱基,没有“疤痕”残留,不会影响新合成链的柔韧性,聚合酶对链的持续延伸的结合效率,有利于减少碱基互补错配几率,提高测序读长和数据质量。In view of the deficiencies of the above-mentioned prior art, the present invention aims to provide a gene sequencing reagent and a gene sequencing method. The bases that are finally incorporated into the newly synthesized chain are natural bases, and there is no "scar" residue, which will not affect the DNA of the newly synthesized chain. Flexibility, the binding efficiency of the polymerase to the continuous extension of the chain, is conducive to reducing the probability of base complementary mismatches and improving the sequencing read length and data quality.

为了解决上述问题,本发明采用了如下的技术方案:In order to solve the above problems, the present invention adopts the following technical solutions:

一方面,本发明提供一种基因测序方法,包括:In one aspect, the present invention provides a gene sequencing method, comprising:

S1,分别制备化合物1、化合物2、化合物3、化合物4,化合物5、化合物6、化合物7和化合物8;S1, prepare compound 1, compound 2, compound 3, compound 4, compound 5, compound 6, compound 7 and compound 8 respectively;

S2,将待测序的核酸模板链接于测试载体上,通过扩增形成待测核酸分子簇;S2, linking the nucleic acid template to be sequenced to the test carrier, and forming a cluster of nucleic acid molecules to be tested by amplification;

S3,将化合物1、化合物2、化合物3、化合物4和聚合酶同时加入S2体系中进行核苷酸聚合反应得一中间态复合体;S3, adding compound 1, compound 2, compound 3, compound 4 and polymerase into the S2 system at the same time to carry out nucleotide polymerization reaction to obtain an intermediate state complex;

S4,洗去未反应的化合物1、化合物2、化合物3、化合物4维持S3的中间态复合体状态,并检测记录每个并入的核苷酸衍生物的荧光标记,判断DNA模板上对应位置的碱基;S4, wash away the unreacted compound 1, compound 2, compound 3, and compound 4 to maintain the intermediate complex state of S3, and detect and record the fluorescent label of each incorporated nucleotide derivative, and determine the corresponding position on the DNA template the base;

S5,将化合物5、化合物6、化合物7、化合物8和聚合酶加入S4处理后的反应体系中进行核苷酸聚合反应;S5, adding compound 5, compound 6, compound 7, compound 8 and polymerase into the reaction system after S4 treatment to carry out nucleotide polymerization reaction;

S6,往S5反应后的体系中加入切割液洗去化合物5、化合物6、化合物7、化合物8中的可切割保护基团,移除溶液相并用缓冲液冲洗干净;S6, adding a cleavage solution to the system after the reaction in S5 to wash away the cleavable protective groups in compound 5, compound 6, compound 7, and compound 8, remove the solution phase and rinse with buffer;

S7,洗去S6中反应后替换下来的化合物1、化合物2、化合物3、化合物4;S7, wash away compound 1, compound 2, compound 3, compound 4 replaced after the reaction in S6;

S8,重复步骤S3至S7一次或多次;S8, repeat steps S3 to S7 one or more times;

所述化合物1、化合物2、化合物3和化合物4为2’,3’-双脱氧三磷酸核苷酸衍生物,具备式(I)结构;The compound 1, compound 2, compound 3 and compound 4 are 2', 3'-dideoxytriphosphate nucleotide derivatives with the structure of formula (I);

所述化合物5、化合物6、化合物7和化合物8为2’-脱氧核苷酸衍生物,具有式(II)的结构;The compound 5, compound 6, compound 7 and compound 8 are 2'-deoxynucleotide derivatives, having the structure of formula (II);

所述式(I)和所述式(II)的结构如下:The structures of the formula (I) and the formula (II) are as follows:

Figure BDA0003852928080000021
Figure BDA0003852928080000021

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2和R3各自独立地为荧光基团和可反应活性基团中的一种,或不存在;R 2 and R 3 are each independently one of a fluorescent group and a reactive active group, or do not exist;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

R5各自独立地为能够进行正交切割反应的保护基团;R 5 are each independently a protecting group capable of performing an orthogonal cleavage reaction;

L1为连接基团或可切割连接基团中的一种。L 1 is one of a linking group or a cleavable linking group.

进一步,所述式(I)的结构如下:Further, the structure of the formula (I) is as follows:

Figure BDA0003852928080000022
Figure BDA0003852928080000022

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2各自独立地为可发出不同荧光信号的荧光基团;R 2 are each independently a fluorophore that can emit different fluorescent signals;

R3为H或不存在; R3 is H or absent;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

L1各自独立地为连接基团或不存在。Each L 1 is independently a linking group or absent.

进一步,所述式(I)的结构如下:Further, the structure of the formula (I) is as follows:

Figure BDA0003852928080000023
Figure BDA0003852928080000023

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2和R3各自独立地为荧光基团或不存在;其中,化合物1中R2为荧光基团,R3不存在;化合物2中R2为不同荧光基团,能够发出与化合物1中R2荧光基团不同的荧光信号,R3不存在;化合物3中的R2为荧光基团与化合物1中R2荧光基团相同,或者是能发出与化合物1中R2荧光基团相同荧光信号的荧光基团,化合物3中的R3则是与化合物2中R2相同的荧光基团,或者是能发出与化合物2中R2荧光基团相同荧光信号的荧光基团;化合物4中R2和R3为H或不存在;R 2 and R 3 are each independently a fluorescent group or do not exist; wherein, in compound 1, R 2 is a fluorescent group, and R 3 does not exist; in compound 2, R 2 is a different fluorescent group, which can emit the same The fluorescent signal of the R 2 fluorophore is different, and R3 does not exist; R 2 in compound 3 is a fluorophore that is the same as the R 2 fluorophore in compound 1, or can emit the same fluorescence as the R 2 fluorophore in compound 1 Signal fluorophore, R 3 in compound 3 is the same fluorophore as R 2 in compound 2, or a fluorophore that can emit the same fluorescent signal as the R 2 fluorophore in compound 2; in compound 4 R2 and R3 are H or absent ;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

L1各自独立地为连接基团或不存在。Each L 1 is independently a linking group or absent.

进一步,所述S4包括:Further, said S4 includes:

S4a,洗去未反应得化合物1、化合物2、化合物3,化合物4;S4a, wash away unreacted compound 1, compound 2, compound 3, compound 4;

S4b,加入能与化合物3中的R3可反应活性基团快速结合的荧光基团标记的活性基团,从而将第二种荧光基团引入化合物3中,活性基团上标记的荧光基团与化合物2中R2相同,或者是能发出与化合物2中R2荧光基团相同荧光信号的荧光基团;S4b, adding a fluorophore-labeled active group that can quickly combine with the R 3 reactive active group in compound 3, thereby introducing a second fluorophore into compound 3, the fluorophore labeled on the active group The same as R2 in compound 2 , or a fluorescent group that can emit the same fluorescent signal as the R2 fluorescent group in compound 2 ;

S4c,加入扫描缓冲液,激发光源检测记录荧光信号,之后洗去扫描缓冲液;S4c, add scanning buffer, excite the light source to detect and record the fluorescent signal, and then wash away the scanning buffer;

所述式(I)的结构如下:The structure of the formula (I) is as follows:

Figure BDA0003852928080000031
Figure BDA0003852928080000031

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2和R3各自独立地为荧光基团和可反应活性基团中的一种,或不存在;其中,化合物1中R2为荧光基团,R3不存在;化合物2中R2为不同荧光基团,能够发出与化合物1中R2荧光基团不同的荧光信号,R3不存在;化合物3中的R2荧光基团与化合物1中R2荧光基团相同,或者是能发出与化合物1中R2荧光基团相同荧光信号的荧光基团;化合物3中的R3为可反应活性基团,可以与荧光基团标记的活性基团快速结合,活性基团上标记的荧光基团与化合物2中R2相同,或者是能发出与化合物2中R2荧光基团相同荧光信号的荧光基团;化合物4中R2和R3为H或不存在;R 2 and R 3 are each independently one of a fluorescent group and a reactive group, or do not exist; wherein, in compound 1, R 2 is a fluorescent group, and R 3 does not exist; in compound 2, R 2 is Different fluorescent groups can emit different fluorescent signals from the R 2 fluorescent group in compound 1, and R 3 does not exist; the R 2 fluorescent group in compound 3 is the same as the R 2 fluorescent group in compound 1, or can emit A fluorophore with the same fluorescent signal as the R 2 fluorophore in compound 1; R 3 in compound 3 is a reactive active group that can quickly combine with the active group labeled with the fluorophore, and the fluorescent label on the active group The group is the same as R2 in compound 2 , or a fluorescent group that can emit the same fluorescent signal as the R2 fluorescent group in compound 2 ; R2 and R3 in compound 4 are H or do not exist;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

L1各自独立地为连接基团或不存在。Each L 1 is independently a linking group or is absent.

进一步,所述S4包括:Further, said S4 includes:

S4a,洗去未反应得化合物1、化合物2、化合物3,化合物4,之后加入扫描缓冲液调节反应环境使得化合物1和化合物2能够被激发发出荧光,检测并记录荧光信号;S4a, wash away the unreacted compound 1, compound 2, compound 3, and compound 4, and then add a scanning buffer to adjust the reaction environment so that compound 1 and compound 2 can be excited to emit fluorescence, detect and record the fluorescence signal;

S4b,洗去扫描缓冲液,加入化合物1所标记荧光基团的活性基团,将该荧光基团引入化合物3中使得化合物3能发出荧光信号,同时调节反应环境,淬灭化合物2上荧光基团的荧光;S4b, washing away the scanning buffer, adding the active group of the fluorescent group labeled by compound 1, introducing the fluorescent group into compound 3 so that compound 3 can emit a fluorescent signal, and at the same time adjust the reaction environment to quench the fluorescent group on compound 2 cluster fluorescence;

S4c,加入扫描缓冲液,激发光源检测记录荧光信号,之后洗去扫描缓冲液;S4c, add scanning buffer, excite the light source to detect and record the fluorescent signal, and then wash away the scanning buffer;

所述式(I)的结构如下:The structure of the formula (I) is as follows:

Figure BDA0003852928080000032
Figure BDA0003852928080000032

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2各自独立地为荧光基团和可反应活性基团中的一种,或不存在;其中,化合物1中R2为荧光基团;化合物2中R2为环境高敏荧光染料基团,该荧光基团在特定条件下能发出和化合物1的荧光基团相同的荧光信号,且能快速响应反应体系环境变换发生荧光淬灭现象;化合物3中R2为可反应活性基团;化合物4中R2为H或不存在;R 2 are each independently one of a fluorescent group and a reactive active group, or do not exist; wherein, in compound 1, R 2 is a fluorescent group; in compound 2, R 2 is an environmental high-sensitivity fluorescent dye group, the Under certain conditions, the fluorophore can emit the same fluorescent signal as the fluorophore of compound 1, and can quickly respond to the reaction system environment change to cause fluorescence quenching; in compound 3, R 2 is a reactive active group; in compound 4 R2 is H or absent ;

R3为H或不存在; R3 is H or absent;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

L1各自独立地为连接基团或不存在。Each L 1 is independently a linking group or absent.

另一方面,本发明提供一种基因测序试剂,包括聚合酶、化合物1、化合物2、化合物3、化合物4,化合物5、化合物6、化合物7和化合物8;In another aspect, the present invention provides a gene sequencing reagent, including polymerase, compound 1, compound 2, compound 3, compound 4, compound 5, compound 6, compound 7 and compound 8;

所述化合物1、化合物2、化合物3、化合物4为2’,3’-双脱氧三磷酸核苷酸衍生物,具备式(I)结构;The compound 1, compound 2, compound 3, and compound 4 are 2', 3'-dideoxytriphosphate nucleotide derivatives, which have the structure of formula (I);

所述化合物5、化合物6、化合物7、化合物8为2’-脱氧核苷酸衍生物,具有式(II)的结构;The compound 5, compound 6, compound 7, and compound 8 are 2'-deoxynucleotide derivatives with a structure of formula (II);

所述式(I)和所述式(II)的结构如下:The structures of the formula (I) and the formula (II) are as follows:

Figure BDA0003852928080000041
Figure BDA0003852928080000041

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2和R3各自独立地为荧光基团和可反应活性基团中的一种,或不存在;R 2 and R 3 are each independently one of a fluorescent group and a reactive active group, or do not exist;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

R5各自独立地为能够进行正交切割反应的保护基团;R 5 are each independently a protecting group capable of performing an orthogonal cleavage reaction;

L1为连接基团或可切割连接基团中的一种。L1 is one of a linking group or a cleavable linking group.

进一步,所述式(I)的结构如下:Further, the structure of the formula (I) is as follows:

Figure BDA0003852928080000042
Figure BDA0003852928080000042

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2各自独立地为可发出不同荧光信号的荧光基团;R 2 are each independently a fluorophore that can emit different fluorescent signals;

R3为H或不存在; R3 is H or absent;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

L1各自独立地为连接基团或不存在。Each L 1 is independently a linking group or absent.

进一步,所述式(I)的结构如下:Further, the structure of the formula (I) is as follows:

Figure BDA0003852928080000051
Figure BDA0003852928080000051

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2和R3各自独立地为荧光基团或不存在;其中,化合物1中R2为荧光基团,R3不存在;化合物2中R2为不同荧光基团,能够发出与化合物1中R2荧光基团不同的荧光信号,R3不存在;化合物3中的R2为荧光基团与化合物1中R2荧光基团相同,或者是能发出与化合物1中R2荧光基团相同荧光信号的荧光基团,化合物3中的R3则是与化合物2中R2相同的荧光基团,或者是能发出与化合物2中R2荧光基团相同荧光信号的荧光基团;化合物4中R2和R3为H或不存在;R 2 and R 3 are each independently a fluorescent group or do not exist; wherein, in compound 1, R 2 is a fluorescent group, and R 3 does not exist; in compound 2, R 2 is a different fluorescent group, which can emit the same The fluorescent signal of the R 2 fluorophore is different, and R3 does not exist; R 2 in compound 3 is a fluorophore that is the same as the R 2 fluorophore in compound 1, or can emit the same fluorescence as the R 2 fluorophore in compound 1 Signal fluorophore, R 3 in compound 3 is the same fluorophore as R 2 in compound 2, or a fluorophore that can emit the same fluorescent signal as the R 2 fluorophore in compound 2; in compound 4 R2 and R3 are H or absent ;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

L1各自独立地为连接基团或不存在。Each L 1 is independently a linking group or absent.

进一步,所述式(I)的结构如下:Further, the structure of the formula (I) is as follows:

Figure BDA0003852928080000052
Figure BDA0003852928080000052

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2和R3各自独立地为荧光基团和可反应活性基团中的一种,或不存在;其中,化合物1中R2为荧光基团,R3不存在;化合物2中R2为不同荧光基团,能够发出与化合物1中R2荧光基团不同的荧光信号,R3不存在;化合物3中的R2荧光基团与化合物1中R2荧光基团相同,或者是能发出与化合物1中R2荧光基团相同荧光信号的荧光基团;化合物3中的R3为可反应活性基团,可以与荧光基团标记的活性基团快速结合,活性基团上标记的荧光基团与化合物2中R2相同,或者是能发出与化合物2中R2荧光基团相同荧光信号的荧光基团;化合物4中R2和R3为H或不存在;R 2 and R 3 are each independently one of a fluorescent group and a reactive group, or do not exist; wherein, in compound 1, R 2 is a fluorescent group, and R 3 does not exist; in compound 2, R 2 is Different fluorescent groups can emit different fluorescent signals from the R 2 fluorescent group in compound 1, and R 3 does not exist; the R 2 fluorescent group in compound 3 is the same as the R 2 fluorescent group in compound 1, or can emit A fluorophore with the same fluorescent signal as the R 2 fluorophore in compound 1; R 3 in compound 3 is a reactive active group that can quickly combine with the active group labeled with the fluorophore, and the fluorescent label on the active group The group is the same as R2 in compound 2 , or a fluorescent group that can emit the same fluorescent signal as the R2 fluorescent group in compound 2 ; R2 and R3 in compound 4 are H or do not exist;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

L1各自独立地为连接基团或不存在。Each L 1 is independently a linking group or absent.

进一步,所述式(I)的结构如下:Further, the structure of the formula (I) is as follows:

Figure BDA0003852928080000053
Figure BDA0003852928080000053

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2各自独立地为荧光基团和可反应活性基团中的一种,或不存在;其中,化合物1中R2为荧光基团;化合物2中R2为环境高敏荧光染料基团,该荧光基团在特定条件下能发出和化合物1的荧光基团相同的荧光信号,且能快速响应反应体系环境变换发生荧光淬灭现象;化合物3中R2为可反应活性基团;化合物4中R2为H或不存在;R 2 are each independently one of a fluorescent group and a reactive active group, or do not exist; wherein, in compound 1, R 2 is a fluorescent group; in compound 2, R 2 is an environmental high-sensitivity fluorescent dye group, the Under certain conditions, the fluorophore can emit the same fluorescent signal as the fluorophore of compound 1, and can quickly respond to the reaction system environment change to cause fluorescence quenching; in compound 3, R 2 is a reactive active group; in compound 4 R2 is H or absent ;

R3为H或不存在; R3 is H or absent;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

L1各自独立地为连接基团或不存在。Each L 1 is independently a linking group or absent.

进一步,所述碱基或类似物为以下任意一种结构的化合物:Further, the base or analog is a compound of any of the following structures:

Figure BDA0003852928080000061
Figure BDA0003852928080000061

进一步,保护基团为具有以下任意一种结构的基团:Further, the protecting group is a group with any of the following structures:

Figure BDA0003852928080000062
Figure BDA0003852928080000062

进一步,所述可切割连接基团是具有以下任意一种结构的基团:Further, the cleavable linking group is a group with any of the following structures:

Figure BDA0003852928080000063
Figure BDA0003852928080000063

Figure BDA0003852928080000071
Figure BDA0003852928080000071

其中,R1’,R2’各自独立地为卤素、-H、C1-C5脂肪链中的一种。Wherein, R1', R2' are each independently one of halogen, -H, and C1-C5 aliphatic chain.

本发明的有益效果在于:本发明提供的一种基因测序试剂及基因测序方法,包括2’,3’-双脱氧三磷酸核苷酸衍生物,其第一特征是5’-端,第一,二个磷酸基团之间的O被CH2基团取代,其第二特征是在核苷酸多磷酸衍生物的碱基上标记有荧光基团或可反应活性基团,这类核苷酸多磷酸衍生物在进行SBS测序过程中,可以被DNA聚合酶识别并将其并入待测DNA链模板上,由于CH2基团的存在,不能正常形成磷酸二酯键,DNA链延长终止,而且2’,3’-双脱氧三磷酸核苷酸上的2’-OH的缺失也确保每次只并入单个核苷酸衍生物;此时核苷酸多磷酸衍生物和聚合酶以及核酸模板链形成一中间态复合体,通过检测每个并入的核苷酸衍生物的荧光标记并拍照存储图像来鉴别所述核酸链的3′端所并入的核苷酸衍生物,据此判断DNA模板上对应位置的碱基;拍照结束后,加入可以正常形成磷酸二酯键的3’-保护基团-2’-脱氧三磷酸核苷酸,竞争性的取代上述2’,3’-双脱氧三磷酸核苷酸衍生物,完成这一轮SBS链增长,3’-保护基团确保每次只增长一个碱基,且每一轮SBS测序后3’-保护基团-2’-脱氧三磷酸核苷酸的保护基团可被切割生成天然的3’-OH-2’-脱氧三磷酸核苷酸,从而不影响下一轮SBS反应,最终并入新合成链的碱基为天然碱基,没有“疤痕”残留,有利于提高测序读长和数据质量。此外,本发明测序方法除了提供四色荧光和双色荧光SBS测序试剂和方法外,还提供单色荧光试剂和测序方法,测序仪仅需配备一个激发光源和一个相机,从而降低了测序仪的制造成本和体积,降低广大医院的开机门槛,有利于测序仪往更广阔的的三四线城市医院和研究机构扩散,适合各级医院本地化检测使用。The beneficial effect of the present invention is that: a gene sequencing reagent and a gene sequencing method provided by the present invention include 2', 3'-dideoxytriphosphate nucleotide derivatives, the first feature of which is the 5'-end, the first , the O between the two phosphate groups is replaced by the CH2 group, and its second feature is that the base of the nucleotide polyphosphate derivative is marked with a fluorescent group or a reactive active group. This type of nucleoside In the process of SBS sequencing, acid polyphosphate derivatives can be recognized by DNA polymerase and incorporated into the DNA chain template to be tested. Due to the presence of CH2 groups, phosphodiester bonds cannot be formed normally, and the DNA chain elongation is terminated. , and the deletion of the 2'-OH on the 2',3'-dideoxytriphosphate nucleotide also ensures that only a single nucleotide derivative is incorporated at a time; at this time, the nucleotide polyphosphate derivative and the polymerase and The nucleic acid template strand forms an intermediate complex, and the nucleotide derivatives incorporated at the 3' end of the nucleic acid chain are identified by detecting the fluorescent label of each incorporated nucleotide derivative and taking pictures and storing the image, according to This determines the base at the corresponding position on the DNA template; after taking pictures, add a 3'-protecting group-2'-deoxytriphosphate nucleotide that can normally form a phosphodiester bond to competitively replace the above 2', 3 '-Dideoxytriphosphate nucleotide derivatives, to complete this round of SBS chain growth, the 3'-protecting group ensures that only one base is increased each time, and after each round of SBS sequencing, the 3'-protecting group-2 The protective group of '-deoxy-triphosphate nucleotides can be cleaved to generate natural 3'-OH-2'-deoxy-triphosphate nucleotides, so as not to affect the next round of SBS reaction, and finally incorporated into the base of the newly synthesized chain The base is a natural base, and there is no "scar" residue, which is conducive to improving the sequencing read length and data quality. In addition, in addition to providing four-color fluorescent and two-color fluorescent SBS sequencing reagents and methods, the sequencing method of the present invention also provides single-color fluorescent reagents and sequencing methods. The sequencer only needs to be equipped with one excitation light source and one camera, thereby reducing the manufacturing cost of the sequencer. The cost and volume reduce the start-up threshold of hospitals, which is conducive to the spread of sequencers to wider hospitals and research institutions in third- and fourth-tier cities, and is suitable for localized testing in hospitals at all levels.

具体实施方式Detailed ways

下面结合具体实施例对本发明作进一步的详细说明。The present invention will be further described in detail below in conjunction with specific embodiments.

需要说明的是,这些实施例仅用于说明本发明,而不是对本发明的限制,在本发明的构思前提下本方法的简单改进,都属于本发明要求保护的范围。It should be noted that these examples are only used to illustrate the present invention, rather than to limit the present invention, and the simple improvement of the method under the premise of the present invention all belongs to the protection scope of the present invention.

一种基因测序试剂,包括聚合酶、化合物1、化合物2、化合物3、化合物4,化合物5、化合物6、化合物7、化合物8;A gene sequencing reagent, including polymerase, compound 1, compound 2, compound 3, compound 4, compound 5, compound 6, compound 7, compound 8;

所述化合物1、化合物2、化合物3、化合物4为2’,3’-双脱氧三磷酸核苷酸衍生物,具备式(I)结构,所述化合物5、化合物6、化合物7、化合物8为2’-脱氧核苷酸衍生物,具有式(II)的结构:The compound 1, compound 2, compound 3, and compound 4 are 2', 3'-dideoxytriphosphate nucleotide derivatives with the structure of formula (I), and the compound 5, compound 6, compound 7, and compound 8 It is a 2'-deoxynucleotide derivative with a structure of formula (II):

Figure BDA0003852928080000072
Figure BDA0003852928080000072

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2,R3各自独立地为荧光基团和可反应活性基团中的一种,或不存在;荧光基团包括为能够发岀相同荧光信号的环境高敏荧光基团、能发出相同或不同荧光信号的荧光基团;R 2 and R 3 are each independently one of a fluorescent group and a reactive group, or not present; the fluorescent group includes an environmentally sensitive fluorescent group that can emit the same fluorescent signal, and can emit the same or different Fluorophores for fluorescent signals;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

R5各自独立地为能够进行正交切割反应的保护基团;R 5 are each independently a protecting group capable of performing an orthogonal cleavage reaction;

L1为连接基团或可切割连接基团中的一种。L 1 is one of a linking group or a cleavable linking group.

在本发明实施例中,使用聚合酶来进行核苷酸聚合反应,聚合酶是指能够催化聚合反应的任何天然或非天然存在的酶或其他催化剂,包括各种已知的天然和改性的核酸聚合酶,例如DNA(脱氧核糖核酸,Desoxyribonucleic Acid)聚合酶、RNA(核糖核酸,Ribonucleic Acid)聚合酶以及逆转录酶。In the embodiments of the present invention, polymerase is used to carry out nucleotide polymerization reaction, polymerase refers to any naturally or non-naturally occurring enzyme or other catalyst capable of catalyzing polymerization, including various known natural and modified Nucleic acid polymerase, such as DNA (Deoxyribonucleic Acid, Desoxyribonucleic Acid) polymerase, RNA (Ribonucleic Acid, Ribonucleic Acid) polymerase, and reverse transcriptase.

聚合酶能够以RNA或单链DNA为模板合成新的DNA链,可根据实际需要,选择合适的聚合酶来进行核苷酸聚合反应,也可以选择多种聚合酶的混合来使用。Polymerases can use RNA or single-stranded DNA as templates to synthesize new DNA strands. According to actual needs, an appropriate polymerase can be selected for nucleotide polymerization, or a mixture of multiple polymerases can be selected for use.

在本发明实施例中,使用DNA聚合酶进行核苷酸聚合反应(以下简称聚合酶M)。In the embodiment of the present invention, DNA polymerase is used for nucleotide polymerization (hereinafter referred to as polymerase M).

本发明中各化合物分别为核苷酸A、(T/U)、C和G的衍生物,核苷酸是指核苷-5’-多磷酸化合物或其结构类似物,具有碱基互补配对能力,其可以通过核酸聚合酶掺入以延伸生长的核酸链,可以在一个或多个碱基、糖或磷酸基团上对核苷酸进行修饰,核苷酸可标记有荧光染料或可反应活性基团。Each compound in the present invention is the derivative of nucleotide A, (T/U), C and G respectively, and nucleotide refers to nucleoside-5'-polyphosphoric acid compound or its structural analogue, has complementary base pairing Ability, which can be incorporated by nucleic acid polymerases to extend growing nucleic acid strands, nucleotides can be modified at one or more base, sugar or phosphate groups, nucleotides can be labeled with fluorescent dyes or reactive active group.

保护基团是能被聚合酶有效识别,能掺入生长中的DNA链中,并在每一轮SBS测序后能被切割脱离的3’-OH修饰基团,所述可切割连接基团是指链接5’磷酸或碱基的荧光基团、环境高敏荧光基团、能发出相同荧光信号的荧光基团、能够进行连接反应的可反应活性基团之间的任意一种小分子基团,所述能发出相同荧光信号的荧光基团是指在同一激发光源波长下能够发出和选用的环境高敏荧光染料一致或接近的发射波长的荧光,并被检测判断为同一荧光信号的非环境高敏荧光染料。保护基团可响应切割剂包括但不限于Na2S2O4)、THP)、TEC、DTT、弱酸、Pd(0)或光照射(例如紫外线照射)等的作用而去保护,保护基团包括但不限于以下几种:The protective group is a 3'-OH modification group that can be effectively recognized by the polymerase, can be incorporated into the growing DNA chain, and can be cut off after each round of SBS sequencing. The cleavable linking group is Refers to any small molecular group between the fluorescent group linked to the 5' phosphate or base, the environmentally sensitive fluorescent group, the fluorescent group that can emit the same fluorescent signal, and the reactive active group that can carry out the connection reaction. The fluorophore that can emit the same fluorescent signal refers to the non-environmental high-sensitivity fluorescence that can emit fluorescence at the same or close emission wavelength as the selected environmental high-sensitivity fluorescent dye under the same excitation light source wavelength, and is detected and judged as the same fluorescent signal. dye. Protecting groups can be deprotected in response to cutting agents including but not limited to Na2S2O4), THP), TEC, DTT, weak acids, Pd(0) or light irradiation (such as ultraviolet irradiation), and protecting groups include but are not limited to the following Several:

Figure BDA0003852928080000081
Figure BDA0003852928080000081

在本发明实施例中核苷酸衍生物修饰有可切割链基团,可切割链基团是指响应于外部刺激(例如,酶、亲核/碱性试剂、还原剂、光照射、亲电/酸性试剂、有机金属和金属试剂或氧化剂)而可正交切割的(例如,可特异性切割)的链基团。在正交切割反应中,使用的切割剂包括但不限于Na2S2O4)、THP)、TEC、DTT、弱酸、Pd(0)或光照射(例如紫外线照射)等,所述可切割连接基团是具有以下任意一种结构的基团:In the embodiment of the present invention, the nucleotide derivative is modified with a cleavable chain group, and the cleavable chain group refers to a cleavable chain group that responds to external stimuli (for example, enzymes, nucleophilic/alkaline reagents, reducing agents, light irradiation, electrophilic/ Acidic reagents, organometallic and metallic reagents or oxidizing agents) are orthogonally cleavable (eg, specifically cleavable) chain groups. In the orthogonal cleavage reaction, the cleavage agent used includes but not limited to Na2S2O4), THP), TEC, DTT, weak acid, Pd(0) or light irradiation (such as ultraviolet irradiation), etc., and the cleavable linking group has A group with any of the following structures:

Figure BDA0003852928080000091
Figure BDA0003852928080000091

其中,R1’,R2’各自独立地为卤素、-H、C1-C5脂肪链中的一种。Wherein, R 1' and R 2' are each independently one of halogen, -H, and C 1 -C 5 aliphatic chain.

所述可反应活性基团(Active Group)是能够与携带有荧光基团的互补基团进行特异性正交连接反应的缀合物反应性基团。所述的正交连接反应的化学反应包括但不限于∶施陶丁格连接反应,铜离子催化的叠氮与炔基的环加成反因,环张力驱动的叠氮与炔基的环加成反应,地高辛与地高辛抗体间的结合反应,狄尔斯—阿尔德反应,Suzuki交叉偶联反应,疏基和疏基衍生物的二硫键形成反应,巯基与马来酰亚胺形成硫醚的反应,疏基和烯烃衍生物的光催化自由基加成反应,疏基和炔基衍生物的光催化自由基加成反应,磺酰氟交换反应,生物素与链霉亲和素之间的结合反应,氨基与活化酯之间的反应。The reactive active group (Active Group) is a conjugated reactive group capable of performing a specific orthogonal connection reaction with a complementary group carrying a fluorescent group. The chemical reactions of the orthogonal linkage reaction include but are not limited to: Staudinger linkage reaction, copper ion-catalyzed cycloaddition reaction of azide and alkynyl, ring tension-driven azide and alkynyl cycloaddition Synthesis reaction, binding reaction between digoxin and digoxin antibody, Diels-Alder reaction, Suzuki cross-coupling reaction, disulfide bond formation reaction between sulfhydryl and sulfhydryl derivatives, sulfhydryl and maleimide The reaction of amines to form thioethers, the photocatalytic free radical addition reaction of sulfhydryl and alkenyl derivatives, the photocatalytic free radical addition reaction of sulfhydryl and alkynyl derivatives, the exchange reaction of sulfonyl fluoride, biotin and streptavidin The binding reaction between and the prime, the reaction between the amino group and the activated ester.

所述荧光基团则来自于如下任意一种或多种荧光染料:AF488、AF532、AF633、AF680、AF660、AF700、AF647、AF 594、AF 555、AF568、CY3、CY5、CY5.5、CY7、CY7.5、ROX、R6G、ATTO 495、ATTO532、ATTO700、ATTO680、ATTO655、ATTO647N、ATTO594、ATTO Rho101、ATTO590、ATTO Thio12、FAM、VIC、TET、JOE、HEX、CAL Fluor Orange 560、TAMRA、CAL Fluor Red610、TEXAS RED、CAL Fluor Red635、iFluor 488、iFluor 514、iFluor 532、iFluor 546、iFluor 555、iFluor 568、iFluor 590、iFluor610、iFluor 633、iFluor 647、iFluor 680、iFluor700、iFluor710、Quasar705、Quasar670。The fluorescent group comes from any one or more of the following fluorescent dyes: AF488, AF532, AF633, AF680, AF660, AF700, AF647, AF 594, AF 555, AF568, CY3, CY5, CY5.5, CY7, CY7.5, ROX, R6G, ATTO 495, ATTO532, ATTO700, ATTO680, ATTO655, ATTO647N, ATTO594, ATTO Rho101, ATTO590, ATTO Thio12, FAM, VIC, TET, JOE, HEX, CAL Fluor Orange 560, TAMRA, CAL Fluor Red610、TEXAS RED、CAL Fluor Red635、iFluor 488、iFluor 514、iFluor 532、iFluor 546、iFluor 555、iFluor 568、iFluor 590、iFluor610、iFluor 633、iFluor 647、iFluor 680、iFluor700、iFluor710、Quasar705、Quasar670。

所述环境高敏荧光染料是能够快速响应环境的变化,如极性,pH,电压,光源和粘度等而改变发光颜色,荧光发射波长或强度的荧光染料。所述的正交连接反应的化学反应包括但不限于各种已知的广泛用于荧光探针、化学传感器、微环境变化检测、生物成像、分子开关和相分离可视化等领域的环境敏感荧光染料,所述环境高敏荧光染料包括但不局限于如下荧光染料:Cy-7、Dylight 800、IRDye 800、Alexa Fluor 790、HiLyte Fluor 750、Ovster 800、Rhodamine isothiocyanate、Texas Red derivatives、Alexa Fluor 680、DyLight 680、Cy5.5 NHS ester(~67O nm,Lumiprobe)、Alexa Fluor 546、DyLight 549、Oregon Green 514、Carboxylic Acid、pHrodoTM Red、6-Carboxynaphthofluorescein、7-Hydroxycoumarin-3-carboxylic acid、SNARFR-5F、SNARFB-4F、SNARFR-1、BCECF、CyPHER5E、HCyC-647、Square-650-pH、6-Carboxy-4,5'-Dichloro-2',7'-Dimethoxyfluorescein。The environmental high-sensitivity fluorescent dye is a fluorescent dye that can quickly respond to changes in the environment, such as polarity, pH, voltage, light source and viscosity, etc. to change the color of light emitted, the wavelength or intensity of fluorescence emission. The chemical reactions of the orthogonal connection reaction include but are not limited to various known environmentally sensitive fluorescent dyes widely used in the fields of fluorescent probes, chemical sensors, microenvironmental change detection, biological imaging, molecular switches, and phase separation visualization. , the environment highly sensitive fluorescent dyes include but not limited to the following fluorescent dyes: Cy-7, Dylight 800, IRDye 800, Alexa Fluor 790, HiLyte Fluor 750, Ovster 800, Rhodamine isothiocyanate, Texas Red derivatives, Alexa Fluor 680, DyLight 680 , Cy5.5 NHS ester (~67O nm, Lumiprobe), Alexa Fluor 546, DyLight 549, Oregon Green 514, Carboxylic Acid, pHrodoTM Red, 6-Carboxynaphthofluorescein, 7-Hydroxycoumarin-3-carboxylic acid, SNARFR-5F, SNARFB- 4F, SNARFR-1, BCECF, CyPHER5E, HCyC-647, Square-650-pH, 6-Carboxy-4,5'-Dichloro-2',7'-Dimethoxyfluorescein.

一种基因测序方法,包括如下步骤:A gene sequencing method, comprising the steps of:

S1,分别制备前述化合物1、化合物2、化合物3、化合物4,化合物5、化合物6、化合物7、化合物8,所述化合物1、化合物2、化合物3、化合物4为2’,3’-双脱氧三磷酸核苷酸衍生物,具备式(I)结构,所述化合物5、化合物6、化合物7、化合物8为2’-脱氧核苷酸衍生物,具有式(II)的结构:S1, respectively prepare the aforementioned compound 1, compound 2, compound 3, compound 4, compound 5, compound 6, compound 7, compound 8, the compound 1, compound 2, compound 3, compound 4 are 2', 3'-bis Deoxyribonucleotide triphosphate derivatives have a structure of formula (I), and the compound 5, compound 6, compound 7, and compound 8 are 2'-deoxynucleotide derivatives with a structure of formula (II):

Figure BDA0003852928080000101
Figure BDA0003852928080000101

其中,R1为不同的碱基或类似物,所述碱基或类似物为腺嘌呤、鸟嘌呤、胞嘧啶、胸腺嘧啶和尿嘧啶或类似物中的一种;Wherein, R is a different base or analogue, and the base or analogue is one of adenine, guanine, cytosine, thymine and uracil or analogues;

R2,R3各自独立地为荧光基团和可反应活性基团中的一种,或不存在;荧光基团包括为能够发岀相同荧光信号的环境高敏荧光基团、能发出相同或不同荧光信号的荧光基团;R 2 and R 3 are each independently one of a fluorescent group and a reactive group, or not present; the fluorescent group includes an environmentally sensitive fluorescent group that can emit the same fluorescent signal, and can emit the same or different Fluorophores for fluorescent signals;

R4各自独立地为单磷酸基团、多磷酸基团中的一种;R 4 are each independently one of a monophosphate group and a polyphosphate group;

R5各自独立地为能够进行正交切割反应的保护基团;R 5 are each independently a protecting group capable of performing an orthogonal cleavage reaction;

L1为连接基团或可切割连接基团中的一种;L is one of a linking group or a cleavable linking group;

S2,将待测序的核酸模板链接于测试载体上,通过扩增形成待测核酸分子簇;S2, linking the nucleic acid template to be sequenced to the test carrier, and forming a cluster of nucleic acid molecules to be tested by amplification;

S3,将化合物1、化合物2、化合物3、化合物4和聚合酶同时加入S2体系中进进行核苷酸聚合反应得一中间态复合体;S3, adding compound 1, compound 2, compound 3, compound 4 and polymerase into the S2 system at the same time to carry out nucleotide polymerization reaction to obtain an intermediate state complex;

S4,洗去未反应的化合物1、化合物2、化合物3、化合物4,维持S3的中间态复合体状态,并检测记录每个并入的核苷酸衍生物的荧光标记,判断DNA模板上对应位置的碱基;S4, wash away unreacted compound 1, compound 2, compound 3, and compound 4, maintain the state of the intermediate complex in S3, and detect and record the fluorescent label of each incorporated nucleotide derivative to determine the corresponding the base of the position;

S5,将化合物5、化合物6、化合物7、化合物8和聚合酶加入S4处理后的反应体中进行核苷酸聚合反应;S5, adding compound 5, compound 6, compound 7, compound 8 and polymerase to the reaction body after S4 treatment to perform nucleotide polymerization reaction;

S6,往S5反应后的体系中加入切割液洗去化合物5、化合物6、化合物7、化合物8中的可切割保护基团,移除溶液相并用缓冲液冲洗干净;S6, adding a cleavage solution to the system after the reaction in S5 to wash away the cleavable protective groups in compound 5, compound 6, compound 7, and compound 8, remove the solution phase and rinse with buffer;

S7,洗去S6中反应后替换下来的化合物1、化合物2、化合物3、化合物4;S7, wash away compound 1, compound 2, compound 3, compound 4 replaced after the reaction in S6;

S8,重复步骤S3至S7一次或多次。S8, repeat steps S3 to S7 one or more times.

根据不同测序需求,可以采用不同的方法来实现,各方法在在个别步骤中有所不同,它们的区别是:According to different sequencing requirements, different methods can be used to achieve it. Each method is different in individual steps. Their differences are:

第一种是四色荧光染料的测序方法:The first is a sequencing method with four-color fluorescent dyes:

在S1步骤中分别制备化合物1、化合物2、化合物3、化合物4,使化合物1、化合物2、化合物3、化合物4分别标记激发和发射波长均不同的荧光基团,化合物1-4各自独立的具备式(I)结构:In step S1, compound 1, compound 2, compound 3, and compound 4 are prepared respectively, so that compound 1, compound 2, compound 3, and compound 4 are respectively labeled with fluorescent groups with different excitation and emission wavelengths, and compounds 1-4 are independently Possess formula (I) structure:

Figure BDA0003852928080000102
Figure BDA0003852928080000102

此时R2各自独立地为可发出不同荧光信号的荧光基团;R3为;L1为H或不存在。At this time, R 2 is each independently a fluorescent group that can emit different fluorescent signals; R 3 is; L 1 is H or does not exist.

步骤S2中则是将待测序的核酸模板连接于测试载体上,通过扩增形成待测核酸分子簇,如将测序的核酸模板链接于芯片或微球上形成待测核酸分子簇;In step S2, the nucleic acid template to be sequenced is connected to the test carrier, and the nucleic acid molecule cluster to be tested is formed by amplification, such as linking the sequenced nucleic acid template to a chip or a microsphere to form a nucleic acid molecule cluster to be tested;

步骤S3中各化合物和聚合酶M同时加入S2反应体系进行核苷酸聚合反应,聚合酶M能识别相应的化合物并将其任意一种并入生长的核酸链的3′端,由于第一,二个磷酸基团之间的O被CH2基团取代,不能正常形成磷酸二酯键,链延长终止,而且2’,3’-双脱氧三磷酸核苷酸的这种结构也确保每次只并入单个核苷酸衍生物;此时化合物1、化合物2、化合物3、化合物4中的任意一种化合物和聚合酶M以及核酸模板链形成一中间态复合体;In step S3, each compound and polymerase M are added to the S2 reaction system at the same time to carry out the nucleotide polymerization reaction. The polymerase M can recognize the corresponding compound and incorporate any of them into the 3' end of the growing nucleic acid chain. Due to the first, The O between the two phosphate groups is replaced by the CH2 group, the phosphodiester bond cannot be formed normally, the chain extension is terminated, and the structure of the 2',3'-dideoxytriphosphate nucleotide also ensures that only Incorporate a single nucleotide derivative; at this time, any compound in Compound 1, Compound 2, Compound 3, and Compound 4 forms an intermediate complex with the polymerase M and the nucleic acid template strand;

步骤S4中洗去未反应的化合物1、化合物2、化合物3、化合物4,维持S3步骤的中间态复合体状态;检测每个并入的核苷酸衍生物的荧光标记并拍照存储图像来鉴别所述核酸链的3′端所并入的核苷酸衍生物,由于化合物1-4各自标记的荧光基团不同,检测到的荧光信号也不同,可据此判断DNA模板上对应位置的碱基;In step S4, unreacted compound 1, compound 2, compound 3, and compound 4 are washed away, and the state of the intermediate complex in step S3 is maintained; the fluorescent label of each incorporated nucleotide derivative is detected and the image is taken and stored for identification The nucleotide derivatives incorporated at the 3' end of the nucleic acid chain, due to the different fluorescent groups labeled by the compounds 1-4, the detected fluorescent signals are also different, which can be used to determine the base at the corresponding position on the DNA template. base;

步骤S5中化合物5、化合物6、化合物7、化合物8能将S3步骤的中间态复合体中的化合物1、化合物2、化合物3、化合物4替换并形成磷酸二酯键达到链延长的目的,由于化合物5-8中R5可切割保护性基团的存在,每次链延长只有单个碱基掺入DNA链;In step S5, compound 5, compound 6, compound 7, and compound 8 can replace compound 1, compound 2, compound 3, and compound 4 in the intermediate state complex of step S3 and form a phosphodiester bond to achieve the purpose of chain extension. Due to the presence of R5 cleavable protective group in compound 5-8 , only a single base is incorporated into the DNA chain for each chain elongation;

步骤S6中加入切割缓冲液脱去化合物5-8中的可切割保护性基团R2,移除溶液相用缓冲液冲洗干净,洗去替换下来的化合物1、化合物2、化合物3、化合物4,得到天然无疤痕的核苷酸,利于下一个碱基的掺入。In step S6, cleavage buffer is added to remove the cleavable protective group R2 in compound 5-8, the solution phase is removed and rinsed with buffer, and the replaced compound 1, compound 2, compound 3, and compound 4 are washed away, Obtain natural nucleotides without scars, which is conducive to the incorporation of the next base.

第二种方法是双色荧光染料的测序方法:The second method is a sequencing method with two-color fluorescent dyes:

步骤S1中,分别制备化合物1、化合物2、化合物3、化合物4;化合物1-4各自独立的具备式(I)结构:In step S1, compound 1, compound 2, compound 3, and compound 4 are prepared respectively; compounds 1-4 each independently have the structure of formula (I):

Figure BDA0003852928080000111
Figure BDA0003852928080000111

此时化合物1中R2为染料A,R3不存在;化合物2中R2为染料B,R3不存在;化合物3中R2为染料A,R3为染料B;化合物4中R2和R3均不存在;染料A和染料B结构不同,能够发出不同的荧光信号;L1各自独立地为连接基团或不存在;At this time, in compound 1 , R2 is dye A, and R3 does not exist; in compound 2 , R2 is dye B, and R3 does not exist; in compound 3 , R2 is dye A, and R3 is dye B; in compound 4 , R2 and R 3 do not exist; dye A and dye B have different structures and can emit different fluorescent signals; L 1 is each independently a linking group or does not exist;

步骤S2中是将待测序的核酸模板连接于测试载体上,通过扩增形成待测核酸分子簇,如将核算模板固定在芯片上,通过桥式扩增构建待测核酸分子簇;In step S2, the nucleic acid template to be sequenced is connected to the test carrier, and the nucleic acid molecule cluster to be tested is formed by amplification, such as fixing the accounting template on the chip, and constructing the nucleic acid molecule cluster to be tested by bridge amplification;

步骤S3中聚合酶M能识别相应的化合物并将其任意一种并入生长的核酸链的3′端,由于5’-OH端β-CH2基团的存在,不能正常形成磷酸二酯键,链延长终止,而且2’,3’-双脱氧三磷酸核苷酸的这种结构也确保每次只并入单个核苷酸衍生物;此时化合物1-3中的任意一种和聚合酶M以及核酸模板链形成一中间态复合体;In step S3, the polymerase M can recognize the corresponding compound and incorporate any of them into the 3' end of the growing nucleic acid chain. Due to the presence of the β-CH2 group at the 5'-OH end, the phosphodiester bond cannot be formed normally. Chain elongation is terminated, and the structure of 2',3'-dideoxytriphosphate nucleotides also ensures that only a single nucleotide derivative is incorporated at a time; at this time, any one of compounds 1-3 and polymerase M and the nucleic acid template strand form an intermediate complex;

步骤S4中检测每个并入的核苷酸衍生物的荧光标记并拍照存储图像来鉴别所述核酸链的3′端所并入的核苷酸衍生物,此时,化合物1在滤光片通道1中能检测到荧光信号,在滤光片通道2中检测不到荧光信号(或信号很弱);化合物2在滤光片通道1中检测不到荧光信号(或信号很弱),在滤光片通道2中能检测到荧光信号,化合物3在滤光片通道1和滤光片通道2中都能检测到荧光信号;由于化合物4没有荧光基团标记,在滤光片通道1和滤光片通道2中都检测不到荧光信号,可据此判断并入的核苷酸衍生物类别和DNA模板上对应位置的碱基。In step S4, the fluorescent label of each incorporated nucleotide derivative is detected and the image is taken and stored to identify the incorporated nucleotide derivative at the 3' end of the nucleic acid chain. At this time, compound 1 is on the filter Fluorescent signal can be detected in channel 1, but no fluorescent signal (or signal is very weak) can be detected in filter channel 2; no fluorescent signal (or signal is very weak) can be detected in filter channel 1 of compound 2, and in A fluorescent signal can be detected in filter channel 2, and a fluorescent signal can be detected in both filter channel 1 and filter channel 2 of compound 3; No fluorescent signal can be detected in filter channel 2, based on which the category of the incorporated nucleotide derivative and the base at the corresponding position on the DNA template can be judged.

其余步骤则和第一种方法相同。The rest of the steps are the same as the first method.

第三种方法也是双色荧光染料的测序方法:The third method is also a sequencing method for two-color fluorescent dyes:

在步骤S1中,分别制备化合物1、化合物2、化合物3、化合物4;化合物1-4各自独立的具备式(I)结构:In step S1, compound 1, compound 2, compound 3, and compound 4 are prepared respectively; compounds 1-4 each independently have the structure of formula (I):

Figure BDA0003852928080000121
Figure BDA0003852928080000121

此时化合物1中R2为染料A,R3不存在;化合物2中R2为染料B,R3不存在;化合物3中R2为染料A,R3为可反应活性基团(Active Group,比如生物素biotin,biotin可以与荧光标记的亲和素或链链霉亲和素快速结合);化合物4中R2和R3均不存在;染料A和染料B结构不同,能够发出不同的荧光信号;L1各自独立地为连接基团或不存在。Now R in compound 1 is dye A, R does not exist; R in compound 2 is dye B, R does not exist; R in compound 3 is dye A, R is reactive group (Active Group , such as biotin biotin, biotin can quickly combine with fluorescently labeled avidin or streptavidin); R 2 and R 3 do not exist in compound 4; dye A and dye B have different structures and can emit different Fluorescent signal; each L 1 is independently a linker or absent.

步骤S2中是将待测序的核酸模板连接于测试载体上,通过扩增形成待测核酸分子簇,如将核算模板固定在芯片上,通过桥式扩增构建待测核酸分子簇;In step S2, the nucleic acid template to be sequenced is connected to the test carrier, and the nucleic acid molecule cluster to be tested is formed by amplification, such as fixing the accounting template on the chip, and constructing the nucleic acid molecule cluster to be tested by bridge amplification;

步骤S3中聚合酶M能识别相应的化合物并将其任意一种并入生长的核酸链的3′端,由于5’-OH端β-CH2基团的存在,不能正常形成磷酸二酯键,链延长终止,而且2’,3’-双脱氧三磷酸核苷酸的这种结构也确保每次只并入单个核苷酸衍生物;此时化合物1-3中的任意一种和聚合酶M以及核酸模板链形成一中间态复合体;In step S3, the polymerase M can recognize the corresponding compound and incorporate any of them into the 3' end of the growing nucleic acid chain. Due to the presence of the β-CH2 group at the 5'-OH end, the phosphodiester bond cannot be formed normally. Chain elongation is terminated, and the structure of 2',3'-dideoxytriphosphate nucleotides also ensures that only a single nucleotide derivative is incorporated at a time; at this time, any one of compounds 1-3 and polymerase M and the nucleic acid template strand form an intermediate complex;

步骤S4中还需加入能和活性基团特异性结合的携带染料B的活性物质,例如水溶性染料B-链霉亲和素,将染料B引入化合物3,结合反应后洗去多余物质;检测每个并入的核苷酸衍生物的荧光标记并拍照存储图像来鉴别所述核酸链的3′端所并入的核苷酸衍生物,此时,化合物1在滤光片通道1中能检测到荧光信号,在滤光片通道2中检测不到荧光信号(或信号很弱);化合物2在滤光片通道1中检测不到荧光信号(或信号很弱),在滤光片通道2中能检测到荧光信号,化合物3在滤光片通道1和滤光片通道2中都能检测到荧光信号;由于化合物4没有荧光基团标记,在滤光片通道1和滤光片通道2中都检测不到荧光信号,可据此判断并入的核苷酸衍生物类别和DNA模板上对应位置的碱基。In step S4, it is also necessary to add an active substance carrying dye B that can specifically bind to the active group, such as a water-soluble dye B-streptavidin, introduce dye B into compound 3, and wash away excess substances after the binding reaction; detection Fluorescent labeling of each incorporated nucleotide derivative and photographing and storing the image to identify the incorporated nucleotide derivative at the 3′ end of the nucleic acid chain, at this time, compound 1 can be Fluorescent signal is detected, but no fluorescent signal (or very weak signal) is detected in filter channel 2; no fluorescent signal (or very weak signal) is detected in filter channel 1 for compound 2, and no fluorescent signal (or very weak signal) is detected in filter channel 1. Fluorescent signal can be detected in 2, and fluorescent signal can be detected in both filter channel 1 and filter channel 2 of compound 3; since compound 4 has no fluorescent group label, it can be detected in filter channel 1 and filter channel No fluorescent signal can be detected in 2, which can be used to determine the type of nucleotide derivative incorporated and the base at the corresponding position on the DNA template.

其余步骤则和第一种方法相同。The rest of the steps are the same as the first method.

第四种方法是单色荧光测序方法:The fourth method is the single-color fluorescent sequencing method:

在步骤S1中,分别制备化合物1、化合物2、化合物3、化合物4;化合物1-4各自独立的具备式(I)结构:In step S1, compound 1, compound 2, compound 3, and compound 4 are prepared respectively; compounds 1-4 each independently have the structure of formula (I):

Figure BDA0003852928080000122
Figure BDA0003852928080000122

此时化合物1中R2为染料A,R3不存在;化合物2中R2为染料B,R3不存在;化合物3中R2为可反应活性基团(Active Group,比如生物素biotin,biotin可以与荧光标记的亲和素或链链霉亲和素快速结合),R3不存在;化合物4中R2和R3均不存在;染料A和染料B结构不同,染料B是环境高敏荧光基团,能发出与染料A相同荧光信号的荧光基团,也能快速响应环境微弱调节,发生荧光淬灭现象;L1各自独立地为连接基团或不存在。At this time, in compound 1, R 2 is dye A, and R 3 does not exist; in compound 2, R 2 is dye B, and R 3 does not exist; in compound 3, R 2 is a reactive active group (Active Group, such as biotin biotin, Biotin can quickly combine with fluorescently labeled avidin or streptavidin), R 3 does not exist; neither R 2 nor R 3 exists in compound 4; dye A and dye B have different structures, and dye B is environmentally sensitive The fluorophore, which can emit the same fluorescent signal as the dye A, can also quickly respond to the weak adjustment of the environment, and the phenomenon of fluorescence quenching occurs; L 1 is independently a linking group or does not exist.

步骤S2中是将待测序的核酸模板连接于测试载体上,通过扩增形成待测核酸分子簇,如将核算模板固定在芯片上,通过桥式扩增构建第一核酸双链分子簇;In step S2, the nucleic acid template to be sequenced is connected to the test carrier, and the nucleic acid molecular cluster to be tested is formed by amplification, such as fixing the accounting template on the chip, and constructing the first nucleic acid double-stranded molecular cluster through bridge amplification;

步骤S3则是将化合物1、化合物2、化合物3、化合物4和聚合酶同时加入S2体系中进行核苷酸识别聚合反应,聚合酶M能识别相应的化合物并将其任意一种并入生长的核酸链的3′端,由于5’-OH端β-CH2基团的存在,不能正常形成磷酸二酯键,链延长终止,而且2’,3’-双脱氧三磷酸核苷酸的这种结构也确保每次只并入单个核苷酸衍生物;此时化合物1-3中的任意一种和聚合酶M以及核酸模板链形成一中间态复合体;Step S3 is to add compound 1, compound 2, compound 3, compound 4 and polymerase into the S2 system at the same time for nucleotide recognition polymerization reaction, polymerase M can recognize the corresponding compounds and incorporate any of them into the growing At the 3' end of the nucleic acid chain, due to the presence of the β-CH2 group at the 5'-OH end, the phosphodiester bond cannot be formed normally, and the chain extension terminates, and the 2', 3'-dideoxytriphosphate nucleotide The structure also ensures that only a single nucleotide derivative is incorporated at a time; at this time, any one of the compounds 1-3 forms an intermediate complex with the polymerase M and the nucleic acid template strand;

步骤S4则包括如下步骤:S4a,洗去未反应得化合物1、化合物2、化合物3,化合物4,之后加入扫描缓冲液调节反应环境使得化合物2上标记的染料B能够被激发,染料B发出与染料A相同的荧光信号,检测荧光信号并通过拍照、存储图像来记录荧光信号;S4b,洗去扫描缓冲液,加入染料A标记的活性基团,该活性基团能够与化合物3上的可反应活性基团发生特异性结合反应,从而将荧光基团A引入化合物3,使其发出荧光信号;同时调节反应环境,淬灭化合物2上荧光基团的荧光,此变化对化合物1和化合物4没有影响,但是能够使化合物2上的环境高敏荧光基团荧光淬灭,检测不到荧光信号;S4c,加入扫描缓冲液,激发光源检测荧光信号并通过拍照、存储图像来记录荧光信号,之后洗去扫描缓冲液;Step S4 includes the following steps: S4a, wash away unreacted compound 1, compound 2, compound 3, and compound 4, and then add scanning buffer to adjust the reaction environment so that the dye B marked on compound 2 can be excited, and the dye B emits the same The same fluorescent signal as dye A, detect the fluorescent signal and record the fluorescent signal by taking pictures and storing the image; S4b, wash away the scanning buffer, add the active group labeled with dye A, which can react with the reactive group on compound 3 The active group undergoes a specific binding reaction, thereby introducing the fluorescent group A into compound 3 to make it emit a fluorescent signal; at the same time, the reaction environment is adjusted to quench the fluorescence of the fluorescent group on compound 2, which has no effect on compound 1 and compound 4. effect, but it can quench the fluorescence of the environmentally sensitive fluorophore on compound 2, and the fluorescence signal cannot be detected; S4c, add scanning buffer, excite the light source to detect the fluorescence signal and record the fluorescence signal by taking pictures and storing the image, and then wash away scan buffer;

对比前后两次拍照结果,判断并入的核苷酸衍生物类别,此时,化合物1在前后两次拍照中都能检测到荧光信号;化合物2在第一次拍照中能检测到荧光信号,在第二次拍照中检测不到荧光信号(或信号很弱);化合物3在第一次拍照中检测不到荧光信号(或信号很弱),在第二次拍照中能检测到荧光信号;由于化合物4没有荧光基团标记,在前后两次拍照中都不能检测到荧光信号;可据此判断并入的核苷酸衍生物类别和DNA模板上对应位置的碱基。Compare the results of the two photographs before and after to judge the category of nucleotide derivatives incorporated. At this time, the fluorescent signal can be detected in the two photographs of compound 1; the fluorescent signal can be detected in the first photograph of compound 2, No fluorescent signal (or very weak signal) could be detected in the second photoshoot; no fluorescent signal (or very weak signal) could be detected in compound 3 in the first photoshoot, but a fluorescent signal could be detected in the second photoshoot; Since compound 4 is not labeled with a fluorophore, no fluorescent signal can be detected in the two photographs before and after; this can be used to determine the type of nucleotide derivative incorporated and the base at the corresponding position on the DNA template.

其余步骤则和第一种方法相同。The rest of the steps are the same as the first method.

实施例1Example 1

化合物1、化合物2、化合物3、化合物4具有式I-a结构:Compound 1, Compound 2, Compound 3, and Compound 4 have the structure of formula I-a:

Figure BDA0003852928080000131
Figure BDA0003852928080000131

化合物5、化合物6、化合物7、化合物8具有式IIa结构:Compound 5, Compound 6, Compound 7, and Compound 8 have the structure of formula IIa:

Figure BDA0003852928080000132
Figure BDA0003852928080000132

Base代表不同的碱基,本实施例中B分别为腺嘌呤(A)、鸟嘌呤(G)、胞嘧啶(C)、胸腺嘧啶(T)或尿嘧啶(U)Base represents different bases, B in this embodiment are adenine (A), guanine (G), cytosine (C), thymine (T) or uracil (U)

Dye代表不同激发和发射波长的荧光染料,本实施例中Dye分别为AF532,Cy5,AF568,IF700,Dye represents fluorescent dyes with different excitation and emission wavelengths. In this embodiment, Dye is AF532, Cy5, AF568, IF700,

Block为可切割保护性基团,本实施例中Block为N3或S-S-Et基团Block is a cleavable protective group, in this embodiment Block is N3 or SS - Et group

具有式Ia的化合物1、化合物2、化合物3、化合物4的合成路线如下所示:The synthetic route of compound 1, compound 2, compound 3, compound 4 with formula Ia is as follows:

Figure BDA0003852928080000141
Figure BDA0003852928080000141

具有式IIa的化合物5、化合物6、化合物7、化合物8的合成路线如下所示:The synthetic route of compound 5, compound 6, compound 7, compound 8 with formula IIa is as follows:

Figure BDA0003852928080000142
Figure BDA0003852928080000142

在其优选的实施例中化合物1、化合物2、化合物3、化合物4,化合物5、化合物6、化合物7、化合物8的具体结构:In its preferred embodiment the specific structures of compound 1, compound 2, compound 3, compound 4, compound 5, compound 6, compound 7, compound 8:

Figure BDA0003852928080000143
Figure BDA0003852928080000143

Figure BDA0003852928080000151
Figure BDA0003852928080000151

其基因测序方法如下∶The gene sequencing method is as follows:

a)将待测DNA模板固定至芯片上,通过桥式扩增构建核酸双链分子簇;a) immobilizing the DNA template to be tested on the chip, and constructing nucleic acid double-stranded molecular clusters through bridge amplification;

b)将化合物1、化合物2、化合物3、化合物4和聚合酶M同时加入反应体系,进行核苷酸识别聚合反应,聚合酶M能识别相应的化合物并将其任意一种并入生长的核酸链的3′端,由于5’-OH端β-CH2基团的存在,不能正常形成磷酸二酯键,链延长终止。而且2’,3’-双脱氧三磷酸核苷酸的这种结构也确保每次只并入单个核苷酸衍生物;此时化合物1-4中的任意一种和聚合酶M以及核酸模板链形成一中间态复合体;b) Add compound 1, compound 2, compound 3, compound 4 and polymerase M to the reaction system at the same time to carry out nucleotide recognition polymerization reaction, polymerase M can recognize the corresponding compound and incorporate any of them into the growing nucleic acid At the 3' end of the chain, due to the presence of the β-CH2 group at the 5'-OH end, the phosphodiester bond cannot be formed normally, and the chain extension terminates. Moreover, this structure of 2',3'-dideoxytriphosphate nucleotide also ensures that only a single nucleotide derivative is incorporated at a time; Chains form an intermediate state complex;

c)洗去未反应的化合物1、化合物2、化合物3、化合物4,维持步骤b的中间态复合体状态;检测每个并入的核苷酸衍生物的荧光标记并拍照存储图像来鉴别所述核酸链的3′端所并入的核苷酸衍生物;由于化合物化合物1、化合物2、化合物3、化合物4分别标记着AF532,Cy5,AF568,IF700,检测到的荧光信号完全不同,可据此所并入的核苷酸衍生物以及DNA模板上对应位置的碱基;c) wash away the unreacted compound 1, compound 2, compound 3, and compound 4, and maintain the state of the intermediate complex in step b; detect the fluorescent label of each incorporated nucleotide derivative and take pictures and store images to identify all Nucleotide derivatives incorporated into the 3' end of the nucleic acid chain; since compound compound 1, compound 2, compound 3, and compound 4 are marked with AF532, Cy5, AF568, and IF700 respectively, the detected fluorescent signals are completely different, which can be The nucleotide derivatives incorporated accordingly and the bases at the corresponding positions on the DNA template;

d)将化合物5、化合物6、化合物7、化合物8和聚合酶同时加入以上核酸双链分子簇反应体系进行核苷酸聚合反应,此时,化合物5、化合物6、化合物7、化合物8能将b步骤的中间态复合体中的化合物1、化合物2、化合物3、化合物4替换并形成磷酸二酯键达到链延长的目的,由于化合物5-8中R2可切割保护性基团的存在,每次链延长只有单个碱基掺入DNA链。d) compound 5, compound 6, compound 7, compound 8 and polymerase are added to the above nucleic acid double-stranded molecular cluster reaction system to carry out nucleotide polymerization reaction. At this time, compound 5, compound 6, compound 7 and compound 8 can Compound 1, Compound 2, Compound 3, and Compound 4 in the intermediate state complex of step b replace and form a phosphodiester bond to achieve the purpose of chain extension. Due to the existence of R2 cleavable protective groups in Compound 5-8, each Secondary strand elongation incorporates only a single base into the DNA strand.

e)加入切割缓冲液脱去化合物5-8中的3’-端保护性基团,移除溶液相用缓冲液冲洗干净,同时洗去替换下来的化合物1、化合物2、化合物3、化合物4,利于下一个碱基的掺入。e) Add cleavage buffer to remove the 3'-terminal protective group in compound 5-8, remove the solution phase and wash it with buffer, and wash away the replaced compound 1, compound 2, compound 3, and compound 4 at the same time , which facilitates the incorporation of the next base.

f)重复步骤a-e一次或多次。f) Repeat steps a-e one or more times.

实施例2Example 2

化合物1、化合物2具备式(I-a)结构,化合物3具备式(I-b)结构;化合物4具备式(I-c)的结构;化合物5、化合物6、化合物7、化合物8具备式(IIa)的结构,如化合物1、化合物2具有式Ia-1结构:Compound 1 and compound 2 have the structure of formula (I-a), compound 3 has the structure of formula (I-b); compound 4 has the structure of formula (I-c); compound 5, compound 6, compound 7 and compound 8 have the structure of formula (IIa), For example, compound 1 and compound 2 have the structure of formula Ia-1:

Figure BDA0003852928080000161
Figure BDA0003852928080000161

化合物3具有式I-b结构:Compound 3 has the structure of formula I-b:

Figure BDA0003852928080000162
Figure BDA0003852928080000162

化合物4具有式I-c结构:Compound 4 has the structure of formula I-c:

Figure BDA0003852928080000163
Figure BDA0003852928080000163

化合物5、化合物6、化合物7、化合物8具有式IIa结构:Compound 5, Compound 6, Compound 7, and Compound 8 have the structure of formula IIa:

Figure BDA0003852928080000164
Figure BDA0003852928080000164

Base代表不同的碱基,本实施例中B分别为腺嘌呤(A)、鸟嘌呤(G)、胞嘧啶(C)、胸腺嘧啶(T)或尿嘧啶(U)。Base represents different bases, and B in this embodiment is adenine (A), guanine (G), cytosine (C), thymine (T) or uracil (U).

Dye A和Dye B各自独立地为能发出不相同荧光信号的荧光基团。Dye A and Dye B are each independently a fluorescent group capable of emitting different fluorescent signals.

Block为可切割保护性基团,本实施例中Block为N3或S-S-Et基团。Block is a cleavable protective group, and in this embodiment Block is N3 or SS - Et group.

化合物1、化合物2、化合物4,化合物5、化合物6、化合物7、化合物8合成路线与具体实施例1中相同。Compound 1, Compound 2, Compound 4, Compound 5, Compound 6, Compound 7, and Compound 8 have the same synthetic routes as in Example 1.

具有式I-c的化合物3的合成路线如下所示:The synthetic route of compound 3 having formula I-c is shown below:

Figure BDA0003852928080000171
Figure BDA0003852928080000171

在其优选的实施例中化合物1、化合物2、化合物3、化合物4具体结构如下:In its preferred embodiment, the specific structures of compound 1, compound 2, compound 3 and compound 4 are as follows:

Figure BDA0003852928080000172
Figure BDA0003852928080000172

Figure BDA0003852928080000181
Figure BDA0003852928080000181

采用其进行基因测序的方法是如下步骤∶The method for using it to carry out gene sequencing is as follows:

a)将待测DNA模板固定至芯片上,通过桥式扩增构建核酸双链分子簇;a) immobilizing the DNA template to be tested on the chip, and constructing nucleic acid double-stranded molecular clusters through bridge amplification;

b)将化合物1、化合物2、化合物3、化合物4和聚合酶M同时加入反应体系,进行核苷酸识别聚合反应,聚合酶M能识别相应的化合物并将其任意一种并入生长的核酸链的3′端,由于5’-OH端β-CH2基团的存在,不能正常形成磷酸二酯键,链延长终止。而且2’,3’-双脱氧三磷酸核苷酸的这种结构也确保每次只并入单个核苷酸衍生物;此时化合物1-3中的任意一种和聚合酶M以及核酸模板链形成一中间态复合体;b) Add compound 1, compound 2, compound 3, compound 4 and polymerase M to the reaction system at the same time to carry out nucleotide recognition polymerization reaction, polymerase M can recognize the corresponding compound and incorporate any of them into the growing nucleic acid At the 3' end of the chain, due to the presence of the β-CH2 group at the 5'-OH end, the phosphodiester bond cannot be formed normally, and the chain extension terminates. Moreover, this structure of 2',3'-dideoxytriphosphate nucleotide also ensures that only a single nucleotide derivative is incorporated at a time; Chains form an intermediate state complex;

c)洗去未反应的化合物1、化合物2、化合物3,化合物4,维持步骤b的中间态复合体状态;检测每个并入的核苷酸衍生物的荧光标记并拍照存储图像来鉴别所述核酸链的3′端所并入的核苷酸衍生物,此时,化合物1标记的染料Cy5在滤光片通道1中能检测到荧光信号,在滤光片通道2中检测不到荧光信号(或信号很弱);化合物2标记的染料AF532在滤光片通道1中检测不到荧光信号(或信号很弱),在滤光片通道2中能检测到荧光信号,化合物3同时标记有Cy5和AF532,因此在滤光片通道1和滤光片通道2中都能检测到荧光信号;由于化合物4没有荧光基团标记,在滤光片通道1和滤光片通道2中都不能检测到荧光信号;据此所并入的核苷酸衍生物以及DNA模板上对应位置的碱基;c) wash away unreacted compound 1, compound 2, compound 3, compound 4, and maintain the state of the intermediate complex in step b; detect the fluorescent label of each incorporated nucleotide derivative and take pictures and store images to identify all Nucleotide derivatives incorporated into the 3′ end of the nucleic acid chain. At this time, the dye Cy5 labeled with compound 1 can detect a fluorescent signal in the filter channel 1, and no fluorescence can be detected in the filter channel 2. Signal (or the signal is very weak); the dye AF532 labeled with compound 2 cannot detect the fluorescent signal (or the signal is very weak) in the filter channel 1, and the fluorescent signal can be detected in the filter channel 2, and the compound 3 is simultaneously labeled There are Cy5 and AF532, so the fluorescent signal can be detected in both filter channel 1 and filter channel 2; since compound 4 is not labeled with a fluorophore, it cannot be detected in both filter channel 1 and filter channel 2 A fluorescent signal is detected; based on this, the incorporated nucleotide derivative and the base at the corresponding position on the DNA template;

d)将化合物5、化合物6、化合物7、化合物8和聚合酶同时加入以上核酸双链分子簇反应体系进行核苷酸聚合反应,此时,化合物5、化合物6、化合物7、化合物8能将S3步骤的中间态复合体中的化合物1、化合物2、化合物3、化合物4替换并形成磷酸二酯键达到链延长的目的,由于化合物5-8中R2可切割保护性基团的存在,每次链延长只有单个碱基掺入DNA链。d) compound 5, compound 6, compound 7, compound 8 and polymerase are added to the above nucleic acid double-stranded molecular cluster reaction system to carry out nucleotide polymerization reaction. At this time, compound 5, compound 6, compound 7 and compound 8 can Compound 1, Compound 2, Compound 3, and Compound 4 in the intermediate state complex of the S3 step replace and form a phosphodiester bond to achieve the purpose of chain extension. Due to the existence of R2 cleavable protective groups in Compound 5-8, each Secondary strand elongation incorporates only a single base into the DNA strand.

e)加入切割缓冲液脱去化合物5-8中的可切割保护性基团R2,移除溶液相,用缓冲液冲洗干净,同时洗去替换下来的化合物1、化合物2、化合物3、化合物4,得到天然无疤痕的核苷酸,利于下一个碱基的掺入。e) Add cleavage buffer to remove the cleavable protective group R2 in compound 5-8, remove the solution phase, rinse with buffer, and wash away the replaced compound 1, compound 2, compound 3, and compound 4 , to obtain natural nucleotides without scars, which is conducive to the incorporation of the next base.

f)重复步骤a-e一次或多次。f) Repeat steps a-e one or more times.

实施例3Example 3

化合物1、化合物2具备式(I-a)结构,化合物3具备式(I-d)结构;化合物4具备式(I-c)的结构;化合物5、化合物6、化合物7、化合物8具备式(IIa)的结构,如化合物1、化合物2具有式Ia-1结构:Compound 1 and compound 2 have the structure of formula (I-a), compound 3 has the structure of formula (I-d); compound 4 has the structure of formula (I-c); compound 5, compound 6, compound 7 and compound 8 have the structure of formula (IIa), For example, compound 1 and compound 2 have the structure of formula Ia-1:

Figure BDA0003852928080000182
Figure BDA0003852928080000182

化合物3具有式I-d结构:Compound 3 has the structure of formula I-d:

Figure BDA0003852928080000191
Figure BDA0003852928080000191

化合物4具有式I-c结构:Compound 4 has the structure of formula I-c:

Figure BDA0003852928080000192
Figure BDA0003852928080000192

化合物5、化合物6、化合物7、化合物8具有式IIa结构:Compound 5, Compound 6, Compound 7, and Compound 8 have the structure of formula IIa:

Figure BDA0003852928080000193
Figure BDA0003852928080000193

Base代表不同的碱基,本实施例中B分别为腺嘌呤(A)、鸟嘌呤(G)、胞嘧啶(C)、胸腺嘧啶(T)或尿嘧啶(U)。Base represents different bases, and B in this embodiment is adenine (A), guanine (G), cytosine (C), thymine (T) or uracil (U).

Dye A和Dye B各自独立地为能发出不相同荧光信号的荧光基团,AG(ActiveGroup)为能够进行连接反应的可反应性基团。Dye A and Dye B are independently fluorescent groups capable of emitting different fluorescent signals, and AG (ActiveGroup) is a reactive group capable of linking reactions.

Block为可切割保护性基团,本实施例中Block为N3或S-S-Et基团。Block is a cleavable protective group, and in this embodiment Block is N3 or SS - Et group.

化合物1、化合物2、化合物4,化合物5、化合物6、化合物7、化合物8合成路线与具体实施例1中相同。Compound 1, Compound 2, Compound 4, Compound 5, Compound 6, Compound 7, and Compound 8 have the same synthetic routes as in Example 1.

具有式I-d的化合物3的合成路线如下所示:The synthetic route of compound 3 having formula 1-d is shown below:

Figure BDA0003852928080000201
Figure BDA0003852928080000201

在其优选的实施例中化合物1、化合物2、化合物3、化合物4具体结构如下:In its preferred embodiment, the specific structures of compound 1, compound 2, compound 3 and compound 4 are as follows:

Figure BDA0003852928080000202
Figure BDA0003852928080000202

Figure BDA0003852928080000211
Figure BDA0003852928080000211

采用其进行基因测序的方法是如下步骤∶The method for using it to carry out gene sequencing is as follows:

a)将待测DNA模板固定至芯片上,通过桥式扩增构建核酸双链分子簇;a) immobilizing the DNA template to be tested on the chip, and constructing nucleic acid double-stranded molecular clusters through bridge amplification;

b)将化合物1、化合物2、化合物3、化合物4和聚合酶M同时加入反应体系,进行核苷酸识别聚合反应,聚合酶M能识别相应的化合物并将其任意一种并入生长的核酸链的3′端,由于5’-OH端β-CH2基团的存在,不能正常形成磷酸二酯键,链延长终止。而且2’,3’-双脱氧三磷酸核苷酸的这种结构也确保每次只并入单个核苷酸衍生物;此时化合物1-3中的任意一种和聚合酶M以及核酸模板链形成一中间态复合体;b) Add compound 1, compound 2, compound 3, compound 4 and polymerase M to the reaction system at the same time to carry out nucleotide recognition polymerization reaction, polymerase M can recognize the corresponding compound and incorporate any of them into the growing nucleic acid At the 3' end of the chain, due to the presence of the β-CH2 group at the 5'-OH end, the phosphodiester bond cannot be formed normally, and the chain extension terminates. Moreover, this structure of 2',3'-dideoxytriphosphate nucleotide also ensures that only a single nucleotide derivative is incorporated at a time; Chains form an intermediate state complex;

c)加入水溶性AF532-链霉亲和素,特异性的与化合物3上的biotin基团结合,从而将AF322荧光基团引入化合物3,,洗去未反应的化合物1、化合物2、化合物3,化合物4,以及AF532-链霉亲和素,维持步骤b的中间态复合体状态;检测每个并入的核苷酸衍生物的荧光标记并拍照存储图像来鉴别所述核酸链的3′端所并入的核苷酸衍生物,此时,化合物1标记的染料Cy5在滤光片通道1中能检测到荧光信号,在滤光片通道2中检测不到荧光信号(或信号很弱);化合物2标记的染料AF532在滤光片通道1中检测不到荧光信号(或信号很弱),在滤光片通道2中能检测到荧光信号,化合物3同时标记有Cy5和AF532,因此在滤光片通道1和滤光片通道2中都能检测到荧光信号;由于化合物4没有荧光基团标记,在滤光片通道1和滤光片通道2中都不能检测到荧光信号;据此所并入的核苷酸衍生物以及DNA模板上对应位置的碱基;c) Add water-soluble AF532-streptavidin, which specifically binds to the biotin group on compound 3, thereby introducing the AF322 fluorescent group into compound 3, and washing away unreacted compound 1, compound 2, and compound 3 , compound 4, and AF532-streptavidin, maintain the intermediate state complex state of step b; detect the fluorescent label of each incorporated nucleotide derivative and take pictures and store images to identify the 3' of the nucleic acid chain At this time, the dye Cy5 labeled with compound 1 can detect a fluorescent signal in the filter channel 1, but no fluorescent signal (or a very weak signal) can be detected in the filter channel 2 ); the dye AF532 labeled with compound 2 could not detect the fluorescent signal (or the signal was very weak) in the filter channel 1, and the fluorescent signal could be detected in the filter channel 2, and the compound 3 was labeled with Cy5 and AF532 at the same time, so Fluorescent signals can be detected in both filter channel 1 and filter channel 2; since compound 4 has no fluorophore label, no fluorescent signal can be detected in filter channel 1 and filter channel 2; according to The incorporated nucleotide derivatives and the bases at the corresponding positions on the DNA template;

d)将化合物5、化合物6、化合物7、化合物8和聚合酶同时加入以上核酸双链分子簇反应体系进行核苷酸聚合反应,此时,化合物5、化合物6、化合物7、化合物8能将S3步骤的中间态复合体中的化合物1、化合物2、化合物3、化合物4替换并形成磷酸二酯键达到链延长的目的,由于化合物5-8中R2可切割保护性基团的存在,每次链延长只有单个碱基掺入DNA链。d) compound 5, compound 6, compound 7, compound 8 and polymerase are added to the above nucleic acid double-stranded molecular cluster reaction system to carry out nucleotide polymerization reaction. At this time, compound 5, compound 6, compound 7 and compound 8 can Compound 1, Compound 2, Compound 3, and Compound 4 in the intermediate state complex of the S3 step replace and form a phosphodiester bond to achieve the purpose of chain extension. Due to the existence of R2 cleavable protective groups in Compound 5-8, each Secondary strand elongation incorporates only a single base into the DNA strand.

e)加入切割缓冲液脱去化合物5-8中的可切割保护性基团R2,移除溶液相,用缓冲液冲洗干净,同时洗去替换下来的化合物1、化合物2、化合物3,化合物4,得到天然无疤痕的核苷酸,利于下一个碱基的掺入。e) Add cleavage buffer to remove the cleavable protective group R2 in compound 5-8, remove the solution phase, rinse with buffer, and wash away the replaced compound 1, compound 2, compound 3, and compound 4 , to obtain natural nucleotides without scars, which is conducive to the incorporation of the next base.

a)重复步骤a-e一次或多次。a) Repeat steps a-e one or more times.

实施例4Example 4

化合物1、化合物2具备式(I-a)结构,化合物3具备式(I-f)结构;化合物4具备式(I-c)的结构;化合物5、化合物6、化合物7、化合物8具备式(IIa)的结构,如化合物1、化合物2具有式Ia-1结构:Compound 1 and compound 2 have the structure of formula (I-a), compound 3 has the structure of formula (I-f); compound 4 has the structure of formula (I-c); compound 5, compound 6, compound 7 and compound 8 have the structure of formula (IIa), For example, compound 1 and compound 2 have the structure of formula Ia-1:

Figure BDA0003852928080000221
Figure BDA0003852928080000221

化合物3具有式I-d结构:Compound 3 has the structure of formula I-d:

Figure BDA0003852928080000222
Figure BDA0003852928080000222

化合物4具有式I-c结构:Compound 4 has the structure of formula I-c:

Figure BDA0003852928080000223
Figure BDA0003852928080000223

化合物5、化合物6、化合物7、化合物8具有式IIa结构:Compound 5, Compound 6, Compound 7, and Compound 8 have the structure of formula IIa:

Figure BDA0003852928080000224
Figure BDA0003852928080000224

Base代表不同的碱基,本实施例中B分别为腺嘌呤(A)、鸟嘌呤(G)、胞嘧啶(C)、胸腺嘧啶(T)或尿嘧啶(U)。Base represents different bases, and B in this embodiment is adenine (A), guanine (G), cytosine (C), thymine (T) or uracil (U).

Dye代表环境敏感荧光染料或能发出相同荧光信号的荧光基团,本实施例中分别为HCyC-647和Cy5;Dye represents an environment-sensitive fluorescent dye or a fluorescent group that can emit the same fluorescent signal, which are HCyC-647 and Cy5 in this example;

AG(Active Group)为能够进行连接反应的可反应性基团。AG (Active Group) is a reactive group capable of linking reaction.

Block为可切割保护性基团,本实施例中Block为N3或S-S-Et基团。Block is a cleavable protective group, and in this embodiment Block is N3 or SS - Et group.

化合物1、化合物2、化合物3、化合物4,化合物5、化合物6、化合物7、化合物8合成路线与具体实施例3中相同或类似。Compound 1, Compound 2, Compound 3, Compound 4, Compound 5, Compound 6, Compound 7, and Compound 8 have the same or similar synthetic routes as in Example 3.

在其优选的技术方案中,化合物1、化合物2、化合物3、化合物4的结构一种为:In its preferred technical scheme, one of the structures of compound 1, compound 2, compound 3 and compound 4 is:

Figure BDA0003852928080000225
Figure BDA0003852928080000225

Figure BDA0003852928080000231
Figure BDA0003852928080000231

利用其进行基因测序的方法如下述步骤∶The method for using it to carry out gene sequencing is as follows:

b)将待测DNA模板固定至芯片上,通过桥式扩增构建核酸双链分子簇;b) immobilizing the DNA template to be tested on the chip, and constructing nucleic acid double-stranded molecular clusters through bridge amplification;

c)化合物1、化合物2、化合物3,化合物4和聚合酶M同时加入反应体系,进行核苷酸识别聚合反应,聚合酶M能识别相应的化合物并将其任意一种并入生长的核酸链的3′端,由于5’-OH端β-CH2基团的存在,不能正常形成磷酸二酯键,链延长终止。而且2’,3’-双脱氧三磷酸核苷酸的这种结构也确保每次只并入单个核苷酸衍生物;此时化合物1-4中的任意一种和聚合酶M以及核酸模板链形成一中间态复合体;c) Compound 1, compound 2, compound 3, compound 4 and polymerase M are added to the reaction system at the same time to carry out nucleotide recognition polymerization reaction, polymerase M can recognize the corresponding compound and incorporate any of them into the growing nucleic acid chain Due to the existence of the β-CH2 group at the 5'-OH end, the 3' end of the phosphodiester bond cannot be formed normally, and the chain extension is terminated. Moreover, this structure of 2',3'-dideoxytriphosphate nucleotide also ensures that only a single nucleotide derivative is incorporated at a time; Chains form an intermediate state complex;

d)洗去未反应的dNTPs,加入扫描缓冲液,调整pH值为6.8,以640nm光源为激发波长检测荧光信号,拍照,存储图像A;d) washing away unreacted dNTPs, adding scanning buffer, adjusting the pH value to 6.8, using a 640nm light source as the excitation wavelength to detect fluorescent signals, taking pictures, and storing image A;

e)洗去扫描缓冲液,加入水溶性Cy5-链霉亲和素,所述处理对化合物1、化合物2和化合物4没有影响,同时Cy5-链霉亲和素能够特异性的与化合物3上Biotin结合,从而将荧光基团Cy5引入化合物3,使其发出荧光信号;e) wash off the scanning buffer, add water-soluble Cy5-streptavidin, the treatment has no effect on compound 1, compound 2 and compound 4, and Cy5-streptavidin can specifically bind to compound 3 Combined with Biotin, the fluorescent group Cy5 was introduced into compound 3 to make it emit a fluorescent signal;

f)调整pH值为7.5,弱碱性环境对化合物1、化合物3和化合物4没有影响,但是能够使化合物2上的HCyC-647荧光淬灭,640nm光源下检测不到荧光信号;g)加入扫描缓冲液,以640nm光源为激发波长检测荧光信号,拍照,存储图像B;h)洗去扫描缓冲液,将化合物5、化合物6、化合物7、化合物8和聚合酶同时加入以上核酸双链分子簇反应体系进行核苷酸聚合反应,此时,化合物5、化合物6、化合物7、化合物8能将b步骤的中间态复合体中的化合物1、化合物2、化合物3和化合物4替换并形成磷酸二酯键达到链延长的目的,由于化合物5-8中3’-OH可切割保护性基团的存在,每次链延长只有单个碱基掺入DNA链。f) Adjusting the pH value to 7.5, the weakly alkaline environment has no effect on compound 1, compound 3 and compound 4, but can quench the fluorescence of HCyC-647 on compound 2, and no fluorescence signal can be detected under a 640nm light source; g) adding Scan buffer, use 640nm light source as excitation wavelength to detect fluorescent signal, take pictures, and store image B; h) Wash away the scan buffer, add compound 5, compound 6, compound 7, compound 8 and polymerase to the above nucleic acid double-stranded molecules at the same time The cluster reaction system carries out the nucleotide polymerization reaction. At this time, compound 5, compound 6, compound 7, and compound 8 can replace compound 1, compound 2, compound 3, and compound 4 in the intermediate state complex of step b to form phosphoric acid The diester bond achieves the purpose of chain extension. Due to the presence of the 3'-OH cleavable protective group in compound 5-8, only a single base is incorporated into the DNA chain for each chain extension.

i)加入切割试剂THP对芯片进行处理,脱去化合物5、化合物6、化合物7、化合物83’-位置处的保护基团重新生成游离的3’-OH,同时洗去替换下来的化合物1、化合物2、化合物3和化合物4,利于下一个碱基的掺入。i) Add cleavage reagent THP to treat the chip, remove the protective group at the 3'-position of compound 5, compound 6, compound 7, and compound 8 to regenerate free 3'-OH, and wash away the replaced compound 1, Compound 2, compound 3 and compound 4 are beneficial to the incorporation of the next base.

j)重复进行步骤(b)-(h);j) repeat steps (b)-(h);

k)在每个循环测试步骤过程中两次拍照获得的图像之后,对同一个位置的核酸双链分子簇的荧光信号进行比较,若在扫描图像A和扫描图像B中均有荧光信号,则并入的核苷酸衍生物的为化合物1(A碱基衍生物,Cy5标记),相应的,可确定该DNA模板上对应位置上的碱基为T;若在扫描图像A和扫描图像B中均无荧光信号,则并入的核苷酸衍生物的为化合物4(G碱基衍生物,无荧光标记),相应的,可确定该DNA模板上对应位置上的碱基为C;若在扫描图像A中有信号,而扫描图像B中无荧光信号,则并入的核苷酸衍生物的为化合物2(T碱基,环境敏感染料HCyC647标记),相应的,可确定该DNA模板上对应位置上的碱基为A;若在扫描图像A中无信号,而扫描图像B中有荧光信号,则并入的核苷酸衍生物的为化合物3(C碱基,Biotin标记),相应的,可确定该DNA模板上对应位置上的碱基为T。如表1中所示。k) After the images obtained by taking pictures twice during each cycle test step, compare the fluorescent signals of the nucleic acid double-stranded molecular clusters at the same position, if there are fluorescent signals in both scanning image A and scanning image B, then The incorporated nucleotide derivative is compound 1 (A base derivative, Cy5 label), correspondingly, it can be determined that the base at the corresponding position on the DNA template is T; If there is no fluorescent signal in any of the DNA templates, the incorporated nucleotide derivative is compound 4 (G base derivative, no fluorescent label). Correspondingly, it can be determined that the base at the corresponding position on the DNA template is C; if If there is a signal in scanning image A, but there is no fluorescent signal in scanning image B, then the incorporated nucleotide derivative is compound 2 (T base, labeled with environmental sensitive dye HCyC647), and correspondingly, the DNA template can be determined The base at the corresponding position above is A; if there is no signal in the scanning image A, but there is a fluorescent signal in the scanning image B, then the incorporated nucleotide derivative is compound 3 (C base, Biotin label), Correspondingly, it can be determined that the base at the corresponding position on the DNA template is T. As shown in Table 1.

表1:实施例4检测结果及对应碱基Table 1: Example 4 detection results and corresponding bases

Figure BDA0003852928080000241
Figure BDA0003852928080000241

使用具体实施例4中所述试剂和方法进行测序验证,测序样本为Lambda DNA片段,测序模式为单端30个碱基,表2展示了部分片段碱基序列,表3展示了测试结果与分析。Sequencing verification was performed using the reagents and methods described in Example 4. The sequencing sample was a Lambda DNA fragment, and the sequencing mode was 30 bases at a single end. Table 2 shows the base sequences of some fragments, and Table 3 shows the test results and analysis .

表2:部分Lambda DNA片段碱基序列Table 2: Base sequences of partial Lambda DNA fragments

Figure BDA0003852928080000242
Figure BDA0003852928080000242

Figure BDA0003852928080000251
Figure BDA0003852928080000251

表3测序结果与分析Table 3 Sequencing results and analysis

Figure BDA0003852928080000252
Figure BDA0003852928080000252

测序结果显示整体错误率为0.2847%,分析表明,本具体实施例中的测序试剂和方法可以对样本DNA片段进行精确测序。The sequencing results showed that the overall error rate was 0.2847%, and the analysis showed that the sequencing reagents and methods in this specific example can accurately sequence the sample DNA fragments.

最后说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管通过参照本发明的优选实施例已经对本发明进行了描述,但本领域的普通技术人员应当理解,可以在形式上和细节上对其作出各种各样的改变,而不偏离所附权利要求书所限定的本发明的精神和范围。Finally, it is noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described with reference to the preferred embodiments of the present invention, those skilled in the art should understand that it can be described in the form Various changes may be made in matter and details thereof without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A method of gene sequencing comprising:
s1, preparing a compound 1, a compound 2, a compound 3, a compound 4, a compound 5, a compound 6, a compound 7 and a compound 8 respectively;
s2, linking a nucleic acid template to be sequenced to a test carrier, and forming a nucleic acid molecular cluster to be sequenced through amplification;
s3, simultaneously adding the compound 1, the compound 2, the compound 3, the compound 4 and polymerase into an S2 system for nucleotide polymerization reaction to obtain an intermediate complex;
s4, washing away unreacted compound 1, compound 2, compound 3 and compound 4 to maintain the intermediate state complex state of S3, detecting and recording the fluorescent label of each incorporated nucleotide derivative, and judging the base at the corresponding position on the DNA template;
s5, adding the compound 5, the compound 6, the compound 7, the compound 8 and polymerase into the reaction system treated by the S4 to perform nucleotide polymerization reaction;
s6, adding a cutting fluid into the system after the reaction of the S5 to wash away the cleavable protecting groups in the compound 5, the compound 6, the compound 7 and the compound 8, removing a solution phase and washing the solution phase clean by using a buffer solution;
s7, washing off the compound 1, the compound 2, the compound 3 and the compound 4 which are replaced after the reaction in the S6;
s8, repeating the steps S3 to S7 for one or more times;
the compound 1, the compound 2, the compound 3 and the compound 4 are 2',3' -dideoxy triphosphate nucleotide derivatives and have a structure shown in a formula (I);
the compound 5, the compound 6, the compound 7 and the compound 8 are 2' -deoxynucleotide derivatives and have the structure of a formula (II);
the structures of formula (I) and formula (II) are as follows:
Figure FDA0003852928070000011
wherein R is 1 Is different bases or analogues, and the bases or analogues are one of adenine, guanine, cytosine, thymine and uracil or analogues;
R 2 and R 3 Each independently is one of a fluorophore and a reactable reactive group, or is absent;
R 4 each independently is one of a monophosphate group and a polyphosphate group;
R 5 each independently a protecting group capable of undergoing an orthogonal cleavage reaction;
L 1 is one of a linker or a cleavable linker.
2. The method for gene sequencing of claim 1, wherein the structure of formula (I) is as follows:
Figure FDA0003852928070000021
wherein R is 1 Is different bases or analogs, and the bases or analogs are one of adenine, guanine, cytosine, thymine and uracil or analogs;
R 2 each independently is a fluorophore capable of emitting different fluorescent signals;
R 3 is H or absent;
R 4 each independently is one of a monophosphate group and a polyphosphate group;
L 1 each independently a linking group or absent.
3. The method for gene sequencing of claim 1, wherein the structure of formula (I) is as follows:
Figure FDA0003852928070000022
wherein R is 1 Is different bases or analogs, and the bases or analogs are one of adenine, guanine, cytosine, thymine and uracil or analogs;
R 2 and R 3 Each independently a fluorophore or is absent; wherein, in the compound 1, R 2 Is a fluorescent group, R 3 Is absent; in the compound 2R 2 Is a different fluorophore capable of emitting a fluorescent light similar to that of R in Compound 1 2 Fluorescence signals of different fluorophores, R3 is not existed; r in Compound 3 2 Is a fluorescent group and R in compound 1 2 The fluorescent group is the same as or capable of emitting the same as R in the compound 1 2 Fluorophores with identical fluorescence signals, R in Compound 3 3 Then is with R in compound 2 2 The same fluorescent group, or a group capable of emitting a light identical to that of R in Compound 2 2 Fluorophores with the same fluorescent signals as the fluorophores; r in Compound 4 2 And R 3 Is H or absent;
R 4 each independently is one of a monophosphate group and a polyphosphate group;
L 1 each independently is a linking group or is absent.
4. The method for gene sequencing of claim 1, wherein S4 comprises:
s4a, washing to remove unreacted compounds 1, 2, 3 and 4;
s4b, addition of R which is capable of reacting with compound 3 3 A fluorophore-labeled active group to which a reactive group rapidly binds, fromAnd a second fluorescent group is introduced into the compound 3, the fluorescent group marked on the active group and R in the compound 2 2 Is the same as or capable of emitting the same as R in the compound 2 2 Fluorophores with the same fluorescent signals as the fluorophores;
s4c, adding a scanning buffer solution, detecting and recording a fluorescent signal by an excitation light source, and then washing away the scanning buffer solution;
the structure of formula (I) is as follows:
Figure FDA0003852928070000031
wherein R is 1 Is different bases or analogs, and the bases or analogs are one of adenine, guanine, cytosine, thymine and uracil or analogs;
R 2 and R 3 Each independently is one of a fluorophore and a reactable reactive group, or is absent; wherein, in the compound 1, R 2 Is a fluorophore group, R 3 Is absent; in the compound 2R 2 Is a different fluorophore capable of emitting a fluorescent light similar to that of R in Compound 1 2 Fluorescent signals of different fluorophores, R 3 Is absent; r in Compound 3 2 Fluorophores and R in Compound 1 2 The fluorescent group is the same as or capable of emitting the same as R in the compound 1 2 Fluorophores with the same fluorescent signals as the fluorophores; r in Compound 3 3 Is a reactive group, can be rapidly combined with a fluorophore-labeled reactive group, and the fluorophore labeled on the reactive group is combined with R in the compound 2 2 Is the same as or capable of emitting the same as R in the compound 2 2 Fluorophores with the same fluorescent signals as the fluorophores; r in Compound 4 2 And R 3 Is H or absent;
R 4 each independently is one of a monophosphate group and a polyphosphate group;
L 1 each independently is a linking group or is absent.
5. The method for gene sequencing of claim 1, wherein S4 comprises:
s4a, washing away unreacted compounds 1, 2, 3 and 4, adding a scanning buffer solution to adjust the reaction environment so that the compounds 1 and 2 can be excited to emit fluorescence, and detecting and recording fluorescence signals;
s4b, washing off the scanning buffer solution, adding an active group of the fluorescent group marked by the compound 1, introducing the fluorescent group into the compound 3 to enable the compound 3 to emit a fluorescent signal, and simultaneously adjusting the reaction environment to quench the fluorescence of the fluorescent group on the compound 2;
s4c, adding a scanning buffer solution, detecting and recording a fluorescent signal by an excitation light source, and washing off the scanning buffer solution;
the structure of formula (I) is as follows:
Figure FDA0003852928070000041
wherein R is 1 Is different bases or analogs, and the bases or analogs are one of adenine, guanine, cytosine, thymine and uracil or analogs;
R 2 each independently is one of a fluorophore and a reactable reactive group, or is absent; wherein, in the compound 1, R 2 Is a fluorophore group; in the compound 2R 2 The fluorescent group can emit a fluorescent signal which is the same as that of the fluorescent group of the compound 1 under a specific condition and can quickly respond to the environmental change of a reaction system to generate a fluorescence quenching phenomenon; r in Compound 3 2 Is a reactive group; r in Compound 4 2 Is H or absent;
R 3 is H or absent;
R 4 each independently is one of a monophosphate group and a polyphosphate group;
L 1 each independently a linking group or absent.
6. A gene sequencing reagent, which is characterized by comprising polymerase, a compound 1, a compound 2, a compound 3, a compound 4, a compound 5, a compound 6, a compound 7 and a compound 8;
the compound 1, the compound 2, the compound 3 and the compound 4 are 2',3' -dideoxy triphosphate nucleotide derivatives and have a structure shown in a formula (I);
the compound 5, the compound 6, the compound 7 and the compound 8 are 2' -deoxynucleotide derivatives and have the structure of a formula (II);
the structures of formula (I) and formula (II) are as follows:
Figure FDA0003852928070000042
wherein R is 1 Is different bases or analogs, and the bases or analogs are one of adenine, guanine, cytosine, thymine and uracil or analogs;
R 2 and R 3 Each independently is one of, or absent from, a fluorophore and a reactable reactive group;
R 4 each independently is one of a monophosphate group and a polyphosphate group;
R 5 each independently a protecting group capable of undergoing an orthogonal cleavage reaction;
l1 is one of a linking group or a cleavable linking group.
7. The gene sequencing reagent of claim 6, wherein the structure of the formula (I) is as follows:
Figure FDA0003852928070000051
wherein R is 1 Is different bases or analogs, and the bases or analogs are one of adenine, guanine, cytosine, thymine and uracil or analogs;
R 2 each independently is a fluorescent group capable of emitting different fluorescent signals;
R 3 is H or absent;
R 4 each independently is one of a monophosphate group and a polyphosphate group;
L 1 each independently a linking group or absent.
8. The gene sequencing reagent of claim 6, wherein the base or the analogue is a compound having any one of the following structures:
Figure FDA0003852928070000052
Figure FDA0003852928070000061
9. the gene sequencing reagent of claim 6, wherein the protecting group is a group having any one of the following structures:
Figure FDA0003852928070000062
10. the gene sequencing reagent of claim 6, wherein the cleavable linking group is a group having any one of the following structures:
Figure FDA0003852928070000063
wherein, R1 'and R2' are respectively and independently one of halogen, -H and C1-C5 fatty chain.
CN202211137859.8A 2022-05-05 2022-09-19 Gene sequencing reagent and gene sequencing method Pending CN115323045A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210478585.2A CN114574571A (en) 2022-05-05 2022-05-05 Nucleotide derivative and gene sequencing method
CN2022104785852 2022-05-05

Publications (1)

Publication Number Publication Date
CN115323045A true CN115323045A (en) 2022-11-11

Family

ID=81778635

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202210478585.2A Pending CN114574571A (en) 2022-05-05 2022-05-05 Nucleotide derivative and gene sequencing method
CN202211137859.8A Pending CN115323045A (en) 2022-05-05 2022-09-19 Gene sequencing reagent and gene sequencing method

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202210478585.2A Pending CN114574571A (en) 2022-05-05 2022-05-05 Nucleotide derivative and gene sequencing method

Country Status (1)

Country Link
CN (2) CN114574571A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115266662B (en) * 2022-06-13 2024-06-04 深圳赛陆医疗科技有限公司 Hyperspectral sequencing method, hyperspectral sequencing system and gene sequencer
CN117924392A (en) * 2024-01-16 2024-04-26 深圳太古语科技有限公司 Nucleotide derivative and application thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107074904A (en) * 2014-10-23 2017-08-18 考利达基因组股份有限公司 Signal bondage is sequenced(SCS)With the nucleotide analog being sequenced for signal bondage
CN109562376A (en) * 2016-04-04 2019-04-02 纽约哥伦比亚大学董事会 A single-molecule/cluster DNA sequencing-by-synthesis based on fluorescence energy transfer
CN114250283A (en) * 2021-10-15 2022-03-29 深圳铭毅智造科技有限公司 Monochromatic fluorescence MRT gene sequencing reagent and method based on environment sensitive dye
CN114958995A (en) * 2022-04-27 2022-08-30 深圳赛陆医疗科技有限公司 Gene sequencing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107074904A (en) * 2014-10-23 2017-08-18 考利达基因组股份有限公司 Signal bondage is sequenced(SCS)With the nucleotide analog being sequenced for signal bondage
CN109562376A (en) * 2016-04-04 2019-04-02 纽约哥伦比亚大学董事会 A single-molecule/cluster DNA sequencing-by-synthesis based on fluorescence energy transfer
CN114250283A (en) * 2021-10-15 2022-03-29 深圳铭毅智造科技有限公司 Monochromatic fluorescence MRT gene sequencing reagent and method based on environment sensitive dye
CN114958995A (en) * 2022-04-27 2022-08-30 深圳赛陆医疗科技有限公司 Gene sequencing method

Also Published As

Publication number Publication date
CN114574571A (en) 2022-06-03

Similar Documents

Publication Publication Date Title
US11827932B2 (en) Methods and compositions for nucleic acid sequencing
US11939631B2 (en) Four-color DNA sequencing by synthesis using cleavable fluorescent nucleotide reversible terminators
JP7797426B2 (en) Nucleosides and nucleotides with 3' acetal blocking groups
CN109562376B (en) A single-molecule/cluster DNA sequencing-by-synthesis based on fluorescence energy transfer
CN114250283B (en) Monochromatic fluorescent MRT gene sequencing reagent and method based on environment-sensitive dye
WO2020093261A1 (en) Method for sequencing polynucleotides
CN115323045A (en) Gene sequencing reagent and gene sequencing method
CN114250282A (en) Gene sequencing reagent and method based on pH value sensitive dye
Tong et al. Combinatorial fluorescence energy transfer tags: New molecular tools for genomics applications
WO2023070010A1 (en) Ultrabright dna nanostructures for biosensing
HK40073095A (en) Compositions for nucleic acid sequencing
HK40014484A (en) Compositions for nucleic acid sequencing
HK40014484B (en) Compositions for nucleic acid sequencing
HK1247254B (en) Methods and compositions for nucleic acid sequencing
HK40049621B (en) Method for sequencing polynucleotides
HK1201079B (en) Methods and compositions for nucleic acid sequencing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination