CN114921533A

CN114921533A - Methods and adaptors for characterising a target polynucleotide

Info

Publication number: CN114921533A
Application number: CN202210482806.3A
Authority: CN
Inventors: 刘先宇; 常馨; 肖宓
Original assignee: Chengdu Qitan Technology Ltd
Current assignee: Chengdu Qitan Technology Ltd
Priority date: 2022-05-05
Filing date: 2022-05-05
Publication date: 2022-08-19

Abstract

The invention provides methods and adaptors for characterising a polynucleotide of interest. The present invention provides a method of characterising a target polynucleotide comprising: (a) moving a target polynucleotide through a nanopore, wherein a sequencing terminus of the target polynucleotide comprises a modification that occludes the nanopore; (b) as the target polynucleotide moves relative to the pore, one or more electrical and/or optical measurements are taken. The invention also provides an adaptor for characterising a target polynucleotide, wherein the adaptor comprises a modification moiety which binds to a sequencing terminal of the target polynucleotide, the modification moiety causing the nanopore to become blocked. The method provided by the invention avoids the condition that a large amount of libraries after sequencing are enriched at the Trans (Trans) end of a nanopore sequencing device, and further improves the variability of the transformation of the Trans (Trans) end.

Description

Methods and adaptors for characterising a target polynucleotide

Technical Field

The present invention is in the field of gene sequencing, and relates to a method for characterizing a polynucleotide, and to adaptors used in the method.

Background

The nanopore sequencing technology has the characteristics of long reading length, direct reading of modification information and real-time data production parallel analysis, and has more obvious advantages in detection of long-fragment nucleic acid detection variation (including but not limited to point mutation, insertion deletion, inversion translocation, gene fusion, RNA abnormal shearing, RNA editing and other related variations of nucleic acid) and modification information (including but not limited to methylation, acetylation and the like) compared with a second-generation sequencing or other sequencing platforms. The platform supports the parallel characteristics of data production and analysis, realizes real-time mutation/modification detection and diagnosis, and has a portable design, so that the platform has a wide application prospect.

When a voltage is applied across the nanopore, the current drops as analytes (e.g., polynucleotides, polypeptides, polysaccharides, and lipids) pass through the nanopore, and the degree of current blockage caused by analytes of different structures varies. The current changes when the analyte temporarily remains in the nanopore barrel (barrel) for a period of time. Nanopore detection of nucleotides gives a change in current of known characteristics and duration.

In current nanopore sequencing, all analytes to be tested are loaded to the Cis (Cis) end of the nanopore sequencing device at the start of sequencing. During the sequencing process, analytes to be detected continuously pass through the pores and are accumulated to the Trans (Trans) end of the nanopore sequencer in a large amount.

Disclosure of Invention

The present invention aims to provide a method of characterising a polynucleotide of interest and also provides adaptors for use in the method. The adaptors of the invention can be used to avoid enrichment of sequencing libraries to the Trans (Trans) end of the sequencing apparatus, thereby avoiding possible interference with this enrichment and further improving the variability of the Trans (Trans) end engineering.

The purpose of the invention is realized by the following technical scheme:

in a first aspect, the present invention provides a method of characterising a target polynucleotide comprising:

(a) moving the target polynucleotide through the nanopore,

wherein the sequencing terminus of the target polynucleotide comprises a modification that occludes a nanopore;

(b) taking one or more electrical and/or optical measurements as the target polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the target polynucleotide, and thereby characterising the target polynucleotide.

The method of the invention further comprises the following steps:

(c) moving the target polynucleotide in a reverse direction relative to the nanopore back to the starting side of the nanopore.

The method according to the present invention, wherein,

the nanopore comprises a solid state nanopore and/or a biological nanopore; the biological nanopore comprises a transmembrane pore; the transmembrane pore comprises a transmembrane protein pore.

The method according to the present invention, wherein step (c) does not comprise:

taking one or more electrical and/or optical measurements as the polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the polynucleotide, and thereby characterising the target polynucleotide.

The method according to the present invention, wherein the step (c) comprises:

The method according to the invention, wherein in step (c) the moving the target polynucleotide in reverse direction relative to the pore is effected by at least one means comprising: a reverse voltage is applied.

The means for reversing the movement of the target polynucleotide relative to the pore may further comprise: AFM (atomic force microscope) drawing, or drawing of an invertase that moves the target polynucleotide in reverse direction with respect to the nanopore, and if a helicase that moves the target polynucleotide toward the nanopore is a helicase in the 5 '-3' direction, the invertase may be a helicase in the 3 '-5' direction.

The method according to the present invention, wherein the step (a) comprises:

attaching an adaptor comprising a modification moiety to the polynucleotide of interest such that a sequencing terminus of the polynucleotide of interest comprises the modification moiety.

Preferably, the modifying moiety comprises a modifying moiety with no charge on the side chain or a positive charge on the side chain; and/or

The modified part with uncharged side chains comprises any one or the combination of more than two of PNA, polypeptide and nucleotide modified by alkylation of phosphate backbone; and/or

The pendant positively charged modified moiety comprises a phosphate backbone cationic oligomer-modified nucleotide; and/or

The cationic oligomer comprises any one or the combination of more than two of spermine, spermidine and putrescine; and/or

The modifying moiety comprises a ligand and a ligand that bind to each other, including streptavidin and biotin, antigens, and antibodies.

The method according to the invention, wherein the adaptor is a Y-adaptor comprising a double-stranded region and at least one single-stranded region, or an E-adaptor comprising a double-stranded region and no single-stranded region.

Wherein the type E adaptors are adaptors for conventional sequencing and type E adaptors suitable for use herein comprise a modification moiety.

The method according to the invention wherein the adaptor is a Y-adaptor, the modified portion of the Y-adaptor being located in or forming the overhang portion of the Y-adaptor; and/or

The modifying moiety comprises a modifying moiety with uncharged side chains, more preferably the modifying moiety with uncharged side chains is a PNA or a polypeptide.

And/or

The modification moiety is covalently attached to the Y-adaptor or the modification is attached to the Y-adaptor by a click chemistry reaction.

The method according to the invention wherein the adaptor is an E-type adaptor and the modification moiety is located at one end which is not attached to the polynucleotide of interest; and/or

The modified part comprises an affinity substance, preferably the affinity substance is streptavidin and biotin; or

The modified moiety comprises cholesterol.

The method according to the present invention, wherein the adaptor comprises a blocking strand having a different structure from the polynucleotide for blocking a motor protein;

preferably, the blocking strand comprises one or more nitroindoles, one or more inosines, one or more acridines, one or more 2-aminopurines, one or more 2-6-diaminopurines, one or more 5-bromo-deoxyuracils, one or more inverted thymidine, one or more inverted dideoxythymidine, one or more dideoxycytidine, one or more 5-methylcytosine, one or more 5-hydroxymethylcytidine, one or more 2 '-alkoxy-modified ribonucleotides, preferably 2' -methoxy-modified ribonucleotides, one or more isodeoxycytidines, one or more isodeoxyguanosine, one or more C3 groups, one or more photocleavable groups, one or more hexanediol, one or more iSP9 groups, one or more iSP18 groups, a, A polymer or one or more thiol linkages.

In a second aspect, the present invention provides an adaptor for characterising a target polynucleotide, wherein the adaptor comprises a modification moiety for binding to a sequencing terminal of the target polynucleotide, the modification moiety being capable of causing nanopore blockage.

In a third aspect, the present invention provides a construct for characterising a polynucleotide of interest, the construct comprising an adaptor, and a polynucleotide of interest;

the adaptor comprises a modification moiety that binds to a sequencing terminus of the polynucleotide of interest, the modification moiety causing the nanopore to be blocked;

preferably, the target polynucleotide is a double-stranded polynucleotide.

In a fourth aspect, the present invention provides a complex for characterising a polynucleotide of interest, said complex comprising a construct and a motor protein, wherein,

the construct comprises an adaptor and a polynucleotide of interest, the adaptor comprising a modification moiety, the modification moiety binding to a sequencing terminal of the polynucleotide of interest, the modification moiety causing the nanopore to be blocked;

the motor protein is a protein capable of binding to the target polynucleotide and controlling its movement through the pore;

preferably, the motor protein is selected from one or more of a polymerase, an exonuclease, a helicase and a topoisomerase;

more preferably, the helicase is selected from one or more of Hel308 helicase, RecD helicase, tra helicase, TrwC helicase, XPD helicase and DDA helicase.

In a fifth aspect, the present invention provides a kit for nanopore characterization polynucleotides, the composition of the kit comprising:

1) an adaptor comprising a modification moiety that binds to a sequencing terminus of the polynucleotide of interest, the modification moiety causing nanopore blockage; and

2) a motor protein.

In nanopore sequencing, all libraries at the start of sequencing are loaded to the Cis (Cis) end of the nanopore sequencing device. In the sequencing process, along with continuous hole passing of the sequencing library to be tested, a large amount of nucleic acid libraries can be gathered to a Trans (Trans) end of the nanopore sequencer, and due to the fact that the nucleic acid carries charges, the gathering can cause possible interference on the sequencing result. The inventors of the present invention provide a method that can avoid this enrichment. The technical concept of the present invention is described below with reference to fig. 1 and 2, fig. 1 is a schematic diagram of the principle of using adaptors of the present invention to avoid massive enrichment of sequencing libraries at the Trans end (Trans); FIG. 2 is a schematic diagram of a complex comprising an adaptor of the invention, a polynucleotide of interest and a helicase; in FIG. 1, the adaptors are ligated to both ends of a double strand of a target polynucleotide to be characterized, and the double strand moves relative to the pore while unwinding under the action of a helicase, and the target polynucleotide is sequenced by the change in the current through the pore, that is, strand sequencing. Since the modified portion of the adaptors of the present invention is present at the sequencing end of the sequencing strand, at the sequencing end of the target polynucleotide strand, the modification fails to cross the nanopore and causes the pore to be blocked, resulting in the target polynucleotide strand completing the sequencing being kicked back out of the pore to the Cis (Cis) terminus. Briefly, during sequencing, the helicase is added as 5' -3 ' helicase, and under the guidance of the complex, the 5' end of the library is put into a hole, and the helicase shifts along 5' -3 ' and performs sequencing; when the single-stranded library runs to the 3' end, because the energy barrier of the end cannot be crossed, the hole blocking phenomenon can occur, and then the system applies reverse voltage to kick the hole, so that the single-stranded library after sequencing is kicked from the Trans (Trans) end to the Cis (Cis) end.

Wherein, in figure 2, the complex comprises the adaptor, the target polynucleotide duplex to be characterised and a motor protein, the Y1 strand comprises a blocking strand S and a polynucleotide strand D' linked to the blocking strand S, and a motor protein arrested on the blocking strand S, and the region of complementarity of the Y2 strand to the Y1 strand is the duplex portion of polynucleotide strand L; the YB-Dn chain comprises a modified portion, which in one particular embodiment comprises PNA, which, when uncharged, causes pore blocking and is therefore eventually kicked out. The complementary region of the YB-Up chain and the Y1 chain is a double-stranded polynucleotide D; wherein the YB-Dn chain and the YB-Up chain are connected through click chemistry, and DBCO and N3 are specifically taken as examples.

Compared with the prior art, the technical scheme of the invention has the following advantages:

the method provided by the invention avoids the condition that a large amount of the library after sequencing is enriched at the Trans (Trans) end of the nanopore sequencing device, greatly reduces the interference possibly caused by the enrichment of a large amount of charged nucleic acid at the Trans (Trans) end compared with the existing sequencing technology, and further improves the variability of the transformation of the Trans (Trans) end.

In addition, methods using modifications with uncharged or positively charged side chains to create pore blockades are more straightforward to prepare and operate and more functional than methods using, for example, streptavidin and biotin to create pore blockades, and can be performed using an adaptor such as a Y adaptor during sequencing.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic diagram showing the principle of using adaptors of the present invention to avoid massive enrichment of sequencing libraries at the Trans end (Trans);

FIG. 2 shows a schematic diagram of a complex comprising an adaptor of the invention, a polynucleotide of interest and a helicase in one particular embodiment;

wherein the Y1 chain comprises a blocking chain S and a polynucleotide chain D' connected with the blocking chain S, and a motor protein stagnated on the blocking chain S, the complementary region of the Y2 chain and the Y1 chain is a double-stranded part of the polynucleotide chain L, the YB-Dn chain comprises a modified part, and the complementary region of the YB-Up chain and the Y1 chain is a double-stranded polynucleotide D; wherein the YB-Dn chain and the YB-Up chain are connected through click chemistry;

FIG. 3 shows a schematic representation of a complex comprising an adaptor of the invention, a polynucleotide of interest and a helicase in another specific embodiment; the sequencing terminal of the target polynucleotide is introduced into the biotin-streptavidin compound, and the electric field force is not enough to tear the target polynucleotide when the target polynucleotide is sequenced to the tail end, so that the hole is blocked, and a sequencing chain is kicked out.

FIG. 4 shows a diagram of sequencing signals when sequencing is performed using adaptors according to example 1 of the invention;

FIG. 5 shows a graph of sequencing signals when sequencing is performed using adaptors according to example 2 of the invention;

FIG. 6 shows a diagram of sequencing signals for adaptors according to example 3 of the invention and for sequencing using adaptors according to example 3 of the invention.

Detailed Description

It is understood that different applications of the disclosed products and methods may be tailored to specific needs in the art. It is to be understood that the terminology used herein is for the purpose of describing particular embodiments of the invention only and is not intended to be limiting.

In addition, as used in this specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "a polynucleotide" includes two or more polynucleotides, reference to "a polynucleotide binding protein" includes two or more such proteins, reference to "a helicase" includes two or more helicases, "reference to" a monomer "refers to two or more monomers, reference to" a pore "includes two or more pores, and the like.

All publications, patents and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety.

Method

The present invention provides a method of characterising a target polynucleotide, comprising:

(a) moving the target polynucleotide through the nanopore,

(b) taking one or more electrical and/or optical measurements as the polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the polynucleotide, and thereby characterising the target polynucleotide.

The method of the invention further comprises the following steps:

(c) optionally applying a reverse voltage to reverse the movement of the target polynucleotide relative to the nanopore back to the starting side of the nanopore.

The method according to the present invention, wherein the step (c) may comprise:

The methods of the invention comprise measuring one or more characteristics of the target polynucleotide. The method may comprise measuring a characteristic of 2, 3, 4, 5 or more polynucleotides of interest. The one or more characteristics, preferably selected from (i) the length of the target polynucleotide, (ii) the identity of the target polynucleotide, (iii) the sequence of the target polynucleotide, (iv) the secondary structure of the target polynucleotide; and (v) whether the target polynucleotide is modified. (i) Any combination of (v) to (v) may be measured according to the present invention.

For (i), the length of the polynucleotide may be determined, for example, by determining the number of interactions of the target polynucleotide with the pore and the duration of time between interactions of the target polynucleotide with the pore.

For (ii), the identity of the polynucleotides may be determined in a variety of ways. The identity of a polynucleotide may be determined in conjunction with or without determination of the sequence of the target polynucleotide. The former is straightforward; sequencing said polynucleotide and identifying therefrom. The latter can be done in several ways. For example, the presence of a particular motif in a polynucleotide can be determined (without determining the remaining sequence of the polynucleotide). Alternatively, a particular electrical and/or optical signal determined in the method can identify a polynucleotide of interest from a particular source.

For (iii), the sequence of the polynucleotide may be determined as described previously. Suitable sequencing methods, particularly those using electrical measurements, are described in Stoddart D et al, Proc Natl Acad Sci, 12; 7702-7, Lieberman KR et al, J Am Chem Soc.2010; 132(50) 17961-72, and International application WO 2000/28312.

For (iv), the secondary structure can be measured in a variety of ways. For example, if the method involves electrical measurements, the secondary structure may be measured using changes in residence time or current changes through the aperture. This allows regions of single-and double-stranded polynucleotides to be identified.

For (v), the presence or absence of any modification can be determined. The method preferably comprises determining whether the target polynucleotide has been modified by methylation, oxidation, damage, use of one or more proteins or one or more labels, tags or blocking strands. Specific modifications will result in specific interactions with the pore, which can be determined using the methods described below. For example, cytosine can be identified from methylated cytosine based on the current passing through the pore during its interaction with each nucleotide.

The process is generally carried out in the presence of a buffer. In the exemplary apparatus discussed above, the buffer is present in the aqueous solution of the chamber. Any buffer may be used in the methods of the invention. Typically, the buffer is a phosphate buffer. Other suitable buffers are HEPES and Tris-HCl buffers. The process is typically carried out at a pH of 4.0 to 12.0, 4.5 to 10.0, 5.0 to 9.0, 5.5 to 8.8, 6.0 to 8.7, 7.0 to 8.8, or 7.5 to 8.5. The pH used is preferably about 7.5.

The process can be carried out at 0 to 100 ℃, 15 to 95 ℃, 16 to 90 ℃, 17 to 85 ℃, 18 to 80 ℃, 19 to 70 ℃, or 20 to 60 ℃. The process is typically carried out at room temperature. The process is optionally carried out at a temperature that supports helicase function, for example about 37 ℃.

The method can be used for the detection of free nucleotides or free nucleotide analogs andand/or a cofactor that assists the functioning of the helicase. The method may also be carried out in the absence of free nucleotides or free nucleotide analogues and in the absence of a cofactor for the helicase. The free nucleotides can be any one or more of the individual nucleotides as discussed above. Free nucleotides include, but are not limited to, Adenosine Monophosphate (AMP), Adenosine Diphosphate (ADP), Adenosine Triphosphate (ATP), Guanosine Monophosphate (GMP), Guanosine Diphosphate (GDP), Guanosine Triphosphate (GTP), Thymidine Monophosphate (TMP), Thymidine Diphosphate (TDP), Thymidine Triphosphate (TTP), Uridine Monophosphate (UMP), Uridine Diphosphate (UDP), Uridine Triphosphate (UTP), Cytidine Monophosphate (CMP), Cytidine Diphosphate (CDP), Cytidine Triphosphate (CTP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyadenosine diphosphate (DADP), deoxyadenosine monophosphate (dATP), deoxyguanosine monophosphate (dGMP), deoxyguanosine diphosphate (dGDP), deoxyguanosine triphosphate (dGTP), deoxythymidine monophosphate (dTMP), deoxythymidine diphosphate (dTDDP), deoxythymidine triphosphate (dTTP), uridine deoxydiphosphate (dUMP), uridine deoxydiphosphate (dUDP), uridine deoxytriphosphate (dUTP), cytidine deoxymonophosphate (dCMP), cytidine deoxydiphosphate (dCDP), and cytidine deoxytriphosphate (dCTP). The free nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP or dCMP. The free nucleotide is preferably Adenosine Triphosphate (ATP). Helicase cofactors are factors that allow helicase or a construct to function. The helicase co-factor is preferably a divalent metal cation. The divalent metal cation is preferably Mg ²⁺ ，Mn ²⁺ ，Ca ²⁺ Or Co ²⁺ . Helicase cofactor is most preferably Mg ²⁺ 。

Joint body

The adaptors of the present invention comprise a modification moiety that binds to the sequencing terminus of the target polynucleotide to cause nanopore blockage.

Wherein the modifying moiety comprises a modifying moiety having no charge on the side chain or a positive charge on the side chain. Optionally, the side chain uncharged modification moiety includes any one of or a combination of any two or more of PNA, polypeptide, and phosphate backbone alkylation modified nucleotides. Alternatively, the pendant positively charged modified moiety comprises a phosphate backbone cationic oligomer-modified nucleotide, see those described in CN101370817A, incorporated herein by reference in its entirety.

In particular, the cationic oligonucleotide A _i B _j H has an oligonucleotide moiety A _i And an oligocationic moiety B _j Wherein A is _i Is an oligonucleotide residue of an i-mer, i ═ 5 to 50, having natural or unnatural nucleobases and/or pentofuranosyl and/or natural phosphodiester bonds. B is _j Is an organic oligocationic moiety of a j-mer, j ═ 1 to 50, wherein B is selected from the group comprising:

-HPO ₃ -R ¹ -(X-R ² _n ) _n1 -X-R ³ -O-wherein R ¹ 、R ² _n And R ³ Are identical or different lower alkylene, X is NH or NC (NH) ₂ ) ₂ ，n1＝2-20，

-HPO ₃ -R ⁴ -CH(R ⁵ X ¹ )-R ⁶ -O-wherein R ⁴ Is lower alkylene, R ⁵ And R ⁶ Are identical or different lower alkylene radicals, X ¹ Is putrescine, spermidine or spermine residue,

-HPO ₃ -R ⁷ -(aa) _n2 -R ⁸ -O-wherein R ⁷ Is lower alkylene, R ⁸ Is lower alkylene, serine, amino alcohol obtained by reduction of natural amino acids, (aa) _n2 Is a peptide containing natural amino acids with cationic side chains such as arginine, lysine, ornithine, histidine, diaminopropionic acid, n2 ═ 2-20.

As used in the specification and claims, "lower alkyl" and "lower alkylene" preferably mean optionally substituted C ₁ -C ₅ Straight or branched chain alkyl or alkylene.

For example a is selected from the group comprising deoxyribonucleotides, ribonucleotides, Locked Nucleotides (LNA) and chemical modifications or substitutions thereof such as phosphorothioates (also known as phosphorothioates), 2 '-fluoro groups, 2' -O-alkyl groups or labelling groups such as fluorescers. Preferably, the cationic oligomer comprises any one of spermine, spermidine and putrescine or a combination of any two or more of the same.

Optionally, the modifying moiety comprises a ligand and a ligand bound to each other, the ligand and ligand comprising streptavidin and biotin, and/or an antigen and an antibody.

The length of the modified moiety varies depending on the chemical structure of the modified moiety. In a specific embodiment, the modified portion is PNA, and the modified polynucleotide may be 3-100 polynucleotides, preferably 5-50 polynucleotides, more preferably 10-30 polynucleotides, and even more preferably 13-20 polynucleotides.

The adaptor is connected to two ends of a double strand of a target polynucleotide to be characterized, the double strand moves relative to the pore while unwinding under the action of helicase, and the sequence of the target polynucleotide is determined through the current change of the pore, namely, the strand sequencing is carried out. Since the modified portion of the adaptors of the present invention is present at the sequencing end of the sequencing strand, at the sequencing end of the target polynucleotide strand, the modification fails to cross the nanopore and causes the pore to be blocked, resulting in the target polynucleotide strand completing the sequencing being kicked back out of the pore to the Cis (Cis) terminus.

Wherein the adaptor may be a Y-type adaptor comprising a double-stranded region and at least one single-stranded region, or an E-type adaptor comprising a double-stranded region and no single-stranded region.

In a particular embodiment, the adaptor is a Y-adaptor, the modified portion of the Y-adaptor is located in or forms the overhang portion of the Y-adaptor; and/or

And/or

In particular, the Y-shaped adaptor comprises { S-D } in the 5 'to 3' direction _n Or { D-S } _n Wherein D is a double-stranded polynucleotide comprising a modification moiety, S is a blocking strand, and n is a positive integer;

and, the D duplex comprises a polynucleotide strand D ' linked to S and a complementary strand D "of the D ', wherein the motor protein moves in the direction S → D ' during characterization, and the modification is located on the complementary strand D".

For example, in a specific embodiment, the helicase is a 5' -3 ' helicase, and the 5' end of the library is placed into a well under the guidance of the complex, the helicase is displaced 5' -3 ' and sequenced; when the single-stranded library runs to the 3' end, because the energy barrier of the end cannot be crossed, the hole blocking phenomenon can occur, and then the system applies reverse voltage to kick the hole, so that the single-stranded library after sequencing is kicked from the Trans (Trans) end to the Cis (Cis) end.

The adaptor of the invention, wherein the adaptor comprises { L-S-D } in the 5 'to 3' end direction _n Or { D-S-L } _n (ii) a Wherein, the L is a polynucleotide chain; preferably, at least part of said L is double stranded; and/or at least part of said L is single-stranded; and/or said L comprises one or more blocking molecules; and/or said L comprises a leader sequence that threads preferentially into the hole.

It will be appreciated that the L moiety is the moiety that first contacts the sequencing well.

In a specific embodiment, the polynucleotide of interest is modified to include a Y-type adaptor and an E-type adaptor comprising a leader sequence. Wherein a Y-adaptor containing a leader sequence is ligated to one end of the polynucleotide and an E-adaptor is ligated to the other end, the modified moiety of the invention being located at the end of the E-adaptor not attached to the polynucleotide of interest; the modification part comprises an affinity substance, and the affinity substance is streptavidin and biotin; or the modified portion comprises cholesterol.

The leader sequence preferentially enters the nanopore, and a double-stranded adaptor which does not comprise a single-stranded region cannot pass through the nanopore due to the blockage of the nanopore caused by the modified part, so that the system applies reverse voltage to kick the pore, and the single-stranded library after sequencing is kicked from a Trans (Trans) end to a Cis (Cis) end.

In the present invention, the following are understood with respect to the Cis (Cis) and Trans (Trans) termini:

nanopores typically have two openings: a first opening and a second opening. Such openings are commonly referred to as the cis-opening and trans-opening of the nanopore. Typically the first opening is a cis opening and the second opening is a trans opening. The symbols "cis" and "trans" opening in nanopores are conventional in the art. For example, the cis opening of a nanopore typically faces the cis end of the nanopore, and the trans opening typically faces the trans end. It will be appreciated that the cis-terminus is the end from which the target polynucleotide moves into the nanopore and the trans-terminus is the end from which the target polynucleotide moves out of the nanopore.

Blocking chain

The one or more blocking strands are included in the target polynucleotide. The blocking strand or strands are preferably part of the target polynucleotide, e.g. it/they interrupt the polynucleotide sequence. The one or more blocking strands are preferably not part of one or more block molecules, such as deceleration strips, that hybridize to the target polynucleotide.

There may be any number of blocking strands in the target polynucleotide, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more blocking strands. Preferably there are 2, 4 or 6 blocked strands in the target polynucleotide. Different regions of the target polynucleotide may have a blocking strand, for example a blocking strand in the leader sequence and a blocking strand in the hairpin loop.

The one or more blocking strands each provide an energy barrier that the one or more helicases cannot overcome even in the active mode. The one or more blocking strands may arrest the one or more helicases by reducing the pulling of the helicase (e.g., by removing the bases of the nucleotides in the target polynucleotide) or physically blocking the movement of the one or more helicases (e.g., using bulky chemical groups).

The one or more blocking strands may comprise any molecule or combination of molecules that arrest one or more helicases. The one or more blocking strands may comprise any molecule or combination of molecules that prevent the one or more helicases from moving along the target polynucleotide. It is directly determined whether one or more helicases stay at one or more of the blocked strands in the absence of a nanopore and an applied potential. For example, this can be tested as shown in the examples, e.g., the ability of helicases to cross the blocked strand and displace the complementary strand of DNA can be measured by PAGE.

The one or more blocking chains typically comprise a linear molecule such as a polymer. The one or more blocking strands typically have a different structure than the target polynucleotide. For example, if the target polynucleotide is DNA, one or more of the blocking strands is not typically deoxyribonucleic acid. In particular, if the target polynucleotide is deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), the one or more blocking strands preferably comprise Peptide Nucleic Acid (PNA), Glycerol Nucleic Acid (GNA), Threose Nucleic Acid (TNA), Locked Nucleic Acid (LNA) or a synthetic polymer with nucleotide side chains.

The one or more blocking strands preferably include one or more nitroindoles, such as one or more 5-nitroindoles, one or more inosines, one or more acridines, one or more 2-aminopurines, one or more 2-6-diaminopurines, one or more 5-bromo-deoxyuracils, one or more inverted thymidine (inverted dTs), one or more inverted deoxythymidine (ddTs), one or more dideoxycytidine (ddCs), one or more 5-methylcytidine, one or more 5-hydroxymethylcytidine, one or more 2 '-alkoxy-modified ribonucleotides (preferably 2' -methoxy-modified ribonucleotides), one or more isodeoxycytidines (iso-dCs), one or more isodeoxyguanosine (iso dGs), one or more iSPC3 groups (i.e., nucleotides lacking sugars and bases), one or more Photocleavable (PC) groups, one or more hexanediol groups, one or more blocked chain 9(iSp9) groups, one or more blocked chain 18(iSp18) groups, a polymer or one or more thiol linkages. The one or more blocking chains may comprise any combination of these groups. Many of these groups are commercially available from (Integrated DNA).

The one or more blocking chains may comprise any number of these groups. For example, for 2-aminopurine, 2-6-diaminopurine, 5-bromodeoxyuridine, inverted dTs, ddTs, ddCs, 5-methylcytidine, 5-hydroxymethylcytidine, 2 '-alkoxy-modified ribonucleotides (preferably 2' -methoxy-modified ribonucleotides), iso dCs, iso dGs, iSPC3 groups, PC groups, hexanediol groups and thiol linkages, one or more blocking strands preferably comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more. The one or more blocking chains preferably comprise 2, 3, 4, 5, 6, 7, 8 or more iSp9 groups. The one or more blocking chains preferably comprise 2, 3, 4, 5 or 6 or more iSp18 groups. The most preferred chain-blocking group is 4 iSP18 groups.

The polymer is preferably a polypeptide or polyethylene glycol (PEG). The polypeptide preferably comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more amino acids. The PEG preferably comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more monomeric units.

The one or more blocking strands preferably comprise one or more abasic nucleotides (i.e. nucleotides lacking a nucleobase), for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more abasic nucleotides. The nucleobases may be replaced by-H (idSp) or-OH in abasic nucleotides. Abasic blocking strands can be inserted into a target polynucleotide by removing nucleobases from one or more adjacent nucleotides.

The one or more blocking strands preferably comprise one or more chemical groups that physically cause the one or more helicases to stall. The one or more chemical groups are preferably one or more pendant chemical groups. The one or more chemical groups may be attached to one or more nucleobases in the target polynucleotide. The one or more chemical groups may be attached to the backbone of the target polynucleotide. Any number, such as 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or more of these chemical groups may be present. Suitable groups include, but are not limited to, fluorophores, streptavidin and/or biotin, cholesterol, methylene blue, Dinitrophenols (DNPs), digoxigenin and/or anti-digoxigenin and diphenylcyclooctyne groups.

Different blocking strands in a target polynucleotide may comprise different stasis molecules. For example, one blocking strand may comprise a linear molecule as discussed above, and the other blocking strand may comprise one or more chemical groups that physically cause the arrest of one or more helicases. The blocking strand may comprise any linear molecule as discussed above and one or more chemical groups, such as one or more abasic and fluorophore groups, that physically cause the arrest of one or more helicases.

Composite material

The invention provides a complex comprising an adaptor according to the invention and a motor protein, wherein the motor protein is located in a blocking chain;

preferably, the motor protein is a protein capable of binding to a polynucleotide and controlling its movement through a pore; preferably an enzyme. For example, the enzyme is selected from one or more of a polymerase, an exonuclease, a helicase and a topoisomerase. For example, the helicase is selected from one or more of Hel308 helicase, RecD helicase, tra helicase, TrwC helicase, XPD helicase and DDA helicase.

Polynucleotide

Polynucleotides, such as nucleic acids, are macromolecules containing two or more nucleotides. The polynucleotide or nucleic acid may include any combination of any nucleotides. Nucleotides may be naturally occurring or synthetic. One or more nucleotides in a polynucleotide may be oxidized or methylated. One or more nucleotides in the polynucleotide may be damaged. For example, the polynucleotide may comprise a pyrimidine dimer. Such dimers are often associated with damage caused by ultraviolet light and are the leading cause of cutaneous melanoma. One or more nucleotides in a polynucleotide may be modified, for example with a label or tag. Suitable labels are described below.

The nucleotides in a polynucleotide are typically ribonucleotides or deoxyribonucleotides. The polynucleotide may comprise the following nucleosides: adenosine, uridine, guanosine and cytidine. The nucleotide is preferably a deoxyribonucleotide. The polynucleotide preferably comprises the following nucleosides: deoxyadenosine (dA), deoxyuridine (dU) and/or thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC).

Nucleotides typically contain a monophosphate, diphosphate or triphosphate. The phosphate may be attached on the 5 "or 3" side of the nucleotide.

Suitable nucleotides include, but are not limited to, Adenosine Monophosphate (AMP), Guanosine Monophosphate (GMP), Thymidine Monophosphate (TMP), Uridine Monophosphate (UMP), Cytidine Monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), adenosine deoxymonophosphate (dAMP), guanosine deoxymonophosphate (dGMP), thymidine deoxymonophosphate (dTMP), uridine deoxymonophosphate (dUMP) and cytidine deoxymonophosphate (dCMP). The nucleotide is preferably selected from the group consisting of AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP, dCMP, and dUMP. The nucleotide is most preferably selected from dAMP, dTMP, dGMP, dCMP, and dUMP. The polynucleotide preferably comprises the following nucleotides: dAMP, dUMP and/or dTMP and dCMP.

The nucleotides in the polynucleotide may be linked to each other in any manner. Nucleotides are typically linked by their sugars and phosphate groups, as in nucleic acids. The nucleotides may be linked by their nucleobases, such as in a pyrimidine dimer.

The polynucleotide may be a nucleic acid. The polynucleotide may be any synthetic nucleic acid known in the art, such as Peptide Nucleic Acid (PNA), Glycerol Nucleic Acid (GNA), Threose Nucleic Acid (TNA), Locked Nucleic Acid (LNA), or other synthetic polymers having nucleotide side chains. The PNA backbone consists of repeating N- (2-aminoethyl) -glycine units linked by peptide bonds. The GNA backbone is composed of repeating ethylene glycol units linked by phosphodiester bonds. The TNA backbone consists of repetitive threones linked together by phosphodiester bonds. The LNA is formed from nucleotides with an additional bridge linking the 2 "oxygen and the 4" carbon in the ribose sugar as discussed above.

The polynucleotide is most preferably ribonucleic acid (RNA) or deoxyribonucleic acid (DNA).

The polynucleotide may be of any length. For example, a polynucleotide can be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400, or at least 500 nucleotides in length. The polynucleotide may be 1000 or more nucleotides, 5000 or more nucleotides in length or 100000 or more nucleotides in length.

Helicases may move along all or only part of the target polynucleotide in the methods of the invention. All or a portion of the target polynucleotide can be characterized using the methods of the invention.

The target polynucleotide may be single stranded. At least a portion of the target polynucleotide is preferably double stranded. Helicases are typically bound to single stranded polynucleotides. If at least a portion of the target polynucleotide is double-stranded, the target polynucleotide preferably comprises a single-stranded region or a non-hybridizing region. The one or more helicases are capable of binding to one strand of the single-stranded region or the non-hybridizing region. The target polynucleotide preferably comprises one or more single stranded regions or one or more non-hybridising regions.

Sample(s)

The target polynucleotide is present in any suitable sample. The invention is generally practiced on samples known to contain or suspected of containing the target polynucleotide. Alternatively, the invention may be carried out on a sample to identify one or more target polynucleotides identified, which are known or expected to be present in the sample.

The sample may be a biological sample. The invention may be practiced in vitro on samples obtained or extracted from any organism or microorganism. The organism or microorganism is typically ancient nuclear (archaean), prokaryotic or eukaryotic, and typically belongs to one of the five kingdoms: plant kingdom, animal kingdom, fungi, prokaryotes, and protists. The present invention is carried out in vitro on samples obtained or extracted from any virus. The sample is preferably a liquid sample. The sample typically comprises a body fluid of the patient. The sample may be urine, lymph, saliva, mucus or amniotic fluid, but is preferably blood, plasma or serum. Typically, the sample is of human origin, but may alternatively be from other mammals, such as commercially farmed animals, such as horses, cattle, sheep or pigs, or may be pets such as cats or dogs. Alternatively, samples of plant origin are typically obtained from commercial crops, such as cereals, legumes, fruits or vegetables, e.g. wheat, quinoa, barley, oats, canola, corn, soybean, rice, bananas, apples, tomatoes, potatoes, grapes, tobacco, beans, lentils, sugar cane, cocoa, cotton.

The sample may be a non-biological sample. The non-biological sample is preferably a liquid sample. Examples of non-biological samples include surgical fluids, water such as drinking water, seawater or river water, and reagents for laboratory testing.

The sample is typically processed prior to testing, for example by centrifugation or by membrane filtration to remove unwanted molecules or cells, such as red blood cells. The detection may be performed immediately after the sample is obtained. The sample may also be stored prior to analysis, preferably below-70 ℃.

Click chemistry

The polynucleotides of the present application may be covalently linked. For example, free copper click chemistry or copper catalyzed click chemistry may be used. Click chemistry is used in these applications due to its desirable properties and its range for generating covalent linkages between various building blocks. For example, it is fast, clean and non-toxic, producing only harmless by-products. Click chemistry is the term first introduced by Kolb et al in 2001 to describe a broader series of powerful, selective and modular building blocks that are reliable for small and large scale applications (Kolb HC Finn, MG, Sharp less KB, click chemistry: reverse chemical function from a good practices, angew. chem. int. ed.40(2001) 2004-. They defined a series of stringent criteria for click chemistry as follows: "the reaction must be modular, broad, give very high yields, produce only harmless by-products that can be removed by non-chromatography, and be stereospecific (but not necessarily enantioselective). The required process features include simple reaction conditions (ideally the process should be insensitive to oxygen and water), readily available starting materials and reagents, the use of no solvent or solvent which is mild (e.g. water) or easily removed, and simple product isolation. Purification must be by non-chromatography, e.g., crystallization or distillation, if desired, and the product must be stable under physiological conditions.

Suitable examples of click chemistry include, but are not limited to, the following:

(a) 1, 3-couple cycloaddition of variants of free copper, wherein the azide reacts with the alkyne under stress, for example in the cyclooctane ring;

(b) reaction of an oxygen nucleophilic reagent on one linker with an epoxide or aziridine reactive moiety on the other linker; and

(c) staudinger ligation, in which the alkyne moiety can be substituted with an aryl phosphine, results in a specific reaction with the azide to give an amide bond.

Preferably, the click chemistry reaction is a cu (i) -catalyzed 1,3 dipolar cycloaddition reaction between an alkyne and an azide. In a preferred embodiment, the first group is an azide group and the second group is an alkyne group. Nucleic acid bases have been synthesized with azide and alkyne groups inserted at preferred positions (e.g., Kocalka P, El-Sagher AH, Brown T, Rapid and effective DNA strand-linking by click chemistry, Chemiochem.2008.9 (8): 1280-5). Alkyne groups are commercially available from Berry Associates (Michigan, USA) and azide groups are synthesized by ATDBio or idtbio.

In a particular embodiment of the present application, preferably the reactive groups are azide and hexyl groups, such as azide N3 and DBCO.

Reagent kit

In a further aspect, the invention also provides a kit for characterising a polynucleotide, the kit comprising the adaptor or the complex.

The kit comprises (a) one or more adaptors, (b) one or more helicases. The kit may include any of the helicases and wells discussed above.

The kit may also include components of the membrane, such as phospholipids such as lipid bilayers, required to form a layer of amphiphilic molecules.

The kit of the invention may additionally comprise one or more other reagents or instruments enabling the performance of any of the embodiments mentioned above. Such reagents or instruments include one or more of the following: suitable buffers (aqueous solutions), means for obtaining a sample from a subject (e.g.a vessel or an instrument comprising a needle), means for amplifying and/or expressing a polynucleotide, a membrane or pressure clamp or patch clamp device as defined above. The reagents may be present in the kit in a dry state, such that the fluid sample re-suspends the reagents. The kit may also, optionally, include instructions for how to use the kit in the methods of the invention, or detailed information about the patient for whom the methods are useful. The kit optionally includes components necessary to facilitate helicase movement (e.g., ATP and Mg) ²⁺ )。

Example 1: preparation and sequencing of the Y adaptor-enzyme complexes of the invention

SEQ ID NO:1GCGGAGTCAAACGGTAGAAGTCGTTTTTTTTTT

SEQ ID NO:2ACTGCTCATTCGGTCCTGCTGACT

SEQ ID NO:3CGACTTCTACCGTTTGACTCCGC

SEQ ID NO:4GTCAGCAGGACCGAATGAGCAGT

5AGTCCAGCACCGACC, wherein SEQ ID NO 5 consists of PNA.

The complex is formed by hybridizing 4 different strands together;

the first strand (Y1), in turn, comprises a leader sequence, namely an iSPC3 blocking strand, denoted 3, which is linked to the 5' end of SEQ ID NO:1, and the 3' end of SEQ ID NO:1 is in turn linked to the blocking strand iSPC18, denoted 8888, and the 5' end of SEQ ID NO: 2.

Second strand (Y2) as shown in SEQ ID NO: 3.

A third strand (YB-Up), the 5 'end of SEQ ID NO:4 comprising P for ligation of a polynucleotide to be characterized, the 3' end of SEQ ID NO:4 comprising a click chemistry group DBCO;

the fourth chain (YB-Dn), N-Lys (azide) -OO-AGTCCAGCACCGACC-RR-C, where O is O-linker (also known as AEEA or eg1) and R is Lys, both for increased solubility.

Y1：5’-333333333333333333333333333333GCGGAGTCAAACGGTAGAAGTCGTTTTTTTTTT-8888-ACTGCTCATTCGGTCCTGCT GACT-3’

Y2：5'-CGACTTCTACCGTTTGACTCCGC-3’

YB-Up：5'-P-GTCAGCAGGACCGAATGAGCAGT-DBCO-3’

YB-Dn：N-Lys(azide)-OO-AGTCCAGCACCGACC-RR-C

Uniformly mixing YB-Up and YB-Dn with equal substance amount (the concentration can be from 10 mu M to 100 mu M); the mixture is placed at 50 ℃ to react for 4 hours, and then the connection product is separated and purified by Urea-PAGE glue to prepare the YB-PNA chain.

Preparation of modified Y-adaptor complexes: mixing Y1; y2; the YB-PNA three synthetic single strands are synthesized with 1: 1.1: 1.1 (slowly cooling from 95 ℃ to 25 ℃, and the cooling amplitude is not more than 0.1 ℃/s). The annealing final system comprises 160mM HEPES 7.0; 200mM NaCl, with a final concentration of Y1 of 4-8. mu.M, finally forming a Y adaptor. The Y-adaptors (500nM) were mixed with 6 times the amount of substance T4 Dda-M1G/E94C/C109A/C136A/A360C (3. mu.M) (SEQ ID NO: 6) in buffer (100mM NaAc (pH 7); 1.5mM TMAD) and incubated for 30 min at room temperature. Sample 1 was obtained.

SEQ ID NO:6

GTFDDLTEGQKNAFNIVMKAIKEKKHHVTINGPAGTGKTTLTKFIIEALISTGETGIILAAPTHAAKKILSKLSGKEASTIHSILKINPVTYECNVLFEQKEVPDLAKARVLICDEVSMYDRKLFKILLSTIPPWATIIGIGDNKQIRPVDPGENTAYISPFFTHKDFYQCELTEVKRSNAPIIDVATDVRNGKWIYDKVVDGHGVRGFTGDTALRDFMVNYFSIVKSLDDLFENRVMAFTNKSVDKLNSIIRKKIFETDKDFIVGEIIVMQEPLFKTYKIDGKPVSEIIFNNGQLVRIIEAEYTSTFVKARGVPGEYLIRHWDLTVETYGDDEYYREKIKIISSDEELYKFNLFLGKTCETYKNWNKGGKAPWSDFWDAKSQFSKVKALPASTFHKAQGMSVDRAFIYTPCIHYADVELAQQLLYVGVTRGRYDVFYV

Wherein, as shown in figure 2, is a schematic representation of the composite; wherein the Y1 chain comprises a blocking chain S and a polynucleotide chain D' connected with the blocking chain S, and a motor protein stagnated on the blocking chain S, the complementary region of the Y2 chain and the Y1 chain is a double-stranded part of the polynucleotide chain L, the YB-Dn chain comprises a modified part, and the complementary region of the YB-Up chain and the Y1 chain is a double-stranded polynucleotide D; wherein the YB-Dn chain and the YB-Up chain are connected by click chemistry.

Then, sample 1 was purified using a DNAPAC PA200 column using the following elution buffer (buffer A:20mM Na-CHES,250mM NaCl, 4% (W/V) glycerol, pH 8.6, buffer B:20mM Na-CHES,1M NaCl, 4% (W/V) glycerol, pH 8.6), sample 1 was loaded on the column, and the enzyme that did not bind to the DNA was eluted from the column with buffer A. The enzyme bound Y-adaptor complex is then eluted with 10 column volumes of 0-100% buffer B. Then, the main elution peak is collected, and the concentration of the main elution peak is measured, so that the adaptor compound of the invention is obtained.

The 2.7Kb polynucleotide library (i.e., the analyte to be detected) was ligated at both ends to the adaptor complex using T4 ligase and then sequenced using the genencology nanopore sequencer QNome-9604, sequencing buffer: final concentration 10mM HEPES, 100mM MgCl ₂ 375mM KCl, ATP 100mM, pH 7.1, sequencing temperature: 30-40 ℃. The sequencing results are shown in FIG. 4. In the sequencing process, the 5' end of the library is put into a hole, and helicase shifts along 5' -3 ' and performs sequencing; the object to be tested penetrates through the nanopore to cause current change, when the object to be tested runs to the 3' end (namely the YB-Dn chain area), the PNA is uncharged because the area is the PNA, the current change cannot be caused during hole passing, the system considers that the hole blocking phenomenon occurs, then the system applies reverse voltage to kick the hole, and the single-chain library after sequencing is kicked back to the Cis (Cis) end from the Trans (Trans) end.

Example 2: another type E adaptor-enzyme complex of the inventionPreparation and sequencing of the substance

SEQ ID NO:7GGTAGTCAGCAGGACCGAATGAGCAGTTT

SEQ ID NO:8ACTGCTCATTCGGTCCTGCTGAC

Type E adaptor sequences

EA-1:5’-P-GGTAGTCAGCAGGACCGAATGAGCAGTTT-biotin-3’

EA-2:5’-ACTGCTCATTCGGTCCTGCTGAC-3’

Wherein the 5' end of EA-1 comprises P for ligation to the polynucleotide to be characterized, and the end is labeled with biotin (biotin).

Preparation of modified type E adaptor complexes: two synthetic single chains EA-1: EA-2 were synthesized in a ratio of 1: annealing at a molar ratio of 1.1 (slow cooling from 95 ℃ to 25 ℃ with a cooling amplitude of not more than 0.1 ℃/s). The annealing final system comprises 160mM HEPES 7.0; 200mM NaCl, final product concentration 4-8. mu.M. Then, according to the following steps of 1: adding streptavidin in the molar ratio of 1, incubating for 10min at 30 ℃, and connecting the streptavidin with biotin at the tail end of EA-1 to obtain the E-type adaptor. FIG. 3 is a schematic representation of the composite; the sequencing terminal of the target polynucleotide is introduced into the biotin-streptavidin compound, and the electric field force is not enough to tear the target polynucleotide when the target polynucleotide is sequenced to the tail end, so that the hole is blocked, and a sequencing chain is kicked out.

An asymmetric 2.7Kb polynucleotide (i.e., analyte to be detected) library was prepared by a single-enzyme digestion method, and the Y-adaptor complex prepared in example 1 and the E-adaptor prepared in this example were added to the library for ligation and purification, followed by sequencing using a manopore sequencer QNome-9604 of kyoto technologies ltd, sequencing buffer: final concentration 10mM HEPES, 100mM MgCl ₂ 375mM KCl, ATP 100mM, pH 7.1, sequencing temperature: 30-40 ℃. The sequencing results are shown in FIG. 5. When the sequencing is started, under the guidance of the complex, the 5' end of the object to be detected enters a hole, and the helicase shifts along 5' -3 ' and performs sequencing; when the system runs to the 3 'end, the biotin and the streptavidin at the 3' end cannot cross the nanopore due to the action of the electric field force, so that the pore blocking phenomenon can occur, the system applies reverse voltage to kick the pore, and the single chain after sequencing is completedThe library kicks from the Trans (Trans) end back to the Cis (Cis) end.

Example 3:preparation and sequencing of another Y adaptor of the invention

The procedure is as in example 1, except that: 5 of the polynucleotide of SEQ ID NO. 5, the phosphate at the 5' end is modified with spermine.

The complex is formed by hybridizing 3 different strands together;

the first strand (Y-Top-1-NS), in turn, comprises a leader sequence, namely the iSPC3 blocking strand, denoted 3, which is linked to the 5' end of SEQ ID NO:1, and the 3' end of SEQ ID NO:1 is in turn linked to the blocking strand iSPC18 (denoted 8888) and the 5' end of SEQ ID NO: 2.

The second strand (Y-Top-2-NS) is shown in SEQ ID NO: 3.

A third strand (Y-Bottom-S), the 5' end of SEQ ID NO:4 comprising P for ligation to the polynucleotide to be characterized, the 3' end of SEQ ID NO:4 being ligated to the 5' end of SEQ ID NO:5, and

5 by cationic oligomer modification, namely spermine modification.

Y-Top-1-NS：5’-

333333333333333333333333333333GCGGAGTCAAACGGTAGAAGTCGTTTTTTTTTT-8888-ACTGCTCATTCGGTCCTGCT GACT-3’

Y-Top-2-NS：5'-CGACTTCTACCGTTTGACTCCGC-3’

Y-Bottom-S: 5' -P-GTCAG CAGGA CCGAA TGAGCAGTSSSAGTCCAGCACCGACC (S stands for cationic oligomer spermine)

The sequencing results are shown in FIG. 6. In the sequencing process, the 5' end of the library is put into a hole, and helicase shifts along 5' -3 ' and performs sequencing; when an object to be tested passes through the nanopore, current change can be caused, and when the object to be tested runs to the 3' end (namely a YB-Dn chain region), the current change cannot be caused when the object passes through the hole due to positive electricity of the region, the system considers that the hole is blocked, and then the system can apply reverse voltage to kick the hole, so that the single-chain library after sequencing is kicked back to a Cis (Cis) end from a Trans (Trans) end.

In addition, the term "and/or" herein is only one kind of association relationship describing the association object, and means that there may be three kinds of relationships, for example, a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

It should be understood that in the embodiment of the present invention, "B corresponding to a" means that B is associated with a, from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Sequence listing

<110> Chengdu carbon technology Co., Ltd

<120> method for characterizing a polynucleotide of interest and adaptors

<160> 8

<170> SIPOSequenceListing 1.0

<210> 1

<211> 33

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

gcggagtcaa acggtagaag tcgttttttt ttt 33

<210> 2

<211> 24

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 2

actgctcatt cggtcctgct gact 24

<210> 3

<211> 23

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 3

cgacttctac cgtttgactc cgc 23

<210> 4

<211> 23

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 4

gtcagcagga ccgaatgagc agt 23

<210> 5

<211> 15

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 5

agtccagcac cgacc 15

<210> 6

<211> 439

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 6

Gly Thr Phe Asp Asp Leu Thr Glu Gly Gln Lys Asn Ala Phe Asn Ile

1 5 10 15

Val Met Lys Ala Ile Lys Glu Lys Lys His His Val Thr Ile Asn Gly

20 25 30

Pro Ala Gly Thr Gly Lys Thr Thr Leu Thr Lys Phe Ile Ile Glu Ala

35 40 45

Leu Ile Ser Thr Gly Glu Thr Gly Ile Ile Leu Ala Ala Pro Thr His

50 55 60

Ala Ala Lys Lys Ile Leu Ser Lys Leu Ser Gly Lys Glu Ala Ser Thr

65 70 75 80

Ile His Ser Ile Leu Lys Ile Asn Pro Val Thr Tyr Glu Cys Asn Val

85 90 95

Leu Phe Glu Gln Lys Glu Val Pro Asp Leu Ala Lys Ala Arg Val Leu

100 105 110

Ile Cys Asp Glu Val Ser Met Tyr Asp Arg Lys Leu Phe Lys Ile Leu

115 120 125

Leu Ser Thr Ile Pro Pro Trp Ala Thr Ile Ile Gly Ile Gly Asp Asn

130 135 140

Lys Gln Ile Arg Pro Val Asp Pro Gly Glu Asn Thr Ala Tyr Ile Ser

145 150 155 160

Pro Phe Phe Thr His Lys Asp Phe Tyr Gln Cys Glu Leu Thr Glu Val

165 170 175

Lys Arg Ser Asn Ala Pro Ile Ile Asp Val Ala Thr Asp Val Arg Asn

180 185 190

Gly Lys Trp Ile Tyr Asp Lys Val Val Asp Gly His Gly Val Arg Gly

195 200 205

Phe Thr Gly Asp Thr Ala Leu Arg Asp Phe Met Val Asn Tyr Phe Ser

210 215 220

Ile Val Lys Ser Leu Asp Asp Leu Phe Glu Asn Arg Val Met Ala Phe

225 230 235 240

Thr Asn Lys Ser Val Asp Lys Leu Asn Ser Ile Ile Arg Lys Lys Ile

245 250 255

Phe Glu Thr Asp Lys Asp Phe Ile Val Gly Glu Ile Ile Val Met Gln

260 265 270

Glu Pro Leu Phe Lys Thr Tyr Lys Ile Asp Gly Lys Pro Val Ser Glu

275 280 285

Ile Ile Phe Asn Asn Gly Gln Leu Val Arg Ile Ile Glu Ala Glu Tyr

290 295 300

Thr Ser Thr Phe Val Lys Ala Arg Gly Val Pro Gly Glu Tyr Leu Ile

305 310 315 320

Arg His Trp Asp Leu Thr Val Glu Thr Tyr Gly Asp Asp Glu Tyr Tyr

325 330 335

Arg Glu Lys Ile Lys Ile Ile Ser Ser Asp Glu Glu Leu Tyr Lys Phe

340 345 350

Asn Leu Phe Leu Gly Lys Thr Cys Glu Thr Tyr Lys Asn Trp Asn Lys

355 360 365

Gly Gly Lys Ala Pro Trp Ser Asp Phe Trp Asp Ala Lys Ser Gln Phe

370 375 380

Ser Lys Val Lys Ala Leu Pro Ala Ser Thr Phe His Lys Ala Gln Gly

385 390 395 400

Met Ser Val Asp Arg Ala Phe Ile Tyr Thr Pro Cys Ile His Tyr Ala

405 410 415

Asp Val Glu Leu Ala Gln Gln Leu Leu Tyr Val Gly Val Thr Arg Gly

420 425 430

Arg Tyr Asp Val Phe Tyr Val

435

<210> 7

<211> 29

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 7

ggtagtcagc aggaccgaat gagcagttt 29

<210> 8

<211> 23

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 8

actgctcatt cggtcctgct gac 23

Claims

1. A method of characterizing a target polynucleotide, comprising:

(a) moving the target polynucleotide through the nanopore,

wherein the sequencing terminus of the target polynucleotide comprises a modification that occludes the nanopore;

2. The method of claim 1, wherein,

3. The method of claim 1 or 2, further comprising:

4. The method of claim 3, wherein step (c) does not include:

(ii) taking one or more electrical and/or optical measurements as the target polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the target polynucleotide, and thereby characterising the target polynucleotide; and/or

In step (c), said moving said target polynucleotide in an opposite direction relative to said pore is achieved by means comprising at least one of: applying a reverse voltage, atomic force microscope drawing and/or drawing of an invertase that moves the target polynucleotide in reverse relative to the nanopore.

5. The method of any one of claims 1 to 4, wherein step (a) comprises:

attaching an adaptor comprising the modification moiety to the polynucleotide of interest such that the sequencing terminus of the polynucleotide of interest comprises the modification moiety; and/or

The modifying moiety comprises a modifying moiety having no charge on the side chain or a positive charge on the side chain; and/or

The side chain uncharged modified part comprises any one or combination of more than two of PNA, polypeptide and phosphate backbone alkylation modified nucleotide; and/or

The cationic oligomer comprises one or the combination of more than two of spermine, spermidine and putrescine; and/or

The modifying moiety comprises a ligand and a ligand which bind to each other, the ligand and ligand comprising streptavidin and biotin, and/or

Antigens and antibodies.

6. The method of claim 5, wherein the adaptor is a Y-adaptor comprising a double-stranded region and at least one single-stranded region, or

An adaptor of type E comprising a double-stranded region and no single-stranded region.

7. The method of claim 6, wherein the adaptor is a Y-adaptor, the modified portion of the Y-adaptor being located in or forming the overhang portion of the Y-adaptor;

and/or, the modification moiety is covalently attached to the Y-adaptor, or the modification moiety is attached to the Y-adaptor by a click chemistry reaction.

8. A method according to claim 6 wherein the adaptor is an E-type adaptor and the modification moiety is located at the end which is not attached to the polynucleotide of interest.

9. The method of any one of claims 1 to 8, wherein the adaptor comprises a blocking strand having a different structure from the polynucleotide for blocking a motor protein.

10. An adaptor for characterising a target polynucleotide, characterised in that the adaptor comprises a modification moiety for binding to a sequencing terminal of the target polynucleotide, the modification moiety being capable of causing nanopore blockage.

11. A construct for characterising a polynucleotide of interest, the construct comprising an adaptor, and a polynucleotide of interest;

the adaptor comprises a modification moiety that binds to a sequencing terminus of the polynucleotide of interest, the modification moiety being capable of causing nanopore blockage;

the target polynucleotide is a double-stranded polynucleotide.

12. A complex for characterising a polynucleotide of interest, the complex comprising an adaptor or construct, and a motor protein, wherein,

the construct comprising the adaptor and a polynucleotide of interest,

the motor protein is a protein capable of binding to the target polynucleotide and controlling its movement through the pore.

13. A complex according to claim 12, wherein the motor protein is selected from one or more of a polymerase, exonuclease, helicase and topoisomerase;

the helicase is selected from one or more of Hel308 helicase, RecD helicase, Tral helicase, TrwC helicase, XPD helicase and DDA helicase.

14. A kit for nanopore characterization of polynucleotides, the composition of the kit comprising:

1) an adaptor comprising a modification moiety for binding to a sequencing terminus of the polynucleotide of interest, the modification moiety being capable of causing nanopore blockage; and

2) a motor protein.