[go: up one dir, main page]

WO2025149478A1 - Compositions of modified nucleoside triphosphates - Google Patents

Compositions of modified nucleoside triphosphates

Info

Publication number
WO2025149478A1
WO2025149478A1 PCT/EP2025/050239 EP2025050239W WO2025149478A1 WO 2025149478 A1 WO2025149478 A1 WO 2025149478A1 EP 2025050239 W EP2025050239 W EP 2025050239W WO 2025149478 A1 WO2025149478 A1 WO 2025149478A1
Authority
WO
WIPO (PCT)
Prior art keywords
composition
nucleic acid
complementary strand
group
nucleoside triphosphates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/EP2025/050239
Other languages
French (fr)
Inventor
Drew GOODMAN
Aaron Jacobs
Mark Stamatios Kokoris
Tylor LEHMANN
Matthew Lopez
Melud Nabavi
Dylan O´CONNELL
John C. Tabone
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
F Hoffmann La Roche AG
Roche Sequencing Solutions Inc
Original Assignee
F Hoffmann La Roche AG
Roche Sequencing Solutions Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by F Hoffmann La Roche AG, Roche Sequencing Solutions Inc filed Critical F Hoffmann La Roche AG
Publication of WO2025149478A1 publication Critical patent/WO2025149478A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H19/00Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof
    • C07H19/02Compounds containing a hetero ring sharing one ring hetero atom with a saccharide radical; Nucleosides; Mononucleotides; Anhydro-derivatives thereof sharing nitrogen
    • C07H19/04Heterocyclic radicals containing only nitrogen atoms as ring hetero atom
    • C07H19/06Pyrimidine radicals
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H21/00Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids

Definitions

  • the present invention relates to compositions comprising an excess of a specific diastereomer of nucleoside triphosphates with a 5’ phosphoramidate.
  • the invention also relates to methods for generating a complementary strand and for sequencing using the compositions.
  • Xpandomer synthesis is based on the natural function of DNA replication where expandable nucleoside triphosphates (XNTPs) act as substrates for replication.
  • XNTPs expandable nucleoside triphosphates
  • Modified nucleoside triphosphates with two clickable (e.g. terminal alkyne) groups such as dNTP-2c are used as building blocks for reagents in nanopore sequencing, especially within the technology of sequencing by expansion.
  • building blocks are typically clicked to tethers that contain reporter and translocation control elements in order to generate XNTPs.
  • Structures and processes of sequencing by expansion and reagents used therein are disclosed in WO 2016/081871, WO 2020/236526 and WO 2020/172479.
  • dNTP-2c The synthesis of dNTP-2c has previously been conducted by solid-phase synthesis using commercially available DNA/RNA-synthesizers with a proprietary synthetic method. Since the a-phosphoramidate in dNTP-2c (and XNTPs derived therefrom) is chiral, dNTP-2c (and XNTPs derived therefrom) are obtained as 1 :1 diastereomeric mixtures of two isomers in this process.
  • compositions comprising an excess of the active isomer over the inactive isomer (e.g. in which at least 80%, preferably at least 90% or even 100% of the nucleoside triphosphates represents the active isomer) thus enables efficient and cost-effective Xpandomer production for sequencing by expansion. Based on these findings, it is desirable to remove the inactive isomer from the final solution used in Xpandomer synthesis, or even to avoid its production in the first place.
  • the disclosure thus provides a composition comprising a nucleoside triphosphate with a chiral 5’ phosphoramidate, wherein one isomer is present in excess over the other.
  • Exemplary embodiments of the disclosure are as follows:
  • a composition comprising nucleoside triphosphates having the structure: wherein NB is a nucleobase; R 1 comprises or consists of a hydrocarbon; R 2 is independently H, OH or any 2 '-ribose modification; R 3 is H or any protecting group; and R 4 comprises or consists of a hydrocarbon; G 1 and G 2 independently represent terminal clickable groups; L 1 and L 2 independently represent linking groups; and T is a tether molecule; wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have the following stereoconfiguration at the a -phosphoramidate: [0013] 2.
  • composition of item 1 wherein at least 90% of the nucleoside triphosphates have the following stereoconfiguration at the a -phosphoramidate: [0014] 3.
  • composition of item 1 or 2 wherein 100% of the nucleoside triphosphates have the following stereoconfiguration at the a-phosphoramidate:
  • composition of any one of the preceding items comprising a mixture of four different types of nucleoside triphosphates.
  • composition of item 7, wherein the four different types of nucleoside triphosphates comprise four different types of nucleobases.
  • composition of item 7 or 8 wherein the four different types of nucleoside triphosphate base pair with, guanine, adenine, thymine and cytosine, respectively.
  • composition of any one of items 7-9, wherein the four different types of nucleoside triphosphate comprise the four different types of nucleobases of item 6, respectively.
  • composition of any one of the preceding items, wherein R 1 is linear.
  • R 1 comprises 1- 20 carbon atoms, such as 1-10 carbon atoms or 5-10 carbon atoms, such as 6 or 8 carbon atoms.
  • R 4 comprises 1- 20 carbon atoms, such as 3-15 carbon atoms, 3-10 carbon atoms, such as 4 carbon atoms.
  • composition of any one of the preceding items wherein each of L 1 and L 2 is a 1,2,3-triazole.
  • composition of any one of the preceding items, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
  • composition of any one of items 1 -29, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
  • the method of any one of items 36-40 comprise hybridizing a primer to the nucleic acid, at the same time as or followed by contacting the nucleic acid with the composition.
  • a method for determining the sequence of a nucleic acid comprising the following steps in order:
  • nanopore inserted into the barrier, wherein the nanopore has an entrance side on the cis side of the barrier and an exit side on the trans side of the barrier;
  • composition can thus further comprise additional reagents for complementary strand synthesis.
  • the composition can further comprise a nucleic acid polymerase (for details see section IV.).
  • the composition can further comprise a buffering agent, such as TrisCi.
  • the composition further comprises at least one of (including each of) TrisOAc, NH4OAC, PEG, water-miscible organic solvent, such as DMF or NMP, polyphosphate 60, N-methyl succinimide (NMS), and MnCh, a single-strand binding protein (SSB), and urea.
  • the SSB can be Kod SSB (from Thermococcus kodakarensis), for example.
  • the composition may further comprise a polymerase-enhancing molecule (PEM), such as described in EP 3 735 409 Bl.
  • PEM polymerase-enhancing molecule
  • a reaction solution will also typically comprise at least one nucleic acid.
  • the composition typically comprises four different types of the nucleoside triphosphate, wherein the four different types of the nucleoside triphosphate base pair with guanine, adenine, thymine and cytosine, respectively.
  • the reaction can look as follows: wherein G 1 and G 2 independently represent terminal clickable groups, G la and G 2a independently represent terminal clickable groups, L 1 and L 2 represent linking groups formed by reacting G 1 and G 2 with G la and G 2a , respectively, and T, NB, R 1 , R 2 , R 3 , and R 4 are as defined above.
  • the disclosure also provides a method for generating a complementary strand to a nucleic acid, comprising contacting the nucleic acid with a composition comprising a nucleoside triphosphate as disclosed herein.
  • the complementary strand generated by such method is typically an Xpandomer (in constrained configuration).
  • the type of nucleic acid is not particularly limited, and includes DNA or RNA.
  • the nucleic acid is a DNA, such as a genomic DNA or a cDNA.
  • the DNA is cell-free DNA.
  • the nucleic acid can be part of a library of nucleic acids.
  • the library can be a library of genomic DNA or cDNA.
  • the nucleic acid can also be denatured, e.g. to facilitate primer hybridization.
  • Means for denaturation are not particularly limited, and include e.g. applying heat (e.g. 90°C-100°C).
  • the method can comprise denaturing the nucleic acid and then hybridizing a primer to the nucleic acid, followed by contacting the nucleic acid with a composition comprising a nucleoside triphosphate as disclosed herein.
  • the complementary strand is generated by using a polymerase, such as a (DNA-dependent) DNA polymerase.
  • a polymerase such as a (DNA-dependent) DNA polymerase.
  • the composition used preferably further comprises a nucleic acid polymerase.
  • Polymerases will typically be used comprising mutations that sterically allow the use of XNTPs as substrates.
  • a suitable class of polymerases for incorporating XNTPs includes the translesion DNA polymerase (i.e. class Y polymerase) family that includes e.g. the DPO4 polymerase.
  • Translesion DNA polymerases exhibit a more flexible substrate recognition than conventional (e.g. replication) polymerases owing to their relatively large substrate binding sites, which have evolved to accommodate naturally occurring, bulky DNA lesions.
  • the P-N bond can be selectively cleaved in step 2) under acidic conditions, for example. This can be achieved by addition of an acid, such as DO. Cleavage in step 2) typically yields an Xpandomer in expanded configuration. The product of step 2) can optionally be purified before proceeding with step 3).
  • the cis well can be perfused with a buffer containing 0.4M NH4C1, 600mM GuanCl, lOOmM HEPES; pH 7.4, and 5% glycerol and the trans well can be perfused with buffer containing 0.4M NH4C1, 600mM GuanCl, 5% ethyl acetate, lOmM HEPES; pH 7.4, before introducing the Xpandomer to the cis side for sequencing.
  • Racemic mixtures comprising nucleoside triphosphates with two clickable groups as disclosed herein were produced by the methods disclosed in WO 2016/081871.
  • Four separate racemic mixtures were produced for four different types of nucleoside triphosphates with four different nucleobases corresponding to C, T, A and G, respectively, each one with -R ⁇ G 1 on the nucleobase and -R 4 -G 2 on the a-phosphoramidate as disclosed herein.
  • the two isomers were then separated from one another by preparative HPLC.
  • FIG. 1 Exemplary HPLC chromatograms for the four different types of nucleoside triphosphates are shown in Fig. 1.
  • the chromatogram shows two distinct peaks for the active and inactive isomers of the four different types of nucleoside triphosphates. The two isomers were separated from one another by collecting suitable fractions from the two peaks. This allows the production of compositions in which one of the two isomers is present in excess.
  • Example 2 Xpandomer synthesis and sequencing using active vs. inactive isomers
  • the 50 pl extension reaction includes the following reagents: 50mM TrisCi, pH 8.84, 200mM NH 4 OAC, 50mM GuC120% PEG8K, 10% N-methylpyrrolidone (NMP), 15nmol polyphosphate PP-60.23, 2.5 pg Kod single-strand binding protein (SSB), 0.1M urea, 15mM PEM additive and 13 pg purified recombinant DNA polymerase C4760 (SEQ ID NO: 2, a variant of DPO4 polymerase; other suitable variants include SEQ ID NOs: 1 and 3-5).
  • the extension reaction is run for 60 minutes at 37°C.
  • Xpandomer products are next sequenced using the SBX protocol. Briefly, the constrained Xpandomer products are washed in buffer B.064 (1% Tween-20/3% SDS/5mM HEPES, pH 8.0/100mM NaPO-i/l 5% DMF) and cleaved to generate linearized Xpandomer by adding 200pl buffer C.001 (7.5M DC1) and incubating for 30 minutes at 23 °C. The sample is then neutralized by adding 2000 pl buffer B.064 and incubating for 2min at RT. The Xpandomer sample is then subjected to amine modification by adding 500pmol succinate anhydride in buffer B.064 and incubating for 5 minutes at 23 °C. The sample is then washed in buffer D.102 (50% ACN) and the Xpandomers are released from the substrate by photocleavage and eluted in 60 pl elution buffer.
  • buffer B.064 1% Tween-20/3% SDS/5mM
  • Protein nanopores are prepared by inserting a-hemolysin into a DPhPE/hexadecane bilayer membrane in a buffer of 2 M NH4CI and 100 mM HEPES, pH 7.4.
  • the cis well is perfused with buffer AG242 containing 0.4M NH4CI , 600mM GuanCl, lOOmM HEPES; pH 7.4, and 5% glycerol and the trans well is perfused with buffer AB080 containing 0.4M NH4CI, 600mM GuanCl, 5% ethyl acetate, lOmM HEPES; pH 7.4.
  • the use of the active isomer achieved a higher percentage of full-length Xpandomers compared to a mixture of both isomers. This was even the case when the concentration of the active isomer was the same in the mixture as in the diastereomerically pure preparation: While 100 pM of pure active isomers yielded 40% of full-length product, the 90:10 mixture at 111 pM (comprising 100 pM active and 11 pM inactive isomers) or the 75:25 mixture at 133 pM (comprising 100 pM active and 33 pM inactive isomers) only yielded 37% and 32% of full-length product, respectively.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Saccharide Compounds (AREA)

Abstract

The invention relates to a diastereomer of nucleoside triphosphates suitable for use in sequencing by expansion. The diastereomer provides a better acceptance and incorporation by a DNA polymerase and better performance in sequencing by expansion workflows. The invention also relates to sequencing methods using the diastereomer of the nucleoside triphosphates.

Description

COMPOSITIONS OF MODIFIED NUCLEOSIDE TRIPHOSPHATES
SEQUENCE LISTING INCORPORATION BY REFERENCE
[0001] This application hereby incorporates-by-reference a sequence listing submitted herewith in a computer-readable format..
FIELD OF THE INVENTION
[0002] The present invention relates to compositions comprising an excess of a specific diastereomer of nucleoside triphosphates with a 5’ phosphoramidate. The invention also relates to methods for generating a complementary strand and for sequencing using the compositions.
BACKGROUND
[0003] Over the last two decades, biological membranes have emerged as an important tool in a variety of biomedical applications. This includes the use of lipid bilayer membranes in nanopore based sequencing applications, where nanopores provide a constant and reproducible physical aperture, through which a target molecule can be directed and sequenced.
[0004] One approach for nanopore-based sequencing of, for example, nucleic acids involves a sequencing-by-expansion approach by transcribing the sequence of nucleic acids into a simple to measure polymer molecule called an Xpandomer. Much like with polymerase chain reaction (PCR), Xpandomer synthesis is based on the natural function of DNA replication where expandable nucleoside triphosphates (XNTPs) act as substrates for replication.
[0005] Xpandomer synthesis is based on four easily differentiated XNTPs that include High Signal-to-Noise Reporters, one for each DNA base. Engineered polymerases incorporate these modified nucleotides into Xpandomers, producing a copy of the target nucleic acid template from the library. As the Xpandomer molecule transits through the nanopore, the distinct electrical signal of each base reporter is easily identifiable to enable highly accurate and high throughput nanopore-based nucleic acid sequencing. See, e.g., U.S. Pat. No. 7,939,259, titled “High Throughput Nucleic Acid Sequencing by Expansion;” and PCT publication WO 2020/236526 Al, titled “Translocation control elements, reporter codes, and further means for translocation control for use in nanopore sequencing”, both of which are hereby incorporated herein in their entirety.
[0006] Modified nucleoside triphosphates with two clickable (e.g. terminal alkyne) groups, such as dNTP-2c, are used as building blocks for reagents in nanopore sequencing, especially within the technology of sequencing by expansion. Within this technology, such building blocks are typically clicked to tethers that contain reporter and translocation control elements in order to generate XNTPs. Structures and processes of sequencing by expansion and reagents used therein are disclosed in WO 2016/081871, WO 2020/236526 and WO 2020/172479.
[0007] The synthesis of dNTP-2c has previously been conducted by solid-phase synthesis using commercially available DNA/RNA-synthesizers with a proprietary synthetic method. Since the a-phosphoramidate in dNTP-2c (and XNTPs derived therefrom) is chiral, dNTP-2c (and XNTPs derived therefrom) are obtained as 1 :1 diastereomeric mixtures of two isomers in this process.
SUMMARY OF THE INVENTION
[0008] The disclosure relates to a nucleoside triphosphate with a 5’ phosphoramidate that may be useful in the field of sequencing by expansion. The a-phosphoramidate in such nucleoside triphosphates is chiral and can have two distinct stereoconfigurations:
[0009] The present inventors have surprisingly found that one stereoconfiguration (“active” isomer) of XNTPs provides the desired functional performance in sequencing by expansion due to better polymerase incorporation and Xpandomer production compared to the other, “inactive” isomer or a mixture of both. As shown in the Examples, only the active isomer allows generation of full- length complementary strand (Xpandomer) product. Surprisingly, the inactive isomer not only does not yield any full-length product, its presence in a mixture with the active isomer demonstrates a negative impact on the yield of full-length product.
[0010] The specific use of a composition comprising an excess of the active isomer over the inactive isomer (e.g. in which at least 80%, preferably at least 90% or even 100% of the nucleoside triphosphates represents the active isomer) thus enables efficient and cost-effective Xpandomer production for sequencing by expansion. Based on these findings, it is desirable to remove the inactive isomer from the final solution used in Xpandomer synthesis, or even to avoid its production in the first place. The disclosure thus provides a composition comprising a nucleoside triphosphate with a chiral 5’ phosphoramidate, wherein one isomer is present in excess over the other. [0011] Exemplary embodiments of the disclosure are as follows:
[0012] 1. A composition comprising nucleoside triphosphates having the structure: wherein NB is a nucleobase; R1 comprises or consists of a hydrocarbon; R2 is independently H, OH or any 2 '-ribose modification; R3 is H or any protecting group; and R4 comprises or consists of a hydrocarbon; G1 and G2 independently represent terminal clickable groups; L1 and L2 independently represent linking groups; and T is a tether molecule; wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have the following stereoconfiguration at the a -phosphoramidate: [0013] 2. The composition of item 1, wherein at least 90% of the nucleoside triphosphates have the following stereoconfiguration at the a -phosphoramidate: [0014] 3. The composition of item 1 or 2, wherein 100% of the nucleoside triphosphates have the following stereoconfiguration at the a-phosphoramidate:
G2
R4
(-O-P-N-] o
[0015] 4. The composition of any one of the preceding items, wherein NB is selected from cytosine, thymine, 7- deazaadenine and 7-deazaguanine.
[0016] 5. The composition of any one of the preceding items, wherein R1 is attached to position 5 of the nucleobase when the nucleobase is a pyrimidine nucleobase, and to position 7 of the nucleobase when the nucleobase is a purine nucleobase.
[0017] 6. The composition of any one of the preceding items, wherein NB has one of the following structures:
[0018] 7. The composition of any one of the preceding items, comprising a mixture of four different types of nucleoside triphosphates.
[0019] 8. The composition of item 7, wherein the four different types of nucleoside triphosphates comprise four different types of nucleobases.
[0020] 9. The composition of item 7 or 8, wherein the four different types of nucleoside triphosphate base pair with, guanine, adenine, thymine and cytosine, respectively.
[0021] 10. The composition of any one of items 7-9, wherein the four different types of nucleoside triphosphate comprise the four different types of nucleobases of item 6, respectively.
[0022] 11. The composition of any one of the preceding items, wherein R1 comprises or consists of an unsaturated hydrocarbon.
[0023] 12. The composition of any one of the preceding items, wherein R1 consists of a hydrocarbon, such as an alkynyl.
[0024] 13. The composition of any one of the preceding items, wherein R1 is acyclic.
[0025] 14. The composition of any one of the preceding items, wherein R1 is linear. [0026] 15. The composition of any one of the preceding items, wherein R1 comprises 1- 20 carbon atoms, such as 1-10 carbon atoms or 5-10 carbon atoms, such as 6 or 8 carbon atoms.
[0027] 16. The composition of any one of the preceding items, wherein R1 is a hexa-1- ynyl or octa-1 -ynyl group.
[0028] 17. The composition of any one of the preceding items, wherein R1 with G1 is a octa-l,7-diynyl or a deca-1 ,9-diynyl group.
[0029] 18. The composition of any one of the preceding items, wherein both R2 are H.
[0030] 19. The composition of any one of the preceding items, wherein R3 is H.
[0031] 20. The composition of any one of the preceding items, wherein R4 comprises or consists of a saturated hydrocarbon.
[0032] 21. The composition of any one of the preceding items, wherein R4 comprises 1- 20 carbon atoms, such as 3-15 carbon atoms, 3-10 carbon atoms, such as 4 carbon atoms.
[0033] 22. The composition of any one of the preceding items, wherein R4 is acyclic.
[0034] 23. The composition of any one of the preceding items, wherein R4 is linear.
[0035] 24. The composition of any one of the preceding items, wherein R4 consists of a hydrocarbon.
[0036] 25. The composition of any one of the preceding items, wherein R4 is an n- butyl group.
[0037] 26. The composition of any one of the preceding items, wherein R4 with G2 is a hex-5-ynyl group.
[0038] 27. The composition of any one of items 1 -23, wherein R4 comprises or consists of two or more hydrocarbons that are linked by an atom or group of atoms other than carbon, such as a phosphorus atom and/or an oxygen atom.
[0039] 28. The composition of any one of the preceding items, wherein the terminal clickable group is a terminal alkyne group or a terminal azide group, preferably a terminal alkyne group.
[0040] 29. The composition of any one of the preceding items, wherein G1 and G2 represent the same type of terminal clickable group.
[0041] 30. The composition of any one of the preceding items, wherein L1 and L2 independently represent linking groups formed via click reactions.
[0042] 31. The composition of any one of the preceding items, wherein each of L1 and L2 is a 1,2,3-triazole. [0043] 32. The composition of any one of the preceding items, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
[0044] 33. The composition of any one of the preceding items, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
[0045] 34. The composition of any one of items 1 -29, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
[0046] 35. The composition of any one of items 1 -29, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure
5 selected from the following structures:
[0047] 36. A method for generating a complementary strand to a nucleic acid, comprising contacting the nucleic acid with the composition of any one of items 1-33. [0048] 37. The method of item 36, wherein the nucleic acid is comprised in library of nucleic acids.
[0049] 38. The method of item 36 or 37, wherein the nucleic acid is a DNA.
[0050] 39. The method of any one of items 36-38, wherein the composition further comprises a nucleic acid polymerase. [0051] 40. The method of any one of items 36-39, wherein the composition further comprises a buffering agent, such as TrisCi, and/or a polymerase cofactor, such as MnC12.
[0052] 41. The method of any one of items 36-40, comprise hybridizing a primer to the nucleic acid, at the same time as or followed by contacting the nucleic acid with the composition.
[0053] 42. A method for determining the sequence of a nucleic acid, comprising the following steps in order:
1) Generating a complementary strand to the nucleic acid by the method of any one of items
36-41; 2) Selectively cleaving the P-N bond within the nucleoside triphosphate to generate an expanded complementary strand;
3) Sequencing the expanded complementary strand,
4) Determining the sequence of the nucleic acid based on the sequence of the expanded complementary strand.
[0054] 43. The method of item 42, wherein the P-N bond is selectively cleaved in step 3) under acidic conditions.
[0055] 44. The method of item 42 or 43, wherein the expanded complementary strand is sequenced in step 4) by nanopore-based sequencing.
[0056] 45. The method of item 44, wherein the nanopore-based sequencing comprises:
(a) providing a chip for nanopore-based sequencing comprising:
(i) an electrochemically resistive barrier disposed over an aperture on a surface of the chip, wherein the barrier separates a cis side from a trans side;
(ii) a nanopore inserted into the barrier, wherein the nanopore has an entrance side on the cis side of the barrier and an exit side on the trans side of the barrier;
(b) contacting the cis side of the barrier with the expanded complementary strand;
(c) applying a voltage across the barrier of the chip to translocate the expanded complementary strand to the trans side;
(d) determining one or more changes in an electrical characteristic of the nanopore associated with occupation of the nanopore by the expanded complementary strand during the translocation; and
(e) determining, based on the one or more changes in the electrical characteristic of the nanopore, a sequence for the expanded complementary strand.
BRIEF DESCRIPTION OF THE DRAWINGS
[0057] FIG. 1 : HPLC chromatograms showing two separate peaks for the active vs. inactive diastereomers for four different dNTP-2c molecules.
[0058] FIG. 2: Graphical representation of data selected from Table 2. Ratio: the ratio of active vs. inactive isomer that is present during Xpandomer synthesis; % Full-length: percentage of full- length complementary strand among all complementary strand products detected.
[0059] FIG. 3 : Gel-electrophoresis after Xpandomer synthesis using active or inactive isomer. A full-length Xpandomer product is obtained using the active isomer (lane 37), whereas no full-length Xpandomer product is obtained using the inactive isomer (lane 38). DETAILED DESCRIPTION OF THE INVENTION
[0060] The invention will now be described in detail by way of reference only using the following definitions and examples. All patents and publications, including all sequences disclosed within such patents and publications, referred to herein are expressly incorporated by reference.
[0061] Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton (Singleton et al., Dictionary of microbiology and molecular biology, 2nd ed., 1994, John Wiley and Sons, New York), Hale (Hale and Marham, The Harper Collins dictionary of biology, 1991, Harper Perennial, NY) and Walker (Walker and Cox, The Language of Biotechnology: A Dictionary of Terms. 1988, American Chemical Society, Washington, D.C. ISBN-0-8412-1499-1) provide one of skill with a general dictionary of many of the terms used in this invention. Practitioners are particularly directed to Sambrook (Sambrook et al., Molecular cloning: A laboratory manual, 1989, Cold Spring Harbor Laboratory Press), and Ausubel (Ausubel et al., Current protocols in molecular biology, 1993, John Wiley & Sons, Inc.), for definitions and terms of the art. It is to be understood that this invention is not limited to the particular methodology, protocols, and reagents described, as these may vary.
[0062] As used herein, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise.
[0063] In the structures shown herein, when not all natural valencies of an atom are filled by named groups, it should be understood that the unfilled valencies are filled by hydrogen. When a wavy line in a structure intersects a bond, then the intersected bond is the location where the structure joins to the remainder of a molecule.
[0064] When a structure depicts a molecule with one or more negatively charged oxygens, the structure likewise encompasses the molecule with the oxygen(s) in conjunction with H+ and/or any organic or inorganic cations. When a structure depicts a molecule with one or more hydroxyl groups, the structure likewise encompasses the molecule with the oxygen(s) from the hydroxyl group(s) in conjunction with H+ and/or any organic or inorganic cations.
[0065] Reference throughout this specification to "one embodiment" or "an embodiment" and variations thereof means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
[0066] Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.
[0067] The headings provided herein are not limitations of the various aspects or embodiments of the invention, which can be had by reference to the specification as a whole. Accordingly, the terms defined immediately below are more fully defined by reference to the specification as a whole.
I. Terms
[0068] Percent identity: The term “% identity” in the context of nucleic acid or amino acid sequences refers to the level of sequence identity between a nucleic acid sequence and a reference nucleic acid sequence or between an amino acid sequence and a reference amino acid sequence, when aligned using a sequence alignment program. For example, as used herein, 80% identity indicates that a sequence has greater than 80% sequence identity over a length of the reference sequence. Exemplary levels of sequence identity include, but are not limited to, 80% or more, 85% or more, 90% or more, 95% or more, and 98% or more sequence identity to a reference sequence, e.g., the wildtype sequence for any one of the polypeptides described herein. Exemplary computer programs which can be used to determine identity between two sequences include, but are not limited to, the suite of BLAST programs, e.g., BLAS TN, BLASTX, and TBLASTX, BLASTP and TBLASTN, publicly available on the Internet. See also, Altschul I and Altschul n. Sequence searches are typically carried out using the BLASTN program when evaluating a given nucleic acid sequence relative to nucleic acid sequences in the GenBank DNA Sequences and other public databases. The BLASTX program may be used for searching nucleic acid sequences that have been translated in all reading frames against amino acid sequences in the GenBank Protein Sequences and other public databases. The BLASTP program may be used for searching amino acid sequence against amino acid sequences in the GenBank Protein Sequences and other public databases. All of BLASTN, BLASTX and BLASTP are run using default parameters of an open gap penalty of 11.0, and an extended gap penalty of 1.0, and utilize the BLOSUM-62 matrix. (See, e.g., Altschul II). In certain example embodiments, an alignment of selected sequences in order to determine “% identity” between two or more sequences, is performed using for example, the CLUSTAL-W program in MacVector version 13.0.7, operated with default parameters, including an open gap penalty of 10.0, an extended gap penalty of 0.1, and a BLOSUM 30 similarity matrix. [0069] Phosphate: A “phosphate” includes an “organophosphate” as well as variants thereof, such as an “amidophosphate” (which is a synonym for “phosphoramidate”). A phosphate can include a side chain, such as -R4-G2 in the nucleoside triphosphates disclosed herein. The first, second and third phosphate counted from the 5’ end of a nucleoside are also referred to as a-phosphate, 0- phosphate and y -phosphate, respectively (or, in case the a-phosphate is a phosphoramidate, it can also be referred to as “a-phosphoramidate”). The type of a given phosphate is also derivable from the structures provided herein.
[0070] Expandable NTP: An “expandable NTP” or “XNTP” refers to a 5' phosphate modified non-natural nucleoside triphosphate (NTP) molecule (typically a non-natural 2’ -deoxynucleoside triphosphate molecule) compatible with template-dependent enzymatic polymerization. Each XNTP has two distinct functional regions, i.e., a selectively cleavable bond (e.g. a phosphoramidate bond) linking the 5’ a-phosphate to a sugar comprised in a nucleoside and a tether that is attached within the XNTP at positions that allow for controlled expansion by cleavage of the cleavable bond (e.g. a tether linking the 5’ a-phosphate and the nucleobase). An XNTP can thus be present in a constrained configuration (when the cleavable bond is still intact) or in an expanded configuration (when the cleavable bond has been cleaved, e.g. via acid treatment).
[0071] dNTP-2c: An “dNTP-2c” refers to a 5' phosphate modified non-natural dNTP molecule that can serve as an intermediate in the synthesis of XNTPs. A dNTP -2c comprises two clickable groups, such as terminal alkynes, one as part of a modification at the 5’ a-phosphate, and one as part of a modification at the nucleobase. The two clickable groups allow addition of a tether between the a-phosphate and the nucleobase to form an XNTP.
[0072] Xpandomer: An “Xpandomer” or “Xp” refers to a molecule consisting of at least two XNTPs. An Xpandomer is obtainable, for example, by polymerase-mediated synthesis of a complementary strand to a template nucleic acid using XNTPs as polymerase substrates. An expanded configuration of the Xpandomer can be obtained by cleavage of the phosphoramidate bond in the XNTPs, e.g. via acid treatment.
II. Nucleoside triphosphates with clickable groups
[0073] In some embodiments, the disclosure relates to nucleoside triphosphates comprising two clickable groups, one attached to the a-phosphoramidate, the other attached to the nucleobase. This allows linking the a-phosphoramidate to the nucleobase via a tether molecule by a click reaction to yield expandable NTPs. Such a nucleoside triphosphate generally has the following structure: wherein NB is a nucleobase; R1 comprises or consists of a hydrocarbon; R2 is independently H, OH or any 2 -ribose modification; R3 is H or any protecting group; and R4 comprises or consists of a hydrocarbon; and G1 and G2 independently represent terminal clickable groups. Preferably, the nucleoside triphosphate is a modified 2 ’-deoxynucleoside triphosphate (dNTP).
[0074] The nucleoside triphosphate comprises a stereocenter at a phosphorus atom, and can therefore exist as two different isomers with different stereoconfigurations at the a-phosphoramidate as follows: [0075] The disclosure thus provides a composition comprising the nucleoside triphosphate
(with two clickable groups) disclosed herein. In some embodiments, the disclosure provides a composition comprising nucleoside triphosphates having the structure: wherein NB is a nucleobase; R1 comprises or consists of a hydrocarbon; R2 is independently H, OH or any 2 '-ribose modification; R3 is H or any protecting group; and R4 comprises or consists of a hydrocarbon; G1 and G2 independently represent terminal clickable groups; and T is a tether molecule; [0076] wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or
100% of the nucleoside triphosphates have the following stereoconfiguration at the a- phosphorami date:
[0077] Preferably, at least 90%, and more preferably 100% of the nucleoside triphosphates in the composition have the following stereoconfiguration at the a-phosphoramidate: [0078] Thus, for example, at least 80%, preferably at least 90%, such as at least 95%, at least
99%, or 100% of the nucleoside triphosphates in the composition can have the following structure:
[0079] A clickable group can be any group that allows selective reaction with a complementary clickable group via click chemistry. Click chemistry and suitable pairs of clickable groups are well known in the art, see e.g. Fantoni et al., 2021, Chemical Reviews, 121 (12): 7122- 7154; and Klbcker et al., 2020, Chem. Soc. Rev., 49:8749-8773. Examples of click reactions include alkyne + azide reactions (CuAAC), copper-free click strain promoted azide alkyne click (SPAAC) reactions (e.g. DBCO + azide), inverse-electron demand Diels-Alder cycloaddition (IEDDA). Thus, for example, a terminal clickable group can be a terminal alkyne or azide group, preferably an alkyne group. In preferred embodiments, G1 and G2 are terminal clickable groups of the same type. Preferably, both G1 and G2 represent a terminal alkyne group.
[0080] NB is a nucleobase, and generally will be a pyrimidine nucleobase or a purine nucleobase. This includes naturally occurring nucleobases, like adenine, guanine, cytosine or thymine, and nucleobases with modifications that do not interfere with base pairing to a complementary nucleobase. For instance, pyrimidine nucleobases can be modified at the position 5, and purine nucleobases can be modified at the position 7. Non-limiting examples of nucleobases are adenine, guanine, thymine, cytosine, uracil, xanthine, hypoxanthine, 8-azapurine, purines substituted at the 8 position with methyl or bromine, 9-oxo-N6-methyladenine, 2-aminoadenine, 7-deazaxanthine, 7- deazaguanine, 7-deazaadenine, N4-ethanocytosine, 2,6-diaminopurine, N6-ethano-2,6- diaminopurine, 5-methylcytosine, 5-(C3-C10)-alkynylcytosine, 5 -fluorouracil, 5 -bromouracil, thiouracil, pseudo isocytosine, 2-hydroxy-5-methyl-4-triazolopyridine, isocytosine, isoguanine, inosine, 7, 8 -dimethylalloxazine, 6-dihydrothymine, 5,6-dihydrouracil, 4-methyl-indole, ethenoadenine and the non-naturally occurring nucleobases described in U.S. Pat. Nos. 5,432,272 and 6,150,510 and published PCT applications WO 92/002258, WO 93/10820, WO 94/22892 and WO 94/24144, and Fasman ("Practical Handbook of Biochemistry and Molecular Biology", pp. 385-394, 1989, CRC Press, Boca Raton, La.), all herein incorporated by reference in their entireties. In one embodiment, the nucleobase is selected from adenine, guanine, uracil, and cytosine, and modified versions ofthese nucleobases, such as those disclosed herein (e.g. 7-deazaadenine or 7-deazaguanine). NB is preferably selected from cytosine, thymine, 7-deazaadenine and 7-deazaguanine. In preferred embodiments, NB has one of the following structures:
[0081] R1 is typically such that it does not interfere with base-pairing with a complementary nucleobase. For example, R1 is attached to position 5 of the nucleobase when the nucleobase is a pyrimidine nucleobase, and to position 7 of the nucleobase when the nucleobase is a purine nucleobase (wherein a naturally occurring nitrogen at position 7 can be replaced by a carbon, for example, as e.g. in 7-deazaadenine or 7-deazaguanine). Nucleobases with modifications at position 5 (pyrimidine bases) or 7 (purine bases) and their synthesis are commonly known, see e.g. see e.g. Kozak et al., 2020 (Russ. Chem. Rev., 2020, 89 (3) 281-310) and Matyugina et al., 2021 (Russ. Chem. Rev., 2021, 90 (11) 1454-1491). For concrete synthesis methods for nucleosides with bases as shown above, see also WO 2016/081871.
[0082] R1 comprises or consists of a (substituted or unsubstituted, preferably unsubstituted) hydrocarbon. For application in sequencing by expansion, R1 typically consists of a (substituted or unsubstituted, preferably unsubstituted) hydrocarbon. The hydrocarbon can be saturated or unsaturated, preferably unsaturated. For example, R1 can comprise or consist of a (substituted or unsubstituted, preferably unsubstituted) alkyl, alkenyl, or alkynyl, preferably alkynyl.
[0083] Preferably, R1 comprises 1-100 carbon atoms, preferably 1-30 carbon atoms or 1-20 carbon atoms, such as 3-20 carbon atoms, 3-10 carbon atoms or 5-10 carbon atoms, such as 6 or 8 carbon atoms. Typically, R1 will be acyclic. Preferably, R1 is linear. In some embodiments, the molecular weight of R1 is 1500 g/mol or less, 1000 g/mol or less, 500 g/mol or less, 200 g/mol or less, or 100 g/mol or less. [0084] In some embodiments, R1 is a substituted or unsubstituted, branched or unbranched, saturated or unsaturated alkyl group comprising 1-100 carbon atoms, which optionally includes one or more oxygen, nitrogen, phosphorus or sulfur heteroatoms (e.g. to include an ether, a thioether, a phosphordiester or phosphortriester, or PEG, a heterocycle, such as a triazole or imidazole).
[0085] In some embodiments, R1 is -Rw-Z, wherein Rw is a substituted or unsubstituted, branched or unbranched, saturated or unsaturated alkyl group having between 1 and 100 carbon atoms, which optionally includes one or more oxygen, nitrogen, phosphorus or sulfur heteroatoms, and where Z is alkyl, alkenyl, alkynyl, acyl, -Het, or -Ofc-Het, where "Het" is a substituted or unsubstituted 5- or 6-membered heterocyclic moiety.
[0086] As a preferred example, R1 consists of a linear hydrocarbon, such as an alkynyl, and G1 is a terminal alkyne group. Preferably, R1 is a hexa-l-ynyl or octa-l-ynyl group. In most preferred examples, R1 with G1 is an octa-1, 7-diynyl or a deca-l,9-diynyl group.
[0087] R2 is independently H, OH or any 2 -ribose modification. In some embodiments, both R2 are H or one R2 is H and the other R2 is OH. Preferably, both R2 are H. 2’ -ribose modifications are known in the art and include, for example, tert-butyldimethylsilyl and tri-iso-propylsilyloxymethyl ether groups as well as a 2’-O-methyl group or a 2’-fluoro group.
[0088] R3 is H or any protecting group, preferably H. Protecting groups are known in the art. Examples of a protecting group include acetyl, benzoyl, benzyl, methoxyethoxymethyl ether, dimethoxytrityl, ethoxymethyl ether, methoxytrityl, p-methoxybenzyl ether, p-methoxyphenyl ether, methylthiomethyl ether, pivaloyl, tert-butyl ethers, tetrahydropyranyl, tetrahydrofuran, trityl, silyl ether (e.g. trimethylsilyl, tert-butyldimethylsilyl, tri-iso-propylsilyloxymethyl, or triisopropylsilyl ethers), methyl ethers, and ethoxyethyl ethers.
[0089] R4 comprises or consists of a hydrocarbon. For application in sequencing by expansion, R4 typically consists of a hydrocarbon. The hydrocarbon can be substituted or unsubstituted, preferably unsubstituted. The hydrocarbon can be saturated or unsaturated, preferably saturated. For example, R4 can comprise or consist of a alkyl, alkenyl, or alkynyl, preferably alkyl.
[0090] Typically, R4 comprises 1-100 carbon atoms, preferably 1-30 carbon atoms or 1-20 carbon atoms, such as 1-15 carbon atoms, 3-15 carbon atoms, 3-10 carbon atoms or 3-6 carbon atoms, such as 4 carbon atoms. Typically, R4 will be acyclic. Preferably, R4 is linear. In preferred embodiments, R4 comprises or consists of a linear (saturated) alkyl. In some embodiments, the molecular weight of R4 is 1500 g/mol or less, 1000 g/mol or less, 500 g/mol or less, 200 g/mol or less, or 100 g/mol or less. [0091] In some embodiments, R4 comprises or consists of a branched, linear, cyclic or heterocyclic, substituted or unsubstituted, saturated or unsaturated hydrocarbon, optionally including one or more heteroatoms, optionally selected from nitrogen, oxygen, phosphorus and sulfur. A cyclic or heterocyclic hydrocarbon can be 5 -membered or 6-membered, for example. A cyclic or heterocyclic hydrocarbon can be aromatic, for example.
[0092] In some embodiments, R4 is a substituted or unsubstituted, branched or unbranched, saturated or unsaturated alkyl group comprising 1-100 carbon atoms, which optionally includes one or more oxygen, nitrogen, phosphorus or sulfur heteroatoms (e.g. to include an ether, a thioether, a phosphordiester or phosphortriester, or PEG, a heterocycle, such as a triazole or imidazole).
[0093] In some embodiments, R4 is -Rw-Z, wherein Rw is a substituted or unsubstituted, branched or unbranched, saturated or unsaturated alkyl group having between 1 and 100 carbon atoms, which optionally includes one or more oxygen, nitrogen, phosphorus or sulfur heteroatoms, and where Z is alkyl, alkenyl, alkynyl, acyl, -Het, or -Clfc-Het, where "Het" is a substituted or unsubstituted 5- or 6-membered heterocyclic moiety.
[0094] In a preferred embodiment, R4 consists of a linear hydrocarbon, such as an alkyl, and G2 is a terminal alkyne group. As a preferred example, R4 with G2 is a hex-5 -ynyl group. Thus, as a preferred example, R4 with G2 has the structure:
[0095] R4 can also comprise or consist of two or more hydrocarbons that are linked by an atom or group of atoms other than carbon, such as a phosphorus atom and/or an oxygen atom. For example, two hydrocarbons (each independently comprising 1-10, 2-10 or 2-6 carbon atoms), such as alkyls, can be linked by an oxygen atom. In another example, two or three hydrocarbons (each independently comprising 1-10, 2-10 or 2-6 carbon atoms), such as alkyls, can be linked by a phosphate diester or a phosphate triester, respectively. Thus, in an embodiment, R4 with G2 has the structure: [0096] During production, R4 comprising a phosphate typically has a protective group at the hydroxyl function of the phosphate, such as an beta-cyano -ethyl group. The protective group can be removed after the pyrophosphate has been added to yield a triphosphate.
[0097] For example, R4 can consist of a) a linear hydrocarbon, or b) two linear hydrocarbons linked by a phosphate diester, wherein R4 comprises 3-15 carbon atoms, and G2 is a terminal alkyne group.
[0098] In preferred embodiments, at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphate in the composition have a structure selected from the following structures:
[0099] In more preferred embodiments, at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphate in the composition have a structure selected from the following structures:
[0100] A racemic mixture of the nucleoside triphosphate with two clickable groups as disclosed herein can be produced, for example, by the methods disclosed in WO 2016/081871. Certain types of R4, e.g. those comprising a phosphate, may comprise a protective group, such as an ethyl cyanide group, at the phosphate, until the 5’ triphosphate has been generated. A given diastereomerically pure isomer can then be separated from the other isomer by high-performance liquid chromatography (HPLC), for example. Concrete conditions for separating the two isomers by preparative HPLC are given in Example 1. The active configuration can be identified as eluting first in HPLC as described in Example 1. This configuration can be further functionally identified, for example, by separating the two enantiomers and testing whether expandable nucleoside triphosphates synthesized with a given enantiomer are suitable for Xpandomer synthesis as described in Example 2.
III. Expandable NTPs
[0101] The nucleoside triphosphates disclosed herein can be used, for example, for sequencing by expansion. In this case, the a-phosphoramidate is typically linked to the nucleobase, e.g. via a tether molecule (to form an expandable NTP). For example, R1 can be linked to R4, e.g. via a tether molecule. The disclosure thus also relates to nucleoside triphosphates that can be expanded, and are thus suitable for sequencing by expansion, for example. Such a nucleoside triphosphate is obtainable, for example, by linking the a-phosphoramidate and the nucleobase in the nucleoside triphosphate with two clickable groups by a click reaction. Such a nucleoside triphosphate generally has the following structure: wherein T is a tether molecule; NB is a nucleobase; R1 comprises or consists of a hydrocarbon; R2 is independently H, OH or any 2'-ribose modification; R3 is H or any protecting group; R4 comprises or consists of a hydrocarbon, and L1 and L2 independently represent linking groups. The further description of NB, R1, R2, R3 and R4 from the context of the nucleoside triphosphate with two clickable groups equally applies.
[0102] The nucleoside triphosphate comprises a stereocenter at a phosphorus atom, and can therefore exist as two different diastereomers with different stereoconfigurations at the a- phosphoramidate as follows:
(active) (inactive)
[0103] The disclosure thus provides a composition comprising (expandable) nucleoside triphosphates as disclosed herein suitable for sequencing by expansion. In some embodiments, the disclosure provides a composition comprising (expandable) nucleoside triphosphates having the structure: wherein NB is a nucleobase; R1 comprises or consists of a hydrocarbon; R2 is independently H, OH or any 2 '-ribose modification; R3 is H or any protecting group; and R4 comprises or consists of a hydrocarbon; L1 and L2 independently represent linking groups; and T is a tether molecule; wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have the following stereoconfiguration at the a-phosphoramidate: preferably
[0104] Thus, for example, at least 80%, preferably at least 90%, such as at least 95%, at least 99% or 100% of the nucleoside triphosphate in the composition can have the following structure:
[0105] The composition can also comprise a mixture of different types of the nucleoside triphosphate. In some embodiments, the composition comprises a mixture of four different types of nucleoside triphosphates. Typically, the four different types of nucleoside triphosphates comprise four different types of nucleobases. Preferably, the four different types of nucleoside triphosphate base pair with, guanine, adenine, thymine and cytosine, respectively. Thus, in some embodiments, the composition comprises four different types of nucleoside triphosphates, each comprising a unique nucleobase, e.g. selected from 7-deazaadenine, 7-deazaguanine, thymine and cytosine. Preferably, the four different types of nucleoside triphosphates each comprise a unique nucleobase selected from:
[0106] The tether molecule is not particularly limited, but will typically comprise a reporter (to allow specific identification of the attached nucleobase, e.g. via nanopore-based sequencing). A tether molecule can be, for example, a symmetrically synthesized reporter tether (SSRT) as disclosed in WO 2020/236526 Al. Such tether molecules typically have the following structure: Linker A - reporter - Linker B. Linker A can be attached to the a-phosphoramidate and linker B to the nucleobase, or vice versa. For example, Linker A and Linker B can be polymers comprising two or more repeat units selected from: spermine (Q), hexaethylene glycol (D), 2-((4-((3-(benzoyloxy)-2-(((l-(3- (benzoyloxy)-2-((benzoyloxy)methyl)-2-((phosphodiester-oxy)methyl)propyl)-lH-l,2,3-triazol-4- yl)methoxy)methyl)-2- ((benzoyloxy )methyl)propoxy)methyl)- 1 H- 1 ,2, 3 -triazol- 1 -yl)methyl)-2-O- phosphodiester- propane- 1,3 -diyl dibenzoate, l,3-O-bis(phosphodiester-2,2-bis(l-Me-4-(Me-O-
PEG2-O-Bz)-l,2,3-triazole)-propane, l,3-O-bis(phosphodiester-2-(4-(Me-O-PEG5)-l-(Et-O-Ac)-
1.2.3-triazole)-propane, 1 ,3 -O-bis(phosphodiester-2s-O-(4-(Me-O-PEG7)- 1 -(Et-OBz)-l ,2,3 - triazole)-propane, l,3-O-bis(phosphodiester-2s-O-(4-(Me-O-PEG3)-l-(Et-2,2,2-Tris-(Me-O-Bz))-
1.2.3-triazole)-propane, l,3-O-bis(phosphodiester-2-(4-(Me-O-PEG5)-l-(Et-2,2,2-Tris-(Me-O-Ac))-
1.2.3-triazole)-propane, l,2-O-bis(phosphodiester)-3-(4-(Me-O-PEG3-O-Bz)-l-(l,2,3-triazole))- propane, 1 ,3-O-bis(phosphodiester-2,2-bis(4-(Me-O-PEG2-O-Me)-l -(Et-O-Bz)-l ,2,3-triazole)- propane, 1 ,3 -O-bis(phosphodiester-2,2-bis(4-(Me-O-PEG3 -O-Me)-1 -(Et-2,2,2-Tris-(Me-O-Bz))-
1.2.3-triazole)-propane, 1 ,2-O-bis(phosphodiester)-3 -(4-methylpiperazine- 1 -yl)-propane, 1 ,3-0- bis(phosphodiester-2,2-bis(4-(Me-O-PEG3-O-Me)-l-(Et-O-Bz)-l,2,3-triazole)-propane, and 1 ,1 ’-O- bis(phosphodiester)-N(p-tolyl)-diethanolamine, preferably spermine. In some embodiments, linker A and B are inverted copies of each other.
[0107] For example, a reporter can be a polymer comprising two or more repeat units selected from: hexaethylene glycol (D), ethane (L), triaethylene glycol (X), l,3-O-bis(phosphodiester)-2S-O- mPEG4-propane, 1 ,3-O-bis(phosphodiester)-2-(4-Me-O-PEG3)-l -(Et-O-Ac)-l ,2,3-triazole)- propane, l,3-O-bis(phosphodiester-2,2-bis(Me-O-mPEG2)-propane, 1 ,3-O-bis(phosphodiester-2S- O-(PEG4-O-Bz)-propane, 1 ,3-O-bis(phosphodiester)-2s-O-mPEG6-propane, 1,3-0- bis(phosphodiester-2s-O-(4-(Me-O-PEG3)-l-(Et-2,2,2-Tris-(Me-O-Bz))-l,2,3-triazole)-propane,
1.3-O-bis(phosphodiester-2s-O-(4-(Me-O-PEG3)-l-(Me-acetate)-l,2,3-triazole)-propane, 1,3-0- bis(phosphodiester)-2s-O-(4-(Me-O-PEG2)-l-(Et-OBz)-l,2,3-triazole)-propane, 1,3-0- bis(phosphodiester)-2-(4-Et-l-(Et-O-mPEGl)-l,2,3-triazole)-propane, 2,3-O-bis(phosphodiester)-l- (1 dimethoxyquinazolinedione)- propane , 2,3-O-bis(phosphodiester)-l-(N9-(3,6- dimethoxycarbazole)-propane, l,r-O-bis(phosphodiester)-2,2’-(sulfonylbis(benz-4-yl))- di ethanol, l,r-O-bis(phosphodiester)-2,2’-bipyridin-4,4’-yl)-dimethanol, 2,3-O-bis(phosphodiester)-l-(Nl- (4,6-dimethoxy-3-Me-indole)-propane, 3-(l,2-O-bis(phosphodiester)-propyl)-8,8- dimethylhexahydro-3H-3a,6-methanobenzo[c]isothiazole 2,2-dioxide, 2,3-O-bis(phosphodiester)-l - (Nl-(6-Azathymine))-propane, l,5-O-bis(phosphodiester)-hexahydrofuro[2,6]furan, 1,1 ’-O- bis(phosphodiester)-octahydro-2,6-dimethyl-3,8:4,7-dimethano-2,6-naphthyridin-4,8-diyl)- dimethanol, 2,3-O-bis(phosphodiester)-l-(Nl-(2-Me-5-nitroindole)-propane, 2,3-0- bis(phosphodiester)-l-(Nl-(2-Me-5-nitroindole)-propane, 2,3-O-bis(phosphodiester)-l-(5- benzofuran)-propane, 1 ,2-O-bis(phosphodiester)-3 -0-mPEG2 -propane, 1 ,3 -O-bis(phosphodiester)- 2-(4-Et-l-(Et-O-mPEG3)-l ,2,3-triazole)-propane, and l,3-O-bis(phosphodiester)-3-O-mPEG4- propane (see WO 2020/236526 Al). In some embodiments, the reporter comprises or consists of two inverted copies of the same polymer comprising two or more repeat units selected from the above. The two inverted copies of the same polymer in the reporter can be linked via a branching element that is further linked to a translocation control element (TCE), as described in WO 2020/236526 Al.
[0108] L1 and L2 independently represent linking groups. The linking groups are not particularly limited, and include substituted or unsubstituted hydrocarbons, including e.g. a 1,2,3- triazole.
[0109] Preferably, the tether molecule is attached via click chemistry reactions. In other words, the terminal clickable groups in G1 and G2 can be used to link a tether molecule via a click reaction. These terminal clickable groups are typically clickable groups of the same type, e.g. each is an alkyne group. When the tether is attached via click reactions, L1 and L2 will each be a product of a click reaction, such as a 1 ,2,3 -triazole. For example, when both G1 and G2 are terminal alkyne groups, a tether molecule attached to terminal azide groups on two ends can be reacted with G1 and G2, thereby yielding two 1,2,3-triazoles (or vice versa, i.e. G1 and G2 are terminal azide groups and the tether is attached to two terminal alkyne groups). In preferred embodiments, L1 and L2 are 1,2,3-triazoles.
[0110] In preferred embodiments, at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphate in the composition have a structure selected from the following structures:
[0111] The first two structures are obtainable by reacting a nucleoside triphosphate with two clickable groups as disclosed herein with R1 with G1 = octa-1, 7-diynyl or deca-l,9-diynyl group, and R4 with G2 = hex-5-ynyl group with a tether attached to two terminal azide groups.
[0112] In more preferred embodiments, at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphate in the composition have a structure selected from the following structures:
[0113] The composition can also be provided as a master mix or a reaction solution. The
5 composition can thus further comprise additional reagents for complementary strand synthesis. For example, the composition can further comprise a nucleic acid polymerase (for details see section IV.). Moreover, the composition can further comprise a buffering agent, such as TrisCi. In some embodiments, the composition further comprises at least one of (including each of) TrisOAc, NH4OAC, PEG, water-miscible organic solvent, such as DMF or NMP, polyphosphate 60, N-methyl succinimide (NMS), and MnCh, a single-strand binding protein (SSB), and urea. The SSB can be Kod SSB (from Thermococcus kodakarensis), for example. The composition may further comprise a polymerase-enhancing molecule (PEM), such as described in EP 3 735 409 Bl. A reaction solution will also typically comprise at least one nucleic acid. In these embodiments, the composition typically comprises four different types of the nucleoside triphosphate, wherein the four different types of the nucleoside triphosphate base pair with guanine, adenine, thymine and cytosine, respectively.
[0114] The (expandable) nucleoside triphosphates can be obtained for example, by linking the a-phosphoramidate to the nucleobase of the nucleoside triphosphate with the two clickable groups G1 and G2. The a-phosphoramidate can be linked to the nucleobase via a tether molecule that can be attached by click chemistry reactions. For example, when G1 and G2 are terminal alkyne groups and the tether molecule is attached to two terminal azide groups, a double click reaction at each end will attach the tether molecule to R1 and R4 via two 1,2,3-triazoles. The reaction can look as follows: wherein G1 and G2 independently represent terminal clickable groups, Gla and G2a independently represent terminal clickable groups, L1 and L2 represent linking groups formed by reacting G1 and G2 with Gla and G2a, respectively, and T, NB, R1, R2, R3, and R4 are as defined above.
[0115] The nucleoside triphosphate can be further purified, for example by HPLC. Purification can take place e.g. after the reaction with a pyrophosphate and/or after linking the a- phosphoramidate to the nucleobase. IV. Methods
[0116] The disclosure also provides a method for generating a complementary strand to a nucleic acid, comprising contacting the nucleic acid with a composition comprising a nucleoside triphosphate as disclosed herein. The complementary strand generated by such method is typically an Xpandomer (in constrained configuration).
[0117] The type of nucleic acid is not particularly limited, and includes DNA or RNA. In preferred embodiments, the nucleic acid is a DNA, such as a genomic DNA or a cDNA. Typically, the DNA is cell-free DNA.
[0118] The nucleic acid can be part of a library of nucleic acids. For example, the library can be a library of genomic DNA or cDNA.
[0119] The generation of the complementary strand is typically primed by a primer. The design and generation of primers is known in the art. The primer to be used is not particularly limited and can be designed, for example, to hybridize with the nucleic acid at a position so as to allow the generation of the complementary strand to parts of the nucleic acid that are of interest, including full- length. When a library of nucleic acids is to be sequenced, it is possible, for example, to use a standard primer binding to all nucleic acids of interest in the library, or a random primer mixture.
[0120] In some embodiments, the method can comprise hybridizing a primer to the nucleic acid, followed by contacting the nucleic acid with a composition comprising a nucleoside triphosphate as disclosed herein.
[0121] If necessary, the nucleic acid can also be denatured, e.g. to facilitate primer hybridization. Means for denaturation are not particularly limited, and include e.g. applying heat (e.g. 90°C-100°C). Thus, in some embodiments, the method can comprise denaturing the nucleic acid and then hybridizing a primer to the nucleic acid, followed by contacting the nucleic acid with a composition comprising a nucleoside triphosphate as disclosed herein.
[0122] Typically, the complementary strand is generated by using a polymerase, such as a (DNA-dependent) DNA polymerase. Thus, the composition used preferably further comprises a nucleic acid polymerase. Polymerases will typically be used comprising mutations that sterically allow the use of XNTPs as substrates. A suitable class of polymerases for incorporating XNTPs includes the translesion DNA polymerase (i.e. class Y polymerase) family that includes e.g. the DPO4 polymerase. Translesion DNA polymerases exhibit a more flexible substrate recognition than conventional (e.g. replication) polymerases owing to their relatively large substrate binding sites, which have evolved to accommodate naturally occurring, bulky DNA lesions. Suitable polymerases include e.g. modified DPO4 polymerases as described in WO 2017/087281, WO 2018/204707, or WO 2019/118372, which are herein incorporated by reference in their entireties. Suitable examples are provided herein as SEQ ID NOs: 1-5. Thus, in some embodiments, the polymerase has at least 95%, such as at least 98% or 99% sequence identity to the amino acid sequence of SEQ ID NO: 1. Such a polymerase has DNA-dependent DNA polymerase activity, and more specifically, is capable of using XNTPs as polymerization substrate.
[0123] Moreover, the composition typically further comprises a buffering agent, such as TrisCi. In some embodiments, the composition comprises at least one of (including each of) TrisOAc, NH4OAc, PEG, water-miscible organic solvent, such as dimethylformamide (DMF) or N- methylpyrrolidone (NMP), polyphosphate 60, N-methyl succinimide (NMS), and MnC12, a singlestrand binding protein (SSB), and urea. The SSB can be Kod SSB, for example.
[0124] The composition may further comprise a polymerase-enhancing molecule (PEM), such as described in EP 3 735 409 Bl . For example, a PEM can be as described in the claims of EP 3 735 409 Bl, i.e. a compound of the following formula that increases the processivity, rate, or fidelity of the nucleic acid polymerase reaction: wherein independently at each occurrence: m is 1,2 or 3; n is 0, 1 or 2; p is 0, 1 or 2; Ari is optionally substituted aryl; Ar2 is selected from 5- and 6-membered monocyclic aromatic rings and 9- and 10-membered fused bicyclic rings comprising two 5- and/or 6-membered monocyclic rings fused together, where at least one of the two monocyclic rings is an aromatic ring, where Ar2 is optionally substituted with one or more substituents selected from halide, Ci-Cealkyl, Ci-Cehaloalkyl, ECO2R°, E-CONH2, E-CHO, E-C(O)NH(OH), E-N(R°)2, and E-OR°, where E is selected from a direct bond and Ci-Cealkylene; and R° is selected from H, Ci-Cealkyl and Ci-Cehaloalkyl, M is selected from hydrogen, halogen and Ci-C4alkyl; and L is a linking group; or a solvate, hydrate, tautomer, chelate or salt thereof. [0125] The composition typically comprises four different types of the nucleoside triphosphate, wherein the four different types of the nucleoside triphosphate base pair with guanine, adenine, thymine and cytosine, respectively.
[0126] The disclosure also provides a method for sequencing a nucleic acid using the expandable nucleoside triphosphate or the composition comprising the same as disclosed herein.
[0127] The disclosure thus provides a method for determining the sequence of a nucleic acid, comprising the following steps in order:
1) Generating a complementary strand to the nucleic acid by the method for generating a complementary strand as disclosed herein;
2) Selectively cleaving the P-N bond within the nucleoside triphosphate to generate an expanded complementary strand;
3) Sequencing the expanded complementary strand,
4) Determining the sequence of the nucleic acid based on the sequence of the expanded complementary strand.
[0128] The nucleoside triphosphates used in such a method are such that they allow the generation of an expanded complementary strand. This is typically achieved by using nucleoside triphosphates in which the a-phosphoramidate is linked to the nucleobase via a tether molecule. When the phosphoramidate (P-N) bond is cleaved, the a-phosphoramidate and the nucleobase remain linked via the tether molecule, thereby generating the expanded complementary strand.
[0129] In some embodiments, the complementary strand is separated from the nucleic acid after step 1), for example by denaturation. The complementary strand can optionally be purified before proceeding with step 2).
[0130] The P-N bond can be selectively cleaved in step 2) under acidic conditions, for example. This can be achieved by addition of an acid, such as DO. Cleavage in step 2) typically yields an Xpandomer in expanded configuration. The product of step 2) can optionally be purified before proceeding with step 3).
[0131] In preferred embodiments, the expanded complementary strand is sequenced in step 3) by nanopore-based sequencing. Methods for nanopore- based sequencing are known in the art, see e.g. WO 2020/236526. For example, nanopore-based sequencing can comprise:
(a) providing a chip for nanopore-based sequencing comprising:
(i) an electrochemically resistive barrier disposed over an aperture on a surface of the chip, wherein the barrier separates a cis side from a trans side; (ii) a nanopore inserted into the barrier, wherein the nanopore has an entrance side on the cis side of the barrier and an exit side on the trans side of the barrier;
(b) contacting the cis side of the barrier with the expanded complementary strand;
(c) applying a voltage across the barrier of the chip to translocate the expanded complementary strand to the trans side;
(d) determining one or more changes in an electrical characteristic of the nanopore associated with occupation of the nanopore by the expanded complementary strand during the translocation; and
(e) determining, based on the one or more changes in the electrical characteristic of the nanopore, a sequence for the expanded complementary strand.
[0132] The barrier is typically a lipid bilayer membrane, such as a DPhPE/hexadecane bilayer membrane. A nanopore, such as a a.hemolysine nanopoer, can be inserted into the membrane by electroporation in a buffer, such as a buffer of 2 M NH4C1 and 100 mM HEPES, pH 7.4. The cis well can be perfused with a buffer containing 0.4M NH4C1, 600mM GuanCl, lOOmM HEPES; pH 7.4, and 5% glycerol and the trans well can be perfused with buffer containing 0.4M NH4C1, 600mM GuanCl, 5% ethyl acetate, lOmM HEPES; pH 7.4, before introducing the Xpandomer to the cis side for sequencing.
V. Examples
[0133] Example 1: Production of diastereomers of dNTP-2c molecules
[0134] Racemic mixtures comprising nucleoside triphosphates with two clickable groups as disclosed herein were produced by the methods disclosed in WO 2016/081871. Four separate racemic mixtures were produced for four different types of nucleoside triphosphates with four different nucleobases corresponding to C, T, A and G, respectively, each one with -R^G1 on the nucleobase and -R4-G2 on the a-phosphoramidate as disclosed herein. For each racemic mixture, the two isomers were then separated from one another by preparative HPLC.
HPLC system: Agilent 1290 Infinity II Preparative LC System
Column: 50 x 250 mm Waters Xbridge Cl 8
Guard column: Waters Xbridge Cl 8
Mobile phase: Me0H/H20 were premixed to the remove heat of mixing. IM TEAB was mixed on the HPLC instrument by pumping it at 10%
[0135] Table 1 shows the preparative workflow used. Table 1
[0136] Exemplary HPLC chromatograms for the four different types of nucleoside triphosphates are shown in Fig. 1. The chromatogram shows two distinct peaks for the active and inactive isomers of the four different types of nucleoside triphosphates. The two isomers were separated from one another by collecting suitable fractions from the two peaks. This allows the production of compositions in which one of the two isomers is present in excess.
[0137] Example 2: Xpandomer synthesis and sequencing using active vs. inactive isomers
[0138] To produce Xpandomer copies of a DNA template, solid-state primer extension reactions are conducted using isomolar amounts of each XNTP, 4pmol template and 20pmol E-oligo primer (solid-state Xpandomer synthesis in which the extension oligo is covalently bound to a chip substrate is described in WO 2020/172479 Al, which is herein incorporated by reference in its entirety). The 50 pl extension reaction includes the following reagents: 50mM TrisCi, pH 8.84, 200mM NH4OAC, 50mM GuC120% PEG8K, 10% N-methylpyrrolidone (NMP), 15nmol polyphosphate PP-60.23, 2.5 pg Kod single-strand binding protein (SSB), 0.1M urea, 15mM PEM additive and 13 pg purified recombinant DNA polymerase C4760 (SEQ ID NO: 2, a variant of DPO4 polymerase; other suitable variants include SEQ ID NOs: 1 and 3-5). The extension reaction is run for 60 minutes at 37°C.
[0139] Xpandomer products are next sequenced using the SBX protocol. Briefly, the constrained Xpandomer products are washed in buffer B.064 (1% Tween-20/3% SDS/5mM HEPES, pH 8.0/100mM NaPO-i/l 5% DMF) and cleaved to generate linearized Xpandomer by adding 200pl buffer C.001 (7.5M DC1) and incubating for 30 minutes at 23 °C. The sample is then neutralized by adding 2000 pl buffer B.064 and incubating for 2min at RT. The Xpandomer sample is then subjected to amine modification by adding 500pmol succinate anhydride in buffer B.064 and incubating for 5 minutes at 23 °C. The sample is then washed in buffer D.102 (50% ACN) and the Xpandomers are released from the substrate by photocleavage and eluted in 60 pl elution buffer.
[0140] Protein nanopores are prepared by inserting a-hemolysin into a DPhPE/hexadecane bilayer membrane in a buffer of 2 M NH4CI and 100 mM HEPES, pH 7.4. The cis well is perfused with buffer AG242 containing 0.4M NH4CI , 600mM GuanCl, lOOmM HEPES; pH 7.4, and 5% glycerol and the trans well is perfused with buffer AB080 containing 0.4M NH4CI, 600mM GuanCl, 5% ethyl acetate, lOmM HEPES; pH 7.4. The Xpandomer sample is heated to 70° C for 2 minutes, cooled completely and vortexed, then a 2 pL aliquot is added to the cis well. The voltage parameters are run as follows: 70mV/625mV/6ps/l .0ms (read voltage/pulse voltage/pulse voltage duration/pulse frequency). Data are acquired via Labview acquisition software.
[0141] A mixture of four XNTPs (complementary to the four naturally occurring nucleobases A, G, C and T, respectively) was used for Xp synthesis and subsequent sequencing by expansion in the form of 1) the diastereomerically pure active isomer, 2) the diastereomerically pure inactive isomer, or 3) mixtures of 90:10 or 75:25 of active inactive isomers. The results are summarized in the following Table 2:
[0142] Table 2 [0143] The use of active isomer at a concentration of 100 pM or higher yielded -40% full- length Xpandomer product, while 75 pM of active isomer yielded 33% full-length product, thus providing a correlation between yield of full-length products with concentrations of active isomers up to 100 pM. This is also graphically shown in Fig. 2 (see data series for 100:0 ratio). Table 2 further shows that the % full-length value did not drastically change at a concentration of 150 pM of active isomer compared to 100 pM, suggesting a possible saturation effect between 75 and 100 pM of active isomer. In contrast, the use of 100 pM inactive isomer did not yield any full-length Xpandomer product at all. This difference between the use of diastereomerically pure active and inactive isomers is confirmed by Fig. 3 showing a representative image after a gel electrophoresis with products from Xpandomer synthesis using the active or inactive isomers. Fig. 3 shows that the active isomer allows for the generation of full-length Xpandomers, as evidenced by the presence of a band in lane 37. In contrast, the use of the inactive isomer does not yield any full-length Xpandomer, as evidenced by the absence of any band in lane 38.
[0144] Table 2 also shows that the truncated Xpandomer products that could be obtained using the inactive isomer had higher rates of sequence errors (deletions, substitutions, or insertion-deletions) than the Xpandomer products synthesized in the presence of the active isomer.
[0145] Moreover, as further demonstrated by Table 2, the use of the active isomer achieved a higher percentage of full-length Xpandomers compared to a mixture of both isomers. This was even the case when the concentration of the active isomer was the same in the mixture as in the diastereomerically pure preparation: While 100 pM of pure active isomers yielded 40% of full-length product, the 90:10 mixture at 111 pM (comprising 100 pM active and 11 pM inactive isomers) or the 75:25 mixture at 133 pM (comprising 100 pM active and 33 pM inactive isomers) only yielded 37% and 32% of full-length product, respectively. Likewise, while 75 pM of pure active isomers yielded 33% of full-length product, the 75:25 mixture at 100 pM (comprising 75 pM active and 25 pM inactive isomers) only yielded 29% of full-length product. This is graphically shown in Fig. 2. The figure plots the concentration of active isomer vs. the % full-length Xpandomer product obtained. It is evident that the % full-length value at a given concentration of active isomer decreases when the inactive isomer is present, depending on the amount of inactive isomer present: While there was only a slight negative effect in a 90:10 mixture, the negative effect was stronger with a 75:25 mixture.
[0146] A negative effect of the inactive isomer on Xpandomer length is also supported by the mean length of the products obtained. See Table 2 showing 281 nt mean length with 100 pM of pure - 31 - active isomers, and slightly shorter mean lengths of 279 nt with the mixtures comprising 100 pM of active isomers plus 11 or 33 pM inactive isomers.
[0147] Overall, the data demonstrate that the yield of full-length Xpandomer product directly depends on the presence of active isomer. The inactive isomer is incapable of producing any high- quality full-length Xpandomer products, and, surprisingly, even has a concentration-dependent negative impact on the yield of full-length Xpandomer product in the presence of the active isomer. Without wishing to be bound by any theory, one explanation for this negative effect might be that - despite a strong preference for the active isomer - the polymerase can occasionally incorporate an inactive isomer which may result in premature termination of Xpandomer elongation.
[0148] In conclusion, it is desirable for Xpandomer synthesis to use a composition comprising an excess of active isomers, for example at least 80%, preferably at least 90% or even 100% of active isomers. Conversely, a composition comprising an excess of inactive isomers may have some utility e.g. as negative control for Xpandomer synthesis.
SEQUENCES
[0149] SEQ ID NO: 1 (DPO4 C4552)
[0150] MTVLFVDFDYFYAQVEEVLNPSLKGKPVWCVFSGRFEDSGWAT ANYEAR
KFGVYAGIPIVEAKKILPNAVYLPWRDLVYWGVSERIMNLLREYSEKIEIASIDEAYLDISDK
VRDYREAYNLGLEIKNKILEKEKITVTVGISKNKVFAAVAGRMAKPNGIKVIDDEEVKRLIR
ELDIADVQGIPYFTAEI<LI<I<LGINI<LVDTLSIEFDI<LI<GMIGEAI<AI<YLISLARDEYNEPIRT
RVRI<SIGRTVTMI<RNSRNLEEII<PYLFRAIEESYYI<LDI<RIPI<AIHVVAWI<SYWNSQYRWS WFPHGISKETAYSESVQLLQQILKKDKRKIRRIGVRFSKF
[0151] SEQ ID NO: 2 (DPO4 C4760)
[0152] MI VLFVDFDYFYAQVEEVLNPSLKGKPVVVCVFSGRFEDSGVV AT ANYEAR
KFGVYAGIPIVRAKKILPNAVYLPWRDLVYWGVSERIMNLLREYSEKIEIASIDEAYLDISDK
VRDYREAYNLGLEIKNKILEKEKITVTVGISKNKVFAAVAGRMAKPNGIKVIDDEEVKRLIR
ELDIADVQGIPYFTAEI<LI<I<LGINI<LVDTLSIEFDI<LI<GMIGEAI<AI<YLISLARDEYNEPIRT RVRKSIGRTVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHWAWKSYWNSQYRWS
WFPHGISKETAYSESVQLLQQILKKDKRKIRRIGVRFSKF
[0153] SEQ ID NO: 3 (DPO4 C4842)
[0154] MIVLFVDFDYFYAQVEEVLNPSLKGKPVWCVFSGRFEDSGWAT AN YE AR
KFGVYAGIPIVRAKKILPNAVYLPWRDLVYWGVSERIMNLLREYSEKIEIASIDEAYLDISDK
VRDYREAYNLGLEIKNKILEKEKITVTVGISKNKVFAAVAGRMAKPNGIKVIDDEEVKRLIR
ELDIADVQGIPYFTAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRT
RVRRSIGRTVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHVVAWKSYWNSQYRWS WFPHGISKETAYSESVQLLQQILKKDKRKIRRIGVRFSKF
[0155] SEQ ID NO: 4 (DPO4 C4852)
[0156] MIVLFVDFDYFYAQVEEVLNPSLKGKPVWCVFSGRFEDSGWAT ANYEAR
KFGVYAGIPIVRAKKILPNAVYLPWRDLVYWGVSERIMNLLREYSEKIEIASIDEAYLDISDK
VRDYREAYNLGLEIKNKILEKEKITVTVGISKNKVFAAVAGRMAKPNGIKVIDDEEVKRLIR
ELDIADVQGIPYFTAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRT
RVRKSIGRTVTMKRDSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHWAWKSYWNSQYRWS WFPHGISKETAYSESVQLLQQILKKDKRKIRRIGVRFSKF
[0157] SEQ ID NO: 5 (DPO4 C4862)
[0158] MIVLFVDFDYFYAQVEEVLNPSLKGKPVWCVFSGRFEDSGW AT ANYEAR
KFGVYAGIPIKRAKKILPNAVYLPWRDLVYWGVSERIMNLLREYSEKIEIASIDEAYLDISDK
VRDYREAYNLGLEIKNKILEKEKITVTVGISKNKVFAAVAGRMAKPNGIKVIDDEEVKRLIR
ELDIADVQGIPYFTAEKLKKLGINKLVDTLSIEFDKLKGMIGEAKAKYLISLARDEYNEPIRT
RVRKSIGRTVTMKRNSRNLEEIKPYLFRAIEESYYKLDKRIPKAIHWAWKSYWNSQYRWS WFPHGISKETAYSESVQLLQQILKKDKRKIRRIGVRFSKF

Claims

PATENT CLAIMS What is claimed is:
1. A composition comprising nucleoside triphosphates having the structure: wherein NB is a nucleobase; R1 comprises or consists of a hydrocarbon; R2 is independently H, OH or any 2'-ribose modification; R3 is H or any protecting group; and R4 comprises or consists of a hydrocarbon; G1 and G2 independently represent terminal clickable groups; L1 and L2 independently represent linking groups; and T is a tether molecule; wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have the following stereoconfiguration at the a-phosphoramidate:
2. The composition of claim 1, wherein at least 90% of the nucleoside triphosphates have the following stereoconfiguration at the a-phosphoramidate:
3. The composition of claim 1 or 2, wherein 100% of the nucleoside triphosphates have the following stereoconfiguration at the a-phosphoramidate:
4. The composition of any one of the preceding claims, wherein NB is selected from cytosine, thymine, 7-deazaadenine and 7-deazaguanine.
5. The composition of any one of the preceding claims, wherein R1 is attached to position 5 of the nucleobase when the nucleobase is a pyrimidine nucleobase, and to position 7 of the nucleobase when the nucleobase is a purine nucleobase.
6. The composition of any one of the preceding claims, wherein NB has one of the following structures:
7. The composition of any one of the preceding claims, comprising a mixture of four different types of nucleoside triphosphates.
8. The composition of claim 7, wherein the four different types of nucleoside triphosphates comprise four different types of nucleobases.
9. The composition of claim 7 or 8, wherein the four different types of nucleoside triphosphate base pair with, guanine, adenine, thymine and cytosine, respectively.
10. The composition of any one of claims 7-9, wherein the four different types of nucleoside triphosphate comprise the four different types of nucleobases of claim 6, respectively.
11. The composition of any one of the preceding claims, wherein R1 comprises or consists of an unsaturated hydrocarbon.
12. The composition of any one of the preceding claims, wherein R1 consists of a hydrocarbon, such as an alkynyl.
13. The composition of any one of the preceding claims, wherein R1 is acyclic.
14. The composition of any one of the preceding claims, wherein R1 is linear.
15. The composition of any one of the preceding claims, wherein R1 comprises 1-20 carbon atoms, such as 1-10 carbon atoms or 5-10 carbon atoms, such as 6 or 8 carbon atoms.
16. The composition of any one of the preceding claims, wherein R1 is a hexa-l-ynyl or octa- 1-ynyl group.
17. The composition of any one of the preceding claims, wherein R1 with G1 is a octa-1, 7- diynyl or a deca-l,9-diynyl group.
18. The composition of any one of the preceding claims, wherein both R2 are H.
19. The composition of any one of the preceding claims, wherein R3 is H.
20. The composition of any one of the preceding claims, wherein R4 comprises or consists of a saturated hydrocarbon.
21. The composition of any one of the preceding claims, wherein R4 comprises 1-20 carbon atoms, such as 3-15 carbon atoms, 3-10 carbon atoms, such as 4 carbon atoms.
22. The composition of any one of the preceding claims, wherein R4 is acyclic.
23. The composition of any one of the preceding claims, wherein R4 is linear.
24. The composition of any one of the preceding claims, wherein R4 consists of a hydrocarbon.
25. The composition of any one of the preceding claims, wherein R4 is an n-butyl group.
26. The composition of any one of the preceding claims, wherein R4 with G2 is a hex-5-ynyl group.
27. The composition of any one of claims 1-23, wherein R4 comprises or consists of two or more hydrocarbons that are linked by an atom or group of atoms other than carbon, such as a phosphorus atom and/or an oxygen atom.
28. The composition of any one of the preceding claims, wherein the terminal clickable group is a terminal alkyne group or a terminal azide group, preferably a terminal alkyne group.
29. The composition of any one of the preceding claims, wherein G1 and G2 represent the same type of terminal clickable group.
30. The composition of any one of the preceding claims, wherein L1 and L2 independently represent linking groups formed via click reactions.
31. The composition of any one of the preceding claims, wherein each of L1 and L2 is a 1 ,2,3- triazole.
32. The composition of any one of the preceding claims, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
33. The composition of any one of the preceding claims, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
34. The composition of any one of claims 1-29, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
35. The composition of any one of claims 1-29, wherein at least 80%, preferably at least 90%, such as at least 95%, at least 99%, or 100% of the nucleoside triphosphates have a structure selected from the following structures:
36. A method for generating a complementary strand to a nucleic acid, comprising contacting the nucleic acid with the composition of any one of claims 1-33.
37. The method of claim 36, wherein the nucleic acid is comprised in library of nucleic acids.
38. The method of claim 36 or 37, wherein the nucleic acid is a DNA.
39. The method of any one of claims 36-38, wherein the composition further comprises a nucleic acid polymerase.
40. The method of any one of claims 36-39, wherein the composition further comprises a buffering agent, such as TrisCi, and/or a polymerase cofactor, such as MnCh.
41. The method of any one of claims 36-40, comprise hybridizing a primer to the nucleic acid, at the same time as or followed by contacting the nucleic acid with the composition.
42. A method for determining the sequence of a nucleic acid, comprising the following steps in order:
1) Generating a complementary strand to the nucleic acid by the method of any one of claims 36-41;
2) Selectively cleaving the P-N bond within the nucleoside triphosphate to generate an expanded complementary strand;
3) Sequencing the expanded complementary strand,
4) Determining the sequence of the nucleic acid based on the sequence of the expanded complementary strand.
43. The method of claim 42, wherein the P-N bond is selectively cleaved in step 3) under acidic conditions.
44. The method of claim 42 or 43, wherein the expanded complementary strand is sequenced in step 4) by nanopore-based sequencing.
45. The method of claim 44, wherein the nanopore-based sequencing comprises:
(a) providing a chip for nanopore-based sequencing comprising:
(i) an electrochemically resistive barrier disposed over an aperture on a surface of the chip, wherein the barrier separates a cis side from a trans side;
(ii) a nanopore inserted into the barrier, wherein the nanopore has an entrance side on the cis side of the barrier and an exit side on the trans side of the barrier; (b) contacting the cis side of the barrier with the expanded complementary strand;
(c) applying a voltage across the barrier of the chip to translocate the expanded complementary strand to the trans side;
(d) determining one or more changes in an electrical characteristic of the nanopore associated with occupation of the nanopore by the expanded complementary strand during the translocation; and
(e) determining, based on the one or more changes in the electrical characteristic of the nanopore, a sequence for the expanded complementary strand.
PCT/EP2025/050239 2024-01-12 2025-01-07 Compositions of modified nucleoside triphosphates Pending WO2025149478A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202463620346P 2024-01-12 2024-01-12
US63/620,346 2024-01-12

Publications (1)

Publication Number Publication Date
WO2025149478A1 true WO2025149478A1 (en) 2025-07-17

Family

ID=94382074

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2025/050239 Pending WO2025149478A1 (en) 2024-01-12 2025-01-07 Compositions of modified nucleoside triphosphates

Country Status (1)

Country Link
WO (1) WO2025149478A1 (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992002258A1 (en) 1990-07-27 1992-02-20 Isis Pharmaceuticals, Inc. Nuclease resistant, pyrimidine modified oligonucleotides that detect and modulate gene expression
WO1993010820A1 (en) 1991-11-26 1993-06-10 Gilead Sciences, Inc. Enhanced triple-helix and double-helix formation with oligomers containing modified pyrimidines
WO1994022892A1 (en) 1993-03-30 1994-10-13 Sterling Winthrop Inc. 7-deazapurine modified oligonucleotides
WO1994024144A2 (en) 1993-04-19 1994-10-27 Gilead Sciences, Inc. Enhanced triple-helix and double-helix formation with oligomers containing modified purines
US5432272A (en) 1990-10-09 1995-07-11 Benner; Steven A. Method for incorporating into a DNA or RNA oligonucleotide using nucleotides bearing heterocyclic bases
US6150510A (en) 1995-11-06 2000-11-21 Aventis Pharma Deutschland Gmbh Modified oligonucleotides, their preparation and their use
US7939259B2 (en) 2007-06-19 2011-05-10 Stratos Genomics, Inc. High throughput nucleic acid sequencing by expansion
WO2016041877A1 (en) * 2014-09-15 2016-03-24 Medivir Ab Methods for the preparation of diastereomerically pure phosphoramidate prodrugs
WO2016081871A1 (en) 2014-11-20 2016-05-26 Stratos Genomics, Inc. Nulceoside phosphoroamidate esters and derivatives thereof, use and synthesis thereof
WO2017087281A1 (en) 2015-11-16 2017-05-26 Stratos Genomics, Inc. Dp04 polymerase variants
WO2018204707A1 (en) 2017-05-04 2018-11-08 Stratos Genomics Inc. Dp04 polymerase variants
WO2019118372A1 (en) 2017-12-11 2019-06-20 Stratos Genomics, Inc. Dpo4 polymerase variants with improved accuracy
WO2019135975A1 (en) * 2018-01-05 2019-07-11 Stratos Genomics Inc. Enhancement of nucleic acid polymerization by aromatic compounds
WO2020172479A1 (en) 2019-02-21 2020-08-27 Stratos Genomics, Inc. Methods, compositions, and devices for solid-state synthesis of expandable polymers for use in single molecule sequencing
WO2020236526A1 (en) 2019-05-23 2020-11-26 Stratos Genomics, Inc. Translocation control elements, reporter codes, and further means for translocation control for use in nanopore sequencing

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992002258A1 (en) 1990-07-27 1992-02-20 Isis Pharmaceuticals, Inc. Nuclease resistant, pyrimidine modified oligonucleotides that detect and modulate gene expression
US5432272A (en) 1990-10-09 1995-07-11 Benner; Steven A. Method for incorporating into a DNA or RNA oligonucleotide using nucleotides bearing heterocyclic bases
WO1993010820A1 (en) 1991-11-26 1993-06-10 Gilead Sciences, Inc. Enhanced triple-helix and double-helix formation with oligomers containing modified pyrimidines
WO1994022892A1 (en) 1993-03-30 1994-10-13 Sterling Winthrop Inc. 7-deazapurine modified oligonucleotides
WO1994024144A2 (en) 1993-04-19 1994-10-27 Gilead Sciences, Inc. Enhanced triple-helix and double-helix formation with oligomers containing modified purines
US6150510A (en) 1995-11-06 2000-11-21 Aventis Pharma Deutschland Gmbh Modified oligonucleotides, their preparation and their use
US7939259B2 (en) 2007-06-19 2011-05-10 Stratos Genomics, Inc. High throughput nucleic acid sequencing by expansion
WO2016041877A1 (en) * 2014-09-15 2016-03-24 Medivir Ab Methods for the preparation of diastereomerically pure phosphoramidate prodrugs
WO2016081871A1 (en) 2014-11-20 2016-05-26 Stratos Genomics, Inc. Nulceoside phosphoroamidate esters and derivatives thereof, use and synthesis thereof
WO2017087281A1 (en) 2015-11-16 2017-05-26 Stratos Genomics, Inc. Dp04 polymerase variants
WO2018204707A1 (en) 2017-05-04 2018-11-08 Stratos Genomics Inc. Dp04 polymerase variants
WO2019118372A1 (en) 2017-12-11 2019-06-20 Stratos Genomics, Inc. Dpo4 polymerase variants with improved accuracy
WO2019135975A1 (en) * 2018-01-05 2019-07-11 Stratos Genomics Inc. Enhancement of nucleic acid polymerization by aromatic compounds
EP3735409B1 (en) 2018-01-05 2023-06-21 Stratos Genomics Inc. Enhancement of nucleic acid polymerization by aromatic compounds
WO2020172479A1 (en) 2019-02-21 2020-08-27 Stratos Genomics, Inc. Methods, compositions, and devices for solid-state synthesis of expandable polymers for use in single molecule sequencing
WO2020236526A1 (en) 2019-05-23 2020-11-26 Stratos Genomics, Inc. Translocation control elements, reporter codes, and further means for translocation control for use in nanopore sequencing

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
AUSUBEL ET AL.: "Current protocols in molecular biology", 1993, JOHN WILEY & SONS, INC.
FANTONI ET AL., CHEMICAL REVIEWS, vol. 121, no. 12, 2021, pages 7122 - 7154
HALEMARHAM: "The Harper Collins dictionary of biology", 1991, HARPER PERENNIAL
KLOCKER ET AL., CHEM. SOC. REV., vol. 49, 2020, pages 8749 - 8773
KOZAK ET AL., RUSS. CHEM. REV., vol. 89, no. 3, 2020, pages 281 - 310
MATYUGINA ET AL., RUSS. CHEM. REV., vol. 90, no. 11, 2021, pages 1454 - 1491
SAMBROOK ET AL.: "Practical Handbook of Biochemistry and Molecular Biology", 1989, COLD SPRING HARBOR LABORATORY PRESS, pages: 385 - 394
SINGLETON ET AL.: "Dictionary of microbiology and molecular biology", 1994, JOHN WILEY AND SONS
WALKERCOX: "The Language of Biotechnology: A Dictionary of Terms", 1988, AMERICAN CHEMICAL SOCIETY

Similar Documents

Publication Publication Date Title
US12180544B2 (en) Synthesis of cleavable fluorescent nucleotides as reversible terminators for DNA sequencing by synthesis
US20250171751A1 (en) Rna polymerase variants for co-transcriptional capping
CA2257227C (en) Substituted propargylethoxyamido nucleosides
EP3670523B1 (en) 3&#39;-oh unblocked, fast photocleavable terminating nucleotides and their use in methods for nucleic acid sequencing
US8003769B2 (en) Dye-labeled ribonucleotide triphosphates
US8344118B2 (en) Preparation and isolation of 5′ capped MRNA
JP2023012498A (en) Enzymatic Synthesis of 4&#39;-Ethyl Nucleoside Analogues
WO2025149478A1 (en) Compositions of modified nucleoside triphosphates
JP3489991B2 (en) 3&#39;-deoxyribonucleotide derivative
Zhou et al. Synthesis and properties of aminopropyl nucleic acids
JP2003502013A (en) Process for the preparation of morpholino-nucleotides and its use for the analysis and labeling of nucleic acid sequences
US20120309953A1 (en) Propargyl Substituted Nucleoside Compounds and Methods
JPH08502069A (en) 2-Substituted adenosines having A-2 receptor affinity
JP3032815B2 (en) 2&#39;-O-silyl cyclic silylated nucleoside derivative, method for producing the same, and method for producing 2&#39;-O-silyl nucleoside using the same
JP6380886B2 (en) Method for producing nucleoside compound
WO2024201393A1 (en) Methods for the synthesis of nucleoside analogues and nucleosides analogues derived therefrom
JP4072794B2 (en) DNA sequence determination method
JP2006248956A (en) Trityl type compound
WO2020054444A1 (en) Production method for guanosine derivative having fluorine atom-containing functional group at position 8 and application thereof
HK1217106B (en) Design and synthesis of cleavable fluorescent nucleotides as reversible terminators for dna sequencing by synthesis
HK1200838B (en) 5-methoxy. 3&#39;-oh unblocked, fast photocleavable terminating nucleotides and methods for nucleic acid sequencing
CA2266031A1 (en) 3&#39;-deoxyribonucleotide derivatives

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25700611

Country of ref document: EP

Kind code of ref document: A1