WO2025212654A1 - Library molecule titration for tunable surface density in polony sequencing - Google Patents
Library molecule titration for tunable surface density in polony sequencingInfo
- Publication number
- WO2025212654A1 WO2025212654A1 PCT/US2025/022547 US2025022547W WO2025212654A1 WO 2025212654 A1 WO2025212654 A1 WO 2025212654A1 US 2025022547 W US2025022547 W US 2025022547W WO 2025212654 A1 WO2025212654 A1 WO 2025212654A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequencing
- batch
- sequence
- template molecules
- molecules
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
Definitions
- the present disclosure provides compositions, apparatus and methods for conducting separate batches of nucleic acid sequencing on a support.
- the separate batches of sequencing can be performed on a support comprising a plurality of nucleic acid template molecules immobilized to the support at high density.
- Massively parallel sequencing methods have applications in biomedical research and healthcare setting as they allow for analyzing large quantities of biological samples.
- the limit of optical resolution impedes the ability to perform highly multiplex sequencing.
- Current technologies are unable to deal with large numbers of molecules being analyzed as they lead to over-crowding signals and images during sequencing, and ultimately lead to increased costs and time when using these methods.
- the disclosure provides a method for nucleic acid sequencing comprising: (a) providing a support comprising a plurality of nucleic acid template molecules immobilized to the support, wherein the plurality of nucleic acid template molecules comprises at least a first and a second sub-population of template molecules, wherein individual template molecules in the first sub-population of template molecules comprises a first batch sequencing primer binding site, a first batch barcode sequence and at least one first sequence-of-interest, wherein the individual template molecules in the second sub-population of template molecules comprises a second batch sequencing primer binding site, a second batch barcode sequence and at least one second sequence-of-interest, (b) sequencing the first sub-population of template molecules using a plurality of first batch sequencing primers, thereby generating a plurality of first batch sequencing read products and imaging a region of the support to detect the first batch sequencing read products; and (c) sequencing the second sub-population of template molecules using a plurality of second batch sequencing primers, thereby generating a plurality of second
- the plurality of nucleic acid template molecules immobilized to the support are at a density of about 10 2 - 10 15 template molecules per mm 2 . In some embodiments, the plurality of nucleic acid template molecules are immobilized to the support at a high density. In some embodiments, at least some individual template molecules of the first and second sub-populations of template molecules comprise nearest neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. In some embodiments, the support lacks partitions and/or barriers that separate regions of the support. In some embodiments, the plurality of template molecules are immobilized to the support at random and non-determined positions on the support. In some embodiments, the plurality of template molecules are immobilized to the support at pre-determined positions on the support (e.g., a patterned support).
- the plurality of nucleic acid template molecules comprises concatemer template molecules comprising at least a first and second sub-population of concatemer template molecules.
- the first batch sequencing read products comprise: the first batch barcode sequence; or the first batch barcode sequence and the first sequence of interest.
- the second batch sequencing read products comprise: the second batch barcode sequence; or the second batch barcode sequence and the second sequence of interest.
- the first plurality of sequencing read products of step (c) comprises: a first seeding batch barcode sequence; or a first seeding batch barcode sequence and a first sequence of interest.
- second individual circularized library molecules in the second plurality of circularized library molecules comprise a second seeding batch sequencing primer binding site, a second seeding batch barcode sequence, and a second sequence of interest.
- the second plurality of sequencing read products of step (e) comprises: a second seeding batch barcode sequence; or a second seeding batch barcode sequence and a second sequence of interest.
- the sequencing at least the subset of the second plurality of concatemer template molecules of step (e) comprises: Step (el): conducting short read sequencing by performing up to 1000 sequencing cycles of the second plurality of concatemer template molecules to generate a second plurality of sequencing read products that comprise up to 1000 bases in length; Step (e2): stopping and/or blocking the short read sequencing of step (el); Step (e3): removing the second plurality of sequencing read products and retaining the second plurality of immobilized concatemer template molecules; and optionally Step (e4): repeating steps (el) - (e3) at least once.
- the plurality of surface capture primers immobilized to the support are at a density of about 10 2 - 10 15 capture primers per mm 2 .
- at least some of the surface capture primers comprise nearest neighbor surface capture primers that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
- the support lacks partitions and/or barriers that separate regions of the support.
- FIG. 1 is a schematic of various exemplary configurations of multivalent molecules.
- Left (Class I) schematics of multivalent molecules having a “starburst” or “helter-skelter” configuration.
- Center (Class II) a schematic of a multivalent molecule having a dendrimer configuration.
- Class III a schematic of multiple multivalent molecules formed by reacting streptavidin with 4-arm or 8-arm PEG-NHS with biotin and dNTPs. Nucleotide units are designated ‘N’, biotin is designated ‘B’, and streptavidin is designated ‘SA’.
- FIG. 2 is a schematic of an exemplary multivalent molecule comprising a generic core attached to a plurality of nucleotide-arms.
- FIG. 3 is a schematic of an exemplary multivalent molecule comprising a dendrimer core attached to a plurality of nucleotide-arms.
- FIG. 4 is a schematic of an exemplary multivalent molecule comprising a core attached to a plurality of nucleotide-arms, where the nucleotide-arms comprise biotin, spacer, linker and a nucleotide unit.
- FIG. 5 is a schematic of an exemplary nucleotide-arm comprising a core attachment moiety, spacer, linker and nucleotide unit.
- FIG. 6 shows the chemical structure of an exemplary spacer (top), and the chemical structures of various exemplary linkers, including an 11 -atom Linker, 16-atom Linker, 23- atom Linker, and an N3 Linker (bottom).
- FIG. 7 shows the chemical structures of various exemplary linkers, including Linkers 1-9.
- FIG. 10 shows the chemical structures of various exemplary linkers joined/attached to nucleotide units.
- FIG. 13 is a schematic of an exemplary intramolecular G-quadruplex structure.
- FIG. 14A is a pair of schematics, (i) and (ii), of an exemplary support having a plurality of nucleic acid capture primers arranged on the support in a non-predetermined and random manner.
- the capture primers can be attached to the support such that some of the nearest neighbor capture primers touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
- the dotted lines that surround the four capture primers represents nearest neighbor capture primers that touch each other.
- nucleic acid template molecule having one of four different batch sequences.
- the different batch sequences of the template molecules are represented by horizontal stripes, vertical dashed, brick, or solid black.
- the template molecules can attach to the support (e.g., via attachment to the capture primers) such that some of the nearest neighbor template molecules touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
- the dotted lines that surround the four template molecules represent nearest neighbor template molecules that touch each other.
- FIG. 14B is a pair of schematics, (iii) and (iv), of an exemplary support having a plurality of nucleic acid template molecules immobilized to the support (e.g., via attachment to the capture primers) where the template molecules are arranged on the support in a predetermined manner.
- the template molecule comprise one of four different batch sequences.
- the different batch sequences of the template molecules are represented by horizontal stripes, vertical dashed, brick, or solid black.
- the template molecules can be immobilized to the support to form spots arranged in rows and columns (iii), or the template molecules can be immobilized to the support to form stripes (iv).
- FIG. 14C is a schematic of an exemplary low binding support comprising a glass substrate and alternating layers of hydrophilic coatings which are covalently or non- covalently adhered to the glass, and which further comprises chemically reactive functional groups that serve as attachment sites for oligonucleotide primers (e.g., capture oligonucleotides).
- the support can be made of any material such as glass, plastic, or a polymer material.
- FIG. 15A is a schematic showing an exemplary workflow for generating circularized padlock probes, comprising hybridizing first and second target-specific padlock probes to the first and second target molecules (respectively) to generate first (left schematic) and second (right schematic) circularized padlock probes (respectively) having a nick or gap, and closing the nick or gap to generate circularized padlock probes.
- the first padlock probe (left schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence), which corresponds to the first target sequence (Batch BC-1); (ii) a batch-specific sequencing primer (also referred to herein as a “batch sequencing primer”) binding site sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iii) a capture primer binding site; and (iv) a compaction oligonucleotide binding site.
- a batch barcode sequence i.e., a batch-specific barcode sequence
- a batch-specific sequencing primer also referred to herein as a “batch sequencing primer” binding site sequence which corresponds to the first target sequence
- a capture primer binding site e.g., a capture primer binding site
- compaction oligonucleotide binding site e.g., a compaction oligonucleotide binding site.
- the first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows), where the first sequencing read products include the first batch barcode sequence (Batch BC-1).
- the first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles, but the second concatemer template molecules do not undergo first batch sequencing.
- the first sequencing read products from the first concatemer template molecules can be up to 1000 bases in length.
- the first and second concatemer template molecules can be subjected to a second sequencing workflow using second batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows), where the second sequencing read products include the second batch barcode sequence (Batch BC-2).
- the second concatemers can undergo reiterative sequencing comprising up to 1000 sequencing cycles, but the first concatemers do not undergo second batch sequencing.
- the second sequencing read products from the second concatemers can be up to 1000 bases in length.
- FIG. 16 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing.
- the RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support.
- the first circularized padlock probe (Left schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC-1); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site.
- a batch barcode sequence i.e., a batch-specific barcode sequence
- a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence
- a first batch capture primer binding site e.g., a first batch capture primer binding site
- compaction oligonucleotide binding site e.g., a compaction oligonucleotide binding site.
- the first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles, but the second concatemer template molecules do not undergo first batch sequencing.
- the first sequencing read products from the first concatemers can be up to 1000 bases in length.
- the first and second concatemer template molecules can be subjected to a second sequencing workflow using second batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows), where the second sequencing read products include the second batch barcode sequence (Batch BC-2).
- the second concatemers can undergo reiterative sequencing comprising up to 1000 sequencing cycles, but the first concatemers do not undergo second batch sequencing.
- FIG. 17 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing.
- the RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support.
- the circularized padlock probes can be subjected to rolling circle amplification (RCA) to generate first and second concatemer template molecules which are immobilized to the support.
- RCA rolling circle amplification
- the first and second circularized padlock probes can be distributed onto the support essentially simultaneously, or distributed onto the support sequentially (e.g., re-seeding the support).
- the first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows), where the first sequencing read products include the first batch barcode sequence (Batch BC- 1).
- FIG. 18 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing.
- the RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support.
- the first circularized padlock probe (Left schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC-1); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the first and second target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site.
- a batch barcode sequence i.e., a batch-specific barcode sequence
- a batch-specific sequencing primer binding site sequence which corresponds to the first and second target sequence
- a first batch capture primer binding site e.g., Batch Seq-1
- compaction oligonucleotide binding site e.g., a compaction oligonucleotide binding site.
- the circularized padlock probes can be subjected to rolling circle amplification (RCA) to generate first and second concatemer template molecules which are immobilized to the support.
- RCA rolling circle amplification
- the first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first and second sequencing read products (dashed arrows).
- the first sequencing read products can include the first batch barcode sequence (Batch BC-
- the second concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles.
- the second sequencing read products from the second concatemer template molecules can be up to 1000 bases in length.
- FIG. 19 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing.
- the RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support.
- the first circularized padlock probe (Left schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first and second target sequence (Batch BC-1); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the first and second target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site.
- a batch barcode sequence i.e., a batch-specific barcode sequence
- a batch-specific sequencing primer binding site sequence which corresponds to the first and second target sequence
- a first batch capture primer binding site e.g., Batch Seq-1
- compaction oligonucleotide binding site e.g., a compaction oligonucleotide binding site.
- the circularized padlock probes can be subjected to rolling circle amplification (RCA) to generate first and second concatemer template molecules which are immobilized to the support.
- RCA rolling circle amplification
- the first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first and second sequencing read products (dashed arrows).
- the first sequencing read products include the first batch barcode sequence (Batch BC-1) and at least a portion of the first target sequence.
- the first circularized padlock probe (Left schematic) can comprise: (i) a first sample index which distinguish sequences of interest obtained from a first sample source (e.g., Sample index-1); (ii) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC- 1); (iii) a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iv) a first batch capture primer binding site; and (v) a compaction oligonucleotide binding site.
- a first sample index which distinguish sequences of interest obtained from a first sample source
- a batch barcode sequence i.e., a batch-specific barcode sequence
- a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence
- a first batch capture primer binding site e.g., Batch Seq-1
- a compaction oligonucleotide binding site e.
- the second circularized padlock probe can comprise: (i) a second sample index which distinguish sequences of interest obtained from a second sample source (e.g., Sample index-2); (ii) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC-1); (iii) a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iv) a first batch capture primer binding site; and (v) a compaction oligonucleotide binding site.
- a second sample index which distinguish sequences of interest obtained from a second sample source
- a batch barcode sequence i.e., a batch-specific barcode sequence
- a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence
- a first batch capture primer binding site e.g., Batch Seq-1
- a compaction oligonucleotide binding site e.g., a compaction
- the first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first and second sequencing read products (dashed arrows).
- the first sequencing read products can include the first batch barcode sequence (Batch BC-1) and the first sample index sequence.
- the first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles.
- the first sequencing read products from the first concatemer can be up to 1000 bases in length.
- the second sequencing read products can include the second batch barcode sequence (Batch BC-2) and the second sample index sequence.
- the second concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles.
- the second sequencing read products from the second concatemer template molecules can be up to 1000 bases in length.
- the exemplary library molecule (100) can comprise: a surface pinning primer binding site sequence (120) (e.g., a batch-specific surface pinning primer binding site sequence); an optional left unique identification sequence (180) (e.g., UMI); a left sample index sequence (160); a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); a sequence of interest (110); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); a right sample index sequence (170); and a surface capture primer binding site sequence (130) (e.g., a batch- specific surface capture primer binding site sequence).
- a surface pinning primer binding site sequence 120
- an optional left unique identification sequence e.g., UMI
- a left sample index sequence 160
- a forward sequencing primer binding site sequence 140
- a reverse sequencing primer binding site sequence 150
- a batch-specific reverse sequencing primer binding site sequence e.g
- the single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the surface pinning primer binding site sequence (120) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the surface capture primer binding site sequence (130) of the linear single-stranded library molecule (100).
- FIG. 22 is a schematic of an exemplary workflow of a linear single stranded library molecule (100) hybridizing with a single-stranded splint molecule/ strand (200) (ss-splint strand) thereby circularizing the library molecule to form a library-splint complex (300) with a nick which is enzymatically ligatable.
- the exemplary linear single stranded library molecule (100) can comprise: a surface pinning primer binding site sequence (120) (e.g., a batch-specific surface pinning primer binding site sequence); a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); a batch barcode sequence (195); a left sample index sequence (160); a sequence of interest (110); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); a right sample index sequence (170); and a surface capture primer binding site sequence (130) (e.g., a batch-specific surface capture primer binding site sequence).
- a surface pinning primer binding site sequence 120
- a forward sequencing primer binding site sequence 140
- a batch-specific forward sequencing primer binding site sequence e.g., a batch-specific forward sequencing primer binding site sequence
- a batch barcode sequence (195) e.g., a batch barcode sequence (195);
- the single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the surface pinning primer binding site sequence (120) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the surface capture primer binding site sequence (130) of the linear single-stranded library molecule (100).
- the exemplary second linear single stranded library molecule (100- 2) can comprise: a first surface pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2); a second batch barcode sequence (195-2); a first sample index sequence (160-1); a second sequence of interest (insert-2, 110- 2); and a first surface capture primer binding site sequence (130-1).
- the single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the first surface pinning primer binding site sequence (120-1) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the first surface capture primer binding site sequence (130-1) of the linear single-stranded library molecule (100).
- the first sequence of interest in the library-splint complex shown in FIG. 23A (110-1) and the second sequence of interest in the library-splint complex shown in FIG. 23B (110-2) can have the same sequence or different sequences.
- FIG. 24B is a schematic of an exemplary workflow in which the nick in the second library-splint complex (300-2) shown in FIG. 23B is ligated to generate a second covalently closed circular library molecule (400-2) which is shown in FIG. 24B.
- the second covalently closed circular library molecule (400-2) is subjected to rolling circle amplification (RCA) to generate a second concatemer template molecule, and the second concatemer template molecule is subjected to batch reiterative sequencing.
- the rolling circle amplification reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support.
- the second covalently closed circular library molecule (400-2) can comprise: a first surface pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2) which corresponds with the second sequence of interest (insert-2, 110-2) ; a second batch barcode sequence (195-2) which corresponds with the second sequence of interest (110-2) ; a first sample index sequence (160-1); a second sequence of interest (110-2) ; and a first surface capture primer binding site sequence (130-1).
- the second concatemer template molecules are not sequenced when first batch sequencing primers are used to sequence the first concatemer template molecules.
- the first concatemer template molecules are not sequenced when second batch sequencing primers are used to sequence the second concatemer template molecules.
- the second sequencing read products can include the second batch barcode sequence (195-2) as shown in FIG. 24B.
- the second sequencing read products can include the second batch barcode sequence (195-2) and the first sample index sequence (160-1) (not shown).
- the second sequencing read products include the second batch barcode sequence (195-2), the first sample index sequence (160-1), and at least a portion of the second sequence of interest (110-2) (not shown).
- the second concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles.
- the second sequencing read products from the second concatemer can be up to 1000 bases in length.
- the exemplary first linear single stranded library molecule (100-1) can comprise: a first surface pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1); a first batch barcode sequence (195-1); a first sequence of interest (insert-1, 110-1); and a first surface capture primer binding site sequence (130-1).
- the single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the first surface pinning primer binding site sequence (120-1) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the first surface capture primer binding site sequence (130-1) of the first linear single-stranded library molecule (100-1).
- FIG. 25B is a schematic of an exemplary workflow of a second single-stranded library molecule (100-2) (linear single-stranded library molecule-2) hybridizing with a singlestranded splint molecule/ strand (ss-splint strand) (200) thereby circularizing the library molecule to form a second library-splint complex (300-2) with a nick which is enzymatically ligatable.
- FIG. 26B is a schematic of an exemplary workflow in which the nick in the librarysplint complex (300-2) shown in FIG. 25B is ligated to generate a second covalently closed circular library molecule (400-2) which is shown in FIG. 26B.
- the second covalently closed circular library molecule (400-2) is subjected to rolling circle amplification (RCA) to generate a second concatemer template molecule, and the second concatemer template molecule is subjected to batch reiterative sequencing.
- the RCA reaction can be conducted insolution using soluble amplification primers or on-support using capture primers immobilized to a support.
- a plurality of the first covalently closed circular library molecule (400-1) shown in FIG. 26 A and a plurality of the second covalently closed circular library molecule (400-2) shown in FIG. 26B can be distributed onto the same support.
- the first covalently closed circular library molecules (400-1) shown in FIG. 26 A and the second covalently closed circular library molecules (400-2) shown in FIG. 26B can be distributed onto the support essentially simultaneously.
- the first covalently closed circular library molecules (400-1) shown in FIG. 26 A and the second covalently closed circular library molecules (400-2) shown in FIG. 26B can be distributed onto the support sequentially (e.g., re-seeding the support).
- the second covalently closed circular library molecules (400-2) can be subjected to rolling circle amplification (RCA) to generate a plurality of second concatemer template molecules which are immobilized to the support.
- the second concatemer template molecules can be subjected to a sequencing workflow using second batch sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows).
- the second concatemer template molecules are not sequenced when first batch sequencing primers are used to sequence the first concatemer template molecules.
- the first concatemer template molecules are not sequenced when second batch sequencing primers are used to sequence the second concatemer template molecules.
- the exemplary linear single stranded library molecule (100) can comprise: a pinning primer binding site sequence (120) (e.g., a batchspecific pinning primer binding site sequence); an optional left unique identification sequence (180) (e.g., UMI); a left sample index sequence (160); a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); a sequence of interest (110); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); a right sample index sequence (170); and a surface capture primer binding site sequence (130) (e.g., a batchspecific surface capture primer binding site sequence).
- a pinning primer binding site sequence 120
- an optional left unique identification sequence e.g., UMI
- a left sample index sequence 160
- a forward sequencing primer binding site sequence 140
- a reverse sequencing primer binding site sequence 150
- a batch-specific reverse sequencing primer binding site sequence e.
- FIG. 28 is a schematic of an exemplary workflow of a linear single-stranded library molecule (100) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the library molecule to form a library-splint complex (800) with two nicks (solid arrowheads).
- the exemplary linear single-stranded library molecule (100) can comprise: a surface pinning primer binding site sequence (120) (e.g., a batch-specific pinning primer binding site sequence); a forward sequencing primer binding site sequence (140) (e.g., batchspecific forward sequencing primer binding site sequence); a batch-specific barcode sequence (195); a left sample index sequence (160); a sequence of interest (110); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); a right sample index sequence (170); and a surface capture primer binding site sequence (130) (e.g., a batch-specific surface capture primer binding site sequence).
- a surface pinning primer binding site sequence 120
- a forward sequencing primer binding site sequence 140
- a batch-specific forward sequencing primer binding site sequence e.g., batchspecific forward sequencing primer binding site sequence
- a batch-specific barcode sequence (195) e.g., a batch-specific barcode sequence (195)
- the first region (620) of the first splint strand (600) can hybridize to at least a portion of the surface pinning primer binding site sequence (120) of a linear single stranded nucleic acid library molecule (100), and the second region (630) of the first splint strand (600) can hybridize to at least a portion of the surface capture primer binding site sequence (130) of the same single-stranded nucleic acid library molecule (100).
- the exemplary library molecule (100) can comprise: a surface pinning primer binding site sequence (120) (e.g., a batch-specific pinning primer binding site sequence); a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); a batch barcode sequence (195); a left sample index sequence (160); a sequence of interest (110); and a surface capture primer binding site sequence (130) (e.g., batch-specific surface capture primer binding site sequence).
- the double-stranded adaptor can comprise a first splint strand (600) hybridized to a second splint strand (700).
- the first splint strand (600) can comprise a first region (620), an internal region (610), and a second region (630), wherein the internal region of the first splint strand (610) is hybridized to the second splint strand (700).
- the second splint strand (700) can comprise a first, a second, and a third subregion, and the internal region (610) of the first splint strand (600) can comprise a fourth, a fifth, and a sixth subregion.
- the first region (620) of the first splint strand (600) can hybridize to at least a portion of the surface pinning primer binding site sequence (120) of a linear single-stranded library molecule (100), and the second region (630) of the first splint strand (600) can hybridize to at least a portion of the surface capture primer binding site sequence (130) of the same linear single-stranded library molecule (100).
- FIG. 30A is a schematic of an exemplary workflow of a first linear single-stranded library molecule (100-1) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the first linear single-stranded library molecule to form a first librarysplint complex (800-1) with two nicks (solid arrowheads) that are enzymatically ligatable.
- the exemplary first linear single stranded library molecule (100-1) can comprise: a first pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1); a first batch barcode sequence (195-1); a first sequence of interest (insert -1, 110-1); and a first surface capture primer binding site sequence (130-1).
- the double-stranded adaptor can comprise a first splint strand (600) hybridized to a second splint strand (700).
- the first splint strand (600) can comprise a first region (620), an internal region (610), and a second region (630), wherein the internal region of the first splint strand (610) is hybridized to the second splint strand (700).
- the second splint strand (700) can comprise a first, a second, and a third subregion, and the internal region (610) of the first splint strand (600) can comprise a fourth, a fifth, and a sixth subregion.
- the first region (620) of the first splint strand (600) can hybridize to at least a portion of the first pinning primer binding site sequence (120-1) of a linear single-stranded library molecule (100-1), and the second region (630) of the first splint strand (600) can hybridize to at least a portion of the first surface capture primer binding site sequence (130-1) of the same linear single-stranded library molecule (100-1).
- FIG. 30B is a schematic of an exemplary workflow of a second linear singlestranded library molecule (100-2) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the library molecule to form a second library-splint complex (800-2) with two nicks (solid arrowheads) that are enzymatically ligatable.
- the exemplary second linear single-stranded library molecule (100-2) can comprise: a first pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2); a second batch barcode sequence (195-2); a second sequence of interest (insert-2, 110-2); and a first surface capture primer binding site sequence (130-1).
- the double-stranded adaptor can comprise a first splint strand (600) hybridized to a second splint strand (700).
- the first splint strand (600) can comprise a first region (620), an internal region (610), and a second region (630), wherein the internal region of the first splint strand (610) is hybridized to the second splint strand (700).
- the second splint strand (700) can comprise a first, a second, and a third subregion, and the internal region (610) of the first splint strand (600) can comprise a fourth, a fifth, and a sixth subregion.
- FIG. 31A is a schematic of an exemplary workflow in which the two nicks in the first library-splint complex (800-1) shown in FIG. 30A are ligated to generate a first covalently closed circular library molecule (900-1) which is shown in FIG. 31 A.
- the first covalently closed circular library molecule (900-1) is subjected to rolling circle amplification (RCA) to generate a first concatemer template molecule, and the first concatemer template molecule is subjected to batch reiterative sequencing.
- the RCA reaction can be conducted insolution using soluble amplification primers or on-support using capture primers immobilized to a support.
- the first covalently closed circular library molecule (900-1) can comprise: a first surface pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1) which corresponds with the first sequence of interest (insert-1, 110-1); a first batch barcode sequence (195-1) which corresponds with the first sequence of interest (110-1); a first sequence of interest (110-1); and a first surface capture primer binding site sequence (130-1).
- the first covalently closed circular library molecule (900-1) can further comprise a second splint strand (700) from the double-stranded adaptor shown in FIG. 30 A.
- the 31 A can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first surface capture primer binding site sequence (130-1) in the first covalently closed circular library molecules (900-1).
- the first covalently closed circular library molecules (900-1) can be subjected to rolling circle amplification (RCA) to generate a plurality of first concatemer template molecules which are immobilized to the support.
- the first concatemer template molecules can be subjected to a sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows).
- the first sequencing read products can include the first batch barcode sequence (195-1) as shown in FIG. 31 A.
- the first sequencing read products can include the first batch barcode sequence (195-1) and at least a portion of the first sequence of interest (110-1) (not shown).
- the first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles.
- the first sequencing read products from the first concatemer template molecule can be up to 1000 bases in length.
- the second covalently closed circular library molecule (900-2) can comprise: a first surface pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2) which corresponds with the second sequence of interest (110-2); a second batch barcode sequence (195-2) which corresponds with the second sequence of interest (insert-2, 110-2); a second sequence of interest (110-2); and a first surface capture primer binding site sequence (130-1).
- the second covalently closed circular library molecule (900-2) can further comprise a second splint strand (700) from the double- stranded adaptor shown in FIG. 3 OB.
- 3 IB can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first surface capture primer binding site sequence (130-1) in the second covalently closed circular library molecules (900-2).
- a plurality of the first covalently closed circular library molecule (900-1) shown in FIG. 31 A and a plurality of the second covalently closed circular library molecule (900-2) shown in FIG. 3 IB are distributed onto the same support.
- the first covalently closed circular library molecules (900-1) shown in FIG. 31A and the second covalently closed circular library molecules (900-2) shown in FIG. 3 IB can be distributed onto the support essentially simultaneously.
- the second covalently closed circular library molecules (900-2) shown in FIG. 3 IB can be distributed onto the support sequentially (e.g., re-seeding the support).
- the second covalently closed circular library molecules (900-2) can be subjected to rolling circle amplification (RCA) to generate a plurality of second concatemer template molecules which are immobilized to the support.
- the second concatemer template molecules can be subjected to a sequencing workflow using second batch sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows).
- the second concatemer template molecules are not sequenced when first batch sequencing primers are used to sequence the first concatemer template molecules.
- the first concatemer template molecules are not sequenced when second batch sequencing primers are used to sequence the second concatemer template molecules.
- the second sequencing read products can include the second batch barcode sequence (195-2) as shown in FIG. 3 IB.
- the second sequencing read products include the second batch barcode sequence (195-2) and at least a portion of the second sequence of interest (110-2) (not shown).
- the second concatemer template molecules undergo reiterative sequencing comprising up to 1000 sequencing cycles.
- the second sequencing read products from the second concatemer template molecules can be up to 1000 bases in length.
- FIG. 32 is a schematic showing an exemplary linear single-stranded library molecule (100) hybridizing with a single-stranded splint molecule/ strand (200) (ss-split strand) thereby circularizing the library molecule to form a library-splint complex (300) with a nick.
- the linear single stranded library molecule (100) can comprise: a first left junction adaptor sequence (121); an adaptor sequence for a surface pinning primer binding site sequence (120); a second left junction adaptor sequence (125); a left sample index sequence (160); a third left junction adaptor sequence (165); an adaptor sequence for a forward sequencing primer binding site sequence (140); a fourth left junction adaptor sequence (145); a sequence of interest (e.g., an insert (110)); a fourth right junction adaptor sequence (155); an adaptor sequence for a reverse sequencing primer binding site sequence (150); a third right junction adaptor sequence (175); a right sample index sequence (170); a second right junction adaptor sequence (135); an adaptor sequence for a surface capture primer binding site (130); and a first right junction adaptor sequence (131).
- a first left junction adaptor sequence 121
- an adaptor sequence for a surface pinning primer binding site sequence 120
- a second left junction adaptor sequence 125
- a left sample index sequence
- the single-stranded splint strand (200) comprises a second region (220) that hybridizes with the other end (e.g., right end or 3’ end) of the linear single stranded library molecule (100) including at least a portion of the adaptor sequence for a surface capture primer binding site (130) and/or at least a portion of the first right junction adaptor sequence (131).
- the library-splint complex (300) does not show any of the junction adaptors.
- the library-splint complex (300) can include any one or any combination of two or more of the junction adaptors that are present in the linear single stranded library molecule (100).
- FIG. 33 is a schematic showing an exemplary linear single-stranded library molecule (100) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the library molecule to form a library-splint complex (800) with two nicks (solid arrowheads).
- the doublestranded splint adaptor (500) comprises a first splint strand (600) having a first region (620) that hybridizes with one end (e.g., left end or 5’ end) of the linear single stranded library molecule (100) including at least a portion of the adaptor sequence for a surface pinning primer binding site sequence (120) and/or at least a portion of the first left junction adaptor sequence (121).
- the double-stranded splint adaptor (500) comprises a first splint strand (600) having a second region (630) that hybridizes with the other end (e.g., right end or 3’ end) of the linear single stranded library molecule (100) including at least a portion of the adaptor sequence for a surface capture primer binding site sequence (130) and/or at least a portion of the first right junction adaptor sequence (131).
- the library-splint complex (300) does not show any of the junction adaptors.
- the library-splint complex (300) can include any one or any combination of two or more of the junction adaptors that are present in the linear single stranded library molecule (100).
- FIG. 34 shows sequencing images of polonies (e.g., DNA nanoballs) immobilized on a support at high density (top) and a table summarizing read count, Q30 scores and percent error (bottom).
- the support e.g., a flow cell
- the support was loaded with 20 picomolar (pM) of a 1 : 1 mixture of covalently closed circular library molecules generated from either singlestranded splint strands (right) or double-stranded splints (left).
- the loaded covalently closed circular library molecules were subjected to rolling circle amplification to generate immobilized concatemer template molecules.
- first batch sequencing primers e.g., TruSeq sequencing primers; SEQ ID NO: 2
- first batch sequencing primers e.g., TruSeq sequencing primers; SEQ ID NO: 2
- second batch sequencing primers e.g., ss- Splint sequencing primers, e.g. SEQ ID NO: 1
- Other loading concentrations were tested including 30 pM and 40 pM.
- FIG. 35A is a bar graph showing the pass filter count (PF Count, in millions (M)) from an experiment conducted to determine the density of immobilized polonies using 8-plex batch sequencing primers.
- PF Count pass filter count
- M the pass filter count
- FIG. 35B is a Table listing the estimated loading concentrations (extrapolated pM) of the libraries corresponding to the number of batch sequencing primers used. The Table in FIG. 35B corresponds to the bar graph shown in FIG. 35 A.
- FIG. 36A is a bar graph showing the percent pass filter from an experiment conducted to determine the density of immobilized polonies using 8-plex batch sequencing primers.
- FIG. 36B is a Table listing the estimated loading concentrations (extrapolated pM) of the libraries corresponding to the number of batch sequencing primers used.
- the Table in FIG. 36B corresponds to the bar graph shown in FIG. 36 A.
- FIG. 37A is a bar graph showing the %Q30 from an experiment conducted to determine the density of immobilized polonies using 8-plex batch sequencing primers.
- FIG. 37B is a Table listing the estimated loading concentrations (extrapolated pM) of the libraries corresponding to the number of batch sequencing primers used.
- the Table in FIG. 37B corresponds to the bar graph shown in FIG. 37 A.
- FIG. 38 is a graph showing the nucleotide base diversity (A, T, C, or G) of a right sample index sequence (170) which includes a universal right sample index and a 3-mer random sequence (NNN).
- the graph shows a nucleotide diversity of the 3-mer random sequence (NNN) of approximately 30% for A and T base calls, and approximately 20% for C and G base calls.
- Batch-specific sequencing enables sequencing a desired subset (e.g., a batch) of the template molecules immobilized to the same flow cell using selected batch-specific sequencing primers to reduce over-crowding signals and images which are generated during sequencing.
- the use of batch-specific sequencing primers produces optical images that are intense and resolvable.
- the batch-specific sequencing methods described herein have many uses. For example, the number of spots that are imaged and associated with sequencing can be counted. The counted spots can be used as a measure for target nucleic acid levels in a sample.
- the present disclosure provides compositions, apparatus and methods for conducting separate sequencing batches on a support having nucleic acid template molecules immobilized thereon, where the separate sequencing batches can be conducted using any massively parallel sequencing technology.
- a plurality of subpopulations of nucleic acid template molecules are immobilized to the support including at least a first and second sub-population.
- the first sub-population of template molecules undergo first sequencing reactions (e.g., first batch sequencing) and a region of the support is imaged to detect the first sequencing reactions, wherein the second sub-population of template molecules do not undergo sequencing reactions.
- the second sub-population of template molecules undergo second sequencing reactions (e.g., second batch sequencing) and the same region of the support is imaged to detect the second sequencing reactions, wherein the first sub-population of template molecules do not undergo sequencing reactions.
- second sequencing reactions e.g., second batch sequencing
- the first and second sub-populations of template molecules undergo batch sequencing.
- the present disclosure also provides compositions, apparatus, and methods for conducting massively parallel sequencing methods using concatemerized template molecules that are generated by rolling circle amplification.
- the concatemer template molecules contain multiple copies of the target sequences and unique barcode sequences and sequencing primer binding sequences associated with the target sequences. Use of the concatemer template molecules increases the accuracy of the sequencing.
- the methods described herein employ batch sequencing on high density immobilized template molecules which offers the advantage of maximizing space on a support (e.g., a flow cell). Furthermore, the same seeded support can be re-used by re-seeding the support with additional template molecules and conducting additional sequencing reactions on the re-seeded template molecules.
- Batch sequencing can be conducted using template molecules arranged in a predetermined manner on the support (e.g., a patterned support). Alternatively, batch sequencing can be conducted using template molecules arranged in a random manner on the support which obviates the need to fabricate a support having organized and pre-determined features for attaching template molecules (e.g., fabrication via lithography is not needed).
- batch sequencing By conducting short sequencing reads of the batch barcode regions of the template molecules, batch sequencing also significantly reduces sequencing run times, reagent use, and reagent costs.
- Batch sequencing also offers the flexibility of re-seeding the support any time between sequencing different batches, or an ongoing sequencing batch can be interrupted to permit re-seeding then the ongoing batch sequencing can be resumed.
- the ability to re-seed the support any time increases throughput and efficiency.
- the terms “about” and “approximately” refer to a value or composition that is within an acceptable error range for the particular value or composition as determined by one of ordinary skill in the art, which will depend in part on how the value or composition is measured or determined, i.e., the limitations of the measurement system.
- “about” or “approximately” can mean within one or more than one standard deviation per the practice in the art.
- “about” or “approximately” can mean a range of up to 10% (i.e., ⁇ 10%) or more depending on the limitations of the measurement system.
- about 5 mg can include any number between 4.5 mg and 5.5 mg.
- corresponding to or “corresponds to” refers to two or more entities whose identities are sufficiently related such that the identity of one entity can be used to determine the identity, position and/or other properties of the other entity.
- a barcode sequence can be said to correspond to a particular sequence of interest if the barcode sequence can be used to determine the identity of the sequence of interest.
- a polymerase comprises one or more active sites at which nucleotide binding and/or catalysis of nucleotide polymerization can occur.
- a polymerase includes other enzymatic activities, such as for example, 3' to 5' exonuclease activity or 5' to 3' exonuclease activity.
- a polymerase has strand displacing activity.
- a polymerase can be expressed in prokaryote, eukaryote, viral, or phage organisms. In some embodiments, a polymerase can be post-translationally modified proteins or fragments thereof. A polymerase can be derived from a prokaryote, eukaryote, virus or phage. A polymerase comprises DNA-directed DNA polymerase and RNA-directed DNA polymerase. [0097] As used herein, the term “strand displacing” refers to the ability of a polymerase to locally separate strands of double-stranded nucleic acids and synthesize a new strand in a template-based manner.
- Strand displacing polymerases displace a complementary strand from a template strand and catalyze new strand synthesis.
- Strand displacing polymerases include mesophilic and thermophilic polymerases.
- Strand displacing polymerases include wild type enzymes, and variants including exonuclease minus mutants, mutant versions, chimeric enzymes and truncated enzymes. Examples of strand displacing polymerases include phi29 DNA polymerase, large fragment of Bst DNA polymerase, large fragment of Bsu DNA polymerase (exo-), Bea DNA polymerase (exo-), KI enow fragment of E.
- the phi29 DNA polymerase can be wild type phi29 DNA polymerase (e.g., MagniPhi® from Expedeon), or variant EquiPhi29 DNA polymerase (e.g., from Thermo Fisher Scientific®), or chimeric QualiPhi® DNA polymerase (e.g., from 4basebio®).
- wild type phi29 DNA polymerase e.g., MagniPhi® from Expedeon
- variant EquiPhi29 DNA polymerase e.g., from Thermo Fisher Scientific®
- chimeric QualiPhi® DNA polymerase e.g., from 4basebio®
- Nucleic acids can be single-stranded or double-stranded. Nucleic acids comprise polymers of nucleotides, where the nucleotides include natural or non-natural bases and/or sugars. Nucleic acids comprise naturally-occurring internucleosidic linkages, for example phosphodiester linkages. Nucleic acids can lack a phosphate group. Nucleic acids comprise non-natural internucleoside linkages, including phosphorothioate, phosphorothiolate, or peptide nucleic acid (PNA) linkages. In some embodiments, nucleic acids comprise a one type of polynucleotides or a mixture of two or more different types of polynucleotides.
- operably linked and “operably joined” or related terms as used herein refers to juxtaposition of components.
- the juxtapositioned components can be linked together covalently.
- two nucleic acid components can be enzymatically ligated together where the linkage that joins together the two components comprises phosphodiester linkage.
- a first and second nucleic acid component can be linked together, where the first nucleic acid component can confer a function on a second nucleic acid component.
- linkage between a primer binding sequence and a sequence of interest forms a nucleic acid library molecule having a portion that can bind to a primer.
- a transgene e.g., a nucleic acid encoding a polypeptide or a nucleic acid sequence of interest
- a transgene can be ligated to a vector where the linkage permits expression or functioning of the transgene sequence contained in the vector.
- a transgene is operably linked to a host cell regulatory sequence (e.g., a promoter sequence) that affects expression of the transgene.
- the vector comprises at least one host cell regulatory sequence, including a promoter sequence, enhancer, transcription and/or translation initiation sequence, transcription and/or translation termination sequence, polypeptide secretion signal sequences, and the like.
- the host cell regulatory sequence controls expression of the level, timing and/or location of the transgene.
- the terms “linked”, “joined”, “attached”, “appended” and variants thereof comprise any type of fusion, bond, adherence or association between any combination of compounds or molecules that is of sufficient stability to withstand use in the particular procedure.
- the procedure can include but are not limited to: nucleotide binding; nucleotide incorporation; de-blocking (e.g., removal of chain-terminating moiety); washing; removing; flowing; detecting; imaging and/or identifying.
- Such linkage can comprise, for example, covalent, ionic, hydrogen, dipole-dipole, hydrophilic, hydrophobic, or affinity bonding, bonds or associations involving van der Waals forces, mechanical bonding, and the like.
- such linkage occurs intramolecularly, for example linking together the ends of a single-stranded or double-stranded linear nucleic acid molecule to form a circular molecule.
- such linkage can occur between a combination of different molecules, or between a molecule and a non-molecule, including but not limited to: linkage between a nucleic acid molecule and a solid surface; linkage between a protein and a detectable reporter moiety; linkage between a nucleotide and detectable reporter moiety; and the like.
- the 3’ end of the primer can lack a 3’ OH moiety, or can include a terminal 3’ blocking group that inhibits nucleotide polymerization in a polymerase-catalyzed reaction. Any one nucleotide, or more than one nucleotide, along the length of the primer can be labeled with a detectable reporter moiety.
- a primer can be in solution (e.g., a soluble primer) or can be immobilized to a support (e.g., a capture primer).
- template nucleic acid refers to a nucleic acid strand that serves as the basis nucleic acid molecule for any of the methods describe herein, e.g. sequencing or amplification methods.
- the template nucleic acid can be single-stranded or double-stranded, or the template nucleic acid can have single-stranded or double-stranded portions.
- the template nucleic acid can be obtained from a naturally- occurring source, recombinant form, or chemically synthesized to include any type of nucleic acid analog.
- the template nucleic acid can be linear, circular, or other forms.
- the template nucleic acids can include an insert portion having an insert sequence.
- the template nucleic acids can also include at least one adaptor sequence.
- the insert portion can be isolated in any form, including chromosomal, genomic, organellar (e.g., mitochondrial, chloroplast or ribosomal), recombinant molecules, cloned, amplified, cDNA, RNA such as precursor mRNA or mRNA, oligonucleotides, whole genomic DNA, obtained from fresh frozen paraffin embedded tissue, needle biopsies, circulating tumor cells, cell free circulating DNA, or any type of nucleic acid library.
- the insert portion can be isolated from any organ, including head, neck, brain, breast, ovary, cervix, colon, rectum, endometrium, gallbladder, intestines, bladder, prostate, testicles, liver, lung, kidney, esophagus, pancreas, thyroid, pituitary, thymus, skin, heart, larynx, or other organs.
- the template nucleic acid can be subjected to nucleic acid analysis, including sequencing and composition analysis.
- the template molecules disclosed herein can be concatemer template molecules, which comprise two or more copies of a particular sequence.
- Adaptors can be single-stranded, double-stranded, or have single-stranded and/or double-stranded portions. Adaptors can be configured to be linear, stem-looped, hairpin, or Y-shaped forms. Adaptors can be any length, including 4-100 nucleotides or longer. Adaptors can have blunt ends, overhang ends, or a combination of both. Overhang ends include 5’ overhang and 3’ overhang ends. The 5’ end of a single-stranded adaptor, or one strand of a double-stranded adaptor, can have a 5’ phosphate group or lack a 5’ phosphate group.
- Adaptors can include a 5’ tail that does not hybridize to a target polynucleotide (e.g., tailed adaptor), or adaptors can be non-tailed.
- An adaptor can include a sequence that is complementary to at least a portion of a primer, such as an amplification primer, a sequencing primer, or a capture primer (e.g., soluble or immobilized capture primers).
- Adaptors can include a random sequence or degenerate sequence.
- Adaptors can include at least one inosine residue.
- Adaptors can include at least one phosphorothioate, phosphorothiolate and/or phosphoramidate linkage.
- Adaptors can include a barcode sequence which can be used to distinguish polynucleotides (e.g., insert sequences) from different sample sources in a multiplex assay.
- Adaptors can include a unique identification sequence (e.g., unique molecular index, UMI; or a unique molecular tag) that can be used to uniquely identify a nucleic acid molecule to which the adaptor is appended.
- a unique identification sequence can be used to increase error correction and accuracy, reduce the rate of false-positive variant calls and/or increase sensitivity of variant detection.
- Adaptors can include at least one restriction enzyme recognition sequence, including any one or any combination of two or more selected from a group consisting of type I, type II, type III, type IV, type Hs or type IIB.
- hybridize or “hybridizing” or “hybridization” or other related terms refers to hydrogen bonding between two different nucleic acids to form a duplex nucleic acid.
- Hybridization also includes hydrogen bonding between two different regions of a single nucleic acid molecule to form a self-hybridizing molecule having a duplex region.
- Hybridization can comprise Watson-Crick or Hoogstein binding to form a duplex double-stranded nucleic acid, or a double-stranded region within a nucleic acid molecule.
- the double-stranded nucleic acid may be wholly complementary, or partially complementary.
- Complementary nucleic acid strands need not hybridize with each other across their entire length.
- the complementary base pairing can be the standard A-T or C-G base pairing, or can be other forms of base-pairing interactions.
- Duplex nucleic acids can include mismatched base-paired nucleotides.
- nucleic acid incorporation comprises polymerization of one or more nucleotides into the terminal 3’ OH end of a nucleic acid strand, resulting in extension of the nucleic acid strand. Nucleotide incorporation can be conducted with natural nucleotides and/or nucleotide analogs. Typically, but not necessarily, nucleotide incorporation occurs in a template-dependent fashion.
- nucleotides refers to a molecule comprising an aromatic base, a five carbon sugar (e.g., ribose or deoxyribose), and at least one phosphate group. Canonical or non-canonical nucleotides are consistent with use of the term.
- the phosphate in some embodiments comprises a monophosphate, diphosphate, or triphosphate, or corresponding phosphate analog.
- nucleoside refers to a molecule comprising an aromatic base and a sugar.
- Nucleotides typically comprise a sugar moiety, such as carbocyclic moiety (Ferraro and Gotor 2000 Chem. Rev. 100: 4319-48), acyclic moieties (Martinez, et al., 1999 Nucleic Acids Research 27: 1271-1274; Martinez, et al., 1997 Bioorganic & Medicinal Chemistry Letters vol. 7: 3013-3016), and other sugar moieties (Joeng, et al., 1993 J. Med. Chem. 36: 2627-2638; Kim, et al., 1993 J. Med. Chem. 36: 30-7; Eschenmosser 1999 Science 284:2118-2124; and U.S. Pat. No.
- the sugar moiety comprises: ribosyl; 2'-deoxyribosyl; 3 '-deoxyribosyl; 2', 3 '-dideoxyribosyl; 2', 3'- didehydrodideoxyribosyl; 2'-alkoxyribosyl; 2'-azidoribosyl; 2'-aminoribosyl; 2'-fluororibosyl; 2'-mercaptoriboxyl; 2'-alkylthioribosyl; 3 '-alkoxyribosyl; 3 '-azidoribosyl; 3 '-aminoribosyl; 3 '-fluororibosyl; 3'-mercaptoriboxyl; 3 '-alkylthioribosyl carbocyclic; acyclic or other modified sugars.
- a proximity event includes two reporter moieties approaching each other, or associating with each other, or binding each other. It is well known to one skilled in the art to select reporter moieties so that each absorbs excitation radiation and/or emits fluorescence at a wavelength distinguishable from the other reporter moieties to permit monitoring the presence of different reporter moieties in the same reaction or in different reactions. Two or more different reporter moieties can be selected having spectrally distinct emission profiles, or having minimal overlapping spectral emission profiles. Reporter moieties can be linked (e.g., operably linked) to nucleotides, nucleosides, nucleic acids, enzymes (e.g., polymerases or reverse transcriptases), or support (e.g., surfaces).
- a reporter moiety comprises a fluorescent label or a fluorophore.
- fluorescent moieties which may serve as fluorescent labels or fhiorophores include, but are not limited to fluorescein and fluorescein derivatives such as carboxyfluorescein, tetrachlorofluorescein, hexachlorofluorescein, carboxynapthofluorescein, fluorescein isothiocyanate, NHS-fluorescein, iodoacetamidofluorescein, fluorescein maleimide, SAMSA-fluorescein, fluorescein thiosemicarbazide, carbohydrazinomethylthioacetyl-amino fluorescein, rhodamine and rhodamine derivatives such as TRITC, TMR, lissamine rhodamine, Texas Red, rhodamine B, rhodamine 6G, rhodamine
- Cyanine dyes may exist in either sulfonated or non-sulfonated forms, and consist of two indolenin, benzo- indolium, pyridium, thiozolium, and/or quinolinium groups separated by a polymethine bridge between two nitrogen atoms.
- cyanine fluorophores include, for example, Cy3, (which may comprise l-[6-(2,5-dioxopyrrolidin-l-yloxy)-6-oxohexyl]-2- (3- ⁇ l-[6-(2,5-dioxopyrrolidin-l-yloxy)-6-oxohexyl]-3,3-dimethyl-l,3-dihydro-2H-indol-2- ylidenejprop- 1 -en- 1 -yl)-3 ,3 -dimethyl-3H-indolium or 1 - [6-(2, 5-dioxopyrrolidin- 1 -yloxy)-6- oxohexyl]-2-(3- ⁇ l-[6-(2,5-dioxopyrrolidin-l-yloxy)-6-oxohexyl]-3,3-dimethyl-5-sulfo-l,3
- Cy2 which is an oxazole derivative rather than indolenin, and the benzo-derivatized Cy3.5, Cy5.5 and Cy7.5 are exceptions to this rule. Additional suitable dyes are described, for example, in U.S. 2024/0240249A1, the contents of which are incorporated by reference in their entirety herein.
- the reporter moiety can be a FRET pair, such that multiple classifications can be performed under a single excitation and imaging step.
- FRET may comprise excitation exchange (Forster) transfers, or electron-exchange (Dexter) transfers.
- the terms “amplify”, “amplifying”, “amplification”, and other related terms include producing multiple copies of an original polynucleotide template molecule, where the copies comprise a sequence that is complementary to the template sequence, or the copies comprise a sequence that is the same as the template sequence. In some embodiments, the copies comprise a sequence that is substantially identical to a template sequence, or is substantially identical to a sequence that is complementary to the template sequence.
- support refers to a substrate that is designed for deposition of biological molecules or biological samples for assays and/or analyses.
- biological molecules to be deposited onto a support include nucleic acids (e.g., DNA, RNA), polypeptides, saccharides, lipids, a single cell or multiple cells.
- biological samples include but are not limited to saliva, phlegm, mucus, blood, plasma, serum, urine, stool, sweat, tears and fluids from tissues or organs.
- a “capture primer” or “surface capture primer” and the like refers to an oligonucleotide immobilized to a support that is complementary to a portion of, and capable of hybridizing with a given oligonucleotide, such as the library molecules and/or template molecules described herein.
- a “pinning primer” or “surface pinning primer” and the like refers to an oligonucleotide immobilized to a support that is complementary to a portion of, and capable of hybridizing with the concatemer template molecules described herein, thereby “pinning” down a portion of the concatemer template molecule to the support.
- the support is solid, semi-solid, or a combination of both. In some embodiments, the support is porous, semi-porous, non-porous, or any combination of porosity. In some embodiments, the support can be substantially planar, concave, convex, or any combination thereof. In some embodiments, the support can be cylindrical, for example comprising a capillary or interior surface of a capillary.
- the support can have a plurality (e.g., two or more) of nucleic acid templates immobilized thereon.
- the plurality of immobilized nucleic acid templates have the same sequence or have different sequences.
- individual nucleic acid template molecules in the plurality of nucleic acid templates are immobilized to a different site on the support.
- two or more individual nucleic acid template molecules in the plurality of nucleic acid templates are immobilized to a site on the support.
- array refers to a support comprising a plurality of sites located at predetermined locations on a support described herein to form an array of sites.
- the sites can be discrete and separated by interstitial regions.
- the pre-determined sites on the support can be arranged in one dimension in a row or a column, or arranged in two dimensions in rows and columns.
- the plurality of pre-determined sites is arranged on the support in an organized fashion.
- the plurality of pre-determined sites is arranged in any organized pattern, including rectilinear, hexagonal patterns, grid patterns, patterns having reflective symmetry, patterns having rotational symmetry, or the like. The pitch between different pairs of sites can be that same or can vary.
- the support comprises between about 10 2 sites and about 10 15 sites, between about 10 5 sites and about 10 15 sites, between about 10 10 sites and about 10 15 sites, between about 10 3 sites and about 10 14 sites, between about 10 4 sites and about 10 13 sites, between about 10 5 sites and about 10 12 sites, between about 10 6 sites and about 10 11 sites, between about 10 7 sites and about 10 10 sites, between about 10 8 sites and about 10 10 sites, or any range therebetween located at pre-determined locations on the support.
- a plurality of pre-determined sites on the support e.g., 10 2 - 10 15 sites or more
- the nucleic acid templates that are immobilized at a plurality of predetermined sites by hybridization to immobilized surface capture primers, or the nucleic acid templates are covalently attached to the surface capture primer.
- the nucleic acid templates that are immobilized at a plurality of pre-determined sites for example immobilized at 10 2 - 10 15 sites or more.
- the immobilized nucleic acid templates are clonally-amplified to generate immobilized nucleic acid clusters at the plurality of pre-determined sites.
- individual immobilized nucleic acid clusters comprise linear clusters, or comprise single-stranded or double-stranded concatemers.
- the support comprises at least 10 2 sites, at least 10 3 sites, at least 10 4 sites, at least 10 5 sites, at least 10 6 sites, at least 10 7 sites, at least 10 8 sites, at least 10 9 sites, at least IO 10 sites, at least 10 11 sites, at least 10 12 sites, at least 10 13 sites, at least 10 14 sites, at least 10 15 sites, or more, where the sites are randomly located on the support.
- the support comprises between about 10 2 sites and about 10 15 sites, between about 10 5 sites and about 10 15 sites, between about IO 10 sites and about 10 15 sites, between about 10 3 sites and about 10 14 sites, between about 10 4 sites and about 10 13 sites, between about 10 5 sites and about 10 12 sites, between about 10 6 sites and about 10 11 sites, between about 10 7 sites and about IO 10 sites, or between about 10 8 sites and about IO 10 sites, or any range therebetween located at random locations on the support.
- a plurality of randomly located sites on the support e.g., 10 2 - 10 15 sites or more
- the template molecules are immobilized at between about 10 2 sites and about 10 15 sites, between about 10 5 sites and about 10 15 sites, between about IO 10 sites and about 10 15 sites, between about 10 3 sites and about 10 14 sites, between about 10 4 sites and about 10 13 sites, between about 10 5 sites and about 10 12 sites, between about 10 6 sites and about 10 11 sites, between about 10 7 sites and about IO 10 sites, or between about 10 8 sites and about IO 10 sites, or any range therebetween, on the support.
- the immobilized nucleic acid templates are clonally-amplified to generate immobilized nucleic acid clusters at the plurality of randomly located sites.
- individual immobilized nucleic acid clusters comprise linear clusters, or comprise single-stranded or double-stranded concatemers.
- the plurality of immobilized nucleic acid clusters on the support are in fluid communication with each other to permit flowing a solution of reagents (e.g., enzymes, nucleotides, divalent cations, and the like) onto the support so that the plurality of immobilized nucleic acid clusters on the support can be essentially simultaneously reacted with the reagents in a massively parallel manner.
- reagents e.g., enzymes, nucleotides, divalent cations, and the like
- the fluid communication of the plurality of immobilized nucleic acid clusters can be used to conduct nucleotide binding assays and/or conduct nucleotide polymerization reactions (e.g., primer extension or sequencing) essentially simultaneously on the plurality of immobilized nucleic acid clusters, and optionally to conduct detection and imaging for massively parallel sequencing.
- nucleotide binding assays e.g., primer extension or sequencing
- nucleotide polymerization reactions e.g., primer extension or sequencing
- immobilized When used in reference to immobilized enzymes, the term “immobilized” and related terms refer to enzymes (e.g., polymerases) that are attached to a support through covalent bond or non-covalent interaction, or attached to a coating on the support, or buried within a matrix formed by a coating on the support.
- enzymes e.g., polymerases
- immobilized When used in reference to immobilized nucleic acids, the term “immobilized” and related terms refer to nucleic acid molecules that are attached to a support through covalent bond or non-covalent interaction, or attached to a coating on the support, or buried within a matrix formed by a coating on the support, where the nucleic acid molecules include surface capture primers, nucleic acid template molecules and extension products of capture primers. Extension products of capture primers includes nucleic acid concatemers (e.g., nucleic acid clusters).
- the one or more nucleic acid templates are clonally-amplified (e.g., in solution or on the support) using a nucleic acid amplification reaction, including any one or any combination of: polymerase chain reaction (PCR), multiple displacement amplification (MDA), transcription-mediated amplification (TMA), nucleic acid sequence-based amplification (NASBA), strand displacement amplification (SDA), real-time SDA, bridge amplification, isothermal bridge amplification, rolling circle amplification (RCA), circle-to-circle amplification, helicase-dependent amplification, recombinase-dependent amplification, and/or single-stranded binding (SSB) protein-dependent amplification.
- PCR polymerase chain reaction
- MDA multiple displacement amplification
- TMA transcription-mediated amplification
- NASBA nucleic acid sequence-based amplification
- SDA strand displacement amplification
- bridge amplification isothermal bridge amplification
- Persistence time refers to the length of time that a binding complex, which is formed between the target nucleic acid, a primer, a polymerase, a conjugated or unconjugated nucleotide, remains stable without any binding component dissociates from the binding complex.
- the persistence time is indicative of the stability of the binding complex and strength of the binding interactions. Persistence time can be measured by observing the onset and/or duration of a binding complex, such as by observing a signal from a labeled component of the binding complex.
- Tris refers to a pH buffering agent Tris(hydroxymethyl)- aminomethane.
- Tris-HCl refers to a pH buffering agent Tris(hydroxymethyl)- aminomethane hydrochloride.
- Tris-acetate refers to a pH buffering agent comprising an acetate salt of Tris (hydroxymethyl)-aminomethane.
- Tricine refers to a pH buffering agent N-[tris(hydroxymethyl) methyl]glycine.
- HEPES refers to a pH buffering agent 4-(2-hy droxy ethyl)- 1- piperazineethanesulfonic acid.
- MES refers to a pH buffering agent 2-(7V-morpholino)ethanesulfonic acid).
- MOPSO refers to a pH buffering agent 3-(N-morpholino)-2- hydroxypropanesulfonic acid.
- BES refers to a pH buffering agent N,N-bis(2-hydroxyethyl)-2- aminoethanesulfonic acid.
- TES refers to a pH buffering agent 2-[(2-Hydroxy-
- CAPS refers to a pH buffering agent 3 -(cyclohexylamino)- 1- propanesuhinic acid.
- the plurality of sub-populations of nucleic acid template molecules are immobilized to the support at a high density.
- at least some of the immobilized template molecules in the first and second sub-populations comprise nearest neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
- the plurality of sub-populations of nucleic acid template molecules are immobilized to the support at a density of about 10 2 - 10 15 template molecules per mm 2 .
- the template molecules are at density of between about 10 10 and about IO 15 template molecules per mm 2 , between about 10 5 and about 10 15 template molecules per mm 2 , between about 10 3 and about 10 14 template molecules per mm 2 , between about 10 4 and about 10 13 template molecules per mm 2 , between about 10 5 and about 10 12 template molecules per mm 2 , between about 10 6 and about 10 11 template molecules per mm 2 , between about 10 7 and about IO 10 template molecules per mm 2 , or between about 10 8 and about IO 10 template molecules per mm 2 on the support, or any range therebetween.
- the support comprises a plurality of template molecules immobilized at pre-determined positions on the support (e.g., a patterned support). In some embodiments, the support comprises a plurality of template molecules immobilized at random and non-pre-determined positions on the support. In some embodiments, the support comprises a mixture of at least two sub-populations of template molecules immobilized at random and non-pre-determined positions on the support.
- the support lacks any contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern. In some embodiments, the support lacks contours which include features as sites for attachment of the nucleic acid template molecules. In some embodiments, the support lacks interstitial regions arranged in a predetermined pattern where the interstitial regions are sites designed to have no attached surface capture primers and/or template molecules. In some embodiments, the support lacks features that can be prepared using photo-chemical, photo-lithography, or micron-scale or nano-scale printing.
- individual template molecules in a given sub-population of template molecules comprise a sequence of interest, a batch barcode sequence that corresponds to the sequence of interest, and a batch sequencing primer binding site sequence that corresponds to the sequence of interest.
- a pre-determined batch barcode sequence can be linked to a given sequence of interest, thus the pre-determined batch barcode sequence corresponds to a given sequence of interest.
- a predetermined batch sequencing primer binding site sequence can be linked to a given sequence of interest, thus the pre-determined batch sequencing primer binding site sequence corresponds to a given sequence of interest.
- template molecules within a given sub-population have the same or different sequences of interest.
- template molecules within a given sub-population have the same batch barcode sequence. In some embodiments, template molecules within a given sub-population have the same sequencing primer binding site sequence. Thus, the different sub-populations of template molecules can undergo batch sequencing using a batch-specific sequencing primer. [00155] In some embodiments, the sequence of interest region need not undergo sequencing. Instead, the batch barcode can be sequenced by conducting a small number of sequencing cycles to reveal the batch barcode which corresponds to its sequence of interest. In some embodiments, the batch barcode and the sequence of interest can be sequenced.
- individual template molecules in a given sub-population of template molecules further comprise a sample index sequence that can be used to distinguish sequences of interest obtained from different sample sources in a multiplex assay.
- template molecules within a given sub-population have the same or different sample index sequences.
- the same portion of individual template molecules can be re-sequenced (e.g., reiterative sequencing) from the same start position to generate overlapping sequencing reads that can be aligned to a reference sequence.
- the same portion of individual template molecules can be sequenced at least two, three, four, five, up to 50 times, up to 100 times, or more than 100 times.
- the start sequencing site can be any location of the template molecule and is dictated by the sequencing primers which are designed to anneal to a selected position within the template molecule.
- the support after sequencing the first and/or second sub-populations of template molecules, can be re-seeded at least once with additional sub-population of template molecules (e.g., a third sub-population) which can undergo additional batch sequencing.
- additional sub-population of template molecules e.g., a third sub-population
- an ongoing batch sequencing run can be stopped prior to completion (e.g., interrupted) to permit re-seeding the support with an additional sub- population of template molecules (e.g., the third sub-population) and then the interrupted batch sequencing can be resumed.
- the support can be re-seeded any time and/or before a previous sequencing batch is completed.
- the same support can undergo a first re-seeding with additional template molecules immobilized to the support so that the first re-seeded density has some nearest template molecules (e.g., 10 - 30% of the first immobilized re-seeded template molecules) that touch each other and/or overlap each other.
- the resulting first re-seeded support comprises a plurality of template molecules having a reduced number of interstitial space (and/or having a reduced size of interstitial space) between the template molecules compared to the initial low density support.
- the same support can undergo a second re-seeding with additional template molecules immobilized to the support so that the second re-seeded density has an increase in nearest neighbor template molecules (e.g., 25 - 50% or more of the first re-seeded template molecules) that touch each other and/or overlap each other.
- the resulting second re-seeded support comprises a plurality of template molecules having a further reduced number of interstitial space (and/or having a further reduced size of interstitial space) between the template molecules compared to the first reseeded density support.
- the support can undergo multiple re-seeding workflows to generate increasing nearest neighbor template molecules that touch each other and/or overlap each other.
- individual template molecules comprise nucleic acid concatemer template molecules.
- a concatemer template molecule can be generated by conducting rolling circle amplification of a circularized nucleic acid library molecule.
- a concatemer template molecule comprises a singlestranded nucleic acid strand carrying numerous tandem copies of a polynucleotide unit, where each polynucleotide unit comprises a sequence of interest region and at least one batch sequencing primer binding site.
- each polynucleotide unit further comprises at least one batch barcode sequence.
- each polynucleotide unit further comprises at least one sample index sequence.
- Individual polynucleotide units can bind a sequencing primer, a sequencing polymerase and a detectably-labeled nucleotide reagent (e.g., detectably labeled multivalent molecules or nucleotide analogs), to form a detectable sequencing complex.
- a detectably-labeled nucleotide reagent e.g., detectably labeled multivalent molecules or nucleotide analogs
- individual concatemer template molecules can collapse into a compact DNA nanoball, where individual nanoballs carry numerous tandem copies of a polynucleotide unit along their lengths. During batch sequencing, individual nanoballs carry numerous detectable sequencing complexes.
- the compact nature of the nanoballs increases the local concentration of detectably-labeled nucleotide reagents that are used during batch sequencing which increases the signal intensity emitted from a nanoball to give a discrete detectable signal which can be imaged as a fluorescent spot.
- a spot corresponds to a concatemer and each concatemer corresponds to a sequence of interest. Multiple spots can be detected and imaged simultaneously on a support having high density concatemer template molecules immobilized thereon.
- the present disclosure provides methods for sequencing comprising step (a): providing a support comprising a plurality of nucleic acid template molecules immobilized to the support.
- the plurality of template molecules comprises a plurality of sub-populations of template molecules including at least a first and a second subpopulation of template molecules.
- the first sub-population of template molecules comprises a first batch sequencing primer binding site and at least one first sequence-of-interest.
- the second sub-population of template molecules comprises a second batch sequencing primer binding site and at least one second sequence- of-interest.
- template molecules within the first sub-population have the same first batch sequencing primer binding site.
- template molecules within the first sub -population have the same sequence of interest or different sequences of interest.
- the sequence of the first batch sequencing primer binding site sequence corresponds to the first sequence of interest, or the first batch sequencing primer binding site sequence corresponds to one of the first sequences of interest in the first sub-population.
- a pre-determined first batch sequencing primer binding site sequence can be linked to a given sequence of interest in the first subpopulation (or can be linked to different sequences of interest in a first sub-population), thus the pre-determined first batch sequencing primer binding site sequence corresponds to a given sequence of interest in the first sub-population.
- sequences of interest in the first sub-population are about 50-250 bases in length, about 250-500 bases in length, about 500-800 bases in length, about 800-1200 bases in length, about 1200-2000 bases in length, or up to 2000 bases in length, or any range therebetween.
- sequences of interest in the second sub-population are about 50-250 bases in length, about 250-500 bases in length, about 500-800 bases in length, about 800-1200 bases in length, about 1200-2000 bases in length, or up to 2000 bases in length, or any range therebetween.
- the first and second batch sequencing primer binding sites have different sequences.
- the support comprises a plurality of nucleic acid template molecules immobilized thereon at a density of about 10 2 - 10 15 template molecules per mm 2 , e.g. between about 10 10 and about 10 15 template molecules per mm 2 , between about 10 5 and about 10 15 template molecules per mm 2 , between about 10 3 and about 10 14 template molecules per mm 2 , between about 10 4 and about 10 13 template molecules per mm 2 , between about 10 5 and about 10 12 template molecules per mm 2 , between about 10 6 and about 10 11 template molecules per mm 2 , between about 10 7 and about 10 10 template molecules per mm 2 , or between about 10 8 and about 10 10 per mm 2 , or any range therebetween.
- the template molecules comprise a mixture of at least two sub-populations of template molecules including at least a first and second sub- population of template molecules.
- the plurality of sub-populations of template molecules are immobilized to the support at a high density where at least some of the template molecules in the first and second sub-populations comprise nearest neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
- the support comprises up to 500 million template molecules immobilized thereon, or up to 1 billion template molecules immobilized thereon, or up to 2 billion template molecules immobilized thereon, or up to 3 billion template molecules immobilized thereon, or up to 4 billion template molecules immobilized thereon, or up to 5 billion template molecules immobilized thereon, or up to 6 billion template molecules immobilized thereon.
- the support comprises up to 7 billion template molecules immobilized thereon, or up to 8 billion template molecules immobilized thereon, or up to 9 billion template molecules immobilized thereon, or up to 10 billion template molecules immobilized thereon, or up to 20 billion template molecules immobilized thereon.
- the support comprises between about 500 million and about 20 billion template molecules immobilized thereon, between about 1 billion and about 10 billion template molecules immobilized thereon, between about 2 billion and about 9 billion template molecules immobilized thereon, between about 3 billion and about 8 billion template molecules immobilized thereon, between about 4 billion and about 7 billion template molecules immobilized thereon, or between about 5 billion and about 6 billion template molecules immobilized thereon, or any range therebetween.
- the support comprises features that are located in a random and non-pre-determined manner, where the features are sites for attachment of the template molecules.
- the support is passivated with multiple polymer layers.
- at least one of the polymer layers comprises oligonucleotide primers including capture primers, pinning primers, or a mixture of capture and pinning primers.
- the plurality of oligonucleotide primers comprise one type of capture primer (e.g., having that same batch capture primer sequence).
- the plurality of oligonucleotide primers comprises a mixture of 2-500 different types of capture primers (e.g., having between about 2-500, between about 50-400, between about 100-300 or between about 20-150 different batch capture primer sequences, or any range therebetween).
- the plurality of oligonucleotide primers comprises one type of pinning primer (e.g., having the same batch pinning primer sequence). In some embodiments, the plurality of oligonucleotide primers comprise a mixture of 2-500 different types of pinning primers (e.g., having between about 2-500, between about 50-400, between about 100-300 or between about 20-150 different batch pinning primer sequences, or any range therebetween).
- the plurality of surface capture primers comprise a plurality of sub-populations of surface capture primers including at least a first and second sub-population of surface capture primers.
- the surface capture primers in the at least first and second sub-population have different sequences.
- the surface capture primers in the at least first and second sub-population can hybridize to and thereby capture different circularized library molecules carrying different surface capture primer binding site sequences.
- the support lacks any contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules.
- the support lacks interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
- the support lacks partitions and/or barriers that would create separate regions of the support.
- the template molecules immobilized to the support are in fluid communication with each other in a massively parallel manner with no barriers to physically separate different batches of template molecules.
- the plurality of surface capture primers are located at predetermined positions on the at least one polymer layer and/or the plurality of surface capture primers are embedded within the at least one polymer layer at pre-determined locations.
- the support includes contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules (e.g., by localizing capture primers thereto).
- the support includes interstitial regions arranged in a predetermined pattern where the interstitial regions are sites designed to have no attached template molecules.
- individual template molecules in the first sub-population further comprise a first batch barcode sequence which corresponds to the first sequence of interest.
- the first batch barcode sequence corresponds to one of the first sequences of interest in the first subpopulation.
- a pre-determined first batch barcode sequence can be linked to a given sequence of interest in the first sub-population, thus the pre-determined first batch barcode sequence corresponds to a given sequence of interest in the first subpopulation.
- a pre-determined first batch barcode sequence can be linked to different sequences of interest in a first sub-population.
- individual template molecules in the second subpopulation further comprise a second batch barcode sequence which corresponds to the second sequence of interest.
- the second batch barcode sequence corresponds to one of the second sequences of interest in the second sub-population.
- a pre-determined second batch barcode sequence can be linked to a given sequence of interest in the second sub-population, thus the pre-determined second batch barcode sequence corresponds to a given sequence of interest in the second sub-population.
- a pre-determined second batch barcode sequence can be linked to different sequences of interest in a second sub-population.
- the first batch barcode sequence can include a short random sequence (e.g., NNN) that is 3-20 in length.
- the first batch sample index sequence can include a short random sequence (e.g., NNN) that is 3-20 in length.
- both the first batch barcode sequence and the first batch sample index sequence both include a short random sequence (e.g., NNN) that is 3-20 in length.
- sequencing the short random sequence can provide nucleotide diversity and color balance.
- sequencing and imaging the short random sequence can be used for polony mapping, location, and template registration because the short random sequence provides sufficient nucleotide diversity and color balance.
- the proportion of adenine (A) at any given position in the short random sequence is about 20- 30% or about 15-35% or about 10-40%. In some embodiments, in the first sub-population of library molecules, the proportion of guanine (G) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, in the first sub-population of library molecules, the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, in the first sub-population of library molecules, the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15- 35% or about 10-40%.
- the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%. In some embodiments, in the first sub-population of library molecules, the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
- the second batch barcode can include a short random sequence (e.g., NNN) that is 3-20 in length.
- the second batch sample index can include a short random sequence (e.g., NNN) that is 3-20 in length.
- both the second batch barcode sequence and the second batch sample index sequence both include a short random sequence (e.g., NNN) that is 3-20 in length.
- sequencing the short random sequence can provide nucleotide diversity and color balance.
- sequencing and imaging the short random sequence can be used for polony mapping, location, and template registration because the short random sequence provides sufficient nucleotide diversity and color balance.
- the short random sequence (e.g., NNN) has an overall base composition of about 25% or about 20-30% of all four nucleotide bases (e.g., A, G, C and T/U) to provide nucleotide diversity at each sequencing cycle during sequencing the short random sequence (e.g., NNN).
- the proportion of adenine (A) at any given position in the short random sequence is about 20- 30% or about 15-35% or about 10-40%. In some embodiments, in the second sub-population of library molecules, the proportion of guanine (G) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, in the second sub-population of library molecules, the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, in the second sub-population of library molecules, the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
- the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%. In some embodiments, in the second sub-population of library molecules, the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
- the plurality of template molecules comprise concatemer template molecules.
- the concatemer template molecules comprise at least first and second sub-populations of concatemer template molecules.
- the concatemer template molecules can be generated by conducting rolling circle amplification (RCA) using circularized library molecules and amplification primers.
- RCA rolling circle amplification
- a concatemer template molecule comprises numerous tandem copies of a polynucleotide unit.
- each polynucleotide unit comprises a sequence of interest and at least one sequencing primer binding site.
- concatemer template molecules immobilized to a support can be generated using circularized library molecules and conducting rolling circle amplification.
- individual concatemer template molecules in the first subpopulation comprise a plurality of tandem polynucleotide units.
- each polynucleotide unit comprises a first sequence of interest and a first batch sequencing primer binding site sequence which corresponds to the first sequence of interest.
- the polynucleotide unit further comprises a first batch barcode sequence which corresponds to the first sequence of interest.
- the polynucleotide unit further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources.
- concatemer template molecules in the first sub-population have the same first batch sequencing primer binding site.
- concatemer template molecules in the first sub-population have the same sequence of interest or different sequences of interest.
- individual concatemer template molecules in the second sub-population comprise a plurality of tandem polynucleotide units.
- each polynucleotide unit comprises a second sequence of interest and a second batch sequencing primer binding site sequence which corresponds to the second sequence of interest.
- the polynucleotide unit further comprises a second batch barcode sequence which corresponds to the second sequence of interest.
- the polynucleotide unit further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources.
- concatemer template molecules in the second sub-population have the same second batch sequencing primer binding site.
- concatemer template molecules in the second sub-population have the same sequence of interest or different sequences of interest.
- the plurality of concatemer template molecules can be generated by conducting a rolling circle amplification reaction in the presence of a plurality of compaction oligonucleotides.
- compaction oligonucleotides are described in W02024040058, the contents of which are incorporated by reference herein in their entirety.
- the plurality of concatemer template molecules can be generated by conducting a rolling circle amplification reaction in the absence of a plurality of compaction oligonucleotides.
- individual compaction oligonucleotides can hybridize to two different locations on the same the concatemer template molecule to pull together distal portions of the concatemer template molecule causing compaction of the template molecule to form a DNA nanoball.
- individual concatemer template molecules collapse into a polony or nucleic acid nanoball having a compact size and shape compared to a non-collapsed concatemer template molecule.
- the methods for sequencing further comprise step (b): sequencing the first sub-population of template molecules using a plurality of first batch sequencing primers, thereby generating a plurality of first batch sequencing read products.
- the sequencing of step (b) comprises imaging a region of the support to detect the sequencing reactions of the first sub-population of template molecules.
- the first stage comprises contacting the first sub-population of template molecules with a plurality of first batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules.
- the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases.
- individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., a nucleotide unit) (e.g., FIGs. 1-5).
- the multivalent molecules can be labeled with at least one detectable moiety that emits a signal.
- the multivalent molecules can be labeled with at least one fluor ophore.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a template molecule hybridized to a sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules.
- the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, and at least one reducing agent.
- the trapping reagent further comprises at least one chaotropic agent.
- the trapping reagent further comprises an amino acid or a modified amino acid.
- the trapping reagent further comprises a plurality of multivalent molecules.
- the trapping reagent further comprises a first plurality of sequencing polymerases.
- the at least one non-catalytic cation inhibits polymerase- catalyzed nucleotide incorporation.
- the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases.
- the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent.
- the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination.
- the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation.
- the first sub-population of template molecules can remain immobilized to the support and the first batch sequencing primers can be retained and can remain hybridized to the first sub-population of template molecules.
- the second stage of the two-stage sequencing method comprises contacting the first sub-population of template molecules and the retained first batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batch sequencing primers.
- the methods for sequencing further comprises step (bl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first sub-population of template molecules to generate a plurality of first batch sequencing read products.
- the plurality of first batch sequencing read products comprises up to 1000 bases in length.
- step (bl) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the first batch sequencing read products comprise a first batch barcode sequence.
- the first batch sequencing read products comprise a first batch barcode sequence and a sample index sequence. In some embodiments, the first batch sequencing read products comprise a first batch barcode sequence and at least a portion of a first sequence of interest. In some embodiments, the first batch sequencing read products comprise a first batch barcode sequence, a sample index sequence, and at least a portion of a first sequence of interest. In some embodiments, the short read sequencing comprises hybridizing sequencing primers to sequencing primer binding sites on concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, 500 million - 1 billion of the first subpopulation of concatemer template molecules can be sequenced.
- the methods for sequencing further comprises step (b2): stopping and/or blocking the short read sequencing of step (bl).
- the stopping/blocking comprises incorporating a chain terminating nucleotide to the 3’ terminal end of the first batch sequencing read products to inhibit further sequencing reactions.
- Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
- the methods for sequencing further comprise step (b4): reiteratively sequencing the template molecules of the first sub-population by repeating steps (bl) - (b3) at least once.
- the reiterative sequencing can be conducted 1-10 times, or 10-25 times, or 25-50 times, or any range therebetween, or more than 50 times.
- the reiterative sequencing can be conducted up to 100 times.
- the sequences of all of the first batch sequencing read products can be determined and aligned with a first reference sequence to confirm the presence of the first sequence of interest.
- the first reference sequence can be the first batch barcode and/or the first sequence of interest.
- the plurality of plurality of first batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- SSC buffer e.g., saline-sodium citrate
- the de-hybridization of step (b3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- the methods for sequencing further comprise step (c): sequencing the second sub-population of template molecules using a plurality of second batch sequencing primers thereby generating a plurality of second batch sequencing read products and imaging the same region of the support to detect the sequencing reactions of the second sub-population of template molecules.
- the sequencing of step (c) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. Exemplary sequencing methods are described in WO2022266470, the contents of which are incorporated by reference in their entirety herein.
- individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5).
- the multivalent molecules can be labeled with at least one detectable moiety that emits a signal.
- the multivalent molecules can be labeled with at least one fluorophore.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex.
- the nucleic acid duplex comprises a template molecule hybridized to a sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the sequencing of step (c) comprises sequencing at least a portion of the second batch barcode and/or sequencing at least a portion of the second sample index. In some embodiments, the sequencing of step (c) comprises sequencing at least a portion of the second sequence of interest.
- the methods for sequencing further comprise step (cl): conducting short read sequencing by performing up to 1000 sequencing cycles of the second sub-population of template molecules to generate a plurality of second batch sequencing read products that comprise up to 1000 bases in length.
- step (cl) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the second batch sequencing read products comprise a second batch barcode sequence.
- the second batch sequencing read products comprise a second batch barcode sequence and a sample index sequence.
- hybridizing the sequencing primers to the concatemer template molecules of step (cl) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10-20% formamide).
- SSC buffer e.g., 2X saline-sodium citrate
- formamide e.g., 10-20% formamide
- the present disclosure provides methods for re-seeding a support comprising step (a): providing a support comprising a plurality of surface capture primers immobilized to the support.
- the plurality of capture primers have the same sequence.
- the plurality of capture primers comprise at least two sub-populations of capture primers including at least a first sub-population of capture primers having a first sequence and a second sub-population of capture primers having a second sequence.
- the plurality of surface capture primers comprise single-stranded oligonucleotides.
- the plurality of surface capture primers can be used to generate concatemer template molecules immobilized to the support.
- the density of the plurality of surface capture primers is about 10 2 - 10 15 per urn 2 , e.g. between about IO 10 and about 10 15 surface capture primers per mm 2 , between about
- the support includes contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules.
- the support includes interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
- individual circularized library molecules in the first subpopulation further comprise a first sub-population seeding batch barcode sequence which corresponds to the first sequence of interest.
- the first sub-population seeding batch barcode sequence corresponds to one of the sequences of interest in the first sub-population.
- a pre-determined first sub-population seeding batch barcode sequence can be linked to a given sequence of interest in the first sub-population of circularized library molecules, thus the pre-determined first sub-population seeding batch barcode sequence corresponds to a given sequence of interest in the first sub-population of circularized library molecules.
- sequences of interest in the first sub-population of circularized nucleic acid library molecules are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, any range therebetween, or up to 2000 bases in length.
- the method comprises conducting a rolling circle amplification reaction, in a template-dependent manner, using individual circularized library molecules in the first sub-population, thereby generating a first sub-population concatemer template molecules immobilized to the support.
- a subset of the surface capture primers hybridize to individual circularized library molecules to generate the plurality of first sub-population concatemer template molecules.
- the first sub-population concatemer template molecules can be immobilized to the support at random and non-predetermined positions on the support, or at pre-determined positions on the support (e.g., patterned support).
- individual circularized library molecules in the second sub-population comprise the same second sub-population seeding batch sequencing primer binding site sequence and have the same sequence of interest or different sequences of interest.
- the second sub-population seeding batch sequencing primer binding site sequence corresponds to the second sequence of interest.
- the second sub-population seeding batch sequencing primer binding site sequence corresponds to one of the sequences of interest in the second sub-population.
- individual circularized library molecules in the second subpopulation further comprise a second sub-population seeding batch barcode sequence which corresponds to the second sequence of interest, or the second sub-population seeding batch barcode sequence corresponds to one of the sequences of interest in the second subpopulation.
- a pre-determined second sub-population seeding batch barcode sequence can be linked to a given sequence of interest in the second sub-population of circularized library molecules, thus the pre-determined second subs-population seeding batch barcode sequence corresponds to a given sequence of interest in the second subpopulation of circularized library molecules.
- a pre-determined second sub-population seeding batch barcode sequence can be linked to different sequences of interest in a second sub-population of circularized library molecules.
- the method comprises conducting a rolling circle amplification reaction, in a template-dependent manner, using individual circularized library molecules in the second sub-population, thereby generating a plurality of second sub-population concatemer template molecules immobilized to the support.
- a subset of the surface capture primers hybridize to individual circularized library molecules to generate the plurality of second sub-population concatemer template molecules.
- the second sub-population concatemer template molecules can be immobilized to the support at random and non-predetermined positions on the support, or at pre-determined positions on the support (e.g., patterned support).
- the plurality of nucleotide further comprises a plurality of a nucleotide having a scissile moiety (e.g., uracil).
- a scissile moiety e.g., uracil
- the full length of the concatemer template molecules in the first plurality are sequenced. In some embodiments, a partial length of the concatemer template molecules in the first plurality are sequenced.
- the sequencing of step (c) comprises hybridizing sequencing primers to sequencing primers binding sites on the first sub-population of the first plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase- catalyzed sequencing reactions using nucleotide reagents.
- the concatemer template molecules in the first sub-population can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- a partial length of the concatemer template molecules in the first sub-population are reiteratively sequenced.
- the sequencing of step (c) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the sequencing of step (c) comprises conducting a two- stage sequencing method.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing).
- the first plurality of sequencing polymerases prior to conducting the second sequencing stage, can be dissociated from the first sub-population of template molecules in the first plurality. In some embodiments, the first sub-population of template molecules in the first plurality can remain immobilized to the support and the first batch sequencing primers can be retained and can remain hybridized to the first sub-population of template molecules in the first plurality.
- the second stage of the two-stage sequencing method comprises contacting the first sub-population of template molecules in the first plurality and the retained first batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batch sequencing primer.
- the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
- the sequencing of step (c) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of first batch sequencing read products.
- one sequencing cycle comprises completion of a first and a second stage.
- the sequencing of step (c) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- a second sub-population of concatemer template molecules in the first plurality are sequenced using the second batch sequencing primer binding sites in the second sub-population of concatemer template molecules.
- the concatemer template molecules in the second sub-population plurality can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases.
- individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5).
- the multivalent molecules can be labeled with at least one detectable moiety that emits a signal.
- the multivalent molecules can be labeled with at least one fluorophore.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules.
- the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent.
- the trapping reagent further comprises a plurality of multivalent molecules.
- the trapping reagent further comprises a first plurality of sequencing polymerases.
- the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing).
- the first plurality of sequencing polymerases can be dissociated from the second sub-population of template molecules in the first plurality.
- the second sub-population of template molecules in the first plurality can remain immobilized to the support and the second batch sequencing primers can be retained and can remain hybridized to the second sub-population of template molecules in the first plurality.
- the second stage of the two-stage sequencing method generally comprises contacting the second sub-population of template molecules in the first plurality and the retained second batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the second batch sequencing primer.
- the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
- the nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
- the stepping reagent further comprises a second plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
- the sequencing of step (c) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of second batch sequencing read products.
- one sequencing cycle comprises completion of a first and a second stage.
- the sequencing of step (c) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the methods for re-seeding a support further comprise reiteratively sequencing the first sub-population of the first plurality of concatemer template molecules, which comprises step (cl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first sub-population of concatemer template molecules to generate a plurality of first sub-population batch sequencing read products that comprise up to 1000 bases in length.
- step (cl) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the first sub-population batch sequencing read products comprise the first sub-population seeding batch barcode sequence.
- the first sub-population batch sequencing read products comprise the first sub-population seeding batch barcode sequence and the sample index sequence.
- the first sub-population batch sequencing read products comprise the first sub-population seeding batch barcode sequence and at least a portion of the first sequence of interest.
- the first sub-population batch sequencing read products comprise the first sub-population seeding batch barcode sequence, the sample index sequence, and at least a portion of the first sequence of interest.
- the short read sequencing comprises hybridizing sequencing primers to sequencing primer binding sites on the first sub-population of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents.
- 500 million - 1 billion of the first sub-population of concatemer template molecules can be sequenced.
- up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the first sub-population of concatemer template molecules can be sequenced.
- the sequencing of step (cl) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the reiterative sequencing of step (cl) comprises conducting a two- stage sequencing method described herein.
- the methods for re-seeding a support further comprise step (c3): removing the plurality of first sub-population batch sequencing read products and retaining the concatemer template molecules of the first sub -population.
- step (c3) is optional.
- the first sub-population batch sequencing read products can be removed from the concatemer template molecules by denaturation using heat and/or a de-hybridization reagent.
- the methods for re-seeding a support further comprise step (c4): reiteratively sequencing the concatemer template molecules of the first sub-population by repeating steps (cl) - (c3) at least once.
- the reiterative sequencing can be conducted 1-10 times, or 10-25 times, or 25-50 times or more.
- the methods for re-seeding a support further comprise reiteratively sequencing the second sub-population of concatemer template molecules in a manner similar to steps (cl) - (c4) as described above for the first sub-population of concatemer template molecules.
- hybridizing the sequencing primers to the concatemer template molecules of any of steps (cl) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10- 20% formamide).
- SSC buffer e.g., 2X saline-sodium citrate
- formamide e.g., 10- 20% formamide
- the plurality of first sub-population batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- SSC buffer e.g., saline-sodium citrate
- step (c3) the plurality of first sub-population batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent.
- the de-hybridization reagent further comprises at least one chaotropic agent.
- the de-hybridization reagent further comprises at least one nucleic acid compaction agent.
- the de-hybridization of step (c3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- the support comprises up to 500 million of a second plurality of concatemer template molecules immobilized thereon, or up to 1 billion a second plurality of concatemer template molecules immobilized thereon, or up to 2 billion a second plurality of concatemer template molecules immobilized thereon, or up to 3 billion a second plurality of concatemer template molecules immobilized thereon, or up to 4 billion a second plurality of concatemer template molecules immobilized thereon, or up to 5 billion a second plurality of concatemer template molecules immobilized thereon, or up to 6 billion a second plurality of concatemer template molecules immobilized thereon.
- the support comprises up to 7 billion concatemer template molecules immobilized thereon, or up to 8 billion concatemer template molecules immobilized thereon, or up to 9 billion concatemer template molecules immobilized thereon, or up to 10 billion concatemer template molecules immobilized thereon, or up to 20 billion concatemer template molecules immobilized thereon.
- individual concatemer template molecules in the second plurality comprise a plurality of tandem copies of a polynucleotide unit, where each polynucleotide unit comprises a sequence of interest and a batch seeding sequencing primer binding site sequence.
- the first plurality of concatemer template molecules of step (c) can be completely sequenced or the sequencing can be interrupted at any time prior to distributing the second plurality of circularized nucleic acid library molecules onto the support of step (d).
- the second plurality of circularized library molecules can be generated using padlock probes, single-stranded splint strands, or double-stranded adaptors.
- the second plurality of circularized library molecules comprise a mixture of any combination of circularized padlock probes, linear library molecules circularized using single-stranded splint strands, and/or linear library molecules circularized using double-stranded adaptors. Methods for generating circularized library molecules are described herein.
- individual circularized library molecules in the second plurality comprise a sequence of interest, a seeding batch sequencing primer binding site sequence which corresponds to the sequence of interest, and a surface capture primer binding site.
- a predetermined second seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the second plurality of circularized library molecules.
- a pre-determined second seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a second plurality of circularized library molecules), thus the pre-determined second seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the second plurality of circularized library molecules.
- individual circularized library molecules in the second plurality further comprise a seeding batch barcode sequence which corresponds to the sequence of interest.
- a pre-determined second seeding batch barcode sequence can be linked to a given sequence of interest in the second plurality of circularized library molecules, thus the pre-determined second seeding batch barcode sequence corresponds to a given sequence of interest in the second plurality of circularized library molecules.
- a pre-determined second seeding batch barcode sequence can be linked to different sequences of interest in a second plurality of circularized library molecules.
- individual circularized library molecules in the second plurality comprise a sequence of interest, the same seeding batch sequencing primer binding site sequence which corresponds to the sequence of interest, and individual circularized library molecules further comprise a surface capture primer binding site, and a second seeding batch barcode sequence which corresponds to the sequence of interest.
- the second plurality of circularized nucleic acid library molecules comprise a plurality of subpopulations of circularized library molecules including at least a third and fourth subpopulation of circularized library molecules.
- a pre- determined third sub-population seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the third sub-population of circularized library molecules, thus the pre-determined third sub-population seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the third sub-population of circularized library molecules.
- a pre-determined third subpopulation seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a third sub-population of circularized library molecules.
- individual circularized library molecules in the third subpopulation further comprise a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources.
- individual circularized library molecules in the third sub-population further comprise a surface capture primer binding site.
- individual circularized library molecules in the third sub-population further comprise a surface pinning primer binding site.
- individual circularized library molecules in the third subpopulation further comprise a compaction oligonucleotide binding site.
- the method comprises conducting a rolling circle amplification reaction, in a template-dependent manner using individual circularized library molecules in the third sub-population, thereby generating a plurality of third sub-population concatemer template molecules immobilized to the support.
- a subset of the surface capture primers hybridize to individual circularized library molecules to generate the plurality of third sub-population concatemer template molecules.
- the third sub-population concatemer template molecules can be immobilized to the support at random and non-predetermined positions, or at predetermined positions (e.g., patterned support).
- individual circularized library molecules in the fourth sub-population comprise the same fourth sub-population seeding batch sequencing primer binding site sequence and have the same sequence of interest or different sequences of interest.
- the fourth sub-population seeding batch sequencing primer binding site sequence corresponds to the fourth sequence of interest, or the fourth sub-population seeding batch sequencing primer binding site sequence corresponds to one of the sequences of interest in the fourth subpopulation.
- a pre-determined fourth sub-population seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the fourth sub-population of circularized library molecules, thus the pre-determined fourth subpopulation seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the fourth sub-population of circularized library molecules.
- a pre-determined fourth sub-population seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a fourth subpopulation of circularized library molecules.
- individual circularized library molecules in the fourth subpopulation further comprise a fourth sub-population seeding batch barcode sequence which corresponds to the fourth sequence of interest, or the fourth sub-population seeding batch barcode sequence corresponds to one of the sequences of interest in the fourth subpopulation.
- a pre-determined fourth sub-population seeding batch barcode sequence can be linked to a given sequence of interest in the fourth sub-population of circularized library molecules, thus the pre-determined fourth subs-population seeding batch barcode sequence corresponds to a given sequence of interest in the fourth sub-population of circularized library molecules.
- a pre-determined fourth sub-population seeding batch barcode sequence can be linked to different sequences of interest in a fourth sub-population of circularized library molecules
- individual circularized library molecules in the fourth subpopulation further comprise a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources.
- individual circularized library molecules in the fourth sub-population further comprise a surface capture primer binding site.
- individual circularized library molecules in the fourth sub-population further comprise a surface pinning primer binding site.
- individual circularized library molecules in the fourth sub-population further comprise a compaction oligonucleotide binding site.
- sequences of interest in the fourth sub-population of circularized nucleic acid library molecules are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
- the method comprises conducting a rolling circle amplification reaction, in a template-dependent manner using individual circularized library molecules in the fourth sub-population, thereby generating a fourth sub-population concatemer template molecules immobilized to the support.
- a subset of the surface capture primers hybridize to individual circularized library molecules to generate the fourth sub-population concatemer template molecules.
- the fourth sub-population concatemer template molecules can be immobilized to the support at random and non-predetermined positions, or at predetermined positions (e.g., patterned support).
- the rolling circle amplification reaction comprises contacting the primed circularized library molecules with a plurality of a strand displacing polymerase, and a plurality of nucleotides which include dATP, dCTP, dGTP, dTTP.
- the plurality of nucleotide further comprises a plurality of a nucleotide having a scissile moiety (e.g., uracil).
- a scissile moiety e.g., uracil
- the rolling circle amplification reaction of step (d) can be conducted in the presence, or in the absence, of a plurality of compaction oligonucleotides.
- individual compaction oligonucleotides can hybridize to two different locations on the same the template molecule to pull together distal portions of the template molecule causing compaction of the template molecule to form a DNA nanoball.
- the methods for re-seeding a support further comprise step (e): sequencing at least a subset of the second plurality of immobilized concatemer template molecules thereby generating a second plurality of sequencing read products.
- the sequencing of step (e) comprises imaging a region of the support to detect the sequencing reactions of the second plurality of template molecules.
- the same region of the support is sequenced in steps (c) and (e).
- different regions of the support are sequenced in steps (c) and (e).
- between about 500 million and about 10 billion concatemer template molecules, between about 1 billion and about 9 billion concatemer template molecules, between about 2 billion and about 8 billion concatemer template molecules, between about 3 billion and about 7 billion concatemer template molecules, between about 4 billion and about 5 billion concatemer template molecules, or any range therebetween of concatemer template molecules of the second plurality of concatemer template molecules can be sequenced.
- the full length of the concatemer template molecules in the second plurality are sequenced. In some embodiments, a partial length of the concatemer template molecules in the second plurality are sequenced.
- the sequencing of step (e) comprises hybridizing sequencing primers to sequencing primers binding sites on the second plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents.
- the concatemer template molecules in the second plurality can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- a partial length of the concatemer template molecules in the second plurality are reiteratively sequenced.
- the full length of the concatemer template molecules in the third sub-population are sequenced. In some embodiments, a partial length of the concatemer template molecules in the third sub-population are sequenced.
- the sequencing of step (e) comprises hybridizing sequencing primers to sequencing primers binding sites on the third sub-population of the second plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents.
- the immobilized concatemer template molecules in the third sub-population can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the sequencing of step (e) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the sequencing of step (e) comprises conducting a two- stage sequencing method.
- the first stage generally comprises contacting the third sub-population of template molecules in the second plurality with a plurality of third batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules.
- the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules.
- the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent.
- the trapping reagent further comprises a plurality of multivalent molecules.
- the trapping reagent further comprises a first plurality of sequencing polymerases.
- the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
- the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases.
- the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent.
- the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination.
- the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing).
- the first plurality of sequencing polymerases can be dissociated from the third sub-population of template molecules in the second plurality.
- the third sub-population of template molecules in the second plurality can remain immobilized to the support and the third batch sequencing primers can be retained and can remain hybridized to the third sub-population of template molecules in the second plurality.
- the second stage of the two-stage sequencing method comprises contacting the third sub-population of template molecules in the second plurality and the retained third batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the third batch sequencing primer.
- the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
- the nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
- nucleotide incorporation can be conducted in the presence of a stepping reagent.
- the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation.
- the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent.
- the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the stepping reagent further comprises a second plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprises chain terminating nucleotides.
- individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position.
- the plurality of nucleotides are not chain terminating nucleotides.
- the fourth sub-population of the concatemer template molecules in the second plurality are sequenced using the fourth batch sequencing primer binding sites in the fourth sub-population of concatemer template molecules.
- the full length of the concatemer template molecules in the fourth sub-population are sequenced. In some embodiments, a partial length of the concatemer template molecules in the fourth sub-population are sequenced.
- the sequencing of step (e) comprises hybridizing sequencing primers to sequencing primers binding sites on the fourth sub-population of the second plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents.
- the concatemer template molecules in the fourth sub-population can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the sequencing of step (e) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the sequencing of step (e) comprises conducting a two- stage sequencing method.
- the first stage comprises contacting the fourth sub-population of template molecules in the second plurality with a plurality of fourth batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules.
- the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases.
- individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5).
- the multivalent molecules can be labeled with at least one detectable moiety that emits a signal.
- the multivalent molecules can be labeled with at least one fluorophore.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules.
- the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent.
- the trapping reagent further comprises a plurality of multivalent molecules.
- the trapping reagent further comprises a first plurality of sequencing polymerases.
- the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
- the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases.
- the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent.
- the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination.
- the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing).
- the first plurality of sequencing polymerases can be dissociated from the fourth sub-population of template molecules in the second plurality.
- the fourth sub-population of template molecules in the second plurality can remain immobilized to the support and the fourth batch sequencing primers can be retained and can remain hybridized to the fourth subpopulation of template molecules in the second plurality.
- the second stage of the two-stage sequencing method comprises contacting the fourth sub-population of template molecules in the second plurality and the retained fourth batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the fourth batch sequencing primer.
- the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
- the nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group. [00367] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent.
- the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation.
- the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent.
- the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the stepping reagent further comprises a second plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
- the sequencing of step (e) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of fourth batch sequencing read products.
- one sequencing cycle comprises completion of a first and a second stage.
- the sequencing of step (e) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the third sub-population batch sequencing read products comprise the third sub-population seeding batch barcode sequence. [00371] In some embodiments, the third sub-population batch sequencing read products comprise the third sub-population seeding batch barcode sequence and the sample index sequence.
- the third sub-population batch sequencing read products comprise the third sub-population seeding batch barcode sequence and at least a portion of the second sequence of interest.
- the third sub-population batch sequencing read products comprise the third sub-population seeding batch barcode sequence, the sample index sequence, and at least a portion of the second sequence of interest.
- the short read sequencing comprises hybridizing sequencing primers to sequencing primer binding sites on the third subpopulation of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents.
- 500 million - 1 billion of the third sub-population of concatemer template molecules can be sequenced.
- up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the third sub-population of concatemer template molecules can be sequenced.
- up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the third sub-population of concatemer template molecules can be sequenced. In some embodiments, between about 500 million and about 10 billion concatemer template molecules, between about 1 billion and about 9 billion concatemer template molecules, between about 2 billion and about 8 billion concatemer template molecules, between about 3 billion and about 7 billion concatemer template molecules, between about 4 billion and about 5 billion concatemer template molecules, or any range therebetween of the third sub-population of concatemer template molecules can be sequenced.
- the sequencing of step (el) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the reiterative sequencing of step (el) comprises conducting a two- stage sequencing method described herein.
- the methods for re-seeding a support further comprise step (e2): stopping and/or blocking the short read sequencing of step (el).
- the stopping/blocking comprises incorporating a chain terminating nucleotide to the 3’ terminal end of the second sub-population batch sequencing read products to inhibit further sequencing reactions.
- Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
- the methods for re-seeding a support further comprise step (e3): removing the plurality of second sub-population batch sequencing read products and retaining the concatemer template molecules of the second sub-population.
- step (e3) is optional.
- the third sub-population batch sequencing read products can be removed from the concatemer template molecules by denaturation using heat and/or a de-hybridization reagent.
- hybridizing the sequencing primers to the concatemer template molecules of any of steps (el) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10- 20% formamide).
- SSC buffer e.g., 2X saline-sodium citrate
- formamide e.g., 10- 20% formamide
- the plurality of third sub-population batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- SSC buffer e.g., saline-sodium citrate
- Conventional methods for achieving a desired density of immobilized nucleic acid template molecules for massively parallel sequencing include determining the concentration of library molecules in-solution prior to immobilizing the library molecules on the support.
- the conventional methods typically employ qPCR and/or a fluorometer with a fluorescentbased assay (e.g., Qubit). Even when the desired in-solution library concentration is achieved, these convention methods can yield immobilized template densities that are too high or too low.
- individual template molecules of the first sub-population comprise (i) a first batch sequencing primer binding site, (ii) a first sequence of interest, and (iii) optionally a first batch barcode sequence and/or a first batch sample index sequence.
- individual template molecules within the first subpopulation comprise the same first batch sequencing primer binding site.
- individual template molecules within the first sub-population comprise the same sequence of interest, or comprise different sequences of interest.
- the sequence of the first batch sequencing primer binding site sequence corresponds to the first sequence of interest, or the first batch sequencing primer binding site sequence corresponds to one of the first sequences of interest in the first sub-population.
- the first batch barcode and/or the first batch sample index can include a short random sequence (e.g., NNN) that is 3-20 in length.
- sequencing the short random sequence can provide nucleotide diversity and color balance.
- sequencing and imaging the short random sequence can be used for polony mapping and location and template registration because the short random sequence provides sufficient nucleotide diversity and color balance.
- the proportion of adenine (A) at any given position in the short random sequence is about 20- 30% or about 15-35% or about 10-40%.
- the proportion of guanine (G) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
- i the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
- the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
- the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%. In some embodiments, i the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
- individual template molecules of the second subpopulation comprise (i) a second batch sequencing primer binding site, (ii) a second sequence of interest, and (iii) optionally a second batch barcode sequence and/or a second batch sample index sequence.
- individual template molecules within the second subpopulation comprise the same second batch sequencing primer binding site. In some embodiments, individual template molecules within the second sub-population comprise the same sequence of interest or comprise different sequences of interest. In some embodiments, the sequence of the second batch sequencing primer binding site sequence corresponds to, i.e. can be used to selectively sequence in a batch sequencing workflow, the second sequence of interest, or the second batch sequencing primer binding site sequence corresponds to one of the second sequences of interest in the second sub-population. In some embodiments, a predetermined second batch sequencing primer binding site sequence can be linked to a given sequence of interest in the second sub-population.
- sequences of interest in the second sub-population are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
- the second batch barcode and/or the second batch sample index can include a short random sequence (e.g., NNN) that is 3-20 in length.
- sequencing the short random sequence can provide nucleotide diversity and color balance.
- sequencing and imaging the short random sequence can be used for polony mapping and location and template registration because the short random sequence provides sufficient nucleotide diversity and color balance.
- the short random sequence (e.g., NNN) has an overall base composition of about 25% or about 20-30% of all four nucleotide bases (e.g., A, G, C and T/U) to provide nucleotide diversity at each sequencing cycle during sequencing the short random sequence (e.g., NNN).
- the proportion of adenine (A) at any given position in the short random sequence is about 20- 30% or about 15-35% or about 10-40%.
- the proportion of guanine (G) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
- the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
- the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
- the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%.
- the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
- the first and second batch sequencing primer binding sites have different sequences.
- the plurality of nucleic acid template molecules can be immobilized to the support at random and non-pre-determined positions on the support, or at pre-determined positions on the support (e.g., a patterned support).
- the support in the methods for determining template density of step (a), comprises a plurality of nucleic acid template molecules immobilized thereon at a density of about 10 2 - 10 15 template molecules per mm 2 , or any range described herein.
- the template molecules immobilized to the support comprise a plurality of at least two sub-populations of template molecules including at least a first and second sub-population of template molecules.
- the plurality of sub-populations of template molecules are immobilized to the support at a high density where at least some of the immobilized template molecules in the first and second sub-populations comprise nearest neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
- the support comprises up to 500 million template molecules immobilized thereon, or up to 1 billion template molecules immobilized thereon, or up to 2 billion template molecules immobilized thereon, or up to 3 billion template molecules immobilized thereon, or up to 4 billion template molecules immobilized thereon, or up to 5 billion template molecules immobilized thereon, or up to 6 billion template molecules immobilized thereon.
- the support comprises up to 7 billion template molecules immobilized thereon, or up to 8 billion template molecules immobilized thereon, or up to 9 billion template molecules immobilized thereon, or up to 10 billion template molecules immobilized thereon, or up to 20 billion template molecules immobilized thereon. In some embodiments, the support comprises between about 500 million and about 20 billion template molecules immobilized thereon, between about 1 billion and about 10 billion template molecules immobilized thereon, between about 2 billion and about 9 billion template molecules immobilized thereon, between about 3 billion and about 8 billion template molecules immobilized thereon, between about 4 billion and about 7 billion template molecules immobilized thereon, or between about 5 billion and about 6 billion template molecules immobilized thereon, or any range therebetween.
- the support in the methods for determining template density of step (a), comprises features on the support that are located in a random and non-pre- determined manner. In some embodiments, the features are sites for attachment of the template molecules.
- the support is passivated with at least one polymer layer comprising a plurality of surface capture primers covalently tethered to the at least one polymer layer.
- At least one of the polymer layers comprises oligonucleotide primers including capture primers, pinning primers, or a mixture of capture and pinning primers.
- the plurality of oligonucleotide primers comprise one type of capture primer (e.g., having that same batch capture primer sequence) or a mixture of 2-500 different types of capture primers (e.g., having between about 2-500, between about 50-400, between about 100-300 or between about 20-150 different batch capture primer sequences, or any range therebetween).
- the plurality of oligonucleotide primers comprise one type of pinning primer (e.g., having that same batch pinning primer sequence) or a mixture of 2-500 different types of pinning primers (e.g., having between about 2-500, between about 50-400, between about 100-300 or between about 20-150 different batch pinning primer sequences, or any range therebetween).
- the plurality of oligonucleotide types comprises between 2 and 500, between 10 and 400, between 20 and 300, between 50 and 200, between 100 and 500, between 200 and 400, between 2 and 250, between 10 and 150, between 20 and 200, or between 20 and 100 or between 5 and 50 different capture primers and/or pinning primers, or any range therebetween.
- the plurality of surface capture primers comprise a plurality of sub-populations of surface capture primers including at least a first and second sub-population of surface capture primers.
- the surface capture primers in the at least first and second sub-populations have different sequences.
- the surface capture primers in the at least first and second sub-populations can hybridize, i.e. capture, different circularized library molecules carrying different surface capture primer binding site sequences.
- the plurality of surface capture primers are randomly distributed throughout and embedded within the at least one polymer layer.
- the support lacks any contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules.
- the support lacks interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
- the support in the methods for determining template density of step (a), lacks partitions and/or barriers that would create separate regions of the support.
- the template molecules immobilized to the support are in fluid communication with each other in a massively parallel manner with no barriers to physically separate different batches of template molecules.
- the support includes contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the template molecules.
- the support includes interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
- individual template molecules in the first sub-population further comprise a first batch barcode sequence which corresponds to the first sequence of interest, or the first batch barcode sequence corresponds to one of the first sequences of interest in the first subpopulation.
- a pre-determined first batch barcode sequence can be linked to a given sequence of interest in the first sub-population thus the pre-determined first batch barcode sequence corresponds to a given sequence of interest in the first subpopulation.
- a pre-determined first batch barcode sequence can be linked to different sequences of interest in a first sub-population.
- individual template molecules in the second subpopulation further comprise a second batch barcode sequence which corresponds to the second sequence of interest, or the second batch barcode sequence corresponds to one of the second sequences of interest in the second sub-population.
- a predetermined second batch barcode sequence can be linked to a given sequence of interest in the second sub-population (or can be linked to different sequences of interest in a second subpopulation), thus the pre-determined second batch barcode sequence corresponds to a given sequence of interest in the second sub-population.
- a pre-determined second batch barcode sequence can be linked can be linked to different sequences of interest in a second sub-population.
- individual template molecules in the first sub -population further comprise at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest in the first sub-population obtained from different sample sources.
- individual template molecules in the second sub-population further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish the sequences of interest in the second sub-population obtained from different sample sources.
- the plurality of template molecules comprise concatemer template molecules, including at least a first and second sub-population of concatemer template molecules.
- the concatemer template molecules can be generated by conducting rolling circle amplification using circularized library molecules and amplification primers.
- the amplification primers comprise capture primers immobilized to a support.
- the amplification primers comprise soluble (non-immobilized) primers.
- a concatemer template molecule comprises numerous tandem copies of a polynucleotide unit.
- each polynucleotide unit comprises a sequence of interest and at least one sequencing primer binding site.
- the rolling circle amplification can be conducted in the presence or absence of a plurality of compaction oligonucleotides.
- individual concatemer template molecules immobilized to the support collapse into a polony or nucleic acid nanoball having a compact size and shape compared to a non-collapsed concatemer template molecule.
- the circularized library molecules can be generated using padlock probes, single-stranded splint strands, or double-stranded adaptors.
- individual concatemer template molecules in the first subpopulation comprise a plurality of tandem polynucleotide units.
- each polynucleotide unit comprises a first sequence of interest and a first batch sequencing primer binding site sequence which corresponds to the first sequence of interest.
- the polynucleotide unit further comprises a first batch barcode sequence which corresponds to the first sequence of interest.
- the polynucleotide unit further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources.
- concatemer template molecules in the first sub-population have the same first batch sequencing primer binding site, and have the same sequence of interest or different sequences of interest.
- the sequencing of step (b) comprises conducting 4- 20 sequencing cycles, or conducting 20-50 sequencing cycles, or conducting 50-75 sequencing cycles, or conducting 75-100 sequencing cycles, any range therebetween, or conducting more than 100 sequencing cycles.
- the sequencing of step (b) comprises sequencing at least a portion of the first batch barcode and/or sequencing at least a portion of the first sample index. In some embodiments, the sequencing of step (b) comprises sequencing at least a portion of the first sequence of interest.
- the sequencing of step (b) comprises imaging at least one region of the support to detect the sequencing reactions of the first sub-population of template molecules. In some embodiments, the sequencing of step (b) further comprises determining the location of the first batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of step (b) further comprises counting the number of first batch sequencing read products on the support. In some embodiments, the sequencing of step (b) further comprises determining the density of the first batch sequencing read products on the support using the counted number of first batch sequencing read products.
- the sequencing of step (b) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the sequencing of step (b) comprises conducting a two- stage sequencing method.
- the first stage generally comprises contacting the first sub-population of template molecules with a plurality of first batch sequencing primers, a first plurality of sequencing polymerase and a first plurality of detectably labeled multivalent molecules.
- the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases.
- individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5).
- the multivalent molecules can be labeled with at least one detectable moiety that emits a signal.
- the multivalent molecules can be labeled with at least one fluorophore.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex.
- the nucleic acid duplex comprises a first sub-population template molecule hybridized to a first batch sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent which does not generate an extended sequencing primer.
- the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, and at least one reducing agent.
- the trapping reagent further comprises at least one chaotropic agent.
- the trapping reagent further comprises an amino acid or a modified amino acid. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
- the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases.
- the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent.
- the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination.
- the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the detectably labeled multivalent molecules prior to conducting the second sequencing stage, can be dissociated from the complexed polymerases and removed (e.g., washing).
- the first plurality of sequencing polymerases prior to conducting the second sequencing stage, can be dissociated from the first sub-population of template molecules.
- the first sub-population of template molecules can remain immobilized to the support and the first batch sequencing primers can be retained and can remain hybridized to the first sub-population of template molecules.
- the second stage of the two-stage sequencing method comprises contacting the first sub-population of template molecules and the retained first batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second sequencing stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batch sequencing primer.
- the first batch sequencing read product comprises an extended first batch sequencing primer after conducting the second sequencing stage.
- the plurality of nucleotides comprises fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
- the second stage of step (b) comprises imaging at least one region of the support in the presence of an imaging reagent to detect the sequencing reactions of the first sub-population of template molecules.
- the sequencing of step (b) further comprises determining the location of the first batch sequencing read products on the support (e.g., template mapping).
- the sequencing of step (b) further comprises counting the number of first batch sequencing read products on the support using the images of the sequencing reactions of the first sub-population of template molecules.
- the sequencing of step (b) further comprises determining the density of the first batch sequencing read products on the support using the counted number of first batch sequencing read products.
- the imaging, location determination, and counting data obtained from the first and second stage sequencing reactions can be combined to determine the density of the first batch sequencing read products on the support.
- the nucleotides comprise chain terminating nucleotides.
- individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position.
- the nucleotides are not chain terminating nucleotides.
- the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
- nucleotide incorporation can be conducted in the presence of a stepping reagent.
- the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation.
- the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent.
- the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the stepping reagent further comprises a second plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprise chain terminating nucleotides.
- individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position.
- the plurality of nucleotides are not chain terminating nucleotides.
- the sequencing of step (b) further comprises determining the location of the first batch sequencing read products on the support when imaging is conducted (e.g., template mapping). In some embodiments, the sequencing of step (b) further comprises counting the number of first batch sequencing read products on the support using the images of the sequencing reactions of the first sub-population of template molecules. In some embodiments, the sequencing of step (b) further comprises determining the density of the first batch sequencing read products on the support using the counted number of first batch sequencing read products.
- the dehybridization reagent comprises an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- the de-hybridization reagent comprises at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent.
- the de-hybridization reagent further comprises at least one chaotropic agent.
- the de- hybridization reagent further comprises at least one nucleic acid compaction agent.
- the de-hybridization step can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- the first batch sequencing read products are not removed from the first sub-population of template molecules.
- the methods for determining template density further comprise step (c): sequencing the second sub-population of template molecules using a plurality of second batch sequencing primers, thereby generating a plurality of second batch sequencing read products and imaging the same region of the support to detect the sequencing reactions of the second sub-population of template molecules.
- the second batch sequencing read products comprise extension products of the second batch sequencing primers.
- the second batch sequencing read products comprise second batch sequencing primers that are not extended.
- the sequencing of step (c) does not require conducting more than 4 sequencing cycles.
- the sequencing of step (c) comprises conducting 4-20 sequencing cycles, or conducting 20-50 sequencing cycles, or conducting 50-75 sequencing cycles, or conducting 75-100 sequencing cycles, or any range therebetween, or conducting more than 100 sequencing cycles. In some embodiments, the sequencing of step (c) comprises sequencing at least a portion of the second batch barcode and/or sequencing at least a portion of the second sample index. In some embodiments, the sequencing of step (c) comprises sequencing at least a portion of the second sequence of interest.
- the sequencing of step (c) comprises imaging at least one region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the location of the second batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of step (c) further comprises counting the number of second batch sequencing read products on the support. In some embodiments, the sequencing of step (c) further comprises determining the density of the second batch sequencing read products on the support using the counted number of second batch sequencing read products.
- the sequencing reactions of the first sub-population of template molecules is stopped or inhibited before initiating the sequencing reactions of the second sub-population of template molecules.
- the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent- complexed polymerases, and detecting the multivalent-complexed polymerases.
- individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5).
- the multivalent molecules can be labeled with at least one detectable moiety that emits a signal.
- the multivalent molecules can be labeled with at least one fluorophore.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex.
- the nucleic acid duplex comprises a second sub-population template molecule hybridized to a second batch sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent which does not generate an extended sequencing primer.
- the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, and at least one reducing agent.
- the trapping reagent further comprises at least one chaotropic agent.
- the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases.
- the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent.
- the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination.
- the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the sequencing of step (c) comprises imaging at least one region of the support in the presence of an imaging reagent to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the location of the second batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of step (c) further comprises counting the number of second batch sequencing read products on the support using the images of the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the density of the second batch sequencing read products on the support using the counted number of second batch sequencing read products.
- the detectably labeled multivalent molecules prior to conducting the second sequencing stage, can be dissociated from the complexed polymerases and removed (e.g., washing).
- the first plurality of sequencing polymerases prior to conducting the second sequencing stage, can be dissociated from the second sub-population of template molecules.
- the second subpopulation of template molecules can remain immobilized to the support and the second batch sequencing primers can be retained and can remain hybridized to the second subpopulation of template molecules.
- the second stage of the two-stage sequencing method comprises contacting the second sub-population of template molecules and the retained second batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second sequencing stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the second batch sequencing primer.
- the second batch sequencing read product comprises an extended second batch sequencing primer after conducting the second sequencing stage.
- the second stage of step (c) comprises imaging at least one region of the support in the presence of an imaging reagent to detect the sequencing reactions of the second sub-population of template molecules.
- the sequencing of step (c) further comprises determining the location of the second batch sequencing read products on the support (e.g., template mapping).
- the sequencing of step (c) further comprises counting the number of second batch sequencing read products on the support using the images of the sequencing reactions of the second sub-population of template molecules.
- the sequencing of step (c) further comprises determining the density of the second batch sequencing read products on the support using the counted number of second batch sequencing read products.
- the imaging, location determination, and counting data obtained from the first and second stage sequencing reactions can be combined to determine the density of the second batch sequencing read products on the support.
- the nucleotides comprise chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
- nucleotide incorporation can be conducted in the presence of a stepping reagent.
- the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation.
- the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent.
- the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the stepping reagent further comprises a second plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprise chain terminating nucleotides.
- individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position.
- the plurality of nucleotides are not chain terminating nucleotides.
- the sequencing of step (c) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once, thereby generating a plurality of second batch sequencing read products.
- one sequencing cycle comprises completion of a first and a second sequencing stage.
- the sequencing of step (c) does not require conducting more than 4 sequencing cycles.
- the sequencing of step (c) comprises conducting 4-20 sequencing cycles, or conducting 20-50 sequencing cycles, or conducting 50-75 sequencing cycles, or conducting 75-100 sequencing cycles, or any range therebetween, or conducting more than 100 sequencing cycles.
- the sequencing of step (c) comprises imaging at least one region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the imaging is conducted at the first sequencing cycle and/or the imaging is conducted at the last sequencing cycle. In some embodiments, the sequencing of step (c) comprises imaging at least one region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the imaging is conducted at every sequencing cycle. In some embodiments, the sequencing of step (c) comprises imaging at least one region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the imaging is conducted at fewer than every sequencing cycle, for example every other sequencing cycle or every third sequencing cycle.
- the sequencing of step (c) further comprises determining the location of the second batch sequencing read products on the support when imaging is conducted (e.g., template mapping). In some embodiments, the sequencing of step (c) further comprises counting the number of second batch sequencing read products on the support using the images of the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the density of the second batch sequencing read products on the support using the counted number of second batch sequencing read products.
- the sequencing of step (c) comprises hybridizing the second batch sequencing primers to the second sub-population of template molecules in the presence of a hybridization reagent.
- the hybridization reagent comprise an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10-20% formamide).
- the methods for determining template density comprise hybridizing a plurality of detectably labeled oligonucleotide probes to the plurality of immobilized template molecules instead of sequencing the immobilized template molecules of steps (b) and (c).
- the closing the nick in the open circle padlock probes of the first and second sub-populations comprises conducting an enzymatic ligation reaction to close the nick thereby generating a plurality of covalently closed circular padlock probes including at least first and second sub-populations of covalently closed circular padlock probes.
- closing the gap of the open circle padlock probes of the first and second sub-populations comprises conducting a polymerase-catalyzed fill-in reaction using the first or second target molecule as a template, and conducting an enzymatic ligation reaction, thereby generating a plurality of covalently closed circular padlock probes including first and second sub-populations of covalently closed circular padlock probes.
- various embodiments of padlock probes carrying different adaptor sequences in their internal region can be used to generate various embodiments of covalently closed circularized padlock probes (e.g., see FIGs. 15-20).
- methods generating circularized library molecules further comprise step (e): sequencing the plurality of concatemer template molecules immobilized to the support.
- the sequencing of step (e) comprises sequencing the first sub-population of concatemer template molecules by conducting up to 1000 sequencing cycles to generate a plurality of first sequencing read products, and sequencing the second sub-population of concatemer template molecules by conducting up to 1000 sequencing cycles to generate a plurality of second sequencing read products.
- the concatemer template molecules of the first and second sub-populations can be sequenced essentially simultaneously using a mixture of first and second batch-specific sequencing primers.
- the concatemer template molecules of the first and second sub-populations can be sequenced separately in batches using first batch-specific sequencing primers and then using second batch-specific sequencing primers.
- between about 500 million and about 10 billion concatemer template molecules, between about 1 billion and about 9 billion concatemer template molecules, between about 2 billion and about 8 billion concatemer template molecules, between about 3 billion and about 7 billion concatemer template molecules, between about 4 billion and about 5 billion concatemer template molecules, or any range therebetween of the second batch concatemer template molecules can be sequenced.
- the sequencing of step (e) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the sequencing of step (e) comprises conducting a two- stage sequencing method.
- step (e) individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a concatemer template molecule hybridized to a sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the multivalent-complexed polymerases in step (e), can be exposed to excitation illumination to induce emission of fluorescent signals from the multivalent-complexed polymerases.
- the fluorescent signals emitted from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent.
- the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination.
- the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photodamage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing).
- the first plurality of sequencing polymerases can be dissociated from the concatemer template molecules of the first sub-population.
- the concatemer template molecules of the first sub-population can remain immobilized to the support and the first batch-specific sequencing primers can be retained and can remain hybridized to the concatemer template molecules of the first sub-population.
- the second stage of the two-stage sequencing method generally comprises contacting the concatemer template molecules of the first subpopulation and the retained first batch-specific sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batchspecific sequencing primer.
- the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled.
- detecting and imaging of the incorporated nucleotides can be performed.
- detecting and imaging of the incorporated nucleotides can be omitted.
- nucleotide incorporation in step (e), can be conducted in the presence of a stepping reagent.
- the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation.
- the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent.
- the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the stepping reagent further comprises a second plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprises chain terminating nucleotides.
- individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position.
- the plurality of nucleotides are not chain terminating nucleotides.
- the sequencing of step (e) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the sequencing of step (e) comprises a reiterative sequencing workflow, which comprises: step (el) contacting the plurality of concatemer template molecules with (i) a plurality of batch-specific sequencing primers, (ii) a plurality of sequencing polymerases, and (iii) a plurality of nucleotide reagents, under a condition suitable for hybridizing the plurality of batch-specific sequencing primers to their respective batch sequencing primer binding sites on the concatemer template molecules.
- the reiterative sequencing further comprises step (e2) conducting up to 1000 sequencing cycles to generate at least a first plurality of sequencing read products and optionally a second plurality of sequencing read products.
- the reiterative sequencing further comprises step (e3) removing the first plurality of sequencing read products from the concatemers and retaining the plurality of concatemer template molecules, and optionally removing the second plurality of sequencing read products from the concatemer template molecules and retaining the plurality of concatemer template molecules.
- the reiterative sequencing further comprises step (e4) repeating steps (el) - (e3) at least once. In some embodiments, the reiterative sequencing further comprises step (e4) repeating steps (el) - (e3) up to 100 times (e.g., between 1 and 100, between 10 and 80, between 20 and 70, between 30 and 50 or between 5 and 40 times, or any range therebetween). [00497] In some embodiments, the reiterative sequencing can be conducting using a sequencing-by-binding procedure, labeled and/or non-labeled chain-terminating nucleotides, or multivalent molecules. Descriptions of these three sequencing methods is described below.
- the plurality of batch-specific sequencing primers can be hybridized to concatemer template molecules with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10-20% formamide).
- SSC buffer e.g., 2X saline-sodium citrate
- formamide e.g., 10-20% formamide
- the reiterative sequencing of steps (el) and (e2) comprise conducting a two-stage sequencing method which is described above.
- the present disclosure provides methods generating circularized library molecules comprising step (a): providing a plurality of linear single stranded library molecules (100).
- individual library molecules comprise the following components arranged in any order: (i) surface pinning primer binding site sequence (120) (e.g., batchspecific pinning primer binding site sequence); (ii) a left unique identification sequence (e.g., UMI) (180); (iii) a batch barcode sequence (195); (iv) a left sample index sequence (160); (v) a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); (vi) a sequence of interest (e.g., insert sequence) (HO); (vii) a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); (viii) a right sample index sequence (170); and/or (ix) a surface capture primer binding site sequence (130) (e.g.
- individual linear single stranded library molecules (100) lack any one or any combination of: a left unique identification sequence (e.g., UMI) (180); a batch barcode sequence (195); a left sample index sequence (160); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); and/or a right sample index sequence (170).
- a left unique identification sequence e.g., UMI
- batch barcode sequence (195)
- a left sample index sequence 160
- a reverse sequencing primer binding site sequence e.g., a batch-specific reverse sequencing primer binding site sequence
- the left and right sample index sequences can be used to distinguish insert sequences (e.g., sequences of interest) that are isolated from different sample sources in a multiplex assay.
- the first left index sequences (160) and/or first right index sequences (170) can be employed to prepare separate sample-indexed libraries using input nucleic acids isolated from different sources.
- the sample-indexed libraries can be pooled together to generate a multiplex library mixture, and the pooled libraries can be circularized, amplified and/or sequenced.
- the sequences of the left sample index (160) and the right sample index (170) are the same or different from each other.
- the left sample index sequence (160) can be 3-20 nucleotides in length.
- the right sample index sequence (170) can be 3-20 nucleotides in length.
- the left sample index sequence (160) and/or the right sample index sequence (170) can include a short random sequence (e.g., NNN).
- the short random sequence can be 3-20 nucleotides in length.
- the left sample index sequence (160) and/or the right sample index sequence (170) can be batch specific index sequences, i.e. the sequence or sequences of the index sequences correspond to a particular batch in a batch-sequencing work flow.
- a pre-determined first batch barcode sequence (195-1) in step (a), can be linked to a given sequence of interest (110-1) in the first sub-population (or can be linked to different sequences of interest in a first sub-population), thus the predetermined first batch barcode sequence (195-1) corresponds to a given sequence of interest (110-1) in the first sub-population.
- the single-stranded library molecules (100-2) within the second sub-population have the same second batch barcode sequence (195-2), and have the same or different second sequence(s) of interest (110-2).
- step (a) the sequences of the first and second batch barcode sequences ((195-1) and (195-2)) are the same or different.
- step (a) the sequences of the first and second batch capture primer binding site sequences ((130-1) and (130-2)) are the same or different.
- step (b) providing a plurality of single-stranded splint strands (200).
- the methods for generating circularized library molecules described herein further comprise conducting separate and sequential phosphorylation and ligation reactions which are conducted in separate reaction vessels.
- the methods for generating circularized library molecules further comprise step (cl): contacting in a first reaction vessel the plurality of the single-stranded splint strands (200) and the plurality of the linear single stranded library molecules (100) with a T4 polynucleotide kinase enzyme under a condition suitable to phosphorylate the 5’ ends of the plurality of singlestranded splint strands (200) and/or the plurality of linear single stranded library molecules (100); and transferring the phosphorylation reaction to a second reaction vessel.
- the methods for generating circularized library molecules described herein further comprise conducting sequential phosphorylation and ligation reactions which are conducted sequentially in the same reaction vessel.
- the methods for generating circularized library molecules further comprise step (c2): contacting in a first reaction vessel the plurality of the single-stranded splint strands (200) and the plurality of the linear single stranded nucleic acid library molecules (100) with a T4 polynucleotide kinase enzyme under a condition suitable to phosphorylate the 5’ ends of the plurality of single-stranded splint strands (200) and the plurality of linear single stranded nucleic acid library molecules (100), thereby generating phosphorylated single stranded nucleic acid library molecules.
- the methods for generating circularized library molecules further comprise step (d2): contacting in the same first reaction vessel the phosphorylated single-stranded splint strands (200) and the phosphorylated single-stranded nucleic acid library molecules with a ligase under a condition suitable to enzymatically ligate the nicks, thereby generating a plurality of covalently closed circular library molecules (400) each hybridized to a single-stranded splint strand (200).
- the ligase enzyme comprises T7 DNA ligase, T3 ligase, T4 ligase, or Taq ligase.
- the methods for generating circularized library molecules further comprise the optional step of enzymatically removing the plurality of single-stranded splint strands (200) from the plurality of covalently closed circular library molecules (400), which comprises the step: contacting the plurality of covalently closed circular library molecules (400) with at least one exonuclease enzyme to remove the plurality of singlestranded splint strands (200) and retaining the plurality of covalently closed circular library molecules (400).
- the exonuclease reaction can be conducted in the same reaction buffer used to conduct the phosphorylation and/or ligation reactions, or in a different reaction buffer.
- the exonuclease reaction can be conducted in a third reaction vessel after conducting the phosphorylation reaction in the first reaction vessel (step cl, see above), and conducting the ligation reaction in the second reaction vessel (step dl, see above).
- the exonuclease reaction can be conducted in the first reaction vessel after conducting the phosphorylation reaction in the first reaction vessel (step c2, see above), and conducting the sequential ligation reaction in the first reaction vessel (step d2, see above).
- the exonuclease reaction can be conducted in the first reaction vessel after conducting the essentially simultaneous phosphorylation and ligation reactions in the first reaction vessel (step c3, see above).
- the at least one exonuclease enzyme comprises any combination of two or more of exonuclease I, thermolabile exonuclease I and/or T7 exonuclease.
- the covalently closed circular library molecules (400) can be subjected to rolling circle amplification and sequencing (e.g., batch sequencing) as described herein.
- the forward sequencing primer binding site sequence (140) in the library molecules comprises the sequence 5’- CGTGCTGGATTGGCTCACCAGACACCTTCCGACAT -3’ (SEQ ID NO: 1).
- the forward sequencing primer binding site sequence (140) in the library molecules comprises the sequence 5’- ACACTCTTTCCCTACACGACGCTCTTCCGATCT -3’ (SEQ ID NO: 2).
- the forward sequencing primer binding site sequence (140) in the library molecules comprise the sequence 5’- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG -3’ (SEQ ID NO: 20).
- the reverse sequencing primer binding site sequence (150) in the library molecules comprises the sequence
- the reverse sequencing primer binding site sequence (150) in the library molecules comprises the sequence
- the present disclosure provides reagents, kits and methods for preparing circularized library molecules.
- the circularized library molecules are prepared by hybridizing any of the linear library molecules described herein with a plurality of double-stranded splint strands (500) to generate a plurality of library-splint complexes (800) which includes two nicks (e.g., see FIGs. 27, 28, 29, 30A and 30B).
- the nicks can be enzymatically ligated to generate covalently closed circular molecules (900) in which the second splint strand (700) is covalently joined at both ends to the linear single stranded library molecule (100), thereby introducing the new adaptor sequences into the circularized library molecule.
- the present disclosure provides methods for forming a plurality of library-splint complexes (800) comprising: (a) providing a plurality of linear single stranded nucleic acid library molecule (100).
- individual library molecules comprise: (i) a left universal adaptor sequence having a first surface pinning primer binding site sequence (120); (ii) a left universal adaptor sequence having a forward sequencing primer binding site sequence (140); (iii) a sequence of interest (110); (iv) a right universal adaptor sequence having a reverse sequencing primer binding site sequence (150); and (v) a right universal adaptor having a second surface capture primer binding site sequence (130).
- the left universal adaptor sequence (120) comprises a binding sequence for a first surface primer P5.
- the right universal adaptor sequence (130) comprises a binding sequence for a second surface primer P7.
- the linear library further comprises a left sample index sequence (160) and/or a right sample index sequence (170).
- the left and right sample index sequences can be used to distinguish insert sequences that are isolated from different sample sources in a multiplex assay.
- the left index sequence (160) can include a random sequence (e.g., NNN) or lack a random sequence.
- the right index sequence (170) can include a random sequence (e.g., NNN) or lack a random sequence. Exemplary single-stranded library molecules are shown in (e.g., see FIGs. 27, 28, 29, 30A and 30B).
- the methods for forming a plurality of library-splint complexes (800) can further comprise step (b): hybridizing the plurality of linear single stranded nucleic acid library molecules (100) with a plurality of double-stranded splint adaptors (500).
- individual double-stranded splint adaptors (500) in the plurality comprise a first splint strand (600) hybridized to a second splint strand (700).
- the double-stranded splint adaptor includes a double-stranded region and two flanking singlestranded regions.
- the first splint strand comprises a first region (620), an internal region (610), and a second region (630).
- the internal region of the first splint strand (610) is hybridized to the second splint strand (700).
- the first splint strand (600) comprises regions arranged in a 5’ to 3’ order a first region (620), an internal region (610), and a second region (630).
- the second splint strand (700) comprises regions arranged in a 5’ to 3’ order (i) a second subregion having a universal binding sequence for a fourth surface primer, and (ii) a first subregion having a universal binding sequence for a third surface primer.
- the universal binding sequences for the third surface primer do not bind the first surface primer (e.g., P5) or the second surface primer (e.g., P7).
- the universal binding sequences for the fourth surface primer do not bind the first surface primer (e.g., P5) or the second surface primer (e.g., P7).
- Exemplary double-stranded splint adaptors (500) are shown in (e.g., see FIGs. 27, 28, 29, 30A and 30B).
- step (b) The hybridizing of step (b) is conducted under a condition suitable for hybridizing the first region of the first splint strand (620) to the at least first left universal adaptor sequence (120) (e.g., the surface pinning primer binding site sequence) of the library molecule, and the condition is suitable for hybridizing the second region of the first splint strand (630) to the at least first right universal sequence (130) (e.g., the surface capture primer binding site sequence) of the library molecule, thereby circularizing the plurality of library molecules to form a plurality of library-splint complexes (800).
- first left universal adaptor sequence 120
- the at least first right universal sequence 130
- the surface capture primer binding site sequence e.g., the surface capture primer binding site sequence
- the library-splint complex (800) comprises a first nick between the 5’ end of the library molecule and the 3’ end of the second splint strand (e.g., see FIGs. 27, 28, 29, 30A and 30B).
- the library-splint complex (800) also comprises a second nick between the 5’ end of the second splint strand and the 3’ end of the library molecule (e.g., see FIGs. 27, 28, 29, 30A and 30B).
- the first and second nicks are enzymatically ligatable.
- the 5’ end of the first splint strand (600) is phosphorylated or lacks a phosphate group.
- the 3’ end of the first splint strand (600) includes a terminal 3’ OH group or a terminal 3’ blocking group.
- the 5’ end of the second splint strand (700) is phosphorylated or lacks a phosphate group.
- the 3’ end of the second splint strand (700) includes a terminal 3’ OH group or a terminal 3’ blocking group.
- the first region of the first splint strand (620) can hybridize to a sense or anti-sense strand of a double-stranded nucleic acid library molecule.
- the second region of the first splint strand (630) can hybridize to a sense or anti-sense strand of a double-stranded nucleic acid library molecule.
- the double-stranded nucleic acid library molecule can be denatured to generate the single-stranded sense and antisense library strands.
- the second splint strand (700) does not hybridize to the sequence of interest (110), and the internal region of the first splint strand (610) does not hybridize to the sequence of interest (110).
- the first region of the first splint strand (620) does not hybridize to the sequence of interest (110), and the second region of the first splint strand (630) does not hybridize to the sequence of interest (110).
- the 5’ end of the linear single stranded library molecule (100) is phosphorylated or lacks a phosphate group.
- the 3’ end of the singlestranded library molecule includes a terminal 3’ OH group or a terminal 3’ blocking group.
- the methods for forming a plurality of library-splint complexes (800) further comprise step (c): contacting the plurality of library-splint complexes (800) from step (b) with a ligase, under a condition suitable to enzymatically ligate the first and second nicks, thereby generating a plurality of covalently closed circular library molecules (900) each hybridized to the first splint strand (600).
- the ligase enzyme comprises T7 DNA ligase, T3 ligase, T4 ligase, or Taq ligase.
- the methods for forming a plurality of library-splint complexes (800) can further comprise an optional step (d): enzymatically removing the plurality of first splint strands (600) from the plurality of covalently closed circular library molecules (900) by contacting the plurality of covalently closed circular library molecules (900) with at least one exonuclease enzyme to remove the plurality of first splint strands (600) and retaining the plurality of covalently closed circular library molecules (900).
- the at least one exonuclease enzyme comprises any combination of two or more of exonuclease I, thermolabile exonuclease I and/or T7 exonuclease.
- the covalently closed circular library molecules (900) can be subjected to rolling circle amplification and sequencing (e.g., batch sequencing) which are described herein.
- the library molecules can include a left universal binding sequence (e.g., a pinning primer binding site sequence (120)) which binds the first region of the first splint strand (620).
- the left universal binding sequence comprises the sequence
- the library molecules can include a left universal binding sequence (e.g., a pinning primer binding site sequence (120)).
- the left universal binding sequence comprises the sequence 5’- CATGTAATGCACGTACTTTCAGGGT -3’ (SEQ ID NO: 18).
- the library molecule in any of the methods for forming a plurality of librarysplint complexes (800) described herein, includes a forward sequencing primer binding site sequence (1 0) comprising a universal binding sequence for a sequencing primer. In some embodiments, the universal binding sequence comprises the sequence 5’- ACACTCTTTCCCTACACGACGCTCTTCCGATCT -3’ (SEQ ID NO: 2). [00562] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a forward sequencing primer binding site sequence (140) comprising a universal binding sequence for a sequencing primer. In some embodiments, the universal binding sequence comprises the sequence 5’- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG -3’ (SEQ ID NO: 20).
- the library molecule in any of the methods for forming a plurality of librarysplint complexes (800) described herein, includes a first sequencing primer binding site sequence (1 0) comprising a universal binding sequence for a sequencing primer.
- the universal binding sequence comprises the sequence 5’- CGTGCTGGATTGGCTCACCAGACACCTTCCGACAT -3’ (SEQ ID NO: 1).
- the library molecule in any of the methods for forming a plurality of librarysplint complexes (800) described herein, includes a reverse sequencing primer binding site sequence (150) comprising a universal binding sequence for a sequencing primer.
- the universal binding sequence comprises the sequence 5’- AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -3’ (SEQ ID NO: 22).
- the library molecule in any of the methods for forming a plurality of librarysplint complexes (800) described herein, includes a reverse sequencing primer binding site sequence (150) comprising a universal binding sequence for a sequencing primer.
- the universal binding sequence comprises the sequence 5’- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC -3’ (SEQ ID NO: 23).
- the library molecule in any of the methods for forming a plurality of librarysplint complexes (800) described herein, includes a reverse sequencing primer binding site sequence (150) comprising a universal binding sequence for a sequencing primer.
- the universal binding sequence comprises the sequence 5’- ATGTCGGAAGGTGTGCAGGCTACCGCTTGTCAACT -3’ (SEQ ID NO: 21).
- the library molecule in any of the methods for forming a plurality of librarysplint complexes (800) described herein, includes a surface capture primer binding site sequence (130) that is a universal binding sequence, and which binds the first region of the first splint strand (630).
- the universal binding sequence comprises the sequence
- the library molecule includes a surface capture primer binding site sequence (130) that is universal binding sequence, and comprises the sequence 5’- AGTCGTCGCAGCCTCACCTGATC -3’ (SEQ ID NO: 24).
- the first sub-region of the second splint strand (700) comprises the sequence 5’- CATGTAATGCACGTACTTTCAGGGT-3’ (SEQ ID NO: 18).
- the second sub-region of the second splint strand (700) comprises the sequence 5’-AGTCGTCGCAGCCTCACCTGATC-3’ (SEQ ID NO: 24).
- the second splint strand (700) comprises a first and second sub-regions comprising the sequence 5’- AGTCGTCGCAGCCTCACCTGATCCATGTAATGCACGTACTTTCAGGGT-3’ (SEQ ID NO: 26).
- the first region of the first splint strand (620) includes a first universal adaptor sequence which comprises a universal binding sequence (or a complementary sequence thereof) for a first surface capture primer (also referred to a surface primer).
- the first region (620) comprises the sequence 5’- TCGGTGGTCGCCGTATCATT-3’ (SEQ ID NO: 27).
- the first region of the first splint strand (620) can hybridize to a P5 surface primer or a complementary sequence of the P5 surface primer.
- the P5 surface primer comprises the sequence 5’- AATGATACGGCGACCACCGA-3’ (short P5; SEQ ID NO: 19), or the P5 surface primer comprises the sequence 5’- AATGATACGGCGACCACCGAGATC-3’ (long P5; SEQ ID NO: 28).
- the second region of the first splint strand (630) includes a second universal adaptor sequence which comprises a universal binding sequence (or a complementary sequence thereof) for a second surface primer.
- the second region (630) comprises the sequence 5’- CAAGCAGAAGACGGCATACGA -3’ (SEQ ID NO: 29).
- the first splint strand (600) includes an internal region (610) which comprises a fifth sub-region having the sequence 5’- GATCAGGTGAGGCTGCGACGACT -3’ (SEQ ID NO: 32). In some embodiments, the first splint strand (600) comprises a first region (620), an internal region (610) having a fourth and fifth sub-region, and a second region (630), having the sequence
- covalently closed circular library molecules can be generated using linear single stranded library molecules (100) and either single-stranded splint strands (200) (e.g., FIGs. 21, 22, 23 A, 23B, 25A and 25B) or doublestranded splint adaptors (500) (e.g., FIGs. 27, 28, 29, 30A and 30B), as described above.
- the covalently closed circular library molecules e.g., (400) and (900)
- RCA rolling circle amplification
- the method for generating circularized library molecules further comprises step (e): conducting a rolling circle amplification reaction by hybridizing the plurality of covalently closed circular library molecules (e.g., (400) or (900)) with a plurality of amplification primers and conducting rolling circle amplification reaction in a template-dependent manner, using a plurality of strand displacing polymerases and a plurality of nucleotides, thereby generating a plurality of concatemer template molecules.
- the plurality of covalently closed circular library molecules e.g., (400) or (900)
- the rolling circle amplification reaction comprises hybridizing first and second sub-populations of covalently closed circular library molecules (e.g., (400) or (900)) to first and second amplification primers, respectively.
- the first and second amplification primers can be immobilized to a support (e.g., first and second capture primers), or the first and second amplification primers can be in solution.
- the first and second amplification primers have the same sequence or have different sequences.
- the first and second amplification primers having different sequences comprise first and second batch amplification primers.
- the rolling circle amplification reaction is conducted in a template-dependent manner, using a plurality of strand displacing polymerases, a plurality of nucleotides, and the first and second sub-populations of covalently closed circular library molecules (e.g., (400) or (900)), thereby generating a plurality of concatemer template molecules including at least a first sub-population of concatemer template molecules and a second sub-population of concatemer template molecules.
- the rolling circle amplification reaction is conducted in the presence of a plurality of compaction oligonucleotides.
- the rolling circle amplification reaction is conducted in the absence of a plurality of compaction oligonucleotides.
- individual concatemer template molecules in the first subpopulation comprise tandem repeat polynucleotide units.
- a unit comprises a first sequence of interest, the first batch barcode sequence, and a first batch sequencing primer binding site (or a complementary sequence thereof).
- a unit comprises a first sequence of interest, the first batch barcode sequence, and a first batch sequencing primer binding site (or a complementary sequence thereof).
- individual concatemer template molecules in the second sub-population comprise tandem repeat polynucleotide units.
- a unit comprises a second sequence of interest, the second batch barcode sequence, and a second batch sequencing primer binding site (or a complementary sequence thereof).
- a unit comprises a second sequence of interest, the second batch barcode sequence, and a second batch sequencing primer binding site (or a complementary sequence thereof).
- the covalently closed circular library molecules (e.g., (400) or (900)) can be distributed onto the support comprising a plurality of immobilized surface primer, under a condition suitable to hybridize at least one portion of the covalently closed circular library molecules to the immobilized surface primers, and the rolling circle amplification reaction is conducted thereby generating a plurality of immobilized concatemer template molecules including at least a first subpopulation of concatemer template molecules and a second sub-population of concatemer template molecules.
- the covalently closed circular library molecules e.g., (400) or (900)
- the rolling circle amplification reaction is conducted thereby generating a plurality of immobilized concatemer template molecules including at least a first subpopulation of concatemer template molecules and a second sub-population of concatemer template molecules.
- the on-support rolling circle amplification reaction is conducted in the presence of a plurality of compaction oligonucleotides. In some embodiments, the on-support rolling circle amplification reaction is conducted in the absence of a plurality of compaction oligonucleotides.
- the covalently closed circular library molecules e.g., (400) or (900)
- the rolling circle amplification reaction can be conducted in-solution
- the rolling circle amplification reaction and nascent concatemer template molecules can be distributed onto a support having a plurality of surface primers immobilized thereon, under a condition suitable to hybridize at least one portion of the nascent concatemer template molecules to the immobilized surface primers, and the rolling circle amplification reaction can be resumed thereby generating a plurality of immobilized concatemer template molecules including at least a first sub-population of concatemer template molecules and a second sub-population of concatemer template molecules.
- the in-solution rolling circle amplification reaction is conducted in the presence of a plurality of compaction oligonucleotides. In some embodiments, the in-solution rolling circle amplification reaction is conducted in the absence of a plurality of compaction oligonucleotides.
- methods generating circularized library molecules further comprise step (f): sequencing the first sub-population of concatemer template molecules using a plurality of first batch sequencing primers.
- the sequencing of step (f) comprises imaging a region of the support to detect the sequencing reactions of the first sub-population of template molecules.
- the sequencing of step (f) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the sequencing of step (f) comprises conducting a two- stage sequencing method.
- the first stage comprises contacting the first sub-population of concatemer template molecules with a plurality of first batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules.
- the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases.
- individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5).
- the multivalent molecules can be labeled with at least one detectable moiety that emits a signal.
- the multivalent molecules can be labeled with at least one fluor ophore.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a concatemer template molecule hybridized to a sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules.
- the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent.
- the trapping reagent further comprises a plurality of multivalent molecules.
- the trapping reagent further comprises a first plurality of sequencing polymerases.
- the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
- the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases.
- the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent.
- the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination.
- the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing).
- the first plurality of sequencing polymerases can be dissociated from the first sub-population of concatemer template molecules.
- the first sub-population of concatemer template molecules can remain immobilized to the support and the first batch sequencing primers can be retained and can remain hybridized to the first sub-population of concatemer template molecules.
- the second stage of the two-stage sequencing method comprises contacting the first sub-population of concatemer template molecules and the retained first batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation.
- the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batch sequencing primer.
- the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
- the nucleotides comprises chain terminating nucleotides.
- individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position.
- the nucleotides are not chain terminating nucleotides.
- the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
- nucleotide incorporation can be conducted in the presence of a stepping reagent.
- the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation.
- the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent.
- the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the stepping reagent further comprises a second plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
- the sequencing of step (f) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of first batch sequencing read products.
- one sequencing cycle comprises completion of a first and a second stage.
- the sequencing of step (f) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
- the methods for sequencing further comprises step (fl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first sub-population of concatemer template molecules to generate a plurality of first batch sequencing read products that comprise up to 1000 bases in length.
- the first batch sequencing read products comprise the first batch barcode sequence.
- the first batch sequencing read products comprise the first batch barcode sequence and the sample index sequence.
- the first batch sequencing read products comprise the first batch barcode sequence and at least a portion of the first sequence of interest.
- the first batch sequencing read products comprise the first batch barcode sequence, the sample index sequence, and at least a portion of the first sequence of interest.
- the short read sequencing comprises hybridizing first batch sequencing primers to the first batch sequencing primer binding sites on first subpopulation of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents.
- the sequencing of step (fl) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents.
- the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
- the reiterative sequencing of step (fl) comprises conducting a two- stage sequencing method described herein.
- the methods for sequencing further comprises step (f2): stopping and/or blocking the short read sequencing of step (fl).
- the stopping and/or blocking comprises incorporating a chain terminating nucleotide to the 3’ terminal end of the first batch sequencing read products to inhibit further sequencing reactions.
- Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
- the methods for sequencing further comprise step (13): removing the plurality of first batch sequencing read products from the concatemer template molecules of the first sub-population, and retaining the concatemer template molecules of the first sub-population.
- the first batch sequencing read products can be removed from the concatemer template molecules by denaturation using heat and/or a dehybridization reagent.
- the methods for sequencing further comprise step (f4): reiteratively sequencing the concatemer template molecules of the first sub-population by repeating steps (fl) - (f3) at least once.
- the reiterative sequencing can be conducted 1-10 times, or 10-25 times, or 25-50 times, or any range therebetween or more than 50 times.
- the reiterative sequencing can be conducted up to 100 times.
- Exemplary schematics of reiterative sequencing workflows are shown in FIGs. 24A, 24B, 26A, 26B, 31A and 31B
- the sequences of all of the first batch sequencing read products can be determined and aligned with a first reference sequence to confirm the presence of the first sequence of interest.
- the first reference sequence can be the first batch barcode and/or the first sequence of interest.
- step (f3) the plurality of plurality of first batch sequencing read products can be removed from the concatemer template molecules and the plurality of concatemer template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- SSC buffer e.g., saline-sodium citrate
- step (f3) the plurality of first batch sequencing read products can be removed from the concatemer template molecules and the plurality of concatemer template molecules can be retained using a de-hybridization reagent comprising at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent.
- the de-hybridization reagent further comprises at least one chaotropic agent.
- the de- hybridization reagent further comprises at least one nucleic acid compaction agent.
- the de-hybridization of step (f3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
- methods generating circularized library molecules further comprise step (g): sequencing the second sub-population of concatemer template molecules which are immobilized to the support using a plurality of second batch sequencing primers.
- the sequencing of step (g) comprises imaging the same region of the support to detect the sequencing reactions of the second sub-population of concatemer template molecules.
- the sequencing reactions of the first sub-population of concatemer template molecules is stopped before initiating the sequencing reactions of the second sub-population of concatemer template molecules.
- the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent- complexed polymerases, and detecting the multivalent-complexed polymerases.
- individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5).
- the multivalent molecules can be labeled with at least one detectable moiety that emits a signal.
- the multivalent molecules can be labeled with at least one fluorophore.
- individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer.
- the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases.
- the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent.
- the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases.
- the present disclosure provides one or more imaging reagents.
- the present disclosure provides methods for batch sequencing with or without reiterative sequencing, and methods for re-seeding with or without reiterative sequencing, which can be conducted with an imaging reagent.
- the imaging reagents can reduce photo damage of a fluorescently-labeled compound upon exposure to the excitation illumination.
- the fluorescently-labeled compound comprises a fluorophore-labeled nucleotide, a fluorophore-labeled multivalent molecule or a fluorophore-labeled multivalent- complexed polymerase.
- the imaging reagent can inhibit polymerase- catalyzed nucleotide incorporation.
- the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the imaging reagents further comprise at least one amino acid or modified amino acids.
- the imaging reagent lacks a reducing agent.
- the present disclosure provides one or more stepping reagents.
- the present disclosure provides methods for batch sequencing with or without reiterative sequencing, and methods for re-seeding with or without reiterative sequencing, which can be conducted with a stepping reagent.
- the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent.
- the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the stepping reagent further comprises a plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
- the stepping reagent comprises: water; any one or any combination of two or more pH buffering agents comprising Tris (e.g., pH 7-9, 10-50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (e.g., pH 5-7, 10-50 mM); any one or any combination of two or more monovalent cations comprising NaCl (e.g., 25-100 mM), KC1 (e.g., 10-75 mM) and/or ammonium sulfate (e.g., 1-50 mM); any one or any combination of two or more catalytic cations comprising magnesium chloride (e.g., 1-30 mM), magnesium sulfate (e.g., 1-30 mM) and/or manganese chloride (e.g., 1-30 mM); any one or any combination of two or more pH
- the present disclosure provides one or more nucleic acid de-hybridization reagents.
- the present disclosure provides methods for batch sequencing with or without reiterative sequencing, and methods for re-seeding with or without reiterative sequencing, which can be conducted with a de-hybridization reagent.
- the de-hybridization reagents can promote nucleic acid denaturation between any two nucleic acid strands.
- the de- hybridization reagents can promote nucleic acid denaturation between a nucleic acid template molecule and a nucleic acid extension product while retaining the nucleic acid template molecule.
- the de-hybridization reagents can promote nucleic acid denaturation between immobilized concatemer template molecules and the plurality of first batch sequencing read products while retaining the immobilized concatemer molecules.
- the de-hybridization reagents can promote nucleic acid denaturation between immobilized concatemer molecules and the plurality of second batch sequencing read products while retaining the immobilized concatemer molecules.
- the de-hybridization reagent comprises at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent.
- the de-hybridization reagent further comprises at least one chaotropic agent.
- the de-hybridization reagent further comprises at least one nucleic acid condenser agent.
- the de-hybridization reagent comprises: any one or any combination of two or more solvents comprising water, acetonitrile (e.g., 10-20%) and/or formamide (e.g., 10-40%); any one or any combination of two or more pH buffering agents comprising MES (e.g., pH 5-7, 10-75 mM), Tris (e.g., pH 6-9, 10-50 mM), HEPES (e.g., pH 6-9, 10-50 mM) and/or PBS (phosphate buffered saline) (e.g., comprising disodium hydrogen phosphate and sodium chloride) (e.g., pH 5-8); at least one reducing agent comprising DMSO (e.g., 10-50%) or TCEP (e.g., 1-10 mM); at least one monovalent salt comprising NaCl (e.g., 0.25-2 M) and/or ammonium sulfate (e
- MES e.g.
- the at least one pH buffering agent comprises any one or any combination of two or more of Tris, Tris-HCl, Tris-acetate, Tricine, Bicine, Bis-Tris propane, HEPES, MES, 3-(N-morpholino)propanesulfonic acid (MOPS), 2-Hydroxy-3- morpholinopropanesulfonic acid (MOPSO), N,N-Bis(2-hydroxyethyl)-2-aminoethanesulfonic acid (BES), 2- ⁇ [l,3-Dihydroxy-2-(hydroxymethyl)propan-2-yl]amino ⁇ ethane-l-sulfonic acid (TES), 3 -(Cyclohexylamino)- 1 -propanesulfonic acid (CAPS), 3 - ⁇ [1,3 -dihydroxy -2- (hydroxymethyl)propan-2-yl]amino ⁇ propane-l -sulfonic acid (TAPS), 3- ⁇ [l,3-
- the reagents can include at least one monovalent salt at a concentration of about 25-500 mM, or about 50-250 mM, or about 100-200 mM, or about 500 mM - 750 mM, or about 750 mM - 1 M, or about 1 M - 1.5 M, or about 1.5 - 2 M, or any range therebetween.
- Ammonium Ions
- the reagents can include the reducing agent at a concentration of about 0.1-0.5 M, or about 0.5-1 M, or about 1-2 M. In some embodiments, the reagents can include the reducing agent at a concentration of about 0.01-0.1 mM, or about 0.1-1 mM, or about 1-2.5 mM, or about 2.5-5 mM, or about 5-7.5 mM, or about 7.5-9 mM, or about 9-12 mM, or about 12-25 mM, or about 25-50 mM, or any range therebetween.
- the reagents can include the viscosity agent at a concentration of about 1-50 mM, or about 50-100 mM, or about 100-150 mM, or about 150- 200 mM.
- the reagents can include the viscosity agent at a concentration of about 0.1-0.5 M, or about 0.5-1 M, or about 1-2 M, or about 2-3 M, or about 3-5 M, or any range therebetween.
- any of the reagents described herein comprise at least one chaotropic agent.
- the at least one chaotropic agent that can disrupt non-covalent bonds such as hydrogen bonds or van der Waals forces.
- the at least one chaotropic agent comprises any one or any combination of two or more of SDS (sodium dodecyl sulfate), urea, thiourea, guanidinium chloride, guanidine hydrochloride, guanidine thiocyanate, guanidine isothiocyanate, guanidine isothionate, potassium thiocyanate, lithium chloride, sodium iodide, sodium perchlorate or imidazole.
- SDS sodium dodecyl sulfate
- urea urea
- thiourea guanidinium chloride
- guanidine hydrochloride guanidine thiocyanate
- guanidine isothiocyanate guanidine isothionate
- any of the reagents described herein comprise at least one zwitterion.
- the zwitterion comprises a cationic zwitterionic compound such as a betaine including N,N,N-trimethylglycine and cocamidopropyl betaine.
- the zwitterion comprises an albuminoids including ovalbumin, and the serum albumins derived from bovine, equine, or human.
- the reagent can include a zwitterion at a concentration of about 0.1-0.5 M, or about 0.5-1 M, or about 1-2 M, or any range therebetween.
- any of the reagents described herein comprise at least one sugar alcohol.
- the at least one sugar alcohol comprising sucrose, trehalose, maltose, rhamnose, arabinose, fucose, mannitol, sorbitol or adonitol.
- the reagents can include the sugar alcohol at a concentration of about 1-50 mM, or about 50-100 mM, or about 100-150 mM, or about 150-200 mM, or any range therebetween.
- the reagents can include the sugar alcohol at a concentration of about 0.1-0.5 M, or about 0.5-1 M, or about 1-2 M, or about 2-3 M, or about 3-5 M, or any range therebetween.
- any of the reagents described herein comprise at least one crowding agent.
- the at least one crowding agent can increase molecular crowding.
- the at least one crowding agent comprises any one or any combination of two or more of polyethylene glycol (PEG, e.g., 1-50K molecular weight), dextran, dextran sulfate, hydroxypropyl methyl cellulose (HPMC), hydroxyethyl methyl cellulose (HEMC), hydroxybutyl methyl cellulose, hydroxypropyl cellulose, methycellulose, and hydroxyl methyl cellulose.
- PEG polyethylene glycol
- the polyethylene glycol comprises PEG 100, PEG 200, PEG 300, PEG 400, PEG 600 or PEG 800. In some embodiments, the polyethylene glycol comprises PEG 1000, PEG 2000, PEG 3000 or PEG 4000. In some embodiments, the dextran sulfate comprises 150 kDa or 500 kDa forms. In some embodiments, the crowding agent can be present in the reagent at about 1-10%, or about 10-25%, or about 25-30%, or about 30-35%, or about 35-50% or higher percentages by volume based on the total volume of the reagent, or any range therebetween.
- Exemplary but non limiting triplet state quenchers comprise ascorbic acid, 1,4- diazobicyclo[2.2.2]octane (DABCO), cyclo-octatetraene (COT), dithiothreitol (DTT), mercaptoethylamine (MEA), P-mercaptoethanol (BME), n-propyl gallate, p- phenylenediamene (PPD), hydroquinone and sodium azide (NaNs), TEMP (2, 2,6,6- tetramethyl-4-piperidone), TEMP amine, TEMPO (2,2,6,6-tetramethyll-l - piperidinyloxyl), TEMPOH (2,2,6,6-Tetramethyl-4-piperidinol), HTEMPO (4-hydroxy derivative of TEMPO), 1,3,5-trihydroxybenzene (THB) and DTBN (di-t-butylnitroxide).
- DABCO 1,4- dia
- Exemplary but non limiting singlet oxygen quenchers comprise thiol-based quenchers such as glutathione, dithiothreitol, ergothioneine, methionine, cysteine, betadimethyl cysteine (penicillamine), mercaptopropionylglycine, MESNA, imidazole, and N- acetyl cysteine and captopril.
- thiol-based quenchers such as glutathione, dithiothreitol, ergothioneine, methionine, cysteine, betadimethyl cysteine (penicillamine), mercaptopropionylglycine, MESNA, imidazole, and N- acetyl cysteine and captopril.
- oxygen scavengers comprise glutathione, and N- acetylcysteine, histidine, tryptophan, hydrazine (N2H4), sodium sulfite (Na2SOs) and hydroxylamine.
- Exemplary but non limiting electron scavengers comprise methyl viologen (e.g., 1,1 '-dimethyl-4,4'-bipyridinium di chloride).
- anti-fade formulations comprise commercially- available products including Fluoroguard® Antifade Reagent (e.g., from BioRad®), SlowFade Antifade Kit (e.g., includes DABCO, from Molecular Probes-Invitrogen®), ProLongTM Gold Antifade Reagent (e.g., from Invitrogen), and CitiFluorTM (e.g., from CitiFluor).
- Fluoroguard® Antifade Reagent e.g., from BioRad®
- SlowFade Antifade Kit e.g., includes DABCO, from Molecular Probes-Invitrogen®
- ProLongTM Gold Antifade Reagent e.g., from Invitrogen
- CitiFluorTM e.g., from CitiFluor
- Exemplary but non limiting electron rich polyphenols comprise rutin, hesperidin, catchchin and epigallocatechin-3 -gallate (EGCG).
- Exemplary but non limiting acidic polyphenols comprise dihydroxybenzoic acids (DHBA), gallic acid and derivatives of gallic acid, tiron, potassium hydroquinonesulfonate (HQSA) and 3,6-dihydronaphthalene-2,7-disulonic acid (e.g., disodium salt)(DHNA).
- DHBA dihydroxybenzoic acids
- HQSA potassium hydroquinonesulfonate
- DHNA 3,6-dihydronaphthalene-2,7-disulonic acid
- Exemplary but non limiting alkenes comprise chlorogenic acid, 4-cyclohexene- 1,2-dicarboxylic acid, caffeic acid and sinapic acid including demethylated sinapic acid.
- Exemplary but non limiting hydrogen donor compounds comprise citric acid shikimic acid, quinic acid, kojic acid, ergothioneine and 2-mercaptoimidazole.
- Exemplary but non limiting thiol based reducing agents comprise cysteine and methionine.
- the reagents can include at least one of the compounds for reducing photo damage at a concentration of about 0.1-1 mM, or about 1-10 mM, or about 10-25 mM, or about 25-50 mM, or about 50-75 mM, or about 75-100 mM.
- the kit comprises any one or any combination of two or more reagents comprising a trapping reagent, an imaging reagent, a stepping reagent and/or a de-hybridization reagent.
- the kit comprises a trapping reagent, an imaging reagent, a stepping reagent and/or a de-hybridization reagent.
- each reagent is stored in a separate container.
- the kit can include instructions for use of the kit, e.g. for conducting methods for batch sequencing with or without reiterative sequencing using a trapping reagent, an imaging reagent, a stepping reagent and/or a de-hybridization reagent.
- the kit can include instructions for use of the kit, e.g. for conducting methods for re-seeding with or without reiterative sequencing using a trapping reagent, an imaging reagent, a stepping reagent and/or a de-hybridization reagent.
- the trapping reagent in the kit comprises: water; any one or any combination of two or more pH buffering agents comprising Tris-HCl (e.g., pH 7-9, 10- 50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (e.g., pH 5-7, 10-50 mM); at least one non-catalytic cation comprising strontium acetate (e.g., 1-7 mM) and/or strontium nitrate (e.g., 1-7 mM); any one or any combination of two or more viscosity agents comprising sucrose (e.g., 50-300 mM), ethylene glycol (e.g., 5- 20%) and/or propylene glycol (e.g., 1-5%); at least one chelating agent comprising EDTA (e.g., 0.1-
- the trapping reagent further comprises at least one chaotropic agent comprising guanidinium hydrochloride (e.g., 50-150 mM) or guanidinium isothiocyanate (e.g., 50-150 mM).
- the trapping reagent further comprises any one or any combination of two or more amino acids or modified amino acids comprising betaine (e.g., 50-500 mM), beta-alanine (e.g., 25-150 mM) and/or L-arginine (e.g., 25-100 mM).
- the trapping reagent further comprises any one or any combination of two or more types of multivalent molecules carrying nucleotide units dATP, dGTP, dCTP, dTTP and/or dUTP (e.g., 10-75 nM each type).
- the trapping reagent further comprises a plurality of sequencing polymerases (e.g., 100-600 nM).
- the trapping reagent lacks a non-catalytic cation.
- the imaging reagent in the kit comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photodamage, at least one reducing agent, at least one detergent and at least one viscosity agent.
- the imaging reagents further comprise at least one amino acid or modified amino acids.
- the imaging reagent lacks a reducing agent.
- the imaging reagent in the kit comprises: water; any one or any combination of two or more pH buffering agents comprising Tris-HCl (e.g., pH 7-9, 10- 50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (pH 5-7, 10-50 mM); at least one monovalent cation comprising NaCl (e.g., 25- 100 mM); at least one chelating agent comprising EDTA (e.g., 0.1-0.7 mM); at least one non- catalytic cation comprising strontium acetate (e.g., 1-7 mM) and/or strontium nitrate (e.g., 1-7 mM); any one or any combination of two or more compounds for reducing photo-damage comprising Trolox (e.g., 0.1-0.5 mM),
- Tris-HCl e
- the stepping reagent in the kit comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent.
- the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides).
- the stepping reagent further comprises a plurality of sequencing polymerases.
- the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation.
- the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
- the stepping reagent in the kit comprises: water; any one or any combination of two or more pH buffering agents comprising Tris (e.g., pH 7-9, 10-50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (e.g., pH 5-7, 10-50 mM); any one or any combination of two or more monovalent cations comprising NaCl (e.g., 25-100 mM), KC1 (e.g., 10-75 mM) and/or ammonium sulfate (e.g., 1-50 mM); any one or any combination of two or more catalytic cations comprising magnesium chloride (e.g., 1-30 mM), magnesium sulfate (e.g., 1-30 mM) and/or manganese chloride (e.g., 1-30 m
- Tris e.g.
- the stepping reagent further comprises any one or any combination of two or more amino acids or modified amino acids comprising betaine (e.g., 50-500 mM), beta-alanine (e.g., 25-150 mM), L-arginine (e.g., 25-100 mM) and/or methionine (e.g., 0.1-5 mM).
- the stepping reagent further comprises any one or any combination of two or more types of nucleotides dATP, dGTP, dCTP, dTTP and/or dUTP (e.g., 0.1-5 uM each type).
- the stepping reagent comprises a plurality of detectably labeled nucleotides. In some embodiments, at least one type of nucleotides in the stepping reagent comprises detectably labeled nucleotides. In some embodiments, the detectable label comprises a fluorophore. In some embodiments, the stepping reagent comprises a plurality of non-labeled nucleotides. In some embodiments, the nucleotides in the stepping reagent comprise 3’ chain terminator nucleotide analogs. In some embodiments, the stepping reagent further comprises a plurality of sequencing polymerases (e.g., 100-600 nM). In some embodiments, the stepping reagent lacks a catalytic cation comprising magnesium or manganese.
- the de-hybridization reagent in the kit comprises at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent.
- the de-hybridization reagent further comprises at least one chaotropic agent.
- the de- hybridization reagent further comprises at least one nucleic acid condenser agent.
- the de-hybridization reagent in the kit comprises: any one or any combination of two or more solvents comprising water, acetonitrile (e.g., 10-20%) and/or formamide (e.g., 10-40%); any one or any combination of two or more pH buffering agents comprising MES (e.g., pH 5-7, 10-75 mM), Tris (e.g., pH 6-9, 10-50 mM), HEPES (e.g., pH 6-9, 10-50 mM) and/or PBS (phosphate buffered saline) (e.g., comprising disodium hydrogen phosphate and sodium chloride) (e.g., pH 5-8); at least one reducing agent comprising DMSO (e.g., 10-50%) or TCEP (e.g., 1-10 mM); at least one monovalent salt comprising NaCl (e.g., 0.25-2 M) and/or ammonium sulf
- MES e.g.
- the present disclosure provides one or more cartridges each containing one or more reagents used for conducting any of the methods for batch sequencing with or without reiterative sequencing, and any of the methods for re-seeding with or without reiterative sequencing described above.
- the cartridge can contain any of the reagents described herein including the trapping reagent, imaging reagent, stepping reagent and/or de-hybridization reagent.
- the cartridge can be sub-divided into two or more separate reservoirs where each reservoir contains a different reagent.
- the cartridge can be sub-divided two or more separate spaces where each space can hold a container containing a reagent.
- the cartridge can include at least four separate spaces.
- each space holds a container comprising a different reagent, e.g. a trapping reagent container, an imaging reagent container, a stepping reagent container or a de-hybridization reagent container.
- the cartridge is configured to fit into a nucleic acid sequencing apparatus.
- the cartridge is connected to at least one capillary that is configured to deliver the contents of the cartridge to one or more supports that are integrated or assembled on a microfluidic flow cell.
- the present disclosure provides a support for use in conducting any of the batch sequencing, reiterative sequencing and/or re-seeding methods described herein.
- the support is solid, semi-solid, or a combination of both.
- the support is porous, semi-porous, non-porous, or any combination of porosity.
- the support can be substantially planar, concave, convex, or any combination thereof.
- the support can be cylindrical, for example comprising a capillary or interior surface of a capillary.
- the support comprises any material, including but not limited to glass, fused- silica, silicon, a polymer (e.g., polystyrene (PS), macroporous polystyrene (MPPS), polymethylmethacrylate (PMMA), polycarbonate (PC), polypropylene (PP), polyethylene (PE), high density polyethylene (HDPE), cyclic olefin polymers (COP), cyclic olefin copolymers (COC), polyethylene terephthalate (PET)), or any combination thereof.
- a polymer e.g., polystyrene (PS), macroporous polystyrene (MPPS), polymethylmethacrylate (PMMA), polycarbonate (PC), polypropylene (PP), polyethylene (PE), high density polyethylene (HDPE), cyclic olefin polymers (COP), cyclic olefin copolymers (COC), polyethylene terephthalate (PET)
- PS polyst
- the contours and interstitial regions can be fabricated using any combination of photo-chemical, photo-lithography, electron beam lithography, micro- or nano-imprint lithography, ink-jet printing, or micron-scale printing and/or nano-scale printing.
- the contours can be functionalized to promote tethering/immobilizing nucleic acid molecules (e.g., capture primers, pinning primers and/or template molecules) and/or for tethering an enzyme (e.g., a polymerase).
- the interstitial regions can be modified to inhibit tethering nucleic acid molecules (e.g., capture primers, pinning primers and/or template molecules) and/or for inhibiting tethering an enzyme (e.g., a polymerase).
- the support comprises at least one region (e.g., a feature) which can be functionalized to tether/immobilize nucleic acid molecules and/or enzymes.
- the features are arranged on the support in a non-predetermined manner (e.g., randomly positioned features; e.g., FIG. 14A part (i)).
- the features are arranged on the support in a predetermined manner (e.g., patterned features; e.g., FIGs. 14B parts (iii) and (iv)).
- the features are arranged on the support in repeating pattern (e.g., FIGs. 14B parts (iii) and (iv)).
- a support comprises a plurality of features located at random and non-predetermined positions on the support.
- individual features can attach to a nucleic acid molecule (e.g., surface capture primers, surface pinning primers and/or template molecules).
- a nucleic acid molecule e.g., surface capture primers, surface pinning primers and/or template molecules.
- Each of the features on the support can be functionalized with a chemical compound to attach to a nucleic acid molecule.
- the surface capture primers on the support can attach to nucleic acid template molecules having one of four different batch sequences (e.g., see FIG. 14A part (ii)).
- the template molecules can attach to the support (via attachment to the capture primers) such that some of the nearest neighbor template molecules touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
- the dotted lines that surround the four template molecules represent nearest neighbor template molecules that touch each other (e.g., FIG. 14A part (ii)).
- the support comprises a contour and at least one feature on or near the contour for tethering nucleic acid molecules.
- one or more wells e.g., a plurality of contours
- the support can be fabricated with any type of contour(s) and feature(s) that are on or near the contour(s), where the features are designed to tether at least one nucleic acid molecule.
- the support lacks contours.
- the support lacks features arranged in a pre-determined pattern where the features have a chemical functionality for tethering nucleic acid molecules and/or enzymes to the support.
- the support comprises features positioned at random non-predetermined locations on the support.
- the support lacks interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to inhibit tethering nucleic acid molecules or enzymes.
- any of the features for tethering nucleic acids and/or enzymes can be positioned on the support using ink-jet printing, or micron-scale or nanoscale printing.
- the features can be made in any shape including for example, circular, square, triangular or rectangular (e.g., FIGs. 14A parts (i) and (iii)).
- at least one surface of the support can be modified with a chemical compound that enables attachment of a polymer coating to the support.
- the support can be modified with a silane compound.
- the silane compound can bind a polymer coating.
- At least one surface of the support is passivated with at least one polymer coating layer (e.g., FIG. 14C).
- the support is passivated with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more polymer coating layers.
- the coating forms a continuous layer on the support. In some embodiments, the coating forms no pre-determined pattern.
- the surface coating may be patterned, such that the chemical modification layers are confined to one or more discrete regions of the support.
- the coating may be patterned using photolithographic techniques to create an ordered array or random pattern of chemically-modified regions on the support.
- the coating may be patterned using, e.g., contact printing and/or ink-jet printing techniques.
- the coating is distributed on the support in a predetermined pattern, for example the pre-determined pattern comprises or spots arranged in rows and/or columns or other pre-determined patterns.
- the coating having a pre-determined pattern comprises at least one interstitial region that lacks a polymer coating.
- the passivated layer forms a porous or semi-porous layer.
- At least one of the polymer layers comprises a hydrophilic polymer layer.
- at least one polymer layer comprises polymer molecules having a molecular weight of at least 1000 Daltons.
- the hydrophilic polymer layer can comprise polyethylene glycol (PEG).
- the hydrophilic polymer layer can comprise unbranched PEG.
- the hydrophilic polymer layer can comprise branched PEG having at least 4 branches, for example the branched PEG comprises 4-16 branches.
- the hydrophilic polymer layer comprises cross-linking or lacks cross-linking.
- the hydrophilic polymer layer comprises cross-linking to form a hydrogel.
- the hydrophilic polymer layer comprises a monolayer having unbranched polymers which can form a brush monolayer.
- the brush monolayer can form an extended brush monolayer.
- the brush monolayer comprises a plurality of unbranched polymers where one end of a given unbranched polymer is attached to the support and the other end of the same given unbranched polymer is attached to an oligonucleotide primer (e.g., capture primer or pinning primer).
- the density of the plurality of oligonucleotide primers attached to the brush monolayer is about 10 2 - 10 15 per um 2 , for example, between about IO 10 and about 10 15 surface oligonucleotide primers per mm 2 , between about 10 5 and about 10 15 oligonucleotide primers per mm 2 , between about 10 3 and about 10 14 oligonucleotide primers per mm 2 , between about 10 4 and about 10 13 oligonucleotide primers per mm 2 , between about 10 5 and about 10 12 oligonucleotide primers per mm 2 , between about 10 6 and about 10 11 oligonucleotide primers per mm 2 , between about 10 7 and about IO 10 oligonucleotide primers per mm 2 , or between about 10 8 and about IO 10 oligonucleotide primers per mm 2 , or any range therebetween.
- the coating layer has a degree of hydrophilicity which can be measured as a water contact angle, where the water contact angle is no more than 45 degrees.
- any layer of the polymer coating includes a plurality of oligonucleotide primers covalently tethered to the polymer layer.
- the plurality of oligonucleotide primers are distributed at a plurality of depths throughout any of the polymer layers.
- the density of the plurality of oligonucleotide primers in any of the polymer layers is about 10 2 - 10 15 per um 2 , for example, between about 10 10 and about 10 15 surface oligonucleotide primers per mm 2 , between about 10 5 and about 10 15 oligonucleotide primers per mm 2 , between about 10 3 and about 10 14 oligonucleotide primers per mm 2 , between about 10 4 and about 10 13 oligonucleotide primers per mm 2 , between about 10 5 and about 10 12 oligonucleotide primers per mm 2 , between about 10 6 and about 10 11 oligonucleotide primers per mm 2 , between about 10 7 and about 10 10 oligonucleotide primers per mm 2 , or between about 10 8 and about 10 10 oligonucleotide primers per mm 2 , or any range therebetween.
- individual oligonucleotide primers comprise nucleic acid molecules comprising DNA, RNA, DNA/RNA chimeric or analogs thereof. In some embodiments, the plurality of oligonucleotide primers are about 10 - 100 nucleotides in length. In some embodiments, individual oligonucleotide primers in the plurality comprise 3’ extendible ends or 3’ nonextendible ends. In some embodiments, the 3’ non-extendible ends comprise a 3’ chain terminating moiety. In some embodiments, individual oligonucleotide primers have their 5’ or 3’ ends or an internal region attached to the polymer layer.
- the 5’ ends of the plurality of oligonucleotide primers are attached to the polymer layer.
- the plurality of oligonucleotide primer are randomly distributed throughout and embedded within at least one of the polymer layers.
- the plurality of oligonucleotide primer are distributed in or on at least one of the polymer layers in a random manner or a pre-determined pattern.
- the plurality of oligonucleotide primers are distributed in or on at least one of the polymer layers in a nonrandom pre-determined pattern, for example the pre-determined pattern comprises stripes or spots arranged in rows and/or columns or other pre-determined patterns.
- the support comprises a first layer comprising a first monolayer having hydrophilic polymer molecules tethered to the support.
- at least some of the polymer molecules in the first layer are covalently tethered to oligonucleotide primers.
- the tethered oligonucleotide primers in the first monolayer are arranged in a random manner or in a pre-determined pattern.
- the polymer molecules in the first layer are not tethered to oligonucleotide primers.
- the support further comprises a second layer comprising a second monolayer having hydrophilic polymer molecules tethered to the first monolayer.
- at least some of the polymer molecules in the second layer are covalently tethered to oligonucleotide primers.
- the tethered oligonucleotide primers in the second monolayer are arranged in a random manner or in a pre-determined pattern.
- the polymer molecules in the second layer are not tethered to oligonucleotide primers.
- the support further comprises a third layer comprising a third monolayer having hydrophilic polymer molecules tethered to the second monolayer.
- at least some of the polymer molecules in the third layer are covalently tethered to oligonucleotide primers.
- the tethered oligonucleotide primers in the third monolayer are arranged in a random manner or in a pre-determined pattern.
- the polymer molecules in the third layer are not tethered to oligonucleotide primers.
- the support comprises a functionalized polymer coating layer covalently bound at least to a portion of the support via a chemical group on the support, a primer grafted to the functionalized polymer coating, and a water-soluble protective coating on the primer and the functionalized polymer coating.
- the functionalized polymer coating comprises a poly(N-(5- azidoacetamidylpentyl)acrylamide-co-acrylamide (PAZAM).
- At least one of the polymer layers comprise oligonucleotide primers including capture primers, pinning primers, or a mixture of capture and pinning primers.
- the plurality of oligonucleotide primers comprise one type of capture primer (e.g., having that same batch capture primer sequence) or a mixture of 2-100 different types of capture primers (e.g., having 2-100 different batch capture primer sequences).
- the plurality of oligonucleotide primers comprise one type of pinning primer (e.g., having that same batch pinning primer sequence) or a mixture of 2- 100 different types of pinning primers (e.g., having 2-100 different batch pinning primer sequences).
- individual capture primers (e.g., which are tethered to and/or embedded in a polymer layer) can be used in an on-support amplification reaction.
- individual capture primers hybridize to a capture primer binding site in a circularized library molecule, and rolling circle amplification can be conducted to generate a concatemer template molecule which is tethered and/or embedded in the polymer layer.
- individual capture primers (e.g., which are tethered to and/or embedded in a polymer layer) can be used in an in-solution amplification workflow.
- individual surface capture primers can hybridize to a surface capture primer binding site in a nascent concatemer template molecule, and rolling circle amplification can continue on the polymer layer to generate a concatemer template molecule which is tethered and/or embedded in the polymer layer.
- the density of the surface capture primers in a polymer layer can be modulated (e.g., increased or decreased) to achieve a desired density of immobilized concatemer template molecules on a support.
- a polymer layer having a high density of surface capture primers will generate concatemer template molecules that are tightly packed and immobilized to the support at a density of about 10 5 - 10 15 per mm 2 which cannot be achieved using supports fabricated to include nano-scale features for attachment of template molecules.
- a surface single pinning primer (e.g., which is tethered to or embedded in a polymer layer) can hybridize to a surface pinning primer binding site in a concatemer template molecule to generate a concatemer template molecule which is tethered or embedded (e.g., pinned down) in the polymer layer.
- At least one of the polymer layers comprise a plurality of capture primers and/or pinning primers having a cleavable region that is cleavable with a restriction endonuclease enzyme.
- the cleavable region comprises a recognition site for a type I, type II, type Ils, type IIB, type III, or type IV restriction enzyme.
- the plurality of surface capture primers and/or pinning primers include a cleavable region that is cleavable with an enzyme that generates an abasic site.
- the cleavable region comprises at least one nucleotide having a scissile moiety including uridine, 8-oxo-7,8-dihydrogunine or deoxyinosine.
- the plurality of capture primers and/or pinning primers lack a cleavable region.
- the support comprises at least one partition/barrier that creates separate regions of the support.
- the partition/barrier can prevent fluid flow on one portion of the support.
- the partition/barrier can inhibit nucleic acid and/or enzyme reactions on a portion of the support.
- the partition/barrier can be placed on the support.
- the partition/barrier is not placed on the support but is positioned to block fluid flow onto the support.
- the support lacks partitions/barriers that would create separate regions of the support.
- the support is passivated with at least one polymer coating formed as a continuous layer, and at least one of the polymer layers comprise a plurality of surface capture primers that are randomly distributed throughout and on the polymer layer.
- the surface capture primers can be used to generate immobilized concatemer template molecules.
- the immobilized template molecules are in fluid communication with each other in a massively parallel manner with no barriers to physically separate different batches of template molecules. Instead, sub-populations of template molecules carrying different batch sequencing primer binding sites which enables batch sequencing.
- Asynchronous sequencing is achieved using concatemer template molecules in fluid communication with each other on the same non-partitioned support. Fragmenting Nucleic Acids
- the present disclosure provides methods for preparing nucleic acid library molecules for use in any of the methods described including batch sequencing, re-seeding, reiterative sequencing, padlock probe workflows, single-stranded splint workflows and/or double-stranded splint workflow.
- the insert region of a nucleic acid library molecule comprises a sequence of interest extracted from any source.
- the insert region can be prepared using recombinant nucleic acid technology including but not limited to any combination of vector cloning, transgenic host cell preparation, host cell culturing and/or PCR amplification.
- the insert region can be in fragmented or un-fragmented form, and can be used to prepare linear nucleic acid library molecules. Fragmented forms of the insert region can be obtained by mechanical force, enzymatic or chemical fragmentation methods. The fragmented insert regions can be generated using procedures that yield a population of fragments having overlapping sequences or non-overlapping sequences.
- Mechanical fragmentation typically generates randomly fragmented nucleic acid molecules.
- Mechanical fragmentation methods include mechanical shearing such as fluid shear, constant shear and pulsatile shear. Mechanical fragmentation methods also include mechanical stress including sonication, nebulization and acoustic cavitation.
- focused acoustic energy can be used to randomly fragment nucleic acid molecules.
- a commercially-available apparatus e.g., Covaris®
- Covaris® can be used to fragment nucleic acid molecules using focused acoustic energy.
- Enzymatic fragmentation procedures can be conducted under conditions suitable to generate randomly or non-randomly fragmented nucleic acid molecules.
- restriction endonuclease enzyme digestion can be conducted to completion to generate non- randomly fragmented nucleic acid molecule.
- partial or incomplete restriction enzyme digestion can be conducted to generate randomly-fragmented nucleic acid molecules.
- Enzymatic fragmentation using restriction endonuclease enzymes includes any one or any combination of two or more restriction enzymes selected from a group consisting of type I, type II, type Ils, type IIB, type III, or type IV restriction enzymes.
- Enzymatic fragmentation includes digestion of the nucleic acid with a rare-cutting restriction enzyme, comprising Not I, Asc I, Bae I, AspC I, Pac I, Fse I, Sap I, Sfi I or Psr I. Enzymatic fragmentation include use of any combination of a nicking restriction endonuclease, endonuclease and/or exonuclease. Enzymatic fragmentation can be achieved by conducting a nick translation reaction. [00707] In some embodiments, enzymatic fragmentation can be achieved by reacting nucleic acids with an enzyme mixture, for example an enzyme that generates single-stranded nicks and another enzyme that catalyzes double-stranded cleavage. An exemplary enzyme mixture is Fragmentase (e.g., from New England Biolabs®).
- Fragments of the insert region can be generated with PCR using sequence-specific primers that hybridize to target regions in genomic DNA samples to generate insert regions having known fragment lengths and sequences.
- Targeted genome fragmentation methods using CRISPR/Cas9 can be used to generate fragmented insert regions.
- Fragments of the insert portion can also be generated using a transposase-based tagmentation method, for example using NEXTERA® (from Epicentre®).
- the insert region can be single-stranded or double-stranded.
- the ends of the double-stranded insert region can be blunt-ended, or have a 5’ overhang or a 3’ overhang end, or any combination thereof.
- One or both ends of the insert region can be subjected to an enzymatic tailing reaction to generate a non-template poly-A tail by employing a terminal transferase reaction.
- the ends of the insert region can be compatible for joining to at least one adaptor sequence (e.g., universal adaptor sequence or batch-specific adaptor sequence).
- the insert region can be any length, for example the insert region can be about 50- 250, or about 250-500, or about 500-750, or about 750-1000, or about 1000-1500, or about 1500-2000 bases or base pairs in length, or any range therebetween. In some embodiments, the insert region can be 2000-5000 bases or base pairs in length.
- the fragments containing the insert region can be subjected to a size selection process, or the fragments are not size selected.
- the fragments can be size selected by gel electrophoresis and gel slice extraction.
- the fragments can be size selected using a solid phase adherence/immobilization method which typically employs micro paramagnetic beads coated with a chemical functional group that interacts with nucleic acids under certain ionic strength conditions with or without polyethylene glycol or polyalkylene glycol.
- Commercially-available solid phase adherence beads include SPRI (Solid Phase Reversible Immobilization) beads from Beckman Coulter® (AMPUR XP® paramagnetic beads, catalog No. B23318), MAGNA PURE® magnetic glass particles (Roche Diagnostics®, catalog No.
- MAGNASIL paramagnetic beads from Promega® (catalog No. MD1360), MAGTRATION® paramagnetic beads and system from Precision System Science (catalog Nos. Al 120 and A1060), MAG-BIND® from Omega Bio-Tek (catalog No. M1378-01), MAGPREP® silica from Millipore® (catalog No. 101193), SNARE DNA purification systems from Bangs Laboratories® (catalog Nos. BP691, BP692 and BP693), and CHEMAGEN M-PVA beads from Perkin Elmer® (catalog No. CMG-200).
- the fragmented nucleic acids can be subjected to enzymatic reactions for end-repair and/or A-tailing.
- the fragmented nucleic acids can be contacted with a plurality of enzymes under a condition suitable to generate nucleic acid fragments having blunt-ended 5’ phosphorylated ends.
- the plurality of enzymes generates blunt-ended fragment having a non-template A-tail at their 3’ ends.
- the plurality of enzymes comprise two or more enzymes that can catalyze nucleic acid end-repair, phosphorylation and/or A-tailing.
- the end-repair enzymes include a DNA polymerase (e.g., T4 DNA polymerase) and Klenow fragment.
- the 5’ end phosphorylation enzyme comprises T4 polynucleotide kinase.
- the A-tailing enzyme includes a Taq polymerase (e.g., non-proofreading polymerase) and dATP.
- the fragmenting, end-repair, phosphorylation and A-tailing can be conducted in a one-pot reaction using a mixture of enzymes.
- individual fragmented (or unfragmented) nucleic acids can be covalently joined to at least one adaptor sequence for library preparation.
- a nucleic acid fragment is covalently joined at both ends to one or more adaptors to generate a linear library molecule having the arrangement left adaptor-insert-right adaptor.
- at least one fragment in the population of fragmented nucleic acids comprises a sequence-of-interest.
- Individual library molecules in the population of library molecules can have an insert region that is the same or different as other library molecules in the population. In some embodiments, about 1-10 ng, or about 10-50 ng, or about 50-100 ng , or any range therebetween, of input fragmented nucleic acids can be appended to one or more adaptors to generate a linear library.
- Individual nucleic acid fragments can be appended on one or both ends to at least one adaptor sequence to form a recombinant nucleic acid linear library molecule having the general arrangement left adaptor-insert-right adaptor.
- any of the adaptors comprise universal adaptor sequences or batch-specific adaptor sequences.
- the adaptors can be prepared using chemical synthesis procedures using native nucleotides with or without nucleotide analogs or modified nucleotide linkages that confer certain properties, including resistance to enzymatic digestion, or increased thermal stability.
- nucleotide analogs and modified nucleotide linkages that inhibit nuclease digestion include phosphorothioate, 2’-O-methyl RNA, inverted dT, and 2’ 3’ dideoxy-dT.
- Insert regions that include locked nucleic acids (LNA) have increased thermal stability.
- a third right junction adaptor sequence (175) can be located between the right sample index sequence (170) and the adaptor sequence for a reverse sequencing primer binding site sequence (150).
- a fourth right junction adaptor sequence (155) can be located between the adaptor sequence for a reverse sequencing primer binding site sequence (150) and the sequence of interest (e.g., insert (HO)).
- the library molecule can be generated by joining the first end of a double-stranded insert region (110) to a first double-stranded adaptor having a having a forward sequencing primer binding site sequence (140), and joining the second end of the double-stranded insert region (110) to a second double-stranded adaptor having a reverse sequencing primer binding site sequence (150).
- the joining is conducted using a DNA ligase enzyme to generate a double-stranded recombinant molecule.
- the first double-stranded adaptor further comprises a left sample index sequence (160) and/or a surface pinning primer binding site sequence (120).
- the second double-stranded adaptor further comprises a right sample index sequence (170) and/or a binding sequence for a capture primer binding site sequence (130).
- the ligating end of the first and/or the second doublestranded adaptors comprise a blunt end, or an overhang end (e.g., 5’ or 3’ overhang end).
- a linear single stranded library molecule (100) can be generated by employing a ligation reaction and primer extension reaction.
- the library molecule can be generated by joining the first end of a double-stranded insert region (110) to a first double-stranded Y-shaped adaptor (e.g., a first forked adaptor), and joining the second end of a double-stranded insert region (110) to a second double-stranded Y-shaped adaptor (e.g., a second forked adaptor).
- the first and second Y-shaped adaptors each comprise two nucleic acid strands, where a portion of the two strands are fully complementary to each other and are annealed together and another portion of the two strands are not complementary to each other and are mismatched.
- the ligating end of the first and second Y-shaped adaptors comprise an annealed portion that forms a blunt end or an overhang end (e.g., 5’ or 3’ overhang end).
- the first strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can include at least a portion of an adaptor sequence having a forward sequencing primer binding site sequence (140) (or a complementary sequence thereof).
- the first strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can further include a left sample index sequence (160).
- the first strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can further include an adaptor sequence having a surface pinning primer binding site sequence (120).
- the double-stranded insert region (110) can be joined to the first and second double-stranded Y-shaped adaptors using a DNA ligase enzyme to generate a doublestranded recombinant molecule.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present disclosure provides compositions, apparatus and methods for conducting separate sequencing batches on a support having nucleic acid template molecules immobilized thereon, where the separate sequencing batches can be conducted using any massively parallel sequencing technology. In some embodiments, a plurality of sub-populations of nucleic acid template molecules are immobilized to the support including at least a first and second sub-population. In some embodiments, the first sub-population of template molecules undergo first batch sequencing reactions and a region of the support is imaged to detect the first sequencing reactions, wherein the second sub-population of template molecules do not undergo sequencing reactions. In some embodiments, the second sub-population of template molecules undergo second batch sequencing and the same region of the support is imaged to detect the second sequencing reactions, wherein the first sub-population of template molecules do not undergo sequencing reactions.
Description
LIBRARY MOLECULE TITRATION FOR TUNABLE SURFACE DENSITY IN
POLONY SEQUENCING
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to, and benefit of, U.S. Provisional Application No. 63/573,300, filed on April 2, 2024, the contents of which are incorporated by reference in their entirety.
INCORPORATION BY REFERENCE OF SEQUENCE LISTING
[0002] The contents of the electronic sequence listing (ELEM_025_001WO_SeqList_ST26.xml; Size: 32,233 bytes; and Date of Creation: March 27, 2025) are herein incorporated by reference in their entireties.
TECHNICAL FIELD
[0003] The present disclosure provides compositions, apparatus and methods for conducting separate batches of nucleic acid sequencing on a support. In some embodiments, the separate batches of sequencing can be performed on a support comprising a plurality of nucleic acid template molecules immobilized to the support at high density.
BACKGROUND
[0004] Massively parallel sequencing methods have applications in biomedical research and healthcare setting as they allow for analyzing large quantities of biological samples. However, the limit of optical resolution impedes the ability to perform highly multiplex sequencing. Current technologies are unable to deal with large numbers of molecules being analyzed as they lead to over-crowding signals and images during sequencing, and ultimately lead to increased costs and time when using these methods. Thus, there exists a need for improved methods of which can be used for performing highly multiplex sequencing.
SUMMARY
[0005] The disclosure provides a method for nucleic acid sequencing comprising: (a) providing a support comprising a plurality of nucleic acid template molecules immobilized to the support, wherein the plurality of nucleic acid template molecules comprises at least a first and a second sub-population of template molecules, wherein individual template molecules in
the first sub-population of template molecules comprises a first batch sequencing primer binding site, a first batch barcode sequence and at least one first sequence-of-interest, wherein the individual template molecules in the second sub-population of template molecules comprises a second batch sequencing primer binding site, a second batch barcode sequence and at least one second sequence-of-interest, (b) sequencing the first sub-population of template molecules using a plurality of first batch sequencing primers, thereby generating a plurality of first batch sequencing read products and imaging a region of the support to detect the first batch sequencing read products; and (c) sequencing the second sub-population of template molecules using a plurality of second batch sequencing primers, thereby generating a plurality of second batch sequencing read products and imaging the same region of the support to detect the second batch sequencing read products.
[0006] In some embodiments of the method of the disclosure, the first batch sequencing primer binding site and the second batch sequencing primer binding site have different sequences. In some embodiments, the first batch barcode sequence and the second batch barcode sequence are different.
[0007] In some embodiments, sequencing the first sub-population of template molecules of step (b) comprises: Step (bl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first sub-population of template molecules to generate a plurality of first batch sequencing read products that comprise up to 1000 bases in length; Step (b2): stopping and/or blocking the short read sequencing of step (bl); Step (b3): removing the plurality of first batch sequencing read products and retaining the first sub-population of template molecules; and optionally Step (b4): repeating steps (bl) - (b3) at least once.
[0008] In some embodiments, sequencing the second sub-population of template molecules of step (c) comprises: Step (cl): conducting short read sequencing by performing up to 1000 sequencing cycles of the second sub-population of template molecules to generate a plurality of second batch sequencing read products that comprise up to 1000 bases in length; Step (c2): stopping and/or blocking the short read sequencing of step (cl); Step (c3): removing the plurality of second batch sequencing read products and retaining the second sub-population of template molecules; and optionally Step (c4): repeating steps (cl) - (c3) at least once. [0009] In some embodiments, the first sub-population of template molecules have the same first batch sequencing primer binding site, and have the same sequence of interest or different sequences of interest.
[0010] In some embodiments, the individual template molecules of the second subpopulation of template molecules have the same second batch sequencing primer binding site, and have the same sequence of interest or different sequences of interest.
[0011] In some embodiments, the plurality of nucleic acid template molecules immobilized to the support are at a density of about 102 - 1015 template molecules per mm2. In some embodiments, the plurality of nucleic acid template molecules are immobilized to the support at a high density. In some embodiments, at least some individual template molecules of the first and second sub-populations of template molecules comprise nearest neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. In some embodiments, the support lacks partitions and/or barriers that separate regions of the support. In some embodiments, the plurality of template molecules are immobilized to the support at random and non-determined positions on the support. In some embodiments, the plurality of template molecules are immobilized to the support at pre-determined positions on the support (e.g., a patterned support).
[0012] In some embodiments, the plurality of nucleic acid template molecules comprises concatemer template molecules comprising at least a first and second sub-population of concatemer template molecules.
[0013] In some embodiments, individual concatemer template molecules in the first subpopulation of concatemer template molecules comprise a plurality of tandem polynucleotide units comprising a first sequence of interest, a first batch sequencing primer binding site sequence which corresponds to the first sequence of interest, and a first batch barcode sequence which corresponds to the first sequence of interest. In some embodiments, individual concatemer template molecules in the second sub-population of concatemer template molecules comprise a plurality of tandem polynucleotide units comprising a second sequence of interest, a second batch sequencing primer binding site sequence which corresponds to the second sequence of interest, and a second batch barcode sequence which corresponds to the second sequence of interest.
[0014] In some embodiments, the first batch sequencing read products comprise: the first batch barcode sequence; or the first batch barcode sequence and the first sequence of interest. [0015] In some embodiments, the second batch sequencing read products comprise: the second batch barcode sequence; or the second batch barcode sequence and the second sequence of interest.
[0016] The disclosure provides a method for re-seeding a support, comprising: (a) providing a support comprising a plurality of surface capture primers immobilized to the support; (b) distributing on the support a first plurality of circularized library molecules under a condition suitable for hybridizing individual circularized library molecules to individual surface capture primers, and conducting a first rolling circle amplification reaction thereby generating a first plurality of concatemer template molecules immobilized to the support; (c) sequencing at least a subset of the first plurality of concatemer template molecules, thereby generating a first plurality of sequencing read products; (d) distributing on the support a second plurality of circularized library molecules under a condition suitable for hybridizing individual circularized library molecules of the second plurality to individual surface capture primers, and conducting a second rolling circle amplification reaction thereby generating a second plurality of concatemer template molecules immobilized to the support; and (e) sequencing at least a subset of the second plurality of concatemer template molecules, thereby generating a second plurality of sequencing read products.
[0017] In some embodiments, the first plurality of circularized library molecules comprises: circularized padlock probes; linear library molecules circularized using singlestranded splint strands; linear library molecules circularized using double-stranded adaptors; or a mixture of any combination of circularized padlock probes, linear library molecules circularized using single-stranded splint strands and/or linear library molecules circularized using double-stranded adaptors.
[0018] In some embodiments, the plurality of surface capture primers are immobilized to the support at random and non-pre-determined positions. In some embodiments, the plurality of surface capture primers are immobilized to the support at pre-determined positions.
[0019] In some embodiments, individual circularized library molecules in the first plurality of circularized library molecules comprise a first seeding batch sequencing primer binding site, a first seeding batch barcode sequence, and a first sequence of interest.
[0020] In some embodiments, the first plurality of sequencing read products of step (c) comprises: a first seeding batch barcode sequence; or a first seeding batch barcode sequence and a first sequence of interest.
[0021] In some embodiments, second individual circularized library molecules in the second plurality of circularized library molecules comprise a second seeding batch sequencing primer binding site, a second seeding batch barcode sequence, and a second sequence of interest.
[0022] In some embodiments, the second plurality of sequencing read products of step (e) comprises: a second seeding batch barcode sequence; or a second seeding batch barcode sequence and a second sequence of interest.
[0023] In some embodiments, sequencing at least the subset of the first plurality of concatemer template molecules of step (c) comprises: Step (cl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first plurality of concatemer template molecules to generate a first plurality of sequencing read products that comprise up to 1000 bases in length; Step (c2): stopping and/or blocking the short read sequencing of step (cl); Step (c3): removing the first plurality of sequencing read products and retaining the first plurality of immobilized concatemer template molecules; and optionally Step (c4): repeating steps (cl) - (c3) at least once.
[0024] In some embodiments, the sequencing at least the subset of the second plurality of concatemer template molecules of step (e) comprises: Step (el): conducting short read sequencing by performing up to 1000 sequencing cycles of the second plurality of concatemer template molecules to generate a second plurality of sequencing read products that comprise up to 1000 bases in length; Step (e2): stopping and/or blocking the short read sequencing of step (el); Step (e3): removing the second plurality of sequencing read products and retaining the second plurality of immobilized concatemer template molecules; and optionally Step (e4): repeating steps (el) - (e3) at least once.
[0025] In some embodiments, the plurality of surface capture primers immobilized to the support are at a density of about 102 - 1015 capture primers per mm2. In some embodiments, at least some of the surface capture primers comprise nearest neighbor surface capture primers that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. In some embodiments, the support lacks partitions and/or barriers that separate regions of the support.
BRIEF DESCRIPTION OF THE DRAWINGS
[0026] The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:
[0027] FIG. 1 is a schematic of various exemplary configurations of multivalent molecules. Left (Class I): schematics of multivalent molecules having a “starburst” or “helter-skelter” configuration. Center (Class II): a schematic of a multivalent molecule having a dendrimer configuration. Right (Class III): a schematic of multiple multivalent molecules formed by reacting streptavidin with 4-arm or 8-arm PEG-NHS with biotin and dNTPs. Nucleotide units are designated ‘N’, biotin is designated ‘B’, and streptavidin is designated ‘SA’.
[0028] FIG. 2 is a schematic of an exemplary multivalent molecule comprising a generic core attached to a plurality of nucleotide-arms.
[0029] FIG. 3 is a schematic of an exemplary multivalent molecule comprising a dendrimer core attached to a plurality of nucleotide-arms.
[0030] FIG. 4 is a schematic of an exemplary multivalent molecule comprising a core attached to a plurality of nucleotide-arms, where the nucleotide-arms comprise biotin, spacer, linker and a nucleotide unit.
[0031] FIG. 5 is a schematic of an exemplary nucleotide-arm comprising a core attachment moiety, spacer, linker and nucleotide unit.
[0032] FIG. 6 shows the chemical structure of an exemplary spacer (top), and the chemical structures of various exemplary linkers, including an 11 -atom Linker, 16-atom Linker, 23- atom Linker, and an N3 Linker (bottom).
[0033] FIG. 7 shows the chemical structures of various exemplary linkers, including Linkers 1-9.
[0034] FIG. 8 shows the chemical structures of various exemplary linkers joined/attached to nucleotide units.
[0035] FIG. 9 shows the chemical structures of various exemplary linkers joined/attached to nucleotide units.
[0036] FIG. 10 shows the chemical structures of various exemplary linkers joined/attached to nucleotide units.
[0037] FIG. 11 shows the chemical structure of an exemplary biotinylated nucleotide-arm. In this example, the nucleotide unit is connected to the linker via a propargyl amine attachment at the 5 position of a pyrimidine base or the 7 position of a purine base.
[0038] FIG. 12 is a schematic of a guanine tetrad (e.g., a G-tetrad).
[0039] FIG. 13 is a schematic of an exemplary intramolecular G-quadruplex structure. [0040] FIG. 14A is a pair of schematics, (i) and (ii), of an exemplary support having a plurality of nucleic acid capture primers arranged on the support in a non-predetermined and random manner. In (i), the capture primers can be attached to the support such that some of
the nearest neighbor capture primers touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. The dotted lines that surround the four capture primers represents nearest neighbor capture primers that touch each other. In (ii), which is a schematic of the same support shown in (i), individual nucleic acid capture primers are attached to a nucleic acid template molecule having one of four different batch sequences. The different batch sequences of the template molecules are represented by horizontal stripes, vertical dashed, brick, or solid black. The template molecules can attach to the support (e.g., via attachment to the capture primers) such that some of the nearest neighbor template molecules touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. The dotted lines that surround the four template molecules represent nearest neighbor template molecules that touch each other.
[0041] FIG. 14B is a pair of schematics, (iii) and (iv), of an exemplary support having a plurality of nucleic acid template molecules immobilized to the support (e.g., via attachment to the capture primers) where the template molecules are arranged on the support in a predetermined manner. The template molecule comprise one of four different batch sequences. The different batch sequences of the template molecules are represented by horizontal stripes, vertical dashed, brick, or solid black. For example, the template molecules can be immobilized to the support to form spots arranged in rows and columns (iii), or the template molecules can be immobilized to the support to form stripes (iv).
[0042] FIG. 14C is a schematic of an exemplary low binding support comprising a glass substrate and alternating layers of hydrophilic coatings which are covalently or non- covalently adhered to the glass, and which further comprises chemically reactive functional groups that serve as attachment sites for oligonucleotide primers (e.g., capture oligonucleotides). Alternatively, the support can be made of any material such as glass, plastic, or a polymer material.
[0043] FIG. 15A is a schematic showing an exemplary workflow for generating circularized padlock probes, comprising hybridizing first and second target-specific padlock probes to the first and second target molecules (respectively) to generate first (left schematic) and second (right schematic) circularized padlock probes (respectively) having a nick or gap, and closing the nick or gap to generate circularized padlock probes. The first padlock probe (left schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence), which corresponds to the first target sequence (Batch BC-1); (ii) a batch-specific sequencing primer (also referred to herein as a “batch sequencing primer”) binding site
sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iii) a capture primer binding site; and (iv) a compaction oligonucleotide binding site. The second padlock probe (right schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence), which corresponds to the second target sequence (Batch BC-2); (ii) a batch-specific sequencing primer (also referred to herein as a “batch sequencing primer”) binding site sequence which corresponds to the second target sequence (e.g., Batch Seq-2); (iii) a capture primer binding site; and (iv) a compaction oligonucleotide binding site.
[0044] FIG. 15B is a schematic showing an exemplary workflow in which the circularized padlock probes shown in FIG. 15A are subjected to rolling circle amplification (RCA) to generate first (left schematic) and second (right schematic) concatemer template molecules which are immobilized to a support having one type of immobilized capture primers. The RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The first and second circularized padlock probes can be distributed onto the support essentially simultaneously, or distributed onto the support sequentially (e.g., re-seeding the support). The first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows), where the first sequencing read products include the first batch barcode sequence (Batch BC-1). The first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles, but the second concatemer template molecules do not undergo first batch sequencing. The first sequencing read products from the first concatemer template molecules can be up to 1000 bases in length. In addition, or alternatively, the first and second concatemer template molecules can be subjected to a second sequencing workflow using second batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows), where the second sequencing read products include the second batch barcode sequence (Batch BC-2). The second concatemers can undergo reiterative sequencing comprising up to 1000 sequencing cycles, but the first concatemers do not undergo second batch sequencing. The second sequencing read products from the second concatemers can be up to 1000 bases in length.
[0045] FIG. 16 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing. The RCA reaction can be conducted in-solution using soluble amplification primers or on-support using
capture primers immobilized to a support. The first circularized padlock probe (Left schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC-1); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site. The second padlock probe (Right schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the second target sequence (Batch BC-2); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the second target sequence (e.g., Batch Seq-2); (iii) a second batch capture primer binding site; and (iv) a compaction oligonucleotide binding site. The first and second circularized padlock probes can be distributed onto a support having two types of immobilized capture primers which selectively hybridize to the first or second batch capture primer binding site sequences in the first or second circularized padlock probes. The circularized padlock probes can be subjected to rolling circle amplification (RCA) to generate first and second concatemer template molecules which are immobilized to the support. For instance, the first and second circularized padlock probes can be distributed onto the support essentially simultaneously, or distributed onto the support sequentially (e.g., re-seeding the support). The first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows), where the first sequencing read products include the first batch barcode sequence (Batch BC-1). The first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles, but the second concatemer template molecules do not undergo first batch sequencing. The first sequencing read products from the first concatemers can be up to 1000 bases in length. In addition, or alternatively, the first and second concatemer template molecules can be subjected to a second sequencing workflow using second batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows), where the second sequencing read products include the second batch barcode sequence (Batch BC-2). The second concatemers can undergo reiterative sequencing comprising up to 1000 sequencing cycles, but the first concatemers do not undergo second batch sequencing. Alternatively, or in addition, the second sequencing read products from the second concatemers can be up to 1000 bases in length.
[0046] FIG. 17 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing. The RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The first (left schematic) and second (right schematic) circularized padlock probes can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC- 1); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site. The first and second circularized padlock probes can be distributed onto a support having one type of immobilized capture primers which selectively hybridize to the first batch capture primer binding site sequences in the first and second circularized padlock probes. The circularized padlock probes can be subjected to rolling circle amplification (RCA) to generate first and second concatemer template molecules which are immobilized to the support. For instance, the first and second circularized padlock probes can be distributed onto the support essentially simultaneously, or distributed onto the support sequentially (e.g., re-seeding the support). The first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows), where the first sequencing read products include the first batch barcode sequence (Batch BC- 1). The first and second concatemer template molecules can undergo reiterative sequencing both comprising up to 1000 sequencing cycles. The first sequencing read products from the first concatemer template molecules can be up to 1000 bases in length. Alternatively, or in addition, the second sequencing read products from the second concatemer can be up to 1000 bases in length.
[0047] FIG. 18 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing. The RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The first circularized padlock probe (Left schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC-1); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the first and second target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site. The second padlock probe (Right schematic ) can
comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the second target sequence (Batch BC-2); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the first and second target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site. The first and second circularized padlock probes can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first batch capture primer binding site sequence in the first and second circularized padlock probes. The circularized padlock probes can be subjected to rolling circle amplification (RCA) to generate first and second concatemer template molecules which are immobilized to the support. For instance, the first and second circularized padlock probes can be distributed onto the support essentially simultaneously, or distributed onto the support sequentially (e.g., re-seeding the support). The first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first and second sequencing read products (dashed arrows). The first sequencing read products can include the first batch barcode sequence (Batch BC-
1). The first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The first sequencing read products from the first concatemer template molecules can be up to 1000 bases in length. Alternatively, or in addition, the second sequencing read products can include the second batch barcode sequence (Batch BC-
2). The second concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The second sequencing read products from the second concatemer template molecules can be up to 1000 bases in length.
[0048] FIG. 19 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing. The RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The first circularized padlock probe (Left schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first and second target sequence (Batch BC-1); (ii) a batch-specific sequencing primer binding site sequence which corresponds to the first and second target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site. The second padlock probe (Right schematic) can comprise: (i) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first and second target sequence (Batch BC-1); (ii) a batch-specific
sequencing primer binding site sequence which corresponds to the first and second target sequence (e.g., Batch Seq-1); (iii) a first batch capture primer binding site; and (iv) a compaction oligonucleotide binding site. The first and second circularized padlock probes can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first batch capture primer binding site sequence in the first and second circularized padlock probes. The circularized padlock probes can be subjected to rolling circle amplification (RCA) to generate first and second concatemer template molecules which are immobilized to the support. For instance, the first and second circularized padlock probes can be distributed onto the support essentially simultaneously, or distributed onto the support sequentially (e.g., re-seeding the support). The first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first and second sequencing read products (dashed arrows). The first sequencing read products include the first batch barcode sequence (Batch BC-1) and at least a portion of the first target sequence. The first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The first sequencing read products from the first concatemer template molecules can be up to 1000 bases in length. Alternatively, or in addition, the second sequencing read products include the second batch barcode sequence (Batch BC-2) and at least a portion of the second target sequence. The second concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The second sequencing read products from the second concatemer template molecules can be up to 1000 bases in length.
[0049] FIG. 20 is a schematic of an exemplary workflow in which circularized padlock probes are subjected to rolling circle amplification (RCA) and batch sequencing. The RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The first circularized padlock probe (Left schematic) can comprise: (i) a first sample index which distinguish sequences of interest obtained from a first sample source (e.g., Sample index-1); (ii) a batch barcode sequence (i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC- 1); (iii) a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iv) a first batch capture primer binding site; and (v) a compaction oligonucleotide binding site. The second circularized padlock probe (Right schematic) can comprise: (i) a second sample index which distinguish sequences of interest obtained from a second sample source (e.g., Sample index-2); (ii) a batch barcode sequence
(i.e., a batch-specific barcode sequence) which corresponds to the first target sequence (Batch BC-1); (iii) a batch-specific sequencing primer binding site sequence which corresponds to the first target sequence (e.g., Batch Seq-1); (iv) a first batch capture primer binding site; and (v) a compaction oligonucleotide binding site. The first and second circularized padlock probes can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first batch capture primer binding site sequence in the first and second circularized padlock probes. The circularized padlock probes can be subjected to rolling circle amplification (RCA) to generate first and second concatemer template molecules which are immobilized to the support. For instance the first and second circularized padlock probes can be distributed onto the support essentially simultaneously, or distributed onto the support sequentially (e.g., re-seeding the support). The first and second concatemer template molecules can be subjected to a first sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first and second sequencing read products (dashed arrows). The first sequencing read products can include the first batch barcode sequence (Batch BC-1) and the first sample index sequence. The first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The first sequencing read products from the first concatemer can be up to 1000 bases in length. Alternatively, or in addition, the second sequencing read products can include the second batch barcode sequence (Batch BC-2) and the second sample index sequence. The second concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The second sequencing read products from the second concatemer template molecules can be up to 1000 bases in length.
[0050] FIG. 21 is a schematic of an exemplary workflow of a linear single stranded library molecule (100) hybridizing with a single-stranded splint molecule/ strand (200) (ss-splint strand) thereby circularizing the library molecule to form a library-splint complex (300) with a nick which is enzymatically ligatable. The exemplary library molecule (100) can comprise: a surface pinning primer binding site sequence (120) (e.g., a batch-specific surface pinning primer binding site sequence); an optional left unique identification sequence (180) (e.g., UMI); a left sample index sequence (160); a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); a sequence of interest (110); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); a right sample index sequence (170); and a surface capture primer binding site sequence (130) (e.g., a batch-
specific surface capture primer binding site sequence). The single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the surface pinning primer binding site sequence (120) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the surface capture primer binding site sequence (130) of the linear single-stranded library molecule (100).
[0051] FIG. 22 is a schematic of an exemplary workflow of a linear single stranded library molecule (100) hybridizing with a single-stranded splint molecule/ strand (200) (ss-splint strand) thereby circularizing the library molecule to form a library-splint complex (300) with a nick which is enzymatically ligatable. The exemplary linear single stranded library molecule (100) can comprise: a surface pinning primer binding site sequence (120) (e.g., a batch-specific surface pinning primer binding site sequence); a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); a batch barcode sequence (195); a left sample index sequence (160); a sequence of interest (110); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); a right sample index sequence (170); and a surface capture primer binding site sequence (130) (e.g., a batch-specific surface capture primer binding site sequence). The single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the surface pinning primer binding site sequence (120) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the surface capture primer binding site sequence (130) of the linear single-stranded library molecule (100).
[0052] FIG. 23A is a schematic of an exemplary workflow of a first linear single stranded library molecule (100-1) (linear single stranded library molecule-1) hybridizing with a single-stranded splint molecule/ strand (200) (ss-splint strand) thereby circularizing the library molecule to form a first library-splint complex (300-1) with a nick which is enzymatically ligatable. The exemplary first linear single stranded library molecule (100-1) can comprise: a first surface pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1); a first batch barcode sequence (195-1); a first sample index sequence (160-1); a first sequence of interest (insert 1, 110-1); and a first surface capture primer binding site sequence (130-1). The single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the first surface pinning primer binding site sequence (120-1) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the first surface capture primer binding site sequence (130-1) of the linear single-stranded library molecule (100).
[0053] FIG. 23B is a schematic of an exemplary workflow of a second linear single stranded library molecule (100-2) (linear single stranded library molecule-2) hybridizing with a single-stranded splint molecule/strand (200) (ss-splint strand) thereby circularizing the library molecule to form a second library-splint complex (300-2) with a nick which is enzymatically ligatable. The exemplary second linear single stranded library molecule (100- 2) can comprise: a first surface pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2); a second batch barcode sequence (195-2); a first sample index sequence (160-1); a second sequence of interest (insert-2, 110- 2); and a first surface capture primer binding site sequence (130-1). The single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the first surface pinning primer binding site sequence (120-1) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the first surface capture primer binding site sequence (130-1) of the linear single-stranded library molecule (100). The first sequence of interest in the library-splint complex shown in FIG. 23A (110-1) and the second sequence of interest in the library-splint complex shown in FIG. 23B (110-2) can have the same sequence or different sequences.
[0054] FIG. 24A is a schematic of an exemplary workflow in which the nick in the first library-splint complex (300-1) shown in FIG. 23A is ligated to generate a first covalently closed circular library molecule (400-1) which is shown in FIG. 24A. The first covalently closed circular library molecule (400-1) is subjected to rolling circle amplification (RCA) to generate a first concatemer template molecule, and the first concatemer template molecule is subjected to batch reiterative sequencing. The rolling circle amplification reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The first covalently closed circular library molecule (400- 1) can comprise: a first surface pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1) which corresponds with the first sequence of interest (insert -1, 110-1); a first batch barcode sequence (195-1) which corresponds with the first sequence of interest (110-1); a first sample index sequence (160-1); a first sequence of interest (110-1); and a first surface capture primer binding site sequence (130-1). A plurality of the first covalently closed circular library molecule (400-1) shown in FIG. 24A can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first capture primer binding site sequence (130-1) in the first covalently closed circular library molecules (400-1). The first covalently closed circular library molecules (400-1) can be subjected to rolling circle amplification (RCA) to generate a
plurality of first concatemer template molecules which are immobilized to the support. The first concatemer template molecules can be subjected to a sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows). The first sequencing read products can include the first batch barcode sequence (195-1) as shown in FIG. 24A. Alternatively, or in addition, the first sequencing read products can include the first batch barcode sequence (195-1) and the first sample index sequence (160-1) (not shown). Alternatively, or in addition, the first sequencing read products can include the first batch barcode sequence (195-1), the first sample index sequence (160-1), and at least a portion of the first sequence of interest (110-1) (not shown). The first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The first sequencing read products from the first concatemer template molecules can be up to 1000 bases in length.
[0055] FIG. 24B is a schematic of an exemplary workflow in which the nick in the second library-splint complex (300-2) shown in FIG. 23B is ligated to generate a second covalently closed circular library molecule (400-2) which is shown in FIG. 24B. The second covalently closed circular library molecule (400-2) is subjected to rolling circle amplification (RCA) to generate a second concatemer template molecule, and the second concatemer template molecule is subjected to batch reiterative sequencing. The rolling circle amplification reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The second covalently closed circular library molecule (400-2) can comprise: a first surface pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2) which corresponds with the second sequence of interest (insert-2, 110-2) ; a second batch barcode sequence (195-2) which corresponds with the second sequence of interest (110-2) ; a first sample index sequence (160-1); a second sequence of interest (110-2) ; and a first surface capture primer binding site sequence (130-1). A plurality of the second covalently closed circular library molecule (400-2) shown in FIG. 24B can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first capture primer binding site sequence (130-1) in the second covalently closed circular library molecules (400-2). A plurality of the first covalently closed circular library molecule (400-1) shown in FIG. 24 A and a plurality of the second covalently closed circular library molecule (400-2) shown in FIG. 24B can be distributed onto the same support. For instance, the first covalently closed circular library molecules (400-1) shown in FIG. 24 A and the second covalently closed
circular library molecules (400-2) shown in FIG. 24B can be distributed onto the support essentially simultaneously. Alternatively, or in addition, the first covalently closed circular library molecules (400-1) shown in FIG. 24A and the second covalently closed circular library molecules (400-2) shown in FIG. 24B can be distributed onto the support sequentially (e.g., re-seeding the support). The second covalently closed circular library molecules (400-2) can be subjected to rolling circle amplification (RCA) to generate a plurality of second concatemer template molecules which are immobilized to the support. The second concatemer template molecules can be subjected to a sequencing workflow using second batch sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows). In some cases, the second concatemer template molecules are not sequenced when first batch sequencing primers are used to sequence the first concatemer template molecules. Alternatively, or in addition, the first concatemer template molecules are not sequenced when second batch sequencing primers are used to sequence the second concatemer template molecules. The second sequencing read products can include the second batch barcode sequence (195-2) as shown in FIG. 24B. Alternatively, or in addition, the second sequencing read products can include the second batch barcode sequence (195-2) and the first sample index sequence (160-1) (not shown). Alternatively, or in addition, the second sequencing read products include the second batch barcode sequence (195-2), the first sample index sequence (160-1), and at least a portion of the second sequence of interest (110-2) (not shown). The second concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The second sequencing read products from the second concatemer can be up to 1000 bases in length.
[0056] FIG. 25A is a schematic of an exemplary workflow of a first linear single stranded library molecule (100-1) (linear single-stranded library molecule-1) hybridizing with a singlestranded splint molecule/ strand (ss-splint strand) (200) thereby circularizing the library molecule to form a first library-splint complex (300-1) with a nick which is enzymatically ligatable. The exemplary first linear single stranded library molecule (100-1) can comprise: a first surface pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1); a first batch barcode sequence (195-1); a first sequence of interest (insert-1, 110-1); and a first surface capture primer binding site sequence (130-1). The single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the first surface pinning primer binding site sequence (120-1) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the first surface
capture primer binding site sequence (130-1) of the first linear single-stranded library molecule (100-1).
[0057] FIG. 25B is a schematic of an exemplary workflow of a second single-stranded library molecule (100-2) (linear single-stranded library molecule-2) hybridizing with a singlestranded splint molecule/ strand (ss-splint strand) (200) thereby circularizing the library molecule to form a second library-splint complex (300-2) with a nick which is enzymatically ligatable. The exemplary second linear single stranded library molecule (100-2) can comprise: a first surface pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2); a second batch barcode sequence (195-2); a second sequence of interest (insert-2, 110-2); and a first surface capture primer binding site sequence (130-1). The single-stranded splint strand (200) can comprise a first region (210) that hybridizes with the first surface pinning primer binding site sequence (120- 1) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the first surface capture primer binding site sequence (130-1) of the second linear single-stranded library molecule (100-2). The first sequence of interest (110-1) in the first library-splint complex shown in FIG. 25 A and the second sequence of interest (110-2) in the second library-splint complex shown in FIG. 25B can have the same sequence or different sequences.
[0058] FIG. 26A is a schematic of an exemplary workflow in which the nick in the first library-splint complex (300-1) shown in FIG. 25A is ligated to generate a first covalently closed circular library molecule (400-1) which is shown in FIG. 26A. The first covalently closed circular library molecule (400-1) is subjected to rolling circle amplification (RCA) to generate a first concatemer template molecule, and the first concatemer template molecule is subjected to batch reiterative sequencing. The RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The first covalently closed circular library molecule (400-1) can comprise: a first surface pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1) which corresponds with the first sequence of interest (insert-1, 110-1); a first batch barcode sequence (195-1) which corresponds with the first sequence of interest (110-1); a first sequence of interest (110-1); and a first surface capture primer binding site sequence (130-1). A plurality of the first covalently closed circular library molecule (400- 1) shown in FIG. 26A can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first capture primer binding site sequence (130-1) in the first covalently closed circular library molecules (400-1). The first covalently
closed circular library molecules (400-1) can be subjected to rolling circle amplification (RCA) to generate a plurality of first concatemer template molecules which are immobilized to the support. The first concatemer template molecules can be subjected to a sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows). The first sequencing read products can include the first batch barcode sequence (195-1) as shown in FIG. 26A. Alternatively or in addition, the first sequencing read products can include the first batch barcode sequence (195-1) and at least a portion of the first sequence of interest (110-1) (not shown). The first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The first sequencing read products from the first concatemer template molecules can be up to 1000 bases in length.
[0059] FIG. 26B is a schematic of an exemplary workflow in which the nick in the librarysplint complex (300-2) shown in FIG. 25B is ligated to generate a second covalently closed circular library molecule (400-2) which is shown in FIG. 26B. The second covalently closed circular library molecule (400-2) is subjected to rolling circle amplification (RCA) to generate a second concatemer template molecule, and the second concatemer template molecule is subjected to batch reiterative sequencing. The RCA reaction can be conducted insolution using soluble amplification primers or on-support using capture primers immobilized to a support. The second covalently closed circular library molecule (400-2) can comprise: a first surface pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2) which corresponds with the second sequence of interest (insert-2, 110-2); a second batch barcode sequence (195-2) which corresponds with the second sequence of interest (110-2); a second sequence of interest (110- 2); and a first surface capture primer binding site sequence (130-1). A plurality of the second covalently closed circular library molecule (400-2) shown in FIG. 26B can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first capture primer binding site sequence (130-1) in the second covalently closed circular library molecules (400-2). A plurality of the first covalently closed circular library molecule (400-1) shown in FIG. 26 A and a plurality of the second covalently closed circular library molecule (400-2) shown in FIG. 26B can be distributed onto the same support. For instance, the first covalently closed circular library molecules (400-1) shown in FIG. 26 A and the second covalently closed circular library molecules (400-2) shown in FIG. 26B can be distributed onto the support essentially simultaneously. Alternatively, or in addition, the first
covalently closed circular library molecules (400-1) shown in FIG. 26 A and the second covalently closed circular library molecules (400-2) shown in FIG. 26B can be distributed onto the support sequentially (e.g., re-seeding the support). The second covalently closed circular library molecules (400-2) can be subjected to rolling circle amplification (RCA) to generate a plurality of second concatemer template molecules which are immobilized to the support. The second concatemer template molecules can be subjected to a sequencing workflow using second batch sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows). In some cases, the second concatemer template molecules are not sequenced when first batch sequencing primers are used to sequence the first concatemer template molecules. Alternatively, or in addition, the first concatemer template molecules are not sequenced when second batch sequencing primers are used to sequence the second concatemer template molecules. The second sequencing read products can include the second batch barcode sequence (195-2) as shown in FIG. 26B. Alternatively, or in addition, the second sequencing read products can include the second batch barcode sequence (195-2) and at least a portion of the second sequence of interest (110-2) (not shown). Alternatively, or in addition, the second concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The second sequencing read products from the second concatemer template molecule can be up to 1000 bases in length.
[0060] FIG. 27 is a schematic of an exemplary workflow of a linear single-stranded library molecule (100) hybridizing with a double-stranded adaptor (ds-splint adaptor) (500) thereby circularizing the linear single-stranded library molecule to form a library-splint complex (800) with two nicks (solid arrowheads). The exemplary linear single stranded library molecule (100) can comprise: a pinning primer binding site sequence (120) (e.g., a batchspecific pinning primer binding site sequence); an optional left unique identification sequence (180) (e.g., UMI); a left sample index sequence (160); a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); a sequence of interest (110); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); a right sample index sequence (170); and a surface capture primer binding site sequence (130) (e.g., a batchspecific surface capture primer binding site sequence). The double-stranded adaptor can comprise a first splint strand (600) hybridized to a second splint strand (700). In the doublestranded adaptor, the first splint strand (600) can comprise a first region (620), an internal region (610), and a second region (630), wherein the internal region of the first splint strand
(610) is hybridized to the second splint strand (700). The second splint strand (700) can comprise a first, a second, and a third subregion, and the internal region (610) of the first splint strand (600) can comprise a fourth, a fifth, and a sixth subregion. The first region (620) of the first splint strand (600) can hybridize to at least a portion of the surface pinning primer binding site sequence (120) of a linear single stranded library molecule (100), and the second region (630) of the first splint strand (600) can hybridize to at least a portion of the surface capture primer binding site sequence (130) of the same linear single stranded nucleic acid library molecule (100).
[0061] FIG. 28 is a schematic of an exemplary workflow of a linear single-stranded library molecule (100) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the library molecule to form a library-splint complex (800) with two nicks (solid arrowheads). The exemplary linear single-stranded library molecule (100) can comprise: a surface pinning primer binding site sequence (120) (e.g., a batch-specific pinning primer binding site sequence); a forward sequencing primer binding site sequence (140) (e.g., batchspecific forward sequencing primer binding site sequence); a batch-specific barcode sequence (195); a left sample index sequence (160); a sequence of interest (110); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); a right sample index sequence (170); and a surface capture primer binding site sequence (130) (e.g., a batch-specific surface capture primer binding site sequence). The double-stranded adaptor can comprise a first splint strand (600) hybridized to a second splint strand (700). In the double-stranded adaptor, the first splint strand (600) comprises a first region (620), an internal region (610), and a second region (630), wherein the internal region of the first splint strand (610) is hybridized to the second splint strand (700). The second splint strand (700) can comprise a first, a second, and a third subregion, and the internal region (610) of the first splint strand (600) can comprise a fourth, a fifth, and a sixth subregion. The first region (620) of the first splint strand (600) can hybridize to at least a portion of the surface pinning primer binding site sequence (120) of a linear single stranded nucleic acid library molecule (100), and the second region (630) of the first splint strand (600) can hybridize to at least a portion of the surface capture primer binding site sequence (130) of the same single-stranded nucleic acid library molecule (100).
[0062] FIG. 29 is a schematic of an exemplary workflow of a linear single stranded library molecule (100) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the linear single-stranded library molecule (100) to form a library-splint complex (800) with two nicks (solid arrowheads). The exemplary library molecule (100) can
comprise: a surface pinning primer binding site sequence (120) (e.g., a batch-specific pinning primer binding site sequence); a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); a batch barcode sequence (195); a left sample index sequence (160); a sequence of interest (110); and a surface capture primer binding site sequence (130) (e.g., batch-specific surface capture primer binding site sequence). The double-stranded adaptor can comprise a first splint strand (600) hybridized to a second splint strand (700). In the double-stranded adaptor, the first splint strand (600) can comprise a first region (620), an internal region (610), and a second region (630), wherein the internal region of the first splint strand (610) is hybridized to the second splint strand (700). The second splint strand (700) can comprise a first, a second, and a third subregion, and the internal region (610) of the first splint strand (600) can comprise a fourth, a fifth, and a sixth subregion. The first region (620) of the first splint strand (600) can hybridize to at least a portion of the surface pinning primer binding site sequence (120) of a linear single-stranded library molecule (100), and the second region (630) of the first splint strand (600) can hybridize to at least a portion of the surface capture primer binding site sequence (130) of the same linear single-stranded library molecule (100).
[0063] FIG. 30A is a schematic of an exemplary workflow of a first linear single-stranded library molecule (100-1) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the first linear single-stranded library molecule to form a first librarysplint complex (800-1) with two nicks (solid arrowheads) that are enzymatically ligatable. The exemplary first linear single stranded library molecule (100-1) can comprise: a first pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1); a first batch barcode sequence (195-1); a first sequence of interest (insert -1, 110-1); and a first surface capture primer binding site sequence (130-1). The double-stranded adaptor can comprise a first splint strand (600) hybridized to a second splint strand (700). In the double-stranded adaptor, the first splint strand (600) can comprise a first region (620), an internal region (610), and a second region (630), wherein the internal region of the first splint strand (610) is hybridized to the second splint strand (700). The second splint strand (700) can comprise a first, a second, and a third subregion, and the internal region (610) of the first splint strand (600) can comprise a fourth, a fifth, and a sixth subregion. The first region (620) of the first splint strand (600) can hybridize to at least a portion of the first pinning primer binding site sequence (120-1) of a linear single-stranded library molecule (100-1), and the second region (630) of the first splint strand (600) can
hybridize to at least a portion of the first surface capture primer binding site sequence (130-1) of the same linear single-stranded library molecule (100-1).
[0064] FIG. 30B is a schematic of an exemplary workflow of a second linear singlestranded library molecule (100-2) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the library molecule to form a second library-splint complex (800-2) with two nicks (solid arrowheads) that are enzymatically ligatable. The exemplary second linear single-stranded library molecule (100-2) can comprise: a first pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2); a second batch barcode sequence (195-2); a second sequence of interest (insert-2, 110-2); and a first surface capture primer binding site sequence (130-1). The double-stranded adaptor can comprise a first splint strand (600) hybridized to a second splint strand (700). In the double-stranded adaptor, the first splint strand (600) can comprise a first region (620), an internal region (610), and a second region (630), wherein the internal region of the first splint strand (610) is hybridized to the second splint strand (700). The second splint strand (700) can comprise a first, a second, and a third subregion, and the internal region (610) of the first splint strand (600) can comprise a fourth, a fifth, and a sixth subregion. The first region (620) of the first splint strand (600) can hybridize to at least a portion of the first surface pinning primer binding site sequence (120-1) of a single-stranded library molecule (100-2), and the second region (630) of the first splint strand (600) can hybridize to at least a portion of the first surface capture primer binding site sequence (130-1) of the same single-stranded library molecule (100-2). The first sequence of interest (110-1) in the first library-splint complex (800-1) shown in FIG. 30A and the second sequence of interest (110-2) in the second library-splint complex (800-2) shown in FIG. 30B can have the same sequence or different sequences.
[0065] FIG. 31A is a schematic of an exemplary workflow in which the two nicks in the first library-splint complex (800-1) shown in FIG. 30A are ligated to generate a first covalently closed circular library molecule (900-1) which is shown in FIG. 31 A. The first covalently closed circular library molecule (900-1) is subjected to rolling circle amplification (RCA) to generate a first concatemer template molecule, and the first concatemer template molecule is subjected to batch reiterative sequencing. The RCA reaction can be conducted insolution using soluble amplification primers or on-support using capture primers immobilized to a support. The first covalently closed circular library molecule (900-1) can comprise: a first surface pinning primer binding site sequence (120-1); a first batch forward sequencing primer binding site sequence (140-1) which corresponds with the first sequence of interest
(insert-1, 110-1); a first batch barcode sequence (195-1) which corresponds with the first sequence of interest (110-1); a first sequence of interest (110-1); and a first surface capture primer binding site sequence (130-1). The first covalently closed circular library molecule (900-1) can further comprise a second splint strand (700) from the double-stranded adaptor shown in FIG. 30 A. A plurality of the first covalently closed circular library molecule (900-1) shown in FIG. 31 A can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first surface capture primer binding site sequence (130-1) in the first covalently closed circular library molecules (900-1). The first covalently closed circular library molecules (900-1) can be subjected to rolling circle amplification (RCA) to generate a plurality of first concatemer template molecules which are immobilized to the support. The first concatemer template molecules can be subjected to a sequencing workflow using first batch-specific sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of first sequencing read products (dashed arrows). The first sequencing read products can include the first batch barcode sequence (195-1) as shown in FIG. 31 A. Alternatively, or in addition, the first sequencing read products can include the first batch barcode sequence (195-1) and at least a portion of the first sequence of interest (110-1) (not shown). The first concatemer template molecules can undergo reiterative sequencing comprising up to 1000 sequencing cycles. The first sequencing read products from the first concatemer template molecule can be up to 1000 bases in length.
[0066] FIG. 31B is a schematic of an exemplary workflow in which the nicks in the second library-splint complex (800-2) shown in FIG. 30B are ligated to generate a second covalently closed circular library molecule (900-2) which is shown in FIG. 3 IB. The second covalently closed circular library molecule (900-2) is subjected to rolling circle amplification (RCA) to generate a second concatemer template molecule, and the concatemer template molecule is subjected to batch reiterative sequencing. The RCA reaction can be conducted in-solution using soluble amplification primers or on-support using capture primers immobilized to a support. The second covalently closed circular library molecule (900-2) can comprise: a first surface pinning primer binding site sequence (120-1); a second batch forward sequencing primer binding site sequence (140-2) which corresponds with the second sequence of interest (110-2); a second batch barcode sequence (195-2) which corresponds with the second sequence of interest (insert-2, 110-2); a second sequence of interest (110-2); and a first surface capture primer binding site sequence (130-1). The second covalently closed circular library molecule (900-2) can further comprise a second splint strand (700) from the double-
stranded adaptor shown in FIG. 3 OB. A plurality of the second covalently closed circular library molecule (900-2) shown in FIG. 3 IB can be distributed onto a support having one type of immobilized capture primers which selectively hybridizes to the first surface capture primer binding site sequence (130-1) in the second covalently closed circular library molecules (900-2). A plurality of the first covalently closed circular library molecule (900-1) shown in FIG. 31 A and a plurality of the second covalently closed circular library molecule (900-2) shown in FIG. 3 IB are distributed onto the same support. For instance, the first covalently closed circular library molecules (900-1) shown in FIG. 31A and the second covalently closed circular library molecules (900-2) shown in FIG. 3 IB can be distributed onto the support essentially simultaneously. Alternatively, or in addition, the first covalently closed circular library molecules (900-1) shown in FIG. 31A and the second covalently closed circular library molecules (900-2) shown in FIG. 3 IB can be distributed onto the support sequentially (e.g., re-seeding the support). The second covalently closed circular library molecules (900-2) can be subjected to rolling circle amplification (RCA) to generate a plurality of second concatemer template molecules which are immobilized to the support. The second concatemer template molecules can be subjected to a sequencing workflow using second batch sequencing primers (solid arrows), sequencing polymerases, and a plurality of nucleotide reagents to generate a plurality of second sequencing read products (dashed arrows). In some cases, the second concatemer template molecules are not sequenced when first batch sequencing primers are used to sequence the first concatemer template molecules. Alternatively, or in addition, the first concatemer template molecules are not sequenced when second batch sequencing primers are used to sequence the second concatemer template molecules. The second sequencing read products can include the second batch barcode sequence (195-2) as shown in FIG. 3 IB. Alternatively, or in addition, the second sequencing read products include the second batch barcode sequence (195-2) and at least a portion of the second sequence of interest (110-2) (not shown). The second concatemer template molecules undergo reiterative sequencing comprising up to 1000 sequencing cycles. The second sequencing read products from the second concatemer template molecules can be up to 1000 bases in length.
[0067] FIG. 32 is a schematic showing an exemplary linear single-stranded library molecule (100) hybridizing with a single-stranded splint molecule/ strand (200) (ss-split strand) thereby circularizing the library molecule to form a library-splint complex (300) with a nick. The linear single stranded library molecule (100) can comprise: a first left junction adaptor sequence (121); an adaptor sequence for a surface pinning primer binding site
sequence (120); a second left junction adaptor sequence (125); a left sample index sequence (160); a third left junction adaptor sequence (165); an adaptor sequence for a forward sequencing primer binding site sequence (140); a fourth left junction adaptor sequence (145); a sequence of interest (e.g., an insert (110)); a fourth right junction adaptor sequence (155); an adaptor sequence for a reverse sequencing primer binding site sequence (150); a third right junction adaptor sequence (175); a right sample index sequence (170); a second right junction adaptor sequence (135); an adaptor sequence for a surface capture primer binding site (130); and a first right junction adaptor sequence (131). The single-stranded splint strand (200) comprises a first region (210) that hybridizes with one end (e.g., left end or 5’ end) of the linear single stranded library molecule (100) including at least a portion of the adaptor sequence for a surface pinning primer binding site (120) and/or at least a portion of the first left junction adaptor sequence (121). The single-stranded splint strand (200) comprises a second region (220) that hybridizes with the other end (e.g., right end or 3’ end) of the linear single stranded library molecule (100) including at least a portion of the adaptor sequence for a surface capture primer binding site (130) and/or at least a portion of the first right junction adaptor sequence (131). For the sake of simplicity, the library-splint complex (300) does not show any of the junction adaptors. The skilled artisan will recognize that the library-splint complex (300) can include any one or any combination of two or more of the junction adaptors that are present in the linear single stranded library molecule (100).
[0068] FIG. 33 is a schematic showing an exemplary linear single-stranded library molecule (100) hybridizing with a double-stranded adaptor (500) (ds-splint adaptor) thereby circularizing the library molecule to form a library-splint complex (800) with two nicks (solid arrowheads). The linear single stranded library molecule (100) can comprise: a first left junction adaptor sequence (121); an adaptor sequence for a surface pinning primer binding site sequence (120); a second left junction adaptor sequence (125); a left sample index sequence (160); a third left junction adaptor sequence (165); an adaptor sequence for a forward sequencing primer binding site sequence (140); a fourth left junction adaptor sequence (145); a sequence of interest (e.g., an insert; (110)); a fourth right junction adaptor sequence (155); an adaptor sequence for a reverse sequencing primer binding site sequence (150); a third right junction adaptor sequence (175); a right sample index sequence (170); a second right junction adaptor sequence (135); an adaptor sequence for a surface capture primer binding site (130); and a first right junction adaptor sequence (131). The doublestranded splint adaptor (500) comprises a first splint strand (600) having a first region (620) that hybridizes with one end (e.g., left end or 5’ end) of the linear single stranded library
molecule (100) including at least a portion of the adaptor sequence for a surface pinning primer binding site sequence (120) and/or at least a portion of the first left junction adaptor sequence (121). The double-stranded splint adaptor (500) comprises a first splint strand (600) having a second region (630) that hybridizes with the other end (e.g., right end or 3’ end) of the linear single stranded library molecule (100) including at least a portion of the adaptor sequence for a surface capture primer binding site sequence (130) and/or at least a portion of the first right junction adaptor sequence (131). For the sake of simplicity, the library-splint complex (300) does not show any of the junction adaptors. The skilled artisan will recognize that the library-splint complex (300) can include any one or any combination of two or more of the junction adaptors that are present in the linear single stranded library molecule (100). [0069] FIG. 34 shows sequencing images of polonies (e.g., DNA nanoballs) immobilized on a support at high density (top) and a table summarizing read count, Q30 scores and percent error (bottom). The support (e.g., a flow cell) was loaded with 20 picomolar (pM) of a 1 : 1 mixture of covalently closed circular library molecules generated from either singlestranded splint strands (right) or double-stranded splints (left). The loaded covalently closed circular library molecules were subjected to rolling circle amplification to generate immobilized concatemer template molecules. 31 cycles of first batch sequencing was conducted using first batch sequencing primers (e.g., TruSeq sequencing primers; SEQ ID NO: 2) that selectively hybridized to the concatemer template molecules generated from double-stranded splint adaptors (ds-Splint; left image was obtained at one of the 31 sequencing cycles). The first batch sequencing read products were removed. 31 cycles of second batch sequencing were conducted using second batch sequencing primers (e.g., ss- Splint sequencing primers, e.g. SEQ ID NO: 1) that selectively hybridized to the concatemer template molecules generated from single-stranded splint strands (ss-Splint; right image was obtained at one of the 31 sequencing cycles). Other loading concentrations were tested including 30 pM and 40 pM.
[0070] FIG. 35A is a bar graph showing the pass filter count (PF Count, in millions (M)) from an experiment conducted to determine the density of immobilized polonies using 8-plex batch sequencing primers. The data represented by the bar graphs shown in FIGs. 35 A, 36A and 37A were generated from the same experiment.
[0071] FIG. 35B is a Table listing the estimated loading concentrations (extrapolated pM) of the libraries corresponding to the number of batch sequencing primers used. The Table in FIG. 35B corresponds to the bar graph shown in FIG. 35 A.
[0072] FIG. 36A is a bar graph showing the percent pass filter from an experiment conducted to determine the density of immobilized polonies using 8-plex batch sequencing primers.
[0073] FIG. 36B is a Table listing the estimated loading concentrations (extrapolated pM) of the libraries corresponding to the number of batch sequencing primers used. The Table in FIG. 36B corresponds to the bar graph shown in FIG. 36 A.
[0074] FIG. 37A is a bar graph showing the %Q30 from an experiment conducted to determine the density of immobilized polonies using 8-plex batch sequencing primers.
[0075] FIG. 37B is a Table listing the estimated loading concentrations (extrapolated pM) of the libraries corresponding to the number of batch sequencing primers used. The Table in FIG. 37B corresponds to the bar graph shown in FIG. 37 A.
[0076] FIG. 38 is a graph showing the nucleotide base diversity (A, T, C, or G) of a right sample index sequence (170) which includes a universal right sample index and a 3-mer random sequence (NNN). The graph shows a nucleotide diversity of the 3-mer random sequence (NNN) of approximately 30% for A and T base calls, and approximately 20% for C and G base calls.
[0077] FIG. 39 is a graph showing the nucleotide base diversity (A, T, C, or G) of a left sample index sequence (160) which lacks a 3-mer random sequence (NNN). The graph shows a nucleotide diversity of approximately 40% for A and T base calls, approximately 15% for C base calls, and approximately 5% for G base calls.
DETAILED DESCRIPTION
Introduction
[0078] For massively parallel sequencing, the limit of optical resolution impedes the ability to perform highly multiplex sequencing. Batch-specific sequencing enables sequencing a desired subset (e.g., a batch) of the template molecules immobilized to the same flow cell using selected batch-specific sequencing primers to reduce over-crowding signals and images which are generated during sequencing. The use of batch-specific sequencing primers produces optical images that are intense and resolvable. The batch-specific sequencing methods described herein have many uses. For example, the number of spots that are imaged and associated with sequencing can be counted. The counted spots can be used as a measure for target nucleic acid levels in a sample.
[0079] The present disclosure provides compositions, apparatus and methods for conducting separate sequencing batches on a support having nucleic acid template molecules
immobilized thereon, where the separate sequencing batches can be conducted using any massively parallel sequencing technology. In some embodiments, a plurality of subpopulations of nucleic acid template molecules are immobilized to the support including at least a first and second sub-population. In some embodiments, the first sub-population of template molecules undergo first sequencing reactions (e.g., first batch sequencing) and a region of the support is imaged to detect the first sequencing reactions, wherein the second sub-population of template molecules do not undergo sequencing reactions. In some embodiments, the second sub-population of template molecules undergo second sequencing reactions (e.g., second batch sequencing) and the same region of the support is imaged to detect the second sequencing reactions, wherein the first sub-population of template molecules do not undergo sequencing reactions. Thus, the first and second sub-populations of template molecules undergo batch sequencing.
[0080] The present disclosure also provides compositions, apparatus, and methods for conducting massively parallel sequencing methods using concatemerized template molecules that are generated by rolling circle amplification. The concatemer template molecules contain multiple copies of the target sequences and unique barcode sequences and sequencing primer binding sequences associated with the target sequences. Use of the concatemer template molecules increases the accuracy of the sequencing.
[0081] The methods described herein employ batch sequencing on high density immobilized template molecules which offers the advantage of maximizing space on a support (e.g., a flow cell). Furthermore, the same seeded support can be re-used by re-seeding the support with additional template molecules and conducting additional sequencing reactions on the re-seeded template molecules.
[0082] Batch sequencing can be conducted using template molecules arranged in a predetermined manner on the support (e.g., a patterned support). Alternatively, batch sequencing can be conducted using template molecules arranged in a random manner on the support which obviates the need to fabricate a support having organized and pre-determined features for attaching template molecules (e.g., fabrication via lithography is not needed).
[0083] By conducting short sequencing reads of the batch barcode regions of the template molecules, batch sequencing also significantly reduces sequencing run times, reagent use, and reagent costs.
[0084] When short sequencing reads of the batch barcode regions are conducted in a reiterative manner, it is not necessary to assemble the sequencing reads or to obtain a full length sequence of the sequence of interest, which reduces the need for long assembly
computations. Also, the redundant sequencing information obtained from the short sequencing reads obviates the need to sequence the complementary strand of the template molecules, thus obviating the need for pairwise sequencing.
[0085] Batch sequencing also offers the flexibility of re-seeding the support any time between sequencing different batches, or an ongoing sequencing batch can be interrupted to permit re-seeding then the ongoing batch sequencing can be resumed. The ability to re-seed the support any time increases throughput and efficiency.
[0086] Conducting batch sequencing with immobilized concatemer template molecule offers advantages over one-copy template molecules (e.g., one-copy template molecule generated via bridge amplification). For example concatemer template molecules carry multiple sequencing primer binding sites along the same concatemer template molecule. The multiple sequencing primer binding sites can be used to generate multiple sequencing reads for increased sequencing depth. Together, reiteratively sequencing one strand of the concatemer template molecules increases sequencing base coverage and sequencing depth compared to sequencing a one-copy template molecule.
[0087] Batch sequencing has many uses including but not limited to detecting specific nucleic acids of interest, mutant nucleic acid sequences, splice variants, and their abundance levels thereof.
Definitions
[0088] The headings provided herein are not limitations of the various aspects of the disclosure, which aspects can be understood by reference to the specification as a whole. [0089] Unless defined otherwise, technical and scientific terms used herein have meanings that are commonly understood by those of ordinary skill in the art unless defined otherwise. Generally, terminologies pertaining to techniques of molecular biology, nucleic acid chemistry, protein chemistry, genetics, microbiology, transgenic cell production, and hybridization described herein are those well-known and commonly used in the art. Techniques and procedures described herein are generally performed according to conventional methods well known in the art and as described in various general and more specific references that are cited and discussed throughout the instant specification. For example, see Sambrook et al., Molecular Cloning: A Laboratory Manual (Third ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. 2000). See also Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992). The
nomenclatures utilized in connection with, and the laboratory procedures and techniques described herein are those well-known and commonly used in the art.
[0090] Unless otherwise required by context herein, singular terms shall include pluralities and plural terms shall include the singular. Singular forms “a”, “an” and “the”, and singular use of any word, include plural referents unless expressly and unequivocally limited on one referent.
[0091] It is understood the use of the alternative term (e.g., “or”) is taken to mean either one or both or any combination thereof of the alternatives.
[0092] The term “and/or” used herein is to be taken mean specific disclosure of each of the specified features or components with or without the other. For example, the term “and/or” as used in a phrase such as “A and/or B” herein is intended to include: “A and B”; “A or B”; “A” (A alone); and “B” (B alone). In a similar manner, the term “and/or” as used in a phrase such as “A, B, and/or C” is intended to encompass each of the following aspects: “A, B, and C”; “A, B, or C”; “A or C”; “A or B”; “B or C”; “A and B”; “B and C”; “A and C”; “A” (A alone); “B” (B alone); and “C” (C alone).
[0093] As used herein and in the appended claims, terms “comprising”, “including”, “having” and “containing”, and their grammatical variants, as used herein are intended to be non-limiting so that one item or multiple items in a list do not exclude other items that can be substituted or added to the listed items. It is understood that wherever aspects are described herein with the language “comprising,” otherwise analogous aspects described in terms of “consisting of’ and/or “consisting essentially of’ are also provided.
[0094] As used herein, the terms “about” and “approximately” refer to a value or composition that is within an acceptable error range for the particular value or composition as determined by one of ordinary skill in the art, which will depend in part on how the value or composition is measured or determined, i.e., the limitations of the measurement system. For example, “about” or “approximately” can mean within one or more than one standard deviation per the practice in the art. Alternatively, “about” or “approximately” can mean a range of up to 10% (i.e., ±10%) or more depending on the limitations of the measurement system. For example, about 5 mg can include any number between 4.5 mg and 5.5 mg.
Furthermore, particularly with respect to biological systems or processes, the terms can mean up to an order of magnitude or up to 5-fold of a value. When particular values or compositions are provided in the instant disclosure, unless otherwise stated, the meaning of “about” or “approximately” should be assumed to be within an acceptable error range for that
particular value or composition. Also, where ranges and/or subranges of values are provided, the ranges and/or subranges can include the endpoints of the ranges and/or subranges.
[0095] As used herein, “corresponding to” or “corresponds to” refers to two or more entities whose identities are sufficiently related such that the identity of one entity can be used to determine the identity, position and/or other properties of the other entity. As nonlimiting example, a barcode sequence can be said to correspond to a particular sequence of interest if the barcode sequence can be used to determine the identity of the sequence of interest.
[0096] The term “polymerase” and its variants, as used herein, comprises an enzyme comprising a domain that binds a nucleotide (or nucleoside) where the polymerase can form a complex having a template nucleic acid and a complementary nucleotide. The polymerase can have one or more activities including, but not limited to, base analog detection activities, DNA polymerization activity, reverse transcriptase activity, DNA binding, strand displacement activity, and nucleotide binding and recognition. A polymerase can be any enzyme that can catalyze polymerization of nucleotides (including analogs thereof) into a nucleic acid strand. Typically but not necessarily such nucleotide polymerization can occur in a template-dependent fashion. Typically, a polymerase comprises one or more active sites at which nucleotide binding and/or catalysis of nucleotide polymerization can occur. In some embodiments, a polymerase includes other enzymatic activities, such as for example, 3' to 5' exonuclease activity or 5' to 3' exonuclease activity. In some embodiments, a polymerase has strand displacing activity. A polymerase can include without limitation naturally occurring polymerases and any subunits and truncations thereof, mutant polymerases, variant polymerases, recombinant, fusion or otherwise engineered polymerases, chemically modified polymerases, synthetic molecules or assemblies, and any analogs, derivatives or fragments thereof that retain the ability to catalyze nucleotide polymerization (e.g., catalytically active fragment). The polymerase includes catalytically inactive polymerases, catalytically active polymerases, reverse transcriptases, and other enzymes comprising a nucleotide binding domain. In some embodiments, a polymerase can be isolated from a cell, or generated using recombinant DNA technology or chemical synthesis methods. In some embodiments, a polymerase can be expressed in prokaryote, eukaryote, viral, or phage organisms. In some embodiments, a polymerase can be post-translationally modified proteins or fragments thereof. A polymerase can be derived from a prokaryote, eukaryote, virus or phage. A polymerase comprises DNA-directed DNA polymerase and RNA-directed DNA polymerase.
[0097] As used herein, the term “strand displacing” refers to the ability of a polymerase to locally separate strands of double-stranded nucleic acids and synthesize a new strand in a template-based manner. Strand displacing polymerases displace a complementary strand from a template strand and catalyze new strand synthesis. Strand displacing polymerases include mesophilic and thermophilic polymerases. Strand displacing polymerases include wild type enzymes, and variants including exonuclease minus mutants, mutant versions, chimeric enzymes and truncated enzymes. Examples of strand displacing polymerases include phi29 DNA polymerase, large fragment of Bst DNA polymerase, large fragment of Bsu DNA polymerase (exo-), Bea DNA polymerase (exo-), KI enow fragment of E. coli DNA polymerase, T5 polymerase, M-MuLV reverse transcriptase, HIV viral reverse transcriptase, Deep Vent DNA polymerase and KOD DNA polymerase. The phi29 DNA polymerase can be wild type phi29 DNA polymerase (e.g., MagniPhi® from Expedeon), or variant EquiPhi29 DNA polymerase (e.g., from Thermo Fisher Scientific®), or chimeric QualiPhi® DNA polymerase (e.g., from 4basebio®).
[0098] The terms “nucleic acid”, "polynucleotide" and "oligonucleotide" and other related terms used herein are used interchangeably and refer to polymers of nucleotides and are not limited to any particular length. Nucleic acids include recombinant and chemically- synthesized forms. Nucleic acids can be isolated. Nucleic acids include DNA molecules (e.g., cDNA or genomic DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA generated using nucleotide analogs (e.g., peptide nucleic acids (PNA) and non-naturally occurring nucleotide analogs), and chimeric forms containing DNA and RNA. Nucleic acids can be single-stranded or double-stranded. Nucleic acids comprise polymers of nucleotides, where the nucleotides include natural or non-natural bases and/or sugars. Nucleic acids comprise naturally-occurring internucleosidic linkages, for example phosphodiester linkages. Nucleic acids can lack a phosphate group. Nucleic acids comprise non-natural internucleoside linkages, including phosphorothioate, phosphorothiolate, or peptide nucleic acid (PNA) linkages. In some embodiments, nucleic acids comprise a one type of polynucleotides or a mixture of two or more different types of polynucleotides.
[0099] The term “operably linked” and “operably joined” or related terms as used herein refers to juxtaposition of components. The juxtapositioned components can be linked together covalently. For example, two nucleic acid components can be enzymatically ligated together where the linkage that joins together the two components comprises phosphodiester linkage. A first and second nucleic acid component can be linked together, where the first nucleic acid component can confer a function on a second nucleic acid component. For example, linkage
between a primer binding sequence and a sequence of interest forms a nucleic acid library molecule having a portion that can bind to a primer. In another example, a transgene (e.g., a nucleic acid encoding a polypeptide or a nucleic acid sequence of interest) can be ligated to a vector where the linkage permits expression or functioning of the transgene sequence contained in the vector. In some embodiments, a transgene is operably linked to a host cell regulatory sequence (e.g., a promoter sequence) that affects expression of the transgene. In some embodiments, the vector comprises at least one host cell regulatory sequence, including a promoter sequence, enhancer, transcription and/or translation initiation sequence, transcription and/or translation termination sequence, polypeptide secretion signal sequences, and the like. In some embodiments, the host cell regulatory sequence controls expression of the level, timing and/or location of the transgene.
[00100] The terms “linked”, “joined”, “attached”, “appended” and variants thereof comprise any type of fusion, bond, adherence or association between any combination of compounds or molecules that is of sufficient stability to withstand use in the particular procedure. The procedure can include but are not limited to: nucleotide binding; nucleotide incorporation; de-blocking (e.g., removal of chain-terminating moiety); washing; removing; flowing; detecting; imaging and/or identifying. Such linkage can comprise, for example, covalent, ionic, hydrogen, dipole-dipole, hydrophilic, hydrophobic, or affinity bonding, bonds or associations involving van der Waals forces, mechanical bonding, and the like. In some embodiments, such linkage occurs intramolecularly, for example linking together the ends of a single-stranded or double-stranded linear nucleic acid molecule to form a circular molecule. In some embodiments,, such linkage can occur between a combination of different molecules, or between a molecule and a non-molecule, including but not limited to: linkage between a nucleic acid molecule and a solid surface; linkage between a protein and a detectable reporter moiety; linkage between a nucleotide and detectable reporter moiety; and the like. Some examples of linkages can be found, for example, in Hermanson, G., “Bioconjugate Techniques”, Second Edition (2008); Aslam, M., Dent, A., “Bioconjugation: Protein Coupling Techniques for the Biomedical Sciences”, London: Macmillan (1998); Aslam, M., Dent, A., “Bioconjugation: Protein Coupling Techniques for the Biomedical Sciences”, London: Macmillan (1998).
[00101] The term “primer” and related terms used herein refer to an oligonucleotide that is capable of hybridizing with a DNA and/or RNA polynucleotide template to form a duplex molecule. Primers comprise natural nucleotides and/or nucleotide analogs. Primers can be recombinant nucleic acid molecules. Primers may have any length, but typically range from
4-50 nucleotides. A typical primer comprises a 5’ end and 3’ end. The 3’ end of the primer can include a 3’ OH moiety which serves as a nucleotide polymerization initiation site in a polymerase-catalyzed primer extension reaction. Alternatively, the 3’ end of the primer can lack a 3’ OH moiety, or can include a terminal 3’ blocking group that inhibits nucleotide polymerization in a polymerase-catalyzed reaction. Any one nucleotide, or more than one nucleotide, along the length of the primer can be labeled with a detectable reporter moiety. A primer can be in solution (e.g., a soluble primer) or can be immobilized to a support (e.g., a capture primer).
[00102] The term “template nucleic acid”, “template polynucleotide”, “target nucleic acid” “target polynucleotide”, “template strand,” “template molecule” and other variations refer to a nucleic acid strand that serves as the basis nucleic acid molecule for any of the methods describe herein, e.g. sequencing or amplification methods. The template nucleic acid can be single-stranded or double-stranded, or the template nucleic acid can have single-stranded or double-stranded portions. The template nucleic acid can be obtained from a naturally- occurring source, recombinant form, or chemically synthesized to include any type of nucleic acid analog. The template nucleic acid can be linear, circular, or other forms. The template nucleic acids can include an insert portion having an insert sequence. The template nucleic acids can also include at least one adaptor sequence. The insert portion can be isolated in any form, including chromosomal, genomic, organellar (e.g., mitochondrial, chloroplast or ribosomal), recombinant molecules, cloned, amplified, cDNA, RNA such as precursor mRNA or mRNA, oligonucleotides, whole genomic DNA, obtained from fresh frozen paraffin embedded tissue, needle biopsies, circulating tumor cells, cell free circulating DNA, or any type of nucleic acid library. The insert portion can be isolated from any source including from organisms such as prokaryotes, eukaryotes (e.g., humans, plants and animals), fungus, viruses, cells, tissues, normal or diseased cells or tissues, body fluids including blood, urine, serum, lymph, tumor, saliva, anal and vaginal secretions, amniotic samples, perspiration, semen, environmental samples, culture samples, or synthesized nucleic acid molecules prepared using recombinant molecular biology or chemical synthesis methods. The insert portion can be isolated from any organ, including head, neck, brain, breast, ovary, cervix, colon, rectum, endometrium, gallbladder, intestines, bladder, prostate, testicles, liver, lung, kidney, esophagus, pancreas, thyroid, pituitary, thymus, skin, heart, larynx, or other organs. The template nucleic acid can be subjected to nucleic acid analysis, including sequencing and composition analysis. The template molecules disclosed herein can be concatemer template molecules, which comprise two or more copies of a particular sequence.
For example, a concatemer template molecule can comprise two or more tandem copies of a polynucleotide unit, where each polynucleotide unit comprises a sequence of interest and at least one other sequence feature, such as any of the barcode sequences, index sequences, or sequencing, surface capture or surface pinning primer binding sequences disclosed herein. [00103] The term “adaptor” and related terms refers to oligonucleotides that can be operably linked to a target polynucleotide, where the adaptor confers a function to the cojoined adaptor-target molecule. Adaptors comprise DNA, RNA, chimeric DNA/RNA, or analogs thereof. Adaptors can include at least one ribonucleoside residue. Adaptors can be single-stranded, double-stranded, or have single-stranded and/or double-stranded portions. Adaptors can be configured to be linear, stem-looped, hairpin, or Y-shaped forms. Adaptors can be any length, including 4-100 nucleotides or longer. Adaptors can have blunt ends, overhang ends, or a combination of both. Overhang ends include 5’ overhang and 3’ overhang ends. The 5’ end of a single-stranded adaptor, or one strand of a double-stranded adaptor, can have a 5’ phosphate group or lack a 5’ phosphate group. Adaptors can include a 5’ tail that does not hybridize to a target polynucleotide (e.g., tailed adaptor), or adaptors can be non-tailed. An adaptor can include a sequence that is complementary to at least a portion of a primer, such as an amplification primer, a sequencing primer, or a capture primer (e.g., soluble or immobilized capture primers). Adaptors can include a random sequence or degenerate sequence. Adaptors can include at least one inosine residue. Adaptors can include at least one phosphorothioate, phosphorothiolate and/or phosphoramidate linkage. Adaptors can include a barcode sequence which can be used to distinguish polynucleotides (e.g., insert sequences) from different sample sources in a multiplex assay. Adaptors can include a unique identification sequence (e.g., unique molecular index, UMI; or a unique molecular tag) that can be used to uniquely identify a nucleic acid molecule to which the adaptor is appended. In some embodiments, a unique identification sequence can be used to increase error correction and accuracy, reduce the rate of false-positive variant calls and/or increase sensitivity of variant detection. Adaptors can include at least one restriction enzyme recognition sequence, including any one or any combination of two or more selected from a group consisting of type I, type II, type III, type IV, type Hs or type IIB.
[00104] In some embodiments, primer sequences, such as any of the amplification primer sequences, sequencing primer sequences, surface capture primer sequences, surface pinning primer sequences, and any of the sample barcode sequences, can be about 3-50 nucleotides in length, or about 5-40 nucleotides in length, or about 5-25 nucleotides in length.
[00105] The term “universal sequence” and related terms refer to a sequence in a nucleic acid molecule that is common among two or more polynucleotide molecules. For example, an adaptor having a universal sequence can be operably joined to a plurality of polynucleotides so that the population of co-joined molecules carry the same universal adaptor sequence. Examples of universal adaptor sequences include an amplification primer sequence, a sequencing primer sequence or a capture primer sequence (e.g., soluble or immobilized capture primers).
[00106] When used in reference to nucleic acid molecules, the terms “hybridize” or “hybridizing” or “hybridization” or other related terms refers to hydrogen bonding between two different nucleic acids to form a duplex nucleic acid. Hybridization also includes hydrogen bonding between two different regions of a single nucleic acid molecule to form a self-hybridizing molecule having a duplex region. Hybridization can comprise Watson-Crick or Hoogstein binding to form a duplex double-stranded nucleic acid, or a double-stranded region within a nucleic acid molecule. The double-stranded nucleic acid, or the two different regions of a single nucleic acid, may be wholly complementary, or partially complementary. Complementary nucleic acid strands need not hybridize with each other across their entire length. The complementary base pairing can be the standard A-T or C-G base pairing, or can be other forms of base-pairing interactions. Duplex nucleic acids can include mismatched base-paired nucleotides.
[00107] When used in reference to nucleic acids, the terms “extend”, “extending”, “extension” and other variants, refers to incorporation of one or more nucleotides into a nucleic acid molecule. Nucleotide incorporation comprises polymerization of one or more nucleotides into the terminal 3’ OH end of a nucleic acid strand, resulting in extension of the nucleic acid strand. Nucleotide incorporation can be conducted with natural nucleotides and/or nucleotide analogs. Typically, but not necessarily, nucleotide incorporation occurs in a template-dependent fashion. Any suitable method of extending a nucleic acid molecule may be used, including primer extension catalyzed by a DNA polymerase or RNA polymerase. [00108] The term “nucleotides” and related terms refers to a molecule comprising an aromatic base, a five carbon sugar (e.g., ribose or deoxyribose), and at least one phosphate group. Canonical or non-canonical nucleotides are consistent with use of the term. The phosphate in some embodiments comprises a monophosphate, diphosphate, or triphosphate, or corresponding phosphate analog. The term “nucleoside” refers to a molecule comprising an aromatic base and a sugar. Nucleotides and nucleosides can be non-labeled or labeled with a detectable reporter moiety.
[00109] Nucleotides (and nucleosides) typically comprise a hetero cyclic base including substituted or unsubstituted nitrogen-containing parent heteroaromatic ring which are commonly found in nucleic acids, including naturally-occurring, substituted, modified, or engineered variants, or analogs of the same. The base of a nucleotide (or nucleoside) is capable of forming Watson-Crick and/or Hoogstein hydrogen bonds with an appropriate complementary base. Exemplary bases include, but are not limited to, purines and pyrimidines such as: 2-aminopurine, 2,6-diaminopurine, adenine (A), ethenoadenine, N6-A2- isopentenyladenine (6iA), N6-A2-isopentenyl-2-methylthioadenine (2ms6iA), N6- methyladenine, guanine (G), isoguanine, N2-dimethylguanine (dmG), 7-methylguanine (7mG), 2-thiopyrimidine, 6-thioguanine (6sG), hypoxanthine and O6-methylguanine; 7- deaza-purines such as 7-deazaadenine (7-deaza-A) and 7-deazaguanine (7-deaza-G); pyrimidines such as cytosine (C), 5-propynylcytosine, isocytosine, thymine (T), 4- thiothymine (4sT), 5,6-dihydrothymine, O4-methylthymine, uracil (U), 4-thiouracil (4sU) and 5,6-dihydrouracil (dihydrouracil; D); indoles such as nitroindole and 4-methylindole; pyrroles such as nitropyrrole; nebularine; inosines; hydroxymethylcytosines; 5-methycytosines; base (Y); as well as methylated, glycosylated, and acylated base moieties; and the like. Additional exemplary bases can be found in Fasman, 1989, in “Practical Handbook of Biochemistry and Molecular Biology”, pp. 385-394, CRC Press, Boca Raton, Fla.
[00110] Nucleotides (and nucleosides) typically comprise a sugar moiety, such as carbocyclic moiety (Ferraro and Gotor 2000 Chem. Rev. 100: 4319-48), acyclic moieties (Martinez, et al., 1999 Nucleic Acids Research 27: 1271-1274; Martinez, et al., 1997 Bioorganic & Medicinal Chemistry Letters vol. 7: 3013-3016), and other sugar moieties (Joeng, et al., 1993 J. Med. Chem. 36: 2627-2638; Kim, et al., 1993 J. Med. Chem. 36: 30-7; Eschenmosser 1999 Science 284:2118-2124; and U.S. Pat. No. 5,558,991). The sugar moiety comprises: ribosyl; 2'-deoxyribosyl; 3 '-deoxyribosyl; 2', 3 '-dideoxyribosyl; 2', 3'- didehydrodideoxyribosyl; 2'-alkoxyribosyl; 2'-azidoribosyl; 2'-aminoribosyl; 2'-fluororibosyl; 2'-mercaptoriboxyl; 2'-alkylthioribosyl; 3 '-alkoxyribosyl; 3 '-azidoribosyl; 3 '-aminoribosyl; 3 '-fluororibosyl; 3'-mercaptoriboxyl; 3 '-alkylthioribosyl carbocyclic; acyclic or other modified sugars.
[00111] In some embodiments, nucleotides comprise a chain of one, two or three phosphorus atoms where the chain is typically attached to the 5’ carbon of the sugar moiety via an ester or phosphoramide linkage. In some embodiments, the nucleotide is an analog having a phosphorus chain in which the phosphorus atoms are linked together with intervening O, S, NH, methylene or ethylene. In some embodiments, the phosphorus atoms in
the chain include substituted side groups including O, S or BH3. In some embodiments, the chain includes phosphate groups substituted with analogs including phosphoramidate, phosphorothioate, phosphorodithioate, and O-methylphosphoramidite groups.
[00112] The term “reporter moiety”, “reporter moieties” or related terms refer to a compound that generates, or causes to generate, a detectable signal. A reporter moiety is sometimes called a “label”. Any suitable reporter moiety may be used, including luminescent, photoluminescent, electroluminescent, bioluminescent, chemiluminescent, fluorescent, phosphorescent, chromophore, radioisotope, electrochemical, mass spectrometry, Raman, hapten, affinity tag, atom, or an enzyme. A reporter moiety generates a detectable signal resulting from a chemical or physical change (e.g., heat, light, electrical, pH, salt concentration, enzymatic activity, or proximity events). A proximity event includes two reporter moieties approaching each other, or associating with each other, or binding each other. It is well known to one skilled in the art to select reporter moieties so that each absorbs excitation radiation and/or emits fluorescence at a wavelength distinguishable from the other reporter moieties to permit monitoring the presence of different reporter moieties in the same reaction or in different reactions. Two or more different reporter moieties can be selected having spectrally distinct emission profiles, or having minimal overlapping spectral emission profiles. Reporter moieties can be linked (e.g., operably linked) to nucleotides, nucleosides, nucleic acids, enzymes (e.g., polymerases or reverse transcriptases), or support (e.g., surfaces).
[00113] A reporter moiety (or label) comprises a fluorescent label or a fluorophore. Exemplary fluorescent moieties which may serve as fluorescent labels or fhiorophores include, but are not limited to fluorescein and fluorescein derivatives such as carboxyfluorescein, tetrachlorofluorescein, hexachlorofluorescein, carboxynapthofluorescein, fluorescein isothiocyanate, NHS-fluorescein, iodoacetamidofluorescein, fluorescein maleimide, SAMSA-fluorescein, fluorescein thiosemicarbazide, carbohydrazinomethylthioacetyl-amino fluorescein, rhodamine and rhodamine derivatives such as TRITC, TMR, lissamine rhodamine, Texas Red, rhodamine B, rhodamine 6G, rhodamine 10, NHS-rhodamine, TMR-iodoacetamide, lissamine rhodamine B sulfonyl chloride, lissamine rhodamine B sulfonyl hydrazine, Texas Red sulfonyl chloride, Texas Red hydrazide, coumarin and coumarin derivatives such as AMCA, AMCA-NHS, AMCA-sulfo- NHS, AMCA-HPDP, DCIA, AMCE-hydrazide, BODIPY and derivatives such as BODIPY FL C3-SE, BODIPY 530/550 C3, BODIPY 530/550 C3-SE, BODIPY 530/550 C3 hydrazide, BODIPY 493/503 C3 hydrazide, BODIPY FL C3 hydrazide, BODIPY FL IA, BODIPY
530/551 IA, Br-BODIPY 493/503, Cascade Blue and derivatives such as Cascade Blue acetyl azide, Cascade Blue cadaverine, Cascade Blue ethylenediamine, Cascade Blue hydrazide, Lucifer Yellow and derivatives such as Lucifer Yellow iodoacetamide, Lucifer Yellow CH, cyanine and derivatives such as indolium based cyanine dyes, benzo-indolium based cyanine dyes, pyridium based cyanine dyes, thiozolium based cyanine dyes, quinolinium based cyanine dyes, imidazolium based cyanine dyes, Cy 3, Cy5, lanthanide chelates and derivatives such as BCPDA, TBP, TMT, BHHCT, BCOT, Europium chelates, Terbium chelates, Alexa Fluor® dyes, DyLight® dyes, Atto™ dyes, LightCycler® Red dyes, CAL Flour dyes, JOE and derivatives thereof, Oregon Green™ dyes, WellRED dyes, IRD dyes, phycoerythrin and phycobilin dyes, Malachite green, stilbene, DEG dyes, NR dyes, nearinfrared dyes and others known in the art such as those described in Haugland, Molecular Probes Handbook, (Eugene, Oreg.) 6th Edition; Lakowicz, Principles of Fluorescence Spectroscopy, 2nd Ed., Plenum Press New York (1999), or Hermanson, Bioconjugate Techniques, 2nd Edition, or derivatives thereof, or any combination thereof. Cyanine dyes may exist in either sulfonated or non-sulfonated forms, and consist of two indolenin, benzo- indolium, pyridium, thiozolium, and/or quinolinium groups separated by a polymethine bridge between two nitrogen atoms. Commercially available cyanine fluorophores include, for example, Cy3, (which may comprise l-[6-(2,5-dioxopyrrolidin-l-yloxy)-6-oxohexyl]-2- (3-{ l-[6-(2,5-dioxopyrrolidin-l-yloxy)-6-oxohexyl]-3,3-dimethyl-l,3-dihydro-2H-indol-2- ylidenejprop- 1 -en- 1 -yl)-3 ,3 -dimethyl-3H-indolium or 1 - [6-(2, 5-dioxopyrrolidin- 1 -yloxy)-6- oxohexyl]-2-(3-{ l-[6-(2,5-dioxopyrrolidin-l-yloxy)-6-oxohexyl]-3,3-dimethyl-5-sulfo-l,3- dihydro-2H-indol-2-ylidene}prop-l-en-l-yl)-3,3-dimethyl-3H-indolium-5-sulfonate), Cy5 (which may comprise l-(6-((2,5-dioxopyrrolidin-l-yl)oxy)-6-oxohexyl)-2-((lE,3E)-5-((E)-l- (6-((2,5-dioxopyrrolidin-l-yl)oxy)-6-oxohexyl)-3,3-dimethyl-5-indolin-2-ylidene)penta-l,3- dien- 1 -yl)-3 ,3 -dimethyl-3H-indol- 1 -ium or 1 -(6-((2, 5-dioxopyrrolidin- 1 -yl)oxy)-6- oxohexyl)-2-((lE,3E)-5-((E)-l-(6-((2,5-dioxopyrrolidin-l-yl)oxy)-6-oxohexyl)-3,3-dimethyl- 5-sulfoindolin-2-ylidene)penta-l,3-dien-l-yl)-3,3-dimethyl-3H-indol-l-ium-5-sulfonate), and Cy7 (which may comprise l-(5-carboxypentyl)-2-[(lE,3E,5E,7Z)-7-(l-ethyl-l,3-dihydro-2H- indol-2-ylidene)hepta-l,3,5-trien-l-yl]-3H-indolium or l-(5-carboxypentyl)-2- [(lE,3E,5E,7Z)-7-(l-ethyl-5-sulfo-l,3-dihydro-2H-indol-2-ylidene)hepta-l,3,5-trien-l-yl]- 3H-indolium-5-sulfonate), where “Cy” stands for 'cyanine', and the first digit identifies the number of carbon atoms between two indolenine groups. Cy2 which is an oxazole derivative rather than indolenin, and the benzo-derivatized Cy3.5, Cy5.5 and Cy7.5 are exceptions to
this rule. Additional suitable dyes are described, for example, in U.S. 2024/0240249A1, the contents of which are incorporated by reference in their entirety herein.
[00114] In some embodiments, the reporter moiety can be a FRET pair, such that multiple classifications can be performed under a single excitation and imaging step. As used herein, FRET may comprise excitation exchange (Forster) transfers, or electron-exchange (Dexter) transfers.
[00115] When used in reference to nucleic acids, the terms “amplify”, “amplifying”, “amplification”, and other related terms include producing multiple copies of an original polynucleotide template molecule, where the copies comprise a sequence that is complementary to the template sequence, or the copies comprise a sequence that is the same as the template sequence. In some embodiments, the copies comprise a sequence that is substantially identical to a template sequence, or is substantially identical to a sequence that is complementary to the template sequence.
[00116] The term “support” as used herein refers to a substrate that is designed for deposition of biological molecules or biological samples for assays and/or analyses. Examples of biological molecules to be deposited onto a support include nucleic acids (e.g., DNA, RNA), polypeptides, saccharides, lipids, a single cell or multiple cells. Examples of biological samples include but are not limited to saliva, phlegm, mucus, blood, plasma, serum, urine, stool, sweat, tears and fluids from tissues or organs.
[00117] A “capture primer” or “surface capture primer” and the like refers to an oligonucleotide immobilized to a support that is complementary to a portion of, and capable of hybridizing with a given oligonucleotide, such as the library molecules and/or template molecules described herein. A “pinning primer” or “surface pinning primer” and the like refers to an oligonucleotide immobilized to a support that is complementary to a portion of, and capable of hybridizing with the concatemer template molecules described herein, thereby “pinning” down a portion of the concatemer template molecule to the support.
[00118] In some embodiments, the support is solid, semi-solid, or a combination of both. In some embodiments, the support is porous, semi-porous, non-porous, or any combination of porosity. In some embodiments, the support can be substantially planar, concave, convex, or any combination thereof. In some embodiments, the support can be cylindrical, for example comprising a capillary or interior surface of a capillary.
[00119] In some embodiments, the surface of the support can be substantially smooth. In some embodiments, the support can be regularly or irregularly textured, including bumps, etched, pores, three-dimensional scaffolds, or any combination thereof.
[00120] In some embodiments, the support comprises a bead having any shape, including spherical, hemi-spherical, cylindrical, barrel-shaped, toroidal, disc-shaped, rod-like, conical, triangular, cubical, polygonal, tubular or wire-like.
[00121] The support can be fabricated from any material, including but not limited to glass, fused-silica, silicon, a polymer (e.g., polystyrene (PS), macroporous polystyrene (MPPS), polymethylmethacrylate (PMMA), polycarbonate (PC), polypropylene (PP), polyethylene (PE), high density polyethylene (HDPE), cyclic olefin polymers (COP), cyclic olefin copolymers (COC), polyethylene terephthalate (PET)), or any combination thereof. Various compositions of both glass and plastic substrates are contemplated.
[00122] The support can have a plurality (e.g., two or more) of nucleic acid templates immobilized thereon. The plurality of immobilized nucleic acid templates have the same sequence or have different sequences. In some embodiments, individual nucleic acid template molecules in the plurality of nucleic acid templates are immobilized to a different site on the support. In some embodiments, two or more individual nucleic acid template molecules in the plurality of nucleic acid templates are immobilized to a site on the support.
[00123] The term “array” refers to a support comprising a plurality of sites located at predetermined locations on a support described herein to form an array of sites. The sites can be discrete and separated by interstitial regions. In some embodiments, the pre-determined sites on the support can be arranged in one dimension in a row or a column, or arranged in two dimensions in rows and columns. In some embodiments, the plurality of pre-determined sites is arranged on the support in an organized fashion. In some embodiments, the plurality of pre-determined sites is arranged in any organized pattern, including rectilinear, hexagonal patterns, grid patterns, patterns having reflective symmetry, patterns having rotational symmetry, or the like. The pitch between different pairs of sites can be that same or can vary. In some embodiments, the support comprises at least 102 sites, at least 103 sites, at least 104 sites, at least 105 sites, at least 106 sites, at least 107 sites, at least 108 sites, at least 109 sites, at least 1010 sites, at least 1011 sites, at least 1012 sites, at least 1013 sites, at least 1014 sites, at least 1015 sites, or more, where the sites are located at pre-determined locations on the support. In some embodiments, the support comprises between about 102 sites and about 1015 sites, between about 105 sites and about 1015 sites, between about 1010 sites and about 1015 sites, between about 103 sites and about 1014 sites, between about 104 sites and about 1013 sites, between about 105 sites and about 1012 sites, between about 106 sites and about 1011 sites, between about 107 sites and about 1010 sites, between about 108 sites and about 1010 sites, or any range therebetween located at pre-determined locations on the support. In some
embodiments, a plurality of pre-determined sites on the support (e.g., 102 - 1015 sites or more) are immobilized with nucleic acid templates to form a nucleic acid template array. In some embodiments, the nucleic acid templates that are immobilized at a plurality of predetermined sites by hybridization to immobilized surface capture primers, or the nucleic acid templates are covalently attached to the surface capture primer. In some embodiments, the nucleic acid templates that are immobilized at a plurality of pre-determined sites, for example immobilized at 102 - 1015 sites or more. In some embodiments, the immobilized nucleic acid templates are clonally-amplified to generate immobilized nucleic acid clusters at the plurality of pre-determined sites. In some embodiments, individual immobilized nucleic acid clusters comprise linear clusters, or comprise single-stranded or double-stranded concatemers. [00124] In some embodiments, a support comprising a plurality of sites located at random locations on the support is referred to herein as a support having randomly located sites thereon. The location of the randomly located sites on the support are not pre-determined. The plurality of randomly-located sites is arranged on the support in a disordered and/or unpredictable fashion. In some embodiments, the support comprises at least 102 sites, at least 103 sites, at least 104 sites, at least 105 sites, at least 106 sites, at least 107 sites, at least 108 sites, at least 109 sites, at least IO10 sites, at least 1011 sites, at least 1012 sites, at least 1013 sites, at least 1014 sites, at least 1015 sites, or more, where the sites are randomly located on the support. In some embodiments, the support comprises between about 102 sites and about 1015 sites, between about 105 sites and about 1015 sites, between about IO10 sites and about 1015 sites, between about 103 sites and about 1014 sites, between about 104 sites and about 1013 sites, between about 105 sites and about 1012 sites, between about 106 sites and about 1011 sites, between about 107 sites and about IO10 sites, or between about 108 sites and about IO10 sites, or any range therebetween located at random locations on the support. In some embodiments, a plurality of randomly located sites on the support (e.g., 102 - 1015 sites or more) are immobilized with nucleic acid templates to form a support immobilized with nucleic acid templates. In some embodiments, the nucleic acid templates that are immobilized at a plurality of randomly located sites by hybridization to immobilized surface capture primers, or the nucleic acid templates are covalently attached to the surface capture primer. In some embodiments, the nucleic acid templates that are immobilized at a plurality of randomly located sites, for example immobilized at 102 - 1015 sites or more. In some embodiments, the template molecules are immobilized at between about 102 sites and about 1015 sites, between about 105 sites and about 1015 sites, between about IO10 sites and about 1015 sites, between about 103 sites and about 1014 sites, between about 104 sites and about 1013 sites, between
about 105 sites and about 1012 sites, between about 106 sites and about 1011 sites, between about 107 sites and about IO10 sites, or between about 108 sites and about IO10 sites, or any range therebetween, on the support. In some embodiments, the immobilized nucleic acid templates are clonally-amplified to generate immobilized nucleic acid clusters at the plurality of randomly located sites. In some embodiments, individual immobilized nucleic acid clusters comprise linear clusters, or comprise single-stranded or double-stranded concatemers.
[00125] In some embodiment, the plurality of immobilized surface capture primers on the support are in fluid communication with each other to permit flowing a solution of reagents (e.g., nucleic acid template molecules, soluble primers, enzymes, nucleotides, divalent cations, buffers, and the like) onto the support so that the plurality of immobilized surface capture primers on the support can be essentially simultaneously reacted with the reagents in a massively parallel manner. In some embodiments, the fluid communication of the plurality of immobilized surface capture primers can be used to conduct nucleic acid amplification reactions (e.g., RCA, MDA, PCR and bridge amplification) essentially simultaneously on the plurality of immobilized surface capture primers.
[00126] In some embodiment, the plurality of immobilized nucleic acid clusters on the support are in fluid communication with each other to permit flowing a solution of reagents (e.g., enzymes, nucleotides, divalent cations, and the like) onto the support so that the plurality of immobilized nucleic acid clusters on the support can be essentially simultaneously reacted with the reagents in a massively parallel manner. In some embodiments, the fluid communication of the plurality of immobilized nucleic acid clusters can be used to conduct nucleotide binding assays and/or conduct nucleotide polymerization reactions (e.g., primer extension or sequencing) essentially simultaneously on the plurality of immobilized nucleic acid clusters, and optionally to conduct detection and imaging for massively parallel sequencing.
[00127] When used in reference to immobilized enzymes, the term “immobilized” and related terms refer to enzymes (e.g., polymerases) that are attached to a support through covalent bond or non-covalent interaction, or attached to a coating on the support, or buried within a matrix formed by a coating on the support.
[00128] When used in reference to immobilized nucleic acids, the term “immobilized” and related terms refer to nucleic acid molecules that are attached to a support through covalent bond or non-covalent interaction, or attached to a coating on the support, or buried within a matrix formed by a coating on the support, where the nucleic acid molecules include surface
capture primers, nucleic acid template molecules and extension products of capture primers. Extension products of capture primers includes nucleic acid concatemers (e.g., nucleic acid clusters).
[00129] In some embodiments, one or more nucleic acid templates are immobilized on the support, for example immobilized at the sites on the support. In some embodiments, the one or more nucleic acid templates are clonally-amplified. In some embodiments, the one or more nucleic acid templates are clonally-amplified off the support (e.g., in-solution) and then deposited onto the support and immobilized on the support. In some embodiments, the clonal amplification reaction of the one or more nucleic acid templates is conducted on the support resulting in immobilization on the support. In some embodiments, the one or more nucleic acid templates are clonally-amplified (e.g., in solution or on the support) using a nucleic acid amplification reaction, including any one or any combination of: polymerase chain reaction (PCR), multiple displacement amplification (MDA), transcription-mediated amplification (TMA), nucleic acid sequence-based amplification (NASBA), strand displacement amplification (SDA), real-time SDA, bridge amplification, isothermal bridge amplification, rolling circle amplification (RCA), circle-to-circle amplification, helicase-dependent amplification, recombinase-dependent amplification, and/or single-stranded binding (SSB) protein-dependent amplification.
[00130] As used herein, the term “binding complex” refers to a complex formed by binding together a nucleic acid duplex, a polymerase, and a free nucleotide or a nucleotide unit of a multivalent molecule, where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a nucleic acid primer. In the binding complex, the free nucleotide or nucleotide unit may or may not be bound to the 3’ end of the nucleic acid primer at a position that is opposite a complementary nucleotide in the nucleic acid template molecule. A “ternary complex” is an example of a binding complex which is formed by binding together a nucleic acid duplex, a polymerase, and a free nucleotide or nucleotide unit of a multivalent molecule, where the free nucleotide or nucleotide unit is bound to the 3’ end of the nucleic acid primer (as part of the nucleic acid duplex) at a position that is opposite a complementary nucleotide in the nucleic acid template molecule.
[00131] The term “persistence time” and related terms refer to the length of time that a binding complex, which is formed between the target nucleic acid, a primer, a polymerase, a conjugated or unconjugated nucleotide, remains stable without any binding component dissociates from the binding complex. The persistence time is indicative of the stability of the binding complex and strength of the binding interactions. Persistence time can be measured
by observing the onset and/or duration of a binding complex, such as by observing a signal from a labeled component of the binding complex. For example, a labeled nucleotide or a labeled reagent comprising one or more nucleotides may be present in a binding complex, thus allowing the signal from the label to be detected during the persistence time of the binding complex. One exemplary label is a fluorescent label.
[00132] The present disclosure provides various reagents, and methods that employ the reagents for conducting a trapping reaction, an imaging reaction, a nucleic acid denaturation (de-hybridization) and/or a stepping reaction. The various reagents can include at least one pH buffering agent. The full name of the pH buffering agents is listed herein.
[00133] The term “Tris” refers to a pH buffering agent Tris(hydroxymethyl)- aminomethane. The term “Tris-HCl” refers to a pH buffering agent Tris(hydroxymethyl)- aminomethane hydrochloride. The term “Tris-acetate” refers to a pH buffering agent comprising an acetate salt of Tris (hydroxymethyl)-aminomethane.
[00134] The term “Tricine” refers to a pH buffering agent N-[tris(hydroxymethyl) methyl]glycine.
[00135] The term “Bicine” refers to a pH buffering agent N,N-bis(2-hydroxyethyl)glycine.
[00136] The term “Bis-Tris propane” refers to a pH buffering agent 1,3 Bis[tris(hydroxymethyl).methylamino]propane.
[00137] The term “HEPES” refers to a pH buffering agent 4-(2-hy droxy ethyl)- 1- piperazineethanesulfonic acid.
[00138] The term “MES” refers to a pH buffering agent 2-(7V-morpholino)ethanesulfonic acid).
[00139] The term “MOPS” refers to a pH buffering agent 3-(N- morpholino)propanesulfonic acid.
[00140] The term “MOPSO” refers to a pH buffering agent 3-(N-morpholino)-2- hydroxypropanesulfonic acid.
[00141] The term “BES” refers to a pH buffering agent N,N-bis(2-hydroxyethyl)-2- aminoethanesulfonic acid.
[00142] The term “TES” refers to a pH buffering agent 2-[(2-Hydroxy-
1, lbis(hydroxymethyl)ethyl)amino]ethanesulfonic acid).
[00143] The term “CAPS” refers to a pH buffering agent 3 -(cyclohexylamino)- 1- propanesuhinic acid.
[00144] The term “TAPS” refers to a pH buffering agent N-[Tris(hydroxymethyl)methyl]- 3 -amino propane sulfonic acid.
[00145] The term “TAPSO” refers to a pH buffering agent N- [Tris(hydroxymethyl)methyl]-3-amino-2-hyidroxypropansulfonic acid. [00146] The term “ACES” refers to a pH buffering agent 7V-(2-Acetamido)-2- aminoethanesulfonic acid.
[00147] The term “PIPES” refers to a pH buffering agent piperazine- l,4-bis(2- ethanesulfonic acid.
[00148] The term “ethanolamine” refers to a pH buffering agent that is also known as 2- aminoethanol.
[00149] Throughout this application various publications, patents, and/or patent applications are referenced. The disclosures of the publications, patents and/or patent applications are hereby incorporated by reference in their entireties into this application in order to more fully describe the state of the art to which this disclosure pertains.
[00150] The present disclosure provides compositions, apparatus and methods for conducting separate sequencing batches on a support having nucleic acid template molecules immobilized thereon, where the separate sequencing batches can be conducted using any massively parallel sequencing technology. In some embodiments, a plurality of subpopulations of nucleic acid template molecules are immobilized to the support including at least a first and second sub-population. In some embodiments, the first sub-population of template molecules undergo first sequencing reactions (e.g., first batch sequencing) and a region of the support is imaged to detect the first sequencing reactions, wherein the second sub-population of template molecules do not undergo sequencing reactions. In some embodiments, the second sub-population of template molecules undergo second sequencing reactions (e.g., second batch sequencing) and the same region of the support is imaged to detect the second sequencing reactions, wherein the first sub-population of template molecules do not undergo sequencing reactions. Thus, the first and second sub-populations of nucleic acid template molecules undergo batch sequencing.
[00151] In some embodiments, the plurality of sub-populations of nucleic acid template molecules are immobilized to the support at a high density. In some embodiments, at least some of the immobilized template molecules in the first and second sub-populations comprise nearest neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. For example, the plurality of sub-populations of nucleic acid template molecules are immobilized to the support at a density of about 102 - 1015 template molecules per mm2. In some embodiments, the template molecules are at density of between about 1010 and about
IO15 template molecules per mm2, between about 105 and about 1015 template molecules per mm2, between about 103 and about 1014 template molecules per mm2, between about 104 and about 1013 template molecules per mm2, between about 105 and about 1012 template molecules per mm2, between about 106 and about 1011 template molecules per mm2, between about 107 and about IO10 template molecules per mm2, or between about 108 and about IO10 template molecules per mm2 on the support, or any range therebetween.
[00152] In some embodiments, the support comprises a plurality of template molecules immobilized at pre-determined positions on the support (e.g., a patterned support). In some embodiments, the support comprises a plurality of template molecules immobilized at random and non-pre-determined positions on the support. In some embodiments, the support comprises a mixture of at least two sub-populations of template molecules immobilized at random and non-pre-determined positions on the support.
[00153] In some embodiments, the support lacks any contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern. In some embodiments, the support lacks contours which include features as sites for attachment of the nucleic acid template molecules. In some embodiments, the support lacks interstitial regions arranged in a predetermined pattern where the interstitial regions are sites designed to have no attached surface capture primers and/or template molecules. In some embodiments, the support lacks features that can be prepared using photo-chemical, photo-lithography, or micron-scale or nano-scale printing.
[00154] In some embodiments, individual template molecules in a given sub-population of template molecules comprise a sequence of interest, a batch barcode sequence that corresponds to the sequence of interest, and a batch sequencing primer binding site sequence that corresponds to the sequence of interest. In some embodiments, a pre-determined batch barcode sequence can be linked to a given sequence of interest, thus the pre-determined batch barcode sequence corresponds to a given sequence of interest. In some embodiments, a predetermined batch sequencing primer binding site sequence can be linked to a given sequence of interest, thus the pre-determined batch sequencing primer binding site sequence corresponds to a given sequence of interest. In some embodiments, template molecules within a given sub-population have the same or different sequences of interest. In some embodiments, template molecules within a given sub-population have the same batch barcode sequence. In some embodiments, template molecules within a given sub-population have the same sequencing primer binding site sequence. Thus, the different sub-populations of template molecules can undergo batch sequencing using a batch-specific sequencing primer.
[00155] In some embodiments, the sequence of interest region need not undergo sequencing. Instead, the batch barcode can be sequenced by conducting a small number of sequencing cycles to reveal the batch barcode which corresponds to its sequence of interest. In some embodiments, the batch barcode and the sequence of interest can be sequenced. [00156] In some embodiments, individual template molecules in a given sub-population of template molecules further comprise a sample index sequence that can be used to distinguish sequences of interest obtained from different sample sources in a multiplex assay. In some embodiments, template molecules within a given sub-population have the same or different sample index sequences.
[00157] In some embodiments, the sequence of interest region need not undergo sequencing. Instead, the batch barcode and the sample index can be sequenced by conducting a small number of sequencing cycles to reveal the batch barcode which corresponds to its sequence of interest and to reveal the sample index which corresponds to the sample source of the sequence of interest. In some embodiments, the template molecules lack a sample index and the batch barcode can serve as a sample index.
[00158] In some embodiments, the same portion of individual template molecules can be re-sequenced (e.g., reiterative sequencing) from the same start position to generate overlapping sequencing reads that can be aligned to a reference sequence. For example, the same portion of individual template molecules can be sequenced at least two, three, four, five, up to 50 times, up to 100 times, or more than 100 times. The start sequencing site can be any location of the template molecule and is dictated by the sequencing primers which are designed to anneal to a selected position within the template molecule. In some embodiments, the batch barcodes (or the batch barcodes and sample indexes) can be reiteratively sequenced by repeatedly conducting a short number of sequencing cycles of the batch barcode region (or the batch barcode and sample index regions) of a given template molecule. The reiterative sequencing reads increase the redundancy of sequencing information for individual bases in the template molecule. Reiteratively sequencing one strand of the template molecule can provide enough base coverage so that pairwise sequencing of the complementary strand is not necessary.
[00159] In some embodiments, after sequencing the first and/or second sub-populations of template molecules, the support can be re-seeded at least once with additional sub-population of template molecules (e.g., a third sub-population) which can undergo additional batch sequencing. In some embodiments, an ongoing batch sequencing run can be stopped prior to completion (e.g., interrupted) to permit re-seeding the support with an additional sub-
population of template molecules (e.g., the third sub-population) and then the interrupted batch sequencing can be resumed. Thus, the support can be re-seeded any time and/or before a previous sequencing batch is completed.
[00160] In some embodiments, the support comprises a plurality of template molecules immobilized at an initial low density where most of the nearest neighbor template molecules do not touch each other and/or do not overlap each other. In some embodiments, the initial low density support comprises a plurality of template molecules having interstitial space between the template molecules.
[00161] In some embodiments, the same support can undergo a first re-seeding with additional template molecules immobilized to the support so that the first re-seeded density has some nearest template molecules (e.g., 10 - 30% of the first immobilized re-seeded template molecules) that touch each other and/or overlap each other. In some embodiments, the resulting first re-seeded support comprises a plurality of template molecules having a reduced number of interstitial space (and/or having a reduced size of interstitial space) between the template molecules compared to the initial low density support.
[00162] In some embodiments, the same support can undergo a second re-seeding with additional template molecules immobilized to the support so that the second re-seeded density has an increase in nearest neighbor template molecules (e.g., 25 - 50% or more of the first re-seeded template molecules) that touch each other and/or overlap each other. In some embodiments, the resulting second re-seeded support comprises a plurality of template molecules having a further reduced number of interstitial space (and/or having a further reduced size of interstitial space) between the template molecules compared to the first reseeded density support. In some embodiments, the support can undergo multiple re-seeding workflows to generate increasing nearest neighbor template molecules that touch each other and/or overlap each other.
[00163] In some embodiments, individual template molecules comprise nucleic acid concatemer template molecules. In some embodiments, a concatemer template molecule can be generated by conducting rolling circle amplification of a circularized nucleic acid library molecule. In some embodiments, a concatemer template molecule comprises a singlestranded nucleic acid strand carrying numerous tandem copies of a polynucleotide unit, where each polynucleotide unit comprises a sequence of interest region and at least one batch sequencing primer binding site. In some embodiments, each polynucleotide unit further comprises at least one batch barcode sequence. In some embodiments, each polynucleotide unit further comprises at least one sample index sequence. Individual polynucleotide units
can bind a sequencing primer, a sequencing polymerase and a detectably-labeled nucleotide reagent (e.g., detectably labeled multivalent molecules or nucleotide analogs), to form a detectable sequencing complex. In some embodiments, individual concatemer template molecules can collapse into a compact DNA nanoball, where individual nanoballs carry numerous tandem copies of a polynucleotide unit along their lengths. During batch sequencing, individual nanoballs carry numerous detectable sequencing complexes. Thus, the compact nature of the nanoballs increases the local concentration of detectably-labeled nucleotide reagents that are used during batch sequencing which increases the signal intensity emitted from a nanoball to give a discrete detectable signal which can be imaged as a fluorescent spot. In some embodiments, a spot corresponds to a concatemer and each concatemer corresponds to a sequence of interest. Multiple spots can be detected and imaged simultaneously on a support having high density concatemer template molecules immobilized thereon.
Batch Sequencing
[00164] The present disclosure provides methods for sequencing comprising step (a): providing a support comprising a plurality of nucleic acid template molecules immobilized to the support. In some embodiments, the plurality of template molecules comprises a plurality of sub-populations of template molecules including at least a first and a second subpopulation of template molecules. In some embodiments, the first sub-population of template molecules comprises a first batch sequencing primer binding site and at least one first sequence-of-interest. In some embodiments, the second sub-population of template molecules comprises a second batch sequencing primer binding site and at least one second sequence- of-interest. In some embodiments, template molecules within the first sub-population have the same first batch sequencing primer binding site. In some embodiments, template molecules within the first sub -population have the same sequence of interest or different sequences of interest. In some embodiments, the sequence of the first batch sequencing primer binding site sequence corresponds to the first sequence of interest, or the first batch sequencing primer binding site sequence corresponds to one of the first sequences of interest in the first sub-population. In some embodiments, a pre-determined first batch sequencing primer binding site sequence can be linked to a given sequence of interest in the first subpopulation (or can be linked to different sequences of interest in a first sub-population), thus the pre-determined first batch sequencing primer binding site sequence corresponds to a given sequence of interest in the first sub-population.
[00165] In some embodiments, the sequences of interest in the first sub-population are about 50-250 bases in length, about 250-500 bases in length, about 500-800 bases in length, about 800-1200 bases in length, about 1200-2000 bases in length, or up to 2000 bases in length, or any range therebetween.
[00166] In some embodiments, template molecules within the second sub-population have the same second batch sequencing primer binding site, and have the same sequence of interest or different sequences of interest. In some embodiments, the sequence of the second batch sequencing primer binding site sequence corresponds to the second sequence of interest. In some embodiments, the sequence of the second batch sequencing primer binding site sequence corresponds to one of the second sequences of interest in the second subpopulation. In some embodiments, a pre-determined second batch sequencing primer binding site sequence can be linked to a given sequence of interest in the second sub-population (or can be linked to different sequences of interest in a second sub-population), thus the predetermined second batch sequencing primer binding site sequence corresponds to a given sequence of interest in the second sub-population.
[00167] In some embodiments, the sequences of interest in the second sub-population are about 50-250 bases in length, about 250-500 bases in length, about 500-800 bases in length, about 800-1200 bases in length, about 1200-2000 bases in length, or up to 2000 bases in length, or any range therebetween.
[00168] In some embodiments, the first and second batch sequencing primer binding sites have different sequences.
[00169] In some embodiments, the plurality of nucleic acid template molecules can be immobilized to the support at random and non-pre-determined positions on the support, or at pre-determined positions on the support (e.g., a patterned support).
[00170] In some embodiments, in the methods for sequencing of step (a), the support comprises a plurality of nucleic acid template molecules immobilized thereon at a density of about 102 - 1015 template molecules per mm2, e.g. between about 1010 and about 1015 template molecules per mm2, between about 105 and about 1015 template molecules per mm2, between about 103 and about 1014 template molecules per mm2, between about 104 and about 1013 template molecules per mm2, between about 105 and about 1012 template molecules per mm2, between about 106 and about 1011 template molecules per mm2, between about 107 and about 1010 template molecules per mm2, or between about 108 and about 1010 per mm2, or any range therebetween. In some embodiments, the template molecules comprise a mixture of at least two sub-populations of template molecules including at least a first and second sub-
population of template molecules. In some embodiments, the plurality of sub-populations of template molecules are immobilized to the support at a high density where at least some of the template molecules in the first and second sub-populations comprise nearest neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. In some embodiments, the support comprises up to 500 million template molecules immobilized thereon, or up to 1 billion template molecules immobilized thereon, or up to 2 billion template molecules immobilized thereon, or up to 3 billion template molecules immobilized thereon, or up to 4 billion template molecules immobilized thereon, or up to 5 billion template molecules immobilized thereon, or up to 6 billion template molecules immobilized thereon. In some embodiments, the support comprises up to 7 billion template molecules immobilized thereon, or up to 8 billion template molecules immobilized thereon, or up to 9 billion template molecules immobilized thereon, or up to 10 billion template molecules immobilized thereon, or up to 20 billion template molecules immobilized thereon. In some embodiments, the support comprises between about 500 million and about 20 billion template molecules immobilized thereon, between about 1 billion and about 10 billion template molecules immobilized thereon, between about 2 billion and about 9 billion template molecules immobilized thereon, between about 3 billion and about 8 billion template molecules immobilized thereon, between about 4 billion and about 7 billion template molecules immobilized thereon, or between about 5 billion and about 6 billion template molecules immobilized thereon, or any range therebetween.
[00171] In some embodiments, in the methods for sequencing of step (a), the support comprises features that are located in a random and non-pre-determined manner, where the features are sites for attachment of the template molecules.
[00172] In some embodiments, the support is passivated with at least one polymer layer comprising a plurality of surface capture primers covalently tethered to the at least one polymer layer.
[00173] In some embodiments, the support is passivated with multiple polymer layers. In some embodiments, at least one of the polymer layers comprises oligonucleotide primers including capture primers, pinning primers, or a mixture of capture and pinning primers. In some embodiments, the plurality of oligonucleotide primers comprise one type of capture primer (e.g., having that same batch capture primer sequence). In some embodiments, the plurality of oligonucleotide primers comprises a mixture of 2-500 different types of capture primers (e.g., having between about 2-500, between about 50-400, between about 100-300 or
between about 20-150 different batch capture primer sequences, or any range therebetween). In some embodiments, the plurality of oligonucleotide primers comprises one type of pinning primer (e.g., having the same batch pinning primer sequence). In some embodiments, the plurality of oligonucleotide primers comprise a mixture of 2-500 different types of pinning primers (e.g., having between about 2-500, between about 50-400, between about 100-300 or between about 20-150 different batch pinning primer sequences, or any range therebetween). In some embodiments, the plurality of oligonucleotide types comprises between 2 and 500, between 10 and 400, between 20 and 300, between 50 and 200, between 100 and 500, between 200 and 400, between 2 and 250, between 10 and 150, between 20 and 200, or between 20 and 100 or between 5 and 50 different capture primers and/or pinning primers, or any range therebetween.
[00174] In some embodiments, the plurality of surface capture primers comprise a plurality of sub-populations of surface capture primers including at least a first and second sub-population of surface capture primers. In some embodiments, the surface capture primers in the at least first and second sub-population have different sequences. In some embodiments, the surface capture primers in the at least first and second sub-population can hybridize to and thereby capture different circularized library molecules carrying different surface capture primer binding site sequences.
[00175] In some embodiments, the plurality of surface capture primers are randomly distributed throughout and embedded within the at least one polymer layer.
[00176] In some embodiments, the support lacks any contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules. In some embodiments, the support lacks interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
[00177] In some embodiments, in the methods for sequencing of step (a), the support lacks partitions and/or barriers that would create separate regions of the support. Thus, the template molecules immobilized to the support are in fluid communication with each other in a massively parallel manner with no barriers to physically separate different batches of template molecules.
[00178] In some embodiments, the plurality of surface capture primers are located at predetermined positions on the at least one polymer layer and/or the plurality of surface capture primers are embedded within the at least one polymer layer at pre-determined locations.
[00179] In some embodiments, the support includes contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules (e.g., by localizing capture primers thereto). In some embodiments, the support includes interstitial regions arranged in a predetermined pattern where the interstitial regions are sites designed to have no attached template molecules.
[00180] In some embodiments, in the methods for sequencing of step (a), individual template molecules in the first sub-population further comprise a first batch barcode sequence which corresponds to the first sequence of interest. In some embodiments, the first batch barcode sequence corresponds to one of the first sequences of interest in the first subpopulation. In some embodiments, a pre-determined first batch barcode sequence can be linked to a given sequence of interest in the first sub-population, thus the pre-determined first batch barcode sequence corresponds to a given sequence of interest in the first subpopulation. In some embodiments, a pre-determined first batch barcode sequence can be linked to different sequences of interest in a first sub-population.
[00181] In some embodiments, individual template molecules in the second subpopulation further comprise a second batch barcode sequence which corresponds to the second sequence of interest. In some embodiments, the second batch barcode sequence corresponds to one of the second sequences of interest in the second sub-population. In some embodiments, a pre-determined second batch barcode sequence can be linked to a given sequence of interest in the second sub-population, thus the pre-determined second batch barcode sequence corresponds to a given sequence of interest in the second sub-population. In some embodiments, a pre-determined second batch barcode sequence can be linked to different sequences of interest in a second sub-population.
[00182] In some embodiments, in the methods for sequencing of step (a), individual template molecules in the first sub-population further comprise at least one sample index sequence that can be used in a multiplex assay to distinguish the first sequences of interest obtained from different sample sources. In some embodiments, individual template molecules in the second sub-population further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish the second sequences of interest obtained from different sample sources.
[00183] In some embodiments, the first batch barcode sequence can include a short random sequence (e.g., NNN) that is 3-20 in length. In some embodiments, the first batch sample index sequence can include a short random sequence (e.g., NNN) that is 3-20 in
length. In some embodiments, both the first batch barcode sequence and the first batch sample index sequence both include a short random sequence (e.g., NNN) that is 3-20 in length. In some embodiments, sequencing the short random sequence can provide nucleotide diversity and color balance. In some embodiments, sequencing and imaging the short random sequence can be used for polony mapping, location, and template registration because the short random sequence provides sufficient nucleotide diversity and color balance.
[00184] In some embodiments, in the first sub-population of library molecules the short random sequence (e.g., NNN) has an overall base composition of about 25% or about 20- 30% of all four nucleotide bases (e.g., A, G, C and T/U) to provide nucleotide diversity at each sequencing cycle during sequencing the short random sequence (e.g., NNN).
[00185] In some embodiments, in the first sub-population of library molecules, the proportion of adenine (A) at any given position in the short random sequence is about 20- 30% or about 15-35% or about 10-40%. In some embodiments, in the first sub-population of library molecules, the proportion of guanine (G) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, in the first sub-population of library molecules, the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, in the first sub-population of library molecules, the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15- 35% or about 10-40%.
[00186] In some embodiments, in the first sub-population of library molecules, the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%. In some embodiments, in the first sub-population of library molecules, the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
[00187] In some embodiments, the second batch barcode can include a short random sequence (e.g., NNN) that is 3-20 in length. In some embodiments, the second batch sample index can include a short random sequence (e.g., NNN) that is 3-20 in length. In some embodiments, both the second batch barcode sequence and the second batch sample index sequence both include a short random sequence (e.g., NNN) that is 3-20 in length. In some embodiments, sequencing the short random sequence can provide nucleotide diversity and color balance. In some embodiments, sequencing and imaging the short random sequence can be used for polony mapping, location, and template registration because the short random sequence provides sufficient nucleotide diversity and color balance.
[00188] In some embodiments, in the second sub-population of library molecules, the short random sequence (e.g., NNN) has an overall base composition of about 25% or about 20-30% of all four nucleotide bases (e.g., A, G, C and T/U) to provide nucleotide diversity at each sequencing cycle during sequencing the short random sequence (e.g., NNN).
[00189] In some embodiments, in the second sub-population of library molecules, the proportion of adenine (A) at any given position in the short random sequence is about 20- 30% or about 15-35% or about 10-40%. In some embodiments, in the second sub-population of library molecules, the proportion of guanine (G) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, in the second sub-population of library molecules, the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, in the second sub-population of library molecules, the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
[00190] In some embodiments, in the second sub-population of library molecules, the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%. In some embodiments, in the second sub-population of library molecules, the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
[00191] In some embodiments, in the methods for sequencing of step (a), the plurality of template molecules comprise concatemer template molecules. In some embodiments, the concatemer template molecules comprise at least first and second sub-populations of concatemer template molecules. In some embodiments, the concatemer template molecules can be generated by conducting rolling circle amplification (RCA) using circularized library molecules and amplification primers. In some embodiments, a concatemer template molecule comprises numerous tandem copies of a polynucleotide unit. In some embodiments, each polynucleotide unit comprises a sequence of interest and at least one sequencing primer binding site. In some embodiments, concatemer template molecules immobilized to a support can be generated using circularized library molecules and conducting rolling circle amplification. In some embodiments, the circularized library molecules can be generated using padlock probes, single-stranded splint strands, or double-stranded adaptors. In some embodiments, the circularized library molecules comprise a mixture of any combination of circularized padlock probes, linear library molecules (single-stranded linear library molecules), circularized using single-stranded splint strands, and/or linear library molecules
circularized using double-stranded adaptors. Methods for generating circularized library molecules are described herein. Methods for generating circularized library molecules are described in WO2023168444, WO2023168443, W02024011145, W02024059550, WO2025024465, the contents of each of which are incorporated by reference in their entirety herein.
[00192] In some embodiments, individual concatemer template molecules in the first subpopulation comprise a plurality of tandem polynucleotide units. In some embodiments, each polynucleotide unit comprises a first sequence of interest and a first batch sequencing primer binding site sequence which corresponds to the first sequence of interest. In some embodiments, the polynucleotide unit further comprises a first batch barcode sequence which corresponds to the first sequence of interest. In some embodiments, the polynucleotide unit further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, concatemer template molecules in the first sub-population have the same first batch sequencing primer binding site. In some embodiments, concatemer template molecules in the first sub-population have the same sequence of interest or different sequences of interest.
[00193] In some embodiments, individual concatemer template molecules in the second sub-population comprise a plurality of tandem polynucleotide units. In some embodiments, each polynucleotide unit comprises a second sequence of interest and a second batch sequencing primer binding site sequence which corresponds to the second sequence of interest. In some embodiments, the polynucleotide unit further comprises a second batch barcode sequence which corresponds to the second sequence of interest. In some embodiments, the polynucleotide unit further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, concatemer template molecules in the second sub-population have the same second batch sequencing primer binding site. In some embodiments, concatemer template molecules in the second sub-population have the same sequence of interest or different sequences of interest.
[00194] In some embodiments, in the methods for sequencing of step (a), the plurality of concatemer template molecules can be generated by conducting a rolling circle amplification reaction in the presence of a plurality of compaction oligonucleotides. Exemplary compaction oligonucleotides are described in W02024040058, the contents of which are incorporated by reference herein in their entirety. In some embodiments, in the methods for sequencing of
step (a), the plurality of concatemer template molecules can be generated by conducting a rolling circle amplification reaction in the absence of a plurality of compaction oligonucleotides. In some embodiments, individual compaction oligonucleotides can hybridize to two different locations on the same the concatemer template molecule to pull together distal portions of the concatemer template molecule causing compaction of the template molecule to form a DNA nanoball. In some embodiments, individual concatemer template molecules collapse into a polony or nucleic acid nanoball having a compact size and shape compared to a non-collapsed concatemer template molecule.
[00195] In some embodiments, the methods for sequencing further comprise step (b): sequencing the first sub-population of template molecules using a plurality of first batch sequencing primers, thereby generating a plurality of first batch sequencing read products. In some embodiments, the sequencing of step (b) comprises imaging a region of the support to detect the sequencing reactions of the first sub-population of template molecules.
[00196] In some embodiments, the sequencing of step (b) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. Exemplary methods are described in WO2022266470, US20240191278A1 and WO2024159166, the contents of which are incorporated by reference in their entirety herein. [00197] In some embodiments, the sequencing of step (b) comprises conducting a two- stage sequencing method. In some embodiments, the first stage comprises contacting the first sub-population of template molecules with a plurality of first batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., a nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluor ophore.
[00198] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a template molecule hybridized to a sequencing primer. In some embodiments, the
detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, and at least one reducing agent. In some embodiments, the trapping reagent further comprises at least one chaotropic agent. In some embodiments, the trapping reagent further comprises an amino acid or a modified amino acid. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase- catalyzed nucleotide incorporation.
[00199] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the sequencing second stage, the first plurality of sequencing polymerases can be dissociated from the first sub-population of template molecules. In some embodiments, the first sub-population of template molecules can remain immobilized to the support and the first batch sequencing primers can be retained and can remain hybridized to the first sub-population of template molecules.
[00200] In some embodiments, the second stage of the two-stage sequencing method comprises contacting the first sub-population of template molecules and the retained first batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batch sequencing primers.
[00201] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides. In some embodiments, the plurality of nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00202] In some embodiments, the nucleotides comprise chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00203] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00204] In some embodiments, the sequencing of step (b) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of first batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (b) comprises conducting 4-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween. In some embodiments, the sequencing of step (b) comprises sequencing at least a portion of the first batch barcode and/or sequencing at least a portion of the first sample index. In some embodiments, the sequencing of step (b) comprises sequencing at least a portion of the first sequence of interest.
[00205] In some embodiments, prior to sequencing the second sub-population of template molecules, the plurality of first batch sequencing read products can be removed from the first sub-population of template molecules. In some embodiments, the first sub-population of template molecules can be retained on the support using a de-hybridization reagent. In some embodiments, the de-hybridization reagent comprises an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C. In some embodiments, the de-hybridization reagent comprises at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de-hybridization reagent further comprises at least one nucleic acid compaction agent. In some embodiments, the de-hybridization step can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C. In some embodiments, the first batch sequencing read products are not removed from the first subpopulation of template molecules.
[00206] In some embodiments, the sequencing reactions of the first sub-population of template molecules is stopped before initiating the sequencing reactions of the second subpopulation of template molecules.
[00207] In some embodiments, the methods for sequencing further comprises step (bl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first sub-population of template molecules to generate a plurality of first batch sequencing read products. In some embodiments, the plurality of first batch sequencing read products
comprises up to 1000 bases in length. In some embodiments, step (bl) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween. In some embodiments, the first batch sequencing read products comprise a first batch barcode sequence. In some embodiments, the first batch sequencing read products comprise a first batch barcode sequence and a sample index sequence. In some embodiments, the first batch sequencing read products comprise a first batch barcode sequence and at least a portion of a first sequence of interest. In some embodiments, the first batch sequencing read products comprise a first batch barcode sequence, a sample index sequence, and at least a portion of a first sequence of interest. In some embodiments, the short read sequencing comprises hybridizing sequencing primers to sequencing primer binding sites on concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, 500 million - 1 billion of the first subpopulation of concatemer template molecules can be sequenced. In some embodiments, up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the first sub-population of concatemer template molecules can be sequenced. In some embodiments, up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the first sub-population of concatemer template molecules can be sequenced. In some embodiments, between about 500 million and about 10 billion, between about 1 billion and about 9 billion, between about 2 billion and about 8 billion, between about 3 billion and about 7 billion, between about 4 billion and about 6 billion, or any range therebetween of the first sub-population of concatemer template molecules can be sequenced.
[00208] In some embodiments, the sequencing of step (bl) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. In some embodiments, the reiterative sequencing of step (bl) comprises conducting a two- stage sequencing method described herein.
[00209] In some embodiments, the methods for sequencing further comprises step (b2): stopping and/or blocking the short read sequencing of step (bl). In some embodiments, the stopping/blocking comprises incorporating a chain terminating nucleotide to the 3’ terminal end of the first batch sequencing read products to inhibit further sequencing reactions.
Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
[00210] In some embodiments, the methods for sequencing further comprise step (b3): removing the plurality of first batch sequencing read products from the template molecules of the first sub-population, and retaining the template molecules of the first sub-population. In some embodiments, the first batch sequencing read products can be removed from the template molecules by denaturation using heat and/or a de-hybridization reagent.
[00211] In some embodiments, the methods for sequencing further comprise step (b4): reiteratively sequencing the template molecules of the first sub-population by repeating steps (bl) - (b3) at least once. In some embodiments, the reiterative sequencing can be conducted 1-10 times, or 10-25 times, or 25-50 times, or any range therebetween, or more than 50 times. For example, the reiterative sequencing can be conducted up to 100 times.
[00212] In some embodiments, the sequences of all of the first batch sequencing read products can be determined and aligned with a first reference sequence to confirm the presence of the first sequence of interest. The first reference sequence can be the first batch barcode and/or the first sequence of interest.
[00213] In some embodiments, hybridizing the sequencing primers to the concatemer template molecules of step (bl) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10-20% formamide). [00214] In some embodiments, in step (b3) the plurality of plurality of first batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00215] In some embodiments, in step (b3) the plurality of plurality of first batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de-hybridization reagent further comprises at least one nucleic acid compaction agent. In some embodiments, the de-hybridization of step (b3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00216] In some embodiments, the methods for sequencing further comprise step (c): sequencing the second sub-population of template molecules using a plurality of second batch sequencing primers thereby generating a plurality of second batch sequencing read products and imaging the same region of the support to detect the sequencing reactions of the second sub-population of template molecules.
[00217] In some embodiments, the sequencing reactions of the first sub-population of template molecules is stopped before initiating the sequencing reactions of the second subpopulation of template molecules.
[00218] In some embodiments, the sequencing of step (c) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. Exemplary sequencing methods are described in WO2022266470, the contents of which are incorporated by reference in their entirety herein.
[00219] In some embodiments, the sequencing of step (c) comprises conducting a two- stage sequencing method. In some embodiments, the first stage generally comprises contacting the second sub-population of template molecules with a plurality of second batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent- complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluorophore.
[00220] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex. In some embodiments, the nucleic acid duplex comprises a template molecule hybridized to a sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules
to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00221] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the second sub-population of template molecules. In some embodiments, the second sub-population of template molecules can remain immobilized to the support and the second batch sequencing primers can be retained and can remain hybridized to the second sub-population of template molecules.
[00222] In some embodiments, the second stage of the two-stage sequencing method generally comprises contacting the second sub-population of template molecules and the retained second batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some
embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the second batch sequencing primer.
[00223] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides. In some embodiments, the plurality of nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00224] In some embodiments, the nucleotides comprise chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00225] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00226] In some embodiments, the sequencing of step (c) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of second batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (c) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500
sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween. In some embodiments, the sequencing of step (c) comprises sequencing at least a portion of the second batch barcode and/or sequencing at least a portion of the second sample index. In some embodiments, the sequencing of step (c) comprises sequencing at least a portion of the second sequence of interest.
[00227] In some embodiments, prior to sequencing a subsequent sub-population of template molecules (e.g., after sequencing the second sub-population of template molecules), the plurality of second batch sequencing read products can be removed from the second subpopulation of template molecules and the second sub-population of template molecules can be retained on the support using a de-hybridization reagent. In some embodiments, the dehybridization reagent comprises an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C. In some embodiments, the de-hybridization reagent comprises at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de- hybridization reagent further comprises at least one nucleic acid compaction agent. In some embodiments, the de-hybridization step can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C. In some embodiments, the second batch sequencing read products are not removed from the second sub-population of template molecules.
[00228] In some embodiments, the sequencing reactions of the second sub-population of template molecules is stopped before initiating the sequencing reactions of the subsequent sub-population of template molecules.
[00229] In some embodiments, the methods for sequencing further comprise step (cl): conducting short read sequencing by performing up to 1000 sequencing cycles of the second sub-population of template molecules to generate a plurality of second batch sequencing read products that comprise up to 1000 bases in length. In some embodiments, step (cl) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween. In some embodiments, the second batch sequencing read products comprise a second batch barcode sequence. In some embodiments, the second batch sequencing read products comprise a second batch barcode sequence and a sample index sequence. In some
embodiments, the second batch sequencing read products comprise a second batch barcode sequence and at least a portion of a second sequence of interest. In some embodiments, the second batch sequencing read products comprise a second batch barcode sequence, a sample index sequence, and at least a portion of a second sequence of interest. In some embodiments, the short read sequencing comprises hybridizing sequencing primers to sequencing primer binding sites on concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, 500 million - 1 billion of the second sub-population of concatemer template molecules can be sequenced. In some embodiments, up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the second sub-population of concatemer template molecules can be sequenced. In some embodiments, up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the second sub-population of concatemer template molecules can be sequenced. In some embodiments, between about 500 million and about 10 billion, between about 1 billion and about 9 billion, between about 2 billion and about 8 billion, between about 3 billion and about 7 billion, between about 4 billion and about 6 billion, or any range therebetween of the second sub-population of concatemer template molecules can be sequenced.
[00230] In some embodiments, the sequencing of step (cl) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. In some embodiments, the reiterative sequencing of step (cl) comprises conducting a two- stage sequencing method described herein.
[00231] In some embodiments, the methods for sequencing further comprise step (c2): stopping and/or blocking the short read sequencing of step (cl). In some embodiments, the stopping/blocking comprises incorporating a chain terminating nucleotide to the 3’ terminal end of the first batch sequencing read products to inhibit further sequencing reactions. Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
[00232] In some embodiments, the methods for sequencing further comprise step (c3): removing the plurality of second batch sequencing read products from the template molecules of the second sub-population, and retaining the template molecules of the second subpopulation. In some embodiments, the second batch sequencing read products can be
removed from the template molecules by denaturation using heat and/or a de-hybridization reagent.
[00233] In some embodiments, the methods for sequencing further comprise step (c4): reiteratively sequencing the template molecules of the second sub-population by repeating steps (cl) - (c3) at least once. In some embodiments, the reiterative sequencing can be conducted 1-10 times, or 10-25 times, or 25-50 times, or any range therebetween, or more than 50 times.
[00234] In some embodiments, the sequences of all of the second batch sequencing read products can be determined and aligned with a second reference sequence to confirm the presence of the second sequence of interest. The second reference sequence can be the second batch barcode and/or the second sequence of interest.
[00235] In some embodiments, hybridizing the sequencing primers to the concatemer template molecules of step (cl) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10-20% formamide). [00236] In some embodiments, in step (c3) the plurality of plurality of second batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00237] In some embodiments, in step (c3) the plurality of plurality of second batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de-hybridization reagent further comprises at least one nucleic acid compaction agent. In some embodiments, the de-hybridization of step (b3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
Re-Seeding a Support with Interrupted Sequencing
[00238] The present disclosure provides methods for re-seeding a support comprising step (a): providing a support comprising a plurality of surface capture primers immobilized to the support. In some embodiments, the plurality of capture primers have the same sequence. In some embodiments, the plurality of capture primers comprise at least two sub-populations of
capture primers including at least a first sub-population of capture primers having a first sequence and a second sub-population of capture primers having a second sequence. In some embodiments, the plurality of surface capture primers comprise single-stranded oligonucleotides. In some embodiments, the plurality of surface capture primers can be used to generate concatemer template molecules immobilized to the support. In some embodiments, the density of the plurality of surface capture primers is about 102 - 1015 per urn2, e.g. between about IO10 and about 1015 surface capture primers per mm2, between about
105 and about 1015 surface capture primers per mm2, between about 103 and about 1014 surface capture primers per mm2, between about 104 and about 1013 surface capture primers per mm2, between about 105 and about 1012 surface capture primers per mm2, between about
106 and about 1011 surface capture primers per mm2, between about 107 and about IO10 surface capture primers per mm2, or between about 108 and about IO10 surface capture primers per mm2, or any range therebetween.
[00239] In some embodiments, the plurality of surface capture primers can be immobilized to the support at random and non-pre-determined positions. In some embodiments, the plurality of surface capture primers can be immobilized to the support at pre-determined positions (e.g., a patterned support).
[00240] In some embodiments, the support is passivated with at least one polymer layer comprising a plurality of surface capture primers covalently tethered to the at least one polymer layer. In some embodiments, the plurality of surface capture primers are randomly distributed throughout and embedded within the at least one polymer layer.
[00241] In some embodiments, the support lacks any contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment (e.g., immobilization) of the nucleic acid template molecules.
[00242] In some embodiments, the support lacks partitions and/or barriers that would create separate regions of the support.
[00243] In some embodiments, the plurality of surface capture primers are located at predetermined positions on the at least one polymer layer and/or the plurality of surface capture primers are embedded within the at least one polymer layer at pre-determined locations.
[00244] In some embodiments, the support includes contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules. In some embodiments, the support includes interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
[00245] In some embodiments, the methods for re-seeding a support further comprise step (b): distributing on the support a first plurality of circularized nucleic acid library molecules under a condition suitable for hybridizing individual circularized library molecules to individual surface capture primers and conducting a rolling circle amplification reaction in a template-dependent manner using individual circularized library molecules in the first plurality as templates, thereby generating a first plurality of nucleic acid concatemer template molecules immobilized to the support. In some embodiments, a subset of the surface capture primers hybridize to individual circularized library molecules to generate a first plurality of concatemer template molecules. In some embodiments, the number of surface capture primers immobilized to the support exceeds the number of first plurality of circularized nucleic acid library molecules distributed onto the support. In some embodiments, the support comprises up to 500 million of a first plurality of concatemer template molecules immobilized thereon, or up to 1 billion a first plurality of concatemer template molecules immobilized thereon, or up to 2 billion a first plurality of concatemer template molecules immobilized thereon, or up to 3 billion a first plurality of concatemer template molecules immobilized thereon, or up to 4 billion a first plurality of concatemer template molecules immobilized thereon, or up to 5 billion a first plurality of concatemer template molecules immobilized thereon, or up to 6 billion a first plurality of concatemer template molecules immobilized thereon. In some embodiments, the support comprises up to 7 billion concatemer template molecules immobilized thereon, or up to 8 billion concatemer template molecules immobilized thereon, or up to 9 billion concatemer template molecules immobilized thereon, or up to 10 billion concatemer template molecules immobilized thereon, or up to 20 billion concatemer template molecules immobilized thereon. In some embodiments, the support comprises between about 500 million and about 20 billion concatemer template molecules immobilized thereon, between about 1 billion and about 10 billion concatemer template molecules immobilized thereon, between about 2 billion and about 9 billion concatemer template molecules immobilized thereon, between about 3 billion and about 8 billion concatemer template molecules immobilized thereon, between about 4 billion and about 7 billion concatemer template molecules immobilized thereon, or between about 5 billion and about 6 billion concatemer template molecules immobilized thereon, or any range therebetween. In some embodiments, individual concatemer template molecules in the first plurality comprise a plurality of tandem copies of a polynucleotide unit. In some embodiments, each polynucleotide unit comprises a sequence of interest and a batch seeding sequencing primer binding site sequence. In some embodiments, the first plurality of
circularized library molecules can be generated using padlock probes, single-stranded splint strands, or double-stranded adaptors. In some embodiments, the first plurality of circularized library molecules comprise a mixture of any combination of circularized padlock probes, linear library molecules circularized using single-stranded splint strands, and/or linear library molecules circularized using double-stranded adaptors. Methods for generating circularized library molecules are described herein.
[00246] In some embodiments, in the methods for re-seeding a support of step (b), individual circularized library molecules in the first plurality comprise a sequence of interest, a seeding batch sequencing primer binding site sequence which corresponds to the sequence of interest, and a surface capture primer binding site. In some embodiments, a pre-determined first seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the first plurality of circularized library molecules, thus the pre-determined first seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the first plurality of circularized library molecules. In some embodiments, a predetermined first seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a first plurality of circularized library molecules.
[00247] In some embodiments, individual circularized library molecules in the first plurality further comprise a seeding batch barcode sequence which corresponds to the sequence of interest. In some embodiments, a pre-determined first seeding batch barcode sequence can be linked to a given sequence of interest in the first plurality of circularized library molecules, thus the pre-determined first seeding batch barcode sequence corresponds to a given sequence of interest in the first plurality of circularized library molecules. In some embodiments, a pre-determined first seeding batch barcode sequence can be linked to different sequences of interest in a first plurality of circularized library molecules.
[00248] In some embodiments, individual circularized library molecules in the first plurality comprise a sequence of interest and an identical seeding batch sequencing primer binding site sequence which corresponds to the sequence of interest. In some embodiments, individual circularized library molecules further comprise a surface capture primer binding site and a first seeding batch barcode sequence which corresponds to the sequence of interest. [00249] In some embodiments, the sequences of interest in the first plurality of circularized nucleic acid library molecules are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00250] In some embodiments, the concentration of the first plurality of circularized nucleic acid library molecules that are distributed onto the support can be about 1-5 pM, or about 5-10 pM, or about 10-50 pM, or any range therebetween.
[00251] In some embodiments, in the methods for re-seeding a support of step (b), the first plurality of circularized nucleic acid library molecules comprise a plurality of subpopulations of circularized library molecules including at least a first and second subpopulation of circularized library molecules.
[00252] In some embodiments, individual circularized library molecules in the first subpopulation comprise the same first sub-population seeding batch sequencing primer binding site sequences. In some embodiments, individual circularized library molecules in the first sub-population have the same sequence of interest or different sequences of interest. In some embodiments, the first sub-population seeding batch sequencing primer binding site sequence corresponds to the first sequence of interest, or the first sub-population seeding batch sequencing primer binding site sequence corresponds to one of the sequences of interest in the first sub-population. In some embodiments, a pre-determined first sub-population seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the first sub-population of circularized library molecules, thus the pre-determined first subpopulation seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the first sub-population of circularized library molecules. In some embodiments, a pre-determined first sub-population seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a first sub-population of circularized library molecules.
[00253] In some embodiments, individual circularized library molecules in the first subpopulation further comprise a first sub-population seeding batch barcode sequence which corresponds to the first sequence of interest. In some embodiments, the first sub-population seeding batch barcode sequence corresponds to one of the sequences of interest in the first sub-population. In some embodiments, a pre-determined first sub-population seeding batch barcode sequence can be linked to a given sequence of interest in the first sub-population of circularized library molecules, thus the pre-determined first sub-population seeding batch barcode sequence corresponds to a given sequence of interest in the first sub-population of circularized library molecules. In some embodiments, a pre-determined first sub-population seeding batch barcode sequence can be linked to different sequences of interest in a first subpopulation of circularized library molecules.
[00254] In some embodiments, individual circularized library molecules in the first subpopulation further comprise a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, individual circularized library molecules in the first sub-population further comprise a surface capture primer binding site. In some embodiments, individual circularized library molecules in the first sub-population further comprise a surface pinning primer binding site. In some embodiments, individual circularized library molecules in the first subpopulation further comprise a compaction oligonucleotide binding site.
[00255] In some embodiments, the sequences of interest in the first sub-population of circularized nucleic acid library molecules are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, any range therebetween, or up to 2000 bases in length.
[00256] In some embodiments, in the methods for re-seeding a support of step (b), the method comprises conducting a rolling circle amplification reaction, in a template-dependent manner, using individual circularized library molecules in the first sub-population, thereby generating a first sub-population concatemer template molecules immobilized to the support. In some embodiments, a subset of the surface capture primers hybridize to individual circularized library molecules to generate the plurality of first sub-population concatemer template molecules.
[00257] In some embodiments, the first sub-population concatemer template molecules can be immobilized to the support at random and non-predetermined positions on the support, or at pre-determined positions on the support (e.g., patterned support).
[00258] In some embodiments, in the methods for re-seeding a support of step (b), individual circularized library molecules in the second sub-population comprise the same second sub-population seeding batch sequencing primer binding site sequence and have the same sequence of interest or different sequences of interest. In some embodiments, the second sub-population seeding batch sequencing primer binding site sequence corresponds to the second sequence of interest. In some embodiments, the second sub-population seeding batch sequencing primer binding site sequence corresponds to one of the sequences of interest in the second sub-population. In some embodiments, a pre-determined second subpopulation seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the second sub-population of circularized library molecules, thus the pre-determined second sub-population seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the second sub-population of
circularized library molecules. In some embodiments, a pre-determined second subpopulation seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a second sub-population of circularized library molecules.
[00259] In some embodiments, individual circularized library molecules in the second subpopulation further comprise a second sub-population seeding batch barcode sequence which corresponds to the second sequence of interest, or the second sub-population seeding batch barcode sequence corresponds to one of the sequences of interest in the second subpopulation. In some embodiments, a pre-determined second sub-population seeding batch barcode sequence can be linked to a given sequence of interest in the second sub-population of circularized library molecules, thus the pre-determined second subs-population seeding batch barcode sequence corresponds to a given sequence of interest in the second subpopulation of circularized library molecules. In some embodiments, a pre-determined second sub-population seeding batch barcode sequence can be linked to different sequences of interest in a second sub-population of circularized library molecules.
[00260] In some embodiments, individual circularized library molecules in the second subpopulation further comprise a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, individual circularized library molecules in the second sub-population further comprise a surface capture primer binding site. In some embodiments, individual circularized library molecules in the second sub-population further comprise a surface pinning primer binding site. In some embodiments, individual circularized library molecules in the second sub-population further comprise a compaction oligonucleotide binding site.
[00261] In some embodiments, the sequences of interest in the second sub-population of circularized nucleic acid library molecules are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00262] In some embodiments, the first sub-population seeding batch sequencing primer binding site sequence and second sub-population seeding batch sequencing primer binding site sequence have different sequences.
[00263] In some embodiments, in the methods for re-seeding a support of step (b), the method comprises conducting a rolling circle amplification reaction, in a template-dependent manner, using individual circularized library molecules in the second sub-population, thereby generating a plurality of second sub-population concatemer template molecules immobilized to the support. In some embodiments, a subset of the surface capture primers hybridize to
individual circularized library molecules to generate the plurality of second sub-population concatemer template molecules.
[00264] In some embodiments, the second sub-population concatemer template molecules can be immobilized to the support at random and non-predetermined positions on the support, or at pre-determined positions on the support (e.g., patterned support).
[00265] In some embodiments, in the methods for re-seeding a support of step (b), the rolling circle amplification reaction comprises contacting the primed circularized library molecules with a plurality of a strand displacing polymerase, and a plurality of nucleotides which include dATP, dCTP, dGTP, dTTP.
[00266] In some embodiments, the plurality of nucleotide further comprises a plurality of a nucleotide having a scissile moiety (e.g., uracil).
[00267] In some embodiments, the rolling circle amplification reaction of step (b) can be conducted in the presence of a plurality of compaction oligonucleotides. In some embodiments, the rolling circle amplification reaction of step (b) can be conducted in the absence of a plurality of compaction oligonucleotides. In some embodiments, individual compaction oligonucleotides can hybridize to two different locations on the same the template molecule to pull together distal portions of the template molecule causing compaction of the template molecule to form a DNA nanoball.
[00268] In some embodiments, the methods for re-seeding a support further comprise step (c): sequencing at least a subset of the first plurality of concatemer template molecules, thereby generating a first plurality of sequencing read products. In some embodiments, the sequencing of step (c) comprises imaging a region of the support to detect the sequencing reactions of the first plurality of concatemer template molecules.
[00269] In some embodiments, the concatemer template molecules immobilized to the support in the first plurality are sequenced. For example, at least 30-50%, or at least 50-70%, or at least 70-90% of the concatemer template molecules in the first plurality are sequenced. In some embodiments, 500 million - 1 billion of the first plurality of concatemer template molecules can be sequenced. In some embodiments, up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the first plurality of concatemer template molecules can be sequenced. In some embodiments, up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the first plurality of concatemer template molecules can be sequenced. In some embodiments, between about 500 million and about 10 billion concatemer template molecules, between about 1 billion and about 9 billion concatemer template molecules, between about 2 billion and about 8 billion concatemer
template molecules, between about 3 billion and about 7 billion concatemer template molecules, between about 4 billion and about 5 billion concatemer template molecules, or any range therebetween of concatemer template molecules of the first plurality of concatemer template molecules can be sequenced.
[00270] In some embodiments, the full length of the concatemer template molecules in the first plurality are sequenced. In some embodiments, a partial length of the concatemer template molecules in the first plurality are sequenced.
[00271] In some embodiments, the sequencing of step (c) comprises hybridizing sequencing primers to sequencing primers binding sites on the first plurality of concatemer template molecules immobilized to the support and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, the concatemer template molecules in the first plurality can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00272] In some embodiments, a partial length of the concatemer template molecules in the first plurality are reiteratively sequenced.
[00273] In some embodiments, in the methods for re-seeding a support of step (c), a first sub-population of the concatemer template molecules in the first plurality are sequenced using the first batch sequencing primer binding sites in the first sub-population of concatemer template molecules.
[00274] In some embodiments, the full length of the concatemer template molecules in the first sub-population are sequenced. In some embodiments, a partial length of the concatemer template molecules in the first sub-population are sequenced.
[00275] In some embodiments, the sequencing of step (c) comprises hybridizing sequencing primers to sequencing primers binding sites on the first sub-population of the first plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase- catalyzed sequencing reactions using nucleotide reagents. In some embodiments, the concatemer template molecules in the first sub-population can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00276] In some embodiments, a partial length of the concatemer template molecules in the first sub-population are reiteratively sequenced.
[00277] In some embodiments, the sequencing of step (c) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. [00278] In some embodiments, the sequencing of step (c) comprises conducting a two- stage sequencing method. In some embodiments, the first stage generally comprises contacting the first sub-population of template molecules in the first plurality with a plurality of first batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluorophore.
[00279] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00280] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the first sub-population of template molecules in the first plurality. In some embodiments, the first sub-population of template molecules in the first plurality can remain immobilized to the support and the first batch sequencing primers can be retained and can remain hybridized to the first sub-population of template molecules in the first plurality.
[00281] In some embodiments, the second stage of the two-stage sequencing method comprises contacting the first sub-population of template molecules in the first plurality and the retained first batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batch sequencing primer.
[00282] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides. In some embodiments, the plurality of nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted. [00283] In some embodiments, the nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar
position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00284] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00285] In some embodiments, the sequencing of step (c) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of first batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (c) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00286] In some embodiments, in the methods for re-seeding a support of step (c), a second sub-population of concatemer template molecules in the first plurality are sequenced using the second batch sequencing primer binding sites in the second sub-population of concatemer template molecules.
[00287] In some embodiments, the full length of the concatemer template molecules in the second sub-population are sequenced. In some embodiments, a partial length of the concatemer template molecules in the second sub-population are sequenced.
[00288] In some embodiments, the sequencing of step (c) comprises hybridizing sequencing primers to sequencing primers binding sites on the second sub-population of the first plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, the concatemer template molecules in the second sub-population plurality can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00289] In some embodiments, a partial length of the concatemer template molecules in the second sub-population are reiteratively sequenced.
[00290] In some embodiments, the sequencing of step (c) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. [00291] In some embodiments, the sequencing of step (c) comprises conducting a two- stage sequencing method. In some embodiments, the first stage comprises contacting the second sub-population of template molecules in the first plurality with a plurality of second batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluorophore.
[00292] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules
to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00293] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the second sub-population of template molecules in the first plurality. In some embodiments, the second sub-population of template molecules in the first plurality can remain immobilized to the support and the second batch sequencing primers can be retained and can remain hybridized to the second sub-population of template molecules in the first plurality.
[00294] In some embodiments, the second stage of the two-stage sequencing method generally comprises contacting the second sub-population of template molecules in the first plurality and the retained second batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting
nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the second batch sequencing primer.
[00295] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00296] In some embodiments, the nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00297] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00298] In some embodiments, the sequencing of step (c) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of second batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (c) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75
sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00299] In some embodiments, the methods for re-seeding a support further comprise reiteratively sequencing the first sub-population of the first plurality of concatemer template molecules, which comprises step (cl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first sub-population of concatemer template molecules to generate a plurality of first sub-population batch sequencing read products that comprise up to 1000 bases in length. In some embodiments, step (cl) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00300] In some embodiments, the first sub-population batch sequencing read products comprise the first sub-population seeding batch barcode sequence.
[00301] In some embodiments, the first sub-population batch sequencing read products comprise the first sub-population seeding batch barcode sequence and the sample index sequence.
[00302] In some embodiments, the first sub-population batch sequencing read products comprise the first sub-population seeding batch barcode sequence and at least a portion of the first sequence of interest.
[00303] In some embodiments, the first sub-population batch sequencing read products comprise the first sub-population seeding batch barcode sequence, the sample index sequence, and at least a portion of the first sequence of interest.
[00304] In some embodiments, in step (cl), the short read sequencing comprises hybridizing sequencing primers to sequencing primer binding sites on the first sub-population of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, 500 million - 1 billion of the first sub-population of concatemer template molecules can be sequenced. In some embodiments, up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the first sub-population of concatemer template molecules can be sequenced. In some embodiments, up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the first sub-population of concatemer template molecules can be sequenced. In some embodiments, between about 500 million and about 10 billion concatemer template molecules, between about 1 billion and about 9 billion concatemer
template molecules, between about 2 billion and about 8 billion concatemer template molecules, between about 3 billion and about 7 billion concatemer template molecules, between about 4 billion and about 5 billion concatemer template molecules, or any range therebetween of the first sub-population of concatemer template molecules can be sequenced. [00305] In some embodiments, the sequencing of step (cl) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. In some embodiments, the reiterative sequencing of step (cl) comprises conducting a two- stage sequencing method described herein.
[00306] In some embodiments, the methods for re-seeding a support further comprise step (c2): stopping and/or blocking the short read sequencing of step (cl). In some embodiments, the stopping and/or blocking comprises incorporating a chain terminating nucleotide to the 3’ terminal end of the first sub-population batch sequencing read products to inhibit further sequencing reactions. Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
[00307] In some embodiments, the methods for re-seeding a support further comprise step (c3): removing the plurality of first sub-population batch sequencing read products and retaining the concatemer template molecules of the first sub -population. In some embodiments, step (c3) is optional. In some embodiments, the first sub-population batch sequencing read products can be removed from the concatemer template molecules by denaturation using heat and/or a de-hybridization reagent.
[00308] In some embodiments, the methods for re-seeding a support further comprise step (c4): reiteratively sequencing the concatemer template molecules of the first sub-population by repeating steps (cl) - (c3) at least once. In some embodiments, the reiterative sequencing can be conducted 1-10 times, or 10-25 times, or 25-50 times or more.
[00309] In some embodiments, the sequences of the first sub-population batch sequencing read products can be determined and aligned with a first reference sequence to confirm the presence of the first sequence of interest. The first reference sequence can be the first subpopulation seeding batch barcode and/or the first sequence of interest.
[00310] In some embodiments, the methods for re-seeding a support further comprise reiteratively sequencing the second sub-population of concatemer template molecules in a manner similar to steps (cl) - (c4) as described above for the first sub-population of concatemer template molecules.
[00311] In some embodiments, hybridizing the sequencing primers to the concatemer template molecules of any of steps (cl) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10- 20% formamide).
[00312] In some embodiments, in step (c3) the plurality of first sub-population batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00313] In some embodiments, in step (c3) the plurality of first sub-population batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de-hybridization reagent further comprises at least one nucleic acid compaction agent. In some embodiments, the de-hybridization of step (c3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00314] In some embodiments, the methods for re-seeding a support further comprise step (d): distributing on the support a second plurality of circularized nucleic acid library molecules under a condition suitable for hybridizing individual circularized library molecules o to individual surface capture primers and conducting a second rolling circle amplification reaction, in a template-dependent manner using individual circularized library molecules in the second plurality as templates, thereby generating a second plurality of nucleic acid concatemer template molecules immobilized to the support. In some embodiments, the support comprises up to 500 million of a second plurality of concatemer template molecules immobilized thereon, or up to 1 billion a second plurality of concatemer template molecules immobilized thereon, or up to 2 billion a second plurality of concatemer template molecules immobilized thereon, or up to 3 billion a second plurality of concatemer template molecules immobilized thereon, or up to 4 billion a second plurality of concatemer template molecules immobilized thereon, or up to 5 billion a second plurality of concatemer template molecules immobilized thereon, or up to 6 billion a second plurality of concatemer template molecules immobilized thereon. In some embodiments, the support comprises up to 7 billion concatemer template molecules immobilized thereon, or up to 8 billion concatemer template
molecules immobilized thereon, or up to 9 billion concatemer template molecules immobilized thereon, or up to 10 billion concatemer template molecules immobilized thereon, or up to 20 billion concatemer template molecules immobilized thereon. In some embodiments, the support comprises between about 500 million and about 20 billion concatemer template molecules immobilized thereon, between about 1 billion and about 10 billion concatemer template molecules immobilized thereon, between about 2 billion and about 9 billion concatemer template molecules immobilized thereon, between about 3 billion and about 8 billion concatemer template molecules immobilized thereon, between about 4 billion and about 7 billion concatemer template molecules immobilized thereon, or between about 5 billion and about 6 concatemer billion template molecules immobilized thereon, or any range therebetween. In some embodiments, individual concatemer template molecules in the second plurality comprise a plurality of tandem copies of a polynucleotide unit, where each polynucleotide unit comprises a sequence of interest and a batch seeding sequencing primer binding site sequence. In some embodiments, the first plurality of concatemer template molecules of step (c) can be completely sequenced or the sequencing can be interrupted at any time prior to distributing the second plurality of circularized nucleic acid library molecules onto the support of step (d). In some embodiments, the second plurality of circularized library molecules can be generated using padlock probes, single-stranded splint strands, or double-stranded adaptors. In some embodiments, the second plurality of circularized library molecules comprise a mixture of any combination of circularized padlock probes, linear library molecules circularized using single-stranded splint strands, and/or linear library molecules circularized using double-stranded adaptors. Methods for generating circularized library molecules are described herein.
[00315] In some embodiments, in the methods for re-seeding the support of step (d), individual circularized library molecules in the second plurality comprise a sequence of interest, a seeding batch sequencing primer binding site sequence which corresponds to the sequence of interest, and a surface capture primer binding site. In some embodiments, a predetermined second seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the second plurality of circularized library molecules. In some embodiments, a pre-determined second seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a second plurality of circularized library molecules), thus the pre-determined second seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the second plurality of circularized library molecules.
[00316] In some embodiments, individual circularized library molecules in the second plurality further comprise a seeding batch barcode sequence which corresponds to the sequence of interest.
[00317] In some embodiments, a pre-determined second seeding batch barcode sequence can be linked to a given sequence of interest in the second plurality of circularized library molecules, thus the pre-determined second seeding batch barcode sequence corresponds to a given sequence of interest in the second plurality of circularized library molecules. In some embodiments, a pre-determined second seeding batch barcode sequence can be linked to different sequences of interest in a second plurality of circularized library molecules.
[00318] In some embodiments, individual circularized library molecules in the second plurality comprise a sequence of interest, the same seeding batch sequencing primer binding site sequence which corresponds to the sequence of interest, and individual circularized library molecules further comprise a surface capture primer binding site, and a second seeding batch barcode sequence which corresponds to the sequence of interest.
[00319] In some embodiments, the sequences of interest in the second plurality of circularized nucleic acid library molecules are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00320] In some embodiments, the concentration of the second plurality of circularized nucleic acid library molecules that are distributed onto the support can be about 1-5 pM, or about 5-10 pM, or about 10-50 pM, or any range therebetween.
[00321] In some embodiments, in the methods for re-seeding a support of step (d), the second plurality of circularized nucleic acid library molecules comprise a plurality of subpopulations of circularized library molecules including at least a third and fourth subpopulation of circularized library molecules.
[00322] In some embodiments, individual circularized library molecules in the third subpopulation comprise the same third sub-population seeding batch sequencing primer binding site sequence and have the same sequence of interest. In some embodiments, individual circularized library molecules in the third sub-population comprise the same third subpopulation seeding batch sequencing primer binding site sequence and have different sequences of interest. In some embodiments, the third sub-population seeding batch sequencing primer binding site sequence corresponds to the third sequence of interest, or the third sub-population seeding batch sequencing primer binding site sequence corresponds to one of the sequences of interest in the third sub-population. In some embodiments, a pre-
determined third sub-population seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the third sub-population of circularized library molecules, thus the pre-determined third sub-population seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the third sub-population of circularized library molecules. In some embodiments, a pre-determined third subpopulation seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a third sub-population of circularized library molecules.
[00323] In some embodiments, individual circularized library molecules in the third subpopulation further comprise a third sub-population seeding batch barcode sequence which corresponds to the third sequence of interest, or the third sub-population seeding batch barcode sequence corresponds to one of the sequences of interest in the third sub-population. In some embodiments, a pre-determined third sub-population seeding batch barcode sequence can be linked to a given sequence of interest in the third sub-population of circularized library molecules, thus the pre-determined third sub-population seeding batch barcode sequence corresponds to a given sequence of interest in the third sub-population of circularized library molecules. In some embodiments, a pre-determined third sub-population seeding batch barcode sequence can be linked to different sequences of interest in a third sub-population of circularized library molecules.
[00324] In some embodiments, individual circularized library molecules in the third subpopulation further comprise a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, individual circularized library molecules in the third sub-population further comprise a surface capture primer binding site. In some embodiments, individual circularized library molecules in the third sub-population further comprise a surface pinning primer binding site. In some embodiments, individual circularized library molecules in the third subpopulation further comprise a compaction oligonucleotide binding site.
[00325] In some embodiments, the sequences of interest in the third sub-population of circularized nucleic acid library molecules are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00326] In some embodiments, in the methods for re-seeding a support of step (d), the method comprises conducting a rolling circle amplification reaction, in a template-dependent manner using individual circularized library molecules in the third sub-population, thereby generating a plurality of third sub-population concatemer template molecules immobilized to
the support. In some embodiments, a subset of the surface capture primers hybridize to individual circularized library molecules to generate the plurality of third sub-population concatemer template molecules.
[00327] In some embodiments, the third sub-population concatemer template molecules can be immobilized to the support at random and non-predetermined positions, or at predetermined positions (e.g., patterned support).
[00328] In some embodiments, in the methods for re-seeding a support of step (d), individual circularized library molecules in the fourth sub-population comprise the same fourth sub-population seeding batch sequencing primer binding site sequence and have the same sequence of interest or different sequences of interest. In some embodiments, the fourth sub-population seeding batch sequencing primer binding site sequence corresponds to the fourth sequence of interest, or the fourth sub-population seeding batch sequencing primer binding site sequence corresponds to one of the sequences of interest in the fourth subpopulation. In some embodiments, a pre-determined fourth sub-population seeding batch sequencing primer binding site sequence can be linked to a given sequence of interest in the fourth sub-population of circularized library molecules, thus the pre-determined fourth subpopulation seeding batch sequencing primer binding site sequence corresponds to a given sequence of interest in the fourth sub-population of circularized library molecules. In some embodiments, a pre-determined fourth sub-population seeding batch sequencing primer binding site sequence can be linked to different sequences of interest in a fourth subpopulation of circularized library molecules.
[00329] In some embodiments, individual circularized library molecules in the fourth subpopulation further comprise a fourth sub-population seeding batch barcode sequence which corresponds to the fourth sequence of interest, or the fourth sub-population seeding batch barcode sequence corresponds to one of the sequences of interest in the fourth subpopulation. In some embodiments, a pre-determined fourth sub-population seeding batch barcode sequence can be linked to a given sequence of interest in the fourth sub-population of circularized library molecules, thus the pre-determined fourth subs-population seeding batch barcode sequence corresponds to a given sequence of interest in the fourth sub-population of circularized library molecules. In some embodiments, a pre-determined fourth sub-population seeding batch barcode sequence can be linked to different sequences of interest in a fourth sub-population of circularized library molecules
[00330] In some embodiments, individual circularized library molecules in the fourth subpopulation further comprise a sample index sequence that can be used in a multiplex assay to
distinguish sequences of interest obtained from different sample sources. In some embodiments, individual circularized library molecules in the fourth sub-population further comprise a surface capture primer binding site. In some embodiments, individual circularized library molecules in the fourth sub-population further comprise a surface pinning primer binding site. In some embodiments, individual circularized library molecules in the fourth sub-population further comprise a compaction oligonucleotide binding site.
[00331] In some embodiments, the sequences of interest in the fourth sub-population of circularized nucleic acid library molecules are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00332] In some embodiments, the third sub-population seeding batch sequencing primer binding site sequence and fourth sub-population seeding batch sequencing primer binding site sequence have different sequences.
[00333] In some embodiments, in the methods for re-seeding a support of step (d), the method comprises conducting a rolling circle amplification reaction, in a template-dependent manner using individual circularized library molecules in the fourth sub-population, thereby generating a fourth sub-population concatemer template molecules immobilized to the support. In some embodiments, a subset of the surface capture primers hybridize to individual circularized library molecules to generate the fourth sub-population concatemer template molecules.
[00334] In some embodiments, the fourth sub-population concatemer template molecules can be immobilized to the support at random and non-predetermined positions, or at predetermined positions (e.g., patterned support).
[00335] In some embodiments, in the methods for re-seeding a support of step (d), the rolling circle amplification reaction comprises contacting the primed circularized library molecules with a plurality of a strand displacing polymerase, and a plurality of nucleotides which include dATP, dCTP, dGTP, dTTP.
[00336] In some embodiments, the plurality of nucleotide further comprises a plurality of a nucleotide having a scissile moiety (e.g., uracil).
[00337] In some embodiments, the rolling circle amplification reaction of step (d) can be conducted in the presence, or in the absence, of a plurality of compaction oligonucleotides. In some embodiments, individual compaction oligonucleotides can hybridize to two different locations on the same the template molecule to pull together distal portions of the template molecule causing compaction of the template molecule to form a DNA nanoball.
[00338] In some embodiments, the methods for re-seeding a support further comprise step (e): sequencing at least a subset of the second plurality of immobilized concatemer template molecules thereby generating a second plurality of sequencing read products. In some embodiments, the sequencing of step (e) comprises imaging a region of the support to detect the sequencing reactions of the second plurality of template molecules. In some embodiments, the same region of the support is sequenced in steps (c) and (e). In some embodiments, different regions of the support are sequenced in steps (c) and (e).
[00339] In some embodiments, the concatemer template molecules in the second plurality are sequenced. For example, at least 30-50%, or at least 50-70%, or at least 70-90% of the concatemer template molecules in the second plurality are sequenced. In some embodiments, 500 million - 1 billion of the second plurality of concatemer template molecules can be sequenced. In some embodiments, up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the second plurality of concatemer template molecules can be sequenced. In some embodiments, up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the second plurality of concatemer template molecules can be sequenced. In some embodiments, between about 500 million and about 10 billion concatemer template molecules, between about 1 billion and about 9 billion concatemer template molecules, between about 2 billion and about 8 billion concatemer template molecules, between about 3 billion and about 7 billion concatemer template molecules, between about 4 billion and about 5 billion concatemer template molecules, or any range therebetween of concatemer template molecules of the second plurality of concatemer template molecules can be sequenced.
[00340] In some embodiments, the full length of the concatemer template molecules in the second plurality are sequenced. In some embodiments, a partial length of the concatemer template molecules in the second plurality are sequenced.
[00341] In some embodiments, the sequencing of step (e) comprises hybridizing sequencing primers to sequencing primers binding sites on the second plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, the concatemer template molecules in the second plurality can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00342] In some embodiments, a partial length of the concatemer template molecules in the second plurality are reiteratively sequenced.
[00343] In some embodiments, in the methods for re-seeding a support of step (e), the third sub-population of the concatemer template molecules in the second plurality are sequenced using the third batch sequencing primer binding sites in the third sub-population of concatemer template molecules.
[00344] In some embodiments, the full length of the concatemer template molecules in the third sub-population are sequenced. In some embodiments, a partial length of the concatemer template molecules in the third sub-population are sequenced.
[00345] In some embodiments, the sequencing of step (e) comprises hybridizing sequencing primers to sequencing primers binding sites on the third sub-population of the second plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, the immobilized concatemer template molecules in the third sub-population can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00346] In some embodiments, a partial length of the concatemer template molecules in the third sub-population are reiteratively sequenced.
[00347] In some embodiments, the sequencing of step (e) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. [00348] In some embodiments, the sequencing of step (e) comprises conducting a two- stage sequencing method. In some embodiments, the first stage generally comprises contacting the third sub-population of template molecules in the second plurality with a plurality of third batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be
labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluorophore.
[00349] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00350] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the third sub-population of template
molecules in the second plurality. In some embodiments, the third sub-population of template molecules in the second plurality can remain immobilized to the support and the third batch sequencing primers can be retained and can remain hybridized to the third sub-population of template molecules in the second plurality.
[00351] In some embodiments, the second stage of the two-stage sequencing method comprises contacting the third sub-population of template molecules in the second plurality and the retained third batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the third batch sequencing primer.
[00352] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00353] In some embodiments, the nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00354] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping
reagent, the plurality of nucleotides comprises chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00355] In some embodiments, the sequencing of step (e) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of third batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (e) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00356] In some embodiments, in the methods for re-seeding a support of step (e), the fourth sub-population of the concatemer template molecules in the second plurality are sequenced using the fourth batch sequencing primer binding sites in the fourth sub-population of concatemer template molecules.
[00357] In some embodiments, the full length of the concatemer template molecules in the fourth sub-population are sequenced. In some embodiments, a partial length of the concatemer template molecules in the fourth sub-population are sequenced.
[00358] In some embodiments, the sequencing of step (e) comprises hybridizing sequencing primers to sequencing primers binding sites on the fourth sub-population of the second plurality of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, the concatemer template molecules in the fourth sub-population can be subjected to 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00359] In some embodiments, a partial length of the concatemer template molecules in the fourth sub-population are reiteratively sequenced.
[00360] In some embodiments, the sequencing of step (e) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
[00361] In some embodiments, the sequencing of step (e) comprises conducting a two- stage sequencing method. In some embodiments, the first stage comprises contacting the fourth sub-population of template molecules in the second plurality with a plurality of fourth batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluorophore.
[00362] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00363] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled
multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the fourth sub-population of template molecules in the second plurality. In some embodiments, the fourth sub-population of template molecules in the second plurality can remain immobilized to the support and the fourth batch sequencing primers can be retained and can remain hybridized to the fourth subpopulation of template molecules in the second plurality.
[00364] In some embodiments, the second stage of the two-stage sequencing method comprises contacting the fourth sub-population of template molecules in the second plurality and the retained fourth batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the fourth batch sequencing primer.
[00365] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00366] In some embodiments, the nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00367] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00368] In some embodiments, the sequencing of step (e) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of fourth batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (e) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00369] In some embodiments, the methods for re-seeding a support further comprise reiteratively sequencing the third sub-population of concatemer template molecules, which comprises step (el): conducting short read sequencing by performing up to 1000 sequencing cycles of the third sub-population of the second plurality of concatemer template molecules to generate a plurality of second sub-population batch sequencing read products that comprise up to 1000 bases in length. In some embodiments, step (el) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00370] In some embodiments, the third sub-population batch sequencing read products comprise the third sub-population seeding batch barcode sequence.
[00371] In some embodiments, the third sub-population batch sequencing read products comprise the third sub-population seeding batch barcode sequence and the sample index sequence.
[00372] In some embodiments, the third sub-population batch sequencing read products comprise the third sub-population seeding batch barcode sequence and at least a portion of the second sequence of interest.
[00373] In some embodiments, the third sub-population batch sequencing read products comprise the third sub-population seeding batch barcode sequence, the sample index sequence, and at least a portion of the second sequence of interest.
[00374] In some embodiments, in step (el), the short read sequencing comprises hybridizing sequencing primers to sequencing primer binding sites on the third subpopulation of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents. In some embodiments, 500 million - 1 billion of the third sub-population of concatemer template molecules can be sequenced. In some embodiments, up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the third sub-population of concatemer template molecules can be sequenced. In some embodiments, up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the third sub-population of concatemer template molecules can be sequenced. In some embodiments, between about 500 million and about 10 billion concatemer template molecules, between about 1 billion and about 9 billion concatemer template molecules, between about 2 billion and about 8 billion concatemer template molecules, between about 3 billion and about 7 billion concatemer template molecules, between about 4 billion and about 5 billion concatemer template molecules, or any range therebetween of the third sub-population of concatemer template molecules can be sequenced.
[00375] In some embodiments, the sequencing of step (el) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. In some embodiments, the reiterative sequencing of step (el) comprises conducting a two- stage sequencing method described herein.
[00376] In some embodiments, the methods for re-seeding a support further comprise step (e2): stopping and/or blocking the short read sequencing of step (el). In some embodiments, the stopping/blocking comprises incorporating a chain terminating nucleotide to the 3’
terminal end of the second sub-population batch sequencing read products to inhibit further sequencing reactions. Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
[00377] In some embodiments, the methods for re-seeding a support further comprise step (e3): removing the plurality of second sub-population batch sequencing read products and retaining the concatemer template molecules of the second sub-population. In some embodiments, step (e3) is optional. In some embodiments, the third sub-population batch sequencing read products can be removed from the concatemer template molecules by denaturation using heat and/or a de-hybridization reagent.
[00378] In some embodiments, the methods for re-seeding a support further comprise step (e4): reiteratively sequencing the concatemer template molecules of the third sub-population by repeating steps (el) - (e3) at least once. In some embodiments, the reiterative sequencing can be conducted 1-10 times, or 10-25 times, or 25-50 times, or any range therebetween or more than 50 times.
[00379] In some embodiments, the sequences of the third sub-population batch sequencing read products can be determined and aligned with a second reference sequence to confirm the presence of the second sequence of interest. The second reference sequence can be the third sub-population seeding batch barcode and/or the second sequence of interest.
[00380] In some embodiments, the methods for re-seeding a support further comprise reiteratively sequencing the fourth sub-population of concatemer template molecules in a manner similar to steps (el) - (e4) as described above for the third sub-population of concatemer template molecules.
[00381] In some embodiments, hybridizing the sequencing primers to the concatemer template molecules of any of steps (el) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10- 20% formamide).
[00382] In some embodiments, in step (e3) the plurality of third sub-population batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00383] In some embodiments, in step (e3) the plurality of third sub-population batch sequencing read products can be removed from the template molecules and the plurality of template molecules can be retained using a de-hybridization reagent comprising at least one
solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de-hybridization reagent further comprises at least one nucleic acid compaction agent. In some embodiments, the de-hybridization of step (e3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
Methods for Determining Nucleic Acid Template Density on a Support
[00384] Conventional methods for achieving a desired density of immobilized nucleic acid template molecules for massively parallel sequencing include determining the concentration of library molecules in-solution prior to immobilizing the library molecules on the support. The conventional methods typically employ qPCR and/or a fluorometer with a fluorescentbased assay (e.g., Qubit). Even when the desired in-solution library concentration is achieved, these convention methods can yield immobilized template densities that are too high or too low.
[00385] The present disclosure provides methods for determining the density of nucleic acid template molecules that are already immobilized to a support, thus providing more accurate density information of the template molecules that are immobilized to the support. In some embodiments, the density determining methods can be used to determine the density of a mixture of template molecules including at least a first and second sub-population of template molecules immobilized to a support. In some embodiments, when the density of any given sub-population of immobilized template molecules is determined to be too low, then the support can be re-seeded with that particular sub-population of library molecule, which can then be amplified to increase the density.
[00386] The present disclosure provides methods for determining nucleic acid template density comprising step (a): providing a support comprising a plurality of nucleic acid template molecules immobilized to the support. In some embodiments, the plurality of template molecules comprises a plurality of sub-populations of template molecules, including at least a first and a second sub-population of template molecules. In some embodiments, the plurality of template molecules comprises 1-50 sub-populations, or 50-100 sub-populations, or 100-150 sub-populations, or 150-200 sub-populations, or any range therebetween, or more than 200 sub-populations of template molecules.
[00387] In some embodiments, individual template molecules of the first sub-population comprise (i) a first batch sequencing primer binding site, (ii) a first sequence of interest, and (iii) optionally a first batch barcode sequence and/or a first batch sample index sequence. [00388] In some embodiments, individual template molecules within the first subpopulation comprise the same first batch sequencing primer binding site. In some embodiments, individual template molecules within the first sub-population comprise the same sequence of interest, or comprise different sequences of interest. In some embodiments, the sequence of the first batch sequencing primer binding site sequence corresponds to the first sequence of interest, or the first batch sequencing primer binding site sequence corresponds to one of the first sequences of interest in the first sub-population. In some embodiments, a pre-determined first batch sequencing primer binding site sequence can be linked to, i.e. can be used to selectively sequence in a batch sequencing workflow, a given sequence of interest in the first sub-population. In some embodiments, a pre-determined first batch sequencing primer binding site sequence can be linked to different sequences of interest in the first sub-population. Thus, the pre-determined first batch sequencing primer binding site sequence corresponds to a given sequence of interest or sequences of interest in the first sub-population.
[00389] In some embodiments, the sequences of interest in the first sub-population are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00390] In some embodiments, the first batch barcode and/or the first batch sample index can include a short random sequence (e.g., NNN) that is 3-20 in length. In some embodiments, sequencing the short random sequence can provide nucleotide diversity and color balance. In some embodiments, sequencing and imaging the short random sequence can be used for polony mapping and location and template registration because the short random sequence provides sufficient nucleotide diversity and color balance.
[00391] In some embodiments, in the first sub-population of library molecules, the short random sequence (e.g., NNN) has an overall base composition of about 25% or about 20- 30% of all four nucleotide bases (e.g., A, G, C and T/U) to provide nucleotide diversity at each sequencing cycle during sequencing the short random sequence (e.g., NNN).
[00392] In some embodiments, in the first sub-population of library molecules, the proportion of adenine (A) at any given position in the short random sequence is about 20- 30% or about 15-35% or about 10-40%. In some embodiments, the proportion of guanine (G)
at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, i the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
[00393] In some embodiments, in the first sub-population of library molecules the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%. In some embodiments, i the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
[00394] In some embodiments, individual template molecules of the second subpopulation comprise (i) a second batch sequencing primer binding site, (ii) a second sequence of interest, and (iii) optionally a second batch barcode sequence and/or a second batch sample index sequence.
[00395] In some embodiments, individual template molecules within the second subpopulation comprise the same second batch sequencing primer binding site. In some embodiments, individual template molecules within the second sub-population comprise the same sequence of interest or comprise different sequences of interest. In some embodiments, the sequence of the second batch sequencing primer binding site sequence corresponds to, i.e. can be used to selectively sequence in a batch sequencing workflow, the second sequence of interest, or the second batch sequencing primer binding site sequence corresponds to one of the second sequences of interest in the second sub-population. In some embodiments, a predetermined second batch sequencing primer binding site sequence can be linked to a given sequence of interest in the second sub-population. In some embodiments, a pre-determined second batch sequencing primer binding site sequence can be linked to different sequences of interest in the second sub-population. Thus, the pre-determined second batch sequencing primer binding site sequence corresponds to a given sequence of interest or sequences of interest in the second sub-population.
[00396] In some embodiments, the sequences of interest in the second sub-population are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00397] In some embodiments, the second batch barcode and/or the second batch sample index can include a short random sequence (e.g., NNN) that is 3-20 in length. In some
embodiments, sequencing the short random sequence can provide nucleotide diversity and color balance. In some embodiments, sequencing and imaging the short random sequence can be used for polony mapping and location and template registration because the short random sequence provides sufficient nucleotide diversity and color balance.
[00398] In some embodiments, in the second sub-population of library molecules, the short random sequence (e.g., NNN) has an overall base composition of about 25% or about 20-30% of all four nucleotide bases (e.g., A, G, C and T/U) to provide nucleotide diversity at each sequencing cycle during sequencing the short random sequence (e.g., NNN).
[00399] In some embodiments, in the second sub-population of library molecules the proportion of adenine (A) at any given position in the short random sequence is about 20- 30% or about 15-35% or about 10-40%. In some embodiments, the proportion of guanine (G) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
[00400] In some embodiments, in the second sub-population of library molecules the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%. In some embodiments, the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
[00401] In some embodiments, the first and second batch sequencing primer binding sites have different sequences.
[00402] In some embodiments, the plurality of nucleic acid template molecules can be immobilized to the support at random and non-pre-determined positions on the support, or at pre-determined positions on the support (e.g., a patterned support).
[00403] In some embodiments, in the methods for determining template density of step (a), the support comprises a plurality of nucleic acid template molecules immobilized thereon at a density of about 102 - 1015 template molecules per mm2, or any range described herein. In some embodiments, the template molecules immobilized to the support comprise a plurality of at least two sub-populations of template molecules including at least a first and second sub-population of template molecules. In some embodiments, the plurality of sub-populations of template molecules are immobilized to the support at a high density where at least some of the immobilized template molecules in the first and second sub-populations comprise nearest
neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. In some embodiments, the support comprises up to 500 million template molecules immobilized thereon, or up to 1 billion template molecules immobilized thereon, or up to 2 billion template molecules immobilized thereon, or up to 3 billion template molecules immobilized thereon, or up to 4 billion template molecules immobilized thereon, or up to 5 billion template molecules immobilized thereon, or up to 6 billion template molecules immobilized thereon. In some embodiments, the support comprises up to 7 billion template molecules immobilized thereon, or up to 8 billion template molecules immobilized thereon, or up to 9 billion template molecules immobilized thereon, or up to 10 billion template molecules immobilized thereon, or up to 20 billion template molecules immobilized thereon. In some embodiments, the support comprises between about 500 million and about 20 billion template molecules immobilized thereon, between about 1 billion and about 10 billion template molecules immobilized thereon, between about 2 billion and about 9 billion template molecules immobilized thereon, between about 3 billion and about 8 billion template molecules immobilized thereon, between about 4 billion and about 7 billion template molecules immobilized thereon, or between about 5 billion and about 6 billion template molecules immobilized thereon, or any range therebetween.
[00404] In some embodiments, in the methods for determining template density of step (a), the support comprises features on the support that are located in a random and non-pre- determined manner. In some embodiments, the features are sites for attachment of the template molecules.
[00405] In some embodiments, the support is passivated with at least one polymer layer comprising a plurality of surface capture primers covalently tethered to the at least one polymer layer.
[00406] In some embodiments, at least one of the polymer layers comprises oligonucleotide primers including capture primers, pinning primers, or a mixture of capture and pinning primers. In some embodiments, the plurality of oligonucleotide primers comprise one type of capture primer (e.g., having that same batch capture primer sequence) or a mixture of 2-500 different types of capture primers (e.g., having between about 2-500, between about 50-400, between about 100-300 or between about 20-150 different batch capture primer sequences, or any range therebetween). In some embodiments, the plurality of oligonucleotide primers comprise one type of pinning primer (e.g., having that same batch pinning primer sequence) or a mixture of 2-500 different types of pinning primers (e.g.,
having between about 2-500, between about 50-400, between about 100-300 or between about 20-150 different batch pinning primer sequences, or any range therebetween). In some embodiments, the plurality of oligonucleotide types comprises between 2 and 500, between 10 and 400, between 20 and 300, between 50 and 200, between 100 and 500, between 200 and 400, between 2 and 250, between 10 and 150, between 20 and 200, or between 20 and 100 or between 5 and 50 different capture primers and/or pinning primers, or any range therebetween.
[00407] In some embodiments, the plurality of surface capture primers comprise a plurality of sub-populations of surface capture primers including at least a first and second sub-population of surface capture primers. In some embodiments, the surface capture primers in the at least first and second sub-populations have different sequences. In some embodiments, the surface capture primers in the at least first and second sub-populations can hybridize, i.e. capture, different circularized library molecules carrying different surface capture primer binding site sequences.
[00408] In some embodiments, the plurality of surface capture primers are randomly distributed throughout and embedded within the at least one polymer layer.
[00409] In some embodiments, the support lacks any contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules. In some embodiments, the support lacks interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
[00410] In some embodiments, in the methods for determining template density of step (a), the support lacks partitions and/or barriers that would create separate regions of the support. Thus, the template molecules immobilized to the support are in fluid communication with each other in a massively parallel manner with no barriers to physically separate different batches of template molecules.
[00411] In some embodiments, the plurality of surface capture primers are located at predetermined positions on the at least one polymer layer and/or the plurality of surface capture primers are embedded within the at least one polymer layer at pre-determined locations.
[00412] In some embodiments, the support includes contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the template molecules. In some embodiments, the support includes interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
[00413] In some embodiments, in the methods for determining template density of step (a), individual template molecules in the first sub-population further comprise a first batch barcode sequence which corresponds to the first sequence of interest, or the first batch barcode sequence corresponds to one of the first sequences of interest in the first subpopulation. In some embodiments, a pre-determined first batch barcode sequence can be linked to a given sequence of interest in the first sub-population thus the pre-determined first batch barcode sequence corresponds to a given sequence of interest in the first subpopulation. In some embodiments, a pre-determined first batch barcode sequence can be linked to different sequences of interest in a first sub-population.
[00414] In some embodiments, individual template molecules in the second subpopulation further comprise a second batch barcode sequence which corresponds to the second sequence of interest, or the second batch barcode sequence corresponds to one of the second sequences of interest in the second sub-population. In some embodiments, a predetermined second batch barcode sequence can be linked to a given sequence of interest in the second sub-population (or can be linked to different sequences of interest in a second subpopulation), thus the pre-determined second batch barcode sequence corresponds to a given sequence of interest in the second sub-population. In some embodiments, a pre-determined second batch barcode sequence can be linked can be linked to different sequences of interest in a second sub-population.
[00415] In some embodiments, in the methods for determining template density of step (a), individual template molecules in the first sub -population further comprise at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest in the first sub-population obtained from different sample sources. In some embodiments, individual template molecules in the second sub-population further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish the sequences of interest in the second sub-population obtained from different sample sources.
[00416] In some embodiments, in the methods for determining template density of step (a), the plurality of template molecules comprise concatemer template molecules, including at least a first and second sub-population of concatemer template molecules. In some embodiments, the concatemer template molecules can be generated by conducting rolling circle amplification using circularized library molecules and amplification primers. In some embodiments, the amplification primers comprise capture primers immobilized to a support. In some embodiments, the amplification primers comprise soluble (non-immobilized) primers. In some embodiments, a concatemer template molecule comprises numerous tandem
copies of a polynucleotide unit. In some embodiments, each polynucleotide unit comprises a sequence of interest and at least one sequencing primer binding site. In some embodiments, the rolling circle amplification can be conducted in the presence or absence of a plurality of compaction oligonucleotides. In some embodiments, individual concatemer template molecules immobilized to the support collapse into a polony or nucleic acid nanoball having a compact size and shape compared to a non-collapsed concatemer template molecule. In some embodiments, the circularized library molecules can be generated using padlock probes, single-stranded splint strands, or double-stranded adaptors. In some embodiments, the circularized library molecules comprise a mixture of any combination of circularized padlock probes, linear library molecules circularized using single-stranded splint strands, and/or linear library molecules circularized using double-stranded adaptors. Methods for generating circularized library molecules are described herein and known in the art.
[00417] In some embodiments, individual concatemer template molecules in the first subpopulation comprise a plurality of tandem polynucleotide units. In some embodiments, each polynucleotide unit comprises a first sequence of interest and a first batch sequencing primer binding site sequence which corresponds to the first sequence of interest. In some embodiments, the polynucleotide unit further comprises a first batch barcode sequence which corresponds to the first sequence of interest. In some embodiments, the polynucleotide unit further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, concatemer template molecules in the first sub-population have the same first batch sequencing primer binding site, and have the same sequence of interest or different sequences of interest.
[00418] In some embodiments, individual concatemer template molecules in the second sub-population comprise a plurality of tandem polynucleotide units. In some embodiments, each polynucleotide unit comprises a second sequence of interest and a second batch sequencing primer binding site sequence which corresponds to the second sequence of interest. In some embodiments, the polynucleotide unit further comprises a second batch barcode sequence which corresponds to the second sequence of interest. In some embodiments, the polynucleotide unit further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, concatemer template molecules in the second sub-population have the same second batch sequencing primer binding site, and have the same sequence of interest or different sequences of interest.
[00419] In some embodiments, the methods for determining template density comprises step (b): sequencing the first sub-population of template molecules using a plurality of first batch sequencing primers, thereby generating a plurality of first batch sequencing read products. In some embodiments, the first batch sequencing read products comprise extension products of the first batch sequencing primers. In some embodiments, the first batch sequencing read products comprise first batch sequencing primers that are not extended. In some embodiments, the sequencing of step (b) does not require conducting more than 4 sequencing cycles. In some embodiments, the sequencing of step (b) comprises conducting 4- 20 sequencing cycles, or conducting 20-50 sequencing cycles, or conducting 50-75 sequencing cycles, or conducting 75-100 sequencing cycles, any range therebetween, or conducting more than 100 sequencing cycles. In some embodiments, the sequencing of step (b) comprises sequencing at least a portion of the first batch barcode and/or sequencing at least a portion of the first sample index. In some embodiments, the sequencing of step (b) comprises sequencing at least a portion of the first sequence of interest.
[00420] In some embodiments, the sequencing of step (b) comprises imaging at least one region of the support to detect the sequencing reactions of the first sub-population of template molecules. In some embodiments, the sequencing of step (b) further comprises determining the location of the first batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of step (b) further comprises counting the number of first batch sequencing read products on the support. In some embodiments, the sequencing of step (b) further comprises determining the density of the first batch sequencing read products on the support using the counted number of first batch sequencing read products.
[00421] In some embodiments, the sequencing of step (b) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. [00422] In some embodiments, the sequencing of step (b) comprises conducting a two- stage sequencing method. In some embodiments, the first stage generally comprises contacting the first sub-population of template molecules with a plurality of first batch sequencing primers, a first plurality of sequencing polymerase and a first plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases.
I l l
In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluorophore.
[00423] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex. In some embodiments, the nucleic acid duplex comprises a first sub-population template molecule hybridized to a first batch sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent which does not generate an extended sequencing primer. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, and at least one reducing agent. In some embodiments, the trapping reagent further comprises at least one chaotropic agent. In some embodiments, the trapping reagent further comprises an amino acid or a modified amino acid. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00424] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed
nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
[00425] In some embodiments, the sequencing of step (b) comprises imaging at least one region of the support in the presence of an imaging reagent to detect the sequencing reactions of the first sub-population of template molecules. In some embodiments, the sequencing of step (b) further comprises determining the location of the first batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of step (b) further comprises counting the number of first batch sequencing read products on the support using the images of the sequencing reactions of the first sub-population of template molecules. In some embodiments, the sequencing of step (b) further comprises determining the density of the first batch sequencing read products on the support using the counted number of first batch sequencing read products.
[00426] In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the first sub-population of template molecules. In some embodiments, the first sub-population of template molecules can remain immobilized to the support and the first batch sequencing primers can be retained and can remain hybridized to the first sub-population of template molecules.
[00427] In some embodiments, the second stage of the two-stage sequencing method comprises contacting the first sub-population of template molecules and the retained first batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second sequencing stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batch sequencing primer. In some embodiments, the first batch sequencing read product comprises an extended first batch sequencing primer after conducting the second sequencing stage.
[00428] In some embodiments, the plurality of nucleotides comprises fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides
are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00429] In some embodiments, when the second sequencing stage employs labeled nucleotides, then the second stage of step (b) comprises imaging at least one region of the support in the presence of an imaging reagent to detect the sequencing reactions of the first sub-population of template molecules. In some embodiments, the sequencing of step (b) further comprises determining the location of the first batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of step (b) further comprises counting the number of first batch sequencing read products on the support using the images of the sequencing reactions of the first sub-population of template molecules. In some embodiments, the sequencing of step (b) further comprises determining the density of the first batch sequencing read products on the support using the counted number of first batch sequencing read products. In some embodiments, the imaging, location determination, and counting data obtained from the first and second stage sequencing reactions can be combined to determine the density of the first batch sequencing read products on the support. [00430] In some embodiments, the nucleotides comprise chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00431] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprise chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’
sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00432] In some embodiments, the sequencing of step (b) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of first batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second sequencing stage. In some embodiments, the sequencing of step (b) does not require conducting more than 4 sequencing cycles. In some embodiments, the sequencing of step (b) comprises conducting 4-20 sequencing cycles, or conducting 20-50 sequencing cycles, or conducting 50-75 sequencing cycles, or conducting 75-100 sequencing cycles, or conducting more than 100 sequencing cycles.
[00433] In some embodiments, the sequencing of step (b) comprises imaging at least one region of the support to detect the sequencing reactions of the first sub-population of template molecules. In some embodiments, the imaging is conducted at the first sequencing cycle and/or the imaging is conducted at the last sequencing cycle. In some embodiments, the sequencing of step (b) comprises imaging at least one region of the support to detect the sequencing reactions of the first sub-population of template molecules. In some embodiments, the imaging is conducted at every sequencing cycle. In some embodiments, the sequencing of step (b) comprises imaging at least one region of the support to detect the sequencing reactions of the first sub-population of template molecules. In some embodiments, the imaging is conducted at fewer than every sequencing cycle, for example every other sequencing cycle or every third sequencing cycle. The skilled artisan will appreciate that many other imaging schedules are possible. In some embodiments, the sequencing of step (b) further comprises determining the location of the first batch sequencing read products on the support when imaging is conducted (e.g., template mapping). In some embodiments, the sequencing of step (b) further comprises counting the number of first batch sequencing read products on the support using the images of the sequencing reactions of the first sub-population of template molecules. In some embodiments, the sequencing of step (b) further comprises determining the density of the first batch sequencing read products on the support using the counted number of first batch sequencing read products.
[00434] In some embodiments, the sequencing of step (b) comprises hybridizing the first batch sequencing primers to the first sub-population of template molecules in the presence of
a hybridization reagent. In some embodiments, the hybridization reagent comprise an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10-20% formamide). [00435] In some embodiments, prior to sequencing the second sub-population of template molecules, the plurality of first batch sequencing read products can be removed from the first sub-population of template molecules and the first sub-population of template molecules can be retained on the support using a de-hybridization reagent. In some embodiments, the dehybridization reagent comprises an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C. In some embodiments, the de-hybridization reagent comprises at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de- hybridization reagent further comprises at least one nucleic acid compaction agent. In some embodiments, the de-hybridization step can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C. In some embodiments, the first batch sequencing read products are not removed from the first sub-population of template molecules.
[00436] In some embodiments, the sequencing reactions of the first sub-population of template molecules is inhibited or stopped before initiating the sequencing reactions of the second sub-population of template molecules.
[00437] In some embodiments, the methods for determining template density further comprise step (c): sequencing the second sub-population of template molecules using a plurality of second batch sequencing primers, thereby generating a plurality of second batch sequencing read products and imaging the same region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the second batch sequencing read products comprise extension products of the second batch sequencing primers. In some embodiments, the second batch sequencing read products comprise second batch sequencing primers that are not extended. In some embodiments, the sequencing of step (c) does not require conducting more than 4 sequencing cycles. In some embodiments, the sequencing of step (c) comprises conducting 4-20 sequencing cycles, or conducting 20-50 sequencing cycles, or conducting 50-75 sequencing cycles, or conducting 75-100 sequencing cycles, or any range therebetween, or conducting more than 100 sequencing cycles. In some embodiments, the sequencing of step (c) comprises sequencing at least a portion of the second batch barcode and/or sequencing at
least a portion of the second sample index. In some embodiments, the sequencing of step (c) comprises sequencing at least a portion of the second sequence of interest.
[00438] In some embodiments, the sequencing of step (c) comprises imaging at least one region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the location of the second batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of step (c) further comprises counting the number of second batch sequencing read products on the support. In some embodiments, the sequencing of step (c) further comprises determining the density of the second batch sequencing read products on the support using the counted number of second batch sequencing read products.
[00439] In some embodiments, the sequencing reactions of the first sub-population of template molecules is stopped or inhibited before initiating the sequencing reactions of the second sub-population of template molecules.
[00440] In some embodiments, the sequencing of step (c) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. [00441] In some embodiments, the sequencing of step (c) comprises conducting a two- stage sequencing method. In some embodiments, the first stage comprises contacting the second sub-population of template molecules with a plurality of second batch sequencing primers, a first plurality of sequencing polymerase and a second plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent- complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluorophore.
[00442] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex. In some embodiments, the nucleic acid duplex comprises a second sub-population template molecule hybridized to a second batch sequencing primer. In some embodiments, the detectably labeled multivalent molecules
bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent which does not generate an extended sequencing primer. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, and at least one reducing agent. In some embodiments, the trapping reagent further comprises at least one chaotropic agent. In some embodiments, the trapping reagent further comprises an amino acid or a modified amino acid. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00443] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent.
[00444] In some embodiments, the sequencing of step (c) comprises imaging at least one region of the support in the presence of an imaging reagent to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the location of the second batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of
step (c) further comprises counting the number of second batch sequencing read products on the support using the images of the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the density of the second batch sequencing read products on the support using the counted number of second batch sequencing read products.
[00445] In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the second sub-population of template molecules. In some embodiments, the second subpopulation of template molecules can remain immobilized to the support and the second batch sequencing primers can be retained and can remain hybridized to the second subpopulation of template molecules.
[00446] In some embodiments, the second stage of the two-stage sequencing method comprises contacting the second sub-population of template molecules and the retained second batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second sequencing stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the second batch sequencing primer. In some embodiments, the second batch sequencing read product comprises an extended second batch sequencing primer after conducting the second sequencing stage.
[00447] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00448] In some embodiments, when the second sequencing stage employs labeled nucleotides, then the second stage of step (c) comprises imaging at least one region of the support in the presence of an imaging reagent to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the location of the second batch sequencing read products on the support (e.g., template mapping). In some embodiments, the sequencing of step (c) further
comprises counting the number of second batch sequencing read products on the support using the images of the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the density of the second batch sequencing read products on the support using the counted number of second batch sequencing read products. In some embodiments, the imaging, location determination, and counting data obtained from the first and second stage sequencing reactions can be combined to determine the density of the second batch sequencing read products on the support.
[00449] In some embodiments, the nucleotides comprise chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00450] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprise chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00451] In some embodiments, the sequencing of step (c) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once, thereby generating a plurality of second batch sequencing read products. In some embodiments, one sequencing cycle comprises completion of a first and a second sequencing stage. In some embodiments, the sequencing of step (c) does not require conducting more than 4 sequencing cycles. In some embodiments, the sequencing of step (c) comprises
conducting 4-20 sequencing cycles, or conducting 20-50 sequencing cycles, or conducting 50-75 sequencing cycles, or conducting 75-100 sequencing cycles, or any range therebetween, or conducting more than 100 sequencing cycles.
[00452] In some embodiments, the sequencing of step (c) comprises imaging at least one region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the imaging is conducted at the first sequencing cycle and/or the imaging is conducted at the last sequencing cycle. In some embodiments, the sequencing of step (c) comprises imaging at least one region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the imaging is conducted at every sequencing cycle. In some embodiments, the sequencing of step (c) comprises imaging at least one region of the support to detect the sequencing reactions of the second sub-population of template molecules. In some embodiments, the imaging is conducted at fewer than every sequencing cycle, for example every other sequencing cycle or every third sequencing cycle. The skilled artisan will appreciate that many other imaging schedules are possible. In some embodiments, the sequencing of step (c) further comprises determining the location of the second batch sequencing read products on the support when imaging is conducted (e.g., template mapping). In some embodiments, the sequencing of step (c) further comprises counting the number of second batch sequencing read products on the support using the images of the sequencing reactions of the second sub-population of template molecules. In some embodiments, the sequencing of step (c) further comprises determining the density of the second batch sequencing read products on the support using the counted number of second batch sequencing read products.
[00453] In some embodiments, the sequencing of step (c) comprises hybridizing the second batch sequencing primers to the second sub-population of template molecules in the presence of a hybridization reagent. In some embodiments, the hybridization reagent comprise an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10-20% formamide).
[00454] In some embodiments, the methods for determining template density comprise hybridizing a plurality of detectably labeled oligonucleotide probes to the plurality of immobilized template molecules instead of sequencing the immobilized template molecules of steps (b) and (c).
[00455] In some embodiments, the plurality of detectably labeled oligonucleotide probes comprises at least a first and second sub-population of probes. In some embodiments, the first
sub-population of probes can hybridize to a universal adaptor sequencing in the first subpopulation of template molecules. In some embodiments, the first sub-population of probes can hybridize to the first batch sequencing primer binding sites on the first sub-population of template molecules. In some embodiments, the second sub-population of probes can hybridize to a universal adaptor sequencing in the second sub-population of template molecules. In some embodiments, the second sub-population of probes can hybridize to the second batch sequencing primer binding sites on the second sub-population of template molecules. In some embodiments, the first and second sub-populations of probes can be labeled with different fluorophores that distinguish the first and second sub-populations of probes.
[00456] In some embodiments, the methods for determining template density using detectably labeled probes comprise: (i) hybridizing the first sub-population of probes to the first sub-population of immobilized template molecules, thereby generating a first subpopulation of immobilized labeled duplexes, and hybridizing the second sub-population of probes to the second sub-population of immobilized template molecules, thereby generating a second sub-population of immobilized labeled duplexes. In some embodiments, the hybridizing is conducted essentially simultaneously. In some embodiments, the methods comprise: (ii) imaging at least one region of the support to detect the first and second subpopulations of immobilized labeled duplexes. In some embodiments, the methods comprise :(iii) determining the location of the first and second sub-populations of immobilized labeled duplexes on the support. In some embodiments, the methods comprise:(iv) counting the number of first and second sub-populations of immobilized labeled duplexes on the support; and (v) determining the density of the first and second subpopulations of immobilized labeled duplexes on the support using the counted number from step (iv). In some embodiments, the counted number of first and second sub-populations of immobilized labeled duplexes from step (iv) can be extrapolated to encompass a larger region of the support than the region that was imaged in step (ii). For example, the counted number of first and second sub-populations of immobilized labeled duplexes from step (iv) can be extrapolated to encompass the full surface of the support.
[00457] In some embodiments, the methods for determining template density using detectably labeled probes comprise: (i) hybridizing the first sub-population of probes to the first sub-population of immobilized template molecules thereby generating a first subpopulation of immobilized labeled duplexes. In some embodiments, the methods comprise: (ii) imaging at least one region of the support to detect the first sub-populations of
immobilized labeled duplexes. In some embodiments, the methods comprise: (iii) determining the location of the first sub-populations of immobilized labeled duplexes on the support. In some embodiments, the methods comprise: (iv) counting the number of first subpopulations of immobilized labeled duplexes on the support. In some embodiments, the methods comprise: (v) determining the density of the first sub-populations of immobilized labeled duplexes on the support using the counted number from step (iv). In some embodiments, the methods comprise: (vi) hybridizing the second sub-populations of probes to the second sub-populations of immobilized template molecules thereby generating a second sub-population of immobilized labeled duplexes. In some embodiments, the methods comprise: (vii) imaging at least one region of the support to detect the second sub-populations of immobilized labeled duplexes. In some embodiments, the methods comprise:(viii) determining the location of the second sub-populations of immobilized labeled duplexes on the support. In some embodiments, the methods comprise: (ix) counting the number of second sub-populations of immobilized labeled duplexes on the support. In some embodiments, the methods comprise: (x) determining the density of the second subpopulations of immobilized labeled duplexes on the support using the counted number from step (ix). In some embodiments, the counted number of first and second sub-populations of immobilized labeled duplexes from steps (iv) and (ix) can be extrapolated to encompass a larger region of the support than the region that was imaged in steps (ii) and (vii). For example, the counted number of first and second sub-populations of immobilized labeled duplexes from steps (iv) and (ix) can be extrapolated to encompass the full surface of the support.
[00458] In some embodiments, the methods for determining template density optionally comprise step (d): re-seeding the support by distributing on the support a third subpopulation of circularized nucleic acid library molecules under a condition suitable for hybridizing individual third sub-population of circularized library molecules to individual surface capture primers and conducting a rolling circle amplification reaction, in a templatedependent manner using individual circularized library molecules in the third sub-population, thereby generating a third sub-population of concatemer template molecules immobilized to the support. In some embodiments, the re-seeding of step (d) can be conducted when the density of the first and/or second sub-population of immobilized template molecules is determined to be lower than desired. In some embodiments, the density determination is conducting by sequencing or hybridization with the labeled probes. In some embodiments, the third sub-population of circularized library molecules can be the same as the first or
second sub-population of circularized library molecules, e.g., the sub-population or subpopulation whose density was determined to be lower than desired. In some embodiments, the third sub-population of circularized library molecules can be different from the first or second sub-population of circularized library molecules. In some embodiments, the third subpopulation of concatemer template molecules can be sequenced using any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. In some embodiments, sequencing the third sub-population of concatemer template molecules can be conducted as described in step (b) or (c) above. In some embodiments, the re-seeding of step (d) can be omitted.
Generating Circularized Library Molecules with Padlock Probes
[00459] The present disclosure provides methods for generating circularized library molecules comprising step (a): providing a plurality of target nucleic acid molecules comprising at least a first and second target nucleic acid molecule. In some embodiments, the target nucleic acid molecules comprise RNA, DNA, cDNA or chimeric RNA/DNA. In some embodiments, the target nucleic acid molecules are present in a mixture of non-target nucleic acid molecules.
[00460] In some embodiments, methods generating circularized library molecules further comprise step (b): contacting the plurality of target nucleic acid molecules with a plurality of target-specific padlock probes. In some embodiments, individual target-specific padlock probes comprise a first and second end (e.g., first and second padlock binding arms) and an internal region having at least one adaptor sequence. In some embodiments, the first end of individual padlock probes selectively hybridize to a first region of a target molecule and the second end selectively hybridizes to a second region of the same target molecule. In some embodiments, the first and second ends of the first target-specific padlock probes hybridize to proximal positions on the target molecule to form an open circle target-specific padlock probe having a nick or gap between the hybridized first and second ends. In some embodiments, the plurality of target-specific padlock probes includes at least a first and second sub-population of target-specific padlock probes.
[00461] In some embodiments, individual target-specific padlock probes of step (b) comprise a first and second end (e.g., first and second padlock binding arms) and an internal region having at least one adaptor sequence. In some embodiments, the first end selectively
hybridizes to a first region of a target nucleic acid molecule and the second end selectively hybridizes to a second region of the same nucleic acid target molecule. In some embodiments, the internal region of individual target-specific padlock probes comprise any one or any combination of adaptor sequences, organized in any order including: (i) a batchspecific sequencing primer binding site sequence which corresponds to the target sequence (e.g., sequence of interest); (ii) a batch barcode sequence which corresponds to the target sequence (e.g., sequence of interest); (iii) a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources; (iv) a capture primer binding site; (v) a surface pinning primer binding site; and/or (vi) a compaction oligonucleotide binding site. An exemplary embodiment of target specific padlock probes are shown in FIGs. 15A, 15B, and 16-20.
[00462] In some embodiments, a pre-determined batch-specific sequencing primer binding site sequence can be linked to a given first and second padlock binding arms, thus the predetermined batch-specific sequencing primer binding site sequence corresponds to a given target region of a target nucleic acid molecule.
[00463] In some embodiments, a pre-determined batch barcode sequence can be linked to a given first and second padlock binding arms, thus the pre-determined batch barcode sequence corresponds to a given target region of a target nucleic acid molecule .
[00464] In some embodiments, the plurality of target-specific padlock probes comprises at least a first and second sub-population of target-specific padlock probes. In some embodiments, individual padlock probes in the first sub-population of target-specific padlock probes comprise a first and second end (e.g., first and second padlock binding arms) and an internal region. In some embodiments, the first end selectively hybridizes to a first region of the first target molecule (Target-1; e.g., see FIG. 15A) and the second end selectively hybridizes to a second region of the first target molecule. In some embodiments, the contacting of step (b) comprises: hybridizing the first and second ends of individual targetspecific padlock probes of the first sub-population to proximal positions on the first target molecule to form an open circle first target-specific padlock probe having a nick or gap between the hybridized first and second ends (e.g., FIG. 15A). In some embodiments, the internal region of individual target-specific padlock probes of the first sub-population comprise a first batch barcode sequence (Batch BC-1; see e.g., FIG. 15A) that corresponds to the first target sequence. In some embodiments, the first batch barcode sequence is located adjacent to one of the regions of the first target-specific padlock probe that selectively hybridizes to the first target molecule. In some embodiments, individual target-specific
padlock probes of the first sub-population comprise a first batch sequencing primer binding site sequence (Batch Seq-1; e.g., see FIG. 15A) (or a complementary sequence thereof). In some embodiments, individual target-specific padlock probes of the first sub-population comprise a primer binding site for a rolling circle amplification primer (surface capture primer binding site; e.g., see FIG. 15A) (or a complementary sequence thereof). In some embodiments, individual target-specific padlock probes of the first sub-population comprise a compaction oligonucleotide binding site (compaction; e.g., see FIG. 15A) (or a complementary sequence thereof). In some embodiments, individual target-specific padlock probes of the first sub-population comprise a first batch sequencing primer binding site and a first batch barcode sequence that are adjacent to each other so that the first batch barcode region of the concatemer is sequenced first. The first batch barcode sequence can be any length, for example 3-15 bases, or 15-25 bases, or 25-40 bases, or longer. Other examples of first target-specific padlock probes are shown in FIGs. 15B, 16, 18 and 19.
[00465] In some embodiments, individual padlock probes in the second sub-population of target-specific padlock probes comprise a first and second end (e.g., first and second padlock binding arms) and an internal region. In some embodiments, the first end selectively hybridizes to a first region of the second target molecule (Target-2; e.g., see FIG. 15A) and the second end selectively hybridizes to a second region of the second target molecule. In some embodiments, the contacting of step (b) comprises: hybridizing the first and second ends of individual target-specific padlock probes of the second sub-population to proximal positions on the second target molecule to form an open circle second target-specific padlock probe having a nick or gap between the hybridized first and second ends (e.g., FIG. 15A). In some embodiments, the internal region of individual target-specific padlock probes of the second sub-population comprises a second batch barcode sequence (Batch BC-2; see e.g., FIG. 15A) that corresponds to the second target sequence. In some embodiments, the second batch barcode sequence is located adjacent to one of the regions of the second target-specific padlock probe that selectively hybridizes to the second target molecule. In some embodiments, individual target-specific padlock probes of the second sub-population comprise a second batch sequencing primer binding site sequence (Batch Seq-2; e.g., see FIG. 15A) (or a complementary sequence thereof). In some embodiments, individual targetspecific padlock probes of the second sub-population comprise a primer binding site for a rolling circle amplification primer (surface capture primer binding site; e.g., see FIG. 15A) (or a complementary sequence thereof). In some embodiments, individual target-specific padlock probes of the second sub-population comprise a compaction oligonucleotide binding
site (compaction; e.g., see FIG. 15A) (or a complementary sequence thereof). In some embodiments, individual target-specific padlock probes of the second sub-population comprise a second batch sequencing primer binding site and a second batch barcode sequence that are adjacent to each other so that the second batch barcode region of the concatemer is sequenced first. The second batch barcode sequence can be any length, for example 3-15 bases, or 15-25 bases, or 25-40 bases, or any range therebetween, or longer. Other examples of first target-specific padlock probes are shown in FIGs. 15B, 16, 18 and 19.
[00466] In some embodiments, methods generating circularized library molecules further comprise step (c): closing the nick or gap of individual open circle target-specific padlock probes of the first and second sub-population by conducting one or more enzymatic reactions, thereby generating a plurality of covalently closed circularized padlock probes including at least first and second sub-populations of covalently closed circularized padlock probes (e.g., FIG. 15B). In some embodiments, the closing the nick in the open circle padlock probes of the first and second sub-populations comprises conducting an enzymatic ligation reaction to close the nick thereby generating a plurality of covalently closed circular padlock probes including at least first and second sub-populations of covalently closed circular padlock probes. In some embodiments, closing the gap of the open circle padlock probes of the first and second sub-populations comprises conducting a polymerase-catalyzed fill-in reaction using the first or second target molecule as a template, and conducting an enzymatic ligation reaction, thereby generating a plurality of covalently closed circular padlock probes including first and second sub-populations of covalently closed circular padlock probes. In some embodiments, various embodiments of padlock probes carrying different adaptor sequences in their internal region can be used to generate various embodiments of covalently closed circularized padlock probes (e.g., see FIGs. 15-20).
[00467] In some embodiments, methods generating circularized library molecules further comprise step (d): conducting a rolling circle amplification reaction by hybridizing the plurality of covalently closed circular padlock probes with a plurality of amplification primers and conducting rolling circle amplification reaction in a template-dependent manner, using a plurality of strand displacing polymerases and a plurality of nucleotides, thereby generating a plurality of concatemer molecules. In some embodiments, the rolling circle amplification reaction comprises hybridizing covalently closed circular padlock probes of the first and second sub-population, to first and second amplification primers, respectively. In some embodiments, the first and second amplification primers are immobilized to a support (e.g., first and second surface capture primers). In some embodiments, the first and second
amplification primers are in solution. In some embodiments, the first and second amplification primers have the same sequence or have different sequences. In some embodiments, the rolling circle amplification reaction is conducted in a template-dependent manner, using a plurality of strand displacing polymerases, a plurality of nucleotides, and the covalently closed circular padlock probes of the first and second sub-populations, thereby generating a plurality of concatemer molecules including at least a first concatemer template molecule that corresponds to a first target nucleic acid molecule, and the plurality of concatemer template molecules includes at least a second concatemer molecule that corresponds to a second target nucleic acid molecule.
[00468] In some embodiments, individual concatemer molecules of the first subpopulation (e.g., first batch concatemer template molecules) comprise tandem repeat polynucleotide units. In some embodiments, a unit comprises the sequence of the first target molecule, the first batch barcode sequence, and the first batch sequencing primer binding site (e.g., see FIGs. 15B-20) (or a complementary sequence thereof).
[00469] In some embodiments, individual concatemer template molecules of the second sub-population (e.g., second batch concatemer template molecules) comprise tandem repeat polynucleotide units. In some embodiments, a unit comprises the sequence of the second target molecule, the second batch barcode sequence, and the second batch sequencing primer binding site (e.g., see FIGs. 15B-20) (or a complementary sequence thereof).
[00470] In some embodiments, when the rolling circle amplification of step (d) is conducted with amplification primers that are immobilized to a support, the covalently closed circularized padlock probes can be distributed onto the support comprising a plurality of immobilized surface capture primers, under a condition suitable to hybridize at least one portion of individual covalently closed circularized padlock probes to one of the immobilized surface capture primers, and the rolling circle amplification reaction is conducted thereby generating concatemer template molecules immobilized to the support.
[00471] In some embodiments, when the rolling circle amplification of step (d) is conducted with amplification primers that are in solution, individual covalently closed circularized padlock probes can be hybridized to one of the amplification primers in solution, the rolling circle amplification reaction can be conducted in-solution thereby generating nascent concatemer template molecules, and the rolling circle amplification reaction and nascent concatemer template molecules can be distributed onto a support having a plurality of surface capture primers immobilized thereon, under a condition suitable to hybridize at least one portion of individual nascent concatemer template molecules to the immobilized surface
capture primers, and the rolling circle amplification reaction can be resumed thereby generating concatemer template molecules immobilized to a support.
[00472] In some embodiments, methods generating circularized library molecules further comprise step (e): sequencing the plurality of concatemer template molecules immobilized to the support. In some embodiments, the sequencing of step (e) comprises sequencing the first sub-population of concatemer template molecules by conducting up to 1000 sequencing cycles to generate a plurality of first sequencing read products, and sequencing the second sub-population of concatemer template molecules by conducting up to 1000 sequencing cycles to generate a plurality of second sequencing read products. In some embodiments, the concatemer template molecules of the first and second sub-populations can be sequenced essentially simultaneously using a mixture of first and second batch-specific sequencing primers. In some embodiments, the concatemer template molecules of the first and second sub-populations can be sequenced separately in batches using first batch-specific sequencing primers and then using second batch-specific sequencing primers.
[00473] In some embodiments, step (e) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100- 200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750- 1000 sequencing cycles, or any range therebetween of the first batch concatemer template molecules.
[00474] In some embodiments, step (e) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100- 200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750- 1000 sequencing cycles, or any range therebetween of the second batch concatemer template molecules.
[00475] In some embodiments, the first batch barcode region of the concatemer template molecules of the first sub-population are selectively sequenced (e.g., left schematics of FIGs. 15B, 16, 17 and 18). In some embodiments, the first batch barcode region and the sample index region of the concatemer molecules of the first sub-population are sequenced (e.g., left schematics of FIG. 20). In some embodiments, the first batch barcode region and a portion of the first sequence of interest region of the concatemer template molecules of the first subpopulation are sequenced (e.g., FIG. 19; left schematic).
[00476] In some embodiments, the second batch barcode region of the concatemer template molecules of the second sub-population are selectively sequenced (e.g., right schematics of FIGs. 15B, 16, 17 and 18). In some embodiments, the second batch barcode
region and the sample index region of the concatemer template molecules of the second subpopulation are sequenced (e.g., right schematics of Figure 20). In some embodiments, the second batch barcode region and a portion of the second sequence of interest region of the concatemer template molecules of the second sub-population are sequenced (e.g., right schematics of FIG. 19).
[00477] In some embodiments, 500 million - 1 billion of the concatemer template molecules of the first sub-population can be sequenced. In some embodiments, up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the concatemer template molecules of the first sub-population can be sequenced. In some embodiments, up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the concatemer template molecules of the first sub-population can be sequenced. [00478] In some embodiments, 500 million - 1 billion of the concatemer template molecules of the second sub-population can be sequenced. In some embodiments, up to 1 billion, or up to 2 billion, or up to 3 billion, or up to 4 billion, or up to 5 billion of the concatemer template molecules of the second sub-population can be sequenced. In some embodiments, up to 6 billion, or up to 7 billion, or up to 8 billion, or up to 9 billion, or up to 10 billion of the concatemer template molecules of the second sub-population can be sequenced. In some embodiments, between about 500 million and about 10 billion concatemer template molecules, between about 1 billion and about 9 billion concatemer template molecules, between about 2 billion and about 8 billion concatemer template molecules, between about 3 billion and about 7 billion concatemer template molecules, between about 4 billion and about 5 billion concatemer template molecules, or any range therebetween of the second batch concatemer template molecules can be sequenced.
[00479] In some embodiments, the sequencing of step (e) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. [00480] In some embodiments, the sequencing of step (e) comprises conducting a two- stage sequencing method.
[00481] In some embodiments, in step (e), the first stage generally comprises contacting the concatemer template molecules of the first sub-population with a plurality of first batchspecific sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage generally comprises contacting the concatemer template molecules of the second sub-population with a
plurality of second batch-specific sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the concatemer template molecules of the first and second sub-population can be sequenced essentially simultaneously using a mixture of first and second batch-specific sequencing primers. In some embodiments, the concatemer template molecules of the first and second sub-population can be sequenced separately in batches using first batch-specific sequencing primers and then using second batch-specific sequencing primers.
[00482] In some embodiments, in step (e), the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluor ophore.
[00483] In some embodiments, in step (e), individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a concatemer template molecule hybridized to a sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00484] In some embodiments, in step (e), the multivalent-complexed polymerases can be exposed to excitation illumination to induce emission of fluorescent signals from the
multivalent-complexed polymerases. In some embodiments, the fluorescent signals emitted from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photodamage, at least one reducing agent, at least one detergent and at least one viscosity agent. [00485] In some embodiments, in step (e), prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the concatemer template molecules of the first sub-population. In some embodiments, the concatemer template molecules of the first sub-population can remain immobilized to the support and the first batch-specific sequencing primers can be retained and can remain hybridized to the concatemer template molecules of the first sub-population.
[00486] In some embodiments, in step (e), prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the concatemer template molecules of the second sub-population. In some embodiments, the concatemer template molecules of the second sub-population can remain immobilized to the support and the second batch-specific sequencing primers can be retained and can remain hybridized to the concatemer template molecules of the second subpopulation.
[00487] In some embodiments, in step (e), the second stage of the two-stage sequencing method generally comprises contacting the concatemer template molecules of the first subpopulation and the retained first batch-specific sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage
sequencing method comprises nucleotide incorporation and extension of the first batchspecific sequencing primer.
[00488] In some embodiments, in step (e), the second stage of the two-stage sequencing method generally comprises contacting the concatemer template molecules of the second subpopulation and the retained second batch-specific sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the second batchspecific sequencing primer.
[00489] In some embodiments, in step (e), the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00490] In some embodiments, in step (e), the nucleotides comprises chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00491] In some embodiments, in step (e), nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’
sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00492] In some embodiments, the sequencing of step (e) comprises conducting a two- stage sequencing method on the first sub-population of concatemer template molecules, including repeating the first stage and second stage at least once, thereby generating a plurality of first batch sequencing read products. In some embodiments, the sequencing of step (e) comprises conducting a two-stage sequencing method on the second sub-population of concatemer template molecules, including repeating the first stage and second stage at least once, thereby generating a plurality of second batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (e) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00493] In some embodiments, the sequencing of step (e) comprises a reiterative sequencing workflow, which comprises: step (el) contacting the plurality of concatemer template molecules with (i) a plurality of batch-specific sequencing primers, (ii) a plurality of sequencing polymerases, and (iii) a plurality of nucleotide reagents, under a condition suitable for hybridizing the plurality of batch-specific sequencing primers to their respective batch sequencing primer binding sites on the concatemer template molecules.
[00494] In some embodiments, the reiterative sequencing further comprises step (e2) conducting up to 1000 sequencing cycles to generate at least a first plurality of sequencing read products and optionally a second plurality of sequencing read products.
[00495] In some embodiments, the reiterative sequencing further comprises step (e3) removing the first plurality of sequencing read products from the concatemers and retaining the plurality of concatemer template molecules, and optionally removing the second plurality of sequencing read products from the concatemer template molecules and retaining the plurality of concatemer template molecules.
[00496] In some embodiments, the reiterative sequencing further comprises step (e4) repeating steps (el) - (e3) at least once. In some embodiments, the reiterative sequencing further comprises step (e4) repeating steps (el) - (e3) up to 100 times (e.g., between 1 and 100, between 10 and 80, between 20 and 70, between 30 and 50 or between 5 and 40 times, or any range therebetween).
[00497] In some embodiments, the reiterative sequencing can be conducting using a sequencing-by-binding procedure, labeled and/or non-labeled chain-terminating nucleotides, or multivalent molecules. Descriptions of these three sequencing methods is described below. [00498] In some embodiments, in the reiterative sequencing of step (el) the plurality of batch-specific sequencing primers can be hybridized to concatemer template molecules with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10-20% formamide).
[00499] In some embodiments, the reiterative sequencing of steps (el) and (e2) comprise conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules.
[00500] In some embodiments, the reiterative sequencing of steps (el) and (e2) comprise conducting a two-stage sequencing method which is described above.
Generating Circularized Library Molecules using Single-Stranded Splint Strands
[00501] The present disclosure provides methods generating circularized library molecules comprising step (a): providing a plurality of linear single stranded library molecules (100). In some embodiments, individual library molecules comprise the following components arranged in any order: (i) surface pinning primer binding site sequence (120) (e.g., batchspecific pinning primer binding site sequence); (ii) a left unique identification sequence (e.g., UMI) (180); (iii) a batch barcode sequence (195); (iv) a left sample index sequence (160); (v) a forward sequencing primer binding site sequence (140) (e.g., a batch-specific forward sequencing primer binding site sequence); (vi) a sequence of interest (e.g., insert sequence) (HO); (vii) a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); (viii) a right sample index sequence (170); and/or (ix) a surface capture primer binding site sequence (130) (e.g., a batch-specific capture primer binding site sequence). Embodiments of various single-stranded library molecules are shown in FIGs. 21, 22, 23 A, 23B, 25A and 25B.
[00502] In some embodiments, in step (a), individual linear single stranded library molecules (100) lack any one or any combination of: a left unique identification sequence (e.g., UMI) (180); a batch barcode sequence (195); a left sample index sequence (160); a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); and/or a right sample index sequence (170).
[00503] In some embodiments, in step (a), the left and right sample index sequences can be used to distinguish insert sequences (e.g., sequences of interest) that are isolated from different sample sources in a multiplex assay. The first left index sequences (160) and/or first right index sequences (170) can be employed to prepare separate sample-indexed libraries using input nucleic acids isolated from different sources. The sample-indexed libraries can be pooled together to generate a multiplex library mixture, and the pooled libraries can be circularized, amplified and/or sequenced. In some embodiments, the sequences of the left sample index (160) and the right sample index (170) are the same or different from each other. The left sample index sequence (160) can be 3-20 nucleotides in length. The right sample index sequence (170) can be 3-20 nucleotides in length. The left sample index sequence (160) and/or the right sample index sequence (170) can include a short random sequence (e.g., NNN). The short random sequence can be 3-20 nucleotides in length. The left sample index sequence (160) and/or the right sample index sequence (170) can be batch specific index sequences, i.e. the sequence or sequences of the index sequences correspond to a particular batch in a batch-sequencing work flow.
[00504] In some embodiments, in step (a), the unique identification sequence such as the left unique identification sequence (180) (e.g., a unique molecular tag) can be used to uniquely identify individual nucleic acid library molecules to which the unique identification sequence is appended.
[00505] In some embodiments, in step (a), the plurality of linear single stranded library molecules (100) includes at least a first and second sub-population of linear single stranded nucleic acid library molecules ((100-1) and (100-2)).
[00506] In some embodiments, in step (a), individual single-stranded library molecules of the first sub-population (100-1) comprise any combination of two of more of: (i) a first batch surface pinning primer binding site sequence (120-1); (ii) a unique identification sequence (e.g., UMI) (180-1); (iii) a first batch barcode sequence (195-1); (iv) a left sample index sequence (160-1); (v) a first batch forward sequencing primer binding site sequence (140-1) ; (vi) a first sequence of interest (e.g., first insert sequence) (110-1); (vii) a first batch reverse sequencing primer binding site sequence (150-1); (viii) a right sample index sequence (170- 1); and/or (ix) a surface capture primer binding site sequence (130-1).
[00507] In some embodiments, the single-stranded library molecules within the first subpopulation (100-1) have the same first batch forward sequencing primer binding site sequence (140-1), and have the same or different first sequence(s) of interest (110-1).
[00508] In some embodiments, in step (a), the sequence of the first batch forward sequencing primer binding site sequence (140-1) corresponds to the first sequence of interest (110-1) in the same library molecule, or the first batch forward sequencing primer binding site sequence (140-1) corresponds to one of the first sequences of interest (110-1) in the first sub-population of library molecules.
[00509] In some embodiments, in step (a), a pre-determined first batch forward sequencing primer binding site sequence (140-1) can be linked to a given sequence of interest (110-1) in the first sub-population (or can be linked to different sequences of interest in a first subpopulation), thus the pre-determined first batch forward sequencing primer binding site sequence (140-1) corresponds to a given sequence of interest (110-1) in the first subpopulation.
[00510] In some embodiments, in step (a), the single-stranded library molecules (100-1) within the first sub-population have the same first batch barcode sequence (195-1), and have the same or different first sequence(s) of interest (110-1).
[00511] In some embodiments, in step (a), the sequence of the first batch barcode sequence (195-1) corresponds to the first sequence of interest (110-1), or the first batch barcode sequence (195-1) corresponds to one of the first sequences of interest (110-1) in the first sub-population.
[00512] In some embodiments, in step (a), a pre-determined first batch barcode sequence (195-1) can be linked to a given sequence of interest (110-1) in the first sub-population (or can be linked to different sequences of interest in a first sub-population), thus the predetermined first batch barcode sequence (195-1) corresponds to a given sequence of interest (110-1) in the first sub-population.
[00513] In some embodiments, in step (a), the sequences of interest (110-1) in the first sub-population are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00514] In some embodiments, in step (a), individuals single-stranded library molecule of the second sub-population (100-2) comprise any combination of two or more of: (i) a second batch surface pinning primer binding site sequence (120-2); (ii) a unique identification sequence (e.g., UMI) (180-2); (iii) a second batch barcode sequence (195-2); (iv) a left sample index sequence (160-2); (v) a second batch forward sequencing primer binding site sequence (140-2); (vi) a second sequence of interest (e.g., second insert sequence) (110-2); (vii) a second batch reverse sequencing primer binding site sequence (150-2); (viii) a right
sample index sequence (170-2); and/or (ix) surface capture primer binding site sequence (130-2).
[00515] In some embodiments, in step (a), the single-stranded library molecules within the second sub-population (100-2) have the same second batch forward sequencing primer binding site sequence (140-2), and have the same or different second sequence(s) of interest (110-2).
[00516] In some embodiments, in step (a), the sequence of the second batch forward sequencing primer binding site sequence (140-2) corresponds to the second sequence of interest (110-2), or the second batch forward sequencing primer binding site sequence (140- 2) corresponds to one of the second sequences of interest (110-2) in the second subpopulation of library molecules.
[00517] In some embodiments, in step (a), a pre-determined second batch forward sequencing primer binding site sequence (140-2) can be linked to a given sequence of interest (110-2) in the second sub-population (or can be linked to different sequences of interest in a second sub-population), thus the pre-determined second batch forward sequencing primer binding site sequence (140-2) corresponds to a given sequence of interest (110-2) in the second sub-population.
[00518] In some embodiments, in step (a), the single-stranded library molecules (100-2) within the second sub-population have the same second batch barcode sequence (195-2), and have the same or different second sequence(s) of interest (110-2).
[00519] In some embodiments, in step (a), the sequence of the second batch barcode sequence (195-2) corresponds to the second sequence of interest (110-2), or the second batch barcode sequence (195-2) corresponds to one of the second sequences of interest (110-2) in the second sub-population.
[00520] In some embodiments, in step (a), a pre-determined second batch barcode sequence (195-2) can be linked to a given sequence of interest (110-2) in the second subpopulation (or can be linked to different sequences of interest in a second sub-population), thus the pre-determined second batch barcode sequence (195-2) corresponds to a given sequence of interest (110-2) in the second sub-population.
[00521] In some embodiments, t in step (a), he sequences of interest (110-2) in the second sub-population are about 50-250 bases in length, or about 250-500 bases in length, or about 500-800 bases in length, or about 800-1200 bases in length, or any range therebetween, or up to 2000 bases in length.
[00522] In some embodiments, in step (a), the sequences of the first and second batch surface pinning primer binding site sequences ((120-1) and (120-2)) are the same or different. [00523] In some embodiments, in step (a), the sequences of the first and second batch forward sequencing primer binding site sequences ((140-1) and (140-2)) are the same or different.
[00524] In some embodiments, in step (a), the sequences of the first and second batch barcode sequences ((195-1) and (195-2)) are the same or different.
[00525] In some embodiments, in step (a), the sequences of the first and second batch reverse sequencing primer binding site sequences ((150-1) and (150-2)) are the same or different.
[00526] In some embodiments, in step (a), the sequences of the first and second batch capture primer binding site sequences ((130-1) and (130-2)) are the same or different. [00527] In some embodiments, the method for generating circularized library molecules further comprises step (b): providing a plurality of single-stranded splint strands (200). In some embodiments, individual single-stranded splint strands (200) comprises regions arranged in any order (i) a first region (210) having a sequence that hybridizes with the surface pinning primer binding site sequence (120) of the single stranded library molecule, and (ii) a second region (220) having a sequence that hybridizes with the surface capture primer binding site sequence (130) of the single stranded library molecule.
[00528] In some embodiments, the method for generating circularized library molecules further comprises step (c): contacting the plurality of linear single stranded library molecules (100) with the plurality of single-stranded splint strands (200). In some embodiments, the contacting is conducted under a condition suitable to hybridize individual linear single stranded library molecules (100) with individual single-stranded splint strands (200) thereby circularizing the library molecule to generate a library-splint complex (300). In some embodiments, the first region (210) of the single-stranded splint strand is hybridized to the surface pinning primer binding site sequence (120) of the single stranded library molecule (100). In some embodiments, the second region (220) of the single-stranded splint strand is hybridized to the surface capture primer binding site sequence (130) of the same single stranded library molecule (100). In some embodiments, the library-splint complex (300) comprises a nick between the terminal 5’ and 3’ ends of the library molecule. In some embodiments, the nick is enzymatically ligatable (e.g., see FIGs. 21, 22, 23 A, 23B, 25A and 25B)
[00529] In some embodiments, the method for generating circularized library molecules further comprises step (d): ligating the nick in the plurality of library-splint complexes (300) thereby generating a plurality of a plurality of covalently closed circular library molecules (400) each hybridized to a single-stranded splint strand (200).
[00530] In some embodiments, the methods for generating circularized library molecules described herein can further comprise at least one enzymatic reaction, including a phosphorylation reaction, ligation reaction and/or exonuclease reaction. The enzymatic reactions can be conducted sequentially or essentially simultaneously. The enzymatic reactions can be conducted in a single reaction vessel. Alternatively, a first enzymatic reaction can be conducted in a first reaction vessel, then transferred to a second reaction vessel where the second enzymatic reaction is conducted, then transferred to a third reaction vessel where the third enzymatic reaction is conducted.
[00531] In some embodiments, the methods for generating circularized library molecules described herein further comprise conducting separate and sequential phosphorylation and ligation reactions which are conducted in separate reaction vessels. In some embodiments, the methods for generating circularized library molecules further comprise step (cl): contacting in a first reaction vessel the plurality of the single-stranded splint strands (200) and the plurality of the linear single stranded library molecules (100) with a T4 polynucleotide kinase enzyme under a condition suitable to phosphorylate the 5’ ends of the plurality of singlestranded splint strands (200) and/or the plurality of linear single stranded library molecules (100); and transferring the phosphorylation reaction to a second reaction vessel. In some embodiments, the methods for generating circularized library molecules further comprise step (dl): contacting in the second reaction vessel the plurality of phosphorylated single-stranded splint strands (200) and the plurality of linear single stranded nucleic acid library molecules (100) which are phosphorylated with a ligase, under a condition suitable to enzymatically ligate the nicks, thereby generating a plurality of covalently closed circular library molecules (400) each hybridized to a single-stranded splint strand (200). In some embodiments, the ligase enzyme comprises T7 DNA ligase, T3 ligase, T4 ligase, or Taq ligase.
[00532] In some embodiments, the methods for generating circularized library molecules described herein further comprise conducting sequential phosphorylation and ligation reactions which are conducted sequentially in the same reaction vessel. In some embodiments, the methods for generating circularized library molecules further comprise step (c2): contacting in a first reaction vessel the plurality of the single-stranded splint strands (200) and the plurality of the linear single stranded nucleic acid library molecules (100) with
a T4 polynucleotide kinase enzyme under a condition suitable to phosphorylate the 5’ ends of the plurality of single-stranded splint strands (200) and the plurality of linear single stranded nucleic acid library molecules (100), thereby generating phosphorylated single stranded nucleic acid library molecules. In some embodiments, the methods for generating circularized library molecules further comprise step (d2): contacting in the same first reaction vessel the phosphorylated single-stranded splint strands (200) and the phosphorylated single-stranded nucleic acid library molecules with a ligase under a condition suitable to enzymatically ligate the nicks, thereby generating a plurality of covalently closed circular library molecules (400) each hybridized to a single-stranded splint strand (200). In some embodiments, the ligase enzyme comprises T7 DNA ligase, T3 ligase, T4 ligase, or Taq ligase.
[00533] In some embodiments, the methods for generating circularized library molecules described herein further comprise conducting essentially simultaneous phosphorylation and ligation reactions which are conducted together in the same reaction vessel. In some embodiments, the methods for generating circularized library molecules further comprise step (c3): contacting in a first reaction vessel the plurality of the single-stranded splint strands (200) and the plurality of the linear single stranded nucleic acid library molecules (100) with a (i) T4 polynucleotide kinase enzyme and (ii) a ligase enzyme, under a condition suitable to phosphorylate the 5’ ends of the plurality of single-stranded splint strands (200) and the plurality of linear single stranded nucleic acid library molecules (100), and the conditions are suitable to enzymatically ligate the nicks, thereby generating a plurality of covalently closed circular library molecules (400) each hybridized to a single-stranded splint strand (200). In some embodiments, the ligase enzyme comprises T7 DNA ligase, T3 ligase, T4 ligase, or Taq ligase.
[00534] In some embodiments, the methods for generating circularized library molecules further comprise the optional step of enzymatically removing the plurality of single-stranded splint strands (200) from the plurality of covalently closed circular library molecules (400), which comprises the step: contacting the plurality of covalently closed circular library molecules (400) with at least one exonuclease enzyme to remove the plurality of singlestranded splint strands (200) and retaining the plurality of covalently closed circular library molecules (400). In some embodiments, the exonuclease reaction can be conducted in the same reaction buffer used to conduct the phosphorylation and/or ligation reactions, or in a different reaction buffer. In some embodiments, the exonuclease reaction can be conducted in a third reaction vessel after conducting the phosphorylation reaction in the first reaction vessel (step cl, see above), and conducting the ligation reaction in the second reaction vessel
(step dl, see above). In some embodiments, the exonuclease reaction can be conducted in the first reaction vessel after conducting the phosphorylation reaction in the first reaction vessel (step c2, see above), and conducting the sequential ligation reaction in the first reaction vessel (step d2, see above). In some embodiments, the exonuclease reaction can be conducted in the first reaction vessel after conducting the essentially simultaneous phosphorylation and ligation reactions in the first reaction vessel (step c3, see above). In some embodiments, the at least one exonuclease enzyme comprises any combination of two or more of exonuclease I, thermolabile exonuclease I and/or T7 exonuclease.
[00535] In some embodiments, the covalently closed circular library molecules (400) can be subjected to rolling circle amplification and sequencing (e.g., batch sequencing) as described herein.
[00536] In some embodiments, the surface pinning primer binding site sequence (120) in the library molecules comprise the sequence 5’- CATGTAATGCACGTACTTTCAGGGT - 3’ (SEQ ID NO: 18).
[00537] In some embodiments, the surface pinning primer binding site sequence (120) in the library molecules comprises the sequence 5’- AATGATACGGCGACCACCGA-3’ (SEQ ID NO: 19).
[00538] In some embodiments, the forward sequencing primer binding site sequence (140) in the library molecules comprises the sequence 5’- CGTGCTGGATTGGCTCACCAGACACCTTCCGACAT -3’ (SEQ ID NO: 1).
[00539] In some embodiments, the forward sequencing primer binding site sequence (140) in the library molecules comprises the sequence 5’- ACACTCTTTCCCTACACGACGCTCTTCCGATCT -3’ (SEQ ID NO: 2).
[00540] In some embodiments, the forward sequencing primer binding site sequence (140) in the library molecules comprise the sequence 5’- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG -3’ (SEQ ID NO: 20).
[00541] In some embodiments, the reverse sequencing primer binding site sequence (150) in the library molecules comprises the sequence
5’- ATGTCGGAAGGTGTGCAGGCTACCGCTTGTCAACT -3’ (SEQ ID NO: 21). [00542] In some embodiments, the reverse sequencing primer binding site sequence (150) in the library molecules comprises the sequence
5’- AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -3’ (SEQ ID NO: 22).
[00543] In some embodiments, the reverse sequencing primer binding site sequence (150) in the library molecules comprises the sequence
5’- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC -3’ (SEQ ID NO: 23).
[00544] In some embodiments, the surface capture primer binding site sequence (130) in the library molecules comprises the sequence 5’- AGTCGTCGCAGCCTCACCTGATC -3’ (SEQ ID NO: 24).
[00545] In some embodiments, the surface capture primer binding site sequence (130) in the library molecules comprises the sequence 5’ - TCGTATGCCGTCTTCTGCTTG -3’ (SEQ ID NO: 25).
Generating Circularized Library Molecules using Double-Stranded Splint Adaptors
[00546] The present disclosure provides reagents, kits and methods for preparing circularized library molecules. In some embodiments, the circularized library molecules are prepared by hybridizing any of the linear library molecules described herein with a plurality of double-stranded splint strands (500) to generate a plurality of library-splint complexes (800) which includes two nicks (e.g., see FIGs. 27, 28, 29, 30A and 30B). The nicks can be enzymatically ligated to generate covalently closed circular molecules (900) in which the second splint strand (700) is covalently joined at both ends to the linear single stranded library molecule (100), thereby introducing the new adaptor sequences into the circularized library molecule.
[00547] The present disclosure provides methods for forming a plurality of library-splint complexes (800) comprising: (a) providing a plurality of linear single stranded nucleic acid library molecule (100). In some embodiments, individual library molecules comprise: (i) a left universal adaptor sequence having a first surface pinning primer binding site sequence (120); (ii) a left universal adaptor sequence having a forward sequencing primer binding site sequence (140); (iii) a sequence of interest (110); (iv) a right universal adaptor sequence having a reverse sequencing primer binding site sequence (150); and (v) a right universal adaptor having a second surface capture primer binding site sequence (130). In some embodiments, the left universal adaptor sequence (120) comprises a binding sequence for a first surface primer P5. In some embodiments, the right universal adaptor sequence (130) comprises a binding sequence for a second surface primer P7. In some embodiments, the linear library further comprises a left sample index sequence (160) and/or a right sample index sequence (170). The left and right sample index sequences can be used to distinguish insert sequences that are isolated from different sample sources in a multiplex assay. The left index sequence (160) can include a random sequence (e.g., NNN) or lack a random sequence. The right index sequence (170) can include a random sequence (e.g., NNN) or lack a random
sequence. Exemplary single-stranded library molecules are shown in (e.g., see FIGs. 27, 28, 29, 30A and 30B).
[00548] The methods for forming a plurality of library-splint complexes (800) can further comprise step (b): hybridizing the plurality of linear single stranded nucleic acid library molecules (100) with a plurality of double-stranded splint adaptors (500). In some embodiments, individual double-stranded splint adaptors (500) in the plurality comprise a first splint strand (600) hybridized to a second splint strand (700). In some embodiments, the double-stranded splint adaptor includes a double-stranded region and two flanking singlestranded regions. In some embodiments, the first splint strand comprises a first region (620), an internal region (610), and a second region (630). In some embodiments, the internal region of the first splint strand (610) is hybridized to the second splint strand (700). In some embodiments, the first splint strand (600) comprises regions arranged in a 5’ to 3’ order a first region (620), an internal region (610), and a second region (630). In some embodiments, the second splint strand (700) comprises regions arranged in a 5’ to 3’ order (i) a second subregion having a universal binding sequence for a fourth surface primer, and (ii) a first subregion having a universal binding sequence for a third surface primer. The universal binding sequences for the third surface primer do not bind the first surface primer (e.g., P5) or the second surface primer (e.g., P7). The universal binding sequences for the fourth surface primer do not bind the first surface primer (e.g., P5) or the second surface primer (e.g., P7). Exemplary double-stranded splint adaptors (500) are shown in (e.g., see FIGs. 27, 28, 29, 30A and 30B).
[00549] The hybridizing of step (b) is conducted under a condition suitable for hybridizing the first region of the first splint strand (620) to the at least first left universal adaptor sequence (120) (e.g., the surface pinning primer binding site sequence) of the library molecule, and the condition is suitable for hybridizing the second region of the first splint strand (630) to the at least first right universal sequence (130) (e.g., the surface capture primer binding site sequence) of the library molecule, thereby circularizing the plurality of library molecules to form a plurality of library-splint complexes (800). The library-splint complex (800) comprises a first nick between the 5’ end of the library molecule and the 3’ end of the second splint strand (e.g., see FIGs. 27, 28, 29, 30A and 30B). The library-splint complex (800) also comprises a second nick between the 5’ end of the second splint strand and the 3’ end of the library molecule (e.g., see FIGs. 27, 28, 29, 30A and 30B). In some embodiments, the first and second nicks are enzymatically ligatable.
[00550] In some embodiments, in the methods for forming a plurality of library-splint complexes (800), the 5’ end of the first splint strand (600) is phosphorylated or lacks a phosphate group. In some embodiments, the 3’ end of the first splint strand (600) includes a terminal 3’ OH group or a terminal 3’ blocking group.
[00551] In some embodiments, in the methods for forming a plurality of library-splint complexes (800), the 5’ end of the second splint strand (700) is phosphorylated or lacks a phosphate group. In some embodiments, the 3’ end of the second splint strand (700) includes a terminal 3’ OH group or a terminal 3’ blocking group.
[00552] In some embodiments, in the methods for forming a plurality of library-splint complexes (800), the first region of the first splint strand (620) can hybridize to a sense or anti-sense strand of a double-stranded nucleic acid library molecule. In the library-splint complex (800), the second region of the first splint strand (630) can hybridize to a sense or anti-sense strand of a double-stranded nucleic acid library molecule. The double-stranded nucleic acid library molecule can be denatured to generate the single-stranded sense and antisense library strands.
[00553] In some embodiments, in the methods for forming a plurality of library-splint complexes (800), the second splint strand (700) does not hybridize to the sequence of interest (110), and the internal region of the first splint strand (610) does not hybridize to the sequence of interest (110).
[00554] In some embodiments, in the methods for forming a plurality of library-splint complexes (800), the first region of the first splint strand (620) does not hybridize to the sequence of interest (110), and the second region of the first splint strand (630) does not hybridize to the sequence of interest (110).
[00555] In some embodiments, in the methods for forming a plurality of library-splint complexes (800), the 5’ end of the linear single stranded library molecule (100) is phosphorylated or lacks a phosphate group. In some embodiments, the 3’ end of the singlestranded library molecule includes a terminal 3’ OH group or a terminal 3’ blocking group. [00556] The methods for forming a plurality of library-splint complexes (800) further comprise step (c): contacting the plurality of library-splint complexes (800) from step (b) with a ligase, under a condition suitable to enzymatically ligate the first and second nicks, thereby generating a plurality of covalently closed circular library molecules (900) each hybridized to the first splint strand (600). In some embodiments, the ligase enzyme comprises T7 DNA ligase, T3 ligase, T4 ligase, or Taq ligase.
[00557] The methods for forming a plurality of library-splint complexes (800) can further comprise an optional step (d): enzymatically removing the plurality of first splint strands (600) from the plurality of covalently closed circular library molecules (900) by contacting the plurality of covalently closed circular library molecules (900) with at least one exonuclease enzyme to remove the plurality of first splint strands (600) and retaining the plurality of covalently closed circular library molecules (900). In some embodiments, the at least one exonuclease enzyme comprises any combination of two or more of exonuclease I, thermolabile exonuclease I and/or T7 exonuclease.
[00558] In some embodiments, the covalently closed circular library molecules (900) can be subjected to rolling circle amplification and sequencing (e.g., batch sequencing) which are described herein.
[00559] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecules can include a left universal binding sequence (e.g., a pinning primer binding site sequence (120)) which binds the first region of the first splint strand (620). In some embodiments, the left universal binding sequence comprises the sequence
5’- AATGATACGGCGACCACCGA-3’ (SEQ ID NO: 19).
[00560] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecules can include a left universal binding sequence (e.g., a pinning primer binding site sequence (120)). In some embodiments, the left universal binding sequence comprises the sequence 5’- CATGTAATGCACGTACTTTCAGGGT -3’ (SEQ ID NO: 18).
[00561] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a forward sequencing primer binding site sequence (1 0) comprising a universal binding sequence for a sequencing primer. In some embodiments, the universal binding sequence comprises the sequence 5’- ACACTCTTTCCCTACACGACGCTCTTCCGATCT -3’ (SEQ ID NO: 2). [00562] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a forward sequencing primer binding site sequence (140) comprising a universal binding sequence for a sequencing primer. In some embodiments, the universal binding sequence comprises the sequence 5’- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG -3’ (SEQ ID NO: 20).
[00563] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a first sequencing
primer binding site sequence (1 0) comprising a universal binding sequence for a sequencing primer. In some embodiments, the universal binding sequence comprises the sequence 5’- CGTGCTGGATTGGCTCACCAGACACCTTCCGACAT -3’ (SEQ ID NO: 1).
[00564] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a reverse sequencing primer binding site sequence (150) comprising a universal binding sequence for a sequencing primer. In some embodiments, the universal binding sequence comprises the sequence 5’- AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -3’ (SEQ ID NO: 22).
[00565] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a reverse sequencing primer binding site sequence (150) comprising a universal binding sequence for a sequencing primer. In some embodiments, the universal binding sequence comprises the sequence 5’- CTGTCTCTTATACACATCTCCGAGCCCACGAGAC -3’ (SEQ ID NO: 23).
[00566] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a reverse sequencing primer binding site sequence (150) comprising a universal binding sequence for a sequencing primer. In some embodiments, the universal binding sequence comprises the sequence 5’- ATGTCGGAAGGTGTGCAGGCTACCGCTTGTCAACT -3’ (SEQ ID NO: 21).
[00567] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a surface capture primer binding site sequence (130) that is a universal binding sequence, and which binds the first region of the first splint strand (630). In some embodiments, the universal binding sequence comprises the sequence
5’- TCGTATGCCGTCTTCTGCTTG -3’ (SEQ ID NO: 25).
[00568] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the library molecule includes a surface capture primer binding site sequence (130) that is universal binding sequence, and comprises the sequence 5’- AGTCGTCGCAGCCTCACCTGATC -3’ (SEQ ID NO: 24).
[00569] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the first sub-region of the second splint strand (700) comprises the sequence 5’- CATGTAATGCACGTACTTTCAGGGT-3’ (SEQ ID NO: 18). In some embodiments, the second sub-region of the second splint strand (700) comprises the sequence 5’-AGTCGTCGCAGCCTCACCTGATC-3’ (SEQ ID NO: 24). In some embodiments, the second splint strand (700) comprises a first and second sub-regions
comprising the sequence 5’- AGTCGTCGCAGCCTCACCTGATCCATGTAATGCACGTACTTTCAGGGT-3’ (SEQ ID NO: 26).
[00570] In some embodiments, in any of the methods for forming a plurality of librarysplint complexes (800) described herein, the first region of the first splint strand (620) includes a first universal adaptor sequence which comprises a universal binding sequence (or a complementary sequence thereof) for a first surface capture primer (also referred to a surface primer). In some embodiments, the first region (620) comprises the sequence 5’- TCGGTGGTCGCCGTATCATT-3’ (SEQ ID NO: 27). For example, the first region of the first splint strand (620) can hybridize to a P5 surface primer or a complementary sequence of the P5 surface primer. For example, the P5 surface primer comprises the sequence 5’- AATGATACGGCGACCACCGA-3’ (short P5; SEQ ID NO: 19), or the P5 surface primer comprises the sequence 5’- AATGATACGGCGACCACCGAGATC-3’ (long P5; SEQ ID NO: 28). In some embodiments, the second region of the first splint strand (630) includes a second universal adaptor sequence which comprises a universal binding sequence (or a complementary sequence thereof) for a second surface primer. In some embodiments, the second region (630) comprises the sequence 5’- CAAGCAGAAGACGGCATACGA -3’ (SEQ ID NO: 29). For example, the second region of the first splint strand (630) can hybridize to a P7 surface primer or a complementary sequence of the P7 surface primer. For example, the P7 surface primer comprises the sequence 5’- CAAGCAGAAGACGGCATACGA -3’ (short P7; SEQ ID NO: 29), or the P7 surface primer comprises the sequence 5’- CAAGCAGAAGACGGCATACGAGAT-3’ (long P7; SEQ ID NO: 30). In some embodiments, the first splint strand (600) includes an internal region (310) which comprises a fourth sub-region having the sequence 5’- ACCCTGAAAGTACGTGCATTACATG-3’ (SEQ ID NO: 31). In some embodiments, the first splint strand (600) includes an internal region (610) which comprises a fifth sub-region having the sequence 5’- GATCAGGTGAGGCTGCGACGACT -3’ (SEQ ID NO: 32). In some embodiments, the first splint strand (600) comprises a first region (620), an internal region (610) having a fourth and fifth sub-region, and a second region (630), having the sequence
5’-
TCGGTGGTCGCCGTATCATTACCCTGAAAGTACGTGCATTACATGGATCAGGTGA GGCTGCGACGACTCAAGCAGAAGACGGCATACGA-3’ (SEQ ID NO: 33).
Rolling Circle Amplification and Sequencing for ss-Splint and ds-Splint
[00571] In some embodiments, covalently closed circular library molecules (e.g., (400) and (900)) can be generated using linear single stranded library molecules (100) and either single-stranded splint strands (200) (e.g., FIGs. 21, 22, 23 A, 23B, 25A and 25B) or doublestranded splint adaptors (500) (e.g., FIGs. 27, 28, 29, 30A and 30B), as described above. In some embodiments, the covalently closed circular library molecules (e.g., (400) and (900)) can be subjected to a rolling circle amplification (RCA) reaction.
[00572] In some embodiments, the method for generating circularized library molecules further comprises step (e): conducting a rolling circle amplification reaction by hybridizing the plurality of covalently closed circular library molecules (e.g., (400) or (900)) with a plurality of amplification primers and conducting rolling circle amplification reaction in a template-dependent manner, using a plurality of strand displacing polymerases and a plurality of nucleotides, thereby generating a plurality of concatemer template molecules. In some embodiments, the plurality of covalently closed circular library molecules (e.g., (400) or (900)) comprises first and second sub-population of covalently closed circular library molecules.
[00573] In some embodiments, the rolling circle amplification reaction comprises hybridizing first and second sub-populations of covalently closed circular library molecules (e.g., (400) or (900)) to first and second amplification primers, respectively. In some embodiments, the first and second amplification primers can be immobilized to a support (e.g., first and second capture primers), or the first and second amplification primers can be in solution. In some embodiments, the first and second amplification primers have the same sequence or have different sequences. In some embodiments, the first and second amplification primers having different sequences comprise first and second batch amplification primers. In some embodiments, the rolling circle amplification reaction is conducted in a template-dependent manner, using a plurality of strand displacing polymerases, a plurality of nucleotides, and the first and second sub-populations of covalently closed circular library molecules (e.g., (400) or (900)), thereby generating a plurality of concatemer template molecules including at least a first sub-population of concatemer template molecules and a second sub-population of concatemer template molecules. In some embodiments, the rolling circle amplification reaction is conducted in the presence of a plurality of compaction oligonucleotides. In some embodiments, the rolling circle amplification reaction is conducted in the absence of a plurality of compaction oligonucleotides.
[00574] In some embodiments, individual concatemer template molecules in the first subpopulation comprise tandem repeat polynucleotide units. In some embodiments, a unit comprises a first sequence of interest, the first batch barcode sequence, and a first batch sequencing primer binding site (or a complementary sequence thereof). For concatemer template molecules generated using single-stranded splint strands see FIGs. 24A, 24B, 26A, 26B; for concatemer template molecules generated using double-stranded splint adaptors see FIGs. 31A and 31B
[00575] In some embodiments, individual concatemer template molecules in the second sub-population comprise tandem repeat polynucleotide units. In some embodiments, a unit comprises a second sequence of interest, the second batch barcode sequence, and a second batch sequencing primer binding site (or a complementary sequence thereof). For concatemer template molecules generated using single-stranded splint strands see FIGs. 24A, 24B, 26A, 26B; for concatemer template molecules generated using double-stranded splint adaptors see FIGs. 31A and 31B
[00576] In some embodiments, when the rolling circle amplification of step (e) is conducted with amplification primers that are immobilized to a support, the covalently closed circular library molecules (e.g., (400) or (900)) can be distributed onto the support comprising a plurality of immobilized surface primer, under a condition suitable to hybridize at least one portion of the covalently closed circular library molecules to the immobilized surface primers, and the rolling circle amplification reaction is conducted thereby generating a plurality of immobilized concatemer template molecules including at least a first subpopulation of concatemer template molecules and a second sub-population of concatemer template molecules. In some embodiments, the on-support rolling circle amplification reaction is conducted in the presence of a plurality of compaction oligonucleotides. In some embodiments, the on-support rolling circle amplification reaction is conducted in the absence of a plurality of compaction oligonucleotides.
[00577] In some embodiments, when the rolling circle amplification of step (e) is conducted with amplification primers that are in solution, the covalently closed circular library molecules (e.g., (400) or (900)) can be hybridized to amplification primers in solution, and the rolling circle amplification reaction can be conducted in-solution, and the rolling circle amplification reaction and nascent concatemer template molecules can be distributed onto a support having a plurality of surface primers immobilized thereon, under a condition suitable to hybridize at least one portion of the nascent concatemer template molecules to the immobilized surface primers, and the rolling circle amplification reaction can be resumed
thereby generating a plurality of immobilized concatemer template molecules including at least a first sub-population of concatemer template molecules and a second sub-population of concatemer template molecules. In some embodiments, the in-solution rolling circle amplification reaction is conducted in the presence of a plurality of compaction oligonucleotides. In some embodiments, the in-solution rolling circle amplification reaction is conducted in the absence of a plurality of compaction oligonucleotides.
[00578] In some embodiments, methods generating circularized library molecules further comprise step (f): sequencing the first sub-population of concatemer template molecules using a plurality of first batch sequencing primers. In some embodiments, the sequencing of step (f) comprises imaging a region of the support to detect the sequencing reactions of the first sub-population of template molecules.
[00579] In some embodiments, the sequencing of step (f) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. [00580] In some embodiments, the sequencing of step (f) comprises conducting a two- stage sequencing method. In some embodiments, the first stage comprises contacting the first sub-population of concatemer template molecules with a plurality of first batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluor ophore.
[00581] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a concatemer template molecule hybridized to a sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping
reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00582] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the first sub-population of concatemer template molecules. In some embodiments, the first sub-population of concatemer template molecules can remain immobilized to the support and the first batch sequencing primers can be retained and can remain hybridized to the first sub-population of concatemer template molecules.
[00583] In some embodiments, the second stage of the two-stage sequencing method comprises contacting the first sub-population of concatemer template molecules and the retained first batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases
to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the first batch sequencing primer.
[00584] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00585] In some embodiments, the nucleotides comprises chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00586] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00587] In some embodiments, the sequencing of step (f) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of first batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (f) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75
sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00588] In some embodiments, the methods for sequencing further comprises step (fl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first sub-population of concatemer template molecules to generate a plurality of first batch sequencing read products that comprise up to 1000 bases in length. In some embodiments, the first batch sequencing read products comprise the first batch barcode sequence. In some embodiments, the first batch sequencing read products comprise the first batch barcode sequence and the sample index sequence. In some embodiments, the first batch sequencing read products comprise the first batch barcode sequence and at least a portion of the first sequence of interest. In some embodiments, the first batch sequencing read products comprise the first batch barcode sequence, the sample index sequence, and at least a portion of the first sequence of interest. In some embodiments, the short read sequencing comprises hybridizing first batch sequencing primers to the first batch sequencing primer binding sites on first subpopulation of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents.
[00589] In some embodiments, the sequencing of step (fl) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. In some embodiments, the reiterative sequencing of step (fl) comprises conducting a two- stage sequencing method described herein.
[00590] In some embodiments, the methods for sequencing further comprises step (f2): stopping and/or blocking the short read sequencing of step (fl). In some embodiments, the stopping and/or blocking comprises incorporating a chain terminating nucleotide to the 3’ terminal end of the first batch sequencing read products to inhibit further sequencing reactions. Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
[00591] In some embodiments, the methods for sequencing further comprise step (13): removing the plurality of first batch sequencing read products from the concatemer template molecules of the first sub-population, and retaining the concatemer template molecules of the first sub-population. In some embodiments, the first batch sequencing read products can be
removed from the concatemer template molecules by denaturation using heat and/or a dehybridization reagent.
[00592] In some embodiments, the methods for sequencing further comprise step (f4): reiteratively sequencing the concatemer template molecules of the first sub-population by repeating steps (fl) - (f3) at least once. In some embodiments, the reiterative sequencing can be conducted 1-10 times, or 10-25 times, or 25-50 times, or any range therebetween or more than 50 times. For example, the reiterative sequencing can be conducted up to 100 times. [00593] Exemplary schematics of reiterative sequencing workflows are shown in FIGs. 24A, 24B, 26A, 26B, 31A and 31B
[00594] In some embodiments, the sequences of all of the first batch sequencing read products can be determined and aligned with a first reference sequence to confirm the presence of the first sequence of interest. The first reference sequence can be the first batch barcode and/or the first sequence of interest.
[00595] In some embodiments, hybridizing the first batch sequencing primers to the concatemer template molecules of step (fl) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10- 20% formamide).
[00596] In some embodiments, in step (f3) the plurality of plurality of first batch sequencing read products can be removed from the concatemer template molecules and the plurality of concatemer template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00597] In some embodiments, in step (f3) the plurality of first batch sequencing read products can be removed from the concatemer template molecules and the plurality of concatemer template molecules can be retained using a de-hybridization reagent comprising at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de- hybridization reagent further comprises at least one nucleic acid compaction agent. In some embodiments, the de-hybridization of step (f3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00598] In some embodiments, methods generating circularized library molecules further comprise step (g): sequencing the second sub-population of concatemer template molecules which are immobilized to the support using a plurality of second batch sequencing primers.
In some embodiments, the sequencing of step (g) comprises imaging the same region of the support to detect the sequencing reactions of the second sub-population of concatemer template molecules.
[00599] In some embodiments, the sequencing reactions of the first sub-population of concatemer template molecules is stopped before initiating the sequencing reactions of the second sub-population of concatemer template molecules.
[00600] In some embodiments, the sequencing of step (g) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. [00601] In some embodiments, the sequencing of step (g) comprises conducting a two- stage sequencing method. In some embodiments, the first stage comprises contacting the second sub-population of concatemer template molecules with a plurality of second batch sequencing primers, a first plurality of sequencing polymerase and a plurality of detectably labeled multivalent molecules. In some embodiments, the first stage comprises binding detectably labeled multivalent molecules to complexed polymerases to form multivalent- complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, individual multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-5). In some embodiments, the multivalent molecules can be labeled with at least one detectable moiety that emits a signal. In some embodiments, the multivalent molecules can be labeled with at least one fluorophore.
[00602] In some embodiments, individual complexed polymerases comprise a first sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer. In some embodiments, the detectably labeled multivalent molecules bind to the complexed polymerases to form a plurality of multivalent-complexed polymerases. In some embodiments, the detectably labeled multivalent molecules are bound to the complexed polymerases in the presence of a trapping reagent. In some embodiments, the trapping reagent can be formulated to promote binding of the detectably labeled multivalent molecules to the complexed polymerases. In some embodiments, the trapping reagent can be formulated to inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating
agent, at least one detergent, at least one monovalent cation, at least one reducing agent, and at least one chaotropic agent. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00603] In some embodiments, the multivalent-complexed polymerases can be exposed to excitation illumination to induce fluorescent signals from the multivalent-complexed polymerases. In some embodiments, the fluorescent signals from the multivalent-complexed polymerases can be imaged in the presence of an imaging reagent. In some embodiments, the imaging reagent can be formulated to reduce photo damage of the fluorescently-labeled multivalent-complexed polymerases upon exposure to the excitation illumination. In some embodiments, the imaging reagent can be formulated to inhibit polymerase-catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, prior to conducting the second sequencing stage, the detectably labeled multivalent molecules can be dissociated from the complexed polymerases and removed (e.g., washing). In some embodiments, prior to conducting the second sequencing stage, the first plurality of sequencing polymerases can be dissociated from the second sub-population of concatemer template molecules. In some embodiments, the second sub-population of concatemer template molecules can remain immobilized to the support and the second batch sequencing primers can be retained and can remain hybridized to the second sub-population of concatemer template molecules.
[00604] In some embodiments, the second stage of the two-stage sequencing method comprises contacting the second sub-population of concatemer template molecules and the retained second batch sequencing primers with a second plurality of sequencing polymerases and a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the second stage comprises binding the plurality of nucleotides to the complexed polymerases to form nucleotide-complexed polymerases, and promoting nucleotide incorporation. In some embodiments, the second stage of the two-stage sequencing method comprises nucleotide incorporation and extension of the second batch sequencing primer.
[00605] In some embodiments, the plurality of nucleotides comprise fluorophore-labeled nucleotides, or the nucleotides are non-labeled. In some embodiments, when the nucleotides
are fluorophore-labeled, then detecting and imaging of the incorporated nucleotides can be performed. In some embodiments, when the nucleotides are non-labeled, detecting and imaging of the incorporated nucleotides can be omitted.
[00606] In some embodiments, the nucleotides comprises chain terminating nucleotides. In some embodiments, individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, the nucleotides are not chain terminating nucleotides. In some embodiments, when the nucleotides comprise chain terminating nucleotides, then the chain terminating moieties can be cleaved from the incorporated chain terminating nucleotides to generate an extendible 3 ’OH group.
[00607] In some embodiments, nucleotide incorporation can be conducted in the presence of a stepping reagent. In some embodiments, the stepping reagent can be formulated to promote polymerase-catalyzed nucleotide incorporation. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a second plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00608] In some embodiments, the sequencing of step (g) comprises conducting a two- stage sequencing method including repeating the first stage and second stage at least once thereby generating a plurality of second batch sequencing read products. In some embodiments, when conducting a two-stage sequencing method, one sequencing cycle comprises completion of a first and a second stage. In some embodiments, the sequencing of step (g) comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00609] In some embodiments, the methods for sequencing further comprises step (gl): conducting short read sequencing by performing up to 1000 sequencing cycles of the second
sub-population of concatemer template molecules to generate a plurality of second batch sequencing read products that comprise up to 1000 bases in length. In some embodiments, the second batch sequencing read products comprise the second batch barcode sequence. In some embodiments, the second batch sequencing read products comprise the second batch barcode sequence and the sample index sequence. In some embodiments, the second batch sequencing read products comprise the second batch barcode sequence and at least a portion of the second sequence of interest. In some embodiments, the second batch sequencing read products comprise the second batch barcode sequence, the sample index sequence, and at least a portion of the second sequence of interest. In some embodiments, the short read sequencing comprises hybridizing second batch sequencing primers to the second batch sequencing primer binding sites on second sub-population of concatemer template molecules and conducting up to 1000 cycles of polymerase-catalyzed sequencing reactions using nucleotide reagents.
[00610] In some embodiments, the sequencing of step (gl) comprises conducting any massively parallel nucleic acid sequencing method that employs a plurality of sequencing polymerases and a plurality of nucleotide reagents. In some embodiments, the plurality of nucleotide reagents comprise nucleotides, nucleotide analogs and/or multivalent molecules. In some embodiments, the reiterative sequencing of step (gl) comprises conducting a two- stage sequencing method described herein.
[00611] In some embodiments, the methods for sequencing further comprises step (g2): stopping and/or blocking the short read sequencing of step (gl). In some embodiments, the stopping and/or blocking comprises incorporating a chain terminating nucleotide to the 3’ terminal end of the second batch sequencing read products to inhibit further sequencing reactions. Exemplary chain terminating nucleotides include dideoxynucleotide or a nucleotide having a 2’ or 3’ chain terminating moiety.
[00612] In some embodiments, the methods for sequencing further comprise step (g3): removing the plurality of second batch sequencing read products from the concatemer template molecules of the second sub-population, and retaining the concatemer template molecules of the second sub-population. In some embodiments, the second batch sequencing read products can be removed from the concatemer template molecules by denaturation using heat and/or a de-hybridization reagent.
[00613] In some embodiments, the methods for sequencing further comprise step (g4): reiteratively sequencing the concatemer template molecules of the second sub-population by repeating steps (gl) - (g3) at least once. In some embodiments, the reiterative sequencing can
be conducted 1-10 times, or 10-25 times, or 25-50 times, or any range therebetween, or more than 50 times. For example, the reiterative sequencing can be conducted up to 100 times. [00614] Exemplary schematics of reiterative sequencing workflows are shown in FIGs.
24A, 24B, 26A, 26B, 31A and 31B
[00615] In some embodiments, the sequences of all of the second batch sequencing read products can be determined and aligned with a second reference sequence to confirm the presence of the second sequence of interest. The second reference sequence can be the second batch barcode and/or the second sequence of interest.
[00616] In some embodiments, hybridizing the second batch sequencing primers to the concatemer template molecules of step (gl) can be conducted with a hybridization reagent comprising an SSC buffer (e.g., 2X saline-sodium citrate) buffer with formamide (e.g., 10- 20% formamide).
[00617] In some embodiments, in step (g3) the plurality of plurality of second batch sequencing read products can be removed from the concatemer template molecules and the plurality of concatemer template molecules can be retained using a de-hybridization reagent comprising an SSC buffer (e.g., saline-sodium citrate) buffer, with or without formamide, at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
[00618] In some embodiments, in step (g3) the plurality of plurality of second batch sequencing read products can be removed from the concatemer template molecules and the plurality of concatemer template molecules can be retained using a de-hybridization reagent comprising at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de- hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de-hybridization of step (g3) can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C.
Trapping Reagents
[00619] The present disclosure provides one or more trapping reagents. The present disclosure provides methods for batch sequencing with or without reiterative sequencing, and methods for re-seeding with or without reiterative sequencing, which can be conducted with a trapping reagent.
[00620] In some embodiments, the trapping reagents can promote binding of multivalent molecules to complexed polymerases individual complexed polymerases comprise a sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex
comprises a nucleic acid template molecule hybridized to a sequencing primer. In some embodiments, the trapping reagent can inhibit incorporation of the nucleotide unit of the multivalent molecules. In some embodiments, the trapping reagent comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, and at least one reducing agent. In some embodiments, the trapping reagent further comprises at least one chaotropic agent. In some embodiments, the trapping reagent further comprises an amino acid or a modified amino acid. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation.
[00621] In some embodiments, the trapping reagent comprises: water; any one or any combination of two or more pH buffering agents comprising Tris-HCl (e.g., pH 7-9, 10-50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (e.g., pH 5-7, 10-50 mM); at least one non-catalytic cation comprising strontium acetate (e.g., 1-7 mM) and/or strontium nitrate (e.g., 1-7 mM); any one or any combination of two or more viscosity agents comprising sucrose (e.g., 50-300 mM), ethylene glycol (e.g., 5-20%) and/or propylene glycol (e.g., 1-5%); at least one chelating agent comprising Ethylenediaminetetraacetic acid (EDTA; e.g., 0.1-0.7 mM); at least one detergent comprising Triton X100 (e.g., 0.1-0.5%) or Tween 20 ( also known as polysorbate 20; e.g., 0.01-0.05%); at least one monovalent cation comprising NaCl (e.g., 25-100 mM); and at least one reducing agent comprising dimethyl sulfoxide (DMSO; e.g., 0.1-0.7%) and/or tri s(2- carboxyethyl)phosphine (TCEP; e.g., 0.1-0.7%). In some embodiments, the trapping reagent further comprises at least one chaotropic agent comprising guanidinium hydrochloride (e.g., 50-150 mM) or guanidinium isothiocyanate (e.g., 50-150 mM). In some embodiments, the trapping reagent further comprises any one or any combination of two or more amino acids or modified amino acids comprising betaine (e.g., 50-500 mM), beta-alanine (e.g., 25-150 mM) and/or L-arginine (e.g., 25-100 mM). In some embodiments, the trapping reagent further comprises any one or any combination of two or more types of multivalent molecules carrying nucleotide units dATP, dGTP, dCTP, dTTP and/or dUTP (e.g., 10-75 nM each type). In some embodiments, the trapping reagent further comprises a plurality of sequencing polymerases (e.g., 100-600 nM). In some embodiments, the trapping reagent lacks a non- catalytic cation.
Imaging Reagents
[00622] The present disclosure provides one or more imaging reagents. The present disclosure provides methods for batch sequencing with or without reiterative sequencing, and methods for re-seeding with or without reiterative sequencing, which can be conducted with an imaging reagent.
[00623] In some embodiments, the imaging reagents can reduce photo damage of a fluorescently-labeled compound upon exposure to the excitation illumination. In some embodiments, the fluorescently-labeled compound comprises a fluorophore-labeled nucleotide, a fluorophore-labeled multivalent molecule or a fluorophore-labeled multivalent- complexed polymerase. In some embodiments, the imaging reagent can inhibit polymerase- catalyzed nucleotide incorporation. In some embodiments, the imaging reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photo-damage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, the imaging reagents further comprise at least one amino acid or modified amino acids. In some embodiments, the imaging reagent lacks a reducing agent. [00624] In some embodiments, the imaging reagent comprises: water; any one or any combination of two or more pH buffering agents comprising Tris-HCl (e.g., pH 7-9, 10-50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (pH 5-7, 10-50 mM); at least one monovalent cation comprising NaCl (e.g., 25-100 mM); at least one chelating agent comprising EDTA (e.g., 0.1-0.7 mM); at least one non- catalytic cation comprising strontium acetate (e.g., 1-7 mM) and/or strontium nitrate (e.g., 1-7 mM); any one or any combination of two or more compounds for reducing photo-damage comprising 6-hydroxy-2,5,7,8-tetramethylchroman-2-carboxylic acid (Trolox; e.g., 0.1-0.5 mM), ascorbic acid (e.g., 10-75 mM), sinapic acid (e.g., 0.1-20 mM), TEMP (e.g., 0.1-20 mM), l-hydroxy-2,2,6,6-tetramethylpiperidine (TEMPOH; e.g., 0.1-20 mM), kojic acid (e.g., 0.1-20 mM), gallic acid (e.g., 0.1-20 mM), caffeic acid (e.g., 0.1-20 mM) and/or ergothioneine (e.g., 0.1-20 mM); at least one reducing agent comprising DMSO (e.g., 0.1- 0.7%) and/or TCEP (e.g., 0.1-0.7 mM); at least one detergent comprising Triton X100 (e.g., 0.1-0.5%) or Tween 20 (e.g., 0.01-0.05%); any one or any combination of two or more viscosity agents comprising sucrose (e.g., 0.1-0.5 M), ethylene glycol (e.g., 5-50%), propylene glycol (e.g., 0.1-10%) and/or glycerol (e.g., 1-8%). In some embodiments, the imaging reagent further comprises any one or any combination of two or more amino acids or
modified amino acids comprising betaine (e.g., 50-500 mM), beta-alanine (e.g., 25-150 mM), L-arginine (e.g., 25-100 mM) and/or methionine (e.g., 0.1-5 mM). In some embodiments, the imaging reagent lacks a non-catalytic cation comprising strontium. In some embodiments, the imaging reagent lacks an amino acid or modified amino acid.
Stepping Reagents
[00625] The present disclosure provides one or more stepping reagents. The present disclosure provides methods for batch sequencing with or without reiterative sequencing, and methods for re-seeding with or without reiterative sequencing, which can be conducted with a stepping reagent.
[00626] In some embodiments, the stepping reagents can promote binding of nucleotides to complexed polymerases. In some embodiments, individual complexed polymerases comprise a sequencing polymerase bound to a nucleic acid duplex where the nucleic acid duplex comprises a nucleic acid template molecule hybridized to a sequencing primer. In some embodiments, the stepping reagent can promote incorporation of the nucleotide into the terminal 3’ end of the sequencing primer. In some embodiments, the stepping reagent comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00627] In some embodiments, the stepping reagent comprises: water; any one or any combination of two or more pH buffering agents comprising Tris (e.g., pH 7-9, 10-50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (e.g., pH 5-7, 10-50 mM); any one or any combination of two or more monovalent cations comprising NaCl (e.g., 25-100 mM), KC1 (e.g., 10-75 mM) and/or ammonium sulfate (e.g., 1-50 mM); any one or any combination of two or more catalytic cations comprising magnesium chloride (e.g., 1-30 mM), magnesium sulfate (e.g., 1-30 mM) and/or manganese chloride (e.g., 1-30 mM); any one or any combination of two or more viscosity agents
comprising propylene glycol (e.g., 0.1-10%), ethylene glycol (e.g., 5-50%), glycerol (e.g., 1- 8%), sucrose (e.g., 0.01-5 M) and/or trehalose (e.g., 0.01-5 M); at least one chelating agent comprising EDTA (e.g., 0.1-0.7 mM); at least one reducing agent comprising TCEP (e.g., 0.1-0.7 mM) and/or DMSO (e.g., 0.1-0.7 %); and at least one detergent comprising Triton X100 (e.g., 0.1-0.5%) or Tween 20 (e.g., 0.01-0.05%). In some embodiments, the stepping reagent further comprises any one or any combination of two or more amino acids or modified amino acids comprising betaine (e.g., 50-500 mM), beta-alanine (e.g., 25-150 mM), L-arginine (e.g., 25-100 mM) and/or methionine (e.g., 0.1-5 mM). In some embodiments, the stepping reagent further comprises any one or any combination of two or more types of nucleotides dATP, dGTP, dCTP, dTTP and/or dUTP (e.g., 0.1-5 uM each type). In some embodiments, the stepping reagent comprises a plurality of detectably labeled nucleotides. In some embodiments, at least one type of nucleotides in the stepping reagent comprises detectably labeled nucleotides. In some embodiments, the stepping reagent comprises a plurality of non-labeled nucleotides. In some embodiments, the nucleotides in the stepping reagent comprise 3’ chain terminator nucleotide analogs. In some embodiments, the stepping reagent further comprises a plurality of sequencing polymerases (e.g., 100-600 nM). In some embodiments, the stepping reagent lacks a catalytic cation comprising magnesium or manganese.
De-Hybridization Reagents
[00628] The present disclosure provides one or more nucleic acid de-hybridization reagents. The present disclosure provides methods for batch sequencing with or without reiterative sequencing, and methods for re-seeding with or without reiterative sequencing, which can be conducted with a de-hybridization reagent.
[00629] In some embodiments, the de-hybridization reagents can promote nucleic acid denaturation between any two nucleic acid strands. In some embodiments, the de- hybridization reagents can promote nucleic acid denaturation between a nucleic acid template molecule and a nucleic acid extension product while retaining the nucleic acid template molecule. In some embodiments, the de-hybridization reagents can promote nucleic acid denaturation between immobilized concatemer template molecules and the plurality of first batch sequencing read products while retaining the immobilized concatemer molecules. In some embodiments, the de-hybridization reagents can promote nucleic acid denaturation between immobilized concatemer molecules and the plurality of second batch sequencing read products while retaining the immobilized concatemer molecules.
[00630] In some embodiments, the de-hybridization reagent comprises at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de-hybridization reagent further comprises at least one nucleic acid condenser agent. In some embodiments, the de- hybridization reaction can be conducted at a temperature that promotes nucleic acid denaturation such as for example 50 - 90 °C. In some embodiments, the de-hybridization reagent has a pH range of about 5 - 5.25, or a pH range of about 5.25 - 5.5, or a pH range of about 5.5 - 5.75, or a pH range of about 5.75 - 6, or a pH range of about 6-7.
[00631] In some embodiments, the de-hybridization reagent comprises: any one or any combination of two or more solvents comprising water, acetonitrile (e.g., 10-20%) and/or formamide (e.g., 10-40%); any one or any combination of two or more pH buffering agents comprising MES (e.g., pH 5-7, 10-75 mM), Tris (e.g., pH 6-9, 10-50 mM), HEPES (e.g., pH 6-9, 10-50 mM) and/or PBS (phosphate buffered saline) (e.g., comprising disodium hydrogen phosphate and sodium chloride) (e.g., pH 5-8); at least one reducing agent comprising DMSO (e.g., 10-50%) or TCEP (e.g., 1-10 mM); at least one monovalent salt comprising NaCl (e.g., 0.25-2 M) and/or ammonium sulfate (e.g., 1-50 mM); any one or any combination of two or more crowding agents comprising PEG 200 (e.g., 10-50%), PEG 400 (e.g., 10-50%) and/or dextran sulfate (e.g., 150kDa, 1-5%); at least one chaotropic agent comprising guanidinium hydrochloride (e.g., 50 mM-2M) or guanidinium isothiocyanate (e.g., 50 mM-2M); and any one or any combination of two or more nucleic acid condenser agents comprising dextran sulfate (e.g., 150 kDa or 500 kDa, 1-5%), a polyamine (e.g., 0.01-15%), poly-lysine (e.g., poly-L-lysine, 0.01-15%) and/or manganese chloride (e.g., 0.1-0.8 M). In some embodiments, the polyamine in the de-hybridization reagent can be in hydrochloride form. In some embodiments, the polyamine comprises spermine, spermidine, cadaverine or putrescene.
Solvents
[00632] In some embodiments, any of the reagents described herein comprise at least one solvent. In some embodiments, the at least one solvent comprises water. In some embodiments, the at least one solvent comprises an alcohol comprising a short chain alcohol having 1-6 carbon backbone, including linear or branched alcohols. The short chain alcohol can be methanol, ethanol, propanol, butanol, pentanol or hexanol. In some embodiments, the
solvent comprises a polar aprotic solvent including acetonitrile, diethylene glycol, N,N- dimethylacetamide, dimethyl formamide, dimethyl sulfoxide, ethylene glycol, 1,4-di oxane (1,4-di ethyleneoxide), formamide, glycerin, 7V-methyl-2-pyrrolidinone, hexamethylphosphoramide, nitrobenzene, or nitromethane. pH Buffering Agents
[00633] In some embodiments, any of the reagents described herein comprise at least one pH buffering agent. In some embodiments, the at least one pH buffering agent can maintain the pH of the reagent in a range that is suitable for nucleic acids and enzymatic activity. In some embodiments, the at least one pH buffering agent comprises any one or any combination of two or more of Tris, Tris-HCl, Tris-acetate, Tricine, Bicine, Bis-Tris propane, HEPES, MES, 3-(N-morpholino)propanesulfonic acid (MOPS), 2-Hydroxy-3- morpholinopropanesulfonic acid (MOPSO), N,N-Bis(2-hydroxyethyl)-2-aminoethanesulfonic acid (BES), 2-{[l,3-Dihydroxy-2-(hydroxymethyl)propan-2-yl]amino}ethane-l-sulfonic acid (TES), 3 -(Cyclohexylamino)- 1 -propanesulfonic acid (CAPS), 3 -{[1,3 -dihydroxy -2- (hydroxymethyl)propan-2-yl]amino}propane-l -sulfonic acid (TAPS), 3-{[l,3-dihydroxy-2- (hydroxymethyl)propan-2-yl]amino}-2-hydroxypropane-l-sulfonic acid (TAPSO), N-(2- acetamido)-2-aminoethanesulfonic acid (ACES), 1,4-Piperazinedi ethanesulfonic acid (PIPES), ethanolamine (a.k.a 2-amino methanol; MEA), a citrate compound, a citrate mixture, NaOH and/or KOH. In some embodiments, the pH buffering agent can be present in any of the reagents described herein at a concentration of about 1 mM - 1 M, or about 5 mM - 0.5 M, or about 10 mM - 0.25 M. In some embodiments, the pH of the pH buffering agent which is present in the reagents described herein can be at a pH of about 4-9.5, or a pH of about 5-9, or a pH of about 5-8, or a pH of about 5.5-7, or any range therebetween.
Monovalent Salts
[00634] In some embodiments, any of the reagents described herein comprise at least one monovalent salt. In some embodiments, any of the reagents described herein can include at least one monovalent salt comprising any one or any combination of two or more of NaCl, KC1, ammonium sulfate (e.g., NH2SO4), potassium acetate (e.g., KCH3CO2), MgCh and/or potassium glutamate. In some embodiments, the reagents can include at least one monovalent salt at a concentration of about 25-500 mM, or about 50-250 mM, or about 100-200 mM, or about 500 mM - 750 mM, or about 750 mM - 1 M, or about 1 M - 1.5 M, or about 1.5 - 2 M, or any range therebetween.
Ammonium Ions
[00635] In some embodiments, any of the reagents described herein comprise ammonium ions. In some embodiments, any of the reagents described herein can include a source of ammonium ions, for example ammonium sulfate (e.g., NH2SO4). In some embodiments, ammonium sulfate is included in the reagent at a concentration of about 1-50 mM, or about 10-25 mM, or any range therebetween.
Detergents
[00636] In some embodiments, any of the reagents described herein comprise at least one detergent. In some embodiments, the at least one detergent comprises an ionic detergent such as SDS (sodium dodecyl sulfate). In some embodiments, the at least one detergent comprises a non-ionic detergent such as Triton X-100, Tween 20, Tween 80 or Nonidet P-40. In some embodiments, the at least one detergent comprises a zwitterionic detergent such as CHAPS (3-[(3-cholamidopropyl)dimethylammonio]-l-propanesulfonate), A-Dodecyl-A,7V-dimethyl- 3 -amonio-1 -propanesulfate (DetX) or n-dodecyl beta-D-maltoside (DDM). In some embodiments, the at least one detergent comprises LDS ( lithium dodecyl sulfate), sodium taurodeoxycholate, sodium taurocholate, sodium glycocholate, sodium deoxycholate or sodium cholate. In some embodiments, the detergent is included in a reagent at a concentration of about 0.01-0.05%, or about 0.05-0.1%, or about 0.1-0.15%, or about 0.15- 0.2%, or about 0.2-0.25%, or any range therebetween.
Reducing Agents
[00637] In some embodiments, any of the reagents described herein comprise at least one reducing agent. In some embodiments, the at least one reducing agent comprising any one or any combination of two or more of DTT (dithiothreitol), 2-beta mercaptoethanol, TCEP, (tris(2- carboxyethyl)phosphine), formamide, DMSO (dimethylsulfoxide), sodium dithionite (Na2S2O4), glutathione, methionine, betaine, Tris(3-hydroxypropyl)phosphine (THPP) and/or N-acetyl cysteine. In some embodiments, the reagents can include the reducing agent at a concentration of about 0.1-0.5 M, or about 0.5-1 M, or about 1-2 M. In some embodiments, the reagents can include the reducing agent at a concentration of about 0.01-0.1 mM, or about 0.1-1 mM, or about 1-2.5 mM, or about 2.5-5 mM, or about 5-7.5 mM, or about 7.5-9 mM, or about 9-12 mM, or about 12-25 mM, or about 25-50 mM, or any range therebetween. In some embodiments, the reagents can include the reducing agent at a concentration of about l%-5%,
or about 5%-l 0%, or about 10%-20%, or about 20%-30%, or about 30%-40%, or about 40%- 50%, or any range therebetween.
Viscosity Agents
[00638] In some embodiments, any of the reagents described herein comprise at least one viscosity agent. In some embodiments, the at least one viscosity agent comprising a saccharide such as trehalose, sucrose, cellulose, xylitol, mannitol, sorbitol or inositol. In some embodiments, the at least one viscosity agent comprises glycerol or a glycol compound such as ethylene glycol or propylene glycol. The reagents can include the viscosity agent at a concentration of about 0.1-1%, or about 1-5%, or about 5-10%, or about 10-15% based on volume, or any range therebetween. The reagents can include the viscosity agent at a concentration of about 1-50 mM, or about 50-100 mM, or about 100-150 mM, or about 150- 200 mM. The reagents can include the viscosity agent at a concentration of about 0.1-0.5 M, or about 0.5-1 M, or about 1-2 M, or about 2-3 M, or about 3-5 M, or any range therebetween.
Chaotropic Agents
[00639] In some embodiments, any of the reagents described herein comprise at least one chaotropic agent. In some embodiments, the at least one chaotropic agent that can disrupt non-covalent bonds such as hydrogen bonds or van der Waals forces. In some embodiments, the at least one chaotropic agent comprises any one or any combination of two or more of SDS (sodium dodecyl sulfate), urea, thiourea, guanidinium chloride, guanidine hydrochloride, guanidine thiocyanate, guanidine isothiocyanate, guanidine isothionate, potassium thiocyanate, lithium chloride, sodium iodide, sodium perchlorate or imidazole. In some embodiments, the reagents can include at least one chaotropic agent at a concentration of about 0.1-5M, about 0.5-4M, about 0.5-3M, about 0.5-1 M, about 0.1-1 M, about 1-2 M, about 2-3 M, about 3-4 M, about 4-5 M, or any range therebetween.
Chelating Agents
[00640] In some embodiments, any of the reagents described herein comprise at least one chelating agent. In some embodiments, the least one chelating agent can bind metal ions by chelation, coordination or covalent bonding. In some embodiments, the at least one chelating agent comprises EDTA (ethylenediaminetetraacetic acid), EGTA (ethylene glycol tetraacetic acid), HEDTA (hydroxy ethylethylenediaminetriacetic acid), DPTA (diethylene triamine
pentaacetic acid), NTA (N,N-bis(carboxymethyl)glycine), citrate anhydrous, sodium citrate, calcium citrate, ammonium citrate, ammonium bicitrate, citric acid, potassium citrate, or magnesium citrate. In some embodiments, the reagents can include at least one chelating agent at a concentration of about 0.01 - 50 mM, or about 0.1 - 20 mM, or about 0.2 - 10 mM, or any range therebetween.
Zwitterions
[00641] In some embodiments, any of the reagents described herein comprise at least one zwitterion. In some embodiments, the zwitterion comprises a cationic zwitterionic compound such as a betaine including N,N,N-trimethylglycine and cocamidopropyl betaine. In some embodiments, the zwitterion comprises an albuminoids including ovalbumin, and the serum albumins derived from bovine, equine, or human. In some embodiments, the reagent can include a zwitterion at a concentration of about 0.1-0.5 M, or about 0.5-1 M, or about 1-2 M, or any range therebetween.
Sugar Alcohols
[00642] In some embodiments, any of the reagents described herein comprise at least one sugar alcohol. In some embodiments, the at least one sugar alcohol, comprising sucrose, trehalose, maltose, rhamnose, arabinose, fucose, mannitol, sorbitol or adonitol. In some embodiments, the reagents can include the sugar alcohol at a concentration of about 1-50 mM, or about 50-100 mM, or about 100-150 mM, or about 150-200 mM, or any range therebetween. In some embodiments, the reagents can include the sugar alcohol at a concentration of about 0.1-0.5 M, or about 0.5-1 M, or about 1-2 M, or about 2-3 M, or about 3-5 M, or any range therebetween.
Crowding Agents
[00643] In some embodiments, any of the reagents described herein comprise at least one crowding agent. In some embodiments, the at least one crowding agent can increase molecular crowding. In some embodiments, the at least one crowding agent comprises any one or any combination of two or more of polyethylene glycol (PEG, e.g., 1-50K molecular weight), dextran, dextran sulfate, hydroxypropyl methyl cellulose (HPMC), hydroxyethyl methyl cellulose (HEMC), hydroxybutyl methyl cellulose, hydroxypropyl cellulose, methycellulose, and hydroxyl methyl cellulose. In some embodiments, the polyethylene glycol comprises PEG 100, PEG 200, PEG 300, PEG 400, PEG 600 or PEG 800. In some
embodiments, the polyethylene glycol comprises PEG 1000, PEG 2000, PEG 3000 or PEG 4000. In some embodiments, the dextran sulfate comprises 150 kDa or 500 kDa forms. In some embodiments, the crowding agent can be present in the reagent at about 1-10%, or about 10-25%, or about 25-30%, or about 30-35%, or about 35-50% or higher percentages by volume based on the total volume of the reagent, or any range therebetween.
Nucleic Acid Condenser Agents
[00644] In some embodiments, any of the reagents described herein comprise at least one nucleic acid condenser agent. In some embodiments, the at least one nucleic acid condenser agent comprising a chemical compound which condenses DNA and/or RNA. In some embodiments, the at least one condenser agent comprises any one or any combination of two or more of a polyamine (e.g., MW approximately 600), spermine, spermidine, cadaverine, putrescene, 1,3 -diaminopropane (1,3-DAP), polypeptide (e.g., poly(lysine)), manganese chloride, dextran sulfate (e.g., about 150 kDa or about 500 kDa) and/or poly-lysine. In some embodiments, the condenser agent can be present in the reagent at a concentration of about 0.1-1 mM, or about 1-5 mM, or about 5-10 mM, or about 10-20 mM, or about 20-40 mM, or about 40-60 mM, or about 60-80 mM, or about 80-100 mM, or any range therebetween.
Amino acids
[00645] In some embodiments, any of the reagents described herein comprise at least one amino acid or modified amino acid. In some embodiments, the at least one type of amino acid or modified amino acid includes di-peptides. In some embodiments, the at least one amino acid comprises any one or any combination of two or more of betaine, beta-alanine, histidine and/or arginine (e.g., L-arginine). In some embodiments, the at least one dipeptide comprises any one or any combination of two or more of carnosine (beta-alanyl-L-histidine), anserine (beta-alanyl-L-1 -methylhistidine), and/or balenine (beta-alanyl-L-3-methylhistidine).
Compounds for Reducing Photo-Damage
[00646] In some embodiments, any of the reagents described herein can include one or more compounds for reducing photo-damage. In some embodiments, compounds that can reduce photo-damage include antioxidants, triplet state quenchers, singlet oxygen quenchers, oxygen scavengers, electron scavengers, anti-fade formulations, electron rich polyphenols, acidic polyphenols, alkenes, hydrogen donor compounds, and thiol based reducing agents. The skilled artisan will appreciate that some of these compounds can be classified as more
than one type of photo-damage reducing compound. In some embodiments, the compounds that can reduce photo-damage comprise chemical compounds or enzymes.
[00647] Exemplary but non limiting antioxidants comprise ascorbic acid, ascorbyl palmitate, D-isoascorbic acid (erythorbic acid), sodium ascorbate, butylated hydroxytoluene (BTH), butylated hydroxy toluene (BHT), polyphenol antioxidants, polyvinyl alcohols, butylated hydroxy anisol (BHA), Trolox (6-hydroxy-2,5,7,8-tetramethylchroman-2- carboxylic acid) and other vitamin E analogs including nitrated Trolox derivates (see U.S. patent No. 9,994,541, the entire contents of which are expressly incorporated by reference in its entirety).
[00648] Exemplary but non limiting triplet state quenchers comprise ascorbic acid, 1,4- diazobicyclo[2.2.2]octane (DABCO), cyclo-octatetraene (COT), dithiothreitol (DTT), mercaptoethylamine (MEA), P-mercaptoethanol (BME), n-propyl gallate, p- phenylenediamene (PPD), hydroquinone and sodium azide (NaNs), TEMP (2, 2,6,6- tetramethyl-4-piperidone), TEMP amine, TEMPO (2,2,6,6-tetramethyll-l - piperidinyloxyl), TEMPOH (2,2,6,6-Tetramethyl-4-piperidinol), HTEMPO (4-hydroxy derivative of TEMPO), 1,3,5-trihydroxybenzene (THB) and DTBN (di-t-butylnitroxide).
[00649] Exemplary but non limiting singlet oxygen quenchers comprise thiol-based quenchers such as glutathione, dithiothreitol, ergothioneine, methionine, cysteine, betadimethyl cysteine (penicillamine), mercaptopropionylglycine, MESNA, imidazole, and N- acetyl cysteine and captopril.
[00650] Exemplary but non limiting oxygen scavengers comprise glutathione, and N- acetylcysteine, histidine, tryptophan, hydrazine (N2H4), sodium sulfite (Na2SOs) and hydroxylamine.
[00651] Exemplary but non limiting electron scavengers comprise methyl viologen (e.g., 1,1 '-dimethyl-4,4'-bipyridinium di chloride).
[00652] Exemplary but non limiting anti-fade formulations comprise commercially- available products including Fluoroguard® Antifade Reagent (e.g., from BioRad®), SlowFade Antifade Kit (e.g., includes DABCO, from Molecular Probes-Invitrogen®), ProLong™ Gold Antifade Reagent (e.g., from Invitrogen), and CitiFluor™ (e.g., from CitiFluor).
[00653] Exemplary but non limiting electron rich polyphenols comprise rutin, hesperidin, catchchin and epigallocatechin-3 -gallate (EGCG).
[00654] Exemplary but non limiting acidic polyphenols comprise dihydroxybenzoic acids (DHBA), gallic acid and derivatives of gallic acid, tiron, potassium hydroquinonesulfonate (HQSA) and 3,6-dihydronaphthalene-2,7-disulonic acid (e.g., disodium salt)(DHNA).
[00655] Exemplary but non limiting alkenes comprise chlorogenic acid, 4-cyclohexene- 1,2-dicarboxylic acid, caffeic acid and sinapic acid including demethylated sinapic acid.
[00656] Exemplary but non limiting hydrogen donor compounds comprise citric acid shikimic acid, quinic acid, kojic acid, ergothioneine and 2-mercaptoimidazole.
[00657] Exemplary but non limiting thiol based reducing agents comprise cysteine and methionine.
[00658] In some embodiments, the reagents can include at least one of the compounds for reducing photo damage at a concentration of about 0.1-1 mM, or about 1-10 mM, or about 10-25 mM, or about 25-50 mM, or about 50-75 mM, or about 75-100 mM.
Kits
[00659] The present disclosure provides at least one kit for conducting any of the methods for batch sequencing with or without reiterative sequencing, and any of the methods for reseeding with or without reiterative sequencing described above. In some embodiments, the kit comprises any one or any combination of two or more reagents comprising a trapping reagent, an imaging reagent, a stepping reagent and/or a de-hybridization reagent.
[00660] In some embodiments, the kit comprises a trapping reagent, an imaging reagent, a stepping reagent and/or a de-hybridization reagent. In some embodiments, each reagent is stored in a separate container.
[00661] The kit can include instructions for use of the kit, e.g. for conducting methods for batch sequencing with or without reiterative sequencing using a trapping reagent, an imaging reagent, a stepping reagent and/or a de-hybridization reagent.
[00662] The kit can include instructions for use of the kit, e.g. for conducting methods for re-seeding with or without reiterative sequencing using a trapping reagent, an imaging reagent, a stepping reagent and/or a de-hybridization reagent.
Kits Comprising Trapping Reagents
[00663] In some embodiments, the trapping reagent in the kit comprises at least one solvent, at least one pH buffering agent, at least one non-catalytic cation, at least one viscosity agent, at least one chelating agent, at least one detergent, at least one monovalent cation, and at least one reducing agent. In some embodiments, the trapping reagent further
comprises at least one chaotropic agent. In some embodiments, the trapping reagent further comprises an amino acid or a modified amino acid. In some embodiments, the trapping reagent further comprises a plurality of multivalent molecules. In some embodiments, the trapping reagent further comprises a first plurality of sequencing polymerases. In some embodiments, the at least one non-catalytic cation inhibits polymerase-catalyzed nucleotide incorporation. In some embodiments, the trapping reagent lacks a non-catalytic cation.
[00664] In some embodiments, the trapping reagent in the kit comprises: water; any one or any combination of two or more pH buffering agents comprising Tris-HCl (e.g., pH 7-9, 10- 50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (e.g., pH 5-7, 10-50 mM); at least one non-catalytic cation comprising strontium acetate (e.g., 1-7 mM) and/or strontium nitrate (e.g., 1-7 mM); any one or any combination of two or more viscosity agents comprising sucrose (e.g., 50-300 mM), ethylene glycol (e.g., 5- 20%) and/or propylene glycol (e.g., 1-5%); at least one chelating agent comprising EDTA (e.g., 0.1-0.7 mM); at least one detergent comprising Triton X100 (e.g., 0.1-0.5%) or Tween 20 (e.g., 0.01-0.05%); at least one monovalent cation comprising NaCl (e.g., 25-100 mM); and at least one reducing agent comprising DMSO (e.g., 0.1-0.7%) and/or TCEP (e.g., 0.1- 0.7%). In some embodiments, the trapping reagent further comprises at least one chaotropic agent comprising guanidinium hydrochloride (e.g., 50-150 mM) or guanidinium isothiocyanate (e.g., 50-150 mM). In some embodiments, the trapping reagent further comprises any one or any combination of two or more amino acids or modified amino acids comprising betaine (e.g., 50-500 mM), beta-alanine (e.g., 25-150 mM) and/or L-arginine (e.g., 25-100 mM). In some embodiments, the trapping reagent further comprises any one or any combination of two or more types of multivalent molecules carrying nucleotide units dATP, dGTP, dCTP, dTTP and/or dUTP (e.g., 10-75 nM each type). In some embodiments, the trapping reagent further comprises a plurality of sequencing polymerases (e.g., 100-600 nM). In some embodiments, the trapping reagent lacks a non-catalytic cation.
Kits Comprising Imaging Reagents
[00665] In some embodiments, the imaging reagent in the kit comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one chelating agent, at least one non-catalytic divalent cation, at least one compound for reducing photodamage, at least one reducing agent, at least one detergent and at least one viscosity agent. In some embodiments, the imaging reagents further comprise at least one amino acid or modified amino acids. In some embodiments, the imaging reagent lacks a reducing agent.
[00666] In some embodiments, the imaging reagent in the kit comprises: water; any one or any combination of two or more pH buffering agents comprising Tris-HCl (e.g., pH 7-9, 10- 50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (pH 5-7, 10-50 mM); at least one monovalent cation comprising NaCl (e.g., 25- 100 mM); at least one chelating agent comprising EDTA (e.g., 0.1-0.7 mM); at least one non- catalytic cation comprising strontium acetate (e.g., 1-7 mM) and/or strontium nitrate (e.g., 1-7 mM); any one or any combination of two or more compounds for reducing photo-damage comprising Trolox (e.g., 0.1-0.5 mM), ascorbic acid (e.g., 10-75 mM), sinapic acid (e.g., 0.1- 20 mM), TEMP (e.g., 0.1-20 mM), TEMPOH (e.g., 0.1-20 mM), kojic acid (e.g., 0.1-20 mM), gallic acid (e.g., 0.1-20 mM), caffeic acid (e.g., 0.1-20 mM) and/or ergothioneine (e.g., 0.1-20 mM); at least one reducing agent comprising DMSO (e.g., 0.1-0.7%) and/or TCEP (e.g., 0.1-0.7 mM); at least one detergent comprising Triton X100 (e.g., 0.1-0.5%) or Tween 20 (e.g., 0.01-0.05%); any one or any combination of two or more viscosity agents comprising sucrose (e.g., 0.1-0.5 M), ethylene glycol (e.g., 5-50%), propylene glycol (e.g., 0.1-10%) and/or glycerol (e.g., 1-8%). In some embodiments, the imaging reagent further comprises any one or any combination of two or more amino acids or modified amino acids comprising betaine (e.g., 50-500 mM), beta-alanine (e.g., 25-150 mM), L-arginine (e.g., 25- 100 mM) and/or methionine (e.g., 0.1-5 mM). In some embodiments, the imaging reagent lacks a non-catalytic cation comprising strontium. In some embodiments, the imaging reagent lacks an amino acid or modified amino acid.
Kits Comprising Stepping Reagents
[00667] In some embodiments, the stepping reagent in the kit comprises at least one solvent, at least one pH buffering agent, at least one monovalent cation, at least one catalytic cation, at least one viscosity agent, at least one chelating agent, at least one amino acid, at least one detergent. In some embodiments, the stepping reagent further comprises a plurality of nucleotides (e.g., non-conjugated free nucleotides). In some embodiments, the stepping reagent further comprises a plurality of sequencing polymerases. In some embodiments, the at least one catalytic cation promotes polymerase-catalyzed nucleotide incorporation. In some embodiments, in the stepping reagent, the plurality of nucleotides comprises chain terminating nucleotides where individual nucleotides comprise a chain terminating moiety attached to the 3’ sugar position. In some embodiments, in the stepping reagent, the plurality of nucleotides are not chain terminating nucleotides.
[00668] In some embodiments, the stepping reagent in the kit comprises: water; any one or any combination of two or more pH buffering agents comprising Tris (e.g., pH 7-9, 10-50 mM), Bis-Tris propane (e.g., pH 7-9, 10-50 mM), HEPES (e.g., pH 7-9, 10-50 mM) and/or MES (e.g., pH 5-7, 10-50 mM); any one or any combination of two or more monovalent cations comprising NaCl (e.g., 25-100 mM), KC1 (e.g., 10-75 mM) and/or ammonium sulfate (e.g., 1-50 mM); any one or any combination of two or more catalytic cations comprising magnesium chloride (e.g., 1-30 mM), magnesium sulfate (e.g., 1-30 mM) and/or manganese chloride (e.g., 1-30 mM); any one or any combination of two or more viscosity agents comprising propylene glycol (e.g., 0.1-10%), ethylene glycol (e.g., 5-50%), glycerol (e.g., 1- 8%), sucrose (e.g., 0.01-5 M) and/or trehalose (e.g., 0.01-5 M); at least one chelating agent comprising EDTA (e.g., 0.1-0.7 mM); at least one reducing agent comprising TCEP (e.g., 0.1-0.7 mM) and/or DMSO (e.g., 0.1-0.7 %); and at least one detergent comprising Triton X100 (e.g., 0.1-0.5%) or Tween 20 (e.g., 0.01-0.05%). In some embodiments, the stepping reagent further comprises any one or any combination of two or more amino acids or modified amino acids comprising betaine (e.g., 50-500 mM), beta-alanine (e.g., 25-150 mM), L-arginine (e.g., 25-100 mM) and/or methionine (e.g., 0.1-5 mM). In some embodiments, the stepping reagent further comprises any one or any combination of two or more types of nucleotides dATP, dGTP, dCTP, dTTP and/or dUTP (e.g., 0.1-5 uM each type). In some embodiments, the stepping reagent comprises a plurality of detectably labeled nucleotides. In some embodiments, at least one type of nucleotides in the stepping reagent comprises detectably labeled nucleotides. In some embodiments, the detectable label comprises a fluorophore. In some embodiments, the stepping reagent comprises a plurality of non-labeled nucleotides. In some embodiments, the nucleotides in the stepping reagent comprise 3’ chain terminator nucleotide analogs. In some embodiments, the stepping reagent further comprises a plurality of sequencing polymerases (e.g., 100-600 nM). In some embodiments, the stepping reagent lacks a catalytic cation comprising magnesium or manganese.
Kits Comprising De-hybridization Reagents
[00669] In some embodiments, the de-hybridization reagent in the kit comprises at least one solvent, at least one pH buffering agent, at least one reducing agent, at least one monovalent salt and at least one crowding agent. In some embodiments, the de-hybridization reagent further comprises at least one chaotropic agent. In some embodiments, the de- hybridization reagent further comprises at least one nucleic acid condenser agent.
[00670] In some embodiments, the de-hybridization reagent in the kit comprises: any one or any combination of two or more solvents comprising water, acetonitrile (e.g., 10-20%) and/or formamide (e.g., 10-40%); any one or any combination of two or more pH buffering agents comprising MES (e.g., pH 5-7, 10-75 mM), Tris (e.g., pH 6-9, 10-50 mM), HEPES (e.g., pH 6-9, 10-50 mM) and/or PBS (phosphate buffered saline) (e.g., comprising disodium hydrogen phosphate and sodium chloride) (e.g., pH 5-8); at least one reducing agent comprising DMSO (e.g., 10-50%) or TCEP (e.g., 1-10 mM); at least one monovalent salt comprising NaCl (e.g., 0.25-2 M) and/or ammonium sulfate (e.g., 1-50 mM); any one or any combination of two or more crowding agents comprising PEG 200 (e.g., 10-50%), PEG 400 (e.g., 10-50%) and/or dextran sulfate (e.g., 150kDa, 1-5%); at least one chaotropic agent comprising guanidinium hydrochloride (e.g., 50 mM-2M) or guanidinium isothiocyanate (e.g., 50 mM-2M); and any one or any combination of two or more nucleic acid condenser agents comprising dextran sulfate (e.g., 150 kDa or 500 kDa, 1-5%), a polyamine (e.g., 0.01- 15%), poly-lysine (e.g., poly-L-lysine, 0.01-15%) and/or manganese chloride (e.g., 0.1-0.8 M). In some embodiments, the polyamine in the de-hybridization reagent can be in hydrochloride form. In some embodiments, the polyamine comprises spermine, spermidine, cadaverine or putrescene.
Cartridges Containing Reagents
[00671] The present disclosure provides one or more cartridges each containing one or more reagents used for conducting any of the methods for batch sequencing with or without reiterative sequencing, and any of the methods for re-seeding with or without reiterative sequencing described above. The cartridge can contain any of the reagents described herein including the trapping reagent, imaging reagent, stepping reagent and/or de-hybridization reagent. In some embodiments, the cartridge can be sub-divided into two or more separate reservoirs where each reservoir contains a different reagent. In some embodiments, the cartridge can be sub-divided two or more separate spaces where each space can hold a container containing a reagent. For example, the cartridge can include at least four separate spaces. In some embodiments, each space holds a container comprising a different reagent, e.g. a trapping reagent container, an imaging reagent container, a stepping reagent container or a de-hybridization reagent container. In some embodiments, the cartridge is configured to fit into a nucleic acid sequencing apparatus. In some embodiments, the cartridge is connected to at least one capillary that is configured to deliver the contents of the cartridge to one or more supports that are integrated or assembled on a microfluidic flow cell.
The Support
[00672] The present disclosure provides a support for use in conducting any of the batch sequencing, reiterative sequencing and/or re-seeding methods described herein. In some embodiments, the support is solid, semi-solid, or a combination of both. In some embodiments, the support is porous, semi-porous, non-porous, or any combination of porosity. In some embodiments, the support can be substantially planar, concave, convex, or any combination thereof. In some embodiments, the support can be cylindrical, for example comprising a capillary or interior surface of a capillary.
[00673] The support comprises any material, including but not limited to glass, fused- silica, silicon, a polymer (e.g., polystyrene (PS), macroporous polystyrene (MPPS), polymethylmethacrylate (PMMA), polycarbonate (PC), polypropylene (PP), polyethylene (PE), high density polyethylene (HDPE), cyclic olefin polymers (COP), cyclic olefin copolymers (COC), polyethylene terephthalate (PET)), or any combination thereof. Various compositions of both glass and plastic substrates are contemplated.
[00674] In some embodiments, the surface of the support can be substantially smooth and lack contours and texture. In some embodiments, the support can be regularly or irregularly contoured or textured, including protrusions, bumps, wells, etchings, pores, three- dimensional scaffolds, or any combination thereof. In some embodiments, the support comprises contours arranged in a pre-determined pattern. In some embodiments, the support comprises contours arranged in a repeating pattern. In some embodiments, the support comprises interstitial regions between the contours, where the interstitial regions are arranged in a pre-determined. In some embodiments, the interstitial regions are arranged in a repeating pattern.
[00675] In some embodiments, the contours and interstitial regions can be fabricated using any combination of photo-chemical, photo-lithography, electron beam lithography, micro- or nano-imprint lithography, ink-jet printing, or micron-scale printing and/or nano-scale printing.
[00676] In some embodiments, the contours can be functionalized to promote tethering/immobilizing nucleic acid molecules (e.g., capture primers, pinning primers and/or template molecules) and/or for tethering an enzyme (e.g., a polymerase). In some embodiments, the interstitial regions can be modified to inhibit tethering nucleic acid molecules (e.g., capture primers, pinning primers and/or template molecules) and/or for inhibiting tethering an enzyme (e.g., a polymerase).
[00677] In some embodiments, the support comprises at least one region (e.g., a feature) which can be functionalized to tether/immobilize nucleic acid molecules and/or enzymes. In some embodiments, the features are arranged on the support in a non-predetermined manner (e.g., randomly positioned features; e.g., FIG. 14A part (i)). In some embodiments, the features are arranged on the support in a predetermined manner (e.g., patterned features; e.g., FIGs. 14B parts (iii) and (iv)). In some embodiments, the features are arranged on the support in repeating pattern (e.g., FIGs. 14B parts (iii) and (iv)).
[00678] In some embodiments, a support comprises a plurality of features located at random and non-predetermined positions on the support. In some embodiments, individual features can attach to a nucleic acid molecule (e.g., surface capture primers, surface pinning primers and/or template molecules). Each of the features on the support can be functionalized with a chemical compound to attach to a nucleic acid molecule.
[00679] For example, the features on the support can attach to surface capture primers (e.g., see FIG. 14A part (i)). In some embodiments, the surface capture primers can be attached to the support such that some of the nearest neighbor surface capture primers touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. The dotted lines that surround the four capture primers represents nearest neighbor capture primers that touch each other (e.g., FIG. 14A Part (i)).
[00680] In some embodiments, the surface capture primers on the support can attach to nucleic acid template molecules having one of four different batch sequences (e.g., see FIG. 14A part (ii)). In some embodiments, the template molecules can attach to the support (via attachment to the capture primers) such that some of the nearest neighbor template molecules touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. The dotted lines that surround the four template molecules represent nearest neighbor template molecules that touch each other (e.g., FIG. 14A part (ii)).
[00681] In some embodiments, the support comprises a contour and at least one feature on or near the contour for tethering nucleic acid molecules. For example, one or more wells (e.g., a plurality of contours) can be fabricated on the support where the bottom of individual wells include a feature having a chemical modification for tethering one or more nucleic acid molecules. The skilled artisan will recognize that the support can be fabricated with any type of contour(s) and feature(s) that are on or near the contour(s), where the features are designed to tether at least one nucleic acid molecule.
[00682] In some embodiments, the support lacks contours. In some embodiments, the support lacks features arranged in a pre-determined pattern where the features have a chemical functionality for tethering nucleic acid molecules and/or enzymes to the support. In some embodiments, the support comprises features positioned at random non-predetermined locations on the support. In some embodiments, the support lacks interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to inhibit tethering nucleic acid molecules or enzymes.
[00683] In some embodiments, any of the features for tethering nucleic acids and/or enzymes can be positioned on the support using ink-jet printing, or micron-scale or nanoscale printing. In some embodiments, the features can be made in any shape including for example, circular, square, triangular or rectangular (e.g., FIGs. 14A parts (i) and (iii)). [00684] In some embodiments, at least one surface of the support can be modified with a chemical compound that enables attachment of a polymer coating to the support. For example, the support can be modified with a silane compound. In some embodiments, the silane compound can bind a polymer coating. In some embodiments, at least one surface of the support is passivated with at least one polymer coating layer (e.g., FIG. 14C). In some embodiments, the support is passivated with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more polymer coating layers. In some embodiments, the coating forms a continuous layer on the support. In some embodiments, the coating forms no pre-determined pattern.
[00685] In some embodiments, the surface coating may be patterned, such that the chemical modification layers are confined to one or more discrete regions of the support. For example, the coating may be patterned using photolithographic techniques to create an ordered array or random pattern of chemically-modified regions on the support. Alternately or in combination, the coating may be patterned using, e.g., contact printing and/or ink-jet printing techniques. In some embodiments, the coating is distributed on the support in a predetermined pattern, for example the pre-determined pattern comprises or spots arranged in rows and/or columns or other pre-determined patterns. In some embodiments, the coating having a pre-determined pattern comprises at least one interstitial region that lacks a polymer coating. In some embodiments, the passivated layer forms a porous or semi-porous layer.
[00686] In some embodiments, at least one of the polymer layers comprises a hydrophilic polymer layer. In some embodiments, at least one polymer layer comprises polymer molecules having a molecular weight of at least 1000 Daltons. The hydrophilic polymer layer can comprise polyethylene glycol (PEG). The hydrophilic polymer layer can comprise unbranched PEG. The hydrophilic polymer layer can comprise branched PEG having at least
4 branches, for example the branched PEG comprises 4-16 branches. In some embodiments, the hydrophilic polymer layer comprises cross-linking or lacks cross-linking. In some embodiments, the hydrophilic polymer layer comprises cross-linking to form a hydrogel. [00687] In some embodiments, the hydrophilic polymer layer comprises a monolayer having unbranched polymers which can form a brush monolayer. In some embodiments, the brush monolayer can form an extended brush monolayer. In some embodiments, the brush monolayer comprises a plurality of unbranched polymers where one end of a given unbranched polymer is attached to the support and the other end of the same given unbranched polymer is attached to an oligonucleotide primer (e.g., capture primer or pinning primer). In some embodiments, the density of the plurality of oligonucleotide primers attached to the brush monolayer is about 102 - 1015 per um2, for example, between about IO10 and about 1015 surface oligonucleotide primers per mm2, between about 105 and about 1015 oligonucleotide primers per mm2, between about 103 and about 1014 oligonucleotide primers per mm2, between about 104 and about 1013 oligonucleotide primers per mm2, between about 105 and about 1012 oligonucleotide primers per mm2, between about 106 and about 1011 oligonucleotide primers per mm2, between about 107 and about IO10 oligonucleotide primers per mm2, or between about 108 and about IO10 oligonucleotide primers per mm2, or any range therebetween.
[00688] In some embodiments, the coating layer has a degree of hydrophilicity which can be measured as a water contact angle, where the water contact angle is no more than 45 degrees.
[00689] In some embodiments, any layer of the polymer coating includes a plurality of oligonucleotide primers covalently tethered to the polymer layer. In some embodiments, the plurality of oligonucleotide primers are distributed at a plurality of depths throughout any of the polymer layers. In some embodiments, the density of the plurality of oligonucleotide primers in any of the polymer layers is about 102 - 1015 per um2, for example, between about 1010 and about 1015 surface oligonucleotide primers per mm2, between about 105 and about 1015 oligonucleotide primers per mm2, between about 103 and about 1014 oligonucleotide primers per mm2, between about 104 and about 1013 oligonucleotide primers per mm2, between about 105 and about 1012 oligonucleotide primers per mm2, between about 106 and about 1011 oligonucleotide primers per mm2, between about 107 and about 1010 oligonucleotide primers per mm2, or between about 108 and about 1010 oligonucleotide primers per mm2, or any range therebetween. In some embodiments, individual oligonucleotide primers comprise nucleic acid molecules comprising DNA, RNA,
DNA/RNA chimeric or analogs thereof. In some embodiments, the plurality of oligonucleotide primers are about 10 - 100 nucleotides in length. In some embodiments, individual oligonucleotide primers in the plurality comprise 3’ extendible ends or 3’ nonextendible ends. In some embodiments, the 3’ non-extendible ends comprise a 3’ chain terminating moiety. In some embodiments, individual oligonucleotide primers have their 5’ or 3’ ends or an internal region attached to the polymer layer. In some embodiments, the 5’ ends of the plurality of oligonucleotide primers are attached to the polymer layer. In some embodiments, the plurality of oligonucleotide primer are randomly distributed throughout and embedded within at least one of the polymer layers. In some embodiments, the plurality of oligonucleotide primer are distributed in or on at least one of the polymer layers in a random manner or a pre-determined pattern. In some embodiments, the plurality of oligonucleotide primers are distributed in or on at least one of the polymer layers in a nonrandom pre-determined pattern, for example the pre-determined pattern comprises stripes or spots arranged in rows and/or columns or other pre-determined patterns.
[00690] In some embodiments, the support comprises a first layer comprising a first monolayer having hydrophilic polymer molecules tethered to the support. In some embodiments, at least some of the polymer molecules in the first layer are covalently tethered to oligonucleotide primers. In some embodiments, the tethered oligonucleotide primers in the first monolayer are arranged in a random manner or in a pre-determined pattern. In some embodiments, the polymer molecules in the first layer are not tethered to oligonucleotide primers.
[00691] In some embodiments, the support further comprises a second layer comprising a second monolayer having hydrophilic polymer molecules tethered to the first monolayer. In some embodiments, at least some of the polymer molecules in the second layer are covalently tethered to oligonucleotide primers. In some embodiments, the tethered oligonucleotide primers in the second monolayer are arranged in a random manner or in a pre-determined pattern. In some embodiments, the polymer molecules in the second layer are not tethered to oligonucleotide primers.
[00692] In some embodiments, the support further comprises a third layer comprising a third monolayer having hydrophilic polymer molecules tethered to the second monolayer. In some embodiments, at least some of the polymer molecules in the third layer are covalently tethered to oligonucleotide primers. In some embodiments, the tethered oligonucleotide primers in the third monolayer are arranged in a random manner or in a pre-determined
pattern. In some embodiments, the polymer molecules in the third layer are not tethered to oligonucleotide primers.
[00693] In some embodiments, the support comprises a functionalized polymer coating layer covalently bound at least to a portion of the support via a chemical group on the support, a primer grafted to the functionalized polymer coating, and a water-soluble protective coating on the primer and the functionalized polymer coating. In some embodiments, the functionalized polymer coating comprises a poly(N-(5- azidoacetamidylpentyl)acrylamide-co-acrylamide (PAZAM).
[00694] In some embodiments, at least one of the polymer layers comprise oligonucleotide primers including capture primers, pinning primers, or a mixture of capture and pinning primers. In some embodiments, the plurality of oligonucleotide primers comprise one type of capture primer (e.g., having that same batch capture primer sequence) or a mixture of 2-100 different types of capture primers (e.g., having 2-100 different batch capture primer sequences). In some embodiments, the plurality of oligonucleotide primers comprise one type of pinning primer (e.g., having that same batch pinning primer sequence) or a mixture of 2- 100 different types of pinning primers (e.g., having 2-100 different batch pinning primer sequences).
[00695] In some embodiments, individual capture primers (e.g., which are tethered to and/or embedded in a polymer layer) can be used in an on-support amplification reaction. In some embodiments, individual capture primers hybridize to a capture primer binding site in a circularized library molecule, and rolling circle amplification can be conducted to generate a concatemer template molecule which is tethered and/or embedded in the polymer layer. [00696] In some embodiments, individual capture primers (e.g., which are tethered to and/or embedded in a polymer layer) can be used in an in-solution amplification workflow. In some embodiments, individual surface capture primers can hybridize to a surface capture primer binding site in a nascent concatemer template molecule, and rolling circle amplification can continue on the polymer layer to generate a concatemer template molecule which is tethered and/or embedded in the polymer layer.
[00697] In some embodiments, the density of the surface capture primers in a polymer layer can be modulated (e.g., increased or decreased) to achieve a desired density of immobilized concatemer template molecules on a support. Generally, a polymer layer having a high density of surface capture primers will generate concatemer template molecules that are tightly packed and immobilized to the support at a density of about 105 - 1015 per mm2
which cannot be achieved using supports fabricated to include nano-scale features for attachment of template molecules.
[00698] In some embodiments, a surface single pinning primer (e.g., which is tethered to or embedded in a polymer layer) can hybridize to a surface pinning primer binding site in a concatemer template molecule to generate a concatemer template molecule which is tethered or embedded (e.g., pinned down) in the polymer layer.
[00699] In some embodiments, at least one of the polymer layers comprise a plurality of capture primers and/or pinning primers having a cleavable region that is cleavable with a restriction endonuclease enzyme. For example, the cleavable region comprises a recognition site for a type I, type II, type Ils, type IIB, type III, or type IV restriction enzyme. In some embodiments, the plurality of surface capture primers and/or pinning primers include a cleavable region that is cleavable with an enzyme that generates an abasic site. For example, the cleavable region comprises at least one nucleotide having a scissile moiety including uridine, 8-oxo-7,8-dihydrogunine or deoxyinosine. In some embodiments, the plurality of capture primers and/or pinning primers lack a cleavable region.
[00700] In some embodiments, the support comprises at least one partition/barrier that creates separate regions of the support. For example, the partition/barrier can prevent fluid flow on one portion of the support. The partition/barrier can inhibit nucleic acid and/or enzyme reactions on a portion of the support. In some embodiments, the partition/barrier can be placed on the support. In some embodiments, the partition/barrier is not placed on the support but is positioned to block fluid flow onto the support.
[00701] In some embodiments, the support lacks partitions/barriers that would create separate regions of the support. For example, the support is passivated with at least one polymer coating formed as a continuous layer, and at least one of the polymer layers comprise a plurality of surface capture primers that are randomly distributed throughout and on the polymer layer. The surface capture primers can be used to generate immobilized concatemer template molecules. Thus, the immobilized template molecules are in fluid communication with each other in a massively parallel manner with no barriers to physically separate different batches of template molecules. Instead, sub-populations of template molecules carrying different batch sequencing primer binding sites which enables batch sequencing. Asynchronous sequencing is achieved using concatemer template molecules in fluid communication with each other on the same non-partitioned support.
Fragmenting Nucleic Acids
[00702] The present disclosure provides methods for preparing nucleic acid library molecules for use in any of the methods described including batch sequencing, re-seeding, reiterative sequencing, padlock probe workflows, single-stranded splint workflows and/or double-stranded splint workflow.
[00703] In some embodiments, the insert region of a nucleic acid library molecule comprises a sequence of interest extracted from any source. The insert region can be prepared using recombinant nucleic acid technology including but not limited to any combination of vector cloning, transgenic host cell preparation, host cell culturing and/or PCR amplification. [00704] In some embodiments, the insert region can be in fragmented or un-fragmented form, and can be used to prepare linear nucleic acid library molecules. Fragmented forms of the insert region can be obtained by mechanical force, enzymatic or chemical fragmentation methods. The fragmented insert regions can be generated using procedures that yield a population of fragments having overlapping sequences or non-overlapping sequences.
[00705] Mechanical fragmentation typically generates randomly fragmented nucleic acid molecules. Mechanical fragmentation methods include mechanical shearing such as fluid shear, constant shear and pulsatile shear. Mechanical fragmentation methods also include mechanical stress including sonication, nebulization and acoustic cavitation. In some embodiments focused acoustic energy can be used to randomly fragment nucleic acid molecules. A commercially-available apparatus (e.g., Covaris®) can be used to fragment nucleic acid molecules using focused acoustic energy.
[00706] Enzymatic fragmentation procedures can be conducted under conditions suitable to generate randomly or non-randomly fragmented nucleic acid molecules. For example, restriction endonuclease enzyme digestion can be conducted to completion to generate non- randomly fragmented nucleic acid molecule. Alternatively, partial or incomplete restriction enzyme digestion can be conducted to generate randomly-fragmented nucleic acid molecules. Enzymatic fragmentation using restriction endonuclease enzymes includes any one or any combination of two or more restriction enzymes selected from a group consisting of type I, type II, type Ils, type IIB, type III, or type IV restriction enzymes. Enzymatic fragmentation includes digestion of the nucleic acid with a rare-cutting restriction enzyme, comprising Not I, Asc I, Bae I, AspC I, Pac I, Fse I, Sap I, Sfi I or Psr I. Enzymatic fragmentation include use of any combination of a nicking restriction endonuclease, endonuclease and/or exonuclease. Enzymatic fragmentation can be achieved by conducting a nick translation reaction.
[00707] In some embodiments, enzymatic fragmentation can be achieved by reacting nucleic acids with an enzyme mixture, for example an enzyme that generates single-stranded nicks and another enzyme that catalyzes double-stranded cleavage. An exemplary enzyme mixture is Fragmentase (e.g., from New England Biolabs®).
[00708] Fragments of the insert region can be generated with PCR using sequence-specific primers that hybridize to target regions in genomic DNA samples to generate insert regions having known fragment lengths and sequences.
[00709] Targeted genome fragmentation methods using CRISPR/Cas9 can be used to generate fragmented insert regions.
[00710] Fragments of the insert portion can also be generated using a transposase-based tagmentation method, for example using NEXTERA® (from Epicentre®).
[00711] The insert region can be single-stranded or double-stranded. The ends of the double-stranded insert region can be blunt-ended, or have a 5’ overhang or a 3’ overhang end, or any combination thereof. One or both ends of the insert region can be subjected to an enzymatic tailing reaction to generate a non-template poly-A tail by employing a terminal transferase reaction. The ends of the insert region can be compatible for joining to at least one adaptor sequence (e.g., universal adaptor sequence or batch-specific adaptor sequence). [00712] The insert region can be any length, for example the insert region can be about 50- 250, or about 250-500, or about 500-750, or about 750-1000, or about 1000-1500, or about 1500-2000 bases or base pairs in length, or any range therebetween. In some embodiments, the insert region can be 2000-5000 bases or base pairs in length.
[00713] The fragments containing the insert region can be subjected to a size selection process, or the fragments are not size selected. For example, the fragments can be size selected by gel electrophoresis and gel slice extraction. The fragments can be size selected using a solid phase adherence/immobilization method which typically employs micro paramagnetic beads coated with a chemical functional group that interacts with nucleic acids under certain ionic strength conditions with or without polyethylene glycol or polyalkylene glycol. Commercially-available solid phase adherence beads include SPRI (Solid Phase Reversible Immobilization) beads from Beckman Coulter® (AMPUR XP® paramagnetic beads, catalog No. B23318), MAGNA PURE® magnetic glass particles (Roche Diagnostics®, catalog No. 03003990001), MAGNASIL paramagnetic beads from Promega® (catalog No. MD1360), MAGTRATION® paramagnetic beads and system from Precision System Science (catalog Nos. Al 120 and A1060), MAG-BIND® from Omega Bio-Tek (catalog No. M1378-01), MAGPREP® silica from Millipore® (catalog No. 101193),
SNARE DNA purification systems from Bangs Laboratories® (catalog Nos. BP691, BP692 and BP693), and CHEMAGEN M-PVA beads from Perkin Elmer® (catalog No. CMG-200). [00714] In some embodiments, the fragmented nucleic acids can be subjected to enzymatic reactions for end-repair and/or A-tailing. The fragmented nucleic acids can be contacted with a plurality of enzymes under a condition suitable to generate nucleic acid fragments having blunt-ended 5’ phosphorylated ends. In some embodiments, the plurality of enzymes generates blunt-ended fragment having a non-template A-tail at their 3’ ends. The plurality of enzymes comprise two or more enzymes that can catalyze nucleic acid end-repair, phosphorylation and/or A-tailing. The end-repair enzymes include a DNA polymerase (e.g., T4 DNA polymerase) and Klenow fragment. The 5’ end phosphorylation enzyme comprises T4 polynucleotide kinase. The A-tailing enzyme includes a Taq polymerase (e.g., non-proofreading polymerase) and dATP. In some embodiments, the fragmenting, end-repair, phosphorylation and A-tailing can be conducted in a one-pot reaction using a mixture of enzymes.
Appending Adaptors to Fragmented or Unfragmented Nucleic Acids
[00715] In some embodiments, individual fragmented (or unfragmented) nucleic acids can be covalently joined to at least one adaptor sequence for library preparation. In general, a nucleic acid fragment is covalently joined at both ends to one or more adaptors to generate a linear library molecule having the arrangement left adaptor-insert-right adaptor. In some embodiments, at least one fragment in the population of fragmented nucleic acids comprises a sequence-of-interest. Individual library molecules in the population of library molecules can have an insert region that is the same or different as other library molecules in the population. In some embodiments, about 1-10 ng, or about 10-50 ng, or about 50-100 ng , or any range therebetween, of input fragmented nucleic acids can be appended to one or more adaptors to generate a linear library.
[00716] Individual nucleic acid fragments can be appended on one or both ends to at least one adaptor sequence to form a recombinant nucleic acid linear library molecule having the general arrangement left adaptor-insert-right adaptor.
[00717] In some embodiments, the nucleic acid fragments can be appended with any one or any combination of two or more adaptors. In some embodiments, the nucleic acid fragments are arranged in any order. In some embodiments, the adaptors comprise an adaptor having a binding sequence for a surface pinning primer binding site sequence (120), an adaptor having a surface capture primer binding site sequence (130), an adaptor having a
forward sequencing primer binding site sequence (140), an adaptor having a reverse sequencing primer binding site sequence (150), an adaptor having a left sample index sequence (160), an adaptor having a right sample index sequence (170), an adaptor having a left unique identification sequence (180), an adaptor having a batch-specific barcode sequence (195) and/or, an adaptor sequence for binding a compaction oligonucleotide.
[00718] In some embodiments, any of the adaptors comprise universal adaptor sequences or batch-specific adaptor sequences.
[00719] Exemplary linear library molecules are shown in FIGs. 21, 22, 23A, 23B, 25A, 25B, 27, 28, 29, 30A, and 30B. The skilled artisan appreciates that many other embodiments of linear library molecules comprising adaptor sequences with other arrangements are possible.
[00720] The adaptors can be prepared using chemical synthesis procedures using native nucleotides with or without nucleotide analogs or modified nucleotide linkages that confer certain properties, including resistance to enzymatic digestion, or increased thermal stability. Examples of nucleotide analogs and modified nucleotide linkages that inhibit nuclease digestion include phosphorothioate, 2’-O-methyl RNA, inverted dT, and 2’ 3’ dideoxy-dT. Insert regions that include locked nucleic acids (LNA) have increased thermal stability.
[00721] The insert region can be joined at one or both ends to at least one adaptor sequence using a ligase enzyme and/or primer extension reaction to generate a linear library molecule. Covalent linkage between an insert region and the adaptor(s) can be achieved with a DNA or RNA ligase. Exemplary DNA ligases that can ligate double-stranded DNA molecules include T4 DNA ligase and T7 DNA ligase. An adaptor sequence can be appended to an insert sequence by PCR using a tailed primer having 5’ region carrying an adaptor sequence and a 3’ region that is complementary to a portion of the insert sequence. A adaptor sequence can be appended to an insert sequence which is flanked one side or both sides with first and second adaptor sequences by PCR using a tailed primer having 5’ region carrying a third adaptor sequence and a 3’ region that is complementary to a portion of the first or second adaptor sequence.
[00722] In some embodiments, the linear single stranded library molecule (100) further comprises at least one junction adaptor sequence located between any of the adaptor sequences described herein (e.g., see FIGs. 32 and 33). For example, a first left junction adaptor sequence (121) can be located upstream (e.g., located 5’) of the adaptor sequence for a surface pinning primer binding site sequence (120). In some embodiments, a second left junction adaptor sequence (125) can be located between the adaptor sequence for a surface
pinning primer binding site sequence (120) and the left sample index sequence (160). In some embodiments, a third left junction adaptor sequence (165) can be located between the left sample index sequence (160) and the adaptor sequence for the forward sequencing primer binding site sequence (140). In some embodiments, s fourth junction adaptor sequence (145) can be located between the adaptor sequence for the forward sequencing primer binding site sequence (140) and the sequence-of-interest (e.g., insert region (110)). In some embodiments, a first right junction adaptor sequence (131) can be located downstream (e.g., located 3’) of the adaptor sequence for a surface capture primer binding site sequence (130). In some embodiments, a second right junction adaptor sequence (135) can be located between the adaptor sequence for a surface capture primer binding site sequence (130) and the right sample index sequence (170). In some embodiments, a third right junction adaptor sequence (175) can be located between the right sample index sequence (170) and the adaptor sequence for a reverse sequencing primer binding site sequence (150). In some embodiments, a fourth right junction adaptor sequence (155) can be located between the adaptor sequence for a reverse sequencing primer binding site sequence (150) and the sequence of interest (e.g., insert (HO)).
[00723] Any of the junction adaptor sequences comprise any sequence and can be 3-60 nucleotides in length. Any of the junction adaptor sequences comprise a universal sequence, a batch-specific sequence, or a unique sequence. Any of the junction adaptor sequences comprise a random sequence (e.g., NNN) having 3-20 nucleotides. Any of the junction adaptor sequences comprise a binding sequence for an amplification primer, a sequencing primer or a compaction oligonucleotide. Any of the junction adaptor sequences comprise a binding sequence for an immobilized capture primer. Any of the junction adaptor sequences comprise a sample index sequence. Any of the junction adaptor sequences comprise a unique identification sequence (e.g., UMI). Any of the junction adaptor sequences, particularly junction adaptor sequence (145) comprise a Tn5 transposon-end sequence, for example 5’- AGATGTGTATAAGAGACAG -3’ (SEQ ID NO: 34). Any of the junction adaptor sequences, particularly junction adaptor sequence (155) comprise a Tn5 transposon-end sequence, for example 5’- CTGTCTCTTATACACATCT -3’ (SEQ ID NO: 35). The Tn5 transposon-end sequences can be introduced into the linear single stranded library molecule (100) via a transposase-mediated reaction which includes contacting double-stranded input DNA (e.g., genomic DNA) with a Tn-5 type transposase enzyme, and a double-stranded oligonucleotide comprising the Tn transposon-end sequence linked to an adaptor sequence or a sample index sequence under a condition that is suitable to form a transposon synaptic
complex. In the double-stranded oligonucleotide, the Tn transposon-end sequence can be located 5’ or 3’ relative to an adaptor sequence or a sample index sequence.
[00724] In some embodiments, a linear single stranded library molecule (100) can be generated by employing a ligation reaction and an optional primer extension reaction. The library molecule can be generated by joining the first end of a double-stranded insert region (110) to a first double-stranded adaptor, and joining the second end of the double-stranded insert region (110) to a second double-stranded adaptor. The first and second double-stranded adaptors each comprise two nucleic acid strands that are fully complementary along their lengths.
[00725] In some embodiments, individual double-stranded insert regions (110) can be joined to a first and a second double-stranded adaptor using a DNA ligase enzyme to generate a double-stranded recombinant molecule. In some embodiments the first and second doublestranded adaptors carry the same adaptor sequences. In some embodiments the first and second double-stranded adaptors carry different adaptor sequences.
[00726] In some embodiments, the library molecule can be generated by joining the first end of a double-stranded insert region (110) to a first double-stranded adaptor having a having a forward sequencing primer binding site sequence (140), and joining the second end of the double-stranded insert region (110) to a second double-stranded adaptor having a reverse sequencing primer binding site sequence (150). In some embodiments, the joining is conducted using a DNA ligase enzyme to generate a double-stranded recombinant molecule. In some embodiments, the first double-stranded adaptor further comprises a left sample index sequence (160) and/or a surface pinning primer binding site sequence (120). In some embodiments, the second double-stranded adaptor further comprises a right sample index sequence (170) and/or a binding sequence for a capture primer binding site sequence (130).
[00727] In some embodiments, the ligating end of the first and/or the second doublestranded adaptors comprise a blunt end, or an overhang end (e.g., 5’ or 3’ overhang end). [00728] In some embodiments, a linear single stranded library molecule (100) can be generated by employing a ligation reaction and primer extension reaction. The library molecule can be generated by joining the first end of a double-stranded insert region (110) to a first double-stranded Y-shaped adaptor (e.g., a first forked adaptor), and joining the second end of a double-stranded insert region (110) to a second double-stranded Y-shaped adaptor (e.g., a second forked adaptor). The first and second Y-shaped adaptors each comprise two nucleic acid strands, where a portion of the two strands are fully complementary to each other and are annealed together and another portion of the two strands are not complementary to
each other and are mismatched. In some embodiments, the ligating end of the first and second Y-shaped adaptors comprise an annealed portion that forms a blunt end or an overhang end (e.g., 5’ or 3’ overhang end).
[00729] In some embodiments the first and second Y-shaped adaptors carry the same adaptor sequences. In some embodiments the first and second Y-shaped adaptors carry different adaptor sequences.
[00730] In some embodiments, the first strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can include at least a portion of an adaptor sequence having a forward sequencing primer binding site sequence (140) (or a complementary sequence thereof). In some embodiments, the first strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can further include a left sample index sequence (160). In some embodiments, the first strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can further include an adaptor sequence having a surface pinning primer binding site sequence (120).
[00731] In some embodiments, the second strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can include at least a portion of an adaptor sequence having a reverse sequencing primer binding site sequence (150) (or a complementary sequence thereof). In some embodiments, the second strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can further include a right sample index sequence (170). In some embodiments, the second strand of the annealed portion and/or the mismatched portion of the Y-shaped adaptor can further include an adaptor sequence having a surface capture primer binding site sequence (130).
[00732] The double-stranded insert region (110) can be joined to the first and second double-stranded Y-shaped adaptors using a DNA ligase enzyme to generate a doublestranded recombinant molecule.
[00733] In some embodiments, the double-stranded recombinant molecules which are generated by ligating the insert region (110) to double-stranded adaptors or Y-shaped adaptors can be subjected to a denaturing condition to generate single-stranded recombinant molecules, and then a primer extension reaction. At least one additional adaptor sequence can be appended to the recombinant molecules by conducting a primer extension reaction using tailed primers (e.g., tailed PCR primers), by contacting/hybri dizing the single-stranded recombinant molecules with a plurality of first tailed primers and conducting at least one primer extension reaction to generate a first double-stranded tailed extension product.
[00734] In some embodiments, an additional adaptor sequence can be appended to the first double-stranded tailed extension product by conducting a primer extension reaction using tailed primers (e.g., tailed PCR primers), by contacting/hybri dizing the first double-stranded tailed extension product with a plurality of second tailed primers and conducting at least one primer extension reaction to generate a second double-stranded tailed extension product.
[00735] In some embodiments, the plurality of first tailed primers each comprise a 5’ region carrying an adaptor sequence having a surface capture surface primer binding site sequence (130), and a 3’ region that is complementary to at least a portion of the adaptor sequence having a reverse sequencing primer binding site sequence (150) of the singlestranded recombinant molecules.
[00736] In some embodiments, the plurality of first tailed primers each comprise a 5’ region carrying an adaptor sequence having a surface capture primer binding site sequence (130), an internal region comprising a right sample index sequence (170), and a 3’ region that is complementary to at least a portion of the adaptor sequence having a reverse sequencing primer binding site sequence (150) of the single-stranded recombinant molecules.
[00737] In some embodiments, the plurality of second tailed primers each comprise a 5’ region carrying an adaptor sequence having surface pinning primer binding site sequence (120), and a 3’ region that is complementary to at least a portion of the adaptor sequence having a forward sequencing primer binding site sequence (140) of the first double-stranded tailed extension product.
[00738] In some embodiments, the plurality of second tailed primers each comprise a 5’ region carrying an adaptor sequence having a surface pinning primer binding site sequence (120), an internal region comprising a left sample index sequence (160), and a 3’ region that is complementary to at least a portion of the adaptor sequence having a forward sequencing primer binding site sequence (140) of the first double-stranded tailed extension product.
[00739] In some embodiments, the first tailed PCR primers can be used to conduct a first primer extension reaction and the second tailed PCR primers can be used conduct a second primer extension to generate library molecules comprising an insert region appended on both sides with at least one adaptor. In some embodiments, the first and second tailed PCR primers can be used to conduct multiple PCR cycles (e.g., about 5-20 PCR cycles) to generate library molecules comprising an insert region appended on both sides with at least one adaptor.
Nucleic Acid Template Molecules
[00740] The present disclosure provides a plurality of nucleic acid template molecules for use in conducting any of the batch sequencing, reiterative sequencing and/or re-seeding methods described herein. In some embodiments, the plurality of template molecules are immobilized to a support. In some embodiments, the plurality of template molecules comprise single-stranded or double-stranded nucleic acid molecules, or a mixture of singlestranded and double-stranded nucleic acid molecules. In some embodiments, the plurality of template molecules comprise nucleic acid molecules comprising DNA, RNA, DNA/RNA chimeric or analogs thereof. In some embodiments, the plurality of template molecules are immobilized to the support at a density of about 102 - 1015 template molecules per mm2, or any of the densities described herein.
[00741] In some embodiments, the plurality of template molecules comprises at least one nucleotide having a scissile moiety that can be cleaved to generate an abasic site in the template molecule. Exemplary nucleotides having a scissile moiety include uridine, 8-oxo- 7,8-dihydrogunine and deoxyinosine. In some embodiments, the plurality of template molecules lack a nucleotide having a scissile moiety. In some embodiments, the plurality of template molecules comprise a mixture of template molecules that either lack a nucleotide having a scissile moiety or include at least one nucleotide having a scissile moiety. In some embodiments, the plurality of template molecules lack a scissile moiety.
[00742] In some embodiments, the plurality of template molecules comprise at least one recognition site for a restriction endonuclease enzyme, including a type I, type II, type Ils, type IIB, type III, or type IV restriction enzymes. In some embodiments, the plurality of template molecules comprise the same restriction enzyme site. In some embodiments, the plurality of template molecules comprise a mixture of template molecules having different restriction enzyme sites, or a mixture of template molecules lacking a restriction enzyme site and template molecules having a restriction enzyme site. In some embodiments, the plurality of template molecules lack a recognition site for a restriction endonuclease enzyme.
[00743] In some embodiments, individual template molecules in the plurality of template molecules comprise nucleic acid concatemer template molecules. In some embodiments, the concatemer template molecules can be generated by conducting rolling circle amplification using circularized library molecules and amplification primers. In some embodiments, a concatemer template molecule comprises a single-stranded nucleic acid strand carrying numerous tandem copies of a polynucleotide unit, where each polynucleotide unit comprises
a sequence of interest and at least one sequencing primer binding site. In some embodiments, the sequence of interest of one of the concatemer template molecules in the plurality and the sequence of interest of a different concatemer template molecule are the same or different. [00744] In some embodiments, concatemer template molecules immobilized to a support can be generated using circularized library molecules and conducting rolling circle amplification. In some embodiments, the circularized library molecules can be generated using padlock probes, single-stranded splint strands, or double-stranded adaptors. Methods for generating circularized library molecules are described herein.
[00745] In some embodiments, the at least one sequencing primer binding site sequence comprises a pre-determined batch sequencing primer binding site sequence. In some embodiments, a pre-determined batch sequencing primer binding site sequence can be linked to a given sequence of interest, thus the pre-determined batch sequencing primer binding site sequence corresponds to a given sequence of interest. In some embodiments, in a batchspecific sequencing workflow, a batch sequencing primer can be used to selectively sequence at least a portion of a polynucleotide unit having a cognate batch sequencing primer binding site sequence.
[00746] In some embodiments, the polynucleotide unit of a concatemer template molecule further comprises at least one barcode sequence. In some embodiments, a pre-determined batch barcode sequence can be linked to a given sequence of interest, thus the pre-determined batch barcode sequence corresponds to a given sequence of interest. In some embodiments, in a batch-specific sequencing workflow, the batch barcode sequence can be sequenced and the sequence of interest need not be sequenced. Thus, the batch barcode sequence serves as a surrogate for the sequence of interest that is linked to the batch barcode sequence.
[00747] In some embodiments, a polynucleotide unit further comprises at least one sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources.
[00748] In some embodiments, a polynucleotide unit further comprises a capture primer binding site. In some embodiments, a capture primer serves as an amplification primer for a circularized library molecule in a rolling circle amplification reaction. In some embodiments, the capture primer binding site of the circularized library molecule can hybridize to a surface capture primer which is immobilized to a support thereby immobilizing the circularized library molecule to the support. In some embodiments, an immobilized concatemer template molecule can be generated by hybridizing a single surface capture primer to a single
circularized library molecule and conducting rolling circle amplification to generate an immobilized concatemer template molecule.
[00749] In some embodiments, a polynucleotide unit further comprises a surface pinning binding site. In some embodiments, in a concatemer template molecule, the surface pinning binding site can hybridize to a surface pinning primer which is immobilized to a support thereby pinning a portion of the concatemer template molecule to the support.
[00750] In some embodiments, a polynucleotide unit further comprises a compaction oligonucleotide binding site. In some embodiments, in a concatemer template molecule, the compaction oligonucleotide binding site binds a compaction oligonucleotide to cause compaction of the concatemer template molecule into a DNA nanoball.
[00751] In some embodiments, the plurality of template molecules comprises a plurality of sub-populations of template molecules including at least a first sub-population and a second sub-population. In some embodiments, the plurality of template molecules comprises 2 - 100 (e.g., about 5-90, about 10-80, about 20-75, about 35-50, about 10-30, or about 5-50, or any range therebetween) or more sub-populations of template molecules. In some embodiments, individual template molecules in a given sub-population comprise a sequence of interest, a sequencing primer binding site sequence that corresponds to the sequence of interest, and optionally a barcode sequence that corresponds to the sequence of interest. In some embodiments, the template molecules of a given sub-population have a sequencing primer binding site that differs from the sequencing primer binding site in the other sub-populations. Thus, the different sequencing primer binding sites of the different sub-populations enable batch sequencing of the template molecules.
[00752] In some embodiments, the plurality of nucleic acid template molecules further comprise any combination of a sample index sequence, a capture primer binding site, a surface pinning primer binding site and/or a compaction oligonucleotide binding site.
[00753] In some embodiments, at least one of the template molecules in the plurality comprises a concatemer template molecule which includes a plurality of tandem copies of a polynucleotide unit, where each polynucleotide unit comprises (i) a sequence of interest; (ii) a sequencing primer binding site sequence which corresponds to the sequence of interest; and (iii) optionally a barcode sequence which corresponds to the sequence of interest. In some embodiments, the polynucleotide unit of the at least one concatemer template molecule further comprises any combination of (iv) a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources;
(v) a surface capture primer binding site; (vi) a surface pinning primer binding site; and/or (vii) a compaction oligonucleotide binding site.
[00754] In some embodiments, individual template molecules in the first sub-population comprise a first sequence of interest, a first batch sequencing primer binding site sequence that corresponds to the first sequence of interest, and optionally a first batch barcode sequence that corresponds to the first sequence of interest. In some embodiments, template molecules in the first sub-population have the same sequence of interest or different sequences of interest. In some embodiments, template molecules in the first sub-population have the same first batch sequencing primer binding site sequence which corresponds to the first sequence of interest or corresponds to one of the first sequence of interest. In some embodiments, template molecules in the first sub-population have the same first batch barcode sequence or different first batch barcode sequences. In some embodiments, a first barcode sequence corresponds to a first sequence of interest, or corresponds to one of the first sequences of interest.
[00755] In some embodiments, individual template molecules in the first sub-population comprise the same first sequence of interest, the same first batch sequencing primer binding site sequence that corresponds to the first sequence of interest, and the same first batch barcode sequence that corresponds to the first sequence of interest.
[00756] In some embodiments, individual template molecules in the first sub-population comprise at least two different first sequences of interest, the same first batch sequencing primer binding site sequence that corresponds to the different first sequences of interest, and at least two different first batch barcode sequences where each first batch barcode sequence corresponds to a particular first sequence of interest.
[00757] In some embodiments, individual template molecules in the first sub-population comprise at least two different first sequences of interest, the same first batch sequencing primer binding site sequence that corresponds to the different first sequences of interest, and one first batch barcode sequence that corresponds to the different first sequences of interest. [00758] In some embodiments, the template molecules in the first sub-population further comprise a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, template molecules in the first sub-population have the same sample index sequences. In some embodiments, the first sub-population comprises a mixture of template molecules having different sample index sequences. For example, some of the template molecules in the first
sub-population comprises a first batch first sample index sequence, and some of the template molecules in the first sub-population comprises a first batch second sample index sequence. [00759] In some embodiments, the first sub-population comprising a mixture of template molecules having different sample index sequences can be generated by conducting separate library preparation workflows to generate: (i) a first set of library molecules comprising a first sequence of interest from a first source, a first batch barcode sequence that corresponds to the first sequence of interest, a first batch sequencing primer binding site sequence that corresponds to the first sequence of interest, and a first sample index that corresponds to the first source of the first sequence of interest, and (ii) a second set of library molecules comprising the first sequence of interest from a second source, a first batch barcode sequence that corresponds to the first sequence of interest, a first batch sequencing primer binding site sequence that corresponds to the first sequence of interest, and a second sample index that corresponds to the second source of the first sequence of interest. The resulting first and second library preps can be mixed together to generate a mixture of template molecules in the first sub-population having a mixture of different sample index sequences.
[00760] In some embodiments, the template molecules in the first sub-population further comprise at least one binding site for a compaction oligonucleotide (e.g., a universal binding site for a compaction oligonucleotide). In some embodiments, individual compaction oligonucleotides can hybridize to two different locations on the same the template molecule to pull together distal portions of the template molecule causing compaction of the template molecule to form a DNA nanoball.
[00761] In some embodiments, the template molecules in the first sub-population further comprise a first batch surface capture primer binding site sequence. In some embodiments, template molecules in the first sub-population have the same first batch surface capture primer binding site sequence.
[00762] In some embodiments, the template molecules in the first sub-population further comprise a first batch surface pinning primer binding site sequence which can hybridize to a first surface pinning primer which is immobilized to a support thereby pinning a portion of the template molecules of the first sub-population to the support. In some embodiments, template molecules in the first sub-population have the same first batch surface pinning primer binding site sequence.
[00763] In some embodiments, individual template molecules in the first sub-population of template molecules comprise first sub-population template molecules. In some embodiments, individual concatemer template molecules in the first sub-population comprise a single-
stranded nucleic acid strand carrying a plurality of tandem copies of a polynucleotide unit, where each polynucleotide unit comprises (i) a first sequence of interest; and (ii) a first batch sequencing primer binding site sequence which corresponds to the first sequence of interest. In some embodiments, the polynucleotide unit of individual concatemer template molecules in the first sub-population further comprise any combination of (iii) a first batch barcode sequence which corresponds to the first sequence of interest; (iv) a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources; (v) a first batch surface capture primer binding site sequence; (vi) a first batch surface pinning primer binding site sequence; and/or (vii) a compaction oligonucleotide binding site.
[00764] In some embodiments, individual template molecules in the second subpopulation comprise a second sequence of interest, a second batch sequencing primer binding site sequence that corresponds to the second sequence of interest, and optionally a second batch barcode sequence that corresponds to the second sequence of interest. In some embodiments, template molecules in the second sub-population have the same sequence of interest or different sequences of interest. In some embodiments, template molecules in the second sub-population have the same second batch sequencing primer binding site sequence which corresponds to the second sequence of interest or corresponds to one of the second sequence of interest. In some embodiments, the first and second batch sequencing primer binding sites have different sequences. In some embodiments, template molecules in the second sub-population have the same second batch barcode sequence or different second batch barcode sequences. In some embodiments, a second barcode sequence corresponds to a second sequence of interest, or corresponds to one of the second sequences of interest.
[00765] In some embodiments, individual template molecules in the second subpopulation comprise the same second sequence of interest, the same second batch sequencing primer binding site sequence that corresponds to the second sequence of interest, and the same second batch barcode sequence that corresponds to the second sequence of interest. [00766] In some embodiments, individual template molecules in the second subpopulation comprise at least two different second sequences of interest, the same second batch sequencing primer binding site sequence that corresponds to the different second sequences of interest, and at least two different second batch barcode sequences where each second batch barcode sequence corresponds to a particular second sequence of interest. [00767] In some embodiments, individual template molecules in the second subpopulation comprise at least two different second sequences of interest, the same second
batch sequencing primer binding site sequence that corresponds to the different second sequences of interest, and one second batch barcode sequence that corresponds to the different second sequences of interest.
[00768] In some embodiments, the template molecules in the second sub-population further comprise a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources. In some embodiments, template molecules in the second sub-population have the same sample index sequences. In some embodiments, the second sub-population comprises a mixture of template molecules having different sample index sequences. For example, some of the template molecules in the second sub-population comprises a second batch first sample index sequence, and some of the template molecules in the second sub-population comprises a second batch second sample index sequence.
[00769] In some embodiments, the second sub-population comprising a mixture of template molecules having different sample index sequences can be generated by conducting separate library preparation workflows to generate: (i) a first set of library molecules comprising a second sequence of interest from a first source, a second batch barcode sequence that corresponds to the second sequence of interest, a second batch sequencing primer binding site sequence that corresponds to the second sequence of interest, and a first sample index that corresponds to the first source of the second sequence of interest, and (ii) a second set of library molecules comprising the second sequence of interest from a second source, a second batch barcode sequence that corresponds to the second sequence of interest, a second batch sequencing primer binding site sequence that corresponds to the second sequence of interest, and a second sample index that corresponds to the second source of the second sequence of interest. The resulting first and second library preparations can be mixed together to generate a mixture of template molecules in the second sub-population having a mixture of different sample index sequences.
[00770] In some embodiments, the template molecules in the second sub-population further comprise at least one binding site for a compaction oligonucleotide (e.g., a universal binding site for a compaction oligonucleotide). In some embodiments, individual compaction oligonucleotides can hybridize to two different locations on the same template molecule to pull together distal portions of the template molecule causing compaction of the template molecule to form a DNA nanoball.
[00771] In some embodiments, the template molecules in the second sub-population further comprise a second batch capture primer binding site. In some embodiments, template
molecules in the second sub-population have the same second batch capture primer binding site.
[00772] In some embodiments, the template molecules in the second sub-population further comprise a second batch surface pinning binding site sequence which can hybridize to a second surface pinning primer which is immobilized to a support, thereby pinning a portion of the template molecules of the second sub-population to the support. In some embodiments, template molecules in the second sub-population have the same second batch surface pinning binding site sequence.
[00773] In some embodiments, individual template molecules in the second subpopulation of template molecules comprise a second sub-population concatemer template molecules. In some embodiments, individual concatemer template molecule in the second sub-population comprise a single-stranded nucleic acid strand carrying a plurality of tandem copies of a polynucleotide unit, where each polynucleotide unit comprises (i) a second sequence of interest; and (ii) a second batch sequencing primer binding site sequence which corresponds to the second sequence of interest. In some embodiments, the polynucleotide unit of individual concatemer template molecules in the second sub-population further comprise any combination of (iii) a second batch barcode sequence which corresponds to the second sequence of interest; (iv) a sample index sequence that can be used in a multiplex assay to distinguish sequences of interest obtained from different sample sources; (v) a second batch capture primer binding site sequence; (vi) a second batch surface pinning primer binding site sequence; and/or (vii) a compaction oligonucleotide binding site.
[00774] In some embodiments, the plurality of nucleic acid template molecules are immobilized to a support at a density of about 102 - 1015 template molecules per mm2, or any of the density ranges described herein. In some embodiments, the template molecules comprise one population or a mixture of at least two sub-populations of template molecules including at least a first and second sub-population of template molecules.
[00775] In some embodiments, the plurality of template molecules are immobilized to the support at a density where at least some of the template molecules comprise nearest neighbor template molecules that do not touch each other and/or do not overlap each other when viewed from any angle of the support including above, below or side views of the support. In some embodiments, the template molecules have visible interstitial space between the template molecules at a given field of view (FOV) of the support.
[00776] In some embodiments, the plurality of template molecules are immobilized to the support at a high density where at least some of the template molecules comprise nearest
neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support. In some embodiments, the high density template molecules have little or no visible interstitial space between the template molecules at a given field of view (FOV) of the support.
[00777] In some embodiments, the template molecules immobilized to the support are optically resolvable as discrete spots. In some embodiments, the template molecules are not optically resolvable as spots. In some embodiments, the template molecules comprise a mixture of template molecules that are, or are not, optically resolvable as discrete spots. [00778] In some embodiments, about 20-75%, or about 25-65%, or about 30-55%, or about 35-45% of the template molecules immobilized to the support are optically resolvable as a discrete spot when viewed from any angle above, below or side view of the support. [00779] In some embodiments, about 20-75%, or about 25-65%, or about 30-55%, or about 35-45% of the template molecules immobilized to the support have a nearest neighbor distance of 15-10 nm.
[00780] In some embodiments, about 20-75%, or about 25-65%, or about 30-55%, or about 35-45% of the template molecules immobilized to the support have a nearest neighbor distance of 10-5 nm.
[00781] In some embodiments, about 20-75%, or about 25-65%, or about 30-55%, or about 35-45% of the template molecules immobilized to the support have a nearest neighbor distance of 5-1 nm or smaller nearest neighbor distance.
[00782] In some embodiments, interstitial space between the template molecules immobilized to the support is about 15-10 nm, or about 10-5 nm, or about 5-1 nm, or smaller. [00783] In some embodiments, the support comprises a plurality of template molecules immobilized at random (e.g., random and non-repeating positions) and non-pre-determined positions on the support. In some embodiments, the plurality of template molecules includes one population of template molecules, or a mixture of at least two sub-populations of template molecules including at least a first and second sub-population of template molecules. In some embodiments, the support comprises features on the support that are located in a random and non-pre-determined manner, where the features are sites for attachment of the template molecules. In some embodiments, the support lacks any contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules. In some embodiments, the support lacks features arranged in a pre-determined pattern. In some embodiments, the support lacks features arranged in a pre-determined pattern where the
feature have a chemical functionality for tethering a nucleic acid template molecule to the support. In some embodiments, the support lacks interstitial regions arranged in a predetermined pattern where the interstitial regions are sites designed to have no attached template molecules.
[00784] In some embodiments, the support comprises a plurality of template molecules immobilized at pre-determined positions on the support. For example the template molecules can be immobilized on the support in a pre-determined pattern comprises stripes or spots arranged in rows and/or columns or other pre-determined patterns. In some embodiments, the pre-determined pattern has a repeating pattern. In some embodiments, the plurality of template molecules includes one population of template molecules, or a mixture of at least two sub-populations of template molecules including at least a first and second subpopulation of template molecules. In some embodiments, the support comprises features on the support that are located in a pre-determined manner, where the features are sites for attachment of the template molecules. In some embodiments, the support includes contours (e.g., wells, protrusions, and the like) arranged in a pre-determined pattern where the contours have features that are sites for attachment of the nucleic acid template molecules. In some embodiments, the support includes features arranged in a pre-determined pattern where the features can be fabricated using photo-chemical, photo-lithography, electron beam lithography, micro- or nano-imprint lithography, ink-jet printing, or micron-scale or nanoscale printing. In some embodiments, the support includes features arranged in a predetermined pattern where the feature have a chemical functionality for tethering a nucleic acid template molecule to the support. In some embodiments, the support includes interstitial regions arranged in a pre-determined pattern where the interstitial regions are sites designed to have no attached template molecules.
Methods for Sequencing
[00785] The present disclosure provides methods for sequencing any of the template molecules (e.g., concatemer template molecules) described herein. Any of the methods for conducting rolling circle amplification reaction described herein can be used to generate a plurality of concatemer template molecules immobilized to a support, and the concatemer template molecules can be subjected to sequencing reactions using sequencing polymerases and nucleotide reagents which include nucleotides, nucleotide analogs and/or multivalent molecules. In some embodiments, the sequencing reactions employ nucleotide reagents comprising detectably labeled nucleotide analogs. In some embodiments, the sequencing
reactions employ a two-stage sequencing reaction comprising binding detectably labeled multivalent molecules, and incorporating nucleotide analogs. In some embodiments, the sequencing reactions employ non-labeled nucleotide analogs. Exemplary methods for sequencing are described in WO2022266470, the contents of which are incorporated by reference herein in their entirety.
Methods for Sequencing using Nucleotide Analogs
[00786] The present disclosure provides methods for sequencing any of the concatemer template molecules described herein, the methods comprising step (a): contacting a sequencing polymerase to (i) a concatemer template molecule and (ii) a nucleic acid sequencing primer. In some embodiments, the contacting is conducted under a condition suitable to bind the sequencing polymerase to the nucleic acid concatemer template molecule which is hybridized to the nucleic acid primer. In some embodiments, the concatemer template molecule hybridized to the nucleic acid primer forms the nucleic acid duplex. In some embodiments, the sequencing polymerase comprises a recombinant mutant sequencing polymerase that can bind and incorporate nucleotide analogs. In some embodiments, the sequencing primer comprises a 3’ extendible end. In some embodiments, the concatemer template molecules are immobilized to a support.
[00787] In some embodiments, the sequencing primer comprises a 3’ extendible end or a 3’ non-extendible end. In some embodiments, the plurality of concatemer template molecules comprise amplified template molecules (e.g., clonally amplified template molecules). In some embodiments, the plurality of concatemer template molecules comprise one copy of a target sequence of interest. In some embodiments, the plurality of nucleic acid molecules comprise two or more tandem copies of a target sequence of interest (e.g., concatemers). In some embodiments, the concatemer template molecules in the plurality of concatemer template molecules comprise the same target sequence of interest or different target sequences of interest. In some embodiments, the plurality of concatemer template molecules and/or the plurality of nucleic acid primers are in solution or are immobilized to a support. In some embodiments, when the plurality of concatemer template molecules and/or the plurality of nucleic acid primers are immobilized to a support, the binding with the first sequencing polymerase generates a plurality of immobilized first complexed polymerases. In some embodiments, the plurality of concatemer template molecules and/or nucleic acid primers are immobilized to 102 - 1015 different sites on a support, for example between about 102 sites and about 1015 sites, between about 105 sites and about 1015 sites, between about 1010 sites
and about 1015 sites, between about 103 sites and about 1014 sites, between about 104 sites and about 1013 sites, between about 105 sites and about 1012 sites, between about 106 sites and about 1011 sites, between about 107 sites and about IO10 sites, or between about 108 sites and about IO10 sites, or any range therebetween, on the support. In some embodiments, the binding of the plurality of concatemer template molecules and nucleic acid primers with the plurality of first sequencing polymerases generates a plurality of first complexed polymerases immobilized to 102 - 1015 different sites on the support, e.g. the sites of the immobilized template molecules and/or nucleic acid primers. In some embodiments, the plurality of first complexed polymerases immobilized on the support are immobilized to pre-determined or to random sites on the support. In some embodiments, the plurality of first complexed polymerases are in fluid communication with each other to permit flowing a solution of reagents (e.g., enzymes including sequencing polymerases, multivalent molecules, nucleotides, and/or divalent cations) onto the support so that the plurality of complexed polymerases on the support are reacted with the solution of reagents in a massively parallel manner.
[00788] In some embodiments, the methods for sequencing further comprise step (b): contacting the sequencing polymerase with a plurality of nucleotides under a condition suitable for binding at least one nucleotide to the sequencing polymerase which is bound to the nucleic acid duplex and suitable for polymerase-catalyzed nucleotide incorporation. In some embodiments, the sequencing polymerase is contacted with the plurality of nucleotides in the presence of at least one catalytic cation comprising magnesium and/or manganese. In some embodiments, the plurality of nucleotides comprises at least one nucleotide analog having a chain terminating moiety at the sugar 2’ or 3’ position. In some embodiments, the chain terminating moiety is removable from the sugar 2’ or 3’ position to convert the chain terminating moiety to an OH or H group. In some embodiments, the plurality of nucleotides comprises at least one nucleotide that lacks a chain terminating moiety. In some embodiments, at least on nucleotide is labeled with a detectable reporter moiety (e.g., fluorophore). In some embodiments, step (b) further comprises removing the chain terminating moiety from the incorporated chain terminating nucleotide to generate an extendible 3 ’OH group. In some embodiments, the sequencing of step (b) further comprises repeating at least once the steps of: (i) incorporating a detectably labeled chain terminating nucleotide into the terminal 3’ end of a hybridized first sequencing primer; (ii) detecting and identifying the incorporated chain terminating nucleotide; and (iii) removing the chain terminating moiety and/or the detectable label from the incorporated chain terminating
nucleotide to generate an extendible 3 ’OH sugar group on the incorporated chain terminating nucleotide.
[00789] In some embodiments, the methods for sequencing further comprise step (c): incorporating at least one nucleotide into the 3’ end of the extendible primer under a condition suitable for incorporating the at least one nucleotide. In some embodiments, the suitable conditions for nucleotide binding the polymerase and for incorporation the nucleotide can be the same or different. In some embodiments, conditions suitable for incorporating the nucleotide comprise inclusion of at least one catalytic cation comprising magnesium and/or manganese. In some embodiments, the at least one nucleotide binds the sequencing polymerase and incorporates into the 3 ’ end of the extendible primer. In some embodiments, the incorporating the nucleotide into the 3’ end of the primer in step (c) comprises a primer extension reaction. In some embodiments, a sequencing cycle comprises completion of steps (b) - (c).
[00790] In some embodiments, the methods for sequencing further comprise step (d): repeating the incorporating at least one nucleotide into the 3’ end of the extendible primer of steps (b) and (c) at least once. In some embodiments, the plurality of nucleotides comprises a plurality of nucleotides labeled with detectable reporter moiety. The detectable reporter moiety comprises a fluorophore. In some embodiments, the fluorophore is attached to the nucleotide base. In some embodiments, the fluorophore is attached to the nucleotide base with a linker which is cleavable/removable from the base. In some embodiments, at least one of the nucleotides in the plurality is not labeled with a detectable reporter moiety. In some embodiments, a particular detectable reporter moiety (e.g., fluorophore) that is attached to the nucleotide can correspond to the nucleotide base (e.g., dATP, dGTP, dCTP, dTTP or dUTP) to permit detection and identification of the nucleotide base. In some embodiments, the method further comprises detecting the at least one incorporated nucleotide at step (c) and/or (d). In some embodiments, the method further comprises identifying the at least one incorporated nucleotide at step (c) and/or (d). In some embodiments, the sequence of the nucleic acid concatemer molecule can be determined by detecting and identifying the nucleotide that binds the sequencing polymerase, thereby determining the sequence of the concatemer template molecule. In some embodiments, the sequence of the concatemer template molecule can be determined by detecting and identifying the nucleotide that incorporates into the 3’ end of the primer, thereby determining the sequence of the concatemer template molecule.
[00791] In some embodiments, in the methods for sequencing, the plurality of sequencing polymerases that are bound to the nucleic acid duplexes comprise a plurality of complexed polymerases, having at least a first and second complexed polymerase. In some embodiments, the first complexed polymerases comprises a first sequencing polymerase bound to a first nucleic acid duplex comprising a first nucleic acid template sequence which is hybridized to a first nucleic acid primer. In some embodiments, the second complexed polymerases comprises a second sequencing polymerase bound to a second nucleic acid duplex comprising a second nucleic acid template sequence which is hybridized to a second nucleic acid primer. In some embodiments, the first and second nucleic acid template sequences comprise the same or different sequences. In some embodiments, the first and second nucleic acid concatemers are clonally-amplified. In some embodiments, the first and second primers comprise extendible 3’ ends or non-extendible 3’ ends. In some embodiments, the plurality of complexed polymerases are immobilized to a support. In some embodiments, the density of the plurality of complexed polymerases is about 102 - 1015 complexed polymerases per mm2 that are immobilized to the support, for example, between about IO10 and about 1015 complexed polymerases per mm2, between about 105 and about 1015 complexed polymerases per mm2, between about 103 and about 1014 complexed polymerases per mm2, between about 104 and about 1013 complexed polymerases per mm2, between about 105 and about 1012 complexed polymerases per mm2, between about 106 and about 1011 complexed polymerases per mm2, between about 107 and about IO10 complexed polymerases per mm2, or between about 108 and about IO10 complexed polymerases per mm2, or any range therebetween.
Two-Stage Methods for Nucleic Acid Sequencing
[00792] The present disclosure provides a two-stage method for sequencing any of the immobilized concatemer template molecules described herein. In some embodiments, the first stage comprises binding multivalent molecules to complexed polymerases to form multivalent-complexed polymerases, and detecting the multivalent-complexed polymerases. In some embodiments, the second stage comprises nucleotide incorporation and extension of the sequencing primer. In some embodiments, one sequencing cycle comprises completion of a first and second stage. In some embodiments, any of the workflows that employ a two-stage sequencing method comprises conducting 5-25 sequencing cycles, or 25-50 sequencing cycles, or 50-75 sequencing cycles, or 75-100 sequencing cycles, or 100-200 sequencing
cycles, or 200-500 sequencing cycles, or 500-750 sequencing cycles, or 750-1000 sequencing cycles, or any range therebetween.
[00793] In some embodiments, the first stage comprises step (a): contacting a plurality of a first sequencing polymerase to (i) a plurality of nucleic acid concatemer template molecules and (ii) a plurality of nucleic acid sequencing primers. In some embodiments, the contacting is conducted under a condition suitable to bind the plurality of first sequencing polymerases to the plurality of concatemer template molecules and the plurality of nucleic acid primers thereby forming a plurality of first complexed polymerases each comprising a first sequencing polymerase bound to a nucleic acid duplex. In some embodiments, the nucleic acid duplex comprises a concatemer template molecule hybridized to a nucleic acid primer. In some embodiments, the first polymerase comprises a recombinant mutant sequencing polymerase. In some embodiments, the sequencing primer comprises a 3’ extendible end. [00794] In some embodiments, in the methods for sequencing concatemer template molecules, the sequencing primer comprises a 3’ extendible end or a 3’ non-extendible end. In some embodiments, the plurality of concatemer template molecules comprise amplified template molecules (e.g., clonally amplified template molecules). In some embodiments, the plurality of concatemer template molecules comprise one copy of a target sequence of interest. In some embodiments, the plurality of nucleic acid molecules comprise two or more tandem copies of a target sequence of interest (e.g., concatemers). In some embodiments, the concatemer template molecules in the plurality of concatemer template molecules comprise the same target sequence of interest or different target sequences of interest. In some embodiments, the plurality of concatemer template molecules and/or the plurality of nucleic acid primers are in solution or are immobilized to a support. In some embodiments, when the plurality of concatemer template molecules and/or the plurality of nucleic acid primers are immobilized to a support, the binding with the first sequencing polymerase generates a plurality of immobilized first complexed polymerases. In some embodiments, the plurality of concatemer template molecules and/or nucleic acid primers are immobilized to 102 - 1015 different sites on a support, or any of the ranges described herein. In some embodiments, the binding of the plurality of concatemer template molecules and nucleic acid primers with the plurality of first sequencing polymerases generates a plurality of first complexed polymerases immobilized to 102 - 1015 different sites on the support. In some embodiments, the plurality of immobilized first complexed polymerases on the support are immobilized to predetermined or to random sites on the support. In some embodiments, the plurality of immobilized first complexed polymerases are in fluid communication with each other to
permit flowing a solution of reagents (e.g., enzymes including sequencing polymerases, multivalent molecules, nucleotides, and/or divalent cations) onto the support so that the plurality of immobilized complexed polymerases on the support are reacted with the solution of reagents in a massively parallel manner.
[00795] In some embodiments, the methods for sequencing further comprise step (b): contacting the plurality of first complexed polymerases with a plurality of multivalent molecules to form a plurality of multival ent-complexed polymerases (e.g., binding complexes). In some embodiments, individual multivalent molecules in the plurality of multivalent molecules comprise a core attached to multiple nucleotide arms and each nucleotide arm is attached to a nucleotide (e.g., nucleotide unit) (e.g., FIGs. 1-4). In some embodiments, the contacting of step (b) is conducted under a condition suitable for binding complementary nucleotide units of the multivalent molecules to at least two of the plurality of first complexed polymerases thereby forming a plurality of multivalent-complexed polymerases. In some embodiments, the condition is suitable for inhibiting polymerase- catalyzed incorporation of the complementary nucleotide units into the primers of the plurality of multivalent-complexed polymerases. In some embodiments, the plurality of multivalent molecules comprise at least one multivalent molecule having multiple nucleotide arms (e.g., FIGs. 1-4) each attached with a nucleotide analog (e.g., nucleotide analog unit), where the nucleotide analog includes a chain terminating moiety at the sugar 2’ and/or 3’ position. In some embodiments, the plurality of multivalent molecules comprises at least one multivalent molecule comprising multiple nucleotide arms each attached with a nucleotide unit that lacks a chain terminating moiety. In some embodiments, at least one of the multivalent molecules in the plurality of multivalent molecules is labeled with a detectable reporter moiety. In some embodiments, the detectable reporter moiety comprises a fluorophore. In some embodiments, the contacting of step (b) is conducted in the presence of at least one non-catalytic cation comprising strontium, barium and/or calcium.
[00796] In some embodiments, the methods for sequencing further comprise step (c): detecting the plurality of multivalent-complexed polymerases. In some embodiments, the detecting includes detecting the multivalent molecules that are bound to the complexed polymerases, where the complementary nucleotide units of the multivalent molecules are bound to the primers but incorporation of the complementary nucleotide units is inhibited. In some embodiments, the multivalent molecules are labeled with a detectable reporter moiety to permit detection. In some embodiments, the labeled multivalent molecules comprise a fluorophore attached to the core, linker and/or nucleotide unit of the multivalent molecules.
[00797] In some embodiments, the methods for sequencing further comprise step (d): identifying the nucleo-base of the complementary nucleotide units that are bound to the plurality of first complexed polymerases, thereby determining the sequence of the concatemer molecule. In some embodiments, the multivalent molecules are labeled with a detectable reporter moiety that corresponds to the particular nucleotide units attached to the nucleotide arms to permit identification of the complementary nucleotide units (e.g., nucleotide base adenine, guanine, cytosine, thymine or uracil) that are bound to the plurality of first complexed polymerases.
[00798] In some embodiments, the second stage of the two-stage sequencing method generally comprises nucleotide incorporation. In some embodiments, the methods for sequencing further comprise step (e): dissociating the plurality of multivalent-complexed polymerases and removing the plurality of first sequencing polymerases and their bound multivalent molecules, and retaining the plurality of nucleic acid duplexes.
[00799] In some embodiments, the methods for sequencing further comprises step (f): contacting the plurality of the retained nucleic acid duplexes of step (e) with a plurality of second sequencing polymerases. In some embodiments, the contacting is conducted under a condition suitable for binding the plurality of second sequencing polymerases to the plurality of the retained nucleic acid duplexes, thereby forming a plurality of second complexed polymerases each comprising a second sequencing polymerase bound to a nucleic acid duplex. In some embodiments, the second sequencing polymerase comprises a recombinant mutant sequencing polymerase.
[00800] In some embodiments, the plurality of first sequencing polymerases of step (a) have an amino acid sequence that is 100% identical to the amino acid sequence as the plurality of the second sequencing polymerases of step (f). In some embodiments, the plurality of first sequencing polymerases of step (a) have an amino acid sequence that differs from the amino acid sequence of the plurality of the second sequencing polymerases of step (f).
[00801] In some embodiments, the methods for sequencing further comprise step (g): contacting the plurality of second complexed polymerases with a plurality of nucleotides. In some embodiments, the contacting is conducted under a condition suitable for binding complementary nucleotides from the plurality of nucleotides to at least two of the second complexed polymerases thereby forming a plurality of nucleotide-complexed polymerases. In some embodiments, the contacting of step (g) is conducted under a condition that is suitable for promoting polymerase-catalyzed incorporation of the bound complementary nucleotides
into the primers of the nucleotide-complexed polymerases thereby forming a plurality of nucleotide-complexed polymerases. In some embodiments, the incorporating the nucleotide into the 3’ end of the primer in step (g) comprises a primer extension reaction. In some embodiments, the contacting of step (g) is conducted in the presence of at least one catalytic cation comprising magnesium and/or manganese. In some embodiments, the plurality of nucleotides comprise native nucleotides (e.g., non-analog nucleotides) or nucleotide analogs. In some embodiments, the plurality of nucleotides comprise a 2’ and/or 3’ chain terminating moiety which is removable or is not removable. In some embodiments, the plurality of nucleotides comprises a plurality of nucleotides labeled with detectable reporter moiety. The detectable reporter moiety comprises a fluorophore. In some embodiments, the fluorophore is attached to the nucleotide base. In some embodiments, the fluorophore is attached to the nucleotide base with a linker which is cleavable/removable from the base or is not removable from the base. In some embodiments, at least one of the nucleotides in the plurality is not labeled with a detectable reporter moiety. In some embodiments, the plurality of nucleotides comprises a plurality of non-labeled nucleotides. In some embodiments, a particular detectable reporter moiety (e.g., fluorophore) that is attached to the nucleotide can correspond to the nucleotide base (e.g., dATP, dGTP, dCTP, dTTP or dUTP) to permit detection and identification of the nucleotide base.
[00802] In some embodiments, the methods for sequencing further comprise step (h): when the nucleotides of step (g) are detectably labeled, then step (h) comprises detecting the complementary nucleotides which are incorporated into the primers of the nucleotide- complexed polymerases. In some embodiments, the plurality of nucleotides are labeled with a detectable reporter moiety to permit detection. In some embodiments, in the methods for sequencing concatemer template molecules, when the nucleotides of step (g) are non-labeled, then the detecting of step (h) is omitted.
[00803] In some embodiments, the methods for sequencing further comprise step (i): when the nucleotides of step (g) are detectably labeled, then step (i) comprises identifying the bases of the complementary nucleotides which are incorporated into the primers of the nucleotide-complexed polymerases. In some embodiments, the identification of the incorporated complementary nucleotides in step (i) can be used to confirm the identity of the complementary nucleotides of the multivalent molecules that are bound to the plurality of first complexed polymerases in step (d). In some embodiments, the identifying of step (i) can be used to determine the sequence of the nucleic acid concatemer template molecules. In
some embodiments, in the methods for sequencing concatemer template molecules, when the nucleotides of step (g) are non-labeled, then the identifying of step (i) is omitted.
[00804] In some embodiments, the methods for sequencing further comprise step (j): removing the chain terminating moiety from the incorporated nucleotide when step (g) is conducted by contacting the plurality of second complexed polymerases with a plurality of nucleotides that comprise at least one nucleotide having a 2’ and/or 3’ chain terminating moiety.
[00805] In some embodiments, the methods for sequencing further comprise step (k): repeating steps (a) - (j) at least once. In some embodiments, the sequence of the nucleic acid concatemer molecules can be determined by detecting and identifying the multivalent molecules that bind the sequencing polymerases but do not incorporate into the 3’ end of the primer at steps (c) and (d). In some embodiments, the sequence of the nucleic acid concatemer molecule can be determined (or confirmed) by detecting and identifying the nucleotide that incorporates into the 3’ end of the primer at steps (h) and (i).
[00806] In some embodiments, in any of the methods for sequencing nucleic acid molecules, the binding of the plurality of first complexed polymerases with the plurality of multivalent molecules forms at least one avidity complex, the method comprising the steps: (a) binding a first nucleic acid primer, a first sequencing polymerase, and a first multivalent molecule to a first portion of a concatemer template molecule thereby forming a first binding complex, wherein a first nucleotide unit of the first multivalent molecule binds to the first sequencing polymerase; and (b) binding a second nucleic acid primer, a second sequencing polymerase, and the first multivalent molecule to a second portion of the same concatemer template molecule thereby forming a second binding complex, wherein a second nucleotide unit of the first multivalent molecule binds to the second sequencing polymerase, wherein the first and second binding complexes which include the same multivalent molecule forms an avidity complex. In some embodiments, the first sequencing polymerase comprises any wild type or mutant polymerase described herein. In some embodiments, the second sequencing polymerase comprises any wild type or mutant polymerase described herein. The concatemer template molecule comprises tandem repeat sequences of a sequence of interest and at least one universal sequencing primer binding site. The first and second nucleic acid primers can bind to a sequencing primer binding site along the concatemer template molecule. Exemplary multivalent molecules are shown in FIGs. 1-4.
[00807] In some embodiments, in any of the methods for sequencing nucleic acid molecules, wherein the method includes binding the plurality of first complexed polymerases
with the plurality of multivalent molecules to form at least one avidity complex, the method comprising the steps: (a) contacting the plurality of sequencing polymerases and the plurality of nucleic acid primers with different portions of a concatemer template molecule to form at least first and second complexed polymerases on the same concatemer template molecule; (b) contacting a plurality of multivalent molecules to the at least first and second complexed polymerases on the same concatemer template molecule, under conditions suitable to bind a single multivalent molecule from the plurality to the first and second complexed polymerases, wherein at least a first nucleotide unit of the single multivalent molecule is bound to the first complexed polymerase which includes a first primer hybridized to a first portion of the concatemer template molecule thereby forming a first binding complex (e.g., first ternary complex), and wherein at least a second nucleotide unit of the single multivalent molecule is bound to the second complexed polymerase which includes a second primer hybridized to a second portion of the concatemer template molecule thereby forming a second binding complex (e.g., second ternary complex), wherein the contacting is conducted under a condition suitable to inhibit polymerase-catalyzed incorporation of the bound first and second nucleotide units in the first and second binding complexes, and wherein the first and second binding complexes which are bound to the same multivalent molecule form an avidity complex; and (c) detecting the first and second binding complexes on the same concatemer template molecule, and (d) identifying the first nucleotide unit in the first binding complex thereby determining the sequence of the first portion of the concatemer template molecule, and identifying the second nucleotide unit in the second binding complex thereby determining the sequence of the second portion of the concatemer template molecule. In some embodiments, the plurality of sequencing polymerases comprise any wild type or mutant sequencing polymerase described herein. The concatemer template molecule comprises tandem repeat sequences of a sequence of interest and at least one universal sequencing primer binding site. The plurality of nucleic acid primers can bind to a sequencing primer binding site along the concatemer template molecule. Exemplary multivalent molecules are shown in FIGs. 1-4.
Sequencing-by-Binding
[00808] The present disclosure provides methods for sequencing any of the immobilized concatemer molecules described herein, wherein the sequencing methods comprise a sequencing-by-binding (SBB) procedure which employs non-labeled chain-terminating nucleotides. In some embodiments, the sequencing-by-binding (SBB) method comprises the
steps of (a) sequentially contacting a primed template nucleic acid with at least two separate mixtures under ternary complex stabilizing conditions, wherein the at least two separate mixtures each include a polymerase and a nucleotide, whereby the sequentially contacting results in the primed template nucleic acid being contacted, under the ternary complex stabilizing conditions, with nucleotide cognates for first, second and third base type base types in the template; (b) examining the at least two separate mixtures to determine whether a ternary complex formed; and (c) identifying the next correct nucleotide for the primed template nucleic acid molecule, wherein the next correct nucleotide is identified as a cognate of the first, second or third base type if ternary complex is detected in step (b), and wherein the next correct nucleotide is imputed to be a nucleotide cognate of a fourth base type based on the absence of a ternary complex in step (b); (d) adding a next correct nucleotide to the primer of the primed template nucleic acid after step (b), thereby producing an extended primer; and (e) repeating steps (a) through (d) at least once on the primed template nucleic acid that comprises the extended primer. Exemplary sequencing-by-binding methods are described in U.S. patent Nos. 10,246,744 and 10,731,141 (where the contents of both patents are hereby incorporated by reference in their entireties). In some embodiments, a sequencing cycle comprises completion of steps (a) - (d).
Sequencing Polymerases
[00809] The present disclosure provides methods for sequencing any of the template molecules described herein, where any of the sequencing methods described herein employ at least one type of sequencing polymerase and a plurality of nucleotides, or employ at least one type of sequencing polymerase and a plurality of nucleotides and a plurality of multivalent molecules. In some embodiments, the sequencing polymerase(s) is/are capable of incorporating a complementary nucleotide opposite a nucleotide in a template molecule. In some embodiments, the sequencing polymerase(s) is/are capable of binding a complementary nucleotide unit of a multivalent molecule opposite a nucleotide in a template molecule. In some embodiments, the plurality of sequencing polymerases comprise recombinant mutant polymerases.
[00810] Examples of suitable polymerases for use in sequencing with nucleotides and/or multivalent molecules include but are not limited to: Klenow DNA polymerase; Thermus aquaticus DNA polymerase I (Taq polymerase); KlenTaq polymerase; Candidates altiarchaeales archaeon; Candidates Hadarchaeum Yellowstonense; Hadesarchaea archaeon; Euryarchaeota archaeon; Thermoplasmata archaeon; Thermococcus polymerases such as
Thermococcus litoralis, bacteriophage T7 DNA polymerase; human alpha, delta and epsilon DNA polymerases; bacteriophage polymerases such as T4, RB69 and phi29 bacteriophage DNA polymerases; Pyrococcus furiosus DNA polymerase (Pfu polymerase); Bacillus subtilis DNA polymerase III; E. coli DNA polymerase III alpha and epsilon; 9 degree N polymerase; reverse transcriptases such as HIV type M or O reverse transcriptases; avian myeloblastosis virus reverse transcriptase; Moloney Murine Leukemia Virus (MMLV) reverse transcriptase; or telomerase. Further non-limiting examples of DNA polymerases include those from various Archaea genera, such as, Aeropyrum, Archaeglobus, Desulfurococcus, Pyrobaculum, Pyrococcus, Pyrolobus, Pyrodictium, Staphylothermus, Stetteria, Sulfolobus, Thermococcus, and Vulcanisaeta and the like or variants thereof, including such polymerases as are known in the art such as 9 degrees N®, VENT®, DEEP VENT®, THERMINATOR®, Pfu, KOD, Pfx, Tgo and RB69 polymerases. Exemplary polymerase are described in U.S. Patent No. 11859241, the contents of which are incorporated by reference herein in their entirety.
Nucleotides
[00811] The present disclosure provides methods for sequencing any of the immobilized concatemer molecules described herein, where any of the sequencing methods described herein employ at least one nucleotide. The nucleotides comprise a base, sugar and at least one phosphate group. In some embodiments, at least one nucleotide in the plurality comprises an aromatic base, a five carbon sugar (e.g., ribose or deoxyribose), and one or more phosphate groups (e.g., 1-10 phosphate groups). The plurality of nucleotides can comprise at least one type of nucleotide selected from a group consisting of dATP, dGTP, dCTP, dTTP and dUTP. The plurality of nucleotides can comprise at a mixture of any combination of two or more types of nucleotides selected from a group consisting of dATP, dGTP, dCTP, dTTP and/or dUTP. In some embodiments, at least one nucleotide in the plurality is not a nucleotide analog. In some embodiments, at least one nucleotide in the plurality comprises a nucleotide analog.
[00812] In some embodiments, at least one nucleotide in the plurality of nucleotides comprise a chain of one, two or three phosphorus atoms where the chain is typically attached to the 5’ carbon of the sugar moiety via an ester or phosphoramide linkage. In some embodiments, at least one nucleotide in the plurality is an analog having a phosphorus chain in which the phosphorus atoms are linked together with intervening O, S, NH, methylene or ethylene. In some embodiments, the phosphorus atoms in the chain include substituted side
groups including O, S or BH3. In some embodiments, the chain includes phosphate groups substituted with analogs including phosphoramidate, phosphorothioate, phosphorodithioate, and O-methylphosphoramidite groups.
[00813] In some embodiments, at least one nucleotide in the plurality of nucleotides comprises a terminator nucleotide analog having a chain terminating moiety (e.g., blocking moiety) at the sugar 2’ position, at the sugar 3’ position, or at the sugar 2’ and 3’ position. In some embodiments, the chain terminating moiety can inhibit polymerase-catalyzed incorporation of a subsequent nucleotide unit or free nucleotide in a nascent strand during a primer extension reaction. In some embodiments, the chain terminating moiety is attached to the 3’ sugar position where the sugar comprises a ribose or deoxyribose sugar moiety. In some embodiments, the chain terminating moiety is removable/cleavable from the 3’ sugar position to generate a nucleotide having a 3 ’OH sugar group which is extendible with a subsequent nucleotide in a polymerase-catalyzed nucleotide incorporation reaction. In some embodiments, the chain terminating moiety comprises an alkyl group, alkenyl group, alkynyl group, allyl group, aryl group, benzyl group, azide group, amine group, amide group, keto group, isocyanate group, phosphate group, thio group, disulfide group, carbonate group, urea group, or silyl group. In some embodiments, the chain terminating moiety is cleavable/removable from the nucleotide, for example by reacting the chain terminating moiety with a chemical agent, pH change, light or heat. In some embodiments, the chain terminating moieties alkyl, alkenyl, alkynyl and allyl are cleavable with tetrakis(triphenylphosphine)palladium(0) (Pd(PPhs)4) with piperidine, or with 2,3-Dichloro- 5,6-dicyano-l,4-benzo-quinone (DDQ). In some embodiments, the chain terminating moieties aryl and benzyl are cleavable with H2 Pd/C. In some embodiments, the chain terminating moieties amine, amide, keto, isocyanate, phosphate, thio, disulfide are cleavable with phosphine or with a thiol group including beta-mercaptoethanol or dithiothritol (DTT). In some embodiments, the chain terminating moiety carbonate is cleavable with potassium carbonate (K2CO3) in MeOH, with triethylamine in pyridine, or with Zn in acetic acid (AcOH). In some embodiments, the chain terminating moieties urea and silyl are cleavable with tetrabutylammonium fluoride, pyridine-HF, with ammonium fluoride, or with triethylamine trihydrofluoride. In some embodiments, the chain terminating moiety may be cleavable/removable with nitrous acid. In some embodiments, a chain terminating moiety may be cleavable/removable using a solution comprising nitrite, such as, for example, a combination of nitrite with an acid such as acetic acid, sulfuric acid, or nitric acid. In some further embodiments, said solution may comprise an organic acid.
[00814] In some embodiments, at least one nucleotide in the plurality of nucleotides comprises a terminator nucleotide analog having a chain terminating moiety (e.g., blocking moiety) at the sugar 2’ position, at the sugar 3’ position, or at the sugar 2’ and 3’ position. In some embodiments, the chain terminating moiety comprises an azide, azido or azidomethyl group. In some embodiments, the chain terminating moiety comprises a 3’-O-azido or 3’-O- azidomethyl group. In some embodiments, the chain terminating moieties azide, azido and azidomethyl group are cleavable/removable with a phosphine compound. In some embodiments, the phosphine compound comprises a derivatized tri-alkyl phosphine moiety or a derivatized tri-aryl phosphine moiety. In some embodiments, the phosphine compound comprises Tris(2-carboxyethyl)phosphine (TCEP) or bis-sulfo triphenyl phosphine (BS-TPP) or Tri(hydroxyproyl)phosphine (THPP). In some embodiments, the cleaving agent comprises 4-dimethylaminopyridine (4-DMAP). In some embodiments, the chain terminating moiety comprising one or more of a 3’-O-amino group, a 3’-O-aminomethyl group, a 3’-O- methylamino group, or derivatives thereof may be cleaved with nitrous acid, through a mechanism utilizing nitrous acid, or using a solution comprising nitrous acid. In some embodiments, the chain terminating moiety comprising one or more of a 3’-O-amino group, a 3’-O-aminomethyl group, a 3’-O-methylamino group, or derivatives thereof may be cleaved using a solution comprising nitrite. In some embodiments, for example, nitrite may be combined with or contacted with an acid such as acetic acid, sulfuric acid, or nitric acid. In some further embodiments, for example, nitrite may be combined with or contacted with an organic acid such as, for example, formic acid, acetic acid, propionic acid, butyric acid, isobutyric acid, or the like.
[00815] In some embodiments, the nucleotide comprises a chain terminating moiety which is selected from a group consisting of 3’-deoxy nucleotides, 2’,3’-dideoxynucleotides, 3’- methyl, 3 ’-azido, 3 ’-azidomethyl, 3’-O-azidoalkyl, 3’-O-ethynyl, 3’-O-aminoalkyl, 3’-O- fluoroalkyl, 3 ’-fluoromethyl, 3’-difluoromethyl, 3’-trifluoromethyl, 3 ’-sulfonyl, 3 ’-malonyl, 3’-amino, 3’-O-amino, 3’-sulfhydral, 3 ’-aminomethyl, 3’-ethyl, 3’butyl, 3’-tert butyl, 3’- Fluorenylmethyloxycarbonyl, 3’ tert-Butyloxy carbonyl, 3’-O-alkyl hydroxylamino group, 3’- phosphorothioate, and 3-O-benzyl, or derivatives thereof.
[00816] In some embodiments, the plurality of nucleotides comprises a plurality of nucleotides labeled with detectable reporter moiety. The detectable reporter moiety comprises a fluorophore. In some embodiments, the fluorophore is attached to the nucleotide base. In some embodiments, the fluorophore is attached to the nucleotide base with a linker which is cleavable/removable from the base. In some embodiments, at least one of the nucleotides in
the plurality is not labeled with a detectable reporter moiety. In some embodiments, a particular detectable reporter moiety (e.g., fluorophore) that is attached to the nucleotide can correspond to the nucleotide base (e.g., dATP, dGTP, dCTP, dTTP or dUTP) to permit detection and identification of the nucleotide base.
[00817] In some embodiments, the cleavable linker on the nucleotide base comprises a cleavable moiety comprising an alkyl group, alkenyl group, alkynyl group, allyl group, aryl group, benzyl group, azide group, amine group, amide group, keto group, isocyanate group, phosphate group, thio group, disulfide group, carbonate group, urea group, or silyl group. In some embodiments, the cleavable linker on the base is cleavable/removable from the base by reacting the cleavable moiety with a chemical agent, pH change, light or heat. In some embodiments, the cleavable moieties alkyl, alkenyl, alkynyl and allyl are cleavable with tetrakis(triphenylphosphine)palladium(0) (Pd(PPhs)4) with piperidine, or with 2,3-Dichloro- 5,6-dicyano-l,4-benzo-quinone (DDQ). In some embodiments, the cleavable moieties aryl and benzyl are cleavable with H2 Pd/C. In some embodiments, the cleavable moieties amine, amide, keto, isocyanate, phosphate, thio, disulfide are cleavable with phosphine or with a thiol group including beta-mercaptoethanol or dithiothritol (DTT). In some embodiments, the cleavable moiety carbonate is cleavable with potassium carbonate (K2CO3) in MeOH, with triethylamine in pyridine, or with Zn in acetic acid (AcOH). In some embodiments, the cleavable moieties urea and silyl are cleavable with tetrabutylammonium fluoride, pyridine- HF, with ammonium fluoride, or with triethylamine trihydrofluoride.
[00818] In some embodiments, the cleavable linker on the nucleotide base comprises cleavable moiety including an azide, azido or azidomethyl group. In some embodiments, the cleavable moieties azide, azido and azidomethyl group are cleavable/removable with a phosphine compound. In some embodiments, the phosphine compound comprises a derivatized tri-alkyl phosphine moiety or a derivatized tri-aryl phosphine moiety. In some embodiments, the phosphine compound comprises Tris(2-carboxyethyl)phosphine (TCEP) or bis-sulfo triphenyl phosphine (BS-TPP) or Tri(hydroxyproyl)phosphine (THPP). In some embodiments, the cleaving agent comprises 4-dimethylaminopyridine (4-DMAP).
[00819] In some embodiments, the chain terminating moiety (e.g., at the sugar 2’ and/or sugar 3’ position) and the cleavable linker on the nucleotide base have the same or different cleavable moieties. In some embodiments, the chain terminating moiety (e.g., at the sugar 2’ and/or sugar 3’ position) and the detectable reporter moiety linked to the base are chemically cleavable/removable with the same chemical agent. In some embodiments, the chain
terminating moiety (e.g., at the sugar 2’ and/or sugar 3’ position) and the detectable reporter moiety linked to the base are chemically cleavable/removable with different chemical agents.
Multivalent Molecules
[00820] The present disclosure provides methods for sequencing any of the immobilized concatemer molecules described herein, where the sequencing methods employ at least one multivalent molecule. In some embodiments, the multivalent molecule comprises a plurality of nucleotide arms attached to a core and having any configuration including a starburst, helter skelter, or bottle brush configuration (e.g., FIG. 1). The multivalent molecule comprises: (1) a core; and (2) a plurality of nucleotide arms which comprise (i) a core attachment moiety, (ii) a spacer comprising a PEG moiety, (iii) a linker, and (iv) a nucleotide unit. In some embodiments, the core is attached to the plurality of nucleotide arms. In some embodiments, the spacer is attached to the linker. In some embodiments, the linker is attached to the nucleotide unit. In some embodiments, the nucleotide unit comprises a base, sugar and at least one phosphate group, and the linker is attached to the nucleotide unit through the base. In some embodiments, the linker comprises an aliphatic chain or an oligo ethylene glycol chain where both linker chains having 2-6 subunits. In some embodiments, the linker also includes an aromatic moiety. An exemplary nucleotide arm is shown in FIG. 5. Exemplary multivalent molecules are shown in FIGs. 1-4. An exemplary spacer is shown in FIG. 6 (top) and exemplary linkers are shown in FIG. 6 (bottom) and FIG. 7. Exemplary nucleotides attached to a linker are shown in FIGs. 8-10. An exemplary biotinylated nucleotide arm is shown in FIG. 11.
[00821] In some embodiments, a multivalent molecule comprises a core attached to multiple nucleotide arms. In some embodiments, the multiple nucleotide arms have the same type of nucleotide unit which is selected from a group consisting of dATP, dGTP, dCTP, dTTP and dUTP.
[00822] In some embodiments, a multivalent molecule comprises a core attached to multiple nucleotide arms, where each arm includes a nucleotide unit. The nucleotide unit comprises an aromatic base, a five carbon sugar (e.g., ribose or deoxyribose), and one or more phosphate groups (e.g., 1-10 phosphate groups). The plurality of multivalent molecules can comprise one type multivalent molecule having one type of nucleotide unit selected from a group consisting of dATP, dGTP, dCTP, dTTP and dUTP. The plurality of multivalent molecules can comprise at a mixture of any combination of two or more types of multivalent
molecules, where individual multivalent molecules in the mixture comprise nucleotide units selected from a group consisting of dATP, dGTP, dCTP, dTTP and/or dUTP.
[00823] In some embodiments, the nucleotide unit comprises a chain of one, two or three phosphorus atoms where the chain is typically attached to the 5’ carbon of the sugar moiety via an ester or phosphoramide linkage. In some embodiments, at least one nucleotide unit is a nucleotide analog having a phosphorus chain in which the phosphorus atoms are linked together with intervening O, S, NH, methylene or ethylene. In some embodiments, the phosphorus atoms in the chain include substituted side groups including O, S or BH3. In some embodiments, the chain includes phosphate groups substituted with analogs including phosphoramidate, phosphorothioate, phosphorodithioate, and O-methylphosphoramidite groups.
[00824] In some embodiments, the multivalent molecule comprises a core attached to multiple nucleotide arms. In some embodiments, individual nucleotide arms comprise a nucleotide unit which is a nucleotide analog having a chain terminating moiety (e.g., blocking moiety) at the sugar 2’ position, at the sugar 3’ position, or at the sugar 2’ and 3’ position. In some embodiments, the nucleotide unit comprises a chain terminating moiety (e.g., blocking moiety) at the sugar 2’ position, at the sugar 3’ position, or at the sugar 2’ and 3’ position. In some embodiments, the chain terminating moiety can inhibit polymerase-catalyzed incorporation of a subsequent nucleotide unit or free nucleotide in a nascent strand during a primer extension reaction. In some embodiments, the chain terminating moiety is attached to the 3’ sugar position where the sugar comprises a ribose or deoxyribose sugar moiety. In some embodiments, the chain terminating moiety is removable/cleavable from the 3’ sugar position to generate a nucleotide having a 3 ’OH sugar group which is extendible with a subsequent nucleotide in a polymerase-catalyzed nucleotide incorporation reaction. In some embodiments, the chain terminating moiety comprises an alkyl group, alkenyl group, alkynyl group, allyl group, aryl group, benzyl group, azide group, amine group, amide group, keto group, isocyanate group, phosphate group, thio group, disulfide group, carbonate group, urea group, or silyl group. In some embodiments, the chain terminating moiety is cleavable/removable from the nucleotide unit, for example by reacting the chain terminating moiety with a chemical agent, pH change, light or heat. In some embodiments, the chain terminating moieties alkyl, alkenyl, alkynyl and allyl are cleavable with tetrakis(triphenylphosphine)palladium(0) (Pd(PPhs)4) with piperidine, or with 2,3-Dichloro- 5,6-dicyano-l,4-benzo-quinone (DDQ). In some embodiments, the chain terminating moieties aryl and benzyl are cleavable with H2 Pd/C. In some embodiments, the chain
terminating moieties amine, amide, keto, isocyanate, phosphate, thio, disulfide are cleavable with phosphine or with a thiol group including beta-mercaptoethanol or dithiothritol (DTT). In some embodiments, the chain terminating moiety carbonate is cleavable with potassium carbonate (K2CO3) in MeOH, with triethylamine in pyridine, or with Zn in acetic acid (AcOH). In some embodiments, the chain terminating moieties urea and silyl are cleavable with tetrabutylammonium fluoride, pyridine-HF, with ammonium fluoride, or with triethylamine trihydrofluoride.
[00825] In some embodiments, the nucleotide unit comprises a chain terminating moiety (e.g., blocking moiety) at the sugar 2’ position, at the sugar 3’ position, or at the sugar 2’ and 3’ position. In some embodiments, the chain terminating moiety comprises an azide, azido or azidomethyl group. In some embodiments, the chain terminating moiety comprises a 3’-O- azido or 3’-O-azidomethyl group. In some embodiments, the chain terminating moieties azide, azido and azidomethyl group are cleavable/removable with a phosphine compound. In some embodiments, the phosphine compound comprises a derivatized tri-alkyl phosphine moiety or a derivatized tri-aryl phosphine moiety. In some embodiments, the phosphine compound comprises Tris(2-carboxyethyl)phosphine (TCEP) or bis-sulfo triphenyl phosphine (BS-TPP) or Tri(hydroxyproyl)phosphine (THPP). In some embodiments, the cleaving agent comprises 4-dimethylaminopyridine (4-DMAP).
[00826] In some embodiments, the nucleotide unit comprising a chain terminating moiety which is selected from a group consisting of 3’-deoxy nucleotides, 2’, 3 ’-dideoxynucleotides, 3 ’-methyl, 3 ’-azido, 3 ’-azidomethyl, 3’-O-azidoalkyl, 3’-O-ethynyl, 3’-O-aminoalkyl, 3’-O- fluoroalkyl, 3 ’-fluoromethyl, 3 ’-difluoromethyl, 3 ’-trifluoromethyl, 3 ’-sulfonyl, 3 ’-malonyl, 3’-amino, 3’-O-amino, 3’-sulfhydral, 3 ’-aminomethyl, 3’-ethyl, 3’butyl, 3’-tert butyl, 3’- Fluorenylmethyloxycarbonyl, 3’ tert-Butyloxy carbonyl, 3’-O-alkyl hydroxylamino group, 3’- phosphorothioate, and 3-O-benzyl, or derivatives thereof.
[00827] In some embodiments, the multivalent molecule comprises a core attached to multiple nucleotide arms. In some embodiments, the nucleotide arms comprise a spacer, linker and nucleotide unit. In some embodiments, the core, the linker and/or the nucleotide unit are labeled with a detectable reporter moiety. In some embodiments, the detectable reporter moiety comprises a fluorophore. In some embodiments, a particular detectable reporter moiety (e.g., fluorophore) that is attached to the multivalent molecule can correspond to the base (e.g., dATP, dGTP, dCTP, dTTP or dUTP) of the nucleotide unit to permit detection and identification of the nucleotide base.
[00828] In some embodiments, at least one nucleotide arm of a multivalent molecule has a nucleotide unit that is attached to a detectable reporter moiety. In some embodiments, the detectable reporter moiety is attached to the nucleotide base. In some embodiments, the detectable reporter moiety comprises a fluorophore. In some embodiments, a particular detectable reporter moiety (e.g., fluorophore) that is attached to the multivalent molecule can correspond to the base (e.g., dATP, dGTP, dCTP, dTTP or dUTP) of the nucleotide unit to permit detection and identification of the nucleotide base.
[00829] In some embodiments, the core of a multivalent molecule comprises an avidin-like or streptavidin-like moiety and the core attachment moiety comprises biotin. In some embodiments, the core comprises an streptavidin-type or avidin-type moiety which includes an avidin protein, as well as any derivatives, analogs and other non-native forms of avidin that can bind to at least one biotin moiety. Other forms of avidin moieties include native and recombinant avidin and streptavidin as well as derivatized molecules, e.g. nonglycosylated avidin and truncated streptavidins . For example, avidin moiety includes deglycosylated forms of avidin, bacterial streptavidin produced by Streptomyces (e.g., Streptomyces avidinii), as well as derivatized forms, for example, N- acyl avidins, e.g., N-acetyl, N-phthalyl and N-succinyl avidin, and the commercially- available products EXTRAVIDIN®, CAPTA VIDIN®, NEUTRA VIDIN® and NEUTRALITE AVIDIN®.
[00830] In some embodiments, any of the methods for sequencing any of the immobilized concatemer molecules described herein can include forming a binding complex, where the binding complex comprises (i) a polymerase, a nucleic acid concatemer molecule duplexed with a primer, and a nucleotide, or the binding complex comprises (ii) a polymerase, a nucleic acid concatemer molecule duplexed with a primer, and a nucleotide unit of a multivalent molecule. In some embodiments, the binding complex has a persistence time of greater than about 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1 second. The binding complex has a persistence time of greater than about 0.1-0.25 seconds, or about 0.25-0.5 seconds, or about 0.5-0.75 seconds, or about 0.75-1 second, or about 1-2 seconds, or about 2-3 seconds, or about 3-4 second, or about 4-5 seconds. In some embodiments, the method is or may be carried out at a temperature of at or above 15 °C, at or above 20 °C, at or above 25 °C, at or above 35 °C, at or above 37 °C, at or above 42 °C at or above 55 °C at or above 60 °C, or at or above 72 °C, or at or above 80 °C, or within a range defined by any of the foregoing embodiments. The binding complex (e.g., ternary complex) remains stable until subjected to a condition that causes dissociation of interactions between any of the polymerase, template
molecule, primer and/or the nucleotide unit or the nucleotide. For example, a dissociating condition comprises contacting the binding complex with any one or any combination of a detergent, EDTA and/or water. In some embodiments, the present disclosure provides said method wherein the binding complex is deposited on, attached to, or hybridized to, a surface showing a contrast to noise ratio in the detecting step of greater than 20. In some embodiments, the present disclosure provides said method wherein the contacting is performed under a condition that stabilizes the binding complex when the nucleotide or nucleotide unit is complementary to a next base of the template nucleic acid, and destabilizes the binding complex when the nucleotide or nucleotide unit is not complementary to the next base of the template nucleic acid.
Supports with Low Non-Specific Binding Coatings
[00831] The present disclosure provides compositions and methods for use of a support having a plurality of surface primers immobilized thereon, for preparing any of the immobilized concatemers described herein. In some embodiments, the support is passivated with a low non-specific binding coating (e.g., FIG. 14C). The surface coatings described herein exhibit very low non-specific binding to reagents typically used for nucleic acid capture, amplification and sequencing workflows, such as dyes, nucleotides, enzymes, and nucleic acid primers. The surface coatings exhibit low background fluorescence signals or high contrast-to-noise (CNR) ratios compared to conventional surface coatings.
[00832] In general, the supports comprise a substrate (or support structure), one or more layers of a covalently or non-covalently attached low-binding, chemical modification layers, e.g., silane layers, polymer films, and one or more covalently or non-covalently attached primer sequences that may be used for tethering single-stranded target nucleic acid(s) to the support surface. In some embodiments, the formulation of the surface, e.g., the chemical composition of one or more layers, the coupling chemistry used to cross-link the one or more layers to the support surface and/or to each other, and the total number of layers, may be varied such that non-specific binding of proteins, nucleic acid molecules, and other hybridization and amplification reaction components to the support surface is minimized or reduced relative to a comparable monolayer. Often, the formulation of the surface may be varied such that non-specific hybridization on the support surface is minimized or reduced relative to a comparable monolayer. The formulation of the surface may be varied such that non-specific amplification on the support surface is minimized or reduced relative to a
comparable monolayer. The formulation of the surface may be varied such that specific amplification rates and/or yields on the support surface are maximized. Amplification levels suitable for detection are achieved in no more than 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, or more than 30 amplification cycles in some cases disclosed herein.
[00833] The substrate or support structure that comprises one or more chemically- modified layers, e.g., layers of a low non-specific binding polymer, may be independent or integrated into another structure or assembly. For example, in some embodiments, the substrate or support structure may comprise one or more surfaces within an integrated or assembled microfluidic flow cell. The substrate or support structure may comprise one or more surfaces within a microplate format, e.g., the bottom surface of the wells in a microplate. As noted above, in some preferred embodiments, the substrate or support structure comprises the interior surface (such as the lumen surface) of a capillary. In alternate preferred embodiments the substrate or support structure comprises the interior surface (such as the lumen surface) of a capillary etched into a planar chip.
[00834] The attachment chemistry used to graft a first chemically-modified layer to a surface will generally be dependent on both the material from which the surface is fabricated and the chemical nature of the layer. In some embodiments, the first layer may be covalently attached to the surface. In some embodiments, the first layer may be non-covalently attached, e.g., adsorbed to the surface through non-covalent interactions such as electrostatic interactions, hydrogen bonding, or van der Waals interactions between the surface and the molecular components of the first layer. In either case, the substrate surface may be treated prior to attachment or deposition of the first layer. Any of a variety of surface preparation techniques known to those of skill in the art may be used to clean or treat the surface. For example, glass or silicon surfaces may be acid-washed using a Piranha solution (a mixture of sulfuric acid (H2SO4) and hydrogen peroxide (H2O2)), base treatment in KOH and NaOH, and/or cleaned using an oxygen plasma treatment method.
[00835] Silane chemistries constitute one non-limiting approach for covalently modifying the silanol groups on glass or silicon surfaces to attach more reactive functional groups (e.g., amines or carboxyl groups), which may then be used in coupling linker molecules (e.g., linear hydrocarbon molecules of various lengths, such as C6, Cl 2, Cl 8 hydrocarbons, or linear polyethylene glycol (PEG) molecules) or layer molecules (e.g., branched PEG molecules or other polymers) to the surface. Examples of suitable silanes that may be used in creating any of the disclosed low binding surfaces include, but are not limited to, (3 -Aminopropyl) trimethoxy silane (APTMS), (3 -Aminopropyl) tri ethoxy silane (APTES), any of a variety of
PEG-silanes (e.g., comprising molecular weights of IK, 2K, 5K, 10K, 20K, etc.), amino-PEG silane (i.e., comprising a free amino functional group), maleimide-PEG silane, biotin-PEG silane, and the like.
[00836] Any of a variety of molecules known to those of skill in the art including, but not limited to, amino acids, peptides, nucleotides, oligonucleotides, other monomers or polymers, or combinations thereof may be used in creating the one or more chemically-modified layers on the surface, where the choice of components used may be varied to alter one or more properties of the surface, e.g., the surface density of functional groups and/or tethered oligonucleotide primers, the hydrophilicity /hydrophobicity of the surface, or the three three- dimensional nature (i.e., “thickness”) of the surface. Examples of preferred polymers that may be used to create one or more layers of low non-specific binding material in any of the disclosed surfaces include, but are not limited to, polyethylene glycol (PEG) of various molecular weights and branching structures, streptavidin, polyacrylamide, polyester, dextran, poly-lysine, and poly-lysine copolymers, or any combination thereof. Examples of conjugation chemistries that may be used to graft one or more layers of material (e.g. polymer layers) to the surface and/or to cross-link the layers to each other include, but are not limited to, biotin-streptavidin interactions (or variations thereof), his tag - Ni/NTA conjugation chemistries, methoxy ether conjugation chemistries, carboxylate conjugation chemistries, amine conjugation chemistries, NHS esters, maleimides, thiol, epoxy, azide, hydrazide, alkyne, isocyanate, and silane.
[00837] The low non-specific binding surface coating may be applied uniformly across the substrate. Alternately, the surface coating may be patterned, such that the chemical modification layers are confined to one or more discrete regions of the substrate. For example, the surface may be patterned using photolithographic techniques to create an ordered array or random pattern of chemically-modified regions on the surface. Alternately or in combination, the substrate surface may be patterned using, e.g., contact printing and/or ink-jet printing techniques. In some embodiments, an ordered array or random pattern of chemically-modified regions may comprise at least 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, or 10,000 or more discrete regions.
[00838] In order to achieve low nonspecific binding surfaces, hydrophilic polymers may be nonspecifically adsorbed or covalently grafted to the surface. Typically, passivation is performed utilizing polyethylene glycol) (PEG, also known as polyethylene oxide (PEO) or polyoxyethylene) or other hydrophilic polymers with different molecular weights and end
groups that are linked to a surface using, for example, silane chemistry. The end groups distal from the surface can include, but are not limited to, biotin, methoxy ether, carboxylate, amine, NHS ester, maleimide, and bis-silane. In some embodiments, two or more layers of a hydrophilic polymer, e.g., a linear polymer, branched polymer, or multi-branched polymer, may be deposited on the surface. In some embodiments, two or more layers may be covalently coupled to each other or internally cross-linked to improve the stability of the resulting surface. In some embodiments, oligonucleotide primers with different base sequences and base modifications (or other biomolecules, e.g., enzymes or antibodies) may be tethered to the resulting surface layer at various surface densities. In some embodiments, for example, both surface functional group density and oligonucleotide concentration may be varied to target a certain primer density range. Additionally, primer density can be controlled by diluting oligonucleotide with other molecules that carry the same functional group. For example, amine-labeled oligonucleotide can be diluted with amine-labeled polyethylene glycol in a reaction with an NHS-ester coated surface to reduce the final primer density. Primers with different lengths of linker between the hybridization region and the surface attachment functional group can also be applied to control surface density. Examples of suitable linkers include poly-T and poly-A strands at the 5’ end of the primer (e.g., 0 to 20 bases), PEG linkers (e.g., 3 to 20 monomer units), and carbon-chain (e.g., C6, C12, C18, etc.). To measure the primer density, fluorescently-labeled primers may be tethered to the surface and a fluorescence reading then compared with that for a dye solution of known concentration.
[00839] In order to scale primer surface density and add additional dimensionality to hydrophilic or amphoteric surfaces, surfaces comprising multi-layer coatings of PEG and other hydrophilic polymers have been developed. By using hydrophilic and amphoteric surface layering approaches that include, but are not limited to, the polymer/co-polymer materials described below, it is possible to increase primer loading density on the surface significantly. Traditional PEG coating approaches use monolayer primer deposition, which have been generally reported for single molecule applications, but do not yield high copy numbers for nucleic acid amplification applications. As described herein “layering” can be accomplished using traditional crosslinking approaches with any compatible polymer or monomer subunits such that a surface comprising two or more highly crosslinked layers can be built sequentially. Examples of suitable polymers include, but are not limited to, streptavidin, poly acrylamide, polyester, dextran, poly-lysine, and copolymers of poly-lysine and PEG. In some embodiments, the different layers may be attached to each other through
any of a variety of conjugation reactions including, but not limited to, biotin-streptavidin binding, azide-alkyne click reaction, amine-NHS ester reaction, thiol-maleimide reaction, and ionic interactions between positively charged polymer and negatively charged polymer. In some embodiments, high primer density materials may be constructed in solution and subsequently layered onto the surface in multiple steps.
[00840] As noted, the low non-specific binding coatings of the present disclosure exhibit reduced non-specific binding of proteins, nucleic acids, and other components of the hybridization and/or amplification formulation used for solid-phase nucleic acid amplification. The degree of non-specific binding exhibited by a given support surface may be assessed either qualitatively or quantitatively. For example, in some embodiments, exposure of the surface to fluorescent dyes (e.g., cyanine dyes such as Cy3, or Cy5, etc., fluoresceins, coumarins, rhodamines, etc. or other dyes disclosed herein), fluorescently- labeled nucleotides, fluorescently-labeled oligonucleotides, and/or fluorescently-labeled proteins (e.g. polymerases) under a standardized set of conditions, followed by a specified rinse protocol and fluorescence imaging may be used as a qualitative tool for comparison of non-specific binding on supports comprising different surface formulations. In some embodiments, exposure of the surface to fluorescent dyes, fluorescently-labeled nucleotides, fluorescently-labeled oligonucleotides, and/or fluorescently-labeled proteins (e.g. polymerases) under a standardized set of conditions, followed by a specified rinse protocol and fluorescence imaging may be used as a quantitative tool for comparison of non-specific binding on supports comprising different surface formulations - provided that care has been taken to ensure that the fluorescence imaging is performed under a condition where fluorescence signal is linearly related (or related in a predictable manner) to the number of fluorophores on the support surface (e.g., under a condition where signal saturation and/or self-quenching of the fluorophore is not an issue) and suitable calibration standards are used. In some embodiments, other techniques known to those of skill in the art, for example, radioisotope labeling and counting methods may be used for quantitative assessment of the degree to which non-specific binding is exhibited by the different support surface formulations of the present disclosure.
[00841] Some surfaces disclosed herein exhibit a ratio of specific to nonspecific binding of a fluorophore such as Cy3 of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 50, 75, 100, or greater than 100, or any intermediate value spanned by the range herein. Some surfaces disclosed herein exhibit a ratio of specific to nonspecific fluorescence of a fluorophore such as Cy3 of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 50, 75, 100, or greater than 100, or any intermediate value spanned by the range herein.
[00842] As noted, in some embodiments, the degree of non-specific binding exhibited by the disclosed low-binding supports may be assessed using a standardized protocol for contacting the surface with a labeled protein (e.g., bovine serum albumin (BSA), streptavidin, a DNA polymerase, a reverse transcriptase, a helicase, a single-stranded binding protein (SSB), etc., or any combination thereof), a labeled nucleotide, a labeled oligonucleotide, etc., under a standardized set of incubation and rinse conditions, followed be detection of the amount of label remaining on the surface and comparison of the signal resulting therefrom to an appropriate calibration standard. In some embodiments, the label may comprise a fluorescent label. In some embodiments, the label may comprise a radioisotope. In some embodiments, the label may comprise any other detectable label known to one of skill in the art. In some embodiments, the degree of non-specific binding exhibited by a given support surface formulation may thus be assessed in terms of the number of non-specifically bound protein molecules (or other molecules) per unit area. In some embodiments, the low-binding supports of the present disclosure may exhibit non-specific protein binding (or non-specific binding of other specified molecules, (e.g., cyanine dyes such as Cy3, or Cy5, etc., fluoresceins, coumarins, rhodamines, etc. or other dyes disclosed herein)) of less than 0.001 molecule per pm2, less than 0.01 molecule per pm2, less than 0.1 molecule per pm2, less than 0.25 molecule per pm2, less than 0.5 molecule per pm2, less than Imolecule per pm2, less than 10 molecules per pm2, less than 100 molecules per pm2, or less than 1,000 molecules per pm2. Those of skill in the art will realize that a given support surface of the present disclosure may exhibit non-specific binding falling anywhere within this range, for example, of less than 86 molecules per pm2. For example, some modified surfaces disclosed herein exhibit nonspecific protein binding of less than 0.5 molecule / pm2 following contact with a 1 pM solution of Cy3 labeled streptavidin (GE Amersham) in phosphate buffered saline (PBS) buffer for 15 minutes, followed by 3 rinses with deionized water. Some modified surfaces disclosed herein exhibit nonspecific binding of Cy3 dye molecules of less than 0.25 molecules per pm2. In independent nonspecific binding assays, 1 pM labeled Cy3 SA (ThermoFisher), 1 pM Cy5 SA dye (ThermoFisher), 10 pM Aminoallyl-dUTP - ATTO-647N (Jena Biosciences), 10 pM Aminoallyl-dUTP - ATTO-Rho 11 (Jena Biosciences), 10 pM Aminoallyl-dUTP - ATTO-Rhol l (Jena Biosciences®), 10 pM 7-Propargylamino-7-deaza- dGTP - Cy5 (Jena Biosciences®, and 10 pM 7-Propargylamino-7-deaza-dGTP - Cy3 (Jena Biosciences) were incubated on the low binding substrates at 37°C for 15 minutes in a 384
well plate format. Each well was rinsed 2-3 x with 50 uL deionized RNase/DNase Free water and 2-3 x with 25 mM ACES buffer pH 7.4. The 384 well plates were imaged on a GE Typhoon® instrument using the Cy3, AF555, or Cy5 filter sets (according to dye test performed) as specified by the manufacturer at a PMT gain setting of 800 and resolution of 50-100 pm. For higher resolution imaging, images were collected on an Olympus® 1X83 microscope (Olympus Corp., Center Valley, PA) with a total internal reflectance fluorescence (TIRF) objective (100X, 1.5 NA, Olympus), a CCD camera (e.g., an Olympus EM-CCD monochrome camera, Olympus XM-10 monochrome camera, or an Olympus DP80 color and monochrome camera), an illumination source (e.g., an Olympus 100W Hg lamp, an Olympus 75W Xe lamp, or an Olympus U-HGLGPS fluorescence light source), and excitation wavelengths of 532 nm or 635 nm. Dichroic mirrors were purchased from Semrock (IDEX Health & Science, LLC, Rochester, New York), e.g., 405, 488, 532, or 633 nm dichroic reflectors/beamsplitters, and band pass filters were chosen as 532 LP or 645 LP concordant with the appropriate excitation wavelength. Some modified surfaces disclosed herein exhibit nonspecific binding of dye molecules of less than 0.25 molecules per pm2.
[00843] In some embodiments, the surfaces disclosed herein exhibit a ratio of specific to nonspecific binding of a fhiorophore such as Cy3 of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 50, 75, 100, or greater than 100, or any intermediate value spanned by the range herein. In some embodiments, the surfaces disclosed herein exhibit a ratio of specific to nonspecific fluorescence signals for a fhiorophore such as Cy3 of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 50, 75, 100, or greater than 100, or any intermediate value spanned by the range herein.
[00844] The low-background surfaces consistent with the disclosure herein may exhibit specific dye attachment (e.g., Cy3 attachment) to non-specific dye adsorption (e.g., Cy3 dye adsorption) ratios of at least 4: 1, 5: 1, 6: 1, 7: 1, 8:1, 9: 1, 10: 1, 15: 1, 20: 1, 30: 1, 40: 1, 50: 1, or more than 50 specific dye molecules attached per molecule nonspecifically adsorbed. Similarly, when subjected to an excitation energy, low-background surfaces consistent with the disclosure herein to which fluorophores, e.g., Cy3, have been attached may exhibit ratios of specific fluorescence signal (e.g., arising from Cy3-labeled oligonucleotides attached to the surface) to non-specific adsorbed dye fluorescence signals of at least 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1, 15: 1, 20: 1, 30:1, 40: 1, 50: 1, or more than 50: 1.
[00845] In some embodiments, the degree of hydrophilicity (or “wettability” with aqueous solutions) of the disclosed support surfaces may be assessed, for example, through the
measurement of water contact angles in which a small droplet of water is placed on the surface and its angle of contact with the surface is measured using, e.g., an optical tensiometer. In some embodiments, a static contact angle may be determined. In some embodiments, an advancing or receding contact angle may be determined. In some embodiments, the water contact angle for the hydrophilic, low-binding support surfaces disclosed herein may range from about 0 degrees to about 30 degrees. In some embodiments, the water contact angle for the hydrophilic, low-binding support surfaced disclosed herein may no more than 50 degrees, 40 degrees, 30 degrees, 25 degrees, 20 degrees, 18 degrees, 16 degrees, 14 degrees, 12 degrees, 10 degrees, 8 degrees, 6 degrees, 4 degrees, 2 degrees, or 1 degree. In many cases the contact angle is no more than 40 degrees. Those of skill in the art will realize that a given hydrophilic, low-binding support surface of the present disclosure may exhibit a water contact angle having a value of anywhere within this range.
[00846] In some embodiments, the hydrophilic surfaces disclosed herein facilitate reduced wash times for bioassays, often due to reduced nonspecific binding of biomolecules to the low-binding surfaces. In some embodiments, adequate wash steps may be performed in less than 60, 50, 40, 30, 20, 15, 10, or less than 10 seconds. For example, in some embodiments adequate wash steps may be performed in less than 30 seconds.
[00847] The low-binding surfaces of the present disclosure exhibit significant improvement in stability or durability to prolonged exposure to solvents and elevated temperatures, or to repeated cycles of solvent exposure or changes in temperature. For example, in some embodiments, the stability of the disclosed surfaces may be tested by fluorescently labeling a functional group on the surface, or a tethered biomolecule (e.g., an oligonucleotide primer) on the surface, and monitoring fluorescence signal before, during, and after prolonged exposure to solvents and elevated temperatures, or to repeated cycles of solvent exposure or changes in temperature. In some embodiments, the degree of change in the fluorescence used to assess the quality of the surface may be less than 1%, 2%, 3%, 4%, 5%, 10%, 15%, 20%, or 25% over a time period of 1 minute, 2 minutes, 3 minutes, 4 minutes, 5 minutes, 10 minutes, 20 minutes, 30 minutes, 40 minutes, 50 minutes, 60 minutes, 2 hours, 3 hours, 4 hours, 5 hours, 6 hours, 7 hours, 8 hours, 9 hours, 10 hours, 15 hours, 20 hours, 25 hours, 30 hours, 35 hours, 40 hours, 45 hours, 50 hours, or 100 hours of exposure to solvents and/or elevated temperatures (or any combination of these percentages as measured over these time periods). In some embodiments, the degree of change in the fluorescence used to assess the quality of the surface may be less than 1%, 2%, 3%, 4%, 5%, 10%, 15%, 20%, or 25% over 5 cycles, 10 cycles, 20 cycles, 30 cycles, 40 cycles, 50 cycles, 60 cycles,
70 cycles, 80 cycles, 90 cycles, 100 cycles, 200 cycles, 300 cycles, 400 cycles, 500 cycles, 600 cycles, 700 cycles, 800 cycles, 900 cycles, or 1,000 cycles of repeated exposure to solvent changes and/or changes in temperature (or any combination of these percentages as measured over this range of cycles).
[00848] In some embodiments, the surfaces disclosed herein may exhibit a high ratio of specific signal to nonspecific signal or other background. For example, when used for nucleic acid amplification, some surfaces may exhibit an amplification signal that is at least 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 75, 100, or greater than 100 fold greater than a signal of an adjacent unpopulated region of the surface. Similarly, some surfaces exhibit an amplification signal that is at least 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 75, 100, or greater than 100 fold greater than a signal of an adjacent amplified nucleic acid population region of the surface.
[00849] In some embodiments, fluorescence images of the disclosed low background surfaces when used in nucleic acid hybridization or amplification applications to create clusters of hybridized or clonally-amplified nucleic acid molecules (e.g., that have been directly or indirectly labeled with a fluorophore) exhibit contrast-to-noise ratios (CNRs) of at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 20, 210, 220, 230, 240, 250, or greater than 250.
[00850] One or more types of primer (e.g., capture primers) may be attached or tethered to the support surface. In some embodiments, the one or more types of adapters or primers may comprise spacer sequences, adapter sequences for hybridization to adapter-ligated target library nucleic acid sequences, forward amplification primers, reverse amplification primers, sequencing primers, and/or molecular barcoding sequences, or any combination thereof. In some embodiments, 1 primer or adapter sequence may be tethered to at least one layer of the surface. In some embodiments, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10 different primer or adapter sequences may be tethered to at least one layer of the surface.
[00851] In some embodiments, the tethered adapter and/or primer sequences may range in length from about 10 nucleotides to about 100 nucleotides. In some embodiments, the tethered adapter and/or primer sequences may be at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, or at least 100 nucleotides in length. In some embodiments, the tethered adapter and/or primer sequences may be at most 100, at most 90, at most 80, at most 70, at most 60, at most 50, at most 40, at most 30, at most 20, or at most 10 nucleotides in length. Any of the lower and upper values described in this paragraph may be combined to form a range included within the present disclosure, for
example, in some embodiments the length of the tethered adapter and/or primer sequences may range from about 20 nucleotides to about 80 nucleotides. Those of skill in the art will recognize that the length of the tethered adapter and/or primer sequences may have any value within this range, e.g., about 24 nucleotides.
[00852] In some embodiments, the resultant surface density of primers on the low binding support surfaces of the present disclosure may range from about 100 primer molecules per pm2 to about 100,000 primer molecules per pm2. In some embodiments, the resultant surface density of primers on the low binding support surfaces of the present disclosure may range from about 100,000 primer molecules per pm2 to about 1015 primer molecules per pm2. In some embodiments, the surface density of primers may be at least 1,000, at least 10,000, at least 100,000, or at least 1015 primer molecules per pm2. In some embodiments, the surface density of primers may be at most 10,000, at most 100,000, at most 1,000,000, or at most 1015 primer molecules per pm2. Any of the lower and upper values described in this paragraph may be combined to form a range included within the present disclosure, for example, in some embodiments the surface density of primers may range from about 10,000 molecules per pm2 to about 1015 molecules per pm2. Those of skill in the art will recognize that the surface density of primer molecules may have any value within this range, e.g., about 455,000 molecules per pm2. In some embodiments, the surface density of target library nucleic acid sequences initially hybridized to adapter or primer sequences on the support surface may be less than or equal to that indicated for the surface density of tethered primers. In some embodiments, the surface density of clonally-amplified target library nucleic acid sequences hybridized to adapter or primer sequences on the support surface may span the same range as that indicated for the surface density of tethered primers.
[00853] Local densities as listed above do not preclude variation in density across a surface, such that a surface may comprise a region having an oligo density of, for example, 500,000 per pm2, while also comprising at least a second region having a substantially different local density.
[00854] The low non-specific binding coating comprise one or more layers of a multilayered surface coating may comprise a branched polymer or may be linear. Examples of suitable branched polymers include, but are not limited to, branched PEG, branched poly(vinyl alcohol) (branched PVA), branched poly(vinyl pyridine), branched poly(vinyl pyrrolidone) (branched PVP), branched ), poly(acrylic acid) (branched PAA), branched polyacrylamide, branched poly(N-isopropylacrylamide) (branched PNIPAM), branched poly(methyl methacrylate) (branched PMA), branched poly(2-hydroxylethyl methacrylate)
(branched PHEMA), branched poly(oligo(ethylene glycol) methyl ether methacrylate) (branched POEGMA), branched polyglutamic acid (branched PGA), branched poly-lysine, branched poly-glucoside, and dextran.
[00855] In some embodiments, the branched polymers used to create one or more layers of any of the multi-layered surfaces disclosed herein may comprise at least 4 branches, at least 5 branches, at least 6 branches, at least 7 branches, at least 8 branches, at least 9 branches, at least 10 branches, at least 12 branches, at least 14 branches, at least 16 branches, at least 18 branches, at least 20 branches, at least 22 branches, at least 24 branches, at least 26 branches, at least 28 branches, at least 30 branches, at least 32 branches, at least 34 branches, at least 36 branches, at least 38 branches, or at least 40 branched.
[00856] Linear, branched, or multi-branched polymers used to create one or more layers of any of the multi-layered surfaces disclosed herein may have a molecular weight of at least 500, at least 1,000, at least 2,000, at least 3,000, at least 4,000, at least 5,000, at least 10,000, at least 15,000, at least 20,000, at least 25,000, at least 30,000, at least 35,000, at least 40,000, at least 45,000, or at least 50,000 daltons.
[00857] In some embodiments, e.g., wherein at least one layer of a multi-layered surface comprises a branched polymer, the number of covalent bonds between a branched polymer molecule of the layer being deposited and molecules of the previous layer may range from about one covalent linkage per molecule to about 32 covalent linkages per molecule. In some embodiments, the number of covalent bonds between a branched polymer molecule of the new layer and molecules of the previous layer may be at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 18, at least 20, at least 22, at least 24, at least 26, at least 28, at least 30, or at least 32 covalent linkages per molecule.
[00858] Any reactive functional groups that remain following the coupling of a material layer to the surface may optionally be blocked by coupling a small, inert molecule using a high yield coupling chemistry. For example, in the case that amine coupling chemistry is used to attach a new material layer to the previous one, any residual amine groups may subsequently be acetylated or deactivated by coupling with a small amino acid such as glycine.
[00859] The number of layers of low non-specific binding material, e.g., a hydrophilic polymer material, deposited on the surface, may range from 1 to about 10. In some embodiments, the number of layers is at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10. In some embodiments, the number of
layers may be at most 10, at most 9, at most 8, at most 7, at most 6, at most 5, at most 4, at most 3, at most 2, or at most 1. Any of the lower and upper values described in this paragraph may be combined to form a range included within the present disclosure, for example, in some embodiments the number of layers may range from about 2 to about 4. In some embodiments, all of the layers may comprise the same material. In some embodiments, each layer may comprise a different material. In some embodiments, the plurality of layers may comprise a plurality of materials. In some embodiments at least one layer may comprise a branched polymer. In some embodiment, all of the layers may comprise a branched polymer.
[00860] One or more layers of low non-specific binding material may in some cases be deposited on and/or conjugated to the substrate surface using a polar protic solvent, a polar or polar aprotic solvent, a nonpolar solvent, or any combination thereof. In some embodiments the solvent used for layer deposition and/or coupling may comprise an alcohol (e.g., methanol, ethanol, propanol, etc.), another organic solvent (e.g., acetonitrile, dimethyl sulfoxide (DMSO), dimethyl formamide (DMF), etc.), water, an aqueous buffer solution (e.g., phosphate buffer, phosphate buffered saline, 3-(N-morpholino)propanesulfonic acid (MOPS), etc.), or any combination thereof. In some embodiments, an organic component of the solvent mixture used may comprise at least 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the total, with the balance made up of water or an aqueous buffer solution. In some embodiments, an aqueous component of the solvent mixture used may comprise at least 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% of the total, with the balance made up of an organic solvent. The pH of the solvent mixture used may be less than 6, about 6, 6.5, 7, 7.5, 8, 8.5, 9, or greater than pH 9.
[00861] Fluorescence imaging may be performed using any of a variety of fluorophores, fluorescence imaging techniques, and fluorescence imaging instruments known to those of skill in the art. Examples of suitable fluorescence dyes that may be used (e.g., by conjugation to nucleotides, oligonucleotides, or proteins) include, but are not limited to, fluorescein, rhodamine, coumarin, cyanine, and derivatives thereof, including the cyanine derivatives Cyanine dye-3 (Cy3), Cyanine dye-5 (Cy5), Cyanine dye-7 (Cy7), etc. Examples of fluorescence imaging techniques that may be used include, but are not limited to, fluorescence microscopy imaging, fluorescence confocal imaging, two-photon fluorescence, and the like. Examples of fluorescence imaging instruments that may be used include, but are not limited to, fluorescence microscopes equipped with an image sensor or camera,
confocal fluorescence microscopes, two-photon fluorescence microscopes, or custom instruments that comprise a suitable selection of light sources, lenses, mirrors, prisms, dichroic reflectors, apertures, and image sensors or cameras, etc. A non-limiting example of a fluorescence microscope equipped for acquiring images of the disclosed low-binding support surfaces and clonally-amplified colonies (polonies) of template nucleic acid sequences hybridized thereon is the Olympus 1X83 inverted fluorescence microscope equipped with ) 20x, 0.75 NA, a 532 nm light source, a bandpass and dichroic mirror filter set optimized for 532 nm long-pass excitation and Cy3 fluorescence emission filter, a Semrock 532 nm dichroic reflector, and a camera (Andor sCMOS, Zyla 4.2) where the excitation light intensity is adjusted to avoid signal saturation. Often, the support surface may be immersed in a buffer (e.g., 25 mM ACES, pH 7.4 buffer) while the image is acquired.
[00862] In some instances, the performance of nucleic acid hybridization and/or amplification reactions using the disclosed reaction formulations and low non-specific binding supports may be assessed using fluorescence imaging techniques, where the contrast- to-noise ratio (CNR) of the images provides a key metric in assessing amplification specificity and non-specific binding on the support. CNR is commonly defined as: CNR = (Signal - Background) / Noise. The background term is commonly taken to be the signal measured for the interstitial regions surrounding a particular feature (diffraction limited spot, DLS) in a specified region of interest (ROI). While signal-to-noise ratio (SNR) is often considered to be a benchmark of overall signal quality, it can be shown that improved CNR can provide a significant advantage over SNR as a benchmark for signal quality in applications that require rapid image capture (e.g., sequencing applications for which cycle times must be minimized), as shown in the example below. The surfaces of the instant disclosure are also provided in co-pending International Application Serial No. PCT/US2019/061556, which is hereby incorporated by reference in its entirety.
[00863] In most ensemble-based sequencing approaches, the background term is typically measured as the signal associated with ‘interstitial’ regions. In addition to “interstitial” background (Binter), “intrastitial” background (Bintra) exists within the region occupied by an amplified DNA colony. The combination of these two background signals dictates the achievable CNR, and subsequently directly impacts the optical instrument requirements, architecture costs, reagent costs, run- times, cost/genome, and ultimately the accuracy and data quality for cyclic array-based sequencing applications. The Binter background signal arises from a variety of sources; a few examples include auto-fluorescence from consumable flow cells, non-specific adsorption of detection molecules that yield spurious fluorescence
signals that may obscure the signal from the ROI, the presence of non-specific DNA amplification products (e.g., those arising from primer dimers). In typical next generation sequencing (NGS) applications, this background signal in the current field-of-view (FOV) is averaged over time and subtracted. The signal arising from individual DNA colonies (i.e., (S) - Bmterin the FOV) yields a discernable feature that can be classified. In some instances, the intrastitial background (Bintra) can contribute a confounding fluorescence signal that is not specific to the target of interest, but is present in the same ROI thus making it far more difficult to average and subtract.
[00864] The implementation of nucleic acid amplification on the low-binding substrates of the present disclosure may decrease the Binter background signal by reducing non-specific binding, may lead to improvements in specific nucleic acid amplification, and may lead to a decrease in non-specific amplification that can impact the background signal arising from both the interstitial and intrastitial regions. In some instances, the disclosed low-binding support surfaces, optionally used in combination with the disclosed hybridization buffer formulations, may lead to improvements in CNR by a factor of 2, 5, 10, 100, or 1000-fold over those achieved using conventional supports and hybridization, amplification, and/or sequencing protocols. Although described here in the context of using fluorescence imaging as the read-out or detection mode, the same principles apply to the use of the disclosed low non-specific binding supports and nucleic acid hybridization and amplification formulations for other detection modes as well, including both optical and non-optical detection modes.
[00865] The disclosed low-binding supports, optionally used in combination with the disclosed hybridization and/or amplification protocols, yield solid-phase reactions that exhibit: (i) negligible non-specific binding of protein and other reaction components (thus minimizing substrate background), (ii) negligible non-specific nucleic acid amplification product, and (iii) provide tunable nucleic acid amplification reactions.
[00866] In some embodiments, fluorescence images of the disclosed low background surfaces when used in nucleic acid hybridization or amplification applications to create polonies of hybridized or clonally-amplified nucleic acid molecules (e.g., that have been directly or indirectly labeled with a fluorophore) exhibit contrast-to-noise ratios (CNRs) of at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 20, 210, 220, 230, 240, 250, or greater than 250.
[00867] In some embodiments, a fluorescence image of the surface exhibits a contrast-to- noise ratio (CNR) of at least 20 when a sample nucleic acid molecule or complementary sequences thereof are labeled with a Cyanine dye-3 (Cy3) fluorophore, and when the
fluorescence image is acquired using an inverted fluorescence microscope (e.g., Olympus 1X83) with a 20 x 0.75 NA objective, a 532 nm light source, a bandpass and dichroic mirror filter set optimized for 532 nm excitation and Cy3 fluorescence emission, and a camera (e.g., Andor sCMOS, Zyla 4.2) under non-signal saturating conditions while the surface is immersed in a buffer (e.g., 25 mM ACES, pH 7.4 buffer).
Sample Indexes for Improved Base Calling
[00868] Generally, it is desirable to prepare nucleic acid libraries that will be distributed onto a support (e.g., a coated flow cell), where the library molecules are converted into template molecules that are immobilized at a high density to the support for massively parallel sequencing. For template molecules that are immobilized at high densities at random locations on the support, the challenge of resolving high density fluorescent images for accurate base calling during sequencing runs becomes challenging.
[00869] The nucleotide diversity of a population of immobilized template molecules refers to the relative proportion of nucleotides A, G, C and T that are present in each sequencing cycle. An optimal high diversity library will generally include sequence-of-interest (insert) regions having approximately equal proportions of all four nucleotides represented in each cycle of a sequencing run. A low diversity library will generally include sequence-of-interest (insert) regions having a high proportion of certain nucleotides and low proportion of other nucleotides. To overcome the problem of low diversity libraries, a small amount of a high diversity library prepared from PhiX bacteriophage is typically mixed with the library-of- interest (e.g., PhiX spike-in library) and sequenced together on the same flow cell. While the PhiX library spike-in library provides nucleotide diversity it also occupies space on the flow cell thereby replacing the target libraries carrying the sequence-of-interest and reduces the amount of sequencing data obtainable from the target libraries (e.g., reduces sequencing throughput). Another method to overcome the problem of low diversity libraries is to prepare library molecules having at least one sample index sequence that is designed to be color- balanced. However it may be desirable to design a large number of sample index sets, for example a set of single index sample sequences or paired index sample sequences for 16- plex, 24-plex, 96-plex or larger plexy levels. It is challenging to design sample index sequences, as a single or paired sample indexes, for large sample index sets where all of the sample index sequences are color-balanced (e.g., see FIG. 39).
[00870] An alternative method to overcome the challenges of sequencing low diversity library molecules (e.g., at high density on the support) is (1) to prepare libraries having at
least one sample index sequence (e.g., (170) or (160)) that can be a batch sample index comprising a short random sequence (e.g., NNN) linked directly to a universal sample index sequence, where the short random sequence provides nucleotide diversity and color balance, and/or (2) to prepare libraries having at least one batch barcode sequence (e.g., 195) comprising a short random sequence (e.g., NNN) linked directly to the batch barcode sequence, where the short random sequence provides nucleotide diversity and color balance. In a population of library molecules each molecule comprising a batch sample index sequence (e.g., (170) or (160)) and/or a batch barcode sequence (e.g., (195)), the short random sequence (e.g., NNN) provides high nucleotide diversity which includes approximately equal proportions of all four nucleotides (e.g., A, G, C, T and/or U) that will be represented in each cycle of a sequencing run (see FIG. 38). The high nucleotide diversity of the short random sequence also provide color balance during each cycle of the sequencing run. The advantage of designing batch sample indexes (e.g., (170) or (160)) and batch barcode sequences (e.g., (195))to include a short random sequence (e.g., NNN) is that, in a low-plexy population of library molecules (e.g., 2-plex or 4-plex), the universal sample index sequences that identify the two or four different samples need not exhibit nucleotide diversity (e.g., see FIG. 39). Additionally, the nucleotide diversity of the short random sequence (e.g., NNN) can obviate the need to include a PhiX spike-in library, or permits use of a reduced amount of PhiX spike-in library to be distributed onto the flow cell and sequenced.
[00871] The library molecule can include at least one batch sample index sequence (e.g., (170) or (160)) and/or a batch barcode sequence (e.g., (195)) which include a short random sequence (e.g., NNN). In some embodiments, the sequencing data from the sample index sequence (e.g., (170) or (160)) and/or from the batch barcode sequence (e.g., (195)) can be used for polony mapping and template registration because the short random sequence (e.g., NNN) provides sufficient nucleotide diversity and color balance. The sequencing data from the sample index sequence (e.g., (170) and/or (160)), which can be a universal sample index sequence, can be used to distinguish sequences of interest obtained from different sample sources in a multiplex assay.
[00872] In some embodiments, the library molecule comprises two batch sample index sequences (e.g., (170) and (160)). In some embodiments, the sequencing data from only one of the sample index sequences (e.g., (170) or (160)) can be used for polony mapping and/or template registration because the short random sequence provides sufficient nucleotide diversity and color balance. The sequencing data from the right sample index sequence (e.g., (170)) and the left sample index sequence (e.g., (160)), which can be a universal sample
index sequence, can be used as dual sample indexes to distinguish sequences of interest obtained from different sample sources in a multiplex assay. In some embodiments, the second sample index sequence (e.g., (160)) may or may not include a second short random sequence (e.g., NNN).
[00873] The order of sequencing the sequence-of-interest region and the sample index region(s) can also be used to improve the challenges of sequencing low diversity library molecules. For example, the batch sample index region (e.g., (170) or (160)) can be sequenced first before sequencing the sequence-of-interest region, and the batch sample index sequence (e.g., (170) or (160)) can be associated with the sequence-of-interest region. For example, the batch sample index sequence (e.g., (170) or (160)) can be sequenced first including sequencing the short random sequence (e.g., NNN) and optionally sequencing at least a portion of the universal sample index, and then sequencing the sequence-of-interest region. In some embodiments, the batch barcode sequence(e.g., (195)) can be sequenced first before sequencing the sequence-of-interest region, and the batch barcode sequence(e.g., (195)) can be associated with the sequence-of-interest region. For example, the batch barcode sequence (e.g., (195)) can be sequenced first including sequencing the short random sequence (e.g., NNN) and optionally sequencing at least a portion of the batch barcode, and then sequencing the sequence-of-interest region. In a population of library molecules, the short random sequence (e.g., NNN) provides nucleotide diversity which may not be provided the sequence-of-interest regions of the library molecules. The short random sequence (e.g., NNN) provides improved nucleotide diversity and color balance for polony mapping and template registration.
[00874] Additionally, when sequencing the batch sample index region first, the length of the sequenced batch sample index region is relatively short (e.g., less than 30 nucleotides in length) so that de-hybridization of the product of the sequenced batch sample index region is more complete. In some embodiments, when sequencing the batch barcode region first, the length of the sequenced batch barcode region is relatively short (e.g., less than 30 nucleotides in length) so that de-hybridization of the product of the sequenced batch barcode region is more complete. Gentler de-hybridization conditions can be used to remove most or all of the product of the sequenced batch sample index region and batch barcode region which reduces the level of residual signals from any sequencing products remaining hybridized to the template molecules. By contrast, the sequence-of-interest region is typically much longer than the batch sample index region and the batch barcode region (e.g., more than 100 nucleotides in length). When the sequence-of-interest region is sequenced before the batch
sample index region and the batch barcode region, the product of the sequenced sequence-of- interest region must be subjected to harsher de-hybridization conditions to remove any products remaining hybridized to the template molecules which may damage the template molecules.
[00875] The present disclosure provides linear single stranded library molecules (100) each comprising at least one batch sample index sequence that can be used to distinguish sequences of interest obtained from different sample sources in a multiplex assay, where the at least one batch sample index sequence comprises a short random sequence (e.g., NNN) linked to a universal sample index sequence. In some embodiments, the left sample index (160) (e.g., left batch sample index) comprises a short random sequence (e.g., NNN) linked to a universal left sample index sequence (160) and/or the batch right sample index sequence (170) comprises a short random sequence (e.g., NNN) linked to a right universal sample index sequence. The at least one batch sample index sequence can include sequence diversity for improved base calling. The at least one sample index sequence can be used to improve base calling accuracy.
[00876] The present disclosure provides linear single stranded library molecules (100) each comprising at least one batch barcode sequence (195), where the at least one batch barcode sequence comprises a short random sequence (e.g., NNN) linked to the batch barcode sequence. The at least one batch barcode sequence can include sequence diversity for improved base calling. The at least one batch barcode sequence can be used to improve base calling accuracy.
[00877] In some embodiments, the short random sequence (e.g., NNN) is positioned upstream of the sample index sequence (e.g., (170) and/or (160)) so that during a sequencing run the random sequence portion is sequenced before the universal sample index sequence. In some embodiments, the short random sequence is positioned downstream of the universal sample index sequence so that during a sequencing run the random portion is sequenced after the universal sample index sequence.
[00878] In some embodiments, the short random sequence (e.g., NNN) is positioned upstream of the batch barcode sequence (e.g., (195)) so that during a sequencing run the random sequence portion is sequenced before the batch barcode sequence. In some embodiments, the short random sequence is positioned downstream of the batch barcode sequence so that during a sequencing run the random portion is sequenced after the batch barcode sequence.
[00879] In some embodiments, in the random sequence each base “N” at a given position is independently selected from A, G, C, T or U. In some embodiments, the random sequence lacks consecutive repeat sequences having 2 or 3 of the same nucleo-base, for example AA, TT, CC, GG, UU, AAA, TTT, CCC, GGG or UUU. In some embodiments, in a population of library molecules the sample index sequences (e.g., (170) and/or (160)) include a short random sequence having a high diversity sequence which includes approximately equal proportions of all four nucleotides (e.g., A, G, C, T and/or U) that will be represented in each cycle of a sequencing run. In some embodiments, in a population of library molecules the batch barcode sequences (e.g., (195)) include a short random sequence having a high diversity sequence which includes approximately equal proportions of all four nucleotides (e.g., A, G, C, T and/or U) that will be represented in each cycle of a sequencing run.
[00880] In some embodiments, the short random sequence (e.g., NNN) comprises 3-20 nucleotides, or 3-10 nucleotides, or 3-8 nucleotides, or 3-6 nucleotides, or 3-5 nucleotides, or 3-4 nucleotides.
[00881] In some embodiments, the short random sequence (e.g., NNN) includes, but is not limited to, AGC, AGT, GAC, GAT, CAT, CAG, TAG, TAC. The skilled artisan will recognize that many more random sequences can be prepared (e.g., 64 possible combinations) where each base “N” at a given position in the random sequence is independently selected from A, G, C, T or U.
[00882] In some embodiments, the universal sample index sequence comprises 5-20 nucleotides, or 7-18 nucleotides, or 9-16 nucleotides.
[00883] In some embodiments, in a population of library molecules the short random sequence (e.g., NNN) has an overall base composition of about 25% or about 20-30% of all four nucleotide bases (e.g., A, G, C and T/U) to provide nucleotide diversity at each sequencing cycle during sequencing the short random sequence (e.g., NNN).
[00884] In some embodiments, in the population of library molecules the proportion of adenine (A) at any given position in the short random sequence is about 20-30% or about 15- 35% or about 10-40%. In some embodiments, the proportion of guanine (G) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, the proportion of cytosine (C) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%. In some embodiments, the proportion of thymine (T) or uracil (U) at any given position in the short random sequence is about 20-30% or about 15-35% or about 10-40%.
[00885] In some embodiments, in the population of library molecules the proportion of adenine (A) and thymine (T), or the proportion of adenine (A) and uracil (U), at any given position in the short random sequence is about 10-65%. In some embodiments, the proportion of guanine (G) and cytosine (C) at any given position in the short random sequence is about 10-65%.
[00886] In some embodiments, in the population of library molecules the sequence diversity of the short random sequences ensures that no sequencing cycle is presented with fewer than four different nucleotide bases during sequencing at least the short random sequence (e.g., NNN).
[00887] Exemplary batch sample index sequences (e.g., (170) and/or (160)) that include a short random sequence NNN linked directly to a universal sample index sequence include but are not limited to: NNNGTAGGAGCC; NNNCCGCTGCTA; NNNAACAACAAG; NNNGGTGGTCTA; NNNTTGGCCAAC; NNNCAGGAGTGC; and NNNATCACACTA. The skilled artisan will recognize that the universal sample index can be any length and have any sequence that can be used to distinguish sequences of interest obtained from different sample sources in a multiplex assay. In a population of a given sample index, for example NNNGTAGGAGCC, the population contains a mixture of individual sample index molecules each carrying the same universal sample index sequence (e.g., GTAGGAGCC) and a different short random sequence (e.g., NNN) where up to 64 different short random sequences may be present in the population of the given sample index. In some embodiments, the batch barcode sequences can be designed in the same manner as the batch sample index sequences.
[00888] In some embodiments, a sequencing reaction includes use of polymerases and nucleotides (e.g., nucleotide analogs) that are labeled with a different fluorophore that corresponds to the nucleo-base. In some embodiments, sequencing the short random sequence (e.g., NNN) using labeled nucleotides provides a balanced ratio of fluorescent colors that correspond to the nucleo-bases adenine, cytosine, guanine, thymine and/or uracil in each cycle of a sequencing run. In some embodiments, sequencing the short random sequence (e.g., NNN) and at least a portion of the universal sample index sequence using labeled nucleotides provides a balanced ratio of fluorescent colors that correspond to nucleo- bases adenine, cytosine, guanine, thymine and/or uracil (e.g., see FIG. 38). In some embodiments, sequencing the short random sequence (e.g., NNN) and at least a portion of the batch barcode sequence using labeled nucleotides provides a balanced ratio of fluorescent colors that correspond to nucleo-bases adenine, cytosine, guanine, thymine and/or uracil. The
labeled nucleotides emit fluorescent signals during the sequencing reactions. In some embodiments, the sequencing reaction is conducted on a sequencing apparatus having a detector that captures fluorescent images from sequencing reactions on the immobilized template molecules. The sequencing apparatus can be configured to relay the fluorescent imaging data captured by the detector to a computer system that is programmed to determine the location (e.g., mapping) of the immobilized template molecules on the flow cell. The computer system can generate a map of the locations of the immobilized template molecules based on the fluorescent imaging data of only the random sequence (e.g., NNN), or based on the random sequence (e.g., NNN) and at least a portion the universal sample index sequence and/or the batch barcode sequence. Thus the few numbers of sequencing cycles used to sequence the random sequence (e.g., NNN) and optionally a portion of the universal sample index sequence and/or optionally a portion the batch barcode sequence can be used to generate a map of the location of the immobilized template molecules. The computer system can be configured to extract the fluorescent color and intensity of only the random sequence (e.g., NNN), or from the random sequence (e.g., NNN) and at least a portion of the universal sample index sequence, or the random sequence (e.g., NNN) and at least a portion of the batch barcode sequence. The computer system can be configured to use the location of a given immobilized template molecule and the fluorescent color and intensity associated with the given template molecule (which were established while sequencing the random sequence) for base calling while sequencing the insert region (110). The computer system can be configured to detect phasing and pre-phasing while sequencing the random sequence (e.g., NNN) and the universal sample index sequence, and the insert region (110). In some embodiments, the balanced ratio of fluorescent colors provided by the random sequence (e.g., NNN) at each sequencing cycle can improve the quality of the data which is processed from the fluorescent images captured by the detector, and can in turn improve the capability by the computer system to determine the location of the immobilized template molecules on the flow cell, and the color and intensity, all of which can improve base calling accuracy and quality scores of the sequenced insert region (110).
[00889] In some embodiments, a sequencing reaction includes use of polymerases and multivalent molecules that are labeled with a different fluorophore that corresponds to the nucleo-base (e.g., adenine, guanine, cytosine, thymine or uracil) of the nucleotide units that are attached to the nucleotide arms in a given multivalent molecule. In some embodiments, the core of individual multivalent molecules is attached to a fluorophore which corresponds to the nucleotide units (e.g., adenine, guanine, cytosine, thymine or uracil) that are attached to
the nucleotide arms in a given multivalent molecule (e.g., see FIGs. 1-4). In some embodiments, at least one of the nucleotide arms of the multivalent molecule comprises a linker and/or nucleotide base that is attached to a fluorophore. In some embodiments, the fluorophore which is attached to a given linker or nucleotide base corresponds to the nucleotide base (e.g., adenine, guanine, cytosine, thymine or uracil) of the nucleotide arm. In some embodiments, sequencing the random sequence (e.g., NNN) by conducting the two- stage sequencing method using labeled multivalent molecules provides a balanced ratio of fluorescent colors that correspond to the nucleo-bases adenine, cytosine, guanine, thymine and/or uracil in each cycle of a sequencing run. In some embodiments, sequencing the random sequence (e.g., NNN) and at least a portion of the universal sample index sequence and/or at least a portion of the batch barcode sequence using labeled multivalent molecules provides a balanced ratio of fluorescent colors that correspond to nucleo-bases adenine, cytosine, guanine, thymine and/or uracil (e.g., see FIG. 38). The labeled multivalent molecules emit fluorescent signals during the sequencing reactions. In some embodiments, the sequencing reaction is conducted on a sequencing apparatus having a detector that captures fluorescent images from sequencing reactions on the immobilized template molecules. The sequencing apparatus can be configured to relay the fluorescent imaging data captured by the detector to a computer system that is programmed to determine the location (e.g., mapping) of the immobilized template molecules (polonies) on the flow cell. The computer system can generate a map of the locations of the immobilized template molecules based on the fluorescent imaging data of only the random sequence (e.g., NNN), or based on the random sequence (e.g., NNN) and at least a portion of the universal sample index sequence and/or at least a portion of the batch barcode sequence. Thus the few numbers of sequencing cycles used to sequence the random sequence (e.g., NNN) and optionally a portion of the universal sample index sequence can be used to generate a map of the location of the immobilized template molecules. The computer system can be configured to extract the fluorescent color and intensity of only the random sequence (e.g., NNN), or from the random sequence (e.g., NNN) and the universal sample index sequence and/or batch barcode sequence. The computer system can be configured to use the location of a given immobilized template molecule and the fluorescent color and intensity associated with the given template molecule (which were established while sequencing the random sequence) for base calling while sequencing the insert region (110). The computer system can be configured to detect phasing and pre-phasing while sequencing the random sequence (e.g., NNN) and the universal sample index sequence, and the insert region (110) . In some embodiments, the
balanced ratio of fluorescent colors provided by the random sequence (e.g., NNN) at each sequencing cycle can improve the quality of the data which is processed from the fluorescent images captured by the detector, and can in turn improve the capability by the computer system to determine the location of the immobilized template molecules on the flow cell, and the color and intensity, all of which can improve base calling accuracy and quality scores of the sequenced insert region (110).
EXAMPLES
[00890] The following examples are meant to be illustrative and can be used to further understand embodiments of the present disclosure and should not be construed as limiting the scope of the present teachings in any way.
Example 1: Two-Plex Batch Sequencing of Concatemer Template Molecules Prepared with Single-stranded Splint or Double-Stranded Splint Adaptors
[00891] Covalently closed circular libraries containing phiX insert regions as the sequence of interest were prepared by hybridizing linear library molecules to either single-stranded splints (e.g., FIG. 21) or double-stranded adaptors (e.g., FIG. 27).
[00892] For the single-stranded splint workflow: linear single stranded library molecules (100) included the following components (e.g., FIG. 21): (i) a surface pinning primer binding site sequence (120); (ii) a left sample index sequence (160); (iii) a forward sequencing primer binding site sequence (140) (ss-Splint sequencing primer, SEQ ID NO: 1); (iv) a sequence of interest (e.g., an insert sequence) (110); (v) a reverse sequencing primer binding site sequence (150); (vi) a right sample index sequence (170); and (vii) a surface capture primer binding site sequence (130). The single-stranded library molecules were hybridized with singlestranded splint strands (200) to generate library-splint complexes (300) having one nick. The nick was ligated to generate covalently closed circular library molecules (400) as exemplified in FIG. 24A. The single-stranded splint strand (200) was removed enzymatically. The forward sequencing primer binding site sequence (140) was a ss-Splint sequencing primer having the sequence:
5’- CGTGCTGGATTGGCTCACCAGACACCTTCCGACAT -3’ (SEQ ID NO: 1).
[00893] For the double-stranded splint workflow: linear single stranded library molecules (100) included the following components (e.g., FIG. 27): (i) a surface pinning primer binding site sequence (120); (ii) a left sample index sequence (160); (iii) a forward sequencing primer binding site sequence (140) (ss-Splint sequencing primer); (iv) a sequence of interest (e.g., an
insert sequence) (110); (v) a reverse sequencing primer binding site sequence (150); (vi) a right sample index sequence (170); and (vii) a surface capture primer binding site sequence (130). The single-stranded library molecules were hybridized with double-stranded splint adaptors (500) to generate library-splint complexes (800) having two nicks. The nicks were ligated to generate covalently closed circular library molecules (900) as exemplified in FIG. 31A. The single stranded splint strand (600) was removed enzymatically. The forward sequencing primer binding site sequence (140) was TruSeq (HP10) having the sequence 5’- ACACTCTTTCCCTACACGACGCTCTTCCGATCT -3’ (SEQ ID NO: 2).
[00894] The two types of covalently closed circular library molecules (400) and (900) were mixed at 1 : 1 ratio and 20 pM was distributed onto a flow cell having a plurality of surface capture and pinning primers immobilized thereon. The surface capture primer was designed to capture both types of covalently closed circular library molecules (e.g., (400) and (900)). The loaded covalently closed circular library molecules (400) and (900) were subjected to on-support rolling circle amplification using the immobilized surface capture primers as amplification primers, thereby generating two type of concatemer template molecules. The rolling circle amplification reaction was conducted in the presence of compaction oligonucleotides to generated compact DNA nanoballs. The pinning primer was designed to pin down both types of concatemer template molecules resulting from rolling circle amplification. In other experiments, 30 and 40 pM of covalently closed circular library molecules (400) and (900) were loaded onto a flow cell to increase the density of immobilized concatemer molecules.
[00895] A first batch sequencing reaction was conducted using the TruSeq (HP 10) sequencing primer and the two-stage sequencing reaction. Thirty-one sequencing cycles were conducted and fluorescent images were acquired after reacting the concatemer template molecules with labeled multivalent molecules (FIG. 4) and non-labeled chain terminator nucleotides. After 31 sequencing cycles, the first batch sequencing read products were removed.
[00896] A second batch sequencing reaction was conducted using ss-Splint sequencing primer and the two-stage sequencing reaction. Thirty-one sequencing cycles were conducted and fluorescent images were acquired after reacting the concatemer template molecules with labeled multivalent molecules (exemplified in FIG. 4) and non-labeled chain terminator nucleotides. After 31 sequencing cycles, the second batch sequencing read products were removed.
[00897] Table 1 below shows the number of millions of reads, quality scores (%Q30), and percent error.
Table 1:
Example 2: Four-Plex and Eight-Plex Batch Sequencing of Concatemer Template Molecules Prepared with Single-Stranded Splints
[00898] Four sub-populations of linear single stranded library molecules (100) were prepared having a PhiX sequence as the sequence of interest where individual library molecules included: (i) a surface pinning primer binding site sequence (120), which can be universal; (ii) a left sample index sequence (160) (e.g., one of four different sample indexes which are 9 bases in length); (iii) a forward sequencing primer binding site sequence (140) (e.g., one of four different batch-specific forward sequencing primer binding site sequences); (iv) a sequence of interest (110) (e.g., PhiX); (v) a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); (vi) a right sample index sequence (170) (e.g., one of four different sample indexes having a random sequence 3-mer (e.g., NNN) and a 9-base sample index sequence); and (vii) a surface capture primer binding site sequence (130) that is a universal sequence. The linear library molecules did not include a unique molecular index (UMI). The arrangement of the linear single stranded library molecules (100) is shown in FIG. 21.
[00899] Universal single-stranded splint strands (200) were prepared having: (i) a first region (210) that hybridizes with the surface pinning primer binding site sequence (120) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the surface capture primer binding site sequence (130) of the single-stranded library molecule (100). The arrangement of the universal single-stranded splint strands (200) is shown in FIG. 21.
[00900] In four separate reactions, the linear single stranded library molecules (100) were hybridized with universal single-stranded splint strands (200) to generate four subpopulations of library-splint complexes (300) with a nick (e.g., see FIG. 21). The librarysplint complexes (300) in the four sub-populations carried one of four different forward sequencing primer binding site sequences.
[00901] The library-splint complexes (300) were subjected to separate ligation reactions to generate four sub-populations of covalently closed circular library molecules (400) where individual covalently closed circular library molecules included: (i) a surface pinning primer binding site sequence (120) which can be universal; (ii) a left sample index sequence (160) (e.g., one of four different sample indexes which are 9 bases in length); (iii) a forward sequencing primer binding site sequence (140) (e.g., one of four different batch-specific forward sequencing primer binding site sequences); (iv) a sequence of interest (110) (e.g., PhiX); (v) a reverse sequencing primer binding site sequence (150) (e.g., batch-specific reverse sequencing primer binding site sequence); (vi) a right sample index sequence (170) (e.g., one of four different sample indexes having a random sequence 3-mer (e.g., NNN) and a 9 base sample index sequence); and (vii) a surface capture primer binding site sequence (130) that is a universal sequence.
[00902] The four sub-populations of covalently closed circular library molecules (400) were mixed at 1 : 1 : 1 : 1 ratio and 200 pM of the mixture was distributed onto a flow cell having a plurality of universal surface capture and pinning primers immobilized thereon. The loaded covalently closed circular library molecules (400) were subjected to on-support rolling circle amplification using the immobilized surface capture primers as amplification primers, thereby generating four sub-populations of concatemer template molecules, where the concatemers in the different sub-populations carried one of four different forward sequencing primer binding site sequence. The rolling circle amplification reaction was conducted in the presence of compaction oligonucleotides to generate compact concatemers (e.g., DNA nanoballs; polonies).
[00903] The density of each sub-population of polonies on the flow cell was measured to be about 400K/mm2 to about 450K/mm2, resulting in a total polony density of about 1600K/mm2.
[00904] A first batch sequencing reaction was conducted using a first batch forward sequencing primer and the two-stage sequencing reaction to sequence the PhiX insert region. Thirty-one sequencing cycles were conducted and fluorescent images were acquired after reacting the concatemer template molecules with labeled multivalent molecules (e.g., FIG. 4) and non-labeled chain terminator nucleotides. After 31 sequencing cycles, the first batch sequencing read products were removed.
[00905] A second batch sequencing reaction was conducted using a second batch forward sequencing primer and the two-stage sequencing reaction to sequence the PhiX insert region. Thirty-one sequencing cycles were conducted and fluorescent images were acquired after
reacting the concatemer template molecules with labeled multivalent molecules (e.g., FIG. 4) and non-labeled chain terminator nucleotides. After 31 sequencing cycles, the second batch sequencing read products were removed.
[00906] The sequencing reactions were repeated using the third and fourth batch forward sequencing primers as described above. The quality scores of the sequencing reads were determined to be about 96% at Q30 and 85% at Q40.
[00907] In a similar manner, an 8-plex library prep, circularization and sequencing workflow was conducted using eight sub-populations of libraries that were prepared using eight different batch-specific forward sequencing primer binding site sequences (140) and eight different batch-specific reverse sequencing primer binding site sequences (150). The sequences of the eight different forward sequencing primer binding site sequences (140) in the linear library molecules is listed in Table 2 below.
Table 2:
[00908] The eight sub-populations of covalently closed circular library molecules (400) were mixed at equal ratio (e.g., each at 8.5 pM, 12.5 pM or 25 pM) and the mixture was distributed onto a flow cell having a plurality of universal capture and pinning primers immobilized thereon. Thus, 68 pM, 100 pM or 200 pM of the covalently closed circular library molecules were loaded onto a flow cell. Rolling circle amplification reaction was conducted. The density of each sub-population of polonies on the flow cell was measured to be about 270K/mm2 to about 290K/mm2, resulting in a total polony density of about 2100K/mm2. Eight rounds of batch sequencing were conducted in a manner similar to that described above using eight different batch sequencing primers and conducting 31
sequencing cycles for each batch. The quality scores of the sequencing reads of the eight different sub-populations are listed in Table 3 below.
Table 3:
Example 3: Eight-Plex Batch Sequencing of Concatemer Template Molecules Prepared with Single-Stranded Splints
[00909] Eight sub-populations of linear single stranded library molecules (100) were prepared having an E. coli sequence of interest (e.g., insert region) where individual library molecules included: (i) a universal pinning primer binding site sequence (120); (ii) a left sample index sequence (160) (e.g., one of eight different sample indexes which are 9 bases in length); (iii) a forward sequencing primer binding site sequence (140) (e.g., one of eight different batch-specific forward sequencing primer binding site sequences); (iv) a sequence of interest (110) (e.g., from E. coliy, (v) a reverse sequencing primer binding site sequence (150) (e.g., a batch-specific reverse sequencing primer binding site sequence); (vi) a right sample index sequence (170) (e.g., one of eight different sample indexes having a random sequence 3-mer (e.g., NNN) and a 9-base sample index sequence); and (vii) a surface capture primer binding site sequence (130) that is a universal sequence. The library molecules did not include a unique molecular index (UMI). The arrangement of the linear single stranded library molecules (100) is shown in FIG. 21. The sequences of the eight different forward sequencing primer binding site sequences (140) in the linear library molecules are listed in Table 4 below.
[00910] Universal single-stranded splint strands (200) were prepared having: (i) a first region (210) that hybridizes with the surface pinning primer binding site sequence (120) of the linear single-stranded library molecule (100), and a second region (220) that hybridizes with the surface capture primer binding site sequence (130) of the linear single-stranded
library molecule (100). The arrangement of the universal single-stranded splint strands (200) is shown in FIG. 21.
Table 4:
[00911] In eight separate reactions, the linear single stranded library molecules (100) were hybridized with universal single-stranded splint strands (200) to generate eight subpopulations of library-splint complexes (300) with a nick (e.g., see FIG. 21). The librarysplint complexes (300) in the eight sub-populations carried one of eight different forward sequencing primer binding site sequences.
[00912] The library-splint complexes (300) were subjected to separate ligation reactions to generate eight sub-populations of covalently closed circular library molecules (400) where individual covalently closed circular library molecules included: (i) a surface pinning primer binding site sequence (120) which can be universal; (ii) a left sample index sequence (160) (e.g., one of eight different sample indexes which are 9 bases in length); (iii) a forward sequencing primer binding site sequence (1 0) (e.g., one of eight different batch-specific forward sequencing primer binding site sequences); (iv) a sequence of interest (110) (e.g., from E. coliy, (v) a reverse sequencing primer binding site sequence (150) (e.g., a batchspecific reverse sequencing primer binding site sequence); (vi) a right sample index sequence (170) (e.g., one of eight different sample indexes having a random sequence 3-mer (e.g., NNN) and a 9-base sample index sequence); and (vii) a surface capture primer binding site sequence (130) that is a universal sequence.
[00913] Equal amounts of the eight sub-populations of covalently closed circular library molecules (400) were mixed together and 56 pM of the mixture was distributed onto a flow
cell having a plurality of universal capture and pinning primers immobilized thereon. The loaded covalently closed circular library molecules (400) were subjected to on-support rolling circle amplification using the immobilized surface capture primers as amplification primers, thereby generating eight sub-populations of concatemer template molecules, where the concatemers in the different sub-populations carried one of eight different forward sequencing primer binding site sequences. The rolling circle amplification reaction was conducted in the presence of compaction oligonucleotides to generate compact concatemers (e.g., DNA nanoballs; polonies).
[00914] The density of each sub-population of polonies on the flow cell was measured to be about 200 K/mm2 to about 450 K/mm2, resulting in a total polony density of about 3,500 K/mm2.
[00915] A first batch sequencing reaction was conducted using a first batch forward sequencing primer and the two-stage sequencing reaction to sequence the E. coli insert region. Thirty-one sequencing cycles were conducted and fluorescent images were acquired after reacting the concatemer template molecules with labeled multivalent molecules (e.g., FIG. 4) and non-labeled chain terminator nucleotides. After 31 sequencing cycles, the first batch sequencing read products were removed.
[00916] A second batch sequencing reaction was conducted using a second batch forward sequencing primer and the two-stage sequencing reaction to sequence the E. coli insert region. 31 sequencing cycles were conducted and fluorescent images were acquired after reacting the concatemer template molecules with labeled multivalent molecules (e.g., FIG. 4) and non-labeled chain terminator nucleotides. After 31 sequencing cycles, the second batch sequencing read products were removed.
[00917] The two-stage sequencing reactions were repeated separately using the third, fourth, fifth, sixth, seventh and eighth batch forward sequencing primers as described above. [00918] The quality scores (Q30 and Q40) of the sequencing reads of the eight different sub-populations are listed in Table 5 below.
Table 5:
Example 4: Determining Polony Density on a Flow cell using Eight-Plex Batch Sequencing of Concatemer Template Molecules Prepared with Single-Stranded Splints [00919] Eight sub-populations of linear single stranded library molecules (100) were prepared having E. coli sequence of interest (e.g., an insert region) where individual library molecules included: (i) a surface pinning primer binding site sequence (120) which can be universal; (ii) a left sample index sequence (160) (e.g., one of eight different sample indexes which are 9 bases in length); (iii) a forward sequencing primer binding site sequence (140) (e.g., one of eight different batch-specific forward sequencing primer binding site sequences); (iv) a sequence of interest (110) (e.g., from E. coliy, (v) a reverse sequencing primer binding site sequence (150) (e.g., one of eight different batch-specific reverse sequencing primer binding site sequences); (vi) a right sample index sequence (170) (e.g., one of eight different sample indexes having a random sequence 3-mer (e.g., NNN) and a 9-base sample index sequence); and (vii) a surface capture primer binding site sequence (130) that is a universal sequence. The arrangement of the linear single stranded library molecules (100) is shown in FIG. 21. The library molecules did not include a unique molecular index (UMI). The sequences of the eight different batch-specific forward sequencing primers are listed in Table 6 below. The sequences of the eight different reverse sequencing primer binding sequences are listed in Table 7 below.
[00920] Universal single-stranded splint strands (200) were prepared having: (i) a first region (210) that hybridizes with the pinning primer binding site sequence (120) of the linear single-stranded library molecule (100), and (ii) a second region (220) that hybridizes with the surface capture primer binding site sequence (130) of the linear single-stranded library molecule (100). The arrangement of the universal single-stranded splint strands (200) is shown in FIG. 21.
Table 6:
Table 7:
[00921] In eight separate reactions, the linear single stranded library molecules (100) were hybridized with universal single-stranded splint strands (200) to generate eight subpopulations of library-splint complexes (300) with a nick (e.g., see FIG. 21). The librarysplint complexes (300) in the eight sub-populations carried one of eight different forward sequencing primer binding site sequences and one of eight different reverse primer binding site sequences.
[00922] The library-splint complexes (300) were subjected to separate ligation reactions to generate eight sub-populations of covalently closed circular library molecules (400).
[00923] 3.75 pM of each of the eight sub-populations of covalently closed circular library molecules (400) were mixed together and the mixture containing 30 pM was distributed onto a flow cell having a plurality of universal capture and pinning primers immobilized thereon. The loaded covalently closed circular library molecules (400) were subjected to on-support rolling circle amplification using the immobilized surface capture primers as amplification primers, thereby generating eight sub-populations of immobilized concatemer template molecules, where the concatemers in the different sub-populations carried one of eight different forward sequencing primer binding site sequences and one of eight different reverse sequencing primer binding sites. The rolling circle amplification reaction was conducted in the presence of compaction oligonucleotides to generate compact concatemers (e.g., DNA nanoballs; polonies).
[00924] The insert regions were sequenced using a set of different batch-specific forward sequencing primers. For example, 31 sequencing cycles were conducted to sequence the insert regions using the two-stage sequencing reaction which employed labeled multivalent molecules and non-labeled chain terminator nucleotides. Fluorescent images were acquired after reacting the concatemer template molecules with labeled multivalent molecules. The sequencing images of the insert regions were used to (i) determine the location of the first batch sequencing read products on the support (e.g., template mapping), (ii) count the number of first batch sequencing read products on the support, and (iii) determine the density of the first batch sequencing read products on the support using the counted number of first batch sequencing read products. After 31 sequencing cycles, the first batch sample index sequencing read products were removed.
[00925] The two-stage sequencing reactions were repeated separately using the second, third, fourth, fifth, sixth, seventh and eighth batch forward sequencing primers as described above.
[00926] The resulting pass filter count (millions), % pass filter, and % Q30 are shown in FIGs. 35A and 35B, 36A and 36B, and 37A and 37B. In FIGs. 35B, 36B and 37B, the extrapolated pM is the estimated loading concentration corresponding to the number of batch sequencing primers used. Each of the 8 libraries was loaded at approximately 3.75 pM. Thus the total amount of libraries loaded was approximately 30 pM. The extrapolated pM value is the cumulative sum of the loading concentrations.
[00927] The range of densities of each sub-population of polonies on the flow cell was measured to be about 248 K/mm2 (e.g., 484 million raw minimum) to about 577 K/mm2 (e.g., 1125 million raw maximum). The range of pass filters of each sub-population of polonies on
the flow cell was measured to be about 246 K/mm2 (e.g., 480 million pass filter minimum) to about 565 K/mm2 (e.g., 1101 million pass filter maximum).
Claims
1. A method for nucleic acid sequencing comprising: a) providing a support comprising a plurality of nucleic acid template molecules immobilized to the support, wherein the plurality of nucleic acid template molecules comprises at least a first and a second sub-population of template molecules,
• wherein individual template molecules in the first sub-population of template molecules comprises a first batch sequencing primer binding site, a first batch barcode sequence and at least one first sequence-of- interest,
• wherein the individual template molecules in the second sub-population of template molecules comprises a second batch sequencing primer binding site, a second batch barcode sequence and at least one second sequence-of-interest, b) sequencing the first sub-population of template molecules using a plurality of first batch sequencing primers, thereby generating a plurality of first batch sequencing read products and imaging a region of the support to detect the first batch sequencing read products; and c) sequencing the second sub-population of template molecules using a plurality of second batch sequencing primers, thereby generating a plurality of second batch sequencing read products and imaging the same region of the support to detect the second batch sequencing read products.
2. The method of claim 1, wherein the first batch sequencing primer binding site and the second batch sequencing primer binding site have different sequences.
3. The method of claim 1 or 2, wherein the first batch barcode sequence and the second batch barcode sequence are different.
4. The method of any one of claims 1-3, wherein sequencing the first sub-population of template molecules of step (b) comprises:
• Step (bl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first sub-population of template molecules to generate a
plurality of first batch sequencing read products that comprise up to 1000 bases in length;
• Step (b2): stopping and/or blocking the short read sequencing of step (bl);
• Step (b3): removing the plurality of first batch sequencing read products and retaining the first sub-population of template molecules; and optionally
• Step (b4): repeating steps (bl) - (b3) at least once.
5. The method of any one of claims 1-4, wherein sequencing the second sub-population of template molecules of step (c) comprises:
• Step (cl): conducting short read sequencing by performing up to 1000 sequencing cycles of the second sub-population of template molecules to generate a plurality of second batch sequencing read products that comprise up to 1000 bases in length;
• Step (c2): stopping and/or blocking the short read sequencing of step (cl);
• Step (c3): removing the plurality of second batch sequencing read products and retaining the second sub-population of template molecules; and optionally
• Step (c4): repeating steps (cl) - (c3) at least once.
6. The method of any one of claims 1-5, wherein the first sub-population of template molecules have the same first batch sequencing primer binding site, and have the same sequence of interest or different sequences of interest.
7. The method of any one of claims 1-6, wherein the individual template molecules of the second sub-population of template molecules have the same second batch sequencing primer binding site, and have the same sequence of interest or different sequences of interest.
8. The method of any one of claims 1-7, wherein the plurality of nucleic acid template molecules immobilized to the support are at a density of about 102 - 1015 template molecules
2 per mm .
9. The method of any one of claims 1-7, wherein the plurality of nucleic acid template molecules are immobilized to the support at a high density.
10. The method of any one of claims 1-9, wherein at least some individual template molecules of the first and second sub-populations of template molecules comprise nearest neighbor template molecules that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
11. The method of any one of claims 1-10, wherein the support lacks partitions and/or barriers that separate regions of the support.
12. The method of any one of claims 1-11, wherein the plurality of template molecules are immobilized to the support at random and non-determined positions on the support.
13. The method of any one of claims 1-11, wherein the plurality of template molecules are immobilized to the support at pre-determined positions on the support (e.g., a patterned support).
14. The method of any one of claims 1-13, wherein the plurality of nucleic acid template molecules comprises concatemer template molecules comprising at least a first and second sub-population of concatemer template molecules.
15. The method of claim 14, wherein individual concatemer template molecules in the first sub-population of concatemer template molecules comprise a plurality of tandem polynucleotide units comprising a first sequence of interest, a first batch sequencing primer binding site sequence which corresponds to the first sequence of interest, and a first batch barcode sequence which corresponds to the first sequence of interest.
16. The method of claim 14 or 15, wherein individual concatemer template molecules in the second sub-population of concatemer template molecules comprise a plurality of tandem polynucleotide units comprising a second sequence of interest, a second batch sequencing primer binding site sequence which corresponds to the second sequence of interest, and a second batch barcode sequence which corresponds to the second sequence of interest.
17. The method of any one of claims 1-16, wherein the first batch sequencing read products comprise:
the first batch barcode sequence; or the first batch barcode sequence and the first sequence of interest.
18. The method of any one of claims 1-17, wherein the second batch sequencing read products comprise:
• the second batch barcode sequence; or
• the second batch barcode sequence and the second sequence of interest.
19. A method for re-seeding a support, comprising: a) providing a support comprising a plurality of surface capture primers immobilized to the support; b) distributing on the support a first plurality of circularized library molecules under a condition suitable for hybridizing individual circularized library molecules to individual surface capture primers, and conducting a first rolling circle amplification reaction thereby generating a first plurality of concatemer template molecules immobilized to the support; c) sequencing at least a subset of the first plurality of concatemer template molecules, thereby generating a first plurality of sequencing read products; d) distributing on the support a second plurality of circularized library molecules under a condition suitable for hybridizing individual circularized library molecules of the second plurality to individual surface capture primers, and conducting a second rolling circle amplification reaction thereby generating a second plurality of concatemer template molecules immobilized to the support; and e) sequencing at least a subset of the second plurality of concatemer template molecules, thereby generating a second plurality of sequencing read products.
20. The method of claim 19, wherein the first plurality of circularized library molecules comprises:
• circularized padlock probes;
• linear library molecules circularized using single-stranded splint strands;
• linear library molecules circularized using double-stranded adaptors; or
a mixture of any combination of circularized padlock probes, linear library molecules circularized using single-stranded splint strands and/or linear library molecules circularized using double-stranded adaptors.
21. The method of claim 19 or 20, wherein the plurality of surface capture primers are immobilized to the support at random and non-pre-determined positions.
22. The method of claim 19 or 20, wherein the plurality of surface capture primers are immobilized to the support at pre-determined positions.
23. The method of any one of claims 19-22, wherein individual circularized library molecules in the first plurality of circularized library molecules comprise a first seeding batch sequencing primer binding site, a first seeding batch barcode sequence, and a first sequence of interest.
24. The method of any one of claims 19-23, wherein the first plurality of sequencing read products of step (c) comprises:
• a first seeding batch barcode sequence; or
• a first seeding batch barcode sequence and a first sequence of interest.
25. The method of any one of claims 19-24, wherein second individual circularized library molecules in the second plurality of circularized library molecules comprise a second seeding batch sequencing primer binding site, a second seeding batch barcode sequence, and a second sequence of interest.
26. The method of any one of claims 19-25, wherein the second plurality of sequencing read products of step (e) comprises:
• a second seeding batch barcode sequence; or
• a second seeding batch barcode sequence and a second sequence of interest.
27. The method of any one of claims 19-26, wherein sequencing at least the subset of the first plurality of concatemer template molecules of step (c) comprises:
• Step (cl): conducting short read sequencing by performing up to 1000 sequencing cycles of the first plurality of concatemer template molecules to generate a first plurality of sequencing read products that comprise up to 1000 bases in length;
• Step (c2): stopping and/or blocking the short read sequencing of step (cl);
• Step (c3): removing the first plurality of sequencing read products and retaining the first plurality of immobilized concatemer template molecules; and optionally
• Step (c4): repeating steps (cl) - (c3) at least once.
28. The method of any one of claims 19-27, wherein the sequencing at least the subset of the second plurality of concatemer template molecules of step (e) comprises:
• Step (el): conducting short read sequencing by performing up to 1000 sequencing cycles of the second plurality of concatemer template molecules to generate a second plurality of sequencing read products that comprise up to 1000 bases in length;
• Step (e2): stopping and/or blocking the short read sequencing of step (el);
• Step (e3): removing the second plurality of sequencing read products and retaining the second plurality of immobilized concatemer template molecules; and optionally
• Step (e4): repeating steps (el) - (e3) at least once.
29. The method of any one of claims 19-28, wherein the plurality of surface capture primers immobilized to the support are at a density of about 102 - 1015 capture primers per
2 mm .
30. The method of any one of claims 19-28, wherein at least some of the surface capture primers comprise nearest neighbor surface capture primers that touch each other and/or overlap each other when viewed from any angle of the support including above, below or side views of the support.
31. The method of any one of claims 19-30, wherein the support lacks partitions and/or barriers that separate regions of the support.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202463573300P | 2024-04-02 | 2024-04-02 | |
| US63/573,300 | 2024-04-02 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025212654A1 true WO2025212654A1 (en) | 2025-10-09 |
Family
ID=95519087
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2025/022547 Pending WO2025212654A1 (en) | 2024-04-02 | 2025-04-01 | Library molecule titration for tunable surface density in polony sequencing |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025212654A1 (en) |
Citations (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5558991A (en) | 1986-07-02 | 1996-09-24 | E. I. Du Pont De Nemours And Company | DNA sequencing method using acyclonucleoside triphosphates |
| US9994541B2 (en) | 2016-08-23 | 2018-06-12 | Institut Pasteur De Montevideo | Nitroalkene trolox derivatives and methods of use thereof in the treatment and prevention of inflammation related conditions |
| US10246744B2 (en) | 2016-08-15 | 2019-04-02 | Omniome, Inc. | Method and system for sequencing nucleic acids |
| US10731141B2 (en) | 2018-09-17 | 2020-08-04 | Omniome, Inc. | Engineered polymerases for improved sequencing |
| WO2021128441A1 (en) * | 2019-12-23 | 2021-07-01 | Mgi Tech Co.,Ltd. | Controlled strand-displacement for paired-end sequencing |
| WO2022266470A1 (en) | 2021-06-17 | 2022-12-22 | Element Biosciences, Inc. | Compositions and methods for pairwise sequencing |
| WO2023107719A2 (en) * | 2021-12-10 | 2023-06-15 | Element Biosciences, Inc. | Primary analysis in next generation sequencing |
| WO2023168444A1 (en) | 2022-03-04 | 2023-09-07 | Element Biosciences, Inc. | Single-stranded splint strands and methods of use |
| WO2023168443A1 (en) | 2022-03-04 | 2023-09-07 | Element Biosciences, Inc. | Double-stranded splint adaptors and methods of use |
| US11859241B2 (en) | 2021-06-17 | 2024-01-02 | Element Biosciences, Inc. | Compositions and methods for pairwise sequencing |
| WO2024011145A1 (en) | 2022-07-05 | 2024-01-11 | Element Biosciences, Inc. | Pcr-free library preparation using double-stranded splint adaptors and methods of use |
| WO2024040058A1 (en) | 2022-08-15 | 2024-02-22 | Element Biosciences, Inc. | Methods for preparing nucleic acid nanostructures using compaction oligonucleotides |
| WO2024059550A1 (en) | 2022-09-12 | 2024-03-21 | Element Biosciences, Inc. | Double-stranded splint adaptors with universal long splint strands and methods of use |
| WO2024064912A2 (en) * | 2022-09-23 | 2024-03-28 | Element Biosciences, Inc. | Increasing sequencing throughput in next generation sequencing of three-dimensional samples |
| US20240240249A1 (en) | 2022-12-07 | 2024-07-18 | Element Biosciences, Inc. | Cyanine derivatives and related uses |
| WO2024159166A1 (en) | 2023-01-27 | 2024-08-02 | Element Biosciences, Inc. | Compositions and methods for sequencing multiple regions of a template molecule using enzyme-based reagents |
| WO2025024465A1 (en) | 2023-07-24 | 2025-01-30 | Element Biosciences, Inc. | On-support circularization and amplification for generating immobilized nucleic acid concatemer molecules |
-
2025
- 2025-04-01 WO PCT/US2025/022547 patent/WO2025212654A1/en active Pending
Patent Citations (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5558991A (en) | 1986-07-02 | 1996-09-24 | E. I. Du Pont De Nemours And Company | DNA sequencing method using acyclonucleoside triphosphates |
| US10246744B2 (en) | 2016-08-15 | 2019-04-02 | Omniome, Inc. | Method and system for sequencing nucleic acids |
| US9994541B2 (en) | 2016-08-23 | 2018-06-12 | Institut Pasteur De Montevideo | Nitroalkene trolox derivatives and methods of use thereof in the treatment and prevention of inflammation related conditions |
| US10731141B2 (en) | 2018-09-17 | 2020-08-04 | Omniome, Inc. | Engineered polymerases for improved sequencing |
| WO2021128441A1 (en) * | 2019-12-23 | 2021-07-01 | Mgi Tech Co.,Ltd. | Controlled strand-displacement for paired-end sequencing |
| US11859241B2 (en) | 2021-06-17 | 2024-01-02 | Element Biosciences, Inc. | Compositions and methods for pairwise sequencing |
| WO2022266470A1 (en) | 2021-06-17 | 2022-12-22 | Element Biosciences, Inc. | Compositions and methods for pairwise sequencing |
| US20240191278A1 (en) | 2021-06-17 | 2024-06-13 | Element Biosciences, Inc. | Compositions and methods for pairwise sequencing |
| WO2023107719A2 (en) * | 2021-12-10 | 2023-06-15 | Element Biosciences, Inc. | Primary analysis in next generation sequencing |
| WO2023168443A1 (en) | 2022-03-04 | 2023-09-07 | Element Biosciences, Inc. | Double-stranded splint adaptors and methods of use |
| WO2023168444A1 (en) | 2022-03-04 | 2023-09-07 | Element Biosciences, Inc. | Single-stranded splint strands and methods of use |
| WO2024011145A1 (en) | 2022-07-05 | 2024-01-11 | Element Biosciences, Inc. | Pcr-free library preparation using double-stranded splint adaptors and methods of use |
| WO2024040058A1 (en) | 2022-08-15 | 2024-02-22 | Element Biosciences, Inc. | Methods for preparing nucleic acid nanostructures using compaction oligonucleotides |
| WO2024059550A1 (en) | 2022-09-12 | 2024-03-21 | Element Biosciences, Inc. | Double-stranded splint adaptors with universal long splint strands and methods of use |
| WO2024064912A2 (en) * | 2022-09-23 | 2024-03-28 | Element Biosciences, Inc. | Increasing sequencing throughput in next generation sequencing of three-dimensional samples |
| US20240240249A1 (en) | 2022-12-07 | 2024-07-18 | Element Biosciences, Inc. | Cyanine derivatives and related uses |
| WO2024159166A1 (en) | 2023-01-27 | 2024-08-02 | Element Biosciences, Inc. | Compositions and methods for sequencing multiple regions of a template molecule using enzyme-based reagents |
| WO2025024465A1 (en) | 2023-07-24 | 2025-01-30 | Element Biosciences, Inc. | On-support circularization and amplification for generating immobilized nucleic acid concatemer molecules |
Non-Patent Citations (10)
| Title |
|---|
| ASLAM, M.DENT, A., BIOCONJUGATION: PROTEIN COUPLING TECHNIQUES FOR THE BIOMEDICAL SCIENCES, 1998 |
| AUSUBEL ET AL.: "Current Protocols in Molecular Biology", 1992, GREENE PUBLISHING ASSOCIATES |
| ESCHENMOSSER, SCIENCE, vol. 284, 1999, pages 2118 - 2124 |
| FASMAN: "Practical Handbook of Biochemistry and Molecular Biology", 1989, CRC PRESS, pages: 385 - 394 |
| FERRAROGOTOR, CHEM. REV., vol. 100, 2000, pages 4319 - 48 |
| HERMANSON, G.: "Bioconjugate Techniques", 2008 |
| JOENG ET AL., J. MED. CHEM., vol. 36, 1993, pages 2627 - 2638 |
| LAKOWICZ: "Principles of Fluorescence Spectroscopy", 1999, PLENUM PRESS |
| MARTINEZ ET AL., BIOORGANIC & MEDICINAL CHEMISTRY LETTERS, vol. 7, 1997, pages 3013 - 3016 |
| MARTINEZ ET AL., NUCLEIC ACIDS RESEARCH, vol. 27, 1999, pages 1271 - 1274 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20250333787A1 (en) | Methods and reagents for nucleic acid analysis | |
| US11236388B1 (en) | Compositions and methods for pairwise sequencing | |
| US12421545B2 (en) | Compositions and methods for preparing nucleic acid nanostructures using compaction oligonucleotides | |
| US12359193B2 (en) | Single-stranded splint strands and methods of use | |
| US20240011022A1 (en) | Pcr-free library preparation using double-stranded splint adaptors and methods of use | |
| WO2022266470A1 (en) | Compositions and methods for pairwise sequencing | |
| US20230193354A1 (en) | Compositions and methods for pairwise sequencing | |
| EP4486916A1 (en) | Double-stranded splint adaptors and methods of use | |
| US20250019760A1 (en) | Compositions and methods for sequencing multiple regions of a template molecule using enzyme-based reagents | |
| AU2024298639A1 (en) | On-support circularization and amplification for generating immobilized nucleic acid concatemer molecules | |
| EP4355913A1 (en) | Compositions and methods for pairwise sequencing | |
| WO2025212654A1 (en) | Library molecule titration for tunable surface density in polony sequencing | |
| WO2025212655A1 (en) | Multiple priming for on-support nucleic acid amplification | |
| WO2025191535A1 (en) | Partially double-stranded splint adaptors and methods of use | |
| WO2025120579A1 (en) | Compositions and methods for sequencing multiple regions of a template molecule using read-capping nucleotide analogs |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25721382 Country of ref document: EP Kind code of ref document: A1 |