WO2025133584A1

WO2025133584A1 - Tail sequence guided addition of adaptor

Info

Publication number: WO2025133584A1
Application number: PCT/GB2024/053124
Authority: WO
Inventors: Thomas DUNWELL; Simon DAILEY; Guoliang Fu
Original assignee: GENEFIRST Ltd
Current assignee: GENEFIRST Ltd
Priority date: 2023-12-17
Filing date: 2024-12-16
Publication date: 2025-06-26
Anticipated expiration: 2026-06-17
Also published as: GB202319353D0

Abstract

This invention relates to methods, compositions and kits for copying a target polynucleotide using a first primer and using the 3' ends or copies of the 3' ends for guided addition of an adaptor sequence for preparing sequencing library of polynucleotides. The sequencing library is suitable for massive parallel sequencing and comprises a plurality of double-stranded nucleic acid molecules.

Description

TAIL SEQUENCE GUIDED ADDITION OF ADAPTOR

BACKGROUND OF THE INVENTION

The present invention is directed to methods and compositions for amplifying a population of target polynucleotides where the amplified portion is processed to generate epigenetic and or genetic information. An oligo is used to generate copies of target polynucleotides which may be enriched in regions containing epigenetic information which subsequently has adaptors added.

Next-generation DNA sequencing (NGS) is continuing to revolutionize clinical medicine and basic research, especially in the rapidly advancing field of liquid biopsy testing. Both genetic and epigenetic biomarkers detectable from cell-free DNA extracted from liquid biopsy has the power to identify the presence of diseases, such as cancer. Of these two subsets of biomarkers, aberrant DNA methylation has been implicated in many disease processes, including cancer due to changes in methylation patterns leading to changes in the regulation of gene expression which leads to dysregulation of normal cellular processes.

DNA methylation profiling using methylation sequencing (e.g., whole genome bisulfite sequencing (WGBS)) is recognized as a valuable diagnostic tool for detection, diagnosis, and/or monitoring of cancer. For example, specific patterns of differentially methylated regions may be useful as molecular markers for various diseases and disease stages. DNA methylation is enriched in regions of the human genome called CpG Islands which have a high-GC content, however these are only approximately 1% of the human genome. As such “whole genome” approaches are expensive and inefficient to sequence with much of the generated data being uninformative and is either not differentially methylated in cancer, or the local CpG density is too low to provide a robust signal. Due to this limitation, methods for enrichment of methylated targets will allow for both greater sensitivity of disease markers and will allow this to be achieved at lower costs.

DETAILED DESCRIPTION

While various embodiments of the compositions and methods have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the compositions and methods. It should be understood that various alternatives to the embodiments described herein may be employed. To facilitate an understanding of the invention, a number of terms are defined below. As used herein, a "sample" refers to any substance containing or presumed to contain nucleic acids and includes a sample of tissue or fluid isolated from an individual or individuals.

As used herein, the term "nucleotide sequence" refers to either a homopolymer or a heteropolymer of deoxyribonucleotides, ribonucleotides or other nucleic acids, or any combination of nucleic acids.

As used herein, the term "nucleotide" generally refers to the monomer components of nucleotide sequences even though the monomers may be nucleoside and/or nucleotide analogues, and/or modified nucleosides such as amino modified nucleosides in addition to nucleotides. In addition, "nucleotide" also includes “nucleoside triphosphate” and non-naturally occurring analogue structures which may be naturally occurring or have been developed in selective or targeted approaches.

As used herein, the term "nucleic acid" refers to at least two nucleotides covalently linked together. A nucleic acid of the present invention may generally contain phosphodiester bonds, although in some cases nucleic acid analogues are included that may have alternate backbones. Nucleic acids may be single-stranded or double-stranded, as specified, or contain portions of both double-stranded and single-stranded sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA, DNA, DNA and RNA mixtures, or, DNA-RNA hybrids, where the nucleic acid contains any combination of deoxyribo- and ribo-nucleotides, and any combination of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xathanine, hypoxathanine, etc. Reference to a "DNA sequence" or “RNA Sequence” can include both single-stranded and double-stranded DNA or RNA. A specific sequence, unless the context indicates otherwise, refers to the single stranded DNA or RNA of such sequence, the duplex of such sequence with its complement (double stranded DNA or RNA) and/or the complement of such sequence.

As used herein, the "polynucleotide" and "oligonucleotide" are types of "nucleic acid", and generally refer to primers, oligomer fragments to be detected. There is no intended distinction in length between the term "nucleic acid", "polynucleotide" and "oligonucleotide", and these terms may be used interchangeably. "Nucleic acid", "DNA" and similar terms also include nucleic acid analogues. The oligonucleotide is not necessarily physically derived from any existing or natural sequence but may be generated in any manner, including chemical synthesis, enzymatically, DNA replication, reverse transcription or any combination thereof.

As used herein, the terms “original target polynucleotide”, "target sequence", "target nucleic acid", "target nucleic acid sequence", "target nucleic acid sequence" and "nucleic acids of interest" are used interchangeably and refer to a desired region which is to be either amplified, detected or both, or is the subject of hybridization with a complementary oligonucleotide, polynucleotide, e.g., a blocking oligomer, or the subject of a primer extension process. The target sequence can be composed of DNA, RNA, analogues thereof, or any combinations thereof. The target sequence can be single-stranded or double-stranded. In primer extension processes, the target nucleic acid which forms a hybridization duplex with the primer may also be referred to as a "template". A template serves as a pattern for the synthesis of a complementary polynucleotide. A target sequence for use with the present invention may be derived from any living or once living organism, including but not limited to prokaryotes, eukaryotes, plants, animals, and viruses, as well as synthetic and/or recombinant target sequences, it may also be a mixture of nucleic acids such that target nucleic acid is a subset of the total nucleic acids.

"Primer" as used herein may be used interchangeably to describe, one or more than one primer or a set or plurality of multiple primers and refers to an oligonucleotide(s), whether occurring naturally or produced synthetically. The multiple primers in a set may have different sequences and hybridise to multiple different locations. The terms “first primer”, “a set of first primers” and “a first set of primers” are interchangeable, and the same applies to terms “second primer”. A “Primer” can be functionally described as a molecule capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product would be expected to occur, which is complementary to a nucleic acid strand is induced i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase and at a suitable temperature and in a suitable buffer. Such conditions include the presence of one or more, two or more, three or more, or four or more different deoxyribonucleoside triphosphates which may include but is not limited to deoxyadenosine triphosphate (dATP), deoxythymidine triphosphate (dTTP), deoxyguanosine triphosphate (dGTP) and deoxycytidine triphosphate (dCTP) or suitable additional or replacement nucleotides, unusual nucleotides, and, a polymerization-inducing agent such as DNA polymerase and/or RNA polymerase and/or reverse transcriptase, in a suitable buffer ("buffer" includes substituents which are cofactors, or affect pH, ionic strength, etc.), and at a suitable temperature. The primer is preferably singlestranded for maximum efficiency in amplification. The primers herein are selected to be substantially complementary to a strand of each specific sequence to be amplified. This means that the primers must be sufficiently complementary to hybridize with their respective strands. One or more regions of non-complementary sequence may be attached to the 5' -end of the primer (5' tail portion) or in the primer (bulge portion), with the remainder of the primer sequence being complementary to the desired section of the target base sequence. Commonly, the primers are complementary, except when non-complementary nucleotides may be present at a predetermined primer terminus or middle region as described. In another expression, the primers herein are selected to be substantially identical to a strand of each specific sequence to be amplified. This means that the primers must be sufficiently identical to one strand, so that they can hybridize with their respective other strands. As used herein, the term “adaptor” is used to describe an oligonucleotide which is designed to be used in a rection as a substrate for a ligase or polymerase as template for extension. Adaptor sequence may be complementary or identical to the oligonucleotide sequence. An adaptor may be comprised of functional components. The term functional component may be used to interchangeable describe any position(s) or nucleotide(s) in the adaptor.

As used herein, the term "complementary" refers to the ability of two nucleotide sequences, either randomly or by design, to bind in a sequence complementary dependent manor to each other by hydrogen bonding through their purine and/or pyrimidine bases according to the usual Watson-Crick rules for forming duplex nucleic acid complexes. It can also refer to the ability of nucleotide sequences that may include modified nucleotides or analogues of deoxyribonucleotides and ribonucleotides, or combinations thereof, to bind sequence- specifically to each other by other than the usual Watson Crick rules to form alternative nucleic acid duplex structures.

As used herein, the term "hybridization" and "annealing" are interchangeable, and refers to the process by which two nucleotide sequences complementary to each other, either partially or fully, bind together to form a duplex sequence or segment.

The terms "duplex" and "double-stranded" are interchangeable, meaning a structure formed as a result of hybridization between two complementary sequences of nucleic acids. Such duplexes can be formed by the complementary binding of two DNA segments to each other, two RNA segments to each other, or of a DNA segment to an RNA segment, or two segments composed of a mixture of RNA and DNA to one another, the latter structure being termed as a hybrid duplex. Either or both members of such duplexes can contain modified nucleotides and/or nucleotide analogues as well as nucleoside analogues. As disclosed herein, such duplexes can be formed as the result of binding of one or more blocking oligonucleotides to a sample sequence. The duplex may be partially or completely complementary and may be partially or fully double stranded.

As used herein, the terms "wild-type nucleic acid", "normal nucleic acid", "nucleic acid with normal nucleotides", “wild-type”, “normal”, "wild-type DNA" and "wild-type template" are used interchangeably and refer to a polynucleotide which has a nucleotide sequence that is considered to be normal or unaltered.

As used herein, the term "mutant polynucleotide", "mutant nucleic acid", "variant nucleic acid", and "nucleic acid with variant nucleotides", refers to a polynucleotide which has a nucleotide sequence that is different from the expected nucleotide sequence of the corresponding wildtype polynucleotide. The difference in the nucleotide sequence of the mutant polynucleotide as compared to the wild-type polynucleotide is referred to as the nucleotide "mutation", "variant nucleotide", “variant” or "variation." The term "variant nucleotide(s)" also refers to one or more nucleotide(s) substitution(s), deletion(s), insertion(s), methylation(s), and/or modification changes.

"Amplification" as used herein denotes the use of any amplification procedures to increase the concentration or copy number of a particular nucleic acid sequence within a mixture of nucleic acid sequences. Amplification can be one or more round of linear amplification, one or more rounds of exponential amplification or a combination thereof. “Replication” or “replicate” as used herein denotes making a complementary copy of a polynucleotides which is a template for polymerase extension. Many rounds of replication result in amplification.

The terms "reaction mixture", "amplification mixture" or "PCR mixture" as used herein refer to a mixture of components necessary to amplify at least one product from nucleic acid templates. The mixture may comprise one or more nucleotides (dNTPs), a polymerase (thermostable or not thermostable), primer(s), and a plurality of nucleic acid templates and other unusual nucleotide(s) necessary for the disclosed invention. The mixture may further comprise a Tris buffer, a monovalent salt and Mg2+. The concentration of each component, apart from the unusual nucleotide as necessary for the disclosed invention, is well known in the art and can be further optimized by an ordinary skilled artisan.

The terms “amplified product” or “amplicon” refer to a fragment of DNA or RNA amplified by a polymerase a primer, pool of primer, a pair of primers, a pool of pairs of primers or any combination thereof in an amplification method.

The terms “primer extension product” refer to a fragment of DNA or RNA extended by a polymerase using one or a pair of primers in a reaction, which may involve one pass extension, for example first strand cDNA synthesis, or two pass extension, for example double strand cDNA syntheses, or many cycles of extension, which may be a PCR, or isothermal amplification using polymerase with strand displacement activity.

The term “compatible” refers to a primer sequence or a portion of primer sequence which is identical, or substantially identical, complementary, substantially complementary or similar to a PCR primer sequence/sequencing primer sequence used in a massive parallel sequencing platform.

The practice of the present invention may employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, recombinant DNA, and next generation sequencing techniques, which are within the skill of a person skilled in the art. All patents, patent applications, and publications mentioned herein, both supra and infra, are hereby incorporated by reference. Disclosed herein are methods, systems, and compositions that can significantly increase the amount of information which can be obtained from a single patient sample compared to other technologies, especially those employing whole sample non-enrichment workflows.

Methylation of cytosines to form 5-methylcytosine (5mC or mC), e.g., at cytosines followed by guanine residues (e.g., cytosine-phosphate-guanine motifs, or CpGs), can be an epigenetic mark with important roles in development and tissue specificity, genomic imprinting, and environmental responses. Dysregulation of 5mC can cause aberrant gene expression, and in some cases can affect cancer risk, progression or treatment response. 5- hydroxymethylcytosine (5hmC or hmC) can be an intermediate in the cell’s active DNA demethylation pathway with tissue-specific distribution affecting gene expression and carcinogenesis.

Gene body DNA methylation (as used herein, methylation can mean addition of or the presence of a methyl group on a base of a nucleic acid; the methyl group can be in an oxygenated or unoxygenated state; an unoxygenated methyl group can be e.g., methyl; an oxygenated methyl group can be a hydroxymethyl, a formyl group, a carboxylic acid group, or a salt of carboxylic acid) can play a role in repetitive DNA elements’ silencing and alternative splicing. DNA methylation can be associated with several biological processes such as genomic imprinting, transposon inactivation, stem cell differentiation, transcription repression, and inflammation. DNA methylation profiles can in some cases be inherited through cell division and sometimes through generations. Since methyl marks can play a very relevant role in both physiologic and pathologic conditions, there may be significant application for profiling DNA methylation to answer biological questions. Moreover, uncovering of DNA methylation genomic regions can be appealing to translational research because methyl sites can be modifiable by pharmacologic intervention.

Early cancer detection in patients is important as it generally allows for earlier intervention and therefore a greater chance for survival. As cancers develop, they often change the patten of genomic methylation while also maintaining signatures of their cellular origin. This is also possible to identify by examining cell-free DNA (cfDNA) fragments which can make early detection of cancer possible by providing a cost-effective and non-invasive method for analysing information relevant to cancer identification and classification. By using an enrichment approach targeted towards any methylated genomic region rather than sequencing all nucleic acids in a test sample, also known as “whole genome sequencing,” the disclosed method can increase sequencing depth of the target regions and lower costs compared to whole genome sequencing (WGS) or whole genome bisulfite sequencing (WGBS).

Use of the described invention for cancer assays can further provide information relevant to the identification of the presence of a cancer, estimation of the stage of a cancer, and information for determining the tissue of origin of the cancer. The present description also provides a method for diagnosis of cancer, wherein the diagnosis of cancer further includes a cancer type and/or cancer stage. Further provided herein are methods of identifying genomic sites having methylation patterns specific to cancer or various types of cancer.

Disclosed herein is an assay for enriching cfDNA molecules for cancer diagnosis. Disclosed herein is an assay for generating a sequencing library. The sequencing library is a enriched specific sequence population such as CG enriched library. Alternatively, the sequencing library is random global sequence population library using random primer.

The method presented herein can be used to determine and identify a genetic change and/or an epigenetic modification present in an original target polynucleotide. In some cases, the target polynucleotide is processed and treated by an agent to differentially convert cytosine and methylated cytosines to produce a converted target polynucleotide where the converted target polynucleotide is amplified to generate amplified converted polynucleotides which are processed by workflows to generate NGS libraries. The final libraries derived from the original target polynucleotide are sequenced with any suitable method which includes but is not limited to Pacific Biosciences, Ion Torrent sequencing, Illumina, Nanopore, GenapSys, MGI, or Element Biosciences instruments. The sequencing data may be analysed to produce some or all of the following disease associated changes which include mutations, hypomethylation and/or hypermethylation which individually or in combination may be signatures of disease and/or epigenetic changes which may be used to identify the tissue of origin.

In some cases, epigenetic marks can be determined using a computer program (e.g., comprising instructions for the analysis of sequencing data and/or for performing one or more operations of a method presented herein). In some cases, such a computer program can be stored on a memory of a computer.

The present invention provides a method for adding adaptor sequences to target polynucleotides in a sample comprising: a) providing a reaction mixture comprising at least first primer (short-tail primer) which comprises a 3’ priming portion and a 5’ universal tail portion, wherein the 3’ priming portion is capable of hybridising to target polynucleotides and priming extension, wherein the 5’ universal tail portion or its complementary strand is capable of mediating guided addition of adaptor sequence to primer extension products, b) performing at least one round of extension reaction, wherein the extension reaction comprises hybridisation of primers to target polynucleotide and extension under extension condition to produce primer extension product, wherein the primers comprise short-tail primers, and c) adding adaptor sequence to the extension product, which is mediated by hybridising a specific oligonucleotide to 3’ end and/or the 5’ end of the extension product, wherein the specific oligonucleotide comprises sequence identical or complementary to the 5’ universal tail portion sequence of the short-tail primer. The at least one round of extension may be two or more rounds, wherein the first rounds produces extension products, the second or further round produces copies of extension products, wherein the specific oligonucleotide is capable of hybridising to 3’ end sequence of the extension product, which is the complementary strand of the 5’ universal tail sequence of the short-tail primer in the extension product.

The at least one round of extension reaction may be multiple rounds of extension by isothermal amplification with a polymerase having strand displacement activity. The polymerase is selected from group of KI enow, Bst polymerase, (|>29 DNA polymerase. The at least one round of extension reaction may be multiple rounds of extension which is PCR reaction.

The specific oligonucleotide may be an adaptor template oligonucleotide (ATO), wherein adding adaptor sequence to the extension product is performed by extension of the 3’ end of the extension product wherein the 3’ end of the extension product acts as primer, while the ATO acts as template.

The specific oligonucleotide may be a splint oligonucleotide which is capable of hybridising to the 3’ end sequence of the extension product and an adaptor sequence, wherein adding adaptor to the extension product comprises hybridising splint oligonucleotide to the 3’ or 5’ end of the extension product and to the adaptor oligonucleotide, and performing ligation between the 3’ or 5’ end of the extension product and 5’ or 3’ end of an adaptor oligonucleotide.

The 5’ universal tail portion may be between 2 to 12 nucleotides long, or between 2 to 9 nucleotides long, or between 2 to 6 nucleotides long, or between 2 to 4 nucleotides long. The short-tail primer may comprise degradable nucleotides. The degradable nucleotides may be Uracil nucleotide. The 3’ priming portion may comprise target specific sequence. The 3’ priming portion may comprise random sequence. The 3’ priming portion may comprise both random and target specific sequence.

The target polynucleotides may be bisulfate or enzymatic converted DNA, wherein the cytosines in the original DNA are coveted to uracils.

The method further comprises d) performing amplification using primers which are capable of hybridising to the adaptor sequences.

A kit for preparing a sequencing library comprising, a) at least one primer (short-tail primer) with 3’ priming portion and 5’ universal tail portion, wherein the 3’ priming portion is capable of hybridising to target polynucleotide and priming extension, wherein the 5’ universal part of the extension product or 3’ end of the extension product which is derived from 5’ universal tail portion of the primer is capable of mediating guided addition of adaptor sequence to the extension products, and b) adaptor template oligonucleotide (ATO) or splint oligonucleotide capable of hybridising to the 3’ end of the extension product, wherein the ATO or splint oligonucleotide comprise sequence identical or complementary to the 5’ universal tail portion of the short-tail primer. For the purpose of enrichment of specific target sequences, the extensions of specific primer (first short-tail primer) generate the copies of target sequences of interest. However, the nucleic acid population is still a mixture of original sequences and the target sequences of interest. The original sequences may make up the majority in the nucleic acid population especially when the extension is one pass extension or low number round of extension. To selectively analyse (for example sequencing) the target sequences of interest, we need to add adaptor to the target sequences of interest only, not to the original sequences. The 5’ tail sequences of primer or the its complementary sequence in the 3’ end of extension products provide a guide to direct the target sequences of interest to be added a adaptor sequence.

The present invention provides a method for guided addition of adaptors to target polynucleotides in a sample comprising: a) providing a reaction mixture comprising at least one primer (short-tail primer) with 3’ priming portion and 5’ universal tail portion, wherein the 3’ priming portion is capable of hybridising to target polynucleotide and priming extension, wherein the 5’ universal tail portion or its complementary strand is capable of mediating guided addition of adaptor sequence to the primer extension products, b) performing at least one round of extension reaction, wherein the extension reaction comprises hybridisation of primer to target polynucleotides which may be original polynucleotides or copies of the original polynucleotides generated from previous round of primer extension, and extension under extension condition to produce primer extension product (initial extension product), and c) adding adaptor to the extension product, which is mediated by hybridising a specific oligonucleotide to 3’ end and/or the 5’ end of the extension product.

The at least one round of extension may be two or more rounds, wherein the first rounds produces extension product, the second or further round produces copies of extension product, wherein the specific oligonucleotide is capable of hybridising to 3’ end of the extension product, which may be the complementary strand of the 5’ universal tail in the extension product.

The at least one round of extension reaction may be multiple rounds of extension by isothermal amplification with a polymerase having strand displacement activity.

The polymerase can be any polynucleate which may be selected from group of KI enow, Bst polymerase, (|)29 DNA polymerase.

The multiple rounds of extension reaction may be a PCR reaction.

In one embodiment, the special oligonucleotide is adaptor template oligonucleotide (ATO), adding adaptor to the extension product may comprise hybridising adaptor template oligonucleotide (ATO) to the 3’ end of the extension product, and performing extension using the 3’ end of the extension product as primer, ATO as template Adding adaptor to the extension product may comprise hybridising splint oligonucleotide to the 3’ or 5’ end of the extension product, and performing ligation between 3’ or 5’ end of extension product and 5’ or 3’ end of an adaptor oligonucleotide, both of which are hybridised to the splint oligonucleotide.

The 5’ universal tail portion of the short-tail primer may be between 12 to 2 nucleotides long, or between 9 to 2 nucleotides long, or between 6 to 2 nucleotides long, or between 4 to 2 nucleotides long.

The 5’ universal tail portion may comprise degradable nucleotides, which for example may be Uracil nucleotide or Inosine nucleotide or RNA

The 3’ priming portion may comprise target specific sequence, or may comprise random sequence, or may comprise combination of random and target specific sequence.

The target polynucleotides may be target polynucleotides which have in part or fully been processed by other workflows. Target polynucleotides may be processed in workflows such as an ATO-Reaction or ligation workflow such that their 3’ ends may have been extended which may include nucleotides which can be selectively destroyed. The extended target polynucleotides may be amplified generating copies by one pass or multi-pass linear or PCR amplification, whereby a nucleotide(s) is incorporated which may subsequently allow for the amplification products to be selectively destroyed, such as by incorporation of uracil nucleotides and treatment by UDG. A portion of the amplified extended target polynucleotides may be taken and used in other workflows such as targeted or whole sample enrichment by PCR, probe-based capture or other enrichment approaches. The remaining amplified extended target polynucleotides in the remaining portion may be selectively destroyed or rendered inert such as by treatment with UDG and the remaining target polynucleotides and/or extended target polynucleotides are then used in the disclosed invention.

The present invention further provides a method for preparing a sequencing library comprising: a) providing a reaction mixture comprising at least one primer (short-tail primer) with 3’ priming portion and 5’ universal tail portion, wherein the 3’ priming portion is capable of hybridising to target polynucleotide and priming extension, wherein the 5’ universal tail portion or its complementary strand is capable of mediating guided addition of adaptor sequence to the extension products, b) performing at least one round of extension reaction, wherein the extension reaction comprises hybridisation of primer to target polynucleotide and extension under extension condition to produce primer extension product, c) adding adaptor to the extension product, which is mediated by hybridising a specific oligonucleotide to 3’ end and/or the 5’ end of the extension product, and d) performing amplification using primer which is capable of hybridising to the adaptor sequence.

The present invention further provides a kit for preparing a sequencing library comprising, a) at least one primer (short-tail primer) with 3’ priming portion and 5’ universal tail portion, wherein the 3’ priming portion is capable of hybridising to target polynucleotide and priming extension, wherein the 5’ universal part of the extension product which is equivalent to the 5’ universal portion of the primer or 3’ end of the extension product which is equivalent to the 5’ universal portion is capable of mediating guided addition of adaptor sequence to the extension products, and b) adaptor template oligonucleotide (ATO) or splint oligonucleotide capable of to the 3’ end of the extension product.

Methods and systems presented herein can comprise providing and/or processing (e.g., chemical or enzymatic) a single- or double-strand DNA polynucleotide. A single- or doublestrand DNA polynucleotide can comprise an original target polynucleotide, described herein. In many cases, an original target polynucleotide may be converted by an agent to produce a converted polynucleotides. In many cases the original or converted polynucleotide is hybridised and extended or amplified by a short-tail primer to produce extension product which may be an initial extension products. In many cases the 3’ ends of the extension products are tagged with adaptors by hybridising a specific oligonucleotide to 3’ end and/or the 5’ end of the extension product.

Preferentially, the method for generating a 3’ extension is an ATO-Reaction which is provided in WO2018/193233 Al, which is incorporated by reference herein in its entirety.

In some embodiments target polynucleotides are modified by an agent, producing converted target polynucleotide. The agent chemically and/or enzymatically modifies the target polynucleotide to allow for the discrimination of methylated and unmethylated cytosines. The agent may be a bisulphite treatment, which may convert cytosine to uracil but not the methylated cytosine (i.e., 5-methylcytosine, which is resistant to this treatment and remains as cytosine). The agent may be enzymatic treatment such as the combination of a TET family member with APOB EC, or equivalent enzymes, which results in the conversion of unmethylated C to U but not the methylated cytosine. The agent may be chemical conversion by ‘TAPS chemistry’ which includes conversion of 5-methylcytosine (and 5- hydroxymethylcytosine) to 5- carboxylcytosine followed by the action of pyridine borane to selectively convert 5- carboxylcytosine residues to dihydrouracil.

The target polynucleotide is preferably fragmented either naturally or artificially. The target polynucleotide may be any nucleic acids such as DNA, cDNA, RNA, mRNA, small RNA, or microRNA, or any combination thereof. The target polynucleotide may comprise a plurality of target polynucleotides. Each of the target polynucleotides of the plurality may comprise different sequences or the same sequence. One or more of the target polynucleotides or plurality of target polynucleotides may comprise a variant sequence.

The target polynucleotides are obtained from naturally occurring sources or they can be synthetic. The naturally occurring sources are RNA and/or genomic DNA from a prokaryote or a eukaryote. For example and without limitation, the source can be a animal including human or mouse, virus, plant or bacteria. In various aspects, the target polynucleotide is tagged or extended at the 3’ end with an adaptor sequence for use in assays involving microarrays and creating libraries for next generation nucleic acid sequencing.

If the source of the target polynucleotide is genomic DNA or RNA or both, in some embodiments the genomic DNA or RNA or both is fragmented prior to its being extended. Fragmenting of genomic DNA/RNA is a general procedure known to those of skill in the art and is performed, for example and without limitation in vitro by shearing (nebulizing) the DNA/RNA, cleaving the DNA/RNA with an endonuclease, sonicating the DNA/RNA, by heating the DNA/RNA, by irradiation of DNA/RNA using alpha, beta, gamma or other radioactive sources, by light, by chemical cleavage of DNA/RNA in the presence of metal ions, by radical cleavage and combinations thereof. Fragmenting of genomic DNA/RNA can also occur in vivo, for example and without limitation due to apoptosis, radiation and/or exposure to asbestos. According to the methods provided herein, a population of target polynucleotides are not required to be of a uniform size. Thus, the methods of the disclosure are effective for use with a population of differently-sized target polynucleotide fragments.

In one embodiment, generating initial extension products comprises extension using a first primer (also refers to as short-tail primer), and using the target polynucleotide (converted or original), wherein the first primer is hybridised to the target polynucleotides and is extended by a polymerase. The first primer anneals to the target polynucleotides. If the first primer comprises 3’ random sequence, the first primer anneals to all places of the polynucleotides in a sample. The first primer may comprise additional sequence compatible to a NGS platform, for example a 5’ tail containing necessary sequences. The first primer may comprise unusual nucleotides such as uracil nucleotide or inosine (dinosine) nucleotide, which make all or a portion of the first primer uncopiable by a polymerase or degradable by a agents such as glycosylase. In some cases, the first oligo is a pool of target-specific primers, complementary to forward or reverse strand of the target polynucleotides or converted target polynucleotides. In some embodiments, the first primer may be a short-tail primer which has a 3’ priming portion and 5’ universal tail portion, wherein the 3’ priming portion is capable of hybridising to target polynucleotide and priming extension, wherein the 5’ universal tail portion or its complementary strand is capable of mediating guided addition of adaptor sequence to the primer extension products.

In some cases, the first primer may have a 3’ end which is target specific. In some cases, the first primer may have a 3’ end which is random chosen from A, T, C and or G nucleotides. In some cases, the first primer may have a 3’ end which is biased or has degenerate bases. In some cases, the bias is towards a higher content of G and C nucleotides. In some cases, the bias is towards a higher content of G and C, and, A or T nucleotides. In some cases, the 3’ may be a combination of fixed bases and random or degenerate bases. In some cases, the very 3’ end may be a target specific sequence which may be a CG dinucleotide. In some cases, the 3’ end may have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more consecutive CG dinucleotides separated or not separated by other nucleotides. In some cases, the 5’ end of the first primer may be a conserved or universal sequence. In some cases, between the 3’ end and 5’ end of the first primer may be a nucleotide which is degradable or not able to be used as a template by a polymerase. In some cases, the nucleotide which is not able to be used as a template by a polymerase may be an inosine or a uracil nucleotide. In some cases, the first primer may comprise a sample barcode (SBC) sequence and additional universal sequence(s) necessary for compatibility with an NGS platform.

In some cases, the initial extension product is generated with a one pass (round) extension using the first primer. The amplification may have 1 cycle, 2 cycles, 3 cycles, 4 cycles, 5 cycles, 1-30 cycles, 2-25cycles, 3-24 cycles, 4-23 cycles, 5-22 cycles, 6-21 cycles, 7- 20 cycles 8-19 cycles 9-18 cycles 10-17 cycles, 31-40 cycles, 41-50 cycles, 51-100 cycles, or more cycles of amplification. In some cases, the initial extension product is generated by the process of isothermal amplification. In some cases, the initial extension product is generated by a PCR reaction. In some cases, multiple cycles of extension may be carried out through repeated cycling of temperatures: annealing, extension, and denaturing.

In some cases, generation of the initial extension product may be carried out by any polymerase and/or any reverse transcriptase, or mix of different polymerases. In many cases, the polymerase is a DNA polymerase. In some cases, the polymerase may have a 3’ to 5’ exonuclease activity. In some cases, the DNA polymerase may be active at low temperature. The polymerase may contain a mix of different polymerases which may have 3’ to 5’ exonuclease activity, 5’ to 3’ exonuclease activity, and/or strand displacement activity. The polymerases that may be used to practice the methods disclosed herein include but are not limited to Deep VentR™ DNA Polymerase, Long Amp™ Taq DNA Polymerase, Phusion™ High-Fidelity DNA Polymerase, Phusion™ Hot Start High-Fidelity DNA Polymerase, VentR® DNA Polymerase, DyNAzyme™ II Hot Start DNA Polymerase, Phire™ Hot Start DNA Polymerase, Crimson LongAmp™ Taq DNA Polymerase, DyNAzyme™ EXT DNA Polymerase, LongAmp™ Taq DNA Polymerase, Taq DNA Polymerase with Standard Taq (Mg-free) Buffer, Taq DNA Polymerase with Standard Taq Buffer, Taq DNA Polymerase with ThermoPol II (Mg-free) Buffer, Taq DNA Polymerase with ThermoPol Buffer, Crimson Taq™ DNA Polymerase, Crimson Taq™ DNA Polymerase with (Mg-free) Buffer, VentR® (exo-) DNA Polymerase, Hemo KlenTaq™, Deep VentR™ (exo-) DNA Polymerase, ProtoScript® AMV First Strand cDNA Synthesis Kit, ProtoScript® M-MuLV First Strand cDNA Synthesis Kit, Bst DNA Polymerase, Full Length, Bst DNA Polymerase, Large Fragment, Taq DNA Polymerase with ThermoPol Buffer, 9° Nm DNA Polymerase, Crimson Taq™ DNA Polymerase, Crimson Taq™ DNA Polymerase with (Mg-free) Buffer, Deep VentR™ (exo-) DNA Polymerase, Deep VentR™ DNA Polymerase, DyNAzyme™ EXT DNA Polymerase, DyNAzyme™ II Hot Start DNA Polymerase, Hemo KlenTaq™, Phusion™ High- Fidelity DNA Polymerase, Phusion™ Hot Start High-Fidelity DNA Polymerase, Sulfolobus DNA Polymerase IV, Therminator™ y DNA Polymerase, Therminator™ DNA Polymerase, Therminator™ II DNA Polymerase, Therminator™ III DNA Polymerase, VentR® DNA Polymerase, VentR® (exo-) DNA Polymerase, Bsu DNA Polymerase, Large Fragment, Bst DNA Polymerase, Large Fragment, DNA Polymerase I (E. coli), DNA Polymerase I, Large (Klenow) Fragment, Klenow Fragment (3 >5 ' exo-), phi29 DNA Polymerase, T4 DNA Polymerase, T7 DNA Polymerase (unmodified), Reverse Transcriptases and RNA Polymerases, AMV Reverse Transcriptase, M-MuLV Reverse Transcriptase, phi6 RNA Polymerase (RdRP), SP6 RNA Polymerase, and T7 RNA Polymerase.

In some cases, the initial extension product once generated may itself be used as a template by the first primer for additional amplification, generating a copy of the initial extension product. The copy of the initial extension product, and all subsequent copies, may be used as a template by the first primer for further amplification generating further copies of the initial extension product.

In many cases, the initial extension product may have at its 3’ end a sequence equivalent to the first primer, which may be a short-tailed primer, its 5’ may be complementary to the template used to generate the initial extension product. Copies of the initial extension product and further copies of the initial extension product may have at their 3’ ends a sequence equivalent to the first primer or a second primer, which may be a short-tailed primer, and its 5’ may be complementary to the first primer or a second primer and between these may be sequence the same as the target polynucleotide or complementary to the target polynucleotide. In many cases, multiple initial extension products may be generated from one target polynucleotide. When a first primer is extended to produce an initial extension product a second first primer may bind upstream, when extended with a polymerase with strand displacement activity the first initial extension product can be displaced by a second first initial extension product, this process may continue until a primer anneals and extends at the very 5’ end of the converted target polynucleotide when it cannot itself be displaced.

In some cases, the 3’ and/ or 5’ ends of any initial extension products may be modified by the addition of an adaptor in a ligation-independent approach. The specific oligonucleotide may be adaptor template oligonucleotides (ATO). The adaptor may be generated by the use of a 3’ extension ATO reaction using ATO designs and ATO reaction methods in any combinations of designs or function by any method here or methods as found in WO2018/193233 Al. The adaptor generation may substitute in place of or in addition to uracil-DNA glycosylase (UDG) an endonuclease which is able to recognise and remove uracil or other modified nucleotides including 2'-Deoxyxanthosine such as Thermostable Endonuclease Q.

In many cases, a 3’ extension reaction may comprise the use of one or more unusual nucleotides which may include any combination or mixture of, 5-Methyl-2'-deoxycytidine-5'- Triphosphate, 5-hydroxyMethyl-2'-deoxycytidine-5'-Triphosphate, 5-formyl-2'-deoxycytidine- 5'-Triphosphate, 5-Carboxy-2'-deoxycytidine-5'-Triphosphate, 2 '-Deoxyuridine, 5'- Triphosphate, 2'-Deoxyuridine, 5 '-Triphosphate and/or 2'-deoxyinosine 5'-triphosphate. In some cases, unusual nucleotides may comprise an (a-thio)-triphosphate in place of a triphosphate to produce polymerase extension products which are resistant to exonucleases due to the substitution of an Oxygen atom with a Sulphur atom.

In many cases the ATO may have the following design:

(a) a 3’ end with a blocker, which renders ATO non-extendible;

(b) 5’ to (a) one of the following or any combination or components thereof (5’ to 3’)

(i) a random sequence of 3 to 36 ‘N’ bases;

(ii) a target specific region of 4 to 10 bases followed by a random region of 3 to 32 bases

(c) a universal sequence, 5’ (b);

(d) one or more moieties which renders the ATO degradable.

In one embodiment, the moieties are uracil nucleotides, wherein the agent is a dU- glycosylase, or a dU-glycosylase and apurinic/apyrimidinic endonuclease, which is capable of digesting/removing the ATO following the first extension reaction.

In some cases, a method for extending an extension product comprises:

(i) incubating the extension product with adaptor template oligonucleotides (ATO) having (a) a 3 ’portion comprising random or/and specific motif sequence;

(b) a 3’ end with a blocker, which renders ATO non-extendible; and

(c) a universal sequence, 5’ to the 3’ portion,

The incubating buffer comprises dNTPs, wherein the target polynucleotides hybridise to the 3’ portion of the ATO;

(ii) performing a polymerase extension of the initial extension product using the ATO as a template and incorporating dNTPs, thereby producing extended initial extension product having a 3 ’ universal sequence;

(iii) treating with an agent capable of digesting the ATO, for example an enzyme having a dU-glycosylase or 3’ exonuclease activity to digest ATO, or ribonuclease or Endonuclease V; and

(iv) generating an amplified initial extension product, wherein generating the first amplified polynucleotides comprises polymerase extension from primers hybridised to the 3’ universal sequence using the target polynucleotides as templates.

In another embodiment, the moieties are ribonucleotides, wherein ribonucleotides are incorporated during oligo synthesis into the ATO in the place of any nucleotides or all nucleotides; wherein the agent is a ribonuclease, which is capable of digesting/removing the ATO following the first extension reaction.

In another embodiment, the moieties are deoxyinosine, wherein deoxyinosine are incorporated during oligo synthesis into the ATO in the place of any nucleotides or all nucleotides; wherein the agent is an enzyme, which is capable of digesting/removing the ATO following the first extension reaction. The enzyme may be an endocucleases which may be Endonuclease V, the enzyme may be a glycosylase which may be human alkyl adenine DNA glycosylase.

The ATO may be an RNA oligo, or a DNA oligo, or a combination of DNA and RNA oligos.

The ATO may be a combination of one or more different ATO. The combination of ATO may vary in sequence. The combined ATOs may vary in design. The combined ATO may vary in function. Herein the term ‘ATO’ may refer to a combination of one or more ATOs, to any sequence of ATO, to any designed of ATO with any combination of ATO design features, to any combination of ATO with any combination of functions. When using a combination of one or more ATO there may be variation within the universal sequence with the ATO used, the term universal sequence is used in these cases as well.

The universal sequence of ATO may be double-stranded or partially double-stranded.

Having the universal sequence protected as a double stranded region prevents hybridisation with the randomised or sequences specific 3 ’-ends of the target polynucleotide or initial extension product and the ATO. In one embodiment, the ATO comprises a 5’ stem portion sequence which is complementary or partially complementary to all or part of the universal sequence, which are capable of forming a stem-loop structure or split stem-loop structure. Alternatively, the loop part may not comprise a non-copiable linkage. Alternatively, the stem part may comprise a non- copiable linkage. If the 5’ of the stem portion comprises an additional sequence, a non-copiable linkage may be present between the stem portion and the additional sequence. Alternatively, the stem part may not comprise a non-copiable linkage. In another embodiment, the 5’ stem portion comprises a non-copiable linkage. The non-copiable linkage may be selected from group but not limited to C3 Spacer phosphoramidite, or a tri ethylene glycol spacer, or an 18-atom hexaethyleneglycol spacer, or l’,2’-Dideoxyribose (dSpacer).

The double-stranded stem part may comprise non-complementary region(s), wherein the non-complementary region(s) in the universal sequence strand comprises a random, a degenerate sequence(s), or a specially designed mismatch. The stem portion may form two or more split sections separated by one or more non-copiable linkage(s). The stem portion may form two or more split sections separated by one or region of mismatches base pairs.

The ATO may further comprise a specific sequence 5’ and/or 3’of the random sequence, wherein the specific sequence is capable of hybridising to a specific place of the initial extension product, or a specific sequence which is not designed for a specific target, and part of the 3’ random/degenerated sequence serves as templates on which the polynucleotide is extended by a polymerase.

In some cases, the 3’ ends of the converted target polynucleotide or initial extension product anneal with the ATO in a way which can be immediately ligated to the 5’ or 3’ ends of an ATO without the need for an extension reaction. In some cases, the 5’ end of the ATO may comprise a phosphate group, the 3’ end of the initial extension product strand may comprise an OH group where the ATO has a 3’ overhang where the overhang has sequences capable of directing annealing to the3’ or 5’ end of any initial extension product. In some cases, the 3’ end of the ATO may comprise an OH group, the 5’ end of the initial extension product strand may comprise a phosphate group where the ATO has a 5’ overhang where the overhang has sequences capable of directing annealing to the 3’ or 5’ end of any initial extension product. In the embodiments where the extension is used without ligation, the 5’ end of the initial extension product strand does not comprise a phosphate group, the 3’ end of the upper separate strand does not comprise a biotin, but the upper separate strand may comprise nucleotides which can be digested such as uracil nucleotide. Any ATO reaction, may comprise ligation, with or without an extension, wherein if the 3’ ends are extended, a polymerase extends the 3’ end of the initial extension product, and a ligase ligates the extended target sequence to the 5’ stem portion of ATO or upper separate strand of ATO.

In one embodiment, the first ATO reaction is a primer extension reaction, wherein the converted target polynucleotide or initial extension product serves as primer that is extended on the ATO template by a DNA polymerase. The DNA polymerase may comprise a strand displacement activity or 5’ to 3’ exonuclease activity, wherein during the extension the stemloop structure is opened or the upper ATO strand is displaced or digested. Any polymerase can be use, for example Klenow exo-, Bst polymerase, or T4 DNA polymerase.

In another embodiment, the first reaction is an extension-ligation reaction, wherein a DNA polymerase extends the target and a DNA ligase ligates the extended target sequence to the 5’ stem portion of ATO or upper strand of ATO. Any DNA polymerase and DNA ligase can be used, for example, Klenow large fragment, T4 DNA ligase.

In another embodiment, the first reaction is a ligation reaction, wherein a DNA ligase ligates the target polynucleotide to the 5’ stem portion of ATO or upper strand of ATO. Any DNA ligase can be used, for example T4 DNA ligase.

In some embodiments, the specific oligonucleotide is a splint or bridge oligonucleotide, in other words, the adaptor comprises two strands: the top adaptor strand which ligates to the single -stranded target sequence, another strand is splint which bridges the top adaptor strand and the single-stranded target sequence. The ligation reaction, uses a splint or bridge oligonucleotide to guide the ligation of a single strand adaptor with a single -stranded target polynucleotides which may be initial extension product comprises short-primer extension products and /or copies (complementary strand) of the short-primer extension products. The splint oligo may have at its 5’ or 3’ end a sequence capable of annealing to the 3’ or 5’ end of the target polynucleotides and at the opposite 3’ or 5’ end a sequence to which the 5’ or 3’ of the adaptor can anneal. When ligation occurs at the 5’ end of the target polynucleotides its 5’ end may have a phosphate group which may be present at the 5’ of a first primer or be added enzymatically. When ligation occurs at the 3’ end of the target polynucleotides its 3’ end may have an OH group which may be present on the initial extension product. When ligation occurs at the 5’ end of the target polynucleotides the 3’ end of the adaptor may have a OH group which may be present during the manufacturing of the oligo. When ligation occurs at the 3’ end of the target polynucleotides, the 5’ end of the adaptor may have a phosphate group which may be present during the manufacturing of the oligo or can be added enzymatically.

The method may further comprise extending a second primer hybridized to the extended initial extension product, thereby generating a copied extended initial extension product, wherein the second primer comprises a target-specific portion or universal sequence, or both 3’ target specific and 5’ universal sequence.

The method may further comprise generating a tagged copy of a copied extended initial extension product, wherein the 3’ end of the copied extended initial extension product anneals to the 3’ sequence of an adaptor template oligonucleotide (second ATO) in an enzymatic second ATO reaction, in which the 3’ end of the extended initial extension product is extended using an ATO as a template, wherein the second ATO comprises a different 5’ universal sequence of the first ATO so that both ends are unique.

The method may further comprise exponential amplification using a first primer and a second primer. The first primer may be a universal primer which can anneal to the sequence, or copies thereof, added to the 3’ extended initial extension product; the second primer may be a universal primer which can anneal to the sequence, or copies thereof, added to the 3’ extended copied extended initial extension product. Alternatively, the second primer is a target specific primer annealing to a specific region of interest of the amplified product. The second primer may be a set of multiple primers targeting multiple sequence regions of interest. When the second primer is a target-specific primer, after linear or exponential amplification using the second primer, a nested target-specific third primer is used for a further amplification.

The first primer may comprise a sample barcode (SBC) sequence and additional 5’ universal sequence compatible for a NGS platform.

A kit comprises the composition described above.

A kit for generating a library of polynucleotides comprises a short-tail primer, an adaptor template oligonucleotide (ATO) or splint oligonucleotides described above, polymerase and primers compatible to NGS platform.

The methods of the present invention can substantially improve both the accuracy and sensitivity of testing of a single patient sample. The approach allows for enrichment of methylated DNA molecules from a population of DNA templates. BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 depicts a schematic of an illustrative embodiment. A first primer, which may be a shorttailed primer, hybridises to a target polynucleotide which may be original or a converted target polynucleotide. The first primer is then extended by a polymerase creating an initial extension product. A first primer or a second primer then hybridises to an initial extension product and is extended creating a copy of an initial extension product. An adaptor is then added to the 3’ end of the copy of the initial extension product. The 3’ end sequence of the extension product corresponding to the 5’ universal tail sequence of the short-tail primer serves as a guide to mediate the addition of adaptor sequence, either by ligation or by extension using ATO as template.

FIG. 2 depicts a schematic of an illustrative embodiment. A selection of designs of first primers are depicted. A first primers may be a short-tailed primer, which may comprise a 5’ universal tail portion and a 3’ priming portion. A 3’ priming portion may be a target specific sequence. A 3’ priming portion may be a random sequence. A 3’ priming portion may be a combination of random sequence and target specific sequence. A first primer may have uracil or inosine between the 5’ universal tail portion and 3’ priming portion.

FIG. 3 depicts a schematic of an illustrative embodiment. A) An example for the guided addition of an adaptor to the initial extension product or copy of the initial extension product using a 3’ extension reaction which is an ATOM-Seq reaction which may be completed by any detailed method. B) An example for the guided addition of an adaptor to the initial extension product or copy of the initial extension product using a splint or bridge oligo and a single strand adaptor which is ligated to the 5’ or 3’ end of the target polynucleotides which may be the initial extension product or the copies of the initial extension product.

FIG. 4 depicts results of an implementation of the invention. Three different masses of starting material, 50 ng, 100 ng and 200 ng, were used at starting material as described in example 1.

FIG. 5 depicts the global methylation patterns for 12 healthy samples and 5 lung cancer samples were visualised via clustering over all methylation patterns in a cluster dendrogram.

EXAMPLES

Table 1 : Details of all Oligos

Example 1

Using deoxyribonucleic acid (DNA) as the target polynucleotide for conversion using an agent followed by its use as a template for generating initial extension product. The initial extension product is then used for guided adaptor addition by a 3’ extension ATO Reaction. Followed by generation of a copy of the extended initial extension product whose 3’ is also used for guided adaptor addition by a 3’ extension ATO Reaction. Finally whole sample amplification to produce a library for NGS.

Materials

EpiTect Fast Bisulfite Kit (Qiagen)

Target polynucleotide, human gDNA

Klenow Fragment (NEB) dNTP Solution (NEB)

NEbuffer 2 (NEB)

Oligos: Table 1

AMPure XP beads (Beckman Coulter)

NEBNext Q5 Master Mix (NEB)

Method

1. Bisulfite Conversion

A total of 50, 100, or 200 ng of DNA was bisulfite converted using a EpiTect Fast Bisulfite Kit following the manufacturers recommended protocol. The final product was eluted in 19.5 pl of molecular biology grade water.

2. Generation of initial extension product The converted target polynucleotide was used as a template for targeted enrichment of CG containing DNA. A mix comprising 17.5 pl of converted target polynucleotide, oligo 1-001 or 1-007, and buffer. The mixture was heated to 98°C for 2 min followed by 10°C for 1 min and held at 10°C. To this Klenow Fragment was added. This was then thermocycled as follows 10°C for 1 minute, 26 °C 6 minutes, 30°C 10 minutes, 65°C 1 minute. The volume of the product was increase to 50 pl by the addition of 28 pl of water followed by purification using AMPure XP beads with a 2x volume of beads following the manufacturer's recommendations and was eluted in 18.5 pl of molecular biology grade water.

3. Guided adaptor addition on initial extension product

The purified initial extension product was then used for guided adaptor addition by mixing 16.5 pl of the purified initial extension product with 2.0 pl of 1-003, 1.5 pl of buffer from step 2 which was thermocycled as per step 2, followed by addition of enzymes and further cycling as per step 2. The reaction mix was then supplemented with 2.0 pl of treatment enzyme mix and incubated at 37°C 15 minutes.

4. Generation of copied extended initial extension product

The entire volume was then combined with 2.0 pl of oligo 1-002 and 26 pl of Q5 Master Mix. This mix was then thermocycled at 98°C for 30 seconds followed by 6 cycles of 98°C for 10 seconds, 65°C for 75 seconds and 72°C for 2 minutes and finally 65°C for 2 minutes. The product was purified using AMPure XP beads with a 2x volume of beads following the manufacturer's recommendations and was eluted in 15 pl of molecular biology grade water.

5. Guided adaptor addition on copied initial extension product

The purified copied initial extension product was then used for guided adaptor addition. First, 13 of the purified copied initial extension product was mixed with oligo 1-006 and incubated at 65°C for 2.5 minutes and 10°C for 1 minute. To this was added of 0.35 pl of dNTP, 2 pl of NEbuffer 2, 1.0 pl of Klenow Fragment. This mix was incubated at 10°C for 1 minute, 26°C for 6 minutes, 30°C for 10 minutes, 65°C for 1 minute, 10°C for 1 minute, 26°C for 6 minutes, and 30°C for 10 minutes. The reaction mix was then supplemented with 2.0 pl treatment enzyme mix and incubated at 37°C 15 minutes.

6. Whole Sample PCR

The whole of the product of step 5 was mixed with 25 pl of Q5 Master Mix, 1.5 pl each of oligo 1-004 and 1-005 and incubated at 98°C for 30 seconds, followed by 15 cycles of 98°C for 10 seconds, 60°C for 30 seconds, 65°C for 75 seconds, and a final extension of 65°C for 2 minutes. The product was purified using AMPure XP beads with a 2x volume of beads following the manufacturer's recommendations and was eluted in 30 pl of molecular biology grade water. The final bead purified products were visualised on a Bioanalyzer high sensitivity DNA chip, as shown in figure 4.

Example 2

Using deoxyribonucleic acid (DNA) as the target polynucleotide for conversion using an agent followed by its use as a template for generating initial extension product. The initial extension product is then used for guided adaptor addition by splint ligation of an adaptor.

Materials

EpiTect Fast Bisulfite Kit (Qiagen)

Target polynucleotide, human gDNA

Klenow Fragment (NEB) dNTP Solution (NEB) NEbuffer 2 (NEB) DNA ligase Oligos: Table 1

AMPure XP beads (Beckman Coulter)

NEBNext Q5 Master Mix (NEB)

Method

1. Bisulfite Conversion

The same as in method 1.

2. Generation of initial extension product

The same as in method 1.

3. Guided adaptor addition on extension product

The purified extension product was then used for guided adaptor ligation by mixing 16.5 pl of the purified extension product with 2.0 pl of splint oligos 2-008 and single-stranded adaptor oligo 2-009 (5’ phosphorylated). The enzymatic ligation was carried out under conventional conditions. The 1 x T4 DNA ligase buffer ([MgCl₂] = 10 mM, [ATP] = 500 pM, [DTT] = 10 mM, and [Tris-HCl] = 40 mM), provided by the manufacturer, was directly employed. The ligation was performed at 20°C for 12 h, and terminated by heating the mixture at 65 °C for 10 min.

4. Amplification of the adaptor ligated product The adaptor ligated product was mixed with primer which hybridises to the adaptor sequence, and primer extension and amplification were performed using standard method. The amplified products were subsequently prepared for sequencing.

Example 3

Using deoxyribonucleic acid (DNA) from clinical samples as the target polynucleotide for conversion using an agent followed by its use as a template for generating initial extension product. The initial extension product is then used for guided adaptor addition by a 3’ extension ATO Reaction. Followed by generation of a copy of the extended initial extension product whose 3’ is also used for guided adaptor addition by a 3’ extension ATO Reaction. Finally whole sample amplification to produce a library for NGS, sequencing and analysis for enrichment of CpG dinucleotides.

Materials

EpiTect Fast Bisulfite Kit (Qiagen)

Target polynucleotide, human gDNA

Klenow Fragment (NEB) dNTP Solution (NEB)

NEbuffer 2 (NEB)

Oligos: Table 1

AMPure XP beads (Beckman Coulter)

NEBNext Q5 Master Mix (NEB)

Method

1. Bisulfite Conversion

A total of 200 ng of DNA from non-cancer or cancer samples was bisulfite converted as in example 1.

2. Generation of initial extension product

As in example 1.

3. Guided adaptor addition on initial extension product

As in example 1.

4. Generation of copied extended initial extension product

As in example 1.

5. Guided adaptor addition on copied initial extension product

As in example 1.

6. Whole Sample PCR As in example 1.

7. Sequencing and Data analysis

Sequencing data was trimmed to remove adaptors using fastp and mapped to the human genome using bismark. Mapping rate was determined using samtools, which identified between 66.11% and 85.48% of reads could be mapped to the human genome. Reads were deduplicated identifying between 67.01% and 91.91% of reads were unique with a total of between 6.9 M and 46.3 M reads for each sample. Intersecting reads with genomic positions of CpG islands demonstrated between 1.69% and 3.94% of unique reads mapped to a CpG island, with between 91.29 and 98.6 of CpG islands covered. Reads were between 1.6% and 3.9% CpG dinucleotides, which given approximately 1% in the human genome are CpG, results in an enrichment of between 1.6x and 2.9x relative to randomly captured DNA. Table 2 summarises these data for 23 samples. The global methylation patterns for 12 healthy samples and 5 lung cancer samples were visualised via clustering over all methylation patterns in a cluster dendrogram (Fig.5). This demonstrated a clear separation of the healthy and cancer samples.

Table 2. Sequencing results showing enriched CG contents.

Claims

1. A method for adding adaptor sequences to target polynucleotides in a sample comprising: a) providing a reaction mixture comprising at least one primer (short-tail primer) which comprises a 3’ priming portion and a 5’ universal tail portion, wherein the 3’ priming portion is capable of hybridising to target polynucleotides and priming extension, wherein the 5’ universal tail portion or its complementary strand is capable of mediating guided addition of adaptor sequence to primer extension products, b) performing at least one round of extension reaction, wherein the extension reaction comprises hybridisation of primers to target polynucleotide and extension under extension condition to produce primer extension product, wherein the primers comprise short-tail primers, and c) adding adaptor sequence to the extension product, which is mediated by hybridising a specific oligonucleotide to the 3’ end and/or the 5’ end of the extension product, wherein the specific oligonucleotide comprises sequence identical or complementary to the 5’ universal tail portion of the short-tail primer.

2. A method according to claim 1, wherein the at least one round of extension is two or more rounds, wherein the first rounds produces extension products, the second or further round produces copies of extension products, wherein the specific oligonucleotide is capable of hybridising to 3’ end sequence of the extension product, which is the complementary strand of the 5’ universal tail sequence of the short-tail primer in the extension product.

3. A method according to claim 1, wherein the at least one round of extension reaction is multiple rounds of extension by isothermal amplification with a polymerase having strand displacement activity.

4. A method according to claim 3, wherein the polymerase is selected from group of Klenow, Bst polymerase, (|)29 DNA polymerase.

5. A method according to claim 1, wherein the at least one round of extension reaction is multiple rounds of extension which is PCR reaction.

6. A method according to claim 1 or 2, wherein the specific oligonucleotide is an adaptor template oligonucleotide (ATO), wherein adding adaptor sequence to the extension product is performed by extension of the 3’ end of the extension product wherein the 3’ end of the extension product acts as primer using ATO as template.

7. A method according to claim 1 or 2, wherein the specific oligonucleotide is a splint oligonucleotide which is capable of hybridising to the 3’ end sequence of the extension product and an adaptor oligonucleotide, wherein adding adaptor to the extension product comprises hybridising splint oligonucleotide to the 3’ or 5’ end of the extension product and to the adaptor oligonucleotide, and performing ligation between 3’ or 5’ end of extension product and 5’ or 3’ end of the adaptor oligonucleotide.

8. A method according to claim 1, wherein the 5’ universal tail portion is between 2 to 12 nucleotides long.

9. A method according to claim 1, wherein the 5’ universal tail portion is between 2 to 9 nucleotides long.

10. A method according to claim 1, wherein the 5’ universal tail portion is between 2 to 6 nucleotides long.

11. A method according to claim 1, wherein the 5’ universal tail portion is between 2 to 4 nucleotides long.

12. A method according to claim 1, wherein the short-tail primer comprises degradable nucleotides.

13. A method according to claim 12, wherein the degradable nucleotides is Uracil nucleotide.

14. A method according to claim 1, wherein the 3’ priming portion comprises target specific sequence.

15. A method according to claim 1, wherein the 3’ priming portion comprises random sequence.

16. A method according to claim 1, wherein the 3’ priming portion comprises random and target specific sequence.

17. A method according to claim 1, wherein the target polynucleotides are bisulfate or enzymatic converted DNA, wherein the cytosines in the original DNA are coveted to uracils.

18. A method according to claim 1, further comprising d) performing amplification using primers which are capable of hybridising to the adaptor sequences.

19. A kit for preparing a sequencing library comprising, a) at least one primer (short-tail primer) with 3’ priming portion and 5’ universal tail portion, wherein the 3’ priming portion is capable of hybridising to target polynucleotide and priming extension, wherein the 5’ universal part of the extension product or 3’ end of the extension product which is derived from 5’ universal tail portion of the primer is capable of mediating guided addition of adaptor sequence to the extension products, and b) adaptor template oligonucleotide (ATO) or splint oligonucleotide capable of hybridising to the 3’ end of the extension product, wherein the ATO or splint oligonucleotide comprise sequence identical or complementary to the 5’ universal tail portion of the short-tail primer.