CN111748613A - Design method and preparation method of double-label joint - Google Patents
Design method and preparation method of double-label joint Download PDFInfo
- Publication number
- CN111748613A CN111748613A CN201910237765.XA CN201910237765A CN111748613A CN 111748613 A CN111748613 A CN 111748613A CN 201910237765 A CN201910237765 A CN 201910237765A CN 111748613 A CN111748613 A CN 111748613A
- Authority
- CN
- China
- Prior art keywords
- sequence
- double
- joint
- sequencing
- primer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000013461 design Methods 0.000 title abstract description 14
- 238000002360 preparation method Methods 0.000 title abstract description 10
- 108020004414 DNA Proteins 0.000 claims abstract description 98
- 238000012163 sequencing technique Methods 0.000 claims abstract description 89
- 102000053602 DNA Human genes 0.000 claims abstract description 68
- 239000012634 fragment Substances 0.000 claims abstract description 17
- 238000000137 annealing Methods 0.000 claims abstract description 12
- 230000000295 complement effect Effects 0.000 claims description 42
- 108090000623 proteins and genes Proteins 0.000 claims description 7
- 230000009977 dual effect Effects 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 244000000010 microbial pathogen Species 0.000 claims description 4
- 230000026731 phosphorylation Effects 0.000 claims description 4
- 238000006366 phosphorylation reaction Methods 0.000 claims description 4
- 238000003559 RNA-seq method Methods 0.000 claims description 2
- 230000004048 modification Effects 0.000 claims description 2
- 238000012986 modification Methods 0.000 claims description 2
- 238000011176 pooling Methods 0.000 claims 1
- 238000001914 filtration Methods 0.000 abstract description 4
- 238000002372 labelling Methods 0.000 abstract description 2
- 238000010276 construction Methods 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 9
- 238000001514 detection method Methods 0.000 description 7
- 239000007984 Tris EDTA buffer Substances 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000011109 contamination Methods 0.000 description 4
- 238000002156 mixing Methods 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 238000012408 PCR amplification Methods 0.000 description 3
- 230000003321 amplification Effects 0.000 description 3
- 238000013401 experimental design Methods 0.000 description 3
- 238000012165 high-throughput sequencing Methods 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 102000012410 DNA Ligases Human genes 0.000 description 2
- 108010061982 DNA Ligases Proteins 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 239000012224 working solution Substances 0.000 description 2
- GUAHPAJOXVYFON-ZETCQYMHSA-N (8S)-8-amino-7-oxononanoic acid zwitterion Chemical compound C[C@H](N)C(=O)CCCCCC(O)=O GUAHPAJOXVYFON-ZETCQYMHSA-N 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 238000013381 RNA quantification Methods 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000010304 firing Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011259 mixed solution Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 239000012452 mother liquor Substances 0.000 description 1
- 239000010413 mother solution Substances 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/06—Biochemical methods, e.g. using enzymes or whole viable microorganisms
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Biochemistry (AREA)
- Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- General Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Immunology (AREA)
- Medicinal Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a design method and a preparation method of a double-label joint. The invention provides a kit for constructing a DNA molecule sequencing library to be detected, which comprises a double-sample label joint; annealing the double-sample label joint by a joint sequence L and a joint sequence S to form a joint; one end of the double-sample label joint is used for connecting a DNA molecule to be detected; the invention has the following advantages: 1) by introducing new sample labels at two ends of the inserted DNA fragment, the adding times of the sequencing primer are reduced, and the sequencing cost is reduced; 2) the invention can also realize double-sample labeling for the single-ended sequencing project under the condition of not increasing the sequencing cost, thereby avoiding the false positive problem caused by sample label crosstalk. The design scheme of the joint can meet the requirement that double-sample labels can be realized by single-ended sequencing, and can realize filtering of wrong sequencing data generated by sample label crosstalk.
Description
Technical Field
The invention belongs to the technical field of biology, and particularly relates to a design method and a preparation method of a double-label joint.
Background
Currently, a high-throughput sequencing technology has become an important gene detection technology, and is widely applied to the fields of scientific research, medical detection, agricultural breeding, judicial identification and the like. The current mainstream providers of high throughput sequencing technology include Illumina corporation, Thermo fisher corporation, Pacbio corporation, nanopore corporation in the uk, and china megagene (BGI) and the like. In order to reduce the average sequencing cost of a sample, a strategy of performing mixed on-machine sequencing on a plurality of sample libraries is adopted in most cases. In the library construction process, a sample label (index) is added to each sample, sequencing data can be split into each sample according to the sample label, and finally high throughput and low cost of sequencing are achieved. Sample tagging has become an integral part of high throughput sequencing technologies.
In practical application, the problem of sample-label crosstalk (index-cross or index-switching) is often encountered, that is, data pollution of other samples can be found in data of a certain sample label, so that the accuracy of sequencing data is affected, for example, false positive results occur in pathogenic microorganism detection and tumor low-frequency mutation detection, and the result of RNA quantification is inaccurate. The main causes of sample tag crosstalk include library adaptor synthesis contamination, contamination during library construction, contamination in target region capture firing, pre-amplification contamination before sequencing, erroneous reading of sample tags during sequencing, sample residues in intermediate flow pipelines of two sequencing experiments, and the like.
At present, the main scheme for solving the problem of sample label crosstalk is to adopt double-label sequencing, namely, sample labels are introduced into two ends of DNA to be detected simultaneously, and only data with correct two labels can enter the analysis of the next link during sequencing data analysis. Thus, the problem of sample label crosstalk can be greatly reduced or even avoided.
At present, the library structure of the double-sample label is shown in figure 1, sequencing primer binding regions are arranged between the double-sample label and the position of DNA to be detected, and Illumina can filter sample data by adopting the scheme. After the library is loaded on a sequencing chip, one end of read1 and index1 is sequenced, after the sequencing is completed, the copying and synthesis of a second end sequencing template are carried out, then the sequencing primers of read2 and index2 are respectively added by taking the template as the template, and finally the reading of the double-sample tag sequence is realized. And if the two sample labels do not accord with the experimental design, deleting the corresponding sequencing reads data, and finally filtering the sample label crosstalk data.
The existing double-sample label design scheme has the following defects: 1) in order to realize the data acquisition of the double-sample label, 2 times of adding index sequencing primers are needed, so that the sequencing cost is increased; 2) since the sequencing templates for the two sample tags are both strands of the DNA library, template strand synthesis is required before sequencing of the second sample tag, resulting in increased sequencing time; 3) current double-sample tag designs are not compatible with single-ended sequencing.
Disclosure of Invention
In order to overcome the defects of the existing double-sample label, the invention provides the following technical scheme.
The invention provides a kit for constructing a DNA molecule sequencing library to be detected, which comprises a double-sample label joint;
annealing the double-sample label joint by a joint sequence L and a joint sequence S to form a joint; one end of the double-sample label joint is used for connecting a DNA molecule to be detected;
the double-sample label joints connected with the two ends of the DNA molecule to be detected are the same.
The DNA molecule to be detected can be a sticky end DNA molecule to be detected or a flat end DNA molecule to be detected, and if the sticky end DNA molecule to be detected is the flat end DNA molecule to be detected, the sticky end DNA molecule to be detected and the flat end DNA molecule to be detected can be connected with the double-sample label joint after the A is added.
The joint sequence L sequentially comprises a region A which is complementary with the joint sequence S and a region C which is not complementary with the joint sequence S from the end close to the DNA molecule to be detected;
the region A sequentially consists of a second sample label sequence and a fragment B for annealing and complementation from the end close to the DNA molecule to be detected;
a binding region of a primer PF in a bank building primer pair is arranged on the region C;
the joint sequence S sequentially comprises a region D which is complementary with the joint sequence L and a region E which is not complementary with the joint sequence L from the end close to the DNA molecule to be detected;
the region D consists of a complementary sequence of the second sample label sequence and a complementary sequence of the fragment B in sequence from the end near the DNA molecule to be detected;
and the region E comprises a binding region of the primer PR in the library-establishing primer pair from the end near the DNA molecule to be detected.
The kit also comprises the library building primer pair;
the library building primer pair consists of the primer PF and the primer PR;
the primer PR comprises, from the 5' end, a first sample tag sequence and a region which binds to the region E.
In the kit, the region E comprises a first sample tag sequence and a binding region of a primer PR in the library-building primer pair from the end close to a DNA molecule to be detected;
the kit also comprises the library building primer pair;
the library building primer pair consists of the primer PF and the primer PR;
the primer PR contains a region which binds to the region E and does not contain the first sample tag sequence.
In the kit, the length of the second sample tag sequence is greater than 3 nt;
or the length of the second sample label sequence is 3-10 nt. The length of the second sample label may be 3 bases or any combination of bases greater than 3 bases, and 10 bases or more are not recommended because of the large amount of data wasted.
In the above kit, the double sample label linker is in a bubble-like or Y-shaped structure or may be in other structures, and it is within the scope of the present invention to introduce new sample labels at both ends of the DNA adjacent to the insert.
In embodiments of the invention, the structure is a Y-type structure, wherein the complementary region of the 2-linker sequence is the backbone of the Y-type, and the non-complementary region is the bifurcation region of the Y-type.
The other end of the double-sample label joint is in a bubbly shape or a free non-complementary double-stranded structure; or, the last base phosphorylation modification of the joint sequence L from the end near the DNA molecule to be detected.
In the kit, the double-sample label adaptor is formed by annealing an adaptor sequence L shown in a sequence 1 and an adaptor sequence S shown in a sequence 3;
or the double-sample label joint is formed by annealing a joint sequence L shown in a sequence 2 and a joint sequence S shown in a sequence 4;
or the pair of the library-establishing primers consists of a primer shown in a sequence 5 and a primer shown in a sequence 6 or 7.
Another purpose of the invention is to provide a method for constructing a DNA molecule sequencing library to be tested by using the kit.
The method provided by the invention comprises the following steps:
when a double-sample label is introduced into the library, a second sample label sequence in the double-sample label is positioned between a DNA molecule to be detected and a sequencing primer binding region;
or when double-sample labels are introduced in the database building, the second sample label sequence in the double-sample labels is close to the two ends of the DNA molecule to be detected.
The method comprises the following steps:
1) connecting the double-sample label joint with a DNA molecule to be detected to obtain a connection product;
the DNA molecules to be detected can be sticky-end DNA molecules to be detected or flat-end DNA molecules to be detected, and if the DNA molecules to be detected are flat-end DNA molecules to be detected, the double-sample label joint can be connected after A is added;
2) amplifying the ligation product by using the library building primer pair to obtain a DNA molecule sequencing library to be detected; and the second sample label sequence in the DNA molecule sequencing library to be detected is close to the two ends of the DNA molecule to be detected.
The application of the kit or the method in constructing a DNA molecule sequencing library to be detected is also within the protection scope of the invention;
or, the application of the kit or the method in single-ended sequencing of the DNA molecule to be detected is also within the protection scope of the invention;
or, the application of the kit or the method in double-end sequencing of the DNA molecule to be detected is also the protection scope of the invention;
or, the application of the double-sample tag adaptor and the library-building primer corresponding to the double-sample tag adaptor in the construction of a DNA molecule sequencing library to be detected is also within the protection scope of the invention;
or, the application of the double-sample label joint and the corresponding library-establishing primer in single-ended sequencing of the DNA molecule to be detected is also within the protection scope of the invention;
or, the application of the double-sample tag adaptor and the corresponding library-establishing primer in double-end sequencing of the DNA molecule to be detected is also within the protection scope of the invention.
In the above application, the single-ended sequencing is non-invasive prenatal gene sequencing, pathogenic microorganism gene sequencing or RNA sequencing.
The invention realizes the sequencing of the double-sample label by adopting lower sequencing cost and sequencing time, and particularly realizes the high-efficiency filtration of the sample label crosstalk data on single-ended sequencing projects, such as noninvasive prenatal gene detection, pathogenic microorganism detection and the like, under the condition of not increasing the sequencing cost.
The invention has the following advantages: 1) by introducing new sample labels at two ends of the inserted DNA fragment, the adding times of the sequencing primer are reduced, and the sequencing cost is reduced; 2) the invention can also realize double-sample labeling for the single-ended sequencing project under the condition of not increasing the sequencing cost, thereby avoiding the false positive problem caused by sample label crosstalk. The design scheme of the joint can meet the requirement that double-sample labels can be realized by single-ended sequencing, and can realize filtering of wrong sequencing data generated by sample label crosstalk.
Drawings
FIG. 1 shows a common design scheme and sequencing method for a double-sample tag adapter.
FIG. 2 is a schematic diagram of library construction results of the double-sample tag adaptor of the present invention.
FIG. 3 illustrates a method for implementing the dual sample label adapter design of the present invention.
Detailed Description
The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
Example 1 design of double-sample tag linker and its application to sequencing library construction
One-and double-sample label joint
1. Double-sample label joint and preparation of library building kit thereof
The second sample tag in the double sample tag adaptor is located intermediate to the sequencing primer binding region and the inserted DNA fragment.
The structure of the double-sample label adaptor shown in fig. 3A is as follows:
the double-sample label joint A is A Y-shaped joint formed by annealing A joint sequence L-A and A joint sequence S-A,
one end of the double-sample label adaptor A forms A complementary flat end or A complementary sticky end by the complementary regions of the adaptor sequence L-A and the adaptor sequence S-A, and the other end forms A free non-complementary double strand by the non-complementary regions of the adaptor sequence L-A and the adaptor sequence S-A; and the complementary blunt end or the complementary cohesive end is used for connecting the DNA molecule to be detected; the complementary cohesive end is connected with the DNA molecule to be detected in a T-A connection mode.
The joint sequence L-A sequentially consists of A region A-A which is complementary with the joint sequence S-A and A region C-A which is not complementary with the joint sequence S-A from the end close to the DNA molecule to be detected;
wherein the region A-A sequentially consists of a second sample label sequence and a fragment B (with the size of 7-15nt) from the end close to the DNA molecule to be detected;
the region C-A is provided with a binding region of a primer PF-A in the library-establishing primer pair A (in the embodiment of the invention, the binding region of the primer PF-A is on the region C-A and is far away from the tail end of the region A-A);
phosphorylation of the last base of the joint sequence L-A from the end close to the DNA molecule to be detected;
the length of the second sample label is more than 3nt, specifically 3-10 nt.
The joint sequence S-A sequentially consists of A region T-A which is complementary with the joint sequence L-A and A region T-A which is not complementary with the joint sequence L-A from the end close to the DNA molecule to be detected;
the region D-A consists of a second sample label sequence complementary sequence and a fragment B complementary sequence in sequence from the end close to the DNA molecule to be detected;
the region pentA-A sequentially comprises A first sample label sequence and A binding region of A primer PR-A in the library-establishing primer pair A from the end close to the DNA molecule to be detected (in the embodiment of the invention, the binding region of the primer PR-A is on the region pentA and far away from the tail end of the region D);
the length of the first sample tag sequence is more than 6nt and less than 12.
The library establishing primer pair A consists of A library establishing primer PF-A and A library establishing primer PR-A.
The library building kit containing the double-sample label joint comprises a double-sample label joint and a library building primer pair A;
2. dual sample tag adaptor preparation
The structure shown in fig. 3B is as follows:
annealing the double-sample label joint B by a joint sequence L-B and a joint sequence S-B to form a Y-shaped joint;
one end of the double-sample label adaptor B forms a complementary flat end or a complementary sticky end by the complementary regions of the adaptor sequence L-B and the adaptor sequence S-B, and the other end forms a free non-complementary double strand by the non-complementary regions of the adaptor sequence L-B and the adaptor sequence S-B; and the complementary blunt end or the complementary cohesive end is used for connecting the DNA molecule to be detected;
the joint sequence L-B sequentially consists of a region A-B which is complementary with the joint sequence S-B and a region C-B which is not complementary with the joint sequence S-B from the end close to the DNA molecule to be detected;
wherein the region A-B consists of a second sample label sequence and a fragment B (with the size of 7-15nt) in sequence from the end close to the DNA molecule to be detected;
the third region C-B is provided with a binding region of a primer PF-B in the library-establishing primer pair B (in the embodiment of the invention, the binding region of the primer PF-B is on the third region and is far away from the tail end of the first region);
phosphorylation of the terminal base of the joint sequence L-B near to the DNA molecule to be detected;
the length of the second sample label is more than 3nt, specifically 3-10 nt.
The joint sequence S-B sequentially consists of a region T-B which is complementary with the joint sequence L-B and a region E-B which is not complementary with the joint sequence L-A from the end close to the DNA molecule to be detected;
the region D-B consists of a second sample label sequence complementary sequence and a fragment B complementary sequence in sequence from the end close to the DNA molecule to be detected;
the region penta-B is provided with a binding region of a primer PR-B in the library-establishing primer pair B (in the embodiment of the invention, the binding region of the primer PR-B is on the region penta-B and far away from the tail end of the region delta-B), and the first sample tag sequence is absent.
The library building primer pair B consists of a library building primer PF-B and a library building primer PR-B;
the library primer PR-B comprises a first sample tag sequence and a region which is combined with the region penta-B from the 5' end.
The length of the first sample tag sequence is more than 6nt and less than 12 nt.
A schematic diagram of the library construction results of the double-sample tag adapters of the present invention is shown in FIG. 2.
When single-ended sequencing is performed, the index2 before the insert fragment can be read, then the sequence of the insert fragment can be sequentially read (namely Reads1), and then the sequencing primer of index1 is added to perform reading of index 1;
when double-end sequencing is performed, the index2 before the insert fragment can be read, then the sequence of the insert fragment can be sequentially read (namely Reads1), and then the sequencing primer of index1 is added to perform reading of index 1; the sequencing primer of Reads2 was added, and similarly, the data generated would have the sequence information of index3 followed by the information of Reads2 where the DNA was inserted.
Second, double-sample label joint construction sequencing library
1. Preparation of the DNA molecule to be determined
The DNA molecules to be tested are prepared by PCR with blunt or sticky ends.
2. Dual sample label joint connection
1) Method connection of FIG. 3A
The principle is as follows: the double-sample tag adaptor synthesized above is directly connected with the inserted DNA fragment through a ligation reaction to directly construct a complete library (the PCR step in A can be omitted, and can be applied to PCR-free, but the disadvantage is that the adaptor synthesis is relatively long and difficult).
The scheme is as follows:
mixing the DNA molecule to be detected, the double-sample label joint A and the T4DNA ligase for ligation reaction to obtain a ligation product;
and amplifying the ligation product by using a library-building primer pair to obtain a DNA sequencing library.
2) Method connection of FIG. 3B
The principle is as follows: the newly added sample tags at both ends of the inserted DNA fragment are added through a ligation reaction, and then a PCR amplification reaction is carried out to introduce the sample tags in the conventional adapters, so as to finally construct a complete double-sample tag library (the adapter in B is short and easy to synthesize, and the Index added later can be selected according to practical application, but the PCR step is necessary).
The scheme is as follows:
mixing the DNA molecule to be detected, the double-sample label joint B and the T4DNA ligase for ligation reaction to obtain a ligation product;
and amplifying the ligation product by using a library-building primer pair to obtain a DNA sequencing library.
Thirdly, sequencing
And (4) performing sequencing on the machine.
Example 2 design of double-sample tag linker and its application to sequencing library construction
First, construct the double-sample tag adapter
1. Construction of double-sample tag linker sequences
According to the design scheme of example 1, the linker sequences required for constructing the double-sample tag linker B shown in the following table 1 are designed;
the linker sequence in the table was synthesized from the great Gene technology Co., Ltd, Heihua, Beijing in a purification mode of C18 DSL with a subscription volume of 5 OD.
TABLE 1 sequences required for linker construction and amplification primer information
After ordering of the linker sequence, the solution was dissolved to 100. mu.M of the mother liquor using TE buffer.
2. Preparation of double-sample Label adapters
Mixing Ad01L and Ad01S according to the amount of the same substances, preparing 25 mu M of adaptor mixed solution by using TE buffer solution, standing at room temperature for more than 30 minutes, and annealing to form a Y-shaped structure containing partial double chains, which is named as Ad01M (double-sample tag adaptor B);
ad02L and Ad02S were mixed in equal amounts, and prepared into 25. mu.M adaptor mixture with TE buffer, and left at room temperature for 30 minutes or more, and annealed into a partially double-stranded Y-shaped structure, designated Ad02M (double-sample tag adaptor B).
Second, double-sample label joint construction sequencing library
1. Preparation of the DNA molecule to be determined
Using lambda phage DNA as a template, PCR primers shown in table 2 of table 2 were designed and PCR amplified to obtain lambda P1 and lambda P2 (blunt ends) as DNA molecules to be detected.
TABLE 2 PCR primers for standards
The primers in the tables were synthesized from Beijing Liu He Hua Dagen technology Co., Ltd, purified in C18 DSL, and ordered at 5 OD. After primer ordering, the primers were dissolved in TE buffer to 100. mu.M of the mother solution and diluted to 10. mu.M of the working solution.
The reaction system and procedure for the PCR amplification are shown in Table 3.
rTaqDNA polymerase (Shenzhen Huazhi Zhi science and technology Limited, 01K01201MS) was used for PCR amplification, and the reaction system and conditions are shown in Table 3.
Table 3 shows PCR reaction System and procedure
PCR products were purified 1.5 Xwith Ampure XP magnetic beads (Beckman, A63880) and quantitatively diluted to a concentration of 1 ng/. mu.L.
2. Dual sample label joint connection
The library was constructed using the KAPA Hyper Prep Kit library building Kit (Kapa Biosystems, KR0961) as follows:
1. dual sample label joint connection
After purification, 10ng of each of the PCR products λ P1 and λ P2 was taken after adding A using a kit, 1 μ L of linkers Ad01M and Ad02M with a concentration of 10 μ M were added, respectively, and in order to simulate sample tag crosstalk, 0.01 μ L of Ad02M was added to the ligation reaction of Ad01M to obtain a linker Ad01M ligation product and a linker Ad02M ligation product.
2. Amplification of library-building primers
PF in Table 1 was mixed with amounts of PR01 and PR02, respectively, and 10. mu.M of primer working solutions, designated as P01M and P02M, were prepared in TE buffer as PCR primers for library construction.
The ligation product was amplified with P01M using linker Ad01M as template to give a lambda P1 sequencing library.
The ligation product was amplified with P02M using linker Ad02M as template to give a lambda P2 sequencing library.
The experimental design is shown in table 4.
Table 4 example experimental design
| Name of liberty | S1 | S2 |
| Insert DNA | λP1 | λP2 |
| Connecting joint | Ad01M | Ad02M |
| Mixing 1% of the linker | Ad02M | Ad01M |
| PCR primer | P01M | P02M |
Thirdly, sequencing
A BGISEQ-500 sequencer made by Huada Ching is adopted to sequence more than 1 ten thousand reads in each sample according to a sequencing mode of single-ended sequencing 100 basic groups.
Analysis statistics were performed on reads generated by sequencing, as shown in table 5. The statistical calculation result shows that the actual detection of the sample label pollution rate of 1% of the simulation is 0.95% and 1.37% respectively. In actual sample testing, only unexpected sequencing data (such as Ad01_ P02 and Ad02_ P01 in Table 5) needs to be filtered and deleted, so that the problem of false positive caused by sample tag crosstalk can be avoided. I.e. only care has to be taken that the two sample labels perfectly fit the expected reads.
TABLE 5 number of sequencing reads
SEQUENCE LISTING
<110> Shenzhen Hua Dagen stock Limited Shenzhen Hua Dai clinical verification center
<120> design method and preparation method of double-label joint
<160>7
<170>PatentIn version 3.5
<210>1
<211>36
<212>DNA
<213>Artificial sequence
<400>1
cacgaagtcg gaggccaagc ggtcttagga agacaa 36
<210>2
<211>36
<212>DNA
<213>Artificial sequence
<400>2
gtgcaagtcg gaggccaagc ggtcttagga agacaa 36
<210>3
<211>30
<212>DNA
<213>Artificial sequence
<400>3
gaacgacatg gctacgatcc gacttcgtgt 30
<210>4
<211>30
<212>DNA
<213>Artificial sequence
<400>4
gaacgacatg gctacgatcc gacttgcact 30
<210>5
<211>17
<212>DNA
<213>Artificial sequence
<400>5
gaacgacatg gctacga 17
<210>6
<211>45
<212>DNA
<213>Artificial sequence
<400>6
tgtgagccaa ggagttgatc ggacctattg tcttcctaag accgc 45
<210>7
<211>45
<212>DNA
<213>Artificial sequence
<400>7
tgtgagccaa ggagttggat tccgtccttg tcttcctaag accgc 45
Claims (10)
1. A kit for constructing a DNA molecule sequencing library to be detected comprises a double-sample label joint;
annealing the double-sample label joint by a joint sequence L and a joint sequence S to form a joint; one end of the double-sample label joint is used for connecting a DNA molecule to be detected;
the joint sequence L sequentially comprises a region A which is complementary with the joint sequence S and a region C which is not complementary with the joint sequence S from the end close to the DNA molecule to be detected;
the region A sequentially consists of a second sample label sequence and a fragment B for annealing and complementation from the end close to the DNA molecule to be detected;
a binding region of a primer PF in a bank building primer pair is arranged on the region C;
the joint sequence S sequentially comprises a region D which is complementary with the joint sequence L and a region E which is not complementary with the joint sequence L from the end close to the DNA molecule to be detected;
the region D consists of a complementary sequence of the second sample label sequence and a complementary sequence of the fragment B in sequence from the end near the DNA molecule to be detected;
and the region E comprises a binding region of the primer PR in the library-establishing primer pair from the end near the DNA molecule to be detected.
2. The kit of claim 1, wherein:
the kit also comprises the library building primer pair;
the library building primer pair consists of the primer PF and the primer PR;
the primer PR comprises, from the 5' end, a first sample tag sequence and a region which binds to the region E.
3. The kit of claim 1, wherein: the region E comprises a first sample tag sequence and a binding region of a primer PR in the library building primer pair from the end close to the DNA molecule to be detected;
the kit also comprises the library building primer pair;
the library building primer pair consists of the primer PF and the primer PR;
the primer PR contains a region which binds to the region E and does not contain the first sample tag sequence.
4. The kit according to any one of claims 1 to 3, wherein:
the length of the second sample label sequence is more than 3 nt;
or the length of the second sample label sequence is 3-10 nt.
5. The kit according to any one of claims 1 to 4, wherein:
the dual sample label tab is a drum bubble or Y-shaped structure.
Or, the last base phosphorylation modification of the joint sequence L from the end near the DNA molecule to be detected.
6. The kit according to any one of claims 1 to 5, wherein:
the double-sample label joint is formed by annealing a joint sequence L shown in a sequence 1 and a joint sequence S shown in a sequence 3;
or the double-sample label joint is formed by annealing a joint sequence L shown in a sequence 2 and a joint sequence S shown in a sequence 4;
or the pair of the library-establishing primers consists of a primer shown in a sequence 5 and a primer shown in a sequence 6 or 7.
7. A method for constructing a sequencing library of test DNA molecules using the kit of claims 1-6, comprising the steps of:
when a double-sample label is introduced into the library, a second sample label sequence in the double-sample label is positioned between a DNA molecule to be detected and a sequencing primer binding region;
or when double-sample labels are introduced in the database building, the second sample label sequence in the double-sample labels is close to the two ends of the DNA molecule to be detected.
8. The method of claim 7, wherein: the method comprises the following steps:
1) connecting the double-sample label adaptor of any one of claims 1-6 with the DNA molecule to be tested to obtain a ligation product;
2) amplifying the ligation products by using the pair of library-constructing primers of any one of claims 1 to 6 to obtain a sequencing library of the DNA molecules to be tested; and the second sample label sequence in the DNA molecule sequencing library to be detected is close to the two ends of the DNA molecule to be detected.
9. Use of a kit according to any one of claims 1 to 6 or a method according to claim 7 or 8 for constructing a sequencing library of test DNA molecules;
or, the use of a kit according to any one of claims 1 to 6 or a method according to claim 7 or 8 for single-ended sequencing of a DNA molecule to be tested;
or, the use of a kit according to any one of claims 1 to 6 or a method according to claim 7 or 8 for paired-end sequencing of a test DNA molecule;
or, the use of the double-sample tag adaptor and the corresponding pooling primer of any one of claims 1-6 for constructing a sequencing library of test DNA molecules;
or, the use of the double-sample tag adaptor of any one of claims 1-6 and the corresponding pool primer in single-ended sequencing of a DNA molecule to be tested;
or, the use of the double-sample tag adaptor and the corresponding pool primer of any one of claims 1-6 in paired end sequencing of a test DNA molecule.
10. Use according to claim 9, characterized in that: the single-ended sequencing is noninvasive prenatal gene sequencing, pathogenic microorganism gene sequencing or RNA sequencing.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910237765.XA CN111748613A (en) | 2019-03-27 | 2019-03-27 | Design method and preparation method of double-label joint |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910237765.XA CN111748613A (en) | 2019-03-27 | 2019-03-27 | Design method and preparation method of double-label joint |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN111748613A true CN111748613A (en) | 2020-10-09 |
Family
ID=72671011
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910237765.XA Pending CN111748613A (en) | 2019-03-27 | 2019-03-27 | Design method and preparation method of double-label joint |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111748613A (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112410331A (en) * | 2020-10-28 | 2021-02-26 | 深圳市睿法生物科技有限公司 | Linker with molecular label and sample label and single-chain library building method thereof |
| CN114717662A (en) * | 2022-04-20 | 2022-07-08 | 深圳市易基因科技有限公司 | Micro free DNA methylation library building method, kit and sequencing method |
| CN115992205A (en) * | 2023-02-23 | 2023-04-21 | 华智生物技术有限公司 | Joint, kit and library construction method for multi-sample pooling library construction |
| CN116064729A (en) * | 2022-07-08 | 2023-05-05 | 广州微远医疗器械有限公司 | Amplicon primer sets for library construction and their applications |
| US20230392201A1 (en) * | 2022-06-06 | 2023-12-07 | Element Biosciences, Inc. | Methods for assembling and reading nucleic acid sequences from mixed populations |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101967476A (en) * | 2010-09-21 | 2011-02-09 | 深圳华大基因科技有限公司 | Joint connection-based deoxyribonucleic acid (DNA) polymerase chain reaction (PCR)-free tag library construction method |
| CN102181533A (en) * | 2011-03-17 | 2011-09-14 | 北京贝瑞和康生物技术有限公司 | Multi-sample mixed sequencing method and kit |
| CN105734048A (en) * | 2016-02-26 | 2016-07-06 | 武汉冰港生物科技有限公司 | PCR-free sequencing library preparation method for genome DNA |
| CN106086162A (en) * | 2015-11-09 | 2016-11-09 | 厦门艾德生物医药科技股份有限公司 | A kind of double label joint sequences for detecting Tumor mutations and detection method |
| CN106367485A (en) * | 2016-08-29 | 2017-02-01 | 厦门艾德生物医药科技股份有限公司 | Multi-locating double tag adaptor set used for detecting gene mutation, and preparation method and application of multi-locating double tag adaptor set |
| CN108148900A (en) * | 2018-01-24 | 2018-06-12 | 深圳因合生物科技有限公司 | Sequencing approach, kit and its application of sequencing mistake are reduced based on molecular label and the sequencing of two generations |
| CN108893466A (en) * | 2018-06-04 | 2018-11-27 | 苏州人人基因科技有限公司 | The detection method of sequence measuring joints, sequence measuring joints group and ultralow frequency mutation |
-
2019
- 2019-03-27 CN CN201910237765.XA patent/CN111748613A/en active Pending
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101967476A (en) * | 2010-09-21 | 2011-02-09 | 深圳华大基因科技有限公司 | Joint connection-based deoxyribonucleic acid (DNA) polymerase chain reaction (PCR)-free tag library construction method |
| CN102181533A (en) * | 2011-03-17 | 2011-09-14 | 北京贝瑞和康生物技术有限公司 | Multi-sample mixed sequencing method and kit |
| CN106086162A (en) * | 2015-11-09 | 2016-11-09 | 厦门艾德生物医药科技股份有限公司 | A kind of double label joint sequences for detecting Tumor mutations and detection method |
| CN105734048A (en) * | 2016-02-26 | 2016-07-06 | 武汉冰港生物科技有限公司 | PCR-free sequencing library preparation method for genome DNA |
| CN106367485A (en) * | 2016-08-29 | 2017-02-01 | 厦门艾德生物医药科技股份有限公司 | Multi-locating double tag adaptor set used for detecting gene mutation, and preparation method and application of multi-locating double tag adaptor set |
| CN108148900A (en) * | 2018-01-24 | 2018-06-12 | 深圳因合生物科技有限公司 | Sequencing approach, kit and its application of sequencing mistake are reduced based on molecular label and the sequencing of two generations |
| CN108893466A (en) * | 2018-06-04 | 2018-11-27 | 苏州人人基因科技有限公司 | The detection method of sequence measuring joints, sequence measuring joints group and ultralow frequency mutation |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112410331A (en) * | 2020-10-28 | 2021-02-26 | 深圳市睿法生物科技有限公司 | Linker with molecular label and sample label and single-chain library building method thereof |
| CN114717662A (en) * | 2022-04-20 | 2022-07-08 | 深圳市易基因科技有限公司 | Micro free DNA methylation library building method, kit and sequencing method |
| CN114717662B (en) * | 2022-04-20 | 2025-12-19 | 深圳市易基因科技有限公司 | Micro free DNA methylation library construction method, kit and sequencing method |
| US20230392201A1 (en) * | 2022-06-06 | 2023-12-07 | Element Biosciences, Inc. | Methods for assembling and reading nucleic acid sequences from mixed populations |
| CN116064729A (en) * | 2022-07-08 | 2023-05-05 | 广州微远医疗器械有限公司 | Amplicon primer sets for library construction and their applications |
| CN115992205A (en) * | 2023-02-23 | 2023-04-21 | 华智生物技术有限公司 | Joint, kit and library construction method for multi-sample pooling library construction |
| CN115992205B (en) * | 2023-02-23 | 2025-05-09 | 华智生物技术有限公司 | A connector, a kit and a library construction method for multi-sample pooling library construction |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111748613A (en) | Design method and preparation method of double-label joint | |
| US20200370095A1 (en) | Spatial Analysis | |
| CN108893466B (en) | Sequencing joint, sequencing joint group and detection method of ultralow frequency mutation | |
| CN105714383B (en) | A kind of sequencing library construction method and reagent based on the reverse probe of molecule | |
| CN108300716B (en) | Linker element, application thereof and method for constructing targeted sequencing library based on asymmetric multiplex PCR | |
| CN103298955B (en) | For building method and the test kit of plasma dna sequencing library | |
| US20220056519A1 (en) | Method and system for constructing sequencing library on the basis of methylated dna target region, and use thereof | |
| CN113005121A (en) | Linker elements, kits and uses related thereto | |
| CN102690809B (en) | DNA index and application thereof in construction and sequencing of mate-paired indexed library | |
| CN110669823B (en) | ctDNA library construction and sequencing data analysis method for simultaneously detecting multiple liver cancer common mutations | |
| CN114277096B (en) | Method and kit for identifying thalassemia alpha anti4.2 heterozygotes and HK alpha heterozygotes | |
| EP3607065B1 (en) | Method and kit for constructing nucleic acid library | |
| CN106192018A (en) | A kind of method of grappling Nest multiplex PCR enrichment DNA target area and test kit | |
| CN107604046A (en) | Bimolecular self checking library for minim DNA ultralow frequency abrupt climatic change prepares and two generation sequence measurements of hybrid capture | |
| CN107922966A (en) | Sample preparation for nucleic acid amplification | |
| CN114736951B (en) | A method for constructing a high-throughput sequencing library for small RNA | |
| CN113462749A (en) | High-sensitivity amplicon library construction kit, library construction method and application | |
| CN112410331A (en) | Linker with molecular label and sample label and single-chain library building method thereof | |
| CN111748606A (en) | Method and kit for quickly constructing plasma DNA sequencing library | |
| CN112795990B (en) | Flexible and changeable multi-tag secondary sequencing library joint capable of reducing pollution and PCR bias | |
| CN118414425A (en) | A linker and its application in constructing a DNB library | |
| CN109337966A (en) | A kind of molecular label and its reagent and application | |
| CN113584135B (en) | A method to detect RNA modifications in mixed samples and achieve precise quantification | |
| CN117210943A (en) | Space transcriptome chip, construction method of cDNA library and method of transcriptome sequencing analysis | |
| EP3012328B1 (en) | Methods for detecting oncogenic mutations |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201009 |




