CN113930490B - Method for evaluating STR slippage by using molecular specific bar code in second generation sequencing platform and application thereof - Google Patents
Method for evaluating STR slippage by using molecular specific bar code in second generation sequencing platform and application thereof Download PDFInfo
- Publication number
- CN113930490B CN113930490B CN202111141442.4A CN202111141442A CN113930490B CN 113930490 B CN113930490 B CN 113930490B CN 202111141442 A CN202111141442 A CN 202111141442A CN 113930490 B CN113930490 B CN 113930490B
- Authority
- CN
- China
- Prior art keywords
- umi
- str
- sequence
- family
- sequences
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The application relates to the field of second generation sequencing, in particular to a method for evaluating slipping of STR by using a molecular specific bar code in second generation sequencing and application thereof. Connecting two ends of a DNA fragment of which the genome is broken to be detected with an excessive molecular specificity barcode sequence respectively, and carrying out PCR amplification to obtain an original DNA template molecule containing a sample specificity identification code and UMI; capturing STR target sequences to obtain captured target fragments; performing high-throughput sequencing on the captured target fragment to obtain sequencing data; STR typing is carried out on the sequencing data according to the UMI joint sequence, the sample specific identification code and the reference genome sequence of the sample to be tested, so as to obtain the slip rate of each STR locus; slippage of STRs in the second generation sequencing platform was assessed. UMI can accurately identify and remove the slipping product (stutters) of the STR locus in the PCR process, achieves the expected effect of reducing the slipping of the STR in a second generation sequencing platform, and provides technical support for the fine identification of mixed detection materials in forensic identification.
Description
Technical Field
The application relates to the field of second generation sequencing, in particular to a method for evaluating slipping of STR by using a molecular specific bar code in second generation sequencing and application thereof.
Background
The Short Tandem Repeat (STR) is a sequence in which 2 to 6 nucleotides are repeated in tandem in the human genome. STR gene locus length is generally between 100 and 300bp, and the copy number and sequence of the individual STR core sequence are changed to be highly polymorphic.
At present, STR typing is widely applied to the fields of forensic individual identification, parent identification and the like.
The classical STR typing principle is to design site-specific primers by using the conserved sequences of the two wing regions of the STR, amplify the primers on a PCR instrument and finally detect the amplified products by electrophoresis. The STR typing technology based on the capillary electrophoresis platform has the highest accuracy, is widely used for human forensic identification, and is a gold standard for current individual identification.
However, in STR typing patterns for capillary electrophoresis, a shadow peak (Stutter) or shadow band with a weaker signal, which is due to slip replication, may occur one more repeat unit than the main band, at a position one repeat unit before the peak of the target allele (main band). In general, the ratio of the peak area of the shadow band to the peak area of the main band is smaller than 15%, the longer the repeating unit is, the more easily the slippage is caused, the peak area of the shadow band becomes smaller with the increase of the slippage repeating unit (main band > main band-1 repeating unit > main band-2 repeating units), and the existence of the shadow band brings great difficulty to the parting of the mixed sample. And shadow band formation is closely related to STR slippage during PCR amplification.
In recent years, the technology of second generation sequencing (next generation sequencing, NGS) has rapid development, has the advantages of high flux, integration, low cost and the like, has been widely applied to the fields of scientific research, clinical diagnosis and the like, and has important application prospects in the fields of forensics and agriculture.
Disclosure of Invention
The application provides a method for evaluating the slipping of STR by using a molecular specific bar code in a second generation sequencing platform and application thereof, which are used for solving the technical problem of how to slip the STR in the second generation sequencing platform.
In a first aspect, the present application provides a method for assessing STR slippage using a molecular specific barcode in second generation sequencing, the method comprising the steps of:
Obtaining a capture probe from the STR locus;
Respectively connecting an original DNA template molecule of a sample to be detected with a UMI joint sequence and a sample specific identification code to obtain a DNA fragment to be captured, wherein the DNA fragment to be captured contains the UMI joint sequence and the sample specific identification code;
STR capturing is carried out on the DNA fragment to be captured by using the capturing probe to obtain a target capturing fragment;
Performing high-throughput sequencing on the target capture fragment to obtain sequencing data;
STR typing is carried out on the sequencing data according to the UMI joint sequence, the sample specific identification code and the reference genome of the sample to be tested, and the slip rate of each STR locus is calculated;
and evaluating the slippage of the STR in the second generation sequencing platform according to the slippage rate.
Optionally, STR typing is performed on the sequencing data according to the UMI linker sequence, the sample specific identifier code and the reference genome of the sample to be tested, and calculating the slip ratio of each STR locus includes:
Removing sequences which do not meet preset standards in the sequencing data, and removing polluted sequencing sequences according to the specific identification codes of the samples to obtain sequences to be compared;
Comparing the sequences to be compared with a reference genome of the sample to be tested to obtain an original extraction sequence;
In the original extraction sequence, determining members of each UMI family and sequences thereof from the sequences to be aligned according to the UMI linker sequence;
reducing to obtain an original DNA template sequence of each UMI family according to the members of each UMI family and the sequences thereof;
STR typing was performed on all members of the UMI family containing the same UMI linker sequence, and the percentage of slippage was calculated for each UMI family.
Optionally, said STR typing of all members of said UMI family containing the same said UMI linker sequence comprises: STR typing is performed on all members of the UMI family containing the same UMI linker sequence according to length polymorphism.
Optionally, the sequence that does not meet the preset standard includes:
The 3-end of the single-end reading sequence contains a sequence with the number of preset mass bases exceeding 1/3 of the number of the base of the sequence, wherein the preset mass bases are bases with the mass value less than or equal to 20;
Sequences with a sequence length of less than 100 bp.
Optionally, the comparing the sequence to be compared with the reference genome of the sample to be tested to obtain an original extracted sequence includes:
And comparing the sequences to be compared with a reference genome, reserving the sequence containing the STR conserved motif and the sequence containing the UMI joint sequence, and simultaneously extracting the sequence of the STR conserved motif and the UMI joint sequence to obtain an original extraction sequence.
Optionally, the method further comprises:
determining a primary typing and each stutters according to the membership of the UMI family;
obtaining STR slipping percentage according to the sum of the membership of each stutters and the membership of the UMI family;
and obtaining the proportion of each stutters according to the number of each stutters and the number of the main types.
Optionally, the number of members of the UMI family is more than or equal to 30; the length of the identification code sequence is more than or equal to 12bp, and the length of the UMI joint sequence is more than or equal to 8bp.
Alternatively, the nucleotide sequence of the capture probe is as set forth in any one of SEQ ID NOS.1-45.
Alternatively, the high throughput sequencing comprises double-ended sequencing or single-ended sequencing.
In a second aspect, the application provides the use of a molecular specific barcode in second generation sequencing to assess slippage of STRs, the use comprising using the method of the first aspect in individual identification and authentication.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
According to the method provided by the embodiment of the application, the impurity pollution sequence is removed by utilizing the identification code sequence, the unique and excessive UMI connector is added to each broken original DNA template molecule by utilizing the molecular specificity of the UMI connector sequence, after PCR amplification and target sequence enrichment, the STR amplification products containing the same UMI form a UMI family, all members of each UMI family of each STR are reduced to the original DNA template molecule by UMI, the slippage rate of each STR locus of each sample before and after reduction is compared, and as a result, the slippage rate of the STR after the reduction of UMI is found to be obviously lower than that before reduction, the slippage rate of the STR after the reduction is reduced to be lower than 5% in all samples, even 0, so that the UMI can accurately identify stutters generated by the slippage of the STR loci in the PCR process and remove the same, and the expected effect of reducing the slippage of the STR in a second-generation sequencing platform is achieved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of a method for evaluating STR slippage using a molecular specific barcode in second generation sequencing according to an embodiment of the present application;
FIG. 2 is a diagram of construction, sequencing and data analysis of a STR high throughput library according to an embodiment of the present application;
FIG. 3 is a diagram illustrating the size of a family of molecular-specific barcodes provided in an embodiment of the application;
FIG. 4 is a graph showing the percent slippage of a family of molecular-specific barcodes provided in an embodiment of the application;
FIG. 5 is a scatter plot of n-1 type stutters versus major typing scale for each STR locus provided by an embodiment of the present application;
FIG. 6 shows the genotyping results for the D2S441 loci from the same UMI family provided in the examples of the present application;
FIG. 7 is a graph showing the Peel correlation coefficient between each STR major typing and stutters provided by an embodiment of the present application;
FIG. 8 is a plot of the relationship between the major typing of each UMI family and stutters for the TH01 locus provided by the examples of the present application;
FIG. 9 is a schematic representation of the reduction of family members containing the same UMI in the D2S1338 locus to the original DNA template molecule.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The following describes the advantages of the process according to the application in particular in connection with examples.
A method for assessing STR slippage using a molecular specific barcode in second generation sequencing, comprising the steps of:
S1, obtaining a capture probe according to an STR gene locus;
In the embodiment of the application, if the STR locus is a human STR locus, probes can be designed aiming at the nucleotide sequences of 22 STR loci of the human to capture the STR target sequences.
In the embodiment of the application, the STR target probe sequence is designed: the capture probes were designed using the 24 STR loci commonly used, 45 probes were designed in total. Wherein, 45 probes are designed at 22 STR loci, the names of the 45 probes are STR0001-STR0045 in sequence, and the corresponding nucleotide sequences are SEQ ID NO.1-45 in sequence. On average, 1-3 probes were designed for each site, and the details are shown in Table 1.
Table 1 capture probes for 22 STR loci.
The methods of the application can be used with animals, plants, microorganisms, in view of the fact that they all have STR loci.
S2, respectively connecting an original DNA template molecule of a sample to be detected with the UMI joint sequence and a sample specific identification code to obtain a DNA fragment to be captured;
In the embodiment of the application, the genome DNA of the sample to be detected can be broken into DNA fragments with the size of 100-500bp by using a mechanical or enzyme cutting method, so as to obtain the original DNA template molecule of the sample to be detected; the DNA fragments were then ligated at both ends with an excess of a molecule specific barcode sequence (UMI) and a sample specific identification code was introduced, respectively.
In the embodiment of the application, the experimental sample adopts 6 Chinese Han nationality filter paper blood samples without individuals to extract and quantify DNA, and an M48 magnetic bead extraction and purification kit (QIAGEN, U.S.) is used for extracting DNA from the blood samples of 6 Chinese Han nationality independent individuals. Using2.0 Fluorometer (Invitrogen, USA) quantitated DNA and the quantitated blood sample DNA was used for further testing.
S3, STR capturing is carried out on the DNA fragment to be captured by using the capturing probe, so as to obtain a target capturing fragment;
s4, performing high-throughput sequencing on the target capture fragment to obtain sequencing data;
Library construction, probe capture and high throughput sequencing: the extracted blood sample DNA was disrupted by means of CovarisS220,220 sonicator (Covaris, woburn, U.S.) to obtain a 100-300bp DNA fragment. The sample DNA fragments are then ligated with UMI linker sequences containing a sample specific DNA identifier code (Tag) and DNA template molecule specificity, as shown in FIG. 2, wherein the linker sequences are in absolute excess to ensure that the UMI to which each original DNA molecule was ligated is unique. After end repair and addition of a tail, a DNA library was constructed using GenoBaits DNA library preparation kit (DL 002, molbreeding Biotechnology co., ltd, china). Each GenoBaits library (100 ng) compatible with the Illumina sequencing platform was pooled and STR target capture was performed with GenoBaits DNA library preparation reagent (DL 001 Molbreeding Biotechnology co., ltd, china) according to the library construction instructions. Finally, the enriched library was double-ended sequenced (2×150 bp) on an Illumina HiSeqX-ten (Illumina, inc., san diego, CA) platform.
In the embodiment of the application, the insert length of the sequencing platform is 150bp, and besides the STR core motif and flanking sequences thereof (at least 20bp long and can be adjusted according to the needs of the user are required), the sample-specific identification codes of 8bp UMI and 12bp are also included, so that the length of the STR core motif of the probe enrichment target region is not more than 90bp, the STR loci with long sequences can not be effectively enriched, and the sequencing length can be increased according to the needs of research, so that all target STR loci are enriched.
High throughput data analysis: the total sequencing data are subjected to library division according to the sample bar codes when the library is constructed, and an original sequencing sequence of each sample is obtained; removing sequences which do not meet the preset standard, and removing the polluted sequences according to the Tag sequences; then, comparing the human reference genome with the comparison software, and reserving sequences containing the STR conserved motif and flanking regions (also called boundary sequences) thereof; STR conserved motifs and UMI were then extracted from these sequences as the original extracted sequences for our subsequent analysis, as shown in figure 2.
Enrichment, library construction and data analysis of STR loci were performed on 6 Chinese Han nationality independent individuals according to FIG. 2, and 6,503,692 sequencing sequences were obtained on average for each sample after sequencing. The sequencing sequence of the Tag corresponding to each sample is reserved; then it is aligned to a human reference genome sequence (ftp:// hgdownload. Cse. Ucsc. Edu/goldenPath/hg 19/chromosomes), if the sequence is alignable to an STR conserved motif of the reference genome and the flanking regions of the conserved motif contain a border sequence of specified threshold length (20 bp for the border sequence herein), the sequence is considered to contain the target STR sequence, and the UMI and STR target sequences contained in the sequence are extracted for STR typing. The typing results of STRs for all samples showed: the target sequences enriched for each STR locus are not balanced, and the enrichment of some STR loci in 6 samples is higher, such as TPOX, TH01, D13S317, DYS391, D2S441, D16S539, CSF1PO, D10S1248; while some sites were almost free of enrichment in all samples, such as FGA, D7S820, vWA, D22S1045, D18S51, D2S1338.
S5, STR typing is carried out on the sequencing data according to the UMI joint sequence, the sample specific identification code and the reference genome of the sample to be tested, and the slip rate of each STR locus is calculated;
s6, evaluating the slippage of the STR in the second generation sequencing platform according to the slippage rate.
In the present embodiment, if there is no disclosed reference genome, it may also be an STR reference sequence, which is collectively referred to herein as a reference genome.
Relationship analysis of UMI family size, STR slip percentage, stutters and major typing: since each fragmented DNA original template molecule has a unique UMI attached to it prior to PCR amplification, a sequence containing the same UMI after PCR amplification forms a UMI family (aUMI family). The size (size) of UMI family refers to the number of all members in one UMI family, and we count the frequency (frequency of occurrence) distribution of different UMI family sizes; while genotyping each member of each UMI family, only the length polymorphism of each member is considered herein, and its sequence polymorphism is temporarily disregarded, given that we studied the slippage of STR in the second generation sequencing platform. Genotypes of the same length in each UMI family are pooled and the number of members is counted, wherein the genotype with the largest number of members is the primary genotype. The original genotyping results for all members of one UMI family were: (AAGG) 13 has 5 members, (AAGG) 12 (ATGG) 1 has 2 members, (AAGG) 12 has 4 members, and the two genotypes of (AAGG) 13 and (AAGG) 12 (ATGG) 1 with the same length are combined and counted to obtain 7 members, (AAGG) 13 which is used as the main typing of the UMI family and is similar to the main band of capillary electrophoresis, only the length of the main band is considered, and other members with the length are used as stutters and are similar to the slipping peak (shadow band) of capillary electrophoresis. The STR slippage percentage for each UMI family is expressed as a percentage of the number of stutters of that family to the number of all members. The distribution of STR slippage percentages for different UMI family sizes (fig. 3) and the distribution of all UMI family numbers and slippage percentages for each STR locus were counted.
Based on the slippage of each STR locus, the number of UMI families with at least 30 members per STR locus was counted for alignment. As a result, it was found that the number of UMI families for the TH01, TPOX, D13S317, D16S539, D2S441, CSF1PO and D10S1248 loci were all greater than 1000, whereas the number of UMI families for the D7S820 locus were less than 50 for the D18S51, D22S1045, D2S1338, FGA and vWA loci, without such UMI families. This result is consistent with the analysis of the number of original sequenced sequences aligned to each STR locus. Analysis of the percentage of STR slippage of all UMI families aligned to each STR locus shows that when the number of UMI families for an STR locus is greater, the percentage of slippage of the STR locus is relatively low, such as the TH01, TPOX loci; when the number of UMI families is small, the percentage of slippage of the STR locus is relatively high, such as D18S51, D22S1045.
In addition, the relationship of the major typing (n_0) per UMI family to stutters (n-1, n-2, n-3, n-4) of increasing 1-2 tandem repeat units (n+1, n+2) or decreasing 1-4 tandem repeat units was counted. Stutters refers to the product formed by slippage of DNA polymerase during PCR amplification of STR, typically one repeat unit (n-1) less than the major typing (n_0), and also increases or decreases by a certain repeat unit. The stutters ratio refers to the ratio of the number of stutters members of each type to the number of primary typing members in a UMI family. The ratio of stutters (n-1, n-2, n-3, n-4) to the number of dominant typing (n_0) members per UMI family, increased by 1-2 tandem repeat units (n+1, n+2) or decreased by 1-4 tandem repeat units, is primarily counted herein and requires at least 60 UMI families per STR locus (6 sample pooled analysis). The results show more n-1 types stutters, less n+1 and n-2 times, and less other types per STR locus. From the scatter plot of n-1 type stutters versus its major typing ratio (FIG. 5), it was found that the n-1 type stutters versus the major typing ratio was between 0 and 1, although the ratio was close to stutters of 1, the proportion of each UMI family was very low, indicating that some STR loci could slip off during the first round of PCR amplification, e.g., 30 members in total for one UMI family at the D2S441 locus in sample 2 (UMI bar code: AATCAGAG), 15 members of 48bp length and 15 members of 44bp length (FIG. 6), and if the major typing of each UMI family was determined based on the principle of the maximum number of members, the major typing of that UMI family could not be determined. In addition, we found that when the number of UMI families is greater than 500, the ratio of stutters to the major typing at sites other than the D10S1248 site is reduced to less than 0.15.
The present example also analyzes the relationship between stutters (n-1, n+1, n-2) and the primary typing (n_0), which counts a relatively large number of members. The pi correlation coefficients between all STR loci dominant typing and stutters are shown in figure 7. The major typing of some STR loci, such as D19S433 (0.16), D13S317 (0.21), has a higher pi correlation coefficient with n-1 type stutters, and is significantly positively correlated with respect to some STR loci, such as DYS391 (0.69), TPOX (0.67). The overall pi correlation coefficient of stutters for n-1 and n-2 is lower than that of n-1 type stutters for the dominant typing, but is related to stutters for n+1 for the dominant typing. The relationship of the major typing of the TH01 locus to each stutters is shown in FIG. 8. Shown on the diagonal is the sequence coverage case for the main typing and stutters. The scatter plot is the number of sequences of the primary typing stutters. r is the pi correlation coefficient of the dominant typing with other stutters. The red line represents a trend plot between stutters and the dominant typed sequence reads processed by the locally weighted scatter smoothing method. As can be seen from FIG. 8, the n-1 type stutters of the STR locus has a higher pi correlation coefficient with the dominant typing, the n-2 and n-1 type stutters have a second pi correlation coefficient, and there is no obvious positive correlation between other stutters or stutters and the dominant typing.
The method can identify and remove stutters generated by STR slippage in the PCR process of the second generation sequencing platform, achieves the effect of reducing the STR slippage, and is beneficial to the accurate typing of mixed samples.
The Stutters product of the application is the slipping product of DNA polymerase; in theory, all members of the UMI family can only type one allele, if any, due to slippage. The conserved sequences of each UMI family member for each STR locus were extracted and each UMI family was reduced as exemplified in fig. 9. The unreduced original sequencing sequence of each STR gene locus and the reduced original DNA template molecule are respectively subjected to genotyping according to the length polymorphism, and the slip rate of each STR gene locus is calculated and counted. Since all members of each UMI family are derived from the same original DNA template molecule and contain the same UMI, the computer program developed by UMI and this team can be used to reduce all members belonging to the UMI family to the original DNA template molecule and to study slippage of the STR loci in each sample by comparing the original sequencing sequence before reduction with the genotyping result of the original DNA template molecule after reduction. Considering that there are typically at most two alleles per STR locus in a single individual, and of course three allelic genotypes, the objective here is to evaluate the slippage of STRs in second-generation sequencing, and thus to override this typing result. Referring to the principle of previous person's determination of homozygosity and heterozygosity for each STR locus, when the ratio of the first allele of each locus to all typing alleles of the locus is greater than or equal to 0.7, the alleles of the STR locus are considered homozygotes, only one allele (the main allele), and the rest alleles are considered as slippage; if the ratio of the first and second alleles to all typing alleles of the locus is greater than or equal to 0.35, the STR locus is considered to have two alleles as heterozygotes, the first one is called the major allele, the second one is the minor allele, and the rest alleles are slippage products in the PCR amplification process. The slip rate is the percentage of each STR locus stutters that is typed for all alleles. The present document evaluates whether UMI can achieve the effect of reducing slippage of STRs in a second-generation sequencing platform by analyzing slippage rates of each STR locus in each sample before and after the sequencing sequence is reduced by UMI.
Analysis results show that the reduced slipping rate of the D12S391 loci is 12.239% or less in one sample, and the slipping rates of other STR loci are reduced to 0 or even less; while the unreduced original sequencing sequence has a higher slip rate for each STR locus than that after reduction. Such as: the average slip rate before reduction of D10S1248 and D3S1358 is above 17%, and the slip rate is reduced by 14% after UMI reduction to the original DNA template molecule. If 6 samples are treated as 6 biological repeats, the genotyping results before and after the reduction of each STR locus are consistent, and the slip rate is obviously reduced, which shows that the method of reducing the high-throughput sequencing sequence to the original DNA template molecule by UMI has obvious effect of reducing the STR slip rate.
Yet another object of the present document is: considering the length polymorphism of STR and also considering its sequence polymorphism, the use of UMI to reduce sequencing data to the original DNA template molecule would be a potential advantage in mixed typing resolution over traditional CE.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Sequence listing
<110> Jiang Handa >
<120> Method for evaluating slipping of STR using molecular specific bar code in second generation sequencing and application thereof
<160> 45
<170> SIPOSequenceListing 1.0
<210> 1
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 1
tatatttggt tccctagtga ttctatttct ctgaagaatc ctgactaaca caggactgaa 60
ggagaattgg gaaagaaagg aaattaaaaa taaacaaaga ccaggcgctt acagctgcta 120
<210> 2
<211> 80
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 2
gagaaggagg aggaggagtt ggtcttgctg tctcaggggt ggtagagatg gaagaaaatc 60
cccatataag ttcaagcctg 80
<210> 3
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 3
ggcctgtggg tccccccata gatcgtaagc ccaggaggaa gggctgtgtt tcagggctgt 60
gatcactagc acccagaacc gtcgactggc acagaacagg cacttaggga accctcactg 120
<210> 4
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 4
atgaatgaat gtttgggcaa ataaacgctg acaaggacag aagggcctag cgggaaggga 60
acaggagtaa gaccagcgca cagcccgact tgtgttcaga agacctggga ttggacctga 120
<210> 5
<211> 90
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 5
gagccctaat gcacccaaca ttctaacaaa aggctgtaac aagggctaca ggaatcatga 60
gccaggaact gtggctcatc tatgaaaact 90
<210> 6
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 6
ctatctatat cataacacca cagccactta gctccaattt aaaagattaa tcataaacat 60
ttgggaagga gagtgaagat ttttgtgatg ttaaataaga atgattatac taaaaaccaa 120
<210> 7
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 7
ccagaatgcc agtcccagag gcccttgtca gtgttcatgc ctacatccct agtacctagc 60
atggtacctg caggtggccc ataatcatga gttattcagt aagttaaagg attgcaggag 120
<210> 8
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 8
aggccaagcc atttctgttt ccaaatccac tggctccctc ccacagctgg attatgggcc 60
agtaggaatt gccattttca gggttttgct gtcactgtag tcaggaccat gaagtcttta 120
<210> 9
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 9
gcatgctggc catattcact tgcccacttc tgcccaggga tctatttttc tgtggtgtgt 60
attccctgtg cctttggggg catctcttat actcatgaaa tcaacagagg cttgcatgta 120
<210> 10
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 10
tcttcctttc ttttttgctg gcaattacag acaaatcact cagcagctac ttcaataacc 60
atattttcga tttcagaccg tgataatacc tacaaccgag tgtcagagga tctgagaagc 120
<210> 11
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 11
ctggcattca tggaaggctg cagggcataa cattatccaa aagtcaaatg ccccataggt 60
tttgaactca cagattaaac tgtaaccaaa ataaaattag gcatatttac aagctagttt 120
<210> 12
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 12
aagcaaaaaa gtaattgtct ctctcagagg aatgctttag tgctttttag ccaagtgatt 60
ccaatcatag ccacagttta caacatttgt atctttatct gtatccttat ttatacctct 120
<210> 13
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 13
tatctatcta tcttcaaaat attacataag gataccaaag aggaaaatca cccttgtcac 60
atacttgcta ttaaaatata cttttattag tacagattat ctgggacacc actttaatta 120
<210> 14
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 14
aggaagtact tagaacaggg tctgacacag gaaatgctgt ccaagtgtgc accaggagat 60
agtatctgag aaggctcagt ctggcaccat gtgggttggg tgggaacctg gaggctggag 120
<210> 15
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 15
agtctgccaa ggactagcag gttgctaacc accctgtgtc tcagttttcc tacctgtaaa 60
atgaagatat taacagtaac tgccttcata gatagaagat agatagatta gatagataga 120
<210> 16
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 16
cggacgcaag gcgcagcggc aaggacaagg ttctgtgctc gctgggctga cgcggtctcc 60
gcggtgtaag gaggtttata tatatttcta caacatctcc cctaccgcta tagtaacttg 120
<210> 17
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 17
gacagattga tagttttttt taatctcact aaatagtcta tagtaaacat ttaattacca 60
atatttggtg caattctgtc aatgaggata aatgtggaat cgttataatt cttaagaata 120
<210> 18
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 18
tgtatttttt ttagagacgg ggtttcacca tgttggtcag gctgactatg gagttatttt 60
aaggttaata tatataaagg gtatgataga acacttgtca tagtttagaa cgaactaacg 120
<210> 19
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 19
agccgttaaa agcatcaagg tagttaggta aagctgagtc tgaagtaagt aaaacattgt 60
tacaggatcc ttggggtgtc gcttttctgg ccagaaacct ctgtagccag tggcgccttt 120
<210> 20
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 20
tggcgccttt gcctgagttt tgctcaggcc cactgggctc tttctgccca cacggcctgg 60
caacttatat gtatttttgt atttcatgtg tacattcgta tctatctgtc tatctatcta 120
<210> 21
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 21
ttccccacag tgaaaataat ctacaggata ggtaaataaa ttaaggcata ttcacgcaat 60
gggatacgat acagtgatga aaatgaacta attatagcta cgtgaaacta tactcatgaa 120
<210> 22
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 22
taggattctt aatagctatt attaccaaag catgaacaat cagtaaaaag caaacctgag 60
cattagcccc aggaccaatc tggtcacaaa catattaatg aattgaacaa atgagtgagt 120
<210> 23
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 23
aatgaagaca atacaaccag agttgttcct ttaataacaa gacaagggaa aaagagaact 60
gtcagaataa gtgttaatta taatatccag gggtgggata cagaggtttt agcatctgct 120
<210> 24
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 24
gtgttaatta taatatccag gggtgggata cagaggtttt agcatctgct ctttgccaag 60
cactgcactt attcctgagg aatacctgag ggaaaaagta tggtttctca caggatctag 120
<210> 25
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 25
caccatggag tctgtgttcc ctgtgacctg cactcggaag ccctgtgtac aggggactgt 60
gtgggccagg ctggataatc gggagctttt cagcccacag gaggggtctt cggtgcctcc 120
<210> 26
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 26
caggctctag cagcagctca tggtgggggg tcctgggcaa atagggggca aaattcaaag 60
ggtatctggg ctctggggtg attcccattg gcctgttcct cccttatttc cctcattcat 120
<210> 27
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 27
caccctactc ccacctgccc ctgcctccct ctgccccagc tgccctagtc agcaccccaa 60
ccagcctgcc tgcttgggga ggcagcccca aggcccttcc caggctctag cagcagctca 120
<210> 28
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 28
caatgtgccg ggcactttgc ccttattatt ttgtgaactc ctcagactga tcctataagg 60
tagagttccc accttccaga agaagaaaca ggtctagagg atccaagttg acttggctga 120
<210> 29
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 29
tagtagtttc ttctggtgaa ggaagaaaag agaatgatat cagggaagat gaaaaaagag 60
actgtattag taaggcttct ccagagagaa agaatcaaca ggatcaatgg atgcataggt 120
<210> 30
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 30
ttattagagg aattagctca agtgatatgg aggctgaaaa atctcatgac agtccatctg 60
caagctggag acccagggac actaggagca tggctcagtc caggtctaaa agccaaaaaa 120
<210> 31
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 31
ggttgctgga catggtatca cagaagtctg ggatgtggag gagagttcat ttctttagtg 60
ggcatccgtg actctctgga ctctgaccca tctaacgcct atctgtattt acaaatacat 120
<210> 32
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 32
atcaatcaat catctatcta tctttctgtc tgtctttttg ggctgcctat ggctcaaccc 60
aagttgaagg aggagatttg accaacaatt caagctctct gaatatgttt tgaaaataat 120
<210> 33
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 33
aatctaaatg cagaaaagca ctgaaagaag aatccagaaa accacagttc ccatttttat 60
atgggagcaa acaaaggcag atcccaagct cttcctcttc cctagatcaa tacagacaga 120
<210> 34
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 34
atatcattga aagacaaaac agagatggat gatagataca tgcttacaga tgcacacaca 60
aacgctaaat ggtataaaaa tggaatcact ctgtaggctg ttttaccacc tactttacta 120
<210> 35
<211> 80
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 35
gaagtgcagt ggcatgaaca tggctcactg cagccttaac cttctgggct caagaactcc 60
tcctgcctca gccctgcaag 80
<210> 36
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 36
ctaccagcaa caacacaaat aaacaaaccg tcagcctaag gtggacatgt tggcttctct 60
ctgttcttaa catgttaaaa ttaaaattaa cttctctggt gtgtggagat gtcttacaat 120
<210> 37
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 37
aggattggaa gttggaatag tggttaccag gactgggggg aggaagggat ggtggatggt 60
gaacaaaagg accttggagg gctcctgggg ttctaggaat caatcttcct tctttccttc 120
<210> 38
<211> 100
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 38
aggaacaggt ggtgttggtt acatgaataa gttctttagc agtgatttct gatattttgg 60
tgcacccatt acccgaataa aaatcttctc tctttcttcc 100
<210> 39
<211> 80
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 39
tcaacagaat cttattctgt tgcccaggct ggagtgcagt ggtacaatta tagctttttg 60
cagcctcaac ctcctgggct 80
<210> 40
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 40
tgttctaagg gcttcagact tggacagcca cactgccagc ttccctgatt cttcagcttg 60
tagatggtct gttatgggac ttttctcagt ctccataaat atgtgagtca attccccaag 120
<210> 41
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 41
tatctatcgt ctatctatcc agtctatcta cctcctatta gtctgtctct ggagaacatt 60
gactaataca acatctttaa tatatcacag tttaatttca agttatatca taccacttca 120
<210> 42
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 42
gagactacta tcatcgggga aaatctagcc cccatagcag ctataagaag gctaggacag 60
ggtctatagg gtggagagga actggccagt ttggggaatt ccaacgatca ggtttaaggt 120
<210> 43
<211> 80
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 43
agtgctctca agagtgcccg gcacagtgtg agtgatcacg cgaatgtatg attggcaata 60
tttttataac aataatagta 80
<210> 44
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 44
ttgtctgtcc atttagctat ctatttatcc attcattcat tcctgtatac ctaacctatc 60
atccatcctt atctcttgtg tatctattca ttcaatcata cacccatatc tgtctgtctg 120
<210> 45
<211> 120
<212> DNA/RNA
<213> Artificial sequence (ARTIFICIAL SEQUENCE)
<400> 45
ctatctgcct atctgcctgc ctacctatcc ctctatggca attgcttgca accagggaga 60
ttttattccc aggagatatt tggctatgtc tgacaacaat ttttttggtt gtcacaaatg 120
Claims (3)
1. A method for assessing STR slippage using a molecular specific barcode in second generation sequencing, comprising the steps of:
Obtaining a capture probe according to an STR locus, wherein the STR locus is derived from a human, and the nucleotide sequence of the capture probe is shown in any one of SEQ ID NO. 1-45;
Respectively connecting two ends of an original DNA template molecule of a sample to be detected with excessive UMI joint sequences and sample specific identification codes to obtain a DNA fragment to be captured;
STR capturing is carried out on the DNA fragment to be captured by using the capturing probe to obtain a target capturing fragment;
Performing high-throughput sequencing on the target capture fragment to obtain sequencing data;
STR typing is carried out on the sequencing data according to the UMI joint sequence, the sample specific identification code and the reference genome of the sample to be tested, and the slip rate of each STR locus is calculated;
according to the slippage rate, the slippage of the STR in the second generation sequencing platform is estimated;
STR typing is performed on the sequencing data according to the UMI joint sequence, the sample specific identification code and the reference genome of the sample to be tested, and calculating the slip rate of each STR locus comprises:
removing sequences which do not meet preset standards in the sequencing data, and removing pollution sequences according to the specific identification codes of the samples to obtain sequences to be compared;
Comparing the sequences to be compared with a reference genome of the sample to be tested to obtain an original extraction sequence;
In the original extraction sequence, determining members of each UMI family and sequences thereof from the sequences to be aligned according to the UMI linker sequence;
reducing to obtain an original DNA template sequence of each UMI family according to the members of each UMI family and the sequences thereof;
STR typing is performed on all members of the UMI families containing the same UMI linker sequence, and the slipping percentage of each UMI family is calculated;
The sequence which does not meet the preset standard comprises the following steps:
The 3' end of the single-end reading sequence contains a sequence with the number of preset mass bases exceeding 1/3 of the number of the base of the sequence, wherein the preset mass bases are bases with the mass value less than or equal to 20;
sequences having a sequence length of less than 100 bp;
the method further comprises the steps of:
determining a primary typing and each stutters according to the membership of the UMI family;
obtaining STR slipping percentage according to the sum of the membership of each stutters and the membership of the UMI family;
obtaining the proportion of each stutters according to the number of each stutters and the number of the main types;
the number of the members of the UMI family is more than or equal to 30; the length of the identification code sequence is more than or equal to 12bp, and the length of the UMI joint sequence is more than or equal to 8bp;
Said STR typing of all members of said UMI family containing the same said UMI linker sequence, comprising: STR typing all members of said UMI family containing the same said UMI linker sequence according to length polymorphism;
The comparing the sequence to be compared with the reference genome of the sample to be tested to obtain an original extraction sequence comprises the following steps:
And comparing the sequences to be compared with a reference genome, reserving the sequence containing the STR conserved motif and the sequence containing the UMI joint sequence, and simultaneously extracting the sequence of the STR conserved motif and the UMI joint sequence to obtain an original extraction sequence.
2. The method of claim 1, wherein the high throughput sequencing comprises double-ended sequencing or single-ended sequencing.
3. Use of a molecular specific barcode in second generation sequencing to assess slippage of STRs, characterized in that the use comprises using the method of any one of claims 1-2 in the field of individual identification and authentication.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111141442.4A CN113930490B (en) | 2021-09-27 | 2021-09-27 | Method for evaluating STR slippage by using molecular specific bar code in second generation sequencing platform and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111141442.4A CN113930490B (en) | 2021-09-27 | 2021-09-27 | Method for evaluating STR slippage by using molecular specific bar code in second generation sequencing platform and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113930490A CN113930490A (en) | 2022-01-14 |
CN113930490B true CN113930490B (en) | 2024-09-10 |
Family
ID=79277122
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111141442.4A Active CN113930490B (en) | 2021-09-27 | 2021-09-27 | Method for evaluating STR slippage by using molecular specific bar code in second generation sequencing platform and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113930490B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106164298A (en) * | 2014-02-18 | 2016-11-23 | 伊鲁米那股份有限公司 | Method and composition for DNA pedigree analysis |
CN108913786A (en) * | 2018-07-26 | 2018-11-30 | 苏州博睿义达生物科技有限公司 | A kind of 19 CODIS locus probe extraction purification kits |
CN112522382A (en) * | 2020-12-22 | 2021-03-19 | 广州深晓基因科技有限公司 | Y chromosome sequencing method based on liquid phase probe capture |
-
2021
- 2021-09-27 CN CN202111141442.4A patent/CN113930490B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106164298A (en) * | 2014-02-18 | 2016-11-23 | 伊鲁米那股份有限公司 | Method and composition for DNA pedigree analysis |
CN108913786A (en) * | 2018-07-26 | 2018-11-30 | 苏州博睿义达生物科技有限公司 | A kind of 19 CODIS locus probe extraction purification kits |
CN112522382A (en) * | 2020-12-22 | 2021-03-19 | 广州深晓基因科技有限公司 | Y chromosome sequencing method based on liquid phase probe capture |
Non-Patent Citations (2)
Title |
---|
Reducing noise and stutter in short tandem repeat loci with unique molecular identifiers;Woerner AE等;Forensic Sci Int Genet;第51卷;102459 * |
分子条形码 — 常见问题解答 利用 HaloPlexHS 检测低频等位基因变异;Anniek De Witte等;安捷伦科技(中国)有限公司;第1-8页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113930490A (en) | 2022-01-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107012225B (en) | STR locus detection kit and detection method based on high-throughput sequencing | |
CN106520982B (en) | A Composite Typing System for Identification | |
CN113528677B (en) | Leaf-specific notopterygium plateau loach microsatellite molecular marker, and primer and application thereof | |
CN108070658B (en) | Non-diagnostic method for detecting MSI | |
CN110863056A (en) | Method, reagent and application for accurately typing human DNA | |
CN106811540A (en) | It is a kind of to identify female ussuriensis, male individual microsatellite marker and specific primer and application | |
CN105177142B (en) | A kind of strain line hippocampus microsatellite marker and its screening technique | |
CN110564861A (en) | Fluorescence labeling composite amplification kit for human Y chromosome STR locus and InDel locus and application thereof | |
CN113215267B (en) | SNP primer set for panda individual identification and paternity test and application | |
CN113930490B (en) | Method for evaluating STR slippage by using molecular specific bar code in second generation sequencing platform and application thereof | |
CN114774517A (en) | A method and kit for sequencing human immune repertoire | |
CN110317887B (en) | Paramisgurnus dabryanus polymorphic microsatellite marker locus, primer and application thereof | |
CN110734984B (en) | Genetic markers related to wool fiber diameter of fine-wool sheep and their application | |
CN113373241B (en) | A microsatellite marker for Homer loach and its amplification primer and application | |
CN117418014A (en) | Microsatellite markers and their applications in brown-spotted grouper, clearwater grouper and their hybrid progeny, Shanhu hybrid spot | |
CN110878372B (en) | Witch hazel microsatellite marker combination and its primers and applications | |
CN115807100A (en) | A kind of SNP molecular marker related to abdominal fat rate of broiler and its application | |
CN112410441A (en) | A method for identification of bee colony resistance to cystic larvae by using SNP marker KZ288479.1_95621 | |
CN114875157A (en) | SNP (Single nucleotide polymorphism) marker related to individual growth traits of pelteobagrus fulvidraco and application | |
CN106755444A (en) | A kind of soybean gene copy number analysis of variance method | |
CN112430675A (en) | Method for identifying anti-cysticercosis trait of bee colony by using SNP marker KZ 288474.1-322717 | |
CN112522423A (en) | Fuworm microsatellite molecular marker and polymorphism primer and application thereof | |
CN115044681B (en) | Microsatellite loci combination and primers for Przewalski's horse and their application | |
CN114836549B (en) | Mongolian Asiatic wild ass microsatellite molecular marker combination, primer and application thereof | |
CN106591462B (en) | Primer, kit and identification method for identifying genetic relationship between Yangtze river finless porpoise individuals |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |