Corn chloroplast InDel molecular marker suitable for capillary electrophoresis detection platform
Technical Field
The invention belongs to the technical field of crop molecular biology, and particularly relates to a set of corn chloroplast InDel molecular markers suitable for a capillary electrophoresis detection platform.
Background
Chloroplasts are organelles of green plants that perform photosynthesis and possess an intact set of genomes known as the chloroplast genome. The chloroplast genome structure is well conserved, DNA is generally a double-stranded circular molecule, and the chloroplast genome size in higher plants is generally 120-160 kb. Double-stranded circular DNA consists of 4 basic parts, namely a Large single copy region (LSC), a small single copy region (SSC), and two Inverted Repeats (IRs).
Chloroplast genome information is widely applied to research and application of plant variety, germplasm resource identification, genetic relationship evaluation, system evolution, cytoplasm genetic characteristics and the like due to the following advantages. (1) The chloroplast genome is small and relatively conserved, and the complete sequence is easily obtained; (2) chloroplast genes are maternal inheritance, gene exchange and fusion among different individuals rarely occur, and the genes of chloroplast have good colinearity; (3) the chloroplast genome is single copy genes except for the inverted repeat region, and the paralogous gene interference hardly exists; (4) the evolution speed difference of chloroplast coding regions and non-coding regions is obvious, and some high mutation regions exist, so that the problem of the following classification units can be solved.
In the research and application of corn varieties, breeding materials and germplasm resource identification, the genetic information of the nuclear genome of corn is adopted at present. For example, the SSR marking method for identifying corn varieties adopts 40 pairs of SSR primers (Wanfengge, etc., 2014); chip maizeSNP3072 (tianan et al, 2015) suitable for corn DNA fingerprinting; commercial corn chip product maizensnp 50K (Ganal et al, 2011); and GBS based high throughput sequencing technology, simplified genome sequencing methods, and the like. The existing corn sample molecular identification method is based on the corn cell nuclear genome sequence information, and plant cells have cytoplasmic inheritance, namely chloroplast and mitochondrial genome information besides nuclear inheritance. And the cytoplasmic genome information, particularly the chloroplast genome information, has the advantages, and is more suitable for identification of maternal traceability and the like of corn samples. The existing maize genome polymorphic site collection has no polymorphic site developed for chloroplast genome.
Disclosure of Invention
The invention aims to provide a set of chloroplast InDel molecular markers suitable for the traceability of a corn maternal line.
In order to realize the purpose of the invention, 170 parts of maize inbred line materials with wide sources, rich phenotypes and genotypes and strong representativeness are collected, chloroplast genomes of corresponding materials are sequenced and nucleotide polymorphisms of the chloroplast genomes of the corresponding materials are compared, and maize chloroplast genome polymorphic sites are developed. The general concepts and steps are as follows. (1) And selecting 170 parts of corn representative test materials, wherein the types of the corn representative test materials comprise all heterosis groups in China, and three types of samples of sweet glutinous, local varieties and CMS sterility. (2) High concentration and high quality of total DNA preparation. (3) Based on the high-throughput sequencing of a second-generation sequencing platform, the size of a constructed library is 500bp, PE140 is obtained, and the sequencing depth is 5 times. (4) Whole genome sequence data processing, chloroplast genome splicing, independent splicing by using two software, screening contigs belonging to chloroplast genome by using BLAST program based on a maize B73 chloroplast genome sequence, assembling and verifying sequence accuracy. (5) And (3) chloroplast genome annotation, polymorphic site determination and annotation of the chloroplast genome by using DOGMA software. (6)170 parts of material chloroplast genome sequences are compared, 11 InDel chloroplast genome variation sites are screened, and the information of the 11 InDel sites is shown in Table 1. (7) Primer design was performed using primer 6 software based on capillary electrophoresis platform. (8) And (3) primer evaluation and verification, namely selecting 28 representative samples of various types to evaluate and verify the primers based on a capillary electrophoresis platform. (9) A set of InDel polymorphic primer combinations suitable for a capillary electrophoresis detection platform, and based on the test data of the 28 samples, the evaluation is carried out in the following aspects: whether the primer amplification is successful or not reflects the maternal genetic characteristics, whether the maternal genetic characteristics are consistent with the sequencing result or not and whether the maternal genetic characteristics are polymorphic or not. The evaluation results show that 6 pairs of primers show better amplification effect on a capillary electrophoresis platform, and the amplification result information of 11 pairs of primers aiming at 11 InDel molecular markers is shown in Table 2.
The invention provides chloroplast molecular markers for maternal traceability of corn, wherein the molecular markers are one or more of the following 11 InDel molecular markers, and the information of the 11 InDel molecular markers is shown in Table 1.
Information of Table 111 InDel molecular markers
The successful verification of 6 InDel molecular markers CPMIDP01-04 and CPMID06-07 are obtained by the following primer amplifications respectively: SEQ ID NO.13-14, SEQ ID NO.15-16, SEQ ID NO.17-18, SEQ ID NO.19-20, SEQ ID NO.21-22, SEQ ID NO. 23-24.
Table 211 InDel molecular marker primers and amplification verification results
Further, the invention provides a chloroplast molecular marker for corn maternal traceability, which is an InDel molecular marker numbered CPMIDP01-04 and CPMID06-07 in Table 1.
The 6 pairs of InDel primers provided by the invention can realize the acquisition of genotyping data on a fluorescence capillary electrophoresis platform. The specific scheme is that the 5' end of one of each pair of primers is marked with a fluorescent group; preparing PCR reaction system and adding DNA, primer, dNTP and MgCl2Taq enzyme, Buffer; operating a reaction program; detecting the amplification product on a fluorescent capillary electrophoresis system; and collecting the original data by using capillary electrophoresis system matched software, and importing the original data into genotype software to analyze the original data to obtain the genotype data in a fragment length format.
The invention provides application of the chloroplast molecular marker in constructing a corn variety chloroplast DNA fingerprint database. The method specifically comprises the following steps: (1) extracting the total genome DNA of a variety for constructing a fingerprint database; (2) the chloroplast genome DNA fingerprint database is obtained by using the set of InDel primers provided by the invention and based on a fluorescence capillary electrophoresis platform.
The invention provides application of the chloroplast molecular marker in maternal traceability analysis of a corn sample. The primer for amplifying the InDel molecular marker is developed based on a corn chloroplast genome and is a strict maternal genetic marker, so that the primer can be applied to maternal traceability analysis of a corn sample. The application is based on the construction of a known maize inbred line variety chloroplast genome InDel-DNA fingerprint database.
The invention provides application of the chloroplast molecular marker in positive and negative cross identification of a corn sample. Maize hybrids are typically produced from two inbred parents, and the parents typically belong to different heterotic model groups. The analysis based on 170 corn samples shows that except a few heterotic modes such as Reid/Lanka, the female parent and the male parent of the majority of heterotic modes can be identified by chloroplast markers. Therefore, seeds produced by the same hybridization combination in a positive and negative cross mode can be identified by extracting total DNA and utilizing chloroplast marker loci. Compared with the traditional mode of extracting the pericarp DNA and the endosperm DNA, the application method of the invention is simple and easy.
The invention provides application of the chloroplast molecular marker in maize molecular marker-assisted breeding.
The invention provides application of the chloroplast molecular marker in preparation of a corn genome chip.
The invention provides application of the chloroplast molecular marker in identifying the genotype of a maize filial generation.
The invention provides application of the chloroplast molecular marker in identification of maize germplasm resources.
Any of the applications described above, comprising the steps of:
1) extracting DNA of a corn sample to be detected;
2) performing PCR amplification by using the DNA extracted in the step 1) as a template according to the INDEL molecular marker;
3) and detecting the PCR product by adopting a fluorescent capillary electrophoresis system.
In the step 2) of the above application, the PCR reaction system is 20. mu.L, and comprises 4. mu.L of DNA, 0.25. mu. mol/L of primer, 0.15. mu. mol/L of dNTP, and 2.5mmol/L of MgCl21 unit Taq enzyme (Genacea, USA), 1 XPCR Buffer. The PCR reaction program is 94 ℃ for 5min, 94 ℃ for 40sec, 60 ℃ for 35s, 72 ℃ for 45s, and 35 cycles; preserving at 72 deg.C for 10min and 4 deg.C.
Electrophoresis and fingerprint data acquisition: the PCR products were electrophoresed on a capillary fluorescence electrophoresis system AB 3730XL DNA Analyzer (Applied Biosystems, USA), and the PCR products, formamide, and an internal standard (GeneScan. TM. -500LIZ, Applied Biosystems, USA) were added to each well of a 96-well electrophoresis plate. And (3) running the mixed sample on a PCR (polymerase chain reaction) instrument at 95 ℃ for 5min, taking out the denatured electrophoresis product, centrifuging at 1000rpm/min for 1min, and performing electrophoresis on an AB 3730XL DNA analyzer. And collecting the original data by using Date Collection Ver.1.0 software matched with the electrophoresis apparatus, and importing the original data into data analysis software to obtain the genotype data in a fragment length format.
The key points of the technology of the invention are as follows:
(1) and (3) analyzing the whole genome sequencing data of the corn. Because the maize genome is large and complex, processing whole genome sequencing data is the first difficult point to encounter and the key point. The method for analyzing the whole genome sequencing data of the corn comprises the following steps: and evaluating the quality of original data, independently splicing by using SPAdes software and SOAPdenovo2 software to obtain contig of high-quality splicing, wherein sequence splicing is a key point, and the parameter setting is relatively strict.
(2) And (4) separating the corn chloroplast genome data. The difficulty and key point of the present invention is to isolate chloroplast data from total DNA data. Since the chloroplast genome sequence is relatively conserved and the sequencing quality and length are sufficient, in the present invention chloroplast genome data is obtained by a splicing scheme, in combination with alignment with the maize chloroplast genome. The corn chloroplast genome data acquisition main steps are as follows: screening contigs of chloroplast genomes by using a Blast program, assembling the chloroplast genome contigs by using Sequencher software, and comparing, verifying and confirming the assembled sequence with a corn chloroplast reference genome (corn variety B73, Version 3) to provide guarantee for obtaining accurate and reliable chloroplast genome sequences.
(3) The determination of the polymorphic site of chloroplast InDel is key data and result of the invention. The accuracy and the high efficiency of obtaining the chloroplast polymorphic sites are ensured through representative sample selection, high-quality sequencing data and accurate data analysis. 170 parts of materials with wide sources and rich phenotypes and genotypes are selected, the spliced 170 chloroplast genome sequences are used for searching the SSR sequences by using MISA software, and the chloroplast InDel polymorphic sites are finally determined, wherein different genetic background materials and high-quality gene sequences are important factors for obtaining accurate and reliable chloroplast polymorphic sites.
The invention develops a set of maize chloroplast genome polymorphic InDel primers based on high-quality re-sequencing data aiming at chloroplast genome characteristics, and compared with a nuclear genome locus marker, the primers are more suitable for construction of a maize chloroplast DNA fingerprint database, parent source tracing and positive and negative cross identification. The application of the chloroplast InDel polymorphic site and the primer thereof expands the range of the usable marker site of the corn on the genome level; provides a new idea and method for the research of maize variety and germplasm resource identification, genetic relationship evaluation, cytoplasm genetic characteristics and the like.
Drawings
FIG. 1 is a technical roadmap for the development and evaluation of 11 maize chloroplast INDEL loci in example 1 of the invention.
FIGS. 2A-2D, FIGS. 2E-2H are the anti-cross breeding of Zheng 58(A), Chang 7-2(B), Zheng 958 orthorhombic species (C) and Zheng 958 inverse-crossed species (D) in corn samples with primers (SEQ ID NO.19-20) labeled with the INDEL molecular marker CPMIDP02(SEQ ID NO.15-16) and CPMIDP04 molecules, respectively; and electrophorograms in reciprocal cross of the maize samples Jing 724(E), Jing 92(F), Jingke 968 orthogonal species (G) and Jingke 968 reciprocal cross species (H).
Detailed Description
The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention. Unless otherwise indicated, the examples follow conventional experimental conditions, such as the Molecular Cloning handbook, Sambrook et al (Sambrook J & Russell DW, Molecular Cloning: a Laboratory Manual,2001), or the conditions as recommended by the manufacturer's instructions. Those skilled in the art will appreciate that the details of the present invention not described in detail herein are well within the skill of those in the art.
Example 1 acquisition of InDel polymorphic sites and primers in maize chloroplast genome
Selecting a sample: 170 parts of widely representative maize inbred lines were selected for whole genome sequencing. The 170 samples comprise corn types such as common corn, waxy corn, sweet corn, cracked corn and the like; including all heterosis groups in China, Tang Si Jiang, Luda honggu, Reid, Lanka, modified Reid, modified Lanka, P group and local varieties.
Sample preparation: 170 corn samples were grown in the incubator for 5 days, with no light conditions for the first 3 days and light conditions for the next 2 days (i.e., sufficient light was given after unearthing). Leaves were selected from 30 seedlings of each sample, mixed and ground thoroughly in liquid nitrogen. Total DNA was extracted by CTAB method and RNA was removed. The quality of the extracted DNA was checked by UV spectrophotometer and agarose electrophoresis, respectively. Agarose electrophoresis showed that the DNA band was single and not degraded; detecting A260/280 by an ultraviolet spectrophotometer to be between 1.8 and 2.0 (DNA has no protein pollution and low RNA content); a260/230 is between 1.8 and 2.0 (the content of DNA salt ions is low); the DNA concentration is greater than 1000 ng/. mu.L.
High throughput sequencing of total DNA: breaking the DNA of 170 parts of corn sample and PCR product by using ultrasound, cutting gel and recovering 400-and 600-bp DNA fragment, and utilizing
The library construction kit constructs a library with the size of 500bp, a sequencing platform of Hiseq 4000PE150 is used for sequencing, the sequencing depth is 5 times, and about 10GB data are obtained on average for each sample.
High-throughput sequencing data processing, chloroplast genome splicing: high throughput sequencing data were stitched independently using two software, SPAdes (Bankevich et al, 2012) and soaldenovo 2(Luo et al, 2012), respectively. For each software spliced contig, screening out contigs of the chloroplast genome using Blast program (Altschul et al, 1997); the selected chloroplast genome contigs were assembled using Sequencher. All reads maps were then applied to the spliced chloroplast genome sequence using geneous 8.1(Kearse et al, 2012) to verify that the spliced contig sequence was correct.
Chloroplast genome annotation, polymorphic site determination: chloroplast genome annotation was performed using dodma (dual organic island Geno Me antanotator) (Wyman et al, 2004) and BLASTX and BLASTN searches were used to identify the location of the encoding gene. 170 maize chloroplast genomes were aligned using MAFFT software (Katoh and Standard, 2013) and then manually adjusted using Se-al software. The principle of the inverted alignment occurring within the sequence is to pull it apart so as not to cause erroneous data polymorphisms. Using DnaSP 5.0 to count variation sites and sequence polymorphism in two chloroplast genomes (Librado and Rozas,2009), 11 INDEL polymorphic sites are developed. The physical location and flanking sequence of these 11 sites were determined based on the chloroplast genome sequence of maize variety B73, and the 11 sites are specified in table 1.
Primer design and evaluation: primer design is carried out by using primer 6 software based on a capillary electrophoresis platform, and FAM fluorescence is modified at the 5' end of a forward primer. And (3) primer evaluation verification, namely selecting 28 representative samples of various types for evaluation in the following aspects: whether the primer amplification is successful or not reflects the maternal genetic characteristics, whether the maternal genetic characteristics are consistent with the sequencing result or not and whether the maternal genetic characteristics are polymorphic or not. The evaluation result shows that 6 pairs of primers show better amplification effect on a capillary electrophoresis platform.
And (3) PCR amplification: the reaction system is 20. mu.L, and comprises 4. mu.L of DNA, 0.25. mu. mol/L of primer, 0.15. mu. mol/L of dNTP, 2.5mmol/L of MgCl2, 1 unit of Taq enzyme (Genaceae, USA), and 1 XPCR Buffer. The amplification program is 94 ℃ for 5min, 94 ℃ for 40sec, 60 ℃ for 35s, 72 ℃ for 45s, 35 cycles; preserving at 72 deg.C for 10min and 4 deg.C.
Electrophoresis and fingerprint data acquisition: the PCR products were electrophoresed on a capillary fluorescence electrophoresis system AB 3730XL DNA Analyzer (Applied Biosystems, USA), and the PCR products, formamide, and an internal standard (GeneScan. TM. -500LIZ, Applied Biosystems, USA) were added to each well of a 96-well electrophoresis plate. And (3) running the mixed sample on a PCR (polymerase chain reaction) instrument at 95 ℃ for 5min, taking out the denatured electrophoresis product, centrifuging at 1000rpm/min for 1min, and performing electrophoresis on an AB 3730XL DNA analyzer. And collecting the original data by using Date Collection Ver.1.0 software matched with the electrophoresis apparatus, and importing the original data into data analysis software to obtain the genotype data in a fragment length format.
A technical scheme for the development and evaluation of 11 maize chloroplast INDEL loci of this invention is shown in FIG. 1. And (3) designing amplification primers aiming at the 11 InDel molecular markers and verifying the amplification effect of the primers, wherein the amplification result information is shown in Table 2. The successful verification of 6 InDel molecular markers CPMIDP01-04 and CPMID06-07 are obtained by the following primer amplifications respectively: SEQ ID NO.13-14, SEQ ID NO.15-16, SEQ ID NO.17-18, SEQ ID NO.19-20, SEQ ID NO.21-22, SEQ ID NO. 23-24.
FIGS. 2A-2D, FIGS. 2E-2H are the primers labeled with CPMIDP02 and CPMIDP04 molecules for Zheng 58, Chang 7-2, Zheng 958 orthorhombic species and Zheng 958 backcross species, respectively, of corn samples; and positive and negative cross identification results of Jing 724, Jing 92, Jingke 968 orthogonal species and Jingke 968 reverse cross species of the corn samples.
Example 2 construction of maize variety chloroplast DNA-InDel fingerprint database Using InDel polymorphic primers provided by the invention
Extracting the DNA of the corn variety: each sample was seeded and irradiated to form green shoots. The DNA extraction adopts a mode of extracting DNA by mixed strains, 30 single-strain green leaves are mixed, and the specific steps of DNA extraction are carried out according to the corn DNA molecular identification standard (Wanfengge et al, 2014). The DNA was diluted to give a working solution at a concentration of 20 ng/. mu.L.
Primer synthesis: primers were synthesized according to the 6 pairs of primer sequences successfully verified as provided in Table 2, i.e., the 6 pairs of primer nucleotide sequences determined in example 1 were SEQ ID NO.13-14, SEQ ID NO.15-16, SEQ ID NO.17-18, SEQ ID NO.19-20, SEQ ID NO.21-22, and SEQ ID NO.23-24, respectively; one of each pair of primers is labeled at its 5' end with a fluorescent group.
And (3) PCR amplification: the PCR reaction system is 20 μ L, and comprises 4 μ L DNA, 0.25 μmol/L primer, 0.15 μmol/L dNTP, 2.5mmol/L MgCl21 unit Taq enzyme (Genacea, USA), 1 XPCR Buffer. The PCR reaction program is 94 ℃ for 5min, 94 ℃ for 40sec, 60 ℃ for 35s, 72 ℃ for 45s, and 35 cycles; preserving at 72 deg.C for 10min and 4 deg.C.
Electrophoresis and fingerprint data acquisition: the PCR products were electrophoresed on a capillary fluorescence electrophoresis system AB 3730XL DNA Analyzer (Applied Biosystems, USA), and the PCR products, formamide, and an internal standard (GeneScan. TM. -500LIZ, Applied Biosystems, USA) were added to each well of a 96-well electrophoresis plate. And (3) running the mixed sample on a PCR (polymerase chain reaction) instrument at 95 ℃ for 5min, taking out the denatured electrophoresis product, centrifuging at 1000rpm/min for 1min, and performing electrophoresis on an AB 3730XL DNA analyzer. And collecting the original data by using Date Collection Ver.1.0 software matched with the electrophoresis apparatus, and importing the original data into data analysis software to obtain the genotype data in a fragment length format.
Example 3 maternal traceability analysis was performed using the maize chloroplast genome InDel polymorphic primers provided by the invention
And (3) extracting DNA (deoxyribonucleic acid) from green leaves of the seedlings of the corn to be identified, performing PCR (polymerase chain reaction) amplification and fluorescence capillary electrophoresis on the InDel polymorphic primers which are successfully verified by using 6 pairs of primers determined in the embodiment 1 of the invention, and obtaining fingerprint data. The specific procedure is the same as in example 2. Comparing the InDel fingerprint data of the sample A (Jingke 968) with an InDel marked fingerprint database of chloroplast of a known maize inbred line variety, determining that chloroplast fingerprint information of the sample to be detected is the same as that of a B inbred line (Jingke 724), and presuming that the female parent of the hybrid A (Jingke 968) is B (Jing 724).
Example 4 identification of maize samples for reciprocal crossing by using the maize chloroplast genome InDel polymorphic primers provided by the invention
To identify the reciprocal hybrid sample C (Zhengdan 958), DNA was extracted from the green leaves of the seed seedlings. Based on the DNA fingerprint data of the male and female parents of Zhengdan 958 in the InDel marker fingerprint database of the chloroplast of the known maize inbred line variety, the primers with polymorphism in the male and female parents of the Zhengdan 958 to be detected are selected from the 6 pairs of InDel polymorphic primers confirmed successfully in the embodiment 1 for PCR amplification, and the fingerprint of the sample C is obtained after the fluorescence electrophoresis data analysis, and the specific method is the same as the embodiment 2. And comparing the fingerprint data of the sample C to be detected with the fingerprint data of the female parent (sample D, Zheng 58) and the male parent (sample E, Chang 7-2). If the fingerprint data of the C sample and the fingerprint data of the D sample, namely the female parent, are the same, the C sample and the D sample are orthogonal; and if the fingerprint data of the sample C is the same as that of the sample E, namely the male parent, performing backcross.
The fingerprint data of the maize variety 6 related to the invention on the chloroplast InDel polymorphic primer are shown in Table 3.
TABLE 3 fingerprint data of maize variety 6 versus chloroplast polymorphic primers
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.