Disclosure of Invention
The invention aims to provide SNP variation on an upstream regulatory region of a gene GhHRK of upland cotton and application thereof, the gene has been reported to negatively regulate high temperature resistance of cotton, and the SNP variation can be used for efficiently distinguishing high temperature resistant cotton germplasm from sensitive high temperature cotton germplasm.
The invention provides SNP variation on an upstream regulatory region of GhHRK genes, wherein the nucleotide sequence containing the SNP variation is shown as SEQ ID NO.1, the SNP variation is 9094715bp of a chromosome A01 of a upland cotton genome, and compared with a reference genome, the SNP variation has allelic variation of G- > A.
The invention also provides a KASP marker based on the SNP variation, and the nucleotide sequences of an upstream primer and a downstream primer of the KASP marker primer are respectively shown as SEQ ID NO.3 and SEQ ID NO. 4.
The invention also provides a kit which comprises the KASP labeled primer and the PCR amplification reagent.
Preferably, the PCR amplification reagents include Taq DNA polymerase, dNTPs and buffer reagents.
The invention also provides SNP variation described in the technical scheme, KASP markers described in the technical scheme or application of the kit described in the technical scheme in cotton breeding.
Preferably, the cotton breeding includes identifying cotton high temperature resistance.
Preferably, the cotton comprises upland cotton.
The invention also provides a method for detecting the high temperature resistance of cotton, which comprises the following steps:
the primer of the technical proposal is used for carrying out PCR amplification on the genome DNA of the cotton germplasm to be detected to obtain PCR amplification products;
Carrying out electrophoresis detection on the PCR amplification product, and when the PCR amplification product can specifically amplify a band and contains a 300bp band, the cotton germplasm to be detected is a high-temperature resistant cotton germplasm;
When the PCR amplification product cannot specifically amplify a band or the amplified band does not contain a300 bp band, the cotton germplasm to be detected is sensitive high-temperature cotton germplasm.
Preferably, the annealing temperature in the PCR amplification is 59-60 ℃.
Preferably, the PCR amplification procedure is 95 ℃ pre-denatured for 5min, 95 ℃ denatured for 30s, 59-60 ℃ annealed for 30s, 70-72 ℃ extended for 30s, cycled for 35 times, and 72 ℃ final extended for 30s.
The beneficial effects are that:
The invention provides an SNP variation on an upstream regulatory region of a gene of upland cotton GhHRK, wherein a nucleotide sequence containing the SNP variation is shown as SEQ ID NO.1, the SNP variation is 9094715bp of a chromosome A01 of the upland cotton genome, and compared with a reference genome, the SNP variation has allelic variation of G- > A. The SNP mutation is positioned in an upstream regulatory region of GhHRK genes, is tightly linked with a high temperature resistance phenotype, and is used for developing a KASP mark, so that the rapid identification of the high temperature resistance of cotton germplasm can be realized, namely, whether the cotton germplasm to be detected has high Wen Kangxing can be accurately judged through a molecular mark band type by taking the genomic DNA of the cotton germplasm to be detected as a template, and the method is convenient and efficient, and provides technical foundation and support for cotton breeding.
Detailed Description
The invention provides an SNP variation on an upstream regulatory region of a upland cotton GhHRK gene, wherein the nucleotide sequence containing the SNP variation is shown as SEQ ID NO.1, the SNP variation is 9094715bp of a upland cotton genome, and compared with a reference genome, the SNP variation has allelic variation of G- > A.
In the present invention, the nucleotide sequence shown in SEQ ID NO.1 is preferably a DNA sequence of a control sequence of 2000bp upstream of GhHRK gene, specifically
The position which is thickened and has underline is SNP mutation position, which generates G- > A mutation compared with a reference genome, is 9094715bp of chromosome A01 of upland cotton genome, the version of the cotton genome is 'TM-1_HAU-AD1_v1.1', the cotton genome can be obtained from CottonGen database in an open mode, and the website is https:// www.cottongen.org/data/download/genome_ tetraploid/AD1.
The invention also provides a KASP marker based on the SNP variation, wherein the nucleotide sequences of an upstream primer and a downstream primer of the KASP marker primer are respectively shown as SEQ ID NO.3 and SEQ ID NO.4, and specifically are 5'-AGAAAGTTAGAAATCAACAATCAG-3' and 5'-TTCTAGATCTGACAATAGCGACGT-3' respectively.
The invention also provides a kit which comprises the KASP marked primer and the PCR amplification reagent. The PCR amplification reagent comprises Taq DNA polymerase, dNTPs and a buffer reagent. The Buffer reagent of the present invention preferably comprises 10 Xbuffer. The source of the PCR amplification reagent is not particularly limited, and the PCR amplification reagent may be a commercially available product. The PCR amplification reagent of the present invention preferably further comprises ultrapure water. The concentration of the mother liquor of the upstream primer and the downstream primer of the primer in the kit is preferably 10-20 mM, more preferably 10mM. The total amount of the PCR amplification reagent and the total amount of the primer in the kit are not particularly limited, and the kit can be arranged according to the conventional requirement of the kit.
The SNP variation is positioned in an upstream regulatory region of GhHRK genes, is closely linked with a high temperature resistance phenotype, and is developed to obtain a KASP mark, so that the identification of high temperature resistance of cotton germplasm can be realized, and the high temperature resistant cotton germplasm and the sensitive high temperature cotton germplasm can be distinguished efficiently.
Based on the technical advantages, the invention also provides the SNP variation, the KASP mark or the application of the kit in cotton breeding. The cotton breeding of the present invention preferably includes the identification of cotton high temperature resistance, more preferably cotton anther high temperature resistance, and particularly preferably the identification and differentiation of high temperature resistant cotton germplasm from sensitive high temperature cotton germplasm. The cotton of the present invention preferably comprises upland cotton.
The invention also provides a method for detecting the high temperature resistance of cotton, which comprises the following steps:
the primer of the technical proposal is used for carrying out PCR amplification on the genome DNA of the cotton germplasm to be detected to obtain PCR amplification products;
Carrying out electrophoresis detection on the PCR amplification product, and when the PCR amplification product can specifically amplify a band and contains a 300bp band, the cotton germplasm to be detected is a high-temperature resistant cotton germplasm;
and when the PCR amplification product cannot specifically amplify a band or the amplified band does not contain 476bp band, the cotton germplasm to be detected is sensitive high-temperature cotton germplasm.
The invention preferably extracts genomic DNA of the cotton germplasm to be tested. The method for extracting the genomic DNA is not particularly limited, and a genomic DNA extraction method or a genomic DNA extraction kit commonly used in the field can be adopted, and a CTAB extraction method is adopted in the embodiment of the invention.
After the extraction, the primer disclosed by the invention is used for carrying out PCR amplification on genomic DNA of cotton germplasm to be detected, so as to obtain a PCR amplification product. The PCR amplification system of the invention preferably comprises the following components of 10 Xbuffer 2.0 mu L, dNTPM 0.3 mu L, taq DNA polymerase 0.2 mu L, genome DNA 1 mu L (75-100 ng), upstream primer 0.5 mu L, downstream primer 0.5 mu L and sterile water 15.5 mu L in terms of 20 mu L. The annealing temperature in the PCR amplification of the present invention is preferably 59 to 60℃and more preferably 60 ℃. The PCR amplification procedure is particularly preferably performed at 95 ℃ for 5min, at 95 ℃ for 30s, at 59-60 ℃ for 30s, at 70-72 ℃ for 30s, and at 35 times of cycle, at 72 ℃ for 30s.
After the PCR amplification product is obtained, the PCR amplification product is subjected to electrophoresis detection, when the PCR amplification product can specifically amplify a band and contains a 300bp band type, the cotton germplasm to be detected is high-temperature resistant cotton germplasm, and when the PCR amplification product cannot specifically amplify the band or the amplified band type does not contain the 300bp band type, the cotton germplasm to be detected is sensitive high-temperature cotton germplasm. The sequence corresponding to the 300bp band-type is preferably shown as SEQ ID NO.5, specifically 5'-AGAAAGTTAGAAATCAACAATCAGGATTTTGTTTTGCAATTTTCACCATTCCATTATTTTGGGGGGAAAATTGTTATGTAGATTTGAAAGTTTCATGGCGGTTTTATATAGCACCCAGAAAACAGGATTTCTGTTGAAAATGCAATAAATATTCAAGCATGATGGCTACCATGCAGTCAACAATTATGTAACATATTATGGTCAAAGCAGCCAATTTTCAATGTTATCTCTTCGTTTATATCTTTTGTCTGGTCCATATAAATAGGCTTATATCTAACGTCGCTATTGTCAGATCTAGAA-3'.
The technical solutions provided by the present invention are described in detail below with reference to the drawings and examples for further illustrating the present invention, but they should not be construed as limiting the scope of the present invention.
The cotton germplasm used in the examples below was selected from upland cotton natural population germplasm resources. The research on the germ plasm resources of a natural group of upland cotton has been published (DOI: 10.1038/ng.3807.) and the germ plasm group is composed by directly purchasing or introducing from a germ plasm resource library, and the original sources are foreign introduction.
Example 1
Candidate gene association analysis based on GhHRK gene natural variation
Firstly, constructing a upland cotton natural variation map (DOI: 10.1038/s 41588-021-00844-9) by using 376 parts of upland cotton resequencing data published, combining the published pollen resistance height Wen Biaoxing (DOI: 10.1111/nph.17325) of 218 parts of cotton natural population, and calling variation information (including 1500bp before a start codon and 1500bp after a stop codon) on a GhHRK1 locus from the obtained pollen resistance height Wen Biaoxing to perform candidate gene association analysis, wherein the analysis steps are as follows:
1. Variation information between 9,093,555bp and 9,105,018bp in the natural variation profile was obtained (summarized in table 1). According to the information in Table 1, there are 25 SNP variations in GhHRK gene loci, 24 of which are introns, 1 upstream regulatory region.
2. After cotton variety materials are grouped according to different genotypes, pollen viability phenotype average values after high temperature stress are calculated, and grouped t-test is carried out, wherein p value is the correlation between variation and phenotype after t-test, and the p value is less than 0.01 as a significant threshold.
3. According to the analysis of step 2, of the 25 SNP variations, 10 SNP variations were significantly correlated with phenotype, 9 of them in introns, 1 in the upstream regulatory region, as in fig. 1a, where a is the correlation between variation at different physical locations and pollen viability phenotype, X-axis is the physical location of SNP on the genome in table 1, Y-axis is-log (p), where p is the t-test result in table 1.
4. Since the regulatory region has a relatively more important role in the expression change of the gene, and at the same time there is a significant negative correlation between the variation G- > a (as b in fig. 1) on the regulatory region and the phenotype (p=1.20E-09), the b plot in fig. 1 shows the distribution of SNP variation over the GhHRK gene model, with different colors marked with a p value of less than 0.01 as a significant correlation threshold, with blue marked as a marker with a p value of greater than 0.01, green marked as a significant correlation variation in the GhHRK upstream regulatory sequence (9094715 bp physical position), and red marked as a significant correlation variation in GhHRK1 introns.
Therefore, in this example, marker development was performed for SNPs (obtained based on the `TM-1_HAU-AD1_v1.1` genome at 9094715bp physical position) on the regulatory region.
TABLE 1 summary of natural population variation information
Note that the categories in the header respectively represent Type of variation, pos physical location of genome, ref reference genotype, alt variation genotype, dist toATG physical distance from start codon (negative value indicates upstream regulatory region of start codon), PV_ref_mean average pollen viability of cotton material of reference genotype, PV_alt_mean average pollen viability of cotton material of variation genotype, T.test_p T test p value of pollen viability of both genotypes.
Example 2
Molecular marker development
2.1 Verification of molecular marker authenticity (taking 9094715bp physical position G- > A as an example)
According to the material information and the corresponding genotype information in the variation map of the example 1, selecting one part of material with genotypes G and A at the 9094715bp position respectively. In this example, two materials Emian (G) and Jimian (a) in the natural population were used for molecular marker amplification, and the main steps are as follows:
1. according to GhHRK upstream regulatory sequences and marking positions in SEQ ID NO.1, PCR primers are designed, and the upstream primers and the downstream primers are respectively shown as SEQ ID NO.2 (5'-GGTTGTATTGCACACCAAACTAGA-3') and SEQ ID NO. 4.
2. Genomic DNA of Emian and Jimian material was extracted by CTAB method, respectively. The method comprises the steps of placing fresh leaves in a 2mL centrifuge tube, adding clean steel balls and 200 mu L of extraction buffer (0.35M glucose, 0.1M Tris-HCl,0.005M Na 2 EDTA,2% PVP K-30 and 0.1% DIECA, with the pH value of 7.5), placing the fresh leaves on a sample grinder (Shanghai Jingxin # Tissuelyser-192) to grind for 60s with the frequency of 60Hz, adding 800 mu L of lysate (0.1M Tris-HCl,1.4M NaCl,0.02M Na 2 EDTA,2%CTAB,2%PVP K-30 and 0.1% DIECA, with the pH of 8.0) after grinding is finished, adding 800 mu L of chloroform (chloroform: isoamyl alcohol with the volume ratio of 24:1) after a water bath at 65 ℃ for 30min, carrying out gentle inversion, extracting for 20min, centrifuging at 12000rpm for 8-10min, transferring supernatant, mixing with isopropyl alcohol precooled at the equal volume of-20 ℃, carrying out centrifugation for 8-10min, and obtaining flocculent DNA precipitation after the mixing, washing the DNA twice with 75% ethanol, and dissolving the DNA with ddH 2 O after the super clean bench. The PCR amplification system was configured according to the recipe in table 2 while PCR amplification was performed according to the procedure in table 3.
TABLE 2 PCR amplification System
TABLE 3 PCR amplification procedure
3. After the PCR amplification is completed, cloning the PCR amplification product onto an entry vector pTOPO-T vector (Aidlab #CV2101), heat-shocking to E.coli competent DH5 alpha, selecting 4-5 monoclonal and using M13F universal primer to carry out product sequencing, and confirming the authenticity of the marked SNP in the sequence. The results are shown in FIG. 2.
As can be seen from FIG. 2, the genotype of Emian 19 at 9094715bp physical position is G, the genotype of Jimian at 9094715bp physical position is A, and thus the presence of marker SNP variation is confirmed, wherein in FIG. 2, a and b represent genotype information of Emian and Jimian 15 two varieties at 9094715bp, respectively. The genotype at 9094715bp in Emian varieties is G (high temperature resistant genotype), and the genotype at 9094715bp in Jimian varieties is A (high temperature sensitive genotype). Genotype information was obtained by PCR amplification using the upstream and downstream primers shown in SEQ ID No.2 and SEQ ID No.4, which were specifically 5'-GGTTGTATTGCACACCAAACTAGA-3' and 5'-TTCTAGATCTGACAATAGCGACGT-3', respectively, after cloning and Sanger sequencing.
2.2 Optimal primer sequence and annealing temperature design
After confirming the authenticity of the label, the number of materials was increased to make an optimal primer sequence and annealing temperature search, and in this example, according to the design principle of annealing sensitivity KASP label (in FIG. 3 a, taking G genotype detection as an example, after setting G base at the 3 'end of the primer (SEQ ID NO. 3), at a specific annealing temperature, the G base at the 3' end of the primer could not anneal with A base, and PCR amplification could not be smoothly performed), primer design and label development were performed as follows:
1. On the sequence of the sequencing result, the reverse primer SEQ ID NO.4 is kept unchanged, the forward primer is designed from the 9094715bp physical position, the 3' end is reserved as G base, and the annealing temperature of the forward primer and the reverse primer is designed to be about 55 ℃.
2. The range of materials was expanded, five materials of genotypes G of Emian 19,Zhemian 3,Xinluzhong 7,Xinluazao11 and DELTAPINE SR-1 and five materials of genotypes a of Jimian 15,Shaanmian 1,Hongyejijiaomian,Xinluzao 6 and Shaan-2786 were selected according to the genetic variation map in example 1, and gradient annealing experiments were performed at every 1 ℃ from 55 ℃ to 60 ℃ according to the PCR amplification system of table 2 and the PCR amplification procedure of table 3 in example 2, respectively, in fig. 3 b, in which 10 parts of molecular markers having different genotypes were amplified, in which five materials of genotypes Emian 19,Zhemian 3,Xinluzhong 7,Xinluazao11 and DELTAPINE SR-1 were G, and five materials of genotypes a of Jimian 15,Shaanmian 1,Hongyejijiaomian,Xinluzao 6 and Shaan-2786 were a. At annealing temperatures of 59 ℃ and 60 ℃, the primer with the G base set at the 3' end cannot anneal to the sequence of genotype a.
3. According to PCR amplification results, the material with genotype G can be normally amplified into strips at annealing temperatures of 59 ℃ and 60 ℃, the material with genotype A is basically free from PCR products, and 60 ℃ is finally selected as the optimal annealing temperature in order to ensure the authenticity of the amplification results.
Example 3
Molecular marker validation at natural population level
Example 2 confirms the authenticity of the label and explores the optimal annealing temperatures and primer sequences. In this example, 47 parts of each of the high temperature resistant material and the high temperature sensitive material was selected based on pollen viability phenotype data of the natural population using the annealing temperature and the primer sequence confirmed in example 2, and DNA was extracted. And (3) setting G base at the 3' end of the primer by taking upland cotton reference genotype variety TM-1 as a control, performing PCR amplification at the annealing temperature of 60 ℃, and performing label amplification of 9094715bp physical positions on the variety materials at the annealing temperature of 60 ℃ according to the PCR amplification system of the table 2 and the PCR amplification program of the table 3.
The PCR results showed that the high temperature resistant material had 40 parts of the material capable of amplifying the specific band, 7 parts of the material with lighter band or without specific amplification (a in FIG. 4, red asterisk indicates the material with insignificant or non-specific amplification), 9 parts of the material with specific band in the sensitive high temperature material, and 38 parts of the material with lighter band or without specific amplification (b in FIG. 4, blue asterisk indicates the material with significant or specific amplification). According to chi-square detection, the amplification efficiency between the high temperature resistant material and the high temperature sensitive material is obviously different, and the p value is 4.51E-28.
The results of the embodiment show that the SNP variation provided by the invention and the primer developed based on the SNP variation can realize the detection of high temperature resistance of upland cotton and effectively distinguish high temperature resistant cotton germplasm from sensitive high temperature cotton germplasm.
Although the foregoing embodiments have been described in some, but not all, embodiments of the invention, it should be understood that other embodiments may be devised in accordance with the present embodiments without departing from the spirit and scope of the invention.