CN117095748B - Method for constructing plant miRNA genetic regulation pathway - Google Patents
Method for constructing plant miRNA genetic regulation pathway Download PDFInfo
- Publication number
- CN117095748B CN117095748B CN202311097229.7A CN202311097229A CN117095748B CN 117095748 B CN117095748 B CN 117095748B CN 202311097229 A CN202311097229 A CN 202311097229A CN 117095748 B CN117095748 B CN 117095748B
- Authority
- CN
- China
- Prior art keywords
- mirna
- population
- snp
- expression
- expression level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000037361 pathway Effects 0.000 title claims abstract description 32
- 108091070501 miRNA Proteins 0.000 title claims abstract 32
- 239000002679 microRNA Substances 0.000 title claims abstract 32
- 230000004034 genetic regulation Effects 0.000 title claims description 5
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 80
- 108091023040 Transcription factor Proteins 0.000 claims abstract description 58
- 102000040945 Transcription factor Human genes 0.000 claims abstract description 58
- 241000196324 Embryophyta Species 0.000 claims abstract description 48
- 230000001105 regulatory effect Effects 0.000 claims abstract description 46
- 238000011144 upstream manufacturing Methods 0.000 claims abstract description 33
- 230000002068 genetic effect Effects 0.000 claims abstract description 32
- 230000014509 gene expression Effects 0.000 claims description 83
- 239000002243 precursor Substances 0.000 claims description 13
- 238000012098 association analyses Methods 0.000 claims description 7
- 230000002596 correlated effect Effects 0.000 claims description 6
- 210000000056 organ Anatomy 0.000 claims description 6
- 108700039691 Genetic Promoter Regions Proteins 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 4
- 108091062157 Cis-regulatory element Proteins 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 108091041012 miR393 stem-loop Proteins 0.000 abstract description 29
- 108091058140 miR393a stem-loop Proteins 0.000 abstract description 29
- 241000249899 Populus tomentosa Species 0.000 abstract description 18
- 239000002023 wood Substances 0.000 abstract description 5
- 238000011160 research Methods 0.000 abstract description 2
- 239000013589 supplement Substances 0.000 abstract description 2
- 108700011259 MicroRNAs Proteins 0.000 description 79
- 238000004458 analytical method Methods 0.000 description 4
- 230000008303 genetic mechanism Effects 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000010534 mechanism of action Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 108700028369 Alleles Proteins 0.000 description 1
- 108020004414 DNA Proteins 0.000 description 1
- 238000012180 RNAeasy kit Methods 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010921 in-depth analysis Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 230000008121 plant development Effects 0.000 description 1
- 230000008635 plant growth Effects 0.000 description 1
- 230000007363 regulatory process Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
Landscapes
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
本发明提供了一种构建植物miRNA遗传调控通路的方法,涉及分子遗传学领域。本发明了一种构建植物miRNA遗传调控通路的方法,可准确、快速、高效地鉴定miRNA上游调控转录因子与下游调控靶基因,是对先前仅开展miRNA下游靶基因研究的有效补充,对解析植物复杂性状的遗传调控通路提供了重要的技术参考。采用本发明提供的方法,得出毛白杨miR393a的上游转录因子与下游靶基因,构建了miR393a影响木材组织相关性状的遗传调控通路。
The present invention provides a method for constructing a plant miRNA genetic regulatory pathway, and relates to the field of molecular genetics. The present invention provides a method for constructing a plant miRNA genetic regulatory pathway, which can accurately, quickly and efficiently identify miRNA upstream regulatory transcription factors and downstream regulatory target genes, and is an effective supplement to the previous research on miRNA downstream target genes, and provides an important technical reference for analyzing the genetic regulatory pathways of complex plant traits. Using the method provided by the present invention, the upstream transcription factors and downstream target genes of miR393a of Populus tomentosa are obtained, and the genetic regulatory pathway of miR393a affecting wood tissue-related traits is constructed.
Description
技术领域Technical Field
本发明设计分子遗传学技术领域,具体涉及一种构建植物miRNA遗传调控通路的方法。The present invention relates to the technical field of molecular genetics, and specifically to a method for constructing a plant miRNA genetic regulation pathway.
技术背景technical background
microRNA(微小RNA,miRNA)通过对下游靶基因的剪切作用或翻译抑制,实现对下游靶基因的负向调控,广泛参与了植物生长发育以及重要性状形成等生物学过程。先前大多数研究仅关注miRNA及其下游靶基因影响重要性状的遗传作用机制,而忽略了上游调控转录因子对miRNA形成的调控过程,极大影响了重要性状的解析精度。特别是,当前研究仅在个体水平对miRNA的功能机制进行了深入解析,未能从群体水平解析miRNA遗传调控网络的作用机制,主要原因包含以下两个因素:(1)通过分子生物学技术手段开展miRNA遗传调控通路的研究耗时长、通量低,难以实现重要miRNA遗传调控通路的大规模鉴定;(2)基于生物信息学预测手段,大多仅在个体水平考虑了上下游基因间表达相关性,而忽略了群体水平的表达模式以及基因组等位变异水平的作用机制解析,造成miRNA遗传通路解析准确性低、假阳性高。因此,现有技术缺少一种准确、快速、高效构建植物miRNA遗传调控通路的方法。MicroRNA (miRNA) negatively regulates downstream target genes by splicing or inhibiting translation of downstream target genes, and is widely involved in biological processes such as plant growth and development and the formation of important traits. Most previous studies have only focused on the genetic mechanism of miRNA and its downstream target genes affecting important traits, while ignoring the regulatory process of upstream regulatory transcription factors on miRNA formation, which greatly affects the accuracy of the analysis of important traits. In particular, current studies have only conducted in-depth analysis of the functional mechanism of miRNA at the individual level, and have failed to analyze the mechanism of action of the miRNA genetic regulatory network at the population level. The main reasons include the following two factors: (1) The study of miRNA genetic regulatory pathways through molecular biological techniques is time-consuming and low-throughput, making it difficult to achieve large-scale identification of important miRNA genetic regulatory pathways; (2) Based on bioinformatics prediction methods, most of them only consider the expression correlation between upstream and downstream genes at the individual level, while ignoring the expression pattern at the population level and the analysis of the mechanism of action at the genomic allele variation level, resulting in low accuracy and high false positives in the analysis of miRNA genetic pathways. Therefore, the existing technology lacks a method for accurately, quickly and efficiently constructing plant miRNA genetic regulatory pathways.
发明内容Summary of the invention
本发明的目的在于提供一种鉴定植物miRNA遗传调控通路的方法,采用本发明的方法能够准确、快速、高效地鉴定植物中特定miRNA上游调控转录因子与下游调控靶基因,解析其影响重要性状的遗传作用机制。The purpose of the present invention is to provide a method for identifying plant miRNA genetic regulatory pathways. The method of the present invention can accurately, quickly and efficiently identify specific miRNA upstream regulatory transcription factors and downstream regulatory target genes in plants, and analyze the genetic mechanism of their influence on important traits.
本发明提供一种鉴定植物miRNA遗传调控通路的方法,包括以下步骤:The present invention provides a method for identifying a plant miRNA genetic regulatory pathway, comprising the following steps:
1)获得特定植物物种miRNA的初级转录本序列,并划分其所包含的miRNA前体序列与成熟miRNA序列;1) Obtain the primary transcript sequence of miRNA of a specific plant species and divide it into the miRNA precursor sequence and the mature miRNA sequence;
2)利用psRNATarget(https://www.zhaolab.org/psRNATarget)在线网站预测miRNA的潜在靶基因,设置Expectation值为3,其它参数设定为默认参数;2) Use the psRNATarget (https://www.zhaolab.org/psRNATarget) online website to predict potential target genes of miRNA, set the Expectation value to 3, and set other parameters to default parameters;
3)获得待测miRNA及其预测的潜在靶基因在特定植物物种种质资源群体的表达量数据;3) Obtain the expression data of the miRNA to be tested and its predicted potential target genes in the germplasm resource population of a specific plant species;
4)计算步骤3)中miRNA群体表达量与预测的潜在靶基因群体表达量间的皮尔森相关性系数r,鉴定与miRNA表达水平高度负相关的候选靶基因;4) calculating the Pearson correlation coefficient r between the expression level of the miRNA population and the expression level of the predicted potential target gene population in step 3), and identifying candidate target genes that are highly negatively correlated with the miRNA expression level;
所述鉴定候选靶基因的条件包括:相关性系数r<-0.6;The conditions for identifying candidate target genes include: correlation coefficient r<-0.6;
5)基于miRNA初级转录本序列,定义其前2000bp为miRNA启动子区域,预测与miRNA启动子顺式作用元件结合的潜在上游转录因子;5) Based on the miRNA primary transcript sequence, the first 2000 bp are defined as the miRNA promoter region, and potential upstream transcription factors binding to the miRNA promoter cis-acting elements are predicted;
6)获得步骤5)中潜在上游转录因子在特定植物物种种质资源群体的表达量数据,计算上游潜在转录因子与miRNA在群体中的表达相关性,鉴定与miRNA表达水平高度相关的上游候选转录因子;6) obtaining the expression data of the potential upstream transcription factors in step 5) in a germplasm resource population of a specific plant species, calculating the expression correlation between the upstream potential transcription factors and the miRNA in the population, and identifying the upstream candidate transcription factors that are highly correlated with the miRNA expression level;
所述鉴定的条件包括:相关性系数r<-0.6或r>0.6;The identification conditions include: correlation coefficient r<-0.6 or r>0.6;
7)获取miRNA前体、候选转录因子与候选靶基因在特定植物种质资源群体中的SNP基因型数据;7) Obtain SNP genotype data of miRNA precursors, candidate transcription factors and candidate target genes in specific plant germplasm resource populations;
8)基于数量性状表达定位策略(expression Quantitative Trait Loci,eQTL),将所述步骤S7中的候选转录因子的群体SNP基因型数据与待测miRNA的群体表达量数据进行关联分析,确定与miRNA表达量显著关联的转录因子内SNP,认定该转录因子为miRNA上游调控转录因子;8) Based on the expression Quantitative Trait Loci (eQTL) strategy, the population SNP genotype data of the candidate transcription factor in step S7 is associated with the population expression data of the miRNA to be tested, and the SNP in the transcription factor that is significantly associated with the miRNA expression is determined, and the transcription factor is identified as the upstream regulatory transcription factor of the miRNA;
所述确定的条件包括:候选转录因子内任一SNP与miRNA的表达水平显著关联;The determined conditions include: any SNP in the candidate transcription factor is significantly associated with the expression level of the miRNA;
9)将所述步骤S7中的miRNA前体序列的SNP数据与候选靶基因的群体表达水平开展关联分析,确定与候选靶基因群体表达水平显著关联的miRNA前体序列内的SNP,认定该靶基因为miRNA下游调控靶基因;9) performing association analysis on the SNP data of the miRNA precursor sequence in step S7 and the population expression level of the candidate target gene, determining the SNP in the miRNA precursor sequence that is significantly associated with the population expression level of the candidate target gene, and identifying the target gene as a miRNA downstream regulatory target gene;
所述确定的条件包括:待测miRNA前提序列内任一SNP与候选靶基因的表达水平显著关联;The determined conditions include: any SNP in the pre-requisite sequence of the miRNA to be tested is significantly associated with the expression level of the candidate target gene;
10)综合步骤8)与9),确定特定植物待测miRNA的上游转录因子与下游靶基因,确定待测miRNA的遗传调控通路。10) Combining steps 8) and 9), the upstream transcription factors and downstream target genes of the miRNA to be tested in the specific plant are determined, and the genetic regulatory pathway of the miRNA to be tested is determined.
优选的,所述步骤1)对特定植物待测miRNA初级初转本及其内部结构划分方法不做限定。Preferably, the step 1) does not limit the primary transcript of the miRNA to be tested in a specific plant and the method for dividing its internal structure.
优选的,所述步骤3)与6)中获取待测miRNA、潜在靶基因与潜在转录因子表达量需在同一植物组织或器官,对表达量获取方式不做限定。Preferably, the expression levels of the miRNA to be tested, the potential target gene and the potential transcription factor in steps 3) and 6) need to be obtained in the same plant tissue or organ, and the method for obtaining the expression levels is not limited.
优选的,所述步骤4)与6)中计算皮尔森相关性系数r的软件包括SPSS v19.0。Preferably, the software for calculating the Pearson correlation coefficient r in steps 4) and 6) includes SPSS v19.0.
优选的,所述步骤5)中预测上游潜在转录因子的方法包括但不仅限于PlantRegMap(http://plantregmap.gao-lab.org/)在线预测工具,参数与筛选标准依软件而定,需符合生物统计学要求。Preferably, the method for predicting upstream potential transcription factors in step 5) includes but is not limited to the PlantRegMap (http://plantregmap.gao-lab.org/) online prediction tool, and the parameters and screening criteria depend on the software and must meet the requirements of biostatistics.
优选的,所述步骤7)群体SNP数据是基于植物全基因组重测序技术获得。Preferably, the population SNP data in step 7) is obtained based on plant whole genome resequencing technology.
优选的,所述操作步骤中SNP基因型频率需大于10%。Preferably, the SNP genotype frequency in the operation step needs to be greater than 10%.
优选的,所述特定植物种质资源群体个体数量需大于200株。Preferably, the number of individuals in the specific plant germplasm resource population needs to be greater than 200.
优选的,所述步骤8)~9)中开展关联分析的方法为TASSEL v5.0中的混合线性模型;利用软件得到每一待测SNP位点与特定表达量间关联的显著性水平,得到P值;对P值进行FDR多重检测获得Q值,筛选P≤0.01,Q≤0.1的SNP位点为与特定表达水平显著关联的SNP位点。Preferably, the method for performing association analysis in steps 8) to 9) is a mixed linear model in TASSEL v5.0; the significance level of the association between each SNP site to be tested and the specific expression level is obtained by using the software to obtain the P value; the P value is subjected to FDR multiple detection to obtain the Q value, and the SNP sites with P≤0.01 and Q≤0.1 are screened as SNP sites significantly associated with the specific expression level.
优选的,所述步骤3)~6)中所述基因表达量数据为与该miRNA遗传调控通路发挥功能相关的植物组织或器官的表达量。Preferably, the gene expression data in steps 3) to 6) are the expression levels of plant tissues or organs related to the function of the miRNA genetic regulatory pathway.
本发明提供了一种构建植物miRNA遗传调控通路的方法。先前基于分子生物学实验手段验证miRNA遗传调控通路,耗时长、精度低;而基于传统生物信息学手段预测miRNA遗传调控通路假阳性较高、准确性较低,难以实现植物miRNA遗传调控通路的精准、快速鉴定。因此本发明了一种构建植物miRNA遗传调控通路的方法,可准确、快速、高效地鉴定miRNA上游调控转录因子与下游调控靶基因,是对先前仅开展miRNA下游靶基因研究的有效补充,对解析植物复杂性状的遗传调控通路提供了重要的技术参考。The present invention provides a method for constructing a plant miRNA genetic regulatory pathway. Previously, the verification of miRNA genetic regulatory pathways based on molecular biological experimental methods was time-consuming and low in precision; and the prediction of miRNA genetic regulatory pathways based on traditional bioinformatics methods had high false positives and low accuracy, making it difficult to achieve accurate and rapid identification of plant miRNA genetic regulatory pathways. Therefore, the present invention provides a method for constructing a plant miRNA genetic regulatory pathway, which can accurately, quickly and efficiently identify miRNA upstream regulatory transcription factors and downstream regulatory target genes, which is an effective supplement to the previous research that only carried out on miRNA downstream target genes, and provides an important technical reference for analyzing the genetic regulatory pathways of complex plant traits.
本发明的实施例结果显示:采用本发明提供的方法,得出毛白杨miR393a的上游转录因子与下游靶基因,构建了miR393a影响木材组织相关性状的遗传调控通路。The results of the examples of the present invention show that: using the method provided by the present invention, the upstream transcription factor and downstream target gene of Populus tomentosa miR393a were obtained, and the genetic regulatory pathway of miR393a affecting wood tissue-related traits was constructed.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为毛白杨miR393a的遗传调控网络;Figure 1 shows the genetic regulatory network of miR393a in Populus tomentosa;
图2为本发明鉴定方法的分析流程图。FIG. 2 is an analysis flow chart of the identification method of the present invention.
具体实施方式Detailed ways
本发明提供一种鉴定植物miRNA遗传调控通路的方法,包括以下步骤:The present invention provides a method for identifying a plant miRNA genetic regulatory pathway, comprising the following steps:
1)获得特定植物物种miRNA的初级转录本序列,并划分其所包含的miRNA前体序列与成熟miRNA序列;1) Obtain the primary transcript sequence of miRNA of a specific plant species and divide it into the miRNA precursor sequence and the mature miRNA sequence;
2)利用psRNATarget(https://www.zhaolab.org/psRNATarget)在线网站预测miRNA的潜在靶基因,设置Expectation值为3,其它参数设定为默认参数;2) Use the psRNATarget (https://www.zhaolab.org/psRNATarget) online website to predict potential target genes of miRNA, set the Expectation value to 3, and set other parameters to default parameters;
3)获得待测miRNA及其预测的潜在靶基因在特定植物物种种质资源群体的表达量数据;3) Obtain the expression data of the miRNA to be tested and its predicted potential target genes in the germplasm resource population of a specific plant species;
4)计算步骤3)中miRNA群体表达量与预测的潜在靶基因群体表达量间的皮尔森相关性系数r,鉴定与miRNA表达水平高度负相关的候选靶基因;4) calculating the Pearson correlation coefficient r between the expression level of the miRNA population and the expression level of the predicted potential target gene population in step 3), and identifying candidate target genes that are highly negatively correlated with the miRNA expression level;
所述鉴定候选靶基因的条件包括:相关性系数r<-0.6;The conditions for identifying candidate target genes include: correlation coefficient r<-0.6;
5)基于miRNA初级转录本序列,定义其前2000bp为miRNA启动子区域,预测与miRNA启动子顺式作用元件结合的潜在上游转录因子;5) Based on the miRNA primary transcript sequence, the first 2000 bp are defined as the miRNA promoter region, and potential upstream transcription factors binding to the miRNA promoter cis-acting elements are predicted;
6)获得步骤5)中潜在上游转录因子在特定植物物种种质资源群体的表达量数据,计算上游潜在转录因子与miRNA在群体中的表达相关性,鉴定与miRNA表达水平高度相关的上游候选转录因子;6) obtaining the expression data of the potential upstream transcription factors in step 5) in a germplasm resource population of a specific plant species, calculating the expression correlation between the upstream potential transcription factors and the miRNA in the population, and identifying the upstream candidate transcription factors that are highly correlated with the miRNA expression level;
所述鉴定的条件包括:相关性系数r<-0.6或r>0.6;The identification conditions include: correlation coefficient r<-0.6 or r>0.6;
7)获取miRNA前体、候选转录因子与候选靶基因在特定植物种质资源群体中的SNP基因型数据;7) Obtain SNP genotype data of miRNA precursors, candidate transcription factors and candidate target genes in specific plant germplasm resource populations;
8)基于数量性状表达定位策略(expression Quantitative Trait Loci,eQTL),将所述步骤S7中的候选转录因子的群体SNP基因型数据与待测miRNA的群体表达量数据进行关联分析,确定与miRNA表达量显著关联的转录因子内SNP,认定该转录因子为miRNA上游调控转录因子;8) Based on the expression Quantitative Trait Loci (eQTL) strategy, the population SNP genotype data of the candidate transcription factor in step S7 is associated with the population expression data of the miRNA to be tested, and the SNP in the transcription factor that is significantly associated with the miRNA expression is determined, and the transcription factor is identified as the upstream regulatory transcription factor of the miRNA;
所述确定的条件包括:候选转录因子内任一SNP与miRNA的表达水平显著关联;The determined conditions include: any SNP in the candidate transcription factor is significantly associated with the expression level of the miRNA;
9)将所述步骤S7中的miRNA前体序列的SNP数据与候选靶基因的群体表达水平开展关联分析,确定与候选靶基因群体表达水平显著关联的miRNA前体序列内的SNP,认定该靶基因为miRNA下游调控靶基因;9) performing association analysis on the SNP data of the miRNA precursor sequence in step S7 and the population expression level of the candidate target gene, determining the SNP in the miRNA precursor sequence that is significantly associated with the population expression level of the candidate target gene, and identifying the target gene as a miRNA downstream regulatory target gene;
所述确定的条件包括:待测miRNA前提序列内任一SNP与候选靶基因的表达水平显著关联;The determined conditions include: any SNP in the pre-requisite sequence of the miRNA to be tested is significantly associated with the expression level of the candidate target gene;
10)综合步骤8)与9),确定特定植物待测miRNA的上游转录因子与下游靶基因,确定待测miRNA的遗传调控通路。10) Combining steps 8) and 9), the upstream transcription factors and downstream target genes of the miRNA to be tested in the specific plant are determined, and the genetic regulatory pathway of the miRNA to be tested is determined.
本发明对所述植物的种类没有特殊限定,在本发明实施例中,所述植物优选为毛白杨。The present invention has no particular limitation on the type of the plant. In an embodiment of the present invention, the plant is preferably Populus tomentosa.
本发明中,所述步骤1)对特定植物待测miRNA初级初转本及其内部结构划分方法不做限定。In the present invention, the step 1) does not limit the primary transcript of the specific plant miRNA to be tested and the method for dividing its internal structure.
本发明中,所述步骤3)与6)中获取待测miRNA、潜在靶基因与潜在转录因子表达量需在同一植物组织或器官,对表达量获取方式不做限定。In the present invention, the expression levels of the miRNA to be tested, the potential target gene and the potential transcription factor in steps 3) and 6) need to be obtained in the same plant tissue or organ, and the method for obtaining the expression levels is not limited.
本发明中,所述步骤4)与6)中计算皮尔森相关性系数r的软件包括SPSS v19.0。In the present invention, the software for calculating the Pearson correlation coefficient r in steps 4) and 6) includes SPSS v19.0.
本发明中,所述步骤5)中预测上游潜在转录因子的方法包括但不仅限于PlantRegMap(http://plantregmap.gao-lab.org/)在线预测工具,参数与筛选标准依软件而定,需符合生物统计学要求。In the present invention, the method for predicting upstream potential transcription factors in step 5) includes but is not limited to the PlantRegMap (http://plantregmap.gao-lab.org/) online prediction tool, and the parameters and screening criteria are determined by the software and must meet the requirements of biostatistics.
本发明中,所述步骤7)群体SNP数据是基于植物全基因组重测序技术获得。In the present invention, the population SNP data in step 7) is obtained based on plant whole genome resequencing technology.
本发明中,所述操作步骤中SNP基因型频率需大于10%。In the present invention, the SNP genotype frequency in the operation steps must be greater than 10%.
本发明中,所述特定植物种质资源群体个体数量需大于200株。In the present invention, the number of individuals in the specific plant germplasm resource population must be greater than 200.
本发明中,所述步骤8)~9)中开展关联分析的方法为TASSEL v5.0中的混合线性模型;利用软件得到每一待测SNP位点与特定表达量间关联的显著性水平,得到P值;对P值进行FDR多重检测(Q),筛选P≤0.01,Q≤0.1的SNP位点为与特定表达水平显著关联的SNP位点。In the present invention, the method for carrying out association analysis in steps 8) to 9) is a mixed linear model in TASSEL v5.0; the significance level of the association between each SNP site to be tested and the specific expression level is obtained by using the software to obtain the P value; the P value is subjected to FDR multiple detection (Q), and the SNP sites with P≤0.01 and Q≤0.1 are screened as SNP sites significantly associated with the specific expression level.
本发明中,所述步骤3)~6)中所述基因表达量数据为与该miRNA遗传调控通路发挥功能相关的植物组织或器官的表达量。In the present invention, the gene expression data in steps 3) to 6) are the expression levels of plant tissues or organs related to the function of the miRNA genetic regulatory pathway.
下面结合具体实施例对本发明所述一种鉴定植物miRNA遗传调控通路的方法做进一步详细的介绍,本发明的技术方案包括但不仅限于以下实施例。The following is a further detailed introduction to the method for identifying plant miRNA genetic regulation pathways of the present invention in conjunction with specific examples. The technical solutions of the present invention include but are not limited to the following examples.
实施例1Example 1
使用本发明图2所示的一种构建植物miRNA遗传调控通路的方法,构建毛白杨miR393a影响木材品质性状的遗传调控通路,解析上述遗传调控通路影响毛白杨木材品质性状的遗传作用机制。Using a method for constructing a plant miRNA genetic regulatory pathway as shown in FIG. 2 of the present invention, a genetic regulatory pathway for the influence of Populus tomentosa miR393a on wood quality traits was constructed, and the genetic mechanism by which the genetic regulatory pathway influences Populus tomentosa wood quality traits was analyzed.
步骤S1,获取毛白杨miR393a的初级转录本(Pri-miR393a)序列,根据miRNA测序结果,划分其包含的miR393前体序列(Pre-miR393a)与成熟序列(miR393a,UCCAAAGGGAUCGCAUUGAUC)。Step S1, obtaining the primary transcript (Pri-miR393a) sequence of Populus tomentosa miR393a, and dividing the miR393 precursor sequence (Pre-miR393a) and the mature sequence (miR393a, UCCAAAGGGAUCGCAUUGAUC) contained therein according to the miRNA sequencing result.
表1预测的miR393a潜在靶基因信息Table 1 Predicted potential target gene information of miR393a
步骤S2,基于miR393a成熟序列,利用psRNATarget在线网站(https://www.zhaolab.org/psRNATarget)预测miR393靶基因,以毛白杨近缘种毛果杨基因数据库为靶基因库,设定Expectation值为3(数值范围为1-5,数值越小,该基因为miRNA靶基因的可能性越大),其他设置均为默认参数,预测毛白杨miR393a的潜在靶基因;经预测,共鉴定到9个miR393a的潜在靶基因,详细情况见表1。Step S2, based on the mature sequence of miR393a, the psRNATarget online website (https://www.zhaolab.org/psRNATarget) was used to predict the target genes of miR393, and the gene database of Populus tomentosa, a closely related species of Populus tomentosa, was used as the target gene library. The Expectation value was set to 3 (the numerical range is 1-5, the smaller the value, the greater the possibility that the gene is a miRNA target gene), and other settings were all default parameters to predict the potential target genes of Populus tomentosa miR393a. After prediction, a total of 9 potential target genes of miR393a were identified, as shown in Table 1 for details.
步骤S3,检测miR393a与9个潜在靶基因在毛白杨种质资源群体(303株)木质部表达水平,具体包括:收集毛白杨种质资源群体303株个体的成熟木质部,采集完后立即放入液氮环境(-196℃)中保存,采用Plant Qiagen RNAeasy kit(Qiagen China,Shanghai,China)试剂盒对成熟木质部RNA进行提取,质检合格后交由生物公司开展miRNA与mRNA测序,获得miR393a与9个潜在靶基因在毛白杨种质资源群体的表达水平。Step S3, detecting the xylem expression levels of miR393a and 9 potential target genes in the Populus tomentosa germplasm resource population (303 strains), specifically comprising: collecting mature xylems of 303 strains of Populus tomentosa germplasm resource population, and immediately storing them in a liquid nitrogen environment (-196°C) after collection, and using the Plant Qiagen RNAeasy kit (Qiagen China, Shanghai, China) to extract RNA from the mature xylem. After passing the quality inspection, the RNA was submitted to a biological company for miRNA and mRNA sequencing to obtain the expression levels of miR393a and 9 potential target genes in the Populus tomentosa germplasm resource population.
步骤S4,利用SPSS v19.0软件,计算miR393a的群体表达量与9个候选基因的群体表达量之间的皮尔森相关性系数r。结果发现,有4个候选基因的表达水平与miR393a的表达水平呈现较强的负相关(r<-0.60),则确定这4个候选基因为miR393a的候选靶基因,4个候选基因分别为Pto-AFB2.1、Pto-AFB2.2、Pto-TIR1.1与Pto-TIR1.2,具体信息见表1。Step S4, using SPSS v19.0 software, calculate the Pearson correlation coefficient r between the group expression of miR393a and the group expression of 9 candidate genes. The results showed that the expression levels of 4 candidate genes showed a strong negative correlation with the expression level of miR393a (r<-0.60), and these 4 candidate genes were determined to be candidate target genes of miR393a. The 4 candidate genes were Pto-AFB2.1, Pto-AFB2.2, Pto-TIR1.1 and Pto-TIR1.2. The specific information is shown in Table 1.
步骤S5,基于毛白杨基因组信息,以Pri-miR393a序列为基础,获得Pri-miR393a前2000bp序列定义为miR393a的启动子区域;利用PlantRegMap(http://plantregmap.gao-lab.org/)在线预测miR393a上游潜在转录因子,以P<1E-06,Q<1E-02为筛选条件,共鉴定到5个潜在的上游转录因子,具体信息见表2。Step S5, based on the genome information of Populus tomentosa and the sequence of Pri-miR393a, the first 2000 bp sequence of Pri-miR393a was defined as the promoter region of miR393a; PlantRegMap (http://plantregmap.gao-lab.org/) was used to predict the potential upstream transcription factors of miR393a online, and P < 1E-06, Q < 1E-02 were used as the screening conditions. A total of 5 potential upstream transcription factors were identified, and the specific information is shown in Table 2.
表2预测的潜在上游转录因子信息Table 2 Predicted potential upstream transcription factor information
步骤S6,基于步骤S3获得的全基因组基因表达量数据,检测步骤S5中5个潜在上游转录因子的群体表达水平,计算这5个潜在上游转录因子与Pri-miR393a的表达量;结果发现,有2个转录因子表达水平与Pri-miR393a的表达水平在群体上呈现较强相关性(r<-0.60或r>0.60),这两个基因为:Pto-AGL20.2与Pto-DREB26,定义这两个基因为候选转录因子。Step S6, based on the whole genome gene expression data obtained in step S3, the group expression levels of the five potential upstream transcription factors in step S5 are detected, and the expression levels of these five potential upstream transcription factors and Pri-miR393a are calculated; the results show that the expression levels of two transcription factors show a strong correlation with the expression level of Pri-miR393a in the group (r<-0.60 or r>0.60), and these two genes are: Pto-AGL20.2 and Pto-DREB26, and these two genes are defined as candidate transcription factors.
步骤S7,获得Pri-miR393a、2个候选转录因子与4个候选靶基因在群体中的SNP基因型数据,具体流程如下:Step S7, obtaining the SNP genotype data of Pri-miR393a, two candidate transcription factors and four candidate target genes in the population, the specific process is as follows:
以毛白杨自然群体中435株个体为材料,提取所有个体的DNA用于重测序,以毛白杨参考基因组为基础,获得全基因组SNP数据及其所在基因组中的位置。利用bioedit软件将上述基因与参考基因组进行序列比对,获得上述基因的位置信息。结合全基因组SNP数据,获得上述基因的群体SNP基因型数据。筛选基因型频率大于10%的SNP,最终检测到7个基因内的134个SNP,详细信息见表3。435 individuals in the natural population of Populus tomentosa were used as materials, and DNA from all individuals was extracted for resequencing. Based on the reference genome of Populus tomentosa, genome-wide SNP data and their locations in the genome were obtained. The above genes were sequenced with the reference genome using bioedit software to obtain the location information of the above genes. Combined with the genome-wide SNP data, the population SNP genotype data of the above genes were obtained. SNPs with genotype frequencies greater than 10% were screened, and 134 SNPs in 7 genes were finally detected. Detailed information is shown in Table 3.
表3本专利所涉及的SNP基因型数据Table 3 SNP genotype data involved in this patent
步骤S8,基于数量性状表达定位策略(expression Quantitative Trait Loci,eQTL),利用TASSEL v5.0中的混合线性模型,将步骤S7中2个候选转录因子内群体SNP基因型数据与miR393a的群体表达水平进行关联分析,确定与miR393a表达水平显著相关的SNP,所述确定的条件包括:2个候选转录因子内任一SNP与miR393a表达水平显著关联,即P≤0.01,Q≤0.1的关联结果,表明该候选转录因子对miR393a具有遗传调控作用。结果发现,Pto-DREB26内3个SNP与miR393a表达水平显著关联,表明Pto-DREB26作为上游调控的转录因子调控miR393a的表达,详细信息见表4。Step S8, based on the expression Quantitative Trait Loci (eQTL) strategy, using the mixed linear model in TASSEL v5.0, the population SNP genotype data in the two candidate transcription factors in step S7 were associated with the population expression level of miR393a to determine the SNP significantly associated with the expression level of miR393a, and the conditions for determination included: any SNP in the two candidate transcription factors was significantly associated with the expression level of miR393a, that is, the association results of P≤0.01 and Q≤0.1, indicating that the candidate transcription factor has a genetic regulatory effect on miR393a. The results showed that three SNPs in Pto-DREB26 were significantly associated with the expression level of miR393a, indicating that Pto-DREB26, as an upstream regulated transcription factor, regulates the expression of miR393a, and detailed information is shown in Table 4.
表4待测SNP与候选基因表达量关联分析结果(P≤0.01,Q≤0.1)Table 4 Results of association analysis between the tested SNPs and candidate gene expression levels (P≤0.01, Q≤0.1)
步骤S9,利用TASSEL v5.0中的混合线性模型,将步骤S7中Pri-miR393a的基因型数据与3个候选靶基因的群体表达水平开展关联分析,确定与Pri-miR393a内SNP与3个候选靶基因表达水平间的显著性关系,所述确定的条件包括:Pri-miR393a内任一SNP与候选靶基因表达水平显著关联,即P≤0.01,Q≤0.1的关联结果,表明miR393a对下游靶基因具有遗传调控作用。结果发现,Pri-miR393a内6个SNP与两个候选靶基因Pto-AFB2.1、Pto-TIR1.2的表达水平显著关联,确定Pto-AFB2.1与Pto-TIR1.2作为miR393a的下游靶基因发挥作用,详细信息见表3。Step S9, using the mixed linear model in TASSEL v5.0, the genotype data of Pri-miR393a in step S7 is associated with the population expression levels of the three candidate target genes to determine the significant relationship between the SNP in Pri-miR393a and the expression levels of the three candidate target genes, and the conditions for determination include: any SNP in Pri-miR393a is significantly associated with the expression level of the candidate target gene, that is, the association results of P≤0.01 and Q≤0.1 indicate that miR393a has a genetic regulatory effect on the downstream target genes. The results show that 6 SNPs in Pri-miR393a are significantly associated with the expression levels of the two candidate target genes Pto-AFB2.1 and Pto-TIR1.2, and it is determined that Pto-AFB2.1 and Pto-TIR1.2 act as downstream target genes of miR393a. Detailed information is shown in Table 3.
步骤S10,综合上述步骤S8与S9,确定了毛白杨miR393a的遗传调控通路,即Pto-DREB26为其上游调控转录因子,Pto-AFB2.1与Pto-TIR1.2为miR393a的下游靶基因(图1),参与了毛白杨木材组织的遗传调控过程。Step S10, combining the above steps S8 and S9, determined the genetic regulatory pathway of Populus tomentosa miR393a, that is, Pto-DREB26 is its upstream regulatory transcription factor, Pto-AFB2.1 and Pto-TIR1.2 are the downstream target genes of miR393a (Figure 1), which are involved in the genetic regulation process of Populus tomentosa wood tissue.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311097229.7A CN117095748B (en) | 2023-08-29 | 2023-08-29 | Method for constructing plant miRNA genetic regulation pathway |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311097229.7A CN117095748B (en) | 2023-08-29 | 2023-08-29 | Method for constructing plant miRNA genetic regulation pathway |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117095748A CN117095748A (en) | 2023-11-21 |
CN117095748B true CN117095748B (en) | 2024-04-23 |
Family
ID=88776864
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311097229.7A Active CN117095748B (en) | 2023-08-29 | 2023-08-29 | Method for constructing plant miRNA genetic regulation pathway |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117095748B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111863127A (en) * | 2020-07-17 | 2020-10-30 | 北京林业大学 | A method for constructing the genetic regulation network of plant transcription factors on target genes |
CN113832177A (en) * | 2020-06-24 | 2021-12-24 | 中国科学院分子植物科学卓越创新中心 | The ER1-MAPK-DST molecular module and its application in improving the number of grains per panicle in plants |
-
2023
- 2023-08-29 CN CN202311097229.7A patent/CN117095748B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113832177A (en) * | 2020-06-24 | 2021-12-24 | 中国科学院分子植物科学卓越创新中心 | The ER1-MAPK-DST molecular module and its application in improving the number of grains per panicle in plants |
CN111863127A (en) * | 2020-07-17 | 2020-10-30 | 北京林业大学 | A method for constructing the genetic regulation network of plant transcription factors on target genes |
Also Published As
Publication number | Publication date |
---|---|
CN117095748A (en) | 2023-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112133365B (en) | Gene set for evaluating tumor microenvironment, scoring model and application of gene set | |
CN111863127B (en) | Method for constructing genetic regulation network of plant transcription factor to target gene | |
CN109411015B (en) | Tumor mutation load detection device based on circulating tumor DNA and storage medium | |
CN109545278B (en) | A method to identify plant lncRNA-gene interactions | |
CN109337997B (en) | Camellia polymorphism chloroplast genome microsatellite molecular marker primer and method for screening and discriminating kindred species | |
CN108004302A (en) | A kind of association analysis method of transcript profile reference and its application | |
CN103984879B (en) | A kind of method and system for determining testing gene group Zonal expression level | |
CN105821042A (en) | MiRNA associated with genomic stability of human umbilical cord blood mesenchymal stem cells and application of miRNA | |
CN105404793B (en) | The method for quickly finding phenotype correlation gene based on probabilistic framework and weight sequencing technologies | |
CN109461473B (en) | Method and device for acquiring concentration of free DNA of fetus | |
JP2014505935A (en) | DNA sequence data analysis method | |
CN116516029A (en) | Golden pomfret whole genome breeding chip and application | |
Meyer et al. | ReadZS detects cell type-specific and developmentally regulated RNA processing programs in single-cell RNA-seq | |
CN106021992A (en) | Computation pipeline of location-dependent variant calls | |
CN117095748B (en) | Method for constructing plant miRNA genetic regulation pathway | |
JP5403563B2 (en) | Gene identification method and expression analysis method in comprehensive fragment analysis | |
CN117133354B (en) | Method for efficiently identifying key breeding gene modules of forest tree | |
CN108441538A (en) | The method for developing polymorphic micro-satellite molecular labeling based on multisample high-flux sequence | |
CN115985399B (en) | HRD panel site selection optimization method and system for high-throughput sequencing | |
CN118127172A (en) | A SNP molecular marker related to goose weight and size and its application | |
CN106326689A (en) | Method and device for determining site subject to selection in colony | |
Li et al. | Current trend of annotating single nucleotide variation in humans–A case study on SNVrap | |
CN111154889A (en) | A kind of SNP molecular marker related to pig body weight and its application and obtaining method | |
CN114639442B (en) | Method and system for predicting open reading frame based on single nucleotide polymorphism | |
CN115547409A (en) | Method, system, device and medium for annotation of peak functional elements in m6A sequencing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |