[go: up one dir, main page]

CN107630079B - Method for determining the sequence, insertion position and border sequence of foreign DNA fragments in transgenic organisms - Google Patents

Method for determining the sequence, insertion position and border sequence of foreign DNA fragments in transgenic organisms Download PDF

Info

Publication number
CN107630079B
CN107630079B CN201610569874.8A CN201610569874A CN107630079B CN 107630079 B CN107630079 B CN 107630079B CN 201610569874 A CN201610569874 A CN 201610569874A CN 107630079 B CN107630079 B CN 107630079B
Authority
CN
China
Prior art keywords
sequence
transgenic organism
transgenic
exogenous dna
organism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610569874.8A
Other languages
Chinese (zh)
Other versions
CN107630079A (en
Inventor
蒋炳军
孙�石
韩天富
吴存祥
侯文胜
陈莉
武婷婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Original Assignee
Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Crop Sciences of Chinese Academy of Agricultural Sciences filed Critical Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Priority to CN201610569874.8A priority Critical patent/CN107630079B/en
Publication of CN107630079A publication Critical patent/CN107630079A/en
Application granted granted Critical
Publication of CN107630079B publication Critical patent/CN107630079B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method for determining the sequence, insertion position and border sequence of exogenous DNA fragment in transgenic organism. The method disclosed by the invention comprises the following steps: sequencing the whole genome of the transgenic organism, and comparing the sequence A with the whole genome sequence of the transgenic organism to determine the sequence, the insertion position and the marginal sequence of the exogenous DNA fragment in the transgenic organism; the transgenic organism is obtained by processing a receptor organism by using a vector containing an exogenous DNA fragment; sequence a is a1), a2) or a 3): a1) the full-length sequence of the vector; a2) the sequence of any DNA fragment containing the exogenous DNA fragment from the vector; a3 exogenous DNA fragment or the sequence of any one of the fragments. The method of the invention can break through species limitation, is not limited by target genes used in constructing transgenic organisms, and can detect the position of the transgene insertion, the marginal sequence and even the copy number more quickly and accurately.

Description

确定转基因生物中外源DNA片段的序列、插入位置和边际序列 的方法Determination of sequences, insertion positions and marginal sequences of foreign DNA fragments in GMOs Methods

技术领域technical field

本发明涉及生物技术领域中确定转基因生物中外源DNA片段的序列、插入位置和边际序列的方法。The present invention relates to a method for determining the sequence, insertion position and marginal sequence of exogenous DNA fragments in transgenic organisms in the field of biotechnology.

背景技术Background technique

近年来,关于转基因生物体特别是转基因食品的安全性评估,越来越成为社会关注的焦点问题。明确转基因在受体基因组中的插入序列、插入位置及其边际序列,是评估分析转基因安全性的关键环节。围绕着这一问题,人们可以用Southern杂交方法明确转基因拷贝数,用染色体步移等方法逐渐获得转基因在受体基因组中的插入位置及其边际序列。但是这些方法,很多时候会受到转基因拷贝数、同源基因相似性、转基因序列、基因组序列、T-DNA区不确定性等因素的限制,总体上效率不高。为此需要开发一种快速精准高效的确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的方法。In recent years, the safety assessment of genetically modified organisms, especially genetically modified foods, has increasingly become the focus of social concern. Identifying the insertion sequence, insertion position and marginal sequence of the transgene in the recipient genome is the key link in evaluating and analyzing the safety of the transgene. Around this problem, people can use Southern hybridization method to determine the copy number of the transgene, and use methods such as chromosome walking to gradually obtain the insertion position of the transgene in the recipient genome and its marginal sequence. However, these methods are often limited by factors such as transgene copy number, homologous gene similarity, transgene sequence, genome sequence, T-DNA region uncertainty and other factors, and the overall efficiency is not high. To this end, it is necessary to develop a fast, accurate and efficient method for determining the sequence and/or insertion position and/or marginal sequence of exogenous DNA fragments in transgenic organisms.

发明内容SUMMARY OF THE INVENTION

本发明所要解决的技术问题是如何确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列。The technical problem to be solved by the present invention is how to determine the sequence and/or insertion position and/or marginal sequence of exogenous DNA fragments in transgenic organisms.

为解决上述技术问题,本发明首先提供了确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的方法。In order to solve the above technical problems, the present invention first provides a method for determining the sequence and/or insertion position and/or marginal sequence of exogenous DNA fragments in transgenic organisms.

本发明所提供的确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的方法,包括下述1)和2):The method for determining the sequence and/or insertion position and/or marginal sequence of exogenous DNA fragments in transgenic organisms provided by the present invention includes the following 1) and 2):

1)对转基因生物的全基因组进行测序,得到转基因生物全基因组序列;1) Sequencing the whole genome of the transgenic organism to obtain the whole genome sequence of the transgenic organism;

2)将序列A与所述转基因生物全基因组序列进行序列比对,确定所述转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列;所述转基因生物为利用含有所述外源DNA片段的载体处理受体生物得到的转基因生物;2) Sequence comparison of sequence A with the whole genome sequence of the transgenic organism to determine the sequence and/or insertion position and/or marginal sequence of the exogenous DNA fragment in the transgenic organism; The transgenic organism obtained by processing the recipient organism with the vector of the source DNA fragment;

所述序列A为a1)、a2)或a3):Said sequence A is a1), a2) or a3):

a1)所述载体的全长序列;a1) the full-length sequence of the vector;

a2)来自所述载体的含有所述外源DNA片段的任一DNA片段的序列;a2) the sequence of any DNA fragment from the vector containing the exogenous DNA fragment;

a3)所述外源DNA片段或其中任一片段的序列。a3) The sequence of the exogenous DNA fragment or any fragment thereof.

上述方法中,步骤2)可包括下述21)和22):In the above method, step 2) can include the following 21) and 22):

21)将所述序列A与所述转基因生物全基因组序列进行序列比对,得到所述转基因生物全基因组序列中与所述序列A匹配的序列,将该序列命名为序列B;21) aligning the sequence A with the complete genome sequence of the transgenic organism to obtain a sequence matching the sequence A in the complete genome sequence of the transgenic organism, and naming the sequence as sequence B;

22)将所述序列B与所述受体生物的参考序列进行序列比对,确定所述转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列。22) Sequence alignment of the sequence B with the reference sequence of the recipient organism to determine the sequence and/or insertion position and/or marginal sequence of the exogenous DNA fragment in the transgenic organism.

上述方法中,步骤22)可包括下述22a)和22b):In the above method, step 22) can include the following 22a) and 22b):

22a)将所述序列B与所述受体生物的参考序列进行序列比对,初步确定所述转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列;22a) aligning the sequence B with the reference sequence of the recipient organism, and preliminarily determining the sequence and/or the insertion position and/or the marginal sequence of the exogenous DNA fragment in the transgenic organism;

22b)根据所述序列B中已知序列与所述受体生物的参考序列设计引物,利用所述引物对所述转基因生物的基因组进行扩增,根据扩增产物的序列进一步确定所述转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列。22b) Design primers according to the known sequence in the sequence B and the reference sequence of the recipient organism, use the primers to amplify the genome of the transgenic organism, and further determine the transgenic organism according to the sequence of the amplified product The sequence and/or insertion position and/or marginal sequence of the exogenous DNA fragment in .

步骤22b)中,所述扩增可为PCR扩增。所述扩增产物的序列可通过将所述扩增产物进行一代测序获得。In step 22b), the amplification may be PCR amplification. The sequence of the amplification product can be obtained by first-generation sequencing of the amplification product.

由于测序覆盖深度有限,所述转基因生物全基因组序列中可能存在未知序列,所以所述序列B中也可能存在未知序列,或者所述序列B无法全部覆盖所述转基因生物中所述外源DNA片段及所述外源DNA片段的插入位置的边际序列。步骤22b)中利用所述引物对所述转基因生物的基因组进行扩增是为了将所述序列B中的未知序列以及所述转基因生物中所述序列B未覆盖的所述外源DNA片段及所述外源DNA片段的插入位置的边际序列扩增出来,并通过测序获知所述序列B中未知序列以及所述转基因生物中所述序列B未覆盖的所述外源DNA片段及所述外源DNA片段的插入位置的边际序列。Due to the limited depth of sequencing coverage, unknown sequences may exist in the whole genome sequence of the transgenic organism, so there may also be unknown sequences in the sequence B, or the sequence B cannot fully cover the exogenous DNA fragments in the transgenic organism and the marginal sequence of the insertion position of the exogenous DNA fragment. In step 22b), the use of the primers to amplify the genome of the transgenic organism is to amplify the unknown sequence in the sequence B and the exogenous DNA fragments and all the sequences not covered by the sequence B in the transgenic organism. The marginal sequence of the insertion position of the exogenous DNA fragment is amplified, and the unknown sequence in the sequence B and the exogenous DNA fragment and the exogenous DNA not covered by the sequence B in the transgenic organism are obtained by sequencing. The marginal sequence of the insertion position of the DNA fragment.

步骤22b)中,所述根据扩增产物的序列进一步确定所述转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列可通过将所述扩增产物的序列与所述受体生物的参考序列进行序列比对来实现。In step 22b), the further determination of the sequence and/or the insertion position and/or the marginal sequence of the exogenous DNA fragment in the transgenic organism according to the sequence of the amplification product can be performed by comparing the sequence of the amplification product with the receptor. Alignment of biological reference sequences.

上述方法中,所述受体生物的参考序列为所述受体生物的野生型的全基因组序列。In the above method, the reference sequence of the recipient organism is the whole genome sequence of the wild-type of the recipient organism.

上述方法中,所述对转基因生物的全基因组进行测序采用高通量测序平台进行。所述高通量测序平台具体可为HiSEQ2500测序平台。In the above method, the whole genome of the transgenic organism is sequenced using a high-throughput sequencing platform. The high-throughput sequencing platform may specifically be a HiSEQ2500 sequencing platform.

上述方法中,所述方法还可包括在对转基因生物的全基因组进行测序前对所述转基因生物的基因组构建测序所用文库。In the above method, the method may further comprise constructing a library for sequencing the genome of the transgenic organism before sequencing the whole genome of the transgenic organism.

所述测序所用文库可为片段大小为500bp-2Kb的文库,如片段大小为1Kb的文库。The library used for the sequencing can be a library with a fragment size of 500bp-2Kb, such as a library with a fragment size of 1Kb.

上述方法中,采用高通量测序平台对转基因生物的全基因组测序后,可通过序列拼接软件和/或模块进行序列拼接得到所述转基因生物全基因组序列。In the above method, after the high-throughput sequencing platform is used to sequence the whole genome of the transgenic organism, the whole genome sequence of the transgenic organism can be obtained by performing sequence splicing through sequence splicing software and/or modules.

所述序列拼接软件可为SOAPdenovo2软件,如SOAPdenovo2(Version 2.04:released on July 13th,2012)软件。The sequence splicing software may be SOAPdenovo2 software, such as SOAPdenovo2 (Version 2.04: released on July 13th, 2012) software.

所述方法还可包括在采用高通量测序平台对转基因生物的全基因组测序后,去除测序结果中低质量数据、无效数据和/或接头污染数据,得到清洁数据。所述序列拼接可为对所述清洁数据进行序列拼接。The method may further include removing low-quality data, invalid data and/or linker contamination data in the sequencing result after the whole genome of the transgenic organism is sequenced using a high-throughput sequencing platform to obtain clean data. The sequence splicing may be sequence splicing of the cleaning data.

上述方法中,所述序列B可为重叠序列(contig)或框架序列(scaffold)。所述重叠序列为进行序列拼接后得到的序列均已知的序列,即序列中不存在未知序列。所述框架序列为进行序列拼接后还存在部分核苷酸(碱基)未知的序列。In the above method, the sequence B may be a contig or a scaffold. The overlapping sequences are all known sequences obtained after sequence splicing, that is, there is no unknown sequence in the sequence. The framework sequence is a sequence in which some nucleotides (bases) are still unknown after sequence splicing.

上述方法中,所述转基因生物可为转基因植物或转基因动物。所述受体生物可为植物或动物。In the above method, the transgenic organism can be a transgenic plant or a transgenic animal. The recipient organism can be a plant or an animal.

上文中,序列比对可采用序列比对软件和/或模块进行。所述序列比对软件可为Clustal软件,如ClustalW。In the above, sequence alignment can be performed using sequence alignment software and/or modules. The sequence alignment software may be Clustal software, such as ClustalW.

为解决上述技术问题,本发明还提供了确定转基因生物中外源DNA片段拷贝数的方法。In order to solve the above technical problems, the present invention also provides a method for determining the copy number of exogenous DNA fragments in a transgenic organism.

本发明所提供的确定转基因生物中外源DNA片段拷贝数的方法包括:通过利用所述确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的方法确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列来确定所述转基因生物中所述外源DNA片段的拷贝数。The method for determining the copy number of exogenous DNA fragments in the transgenic organism provided by the present invention includes: determining the exogenous DNA fragments in the transgenic organism by using the method for determining the sequence and/or insertion position and/or marginal sequence of the exogenous DNA fragment in the transgenic organism the sequence and/or insertion position and/or marginal sequence to determine the copy number of the exogenous DNA segment in the transgenic organism.

上述方法中,所述转基因生物可为转基因植物或转基因动物。In the above method, the transgenic organism can be a transgenic plant or a transgenic animal.

为解决上述技术问题,本发明还提供了确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的系统。In order to solve the above technical problems, the present invention also provides a system for determining the sequence and/or insertion position and/or marginal sequence of exogenous DNA fragments in transgenic organisms.

本发明所提供的确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的系统包括下述b1)-b4)中的至少两种:The system for determining the sequence and/or insertion position and/or marginal sequence of exogenous DNA fragments in transgenic organisms provided by the present invention includes at least two of the following b1)-b4):

b1)进行高通量测序所需的试剂和/或仪器;b1) Reagents and/or instruments required for high-throughput sequencing;

b2)进行扩增所需的试剂和/或仪器;b2) reagents and/or instruments required for amplification;

b3)序列比对软件和/或模块;b3) sequence alignment software and/or modules;

b4)序列拼接软件和/或模块。b4) Sequence splicing software and/or modules.

上述系统也可仅由上述b1)-b4)中的至少两种组成。The above system may also consist of only at least two of the above b1)-b4).

上述系统中,所述转基因生物可为转基因植物或转基因动物。In the above system, the transgenic organism can be a transgenic plant or a transgenic animal.

为解决上述技术问题,本发明还提供了下述任一应用:For solving the above-mentioned technical problems, the present invention also provides any of the following applications:

P1)所述确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的方法或所述确定转基因生物中外源DNA片段拷贝数的方法在生物育种中的应用;P1) the application of the method for determining the sequence and/or the insertion position and/or the marginal sequence of the exogenous DNA fragment in the transgenic organism or the method for determining the copy number of the exogenous DNA fragment in the transgenic organism in biological breeding;

P2)所述确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的方法或所述确定转基因生物中外源DNA片段拷贝数的方法在评价转基因生物是否安全中的应用;P2) the application of the method for determining the sequence and/or the insertion position and/or the marginal sequence of the exogenous DNA fragment in the transgenic organism or the method for determining the copy number of the exogenous DNA fragment in the transgenic organism in evaluating whether the transgenic organism is safe;

P3)所述系统在确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列中的应用;P3) application of the system in determining the sequence and/or insertion position and/or marginal sequence of exogenous DNA fragments in transgenic organisms;

P4)所述系统在确定转基因生物中外源DNA片段拷贝数中的应用;P4) application of the system in determining the copy number of exogenous DNA fragments in transgenic organisms;

P5)所述系统在生物育种中的应用;P5) application of the system in biological breeding;

P6)所述系统在评价转基因生物是否安全中的应用。P6) Application of the system in evaluating the safety of genetically modified organisms.

上述应用中,所述转基因生物可为转基因植物或转基因动物。所述生物可为植物或动物。In the above application, the transgenic organism may be a transgenic plant or a transgenic animal. The organism can be a plant or an animal.

本发明的方法具有如下优点:The method of the present invention has the following advantages:

1)可以突破物种限制,在已知构建转基因生物时所用的载体的序列或者是仅知道构建转基因生物时所用的目的基因的序列时,均可以通过本方法明的方法确转基因生物中实际插入的外源DNA片段的插入序列、插入位置及边际序列;1) The species restriction can be broken through. When the sequence of the vector used in constructing the transgenic organism is known or only the sequence of the target gene used in constructing the transgenic organism is known, the actual insertion of the transgenic organism can be determined by a clear method of this method. Insertion sequence, insertion position and marginal sequence of the foreign DNA fragment;

2)可以不受构建转基因生物时所用的目的基因的限制,更加广泛地检测转基因载体中非目标基因的转基因搭车效应,可同时确定载体中插入到受体生物基因组中的序列;2) Can not be restricted by the target gene used in constructing the transgenic organism, more extensively detect the transgenic ride-on effect of the non-target gene in the transgenic vector, and simultaneously determine the sequence inserted into the genome of the recipient organism in the vector;

3)可以更加快速精准地检测外源基因插入位置、边际序列甚至拷贝数;3) The insertion position, marginal sequence and even copy number of foreign genes can be detected more quickly and accurately;

4)在进行高通量测序时由于测序覆盖深度有限,转基因生物全基因组序列中可能存在未知序列,本发明的方法可以在不必得知转基因生物的完整的基因组序列的情况下外源DNA片段的插入序列、插入位置及边际序列。4) When performing high-throughput sequencing, due to the limited depth of sequencing coverage, unknown sequences may exist in the whole genome sequence of the transgenic organism, and the method of the present invention can detect the complete genome sequence of the transgenic organism. Insertion Sequence, Insertion Position, and Marginal Sequence.

附图说明Description of drawings

图1为转基因载体中目的基因AtD-CGS表达盒序列与scaffold99334的比对结果。Figure 1 shows the alignment result of the target gene AtD-CGS expression cassette sequence in the transgenic vector and scaffold99334.

图2为用ClustalW软件对转基因载体中目的基因AtD-CGS表达盒序列与scaffold99334(填补空白未知碱基前)和scaffold99334.fill(填补空白未知碱基后)进行比对的结果。Figure 2 shows the results of aligning the AtD-CGS expression cassette sequence of the target gene in the transgenic vector with scaffold99334 (before filling in blank unknown bases) and scaffold99334.fill (after filling blank unknown bases) using ClustalW software.

具体实施方式Detailed ways

下面结合具体实施方式对本发明进行进一步的详细描述,给出的实施例仅为了阐明本发明,而不是为了限制本发明的范围。The present invention will be further described in detail below with reference to the specific embodiments, and the given examples are only for illustrating the present invention, rather than for limiting the scope of the present invention.

下述实施例中的实验方法,如无特殊说明,均为常规方法。The experimental methods in the following examples are conventional methods unless otherwise specified.

下述实施例中所用的材料、试剂等,如无特殊说明,均可从商业途径得到。The materials, reagents, etc. used in the following examples can be obtained from commercial sources unless otherwise specified.

本发明的确定转基因生物中外源DNA片段的序列、插入位置和边际序列的方法,包括下述1)-5):The method for determining the sequence, insertion position and marginal sequence of exogenous DNA fragments in transgenic organisms of the present invention includes the following 1)-5):

1)采用高通量测序平台对转基因生物的全基因组进行测序,得到转基因生物全基因组序列;1) using a high-throughput sequencing platform to sequence the whole genome of the transgenic organism to obtain the whole genome sequence of the transgenic organism;

2)将序列A与转基因生物全基因组序列进行序列比对,确定转基因生物中外源DNA片段的序列、插入位置和边际序列;2) Sequence A is compared with the whole genome sequence of the transgenic organism to determine the sequence, insertion position and marginal sequence of the exogenous DNA fragment in the transgenic organism;

序列A为a1)、a2)或a3):Sequence A is a1), a2) or a3):

a1)制备所述转基因生物时用到的载体的全长序列;a1) the full-length sequence of the vector used in the preparation of the transgenic organism;

a2)载体中含有外源DNA片段的任一DNA片段的序列;a2) the sequence of any DNA fragment containing the exogenous DNA fragment in the vector;

a3)外源DNA片段或其中任一片段的序列。a3) The sequence of the exogenous DNA fragment or any fragment thereof.

3)将序列A与转基因生物全基因组序列进行序列比对,得到转基因生物全基因组序列中与序列A匹配的序列,将该序列命名为序列B;3) Sequence A is compared with the complete genome sequence of the transgenic organism to obtain a sequence matching the sequence A in the complete genome sequence of the transgenic organism, and the sequence is named as sequence B;

4)将序列B与作为参考序列的野生型生物全基因组序列进行序列比对,初步确定转基因生物中外源DNA片段的序列、插入位置和边际序列;4) Sequence B is compared with the whole genome sequence of the wild-type organism as the reference sequence, and the sequence, insertion position and marginal sequence of the exogenous DNA fragment in the transgenic organism are preliminarily determined;

5)根据序列B中已知序列与参考序列设计引物,利用该引物对转基因生物的基因组进行扩增,根据扩增产物的序列进一步确定转基因生物中外源DNA片段的序列、插入位置和边际序列。5) Design primers according to the known sequence and reference sequence in sequence B, use the primers to amplify the genome of the transgenic organism, and further determine the sequence, insertion position and marginal sequence of the exogenous DNA fragment in the transgenic organism according to the sequence of the amplified product.

下面以转基因大豆为例具体阐述确定转基因生物中外源DNA片段的序列、插入位置和边际序列的方法。The method for determining the sequence, insertion position and marginal sequence of exogenous DNA fragments in the transgenic organism will be specifically described below by taking transgenic soybean as an example.

实施例1、转AtDCGS高蛋氨酸大豆中外源DNA片段的序列、插入位置和边际序列的确定Example 1. Determination of sequence, insertion position and marginal sequence of exogenous DNA fragments in AtDCGS high methionine soybean

一、转AtDCGS高蛋氨酸大豆CGS-ZG11的制备1. Preparation of AtDCGS high methionine soybean CGS-ZG11

利用载体pGPTV-Bar-DCGS,通过农杆菌介导的转基因方法,将AtD-CGS(亦称AtDCGS)基因转化到大豆品种自贡冬豆,得到转AtD-CGS高蛋氨酸大豆CGS-ZG11(即转AtD-CGS高蛋氨酸转基因大豆株系(韩庆梅等,高蛋氨酸转基因大豆的鉴定和遗传稳定性分析,中国油料作物学报,2015,37(6):789-796))。Using the vector pGPTV-Bar-DCGS, the AtD-CGS (also known as AtDCGS) gene was transformed into the soybean variety Zigong Dong bean by Agrobacterium-mediated transgenic method, and the AtD-CGS high methionine soybean CGS-ZG11 (that is, the AtD-transformed soybean CGS-ZG11) was obtained. -CGS high methionine transgenic soybean line (Han Qingmei et al., Identification and genetic stability analysis of high methionine transgenic soybean, Chinese Journal of Oil Crops, 2015, 37(6):789-796)).

二、转AtD-CGS高蛋氨酸大豆CGS-ZG11中外源DNA片段的序列、插入位置和边际序列的确定2. Determination of the sequence, insertion position and marginal sequence of exogenous DNA fragments in AtD-CGS high methionine soybean CGS-ZG11

1、转AtD-CGS高蛋氨酸大豆CGS-ZG11的基因组DNA样本的制备1. Preparation of genomic DNA samples of AtD-CGS high methionine soybean CGS-ZG11

采用天根生化科技(北京)有限公司的植物基因组DNA提取试剂盒提取转AtD-CGS高蛋氨酸大豆CGS-ZG11的基因组DNA。The genomic DNA of AtD-CGS high methionine soybean CGS-ZG11 was extracted using the plant genomic DNA extraction kit of Tiangen Biochemical Technology (Beijing) Co., Ltd.

2、转AtD-CGS高蛋氨酸大豆的基因组重测序2. Genome resequencing of AtD-CGS high methionine soybean

基于HiSEQ2500测序技术平台,利用双末端测序(Paired-End)方法,构建基因组测序用的小片段文库,获得原始测序数据。进而去除接头污染和低质量数据,获取清洁数据(clean data),去除数据的标准如下:a.当单端测序read中无法确定碱基信息的比例大于10%时,需要去除此对reads;b.当单端测序read中含有的低质量即Phred值低于5的碱基数超过该条read长度比例的50%时,需要去除此对reads。转ATCGS高蛋氨酸大豆的基因组重测序数据中的清洁序列与含不定碱基N的序列如表1所示。Based on the HiSEQ2500 sequencing technology platform, the paired-end sequencing (Paired-End) method was used to construct a small fragment library for genome sequencing, and the original sequencing data was obtained. Then remove linker contamination and low-quality data, and obtain clean data. The standards for data removal are as follows: a. When the proportion of undetermined base information in single-end sequencing reads is greater than 10%, the pair of reads needs to be removed; b . When the number of bases with low quality, that is, the Phred value lower than 5, contained in the single-end sequencing read exceeds 50% of the length of the read, this pair of reads needs to be removed. The clean sequences and sequences containing the indeterminate base N in the genome resequencing data of ATCGS high methionine soybean are shown in Table 1.

表1、转AtDCGS高蛋氨酸大豆的基因组重测序数据Table 1. Genome resequencing data of AtDCGS high methionine soybean

文库library 清洁序列(Clean Reads)Clean Reads 高质量Q30序列的比例Proportion of high-quality Q30 sequences 500bp500bp 159,451,342159,451,342 89.57% 89.57%

1)使用SOAPdenovo2(Version 2.04:released on July 13th,2012)软件组装基因组,相关信息如表2所示。1) The genome was assembled using SOAPdenovo2 (Version 2.04: released on July 13th, 2012) software, and the relevant information is shown in Table 2.

表2、转AtDCGS高蛋氨酸大豆的基因组的组装结果Table 2. The assembly results of the genome of AtDCGS high methionine soybean

大豆参考基因组大小soybean reference genome size 978495272 bp978495272bp 基因组组装大小(含N)Genome assembly size (including N) 964714736 bp964714736bp 基因组组装大小(不含N)Genome assembly size (without N) 885873140 bp885873140bp Scaffold数目Number of Scaffolds 12490451249045 Scaffold平均长度Average Scaffold Length 772 bp772bp Scaffold中位数长度Scaffold median length 127 bp127bp Scaffold最长长度Scaffold longest length 422777 bp422777bp Scaffold最短长度Scaffold shortest length 100 bp100bp N50 Scaffold长度N50 Scaffold Length 20840 bp20840bp Contig数目Number of Contigs 19856441985644 Contig平均长度Average length of Contig 469 bp469bp Contig中位数长度Contig median length 137 bp137bp Contig最长长度Contig longest length 51715 bp51715bp Contig最短长度Contig shortest length 100 bp100bp N50 Contig长度N50 Contig length 2783 bp 2783bp

表2中,大豆参考基因组为Williams 82的全基因组序列,该大豆全基因组序列的版本号为Gmax_275_Wm82.a2.v1。In Table 2, the soybean reference genome is the whole genome sequence of Williams 82, and the version number of the soybean whole genome sequence is Gmax_275_Wm82.a2.v1.

2)根据pGPTV-Bar-DCGS中目的基因AtD-CGS表达盒序列(种子特异性启动子(LegB4)——导肽(TP)——目的基因(AtD-CGS)——终止子(TOCT)),比对分析Scaffold序列,发现目的基因AtD-CGS表达盒序列特异性比对到scaffold99334(55746bp)上,如图1(图1中scaffo表示scaffold99334,DCGS-表示目的基因AtD-CGS表达盒序列)所示,并且scaffold99334中与目的基因AtDCGS表达盒序列匹配上的序列位于scaffold99334序列的中间部分。2) According to the sequence of the target gene AtD-CGS expression cassette in pGPTV-Bar-DCGS (seed-specific promoter (LegB4) - guide peptide (TP) - target gene (AtD-CGS) - terminator (TOCT)) , compare and analyze the Scaffold sequence, and find that the sequence of the AtD-CGS expression cassette of the target gene is specifically aligned to scaffold99334 (55746bp), as shown in Figure 1 (scaffo in Figure 1 means scaffold99334, DCGS- means the sequence of the AtD-CGS expression cassette of the target gene) shown, and the sequence in scaffold99334 that matches the AtDCGS expression cassette sequence of the target gene is located in the middle part of the scaffold99334 sequence.

3)将获得的scaffold99334序列比对到表2中的大豆参考基因组上,发现scaffold99334位于09号染色体Chr09:39880882-39924565之间,并用ClustalW软件对scaffold99334与该参考序列区段(09号染色体Chr09:39880882-39924565)之间进行仔细比对,发现下划线标记的外源片段插入到Chr09:39917686-39917762之间,如图2所示。3) Align the obtained scaffold99334 sequence to the soybean reference genome in Table 2, and find that scaffold99334 is located between chromosome 09 Chr09:39880882-39924565, and use ClustalW software to compare scaffold99334 with the reference sequence segment (chromosome 09 Chr09: 39880882-39924565) were carefully aligned, and it was found that the underlined exogenous fragment was inserted between Chr09:39917686-39917762, as shown in Figure 2.

4)由于测序覆盖深度问题,scaffold99334中外源片段插入序列及插入位置边际序列存在未知序列一处(图2)。根据此未知序列两侧序列,设计引物CG-F和CG-R(如表3)扩增未知序列,并将未知序列进行一代测序。4) Due to the depth of sequencing coverage, there is an unknown sequence in the insertion sequence of the exogenous fragment and the marginal sequence of the insertion position in scaffold99334 (Figure 2). Based on the sequences flanking the unknown sequence, primers CG-F and CG-R (as shown in Table 3) were designed to amplify the unknown sequence, and the unknown sequence was subjected to next-generation sequencing.

表3、扩增未知序列的引物Table 3. Primers for amplifying unknown sequences

引物名称primer name 序列sequence CG-FCG-F 5'-AGGGCTGCTAAAGGAAGCGGAACA-3'5'-AGGGCTGCTAAAGGAAGCGGAACA-3' CG-RCG-R 5'-CGATGTAGTGGTTGACGATGGTG-3'5'-CGATGTAGTGGTTTGACGATGGTG-3' GT-FGT-F 5'-GCCTGAAAATGAGGAAGAAACA-3'5'-GCCTGAAAATGAGGAAGAAACA-3' GT-RGT-R 5'-CAATGAATCAACAACTCTCCTGGCG-3'5'-CAATGAATCAACAACTCTCCTGGCG-3' GA-FGA-F 5'-ACACTCAACCCTATCTCGGGCTATT-3'5'-ACACTCAACCCTATCTCGGGCTATT-3' GA-RGA-R 5'-GGTATCTTATGGCTGCTTGGAGTTG-3' 5'-GGTATCTTATGGCTGCTTGGAGTTG-3'

5)将未知序列测序得到的序列比对到表2中的大豆参考基因组中,明确转ATCGS高蛋氨酸大豆中外源片段插入位置及边际序列。如图2所示,外源片段(斜体)插入到大豆第09号染色体Chr09:39917686-39917762之间,依据大豆参考序列可以得知插入位点两端的边际序列。5) Align the sequence obtained by sequencing the unknown sequence with the soybean reference genome in Table 2, and clarify the insertion position and marginal sequence of the exogenous fragment in the ATCGS high methionine soybean. As shown in Figure 2, the exogenous fragment (italicized) was inserted into soybean chromosome 09 between Chr09:39917686-39917762, and the marginal sequences at both ends of the insertion site can be obtained according to the soybean reference sequence.

6同时明确外源片段详细信息。如图2所示,插入的外源片段达到9343bp,如序列表中序列1所示,序列1中,第171-423位为NOS终止子序列,第494-2305位为GUS基因序列,第2350-3061位为TOCT终止子序列,第3132-4535位为AtD-CGS基因序列,第4719-7472位为LegB4启动子序列,第7513-8125位为pNOS启动子序列,第8126-8714为除草剂抗性基因Bar基因序列,第8715-8937位为transcript 7终止子序列。这一外源片段信息与目标载体一致。同时利用插入位置信息,分别在上下游接头处,设计两对引物(GT-F/GT-R和GA-F/GA-R(即表3中GT-F/GT-R和GA-F/GA-R)),进行PCR扩增,通过常规测序,确证插入位置,表明,利用本发明的确定转基因生物中外源DNA片段的序列、插入位置和边际序列的方法确定转AtDCGS高蛋氨酸大豆中外源DNA片段的序列、插入位置和边际序列的结果是可靠的。6 At the same time, clarify the details of exogenous fragments. As shown in Figure 2, the inserted exogenous fragment reaches 9343bp, as shown in sequence 1 in the sequence table, in sequence 1, the 171-423rd position is the NOS terminator sequence, the 494th-2305th position is the GUS gene sequence, and the 2350th position -3061 is the TOCT terminator sequence, 3132-4535 is the AtD-CGS gene sequence, 4719-7472 is the LegB4 promoter sequence, 7513-8125 is the pNOS promoter sequence, and 8126-8714 is the herbicide Resistance gene Bar gene sequence, the 8715-8937th is the transcript 7 terminator sequence. This exogenous fragment information is consistent with the target vector. At the same time, using the insertion position information, two pairs of primers (GT-F/GT-R and GA-F/GA-R (that is, GT-F/GT-R and GA-F/ GA-R)), carry out PCR amplification, and confirm the insertion position by conventional sequencing, indicating that the method for determining the sequence, insertion position and marginal sequence of exogenous DNA fragments in transgenic organisms of the present invention is used to determine the exogenous transfection in AtDCGS high methionine soybean. The results of DNA fragment sequence, insertion position and marginal sequence are reliable.

6)根据pGPTV-Bar-DCGS中目的基因AtD-CGS表达盒序列(种子特异性启动子(LegB4)——导肽(TP)——目的基因(AtD-CGS)——终止子(TOCT)),比对分析Scaffold序列,发现除了scaffold99334外,未检测到其他与目标基因表达盒能特异比对的序列,说明目标基因在转AtD-CGS高蛋氨酸大豆CGS-ZG11的基因组中为单拷贝存在,这与相关的Southern杂交结果(韩庆梅等,高蛋氨酸转基因大豆的鉴定和遗传稳定性分析,中国油料作物学报,2015,37(6):789-796)一致。说明本方法还具有可以检测拷贝数的优点。6) According to the sequence of the target gene AtD-CGS expression cassette in pGPTV-Bar-DCGS (seed-specific promoter (LegB4) - guide peptide (TP) - target gene (AtD-CGS) - terminator (TOCT)) , compared and analyzed the Scaffold sequences, and found that except scaffold99334, no other sequences that can be specifically aligned with the target gene expression cassette were detected, indicating that the target gene exists in a single copy in the genome of AtD-CGS high methionine soybean CGS-ZG11. This is consistent with the related Southern hybridization results (Han Qingmei et al., Identification and genetic stability analysis of high-methionine transgenic soybeans, Chinese Journal of Oil Crops, 2015, 37(6):789-796). It shows that this method also has the advantage of being able to detect the copy number.

Figure IDA0001053172150000011
Figure IDA0001053172150000011

Figure IDA0001053172150000021
Figure IDA0001053172150000021

Figure IDA0001053172150000031
Figure IDA0001053172150000031

Claims (7)

1.确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列的方法,包括下述1)和2):1. A method for determining the sequence and/or insertion position and/or marginal sequence of exogenous DNA fragments in a transgenic organism, including the following 1) and 2): 1)对转基因生物的全基因组进行测序,得到转基因生物全基因组序列;其中,所述转基因生物为利用含有外源DNA片段的载体处理受体生物得到的转基因生物;1) Sequence the whole genome of the transgenic organism to obtain the whole genome sequence of the transgenic organism; wherein, the transgenic organism is a transgenic organism obtained by processing the recipient organism with a vector containing exogenous DNA fragments; 2)包括下述21)和22):2) Including the following 21) and 22): 21)将序列A与所述转基因生物全基因组序列进行序列比对,得到所述转基因生物全基因组序列中与所述序列A匹配的序列,将该序列命名为序列B;21) Sequence alignment of sequence A with the whole genome sequence of the transgenic organism to obtain a sequence matching the sequence A in the whole genome sequence of the transgenic organism, and name the sequence as sequence B; 22)包括下述22a)和22b):22) includes the following 22a) and 22b): 22a)将所述序列B与所述受体生物的参考序列进行序列比对,初步确定所述转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列;22a) aligning the sequence B with the reference sequence of the recipient organism, and preliminarily determining the sequence and/or the insertion position and/or the marginal sequence of the exogenous DNA fragment in the transgenic organism; 22b)根据所述序列B中已知序列与所述受体生物的参考序列设计引物,利用所述引物对所述转基因生物的基因组进行扩增,根据扩增产物的序列进一步确定所述转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列;22b) Design primers according to the known sequence in the sequence B and the reference sequence of the recipient organism, use the primers to amplify the genome of the transgenic organism, and further determine the transgenic organism according to the sequence of the amplified product the sequence and/or insertion position and/or marginal sequence of the exogenous DNA fragment; 所述序列A为a1)、a2)或a3):Said sequence A is a1), a2) or a3): a1)所述载体的全长序列;a1) the full-length sequence of the vector; a2)来自所述载体的含有所述外源DNA片段的任一DNA片段的序列;a2) the sequence of any DNA fragment from the vector containing the exogenous DNA fragment; a3)所述外源DNA片段或其中任一片段的序列。a3) The sequence of the exogenous DNA fragment or any fragment thereof. 2.根据权利要求1所述的方法,其特征在于:所述对转基因生物的全基因组进行测序采用高通量测序平台进行。2 . The method according to claim 1 , wherein the sequencing of the entire genome of the transgenic organism is performed using a high-throughput sequencing platform. 3 . 3.根据权利要求1所述的方法,其特征在于:所述方法还包括在对转基因生物的全基因组进行测序前对所述转基因生物的基因组构建测序所用文库。3. The method of claim 1, wherein the method further comprises constructing a library for sequencing the genome of the transgenic organism before sequencing the entire genome of the transgenic organism. 4.根据权利要求2所述的方法,其特征在于:采用高通量测序平台对转基因生物的全基因组测序后,通过序列拼接软件和/或模块进行序列拼接得到所述转基因生物全基因组序列。4. The method according to claim 2, characterized in that: after using a high-throughput sequencing platform to sequence the whole genome of the transgenic organism, sequence splicing is performed by sequence splicing software and/or modules to obtain the whole genome sequence of the transgenic organism. 5.根据权利要求1所述的方法,其特征在于:所述转基因生物为转基因植物或转基因动物。5. The method according to claim 1, wherein the transgenic organism is a transgenic plant or a transgenic animal. 6.确定转基因生物中外源DNA片段拷贝数的方法,包括:通过利用权利要求1-5中任一所述方法确定转基因生物中外源DNA片段的序列和/或插入位置和/或边际序列来确定所述转基因生物中所述外源DNA片段的拷贝数。6. A method for determining the copy number of an exogenous DNA fragment in a transgenic organism, comprising: determining the sequence and/or insertion position and/or marginal sequence of the exogenous DNA fragment in the transgenic organism by utilizing the method described in any one of claims 1-5 The copy number of the exogenous DNA segment in the transgenic organism. 7.下述任一应用:7. Any of the following applications: P1)权利要求1-6中任一所述方法在生物育种中的应用;P1) Application of any one of the methods of claims 1-6 in biological breeding; P2)权利要求1-6中任一所述方法在评价转基因生物是否安全中的应用。P2) Application of the method of any one of claims 1 to 6 in evaluating whether a transgenic organism is safe.
CN201610569874.8A 2016-07-19 2016-07-19 Method for determining the sequence, insertion position and border sequence of foreign DNA fragments in transgenic organisms Active CN107630079B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610569874.8A CN107630079B (en) 2016-07-19 2016-07-19 Method for determining the sequence, insertion position and border sequence of foreign DNA fragments in transgenic organisms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610569874.8A CN107630079B (en) 2016-07-19 2016-07-19 Method for determining the sequence, insertion position and border sequence of foreign DNA fragments in transgenic organisms

Publications (2)

Publication Number Publication Date
CN107630079A CN107630079A (en) 2018-01-26
CN107630079B true CN107630079B (en) 2020-07-28

Family

ID=61113119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610569874.8A Active CN107630079B (en) 2016-07-19 2016-07-19 Method for determining the sequence, insertion position and border sequence of foreign DNA fragments in transgenic organisms

Country Status (1)

Country Link
CN (1) CN107630079B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109207569A (en) * 2018-09-29 2019-01-15 中国科学院遗传与发育生物学研究所 A kind of carrier insertion position detection method based on the sequencing of two generation of genome
CN110600079B (en) * 2019-08-12 2021-12-10 中国水稻研究所 Transgene identification method and identification device
CN110556165B (en) * 2019-09-12 2022-03-18 浙江大学 Method for rapidly identifying transgene or gene editing material and insertion site thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103270175A (en) * 2011-01-20 2013-08-28 深圳华大基因科技有限公司 Method and system for detecting the insertion sites of transgenic foreign fragments
CN103667481A (en) * 2013-12-06 2014-03-26 上海美吉生物医药科技有限公司 Method for sequencing unknown flanking sequence at both sides of known sequence
CN105492625A (en) * 2013-04-17 2016-04-13 先锋国际良种公司 Methods for characterizing DNA sequence composition in a genome
CN105631242A (en) * 2015-12-25 2016-06-01 中国农业大学 Method for identifying transgenic events through whole genome sequencing data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103270175A (en) * 2011-01-20 2013-08-28 深圳华大基因科技有限公司 Method and system for detecting the insertion sites of transgenic foreign fragments
CN105492625A (en) * 2013-04-17 2016-04-13 先锋国际良种公司 Methods for characterizing DNA sequence composition in a genome
CN103667481A (en) * 2013-12-06 2014-03-26 上海美吉生物医药科技有限公司 Method for sequencing unknown flanking sequence at both sides of known sequence
CN105631242A (en) * 2015-12-25 2016-06-01 中国农业大学 Method for identifying transgenic events through whole genome sequencing data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
植物外源基因拷贝数及插入位点的检测方法与技术;罗滨等;《河南师范大学学报(自然科学版)》;20121130;第40卷(第6期);第111-116页 *
高通量测序检测金针菇RNAi转化子插入位点及拷贝数;丑天胜等;《菌物学报》;20150715;第34卷(第4期);第694-702页 *

Also Published As

Publication number Publication date
CN107630079A (en) 2018-01-26

Similar Documents

Publication Publication Date Title
Yang et al. Characterization of GM events by insert knowledge adapted re-sequencing approaches
Park et al. A bioinformatics approach for identifying transgene insertion sites using whole genome sequencing data
Guo et al. Identification of genomic insertion and flanking sequence of G2-EPSPS and GAT transgenes in soybean using whole genome sequencing method
CN107385059B (en) Molecular marker related to growth traits of broiler chickens and application thereof
Blow et al. Identification of ancient remains through genomic sequencing
AU2016324473B2 (en) Virome capture sequencing platform, methods of designing and constructing and methods of using
CN107523633B (en) A method for developing novel molecular markers based on porcine SINE transposon insertion polymorphism
CN107630079B (en) Method for determining the sequence, insertion position and border sequence of foreign DNA fragments in transgenic organisms
CN103160937A (en) Method for conducting enrichment library construction and SNP analysis on gene of complex genome of higher plant
US20200273538A1 (en) Computational modeling of loss of function based on allelic frequency
Giraldo et al. Rapid and detailed characterization of transgene insertion sites in genetically modified plants via nanopore sequencing
Wang et al. Whole-genome sequencing: an effective strategy for insertion information analysis of foreign genes in transgenic plants
CN110699465B (en) Molecular marker for rapidly improving green shell rate of duck group and application
CN107988385A (en) A kind of method and its dedicated kit for detecting beef cattle PLAG1 genes Indel marks
CN102321750A (en) Method for rapidly screening bian chicken weight gain degree by molecular marking
CN114196761A (en) Method for making liquid phase chip for main selection of sire breed pig feed remuneration
CN112513292A (en) Method and device for detecting homologous sequence based on high-throughput sequencing
CN114875157B (en) SNP (Single nucleotide polymorphism) marker related to individual growth traits of pelteobagrus fulvidraco and application
CN105274229B (en) Detect the method and kit of 11 foreign gene homozygosis of transgenic corns T4/heterozygous state
CN104726577B (en) A kind of SNP marker related to Erhualian sow litter trait and its detection method
CN114525328A (en) Kit for detecting HLA-I/II gene expression typing and expression quantity at single cell level and use method thereof
CN113593638A (en) Full-length rapid identification and cloning technology for medicinal radix pseudostellariae virus genome
Mueller et al. Sequencing of mRNA from whole blood using nanopore sequencing
Bellec et al. Long read sequencing technology to solve complex genomic regions assembly in plants
CN105504037A (en) Gene and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant