[go: up one dir, main page]

CN104844696A - Design, synthesis and application of transcription activator like effector function protein - Google Patents

Design, synthesis and application of transcription activator like effector function protein Download PDF

Info

Publication number
CN104844696A
CN104844696A CN201410056380.0A CN201410056380A CN104844696A CN 104844696 A CN104844696 A CN 104844696A CN 201410056380 A CN201410056380 A CN 201410056380A CN 104844696 A CN104844696 A CN 104844696A
Authority
CN
China
Prior art keywords
rvd
functional protein
residues
transcription activator
effector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410056380.0A
Other languages
Chinese (zh)
Inventor
席建忠
孙常宏
王干诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201410056380.0A priority Critical patent/CN104844696A/en
Publication of CN104844696A publication Critical patent/CN104844696A/en
Pending legal-status Critical Current

Links

Landscapes

  • Peptides Or Proteins (AREA)

Abstract

一种“转录激活子样效应因子(Transcription Activator Like Effectors,TALE)”功能蛋白,所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和缬氨酸V、或者分别为缬氨酸V和异亮氨酸I、或者分别为甘氨酸G和天冬氨酸D、或者分别为精氨酸R和色氨酸W、或者分别为苏氨酸T和甘氨酸G、或者分别为半胱氨酸C和甘氨酸G、或者分别为谷氨酸E和半胱氨酸C。

A "transcription activator like effector (TALE)" functional protein, the repeat variable double amino acid residue RVD (repeat variable di-residues) of the "transcription activator like effector" functional protein Arginine R and valine V respectively, or valine V and isoleucine I respectively, or glycine G and aspartic acid D respectively, or arginine R and tryptophan W respectively , or respectively threonine T and glycine G, or respectively cysteine C and glycine G, or respectively glutamic acid E and cysteine C.

Description

一种转录激活子样效应因子功能蛋白设计、合成及其应用Design, synthesis and application of a transcriptional activator-like effector functional protein

技术领域 technical field

本发明涉及一种“转录激活子样效应因子(TranscriptionActivator Like Effectors,TALE)”功能蛋白设计、合成以及上述功能蛋白的应用,特别涉及上述功能蛋白。  The present invention relates to the design and synthesis of a "transcription activator like effector (TALE)" functional protein and the application of the above functional protein, in particular to the above functional protein. the

背景技术 Background technique

在后基因组时代,我们迫切需要高效的基因操纵、合成等新型的生物工程技术。比如,我们常常需要抑制或沉默一个基因的表达。传统的基因敲除(Gene Knockout)技术依赖于细胞内自然发生的同源重组,其效率非常低,通常为10-6级;RNAi技术虽然简单易行,但是又很难获得百分之百的抑制效果。除了抑制或沉默特定的基因,我们常常还需要针对特定基因,进行某几个碱基,甚至某一段序列的修改。同样只依赖于同源重组技术,很难获得理想的效果。  In the post-genome era, we urgently need new bioengineering technologies such as efficient gene manipulation and synthesis. For example, we often need to suppress or silence the expression of a gene. Traditional gene knockout (Gene Knockout) technology relies on homologous recombination naturally occurring in cells, and its efficiency is very low, usually at the level of 10 -6 ; RNAi technology is simple and easy to implement, but it is difficult to obtain 100% inhibition effect. In addition to suppressing or silencing specific genes, we often need to modify a few bases or even a certain sequence of specific genes. Also only relying on homologous recombination technology, it is difficult to obtain the desired effect.

TALE(Transcription Activator Like Effectors)首先是植物病原菌黄单胞菌(Xanthomonas)上发现的,特异性地结合到DNA,在该病原菌感染过程中对植物基因进行调控。TALE蛋白可以穿过核膜进入细胞核内与特定的DNA结合域结合,调控植物基因组中与疾病和抵抗力相关基因的表达。简单讲,TALE是由4个或以上特异性识别DNA的串联“蛋白模块”和两侧的N-末端及C-末端序列组成,而每个“蛋白模块”包含34个氨基酸,其中第12和13位氨基酸是靶向识别的关键位点,被称作重复可变的双氨基酸残基(repeat variable diresidue,或者RVDs)。TALE识别DNA的机制在于DNA靶点上的一个核苷酸被一个重复序列上的RVD识别。  TALE (Transcription Activator Like Effectors) was first discovered on the plant pathogen Xanthomonas, which specifically binds to DNA and regulates plant genes during the pathogen infection process. TALE proteins can pass through the nuclear membrane into the nucleus and bind to specific DNA binding domains to regulate the expression of genes related to disease and resistance in the plant genome. Simply put, TALE is composed of 4 or more tandem "protein modules" that specifically recognize DNA and the N-terminal and C-terminal sequences on both sides, and each "protein module" contains 34 amino acids, of which the 12th and The 13th amino acid is the key site for target recognition, which is called repeat variable diresidue (or RVDs). The mechanism of TALE recognition of DNA is that a nucleotide on the DNA target is recognized by an RVD on a repeat sequence. the

理论上,针对A、T、G、C任何一个碱基,都能找到与之特定结合的RVDs。因此,对任何一段DNA序列,我们可以方便的设计、合成出对应蛋白模块组成的TALE。需要指出的问题是,1)虽然针对A、T、G、C都可以设计出对应的RVD,但是,它们之间的对应关系还有待于进一步优化;2)设计合成的TALE能否高效地靶向结合基因组上的预期位置,还决定于很多其它因素;3)蛋白模块的组成也有进一步优化的空间。  Theoretically, for any of the bases A, T, G, and C, RVDs that specifically bind to it can be found. Therefore, for any DNA sequence, we can easily design and synthesize TALEs composed of corresponding protein modules. The problems that need to be pointed out are: 1) Although the corresponding RVD can be designed for A, T, G, and C, the corresponding relationship between them needs to be further optimized; 2) Can the designed and synthesized TALE effectively target The expected location on the genome is also determined by many other factors; 3) The composition of the protein module also has room for further optimization. the

虽然TALE还有很多问题有待于深入研究加以解决,但是这并不妨碍它的应用。一个最大的应用前景是TALEN。TALEN是一个融合蛋白,由与某段DNA序列特异性识别TALE与能在DNA序列上产生双链断裂(double strand break,DSB)的内切核酸酶(Nuclease)融合而成。TALEN是异源二聚体分子(即两单位的TALE-Nuclease共同作用),能够在两个相隔较近的特异识别序列间切割DNA。  Although TALE still has many problems to be studied and solved, this does not hinder its application. One of the biggest application prospects is TALEN. TALEN is a fusion protein, which is fused with a DNA sequence that specifically recognizes TALE and an endonuclease (Nuclease) that can generate a double strand break (DSB) on the DNA sequence. TALEN is a heterodimeric molecule (that is, two units of TALE-Nuclease act together), which can cut DNA between two specific recognition sequences that are relatively close together. the

TALEN产生的DSB能够通过以下两种途径进行修复:1)非同源末端连接(Non Homologous EndJoining,NHEJ):NHEJ是以自然修复机制,能被用来引入核苷酸缺失以便失活或敲除一个特异性的靶基因;2)同源重组(Homologous Recombination,HR):DSB促进同源重组,在一个DNA模版存在下,能够产生特异性DNA序列改变,也能够将转基因整合到DNA序列上。NHEJ途径可以用于基因沉默,而HR可用于修改基因(Gene Editing),或者基因敲入(Gene Knock-in)。无论是哪种途径,TALEN产生的修复与单纯依赖于同源重组相比,基因发生重组的效率大大提高,这为我们从事基因组订制化(genome customization)提供方便的技术手段,为开发出更简易的新型基因组靶向修饰技术带来了新希望。  The DSB produced by TALEN can be repaired in the following two ways: 1) Non Homologous End Joining (NHEJ): NHEJ is a natural repair mechanism that can be used to introduce nucleotide deletions for inactivation or knockout A specific target gene; 2) Homologous recombination (Homologous Recombination, HR): DSB promotes homologous recombination, in the presence of a DNA template, can produce specific DNA sequence changes, and can also integrate transgenes into the DNA sequence. The NHEJ pathway can be used for gene silencing, while HR can be used for gene editing (Gene Editing), or gene knock-in (Gene Knock-in). Regardless of the approach, compared with purely relying on homologous recombination, the repair produced by TALEN can greatly improve the efficiency of gene recombination, which provides a convenient technical means for us to engage in genome customization and develop more Facile novel genome-targeted modification technology offers new hope. the

TALE另外一个大的应用前景是TALEA(transcription activator-like(TAL)effector activator)。TALEA是一个融合蛋白,将识别特异DNA序列的TALE与转录因子激活区域VP64(VP64Activation Domain)融合,可构建成识别启动子上特异DNA序列的转录激活因子TALEA。该融合蛋白将结合基因启动子附近的特异DNA序列,并通过VP64激活区域与Polymerase II结合,从而激活基因的转录,提高了内源目标基因的表达。在实际操作中,需在目标基因的启动子上游选取靶序列(一般12-18个碱基),构建TALE识别模块。  Another big application prospect of TALE is TALEA (transcription activator-like (TAL) effector activator). TALEA is a fusion protein that fuses TALE that recognizes a specific DNA sequence with the transcription factor activation domain VP64 (VP64 Activation Domain) to construct a transcriptional activator TALEA that recognizes a specific DNA sequence on the promoter. The fusion protein will bind to the specific DNA sequence near the gene promoter, and bind to Polymerase II through the VP64 activation region, thereby activating the transcription of the gene and increasing the expression of the endogenous target gene. In practice, it is necessary to select a target sequence (generally 12-18 bases) upstream of the promoter of the target gene to construct a TALE recognition module. the

TALE技术已经开始在生命科学领域崭露头角。2011年,法国和美国两个小组合作,利用TALEN技术,在大鼠中敲除失活IgM功能,效率高达60%。2011年,包括中国在内的几个小组,利用TALEN技术,在斑马鱼中,敲除失活hey2等基因,效率也达到了30%以上。  TALE technology has begun to emerge in the field of life sciences. In 2011, two groups from France and the United States cooperated to use TALEN technology to knock out and inactivate IgM function in rats, with an efficiency as high as 60%. In 2011, several groups, including China, used TALEN technology to knock out and inactivate genes such as hey2 in zebrafish, and the efficiency reached more than 30%. the

TALE具有特殊的结构特征,包括氮端(N端)分泌信号、中央的DNA结合域、和核定位序列(Nuclear localization signal,NLS)和碳端(C端)的激活域。DNA结合域中,几乎所有已经发现的TALE蛋白都是有数量不同(12~30)、高度保守的重复单元组成,这些重复单元(一般含有33~35个氨基酸)中,除了第12和13位氨基酸不尽相同之外,其他组成部分都十分保守。其中第12和13位可变的氨基酸被称为重复序列可变的双氨基酸残基RVD(repeat variable di-residues)。TALE主要是通过重复单元中的RVD识别DNA序列。目前已经报道的RVD共有14种。NI(天冬酰胺和异亮氨酸)特异性识别A,HD(组氨酸和天冬氨酸)特异性识别C,NN(天冬酰胺和天冬酰胺)可以识别G和A,NK(天冬酰胺和赖氨酸)特异性识别G,NS(天冬酰胺和丝氨酸)可以识别G、A、C和T,NG(天冬酰胺和甘氨酸)特异性识别T,NH(天冬酰胺和组氨酸)特异性识别G,N*(天冬酰胺,第13位空缺)可以识别T、C、G和A,NP(天冬酰胺和脯氨酸)可以识别T、A和C,HN(组氨酸和天冬酰胺)可以识别G和A,NT(天冬酰胺和苏氨酸)可以识别G和A,SN(丝氨酸和天冬酰胺)特异性识别G,SH(丝氨酸和组氨酸)特异性识别G。  TALEs have special structural features, including a nitrogen-terminal (N-terminal) secretion signal, a central DNA-binding domain, a nuclear localization signal (NLS) and a carbon-terminal (C-terminal) activation domain. In the DNA binding domain, almost all TALE proteins that have been discovered are composed of highly conserved repeating units with different numbers (12-30). Among these repeating units (generally containing 33-35 amino acids), except for the 12th and 13th Amino acids are not the same, other components are very conservative. The variable amino acids at the 12th and 13th positions are called repeat variable di-residues RVD (repeat variable di-residues). TALE mainly recognizes DNA sequences through the RVD in the repeat unit. A total of 14 types of RVD have been reported so far. NI (asparagine and isoleucine) specifically recognizes A, HD (histidine and aspartic acid) specifically recognizes C, NN (asparagine and asparagine) can recognize G and A, NK ( Asparagine and lysine) specifically recognize G, NS (asparagine and serine) can recognize G, A, C and T, NG (asparagine and glycine) specifically recognize T, NH (asparagine and Histidine) specifically recognizes G, N* (asparagine, 13th vacancy) can recognize T, C, G and A, NP (asparagine and proline) can recognize T, A and C, HN (histidine and asparagine) can recognize G and A, NT (asparagine and threonine) can recognize G and A, SN (serine and asparagine) can specifically recognize G, SH (serine and histamine) Acid) specifically recognizes G. the

现有技术存在的问题:有的TALE特异性不够强,有的TALE识别效率不够高。因此, 开发新的高效TALE序列(即新的RVD),就成为本领域迫切需要解决的一个技术问题。  Problems existing in the prior art: some TALEs are not specific enough, and some TALEs are not efficient enough in recognition. Therefore, the development of new high-efficiency TALE sequences (ie, new RVDs) has become a technical problem that needs to be solved urgently in this field. the

发明内容 Contents of the invention

本发明的目的正是为了解决上述技术问题,希望通过设计对TALE蛋白进行一些改进,实现TALE蛋白的RVD序列上的优化,以达到针对A、T、G、C任何一个碱基都有与之特定结合的RVD序列。  The purpose of the present invention is to solve the above-mentioned technical problems. It is hoped that some improvements can be made to the TALE protein by design to realize the optimization of the RVD sequence of the TALE protein, so as to achieve a target for any base of A, T, G, and C. Specific binding RVD sequence. the

RV特异性识别A碱基;VI特异性识别C碱基;GD特异性识别C碱基;RW特异性识别C碱基;TG识别T碱基,亦可以较弱识别C和G碱基;CG可以识别G碱基,亦可以较弱识别T和C碱基;EC可以较强识别A、G和C碱基,亦可以较弱识别T碱基。  RV specifically recognizes A bases; VI specifically recognizes C bases; GD specifically recognizes C bases; RW specifically recognizes C bases; TG recognizes T bases, and can also weakly recognize C and G bases; CG It can recognize G bases, and can also weakly recognize T and C bases; EC can strongly recognize A, G, and C bases, and can also weakly recognize T bases. the

本发明采用的技术方案如下。  The technical scheme adopted in the present invention is as follows. the

本发明中设计针对HHB基因序列设计合成TALEN,采用的实验方法是国际上比较常用的荧光素酶单链重组退火实验(luciferase single-strand annealing recombination assay,SSA),和用来检测内源基因非同源重组检测效率的方法——SURVEYOR实验。实验原理如附图1、2所示。  In the present invention, TALEN is designed and synthesized for the HHB gene sequence, and the experimental method adopted is the luciferase single-strand annealing recombination assay (luciferase single-strand annealing recombination assay, SSA), which is commonly used in the world, and is used to detect endogenous gene Homologous recombination detection efficiency method - SURVEYOR experiment. The experimental principle is shown in Figures 1 and 2. the

虽然对于识别特异性更好、识别效率更高的新TALE存在普遍的需求,现有技术中也存在测试TALE识别效率的方法(SSA、SURVEYOR,详见下文),但是自从TALE被发现以来,新结构的TALE却很少见报导,在诸多原因中,筛选工作量太大以至于超出忍受范围,是最主要的原因之一。  Although there is a general demand for new TALEs with better recognition specificity and higher recognition efficiency, there are also methods for testing the recognition efficiency of TALEs in the prior art (SSA, SURVEYOR, see below for details), but since TALEs were discovered, new Structural TALE is rarely reported. Among many reasons, the screening workload is too large to be tolerated, which is one of the most important reasons. the

总所周知,由于基本氨基酸即有20种,因此含有两个氨基酸的RVD存在202即400种排列组合情况,筛选实验量相当大。本发明中利用质粒在大肠杆菌中的质粒不相容性这一性质,在大肠杆菌里大量的排列组合中筛选出有用的RVD。具体步骤如图1所示。该方法首先建立筛选体系,然后按照20种基本氨基酸两两排列组合的400种情况建立TALEN库,将TALEN库中的TALEN与报告质粒一起共转化进大肠杆菌,在其中挑选阳性克隆扩增,最后与报告质粒再次共转化进行验证。我们设计的上述新方法使得筛选实验的实验量大大减小,从而使得从大量可能的组合中挑选出有效的新型TALE变得可行。  As we all know, since there are 20 kinds of basic amino acids, there are 20 2 or 400 permutations and combinations of RVDs containing two amino acids, and the amount of screening experiments is quite large. In the present invention, the property of plasmid incompatibility in Escherichia coli is used to screen useful RVDs from a large number of permutations and combinations in Escherichia coli. The specific steps are shown in Figure 1. In this method, a screening system is first established, and then a TALEN library is established according to 400 situations in which 20 basic amino acids are arranged in pairs. The TALEN in the TALEN library and the reporter plasmid are co-transformed into E. coli, and positive clones are selected and amplified. Co-transform again with the reporter plasmid for verification. The above-mentioned new method we designed greatly reduces the amount of screening experiments, making it feasible to select effective new TALEs from a large number of possible combinations.

使用上述方法,我们从几百种氨基酸组合中挑选出了有效的TALE,分别如下:“转录激活子样效应因子(TALE)”功能蛋白中的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和缬氨酸V、或者分别为缬氨酸V和异亮氨酸I、或者分别为甘氨酸G和天冬氨酸D、或者分别为精氨酸R和色氨酸W、或者分别为苏氨酸T和甘氨酸G、或者分别为半胱氨酸C和甘氨酸G、或者分别为谷氨酸E和半胱氨酸C。并且,我们还确认了上述新型TALE特异识别的碱基,可以将这些新型的TALE应用于这些碱基的特异识别上, 分别如下:  Using the above method, we selected effective TALEs from hundreds of amino acid combinations, which are as follows: The repeat variable diamino acid residue RVD (repeat variable di- residues) are arginine R and valine V respectively, or valine V and isoleucine I respectively, or glycine G and aspartic acid D respectively, or arginine R and tryptophan respectively Acid W, or threonine T and glycine G, respectively, or cysteine C and glycine G, respectively, or glutamic acid E and cysteine C, respectively. Moreover, we have also confirmed the bases specifically recognized by the above-mentioned new TALEs, and these new TALEs can be applied to the specific recognition of these bases, respectively as follows:

所述“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和缬氨酸V,用于对腺嘌呤(A)的特异性识别;所述“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为缬氨酸V和异亮氨酸I,用于对胞嘧啶(C)的特异性识别;所述“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为甘氨酸G和天冬氨酸D,用于对胞嘧啶(C)的特异性识别;所述“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和色氨酸W,用于对胞嘧啶(C)的特异性识别;所述“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为苏氨酸T和甘氨酸G,用于对胸腺嘧啶(T)的进行较强识别,亦可以对胞嘧啶(C)和鸟嘌呤(G)碱基较弱识别;所述“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为半胱氨酸C和甘氨酸G,用于对鸟嘌呤(G)的进行较强识别,亦可以对胸腺嘧啶(T)和胞嘧啶(C)进行较弱识别;所述“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为谷氨酸E和半胱氨酸C,用于对腺嘌呤(A)、鸟嘌呤(G)和胞嘧啶(C)的进行较强识别,亦可以对胸腺嘧啶(T)进行较弱识别。  The repeat variable di-amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively arginine R and valine V, which are used for the specificity of adenine (A). sex recognition; the repeat variable di-amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively valine V and isoleucine I, which are used for cytosine ( C) specific recognition; the repeated variable di-amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are glycine G and aspartic acid D, respectively, for the The specific recognition of pyrimidine (C); the repeat variable di-amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively arginine R and tryptophan W, with For the specific recognition of cytosine (C); the repeat variable di-amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively threonine T and glycine G, It is used for strong recognition of thymine (T), and weak recognition of cytosine (C) and guanine (G) bases; the repeat of the "transcription activator-like effector" functional protein is variable The double amino acid residues RVD (repeat variable di-residues) are cysteine C and glycine G, which are used for strong recognition of guanine (G), and can also recognize thymine (T) and cytosine (C ) for weak recognition; the repeat variable di-amino acid residue RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein is glutamic acid E and cysteine C, respectively, for the Stronger recognition of adenine (A), guanine (G) and cytosine (C) and weaker recognition of thymine (T). the

附图说明 Description of drawings

图1筛选实验流程示意图  Figure 1 Schematic diagram of the screening experiment process

图2荧光素酶单链重组退火实验(luciferase single-strand annealing recombination assay,SSA)原理图:首先,把荧光素酶基因进行改造。在荧光素酶基因中间加入终止密码子和靶基因序列;随后通过分子生物学手段再紧接上另外一段荧光素酶基因序列,途中浅灰色区域的序列(约800bp)相同。当TALEN有效果时,TALEN会在靶序列位置把双链DNA剪切形成一个双链的DNA断裂,由于两个浅灰色区域序列(即图1中“终止密码子”左侧的一段基因和“靶序列”右侧的一段基因)相同,此时会发生同源重组,就形成了一个有活性的荧光素酶报告基因,实验中可以测量荧光素酶的表达来检测TALEN的效果的强弱。  Figure 2 Schematic diagram of luciferase single-strand annealing recombination assay (SSA): first, the luciferase gene was modified. A stop codon and target gene sequence were added in the middle of the luciferase gene; then another piece of luciferase gene sequence was followed by molecular biological means, and the sequence in the light gray area (about 800bp) on the way was the same. When TALEN is effective, TALEN will cut the double-stranded DNA at the target sequence position to form a double-stranded DNA break, due to the two light gray region sequences (that is, a gene on the left side of "stop codon" in Figure 1 and " The gene on the right side of "target sequence" is the same, homologous recombination will occur at this time, and an active luciferase reporter gene will be formed. In the experiment, the expression of luciferase can be measured to detect the strength of the effect of TALEN. the

图3SURVEYOR原理图:首先用PCR的方法把目的基因片段从基因组上扩增出来。其中包括发生缺失和没有产生变化的序列。然后把扩增出来的片段在体外进行重新退火,此时发生缺失的片段和没有产生变化的序列会退火形成有部分不配对的双链结构。接下来用SURVEYOR酶进行酶切实验,这种酶可以识别不配对的区域进行酶切。我们可以通过跑胶检测图中a、b、c 的含量来检测发生缺失的效率,从而得出TALEN的作用效果。  Figure 3 SURVEYOR principle diagram: Firstly, PCR method is used to amplify the target gene fragment from the genome. These include deletions and sequences that do not produce changes. Then, the amplified fragments are re-annealed in vitro, and at this time, the missing fragments and the unchanged sequences will anneal to form a partially unpaired double-stranded structure. Next, use SURVEYOR enzyme to carry out enzyme digestion experiments, this enzyme can recognize unpaired regions for digestion. We can detect the efficiency of the deletion by the content of a, b, and c in the gel detection graph, so as to obtain the effect of TALEN. the

图4RVD-RV的SSA实验结果  Figure 4 SSA experimental results of RVD-RV

图5RVD-VI的SSA实验结果  Figure 5 SSA experimental results of RVD-VI

图6RVD-GD的SSA实验结果  Figure 6 SSA experimental results of RVD-GD

图7RVD-RW的SSA实验结果  Figure 7 SSA experimental results of RVD-RW

图8RVD-TG的SSA实验结果  Figure 8 SSA experimental results of RVD-TG

图9RVD-CG的SSA实验结果  Figure 9 SSA experimental results of RVD-CG

图10RVD-EC的SSA实验结果  Figure 10 SSA experimental results of RVD-EC

图11SURVEYOR实验结果  Figure 11 SURVEYOR experimental results

具体实施方式 Detailed ways

实施例1  Example 1

“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和缬氨酸V,简称RVD-RV。对应的SSA实验结果见图4,其中阳性对照是根据领域中现有的规则设计合成的TALE:NI(天冬酰胺和异亮氨酸)识别A,HD(组氨酸和天冬氨酸)识别C,NN(天冬酰胺和天冬酰胺)识别G,NG(天冬酰胺和甘氨酸)识别T。与阳性对照相比,RVD-RV对于腺嘌呤(A)的识别率较高,与阳性对照相比,相对效率达到了150%。因此可将该TALE用于对腺嘌呤(A)的特异性识别。  The repeat variable di-residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively arginine R and valine V, referred to as RVD-RV. The corresponding SSA experimental results are shown in Figure 4, where the positive control is a TALE designed and synthesized according to existing rules in the field: NI (asparagine and isoleucine) recognizes A, HD (histidine and aspartic acid) C is recognized by NN (asparagine and asparagine), G is recognized by NG (asparagine and glycine), and T is recognized by NG (asparagine and glycine). Compared with the positive control, the recognition rate of RVD-RV for adenine (A) is higher, and compared with the positive control, the relative efficiency reaches 150%. Therefore, this TALE can be used for the specific recognition of adenine (A). the

实施例2  Example 2

“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为缬氨酸V和异亮氨酸I,简称RVD-VI。对应的SSA实验结果见图5。与阳性对照相比,RVD-VI对于胞嘧啶(C)的识别率很高,与阳性对照相比,相对效率达到了198%。因此可将该TALE用于对胞嘧啶(C)的特异性识别。  The repeat variable di-amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively valine V and isoleucine I, referred to as RVD-VI. The corresponding SSA experimental results are shown in Figure 5. Compared with the positive control, RVD-VI has a high recognition rate for cytosine (C), and compared with the positive control, the relative efficiency reaches 198%. Therefore, this TALE can be used for the specific recognition of cytosine (C). the

实施例3  Example 3

“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为甘氨酸G和天冬氨酸D,简称RVD-GD。对应的SSA实验结果见图6。与 阳性对照相比,RVD-GD对于胞嘧啶(C)的识别率很高,与阳性对照相比,相对效率达到了103%。因此可将该TALE用于对胞嘧啶(C)的特异性识别。  The repeat variable di-residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively glycine G and aspartic acid D, referred to as RVD-GD. The corresponding SSA experimental results are shown in Figure 6. Compared with the positive control, RVD-GD has a high recognition rate for cytosine (C), and compared with the positive control, the relative efficiency reaches 103%. Therefore, this TALE can be used for the specific recognition of cytosine (C). the

实施例4  Example 4

“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和色氨酸W,简称RVD-RW。对应的SSA实验结果见图7。与阳性对照相比,RVD-RW对于胞嘧啶(C)的识别率较高,与阳性对照相比,相对效率达到了68.5%。因此可将该TALE用于对胞嘧啶(C)的特异性识别。  The repeat variable di-residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively arginine R and tryptophan W, referred to as RVD-RW. The corresponding SSA experimental results are shown in Figure 7. Compared with the positive control, the recognition rate of cytosine (C) by RVD-RW is higher, and compared with the positive control, the relative efficiency reaches 68.5%. Therefore, this TALE can be used for the specific recognition of cytosine (C). the

实施例5  Example 5

“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为苏氨酸T和甘氨酸G,简称RVD-TG。对应的SSA实验结果见图8。与阳性对照相比,RVD-TG对于胸腺嘧啶(T)的识别率较高,与阳性对照相比,相对效率达到了132%;对于对胞嘧啶(C)和鸟嘌呤(G)碱基进行较弱识别,与阳性对照相比,相对效率分别为61%和36%。因此可将该TALE用于对胸腺嘧啶(T)的进行较强识别,亦可以对胞嘧啶(C)和鸟嘌呤(G)碱基较弱识别。  The repeat variable di-residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively threonine T and glycine G, referred to as RVD-TG. The corresponding SSA experimental results are shown in Figure 8. Compared with the positive control, RVD-TG has a higher recognition rate for thymine (T), and compared with the positive control, the relative efficiency has reached 132%; for cytosine (C) and guanine (G) bases Weaker recognition, with relative efficiencies of 61% and 36%, respectively, compared to the positive control. Therefore, the TALE can be used for stronger recognition of thymine (T), and weaker recognition of cytosine (C) and guanine (G) bases. the

实施例6  Example 6

“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为半胱氨酸C和甘氨酸G,简称RVD-CG。对应的SSA实验结果见图9。与阳性对照相比,RVD-CG对于鸟嘌呤(G)的识别率较高,与阳性对照相比,相对效率达到了88.8%;对于胸腺嘧啶(T)和胞嘧啶(C)进行较弱识别,与阳性对照相比,相对效率分别为1%和34%。因此可将该TALE用于对鸟嘌呤(G)的进行较强识别,亦可以对胸腺嘧啶(T)和胞嘧啶(C)进行较弱识别。  The repeat variable di-amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively cysteine C and glycine G, referred to as RVD-CG. The corresponding SSA experimental results are shown in Figure 9. Compared with the positive control, RVD-CG has a higher recognition rate for guanine (G), and the relative efficiency reaches 88.8% compared with the positive control; it has a weaker recognition for thymine (T) and cytosine (C) , compared with the positive control, the relative efficiencies were 1% and 34%, respectively. Therefore, the TALE can be used for stronger recognition of guanine (G), and weaker recognition of thymine (T) and cytosine (C). the

实施例7  Example 7

“转录激活子样效应因子”功能蛋白的重复可变双氨基酸残基RVD(repeat variable di-residues)分别为谷氨酸E和半胱氨酸C,简称RVD-EC。对应的SSA实验结果见图10。与阳性对照相比,RVD-EC对于腺嘌呤(A)、鸟嘌呤(G)和胞嘧啶(C)的识别率较高,与阳 性对照相比,相对效率分别为211%、421和200%;对胸腺嘧啶(T)进行较弱识别,与阳性对照相比,相对效率为13%。因此可将该TALE用于对腺嘌呤(A)、鸟嘌呤(G)和胞嘧啶(C)的进行较强识别,亦可以对胸腺嘧啶(T)进行较弱识别。  The repeat variable di-amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are glutamic acid E and cysteine C, respectively, referred to as RVD-EC. The corresponding SSA experimental results are shown in Figure 10. Compared with the positive control, RVD-EC has a higher recognition rate for adenine (A), guanine (G) and cytosine (C), and the relative efficiencies are 211%, 421 and 200% respectively compared with the positive control %; Thymine (T) was weakly recognized, compared with the positive control, the relative efficiency was 13%. Therefore, the TALE can be used for strong recognition of adenine (A), guanine (G) and cytosine (C), and weak recognition of thymine (T). the

附录:  Appendix:

(1)  (1)

实施例1中,RVD-RV氨基酸序列:  In Example 1, the RVD-RV amino acid sequence:

MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEKI  MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEKI

KPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVG  KPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVG

KQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNL  KQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNL

TPEQVVAIASRVGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNGGGKQALETVQRLLP  TPEQVVAIASRVGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNGGGKQALETVQRLLP

VLCQAHGLTPAQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASRVGGKQAL  VLCQAHGLTPAQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASRVGGKQAL

ETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASH  ETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASH

DGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPD  DGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPD

QVVAIASRVGGKQALETVQRLLPVLCQAHGLTPDQVVAIASRVGGKQALETVQRLLPVLC  QVVAIASRVGGKQALETVQRLLPVLCQAHGLTPDQVVAIASRVGGKQALETVQRLLPVLC

QAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETV  QAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIAASNGGGKQALETV

QRLLPVLCQAHGLTPAQVVAIASRVGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDG  QRLLPVLCQAHGLTPAQVVAIASRVGGKQALETVQRLLPVLCQAHGLTPDQVVAIASRVGG

GKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQV  GKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQV

VAIASRVGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQA  VAIASRVGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQA

HGLTPAQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGL  HGLTPAQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGL

PHAPALIKRTNRRIPERTSHRVA  PHAPALIKRTNRRIPERTSHRVA

(2)  (2)

实施例2中,RVD-VI氨基酸序列:  In embodiment 2, RVD-VI amino acid sequence:

MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEKI  MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEKI

KPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVG  KPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVG

KQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNL  KQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNL

TPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNGGGKQALETVQRLLPV  TPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNGGGKQALETVQRLLPV

LCQAHGLTPAQVVAIASVIGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALETV  LCQAHGLTPAQVVAIAASVIGGKQALETVQRLLPVLCQAHGLTPAQVVAIAASNIGGKQALETV

QRLLPVLCQAHGLTPDQVVAIASVIGGKQALETVQRLLPVLCQAHGLTPAQVVAIASVIGGK  QRLLPVLCQAHGLTPDQVVAIAASVIGGKQALETVQRLLPVLCQAHGLTPAQVVAIAASVIGGK

QALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQVVAI  QALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQVVAI

ASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNIGGKQALETVQRLLPVLCQAHGLT  ASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIAASNIGGKQALETVQRLLPVLCQAHGLT

PDQVVAIASVIGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVL  PDQVVAIAASVIGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVL

CQAHGLTPAQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASVIGGKQALETV  CQAHGLTPAQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIAASVIGGKQALETV

QRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGG  QRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGG

KQALETVQRLLPVLCQAHGLTPDQVVAIASVIGGKQALETVQRLLPVLCQAHGLTPAQVVA  KQALETVQRLLPVLCQAHGLTPDQVVAIAASVIGGKQALETVQRLLPVLCQAHGLTPAQVVA

IASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRT  IASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRT

NRRIPERTSHRVA  NRRIPERTS HRVA

(3)  (3)

实施例3中,RVD-GD氨基酸序列:  In embodiment 3, RVD-GD amino acid sequence:

MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEK  MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEK

IKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGV  IKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGV

GKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPL  GKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPL

NLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNNGGKQALETVQRL  NLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNNGGKQALETVQRL

LPVLCQAHGLTPDQVVAIASGDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGK  LPVLCQAHGLTPDQVVAIASGDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGK

QALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQVVA  QALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQVVA

IASGDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAH  IASGDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAH

GLTPAQVVAIASGDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASGDGGKQALETVQRL  GLTPAQVVAIASGDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASGDGGKQALETVQRL

LPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASGDGGK  LPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASGDGGK

QALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVA  QALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVA

IASGDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALETVQRLLPVLCQAHG  IASGDGGKQALETVQRLLPVLCQAHGLTPAQVVAIAASNIGGKQALETVQRLLPVLCQAHG

LTPDQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNGGGRPALESIVAQLS  LTPDQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNGGGRPALESIVAQLS

RPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVA  RPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVA 

(4)  (4)

实施例4中,RVD-RW氨基酸序列:  In embodiment 4, RVD-RW amino acid sequence:

MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEK  MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEK

IKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGV  IKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGV

GKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPL  GKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPL

NLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASRWGGKQALETVQRL  NLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASRWGGKQALETVQRL

LPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASRWGGK  LPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASRWGGK

QALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQVVA  QALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQVVA

IASRWGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAH  IASRWGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAH

GLTPAQVVAIASRWGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALETVQRL  GLTPAQVVAIASRWGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALETVQRL

LPVLCQAHGLTPDQVVAIASRWGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQ  LPVLCQAHGLTPDQVVAIASRWGGKQALETVQRLLPVLCQAHGLTPAQVVAIAASNIGGKQ

ALETVQRLLPVLCQAHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAI  ALETVQRLLPVLCQAHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAI

ASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHG  ASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHG

LTPAQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNIGGKQALETVQRLLP  LTPAQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIAASNIGGKQALETVQRLLP

VLCQAHGLTPDQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPAL  VLCQAHGLTPDQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPAL

DAVKKGLPHAPALIKRTNRRIPERTSHRVA  DAVKKGLPHAPALIKRTNRRIPERTSHRVA 

(5)  (5)

实施例5中,RVD-TG氨基酸序列:  In Example 5, the RVD-TG amino acid sequence:

MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEKI  MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEKI

KPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVG  KPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVG

KQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNL  KQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNL

TPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASTGGGKQALETVQRLLPV  TPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASTGGGKQALETVQRLLPV

LCQAHGLTPAQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALET  LCQAHGLTPAQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALET

VQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASHD  VQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASHD

GGKQALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQ  GGKQALETVQRLLPVLCQAHGLTPAQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPDQ

VVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNIGGKQALETVQRLLPVLCQA  VVAIAASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIAASNIGGKQALETVQRLLPVLCQA

HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASTGGGKQALETVQR  HGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASTGGGKQALETVQR

LLPVLCQAHGLTPAQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQ  LLPVLCQAHGLTPAQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQ

ALETVQRLLPVLCQAHGLTPAQVVAIASTGGGKQALETVQRLLPVLCQAHGLTPAQVVAIA  ALETVQRLLPVLCQAHGLTPAQVVAIASTGGGKQALETVQRLLPVLCQAHGLTPAQVVAIA

SNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLT  SNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLT

PAQVVAIASTGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAP  PAQVVAIASTGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAP

ALIKRTNRRIPERTSHRVA  ALIKRTNRRIPERTSHRVA

(6)  (6)

实施例6中,RVD-CG氨基酸序列:  In Example 6, the RVD-CG amino acid sequence:

MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEK  MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEK

IKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGV  IKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGV

GKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPL  GKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPL

NLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLL  NLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASHDGGKQALETVQRLL

PVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASHDGGKQ  PVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIAASHDGGKQ

ALETVQRLLPVLCQAHGLTPAQVVAIASCGGGKQALETVQRLLPVLCQAHGLTPDQVVAI  ALETVQRLLPVLCQAHGLTPAQVVAIASCGGGKQALETVQRLLPVLCQAHGLTPDQVVAI

ASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHG  ASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHG

LTPAQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALETVQRLLP  LTPAQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALETVQRLLP

VLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQAL  VLCQAHGLTPDQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQAL

ETVQRLLPVLCQAHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIAS  ETVQRLLPVLCQAHGLTPDQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIAS

NGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLT  NGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLT

PAQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNIGGKQALETVQRLLPVL  PAQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIAASNIGGKQALETVQRLLPVL

CQAHGLTPDQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDA  CQAHGLTPDQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDA

VKKGLPHAPALIKRTNRRIPERTSHRVA  VKKGLPHAPALIKRTNRRIPERTSHRVA 

(7)  (7)

实施例7中,RVD-EC氨基酸序列:  In Example 7, the amino acid sequence of RVD-EC:

MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEK  MAPKKKRKVYPYDVPDYAGYPYDVPDYAGSYPYDVPDYAAHGTVDLRTLGYSQQQQEK

IKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGV  IKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGV

GKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPL  GKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPL

NLTPEQVVAIASECGGKQALETVQRLLPVLCQAHGLTPDQVVAIASECGGKQALETVQRL  NLTPEQVVAIASECGGKQALETVQRLLPVLCQAHGLTPDQVVAIASECGGKQALETVQRL

LPVLCQAHGLTPDQVVAIASH DGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGK  LPVLCQAHGLTPDQVVAIASH DGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGK

QALETVQRLLPVLCQAHGLTPAQVVAIASECGGKQALETVQRLLPVLCQAHGLTPDQVVA  QALETVQRLLPVLCQAHGLTPAQVVAIASECGGKQALETVQRLLPVLCQAHGLTPDQVVA

IASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHG  IASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHG

LTPAQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASHDGGKQALETVQRLL  LTPAQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASHDGGKQALETVQRLL

PVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIASHDGGKQ  PVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAIAASHDGGKQ

ALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAI  ALETVQRLLPVLCQAHGLTPAQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPAQVVAI

ASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIASNIGGKQALETVQRLLPVLCQAHGL  ASHDGGKQALETVQRLLPVLCQAHGLTPAQVVAIAASNIGGKQALETVQRLLPVLCQAHGL

TPDQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNGGGRPALESIVAQLSR  TPDQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPDQVVAIASNGGGRPALESIVAQLSR

PDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVA  PDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVA 

Claims (8)

1.一种“转录激活子样效应因子(Transcription Activator Like Effectors,TALE)”功能蛋白,其特征在于所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和缬氨酸V、或者分别为缬氨酸V和异亮氨酸I、或者分别为甘氨酸G和天冬氨酸D、或者分别为精氨酸R和色氨酸W、或者分别为苏氨酸T和甘氨酸G、或者分别为半胱氨酸C和甘氨酸G、或者分别为谷氨酸E和半胱氨酸C。  1. A "transcription activator like effector (Transcription Activator Like Effectors, TALE)" functional protein, characterized in that the repeated variable double amino acid residue RVD (repeat) of the "transcription activator like effector" functional protein variable di-residues) are respectively arginine R and valine V, or are respectively valine V and isoleucine I, or are respectively glycine G and aspartic acid D, or are respectively arginine R and tryptophan W, or threonine T and glycine G, respectively, or cysteine C and glycine G, respectively, or glutamic acid E and cysteine C, respectively. the 2.根据权利要求1所述的功能蛋白的应用,其特征在于所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和缬氨酸V,用于对DNA中腺嘌呤(A)的特异性识别。  2. The application of the functional protein according to claim 1, characterized in that the repeated variable double amino acid residue RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein is arginine respectively Acid R and valine V are used for the specific recognition of adenine (A) in DNA. the 3.根据权利要求1所述的功能蛋白的应用,其特征在于所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为缬氨酸V和异亮氨酸I,用于对DNA中胞嘧啶(C)的特异性识别。  3. The application of the functional protein according to claim 1, characterized in that the repeated variable double amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively valamine Acid V and Isoleucine I are used for the specific recognition of cytosine (C) in DNA. the 4.根据权利要求1所述的功能蛋白的应用,其特征在于所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为甘氨酸G和天冬氨酸D,用于对DNA中胞嘧啶(C)的特异性识别。  4. The application of the functional protein according to claim 1, characterized in that the repeated variable double amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively glycine G and aspartic acid D for the specific recognition of cytosine (C) in DNA. the 5.根据权利要求1所述的功能蛋白的应用,其特征在于所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为精氨酸R和色氨酸W,用于对DNA中胞嘧啶(C)的特异性识别。  5. The application of the functional protein according to claim 1, characterized in that the repeated variable double amino acid residue RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein is respectively arginine Acid R and tryptophan W for the specific recognition of cytosine (C) in DNA. the 6.根据权利要求1所述的功能蛋白的应用,其特征在于所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为苏氨酸T和甘氨酸G,用于对DNA中胸腺嘧啶(T)的较强识别,亦可以对DNA中胞嘧啶(C)和鸟嘌呤(G)碱基较弱识别。  6. The application of the functional protein according to claim 1, characterized in that the repeated variable double amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively threonine Acid T and glycine G are used for strong recognition of thymine (T) in DNA, and weak recognition of cytosine (C) and guanine (G) bases in DNA. the 7.根据权利要求1所述的功能蛋白的应用,其特征在于所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为半胱氨酸C和甘氨酸G,用于对DNA中鸟嘌呤(G)的进行较强识别,亦可以对DNA中胸腺嘧啶(T)和胞嘧啶(C)进行较弱识别。  7. The application of the functional protein according to claim 1, characterized in that the repeated variable double amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively cysteine Amino acid C and glycine G are used for strong recognition of guanine (G) in DNA, and weak recognition of thymine (T) and cytosine (C) in DNA. the 8.根据权利要求1所述的功能蛋白的应用,其特征在于所述“转录激活子样效应因子”功能蛋白的重复可变的双氨基酸残基RVD(repeat variable di-residues)分别为谷氨酸E和半胱氨酸C,用于对DNA中腺嘌呤(A)、鸟嘌呤(G)和胞嘧啶(C)的进行较强识别,亦可以对DNA中胸腺嘧啶(T)进行较弱识别。  8. The application of the functional protein according to claim 1, characterized in that the repeated variable double amino acid residues RVD (repeat variable di-residues) of the "transcription activator-like effector" functional protein are respectively glutamine Acid E and cysteine C, for strong recognition of adenine (A), guanine (G) and cytosine (C) in DNA, and weak recognition of thymine (T) in DNA identify. the
CN201410056380.0A 2014-02-19 2014-02-19 Design, synthesis and application of transcription activator like effector function protein Pending CN104844696A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410056380.0A CN104844696A (en) 2014-02-19 2014-02-19 Design, synthesis and application of transcription activator like effector function protein

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410056380.0A CN104844696A (en) 2014-02-19 2014-02-19 Design, synthesis and application of transcription activator like effector function protein

Publications (1)

Publication Number Publication Date
CN104844696A true CN104844696A (en) 2015-08-19

Family

ID=53844721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410056380.0A Pending CN104844696A (en) 2014-02-19 2014-02-19 Design, synthesis and application of transcription activator like effector function protein

Country Status (1)

Country Link
CN (1) CN104844696A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108796040A (en) * 2018-06-27 2018-11-13 中南民族大学 A kind of bioluminescent detection probe and its construction method and application based on transcriptional activation increment effector

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102770539A (en) * 2009-12-10 2012-11-07 明尼苏达大学董事会 TAL effector-mediated DNA modification
CN102787125A (en) * 2011-08-05 2012-11-21 北京大学 Method for building TALE (transcription activator-like effector) repeated sequences
CN103025344A (en) * 2010-05-17 2013-04-03 桑格摩生物科学股份有限公司 Novel DNA-binding proteins and uses thereof
WO2013082519A2 (en) * 2011-11-30 2013-06-06 The Broad Institute Inc. Nucleotide-specific recognition sequences for designer tal effectors
CN103146735A (en) * 2012-12-28 2013-06-12 西北农林科技大学 Construction method of TALE repeat unit tetramer library, construction method of TALEN expression vector and application thereof
CN103435691A (en) * 2013-08-13 2013-12-11 北京大学 TALE (Transcription Activator Like Effectors) protein and application thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102770539A (en) * 2009-12-10 2012-11-07 明尼苏达大学董事会 TAL effector-mediated DNA modification
CN103025344A (en) * 2010-05-17 2013-04-03 桑格摩生物科学股份有限公司 Novel DNA-binding proteins and uses thereof
CN102787125A (en) * 2011-08-05 2012-11-21 北京大学 Method for building TALE (transcription activator-like effector) repeated sequences
WO2013082519A2 (en) * 2011-11-30 2013-06-06 The Broad Institute Inc. Nucleotide-specific recognition sequences for designer tal effectors
CN103146735A (en) * 2012-12-28 2013-06-12 西北农林科技大学 Construction method of TALE repeat unit tetramer library, construction method of TALEN expression vector and application thereof
CN103435691A (en) * 2013-08-13 2013-12-11 北京大学 TALE (Transcription Activator Like Effectors) protein and application thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ERNST WEBER: "Assem bly of Designer TAL Effectors by Golden Gate Cloning", 《PLOS ONE》 *
JENS BOCH: "Breaking the Code of DNA Binding Specificity of TAL-Type III Effectors", 《SCIENCE》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108796040A (en) * 2018-06-27 2018-11-13 中南民族大学 A kind of bioluminescent detection probe and its construction method and application based on transcriptional activation increment effector
CN108796040B (en) * 2018-06-27 2021-09-24 中南民族大学 Bioluminescence detection probe based on transcription activator-like effector and construction method and application thereof

Similar Documents

Publication Publication Date Title
JP6740325B2 (en) TALE transcription activator
WO2015165274A1 (en) Taler protein having a transcription inhibiting effect by means of steric hindrance, and application thereof
US20160369268A1 (en) Transcription activator-like effector (tale) libraries and methods of synthesis and use
WO2019139645A3 (en) High efficiency base editors comprising gam
CN102558309B (en) Transcription activator-like effector nucleases, and encoding genes and application thereof
WO2017070632A3 (en) Nucleobase editors and uses thereof
Zhang et al. TALE: a tale of genome editing
Calderini et al. Molecular cytogenetics and DNA sequence analysis of an apomixis-linked BAC in Paspalum simplex reveal a non pericentromere location and partial microcolinearity with rice
JP2013513389A5 (en)
RU2018112325A (en) 3'-UTR SEQUENCES FOR RNA STABILIZATION
JP2016534727A5 (en)
WO2010057203A3 (en) Hdl particles for delivery of nucleic acids
Denisenko et al. Transcriptionally induced enhancers in the macrophage immune response to Mycobacterium tuberculosis infection
CN103435691A (en) TALE (Transcription Activator Like Effectors) protein and application thereof
WO2015158031A1 (en) Dna molecule used for pichia pastoris recombinant plasmid and pichia pastoris recombinant bacterium expressing ppri protein of deinococcus radiodurans
CN104844696A (en) Design, synthesis and application of transcription activator like effector function protein
JP2015533372A5 (en)
CN102964431A (en) Polypeptide pair for specifically recognizing muscle myostatin gene as well as encoding gene and application of gene
CN105713885B (en) A chimeric nuclease that specifically recognizes and repairs the beta-globin gene of β-thalassemia
CN104531633A (en) Cas9-scForkI fusion protein and application thereof
CN104073496B (en) The sequence of escherichia coli outer membrane protein TolC aptamer and purposes
Rajendran et al. Artificial Restriction DNA Cutter Using Nuclease S1 for Site‐Selective Scission of Genomic DNA
CN104450784A (en) Method for establishing SAMHD1 gene knockout cell line
CN104628828B (en) The polypeptide of a pair of of specific recognition I κ B α genes and its encoding gene and application
Pilsl Reconstitution of the RNA polymerase I initiation complex from recombinant initiation factors and regulation of its activity

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150819

WD01 Invention patent application deemed withdrawn after publication