CN101283089A

CN101283089A - Encoding nucleic acid for modifying cytochrome P450 enzyme and its application method

Info

Publication number: CN101283089A
Application number: CNA2006800373195A
Authority: CN
Inventors: M·C·-Y·常; R·伊切斯; D·-K·罗; 吉国靖雄; J·D·基斯林
Original assignee: University of California San Diego UCSD
Current assignee: University of California San Diego UCSD
Priority date: 2005-10-07
Filing date: 2006-10-05
Publication date: 2008-10-08
Also published as: ZA200803368B

Abstract

The present invention provides nucleic acids comprising nucleotide sequences encoding modified cytochrome P450 enzymes; as well as recombinant vectors and host cells comprising the nucleic acids. The present invention further provides methods of producing a functionalized compound in a host cell genetically modified with a nucleic acid comprising nucleotide sequences encoding a modified cytochrome P450 enzyme.

Description

Encoding nucleic acid for modifying cytochrome P450 enzyme and its application method

交叉参考cross reference

本申请要求2005年10月7日提交的美国临时专利申请号60/724,525和2006年1月27日提交的美国临时专利申请号60/762,700的优先权，将这些申请全文纳入本文作参考。This application claims priority to U.S. Provisional Patent Application No. 60/724,525, filed October 7, 2005, and U.S. Provisional Patent Application No. 60/762,700, filed January 27, 2006, which are hereby incorporated by reference in their entirety.

发明领域field of invention

本发明涉及类异戊烯(isoprenoid)化合物生产领域，具体是用类异戊烯前体修饰酶的编码核酸遗传修饰的宿主细胞。The invention relates to the field of isoprenoid compound production, in particular to host cells genetically modified with nucleic acid encoding isoprenoid precursor modifying enzymes.

发明背景Background of the invention

类异戊烯构成了极大并且多样的天然产物类型，它们具有相同的生物合成来源，即一种代谢前体二磷酸异戊-1-烯酯(isopentenyl diphosphate，IPP)。类异戊烯化合物也称为“萜”或“类萜”。已经描述了超过40,000种类异戊烯。由定义可以看出，类异戊烯由所谓的异戊烯(isoprene)(C5)单元组成。类异戊烯中存在的C原子的数量一般可被5整除(C5、C10、C15、C20、C25、C30和C40)，但也报道了不规则的类异戊烯和多萜。类异戊烯化合物也称为“萜”或“类萜”。类异戊烯的重要成员包括类胡萝卜素、倍半萜类、双萜类和半萜。类胡萝卜素包括例如：番茄红素，β-胡萝卜素等许多用作抗氧化剂的物质。倍半萜类包括例如：一种具有抗疟活性的化合物青蒿素。双萜类包括例如：一种癌症化疗药紫杉醇。Isopentenoids constitute a large and diverse class of natural products that share the same biosynthetic source, a metabolic precursor, isopentenyl diphosphate (IPP). Isopentenoid compounds are also known as "terpenes" or "terpenoids". More than 40,000 isopentenoids have been described. As can be seen by definition, isopentenoids consist of so-called isoprene (C5) units. The number of C atoms present in isopentenoids is generally divisible by 5 (C5, C10, C15, C20, C25, C30 and C40), but irregular isopentenoids and polyterpenes have also been reported. Isopentenoid compounds are also known as "terpenes" or "terpenoids". Important members of the isopentenoids include carotenoids, sesquiterpenoids, diterpenoids, and hemiterpenoids. Carotenoids include, for example, lycopene, beta-carotene, and many others that act as antioxidants. Sesquiterpenes include, for example, artemisinin, a compound with antimalarial activity. Diterpenes include, for example, paclitaxel, a cancer chemotherapy drug.

类异戊烯包括种类最多并结构各异的天然产物家族。在这个家族中，分离自植物和其它天然来源的萜类化合物用作商品调味剂和香料化合物，以及抗疟药和抗癌药。目前使用的大多数萜类化合物是天然产物或其衍生物。许多这些天然产物所来源的生物体(如树、海洋无脊椎动物)既无法进行大规模培育以生产商业适合数量(的产物)，也不适合遗传操作以提高这些化合物的产量或对这些化合物进行衍生。因此，必须由类似物半合成、或用常规化学合成方法合成这些天然产物。而且，许多天然产物具有复杂结构，因此，目前合成这些天然产物是不经济或不可能的。这些天然产物必须从天然来源如树、海绵、珊瑚和海洋微生物中提取；或由更丰富的前体合成或半合成产生。由天然来源提取天然产物受限于天然来源的可用性；合成或半合成生产天然产物可能遇到低产率和/或高成本的困扰。这些生产问题和天然来源的局限性可限制这种产物的商业和临床开发。Isopenoids comprise the largest and structurally diverse family of natural products. Within this family, terpenoids isolated from plants and other natural sources are used as commercial flavoring and fragrance compounds, as well as antimalarial and anticancer agents. Most terpenoids in use today are natural products or their derivatives. The organisms from which many of these natural products are derived (e.g. trees, marine invertebrates) can neither be bred on a large scale to produce commercially viable quantities nor amenable to genetic manipulations to increase the production of these compounds or to derivative. Therefore, these natural products must be semi-synthesized from analogs, or synthesized by conventional chemical synthesis methods. Furthermore, many natural products have complex structures and therefore, synthesis of these natural products is currently not economical or possible. These natural products must be extracted from natural sources such as trees, sponges, corals, and marine microorganisms; or produced synthetically or semisynthetically from more abundant precursors. Extraction of natural products from natural sources is limited by the availability of natural sources; synthetic or semi-synthetic production of natural products may suffer from low yields and/or high costs. These production issues and limitations of natural sources can limit the commercial and clinical development of this product.

在工程改造(遗传修饰)的宿主细胞中体外(如在发酵系统中)或体内(如在遗传修饰的多细胞生物体中)生物合成类异戊烯天然产物可激发出这些天然来源未能实现的商业或治疗潜能，并生产出较便宜和更广泛使用的精细化学品和药物。在遗传修饰宿主中产生类异戊烯或类异戊烯前体化合物的一个障碍是有效生产能修饰类异戊烯化合物的聚异戊烯前体、或修饰类异戊烯前体的酶。Biosynthesis of isopentenoid natural products in engineered (genetically modified) host cells in vitro (as in fermentation systems) or in vivo (as in genetically modified multicellular organisms) can challenge the inability of these natural sources to achieve commercial or therapeutic potential, and produce less expensive and more widely available fine chemicals and pharmaceuticals. One obstacle to the production of isopentenoid or isopentenoid precursor compounds in genetically modified hosts is the efficient production of polyisoprene precursors capable of modifying isopentenoid compounds, or enzymes that modify isopentenoid precursors.

在许多天然产物靶点的生化转化中最重要的一类酶是细胞色素P450(P450)超家族，它参与了极广泛的各种代谢反应。在一个惊人的例子中，在由前体焦磷酸牻牛儿基牻牛儿基酯生物合成紫杉醇的约20个步骤中，P450能催化其中8个步骤。The most important class of enzymes in the biochemical transformation of many natural product targets is the cytochrome P450 (P450) superfamily, which is involved in an extremely wide variety of metabolic reactions. In a striking example, P450s catalyze 8 of the approximately 20 steps in the biosynthesis of paclitaxel from the precursor geranylgeranyl pyrophosphate.

本领域需要改进的产生类异戊烯或产生类异戊烯前体的宿主细胞，以便高水平产生类异戊烯化合物。本发明解决了这种需求并提供了相关优点。There is a need in the art for improved isopentenoid-producing or isopentenoid precursor-producing host cells for high-level production of isopentenoid compounds. The present invention addresses this need and provides related advantages.

参考文献references

美国专利公开号2004/005678；美国专利公开号2003/0148479；Martin等(2003)Nat.Biotech.21(7)：796-802；Polakowski等(1998)Appl.Microbiol.Biotechnol.49：67-71；Wilding等(2000)J Bacteriol 182(15)：4319-27；美国专利公开号2004/0194162；Donald等(1997)Appl.Env.Microbiol.63：3341-3344；Jackson等(2003)Organ.Lett.5：1629-1632；美国专利公开号2004/0072323；美国专利公开号2004/0029239；美国专利公开号2004/0110259；美国专利公开号2004/0063182；美国专利号5,460,949；美国专利公开号2004/0077039；美国专利号6,531,303；美国专利号6,689,593；Hamano等(2001)Biosci.Biotechnol.Biochem.65：1627-1635；T.Kuzuyama.(2004)Biosci.Biotechnol.Biochem.68(4)：931-934；T.Kazuhiko.(2004)Biotechnology Letters.26：1487-1491；Brock等(2004)Eur J.Biochem.271：3227-3241；Choi等(1999)Appl.Environ.Microbio.654363-4368；Parke等(2004)Appl.Environ.Microbio.70：2974-2983；Subrahmanyam等(1998)J.Bact.180：4596-4602；Murli等(2003)J.Ind.Microbiol.Biotechnol.30：500-509；Starai等(2005)J.Biol.Chem.280：26200-26205；和Starai等(2004)J.Mol.Biol.340：1005-1012；Jennewein等Chem.Biol.2004，11，379-387；Sowden等Org.Biomol.Chem.2005，3，57-64；Luo等Plant J.2001，28，95-104；Carter等Phytochem.2003，64，425-433；Craft等Appl.Environ.Microbiol.2003，69，5983-5991；Barnes等Proc.Natl.Acad.Sci.USA 1991，88，5597-5601；Schoch等Plant Physiol.2003，133，1198-1208；Roosild等Science 2005，307，1317-1321。US Patent Publication No. 2004/005678; US Patent Publication No. 2003/0148479; Martin et al. (2003) Nat. Biotech. 21(7): 796-802; Polakowski et al. (1998) Appl. Microbiol. Biotechnol. 49: 67-71 ; Wilding et al. (2000) J Bacteriol 182(15): 4319-27; U.S. Patent Publication No. 2004/0194162; Donald et al. (1997) Appl.Env.Microbiol.63: 3341-3344; Jackson et al. (2003) Organ.Lett .5: 1629-1632; US Patent Publication No. 2004/0072323; US Patent Publication No. 2004/0029239; US Patent Publication No. 2004/0110259; US Patent Publication No. 2004/0063182; 0077039; US Patent No. 6,531,303; US Patent No. 6,689,593; Hamano et al. (2001) Biosci. Biotechnol. Biochem.65: 1627-1635; T. Kuzuyama. (2004) Biosci. ; T. Kazuhiko. (2004) Biotechnology Letters. 26: 1487-1491; Brock et al. (2004) Eur J. Biochem. 271: 3227-3241; Choi et al. (1999) Appl. Environ. Microbio. (2004) Appl.Environ.Microbio.70:2974-2983; Subrahmanyam et al. (1998) J.Bact.180:4596-4602; Murli et al. (2003) J.Ind.Microbiol.Biotechnol.30:500-509; Starai et al. (2005) J.Biol.Chem.280:26200-26205; and Starai et al. (2004) J.Mol.Biol.340:1005-1012; Jennewein et al. Chem.Biol.2004, 11, 379-387; Sowden et al. Org.Biomol.Chem.2005, 3, 57-64; Luo et al. Plant J.2001, 28, 95-104; Carter et al. Phytochem.2003, 64, 425-433; Craft et al. Appl.Environ.Microbiol.2003, 69 , 5983-5991; Barnes et al. Proc.Natl.Acad.Sci.USA 1991, 88, 5597-5601; Schoch et al. Plant Physiol.2003, 133, 1198-1208; Roosild et al.

发明概述Summary of the invention

本发明提供了含有编码修饰细胞色素P450酶的核苷酸序列的核酸；以及包含该核酸的重组载体和宿主细胞。本发明还提供了在用含有编码修饰细胞色素P450酶的核苷酸序列的核酸遗传修饰的宿主细胞中，产生官能化化合物的方法。The invention provides a nucleic acid containing a nucleotide sequence encoding a modified cytochrome P450 enzyme; and a recombinant vector and a host cell containing the nucleic acid. The invention also provides methods for producing functionalized compounds in a host cell genetically modified with a nucleic acid comprising a nucleotide sequence encoding a modified cytochrome P450 enzyme.

附图简要说明Brief description of the drawings

图1是在大肠杆菌(E.coli)中生物合成8-羟基-δ-杜松烯的示意图。Figure 1 is a schematic diagram of the biosynthesis of 8-hydroxy-delta-junene in Escherichia coli (E. coli).

图2是由表达CadOH生物合成途径的大肠杆菌中提取的有机层的气相色谱-质谱(GC-MS)图。Fig. 2 is a gas chromatography-mass spectrometry (GC-MS) graph of an organic layer extracted from Escherichia coli expressing a CadOH biosynthetic pathway.

图3是由饲喂甲羟戊酸的表达CadOH生物合成途径以及一部分甲羟戊酸途径(pMBIS)的大肠杆菌中提取的有机层的GC-MS图。Figure 3 is a GC-MS image of an organic layer extracted from mevalonate-fed E. coli expressing the CadOH biosynthetic pathway and a portion of the mevalonate pathway (pMBIS).

图4A和4B描述了对CadH进行的各种N末端修饰(图4A)；用各种CadH构建物遗传修饰的大肠杆菌产生CadOH的时程图。Figures 4A and 4B depict various N-terminal modifications to CadH (Figure 4A); time course of CadOH production by E. coli genetically modified with various CadH constructs.

图5描述了米司迪蛋白(mistic)的氨基酸序列。Figure 5 depicts the amino acid sequence of mistic.

图6描述了苧烯羟化酶的氨基酸序列。Figure 6 depicts the amino acid sequence of limonene hydroxylase.

图7描述了马兜铃碱(aristolochene)二羟化酶的氨基酸序列。Figure 7 depicts the amino acid sequence of aristolochene dihydroxylase.

图8A-D描述了含有天然跨膜结构域(下划线)的杜松烯羟化酶(图8A)；含有异源跨膜结构域(粗体)的杜松烯羟化酶(图8B)；含有增溶结构域(粗体)的杜松烯羟化酶(图8C)；和含有分泌结构域和异源跨膜结构域(粗体)的杜松烯羟化酶(图8D)的氨基酸序列。Figures 8A-D depict a cinerene hydroxylase containing a native transmembrane domain (underlined) (Fig. 8A); a cinerene hydroxylase containing a heterologous transmembrane domain (bold); (Fig. 8B); Amino acids of culinene hydroxylases containing a solubilizing domain (bold) (Figure 8C); and culinene hydroxylases containing a secretory domain and a heterologous transmembrane domain (bold) (Figure 8D) sequence.

图9A和9B描述了紫杉二烯(taxadiene)羟化酶的氨基酸序列。Figures 9A and 9B depict the amino acid sequence of taxadiene hydroxylase.

图10描述了ent-贝壳杉烯氧化酶的氨基酸序列。Figure 10 depicts the amino acid sequence of ent-kaurene oxidase.

图11A描述了编码杜松烯羟化酶的核苷酸序列(用粗体表示起始atg)；图11B描述了根据在原核细胞中表达进行密码子优化的编码杜松烯羟化酶的变异核苷酸序列。Figure 11A depicts the nucleotide sequence encoding cinerene hydroxylase (starting atg is indicated in bold); Figure 11B depicts the codon-optimized variant encoding cinerene hydroxylase based on expression in prokaryotic cells Nucleotide sequence.

图12A描述了来自东北红豆杉(Taxus cuspidata)的细胞色素P450还原酶(CPR)的氨基酸序列；图12B描述了来自热带念珠菌(Candida tropicalis)的CPR的氨基酸序列；图12C描述了来自拟南芥(Arabidopsis thaliana)的CPR(ATR1)的氨基酸序列；图12D描述了来自拟南芥的CPR(ATR2)的氨基酸序列；图12E描述了缺少叶绿体靶向序列的变异ATR2氨基酸序列。Figure 12A has described the amino acid sequence of the cytochrome P450 reductase (CPR) from northeast yew (Taxus cuspidata); Figure 12B has described the amino acid sequence of the CPR from tropical Candida (Candida tropicalis); Amino acid sequence of CPR (ATR1 ) from Arabidopsis thaliana; Figure 12D depicts the amino acid sequence of CPR (ATR2) from Arabidopsis thaliana; Figure 12E depicts a variant ATR2 amino acid sequence lacking the chloroplast targeting sequence.

图13是两种血红素生物合成途径的示意图。Figure 13 is a schematic diagram of two heme biosynthetic pathways.

图14是生物合成示范性类异戊烯产物紫杉醇、青蒿素和薄荷醇的示意图。Figure 14 is a schematic diagram of the biosynthesis of exemplary isopentenoid products paclitaxel, artemisinin and menthol.

图15是产生示范性类异戊烯化合物的反应方案的示意图。Figure 15 is a schematic representation of a reaction scheme for producing exemplary isopentenoid compounds.

图16是类异戊烯代谢途径的示意图，该代谢途径由二磷酸异戊-1-烯酯(IPP)和二磷酸二甲基烯丙酯(DMAPP)生产类异戊烯生物合成途径中间体二磷酸多异戊烯酯(polyprenyl diphosphate)、二磷酸牻牛儿基酯(GPP)、二磷酸法呢酯(FPP)和二磷酸牻牛儿基牻牛儿基酯(GGPPP)。Figure 16 is a schematic diagram of the isopentenoid metabolic pathway that produces isopentenoid biosynthetic pathway intermediates from isopent-1-enyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) Polyprenyl diphosphate, geranyl diphosphate (GPP), farnesyl diphosphate (FPP) and geranylgeranyl diphosphate (GGPPP).

图17是产生IPP的甲羟戊酸(MEV)途径的示意图。Figure 17 is a schematic diagram of the mevalonate (MEV) pathway to produce IPP.

图18是用于产生IPP和焦磷酸二甲基烯丙酯(DMAPP)的DXP途径的示意图。Figure 18 is a schematic diagram of the DXP pathway for the production of IPP and dimethylallyl pyrophosphate (DMAPP).

图19A-C描述了各种修饰生物碱途径中间体的P450酶的氨基酸序列。Figures 19A-C depict the amino acid sequences of various P450 enzymes that modify alkaloid pathway intermediates.

图20A-C描述了各种修饰苯丙酸类途径中间体的P450酶的氨基酸序列。Figures 20A-C depict the amino acid sequences of various P450 enzymes that modify phenylpropanate pathway intermediates.

图21A和21B描述了各种修饰聚酮化合物途径中间体的P450酶的氨基酸序列。Figures 21A and 21B depict the amino acid sequences of various P450 enzymes that modify polyketide pathway intermediates.

图22是各种紫穗槐二烯氧化酶(AMO)构建物的示意图。(1)nAMO，分离自黄花蒿(Artemisia annua)的天然AMO序列；2)sAMO，根据大肠杆菌表达进行密码子优化的合成AMO基因；3)A13-AMO，用热带念珠菌(C.tropicalis)的A13N末端序列取代了野生型跨膜序列的合成AMO基因；4)A17-AMO，用热带念珠菌(C.tropicalis)的A17N末端序列取代了野生型跨膜序列的合成AMO基因；5)Bov-AMO，用牛微粒体N末端序列取代了野生型跨膜序列的合成AMO基因。Figure 22 is a schematic representation of various amorphadiene oxidase (AMO) constructs. (1) nAMO, the natural AMO sequence isolated from Artemisia annua; 2) sAMO, a codon-optimized synthetic AMO gene expressed in Escherichia coli; 3) A13-AMO, from C. tropicalis 4) A17-AMO, the A17 N-terminal sequence of C. tropicalis replaced the synthetic AMO gene of the wild-type transmembrane sequence; 5) Bov - AMO, a synthetic AMO gene in which the wild-type transmembrane sequence was replaced by the bovine microsomal N-terminal sequence.

图23A和B描述了在大肠杆菌中通过各种AMO构建物氧化紫穗槐二烯。Figures 23A and B depict the oxidation of amorphadiene by various AMO constructs in E. coli.

图24A和B描述了编码野生型AMO的核苷酸序列。Figures 24A and B depict the nucleotide sequence encoding wild-type AMO.

图25描述了图24所示核苷酸序列的氨基酸序列翻译图。Figure 25 depicts an amino acid sequence translation of the nucleotide sequence shown in Figure 24.

图26和27分别描述了编码A13-AMO和氨基酸序列翻译图的核苷酸序列。Figures 26 and 27 depict the nucleotide sequence encoding A13-AMO and a translation of the amino acid sequence, respectively.

图28和29分别描述了编码A17-AMO和氨基酸序列翻译图的核苷酸序列。Figures 28 and 29 depict the nucleotide sequence encoding A17-AMO and a translation of the amino acid sequence, respectively.

图30和31分别描述了编码牛-AMO和氨基酸序列翻译图的核苷酸序列。Figures 30 and 31 depict the nucleotide sequence encoding bovine-AMO and the translation map of the amino acid sequence, respectively.

图32描述了在除含有编码CadOH、CPR和CadS的核苷酸序列外，还含有全部甲羟戊酸途径的大肠杆菌中产生的CadOH。Figure 32 depicts CadOH production in E. coli containing the entire mevalonate pathway in addition to the nucleotide sequences encoding CadOH, CPR and CadS.

图33是比较在表达全部紫穗槐二烯途径并含有pDUET-ctAACPR-A13AMO质粒或pCWori-A17AMO-ctAACPR质粒的大肠杆菌中产生的青蒿酸的GC-MS气相谱和质谱图。Figure 33 is a comparison of GC-MS gas and mass spectra of artemisinic acid produced in E. coli expressing the entire amorphadiene pathway and containing either the pDUET-ctAACPR-A13AMO plasmid or the pCWori-A17AMO-ctAACPR plasmid.

图34描述了在用编码甲羟戊酸途径酶和紫穗槐二烯合酶的核酸、和pCWori-A17AMO-ctAACPR质粒遗传修饰的大肠杆菌中，将青蒿醇氧化成青蒿醛的GC-MS气相谱图。Figure 34 depicts the GC-mediated oxidation of artenimol to artemisinin in E. coli genetically modified with nucleic acids encoding mevalonate pathway enzymes and amorphadiene synthase, and the pCWori-A17AMO-ctAACPR plasmid. MS gas chromatogram.

图35A和35B描述了编码乙酰乙酰-CoA硫解酶(“atoB”)、HMGS和截短的HMGR(tHMGR)的核苷酸序列。Figures 35A and 35B depict the nucleotide sequences encoding acetoacetyl-CoA thiolase ("atoB"), HMGS and truncated HMGR (tHMGR).

图36A-D描述了pMBIS的核苷酸序列。Figures 36A-D depict the nucleotide sequence of pMBIS.

定义definition

术语“类异戊烯”、“类异戊烯化合物”、“萜”、“萜化合物”、“类萜”和“类萜化合物”可互换使用。类异戊烯化合物由不同数量的所谓的异戊烯(C5)单位组成。类异戊烯中存在的碳原子数目一般可被5整除(如C5、C10、C15、C20、C25、C30和C40)。曾报道过不规则的类异戊烯和多萜，它们也包括在“类异戊烯”的定义中。类异戊烯化合物包括但不限于：单萜、倍半萜、三萜、多萜和二萜。The terms "isopentenoid", "isopentenoid", "terpene", "terpenoid", "terpenoid" and "terpenoid" are used interchangeably. Isopentenoid compounds consist of varying numbers of so-called isopentenyl (C5) units. The number of carbon atoms present in the isopentenoid is generally divisible by 5 (eg, C5, C10, C15, C20, C25, C30, and C40). Irregular isopentenoids and polyterpenes have been reported and are included in the definition of "isopentenoid". Isopentenoid compounds include, but are not limited to, monoterpenes, sesquiterpenes, triterpenes, polyterpenes, and diterpenes.

本文所用术语“二磷酸异戊烯酯(prenyl diphosphate)”与“焦磷酸异戊烯酯(prenyl pyrophosphate)”可互换使用，包括含有一个异戊烯基的二磷酸单异戊烯酯(如IPP和DMAPP)，以及含有两个或多个异戊烯基的二磷酸多异戊烯酯。二磷酸单异戊烯酯包括焦磷酸异戊-1-烯酯(IPP)和其异构体焦磷酸二甲基烯丙酯(DMAPP)。As used herein, the term "prenyl diphosphate" is used interchangeably with "prenyl pyrophosphate" and includes monoprenyl diphosphates containing one prenyl group (e.g. IPP and DMAPP), and polyprenyl diphosphates containing two or more isopentenyl groups. Monoprenyl diphosphates include isopent-1-enyl pyrophosphate (IPP) and its isomer dimethylallyl pyrophosphate (DMAPP).

本文所用术语“萜合酶”指能酶学修饰IPP、DMAPP或焦磷酸多异戊烯酯，以便产生萜类前体化合物的任何酶。术语“萜合酶”包括能催化二磷酸异戊烯酯转化为类异戊烯或类异戊烯前体的酶。The term "terpene synthase" as used herein refers to any enzyme capable of enzymatically modifying IPP, DMAPP or polyprenyl pyrophosphate so as to produce terpenoid precursor compounds. The term "terpene synthase" includes enzymes that catalyze the conversion of isopentenyl diphosphates to isopentenoids or isopentenoid precursors.

在本文中，术语“焦磷酸”可与“二磷酸”互换使用。因此，例如，术语“二磷酸异戊烯酯”和“焦磷酸异戊烯酯”可互换；术语“焦磷酸异戊-1-烯酯”和“二磷酸异戊-1-烯酯”可互换；术语“二磷酸法呢酯”和“焦磷酸法呢酯”可互换；等等。The term "pyrophosphoric acid" is used interchangeably with "diphosphoric acid" herein. Thus, for example, the terms "prenyl diphosphate" and "prenyl pyrophosphate" are interchangeable; the terms "pren-1-enyl pyrophosphate" and "pren-1-enyl diphosphate" are interchangeable; the terms "farnesyl diphosphate" and "farnesyl pyrophosphate" are interchangeable; etc.

本文所用术语“甲羟戊酸途径”或“MEV途径”指将乙酰-CoA转变为IPP的生物合成途径。甲羟戊酸途径包括催化以下步骤的酶：(a)使两个乙酰-CoA分子缩合成乙酰乙酰CoA；(b)使乙酰乙酰CoA与乙酰CoA缩合形成HMG-CoA；(c)将HMG-CoA转化为甲羟戊酸；(d)将甲羟戊酸磷酸化为甲羟戊酸5-磷酸；(e)将甲羟戊酸5-磷酸转化为甲羟戊酸5-焦磷酸；和(f)将甲羟戊酸5-焦磷酸转化为焦磷酸异戊-1-烯酯。图17示意性说明了甲羟戊酸途径。甲羟戊酸途径的“上半部分”指负责通过MEV途径中间体将乙酰-CoA转变为甲羟戊酸的酶。The term "mevalonate pathway" or "MEV pathway" as used herein refers to the biosynthetic pathway that converts acetyl-CoA to IPP. The mevalonate pathway includes enzymes that catalyze the following steps: (a) condensation of two acetyl-CoA molecules to acetoacetyl-CoA; (b) condensation of acetoacetyl-CoA and acetyl-CoA to form HMG-CoA; (c) conversion of HMG- conversion of CoA to mevalonate; (d) phosphorylation of mevalonate to mevalonate 5-phosphate; (e) conversion of mevalonate 5-phosphate to mevalonate 5-pyrophosphate; and (f) Conversion of mevalonate 5-pyrophosphate to isopent-1-enyl pyrophosphate. Figure 17 schematically illustrates the mevalonate pathway. The "upper half" of the mevalonate pathway refers to the enzymes responsible for the conversion of acetyl-CoA to mevalonate via MEV pathway intermediates.

本文所用术语“1-脱氧-D-木酮糖5-二磷酸途径”或“DXP途径”指通过DXP途径中间体将甘油醛-3-磷酸和丙酮酸转变为IPP和DMAPP的途径，其中DXP途径包括催化图18示意性说明的反应的酶。The term "1-deoxy-D-xylulose 5-bisphosphate pathway" or "DXP pathway" as used herein refers to the pathway through which DXP pathway intermediates convert glyceraldehyde-3-phosphate and pyruvate into IPP and DMAPP, wherein DXP The pathway includes enzymes that catalyze the reactions schematically illustrated in FIG. 18 .

本文所用术语“异戊烯基转移酶”与术语“二磷酸异戊烯酯合酶”和“多异戊烯基合酶”(如“GPP合酶”、“FPP合酶”、“OPP合酶”等)可互换使用，指能催化二磷酸异戊-1-烯酯与烯丙基初始底物(primer substrate)的连续1’-4缩合，导致形成各种链长度的二磷酸异戊烯酯的酶。The term "prenyltransferase" is used herein in conjunction with the terms "prenyl diphosphate synthase" and "polyprenyl synthase" (such as "GPP synthase", "FPP synthase", "OPP synthase") "Enzyme", etc.) are used interchangeably to refer to the ability to catalyze the sequential 1'-4 condensation of isopent-1-enyl diphosphate with an allyl primer substrate, resulting in the formation of iso-diphosphates of various chain lengths Pentenyl ester enzyme.

术语“多核苷酸”和“核酸”在本文中可互换使用，指聚合形式的任何长度的核苷酸，包括核糖核苷酸或脱氧核苷酸。因此，该术语包括但不限于：单链、双链或多链DNA或RNA，基因组DNA，cDNA，DNA-RNA杂交体，或含有嘌呤和嘧啶碱基或其它天然的、化学或生化修饰的非天然的、或衍生的核苷酸碱基的聚合物。The terms "polynucleotide" and "nucleic acid" are used interchangeably herein to refer to nucleotides of any length, including ribonucleotides or deoxynucleotides, in polymeric form. Thus, the term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or non-cellular DNA containing purine and pyrimidine bases or other natural, chemical, or biochemical modifications. A polymer of natural, or derivatized, nucleotide bases.

术语“肽”、“多肽”和“蛋白质”在本文中可互换使用，指聚合形式的任何长度的氨基酸，可包括编码和非编码的氨基酸，化学或生化修饰或衍生的氨基酸，以及含有修饰的肽主链的多肽。The terms "peptide," "polypeptide," and "protein" are used interchangeably herein to refer to amino acids of any length in polymeric form, which may include coded and non-coded amino acids, chemically or biochemically modified or derivatized amino acids, and amino acids containing modifications. The peptide backbone of the polypeptide.

本文所用术语“天然产生”可应用于核酸、细胞或生物体，指在天然情况下发现的核酸、细胞或生物体。例如，可由天然来源分离并且未经实验室人员有意修饰的生物体(包括病毒)中存在的多肽或多核苷酸序列是天然产生的。The term "naturally occurring" as used herein applies to nucleic acids, cells or organisms and refers to nucleic acids, cells or organisms as found in nature. For example, a polypeptide or polynucleotide sequence occurring in an organism, including a virus, that can be isolated from a natural source and has not been intentionally modified by the laboratory worker is naturally occurring.

本文所用术语“分离”指与多核苷酸、多肽或细胞天然存在环境不同的环境中的多核苷酸、多肽或细胞。分离的遗传修饰的宿主细胞可存在于遗传修饰的宿主细胞的混合群体中。The term "isolated" as used herein refers to a polynucleotide, polypeptide or cell in an environment that is different from the environment in which the polynucleotide, polypeptide or cell naturally occurs. An isolated genetically modified host cell can be present in a mixed population of genetically modified host cells.

本文所用术语“外源性核酸”指通常或天然情况下没有在给定的天然细菌、生物体或细胞中发现和/或产生的核酸。本文所用术语“内源性核酸”指通常在给定的天然细菌、生物体或细胞中发现和/或产生的核酸。“内源性核酸”也称为“天然核酸”或对于给定的细菌、生物体或细胞“天然”的核酸。例如，编码HMGS、甲羟戊酸激酶和磷酸甲羟戊酸激酶的核酸代表大肠杆菌(E.coli)的外源性核酸。由酿酒酵母(Sacchromyces cerevisiae)克隆这些甲羟戊酸途径核酸。在酿酒酵母中，染色体上编码HMGS、MK和PMK的基因序列为“内源性”核酸。The term "exogenous nucleic acid" as used herein refers to nucleic acid not normally or naturally found and/or produced in a given native bacterium, organism or cell. The term "endogenous nucleic acid" as used herein refers to a nucleic acid normally found and/or produced in a given native bacterium, organism or cell. "Endogenous nucleic acid" is also referred to as "native nucleic acid" or a nucleic acid that is "natural" to a given bacterium, organism or cell. For example, nucleic acids encoding HMGS, mevalonate kinase, and phosphomevalonate kinase represent nucleic acids exogenous to E. coli. These mevalonate pathway nucleic acids were cloned from Sacchromyces cerevisiae. In Saccharomyces cerevisiae, the gene sequences encoding HMGS, MK and PMK on the chromosome are "endogenous" nucleic acids.

本文所用术语“异源核酸”指至少满足以下一种条件的核酸：(a)核酸是给定的宿主微生物或宿主细胞以外的(“外源性”)(即不是天然发现的)；(b)核酸包含在给定宿主微生物或宿主细胞中天然发现的(如“内源性”)核苷酸序列(如核酸包含宿主微生物或宿主细胞内源性核苷酸序列)，但该核酸在细胞中的产量是非天然量(如大于预期产量或大于天然情况下的产量)；或者该核酸的序列与内源性核苷酸序列不同，以致在细胞中产生非天然量(如大于预期产量或大于天然情况下的产量)的内源发现的同一编码蛋白(氨基酸序列相同或基本相同)；(c)核酸包含在天然情况下相互关系不同的两种或多种核苷酸序列或节段，如核酸是重组核酸。The term "heterologous nucleic acid" as used herein refers to a nucleic acid that satisfies at least one of the following conditions: (a) the nucleic acid is external to a given host microorganism or host cell ("exogenous") (i.e., not found in nature); (b ) nucleic acid comprising a nucleotide sequence naturally found (e.g. "endogenous") in a given host microorganism or host cell (e.g. nucleic acid comprising a nucleotide sequence endogenous to the host microorganism or host cell), but the nucleic acid is or the sequence of the nucleic acid is different from the endogenous nucleotide sequence so that an unnatural amount is produced in the cell (such as greater than the expected yield or greater than that produced in nature); (c) nucleic acid comprising two or more nucleotide sequences or segments that differ in their natural relationship to each other, such as A nucleic acid is a recombinant nucleic acid.

本文所用术语″异源多肽″指天然情况下与给定多肽无关的多肽。例如，含有“异源跨膜结构域”的类异戊烯前体修饰酶指含有天然情况下与类异戊烯前体修饰酶通常无关(如通常不毗邻；通常不在同一多肽链中发现)的跨膜结构域的类异戊烯前体修饰酶。类似地，含有一个或多个“异源分泌结构域”、“异源膜插入多肽”和“异源增溶结构域”的类异戊烯前体修饰酶是含有一个或多个天然情况下与类异戊烯前体修饰酶通常无关(如通常不毗邻；通常不在同一多肽链中发现)的分泌结构域、膜插入多肽和增溶结构域的类异戊烯前体修饰酶。The term "heterologous polypeptide" as used herein refers to a polypeptide that is not related in nature to a given polypeptide. For example, an isopentenoid precursor-modifying enzyme that contains a "heterologous transmembrane domain" refers to an isopentenoid precursor-modifying enzyme that is not normally associated with (eg, usually not contiguous; usually not found in the same polypeptide chain) in nature. An isopentenoid precursor modifying enzyme of the transmembrane domain. Similarly, isopentenoid precursor modifying enzymes containing one or more of a "heterologous secretion domain", a "heterologous membrane inserting polypeptide" and a "heterologous solubilization domain" are those containing one or more The isoprenoid precursor modifying enzyme of the secretion domain, membrane inserting polypeptide, and solubilization domain that are not normally related to (eg, generally not contiguous; generally not found in the same polypeptide chain) the isopenten precursor modifying enzyme.

本文所用术语“重组”指具体核酸(DNA或RNA)是克隆、限制性酶切和/或连接步骤的各种组合的产物，这些组合产生具有可与天然系统中发现的内源性核酸区别开的结构性编码或非编码序列的构建物。通常，编码结构性编码序列的DNA序列可由cDNA片段和短寡核苷酸接头、或由一系列合成寡核苷酸组装，以提供能够由细胞或无细胞转录和翻译系统所含的重组转录单元表达的合成核酸。可以不被内部非翻译序列或内含子(一般存在于真核基因中)中断的开放阅读框的形式提供这种序列。包含相关序列的基因组DNA也可用于形成重组基因或转录单元。非翻译DNA序列可以存在于开放阅读框的5’或3’端，这些序列不干扰编码区的操作或表达，并且实际上可用于以各种机制调节所需产物的产生(参见下面的“DNA调控序列”)。The term "recombinant" as used herein refers to a particular nucleic acid (DNA or RNA) that is the product of various combinations of cloning, restriction, and/or ligation steps that result in a nucleic acid with a characteristic that is distinguishable from the endogenous nucleic acid found in natural systems. Constructs of structural coding or non-coding sequences. Typically, the DNA sequence encoding the structural coding sequence can be assembled from cDNA fragments and short oligonucleotide linkers, or from a series of synthetic oligonucleotides, to provide recombinant transcriptional units capable of being contained by cellular or cell-free transcription and translation systems Expressed synthetic nucleic acids. Such sequences may be provided in the form of an open reading frame uninterrupted by internal untranslated sequences or introns (commonly present in eukaryotic genes). Genomic DNA containing related sequences can also be used to form recombinant genes or transcription units. Non-translated DNA sequences can be present at the 5' or 3' end of the open reading frame, these sequences do not interfere with the operation or expression of the coding region, and can actually be used to regulate the production of the desired product by various mechanisms (see "DNA regulatory sequence").

因此，例如，术语“重组”多核苷酸或“重组”核酸指不是天然产生的，例如通过人类介入人工组合两种分离的序列节段制备的多核苷酸或核酸。常常通过化学合成方法或通过人工操作分离的核酸节段(如遗传工程技术)实现此人工组合。通常这样做是为了用编码相同或保守性氨基酸的冗余密码子取代密码子，一般引入或去除序列识别位点。或者，进行该方法以连接具有所需功能的核酸节段，产生所需的功能组合。常常通过化学合成方法或通过人工操作分离的核酸节段(如遗传工程技术)实现此人工组合。Thus, for example, the term "recombinant" polynucleotide or "recombinant" nucleic acid refers to a polynucleotide or nucleic acid that does not occur in nature, eg, by human intervention, the artificial combination of two isolated sequence segments. This artificial combination is often achieved by chemical synthetic methods or by artificial manipulation of isolated nucleic acid segments such as genetic engineering techniques. Typically this is done to replace codons with redundant codons encoding identical or conserved amino acids, typically to introduce or remove sequence recognition sites. Alternatively, the method is performed to link nucleic acid segments with desired functions, resulting in the desired combination of functions. This artificial combination is often achieved by chemical synthetic methods or by artificial manipulation of isolated nucleic acid segments such as genetic engineering techniques.

类似地，术语“重组”多肽指不是天然产生的，而是(例如)通过人工介入人为地组合两个分离的氨基酸序列节段而制备的多肽。因此，例如，含有异源氨基酸序列的多肽是重组多肽。Similarly, the term "recombinant" polypeptide refers to a polypeptide that does not occur in nature but has been prepared by the artificial combination of two separate segments of amino acid sequence, eg, by human intervention. Thus, for example, a polypeptide comprising a heterologous amino acid sequence is a recombinant polypeptide.

“构建物”或“载体”指重组核酸，通常是重组DNA，产生重组DNA的目的是表达和/或增殖特定的核苷酸序列，或用其构建其它重组核苷酸序列。"Construct" or "vector" refers to a recombinant nucleic acid, usually recombinant DNA, produced for the purpose of expressing and/or propagating a specific nucleotide sequence, or using it to construct other recombinant nucleotide sequences.

本文所用术语“操纵子”和“单转录单元”可互换使用，指受一个或多个控制元件(如启动子)协同调节的两个或多个毗连编码区(编码基因产物如RNA或蛋白质的核苷酸序列)。本文所用术语“基因产物”指DNA编码的RNA(反之亦然)或者RNA或DNA编码的蛋白质，其中基因一般包含编码蛋白质的一种或多种核苷酸序列，也可包括内含子和其它非编码核苷酸序列。As used herein, the terms "operon" and "single transcription unit" are used interchangeably to refer to two or more contiguous coding regions (encoding a gene product such as RNA or protein) that are cooperatively regulated by one or more control elements (such as a promoter). nucleotide sequence). The term "gene product" as used herein refers to RNA encoded by DNA (and vice versa) or protein encoded by RNA or DNA, where a gene generally comprises one or more nucleotide sequences encoding a protein and may also include introns and other non-coding nucleotide sequences.

本文中术语“DNA调控序列”、“控制元件”和“调控元件”可互换使用，指转录或翻译控制序列，如启动子、增强子、聚腺苷酸化信号、终止子、蛋白降解信号等，它们提供和/或调控宿主细胞中编码序列的表达和/或编码多肽的产生。The terms "DNA regulatory sequence", "control element" and "regulatory element" are used interchangeably herein to refer to transcriptional or translational control sequences such as promoters, enhancers, polyadenylation signals, terminators, protein degradation signals, etc. , which provide and/or regulate expression of a coding sequence and/or production of an encoded polypeptide in a host cell.

本文中术语“转化”与“遗传修饰”可互换使用，指引入新核酸(即细胞外源性DNA)后永久或瞬时诱导的细胞遗传改变。可通过将新DNA掺入宿主细胞基因组，或通过将新DNA瞬时或稳定地维持为外遗元件实现遗传改变(“修饰”)。当细胞是真核细胞时，通常通过将DNA引入细胞基因组实现永久遗传改变。在原核细胞中，可将永久改变引入染色体或通过染色体外元件如质粒和表达载体引入，染色体外元件可包含一个或多个可选择标记以帮助将它们维持在重组宿主细胞中。遗传修饰的合适方法包括病毒感染、转染、接合、原生质体融合、电穿孔、基因枪技术、磷酸钙沉淀、直接显微注射等。对方法的选择通常取决于待转化的细胞类型和发生转化的环境(即体外、离体或体内)。对这些方法的总体讨论可参见Ausubel等，Short Protocols in Molecular Biology(分子生物学简单方法)，第3版，维森出版集团(Wiley and Sons)，1995。The term "transformation" and "genetic modification" are used interchangeably herein to refer to a permanent or transient genetic change in a cell induced following the introduction of new nucleic acid (ie, DNA exogenous to the cell). Genetic alteration ("modification") can be achieved by the incorporation of new DNA into the genome of the host cell, or by the transient or stable maintenance of the new DNA as an epigenetic element. When the cells are eukaryotic, permanent genetic changes are usually achieved by introducing DNA into the genome of the cells. In prokaryotic cells, permanent changes can be introduced into the chromosome or through extrachromosomal elements such as plasmids and expression vectors, which can contain one or more selectable markers to aid in their maintenance in recombinant host cells. Suitable methods of genetic modification include viral infection, transfection, conjugation, protoplast fusion, electroporation, biolistic techniques, calcium phosphate precipitation, direct microinjection, and the like. The choice of method will generally depend on the type of cell to be transformed and the environment in which transformation occurs (ie, in vitro, ex vivo or in vivo). A general discussion of these methods can be found in Ausubel et al., Short Protocols in Molecular Biology, 3rd ed., Wiley and Sons, 1995.

“操作性连接”指并列，其中所述组件的相互关系允许它们以所需方式起作用。例如，如果启动子影响某编码序列转录或表达，那么该启动子操作性连接于该编码序列。本文所用术语“异源启动子”和“异源控制区”指正常情况下不与天然具体核酸相连的启动子和其它控制区。例如，“与编码区异源的转录控制区”是正常情况下不与天然编码区相连的转录控制区。"Operatively linked"refers to a juxtaposition wherein the components described are in a relationship permitting them to function in a desired manner. For example, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. The terms "heterologous promoter" and "heterologous control region" as used herein refer to promoters and other control regions that are not normally associated with a particular nucleic acid in nature. For example, a "transcriptional control region heterologous to a coding region" is a transcriptional control region that is not normally associated with a native coding region.

本文所用“宿主细胞”指体内或体外真核细胞、原核细胞或来自多细胞生物体(如细胞系)但培养为单细胞实体的细胞，其中真核或原核细胞可用作或已用作核酸受体(如包含编码一种或多种生物合成途径基因产物如甲羟戊酸途径基因产物的核苷酸序列的表达载体)，它包括已用核酸遗传修饰的原始细胞的后代。应理解，单个细胞的后代不一定与原始亲本的形态或基因组DNA或整套DNA完全相同，因为有天然、偶然或有意突变。“重组宿主细胞”(也称为“遗传修饰的宿主细胞”)是引入了异源核酸如表达载体的宿主细胞。例如，由于将异源核酸引入合适的原核宿主细胞，所述原核宿主细胞是遗传修饰的原核宿主细胞(如细菌)，其中所述异源核酸是例如该原核宿主细胞以外(天然情况下未发现)的外源性核酸，或通常未在该原核宿主细胞中发现的重组核酸；由于将异源核酸引入了合适的真核宿主细胞，所述真核宿主细胞是遗传修饰的真核宿主细胞，其中异源核酸例如该真核宿主细胞以外的外源性核酸或通常未在该真核宿主细胞中发现的重组核酸。A "host cell" as used herein refers to a eukaryotic cell, prokaryotic cell, or cell derived from a multicellular organism (such as a cell line) but cultured as a unicellular entity, in vivo or in vitro, where the eukaryotic or prokaryotic cell can be or has been used as a nucleic acid Recipients (eg, expression vectors comprising nucleotide sequences encoding one or more biosynthetic pathway gene products, such as mevalonate pathway gene products), which include progeny of the original cell that have been genetically modified with the nucleic acid. It is understood that the progeny of a single cell are not necessarily identical in morphology or in genomic or complete DNA to the original parent due to natural, accidental or deliberate mutations. A "recombinant host cell" (also referred to as a "genetically modified host cell") is a host cell into which a heterologous nucleic acid, such as an expression vector, has been introduced. For example, due to the introduction of heterologous nucleic acid into a suitable prokaryotic host cell, said prokaryotic host cell being a genetically modified prokaryotic host cell (such as a bacterium), wherein said heterologous nucleic acid is e.g. ), or a recombinant nucleic acid not normally found in the prokaryotic host cell; due to the introduction of the heterologous nucleic acid into a suitable eukaryotic host cell, which is a genetically modified eukaryotic host cell, Wherein the heterologous nucleic acid is for example an exogenous nucleic acid outside the eukaryotic host cell or a recombinant nucleic acid not normally found in the eukaryotic host cell.

术语“保守性氨基酸取代”指蛋白质中具有相似侧链的氨基酸残基可互换。例如，具有脂族侧链的一组氨基酸由甘氨酸、丙氨酸、缬氨酸、亮氨酸和异亮氨酸组成；具有脂族-羟基侧链的一组氨基酸由丝氨酸和苏氨酸组成；具有含酰胺侧链的一组氨基酸由天冬酰胺和谷胺酰胺组成；具有芳族侧链的一组氨基酸由苯丙氨酸、酪氨酸和色氨酸组成；具有碱性侧链的一组氨基酸由赖氨酸、精氨酸和组氨酸组成；具有含硫侧链的一组氨基酸由半胱氨酸和甲硫氨酸组成。示范性保守性氨基酸取代基是：缬氨酸-亮氨酸-异亮氨酸、苯丙氨酸-酪氨酸、赖氨酸-精氨酸、丙氨酸-缬氨酸和天冬酰胺-谷胺酰胺。The term "conservative amino acid substitution" means that amino acid residues with similar side chains are interchangeable in a protein. For example, the group of amino acids with aliphatic side chains consists of glycine, alanine, valine, leucine, and isoleucine; the group of amino acids with aliphatic-hydroxyl side chains consists of serine and threonine ; the group of amino acids with amide-containing side chains consists of asparagine and glutamine; the group of amino acids with aromatic side chains consists of phenylalanine, tyrosine, and tryptophan; the group of amino acids with basic side chains A group of amino acids consists of lysine, arginine, and histidine; a group of amino acids with sulfur-containing side chains consists of cysteine and methionine. Exemplary conservative amino acid substitutions are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine - Glutamine.

“合成核酸”可由用本领域技术人员已知的方法化学合成的寡核苷酸结构块(building block)组装。连接和退火这些结构块，以形成基因节段，之后用酶组装这些节段构建整个基因。提到DNA序列时，“化学合成”指其组成氨基酸是在体外组装的。可以用已经建立的方法手工化学合成DNA，或可用许多市售机器之一自动化学合成DNA。可修饰核酸的核苷酸序列，以根据优化核苷酸序列以反映宿主细胞的密码子偏倚性(的原则)优化表达。本领域技术人员应理解，如果密码子使用偏向于适合宿主的密码子，就能提高成功表达的可能性。可根据对获自可获得序列信息的宿主细胞的基因研究，确定优选密码子。A "synthetic nucleic acid" can be assembled from oligonucleotide building blocks that are chemically synthesized by methods known to those skilled in the art. These building blocks are joined and annealed to form gene segments, which are then assembled with enzymes to build the entire gene. "Chemical synthesis" in reference to a DNA sequence means that its constituent amino acids are assembled in vitro. DNA can be chemically synthesized manually using established methods, or it can be automated using one of many commercially available machines. The nucleotide sequence of a nucleic acid can be modified to optimize expression by optimizing the nucleotide sequence to reflect the codon bias of the host cell. Those of skill in the art will appreciate that the likelihood of successful expression can be increased if codon usage is biased toward host-appropriate codons. Preferred codons can be determined from genetic studies of host cells from which sequence information is available.

多核苷酸或多肽与另一种多核苷酸或多肽具有某百分数的“序列相同性”是指，当比对时，此百分数的碱基或氨基酸相同，以及在比较两个序列时，此百分数的碱基或氨基酸在相同的相对位置中。可用许多不同方式测定序列相似性。为了测定序列相同性，可用方法和计算机程序比对序列，计算机程序包括可由因特网址ncbi.nlm.nih.gov/BLAST获得的BLAST。参见例如，Altschul等(1990)，J.Mol.Biol.215：403-10。另一种比对算法是FASTA，可从美国威斯康星州麦迪逊的遗传学计算组(GCG)包(牛津分子集团公司(Oxford MolecularGroup，Inc.)的全资子公司)获得。其它比对技术参见《酶学方法》(Methods inEnzymology)，第266卷：大分子序列分析的计算机方法(Computer Methods forMacromolecular Sequence Analysis)(1996)，Doolittle编，学术出版社公司(Academic Press，Inc.)，哈布公司(Harcourt Brace&Co.)的分公司，美国加利福尼亚州圣地亚哥。尤其感兴趣的是允许序列有缺口的比对程序。Smith-Waterman是允许序列比对中有缺口的一种算法类型。参见Meth.Mol.Biol.70：173-187(1997)。同时，采用Needleman&Wunsch比对法的GAP程序可用于比对序列。参见J.Mol.Biol.48：443-453(1970)。A polynucleotide or polypeptide having a certain percentage of "sequence identity" with another polynucleotide or polypeptide means that, when aligned, this percentage of bases or amino acids are identical, and when two sequences are compared, this percentage bases or amino acids in the same relative position. Sequence similarity can be determined in a number of different ways. To determine sequence identity, methods and computer programs are available to align sequences, including BLAST, available on the Internet at ncbi.nlm.nih.gov/BLAST. See, eg, Altschul et al. (1990), J. Mol. Biol. 215:403-10. Another alignment algorithm is FASTA, available from Package Genetics Group (GCG), Madison, WI, USA (a wholly owned subsidiary of Oxford Molecular Group, Inc.). For other comparison techniques see "Methods in Enzymology", Volume 266: Computer Methods for Macromolecular Sequence Analysis (1996), edited by Doolittle, Academic Press, Inc. ), a division of Harcourt Brace & Co., San Diego, California, USA. Of particular interest are alignment programs that allow gaps in sequences. Smith-Waterman is a type of algorithm that allows for gaps in sequence alignments. See Meth. Mol. Biol. 70:173-187 (1997). Also, the GAP program using the Needleman & Wunsch alignment method can be used to align sequences. See J. Mol. Biol. 48:443-453 (1970).

当在温度和溶液离子强度合适的条件下，核酸的单链形式可退火于另一核酸时，该核酸与另一核酸(如cDNA、基因组DNA或RNA)“可杂交”。杂交和洗涤条件是熟知的，参见例如Sambrook，J.，Fritsch，E.F.和Maniatis，T.《分子克隆：实验室手册》(Molecular Cloning：A Laboratory Manual)，第二版，冷泉港实验室出版社(Cold Spring Harbor Laboratory Press)，冷泉港(1989)，特别是其中第11章和表11.1；和Sambrook，J.和Russell，W.，《分子克隆：实验室手册》(Molecular Cloning：A Laboratory Manual)，第三版，冷泉港实验室出版社，冷泉港(2001)。温度和离子强度条件决定了杂交的“严谨性”。可调节严谨性条件以筛选中等类似的片段如相关性较少的生物体的同源序列，至高度相似的片段如复制密切相关生物体的功能酶的基因。杂交条件和杂交后洗涤可用于获得所需的杂交决定性严谨条件。一组说明性杂交后洗涤是一系列洗涤，开始是6×SSC(SSC是0.15M NaCl和15mM柠檬酸盐缓冲液)，0.5％SDS，室温下15分钟，然后用2×SSC，0.5％SDS，45℃重复洗涤30分钟，然后用0.2×SSC，0.5％SDS，50℃洗涤30分钟，重复两次。用较高温度获得其它严谨条件，其中洗涤与上述洗涤相同，除了最后两次用0.2×SSC，0.5％SDS洗涤30分钟的温度提高到60℃。另一组高度严谨条件最后两次洗涤采用0.1×SSC，0.1％SDS，65℃。严谨杂交条件的另一个例子是在50℃或更高温度下和0.1×SSC(15mM氯化钠/1.5mM柠檬酸钠)中杂交。严谨杂交条件的另一个例子是在溶液中42℃孵育过夜，所用溶液是：50％甲酰胺、5×SSC(150mMNaCl、15mM柠檬酸三钠)、50mM磷酸钠(pH 7.6)、5×Denhardt溶液、10％硫酸葡聚糖和20μg/ml变性剪切的鲑精DNA，然后用0.1×SSC在约65℃洗涤滤器。严谨杂交条件和杂交后洗涤条件是至少与上述代表性条件同样严谨的杂交条件和杂交后洗涤条件。A nucleic acid is "hybridizable" to another nucleic acid (eg, cDNA, genomic DNA, or RNA) when the single-stranded form of the nucleic acid can anneal to the other nucleic acid under appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known, see, e.g., Sambrook, J., Fritsch, E.F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press (Cold Spring Harbor Laboratory Press), Cold Spring Harbor (1989), especially Chapter 11 and Table 11.1 therein; and Sambrook, J. and Russell, W., Molecular Cloning: A Laboratory Manual ), Third Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (2001). Temperature and ionic strength conditions determine the "stringency" of hybridization. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from less related organisms, to highly similar fragments such as genes duplicating functional enzymes from closely related organisms. Hybridization conditions and post-hybridization washes can be used to achieve the desired hybridization-critical stringency conditions. An illustrative set of post-hybridization washes is a series of washes starting with 6×SSC (SSC is 0.15M NaCl and 15mM citrate buffer), 0.5% SDS for 15 minutes at room temperature, followed by 2×SSC, 0.5% SDS , repeated washing at 45°C for 30 minutes, followed by washing with 0.2×SSC, 0.5% SDS, 50°C for 30 minutes, and repeated twice. Other stringent conditions were obtained with higher temperatures, where the washes were the same as above except that the temperature was raised to 60°C for the last two 30 min washes with 0.2 x SSC, 0.5% SDS. Another set of highly stringent conditions used 0.1×SSC, 0.1% SDS, 65° C. for the last two washes. Another example of stringent hybridization conditions is hybridization at 50° C. or higher in 0.1×SSC (15 mM sodium chloride/1.5 mM sodium citrate). Another example of stringent hybridization conditions is incubation overnight at 42°C in a solution of: 50% formamide, 5×SSC (150mM NaCl, 15mM trisodium citrate), 50mM sodium phosphate (pH 7.6), 5×Denhardt’s solution , 10% dextran sulfate and 20 μg/ml denatured sheared salmon sperm DNA, then wash the filter with 0.1×SSC at about 65°C. Stringent hybridization conditions and post-hybridization wash conditions are hybridization conditions and post-hybridization wash conditions that are at least as stringent as the representative conditions described above.

杂交需要两种核酸含有互补序列，但根据杂交的严谨性，可能有碱基错配。核酸杂交的合适严谨性取决于核酸长度和互补程度，它们是本领域熟知的变量。两个核苷酸序列之间的相似性或同源性程度越大，具有这些序列的核酸杂交体的解链温度(Tm)越高。核酸杂交的相对稳定性(对应于较高Tm)按以下顺序依次降低：RNA:RNA、DNA:RNA、DNA:DNA。对于长度大于100个核苷酸的杂交体，产生了计算Tm的等式(参见Sambrook等，同上，9.50-9.51)。对于较短核酸即寡核苷酸的杂交，错配位置变得更重要，寡核苷酸的长度决定其特异性(参见Sambrook等，同上，11.7-11.8)。一般地，可杂交核酸的长度至少约为10个核苷酸。可杂交核酸的示范性最小长度为：至少约15个核苷酸；至少约20个核苷酸；至少约30个核苷酸。而且，本领域技术人员知道，可根据诸如探针长度等因素调节温度和洗涤溶液盐浓度。Hybridization requires that the two nucleic acids contain complementary sequences, but depending on the stringency of the hybridization, there may be base mismatches. The appropriate stringency for hybridization of nucleic acids depends on the length of the nucleic acids and the degree of complementarity, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the higher the melting temperature (Tm) of nucleic acid hybrids having those sequences. The relative stability of nucleic acid hybridization (corresponding to higher Tm) decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids longer than 100 nucleotides, an equation for calculating the Tm was generated (see Sambrook et al., supra, 9.50-9.51). For hybridization of shorter nucleic acids, ie, oligonucleotides, the position of the mismatch becomes more important, the length of the oligonucleotide determining its specificity (see Sambrook et al., supra, 11.7-11.8). Generally, hybridizable nucleic acids are at least about 10 nucleotides in length. Exemplary minimum lengths for hybridizable nucleic acids are: at least about 15 nucleotides; at least about 20 nucleotides; at least about 30 nucleotides. Also, the temperature and wash solution salt concentration can be adjusted according to factors such as probe length, as known to those skilled in the art.

在进一步描述本发明之前，应理解，本发明不仅限于所述的具体实施方式，当然它们也可以改变。也应理解，本文所用术语仅仅为了描述具体实施方式，不旨在限制，因此，本发明范围仅受所附权利要求书的限制。Before the present invention is further described, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting, such that the scope of the present invention will be limited only by the appended claims.

当提供值的范围时，应理解，本发明包括该范围上下限之间以下限单位十分之一为间隔的各居中值(除非文中明确指出不是这样)以及任何其它指出值或指出范围中的居中值。这些较小范围的上下限可独立地包括在较小范围内，并且也属于本发明，还可在指出范围中特别排除限值。当所述范围包括一个或两个限值时，本发明也包括排除一个或两个限值的范围。Where a range of values is provided, it is to be understood that the invention includes each intervening value between the upper and lower limits of the range at intervals of one-tenth of the unit of the lower limit (unless the context clearly dictates otherwise) and any other stated value or value within a stated range. Median value. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also inventive, provided that a limit is specifically excluded in the indicated range. Where the stated range includes one or both of the limits, ranges excluding either or both of the limits are also included.

除非另有说明，本文所用的所有科技术语与本发明所属领域普通技术人员通常理解的含义相同。虽然实施或测试本发明也可采用类似或等效于本文所述内容的任何方法和材料，但现在描述了优选的方法和材料。将本文所述所有发表物纳入本文作参考，以便连同引用的发表物公开和描述方法和/或材料。Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.

必须注意，本文和所附权利要求书中所用的单数形式“一”、“一种”以及“这种”包括复数含义，除非文中明确指出不是这样。因此，例如，提到“一种细胞色素P450酶”包括多种这类酶，提到“一种细胞色素P450还原酶”包括本领域技术人员已知的一种或多种细胞色素P450还原酶和其等同物等。还需注意，可以使权利要求书排除任何任选元件。因此，此声明旨在用作排除性术语如“单独”、“仅仅”等与权利要求部分联用的前提基础，或采用“负”限制。It must be noted that as used herein and in the appended claims the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a cytochrome P450 enzyme" includes a plurality of such enzymes and reference to "a cytochrome P450 reductase" includes one or more cytochrome P450 reductases known to those skilled in the art. and its equivalents etc. Note also that a claim may be made to exclude any optional element. Accordingly, this statement is intended to be used as a predicated basis for the use of exclusive terms such as "solely," "only," etc. in conjunction with claim sections, or as a "negative" limitation.

提供本文所述发表物仅仅是因为它们在本申请的申请日之前公开。不应理解为承认由于在先发明而使本发明不具资格先于这种出版物。而且，提供的公开日可能与实际公开日不同，可能需要单独确认。The publications described herein are provided solely for their publication prior to the filing date of the present application. It should not be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Also, the dates of publication provided may differ from the actual publication dates and may need to be independently confirmed.

发明详述Detailed description of the invention

本发明还提供了含有编码类异戊烯前体修饰酶的核苷酸序列的核酸；以及包含该核酸的重组载体和宿主细胞。本发明提供了在宿主细胞中产生有酶活性的类异戊烯前体修饰酶的方法。本发明还提供了在用含有编码类异戊烯前体修饰酶的核苷酸序列的核酸遗传修饰的宿主细胞中，产生类异戊烯化合物的方法。The present invention also provides a nucleic acid containing a nucleotide sequence encoding an isopentenoid precursor modifying enzyme; and a recombinant vector and a host cell containing the nucleic acid. The present invention provides methods for producing enzymatically active isopentenoid precursor modifying enzymes in host cells. The present invention also provides a method of producing an isopentenoid compound in a host cell genetically modified with a nucleic acid comprising a nucleotide sequence encoding an isopentenoid precursor modifying enzyme.

核酸、载体和宿主细胞Nucleic acids, vectors and host cells

本发明提供了含有编码修饰细胞色素P450酶的核苷酸序列的核酸；以及包含该核酸的重组载体和宿主细胞。本发明提供了含有编码类异戊烯前体修饰酶的核苷酸序列的核酸；以及包含该核酸的重组载体和宿主细胞。The invention provides a nucleic acid containing a nucleotide sequence encoding a modified cytochrome P450 enzyme; and a recombinant vector and a host cell containing the nucleic acid. The present invention provides a nucleic acid containing a nucleotide sequence encoding an isopentenoid precursor modifying enzyme; and a recombinant vector and a host cell containing the nucleic acid.

本文所用术语“修饰细胞色素P450酶”指修饰(如“官能化”)生物合成途径的中间体的酶。本发明核酸编码的修饰细胞色素P450酶能催化以下一种或多种反应：羟化、氧化、环氧化、脱水、脱氢、脱卤、异构化、醇氧化、醛氧化、脱烷基化和C-C键断裂反应。在本文中将这类反应总称为“生物合成途径中间体修饰反应”。这些反应参见例如，Sono等((1996)Chem.Rev.96：2841-2887；参见例如，Sono等的图3，这类反应的示意图)。As used herein, the term "modified cytochrome P450 enzyme" refers to an enzyme that modifies (eg, "functionalizes") an intermediate of a biosynthetic pathway. The modified cytochrome P450 enzyme encoded by the nucleic acid of the present invention can catalyze one or more of the following reactions: hydroxylation, oxidation, epoxidation, dehydration, dehydrogenation, dehalogenation, isomerization, alcohol oxidation, aldehyde oxidation, dealkylation cation and C-C bond scission reactions. Such reactions are collectively referred to herein as "biosynthetic pathway intermediate modification reactions". See, eg, Sono et al. ((1996) Chem. Rev. 96:2841-2887; see, eg, Figure 3 of Sono et al., for a schematic diagram of such reactions).

在一些实施方式中，修饰细胞色素P450酶是类异戊烯前体修饰酶。本文所用术语“类异戊烯前体修饰酶”与“类异戊烯修饰酶”可互换使用，它们指修饰类异戊烯前体化合物，例如用类异戊烯前体化合物作底物的酶，所述类异戊烯前体修饰酶能催化以下一种或多种反应：羟化、环氧化、氧化、脱水、脱氢、脱卤、异构化、醇氧化、醛氧化、脱烷基化和C-C键断裂反应。在本文中将这类反应总称为“类异戊烯前体修饰反应”。这些反应参见例如，Sono等((1996)同上；参见例如，Sono等的图3，这类反应的示意图)。在许多实施方式中，类异戊烯前体修饰酶是细胞色素P450酶。参见例如，Sono等(1996)同上。In some embodiments, the modifying cytochrome P450 enzyme is an isopentenoid precursor modifying enzyme. As used herein, the term "isopentenoid precursor modifying enzyme" is used interchangeably with "isopentenoid modifying enzyme" and they refer to the modification of an isopentenoid precursor compound, e.g., using the isopentenoid precursor compound as a substrate An enzyme that can catalyze one or more of the following reactions: hydroxylation, epoxidation, oxidation, dehydration, dehydrogenation, dehalogenation, isomerization, alcohol oxidation, aldehyde oxidation, Dealkylation and C-C bond scission reactions. Such reactions are collectively referred to herein as "isopentenoid precursor modification reactions". See, eg, Sono et al. ((1996) supra; see, eg, Figure 3 of Sono et al., for a schematic diagram of such reactions). In many embodiments, the isopentenoid precursor modifying enzyme is a cytochrome P450 enzyme. See, eg, Sono et al. (1996) supra.

修饰细胞色素P450酶的底物Modification of substrates for cytochrome P450 enzymes

如上所述，修饰细胞色素P450酶的底物是生物合成途径的中间体。示范性中间体包括但不限于：类异戊烯前体；生物碱前体；苯丙酸类前体；类黄酮前体；类固醇前体；聚酮化合物前体；大环内酯前体；糖醇前体；酚类化合物前体等。参见例如，Hwang等((2003)Appl.Environ.Microbiol.69：2699-2706；Facchini等((2004)TRENDS Plant Sci.9：116。As mentioned above, the substrates of the modified cytochrome P450 enzymes are intermediates of the biosynthetic pathway. Exemplary intermediates include, but are not limited to: isopentenoid precursors; alkaloid precursors; phenylpropionic acid precursors; flavonoid precursors; steroid precursors; polyketide precursors; macrolide precursors; Sugar alcohol precursors; phenolic compound precursors, etc. See eg, Hwang et al. ((2003) Appl. Environ. Microbiol. 69:2699-2706; Facchini et al. ((2004) TRENDS Plant Sci. 9:116.

感兴趣的生物合成途径产物包括但不限于：类异戊烯化合物、生物碱化合物、苯丙酸类化合物、类黄酮化合物、类固醇化合物、聚酮化合物化合物、大环内酯化合物、糖醇、酚类化合物等。Biosynthetic pathway products of interest include, but are not limited to: isopentenoids, alkaloids, phenylpropanoids, flavonoids, steroids, polyketides, macrolides, sugar alcohols, phenols compounds, etc.

生物碱化合物是在约20％植物种类中发现的一大类多样化的天然产物。它们的定义通常是氧化状态下杂环中存在氮原子。生物碱化合物包括苄基异喹啉生物碱化合物、吲哚生物碱化合物、异喹啉生物碱化合物等。生物碱化合物包括单环生物碱化合物、双环生物碱化合物、三环生物碱化合物、四环生物碱化合物以及具有笼形结构的生物碱化合物。生物碱化合物包括：1)吡啶类：胡椒碱、毒芹碱、葫芦巴碱、槟榔啶、四氢烟酸、匹鲁卡品、野靛碱、金雀花碱、石榴碱；2)吡咯烷类：古豆碱、烟碱、红古豆碱；3)托品碱类：阿托品、可卡因、芽子碱、石榴碱、东莨菪碱；4)喹啉类：奎宁、二氢奎宁、奎尼丁、二氢奎尼丁、番木鳖硷、二甲马钱子碱和藜芦生物碱(如藜芦碱、西法丁)；5)异喹啉类：吗啡、可待因、蒂巴因、罂粟碱、那可汀、那碎因、白毛茛碱和黄连素；6)苯乙胺类：去氧麻黄碱、墨斯卡灵、麻黄碱；7)吲哚类：色胺类(如二甲色胺、西洛西宾、5-羟色胺)、麦角灵类(如麦碱、麦角胺、麦角酸等)和β-咔啉类(如哈尔碱、育亨宾、利血平、吐根碱)；8)嘌呤类：黄嘌呤(如咖啡因、可可碱、茶碱)；9)萜类：乌头生物碱(如乌头碱)和类固醇(如茄碱、蝾螈碱)；10)甜菜碱类：(季铵化合物：如毒蝇碱、胆碱、神经碱)；以及11)吡唑类：吡唑、甲吡唑。示范性生物碱化合物是吗啡、黄连素、长春碱、长春新碱、可卡因、东莨菪碱、咖啡因、烟碱、阿托品、罂粟碱、吐根碱、奎宁、利血平、可待因、5-羟色胺等。参见例如，Facchini等((2004)Trends Plant Science 9：116)。Alkaloid compounds are a large and diverse class of natural products found in about 20% of plant species. They are generally defined by the presence of a nitrogen atom in the heterocyclic ring in the oxidized state. The alkaloid compounds include benzylisoquinoline alkaloid compounds, indole alkaloid compounds, isoquinoline alkaloid compounds, and the like. The alkaloid compounds include monocyclic alkaloid compounds, bicyclic alkaloid compounds, tricyclic alkaloid compounds, tetracyclic alkaloid compounds and alkaloid compounds having a cage structure. Alkaloid compounds include: 1) pyridines: piperine, venomine, trigonelline, arecaline, tetrahydronicotinic acid, pilocarpine, cydidine, cytisine, pomegranate; 2) pyrrolidine 3) Tropines: atropine, cocaine, ecgonine, pomegranate, scopolamine; 4) quinolines: quinine, dihydroquinine, quinine Ding, dihydroquinidine, strychnine, dimethyl strychnine and veratrum alkaloids (such as veratrine, ciphadin); 5) isoquinolines: morphine, codeine, thebaine , papaverine, narcotine, narcoline, hydrazine and berberine; 6) phenethylamines: methamphetamine, mescaline, ephedrine; 7) indoles: tryptamines (such as Dimethyltryptamine, psilocybin, serotonin), ergolines (such as ergotine, ergotamine, lysergic acid, etc.) and β-carbolines (such as Halline, yohimbine, reserpine, emetine); 8) purines: xanthines (such as caffeine, theobromine, theophylline); 9) terpenes: aconitine alkaloids (such as aconitine) and steroids (such as solanine, salamanderline) ; 10) betaines: (quaternary ammonium compounds: such as muscarine, choline, neuromineral); and 11) pyrazoles: pyrazole, fomepyrazole. Exemplary alkaloid compounds are morphine, berberine, vinblastine, vincristine, cocaine, scopolamine, caffeine, nicotine, atropine, papaverine, emetine, quinine, reserpine, codeine, 5- Serotonin, etc. See, eg, Facchini et al. ((2004) Trends Plant Science 9:116).

类异戊烯修饰酶底物Isopentenoid modifying enzyme substrate

术语“类异戊烯前体化合物”与“类异戊烯前体底物”可互换使用，它们指作为萜合酶在二磷酸多异戊烯酯上反应产物的化合物。萜合酶(也称为“萜环化酶”)反应产物是所谓的“萜骨架”。在一些实施方式中，类异戊烯修饰酶催化修饰萜骨架，或其下游产物。因此，在一些实施方式中，类异戊烯前体是萜骨架。类异戊烯前体修饰酶的类异戊烯前体底物包括单萜、二萜、三萜和倍半萜。The term "isopentenoid precursor compound" is used interchangeably with "isopentenoid precursor substrate" and refers to a compound that is the product of the reaction of a terpene synthase on polyprenyl diphosphate. The product of the terpene synthase (also called "terpene cyclase") reaction is the so-called "terpene backbone". In some embodiments, the isopentenoid modifying enzyme catalyzes the modification of the terpene backbone, or downstream products thereof. Thus, in some embodiments, the isopentenoid precursor is a terpene backbone. Isopentenoid precursor substrates for isopentenoid precursor modifying enzymes include monoterpenes, diterpenes, triterpenes, and sesquiterpenes.

本发明核酸编码的类异戊烯修饰酶的单萜底物包括但不限于：产生的氧化产物是单萜化合物或者是产生单萜化合物的生物合成途径的中间体的任何单萜底物。示范性单萜底物包括但不限于：属于以下家族的单萜底物：无环单萜、二甲基辛烷、薄荷烷、不规则的类单萜、桉油醇、莰烷、异莰烷、单环单萜、蒎烷、葑烷、苧烷、蒈烷、紫罗酮、虹彩烷(Iridanes)和大麻(Cannabanoids)。示范性单萜底物、中间体和产物包括但不限于：苧烯、香茅醇(citranellol)、牻牛儿醇、薄荷醇、紫苏子醇、芳樟醇和宁酮。The monoterpene substrates of the isopentenoid modifying enzymes encoded by the nucleic acids of the present invention include, but are not limited to: any monoterpene substrate that produces an oxidation product that is a monoterpene compound or is an intermediate in a biosynthetic pathway that produces a monoterpene compound. Exemplary monoterpene substrates include, but are not limited to: monoterpene substrates belonging to the following families: acyclic monoterpene, dimethyloctane, menthane, irregular monoterpenoids, eucalyptol, camphane, isocamphene Alkanes, Monocyclic Monoterpenes, Pinanes, Fenranes, Limones, Caranes, Ionones, Iridanes and Cannabanoids. Exemplary monoterpene substrates, intermediates, and products include, but are not limited to, limonene, citranellol, geraniol, menthol, perillyl alcohol, linalool, and nicinone.

本发明核酸编码的类异戊烯修饰酶的二萜底物包括但不限于：产生的氧化产物是二萜化合物或者是产生二萜化合物的生物合成途径的中间体的任何二萜底物。示范性二萜底物包括但不限于：属于以下家族的二萜底物：无环双萜类、双环双萜类、单环双萜类、半日花烷(Labdanes)、克罗登烷(Clerodanes)、紫杉烷、三环双萜类、四环双萜类、贝壳杉烯、贝叶烷(Beyerenes)、吖地烯(Atiserenes)、蚜肠霉素(Aphidicolins)、木藜芦毒素、赤霉素、大环二萜和伊丽莎白三烷(Elizabethatrianes)。示范性二萜底物、中间体和产物包括但不限于：篦麻素、艾榴塞洛素(eleutherobin)、紫杉醇、蔓生素(prostratin)和假蕨素。The diterpene substrates of the isopentenoid modifying enzymes encoded by the nucleic acids of the present invention include, but are not limited to: any diterpene substrate that produces an oxidation product that is a diterpene compound or is an intermediate in a biosynthetic pathway that produces a diterpene compound. Exemplary diterpene substrates include, but are not limited to, diterpene substrates belonging to the following families: acyclic diterpenes, dicyclic diterpenes, monocyclic diterpenes, Labdanes, Clerodanes ), Taxanes, Tricyclic Diterpenes, Tetracyclic Diterpenes, Kaurene, Beyerenes, Atiserenes, Aphidicolins, Keveratin, Erythromycin Mycins, macrocyclic diterpenes, and Elizabethtrianes. Exemplary diterpene substrates, intermediates, and products include, but are not limited to, ricin, eleutherobin, paclitaxel, prostratin, and pseudopernin.

本发明核酸编码的类异戊烯修饰酶的三萜底物包括但不限于：产生的氧化产物是三萜化合物或者是产生三萜化合物的生物合成途径的中间体的任何三萜底物。示范性三萜底物、中间体和产物包括但不限于：阿布糖苷E(arbrusideE)、鸦胆丁(bruceantin)、睾酮、孕酮、可的松和洋地黄毒苷。The triterpene substrates of the isopentenoid modifying enzymes encoded by the nucleic acids of the present invention include, but are not limited to: any triterpene substrate that produces an oxidation product that is a triterpene compound or is an intermediate in a biosynthetic pathway that produces a triterpene compound. Exemplary triterpene substrates, intermediates, and products include, but are not limited to: arbruside E, bruceantin, testosterone, progesterone, cortisone, and digitoxin.

本发明核酸编码的类异戊烯修饰酶的倍半萜底物包括但不限于：产生的氧化产物是倍半萜化合物或者是产生倍半萜化合物的生物合成途径的中间体的任何倍半萜底物。示范性倍半萜底物包括但不限于：属于以下家族的倍半萜底物：法呢烷、单环法呢烷、单环倍半萜、双环倍半萜、双环法呢烷、双波烷(Bisbolanes)、檀香烷、卡波烷(Cupranes)、剪叶苔烷(Herbertanes)、瓣苔烷(Gymnomitranes)、单端孢霉烷、花柏烷(Chamigranes)、胡萝卜烷、菖蒲烷、安提萨汀(Antisatins)、杜松烷、倍半萜酮右旋日本刺参萜酮(Oplopananes)、胡椒烷(Copaanes)、苦毒烷(Picrotoxanes)、雪松烷、长叶蒎烷、长叶环烷(Longicyclanes)、丁香烷、莫得烷(Modhephanes)、斯非叶烷(Siphiperfolanes)、葎草烷、全叶烷(Intergrifolianes)、丽皮叶烷(Lippifolianes)、原伊鲁烷(Protoilludanes)、隐环伞烷(Illudanes)、多毛烷(Hirsutanes)、乳菇烷(Lactaranes)、斯蒂波烷(Sterpuranes)、富麻烷(Fomannosanes)、马拉烷(Marasmanes)、大根香叶烷、榄香烷、桉叶烷、贝克烷(Bakkanes)、绮罗烷(Chilosyphanes)、愈创木烷、假愈创木烷、三环倍半萜、广霍香烷、三噁烷(Trixanes)、香木兰烷(Aromadendranes)、高各烷(Gorgonanes)、纳多烷(Nardosinanes)、巴西烷(Brasilanes)、绿苔烷(Pinguisanes)、倍半蒎烷(Sequipinane)、倍半莰烷(Sequicamphane)、斧柏烷、双环葎草烷、葱烷(Alliacanes)、斯蒂波烷(Sterpuranes)、乳菇烷、亚夫丽烷(Africanes)、全叶烷、原伊鲁烷(Protoilludanes)、土青木香烷和纽兰烷(Neolemnanes)。示范性倍半萜底物包括但不限于：紫穗槐二烯、异长叶烯(alloisolongifolene)、(-)-α-反式-香柠檬烯(bergamotene)、(-)-β-榄香烯、(+)-大根香叶烯A、大根香叶烯B、(+)-γ-古芸烯、(+)-喇叭烯、十氢二甲基甲乙烯基萘酚(neointermedeol)、(+)-β-蛇床烯和(+)-朱栾倍半萜。The sesquiterpene substrates of the isopentenoid modifying enzymes encoded by the nucleic acids of the invention include, but are not limited to: any sesquiterpene that produces an oxidation product that is a sesquiterpene compound or is an intermediate in a biosynthetic pathway that produces a sesquiterpene compound substrate. Exemplary sesquiterpene substrates include, but are not limited to: sesquiterpene substrates belonging to the following families: farnesane, monocyclic farnesane, monocyclic sesquiterpene, bicyclic sesquiterpene, bicyclic farnesane, Bisbolanes, Santalane, Cupranes, Herbertanes, Gymnomitranes, Trichothecenes, Chamigranes, Carrotane, Calamusane, Antipyretic Antisatins, Cindinanes, Sesquiterpene Ketones, Oplopananes, Copaanes, Picrotoxanes, Cedarane, Pinane, and Naphthene (Longicyclanes), Syringane, Modhephanes, Siphiperfolanes, Humulane, Intergrifolianes, Lippifolianes, Protoilludanes, Illudanes, Hirsutanes, Lactaranes, Sterpuranes, Fomannosanes, Marasmanes, Geranane, Elemane , Eucalyptane, Bakkanes, Chilosyphanes, Guaiacane, Pseudoguaiaranes, Tricyclic Sesquiterpenes, Patchoulane, Trixanes, Magnolane ( Aromadendranes), Gorgonanes, Nardosinanes, Brasilanes, Pinguisanes, Sequipinane, Sequicamphane, Axetane, Bicyclic humulanes, Alliacanes, Sterpuranes, Lactanes, Africanes, Panphyllanes, Protoilludanes, Turpuranes, and Newlands Alkanes (Neolemnanes). Exemplary sesquiterpene substrates include, but are not limited to: amorphadiene, alloisolongifolene, (-)-α-trans-bergamotene, (-)-β-elemene , (+)-germacene A, germane B, (+)-γ-gureene, (+)-hornene, decahydrodimethylvinyl naphthol (neointermedeol), (+ )-β-osthole and (+)-valencene.

修饰modify

本发明核酸含有编码修饰细胞色素P450酶的核苷酸序列，在许多实施方式中，本发明核酸编码的修饰细胞色素P450酶含有非天然(非野生型，或非天然产生的，或变异的)氨基酸序列。编码的修饰细胞色素P450酶含有能提高用本发明核酸遗传修饰的宿主细胞中修饰细胞色素P450酶的活性水平和/或提高用本发明核酸遗传修饰的宿主细胞产生的生物合成途径的给定产物水平的一种或多种氨基酸序列修饰(缺失、加入、插入、取代)。The nucleic acid of the present invention contains a nucleotide sequence encoding a modified cytochrome P450 enzyme. In many embodiments, the modified cytochrome P450 enzyme encoded by the nucleic acid of the present invention contains a non-natural (non-wild type, or non-naturally occurring, or variant) amino acid sequence. The encoded modified cytochrome P450 enzyme contains the ability to increase the activity level of the modified cytochrome P450 enzyme in the host cell genetically modified with the nucleic acid of the present invention and/or increase the given product of the biosynthetic pathway produced by the host cell genetically modified with the nucleic acid of the present invention One or more amino acid sequence modifications (deletions, additions, insertions, substitutions) at the level.

在一些实施方式中，本发明核酸含有编码修饰类异戊烯前体修饰酶的核苷酸序列，在许多实施方式中，本发明核酸编码的类异戊烯前体修饰酶含有非天然(非野生型，或非天然产生的，或变异的)氨基酸序列。编码的类异戊烯前体修饰酶将含有能提高用本发明核酸遗传修饰的宿主细胞中类异戊烯前体修饰酶的活性水平和/或提高用本发明核酸遗传修饰的宿主细胞产生的给定类异戊烯化合物水平一个或多个氨基酸序列修饰(缺失、加入、插入、取代)。在一些实施方式中，相对于野生型类异戊烯前体修饰酶，编码的类异戊烯前体修饰酶将包含一个或多个以下修饰：a)用非天然跨膜结构域取代天然跨膜结构域；b)用分泌信号结构域取代天然跨膜结构域；c)用增溶结构域取代天然跨膜结构域；d)用膜插入域取代天然跨膜结构域；e)截短天然跨膜结构域；和f)改变天然跨膜结构域的氨基酸序列。In some embodiments, the nucleic acid of the present invention contains a nucleotide sequence encoding a modified isopentenoid precursor modifying enzyme, and in many embodiments, the nucleic acid of the present invention encodes an isopentenoid precursor modifying enzyme containing a non-natural wild-type, or non-naturally occurring, or variant) amino acid sequence. The encoded isopentenoid precursor modifying enzyme will contain an enzyme capable of increasing the activity level of the isopentenoid precursor modifying enzyme in the host cell genetically modified with the nucleic acid of the present invention and/or increasing the production of the enzyme by the host cell genetically modified with the nucleic acid of the present invention. One or more amino acid sequence modifications (deletions, additions, insertions, substitutions) at the level of a given isopentenoid compound. In some embodiments, the encoded isopentenoid precursor modifying enzyme will comprise one or more of the following modifications relative to the wild-type isopentenoid precursor modifying enzyme: a) a non-native transmembrane domain replacing the natural transmembrane Membrane domain; b) replacement of native transmembrane domain with secretion signaling domain; c) replacement of native transmembrane domain with solubilization domain; d) replacement of native transmembrane domain with membrane insertion domain; e) truncated native the transmembrane domain; and f) altering the amino acid sequence of the native transmembrane domain.

在许多实施方式中，以5’至3’的顺序，本发明核酸包含操作性连接的编码第一结构域的核苷酸序列和编码修饰P450酶(如类异戊烯前体修饰酶)的催化域的核苷酸序列，所述第一结构域选自跨膜结构域、分泌结构域、增溶结构域或膜插入蛋白；所述第一结构域与所述催化域异源。在一些实施方式中，所述第一结构域包含分泌信号和跨膜结构域。In many embodiments, in a 5' to 3' order, a nucleic acid of the invention comprises a nucleotide sequence encoding a first domain and a nucleotide sequence encoding a modified P450 enzyme (such as an isopentenoid precursor modifying enzyme) operably linked. A nucleotide sequence of a catalytic domain, said first domain being selected from a transmembrane domain, a secretory domain, a solubilizing domain, or a membrane-inserted protein; said first domain being heterologous to said catalytic domain. In some embodiments, the first domain comprises a secretion signal and a transmembrane domain.

非天然跨膜结构域non-native transmembrane domain

在一些实施方式中，编码的修饰细胞色素P450酶(如类异戊烯前体修饰酶)将包含非天然(如异源)跨膜结构域。合适的非天然跨膜结构域通常选自在给定宿主细胞中有功能的跨膜结构域。在一些实施方式中，非天然跨膜结构域是在原核宿主细胞中有功能的跨膜结构域。在其它实施方式中，非天然跨膜结构域是在真核宿主细胞中有功能的跨膜结构域。In some embodiments, the encoded modified cytochrome P450 enzyme (eg, isopentenoid precursor modifying enzyme) will comprise a non-native (eg, heterologous) transmembrane domain. Suitable non-native transmembrane domains are generally selected from those that are functional in a given host cell. In some embodiments, the non-native transmembrane domain is a transmembrane domain that is functional in a prokaryotic host cell. In other embodiments, the non-native transmembrane domain is a transmembrane domain that is functional in a eukaryotic host cell.

例如，在许多实施方式中，为了在大肠杆菌中表达，非天然跨膜结构域包含以下氨基酸序列之一：For example, in many embodiments, for expression in E. coli, the non-native transmembrane domain comprises one of the following amino acid sequences:

NH₂-MWLLLIAVFLLTLAYLFWP-COOH(SEQ ID NO：1)； _NH2 -MWLLLIAVFLLTLAYLFWP-COOH (SEQ ID NO: 1);

NH₂-MALLLAVFLGLSCLLLLSLW-COOH(SEQ ID NO：2)； _NH2 -MALLLAVFLGLSCLLLLSLW-COOH (SEQ ID NO: 2);

NH₂-MAILAAIFALVVATATRV-COOH(SEQ ID NO：3)；NH ₂ -MAILAAIFALVVATATRV-COOH (SEQ ID NO: 3);

NH₂-MDASLLLSVALAVVLIPLSLALLN-COOH(SEQ ID NO：4)；和 _NH2 -MDASLLLSVALAVVLIPLSLALLN-COOH (SEQ ID NO: 4); and

NH₂-MIEQLLEYWYVVVPVLYIIKQLLAYTK-COOH(SEQ ID NO：5)。 _NH2 -MIEQLLEYWYVVVPVLYIIKQLLAYTK-COOH (SEQ ID NO: 5).

分泌信号secretory signal

在一些实施方式中，编码的修饰细胞色素P450酶(如类异戊烯前体修饰酶)将包含能够使细胞分泌融合蛋白的非天然氨基酸序列。本领域技术人员了解这类分泌信号序列。适用于细菌的分泌信号包括但不限于：大肠杆菌、粘质沙雷菌(S.marcescens)、解淀粉欧文菌(E.amylosora)、摩氏摩根菌(M.morganii)和奇异变形杆菌(P.mirabilis)的布朗(Braun’s)脂蛋白的分泌信号，大肠杆菌和沙门菌(Salmonella)的TraT蛋白；地衣芽孢杆菌(B.licheniformis)和蜡状芽孢杆菌(B.cereus)和金黄色葡萄球菌(S.aureus)的青霉素酶(PenP)蛋白；肺炎克雷伯菌(Klebsiella pneumoniae)和产气克雷伯菌(Klebsiella aerogenese)的支链淀粉酶蛋白；大肠杆菌脂蛋白1pp-28、Pal、RplA、RplB、OsmB、NIpB和Orl17；哈维氏弧菌(V.harseyi)的几丁质酶蛋白；茄科假单胞菌(Pseudomonassolanacearum)的β-1，4-葡聚糖内切酶蛋白，流感嗜血杆菌(H.influenzae)的Pal和Pcp蛋白；铜绿假单胞菌(P.aeruginosa)的OprI蛋白；肺炎链球菌(S.pneumoniae)的MalX和AmiA蛋白；苍白密螺旋体(Treponema pallidum)的34kda抗原和TpmA蛋白；猪鼻支原体(Mycoplasma hyorhinis)的P37蛋白；解淀粉芽孢杆菌(Bacillus amyloliquefaciens)的中性蛋白酶；立克立克次体(Rickettsiarickettsii)的17kda抗原；malE麦芽糖结合蛋白；rbsB核糖结合蛋白；phoA碱性磷酸酶；和OmpA分泌信号(参见例如，Tanji等(1991)J Bacteriol.173(6)：1997-2005)。本领域已知适用于酵母的分泌信号序列，可采用这种序列。参见例如，美国专利号5,712,113。rbsB、malE和phoA分泌信号参见例如，Collier(1994)J.Bacteriol.176：3013。In some embodiments, the encoded modified cytochrome P450 enzyme (eg, isopentenoid precursor modifying enzyme) will comprise a non-natural amino acid sequence that enables the cell to secrete the fusion protein. Such secretion signal sequences are known to those skilled in the art. Secretion signals suitable for bacteria include, but are not limited to: Escherichia coli, S. marcescens, Erwinia amyloliquefaciens (E. amylosora), Morganii morganii (M. morganii), and Proteus mirabilis (P mirabilis) Brown (Braun's) lipoprotein secretion signal, Escherichia coli and Salmonella (Salmonella) TraT protein; licheniformis (B.licheniformis) and cereus (B.cereus) and Staphylococcus aureus ( Penicillinase (PenP) protein from S. aureus; pullulanase protein from Klebsiella pneumoniae and Klebsiella aerogenes; Escherichia coli lipoprotein 1pp-28, Pal, RplA , RplB, OsmB, NIpB and Orl17; the chitinase protein of Vibrio harveyi (V.harseyi); the beta-1,4-endoglucanase protein of Pseudomonas solanacearum (Pseudomonassolanacearum), Pal and Pcp proteins from Haemophilus influenzae (H.influenzae); OprI protein from Pseudomonas aeruginosa (P.aeruginosa); MalX and AmiA proteins from Streptococcus pneumoniae (S.pneumoniae); Treponema pallidum 34kda antigen and TpmA protein of Mycoplasma hyorhinis; P37 protein of Mycoplasma hyorhinis; neutral protease of Bacillus amyloliquefaciens; 17kda antigen of Rickettsia rickettsii; malE maltose binding protein; rbsB ribose binding protein; phoA alkaline phosphatase; and OmpA secretion signal (see, eg, Tanji et al. (1991) J Bacteriol. 173(6):1997-2005). Secretory signal sequences suitable for use in yeast are known in the art and such sequences can be used. See, eg, US Patent No. 5,712,113. See eg, Collier (1994) J. Bacteriol. 176:3013 for rbsB, malE and phoA secretion signals.

在一些实施方式中，例如，为了在原核宿主细胞如大肠杆菌中表达，分泌信号将含有以下氨基酸序列之一：In some embodiments, for example, for expression in a prokaryotic host cell such as E. coli, the secretion signal will contain one of the following amino acid sequences:

NH₂-MKKTAIAIAVALAGFATVAQA-COOH(SEQ ID NO：6)； _NH2 -MKKTAIAIAVALAGFATVAQA-COOH (SEQ ID NO: 6);

NH₂-MKKTAIAIVVALAGFATVAQA-COOH(SEQ ID NO：7)； _NH2 -MKKTAIAIVVALAGFATVAQA-COOH (SEQ ID NO: 7);

NH₂-MKKTALALAVALAGFATVAQA-COOH(SEQ ID NO：8)； _NH2 -MKKTALALAVALAGFATVAQA-COOH (SEQ ID NO: 8);

NH₂-MKIKTGARILALSALTTMMFSASALA-COOH(SEQ ID NO：9)；NH ₂ -MKIKTGARILALSALTTMMFSASALA-COOH (SEQ ID NO: 9);

NH₂-MNMKKLATLVSAVALSATVSANAMA-COOH(SEQ ID NO：10)；和 _NH2 -MNMKKLATLVSAVALSATVSANAMA-COOH (SEQ ID NO: 10); and

NH₂-MKQSTIALALLPLLFTPVTKA-COOH(SEQ ID NO：11)。 _NH2 -MKQSTIALALLPLLFTPVTKA-COOH (SEQ ID NO: 11).

在一些实施方式中，编码的修饰细胞色素P450酶(如类异戊烯前体修饰酶)将包含非天然分泌信号序列和异源跨膜结构域。可采用分泌信号序列和异源跨膜结构域的任何组合。In some embodiments, the encoded modified cytochrome P450 enzyme (eg, isopentenoid precursor modifying enzyme) will comprise a non-native secretion signal sequence and a heterologous transmembrane domain. Any combination of secretion signal sequences and heterologous transmembrane domains can be used.

作为一个非限制性例子，在一些实施方式中，含有非天然分泌信号序列和异源跨膜结构域的异源结构域具有以下氨基酸序列：NH₂-MKKTAIAIAVALAGFATVAQALLEYWYVVVPVLYIIKQLLAYTK-COOH(SEQ ID NO：12)，其中用下划线表示跨膜结构域，分泌信号位于跨膜结构域的N末端一侧。As a non-limiting example, in some embodiments, a heterologous domain comprising a non-native secretion signal sequence and a heterologous transmembrane domain has the following amino acid sequence: _NH2 -MKKTAIAIAVALAGFATVAQA LLEYWYVVVPVLYIIKQLLAYTK -COOH (SEQ ID NO: 12) , where the transmembrane domain is underlined and the secretion signal is located on the N-terminal side of the transmembrane domain.

增溶结构域solubilizing domain

在一些实施方式中，编码的修饰细胞色素P450酶(如类异戊烯前体修饰酶)将包含使蛋白质增溶的非天然结构域。In some embodiments, the encoded modified cytochrome P450 enzyme (eg, isopentenoid precursor modifying enzyme) will comprise a non-native domain that solubilizes the protein.

在一些实施方式中，增溶结构域将含有以下氨基酸序列之一：In some embodiments, the solubilizing domain will contain one of the following amino acid sequences:

NH₂-EELLKQALQQAQQLLQQAQELAKK-COOH(SEQ ID NO：13)；和 _NH2 -EELLKQALQQAQQLLQQAQELAKK-COOH (SEQ ID NO: 13); and

NH₂-MTVHDIIATYFTKWYVIVPLALIAYRVLDYFY-COOH(SEQ IDNO：14)；NH ₂ -MTVHDIIATYFTKWYVIVPLALIAYRVLDYFY-COOH (SEQ ID NO: 14);

NH₂-GLFGAIAGFIEGGWTGMIDGWYGYGGGKK-COOH(SEQ IDNO：15)；和 _NH2 -GLFGAIAGFIEGGWTGMIDGWYGYGGGKK-COOH (SEQ ID NO: 15); and

NH₂-MAKKTSSKG-COOH(SEQ ID NO：16)。 _NH2 -MAKKTSSKG-COOH (SEQ ID NO: 16).

膜插入域membrane insertion domain

在一些实施方式中，编码的修饰细胞色素P450酶(如类异戊烯前体修饰酶)将包含能使其插入膜中的非天然氨基酸序列。在一些实施方式中，编码的修饰细胞色素P450酶是含有框内融合于氨基端或羧基端的异源融合伙伴(如除细胞色素P450酶以外的蛋白)的融合多肽，该融合伙伴能使该融合蛋白插入生物膜中。In some embodiments, the encoded modified cytochrome P450 enzyme (eg, isopentenoid precursor modifying enzyme) will comprise a non-natural amino acid sequence that enables its insertion into the membrane. In some embodiments, the encoded modified cytochrome P450 enzyme is a fusion polypeptide comprising a heterologous fusion partner (e.g., a protein other than a cytochrome P450 enzyme) fused in frame to the amino- or carboxy-terminus that enables the fusion Proteins insert into biomembranes.

在一些实施方式中，该融合伙伴是米司迪蛋白(mistic)，如含有图5所示氨基酸序列(GenBank登录号AY874162)的蛋白。GenBank登录号AY874162下提供了编码该米司迪蛋白的核苷酸序列。本领域所知能够插入生物膜中的其它多肽，参见例如，PsbW Woolhead等(J.Biol.Chem.276(18)：14607)，描述了bW；和Kuhn(FEMS Microbiology Reviews 17(1992i)285)，描述了M12普洛克特(procoat)蛋白和Pf3普洛克特蛋白。In some embodiments, the fusion partner is mistic, such as a protein comprising the amino acid sequence shown in Figure 5 (GenBank accession number AY874162). The nucleotide sequence encoding this mysidin is provided under GenBank accession number AY874162. Other polypeptides known in the art that are capable of inserting into biological membranes, see, e.g., PsbW Woolhead et al. (J. Biol. Chem. 276(18):14607), describing bW; and Kuhn (FEMS Microbiology Reviews 17(1992i) 285) , describing the M12 procoat protein and the Pf3 procoat protein.

细胞色素P450酶Cytochrome P450 enzymes

在许多实施方式中，编码的类异戊烯前体修饰酶是细胞色素P450酶。编码的细胞色素P450酶将进行以下一种或多种反应：羟化、环氧化、氧化、脱水、脱氢、脱卤、异构化、醇氧化、醛氧化、脱烷基化和C-C键断裂反应。在本文中将这类反应总称为“生物合成途径中间体修饰反应”，或者在具体实施方式中称为“类异戊烯前体修饰反应”。这些反应参见例如，Sono等((1996)同上；参见例如，Sono等的图3，这类反应的示意图)。如上所述，在许多实施方式中，编码的修饰细胞色素P450酶(如类异戊烯前体修饰酶)是细胞色素P450单加氧酶、细胞色素P450羟化酶、细胞色素P450环氧化酶或细胞色素P450脱氢酶。本领域了解各种细胞色素P450单加氧酶、羟化酶，环氧化酶和脱氢酶(总称为“P450酶”)，可按照本发明修饰任何已知P450酶或其变体的氨基酸序列。In many embodiments, the encoded isopentenoid precursor modifying enzyme is a cytochrome P450 enzyme. The encoded cytochrome P450 enzyme will perform one or more of the following reactions: hydroxylation, epoxidation, oxidation, dehydration, dehydrogenation, dehalogenation, isomerization, alcohol oxidation, aldehyde oxidation, dealkylation, and C-C bonding Fragmentation reaction. Such reactions are collectively referred to herein as "biosynthetic pathway intermediate modification reactions", or in specific embodiments as "isopentenoid precursor modification reactions". See, eg, Sono et al. ((1996) supra; see, eg, Figure 3 of Sono et al., for a schematic diagram of such reactions). As noted above, in many embodiments, the encoded modified cytochrome P450 enzyme (eg, isopentenoid precursor modifying enzyme) is cytochrome P450 monooxygenase, cytochrome P450 hydroxylase, cytochrome P450 epoxidase enzyme or cytochrome P450 dehydrogenase. Various cytochrome P450 monooxygenases, hydroxylases, cyclooxygenases and dehydrogenases (collectively referred to as "P450 enzymes") are known in the art and the amino acids of any known P450 enzyme or variant thereof can be modified according to the present invention sequence.

含有编码细胞色素P450酶的核苷酸序列的核酸的合适来源包括但不限于：六界，如细菌界(如真细菌)；古细菌界；原生生物界；真菌界；植物界；和动物界中任一界的细胞或生物体。外源核酸的合适来源包括原生生物界中类似于植物的成员，包括但不限于：藻类(如绿藻、红藻、灰胞藻(glaucophytes)，蓝藻)；原生生物界中类似于真菌的成员，如粘菌、水霉等；原生生物界中类似于动物的成员，如鞭毛虫(如眼虫藻)、变形虫(如阿米巴)、孢子虫(如顶复虫(Apicomplexa)、粘原虫、微孢子虫)和纤毛虫(如草履虫)。外源核酸的合适来源包括真菌界的成员，包括但不限于：以下任何门的成员：担子菌门(珊瑚菌；如伞菌属、天狗菌属、牛肝菌属、鸡油菌属(Cantherellus)的成员等)；子囊菌门(子囊菌，包括(如)酵母)；菌藻门(地衣)；接合菌门(接合真菌)；和不完全菌门。外源核酸的合适来源包括植物界成员，包括但不限于：以下任何门的成员：苔藓植物门(如藓类)、角苔植物门(如金鱼藻)、苔类植物门(Hepaticophyta)(如苔类)、石松植物门(如石松类)、楔叶植物门(如木贼类)、裸蕨植物门(如叶蕨)、瓶尔小草门、蕨门(如蕨类)、苏铁门、银杏门、松柏门、买麻藤门和木兰门(如显花植物)。外源核酸的合适来源包括动物界成员，包括但不限于：以下任何门的成员：多孔动物门(海绵动物)；扁盘动物门；直泳虫门(海洋无脊椎动物的寄生物)；菱形虫门；刺胞动物门(珊瑚，海葵，海蜇，海笔，海肾，立方水母)；栉水母门(栉水母)；扁虫动物门(叶状软体蜗虫)；纽形动物门(纽虫)；颚胃动物门(Ngathostomulida)(有颚蠕虫)；腹毛动物门；轮虫动物门；曳鳃动物门；动吻动物门；铠甲动物门；棘头动物门；内肛亚门；线虫动物门；线形动物门；环口动物门；软体动物门(软体动物)；星虫动物门(方格星虫(peanut worms))；环节动物门(环节蠕虫)；缓步动物门(缓步动物)；有爪动物门(栉蚕)；节肢动物门(包括以下亚门：有螯肢亚门，多足亚门，六足亚门和甲壳亚门，其中有螯肢亚门包括例如，蛛形纲，肢口纲和海蜘蛛纲，多足亚门包括例如，唇足纲(唇足类)、倍足纲(多足类)、少足纲(Paropoda)和综合纲，六足亚门包括昆虫纲，甲壳亚门包括虾、磷虾、藤壶等；帚虫动物门；外肛动物门(苔藓动物)；腕足动物门；棘皮动物门(如海星、海雏菊、毛头星、海胆、海参、海蛇尾、脆篮(brittlebaskets)等)；毛颚动物门(箭虫)；半索动物门(玉钩虫)；和脊索动物门。脊索动物门的合适成员包括以下亚门的任何成员：尾索动物亚门(海鞘纲；包括海鞘目、樽海鞘目和幼形目)；头索亚门(文昌鱼)；盲鳗纲(盲鳗)；和脊椎动物亚门，其中脊椎动物亚门成员包括以下成员，例如，鳃鳗纲(七鳃类)，软骨鱼纲(软骨鱼)，辐鳍鱼纲(辐鳍鱼)，腔棘焦纲(腔棘鱼)，肺鱼纲(肺鱼)，爬行纲(爬行动物，如蛇、短吻鳄、鳄鱼、蜥蜴等)，鸟纲(鸟)；和哺乳纲(哺乳动物)。合适的植物包括单子叶植物和双子叶植物。Suitable sources of nucleic acid containing a nucleotide sequence encoding a cytochrome P450 enzyme include, but are not limited to: the six kingdoms, such as the kingdom Bacteria (e.g. Eubacteria); the kingdom Archaebacteria; the kingdom Protists; the kingdom Fungi; the kingdom Plants; and the kingdom Animals A cell or organism in any kingdom. Suitable sources of exogenous nucleic acid include plant-like members of the kingdom Protists, including but not limited to: algae (e.g., green algae, red algae, glaucophytes, cyanobacteria); fungus-like members of the kingdom Protists , such as slime mold, saprolegnia, etc.; animal-like members in the kingdom of protozoa, such as flagellates (such as Euglena), amoebas (such as amoebas), sporozoans (such as apicomplexa (Apicomplexa), sticky Protozoa, microsporidia) and ciliates (such as paramecium). Suitable sources of exogenous nucleic acid include members of the kingdom Fungi, including, but not limited to, members of any of the following phyla: Basidiomycota (coralic fungi; e.g., Agaricus, Tengus, Boletus, Cantherellus ) etc.); Ascomycota (Ascomycota, including, for example, yeast); Mycophyta (lichens); Zygomycota (zygote fungi); and Incomplete bacteria. Suitable sources of exogenous nucleic acid include members of the kingdom Plantae, including, but not limited to, members of any of the following phyla: Bryophyta (e.g., Mosses), Ceratophyta (e.g., Chrysophyta), Hepaticophyta (e.g., Liverworts), Lycopophyta (such as Lycopodium), Leafy Plants (such as Equisetum), Naked Ferns (such as Leaf Fern), Pingeria, Ferns (such as Ferns), Cycads , Ginkgo biloba, pine and cypress, Maima rattan and Magnolia (such as flowering plants). Suitable sources of exogenous nucleic acid include members of the kingdom Animalia, including but not limited to: members of any of the following phyla: Polyporiata (sponges); Placizota; Orthozoa (parasites of marine invertebrates); Rhomboids (parasites of marine invertebrates); Phylum Cnidaria; Phylum Cnidaria (corals, sea anemones, jellyfish, sea pens, sea kidneys, cubic jellyfish); Phylum Ctenophores (comb jellyfish); Phylum Flatworms (phyllophyllous molluscs); Phylum Neptunida ( Nematodes); Ngathostomulida (worms with jaws); Ventricotilla; Rotifera; Snipperbranchia; Kistozoa; Armoridae; Echinocephala; Intraanalia Phylum Nematodes; Phylum Nematodes; Phylum Annulus; Phylum Molluscs (molluscs); Phylum Astrozoa (phylum Peanut worms); Phylum Annelids (phylum worms); Phylum Tardigrades ( Tardigrades); Clawed Animals (ctenophores); Arthropoda (including the following subphylums: Chelicanidae, Myriapods, Hexapods and Crustacea, which include Chelicanidae For example, the classes Arachnida, Acrostomia and Sea Arachnids, the subphylum Myriapoda include, for example, Chiropods (lipoda), Diplopoda (myriapoda), Paropoda and Syndrome, Six The subphylum Podia includes the class Insecta, and the subphylum Crustacea includes shrimp, krill, barnacles, etc.; Phyllozoa; Phyllozoan (bryozoa); Phylum Brachiopod; Phylum Echinoderm (such as starfish, sea daisy, hair head Stars, sea urchins, sea cucumbers, brittle baskets, etc.); Chaetognatha (Arrowworms); Hemicordia (Hinachozoa); and Chordates. Suitable members of the Chordates include the following subphyla Any member of: the subphylum Uurochordae (class Ascidian; includes the orders Ascidians, Salps, and Lavulata); the subphylum Urosochria (amphioxus); the class Hagfish (Hagex); and the subphylum Vertebrate, of which Members of the subphylum Vertebrate include, for example, members of the class Gryllia (heptapills), Chondrichthyes (cartilaginous fishes), Actinopterygii (ray-finned fishes), Coelacanths (coelacanths), Lungfishes Class (lungfishes), Reptilia (reptiles such as snakes, alligators, crocodiles, lizards, etc.), Aves (birds); and Mammalia (mammals). Suitable plants include monocots and dicots .

因此，例如，合适来源包括来自以下生物体的细胞，所述生物体包括但不限于：原生动物、植物、真菌、藻类、酵母、爬行动物、两栖动物、哺乳动物、海洋微生物、海洋无脊椎动物、节肢动物、等足类动物、昆虫、蛛形纲动物、古细菌和真细菌。Thus, for example, suitable sources include cells from organisms including, but not limited to, protozoa, plants, fungi, algae, yeasts, reptiles, amphibians, mammals, marine microorganisms, marine invertebrates , arthropods, isopods, insects, arachnids, archaea and eubacteria.

合适的原核来源包括细菌(如真细菌)和古细菌。合适的古细菌来源包括产甲烷菌、极度嗜盐菌、极度嗜热菌等。合适的古细菌来源包括但不限于以下类别的任何成员：泉古菌门(如硫磺矿硫化叶菌(Sulfolobus solfataricus)、运动硫还原球菌(Defulfurococcus mobilis)、隐蔽热网菌(Pyrodictium occultum)、依赖热丝菌(Thermofflum pendens)、特耐热变形菌(Thermoproteus tenax))、广古菌门(如速生热球菌(Thermococcus celer)、无机营养产甲烷菌(Methanococcusthermolithotrophicus)、詹氏产甲烷菌(Methanococcus jannaschii)、热自养甲烷杆菌(Methanobacterium thermoautotrophicum)、甲酸甲烷杆菌(Methanobacteriumformicicum)、炽热甲烷嗜热菌(Methanothermus fervidus)、闪烁古生球菌(Archaeoglobus fulgidus)、嗜酸热原体(Thermoplasma acidophilum)、富饶盐菌(Haloferax volcanni)、巴氏甲烷八叠球菌(Methanosarcina barkeri)、Methanosaetaconcilli、洪氏甲烷螺菌(Methanospririllum hungatei)、运动甲烷微菌(Methanomicrobium mobile)和初古菌门。合适的真细菌来源包括但不限于以下类别的任何成员：产氢杆菌、热袍菌、绿色非硫细菌、异常球菌、蓝细菌、紫色细菌、浮霉菌、螺旋原虫、绿色硫细菌、噬细胞菌和革兰阳性菌(如分支杆菌(Mycobacterium sp.)、微球菌(Micrococcus sp.)、链球菌(Streptomyces sp.)、乳酸杆菌(Lactobacillus sp.)、螺杆菌(Helicobacterium sp.)、梭菌(Clostridium sp.)、支原体(Mycoplasma sp.)、芽孢杆菌(Bacillus sp.)等)。Suitable prokaryotic sources include bacteria (eg, eubacteria) and archaea. Suitable archaeal sources include methanogens, hyperhalophiles, hyperthermophiles, and the like. Suitable archaeal sources include, but are not limited to, any member of the following classes: Springarchaeota (e.g., Sulfolobus solfataricus, Defulfurococcus mobilis, Pyrodictium occultum, Dependent Thermofflum pendens, Thermoproteus tenax), Euryarchaeobacteria (such as Thermococcus celer, Methanococcus thermolithotrophicus, Methanococcus jannaschii ), Methanobacterium thermoautotrophicum, Methanobacterium formicicum, Methanothermus fervidus, Archaeoglobus fulgidus, Thermoplasma acidophilum, rich salt Haloferax volcanni, Methanosarcina barkeri, Methanosaetaconcilli, Methanospririllum hungatei, Methanomicrobium mobile, and Archaea. Suitable eubacterial sources include but Not limited to any member of the following classes: Hydrogenobacteria, Thermotogae, Green non-sulfur bacteria, Deinococcus, Cyanobacteria, Violet bacteria, Planctomycetes, Spiralia, Green sulfur bacteria, Cytophagia and Gram-positive bacteria (eg Mycobacterium sp., Micrococcus sp., Streptomyces sp., Lactobacillus sp., Helicobacterium sp., Clostridium sp., Mycoplasma ( Mycoplasma sp.), Bacillus sp., etc.).

在一些实施方式中，由取自生物体的组织；分离自生物体的特定细胞或细胞群等分离P450酶编码核酸。例如，在一些实施方式中，当有机体是植物时，由木质部、韧皮部、形成层、叶、根等分离核酸。在一些实施方式中，当有机体是动物时，由特定组织(如肺、肝、心、肾、脑、脾、皮肤、胎儿组织等或特定细胞类型(如神经元细胞、上皮细胞、内皮细胞、星形细胞、巨噬细胞、胶质细胞、岛细胞、T淋巴细胞、B淋巴细胞等)分离核酸。In some embodiments, a P450 enzyme-encoding nucleic acid is isolated from a tissue taken from an organism; a specific cell or cell population isolated from an organism; and the like. For example, in some embodiments, when the organism is a plant, the nucleic acid is isolated from xylem, phloem, cambium, leaves, roots, and the like. In some embodiments, when the organism is an animal, specific tissues (such as lung, liver, heart, kidney, brain, spleen, skin, fetal tissue, etc.) or specific cell types (such as neuronal cells, epithelial cells, endothelial cells, astrocytes, macrophages, glial cells, islet cells, T lymphocytes, B lymphocytes, etc.) to isolate nucleic acid.

在一些实施方式中，本发明核酸包含编码P450酶的核苷酸序列，所述序列不同于编码P450酶的野生型或天然产生的核苷酸序列，例如，本发明核酸包含编码变异P450酶的核苷酸序列。在一些实施方式中，与天然产生的母体P450酶的氨基酸序列相比，变异P450酶的氨基酸序列中有一个氨基酸、两个氨基酸、三个氨基酸、四个氨基酸、五个氨基酸、六个氨基酸、七个氨基酸、八个氨基酸、九个氨基酸或十个氨基酸或更多个氨基酸不同。在一些实施方式中，与天然产生的母体P450酶的氨基酸序列相比，变异P450酶的氨基酸序列中有约10-15个氨基酸、约15-20个氨基酸、约20-25个氨基酸、约25-30个氨基酸、约30-35个氨基酸、约35-40个氨基酸、约40-50个氨基酸或约50-60个氨基酸、或更多个氨基酸不同。In some embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding a P450 enzyme that is different from a wild-type or naturally occurring nucleotide sequence encoding a P450 enzyme, for example, a nucleic acid of the invention comprises a nucleotide sequence encoding a variant P450 enzyme Nucleotide sequence. In some embodiments, the amino acid sequence of the variant P450 enzyme has one amino acid, two amino acids, three amino acids, four amino acids, five amino acids, six amino acids, Seven amino acids, eight amino acids, nine amino acids, or ten amino acids or more differ. In some embodiments, the variant P450 enzyme has about 10-15 amino acids, about 15-20 amino acids, about 20-25 amino acids, about 25 amino acids in the amino acid sequence compared to the amino acid sequence of the naturally occurring parent P450 enzyme. -30 amino acids, about 30-35 amino acids, about 35-40 amino acids, about 40-50 amino acids, or about 50-60 amino acids, or more amino acids differ.

在许多实施方式中，如上所述，编码的修饰细胞色素P450酶包含母体(如野生型或天然产生或天然的序列)N末端的修饰，如，跨膜结构域和/或跨膜结构域N末端的氨基酸序列中的修饰。在一些实施方式中，与野生型细胞色素P450酶的氨基酸序列相比，编码的修饰细胞色素P450酶的酶催化部分中还包括一个或多个氨基酸序列修饰。In many embodiments, as described above, the encoded modified cytochrome P450 enzyme comprises a modification of the N-terminus of the parent (e.g., wild-type or naturally occurring or native sequence), e.g., a transmembrane domain and/or a transmembrane domain N Modifications in the amino acid sequence of the terminal. In some embodiments, the encoded modified cytochrome P450 enzyme further includes one or more amino acid sequence modifications in the enzymatic portion of the enzyme compared to the amino acid sequence of the wild-type cytochrome P450 enzyme.

含有编码变异的(如修饰的)P450酶的核苷酸序列的核酸是合成核酸。在一些实施方式中，含有编码变异P450酶的核苷酸序列的合成核酸是在合适杂交条件下能与含有编码天然产生的P450酶的核苷酸序列的核酸杂交的核酸。在一些实施方式中，含有编码变异P450酶的核苷酸序列的合成核酸是在严谨杂交条件下能与含有编码天然产生的P450酶的核苷酸序列的核酸杂交的核酸。在一些实施方式中，含有编码变异P450酶的核苷酸序列的合成核酸包含的编码变异P450酶的核苷酸序列与编码天然产生P450酶的核苷酸序列的核苷酸序列相同性小于约95％，例如，编码变异P450酶的核苷酸序列与编码天然产生P450酶的核苷酸序列的核苷酸序列相同性不大于约90％-95％、约85％-90％、约80％-85％、约75％-80％、约70％-75％、约65％-70％、约60％-65％、约55％-60％或约50％-55％。A nucleic acid comprising a nucleotide sequence encoding a variant (eg, modified) P450 enzyme is a synthetic nucleic acid. In some embodiments, a synthetic nucleic acid comprising a nucleotide sequence encoding a variant P450 enzyme is a nucleic acid that hybridizes under appropriate hybridization conditions to a nucleic acid comprising a nucleotide sequence encoding a naturally occurring P450 enzyme. In some embodiments, a synthetic nucleic acid comprising a nucleotide sequence encoding a variant P450 enzyme is a nucleic acid that hybridizes under stringent hybridization conditions to a nucleic acid comprising a nucleotide sequence encoding a naturally occurring P450 enzyme. In some embodiments, the synthetic nucleic acid comprising a nucleotide sequence encoding a variant P450 enzyme comprises a nucleotide sequence encoding a variant P450 enzyme that has a nucleotide sequence identity to a nucleotide sequence encoding a naturally occurring P450 enzyme of less than about 95%, for example, the nucleotide sequence encoding the variant P450 enzyme is no more than about 90%-95%, about 85%-90%, about 80% identical to the nucleotide sequence encoding the naturally occurring P450 enzyme %-85%, about 75%-80%, about 70%-75%, about 65%-70%, about 60%-65%, about 55%-60%, or about 50%-55%.

在一些实施方式中，编码变异P450酶的核苷酸序列编码的P450酶与天然产生P450酶的氨基酸序列相同性为约50％-55％、约55％-60％、约60％-65％、约65％-70％、约70％-75％、约75％-80％、约80％-85％、约85％-90％或约90％-95％。本领域已知许多P450酶的氨基酸序列。In some embodiments, the P450 enzyme encoded by the nucleotide sequence encoding the variant P450 enzyme has an amino acid sequence identity of about 50%-55%, about 55%-60%, about 60%-65% to a naturally occurring P450 enzyme , about 65%-70%, about 70%-75%, about 75%-80%, about 80%-85%, about 85%-90%, or about 90%-95%. The amino acid sequences of many P450 enzymes are known in the art.

可由本发明核酸中所含核苷酸序列修饰和编码的合适的P450酶包括但不限于：苧烯-6-羟化酶(参见例如，图6；GenBank登录号AY281025和AF124815)；5-表-马兜铃碱二羟化酶(参见例如，图7；GenBank登录号AF368376)；δ-杜松烯-8-羟化酶(参见例如，图8A；GenBank登录号AF332974)；紫杉二烯-5α-羟化酶(参见例如，图9A和9B；GenBank登录号AY289209，AY959320，和AY364469)；ent-贝壳杉烯氧化酶(参见例如，图10；GenBank登录号AF047719；参见例如，Helliwell等(1998)Proc.Natl.Acad.Sci.USA 95：9019-9024)。Suitable P450 enzymes that may be modified and encoded by the nucleotide sequences contained in the nucleic acids of the invention include, but are not limited to: limonene-6-hydroxylase (see, e.g., Figure 6; GenBank Accession Nos. AY281025 and AF124815); 5-Table - aristolochine dihydroxylase (see, eg, Figure 7; GenBank Accession No. AF368376); delta-junipene-8-hydroxylase (see, eg, Figure 8A; GenBank Accession No. AF332974); taxadiene - 5α-hydroxylase (see, e.g., Figures 9A and 9B; GenBank Accession Nos. AY289209, AY959320, and AY364469); ent-kaurene oxidase (see, e.g., Figure 10; GenBank Accession No. AF047719; see, e.g., Helliwell et al (1998) Proc. Natl. Acad. Sci. USA 95:9019-9024).

图8B-D描述了示范性P450变体。图8B描述了含有异源跨膜结构域的杜松烯羟化酶；图8C描述了含有增溶结构域的杜松烯羟化酶；图8C描述了含有分泌结构域和异源跨膜结构域的杜松烯羟化酶。图22描述了其它示范性P450变体，包括含有各种N末端序列的紫穗槐二烯氧化酶。Figures 8B-D depict exemplary P450 variants. Figure 8B depicts a culinene hydroxylase containing a heterologous transmembrane domain; Figure 8C depicts a cinderene hydroxylase containing a solubilizing domain; domain of juniperene hydroxylase. Figure 22 depicts other exemplary P450 variants, including amorphadiene oxidase containing various N-terminal sequences.

本领域已知修饰生物碱途径中间体的细胞色素P450酶。参见例如，Facchini等(2004)同上；Pauli和Kutchan((1998)Plant J.13：793-801；Collu等((2001)FEBS Lett.508：215-220；Schroder等((1999)FEBSLett.458：97-102。也参见图19A-C。Cytochrome P450 enzymes that modify alkaloid pathway intermediates are known in the art. See, eg, Facchini et al. (2004) supra; Pauli and Kutchan ((1998) Plant J.13:793-801; Collu et al. ((2001) FEBS Lett.508:215-220; Schroder et al. ((1999) FEBS Lett.458 : 97-102. See also Figures 19A-C.

本领域已知修饰苯丙酸类途径中间体的细胞色素P450酶。参见例如，Mizutani等((1997)Plant Physiol.113：755-763；和Gang等((2002)Plant Physiol.130：1536-1544。也参见图20A-C。Cytochrome P450 enzymes that modify phenylpropanate pathway intermediates are known in the art. See, eg, Mizutani et al. ((1997) Plant Physiol. 113:755-763; and Gang et al. ((2002) Plant Physiol. 130:1536-1544. See also Figures 20A-C.

图21A和21B描述了示范性的修饰聚酮化合物途径中间体的细胞色素P450酶。也参见Ikeda等((1999)Proc.Natl.Acad.Sci.USA 96：9509-9514；和Ward等((2004)Antimicrob.Agents Chemother.48：4703-4712。Figures 21A and 21B depict exemplary cytochrome P450 enzymes that modify polyketide pathway intermediates. See also Ikeda et al. ((1999) Proc. Natl. Acad. Sci. USA 96:9509-9514; and Ward et al. ((2004) Antimicrob. Agents Chemother. 48:4703-4712.

编码的修饰细胞色素P450酶(如类异戊烯前体修饰酶)有酶活性，例如，修饰细胞色素P450酶(如类异戊烯前体修饰酶)具有以下活性中的一种或多种：a)通过以下一种或多种反应修饰生物合成途径中间体：氧化、羟化、环氧化、脱水、脱氢、脱卤、异构化、醇氧化、醛氧化、脱烷基化或键断裂反应；b)通过以下一种或多种反应修饰类异戊烯前体：氧化、羟化、环氧化、脱水、脱氢、脱卤、异构化、醇氧化、醛氧化、脱烷基化或键断裂反应。通过检测P450酶对底物的反应产物和/或检测P450酶对底物反应的下游产物，不难确定本发明核酸是否编码有酶活性的细胞色素P450酶。例如，不难采用合适底物、根据这些酶活性的标准试验确定本发明核酸是否编码有酶活性的萜氧化酶或萜羟化酶。通常通过色谱-质谱联用分析酶修饰的产物。例如，不难用这些酶活性的标准试验确定本发明核酸是否编码倍半萜氧化酶或倍半萜羟化酶。参见例如，美国专利公开号20050019882。The encoded modified cytochrome P450 enzyme (such as an isopentenoid precursor modifying enzyme) has enzymatic activity, for example, the modified cytochrome P450 enzyme (such as an isopentenoid precursor modifying enzyme) has one or more of the following activities : a) modifying a biosynthetic pathway intermediate by one or more of the following reactions: oxidation, hydroxylation, epoxidation, dehydration, dehydrogenation, dehalogenation, isomerization, alcohol oxidation, aldehyde oxidation, dealkylation or bond scission reaction; b) modification of the isopentenoid precursor by one or more of the following reactions: oxidation, hydroxylation, epoxidation, dehydration, dehydrogenation, dehalogenation, isomerization, alcohol oxidation, aldehyde oxidation, dehydration Alkylation or bond scission reaction. By detecting the reaction product of the P450 enzyme to the substrate and/or detecting the downstream product of the reaction of the P450 enzyme to the substrate, it is not difficult to determine whether the nucleic acid of the present invention encodes a cytochrome P450 enzyme with enzymatic activity. For example, it is not difficult to determine whether a nucleic acid of the invention encodes an enzymatically active terpene oxidase or terpene hydroxylase according to standard assays for these enzymatic activities using appropriate substrates. Enzyme-modified products are typically analyzed by chromatography-mass spectrometry. For example, it is not difficult to determine whether a nucleic acid of the invention encodes a sesquiterpene oxidase or a sesquiterpene hydroxylase using standard assays for the activity of these enzymes. See, eg, US Patent Publication No. 20050019882.

在一些实施方式中，修饰编码修饰细胞色素P450酶(如修饰的类异戊烯前体修饰酶)的核苷酸序列，以反映出具体宿主细胞优选的密码子。例如，在一些实施方式中，根据酵母优选的密码子修饰该核苷酸序列。参见例如，Bennetzen和Hall(1982)J.Biol.Chem.257(6)：3026-3031。在其它实施方式中，另一个非限制性例子是，根据大肠杆菌优选的密码子修饰该核苷酸序列。参见例如，Gouy和Gautier(1982)Nucleic Acids Res.10(22)：7055-7074；Eyre-Walker(1996)Mol.Biol.Evol.13(6)：864-872。也参见Nakamura等(2000)Nucleic Acids Res.28(1)：292。一个非限制性例子是，图11A描述了编码杜松烯羟化酶的野生型核苷酸序列(用粗体表示atg起始密码子)；图11B描述了图11A所示序列的密码子优化变体，其中根据在原核细胞如大肠杆菌中表达优化密码子。In some embodiments, the nucleotide sequence encoding a modified cytochrome P450 enzyme (eg, a modified isopentenoid precursor modifying enzyme) is modified to reflect codon preferences of a particular host cell. For example, in some embodiments, the nucleotide sequence is modified according to yeast preferred codons. See, eg, Bennetzen and Hall (1982) J. Biol. Chem. 257(6):3026-3031. In other embodiments, as another non-limiting example, the nucleotide sequence is modified according to E. coli preferred codons. See, eg, Gouy and Gautier (1982) Nucleic Acids Res. 10(22):7055-7074; Eyre-Walker (1996) Mol. Biol. Evol. 13(6):864-872. See also Nakamura et al. (2000) Nucleic Acids Res. 28(1):292. As a non-limiting example, FIG. 11A depicts the wild-type nucleotide sequence encoding cinediene hydroxylase (the atg start codon is indicated in bold); FIG. 11B depicts the codon optimization of the sequence shown in FIG. 11A Variants in which codons are optimized for expression in prokaryotic cells such as E. coli.

细胞色素P450还原酶Cytochrome P450 reductase

NADPH-细胞色素P450氧化还原酶(CPR，EC 1.6.2.4)是许多P450-单加氧酶的氧化还原伙伴。在一些实施方式中，本发明核酸还包含编码细胞色素P450还原酶(CPR)的核苷酸序列。含有编码CPR的核苷酸序列的本发明核酸称为“CPR核酸”。主题CPR核酸编码的CPR将电子由NADPH传递给细胞色素P450。例如，在一些实施方式中，主题CPR核酸编码的CPR将电子由NADPH传递给主题类异戊烯修饰酶编码核酸编码的类异戊烯修饰酶，如倍半萜氧化酶。NADPH-cytochrome P450 oxidoreductase (CPR, EC 1.6.2.4) is the redox partner of many P450-monooxygenases. In some embodiments, the nucleic acid of the invention further comprises a nucleotide sequence encoding a cytochrome P450 reductase (CPR). A nucleic acid of the present invention comprising a nucleotide sequence encoding a CPR is referred to as a "CPR nucleic acid". The CPR encoded by the subject CPR nucleic acid transfers electrons from NADPH to cytochrome P450. For example, in some embodiments, a CPR encoded by a subject CPR nucleic acid transfers electrons from NADPH to an isopentenoid-modifying enzyme, such as a sesquiterpene oxidase, encoded by a subject isopentenoid-modifying enzyme-encoding nucleic acid.

在一些实施方式中，本发明核酸包含编码修饰细胞色素P450酶(如修饰的类异戊烯前体修饰酶)和CPR的核苷酸序列。在一些实施方式中，本发明核酸包含编码一融合蛋白的核苷酸序列，该融合蛋白包含具有类异戊烯前体修饰活性的修饰细胞色素P450酶(如修饰的类异戊烯前体修饰酶)(如上所述)和融合的CPR多肽的氨基酸序列。在一些实施方式中，编码的融合蛋白具有式NH₂-A-X-B-COOH，其中A是修饰细胞色素P450酶，X是任选接头，B是CPR多肽。在一些实施方式中，编码的融合蛋白具有式NH₂-A-X-B-COOH，其中A是CPR多肽，X是任选接头，B是修饰细胞色素P450酶。In some embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding a modified cytochrome P450 enzyme (eg, a modified isopentenoid precursor modifying enzyme) and a CPR. In some embodiments, the nucleic acid of the present invention comprises a nucleotide sequence encoding a fusion protein comprising a modified cytochrome P450 enzyme having isopentenoid precursor modifying activity (such as a modified isopentenoid precursor modified enzyme) (as described above) and the amino acid sequence of the fused CPR polypeptide. In some embodiments, the encoded fusion protein has the formula _NH2 -AXB-COOH, where A is a modified cytochrome P450 enzyme, X is an optional linker, and B is a CPR polypeptide. In some embodiments, the encoded fusion protein has the formula _NH2 -AXB-COOH, where A is a CPR polypeptide, X is an optional linker, and B is a modified cytochrome P450 enzyme.

接头肽可含有各种氨基酸序列。蛋白质通过通常有弹性的间隔肽连接，但也不排除其它化学连接。该接头可以是可切割接头。合适的接头序列长度通常为约5-50个氨基酸，或约6-25个氨基酸。通常采用具有一定弹性程度的肽接头。连接肽基本上可以含有任何氨基酸序列，需要注意优选接头的序列应产生总体弹性肽。采用小氨基酸如甘氨酸和丙氨酸，可用于产生弹性肽。本领域技术人员通常能够创造这类序列。可购得各种不同接头，认为它们也适用于本发明。Linker peptides can contain various amino acid sequences. Proteins are linked by usually flexible spacer peptides, but other chemical linkages are not excluded. The linker can be a cleavable linker. Suitable linker sequences are generally about 5-50 amino acids, or about 6-25 amino acids in length. Peptide linkers with some degree of flexibility are generally employed. The linker peptide can contain essentially any amino acid sequence, with care being taken that the sequence of the linker is preferably one that results in an overall elastic peptide. Using small amino acids such as glycine and alanine, can be used to generate elastic peptides. Those skilled in the art are generally able to create such sequences. A variety of different linkers are commercially available and are believed to be suitable for use in the present invention.

合适的接头肽常常包含富含丙氨酸和脯氨酸残基的氨基酸序列，已知这两种残基能赋予蛋白质结构以弹性。示范性接头含有甘氨酸、丙氨酸、脯氨酸和甲硫氨酸残基的组合，如AAAGGM(SEQ ID NO：17)；AAAGGMPPAAAGGM(SEQ ID NO：18)；AAAGGM(SEQ ID NO：19)；和PPAAAGGM(SEQ ID NO：20)。其它示范性接头肽包含IEGR(SEQ ID NO：21)；和GGKGGK(SEQ ID NO：22)。然而，可采用长度通常约为5-50个氨基酸的任何弹性接头。接头基本上可含有能产生总体弹性肽的任何序列，包括上述类型的富含丙氨酸-脯氨酸的序列。Suitable linker peptides often contain amino acid sequences rich in alanine and proline residues, which are known to confer flexibility to protein structure. Exemplary linkers contain a combination of glycine, alanine, proline, and methionine residues, such as AAAGGM (SEQ ID NO: 17); AAAGGMPPAAAGGM (SEQ ID NO: 18); AAAGGM (SEQ ID NO: 19) and PPAAAGGM (SEQ ID NO: 20). Other exemplary linker peptides include IEGR (SEQ ID NO: 21); and GGKGGK (SEQ ID NO: 22). However, any flexible linker generally about 5-50 amino acids in length may be used. The linker may contain essentially any sequence that results in an overall elastic peptide, including alanine-proline rich sequences of the type described above.

在一些实施方式中，本发明核酸包含编码CPR多肽的核苷酸序列，所述CPR多肽与已知或天然产生的CPR多肽的氨基酸序列相同性为至少约45％、至少约50％、至少约55％、至少约57％、至少约60％、至少约65％、至少约70％、至少约75％、至少约80％、至少约85％、至少约90％、至少约95％、至少约98％或至少约99％。In some embodiments, nucleic acids of the invention comprise a nucleotide sequence encoding a CPR polypeptide having an amino acid sequence identity of at least about 45%, at least about 50%, at least about 55%, at least about 57%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98% or at least about 99%.

本领域知道CPR多肽以及CPR多肽的编码核酸，任何CPR编码核酸或其变体均可用于本发明。合适的CPR编码核酸包括编码植物中发现的CPR的核酸。合适的CPR编码核酸包括编码真菌中发现CPR的核酸。合适的CPR编码核酸的例子包括：GenBank登录号AJ303373(普通小麦(Triticum aestivum)CPR)；GenBank登录号AY959320(红豆杉(Taxus chinensis)CPR)；GenBank登录号AY532374(大阿米(Ammi majus)CPR)；GenBank登录号AG211221(水稻(Oryza sativa)CPR)；和GenBank登录号AF024635(皱叶欧芹(Petroselinumcrispum)CPR)；热带念珠菌(Candida tropicalis)细胞色素P450还原酶(GenBank登录号M35199)；拟南芥(Arabidopsis thaliana)细胞色素P450还原酶ATR1(GenBank登录号X66016)；和拟南芥细胞色素P450还原酶ATR2(GenBank登录号X66017)；和假单孢氧还蛋白还原酶和假单胞氧还蛋白(GenBank登录号J05406)。CPR polypeptides and nucleic acids encoding CPR polypeptides are known in the art, and any CPR-encoding nucleic acid or variant thereof may be used in the present invention. Suitable CPR-encoding nucleic acids include nucleic acids encoding CPRs found in plants. Suitable CPR-encoding nucleic acids include nucleic acids encoding CPRs found in fungi. Examples of suitable CPR-encoding nucleic acids include: GenBank Accession No. AJ303373 (Triticum aestivum CPR); GenBank Accession No. AY959320 (Taxus chinensis CPR); GenBank Accession No. AY532374 (Ammi majus CPR ); GenBank Accession No. AG211221 (Oryza sativa CPR); and GenBank Accession No. AF024635 (Petroselinum crispum CPR); Candida tropicalis cytochrome P450 reductase (GenBank Accession No. M35199); Arabidopsis thaliana cytochrome P450 reductase ATR1 (GenBank accession number X66016); and Arabidopsis thaliana cytochrome P450 reductase ATR2 (GenBank accession number X66017); and Pseudomonas thaliana redoxin reductase and Pseudomonas Redoxin (GenBank Accession No. J05406).

在一些实施方式中，本发明核酸包含编码给定P450酶的特异性CPR多肽的核苷酸序列。一个非限制性例子是，本发明核酸包含编码东北红豆杉CPR的核苷酸序列(图12A；GenBank AY571340)。另一个非限制性例子是，本发明核酸包含编码热带念珠菌CPR的核苷酸序列(图12B)。在其它实施方式中，本发明核酸包含编码可用作两种或多种不同P450酶的氧化还原伙伴的CPR多肽的核苷酸序列。图12C描述了一种这类CPR(拟南芥细胞色素P450还原酶ATR1)。图12D描述了另一种这类CPR(拟南芥细胞色素P450还原酶ATR2)。如图12D所示的修饰或变异的ATR2也是合适的，变异的ATR2缺少叶绿体靶向序列。In some embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding a specific CPR polypeptide for a given P450 enzyme. As a non-limiting example, the nucleic acid of the invention comprises a nucleotide sequence encoding a CPR of Taxus chinensis (Figure 12A; GenBank AY571340). As another non-limiting example, a nucleic acid of the invention comprises a nucleotide sequence encoding a CPR of C. tropicalis (FIG. 12B). In other embodiments, nucleic acids of the invention comprise nucleotide sequences encoding CPR polypeptides that can serve as redox partners for two or more different P450 enzymes. Figure 12C depicts one such CPR (Arabidopsis cytochrome P450 reductase ATR1). Figure 12D depicts another such CPR (Arabidopsis cytochrome P450 reductase ATR2). Also suitable are modified or variant ATR2 as shown in Figure 12D, the variant ATR2 lacking the chloroplast targeting sequence.

在一些实施方式中，编码的CPR包含异源氨基酸序列或变异氨基酸序列(如取代、缺失、插入、加入)。在一些实施方式中，相对于野生型CPR编码的CPR包含以下一种或多种修饰：a)用非天然跨膜结构域取代天然跨膜结构域；b)用分泌信号结构域取代天然跨膜结构域；c)用增溶结构域取代天然跨膜结构域；d)用膜插入域取代天然跨膜结构域；e)截短天然跨膜结构域；和f)改变天然跨膜结构域的氨基酸序列。In some embodiments, the encoded CPR comprises a heterologous amino acid sequence or a variant amino acid sequence (eg, substitution, deletion, insertion, addition). In some embodiments, the CPR encoded relative to the wild-type CPR comprises one or more of the following modifications: a) a non-native transmembrane domain is substituted for the native transmembrane domain; b) a secretion signaling domain is substituted for the native transmembrane domain c) replacing the native transmembrane domain with a solubilizing domain; d) replacing the native transmembrane domain with a membrane insertion domain; e) truncating the native transmembrane domain; and f) altering the native transmembrane domain amino acid sequence.

在一些实施方式中，修饰编码CPR多肽的核苷酸序列，以反映出具体宿主细胞优选的密码子。例如，在一些实施方式中，根据酵母优选的密码子修饰核苷酸序列。参见例如，Bennetzen和Hall(1982)J.Biol.Chem.257(6)：3026-3031。在其它实施方式中，另一个非限制性例子是，根据大肠杆菌优选的密码子修饰核苷酸序列。参见例如，Gouy和Gautier(1982)Nucleic Acids Res.10(22)：7055-7074；Eyre-Walker(1996)Mol.Biol.Evol.13(6)：864-872。也参见Nakamura等(2000)Nucleic Acids Res.28(1)：292。In some embodiments, the nucleotide sequence encoding a CPR polypeptide is modified to reflect the preferred codons of a particular host cell. For example, in some embodiments, the nucleotide sequence is modified according to yeast preferred codons. See, eg, Bennetzen and Hall (1982) J. Biol. Chem. 257(6):3026-3031. In other embodiments, as another non-limiting example, the nucleotide sequence is modified according to E. coli preferred codons. See, eg, Gouy and Gautier (1982) Nucleic Acids Res. 10(22):7055-7074; Eyre-Walker (1996) Mol. Biol. Evol. 13(6):864-872. See also Nakamura et al. (2000) Nucleic Acids Res. 28(1):292.

构建物construct

本发明还提供了包含本发明核酸的重组载体(“构建物”)。在一些实施方式中，本发明重组载体能够扩增本发明核酸。在一些实施方式中，本发明重组载体能够在真核细胞、原核细胞或无细胞转录/翻译系统中产生编码的类异戊烯修饰酶或编码的CPR。合适的表达载体包括但不限于：杆状病毒载体、噬菌体载体、质粒、噬菌粒、粘粒、F粘粒(fosmid)、细菌人工染色体、病毒载体(如基于牛痘病毒、脊髓灰质炎病毒、腺病毒、腺伴随病毒、SV40、单纯疱疹病毒等的病毒载体)、基于P1的人工染色体、酵母质粒、酵母人工染色体和对感兴趣的特定宿主(如大肠杆菌(E.coli)、酵母和植物细胞)特异的任何其它载体。The invention also provides recombinant vectors ("constructs") comprising nucleic acids of the invention. In some embodiments, recombinant vectors of the invention are capable of amplifying nucleic acids of the invention. In some embodiments, the recombinant vector of the present invention is capable of producing the encoded isopentenoid modifying enzyme or the encoded CPR in eukaryotic cells, prokaryotic cells or cell-free transcription/translation systems. Suitable expression vectors include, but are not limited to: baculovirus vectors, phage vectors, plasmids, phagemids, cosmids, fosmids, bacterial artificial chromosomes, viral vectors (e.g. based on vaccinia virus, poliovirus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, etc.), P1-based artificial chromosomes, yeast plasmids, yeast artificial chromosomes and specific hosts of interest (such as E. coli (E.coli), yeast and plant cell) specific any other vector.

在一些实施方式中，本发明重组载体包含本发明的修饰的细胞色素P450编码核酸和本发明的CPR编码核酸。在一些实施方式中，本发明重组载体是能在真核细胞、原核细胞或无细胞转录/翻译系统中产生编码的修饰细胞色素P450酶(如修饰的类异戊烯修饰酶)和编码的CPR的表达载体。In some embodiments, the recombinant vector of the present invention comprises the modified cytochrome P450-encoding nucleic acid of the present invention and the CPR-encoding nucleic acid of the present invention. In some embodiments, the recombinant vector of the present invention is capable of producing encoded modified cytochrome P450 enzymes (such as modified isopentenoid modifying enzymes) and encoded CPR in eukaryotic cells, prokaryotic cells, or cell-free transcription/translation systems. expression vector.

某些载体类型能够扩增本发明表达盒。将本发明核酸有效引入细胞和引入后的表达需要其它载体类型。能够接受本发明核酸的任何载体均可考虑用作本发明的重组载体。载体可以是整合到宿主基因组中或保持附加体形式的环状或线性DNA。载体可能需要附加操作或具体条件才能有效掺入宿主细胞中(如许多表达质粒)，或者可以是自身整合的细胞特异性系统的一部分(如重组病毒)。在一些实施方式中，载体在原核细胞中有功能，这种载体用于增殖重组载体和/或表达本发明核酸。在一些实施方式中，载体在真核细胞中有功能，在许多情况下这种载体是表达载体。Certain vector types are capable of amplifying the expression cassettes of the invention. Efficient introduction of nucleic acids of the invention into cells and expression thereafter require additional vector types. Any vector capable of receiving the nucleic acid of the present invention can be considered as the recombinant vector of the present invention. Vectors can be circular or linear DNA that integrates into the host genome or remains episomal. Vectors may require additional manipulations or specific conditions for efficient incorporation into host cells (eg, many expression plasmids), or may be part of a cell-specific system that integrates itself (eg, recombinant viruses). In some embodiments, vectors are functional in prokaryotic cells and such vectors are used to propagate recombinant vectors and/or express nucleic acids of the invention. In some embodiments, the vector is functional in eukaryotic cells, and in many cases such vectors are expression vectors.

本领域技术人员已知许多合适的表达载体，许多可从市场上购得。以举例方式提供以下载体；用于细菌宿主细胞：pBluescript(加州圣地亚哥的司查塔基公司(Stratagene))、pQE载体(凯杰公司(Qiagen))、pBluescript质粒、pNH载体、λ-ZAP载体(司查塔基公司)；pTrc(Amann等、Gene、69：301-315(1988))；pTrc99a、pKK223-3、pDR540和pRIT2T(法玛西亚公司(Pharmacia))；用于真核宿主细胞：pXT1、pSG5(司查塔基公司)、pSVK3、pBPV、pMSG和pSVLSV40(法玛西亚公司)。然而，可采用任何其它质粒或其它载体，只要它与宿主细胞相容。在具体实施方式中，用质粒载体pSP19g10L在原核宿主细胞中表达。在其它具体实施方式中，用质粒载体pCWori在原核宿主细胞中表达。pSP19g10L和pCWori的描述参见例如，Barnes((1996)Methods Enzymol.272：1-14)。Many suitable expression vectors are known to those skilled in the art and many are commercially available. The following vectors are provided by way of example; for bacterial host cells: pBluescript (Stratagene, San Diego, CA), pQE vector (Qiagen), pBluescript plasmid, pNH vector, λ-ZAP vector ( Schatage); pTrc (Amann et al., Gene, 69:301-315 (1988)); pTrc99a, pKK223-3, pDR540, and pRIT2T (Pharmacia); for eukaryotic host cells: pXT1, pSG5 (Strataki), pSVK3, pBPV, pMSG and pSVLSV40 (Pharmacia). However, any other plasmid or other vector may be used as long as it is compatible with the host cell. In a specific embodiment, the plasmid vector pSP19g10L is used for expression in prokaryotic host cells. In other embodiments, the plasmid vector pCWori is used for expression in prokaryotic host cells. See, eg, Barnes ((1996) Methods Enzymol. 272: 1-14) for descriptions of pSP19g10L and pCWori.

在许多实施方式中，本发明核酸包含编码类异戊烯修饰酶的核苷酸序列，其中编码类异戊烯修饰酶的核苷酸序列操作性连接于一种或多种转录和/或翻译控制元件。在许多实施方式中，本发明核酸包含编码CPR的核苷酸序列，其中编码CPR的核苷酸序列操作性连接于一种或多种转录和/或翻译控制元件。In many embodiments, a nucleic acid of the invention comprises a nucleotide sequence encoding an isopentenoid modifying enzyme, wherein the nucleotide sequence encoding an isopentenoid modifying enzyme is operably linked to one or more transcriptional and/or translational control element. In many embodiments, a nucleic acid of the invention comprises a CPR-encoding nucleotide sequence operably linked to one or more transcriptional and/or translational control elements.

在一些实施方式中，如上所述，本发明重组载体包含本发明类异戊烯修饰酶-编码核酸和本发明CPR-编码核酸。在一些实施方式中，编码的类异戊烯修饰酶核苷酸序列和编码CPR的核苷酸序列操作性连接于不同的转录控制元件。在其它实施方式中，编码类异戊烯修饰酶的核苷酸序列和编码CPR的核苷酸序列操作性连接于同一个转录控制元件。在一些实施方式中，编码类异戊烯修饰酶的核苷酸序列和编码CPR的核苷酸序列均操作性连接于同一个诱导型启动子。在一些实施方式中，编码类异戊烯修饰酶的核苷酸序列和编码CPR的核苷酸序列均操作性连接于同一个组成型启动子。In some embodiments, as described above, the recombinant vector of the present invention comprises the isopentenoid modifying enzyme-encoding nucleic acid of the present invention and the CPR-encoding nucleic acid of the present invention. In some embodiments, the nucleotide sequence encoding the isopentenoid modifying enzyme and the nucleotide sequence encoding the CPR are operably linked to different transcriptional control elements. In other embodiments, the nucleotide sequence encoding the isopentenoid modifying enzyme and the nucleotide sequence encoding the CPR are operably linked to the same transcriptional control element. In some embodiments, both the nucleotide sequence encoding the isopentenoid modifying enzyme and the nucleotide sequence encoding the CPR are operably linked to the same inducible promoter. In some embodiments, both the nucleotide sequence encoding the isopentenoid modifying enzyme and the nucleotide sequence encoding CPR are operably linked to the same constitutive promoter.

适用于原核宿主细胞的启动子包括但不限于：噬菌体T7RNA聚合酶启动子；trp启动子；lac操纵子启动子；杂合启动子，如lac/tac杂合启动子、tac/trc杂合启动子、trp/lac启动子、T7/lac启动子；trc启动子；tac启动子等；araBAD启动子；体内调节启动子，如ssaG启动子或相关启动子(参见例如，美国专利公开号20040131637)、pagC启动子(Pulkkinen和Miller，J.Bacteriol.，1991：173(1)：86-93；Alpuche-Aranda等，PNAS，1992；89(21)：10079-83)、nirB启动子(Harborne等(1992)Mol.Micro.6：2805-2813)等(参见例如Dunstan等(1999)Infect.Immun.67：5133-5141；McKelvie等(2004)Vaccine22：3243-3255；和Chatfield等(1992)Biotechnol.10：888-892)；σ70启动子，如共有σ70启动子(参见例如，GenBank登录号AX798980、AX798961和AX798183)；稳定期启动子，如dps启动子、spv启动子等；衍生自致病岛SPI-2的启动子(参见例如WO96/17951)；actA启动子(参见例如，Shetron-Rama等(2002)Infect.Immun.70：1087-1096)；rpsM启动子(参见例如，Valdivia和Falkow(1996)，Mol.Microbiol.22：367-378)；tet启动子(参见例如，Hillen，W.和Wissmann，A.(1989)，刊于Saenger，W.和Heinemann，U所编的Topics inMolecular and Structural Biology，Protein-Nucleic Acid Interaction(分子和结构生物学专题，蛋白质-核酸相互作用)，英国伦敦麦克米兰(Macmillan)，第10卷，第143-162页)；SP6启动子(参见例如，Melton等(1984)Nucl.Acids Res.12：7035-7056)；等等。Promoters suitable for prokaryotic host cells include but are not limited to: bacteriophage T7 RNA polymerase promoter; trp promoter; lac operator promoter; hybrid promoters, such as lac/tac hybrid promoter, tac/trc hybrid promoter promoter, trp/lac promoter, T7/lac promoter; trc promoter; tac promoter, etc.; , pagC promoter (Pulkkinen and Miller, J.Bacteriol., 1991: 173 (1): 86-93; Alpuche-Aranda et al., PNAS, 1992; 89 (21): 10079-83), nirB promoter (Harborne et al. (1992) Mol.Micro.6:2805-2813) etc. (see for example Dunstan et al. (1999) Infect.Immun.67:5133-5141; McKelvie et al. (2004) Vaccine22:3243-3255; .10:888-892); σ70 promoters, such as consensus σ70 promoters (see e.g., GenBank accession numbers AX798980, AX798961, and AX798183); stationary phase promoters, such as dps promoter, spv promoter, etc.; derived from pathogenic Promoter of island SPI-2 (see, e.g., WO96/17951); actA promoter (see, e.g., Shetron-Rama et al. (2002) Infect. Immun. 70:1087-1096); rpsM promoter (see, e.g., Valdivia and Falkow (1996), Mol.Microbiol.22:367-378); tet promoter (see, e.g., Hillen, W. and Wissmann, A. (1989), published in Topics in Molecular, edited by Saenger, W. and Heinemann, U and Structural Biology, Protein-Nucleic Acid Interaction (Macmillan, London, UK, Vol. 10, pp. 143-162); the SP6 promoter (see e.g. Melton et al. (1984) Nucl. Acids Res. 12:7035-7056); et al.

合适的真核启动子的非限制性例子包括CMV立即早期、HSV胸苷激酶、早期和晚期SV40、来自逆转录病毒的LTR和小鼠金属硫蛋白-I。在一些实施方式中，例如在酵母细胞中表达时，合适启动子是组成型启动子如ADH1启动子、PGK1启动子、ENO启动子、PYK1启动子等；或者调节型启动子如GAL1启动子、GAL10启动子、ADH2启动子、PHO5启动子、CUP1启动子、GAL7启动子、MET25启动子、MET3启动子等。本领域普通技术人员能够选择合适的载体和启动子。表达载体也可含有用于启动翻译的核糖体结合位点和转录终止子。表达载体还可含有扩大表达的合适序列。Non-limiting examples of suitable eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTR from retroviruses, and mouse metallothionein-I. In some embodiments, for example, when expressing in yeast cells, suitable promoters are constitutive promoters such as ADH1 promoters, PGK1 promoters, ENO promoters, PYK1 promoters, etc.; or regulated promoters such as GAL1 promoters, GAL10 promoter, ADH2 promoter, PHO5 promoter, CUP1 promoter, GAL7 promoter, MET25 promoter, MET3 promoter, etc. Those of ordinary skill in the art can select appropriate vectors and promoters. Expression vectors may also contain ribosomal binding sites and transcription terminators for initiating translation. Expression vectors may also contain suitable sequences to amplify expression.

在许多实施方式中，本发明重组载体含有一个或多个选择性标记基因，以提供用于选择转化的宿主细胞的表型特征。合适的选择性标记包括但不限于：真核细胞培养中的二氢叶酸还原酶、新霉素抗性；和原核宿主细胞如大肠杆菌中的四环素或氨苄青霉素抗性。In many embodiments, recombinant vectors of the invention contain one or more selectable marker genes to provide phenotypic characteristics for selection of transformed host cells. Suitable selectable markers include, but are not limited to: dihydrofolate reductase, neomycin resistance in eukaryotic cell culture; and tetracycline or ampicillin resistance in prokaryotic host cells such as E. coli.

通常，重组表达载体将包含能够转化宿主细胞的复制起点和选择性标记，如大肠杆菌的氨苄青霉素抗性基因、酿酒酵母(S.cerevisiae)TRP1基因等；和用于指导编码序列转录的衍生自高表达基因的启动子。这类启动子可衍生自编码糖酵解酶如3-磷酸甘油酸激酶(PGK)、α-因子、酸性磷酸酶或热激蛋白等的操纵子。Usually, the recombinant expression vector will contain an origin of replication capable of transforming host cells and a selectable marker, such as the ampicillin resistance gene of Escherichia coli, the Saccharomyces cerevisiae (S.cerevisiae) TRP1 gene, etc.; Promoters of highly expressed genes. Such promoters may be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), alpha-factor, acid phosphatase, or heat shock proteins, among others.

在许多实施方式中，编码修饰细胞色素P450酶(如修饰的类异戊烯修饰酶)的核苷酸序列操作性连接于诱导型启动子。在许多实施方式中，编码CPR的核苷酸序列操作性连接于诱导型启动子。本领域熟知诱导型启动子。合适的诱导型启动子包括但不限于：λ噬菌体的pL；Plac；Ptrp；Ptac(Ptrp-lac杂合启动子)；异丙基-β-D-硫代吡喃半乳糖苷(IPTG)-诱导型启动子，如lacZ启动子；四环素-诱导型启动子；阿拉伯糖诱导型启动子，如P_BAD(参见例如，Guzman等(1995)J.Bacteriol.177：4121-4130)；木糖诱导型启动子，如Pxyl(参见例如，Kim等(1996)Gene 181：71-76)；GAL1启动子；色氨酸启动子；lac启动子；醇诱导型启动子，如甲醇诱导型启动子、乙醇-诱导型启动子；棉子糖-诱导型启动子；热诱导型启动子，如热诱导型λP_L启动子、由热敏阻抑物控制的启动子(如CI857-阻抑的λ基表达载体；参见例如，Hoffmann等(1999)FEMS Microbiol Lett.177(2)：327-34)；等等。In many embodiments, a nucleotide sequence encoding a modified cytochrome P450 enzyme (eg, a modified isopentenoid modifying enzyme) is operably linked to an inducible promoter. In many embodiments, the nucleotide sequence encoding the CPR is operably linked to an inducible promoter. Inducible promoters are well known in the art. Suitable inducible promoters include, but are not limited to: pL of bacteriophage lambda; Plac; Ptrp; Ptac (Ptrp-lac hybrid promoter); Inducible promoters, such as the lacZ promoter; tetracycline-inducible promoters; arabinose-inducible promoters, such as _PBAD (see, e.g., Guzman et al. (1995) J. Bacteriol. 177:4121-4130); xylose-inducible GAL1 promoter; tryptophan promoter; lac promoter; alcohol-inducible promoters, such as methanol-inducible promoters, Ethanol-inducible promoters; raffinose-inducible promoters; heat-inducible promoters, such as heat-inducible _λPL promoters, promoters controlled by heat-sensitive repressors (such as CI857-repressed λ Expression vectors; see eg, Hoffmann et al. (1999) FEMS Microbiol Lett. 177(2):327-34); and the like.

在酵母中，可采用含有组成型或诱导型启动子的许多载体。综述参见Current Protocols in Molecular Biology(新编分子生物学实验指南)，第2卷，1988，Ausubel等编，Greene Publish.Assoc.&Wiley Interscience(格林出版联合集团和韦氏科学出版社)，第13章；Grant等，1987，Expression and SecretionVector for Yeast(用于酵母的表达和分泌载体)，Methods in Enzymology(酶学方法)，Wu和Grossman编，31987，学术出版社(Acad.Press)，纽约，第153卷，第516-544页；Glover，1986，DNA Cloning(DNA克隆)，第II卷，IRL出版社，华盛顿特区，第3章；和Bitter，1987，Heterologous Gene Expressionin Yeast(酵母中的异源基因表达)，Methods in Enzymology(酶学方法)，Berger和Kimmel编，学术出版社，纽约，第152卷，第673-684页；和TheMolecular Biology of the Yeast Saccharomyces(酿酒酵母的分子生物学)，1982，Strathern等编，冷泉港出版社，第I和II卷。可采用组成型酵母启动子如ADH或LEU2或诱导型启动子如GAL(Cloning in Yeast(在酵母中克隆)，第3章，R.Rothstein，刊于DNA Cloning(DNA克隆)第11卷，A PracticalApproach(实践方法)，DM Glover编，1986，IRL出版社，华盛顿特区)。或者，可采用能促进外来DNA序列整合到酵母染色体中的载体。In yeast, a number of vectors containing constitutive or inducible promoters are available. For a review, see Current Protocols in Molecular Biology, Vol. 2, 1988, edited by Ausubel et al., Greene Publish. Assoc. & Wiley Interscience (Green Publishing Group and Webster Science Publishers), Chapter 13 Grant et al., 1987, Expression and SecretionVector for Yeast (expression and secretion vector for yeast), Methods in Enzymology (enzyme method), Wu and Grossman edits, 31987, Academic Press (Acad.Press), New York, pp. 153, pp. 516-544; Glover, 1986, DNA Cloning, Vol. II, IRL Press, Washington, DC, Chapter 3; and Bitter, 1987, Heterologous Gene Expression in Yeast (Heterologous Gene Expression in Yeast Gene Expression), Methods in Enzymology, eds. Berger and Kimmel, Academic Press, New York, Vol. 152, pp. 673-684; and The Molecular Biology of the Yeast Saccharomyces (Molecular Biology of Saccharomyces cerevisiae), 1982, Strathern et al., eds., Cold Spring Harbor Press, Volumes I and II. Constitutive yeast promoters such as ADH or LEU2 or inducible promoters such as GAL (Cloning in Yeast (cloning in yeast), Chapter 3, R. Rothstein, published in DNA Cloning (DNA Cloning) Volume 11, A Practical Approach, ed. DM Glover, 1986, IRL Press, Washington DC). Alternatively, vectors that facilitate integration of the foreign DNA sequence into the yeast chromosome can be used.

在一些实施方式中，本发明核酸或本发明载体包含用于在植物细胞中表达的启动子或其它调控元件。在植物细胞中有功能的合适的组成型启动子的非限制性例子是花椰菜花叶病毒35S启动子、串联35S启动子(Kay等，Science236：1299(1987))、花椰菜花叶病毒19S启动子、胭脂碱合酶基因启动子(Singer等，PlantMol.Biol.14：433(1990)；An，PlantPhysiol.81：86(1986)、章鱼碱合酶基因启动子和泛素启动子。在植物细胞中有功能的合适的诱导型启动子包括但不限于：苯丙氨酸氨-裂合酶基因启动子、查耳酮合酶基因启动子、病理相关蛋白基因启动子、铜-诱导性调控元件(Mett等，Proc.Natl.Acad.Sci.USA90：4567-4571(1993)；Furst等，Cell 55：705-717(1988))；四环素和氯四环素-诱导性调控元件(Gatz等，Plant J.2：397-404(1992)；

等，Mol.Gen.Genet.243：32-38(1994)；Gatz，Meth.Cell Biol.50：411-424(1995))；蜕皮激素诱导性调控元件(Christopherson等，Proc.Natl.Acad.Sci.USA 89：6314-6318(1992)；Kreutzweiser等，Ecotoxicol.Environ.Safety 28：14-24(1994))；热激诱导性调控元件(Takahashi等，Plant Physiol.99：383-390(1992)；Yabe等，PlantCell Physiol.35：1207-1219(1994)；Ueda等，Mol.Gen.Genet.250：533-539(1996))；和lac操纵子元件，它们与组成性表达的lac阻抑物联用以产生(例如)IPTG-诱导性表达(Wilde等，EMBO J.11：1251-1259(1992)；菠菜亚硝酸还原酶基因的硝酸-诱导性启动子(Back等，Plant Mol.Biol.17：9(1991))；光诱导性启动子，如与RuBP羧化酶或LHCP基因家族的小亚基相关的启动子(Feinbaum等，Mol.Gen.Genet.226：449(1991)；Lam和Chua，Science 248：471(1990))；如美国专利公开号20040038400所述的光-反应性调控元件；水杨酸诱导性调控元件(Uknes等，Plant Cell 5：159-169(1993)；Bi等，Plant J.8：235-245(1995))；植物激素诱导型调控元件(Yamaguchi-Shinozaki等，PlantMol.Biol.15：905(1990)；Kares等，Plant Mol.Biol.15：225(1990))；和人激素-诱导型调控元件如人糖皮质激素效应元件(Schena等，Proc.Natl.Acad.Sci.USA 88：10421(1991)。In some embodiments, a nucleic acid of the invention or a vector of the invention comprises a promoter or other regulatory elements for expression in a plant cell. Non-limiting examples of suitable constitutive promoters functional in plant cells are the cauliflower mosaic virus 35S promoter, the tandem 35S promoter (Kay et al., Science 236:1299 (1987)), the cauliflower mosaic virus 19S promoter , nopaline synthase gene promoter (Singer et al., PlantMol.Biol.14: 433 (1990); An, PlantPhysiol.81: 86 (1986), octopine synthase gene promoter and ubiquitin promoter. In plant cells Suitable inducible promoters that are functional in include, but are not limited to: phenylalanine ammonia-lyase gene promoter, chalcone synthase gene promoter, pathology-associated protein gene promoter, copper-inducible regulatory elements (Mett et al., Proc.Natl.Acad.Sci.USA90:4567-4571 (1993); Furst et al., Cell 55:705-717 (1988)); Tetracycline and chlortetracycline-inducible regulatory elements (Gatz et al., Plant J .2: 397-404 (1992);

etc., Mol.Gen.Genet.243:32-38 (1994); Gatz, Meth.Cell Biol.50:411-424 (1995)); Ecdysone-inducible regulatory elements (Christopherson et al., Proc.Natl.Acad. Sci.USA 89:6314-6318 (1992); Kreutzweiser et al., Ecotoxicol.Environ.Safety 28:14-24 (1994)); ); Yabe et al., PlantCell Physiol.35:1207-1219 (1994); Ueda et al., Mol.Gen.Genet.250:533-539 (1996)); Inhibition was used to generate, for example, IPTG-inducible expression (Wilde et al., EMBO J. 11:1251-1259 (1992); the nitrate-inducible promoter of the spinach nitrite reductase gene (Back et al., Plant Mol. Biol. 17:9 (1991)); light-inducible promoters, such as those associated with the RuBP carboxylase or the small subunit of the LHCP gene family (Feinbaum et al., Mol. Gen. Genet. 226: 449 (1991) ; Lam and Chua, Science 248:471 (1990)); Light-responsive regulatory element as described in US Patent Publication No. 20040038400; Salicylic acid-inducible regulatory element (Uknes et al., Plant Cell 5:159-169 (1993 ); Bi et al., Plant J.8: 235-245 (1995)); Plant hormone-inducible regulatory elements (Yamaguchi-Shinozaki et al., Plant Mol. Biol. 15: 905 (1990); Kares et al., Plant Mol. Biol. 15 : 225 (1990)); and human hormone-inducible regulatory elements such as the human glucocorticoid response element (Schena et al., Proc. Natl. Acad. Sci. USA 88: 10421 (1991).

本发明核酸或本发明载体中也可包含植物组织选择性调控元件。适用于在一种组织或有限数量的组织中异位表达核酸的组织选择性调控元件包括但不限于：木质部-选择性调控元件、管胞-选择性调控元件、纤维-选择性调控元件、毛状体-选择性调控元件(参见例如，Wang等(2002)J.Exp.Botany53：1891-1897)、腺毛-选择性调控元件等。Plant tissue selective regulatory elements may also be included in the nucleic acid of the invention or the vector of the invention. Tissue-selective regulatory elements suitable for ectopic expression of nucleic acids in one tissue or a limited number of tissues include, but are not limited to: xylem-selective regulatory elements, tracheid-selective regulatory elements, fiber-selective regulatory elements, hair-selective regulatory elements, Stem-selective regulatory elements (see, eg, Wang et al. (2002) J. Exp. Botany 53:1891-1897), glandular trichome-selective regulatory elements, and the like.

本领域已知适用于植物细胞的载体，任何这种载体均可用于将本发明核酸引入植物宿主细胞。合适载体包括例如：根癌土壤杆菌(Agrobacteriumtumefaciens)的Ti质粒或毛根土壤杆菌(A.rhizogenes)的Ri₁质粒。感染土壤杆菌(Agrobacterium)后，Ti或Ri₁质粒被输送到植物细胞中，并稳定整合到植物基因组中。J.Schell，Science，237：1176-83(1987)。适合采用的还有植物人工染色体，如美国专利号6,900,012所述。Vectors suitable for use in plant cells are known in the art, and any such vectors can be used to introduce nucleic acids of the invention into plant host cells. Suitable vectors include, for example, the Ti plasmid of Agrobacterium tumefaciens or the Ri ₁ plasmid of A. rhizogenes. After infection of Agrobacterium, the Ti or Ri ₁ plasmid is delivered into plant cells and stably integrated into the plant genome. J. Schell, Science, 237: 1176-83 (1987). Also suitable for use are plant artificial chromosomes, as described in US Patent No. 6,900,012.

组合物combination

本发明还提供了含有本发明核酸的组合物。本发明还提供了含有本发明重组载体的组合物。在许多实施方式中，含有本发明核酸或本发明表达载体的组合物包含以下一种或多种物质：盐，如NaCl、MgCl、KCl、MgSO₄等；缓冲剂，如Tris缓冲液、N-(2-羟乙基)哌嗪-N′-(2-乙磺酸)(HEPES)、2-(N-吗啉代)乙磺酸(MES)、2-(N-吗啉代)乙磺酸钠(MES)、3-(N-吗啉代)丙磺酸(MOPS)、N-三[羟甲基]甲基-3-氨基丙磺酸(TAPS)、等；增溶剂；去污剂，如非离子型去污剂如吐温-20等；核酸酶抑制剂；等等。在一些实施方式中，冻干本发明核酸或本发明重组载体。The invention also provides compositions comprising a nucleic acid of the invention. The present invention also provides a composition containing the recombinant vector of the present invention. In many embodiments, the composition containing the nucleic acid of the present invention or the expression vector of the present invention includes one or more of the following substances: salts, such as NaCl, MgCl, KCl, MgSO ₄ and the like; buffers, such as Tris buffer, N- (2-Hydroxyethyl)piperazine-N'-(2-ethanesulfonic acid) (HEPES), 2-(N-morpholino)ethanesulfonic acid (MES), 2-(N-morpholino)ethane Sodium sulfonate (MES), 3-(N-morpholino)propanesulfonic acid (MOPS), N-tris[hydroxymethyl]methyl-3-aminopropanesulfonic acid (TAPS), etc.; solubilizer; Detergents, such as non-ionic detergents such as Tween-20, etc.; nuclease inhibitors; and the like. In some embodiments, a nucleic acid of the invention or a recombinant vector of the invention is lyophilized.

宿主细胞host cell

本发明提供了遗传修饰的宿主细胞，如用本发明核酸或本发明重组载体遗传修饰的宿主细胞。在许多实施方式中，本发明遗传修饰的宿主细胞是体外宿主细胞。在其它实施方式中，本发明遗传修饰的宿主细胞是体内宿主细胞。在其它实施方式中，本发明遗传修饰的宿主细胞是多细胞生物体的一部分。The invention provides a genetically modified host cell, such as a host cell genetically modified with a nucleic acid of the invention or a recombinant vector of the invention. In many embodiments, the genetically modified host cells of the invention are in vitro host cells. In other embodiments, the genetically modified host cells of the invention are in vivo host cells. In other embodiments, the genetically modified host cells of the invention are part of a multicellular organism.

在许多实施方式中，宿主细胞是单细胞生物，或作为单细胞体外培养。在一些实施方式中，宿主细胞是真核细胞。合适的真核宿主细胞包括但不限于：酵母细胞、昆虫细胞、植物细胞、真菌细胞和藻类细胞。合适的真核宿主细胞包括但不限于：巴斯德毕赤酵母(Pichia pastoris)、芬兰毕赤酵母(Pichiafinlandica)、海藻毕赤酵母(Pichia trehalophila)、柯蓝毕赤酵母(Pichia koclamae)、膜醭毕赤酵母(Pichia membranaefaciens)、奥普毕赤酵母(Pichia opuntiae)、耐热毕赤酵母(Pichia thermotolerans)、萨利毕赤酵母(Pichia salictaria)、戈尔毕赤酵母(Pichia guercuum)、皮氏毕赤酵母(Pichia pijperi)、树干毕赤酵母(Pichiastiptis)、甲醇毕赤酵母(Pichia methanolica)、毕赤酵母(Pichia sp.)、酿酒酵母(Saccharomyces cereviseae)、酵母(Saccharomyces sp.)、多形汉逊酵母(Hansenulapolymorpha)、克鲁维酵母(Kluyveromyces sp.)、乳糖克鲁维酵母(Kluyveromyceslactis)、白假丝酵母(Candida albicans)、构巢曲霉(Aspergillus nidulans)、黑曲霉(Aspergillus niger)、米曲霉(Aspergillus oryzae)、木霉(Trichoderma reesei)、Chrysosporium lucknowense、镰刀菌(Fusarium sp.)、禾谷镰刀菌(Fusariumgramineum)、镰孢霉(Fusarium venenatum)、粗糙链胞菌(Neurospora crassa)、莱茵衣藻(Chlamydomonas reinhardtii)等。在一些实施方式中，宿主细胞是除植物细胞以外的真核细胞。In many embodiments, the host cell is a unicellular organism, or is cultured in vitro as a single cell. In some embodiments, the host cell is a eukaryotic cell. Suitable eukaryotic host cells include, but are not limited to, yeast cells, insect cells, plant cells, fungal cells, and algal cells. Suitable eukaryotic host cells include, but are not limited to: Pichia pastoris, Pichiafinlandica, Pichia trehalophila, Pichia koclamae, membrane Pichia membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia salictaria, Pichia guercuum, skin Pichia pijperi, Pichia stiptis, Pichia methanolica, Pichia sp., Saccharomyces cereviseae, Saccharomyces sp., multi Hansenula polymorpha, Kluyveromyces sp., Kluyveromyceslactis, Candida albicans, Aspergillus nidulans, Aspergillus niger , Aspergillus oryzae, Trichoderma reesei, Chrysosporium lucknowense, Fusarium sp., Fusarium gramineum, Fusarium venenatum, Neurospora crassa , Chlamydomonas reinhardtii, etc. In some embodiments, the host cell is a eukaryotic cell other than a plant cell.

在其它实施方式中，宿主细胞是植物细胞。植物细胞包括单子叶植物和双子叶植物的细胞。In other embodiments, the host cell is a plant cell. Plant cells include those of monocots and dicots.

在其它实施方式中，宿主细胞是原核细胞。合适的原核细胞包括但不限于：大肠杆菌(Escherichia coli)、乳杆菌(Lactobacillus sp.)、沙门菌(Salmonella sp.)、志贺菌(Shigella sp.)等的各种实验室菌株。参见例如，Carrier等(1992)J.Immunol.148：1176-1181；美国专利号6,447,784；和Sizemore等(1995)Science270：299-302。可用于本发明的沙门菌菌株的例子包括但不限于：伤寒沙门菌(Salmonella typhi)和鼠伤寒沙门菌(S.typhimurium)。合适的志贺菌菌株包括但不限于：弗志贺菌(Shigella flexneri)、宋内志贺菌(Shigella sonnei)和迪森志贺菌(Shigella disenteriae)。一般地，所述实验室菌株是非致病性菌株。其它合适细菌的非限制性例子包括但不限于：枯草芽孢杆菌(Bacillus subtilis)、普假单胞菌(Pseudomonas pudita)、绿脓假单胞菌(Pseudomonas aeruginosa)、Pseudomonasmevalonii、类球红细菌(Rhodobacter sphaeroides)、荚膜红细菌(Rhodobactercapsulatus)、深红红螺菌(Rhodospirillum rubrum)、红球菌属(Rhodococcus sp.)等。在一些实施方式中，宿主细胞是大肠杆菌。In other embodiments, the host cell is a prokaryotic cell. Suitable prokaryotic cells include, but are not limited to, various laboratory strains of Escherichia coli, Lactobacillus sp., Salmonella sp., Shigella sp., and the like. See, eg, Carrier et al. (1992) J. Immunol. 148:1176-1181; US Patent No. 6,447,784; and Sizemore et al. (1995) Science 270:299-302. Examples of Salmonella strains that can be used in the present invention include, but are not limited to: Salmonella typhi and S. typhimurium. Suitable Shigella strains include, but are not limited to, Shigella flexneri, Shigella sonnei, and Shigella disenteriae. Generally, the laboratory strains are non-pathogenic strains. Non-limiting examples of other suitable bacteria include, but are not limited to: Bacillus subtilis, Pseudomonas pudita, Pseudomonas aeruginosa, Pseudomonasmevalonii, Rhodobacter sphaeroides), Rhodobacter capsulatus, Rhodospirillum rubrum, Rhodococcus sp., etc. In some embodiments, the host cell is E. coli.

为了产生本发明遗传修饰的宿主细胞，用已经建立的技术将包含编码修饰细胞色素P450酶(如修饰的类异戊烯修饰酶)的核苷酸序列的本发明核酸稳定或瞬时地引入亲代宿主细胞中，这些技术包括但不限于：电穿孔、磷酸钙沉淀、DEAE-右旋糖苷介导的转染、脂质体介导的转染等。在稳定转化中，核酸一般还包括选择性标记物，例如几种熟知的选择性标记物如新霉素抗性、氨苄青霉素抗性、四环素抗性、氯霉素抗性、卡那霉素抗性等。To generate the genetically modified host cells of the invention, a nucleic acid of the invention comprising a nucleotide sequence encoding a modified cytochrome P450 enzyme (such as a modified isopentenoid modifying enzyme) is stably or transiently introduced into the parental host using established techniques In cells, these techniques include, but are not limited to: electroporation, calcium phosphate precipitation, DEAE-dextran-mediated transfection, liposome-mediated transfection, and the like. In stable transformations, the nucleic acid typically also includes a selectable marker, for example several well-known selectable markers such as neomycin resistance, ampicillin resistance, tetracycline resistance, chloramphenicol resistance, kanamycin resistance sex etc.

在一些实施方式中，本发明遗传修饰的宿主细胞是植物细胞。本发明遗传修饰的植物细胞可用于在体外植物细胞培养物中产生选择的类异戊烯化合物。关于植物组织培养的指南可参见(例如)：Plant Cell and Tissue Culture(植物细胞和组织培养)，1994，Vasil和Thorpe编，克鲁维学术出版集团(Kluwer AcademicPublishers)；和Plant Cell Culture Protocols(植物细胞培养方法)(Methods inMolecular Biology(分子生物学方法)111)，1999，Hall编，休曼出版社(HumanaPress)。In some embodiments, the genetically modified host cells of the invention are plant cells. The genetically modified plant cells of the invention can be used to produce selected isopentenoid compounds in in vitro plant cell culture. Guidelines on plant tissue culture can be found in (for example): Plant Cell and Tissue Culture, 1994, eds. Vasil and Thorpe, Kluwer Academic Publishers (Kluwer Academic Publishers); and Plant Cell Culture Protocols (Plant Cell Culture Protocols). Cell Culture Methods) (Methods in Molecular Biology (molecular biology methods) 111), 1999, Hall ed., Humana Press (Humana Press).

遗传修饰的宿主细胞genetically modified host cells

在一些实施方式中，本发明遗传修饰的宿主细胞包含本发明表达载体，所述本发明表达载体包含编码修饰细胞色素P450酶的核苷酸序列。在一些实施方式中，本发明遗传修饰的宿主细胞包含本发明表达载体，所述本发明表达载体包含编码修饰的类异戊烯前体修饰酶的核苷酸序列。In some embodiments, a genetically modified host cell of the invention comprises an expression vector of the invention comprising a nucleotide sequence encoding a modified cytochrome P450 enzyme. In some embodiments, a genetically modified host cell of the invention comprises an expression vector of the invention comprising a nucleotide sequence encoding a modified isopentenoid precursor modifying enzyme.

在一些实施方式中，本发明遗传修饰的宿主细胞包含第一种本发明表达载体和第二种本发明表达载体，所述第一种本发明表达载体包含的本发明核酸含有编码修饰细胞色素P450酶的核苷酸序列；所述第二种本发明表达载体包含的本发明核酸含有编码CPR的核苷酸序列。在其它实施方式中，本发明遗传修饰的宿主细胞包含本发明表达载体，其中所述本发明表达载体包含含有编码修饰细胞色素P450酶的核苷酸序列的本发明核酸和含有编码CPR的核苷酸序列的本发明核酸。在其它实施方式中，本发明遗传修饰的宿主细胞包含本发明表达载体，所述本发明表达载体包含的本发明核酸含有编码融合多肽(如包含修饰细胞色素P450酶和CPR的多肽)的核苷酸序列。In some embodiments, the genetically modified host cell of the present invention comprises a first expression vector of the present invention and a second expression vector of the present invention, the nucleic acid of the present invention contained in the first expression vector of the present invention contains a modified cytochrome P450 The nucleotide sequence of the enzyme; the nucleic acid of the present invention contained in the second expression vector of the present invention contains a nucleotide sequence encoding CPR. In other embodiments, the genetically modified host cell of the present invention comprises an expression vector of the present invention, wherein the expression vector of the present invention comprises a nucleic acid of the present invention comprising a nucleotide sequence encoding a modified cytochrome P450 enzyme and a nucleoside encoding a CPR A nucleic acid of the invention having an acid sequence. In other embodiments, the genetically modified host cell of the present invention comprises an expression vector of the present invention, and the nucleic acid of the present invention contained in the expression vector of the present invention contains nucleosides encoding a fusion polypeptide (such as a polypeptide comprising a modified cytochrome P450 enzyme and CPR) acid sequence.

在一些实施方式中，本发明遗传修饰的宿主细胞包含第一表达载体和第二表达载体，所述第一表达载体包含含有编码修饰细胞色素P450酶的核苷酸序列的本发明核酸；所述第二表达载体包含编码CPR的核苷酸序列。在其它实施方式中，本发明遗传修饰的宿主细胞包含主题表达载体，其中所述主题表达载体包含含有编码修饰细胞色素P450酶的核苷酸序列和编码CPR的核苷酸序列的本发明核酸。In some embodiments, the genetically modified host cell of the present invention comprises a first expression vector and a second expression vector, the first expression vector comprising the nucleic acid of the present invention comprising a nucleotide sequence encoding a modified cytochrome P450 enzyme; the The second expression vector comprises a nucleotide sequence encoding CPR. In other embodiments, a genetically modified host cell of the invention comprises a subject expression vector, wherein said subject expression vector comprises a nucleic acid of the invention comprising a nucleotide sequence encoding a modified cytochrome P450 enzyme and a nucleotide sequence encoding a CPR.

在一些实施方式中，进一步遗传修饰本发明遗传修饰的宿主细胞，使其包含含有编码产生细胞色素P450酶底物的一种或多种酶的核苷酸序列的一种或多种核酸。这类酶的例子包括但不限于：萜合酶；异戊烯基转移酶；二磷酸异戊-1-烯酯异构酶；甲羟戊酸途径中的一种或多种酶；和DXP途径中的一种或多种酶。在一些实施方式中，进一步遗传修饰本发明遗传修饰的宿主细胞，使其包含含有编码萜合酶、异戊烯基转移酶、IPP异构酶、乙酰乙酰CoA硫解酶、HMGS、HMGR、MK、PMK和MPD中一种、两种、三种、四种、五种、六种、七种或八种或更多种的核苷酸序列的一种或多种核酸。在一些实施方式中，例如，当进一步遗传修饰本发明遗传修饰的宿主细胞，使其包含含有编码萜合酶、异戊烯基转移酶、IPP异构酶、乙酰乙酰-CoA硫解酶、HMGS、HMGR、MK、PMK和MPD中两种或多种的核苷酸序列的一种或多种核酸时，核苷酸序列存在于至少两个操纵子，如两个分离的操纵子、三个分离的操纵子或四个分离的操纵子中。In some embodiments, the genetically modified host cells of the invention are further genetically modified to comprise one or more nucleic acids comprising a nucleotide sequence encoding one or more enzymes that produce substrates for cytochrome P450 enzymes. Examples of such enzymes include, but are not limited to: terpene synthase; prenyltransferase; isopent-1-enyl diphosphate isomerase; one or more enzymes in the mevalonate pathway; One or more enzymes in the pathway. In some embodiments, the genetically modified host cell of the present invention is further genetically modified to comprise a protein containing a gene encoding terpene synthase, isopentenyltransferase, IPP isomerase, acetoacetyl-CoA thiolase, HMGS, HMGR, MK One or more nucleic acids of one, two, three, four, five, six, seven or eight or more nucleotide sequences in , PMK and MPD. In some embodiments, for example, when the genetically modified host cell of the present invention is further genetically modified, it comprises a protein containing a gene encoding terpene synthase, isopentenyl transferase, IPP isomerase, acetoacetyl-CoA thiolase, HMGS , HMGR, MK, PMK and MPD, when one or more nucleic acids of two or more nucleotide sequences exist in at least two operons, such as two separate operons, three in a separate operon or in four separate operons.

萜合酶terpene synthase

在一些实施方式中，进一步遗传修饰本发明遗传修饰的宿主细胞，以包含含有编码萜合酶的核苷酸序列的核酸。在一些实施方式中，萜合酶是修饰FPP产生倍半萜的萜合酶。在其它实施方式中，萜合酶是修饰GPP产生单萜的萜合酶。在其它实施方式中，萜合酶是修饰GGPP产生二萜的萜合酶。In some embodiments, the genetically modified host cells of the invention are further genetically modified to comprise a nucleic acid comprising a nucleotide sequence encoding a terpene synthase. In some embodiments, the terpene synthase is a terpene synthase that modifies FPP to produce a sesquiterpene. In other embodiments, the terpene synthase is a terpene synthase that modifies GPP to produce a monoterpene. In other embodiments, the terpene synthase is a terpene synthase that modifies GGPP to produce a diterpene.

萜合酶作用于二磷酸多异戊烯酯底物，通过环化、重排或偶联修饰二磷酸多异戊烯酯底物，产生类异戊烯前体(如苧烯、紫穗槐二烯、紫杉二烯等)，这类类异戊烯前体是类异戊烯前体修饰酶的底物。通过萜合酶对二磷酸多异戊烯酯底物的作用，产生类异戊烯前体修饰酶的底物。Terpene synthases act on polyprenyl diphosphate substrates, and modify polyprenyl diphosphate substrates by cyclization, rearrangement or coupling to produce isopentenoid precursors (such as limonene, amorpha dienes, taxadiene, etc.), such isopentenoid precursors are substrates for isopentenoid precursor modifying enzymes. Substrates for isopentenoid precursor modifying enzymes are produced by the action of terpene synthases on polyprenyl diphosphate substrates.

本领域已知编码萜合酶的核苷酸序列，可用编码萜合酶的任何已知核苷酸序列遗传修饰宿主细胞。例如，已知并可采用以下编码萜合酶的核苷酸序列(后面是它们的GenBank登录号和鉴定到它们的生物)：(-)-大根香叶烯D合酶mRNA(AY438099；小叶杨毛果亚种(Populus balsamifera subsp.trichocarpa)×三角杨(Populus deltoids))；E，E-α-法呢烯合酶mRNA(AY640154；黄瓜(Cucumissativus))；1，8-桉油素合酶mRNA(AY691947；拟南芥(Arabidopsis thaliana))；萜合酶5(TPS5)mRNA(AY518314；玉米(Zea mays))；萜合酶4(TPS4)mRNA(AY518312；玉米)；香叶烯/罗勒烯合酶(TPS10)(At2g24210)mRNA(NM 127982；拟南芥)；牻牛儿醇合酶(GES)mRNA(AY362553；罗勒(Ocimum basilicum))；蒎烯合酶mRNA(AY237645；Picea sitchensis)；香叶烯合酶1e20mRNA(AY195609；金鱼草(Antirrhinumm majus))；(E)-β-罗勒烯合酶(0e23)mRNA(AY195607；金鱼草)；E-β-罗勒烯合酶mRNA(AY151086；金鱼草)；萜合酶mRNA(AF497492；拟南芥)；(-)-莰烯合酶(AG6.5)mRNA(U87910；北美冷杉(Abies grandis))；(-)-4S-苧烯合酶基因(如基因组序列)(AF326518；北美冷杉)；δ-蛇床烯合酶基因(AF326513；北美冷杉)；紫穗槐-4，11-二烯合酶mRNA(AJ251751；黄花蒿(Artemisia annua))；E-α-红没药烯合酶mRNA(AF006195；北美冷杉)；γ-葎草烯合酶mRNA(U92267；北美冷杉)；δ-蛇床烯合酶mRNA(U92266；北美冷杉)；蒎烯合酶(AG3.18)mRNA(U87909；北美冷杉)；香叶烯合酶(AG2.2)mRNA(U87908；北美冷杉)等。Nucleotide sequences encoding terpene synthases are known in the art, and host cells can be genetically modified with any known nucleotide sequence encoding a terpene synthase. For example, the following nucleotide sequences encoding terpene synthases (followed by their GenBank accession numbers and the organism in which they were identified) are known and available: (-)-germaneene D synthase mRNA (AY438099; Populus microphyllum subspecies (Populus balsamifera subsp.trichocarpa)×Populus deltoids (Populus deltoids)); E, E-α-farnesene synthase mRNA (AY640154; cucumber (Cucumissativus)); 1,8-cineole synthase mRNA ( AY691947; Arabidopsis thaliana); terpene synthase 5 (TPS5) mRNA (AY518314; maize (Zea mays)); terpene synthase 4 (TPS4) mRNA (AY518312; maize); enzyme (TPS10) (At2g24210) mRNA (NM 127982; Arabidopsis); geraniol synthase (GES) mRNA (AY362553; Ocimum basilicum); pinene synthase mRNA (AY237645; Picea sitchensis); Folinene synthase 1e20 mRNA (AY195609; Antirrhinum majus); (E)-β-ocimene synthase (0e23) mRNA (AY195607; Antirrhinum majus); E-β-ocimene synthase mRNA (AY151086; Goldfish grass); terpene synthase mRNA (AF497492; Arabidopsis); (-)-camphene synthase (AG6.5) mRNA (U87910; Abies grandis); (-)-4S-limonene synthase Genes (e.g., genome sequences) (AF326518; Abies japonica); δ-ostene synthase gene (AF326513; Abies japonica); Amorpha-4,11-diene synthase mRNA (AJ251751; Artemisia annua) ); E-α-bisabolene synthase mRNA (AF006195; North American Abies); γ-humulene synthase mRNA (U92267; North American Abies); δ-osthole synthase mRNA (U92266; North American Abies); Pinene synthase (AG3.18) mRNA (U87909; North American Abies); myrcene synthase (AG2.2) mRNA (U87908; North American Abies), etc.

甲羟戊酸途径mevalonate pathway

在一些实施方式中，本发明遗传修饰的宿主细胞是通常不通过甲羟戊酸途径合成焦磷酸异戊-1-烯酯(IPP)或甲羟戊酸的宿主细胞。甲羟戊酸途径包括：(a)将两个分子的乙酰-CoA缩合成乙酰乙酰-CoA；(b)使乙酰乙酰-CoA与乙酰-CoA缩合形成HMG-CoA；(c)将HMG-CoA转化为甲羟戊酸；(d)将甲羟戊酸磷酸化为甲羟戊酸5-磷酸；(e)将甲羟戊酸5-磷酸转化为甲羟戊酸5-焦磷酸；和(f)将甲羟戊酸5-焦磷酸转化为焦磷酸异戊-1-烯酯。产生IPP所需的甲羟戊酸途径酶可能因培养条件而变。In some embodiments, a genetically modified host cell of the invention is a host cell that does not normally synthesize isopent-1-enyl pyrophosphate (IPP) or mevalonate through the mevalonate pathway. The mevalonate pathway involves: (a) condensation of two molecules of acetyl-CoA to acetoacetyl-CoA; (b) condensation of acetoacetyl-CoA and acetyl-CoA to form HMG-CoA; (c) condensation of HMG-CoA to conversion to mevalonate; (d) phosphorylation of mevalonate to mevalonate 5-phosphate; (e) conversion of mevalonate 5-phosphate to mevalonate 5-pyrophosphate; and ( f) Conversion of mevalonate 5-pyrophosphate to isopent-1-enyl pyrophosphate. Mevalonate pathway enzymes required for IPP production may vary with culture conditions.

如上所述，在一些实施方式中，本发明遗传修饰的宿主细胞是通常不通过甲羟戊酸途径合成焦磷酸异戊-1-烯酯(IPP)或甲羟戊酸的宿主细胞。在一些实施方式中，用本发明表达载体遗传修饰宿主细胞，所述本发明表达载体包含编码类异戊烯修饰酶的本发明核酸；并且用一种或多种异源核酸遗传修饰宿主细胞，所述异源核酸包含编码乙酰乙酰-CoA硫解酶、羟甲基戊二酰基-CoA合酶(HMGS)、羟甲基戊二酰基-CoA还原酶(HMGR)、甲羟戊酸激酶(MK)、磷酸甲羟戊酸激酶(PMK)和甲羟戊酸焦磷酸脱羧酶(MPD)(还有任选的IPP异构酶)的核苷酸序列。在许多这类实施方式中，用表达载体遗传修饰宿主细胞，所述表达载体包含编码CPR的核苷酸序列。在一些实施方式中，用本发明表达载体遗传修饰宿主细胞，所述本发明表达载体包含编码类异戊烯修饰酶的本发明核酸；用一种或多种异源核酸遗传修饰宿主细胞，所述异源核酸包含编码MK、PMK、MPD(还有任选的IPP异构酶)的核苷酸序列。在许多这类实施方式中，用表达载体遗传修饰宿主细胞，所述表达载体包含编码CPR的核苷酸序列。As noted above, in some embodiments, a genetically modified host cell of the invention is a host cell that does not normally synthesize isopent-1-enyl pyrophosphate (IPP) or mevalonate through the mevalonate pathway. In some embodiments, a host cell is genetically modified with an expression vector of the invention comprising a nucleic acid of the invention encoding an isopentenoid modifying enzyme; and the host cell is genetically modified with one or more heterologous nucleic acids, The heterologous nucleic acid comprises encoding acetoacetyl-CoA thiolase, hydroxymethylglutaryl-CoA synthase (HMGS), hydroxymethylglutaryl-CoA reductase (HMGR), mevalonate kinase (MK ), phosphomevalonate kinase (PMK) and mevalonate pyrophosphate decarboxylase (MPD) (and optionally IPP isomerase) nucleotide sequences. In many of these embodiments, the host cell is genetically modified with an expression vector comprising a nucleotide sequence encoding a CPR. In some embodiments, a host cell is genetically modified with an expression vector of the present invention comprising a nucleic acid of the present invention encoding an isopentenoid modifying enzyme; a host cell is genetically modified with one or more heterologous nucleic acids, the Said heterologous nucleic acid comprises nucleotide sequences encoding MK, PMK, MPD (and optionally IPP isomerase). In many of these embodiments, the host cell is genetically modified with an expression vector comprising a nucleotide sequence encoding a CPR.

在一些实施方式中，本发明遗传修饰的宿主细胞是通常不通过甲羟戊酸途径合成IPP或甲羟戊酸的宿主细胞；用本发明表达载体遗传修饰宿主细胞，所述本发明表达载体包含编码类异戊烯修饰酶的本发明核酸；并且用一种或多种异源核酸遗传修饰宿主细胞，所述异源核酸包含编码乙酰乙酰-CoA硫解酶、HMGS、HMGR、MK、PMK、MPD、IPP异构酶和异戊烯基转移酶的核苷酸序列。在许多这类实施方式中，用表达载体遗传修饰宿主细胞，所述表达载体包含编码CPR的核苷酸序列。在一些实施方式中，本发明遗传修饰的宿主细胞是通常不通过甲羟戊酸途径合成IPP或甲羟戊酸的宿主细胞；用本发明表达载体遗传修饰宿主细胞，所述本发明表达载体包含编码类异戊烯修饰酶的本发明核酸；并且用一种或多种异源核酸遗传修饰宿主细胞，所述异源核酸包含编码MK、PMK、MPD、IPP异构酶和异戊烯基转移酶的核苷酸序列。在许多这类实施方式中，用表达载体遗传修饰宿主细胞，所述表达载体包含编码CPR的核苷酸序列。In some embodiments, the genetically modified host cell of the present invention is a host cell that does not normally synthesize IPP or mevalonate through the mevalonate pathway; the host cell is genetically modified with an expression vector of the present invention comprising a nucleic acid of the invention encoding an isopentenoid modifying enzyme; and genetically modifying a host cell with one or more heterologous nucleic acids comprising an encoding acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, Nucleotide sequences of MPD, IPP isomerase and prenyltransferase. In many of these embodiments, the host cell is genetically modified with an expression vector comprising a nucleotide sequence encoding a CPR. In some embodiments, the genetically modified host cell of the present invention is a host cell that does not normally synthesize IPP or mevalonate through the mevalonate pathway; the host cell is genetically modified with an expression vector of the present invention comprising a nucleic acid of the invention encoding an isopentenoid modifying enzyme; and genetically modifying a host cell with one or more heterologous nucleic acids comprising encoding MK, PMK, MPD, IPP isomerase, and isopentenyl transfer The nucleotide sequence of the enzyme. In many of these embodiments, the host cell is genetically modified with an expression vector comprising a nucleotide sequence encoding a CPR.

在一些实施方式中，本发明遗传修饰的宿主细胞是通常通过甲羟戊酸途径合成IPP或甲羟戊酸的宿主细胞，例如，宿主细胞是包含内源性甲羟戊酸途径的宿主细胞。在一些实施方式中，宿主细胞是酵母细胞。在一些实施方式中，宿主细胞是酿酒酵母。In some embodiments, a genetically modified host cell of the invention is a host cell that normally synthesizes IPP or mevalonate via the mevalonate pathway, eg, the host cell is a host cell that comprises an endogenous mevalonate pathway. In some embodiments, the host cell is a yeast cell. In some embodiments, the host cell is Saccharomyces cerevisiae.

在一些实施方式中，还用一种或多种核酸遗传修饰本发明遗传修饰的宿主细胞，所述核酸包含编码脱氢酶的核苷酸序列，所述脱氢酶进一步修饰类异戊烯化合物。所述编码的脱氢酶可以是在原核细胞或真核细胞中天然发现的脱氢酶，或者可以是这种脱氢酶的变体。在一些实施方式中，本发明提供了编码这种脱氢酶的核苷酸序列的分离核酸。In some embodiments, the genetically modified host cells of the invention are also genetically modified with one or more nucleic acids comprising a nucleotide sequence encoding a dehydrogenase that further modifies the isopentenoid compound . The encoded dehydrogenase may be a dehydrogenase naturally found in prokaryotic or eukaryotic cells, or may be a variant of such a dehydrogenase. In some embodiments, the invention provides isolated nucleic acids encoding the nucleotide sequences of such dehydrogenases.

甲羟戊酸途径核酸Mevalonate Pathway Nucleic Acids

本领域已知编码MEV途径基因产物的核苷酸序列，可采用任何已知的编码MEV途径基因产物的核苷酸序列，以产生本发明遗传修饰的宿主细胞。例如，编码乙酰乙酰-CoA硫解酶、HMGS、HMGR、MK、PMK、MPD和IDI的核苷酸序列是本领域已知的。以下是编码MEV途径基因产物的已知核苷酸序列的非限制性例子，后面括号中的内容表示各MEV途径酶的GenBank登录号和生物体：乙酰乙酰-CoA硫解酶：(NC_000913区：2324131..2325315；大肠杆菌)、(D49362；脱氮副球菌(Paracoccus denitrificans))和(L20428；酿酒酵母)；HMGS：(NC_001145互补物19061..20536；酿酒酵母)、(X96617；酿酒酵母)、(X83882；拟南芥)、(AB037907；灰北里孢菌(Kitasatospora griseola))和(BT007302；智人(Homo sapiens))；HMGR：(NM_206548；黑腹果蝇(Drosophilamelanogaster))、(NM_204485；红原鸡(Gallus gallus))、(AB015627；链霉菌(Streptomyces sp.)KO-3988)、(AF542543；烟草(Nicotiana attenuata))、(AB037907；灰北里孢菌(Kitasatospora griseola))、(AX128213，提供了编码截短的HMGR的序列；酿酒酵母)和(NC_001145：互补物(115734..118898；酿酒酵母))；MK：(L77688；拟南芥)和(X55875；酿酒酵母)；PMK：(AF429385；巴西橡胶树(Hevea brasiliensis))、(NM_006556；智人)、(NC_001145，互补物712315..713670；酿酒酵母)；MPD：(X97557；酿酒酵母)、(AF290095；屎肠球菌(Enterococcus faecium))和(U49260；智人)；和IDI：(NC_000913，3031087..3031635；大肠杆菌)和(AF082326；雨生红球藻(Haematococcuspluvialis))。Nucleotide sequences encoding MEV pathway gene products are known in the art, and any known nucleotide sequence encoding MEV pathway gene products can be used to generate the genetically modified host cells of the present invention. For example, nucleotide sequences encoding acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, MPD, and IDI are known in the art. The following are non-limiting examples of known nucleotide sequences encoding MEV pathway gene products, followed by content in parentheses indicating the GenBank accession number and organism for each MEV pathway enzyme: Acetoacetyl-CoA Thiolase: (NC_000913 Region: 2324131..2325315; Escherichia coli), (D49362; Paracoccus denitrificans) and (L20428; Saccharomyces cerevisiae); HMGS: (NC_001145 complement 19061..20536; Saccharomyces cerevisiae), (X96617; Saccharomyces cerevisiae) , (X83882; Arabidopsis), (AB037907; Kitasatospora griseola) and (BT007302; Homo sapiens); HMGR: (NM_206548; Drosophila melanogaster), (NM_204485; Red Jungle Chicken (Gallus gallus)), (AB015627; Streptomyces sp. KO-3988), (AF542543; Nicotiana attenuata), (AB037907; Kitasatospora griseola)), (AX128213, The sequence encoding the truncated HMGR is provided; S. cerevisiae) and (NC_001145: complement (115734..118898; S. cerevisiae)); MK: (L77688; Arabidopsis) and (X55875; S. cerevisiae); PMK: ( AF429385; Hevea brasiliensis), (NM_006556; Homo sapiens), (NC_001145, complement 712315..713670; Saccharomyces cerevisiae); MPD: (X97557; Saccharomyces cerevisiae), (AF290095; Enterococcus faecium) ) and (U49260; Homo sapiens); and IDI: (NC_000913, 3031087..3031635; Escherichia coli) and (AF082326; Haematococcus pluvialis).

在一些实施方式中，HMGR编码区编码缺少野生型HMGR的跨膜结构域的截短形式的HMGR(“tHMGR”)。HMGR的跨膜结构域含有该酶的调节部分，但没有催化活性。In some embodiments, the HMGR coding region encodes a truncated form of HMGR that lacks the transmembrane domain of wild-type HMGR ("tHMGR"). The transmembrane domain of HMGR contains the regulatory portion of the enzyme but is not catalytically active.

可以本领域已知的各种方式改变任何已知MEV途径酶的编码序列，以靶向改变编码酶的氨基酸序列。变异MEV途径酶的氨基酸通常与任何已知MEV途径酶的氨基酸序列基本相似，即其区别为至少一个氨基酸，区别可能是至少两个、至少5个、至少10个、或至少20个氨基酸，但一般不超过约50个氨基酸。序列改变可以是取代、插入或缺失。例如，如下所述，可根据具体宿主细胞的密码子偏倚性改变核苷酸序列。此外，可引入使编码蛋白发生保守性氨基酸改变的一个或多个核苷酸序列差异。The coding sequence of any known MEV pathway enzyme can be altered in various ways known in the art to target changes in the amino acid sequence of the encoded enzyme. The amino acid sequence of a variant MEV pathway enzyme is generally substantially similar to the amino acid sequence of any known MEV pathway enzyme, i.e. it differs by at least one amino acid, possibly by at least two, at least 5, at least 10, or at least 20 amino acids, but Generally no more than about 50 amino acids. Sequence alterations may be substitutions, insertions or deletions. For example, as described below, nucleotide sequences may be altered according to the codon bias of a particular host cell. In addition, one or more nucleotide sequence differences may be introduced that result in conservative amino acid changes in the encoded protein.

异戊烯基转移酶isopentenyltransferase

在一些实施方式中，遗传修饰本发明遗传修饰的宿主细胞，以包含含有编码类异戊烯修饰酶的核苷酸序列的核酸；在一些实施方式中，还通过遗传修饰以包含含有编码一种或多种甲羟戊酸途径酶的核苷酸序列的一种或多种核酸(如上所述)；和编码异戊烯基转移酶的核苷酸序列的核酸。In some embodiments, the genetically modified host cell of the present invention is genetically modified to comprise a nucleic acid comprising a nucleotide sequence encoding an isopentenoid modifying enzyme; in some embodiments, it is also genetically modified to comprise a nucleic acid encoding a one or more nucleic acids of the nucleotide sequence of one or more mevalonate pathway enzymes (as described above); and a nucleic acid encoding the nucleotide sequence of a prenyltransferase.

异戊烯基转移酶构成了催化IPP连续缩合导致形成各种链长度的二磷酸异戊烯酯的一大类酶。合适的异戊烯基转移酶包括催化IPP与烯丙基初始底物缩合形成类异戊烯化合物的酶，所述类异戊烯化合物具有约2个异戊烯单元-6000个异戊烯单元或更多，例如，2个异戊烯单元(焦磷酸牻牛儿酯合酶)、3个异戊烯单元(焦磷酸法呢酯合酶)、4个异戊烯单元(焦磷酸牻牛儿基牻牛儿酯合酶)、5个异戊烯单元、6个异戊烯单元(焦磷酸十六烷基酯合酶)、7个异戊烯单元、8个异戊烯单元(八氢番茄红素合酶、焦磷酸八异戊烯酯合酶)、9个异戊烯单元(焦磷酸九异戊烯酯合酶)、10个异戊烯单元(焦磷酸十异戊烯酯合酶)、约10个异戊烯单元-15个异戊烯单元、约15个异戊烯单元-20个异戊烯单元、约20个异戊烯单元-25个异戊烯单元、约25个异戊烯单元-30个异戊烯单元、约30个异戊烯单元-40个异戊烯单元、约40个异戊烯单元-50个异戊烯单元、约50个异戊烯单元-100个异戊烯单元、约100个异戊烯单元-250个异戊烯单元、约250个异戊烯单元-500个异戊烯单元、约500个异戊烯单元-1000个异戊烯单元、约1000个异戊烯单元-2000个异戊烯单元、约2000个异戊烯单元-3000个异戊烯单元、约3000个异戊烯单元-4000个异戊烯单元、约4000个异戊烯单元-5000个异戊烯单元、或约5000个异戊烯单元-6000个异戊烯单元或更多。Prenyltransferases constitute a large class of enzymes that catalyze the sequential condensation of IPP leading to the formation of prenyl diphosphates of various chain lengths. Suitable prenyltransferases include enzymes that catalyze the condensation of IPP with an allyl starting substrate to form isopentenoid compounds having from about 2 isopentenyl units to 6000 isopentenyl units or more, for example, 2 isopentenyl units (geranyl pyrophosphate synthase), 3 isopentenyl units (farnesyl pyrophosphate synthase), 4 isopentenyl units (geranyl pyrophosphate synthase), geranylgeranyl synthase), 5 isopentenyl units, 6 isopentenyl units (hexadecyl pyrophosphate synthase), 7 isopentenyl units, 8 isopentenyl units (eight phytoene synthase, octa-isopentenyl pyrophosphate synthase), 9 isopentenyl units (nona-isopentenyl pyrophosphate synthase), 10 isopentenyl units (deca-isopentenyl pyrophosphate synthase), about 10 isopentene units-15 isopentene units, about 15 isopentene units-20 isopentene units, about 20 isopentene units-25 isopentene units, about 25 isopentene units - 30 isopentene units, about 30 isopentene units - 40 isopentene units, about 40 isopentene units - 50 isopentene units, about 50 isopentene units Units - 100 isoamylene units, about 100 isoamylene units - 250 isoamylene units, about 250 isoamylene units - 500 isoamylene units, about 500 isoamylene units - 1000 isoamylene units Amylene units, about 1000 isopentene units - 2000 isopentene units, about 2000 isopentene units - 3000 isopentene units, about 3000 isopentene units - 4000 isopentene units, about From 4000 isoamylene units to 5000 isoamylene units, or from about 5000 isoamylene units to 6000 isoamylene units or more.

合适的异戊烯基转移酶包括但不限于：二磷酸E-异戊烯酯合酶，包括但不限于：二磷酸牻牛儿酯(GPP)合酶、二磷酸法呢酯(FPP)合酶、二磷酸牻牛儿基牻牛儿酯(GGPP)合酶、二磷酸六异戊烯酯(HexPP)合酶、二磷酸七异戊烯酯(HepPP)合酶、二磷酸八异戊烯酯(OPP)合酶、二磷酸茄呢酯(SPP)合酶、二磷酸十异戊烯酯(DPP)合酶、糖胶树胶合酶和古塔胶合酶；以及二磷酸Z-异戊烯酯合酶，包括但不限于：二磷酸九异戊烯酯(NPP)合酶、二磷酸十一异戊烯酯(UPP)合酶、二磷酸脱氢多萜酯(dehydrodolichyl diphosphate)合酶、二磷酸二十异戊烯酯合酶、天然橡胶合酶和其它二磷酸Z-异戊烯酯合酶。Suitable prenyltransferases include, but are not limited to: E-prenyl diphosphate synthases, including but not limited to: geranyl diphosphate (GPP) synthase, farnesyl diphosphate (FPP) synthase Enzyme, geranylgeranyl diphosphate (GGPP) synthase, hexapenyl diphosphate (HexPP) synthase, heptaprenyl diphosphate (HepPP) synthase, octaprenyl diphosphate ester (OPP) synthase, solanesyl diphosphate (SPP) synthase, decaprenyl diphosphate (DPP) synthase, chicle and gutta-percha synthase; and Z-isopentenyl diphosphate Ester synthases, including but not limited to: nonapenyl diphosphate (NPP) synthase, undecenyl diphosphate (UPP) synthase, dehydrodolichyl diphosphate (dehydrodolichyl diphosphate) synthase, Eicosopentenyl diphosphate synthase, natural rubber synthase and other Z-prenyl diphosphate synthases.

已知各种物种的多种异戊烯基转移酶的核苷酸序列，可用于或修饰后用于产生所述遗传修饰的宿主细胞。本领域已知编码异戊烯基转移酶的核苷酸序列。参见例如，人焦磷酸法呢酯合成酶mRNA(GenBank登录号J05262；智人)；二磷酸法呢酯合成酶(FPP)基因(GenBank登录号J05091；酿酒酵母)；二磷酸单异戊烯酯：二磷酸二甲基烯丙酯异构酶基因(J05090；酿酒酵母)；Wang和Ohnuma(2000)Biochim.Biophys.Acta 1529：33-48；美国专利号6,645,747；拟南芥焦磷酸法呢酯合成酶2(FPS2)/FPP合成酶2/二磷酸法呢酯合酶2(At4g17190)mRNA(GenBank登录号NM_202836)；银杏(Ginkgo biloba)二磷酸牻牛儿基牻牛儿酯合酶(ggpps)mRNA(GenBank登录号AY371321)；拟南芥焦磷酸牻牛儿基牻牛儿酯合酶(GGPS1)/GGPP合成酶/法呢基转移酶(At4g36810)mRNA(GenBank登录号NM_119845)；细长聚球藻(Synechococcuselongatus)的二磷酸法呢基、牻牛儿基牻牛儿基、牻牛儿基法呢基、六异戊烯基、七异戊烯酯合酶基因(SelF-HepPS)(GenBank登录号AB016095)；等。Nucleotide sequences of various prenyltransferases from various species are known and can be used or modified for use in producing such genetically modified host cells. Nucleotide sequences encoding prenyltransferases are known in the art. See, eg, human farnesyl pyrophosphate synthase mRNA (GenBank Accession No. J05262; Homo sapiens); farnesyl diphosphate synthase (FPP) gene (GenBank Accession No. J05091; Saccharomyces cerevisiae); monoprenyl diphosphate : dimethylallyl diphosphate isomerase gene (J05090; Saccharomyces cerevisiae); Wang and Ohnuma (2000) Biochim. Biophys. Acta 1529: 33-48; US Patent No. 6,645,747; Arabidopsis thaliana farnesyl pyrophosphate Synthetase 2 (FPS2)/FPP synthase 2/farnesyl diphosphate synthase 2 (At4g17190) mRNA (GenBank accession number NM_202836); Ginkgo biloba (Ginkgo biloba) geranylgeranyl diphosphate synthase (ggpps ) mRNA (GenBank accession number AY371321); Arabidopsis geranylgeranyl pyrophosphate synthase (GGPS1)/GGPP synthetase/farnesyl transferase (At4g36810) mRNA (GenBank accession number NM_119845); Synechococcuse longatus diphosphate farnesyl, geranylgeranyl, geranyl farnesyl, hexapenyl, heptaprenyl ester synthase gene (SelF-HepPS) ( GenBank Accession No. AB016095); et al.

密码子使用codon usage

在一些实施方式中，修饰用于产生本发明遗传修饰的宿主细胞的核苷酸序列，以使该核苷酸序列反映出具体宿主细胞优选的密码子。例如，在一些实施方式中，根据酵母优选的密码子修饰核苷酸序列。参见例如，Bennetzen和Hall(1982)J.Biol.Chem.257(6)：3026-3031。作为另一个非限制性例子，在其它实施方式中，根据大肠杆菌优选的密码子修饰核苷酸序列。参见例如，Gouy和Gautier(1982)Nucleic Acids Res.10(22)：7055-7074；Eyre-Walker(1996)Mol.Biol.Evol.13(6)：864-872。也参见Nakamura等(2000)Nucleic Acids Res.28(1)：292。In some embodiments, the nucleotide sequence used to generate the genetically modified host cell of the invention is modified such that the nucleotide sequence reflects the preferred codons of a particular host cell. For example, in some embodiments, the nucleotide sequence is modified according to yeast preferred codons. See, eg, Bennetzen and Hall (1982) J. Biol. Chem. 257(6):3026-3031. As another non-limiting example, in other embodiments, the nucleotide sequence is modified according to E. coli preferred codons. See, eg, Gouy and Gautier (1982) Nucleic Acids Res. 10(22):7055-7074; Eyre-Walker (1996) Mol. Biol. Evol. 13(6):864-872. See also Nakamura et al. (2000) Nucleic Acids Res. 28(1):292.

其它遗传修饰other genetic modification

在一些实施方式中，进一步遗传修饰的本发明遗传修饰的宿主细胞指经过以下修饰的宿主细胞：遗传修饰以包含一种或多种含有编码修饰细胞色素P450酶(如修饰的类异戊烯修饰酶)的核苷酸序列的核酸；以及进一步遗传修饰以提高血红素产量和/或提高萜生物合成途径中间体产量，和/或进一步遗传修饰以使内源性萜生物合成途径基因功能损伤。提到内源性萜生物合成途径基因时，本文所用术语“功能损伤”指遗传修饰萜生物合成途径基因，导致该基因编码的基因产物的产量低于正常水平，和/或没有功能。In some embodiments, a genetically modified host cell of the invention that is further genetically modified refers to a host cell that has been genetically modified to contain one or more enzymes encoding a modified cytochrome P450 (such as a modified isopentenoid modified and further genetic modification to increase heme production and/or increase the production of terpene biosynthesis pathway intermediates, and/or further genetic modification to damage endogenous terpene biosynthesis pathway gene function. As used herein, the term "impairment of function" when referring to an endogenous terpene biosynthetic pathway gene refers to a genetic modification of a terpene biosynthetic pathway gene that results in lower than normal production of the gene product encoded by the gene, and/or non-function.

提高血红素产量Increases hemoglobin production

在一些实施方式中，本发明遗传修饰的宿主细胞包含能够提高血红素产量的一种或多种其它遗传修饰，例如，与不包含所述一种或多种其它遗传修饰的宿主细胞相比，能使血红素产量提高至少约10％、至少约15％、至少约20％、至少约25％、至少约30％、至少约40％、至少约50％、至少约60％、至少约70％、至少约80％、至少约90％、至少约2倍、至少约2.5倍、至少约5倍、至少约10倍、至少约15倍、至少约20倍或至少约25倍，或更多倍。In some embodiments, a genetically modified host cell of the invention comprises one or more other genetic modifications capable of increasing heme production, e.g., compared to a host cell not comprising said one or more other genetic modifications, Capable of increasing heme production by at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70% , at least about 80%, at least about 90%, at least about 2 times, at least about 2.5 times, at least about 5 times, at least about 10 times, at least about 15 times, at least about 20 times or at least about 25 times, or more times .

细胞中产生血红素的有限步骤是生物合成氨基乙酰丙酸(ALA)。如图13所示，有两种不同的包括C₄途径或C₅途径的ALA生物合成途径。在一些实施方式中，进一步遗传修饰本发明遗传修饰的宿主细胞，以过度表达谷氨酰-tRNA还原酶(GTR还原酶)。在一些实施方式中，进一步遗传修饰本发明遗传修饰的宿主细胞，以使GTR还原酶活性水平比对照宿主细胞产生的GTR还原酶活性水平高至少约10％、至少约15％、至少约20％、至少约25％、至少约30％、至少约40％、至少约50％、至少约60％、至少约70％、至少约80％、至少约90％、至少约2倍、至少约2.5倍、至少约5倍、至少约10倍、至少约15倍、至少约20倍或至少约25倍或更多倍。The finite step in the production of heme in cells is the biosynthesis of aminolevulinic acid (ALA). As shown in Figure 13, there are two different ALA biosynthetic pathways including the _C4 pathway or the _C5 pathway. In some embodiments, the genetically modified host cells of the invention are further genetically modified to overexpress glutamyl-tRNA reductase (GTR reductase). In some embodiments, the genetically modified host cells of the invention are further genetically modified such that the level of GTR reductase activity is at least about 10%, at least about 15%, at least about 20% greater than the level of GTR reductase activity produced by a control host cell , at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 2 times, at least about 2.5 times , at least about 5 times, at least about 10 times, at least about 15 times, at least about 20 times or at least about 25 times or more.

通过多种方式提高细胞中GTR还原酶的活性水平，这些方式包括但不限于：1)提高操作性连接于GTR还原酶编码区的启动子的启动子强度；2)增加含有编码GTR还原酶的核苷酸序列的质粒的拷贝数；3)提高GTR还原酶mRNA的稳定性(当“GTR还原酶mRNA”是含有编码GTR还原酶的核苷酸序列的mRNA时)；4)修饰GTR还原酶的密码子使用，以提高GTR还原酶mRNA的翻译水平；5)提高GTR还原酶的酶稳定性；6)提高GTR还原酶的比活(每单位蛋白质的活性单位)；和7)降低GTR还原酶的负反馈调节。Increase the activity level of GTR reductase in cells by various means, including but not limited to: 1) increasing the promoter strength of the promoter operably linked to the GTR reductase coding region; 3) increase the stability of GTR reductase mRNA (when "GTR reductase mRNA" is an mRNA containing a nucleotide sequence encoding GTR reductase); 4) modify GTR reductase 5) increase the enzyme stability of GTR reductase; 6) increase the specific activity (units of activity per unit protein) of GTR reductase; and 7) decrease GTR reduction Negative feedback regulation of enzymes.

在一些实施方式中，导致GTR还原酶水平提高的遗传修饰是降低GTR还原酶的负反馈调节的遗传修饰。在一些实施方式中，通过在N末端或其附近插入带正电的KK序列减少GTR还原酶负反馈调节的降低。In some embodiments, the genetic modification that results in increased levels of GTR reductase is a genetic modification that reduces negative feedback regulation of GTR reductase. In some embodiments, the decrease in negative feedback regulation of GTR reductase is reduced by inserting a positively charged KK sequence at or near the N-terminus.

在一些实施方式中，进一步遗传修饰本发明遗传修饰的宿主细胞，以过度表达ALA合酶。在一些实施方式中，进一步遗传修饰本发明遗传修饰的宿主细胞，以使ALA合酶水平比对照宿主细胞中产生的ALA合酶活性水平高出至少约10％、至少约15％、至少约20％、至少约25％、至少约30％、至少约40％、至少约50％、至少约60％、至少约70％、至少约80％、至少约90％、至少约2倍、至少约2.5倍、至少约5倍、至少约10倍、至少约15倍、至少约20倍或至少约25倍或更多倍。In some embodiments, the genetically modified host cells of the invention are further genetically modified to overexpress ALA synthase. In some embodiments, the genetically modified host cells of the invention are further genetically modified such that the level of ALA synthase is at least about 10%, at least about 15%, at least about 20% higher than the level of ALA synthase activity produced in a control host cell. %, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 2 times, at least about 2.5 times, at least about 5 times, at least about 10 times, at least about 15 times, at least about 20 times or at least about 25 times or more.

通过多种方式提高细胞中ALA合酶的活性水平，这些方式包括但不限于：：1)提高操作性连接于ALA合酶编码区的启动子的启动子强度；2)增加含有编码ALA合酶的核苷酸序列的质粒的拷贝数；3)提高ALA合酶mRNA的稳定性(当“ALA合酶mRNA”是含有编码ALA合酶的核苷酸序列的mRNA时)；4)修饰ALA合酶的密码子使用，以提高ALA合酶mRNA的翻译水平；5)提高ALA合酶的酶稳定性；和6)提高ALA合酶的比活(每单位蛋白质的活性单位)。Increase the activity level of ALA synthase in cells by various means, including but not limited to: 1) increasing the promoter strength of the promoter operably linked to the ALA synthase coding region; 3) improve the stability of ALA synthase mRNA (when "ALA synthase mRNA" is the mRNA containing the nucleotide sequence encoding ALA synthase); 4) modify ALA synthase Enzyme codon usage to increase translation levels of ALA synthase mRNA; 5) increase ALA synthase enzyme stability; and 6) increase ALA synthase specific activity (units of activity per unit of protein).

提高内源性萜生物合成途径中间体的产量Increased production of endogenous terpene biosynthetic pathway intermediates

提高内源性萜生物合成途径中间体产量的遗传修饰包括但不限于：导致宿主细胞中磷酸转乙酰酶水平和/或活性降低的遗传修饰。通过增加乙酰辅酶A的细胞内浓度而增加萜生物合成途径中间体的细胞内浓度。大肠杆菌将大量的醋酸酯形式的细胞内乙酰辅酶A分泌入培养基中。编码磷酸转乙酰酶(pta，负责乙酰辅酶A转化形成醋酸酯的第一个酶)的基因的缺失将减少醋酸酯分泌。降低原核宿主细胞中磷酸转乙酰酶的水平和/或活性的遗传修饰是特别有用的，其中，遗传修饰的宿主细胞是用包含编码一种或多种MEV途径基因产物的核苷酸序列的核酸遗传修饰的宿主细胞。Genetic modifications that increase the production of endogenous terpene biosynthetic pathway intermediates include, but are not limited to, genetic modifications that result in decreased levels and/or activity of phosphotransacetylases in the host cell. Increases the intracellular concentration of terpene biosynthetic pathway intermediates by increasing the intracellular concentration of acetyl-CoA. E. coli secretes large amounts of intracellular acetyl-CoA in the form of acetate into the culture medium. Deletion of the gene encoding phosphotransacetylase (pta, the first enzyme responsible for the conversion of acetyl-CoA to form acetate) will reduce acetate secretion. Genetic modifications that reduce the level and/or activity of a phosphotransacetylase in a prokaryotic host cell are particularly useful, wherein the genetically modified host cell is modified with a nucleic acid comprising a nucleotide sequence encoding one or more MEV pathway gene products. Genetically modified host cells.

在一些实施方式中，导致原核宿主细胞中磷酸转乙酰酶水平降低的遗传修饰是使原核宿主细胞编码磷酸转乙酰酶的内源性pta基因功能损伤的遗传突变。可以多种方式使pta基因功能损伤，包括：插入可动遗传元件(例如转座子等)；缺失所有或部分基因，导致不形成基因产物、或产物被截短且在将乙酰辅酶A转化为乙酸酯的过程中没有功能；基因突变，导致不形成基因产物、产物被截短且在将乙酰辅酶A转化为乙酸酯的过程中没有功能；缺失或突变一个或多个控制pta基因表达的控制元件，导致不产生基因产物；等等。In some embodiments, the genetic modification that results in a reduced level of phosphotransacetylase in the prokaryotic host cell is a genetic mutation that impairs the function of the endogenous pta gene encoding the phosphotransacetylase in the prokaryotic host cell. The function of pta gene can be damaged in various ways, including: inserting mobile genetic elements (such as transposons, etc.); No function in the process of acetate; gene mutation resulting in no gene product being formed, product truncated and not functional in the process of converting acetyl-CoA to acetate; deletion or mutation of one or more of the genes controlling pta expression control elements that result in the non-production of the gene product; and so on.

在一些实施方式中，缺失了遗传修饰宿主细胞的内源性pta基因。可使用任何缺失基因的方法。pta基因缺失方法的一个非限制性例子是使用λRed重组系统。Datsenko和Wanner(2000)Proc Natl Acad Sci USA 97(12)：第6640-5页。在一些实施方式中，pta基因在用包含编码MK、PMK、MPD和IDI的核苷酸序列的核酸遗传修饰的宿主细胞(如大肠杆菌)中缺失。在一些实施方式中，pta基因在用包含编码MK、PMK、MPD和IPP的核苷酸序列的核酸遗传修饰的宿主细胞(如大肠杆菌)中缺失。在一些实施方式中，pta基因在用包含编码MK、PMK、MPD、IPP和异戊烯基转移酶的核苷酸序列的核酸遗传修饰的宿主细胞(如大肠杆菌)中缺失。In some embodiments, the endogenous pta gene of the genetically modified host cell is deleted. Any method for deleting genes can be used. A non-limiting example of a pta gene deletion method is the use of the lambda Red recombination system. Datsenko and Wanner (2000) Proc Natl Acad Sci USA 97(12): pp. 6640-5. In some embodiments, the pta gene is deleted in a host cell (eg, E. coli ) genetically modified with a nucleic acid comprising a nucleotide sequence encoding MK, PMK, MPD, and IDI. In some embodiments, the pta gene is deleted in a host cell (eg, E. coli ) genetically modified with a nucleic acid comprising a nucleotide sequence encoding MK, PMK, MPD, and IPP. In some embodiments, the pta gene is deleted in a host cell (eg, E. coli ) genetically modified with a nucleic acid comprising a nucleotide sequence encoding MK, PMK, MPD, IPP, and prenyltransferase.

丧失功能的DXP途径Loss of function DXP pathway

在一些实施方式中，本发明遗传修饰的宿主细胞是经遗传修饰包含一种或多种核酸的宿主细胞，所述核酸包含编码MEV生物合成途径基因产物的核苷酸序列；和经进一步遗传修饰使内源性DXP生物合成途径基因功能损伤的核苷酸序列。在其它实施方式中，本发明遗传修饰的宿主细胞是经遗传修饰包含一种或多种核酸的宿主细胞，所述核酸包含编码DXP生物合成途径基因产物的核苷酸序列；和经进一步遗传修饰以使内源性MEV生物合成途径基因功能损伤的核苷酸序列。In some embodiments, a genetically modified host cell of the invention is a host cell genetically modified to comprise one or more nucleic acids comprising a nucleotide sequence encoding a MEV biosynthetic pathway gene product; and further genetically modified A nucleotide sequence that impairs the function of an endogenous DXP biosynthetic pathway gene. In other embodiments, the genetically modified host cell of the invention is a host cell genetically modified to comprise one or more nucleic acids comprising a nucleotide sequence encoding a DXP biosynthetic pathway gene product; and further genetically modified Nucleotide sequences that impair the function of endogenous MEV biosynthetic pathway genes.

在一些实施方式中，当本发明遗传修饰的宿主细胞是用包含编码一种或多种MEV途径基因产物的核苷酸序列的核酸遗传修饰的原核宿主细胞时，进一步遗传修饰宿主细胞，以使一种或多种内源性DXP途径基因功能损伤。可被功能损伤的DXP途径基因包括编码任何以下DXP基因产物的一种或多种基因：1-脱氧-D-木酮糖-5-磷酸酯合酶、1-脱氧-D-木酮糖-5-磷酸酯还原酮异构酶(reductoisomerase)、4-二磷酸胞苷酰-2-C-甲基-D-赤藓醇合酶、4-二磷酸胞苷酰-2-C-甲基-D-赤藓醇激酶、2C-甲基-D-赤藓醇2，4-环二磷酸酯合酶和1-羟基-2-甲基-2-(E)-丁烯基4-二磷酸酯合酶。In some embodiments, when the genetically modified host cell of the invention is a prokaryotic host cell genetically modified with a nucleic acid comprising a nucleotide sequence encoding one or more MEV pathway gene products, the host cell is further genetically modified such that Impaired function of one or more endogenous DXP pathway genes. DXP pathway genes that may be functionally impaired include one or more genes encoding any of the following DXP gene products: 1-deoxy-D-xylulose-5-phosphate synthase, 1-deoxy-D-xylulose- 5-phosphate reductoketone isomerase (reductoisomerase), 4-diphosphocytiyl-2-C-methyl-D-erythritol synthase, 4-diphosphocytiyl-2-C-methyl -D-erythritol kinase, 2C-methyl-D-erythritol 2,4-cyclic diphosphate synthase and 1-hydroxy-2-methyl-2-(E)-butenyl 4-di Phosphate synthase.

可以多种方式使内源性DXP途径基因功能损伤，包括插入可动遗传元件(例如，转座子等)；缺失所有或部分基因，导致不形成基因产物、或产物被截短且没有酶活性；基因突变，导致不形成基因产物、产物被截短且没有酶功能；缺失或突变一个或多个控制基因表达的控制元件，导致不产生基因产物；等等。Endogenous DXP pathway gene function can be impaired in a variety of ways, including insertion of mobile genetic elements (e.g., transposons, etc.); deletion of all or part of the gene, resulting in no gene product being formed, or the product is truncated and enzymatically inactive ; gene mutations, resulting in no gene product being formed, truncated and nonenzymatically functional; deletion or mutation of one or more control elements that control gene expression, resulting in no gene product being produced; etc.

在其它实施方式中，当本发明遗传修饰的宿主细胞是用包含编码一种或多种DXP途径基因产物的核苷酸序列的核酸遗传修饰的原核宿主细胞时，进一步遗传修饰宿主细胞，以使一种或多种内源性MEV途径基因功能损伤。可被功能损伤的内源性MEV途径基因包括编码任何以下MEV基因产物的一种或多种基因：HMGS、HMGR、MK、PMK、MPD和IDI。可以多种方式使内源性MEV途径基因功能损伤，包括插入可动遗传元件(例如，转座子等)；缺失所有或部分基因，导致不形成基因产物、或产物被截短且没有酶活性；基因突变，导致不形成基因产物、产物被截短且没有酶功能；缺失或突变一个或多个控制基因表达的控制元件，导致不产生基因产物；等等。In other embodiments, when the genetically modified host cell of the invention is a prokaryotic host cell genetically modified with a nucleic acid comprising a nucleotide sequence encoding one or more DXP pathway gene products, the host cell is further genetically modified such that Impaired function of one or more endogenous MEV pathway genes. Endogenous MEV pathway genes that may be functionally impaired include one or more genes encoding any of the following MEV gene products: HMGS, HMGR, MK, PMK, MPD, and IDI. Endogenous MEV pathway gene function can be impaired in a variety of ways, including insertion of mobile genetic elements (e.g., transposons, etc.); deletion of all or part of the gene, resulting in no gene product being formed, or the product is truncated and enzymatically inactive ; gene mutations, resulting in no gene product being formed, truncated and nonenzymatically functional; deletion or mutation of one or more control elements that control gene expression, resulting in no gene product being produced; etc.

包含本发明遗传修饰的宿主细胞的组合物Compositions comprising genetically modified host cells of the invention

本发明还提供包含本发明遗传修饰宿主细胞的组合物。本发明组合物包含本发明遗传修饰的宿主细胞；并且在一些实施方式中还包含一种或多种其它成分，这些成分的选择部分基于遗传修饰宿主细胞的特定用途。合适的成分包括但不限于：盐；缓冲剂；稳定剂；蛋白酶抑制剂；核酸酶抑制剂；细胞膜和/或细胞壁保存性化合物，例如甘油、二甲亚砜等；适合细胞的营养培养基；等等。在一些实施方式中，冻干细胞。The invention also provides compositions comprising the genetically modified host cells of the invention. Compositions of the invention comprise a genetically modified host cell of the invention; and, in some embodiments, one or more additional components, selected in part based on the particular use for which the genetically modified host cell is intended. Suitable ingredients include, but are not limited to: salts; buffers; stabilizers; protease inhibitors; nuclease inhibitors; cell membrane and/or cell wall preserving compounds such as glycerol, dimethylsulfoxide, etc.; nutrient media suitable for cells; etc. In some embodiments, the cells are lyophilized.

转基因植物transgenic plant

在一些实施方式中，本发明核酸或本发明表达载体(如本发明修饰细胞色素P450酶核酸或包含修饰细胞色素P450酶核酸的本发明表达载体)用作转基因，以产生能生产编码的修饰细胞色素P450酶的转基因植物。因此，本发明还提供了转基因植物(或植物部分、种子、组织等)，该植物包含含有本发明核酸的转基因，所述本发明核酸包含编码修饰细胞色素P450酶(如上所述)的核苷酸序列。在一些实施方式中，转基因植物的基因组包含本发明核酸。在一些实施方式中，转基因植物是遗传修饰纯合的。在一些实施方式中，转基因植物是遗传修饰杂合的。In some embodiments, the nucleic acid of the present invention or the expression vector of the present invention (such as the modified cytochrome P450 enzyme nucleic acid of the present invention or the expression vector of the present invention comprising the modified cytochrome P450 enzyme nucleic acid) is used as a transgene to produce modified cells capable of producing the encoded Transgenic plants for pigment P450 enzymes. Accordingly, the invention also provides a transgenic plant (or plant part, seed, tissue, etc.) comprising a transgene comprising a nucleic acid of the invention comprising a nucleoside encoding a modified cytochrome P450 enzyme (as described above) acid sequence. In some embodiments, the genome of a transgenic plant comprises a nucleic acid of the invention. In some embodiments, the transgenic plant is homozygous for the genetic modification. In some embodiments, the transgenic plant is heterozygous for the genetic modification.

在一些实施方式中，与同一物种的对照植物如非转基因植物(不包含编码该多肽的转基因的植物)的产物产量相比，本发明转基因植物产生的转基因编码的修饰细胞色素P450酶和修饰细胞色素P450酶的产物的产量高出至少约50％、至少约2倍、至少约5倍、至少约10倍、至少约25倍、至少约50倍或至少约100倍或更高。In some embodiments, the transgene-encoded modified cytochrome P450 enzyme and modified cells produced by the transgenic plant of the present invention are compared to the product yield of a control plant of the same species, such as a non-transgenic plant (a plant that does not contain a transgene encoding the polypeptide). The yield of the product of the pigment P450 enzyme is at least about 50%, at least about 2-fold, at least about 5-fold, at least about 10-fold, at least about 25-fold, at least about 50-fold, or at least about 100-fold or higher.

在一些实施方式中，本发明转基因植物是非转基因对照植物的转基因形式，该非转基因对照植物通常产生的类异戊烯化合物是转基因编码的修饰类异戊烯前体修饰酶产生的化合物或者是该转基因编码多肽的下游产物；与同一物种的对照植物如非转基因植物(不包含编码该多肽的转基因的植物)的类异戊烯化合物产量相比，该转基因植物的类异戊烯化合物产量高出至少约50％、至少约2倍、至少约5倍、至少约10倍、至少约25倍、至少约50倍或至少约100倍或更高。In some embodiments, the transgenic plant of the invention is a transgenic form of a non-transgenic control plant that normally produces an isopentenoid compound that is produced by a transgene-encoded modified isopentenoid precursor modifying enzyme or that is the The downstream product of the polypeptide encoded by the transgene; the transgenic plant has a higher isopentenoid production than that of a control plant of the same species, such as a non-transgenic plant (a plant that does not contain the transgene encoding the polypeptide) At least about 50%, at least about 2-fold, at least about 5-fold, at least about 10-fold, at least about 25-fold, at least about 50-fold, or at least about 100-fold or higher.

本领域熟知将外源性核酸引入植物细胞的方法。认为这种植物细胞被“转化”，如上所述。合适方法包括病毒感染(如双链DNA病毒)、转染、接合、原生质体融合、电穿孔、基因枪技术、磷酸钙沉淀、直接显微注射、碳化硅颈须(whisker)技术、土壤杆菌介导的转化等。方法的选择通常取决于转化的细胞类型和发生转化的环境(即体外、离体或体内)。Methods for introducing exogenous nucleic acids into plant cells are well known in the art. Such plant cells are said to be "transformed," as described above. Suitable methods include viral infection (e.g. double stranded DNA virus), transfection, conjugation, protoplast fusion, electroporation, particle gun techniques, calcium phosphate precipitation, direct microinjection, silicon carbide whisker technique, Agrobacterium-mediated guided transformation, etc. The choice of method will generally depend on the type of cell transformed and the environment in which transformation occurs (ie, in vitro, ex vivo or in vivo).

基于土壤细菌根癌土壤杆菌(Agrobacterium tumefaciens)的转化方法特别适用于将外源性核酸分子引入维管植物。野生型土壤杆菌含有指导在宿主植物上产生致瘤性冠瘿生长的Ti(肿瘤诱导)质粒。将Ti质粒的肿瘤诱导型T-DNA区转移给植物基因组需要Ti质粒编码的毒力基因以及T-DNA边界，它们是勾划出待转移区域的一组同向DNA重复。基于土壤杆菌的载体是修饰的Ti质粒，其中肿瘤诱导功能被待引入植物宿主的感兴趣核酸序列所取代。Transformation methods based on the soil bacterium Agrobacterium tumefaciens are particularly suitable for introducing exogenous nucleic acid molecules into vascular plants. Wild-type Agrobacterium contains a Ti (tumor-inducing) plasmid that directs the growth of tumorigenic crown gall on host plants. Transfer of the tumor-inducing T-DNA region of the Ti plasmid to the plant genome requires the virulence genes encoded by the Ti plasmid as well as the T-DNA border, which is a set of direct DNA repeats that delineate the region to be transferred. Agrobacterium-based vectors are modified Ti plasmids in which the tumor-inducing function is replaced by a nucleic acid sequence of interest to be introduced into a plant host.

土壤杆菌介导的转化通常采用共同整合载体，或者优选采用双元载体系统，其中Ti质粒的元件在辅助载体和穿梭载体之间分开，所述辅助载体永久地滞留在土壤杆菌宿主中并且携带毒力基因，所述穿梭载体含有以T-DNA序列为边界的感兴趣基因。本领域熟知各种双元载体，并且可购自(例如)克隆太克技术公司(Clontech)(加利福尼亚州帕洛阿尔托)。本领域熟知(例如)将土壤杆菌与培养的植物细胞或创伤组织如叶组织、根外植体、下胚轴(hypocotyledons)、茎块或块茎共同培养的方法。参见例如，Glick和Thompson(编)，Methods inPlant Molecular Biology and Biotechnology(植物分子生物学和生物技术的方法)，佛罗里达州伯卡拉顿(Boca Raton，Fla.)：CRC出版社(CRC Press)(1993)。Agrobacterium-mediated transformation typically employs co-integrating vectors or, preferably, binary vector systems in which elements of the Ti plasmid are split between a helper vector that permanently resides in the Agrobacterium host and carries the virus. Force gene, the shuttle vector contains the gene of interest bordered by T-DNA sequences. A variety of binary vectors are well known in the art and are commercially available, for example, from Clontech (Palo Alto, CA). Methods are well known in the art, for example, for the co-cultivation of Agrobacterium with cultured plant cells or wounded tissue such as leaf tissue, root explants, hypocotyledons, stem pieces or tubers. See, e.g., Glick and Thompson (eds.), Methods in Plant Molecular Biology and Biotechnology, Boca Raton, Fla. (Boca Raton, Fla.): CRC Press (1993 ).

土壤杆菌介导的转化可用于产生各种转基因维管植物(Wang等，同上，1995)包括至少一种桉树属(Eucalyptus)和豆科牧草(forage legumes)如苜蓿(紫花苜蓿)；牛角花、白三叶草、铅笔花属(Stylosanthes)、罗顿豆(Lotononisbainessii)和红豆草。Agrobacterium-mediated transformation can be used to generate a variety of transgenic vascular plants (Wang et al., supra, 1995) including at least one species of Eucalyptus and forage legumes such as alfalfa (Medicago sativa); White clover, Stylosanthes, Lotononis bainessii, and red bean grass.

也可采用微粒介导的转化产生本发明转基因植物。首先由Klein等(Nature327：70-73(1987))描述的这种方法依赖于通过氯化钙、亚精胺或聚乙二醇沉淀涂上所需核酸分子的微粒如金或钨。用一种装置如BIOLISTIC PD-1000(伯乐公司(Biorad)；加利福尼亚州贺求斯(Hercules Calif.))使微粒高速射入被子植物组织中。Microparticle-mediated transformation can also be used to produce transgenic plants of the invention. This method, first described by Klein et al. (Nature 327:70-73 (1987)), relies on microparticles such as gold or tungsten coated with the desired nucleic acid molecule by calcium chloride, spermidine or polyethylene glycol precipitation. Microparticles are injected at high velocity into angiosperm tissue using a device such as the BIOLISTIC PD-1000 (Biorad; Hercules Calif.).

将本发明核酸引入植物的方式应能使该核酸能够通过(例如)体内或离体方法进入植物细胞。“体内”指通过(例如)浸渗使核酸进入植物活体中。“离体”指在植株外修饰细胞或外植体，然后使该细胞或器官再生为植株。已经描述了适用于稳定转化植物细胞或建立转基因植物的许多载体，包括Weissbach和Weissbach(1989)Methods for Plant Molecular Biology(植物分子生物学方法)，学术出版社；和Gelvin等，(1990)Plant Molecular Biology Manual(植物分子生物学手册)，克鲁维学术出版集团所述的载体。具体例子包括衍生自根癌土壤杆菌的Ti质粒的载体，以及Herrera-Estrella等(1983)Nature 303：209，Bevan(1984)Nucl Acid Res.12：8711-8721，Klee(1985)Bio/Technology 3：637-642所述的载体。或者，可通过游离DNA递送技术采用非Ti载体将DNA转移到植物和细胞中。采用这些方法时，可产生转基因植物如小麦、水稻(Christou(1991)Bio/Technology 9：957-962)和玉米(Gordon-Kamm(1990)Plant Cell 2：603-618)。采用基因枪进行直接DNA递送技术时，不成熟的胚胎也可能是良好的单子叶植物靶组织(Weeks等(1993)Plant Physiol 102：1077-1084；Vasil(1993)Bio/Technolo 10：667-674；Wan和Lemeaux(1994)Plant Physiol 104：37-48；对于土壤杆菌介导的DNA转移参见(Ishida等(1996)Nature Biotech 14：745-750)。将DNA引入叶绿体的示范性方法是生物射弹轰击、聚乙二醇转化原生质体和显微注射(Danieli等，Nat.Biotechnol 16：345-348，1998；Staub等，Nat.Biotechnol 18：333-338，2000；O’Neill等，Plant J.3：729-738，1993；Knoblauch等，Nat.Biotechnol 17：906-909；美国专利5,451,513、5,545,817、5,545,818和5,576,198；国际申请WO 95/16783；以及Boynton等，Methods inEnzymology(酶学方法)217：510-536(1993)，Svab等，Proc.Natl.Acad.Sci.USA 90：913-917(1993)和McBride等，Proc.Natl.Acad.Sci.USA 91：7301-7305(1994))。适用于生物射弹轰击、聚乙二醇转化原生质体和显微注射方法的任何载体均可用作叶绿体转化的靶向载体。任何双链DNA载体均可用作转化载体，尤其是当引入方法不采用土壤杆菌时。A nucleic acid of the invention is introduced into a plant in such a manner that the nucleic acid can enter plant cells, eg, by in vivo or ex vivo methods. "In vivo" refers to the entry of nucleic acid into a living plant by, for example, infiltration. "Ex vivo" refers to the modification of a cell or explant outside of a plant, followed by regeneration of the cell or organ into a plant. Many vectors suitable for stably transforming plant cells or establishing transgenic plants have been described, including Weissbach and Weissbach (1989) Methods for Plant Molecular Biology (Methods for Plant Molecular Biology), Academic Press; and Gelvin et al., (1990) Plant Molecular Vectors described in Biology Manual, Kluyver Academic Publishing Group. Specific examples include vectors derived from the Ti plasmid of Agrobacterium tumefaciens, and Herrera-Estrella et al. (1983) Nature 303: 209, Bevan (1984) Nucl Acid Res. 12: 8711-8721, Klee (1985) Bio/Technology 3 : 637-642 described vector. Alternatively, non-Ti vectors can be used to transfer DNA into plants and cells by episomal DNA delivery techniques. Using these methods, transgenic plants such as wheat, rice (Christou (1991) Bio/Technology 9:957-962) and maize (Gordon-Kamm (1990) Plant Cell 2:603-618) can be produced. Immature embryos may also be good monocot target tissues when using gene gun for direct DNA delivery technology (Weeks et al. (1993) Plant Physiol 102:1077-1084; Vasil (1993) Bio/Technolo 10:667-674 ; Wan and Lemeaux (1994) Plant Physiol 104:37-48; See (Ishida et al. (1996) Nature Biotech 14:745-750) for Agrobacterium-mediated DNA transfer.An exemplary method for introducing DNA into chloroplasts is bioinjection Projectile bombardment, polyethylene glycol transformation of protoplasts and microinjection (Danieli et al., Nat.Biotechnol 16:345-348, 1998; Staub et al., Nat.Biotechnol 18:333-338, 2000; O'Neill et al., Plant J .3: 729-738, 1993; Knoblauch et al., Nat. Biotechnol 17: 906-909; U.S. Patents 5,451,513, 5,545,817, 5,545,818 and 5,576,198; International Application WO 95/16783; and Boynton et al., Methods in Enzymology 217 : 510-536 (1993), Svab et al., Proc. Natl. Acad. Sci. USA 90: 913-917 (1993) and McBride et al., Proc. Natl. Acad. Sci. USA 91: 7301-7305 (1994)) .Any vector suitable for biolistic bombardment, polyethylene glycol transformation of protoplasts, and microinjection methods can be used as a targeting vector for chloroplast transformation. Any double-stranded DNA vector can be used as a transformation vector, especially when introducing When the method does not use Agrobacterium.

可进行遗传修饰的植物包括谷类、牧草作物、水果、蔬菜、油籽作物、棕榈、林业植物和藤本植物。可修饰的植物的具体例子如下：玉米、香蕉、花生、紫花豌豆、向日葵、番茄、芸苔、烟草、小麦、大麦、燕麦、马铃薯、大豆、棉花、康乃馨、高粱、羽扇豆和稻。其它例子包括黄花蒿，或者已知能产生感兴趣的类异戊烯化合物的其它植物。Plants that can be genetically modified include cereals, pasture crops, fruits, vegetables, oilseed crops, palms, forestry plants and vines. Specific examples of plants that can be modified are as follows: corn, banana, peanut, pea, sunflower, tomato, brassica, tobacco, wheat, barley, oats, potato, soybean, cotton, carnation, sorghum, lupine, and rice. Other examples include Artemisia annua, or other plants known to produce the isopentenoid compound of interest.

本发明也提供了转化的植物细胞，含有转化的植物细胞的组织、种子、植株和产品。本发明转化细胞和含有该细胞的组织和产品的特征是本发明核酸整合到基因组中，和该植物细胞能产生修饰细胞色素P450酶。本发明重组植物细胞可作为重组细胞群体使用，或作为组织、种子、整个植株、茎、果实、叶、根、花、茎、块茎、谷粒、动物饲料、大片植物等使用。The invention also provides transformed plant cells, tissues, seeds, plants and products containing transformed plant cells. The transformed cells of the present invention and the tissues and products containing the cells are characterized by the integration of the nucleic acid of the present invention into the genome, and the ability of the plant cells to produce modified cytochrome P450 enzymes. The recombinant plant cells of the invention can be used as recombinant cell populations, or as tissues, seeds, whole plants, stems, fruits, leaves, roots, flowers, stems, tubers, grains, animal feed, plant flakes, and the like.

本发明也提供了本发明转基因植物的繁殖材料，所述繁殖材料包括种子、后代植株和无性繁殖材料。The present invention also provides the propagating material of the transgenic plant of the present invention, and the propagating material includes seeds, progeny plants and asexual propagating material.

产生生物合成途径产物的方法Methods of producing biosynthetic pathway products

本发明提供了产生生物合成途径产物的方法。该方法通常包括在合适培养基中培养本发明遗传修饰的宿主细胞。本发明遗传修饰的宿主细胞是用包含编码修饰细胞色素P450酶的核苷酸序列的核酸进行遗传修饰、以产生修饰细胞色素P450酶的细胞，所述修饰细胞色素P450酶操作性连接于选自跨膜结构域、分泌结构域、增溶结构域或膜插入蛋白的结构域。在生物合成途径中间体存在时，修饰细胞色素P450酶的产生导致中间体发生酶学修饰和产生生物合成途径产物。在其它实施方式中，该方法通常包括将本发明转基因植物维持在有利于产生编码的修饰细胞色素P450酶的条件下。修饰细胞色素P450酶的产生导致产生了生物合成途径产物。该方法一般在体外(例如在体外用活细胞培养)进行，但也考虑在体内产生生物合成途径产物。在一些实施方式中，所述宿主细胞是真核细胞，如酵母细胞。在其它实施方式中，所述宿主细胞是原核细胞。在一些实施方式中，所述宿主细胞是植物细胞。在一些实施方式中，该方法在本发明转基因植物中进行。The present invention provides methods of producing biosynthetic pathway products. The method generally involves culturing a genetically modified host cell of the invention in a suitable medium. The genetically modified host cell of the present invention is a cell that is genetically modified with a nucleic acid comprising a nucleotide sequence encoding a modified cytochrome P450 enzyme to produce a modified cytochrome P450 enzyme that is operably linked to a group selected from Transmembrane domains, secretory domains, solubilizing domains, or domains of membrane-inserted proteins. In the presence of biosynthetic pathway intermediates, production of modified cytochrome P450 enzymes results in enzymatic modification of the intermediates and production of biosynthetic pathway products. In other embodiments, the methods generally comprise maintaining a transgenic plant of the invention under conditions favorable for production of the encoded modified cytochrome P450 enzyme. Modification of the production of cytochrome P450 enzymes results in the production of biosynthetic pathway products. The method is typically performed in vitro (eg, in vitro with living cell culture), but in vivo production of biosynthetic pathway products is also contemplated. In some embodiments, the host cell is a eukaryotic cell, such as a yeast cell. In other embodiments, the host cell is a prokaryotic cell. In some embodiments, the host cell is a plant cell. In some embodiments, the method is performed in a transgenic plant of the invention.

与对照亲本宿主细胞相比，本发明遗传修饰的宿主细胞能够提高生物合成途径产物的产量。因此，例如，与对照亲本宿主细胞中产生的产物水平相比，遗传修饰宿主细胞的生物合成途径产物的产量提高了至少约10％、至少约20％、至少约50％、至少约2倍、至少约2.5倍、至少约5倍、至少约10倍、至少约20倍、至少约30倍、至少约40倍、至少约50倍、至少约75倍、至少约100倍、至少约200倍、至少约300倍、至少约400倍或至少约500倍或更多倍。对照亲本宿主细胞不包含遗传修饰宿主细胞中存在的遗传修饰。The genetically modified host cells of the invention are capable of increased production of biosynthetic pathway products compared to a control parental host cell. Thus, for example, the production of a biosynthetic pathway product of the genetically modified host cell is increased by at least about 10%, at least about 20%, at least about 50%, at least about 2-fold, compared to the level of product produced in the control parental host cell, at least about 2.5 times, at least about 5 times, at least about 10 times, at least about 20 times, at least about 30 times, at least about 40 times, at least about 50 times, at least about 75 times, at least about 100 times, at least about 200 times, At least about 300 times, at least about 400 times, or at least about 500 times or more. A control parental host cell does not contain the genetic modification present in the genetically modified host cell.

在一些实施方式中，与对照宿主细胞相比，本发明遗传修饰的宿主细胞提高了生物合成途径产物的产量。因此，例如，与对照宿主细胞中产生的产物水平相比，遗传修饰的宿主细胞中生物合成途径产物的产量提高至少约10％、至少约20％、至少约50％、至少约2倍、至少约2.5倍、至少约5倍、至少约10倍、至少约20倍、至少约30倍、至少约40倍、至少约50倍、至少约75倍、至少约100倍、至少约200倍、至少约300倍、至少约400倍或至少约500倍或更多倍。在一些实施方式中，对照宿主细胞不含遗传修饰的宿主细胞中存在的遗传修饰，例如，对照宿主细胞中的类异戊烯修饰酶编码核酸(如细胞色素P450酶编码核酸)操作性连接于天然跨膜结构域、天然分泌结构域、天然增溶结构域和天然膜插入多肽中的一种或多种，而遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于非天然(如异源)跨膜结构域、非天然分泌结构域、非天然增溶结构域和非天然膜插入域中的一种或多种。一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于非天然类异戊烯修饰酶编码核酸时，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。另一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于异源分泌信号结构域时，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。另一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于异源增溶结构域时，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。另一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于异源膜插入域时，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。另一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于变异跨膜结构域(如截短的天然跨膜结构域；与天然跨膜结构域的氨基酸序列相比氨基酸序列有改变的跨膜结构域)，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。In some embodiments, a genetically modified host cell of the invention has increased production of a biosynthetic pathway product compared to a control host cell. Thus, for example, the production of a biosynthetic pathway product in a genetically modified host cell is increased by at least about 10%, at least about 20%, at least about 50%, at least about 2-fold, at least about 2.5 times, at least about 5 times, at least about 10 times, at least about 20 times, at least about 30 times, at least about 40 times, at least about 50 times, at least about 75 times, at least about 100 times, at least about 200 times, at least About 300 times, at least about 400 times or at least about 500 times or more. In some embodiments, the control host cell does not contain the genetic modification present in the genetically modified host cell, for example, an isopentenoid modifying enzyme-encoding nucleic acid (such as a cytochrome P450 enzyme-encoding nucleic acid) in the control host cell is operably linked to one or more of a native transmembrane domain, a native secretory domain, a native solubilizing domain, and a native membrane-inserting polypeptide, while the genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a non- One or more of a native (eg, heterologous) transmembrane domain, a non-native secretion domain, a non-native solubilization domain, and a non-native membrane insertion domain. As an example, when a genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operatively linked to a non-native isopentenoid-modifying enzyme-encoding nucleic acid, a suitable control host cell comprises an isopentenoid-modifying enzyme encoding The nucleic acid is operably linked to the native transmembrane domain. As another example, when a genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous secretion signaling domain, a suitable control host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous secretion signaling domain. Linked to native transmembrane domain. As another example, when a genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous solubilization domain, a suitable control host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous solubilization domain. Linked to native transmembrane domain. As another example, when a genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous membrane insertion domain, a suitable control host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to in the native transmembrane domain. Another example is when the genetically modified host cell contains an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a variant transmembrane domain (such as a truncated native transmembrane domain; the amino acid sequence of the native transmembrane domain A suitable control host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to the native transmembrane domain, compared to a transmembrane domain with an altered amino acid sequence.

本发明提供了产生类异戊烯化合物的方法。该方法通常包括在合适培养基中培养本发明遗传修饰的宿主细胞，其中本发明遗传修饰的宿主细胞是用含有编码类异戊烯前体修饰酶的核苷酸序列的核酸遗传修饰的细胞，所述类异戊烯前体修饰酶操作性连接于选自下组的结构域：跨膜结构域、分泌结构域、增溶结构域和膜插入蛋白，以产生类异戊烯前体修饰酶。在类异戊烯前体化合物存在下，产生类异戊烯前体修饰酶导致酶学修饰类异戊烯前体和产生类异戊烯化合物。在其它实施方式中，所述方法通常包括在有利于产生编码的类异戊烯前体修饰酶的条件下维持本发明转基因植物。产生类异戊烯前体修饰酶导致产生类异戊烯化合物。例如，在一些实施方式中，所述方法通常包括在合适培养基中培养遗传修饰的宿主细胞，其中用含有编码萜修饰酶如萜氧化酶、萜羟化酶等的核苷酸序列的本发明核酸遗传修饰所述宿主细胞。产生萜氧化酶导致产生类异戊烯化合物。该方法一般在体外(如在体外用活细胞培养)进行，但也考虑在体内产生类异戊烯化合物。在一些实施方式中，所述宿主细胞是真核细胞，如酵母细胞。在其它实施方式中，所述宿主细胞是原核细胞。在一些实施方式中，所述宿主细胞是植物细胞。在一些实施方式中，该方法在本发明转基因植物中进行。The present invention provides methods for producing isopentenoid compounds. The method generally comprises culturing the genetically modified host cell of the present invention in a suitable medium, wherein the genetically modified host cell of the present invention is a cell genetically modified with a nucleic acid containing a nucleotide sequence encoding an isopentenoid precursor modifying enzyme, The isopentenoid precursor modifying enzyme is operably linked to a domain selected from the group consisting of a transmembrane domain, a secretory domain, a solubilizing domain, and a membrane insertion protein to produce an isopentenoid precursor modifying enzyme . The production of the isopentenoid precursor modifying enzyme in the presence of the isopentenoid precursor compound results in enzymatic modification of the isopentenoid precursor and production of the isopentenoid compound. In other embodiments, the methods generally comprise maintaining the transgenic plants of the invention under conditions favorable for production of the encoded isopentenoid precursor modifying enzyme. Production of isopentenoid precursors Modifying enzymes result in the production of isopentenoid compounds. For example, in some embodiments, the method generally comprises culturing a genetically modified host cell in a suitable medium with a nucleotide sequence of the invention comprising a terpene modifying enzyme such as terpene oxidase, terpene hydroxylase, etc. The nucleic acid genetically modifies the host cell. Production of terpene oxidase leads to production of isopentenoid compounds. The method is typically performed in vitro (eg, in vitro with living cell culture), but in vivo production of the isopentenoid compound is also contemplated. In some embodiments, the host cell is a eukaryotic cell, such as a yeast cell. In other embodiments, the host cell is a prokaryotic cell. In some embodiments, the host cell is a plant cell. In some embodiments, the method is performed in a transgenic plant of the invention.

与对照亲本宿主细胞相比，本发明遗传修饰的宿主细胞能够提高类异戊烯化合物的产量。因此，例如，与对照亲本宿主细胞中产生的产物水平相比，遗传修饰宿主细胞的类异戊烯或类异戊烯前体产量提高了至少约10％、至少约20％、至少约50％、至少约2倍、至少约2.5倍、至少约5倍、至少约10倍、至少约20倍、至少约30倍、至少约40倍、至少约50倍、至少约75倍、至少约100倍、至少约200倍、至少约300倍、至少约400倍或至少约500倍或更多倍。对照亲本宿主细胞不包含遗传修饰宿主细胞中存在的遗传修饰。The genetically modified host cells of the invention are capable of increased production of isopentenoid compounds compared to a control parental host cell. Thus, for example, the genetically modified host cell produces an isopentenoid or isopenten precursor that is increased by at least about 10%, at least about 20%, at least about 50% compared to the level of product produced in the control parental host cell , at least about 2 times, at least about 2.5 times, at least about 5 times, at least about 10 times, at least about 20 times, at least about 30 times, at least about 40 times, at least about 50 times, at least about 75 times, at least about 100 times , at least about 200 times, at least about 300 times, at least about 400 times or at least about 500 times or more. A control parental host cell does not contain the genetic modification present in the genetically modified host cell.

在一些实施方式中，与对照宿主细胞相比，本发明遗传修饰的宿主细胞提高了类异戊烯化合物的产量。因此，例如，与对照宿主细胞中产生的产物水平相比，遗传修饰的宿主细胞中类异戊烯或类异戊烯前体产量提高至少约10％、至少约20％、至少约50％、至少约2倍、至少约2.5倍、至少约5倍、至少约10倍、至少约20倍、至少约30倍、至少约40倍、至少约50倍、至少约75倍、至少约100倍、至少约200倍、至少约300倍、至少约400倍或至少约500倍或更多倍。在一些实施方式中，对照宿主细胞不含遗传修饰的宿主细胞中存在的遗传修饰，例如，对照宿主细胞中的类异戊烯修饰酶编码核酸(如细胞色素P450酶编码核酸)操作性连接于天然跨膜结构域、天然分泌结构域、天然增溶结构域和天然膜插入多肽中的一种或多种，而遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于非天然(如异源)跨膜结构域、非天然分泌结构域、非天然增溶结构域和非天然膜插入域中的一种或多种。一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于非天然类异戊烯修饰酶编码核酸时，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。另一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于异源分泌信号结构域时，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。另一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于异源增溶结构域时，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。另一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于异源膜插入域时，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。另一个例子是，当遗传修饰的宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于变异跨膜结构域(如截短的天然跨膜结构域；与天然跨膜结构域的氨基酸序列相比氨基酸序列有改变的跨膜结构域)，合适的对照宿主细胞包含的类异戊烯修饰酶编码核酸操作性连接于天然跨膜结构域。In some embodiments, the genetically modified host cells of the invention have increased production of isopentenoid compounds compared to control host cells. Thus, for example, the production of an isopentenoid or an isopenten precursor in a genetically modified host cell is increased by at least about 10%, at least about 20%, at least about 50%, compared to the level of product produced in a control host cell. at least about 2 times, at least about 2.5 times, at least about 5 times, at least about 10 times, at least about 20 times, at least about 30 times, at least about 40 times, at least about 50 times, at least about 75 times, at least about 100 times, At least about 200 times, at least about 300 times, at least about 400 times or at least about 500 times or more. In some embodiments, the control host cell does not contain the genetic modification present in the genetically modified host cell, for example, an isopentenoid modifying enzyme-encoding nucleic acid (such as a cytochrome P450 enzyme-encoding nucleic acid) in the control host cell is operably linked to one or more of a native transmembrane domain, a native secretory domain, a native solubilizing domain, and a native membrane-inserting polypeptide, while the genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a non- One or more of a native (eg, heterologous) transmembrane domain, a non-native secretion domain, a non-native solubilization domain, and a non-native membrane insertion domain. As an example, when a genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operatively linked to a non-native isopentenoid-modifying enzyme-encoding nucleic acid, a suitable control host cell comprises an isopentenoid-modifying enzyme encoding The nucleic acid is operably linked to the native transmembrane domain. As another example, when a genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous secretion signaling domain, a suitable control host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous secretion signaling domain. Linked to native transmembrane domain. As another example, when a genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous solubilization domain, a suitable control host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous solubilization domain. Linked to native transmembrane domain. As another example, when a genetically modified host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a heterologous membrane insertion domain, a suitable control host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to in the native transmembrane domain. Another example is when the genetically modified host cell contains an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to a variant transmembrane domain (such as a truncated native transmembrane domain; the amino acid sequence of the native transmembrane domain A suitable control host cell comprises an isopentenoid-modifying enzyme-encoding nucleic acid operably linked to the native transmembrane domain, compared to a transmembrane domain with an altered amino acid sequence.

因此，在一些实施方式中，以每个细胞计，与不包含遗传修饰宿主细胞所含的一种或多种遗传修饰的对照宿主细胞产生的类异戊烯化合物水平相比，本发明遗传修饰的宿主细胞产生的类异戊烯化合物水平高出至少约10％、至少约15％、至少约20％、至少约25％、至少约30％、至少约35％、至少约40％、至少约45％、至少约50％、至少约60％、至少约70％、至少约80％、至少约90％、至少约2倍、至少约2.5倍、至少约5倍、至少约10倍、至少约20倍、至少约30倍、至少约40倍、至少约50倍、至少约75倍、至少约100倍、至少约200倍、至少约300倍、至少约400倍或至少约500倍或更多倍。不难用熟知方法，例如测定600nm处细菌液体培养物的光密度(OD)(OD₆₀₀)；菌落大小；生长速率等测定遗传修饰的宿主细胞的生长。Thus, in some embodiments, on a per cell basis, the level of isopentenoid compound produced by a genetically modified host cell of the invention as compared The level of isopentenoid compound produced by the host cell is at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 2 times, at least about 2.5 times, at least about 5 times, at least about 10 times, at least about 20 times, at least about 30 times, at least about 40 times, at least about 50 times, at least about 75 times, at least about 100 times, at least about 200 times, at least about 300 times, at least about 400 times or at least about 500 times or more times. Growth of genetically modified host cells is readily determined by well-known methods, eg, measuring optical density (OD) at 600 nm ( _OD600 ) of bacterial liquid cultures; colony size; growth rate, and the like.

在一些实施方式中，本发明遗传修饰的宿主细胞产生类异戊烯化合物的可回收量为至少约1mg/L、至少约5mg/L、至少约10mg/L、至少约15mg/L、至少约20mg/L、至少约25mg/L、至少约30mg/L、至少约35mg/L、至少约40mg/L、至少约50mg/L、至少约75mg/L、至少约100mg/L、至少约125mg/L、至少约150mg/L、至少约200mg/L、至少约300mg/L、至少约500mg/L、至少约1000mg/L或至少约5000mg/L。In some embodiments, the genetically modified host cells of the invention produce isopentenoid compounds in recoverable amounts of at least about 1 mg/L, at least about 5 mg/L, at least about 10 mg/L, at least about 15 mg/L, at least about 20mg/L, at least about 25mg/L, at least about 30mg/L, at least about 35mg/L, at least about 40mg/L, at least about 50mg/L, at least about 75mg/L, at least about 100mg/L, at least about 125mg/L L, at least about 150 mg/L, at least about 200 mg/L, at least about 300 mg/L, at least about 500 mg/L, at least about 1000 mg/L, or at least about 5000 mg/L.

在一些实施方式中，本发明遗传修饰的宿主细胞产生类异戊烯化合物的可回收量为约1mg/L-5000mg/L，例如约1mg/L-2mg/L、约2mg/L-5mg/L、约5mg/L-10mg/L、约10mg/L-15mg/L、约15mg/L-20mg/L、约20mg/L-25mg/L、约25mg/L-50mg/L、约50mg/L-75mg/L、约75mg/L-100mg/L、约100mg/L-150mg/L、约150mg/L-200mg/L、约200mg/L-250mg/L、约250mg/L-300mg/L、约300mg/L-350mg/L、约350mg/L-400mg/L、约400mg/L-450mg/L、约450mg/L-500mg/L、约500mg/L-1000mg/L、约1000mg/L-2000mg/L、约2000mg/L-3000mg/L、约3000mg/L-4000mg/L或约4000mg/L-5000mg/L。可由培养基或宿主细胞，例如由培养基或细胞裂解物或细胞裂解物组分回收产生的类异戊烯。回收方法可能取决于各种因素，例如产生的具体类异戊烯的特性。In some embodiments, the genetically modified host cells of the present invention produce isopentenoid compounds in a recoverable amount of about 1 mg/L-5000 mg/L, such as about 1 mg/L-2 mg/L, about 2 mg/L-5 mg/L L, about 5mg/L-10mg/L, about 10mg/L-15mg/L, about 15mg/L-20mg/L, about 20mg/L-25mg/L, about 25mg/L-50mg/L, about 50mg/L L-75mg/L, about 75mg/L-100mg/L, about 100mg/L-150mg/L, about 150mg/L-200mg/L, about 200mg/L-250mg/L, about 250mg/L-300mg/L , about 300mg/L-350mg/L, about 350mg/L-400mg/L, about 400mg/L-450mg/L, about 450mg/L-500mg/L, about 500mg/L-1000mg/L, about 1000mg/L -2000mg/L, about 2000mg/L-3000mg/L, about 3000mg/L-4000mg/L or about 4000mg/L-5000mg/L. Isopenoids produced may be recovered from the culture medium or host cells, eg, from the culture medium or a cell lysate or components of a cell lysate. Recovery methods may depend on various factors, such as the identity of the particular isopentenoid produced.

图14和15示意性地描述了生物合成示范性类异戊烯产物。萜合酶催化直链二磷酸多异戊烯酯的转化；转化产物是类异戊烯前体修饰酶(如P450酶)的底物。然后由P450和其氧化还原伙伴CPR催化前体的碳骨架反应，从而进行具体的官能化。Figures 14 and 15 schematically depict biosynthetic exemplary isopentenoid products. Terpene synthases catalyze the conversion of linear polyprenyl diphosphates; the conversion products are substrates for isopentenoid precursor modifying enzymes such as P450 enzymes. The specific functionalization is then catalyzed by P450 and its redox partner CPR to react with the carbon backbone of the precursor.

在一些实施方式中，用含有编码萜合酶的核苷酸序列的核酸进一步遗传修饰遗传修饰的宿主细胞，所述萜合酶可以是异源萜合酶(如宿主细胞通常不产生的萜合酶)。因此，例如，在一些实施方式中，用含有编码萜合酶和类异戊烯修饰酶(如倍半萜氧化酶)的核苷酸序列的一种或多种核酸遗传修饰宿主细胞。在合适培养基中培养这类宿主细胞能够产生萜合酶和类异戊烯修饰酶(如倍半萜氧化酶)。例如，萜合酶能修饰焦磷酸法呢酯，以产生所述倍半萜氧化酶的倍半萜底物。In some embodiments, the genetically modified host cell is further genetically modified with a nucleic acid comprising a nucleotide sequence encoding a terpene synthase, which may be a heterologous terpene synthase (such as a terpene synthase not normally produced by the host cell). enzyme). Thus, for example, in some embodiments, a host cell is genetically modified with one or more nucleic acids comprising nucleotide sequences encoding terpene synthases and isopentenoid modifying enzymes (eg, sesquiterpene oxidase). Culture of such host cells in a suitable medium is capable of producing terpene synthases and isopentenoid modifying enzymes (eg, sesquiterpene oxidase). For example, a terpene synthase can modify farnesyl pyrophosphate to produce a sesquiterpene substrate for the sesquiterpene oxidase.

在一些实施方式中，用含有编码细胞色素P450还原酶(CPR)的核苷酸序列的核酸进一步遗传修饰宿主细胞。已知各种CPR的核苷酸序列，可采用任何已知的CPR编码核酸，只要编码的CPR具有由NADPH转移电子的活性。在一些实施方式中，CPR编码核酸编码能将电子从NADPH转移到本发明类异戊烯修饰酶编码核酸编码的类异戊烯修饰酶，如倍半萜氧化酶的CPR。In some embodiments, the host cell is further genetically modified with a nucleic acid comprising a nucleotide sequence encoding a cytochrome P450 reductase (CPR). Nucleotide sequences of various CPRs are known, and any known CPR-encoding nucleic acid can be used as long as the encoded CPR has the activity of transferring electrons from NADPH. In some embodiments, a CPR-encoding nucleic acid encodes a CPR capable of transferring electrons from NADPH to an isopentenoid-modifying enzyme-encoding nucleic acid of the invention, such as a sesquiterpene oxidase.

在一些实施方式中，进一步遗传修饰宿主细胞产生异戊烯基转移酶和/或生物合成途径中的一种或多种酶，以产生焦磷酸异戊-1-烯酯。细胞一般采用两种途径之一产生类异戊烯或类异戊烯前体(如IPP、二磷酸多异戊烯酯等)。图16-18用于说明细胞用以产生类异戊烯化合物或前体如二磷酸多异戊烯酯的途径。In some embodiments, the host cell is further genetically modified to produce a prenyltransferase and/or one or more enzymes in a biosynthetic pathway to produce isopent-1-enyl pyrophosphate. Cells typically produce isopentenoids or isopentenoid precursors (eg, IPP, polyprenyl diphosphate, etc.) using one of two pathways. Figures 16-18 illustrate pathways by which cells produce isopentenoid compounds or precursors such as polyprenyl diphosphate.

图16显示了类异戊烯途径，包括用异戊烯基转移酶修饰二磷酸异戊烯酯(IPP)和/或其异构体二磷酸二甲基烯丙酯(DMAPP)，产生二磷酸多异戊烯酯二磷酸牻牛儿酯(GPP)、二磷酸法呢酯(FPP)和二磷酸牻牛儿基牻牛儿酯(GGPP)。GPP和FPP进一步由萜合酶修饰，分别产生单萜和倍半萜；GGPP进一步由萜合酶修饰，形成二萜和类葫萝卜素。IPP和DMAPP由以下两种途径中的一种产生：甲羟戊酸(MEV)途径和1-脱氧-D-木酮糖-5-磷酸酯(DXP)途径。Figure 16 shows the isopentenoid pathway involving the modification of isopentenyl diphosphate (IPP) and/or its isomer dimethylallyl diphosphate (DMAPP) with isopentenyltransferase to generate diphosphate Polyprenyl geranyl geranyl diphosphate (GPP), farnesyl diphosphate (FPP) and geranyl geranyl diphosphate (GGPP). GPP and FPP are further modified by terpene synthases to produce monoterpenes and sesquiterpenes, respectively; GGPP is further modified by terpene synthases to form diterpenes and carotenoids. IPP and DMAPP are produced by one of two pathways: the mevalonate (MEV) pathway and the 1-deoxy-D-xylulose-5-phosphate (DXP) pathway.

图17示意性显示了MEV途径，其中，乙酰辅酶A通过一系列反应转化为IPP。Figure 17 schematically shows the MEV pathway in which acetyl-CoA is converted to IPP through a series of reactions.

图18示意性显示了DXP途径，其中，丙酮酸和D-甘油醛-3-磷酸酯通过一系列反应转化为IPP和DMAPP。除植物细胞之外的真核细胞仅利用MEV类异戊烯途径以使乙酰辅酶A(乙酰-CoA)转化为IPP，然后异构化为DMAPP。植物同时利用MEV和甲羟戊酸非依赖性、或DXP途径来合成类异戊烯。除一些例外情况，原核生物利用DXP途径通过分支点分别产生IPP和DMAPP。Figure 18 schematically shows the DXP pathway, in which pyruvate and D-glyceraldehyde-3-phosphate are converted to IPP and DMAPP through a series of reactions. Eukaryotic cells other than plant cells utilize only the MEV isopentenoid pathway to convert acetyl coenzyme A (acetyl-CoA) to IPP, which is then isomerized to DMAPP. Plants utilize both MEV and the mevalonate-independent, or DXP, pathway to synthesize isopentenoids. With some exceptions, prokaryotes utilize the DXP pathway to produce IPP and DMAPP, respectively, through branch points.

根据培养宿主细胞的培养基和宿主细胞通过DXP途径或是甲羟戊酸途径合成IPP，在一些实施方式中，宿主细胞还可包含其它遗传修饰。例如，在一些实施方式中，宿主细胞是没有内源性甲羟戊酸途径的宿主细胞，例如，宿主细胞是通常不通过甲羟戊酸途径合成IPP或甲羟戊酸的宿主细胞。例如，在一些实施方式中，宿主细胞是通常不通过甲羟戊酸途径合成IPP的宿主细胞，用包含编码甲羟戊酸途径中两种或多种酶、IPP异构酶、异戊烯基转移酶、萜合酶和类异戊烯修饰酶(如本发明核酸编码的类异戊烯修饰酶)的核苷酸序列的一种或多种核酸遗传修饰宿主细胞。培养这种宿主细胞能产生甲羟戊酸途径酶、IPP异构酶、异戊烯基转移酶、萜合酶和类异戊烯修饰酶(如倍半萜氧化酶)。产生甲羟戊酸途径酶、IPP异构酶、异戊烯基转移酶、萜合酶和类异戊烯修饰酶(如倍半萜氧化酶)则导致产生类异戊烯化合物。在许多实施方式中，异戊烯基转移酶是FPP合酶，它能产生本发明核酸编码的倍半萜氧化酶的倍半萜底物；产生倍半萜氧化酶则导致氧化宿主细胞中的倍半萜底物。适合采用编码甲羟戊酸途径酶、IPP异构酶、异戊烯基转移酶和萜合酶的任何核酸。例如，合适核酸参见例如Martin等(2003)同上。Depending on the medium in which the host cell is cultured and whether the host cell synthesizes IPP via the DXP pathway or the mevalonate pathway, in some embodiments the host cell may also comprise other genetic modifications. For example, in some embodiments, the host cell is a host cell that does not have an endogenous mevalonate pathway, eg, the host cell is a host cell that does not normally synthesize IPP or mevalonate through the mevalonate pathway. For example, in some embodiments, the host cell is one that does not normally synthesize IPP through the mevalonate pathway, and the host cell is a host cell that encodes two or more enzymes in the mevalonate pathway, IPP isomerase, isopentenyl One or more nucleic acids of the nucleotide sequence of a transferase, a terpene synthase, and an isopentenoid-modifying enzyme, such as the isopentenoid-modifying enzyme encoded by the nucleic acid of the invention, genetically modify the host cell. Culturing of such host cells produces mevalonate pathway enzymes, IPP isomerases, prenyltransferases, terpene synthases and isopentenoid modifying enzymes such as sesquiterpene oxidase. The production of mevalonate pathway enzymes, IPP isomerase, isopentenyltransferase, terpene synthase and isopentenoid modifying enzymes such as sesquiterpene oxidase leads to the production of isopentenoid compounds. In many embodiments, the isopentenyltransferase is an FPP synthase that produces a sesquiterpene substrate of the sesquiterpene oxidase encoded by the nucleic acid of the invention; the production of the sesquiterpene oxidase results in the oxidation of the sesquiterpene oxidase in the host cell. Sesquiterpene substrate. Any nucleic acid encoding a mevalonate pathway enzyme, IPP isomerase, prenyltransferase, and terpene synthase is suitable for use. See, eg, Martin et al. (2003) supra for suitable nucleic acids.

在一些上述实施方式中，用含有编码两种或多种甲羟戊酸途径酶的核苷酸序列的一种或多种核酸遗传修饰宿主细胞时，所述两种或多种甲羟戊酸途径酶包括MK、PMK和MPD，并在含有甲羟戊酸的培养基中培养宿主细胞。在其它实施方式中，所述两种或多种甲羟戊酸途径酶包括乙酰乙酰CoA硫解酶、HMGS、HMGR、MK、PMK和MPD。In some of the above embodiments, when the host cell is genetically modified with one or more nucleic acids comprising nucleotide sequences encoding two or more mevalonate pathway enzymes, the two or more mevalonate pathway enzymes Pathway enzymes include MK, PMK, and MPD, and host cells are cultured in media containing mevalonate. In other embodiments, the two or more mevalonate pathway enzymes include acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, and MPD.

在一些实施方式中，宿主细胞是通常不通过甲羟戊酸途径合成IPP的宿主细胞，如上所述遗传修饰宿主细胞，宿主细胞还包含功能损伤的DXP途径。In some embodiments, the host cell is one that does not normally synthesize IPP through the mevalonate pathway, the host cell is genetically modified as described above, and the host cell further comprises a functionally impaired DXP pathway.

本发明方法可用于产生各种类异戊烯化合物，包括但不限于：青蒿酸(如倍半萜底物是紫穗槐-4，11-二烯时)、异长叶烯醇(如底物是异长叶烯时)、(E)-反式-香柠檬-2，12-二烯-14-醇(如底物是(-)-α-反式-香柠檬烯时)、(-)-榄香-1，3，11(13)-三烯-12-醇(如底物是(-)-β-榄香烯时)、大牻牛儿三烯醇(germacra)-1(10)，4，11(13)-三烯-12-醇(如底物是(+)-大根香叶烯A时)、大根香叶烯B醇(如底物是大根香叶烯B时)、5，11(13)-愈创木二烯-12-醇(如底物是(+)-γ-古芸烯时)、喇叭烯醇(如底物是(+)-喇叭烯时)、4β-H-桉叶-11(13)-烯-4，12-二醇(如底物是十氢二甲基甲乙烯基萘酚(neointermedeol)时)、(+)-β-广木香醇(如底物是(+)-β-蛇床烯等时)；和任何上述物质的衍生物。The method of the present invention can be used to produce a variety of isopentenoid compounds, including but not limited to: artemisinic acid (such as when the sesquiterpene substrate is amorpha-4,11-diene), isofenol (such as When the substrate is isolongifene), (E)-trans-bergamot-2,12-dien-14-ol (when the substrate is (-)-α-trans-bergamotene), ( -)-Elem-1,3,11(13)-triene-12-ol (if the substrate is (-)-β-elemene), Germacra-1 (10), 4,11(13)-trien-12-ol (if the substrate is (+)-germantene A), germaneol B alcohol (if the substrate is germanolene B ), 5,11(13)-guaiadiene-12-ol (if the substrate is (+)-γ-gurene), hornenol (if the substrate is (+)-hornene when), 4β-H-eucalyptus-11(13)-ene-4,12-diol (such as when the substrate is decahydrodimethylvinyl naphthol (neointermedeol)), (+)-β- picrolin (if the substrate is (+)-β-ostendene, etc.); and derivatives of any of the above substances.

在许多实施方式中，用合适培养基在合适温度下体外培养本发明遗传修饰的宿主细胞。细胞培养温度通常为约18℃-40℃，例如约18℃-20℃、约20℃-25℃、约25℃-30℃、约30℃-35℃、或约35℃-40℃(如约37℃)。In many embodiments, the genetically modified host cells of the invention are cultured in vitro in a suitable medium at a suitable temperature. The cell culture temperature is usually about 18°C-40°C, such as about 18°C-20°C, about 20°C-25°C, about 25°C-30°C, about 30°C-35°C, or about 35°C-40°C (such as about 37°C).

在一些实施方式中，用合适培养基(如Luria-Bertoni肉汤，任选补充有一种或多种其它物质，例如诱导物(例如编码类异戊烯修饰酶的核苷酸序列在诱导型启动子控制下时)等)培养本发明遗传修饰的宿主细胞；用有机溶剂如十二烷覆盖培养基，形成有机层。遗传修饰的宿主细胞产生的类异戊烯化合物分配到有机层中，由有机层纯化该化合物。在一些实施方式中，当编码类异戊烯修饰酶的核苷酸序列操作性连接于诱导型启动子时，将诱导物加入培养基中；适当时间后，由覆盖在培养基上的有机层分离类异戊烯化合物。In some embodiments, inducible priming is initiated with a suitable medium (e.g., Luria-Bertoni broth, optionally supplemented with one or more other substances, such as an inducer (e.g., a nucleotide sequence encoding an isopentenoid modifying enzyme). When under sub-control) etc.) cultivate the genetically modified host cell of the present invention; cover the culture medium with an organic solvent such as dodecane to form an organic layer. The isopentenoid compound produced by the genetically modified host cell partitions into the organic layer from which the compound is purified. In some embodiments, when the nucleotide sequence encoding the isopentenoid modifying enzyme is operably linked to an inducible promoter, the inducer is added to the culture medium; Isolation of isopentenoid compounds.

在一些实施方式中，将类异戊烯化合物与有机层中可能存在的其它产物分离开。不难采用(如)标准的色谱方法将类异戊烯化合物与有机层中可能存在的其它产物分离开。In some embodiments, the isopentenoid compound is separated from other products that may be present in the organic layer. The isopentenoid compound is readily separated from other products that may be present in the organic layer using, for example, standard chromatographic methods.

在一些实施方式中，在无细胞反应中用化学方法进一步修饰本发明方法合成的类异戊烯化合物。例如，在一些实施方式中，由培养基和/或细胞裂解物分离青蒿酸，在无细胞反应中用化学方法进一步修饰青蒿酸，以产生青蒿素。In some embodiments, the isopentenoid compounds synthesized by the methods of the invention are further modified chemically in a cell-free reaction. For example, in some embodiments, artemisinic acid is isolated from culture medium and/or cell lysates, and artemisinic acid is further modified chemically in a cell-free reaction to produce artemisinin.

在一些实施方式中，类异戊烯化合物是纯净的，例如，纯度为至少约40％、至少约50％、至少约60％、至少约70％、至少约80％、至少约90％、至少约95％、至少约98％、或高于98％，提到类异戊烯化合物时，“纯净”指不含其它类异戊烯化合物、大分子、污染物等的类异戊烯化合物。In some embodiments, the isopentenoid compound is pure, e.g., at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least About 95%, at least about 98%, or greater than 98%, "pure" when referring to an isopentenoid compound refers to an isopentenoid compound that is free from other isopentenoid compounds, macromolecules, contaminants, and the like.

实施例Example

提出以下实施例，以便向本领域普通技术人员提供如何制备和使用本发明的完整公开和描述，但并不旨在限制发明人所认定的发明范围，它们也不代表以下实验是进行的所有或仅有实验。努力确保所用数值(如量、温度等)的准确性，但应该允许一些实验误差和偏差。除非另有说明，份数是重量份数，分子量是重均分子量，温度是摄氏度，压力是大气压或近大气压。可采用标准缩写，例如：bp，碱基对；kb，千碱基；pl，皮升；s或sec，秒；min，分钟；h或hr，小时；aa，氨基酸；kb，千碱基；bp，碱基对；nt，核苷酸；i.m.，肌肉内；i.p.，腹膜内；s.c.，皮下；等等。The following examples are presented to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the invention, but are not intended to limit the scope of the invention as identified by the inventors, nor do they represent that the following experiments were performed or that all Only experiments. Efforts have been made to ensure accuracy with respect to numbers used (eg, amounts, temperature, etc.), but some experimental errors and deviations should be allowed for. Unless indicated otherwise, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees Centigrade, and pressure is at or near atmospheric. Standard abbreviations may be used, for example: bp, base pair; kb, kilobase; pl, picoliter; s or sec, second; min, minute; h or hr, hour; aa, amino acid; kb, kilobase; bp, base pair; nt, nucleotide; i.m., intramuscular; i.p., intraperitoneal; s.c., subcutaneous;

实施例1：在大肠杆菌中产生8-羟基-δ-杜松烯 Example 1 : Production of 8-hydroxy-delta-junene in E. coli

本实施例描述了用参与生物合成途径的天然P450高水平(最高30mg L^-1)产生体内产生的底物。δ-杜松烯-8-羟化酶(CadH)是植物衍生的膜结合P450，它能将植物防御化合物生物合成棉酚中的倍半萜δ-杜松烯(cad)羟化成8-羟基-δ-杜松烯(CadOH)。图1示意性描述了大肠杆菌中的生物合成CadOH。在大肠杆菌中萜合酶CadS从内源性焦磷酸法呢酯(FPP)产生底物(Cad)。通过CadH与其氧化还原伙伴(CPR)的作用，将Cad进一步羟化成产物(CadOH)。This example describes the production of in vivo produced substrates with high levels (up to 30 mg L ^-1 ) of natural P450s involved in biosynthetic pathways. Delta-cadinene-8-hydroxylase (CadH) is a plant-derived membrane-bound P450 that hydroxylates the sesquiterpene delta-cadinene (cad) in the biosynthesis of the plant defense compound gossypol to 8-hydroxyl - delta-junpinene (CadOH). Figure 1 schematically depicts the biosynthesis of CadOH in E. coli. The terpene synthase CadS in E. coli produces a substrate (Cad) from endogenous farnesyl pyrophosphate (FPP). Cad is further hydroxylated to the product (CadOH) through the action of CadH and its redox partner (CPR).

CadH表达载体包括CadH基因以及编码热带念珠菌的细胞色素P450还原酶(CPR)氧化还原伙伴的基因。将此构建物与δ-杜松烯合酶(CadS)的相容性表达载体共同转化到大肠杆菌中，从而提供CadH的底物。在富营养培养基中培养该菌株，在血红素补充物存在下20℃诱导48小时，然后用有机溶剂提取该培养基。图2所示结果显示，经GC-MS(气相色谱-质谱联用)测定，此系统中产生的CadOH量明显可检测到(～100μg L^-1)。The CadH expression vector includes the CadH gene and the gene encoding the cytochrome P450 reductase (CPR) redox partner of C. tropicalis. This construct was co-transformed into E. coli with a compatible expression vector for delta-junniene synthase (CadS), thereby providing a substrate for CadH. The strain was cultured in a nutrient-rich medium, induced for 48 hours at 20°C in the presence of heme supplements, and then the medium was extracted with an organic solvent. The results shown in Fig. 2 show that the amount of CadOH produced in this system is clearly detectable (~100 μg L ^-1 ) as determined by GC-MS (gas chromatography-mass spectrometry).

图2是由表达CadOH生物合成途径的大肠杆菌提取的有机层的GC-MS图。插图是显示CadOH(峰1)和推定酮物质(峰2)的区域的放大图。上面一条线对应于表达CadS、CadH和CPR的样品，而下面一条线对应于只表达CadS和CPR(无CadH)的阴性对照。Figure 2 is a GC-MS graph of an organic layer extracted from Escherichia coli expressing the CadOH biosynthetic pathway. The inset is a magnified view of the region showing CadOH (peak 1) and putative ketone species (peak 2). The upper line corresponds to samples expressing CadS, CadH and CPR, while the lower line corresponds to a negative control expressing only CadS and CPR (no CadH).

此外，也观察到少量推定的酮产物([M⁺]：m/z＝218)(图2插图，上面一条线，峰2)，意味着同一酶可能周转多次。在GC-MS图中，仅含CPR而不含CadH的阴性对照质粒没有产物峰(图2插图，下面一条线)。用此系统由大肠杆菌在体内产生的CadOH的质谱和CadOH的文献谱图非常接近[4]。以前尝试使用天然P450在体内产生类似化合物家族的官能化天然产物未成功，说明底物可及性有问题。In addition, a small amount of putative ketone product ([M ⁺ ]: m/z=218) was also observed (Fig. 2 inset, upper line, peak 2), implying that the same enzyme may have multiple turnovers. The negative control plasmid containing only CPR without CadH had no product peaks in the GC-MS profile (Fig. 2 inset, lower line). The mass spectrum of CadOH produced in vivo by E. coli using this system is very close to the literature spectrum of CadOH [4]. Previous attempts to use native P450s to generate functionalized natural products of similar compound families in vivo have been unsuccessful, suggesting problems with substrate accessibility.

通过提高pMBIS质粒在大肠杆菌中产生的FPP量，显著提高了CadOH产量，pMBIS质粒使大肠杆菌由甲羟戊酸产生FPP[6]。pMBIS的核苷酸序列参见图36A-D(SEQ ID NO：62)。pMBIS也参见美国专利公开号2003/0148479；和2004/0005678；它包含编码甲羟戊酸激酶、磷酸甲羟戊酸激酶、焦磷酸甲羟戊酸酯脱羧酶、IPP异构酶和FPP合酶的核苷酸序列。在这些研究中，用以下三种表达质粒转化大肠杆菌：(1)pMBIS、(2)CadS和(3)CadH/CPR。诱导后加入20mM甲羟戊酸。与不含pMBIS的细胞相比，加入pMBIS能使CadOH产量提高74倍(图3)。再一次，阴性对照(无CadH)没有形成产物。这些结果表明，P450周转量可能受体内(如体外培养物中的活细胞)底物产量的限制。也可在混合的水性/有机培养基中培养这些细胞；培养该菌株，在十二烷覆盖物的存在下诱导，不显著改变CadOH产量(～2倍以下)。By increasing the amount of FPP produced by the pMBIS plasmid in E. coli, the CadOH production was significantly increased. The pMBIS plasmid enables E. coli to produce FPP from mevalonic acid [6]. The nucleotide sequence of pMBIS is shown in Figure 36A-D (SEQ ID NO: 62). See also pMBIS US Patent Publication Nos. 2003/0148479; and 2004/0005678; it contains codes for mevalonate kinase, phosphomevalonate kinase, pyrophosphate mevalonate decarboxylase, IPP isomerase, and FPP synthase the nucleotide sequence. In these studies, E. coli was transformed with the following three expression plasmids: (1) pMBIS, (2) CadS and (3) CadH/CPR. 20 mM mevalonate was added after induction. Addition of pMBIS resulted in a 74-fold increase in CadOH production compared to cells without pMBIS (Figure 3). Again, the negative control (no CadH) formed no product. These results suggest that P450 turnover may be limited by substrate production in vivo (eg, in living cells in in vitro culture). These cells can also be grown in mixed aqueous/organic media; growing this strain, induced in the presence of a dodecane overlay, does not alter CadOH production significantly (~2-fold less).

图3是由饲喂甲羟戊酸的表达CadOH生物合成途径以及pMBIS的大肠杆菌提取的有机层的GC-MS图。图上表示了Cad和CadOH。Figure 3 is a GC-MS graph of an organic layer extracted from Escherichia coli expressing the CadOH biosynthetic pathway and pMBIS fed with mevalonate. The figure shows Cad and CadOH.

进一步证明，通过在不损失产物特异性的情况下工程改造P450提高产量。比较了天然基因(nCadH)与根据在大肠杆菌中表达进行密码子使用优化的合成基因(sCadH)的体内产量(图4B)。这一比较表明，合成基因略优于天然基因。It was further demonstrated that yield was enhanced by engineering P450s without loss of product specificity. The in vivo production of the native gene (nCadH) was compared with the codon usage optimized synthetic gene (sCadH) upon expression in E. coli (Fig. 4B). This comparison showed that the synthetic gene was slightly better than the natural gene.

用已知在大肠杆菌中有功能的序列取代野生型N末端跨膜结构域(TM)(图4A)。所检测的N末端序列是两个衍生自热带念珠菌的P450N末端前导物-不含预测的TM结构域的CYP52A13(A13)和含有TM结构域[7]的CYP52A17(A17)-以及牛微粒体前导物(牛)[8]。The wild-type N-terminal transmembrane domain (TM) was replaced with a sequence known to be functional in E. coli (Fig. 4A). The N-terminal sequences examined were two P450 N-terminal leaders derived from C. tropicalis - CYP52A13 (A13) without the predicted TM domain and CYP52A17 (A17) with the TM domain [7] - and bovine microsomal Lead (bovine) [8].

完全去除(截短)野生型TM结构域，用分泌标签(OmpA)、增溶结构域(PD1)[9]或膜插入蛋白(米司迪蛋白)[10]取代。牛-CadH超过野生型CadH约2倍，产生～30mg L^-1(图4B)。The wild-type TM domain is completely removed (truncated) and replaced with a secretory tag (OmpA), a solubilizing domain (PD1) [9] or a membrane insertion protein (mysidin) [10]. Bovine-CadH exceeded wild-type CadH approximately 2-fold, yielding ~30 mg L ^-1 (Fig. 4B).

参考文献references

1.M.Sono，M.P.Roach，E.D.Coulter和J.H.Dawson，Chem.Rev.1996，96，2841-2887。1. M. Sono, M.P. Roach, E.D. Coulter and J.H. Dawson, Chem. Rev. 1996, 96, 2841-2887.

2.S.Jennewein，R.M.Long，R.M.Williams和R.Croteau，Chem.Biol.2004，11，379-387。2. S. Jennewein, R.M. Long, R.M. Williams and R. Croteau, Chem. Biol. 2004, 11, 379-387.

3.R.J.Sowden，S.Yasmin，N.H.Rees，S.G.Bell和L.-L.Wong，Org.Biomol.Chem.2005，3，57-64。3. R. J. Sowden, S. Yasmin, N. H. Rees, S. G. Bell and L.-L. Wong, Org. Biomol. Chem. 2005, 3, 57-64.

4.P.Luo，Y.-H.Wang，G.-D.Wang，M.Essenberg和X.-Y.Chen，PlantJ.2001，28，95-104。4. P. Luo, Y.-H. Wang, G.-D. Wang, M. Essenberg and X.-Y. Chen, Plant J. 2001, 28, 95-104.

5.O.A.Carter，R.J.Peters和R.Croteau，Phytochem.2003，64，425-433。5. O.A. Carter, R.J. Peters and R. Croteau, Phytochem. 2003, 64, 425-433.

6.V.J.J.Martin，D.J.Pitera，S.T.Withers，J.D.Newman和J.D.Keasling，Nature Biotech.2003，21，796-8016. V.J.J. Martin, D.J. Pitera, S.T. Withers, J.D. Newman and J.D. Keasling, Nature Biotech. 2003, 21, 796-801

7.D.L.Craft，K.M.Madduri，M.Eshoo和C.R.Wilson，Appl.Environ.Microbiol.2003，69，5983-5991。7. D.L. Craft, K.M. Madduri, M. Eshoo and C.R. Wilson, Appl. Environ. Microbiol. 2003, 69, 5983-5991.

8.H.J.Barnes，M.P.Arlotto和M.R.Waterman，Proc.Natl.Acad.Sci.USA 1991，88，5597-5601。8. H.J.Barnes, M.P.Arlotto and M.R.Waterman, Proc.Natl.Acad.Sci.USA 1991, 88, 5597-5601.

9.G.A.Schock，R.Attias，M.Belghazi，P.M.Dansette和D.Werck-Reichart，Plant Physiol.2003，133，1198-1208。9. G.A. Schock, R. Attias, M. Belghazi, P.M. Dansette and D. Werck-Reichart, Plant Physiol. 2003, 133, 1198-1208.

10.T.P.Roosild，J.Greenwald，M.Vega，S.Castronovo，R.Riek和S.Choe Science 2005，307，1317-1321。10. T.P. Roosild, J. Greenwald, M. Vega, S. Castronovo, R. Riek and S. Choe Science 2005, 307, 1317-1321.

实施例2：用紫穗槐二烯氧化酶(AMO)氧化紫穗槐二烯 Example 2 : Oxidation of amorphadiene with amorphadiene oxidase (AMO)

本实施例描述了用分离自黄花蒿的紫穗槐二烯氧化酶(AMO)，也称为CYP71AV1在体内(如在体外培养物中的活细胞中)氧化紫穗槐二烯。产生并检测含有编码AMO的核苷酸序列的各种构建物，以优化氧化产物的产量。图22示意性描述了各种AMO构建物。(1)nAMO，分离自黄花蒿的天然AMO序列。(2)sAMO，根据在大肠杆菌中表达进行密码子优化的合成AMO基因。(3)A13-AMO，野生型跨膜结构域被热带念珠菌的A13N末端序列取代的合成AMO基因。(4)A17-AMO，野生型跨膜结构域被热带念珠菌的A17N末端序列取代的合成AMO基因。(5)Bov-AMO，野生型跨膜结构域被牛微粒体N末端序列取代的合成AMO基因。各种构建物的核苷酸和氨基酸序列参见图24-31。This example describes the in vivo (eg, in living cells in in vitro culture) oxidation of amorphadiene by amorphadiene oxidase (AMO), also known as CYP71AV1, isolated from Artemisia annua. Various constructs containing the nucleotide sequence encoding the AMO were generated and tested to optimize the yield of the oxidation product. Figure 22 schematically depicts various AMO constructs. (1) nAMO, isolated from the natural AMO sequence of Artemisia annua. (2) sAMO, based on a codon-optimized synthetic AMO gene expressed in Escherichia coli. (3) A13-AMO, a synthetic AMO gene in which the wild-type transmembrane domain is replaced by the A13 N-terminal sequence of Candida tropicalis. (4) A17-AMO, a synthetic AMO gene in which the wild-type transmembrane domain is replaced by the A17 N-terminal sequence of Candida tropicalis. (5) Bov-AMO, a synthetic AMO gene in which the wild-type transmembrane domain is replaced by the bovine microsomal N-terminal sequence. See Figures 24-31 for the nucleotide and amino acid sequences of the various constructs.

各种AMO构建物与以下物质共同表达：a)CPR；b)紫穗槐二烯合酶(ADS)；和c)质粒pMBIS。在甲羟戊酸的存在下，观察到紫穗槐二烯在C12位置上氧化成相应醇。图23A显示体内青蒿醇的相对产量。与青蒿醇的可信标准品比较后确认了该产物的身份(图23A的下图和图23B)。Various AMO constructs were co-expressed with: a) CPR; b) amorphadiene synthase (ADS); and c) plasmid pMBIS. In the presence of mevalonate, oxidation of amorphadiene to the corresponding alcohol at the C12 position was observed. Figure 23A shows the relative production of artenimol in vivo. Comparison to an authentic standard of artenimol confirmed the identity of the product (lower panels of Figure 23A and Figure 23B).

图23A和23B.用各种AMO构建物在大肠杆菌中体内氧化紫穗槐二烯。(A)GC-MS图显示与可信标准品(下图)相比，大肠杆菌中sAMO、A13-AMO、A17-AMO和bov-AMO产生的青蒿醇产量(上图)。(B)与可信标准品(下图)相比，大肠杆菌中产生的青蒿醇的EI-MS(上图)。Figures 23A and 23B. In vivo oxidation of amorphadiene in E. coli with various AMO constructs. (A) GC-MS graphs showing artenimol production (upper panel) by sAMO, A13-AMO, A17-AMO, and bov-AMO in E. coli compared to authentic standards (lower panel). (B) EI-MS of artenimol produced in E. coli (upper panel) compared to an authentic standard (lower panel).

实施例3：表达完整甲羟戊酸途径的细胞中的底物氧化 Example 3 : Substrate oxidation in cells expressing the complete mevalonate pathway

也在表达从乙酰-CoA开始的完整甲羟戊酸途径的细胞中进行底物氧化。以下产生CadOH的例子利用了3种质粒：(1)含有AtoB、HMGR和HMGS的pMevT，(2)pMBIS(含有编码MK、PMK、PMD、IDI(IPP异构酶)和IspA(FPP合酶)的核苷酸序列)和(3)含有CadH、CPR和CadS的表达载体。在添加了血红素添加物、δ-氨基乙酰丙酸的TB甘油中20℃培养细胞。该细胞产生的CadOH效价高达60mg/L。数据见图32。Substrate oxidation was also performed in cells expressing the complete mevalonate pathway starting with acetyl-CoA. The following example of CadOH generation utilizes 3 plasmids: (1) pMevT containing AtoB, HMGR and HMGS, (2) pMBIS (containing DNA encoding MK, PMK, PMD, IDI (IPP isomerase) and IspA (FPP synthase) nucleotide sequence) and (3) expression vectors containing CadH, CPR and CadS. Cells were cultured at 20°C in TB glycerol supplemented with heme supplement, delta-aminolevulinic acid. The CadOH titer produced by the cells was as high as 60mg/L. The data are shown in Figure 32.

在第二个例子中，用以下2种质粒产生青蒿酸：(1)含有编码MevT(AtoB、HMGR和HMGS)(参见图35A和B)、MBIS(MK、PMK、PMD、IDI和IspA)和ADS操纵子的核苷酸序列的表达载体，和(2)含有编码AMO和黄花蒿的CPR(AACPR)氧化还原伙伴的核苷酸序列的表达载体。在添加了血红素添加物的TB甘油中20℃培养大肠杆菌细胞后，用T7启动子载体观察痕量的青蒿酸(图33)。将载体改变为pCWOri后，AMO可用于紫穗槐二烯的3-步氧化，以在大肠杆菌中产生效价为20mg/L的青蒿酸(图33)。此外，用效价为40-80mg/L时产生的醛观察到逐步将醇氧化成醛(图34)。In a second example, the following 2 plasmids were used to produce artemisinic acid: (1) containing a gene encoding MevT (AtoB, HMGR and HMGS) (see Figure 35A and B), MBIS (MK, PMK, PMD, IDI and IspA) and (2) an expression vector containing a nucleotide sequence encoding a CPR (AACPR) redox partner of AMO and Artemisia annua. Traces of artemisinic acid were observed with the T7 promoter vector after culturing E. coli cells in TB glycerol with heme supplementation at 20°C (FIG. 33). After changing the vector to pCWOri, AMO could be used for the 3-step oxidation of amorphadiene to produce artemisinic acid at a titer of 20 mg/L in E. coli ( FIG. 33 ). Furthermore, a gradual oxidation of alcohols to aldehydes was observed with aldehydes produced at titers of 40-80 mg/L (Figure 34).

虽然参照具体实施方式描述了本发明，但本领域技术人员应理解，可以在不背离本发明的真实构思和范围的情况下进行各种改变，并可用等同物取代。此外，可进行许多修改以使具体情况、材料、物质组成、方法、工艺步骤适应本发明的目的、构思和范围。所有这些修改应属于所附权利要求书的范围。While the present invention has been described with reference to specific embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the true spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation, material, composition of matter, method, process step to the objective, concept and scope of the invention. All such modifications are intended to fall within the scope of the appended claims.

序列表sequence listing

<110>M.C.-Y.常(CHANG，MICHELLE CHIA-YU)<110> M.C.-Y. Chang (CHANG, MICHELLE CHIA-YU)

R.伊切斯(EACHUS，RACHEL)R. Itches (EACHUS, RACHEL)

D.-K.罗(RO，DAE-KYUN)D.-K. Luo (RO, DAE-KYUN)

吉国靖雄(YOSHIKUNI，YASUO)Yasuo Yoshikuni (YOSHIKUNI, YASUO)

J.D.基斯林(KEASLING，JAY D.)J.D. Keasling (KEASLING, JAY D.)

<120>修饰细胞色素P450酶的编码核酸和其应用方法<120> Modified cytochrome P450 enzyme encoding nucleic acid and its application method

<130>BERK-053WO<130>BERK-053WO

<150>60/724,525<150>60/724,525

<151>2005-10-07<151>2005-10-07

<150>60/762,700<150>60/762,700

<151>2006-01-27<151>2006-01-27

<160>63<160>63

<170>FastSEQ for Windows Version 4.0<170>FastSEQ for Windows Version 4.0

<210>1<210>1

<211>19<211>19

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>1<400>1

Met Trp Leu Leu Leu Ile Ala Val Phe Leu Leu Thr Leu Ala Tyr LeuMet Trp Leu Leu Leu Ile Ala Val Phe Leu Leu Thr Leu Ala Tyr Leu

1 5 10 151 5 10 15

Phe Trp ProPhe Trp Pro

<210>2<210>2

<211>20<211>20

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>2<400>2

Met Ala Leu Leu Leu Ala Val Phe Leu Gly Leu Ser Cys Leu Leu LeuMet Ala Leu Leu Leu Ala Val Phe Leu Gly Leu Ser Cys Leu Leu Leu

1 5 10 151 5 10 15

Leu Ser Leu TrpLeu Ser Leu Trp

2020

<210>3<210>3

<211>18<211>18

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>3<400>3

Met Ala Ile Leu Ala Ala Ile Phe Ala Leu Val ValAla Thr Ala ThrMet Ala Ile Leu Ala Ala Ile Phe Ala Leu Val ValAla Thr Ala Thr

1 5 10 151 5 10 15

Arg ValArg Val

<210>4<210>4

<211>24<211>24

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>4<400>4

Met Asp Ala Ser Leu Leu Leu Ser Val Ala Leu Ala Val Val Leu IleMet Asp Ala Ser Leu Leu Leu Ser Val Ala Leu Ala Val Val Leu Ile

1 5 10 151 5 10 15

Pro Leu Ser Leu Ala Leu Leu AsnPro Leu Ser Leu Ala Leu Leu Asn

2020

<210>5<210>5

<211>27<211>27

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>5<400>5

Met Ile Glu Gln Leu Leu Glu Tyr Trp Tyr Val Val Val Pro Val LeuMet Ile Glu Gln Leu Leu Glu Tyr Trp Tyr Val Val Val Pro Val Leu

1 5 10 151 5 10 15

Tyr Ile Ile Lys Gln Leu Leu Ala Tyr Thr LysTyr Ile Ile Lys Gln Leu Leu Ala Tyr Thr Lys

20 2520 25

<210>6<210>6

<211>21<211>21

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>6<400>6

Met Lys Lys Thr Ala Ile Ala Ile Ala Val Ala Leu Ala Gly Phe AlaMet Lys Lys Thr Ala Ile Ala Ile Ala Val Ala Leu Ala Gly Phe Ala

1 5 10 151 5 10 15

Thr Val Ala Gln AlaThr Val Ala Gln Ala

2020

<210>7<210>7

<211>21<211>21

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>7<400>7

Met Lys Lys Thr Ala Ile Ala Ile Val Val Ala Leu Ala Gly Phe AlaMet Lys Lys Thr Ala Ile Ala Ile Val Val Ala Leu Ala Gly Phe Ala

1 5 10 151 5 10 15

Thr Val Ala Gln AlaThr Val Ala Gln Ala

2020

<210>8<210>8

<211>21<211>21

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>8<400>8

Met Lys Lys Thr Ala Leu Ala Leu Ala Val Ala Leu Ala Gly Phe AlaMet Lys Lys Thr Ala Leu Ala Leu Ala Val Ala Leu Ala Gly Phe Ala

1 5 10 151 5 10 15

Thr Val Ala Gln AlaThr Val Ala Gln Ala

2020

<210>9<210>9

<211>26<211>26

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>9<400>9

Met Lys Ile Lys Thr Gly Ala Arg Ile Leu Ala Leu Ser Ala Leu ThrMet Lys Ile Lys Thr Gly Ala Arg Ile Leu Ala Leu Ser Ala Leu Thr

1 5 10 151 5 10 15

Thr Met Met Phe Ser Ala Ser Ala Leu AlaThr Met Met Phe Ser Ala Ser Ala Leu Ala

20 2520 25

<210>10<210>10

<211>25<211>25

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>10<400>10

Met Asn Met Lys Lys Leu Ala Thr Leu Val Ser Ala Val Ala Leu SerMet Asn Met Lys Lys Leu Ala Thr Leu Val Ser Ala Val Ala Leu Ser

1 5 10 151 5 10 15

Ala Thr Val Ser Ala Asn Ala Met AlaAla Thr Val Ser Ala Asn Ala Met Ala

20 2520 25

<210>11<210>11

<211>21<211>21

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>11<400>11

Met Lys Gln Ser Thr Ile Ala Leu Ala Leu Leu Pro Leu Leu Phe ThrMet Lys Gln Ser Thr Ile Ala Leu Ala Leu Leu Pro Leu Leu Phe Thr

1 5 10 151 5 10 15

Pro Val Thr Lys AlaPro Val Thr Lys Ala

2020

<210>12<210>12

<211>44<211>44

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>12<400>12

1 5 10 151 5 10 15

Thr Val Ala Gln Ala Leu Leu Glu Tyr Trp Tyr Val Val Val Pro ValThr Val Ala Gln Ala Leu Leu Glu Tyr Trp Tyr Val Val Val Pro Val

20 25 3020 25 30

Leu Tyr Ile Ile Lys Gln Leu Leu Ala Tyr Thr LysLeu Tyr Ile Ile Lys Gln Leu Leu Ala Tyr Thr Lys

35 4035 40

<210>13<210>13

<211>24<211>24

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>13<400>13

Glu Glu Leu Leu Lys Gln Ala Leu Gln Gln Ala Gln Gln Leu Leu GlnGlu Glu Leu Leu Lys Gln Ala Leu Gln Gln Ala Gln Gln Leu Leu Gln

1 5 10 151 5 10 15

Gln Ala Gln Glu Leu Ala Lys LysGln Ala Gln Glu Leu Ala Lys Lys

2020

<210>14<210>14

<211>32<211>32

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>14<400>14

Met Thr Val His Asp Ile Ile Ala Thr Tyr Phe Thr Lys Trp Tyr ValMet Thr Val His Asp Ile Ile Ala Thr Tyr Phe Thr Lys Trp Tyr Val

1 5 10 151 5 10 15

Ile Val Pro Leu Ala Leu Ile Ala Tyr Arg Val Leu Asp Tyr Phe TyrIle Val Pro Leu Ala Leu Ile Ala Tyr Arg Val Leu Asp Tyr Phe Tyr

20 25 3020 25 30

<210>15<210>15

<211>29<211>29

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>15<400>15

Gly Leu Phe Gly Ala Ile Ala Gly Phe Ile Glu Gly Gly Trp Thr GlyGly Leu Phe Gly Ala Ile Ala Gly Phe Ile Glu Gly Gly Trp Thr Gly

1 5 10 151 5 10 15

Met Ile Asp Gly Trp Tyr Gly Tyr Gly Gly Gly Lys LysMet Ile Asp Gly Trp Tyr Gly Tyr Gly Gly Gly Lys Lys

20 2520 25

<210>16<210>16

<211>9<211>9

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>16<400>16

Met Ala Lys Lys Thr Ser Ser Lys GlyMet Ala Lys Lys Thr Ser Ser Lys Gly

1 51 5

<210>17<210>17

<211>6<211>6

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>17<400>17

Ala Ala Ala Gly Gly MetAla Ala Ala Gly Gly Met

1 51 5

<210>18<210>18

<211>14<211>14

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>18<400>18

Ala Ala Ala Gly Gly Met Pro Pro Ala Ala Ala Gly Gly MetAla Ala Ala Gly Gly Met Pro Pro Ala Ala Ala Gly Gly Met

1 5 101 5 5 10

<210>19<210>19

<211>6<211>6

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>接头肽<223> linker peptide

<400>19<400>19

Ala Ala Ala Gly Gly MetAla Ala Ala Gly Gly Met

1 51 5

<210>20<210>20

<211>8<211>8

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>接头肽<223> linker peptide

<400>20<400>20

Pro Pro Ala Ala Ala Gly Gly MetPro Pro Ala Ala Ala Gly Gly Met

1 51 5

<210>21<210>21

<211>4<211>4

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>接头肽<223> linker peptide

<400>21<400>21

Ile Glu Gly ArgIle Glu Gly Arg

1 1

<210>22<210>22

<211>6<211>6

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>接头肽<223> linker peptide

<400>22<400>22

Gly Gly Lys Gly Gly LysGly Gly Lys Gly Gly Lys

1 51 5

<210>23<210>23

<211>23<211>23

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成肽<223> Synthetic peptides

<400>23<400>23

Met Leu Phe Pro Val Ala Leu Ser Phe Leu Val Ala Ile Leu Gly IleMet Leu Phe Pro Val Ala Leu Ser Phe Leu Val Ala Ile Leu Gly Ile

1 5 10 151 5 10 15

Ser Leu Trp His Val Trp ThrSer Leu Trp His Val Trp Thr

2020

<210>24<210>24

<211>110<211>110

<212>PRT<212>PRT

<213>枯草芽孢杆菌(Bacillus subtilis)<213> Bacillus subtilis

<400>24<400>24

Met Phe Cys Thr Phe Phe Glu Lys His His Arg Lys Trp Asp Ile LeuMet Phe Cys Thr Phe Phe Glu Lys His His Arg Lys Trp Asp Ile Leu

1 5 10 151 5 10 15

Leu Glu Lys Ser Thr Gly Val Met Glu Ala Met Lys Val Thr Ser GluLeu Glu Lys Ser Thr Gly Val Met Glu Ala Met Lys Val Thr Ser Glu

20 25 3020 25 30

Glu Lys Glu Gln Leu Ser Thr Ala Ile Asp Arg Met Asn Glu Gly LeuGlu Lys Glu Gln Leu Ser Thr Ala Ile Asp Arg Met Asn Glu Gly Leu

35 40 4535 40 45

Asp Ala Phe Ile Gln Leu Tyr Asn Glu Ser Glu Ile Asp Glu Pro LeuAsp Ala Phe Ile Gln Leu Tyr Asn Glu Ser Glu Ile Asp Glu Pro Leu

50 55 6050 55 60

Ile Gln Leu Asp Asp Asp Thr Ala Glu Leu Met Lys Gln Ala Arg AspIle Gln Leu Asp Asp Asp Thr Ala Glu Leu Met Lys Gln Ala Arg Asp

65 70 75 8065 70 75 80

Met Tyr Gly Gln Glu Lys Leu Asn Glu Lys Leu Asn Thr Ile Ile LysMet Tyr Gly Gln Glu Lys Leu Asn Glu Lys Leu Asn Thr Ile Ile Lys

85 90 9585 90 95

Gln Ile Leu Ser Ile Ser Val Ser Glu Glu Gly Glu Lys GluGln Ile Leu Ser Ile Ser Val Ser Glu Glu Gly Glu Lys Glu

100 105 110100 105 110

<210>25<210>25

<211>496<211>496

<212>PRT<212>PRT

<213>薄荷(Mentha x gracilis)<213> Mint (Mentha x gracilis)

<400>25<400>25

Met Glu Leu Asp Leu Leu Ser Ala Ile Ile Ile Leu Val Ala Thr TyrMet Glu Leu Asp Leu Leu Ser Ala Ile Ile Ile Leu Val Ala Thr Tyr

1 5 10 151 5 10 15

Ile Val Ser Leu Leu Ile Asn Gln Trp Arg Lys Ser Lys Ser Gln GlnIle Val Ser Leu Leu Ile Asn Gln Trp Arg Lys Ser Lys Ser Gln Gln

20 25 3020 25 30

Asn Leu Pro Pro Ser Pro Pro Lys Leu Pro Val Ile Gly His Leu HisAsn Leu Pro Pro Ser Pro Pro Lys Leu Pro Val Ile Gly His Leu His

35 40 4535 40 45

Phe Leu Trp Gly Gly Leu Pro Gln His Val Phe Arg Ser Ile Ala GlnPhe Leu Trp Gly Gly Leu Pro Gln His Val Phe Arg Ser Ile Ala Gln

50 55 6050 55 60

Lys Tyr Gly Pro Val Ala His Val Gln Leu Gly Glu Val Tyr Ser ValLys Tyr Gly Pro Val Ala His Val Gln Leu Gly Glu Val Tyr Ser Val

65 70 75 8065 70 75 80

Val Leu Ser Ser Ala Glu Ala Ala Lys Gln Ala Met Lys Val Leu AspVal Leu Ser Ser Ala Glu Ala Ala Lys Gln Ala Met Lys Val Leu Asp

85 90 9585 90 95

Pro Asn Phe Ala Asp Arg Phe Asp Gly Ile Gly Ser Arg Thr Met TrpPro Asn Phe Ala Asp Arg Phe Asp Gly Ile Gly Ser Arg Thr Met Trp

100 105 110100 105 110

Tyr Asp Lys Asp Asp Ile Ile Phe Ser Pro Tyr Asn Asp His Trp ArgTyr Asp Lys Asp Asp Ile Ile Phe Ser Pro Tyr Asn Asp His Trp Arg

115 120 125115 120 125

Gln Met Arg Arg Ile Cys Val Thr Glu Leu Leu Ser Pro Lys Asn ValGln Met Arg Arg Ile Cys Val Thr Glu Leu Leu Ser Pro Lys Asn Val

130 135 140130 135 140

Arg Ser Phe Gly Tyr Ile Arg Gln Glu Glu Ile Glu Arg Leu Ile ArgArg Ser Phe Gly Tyr Ile Arg Gln Glu Glu Ile Glu Arg Leu Ile Arg

145 150 155 160145 150 155 160

Leu Leu Gly Ser Ser Gly Gly Ala Pro Val Asp Val Thr Glu Glu ValLeu Leu Gly Ser Ser Gly Gly Ala Pro Val Asp Val Thr Glu Glu Val

165 170 175165 170 175

Ser Lys Met Ser Cys Val Val Val Cys Arg Ala Ala Phe Gly Ser ValSer Lys Met Ser Cys Val Val Val Cys Arg Ala Ala Phe Gly Ser Val

180 185 190180 185 190

Leu Lys Asp Gln Gly Ser Leu Ala Glu Leu Val Lys Glu Ser Leu AlaLeu Lys Asp Gln Gly Ser Leu Ala Glu Leu Val Lys Glu Ser Leu Ala

195 200 205195 200 205

Leu Ala Ser Gly Phe Glu Leu Ala Asp Leu Tyr Pro Ser Ser Trp LeuLeu Ala Ser Gly Phe Glu Leu Ala Asp Leu Tyr Pro Ser Ser Trp Leu

210 215 220210 215 220

Leu Asn Leu Leu Ser Leu Asn Lys Tyr Arg Leu Gln Arg Met Arg ArgLeu Asn Leu Leu Ser Leu Asn Lys Tyr Arg Leu Gln Arg Met Arg Arg

225 230 235 240225 230 235 240

Arg Leu Asp His Ile Leu Asp Gly Phe Leu Glu Glu His Arg Glu LysArg Leu Asp His Ile Leu Asp Gly Phe Leu Glu Glu His Arg Glu Lys

245 250 255245 250 255

Lys Ser Gly Glu Phe Gly Gly Glu Asp Ile Val Asp Val Leu Phe ArgLys Ser Gly Glu Phe Gly Gly Glu Asp Ile Val Asp Val Leu Phe Arg

260 265 270260 265 270

Met Gln Lys Gly Ser Asp Ile Lys Ile Pro Ile Thr Ser Asn Cys IleMet Gln Lys Gly Ser Asp Ile Lys Ile Pro Ile Thr Ser Asn Cys Ile

275 280 285275 280 285

Lys Gly Phe Ile Phe Asp Thr Phe Ser Ala Gly Ala Glu Thr Ser SerLys Gly Phe Ile Phe Asp Thr Phe Ser Ala Gly Ala Glu Thr Ser Ser

290 295 300290 295 300

Thr Thr Ile Ser Trp Ala Leu Ser Glu Leu Met Arg Asn Pro Ala LysThr Thr Ile Ser Trp Ala Leu Ser Glu Leu Met Arg Asn Pro Ala Lys

305 310 315 320305 310 315 320

Met Ala Lys Val Gln Ala Glu Val Arg Glu Ala Leu Lys Gly Lys ThrMet Ala Lys Val Gln Ala Glu Val Arg Glu Ala Leu Lys Gly Lys Thr

325 330 335325 330 335

Val Val Asp Leu Ser Glu Val Gln Glu Leu Lys Tyr Leu Arg Ser ValVal Val Asp Leu Ser Glu Val Gln Glu Leu Lys Tyr Leu Arg Ser Val

340 345 350340 345 350

Leu Lys Glu Thr Leu Arg Leu His Pro Pro Phe Pro Leu Ile Pro ArgLeu Lys Glu Thr Leu Arg Leu His Pro Pro Phe Pro Leu Ile Pro Arg

355 360 365355 360 365

Gln Ser Arg Glu Glu Cys Glu Val Asn Gly Tyr Thr Ile Pro Ala LysGln Ser Arg Glu Glu Cys Glu Val Asn Gly Tyr Thr Ile Pro Ala Lys

370 375 380370 375 380

Thr Arg Ile Phe Ile Asn Val Trp Ala Ile Gly Arg Asp Pro Gln TyrThr Arg Ile Phe Ile Asn Val Trp Ala Ile Gly Arg Asp Pro Gln Tyr

385 390 395 400385 390 395 400

Trp Glu Asp Pro Asp Thr Phe Arg Pro Glu Arg Phe Asp Glu Val SerTrp Glu Asp Pro Asp Thr Phe Arg Pro Glu Arg Phe Asp Glu Val Ser

405 410 415405 410 415

Arg Asp Phe Met Gly Asn Asp Phe Glu Phe Ile Pro Phe Gly Ala GlyArg Asp Phe Met Gly Asn Asp Phe Glu Phe Ile Pro Phe Gly Ala Gly

420 425 430420 425 430

Arg Arg Ile Cys Pro Gly Leu His Phe Gly Leu Ala Asn Val Glu IleArg Arg Ile Cys Pro Gly Leu His Phe Gly Leu Ala Asn Val Glu Ile

435 440 445435 440 445

Pro Leu Ala Gln Leu Leu Tyr His Phe Asp Trp Lys Leu Pro Gln GlyPro Leu Ala Gln Leu Leu Tyr His Phe Asp Trp Lys Leu Pro Gln Gly

450 455 460450 455 460

Met Thr Asp Ala Asp Leu Asp Met Thr Glu Thr Pro Gly Leu Ser GlyMet Thr Asp Ala Asp Leu Asp Met Thr Glu Thr Pro Gly Leu Ser Gly

465 470 475 480465 470 475 480

Pro Lys Lys Lys Asn Val Cys Leu Val Pro Thr Leu Tyr Lys Ser ProPro Lys Lys Lys Asn Val Cys Leu Val Pro Thr Leu Tyr Lys Ser Pro

485 490 495485 490 495

<210>26<210>26

<211>473<211>473

<212>PRT<212>PRT

<213>烟草(Nicotiana tabacum)<213> Tobacco (Nicotiana tabacum)

<400>26<400>26

Met Gln Phe Phe Ser Leu Val Ser Ile Phe Leu Phe Leu Ala Phe LeuMet Gln Phe Phe Ser Leu Val Ser Ile Phe Leu Phe Leu Ala Phe Leu

1 5 10 151 5 10 15

Phe Leu Leu Arg Lys Trp Lys Asn Ser Asn Ser Gln Ser Lys Lys LeuPhe Leu Leu Arg Lys Trp Lys Asn Ser Asn Ser Gln Ser Lys Lys Leu

20 25 3020 25 30

Pro Pro Gly Pro Trp Lys Ile Pro Ile Leu Gly Ser Met Leu His MetPro Pro Gly Pro Trp Lys Ile Pro Ile Leu Gly Ser Met Leu His Met

35 40 4535 40 45

Ile Gly Gly Glu Pro His His Val Leu Arg Asp Leu Ala Lys Lys TyrIle Gly Gly Glu Pro His His Val Leu Arg Asp Leu Ala Lys Lys Tyr

50 55 6050 55 60

Gly Pro Leu Met His Leu Gln Leu Gly Glu Ile Ser Ala Val Val ValGly Pro Leu Met His Leu Gln Leu Gly Glu Ile Ser Ala Val Val Val

65 70 75 8065 70 75 80

Thr Ser Arg Asp Met Ala Lys Glu Val Leu Lys Thr His Asp Val ValThr Ser Arg Asp Met Ala Lys Glu Val Leu Lys Thr His Asp Val Val

85 90 9585 90 95

Phe Ala Ser Arg Pro Lys Ile Val Ala Met Asp Ile Ile Cys Tyr AsnPhe Ala Ser Arg Pro Lys Ile Val Ala Met Asp Ile Ile Cys Tyr Asn

100 105 110100 105 110

Gln Ser Asp Ile Ala Phe Ser Pro Tyr Gly Asp His Trp Arg Gln MetGln Ser Asp Ile Ala Phe Ser Pro Tyr Gly Asp His Trp Arg Gln Met

115 120 125115 120 125

Arg Lys Ile Cys Val Met Glu Leu Leu Asn Ala Lys Asn Val Arg SerArg Lys Ile Cys Val Met Glu Leu Leu Asn Ala Lys Asn Val Arg Ser

130 135 140130 135 140

Phe Ser Ser Ile Arg Arg Asp Glu Val Val Arg Leu Ile Asp Ser IlePhe Ser Ser Ile Arg Arg Asp Glu Val Val Arg Leu Ile Asp Ser Ile

145 150 155 160145 150 155 160

Arg Ser Asp Ser Ser Ser Gly Glu Leu Val Asn Phe Thr Gln Arg IleArg Ser Asp Ser Ser Ser Gly Glu Leu Val Asn Phe Thr Gln Arg Ile

165 170 175165 170 175

Ile Trp Phe Ala Ser Ser Met Thr Cys Arg Ser Ala Phe Gly Gln ValIle Trp Phe Ala Ser Ser Met Thr Cys Arg Ser Ala Phe Gly Gln Val

180 185 190180 185 190

Leu Lys Gly Gln Asp Ile Phe Ala Lys Lys Ile Arg Glu Val Ile GlyLeu Lys Gly Gln Asp Ile Phe Ala Lys Lys Ile Arg Glu Val Ile Gly

195 200 205195 200 205

Leu Ala Glu Gly Phe Asp Val Val Asp Ile Phe Pro Thr Tyr Lys PheLeu Ala Glu Gly Phe Asp Val Val Asp Ile Phe Pro Thr Tyr Lys Phe

210 215 220210 215 220

Leu His Val Leu Ser Gly Met Lys Arg Lys Leu Leu Asn Ala His LeuLeu His Val Leu Ser Gly Met Lys Arg Lys Leu Leu Asn Ala His Leu

225 230 235 240225 230 235 240

Lys Val Asp Ala Ile Val Glu Asp Val Ile Asn Glu His Lys Lys AsnLys Val Asp Ala Ile Val Glu Asp Val Ile Asn Glu His Lys Lys Asn

245 250 255245 250 255

Leu Ala Ala Gly Lys Ser Asn Gly Ala Leu Glu Asp Met Phe Ala AlaLeu Ala Ala Gly Lys Ser Asn Gly Ala Leu Glu Asp Met Phe Ala Ala

260 265 270260 265 270

Gly Thr Glu Thr Ser Ser Thr Thr Thr Val Trp Ala Met Ala Glu MetGly Thr Glu Thr Ser Ser Thr Thr Thr Val Trp Ala Met Ala Glu Met

275 280 285275 280 285

Met Lys Asn Pro Ser Val Phe Thr Lys Ala Gln Ala Glu Val Arg GluMet Lys Asn Pro Ser Val Phe Thr Lys Ala Gln Ala Glu Val Arg Glu

290 295 300290 295 300

Ala Phe Arg Asp Lys Val Ser Phe Asp Glu Asn Asp Val Glu Glu LeuAla Phe Arg Asp Lys Val Ser Phe Asp Glu Asn Asp Val Glu Glu Leu

305 310 315 320305 310 315 320

Lys Tyr Leu Lys Leu Val Ile Lys Glu Thr Leu Arg Leu His Pro ProLys Tyr Leu Lys Leu Val Ile Lys Glu Thr Leu Arg Leu His Pro Pro

325 330 335325 330 335

Ser Pro Leu Leu Val Pro Arg Glu Cys Arg Glu Asp Thr Asp Ile AsnSer Pro Leu Leu Val Pro Arg Glu Cys Arg Glu Asp Thr Asp Ile Asn

340 345 350340 345 350

Gly Tyr Thr Ile Pro Ala Lys Thr Lys Val Met Val Asn Val Trp AlaGly Tyr Thr Ile Pro Ala Lys Thr Lys Val Met Val Asn Val Trp Ala

355 360 365355 360 365

Leu Gly Arg Asp Pro Lys Tyr Trp Asp Asp Ala Glu Ser Phe Lys ProLeu Gly Arg Asp Pro Lys Tyr Trp Asp Asp Ala Glu Ser Phe Lys Pro

370 375 380370 375 380

Glu Arg Phe Glu Gln Cys Ser Val Asp Phe Phe Gly Asn Asn Phe GluGlu Arg Phe Glu Gln Cys Ser Val Asp Phe Phe Gly Asn Asn Phe Glu

385 390 395 400385 390 395 400

Phe Leu Pro Phe Gly Gly Gly Arg Arg Ile Cys Pro Gly Met Ser PhePhe Leu Pro Phe Gly Gly Gly Arg Arg Ile Cys Pro Gly Met Ser Phe

405 410 415405 410 415

Gly Leu Ala Asn Leu Tyr Leu Pro Leu Ala Gln Leu Leu Tyr His PheGly Leu Ala Asn Leu Tyr Leu Pro Leu Ala Gln Leu Leu Tyr His Phe

420 425 430420 425 430

Asp Trp Lys Leu Pro Thr Gly Ile Met Pro Arg Asp Leu Asp Leu ThrAsp Trp Lys Leu Pro Thr Gly Ile Met Pro Arg Asp Leu Asp Leu Thr

435 440 445435 440 445

Glu Leu Ser Gly Ile Thr Ile Ala Arg Lys Gly Asp Leu Tyr Leu AsnGlu Leu Ser Gly Ile Thr Ile Ala Arg Lys Gly Asp Leu Tyr Leu Asn

450 455 460450 455 460

Ala Thr Pro Tyr Gln Pro Ser Arg GluAla Thr Pro Tyr Gln Pro Ser Arg Glu

465 470465 470

<210>27<210>27

<211>536<211>536

<212>PRT<212>PRT

<213>树绵(Gossypium arboreum)<213> Sponge (Gossypium arboreum)

<400>27<400>27

Met Leu Gln Ile Ala Phe Ser Ser Tyr Ser Trp Leu Leu Thr Ala SerMet Leu Gln Ile Ala Phe Ser Ser Tyr Ser Trp Leu Leu Thr Ala Ser

1 5 10 151 5 10 15

Asn Gln Lys Asp Gly Met Leu Phe Pro Val Ala Leu Ser Phe Leu ValAsn Gln Lys Asp Gly Met Leu Phe Pro Val Ala Leu Ser Phe Leu Val

20 25 3020 25 30

Ala Ile Leu Gly Ile Ser Leu Trp His Val Trp Thr Ile Arg Lys ProAla Ile Leu Gly Ile Ser Leu Trp His Val Trp Thr Ile Arg Lys Pro

35 40 4535 40 45

Lys Lys Asp Ile Ala Pro Leu Pro Pro Gly Pro Arg Gly Leu Pro IleLys Lys Asp Ile Ala Pro Leu Pro Pro Gly Pro Arg Gly Leu Pro Ile

50 55 6050 55 60

Val Gly Tyr Leu Pro Tyr Leu Gly Thr Asp Asn Leu His Leu Val PheVal Gly Tyr Leu Pro Tyr Leu Gly Thr Asp Asn Leu His Leu Val Phe

65 70 75 8065 70 75 80

Thr Asp Leu Ala Ala Ala Tyr Gly Pro Ile Tyr Lys Leu Trp Leu GlyThr Asp Leu Ala Ala Ala Tyr Gly Pro Ile Tyr Lys Leu Trp Leu Gly

85 90 9585 90 95

Asn Lys Leu Cys Val Val Ile Ser Ser Ala Pro Leu Ala Lys Glu ValAsn Lys Leu Cys Val Val Ile Ser Ser Ala Pro Leu Ala Lys Glu Val

100 105 110100 105 110

Val Arg Asp Asn Asp Ile Thr Phe Ser Glu Arg Asp Pro Pro Val CysVal Arg Asp Asn Asp Ile Thr Phe Ser Glu Arg Asp Pro Pro Val Cys

115 120 125115 120 125

Ala Lys Ile Ile Thr Phe Gly Leu Asn Asp Ile Val Phe Asp Ser TyrAla Lys Ile Ile Thr Phe Gly Leu Asn Asp Ile Val Phe Asp Ser Tyr

130 135 140130 135 140

Ser Ser Pro Asp Trp Arg Met Lys Arg Lys Val Leu Val Arg Glu MetSer Ser Pro Asp Trp Arg Met Lys Arg Lys Val Leu Val Arg Glu Met

145 150 155 160145 150 155 160

Leu Ser His Ser Ser Ile Lys Ala Cys Tyr Gly Leu Arg Arg Glu GlnLeu Ser His Ser Ser Ile Lys Ala Cys Tyr Gly Leu Arg Arg Glu Gln

165 170 175165 170 175

Val Leu Lys Gly Val Gln Asn Val Ala Gln Ser Ala Gly Lys Pro IleVal Leu Lys Gly Val Gln Asn Val Ala Gln Ser Ala Gly Lys Pro Ile

180 185 190180 185 190

Asp Phe Gly Glu Thr Ala Phe Leu Thr Ser Ile Asn Ala Met Met SerAsp Phe Gly Glu Thr Ala Phe Leu Thr Ser Ile Asn Ala Met Met Ser

195 200 205195 200 205

Met Leu Trp Gly Gly Lys Gln Gly Gly Glu Arg Lys Gly Ala Asp ValMet Leu Trp Gly Gly Lys Gln Gly Gly Glu Arg Lys Gly Ala Asp Val

210 215 220210 215 220

Trp Gly Gln Phe Arg Asp Leu Ile Thr Glu Leu Met Val Ile Leu GlyTrp Gly Gln Phe Arg Asp Leu Ile Thr Glu Leu Met Val Ile Leu Gly

225 230 235 240225 230 235 240

Lys Pro Asn Val Ser Asp Ile Phe Pro Val Leu Ala Arg Phe Asp IleLys Pro Asn Val Ser Asp Ile Phe Pro Val Leu Ala Arg Phe Asp Ile

245 250 255245 250 255

Gln Gly Leu Glu Lys Glu Met Thr Lys Ile Val Asn Ser Phe Asp LysGln Gly Leu Glu Lys Glu Met Thr Lys Ile Val Asn Ser Phe Asp Lys

260 265 270260 265 270

Leu Phe Asn Ser Met Ile Glu Glu Arg Glu Asn Phe Ser Asn Lys LeuLeu Phe Asn Ser Met Ile Glu Glu Arg Glu Asn Phe Ser Asn Lys Leu

275 280 285275 280 285

Ser Lys Glu Asp Gly Asn Thr Glu Thr Lys Asp Phe Leu Gln Leu LeuSer Lys Glu Asp Gly Asn Thr Glu Thr Lys Asp Phe Leu Gln Leu Leu

290 295 300290 295 300

Leu Asp Leu Lys Gln Lys Asn Asp Ser Gly Ile Ser Ile Thr Met AsnLeu Asp Leu Lys Gln Lys Asn Asp Ser Gly Ile Ser Ile Thr Met Asn

305 310 315 320305 310 315 320

Gln Val Lys Ala Leu Leu Met Asp Ile Val Val Gly Gly Thr Asp ThrGln Val Lys Ala Leu Leu Met Asp Ile Val Val Gly Gly Thr Asp Thr

325 330 335325 330 335

Thr Ser Thr Met Met Glu Trp Thr Met Ala Glu Leu Ile Ala Ash ProThr Ser Thr Met Met Glu Trp Thr Met Ala Glu Leu Ile Ala Ash Pro

340 345 350340 345 350

Glu Ala Met Lys Lys Val Lys Gln Glu Ile Asp Asp Val Val Gly SerGlu Ala Met Lys Lys Val Lys Gln Glu Ile Asp Asp Val Val Gly Ser

355 360 365355 360 365

Asp Gly Ala Val Asp Glu Thr His Leu Pro Lys Leu Arg Tyr Leu AspAsp Gly Ala Val Asp Glu Thr His Leu Pro Lys Leu Arg Tyr Leu Asp

370 375 380370 375 380

Ala Ala Val Lys Glu Thr Phe Arg Leu His Pro Pro Met Pro Leu LeuAla Ala Val Lys Glu Thr Phe Arg Leu His Pro Pro Met Pro Leu Leu

385 390 395 400385 390 395 400

Val Pro Arg Cys Pro Gly Asp Ser Ser Asn Val Gly Gly Tyr Ser ValVal Pro Arg Cys Pro Gly Asp Ser Ser Asn Val Gly Gly Tyr Ser Val

405 410 415405 410 415

Pro Lys Gly Thr Arg Val Phe Leu Asn Ile Trp Cys Ile Gln Arg AspPro Lys Gly Thr Arg Val Phe Leu Asn Ile Trp Cys Ile Gln Arg Asp

420 425 430420 425 430

Pro Gln Leu Trp Glu Asn Pro Leu Glu Phe Lys Pro Glu Arg Phe LeuPro Gln Leu Trp Glu Asn Pro Leu Glu Phe Lys Pro Glu Arg Phe Leu

435 440 445435 440 445

Thr Asp His Glu Lys Leu Asp Tyr Leu Gly Asn Asp Ser Arg Tyr MetThr Asp His Glu Lys Leu Asp Tyr Leu Gly Asn Asp Ser Arg Tyr Met

450 455 460450 455 460

Pro Phe Gly Ser Gly Arg Arg Met Cys Ala Gly Val Ser Leu Gly GluPro Phe Gly Ser Gly Arg Arg Met Cys Ala Gly Val Ser Leu Gly Glu

465 470 475 480465 470 475 480

Lys Met Leu Tyr Ser Ser Leu Ala Ala Met Ile His Ala Tyr Asp TrpLys Met Leu Tyr Ser Ser Ser Leu Ala Ala Met Ile His Ala Tyr Asp Trp

485 490 495485 490 495

Asn Leu Ala Asp Gly Glu Glu Asn Asp Leu Ile Gly Leu Phe Gly IleAsn Leu Ala Asp Gly Glu Glu Asn Asp Leu Ile Gly Leu Phe Gly Ile

500 505 510500 505 510

Ile Met Lys Lys Lys Lys Pro Leu Ile Leu Val Pro Thr Pro Arg ProIle Met Lys Lys Lys Lys Pro Leu Ile Leu Val Pro Thr Pro Arg Pro

515 520 525515 520 525

Ser Asn Leu Gln His Tyr Met LysSer Asn Leu Gln His Tyr Met Lys

530 535530 535

<210>28<210>28

<211>512<211>512

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的P450单加氧酶<223> Modified P450 monooxygenase

<400>28<400>28

1 5 10 151 5 10 15

Leu Ser Leu Trp Ile Arg Lys Pro Lys Lys Asp Ile Ala Pro Leu ProLeu Ser Leu Trp Ile Arg Lys Pro Lys Lys Asp Ile Ala Pro Leu Pro

20 25 3020 25 30

Pro Gly Pro Arg Gly Leu Pro Ile Val Gly Tyr Leu Pro Tyr Leu GlyPro Gly Pro Arg Gly Leu Pro Ile Val Gly Tyr Leu Pro Tyr Leu Gly

35 40 4535 40 45

Thr Asp Asn Leu His Leu Val Phe Thr Asp Leu Ala Ala Ala Tyr GlyThr Asp Asn Leu His Leu Val Phe Thr Asp Leu Ala Ala Ala Tyr Gly

50 55 6050 55 60

Pro Ile Tyr Lys Leu Trp Leu Gly Asn Lys Leu Cys Val Val Ile SerPro Ile Tyr Lys Leu Trp Leu Gly Asn Lys Leu Cys Val Val Ile Ser

65 70 75 8065 70 75 80

Ser Ala Pro Leu Ala Lys Glu Val Val Arg Asp Asn Asp Ile Thr PheSer Ala Pro Leu Ala Lys Glu Val Val Arg Asp Asn Asp Ile Thr Phe

85 90 9585 90 95

Ser Glu Arg Asp Pro Pro Val Cys Ala Lys Ile Ile Thr Phe Gly LeuSer Glu Arg Asp Pro Pro Val Cys Ala Lys Ile Ile Thr Phe Gly Leu

100 105 110100 105 110

Asn Asp Ile Val Phe Asp Ser Tyr Ser Ser Pro Asp Trp Arg Met LysAsn Asp Ile Val Phe Asp Ser Tyr Ser Ser Pro Asp Trp Arg Met Lys

115 120 125115 120 125

Arg Lys Val Leu Val Arg Glu Met Leu Ser His Ser Ser Ile Lys AlaArg Lys Val Leu Val Arg Glu Met Leu Ser His Ser Ser Ile Lys Ala

130 135 140130 135 140

Cys Tyr Gly Leu Arg Arg Glu Gln Val Leu Lys Gly Val Gln Asn ValCys Tyr Gly Leu Arg Arg Glu Gln Val Leu Lys Gly Val Gln Asn Val

145 150 155 160145 150 155 160

Ala Gln Ser Ala Gly Lys Pro Ile Asp Phe Gly Glu Thr Ala Phe LeuAla Gln Ser Ala Gly Lys Pro Ile Asp Phe Gly Glu Thr Ala Phe Leu

165 170 175165 170 175

Thr Ser Ile Asn Ala Met Met Ser Met Leu Trp Gly Gly Lys Gln GlyThr Ser Ile Asn Ala Met Met Ser Met Leu Trp Gly Gly Lys Gln Gly

180 185 190180 185 190

Gly Glu Arg Lys Gly Ala Asp Val Trp Gly Gln Phe Arg Asp Leu IleGly Glu Arg Lys Gly Ala Asp Val Trp Gly Gln Phe Arg Asp Leu Ile

195 200 205195 200 205

Thr Glu Leu Met Val Ile Leu Gly Lys Pro Asn Val Ser Asp Ile PheThr Glu Leu Met Val Ile Leu Gly Lys Pro Asn Val Ser Asp Ile Phe

210 215 220210 215 220

Pro Val Leu Ala Arg Phe Asp Ile Gln Gly Leu Glu Lys Glu Met ThrPro Val Leu Ala Arg Phe Asp Ile Gln Gly Leu Glu Lys Glu Met Thr

225 230 235 240225 230 235 240

Lys Ile Val Asn Ser Phe Asp Lys Leu Phe Asn Ser Met Ile Glu GluLys Ile Val Asn Ser Phe Asp Lys Leu Phe Asn Ser Met Ile Glu Glu

245 250 255245 250 255

Arg Glu Asn Phe Ser Asn Lys Leu Ser Lys Glu Asp Gly Asn Thr GluArg Glu Asn Phe Ser Asn Lys Leu Ser Lys Glu Asp Gly Asn Thr Glu

260 265 270260 265 270

Thr Lys Asp Phe Leu Gln Leu Leu Leu Asp Leu Lys Gln Lys Asn AspThr Lys Asp Phe Leu Gln Leu Leu Leu Asp Leu Lys Gln Lys Asn Asp

275 280 285275 280 285

Ser Gly Ile Ser Ile Thr Met Asn Gln Val Lys Ala Leu Leu Met AspSer Gly Ile Ser Ile Thr Met Asn Gln Val Lys Ala Leu Leu Met Asp

290 295 300290 295 300

Ile Val Val Gly Gly Thr Asp Thr Thr Ser Thr Met Met Glu Trp ThrIle Val Val Gly Gly Thr Asp Thr Thr Ser Ser Met Met Met Glu Trp Thr

305 310 315 320305 310 315 320

Met Ala Glu Leu Ile Ala Asn Pro Glu Ala Met Lys Lys Val Lys GlnMet Ala Glu Leu Ile Ala Asn Pro Glu Ala Met Lys Lys Val Lys Gln

325 330 335325 330 335

Glu Ile Asp Asp Val Val Gly Ser Asp Gly Ala Val Asp Glu Thr HisGlu Ile Asp Asp Val Val Gly Ser Asp Gly Ala Val Asp Glu Thr His

340 345 350340 345 350

Leu Pro Lys Leu Arg Tyr Leu Asp Ala Ala Val Lys Glu Thr Phe ArgLeu Pro Lys Leu Arg Tyr Leu Asp Ala Ala Val Lys Glu Thr Phe Arg

355 360 365355 360 365

Leu His Pro Pro Met Pro Leu Leu Val Pro Arg Cys Pro Gly Asp SerLeu His Pro Pro Met Pro Leu Leu Val Pro Arg Cys Pro Gly Asp Ser

370 375 380370 375 380

Ser Asn Val Gly Gly Tyr Ser Val Pro Lys Gly Thr Arg Val Phe LeuSer Asn Val Gly Gly Tyr Ser Val Pro Lys Gly Thr Arg Val Phe Leu

385 390 395 400385 390 395 400

Asn Ile Trp Cys Ile Gln Arg Asp Pro Gln Leu Trp Glu Asn Pro LeuAsn Ile Trp Cys Ile Gln Arg Asp Pro Gln Leu Trp Glu Asn Pro Leu

405 410 415405 410 415

Glu Phe Lys Pro Glu Arg Phe Leu Thr Asp His Glu Lys Leu Asp TyrGlu Phe Lys Pro Glu Arg Phe Leu Thr Asp His Glu Lys Leu Asp Tyr

420 425 430420 425 430

Leu Gly Asn Asp Ser Arg Tyr Met Pro Phe Gly Ser Gly Arg Arg MetLeu Gly Asn Asp Ser Arg Tyr Met Pro Phe Gly Ser Gly Arg Arg Met

435 440 445435 440 445

Cys Ala Gly Val Ser Leu Gly Glu Lys Met Leu Tyr Ser Ser Leu AlaCys Ala Gly Val Ser Leu Gly Glu Lys Met Leu Tyr Ser Ser Leu Ala

450 455 460450 455 460

Ala Met Ile His Ala Tyr Asp Trp Asn Leu Ala Asp Gly Glu Glu AsnAla Met Ile His Ala Tyr Asp Trp Asn Leu Ala Asp Gly Glu Glu Asn

465 470 475 480465 470 475 480

Asp Leu Ile Gly Leu Phe Gly Ile Ile Met Lys Lys Lys Lys Pro LeuAsp Leu Ile Gly Leu Phe Gly Ile Ile Met Lys Lys Lys Lys Pro Leu

485 490 495485 490 495

Ile Leu Val Pro Thr Pro Arg Pro Ser Asn Leu Gln His Tyr Met LysIle Leu Val Pro Thr Pro Arg Pro Ser Asn Leu Gln His Tyr Met Lys

500 505 510500 505 510

<210>29<210>29

<211>516<211>516

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的P450单加氧酶<223> Modified P450 monooxygenase

<400>29<400>29

1 5 10 151 5 10 15

Gln Ala Gln Glu Leu Ala Lys Lys Ile Arg Lys Pro Lys Lys Asp IleGln Ala Gln Glu Leu Ala Lys Lys Ile Arg Lys Pro Lys Lys Asp Ile

20 25 3020 25 30

Ala Pro Leu Pro Pro Gly Pro Arg Gly Leu Pro Ile Val Gly Tyr LeuAla Pro Leu Pro Pro Gly Pro Arg Gly Leu Pro Ile Val Gly Tyr Leu

35 40 4535 40 45

Pro Tyr Leu Gly Thr Asp Asn Leu His Leu Val Phe Thr Asp Leu AlaPro Tyr Leu Gly Thr Asp Asn Leu His Leu Val Phe Thr Asp Leu Ala

50 55 6050 55 60

Ala Ala Tyr Gly Pro Ile Tyr Lys Leu Trp Leu Gly Asn Lys Leu CysAla Ala Tyr Gly Pro Ile Tyr Lys Leu Trp Leu Gly Asn Lys Leu Cys

65 70 75 8065 70 75 80

Val Val Ile Ser Ser Ala Pro Leu Ala Lys Glu Val Val Arg Asp AsnVal Val Ile Ser Ser Ser Ala Pro Leu Ala Lys Glu Val Val Arg Asp Asn

85 90 9585 90 95

Asp Ile Thr Phe Ser Glu Arg Asp Pro Pro Val Cys Ala Lys Ile IleAsp Ile Thr Phe Ser Glu Arg Asp Pro Pro Val Cys Ala Lys Ile Ile

100 105 110100 105 110

Thr Phe Gly Leu Asn Asp Ile Val Phe Asp Ser Tyr Ser Ser Pro AspThr Phe Gly Leu Asn Asp Ile Val Phe Asp Ser Tyr Ser Ser Pro Asp

115 120 125115 120 125

Trp Arg Met Lys Arg Lys Val Leu Val Arg Glu Met Leu Ser His SerTrp Arg Met Lys Arg Lys Val Leu Val Arg Glu Met Leu Ser His Ser

130 135 140130 135 140

Ser Ile Lys Ala Cys Tyr Gly Leu Arg Arg Glu Gln Val Leu Lys GlySer Ile Lys Ala Cys Tyr Gly Leu Arg Arg Glu Gln Val Leu Lys Gly

145 150 155 160145 150 155 160

Val Gln Asn Val Ala Gln Ser Ala Gly Lys Pro Ile Asp Phe Gly GluVal Gln Asn Val Ala Gln Ser Ala Gly Lys Pro Ile Asp Phe Gly Glu

165 170 175165 170 175

Thr Ala Phe Leu Thr Ser Ile Asn Ala Met Met Ser Met Leu Trp GlyThr Ala Phe Leu Thr Ser Ile Asn Ala Met Met Ser Met Leu Trp Gly

180 185 190180 185 190

Gly Lys Gln Gly Gly Glu Arg Lys Gly Ala Asp Val Trp Gly Gln PheGly Lys Gln Gly Gly Glu Arg Lys Gly Ala Asp Val Trp Gly Gln Phe

195 200 205195 200 205

Arg Asp Leu Ile Thr Glu Leu Met Val Ile Leu Gly Lys Pro Asn ValArg Asp Leu Ile Thr Glu Leu Met Val Ile Leu Gly Lys Pro Asn Val

210 215 220210 215 220

Ser Asp Ile Phe Pro Val Leu Ala ArgPhe Asp Ile Gln Gly Leu GluSer Asp Ile Phe Pro Val Leu Ala ArgPhe Asp Ile Gln Gly Leu Glu

225 230 235 240225 230 235 240

Lys Glu Met Thr Lys Ile Val Asn Ser Phe Asp Lys Leu Phe Asn SerLys Glu Met Thr Lys Ile Val Asn Ser Phe Asp Lys Leu Phe Asn Ser

245 250 255245 250 255

Met Ile Glu Glu Arg Glu Asn Phe SerAsn Lys Leu Ser Lys Glu AspMet Ile Glu Glu Arg Glu Asn Phe Ser Asn Lys Leu Ser Lys Glu Asp

260 265 270260 265 270

Gly Asn Thr Glu Thr Lys Asp Phe Leu Gln Leu Leu Leu Asp Leu LysGly Asn Thr Glu Thr Lys Asp Phe Leu Gln Leu Leu Leu Asp Leu Lys

275 280 285275 280 285

Gln Lys Asn Asp Ser Gly Ile Ser Ile Thr Met Asn Gln Val Lys AlaGln Lys Asn Asp Ser Gly Ile Ser Ile Thr Met Asn Gln Val Lys Ala

290 295 300290 295 300

Leu Leu Met Asp Ile Val Val Gly Gly Thr Asp Thr Thr Ser Thr MetLeu Leu Met Asp Ile Val Val Gly Gly Thr Asp Thr Thr Ser Thr Met

305 310 315 320305 310 315 320

Met Glu Trp Thr Met Ala Glu Leu IleAla Asn Pro Glu Ala Met LysMet Glu Trp Thr Met Ala Glu Leu IleAla Asn Pro Glu Ala Met Lys

325 330 335325 330 335

Lys Val Lys Gln Glu Ile Asp Asp Val Val Gly Ser Asp Gly Ala ValLys Val Lys Gln Glu Ile Asp Asp Val Val Gly Ser Asp Gly Ala Val

340 345 350340 345 350

Asp Glu Thr His Leu Pro Lys Leu Arg Tyr Leu Asp Ala Ala Val LysAsp Glu Thr His Leu Pro Lys Leu Arg Tyr Leu Asp Ala Ala Val Lys

355 360 365355 360 365

Glu Thr Phe Arg Leu His Pro Pro Met Pro Leu Leu Val Pro Arg CysGlu Thr Phe Arg Leu His Pro Pro Met Pro Leu Leu Val Pro Arg Cys

370 375 380370 375 380

Pro Gly Asp Ser Ser Asn Val Gly Gly Tyr Ser Val Pro Lys Gly ThrPro Gly Asp Ser Ser Asn Val Gly Gly Tyr Ser Val Pro Lys Gly Thr

385 390 395 400385 390 395 400

Arg Val Phe Leu Asn Ile Trp Cys Ile Gln Arg Asp Pro Gln Leu TrpArg Val Phe Leu Asn Ile Trp Cys Ile Gln Arg Asp Pro Gln Leu Trp

405 410 415405 410 415

Glu Asn Pro Leu Glu Phe Lys Pro Glu Arg Phe Leu Thr Asp His GluGlu Asn Pro Leu Glu Phe Lys Pro Glu Arg Phe Leu Thr Asp His Glu

420 425 430420 425 430

Lys Leu Asp Tyr Leu Gly Asn Asp Ser Arg Tyr Met Pro Phe Gly SerLys Leu Asp Tyr Leu Gly Asn Asp Ser Arg Tyr Met Pro Phe Gly Ser

435 440 445435 440 445

Gly Arg Arg Met Cys Ala Gly Val Ser Leu Gly Glu Lys Met Leu TyrGly Arg Arg Met Cys Ala Gly Val Ser Leu Gly Glu Lys Met Leu Tyr

450 455 460450 455 460

Ser Ser Leu Ala Ala Met Ile His Ala Tyr Asp Trp Asn Leu Ala AspSer Ser Leu Ala Ala Met Ile His Ala Tyr Asp Trp Asn Leu Ala Asp

465 470 475 480465 470 475 480

Gly Glu Glu Asn Asp Leu Ile Gly Leu Phe Gly Ile Ile Met Lys LysGly Glu Glu Asn Asp Leu Ile Gly Leu Phe Gly Ile Ile Met Lys Lys

485 490 495485 490 495

Lys Lys Pro Leu Ile Leu Val Pro Thr Pro Arg Pro Ser Asn Leu GlnLys Lys Pro Leu Ile Leu Val Pro Thr Pro Arg Pro Ser Asn Leu Gln

500 505 510500 505 510

His Tyr Met LysHis Tyr Met Lys

515515

<210>30<210>30

<211>536<211>536

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的P450单加氧酶<223> Modified P450 monooxygenase

<400>30<400>30

1 5 10 151 5 10 15

20 25 3020 25 30

Leu Tyr Ile Ile Lys Gln Leu Leu Ala Tyr Thr Lys Ile Arg Lys ProLeu Tyr Ile Ile Lys Gln Leu Leu Ala Tyr Thr Lys Ile Arg Lys Pro

35 40 4535 40 45

50 55 6050 55 60

65 70 75 8065 70 75 80

85 90 9585 90 95

100 105 110100 105 110

115 120 125115 120 125

130 135 140130 135 140

145 150 155 160145 150 155 160

Leu Ser His Ser Ser Ile Lys Ala Cys Tyr Gly Leu Arg Arg 6lu GlnLeu Ser His Ser Ser Ile Lys Ala Cys Tyr Gly Leu Arg Arg 6lu Gln

165 170 175165 170 175

180 185 190180 185 190

195 200 205195 200 205

210 215 220210 215 220

225 230 235 240225 230 235 240

245 250 255245 250 255

260 265 270260 265 270

275 280 285275 280 285

290 295 300290 295 300

305 310 315 320305 310 315 320

325 330 335325 330 335

Thr Ser Thr Met Met Glu Trp Thr Met Ala Glu Leu Ile Ala Asn ProThr Ser Thr Met Met Glu Trp Thr Met Ala Glu Leu Ile Ala Asn Pro

340 345 350340 345 350

355 360 365355 360 365

370 375 380370 375 380

385 390 395 400385 390 395 400

405 410 415405 410 415

420 425 430420 425 430

435 440 445435 440 445

450 455 460450 455 460

465 470 475 480465 470 475 480

485 490 495485 490 495

500 505 510500 505 510

515 520 525515 520 525

Ser Asn Leu Gln His Tyr Met LysSer Asn Leu Gln His Tyr Met Lys

530 535530 535

<210>31<210>31

<211>502<211>502

<212>PRT<212>PRT

<213>东北红豆杉(Taxus cuspidata)<213>Northeast yew (Taxus cuspidata)

<400>31<400>31

Met Asp Ala Leu Tyr Lys Ser Thr Val Ala Lys Phe Asn Glu Val ThrMet Asp Ala Leu Tyr Lys Ser Thr Val Ala Lys Phe Asn Glu Val Thr

1 5 10 151 5 10 15

Gln Leu Asp Cys Ser Thr Glu Ser Phe Ser Ile Ala Leu Ser Ala IleGln Leu Asp Cys Ser Thr Glu Ser Phe Ser Ile Ala Leu Ser Ala Ile

20 25 3020 25 30

Ala Gly Ile Leu Leu Leu Leu Leu Leu Phe Arg Ser Lys Arg His SerAla Gly Ile Leu Leu Leu Leu Leu Leu Phe Arg Ser Lys Arg His Ser

35 40 4535 40 45

Ser Leu Lys Leu Pro Pro Gly Lys Leu Gly Ile Pro Phe Ile Gly GluSer Leu Lys Leu Pro Pro Gly Lys Leu Gly Ile Pro Phe Ile Gly Glu

50 55 6050 55 60

Ser Phe Ile Phe Leu Arg Ala Leu Arg Ser Asn Ser Leu Glu Gln PheSer Phe Ile Phe Leu Arg Ala Leu Arg Ser Asn Ser Leu Glu Gln Phe

65 70 75 8065 70 75 80

Phe Asp Glu Arg Val Lys Lys Phe Gly Leu Val Phe Lys Thr Ser LeuPhe Asp Glu Arg Val Lys Lys Phe Gly Leu Val Phe Lys Thr Ser Leu

85 90 9585 90 95

Ile Gly His Pro Thr Val Val Leu Cys Gly Pro Ala Gly Asn Arg LeuIle Gly His Pro Thr Val Val Leu Cys Gly Pro Ala Gly Asn Arg Leu

100 105 110100 105 110

Ile Leu Ser Asn Glu Glu Lys Leu Val Gln Met Ser Trp Pro Ala GlnIle Leu Ser Asn Glu Glu Lys Leu Val Gln Met Ser Trp Pro Ala Gln

115 120 125115 120 125

Phe Met Lys Leu Met Gly Glu Asn Ser Val Ala Thr Arg Arg Gly GluPhe Met Lys Leu Met Gly Glu Asn Ser Val Ala Thr Arg Arg Gly Glu

130 135 140130 135 140

Asp His Ile Val Met Arg Ser Ala Leu Ala Gly Phe Phe Gly Pro GlyAsp His Ile Val Met Arg Ser Ala Leu Ala Gly Phe Phe Gly Pro Gly

145 150 155 160145 150 155 160

Ala Leu Gln Ser Tyr Ile Gly Lys Met Asn Thr Glu Ile Gln Ser HisAla Leu Gln Ser Tyr Ile Gly Lys Met Asn Thr Glu Ile Gln Ser His

165 170 175165 170 175

Ile Asn Glu Lys Trp Lys Gly Lys Asp Glu Val Asn Val Leu Pro LeuIle Asn Glu Lys Trp Lys Gly Lys Asp Glu Val Asn Val Leu Pro Leu

180 185 190180 185 190

Val Arg Glu Leu Val Phe Asn Ile Ser Ala Ile Leu Phe Phe Asn IleVal Arg Glu Leu Val Phe Asn Ile Ser Ala Ile Leu Phe Phe Asn Ile

195 200 205195 200 205

Tyr Asp Lys Gln Glu Gln Asp Arg Leu His Lys Leu Leu Glu Thr IleTyr Asp Lys Gln Glu Gln Asp Arg Leu His Lys Leu Leu Glu Thr Ile

210 215 220210 215 220

Leu Val Gly Ser Phe Ala Leu Pro Ile Asp Leu Pro Gly Phe Gly PheLeu Val Gly Ser Phe Ala Leu Pro Ile Asp Leu Pro Gly Phe Gly Phe

225 230 235 240225 230 235 240

His Arg Ala Leu Gln Gly Arg Ala Lys Leu Asn Lys Ile Met Leu SerHis Arg Ala Leu Gln Gly Arg Ala Lys Leu Asn Lys Ile Met Leu Ser

245 250 255245 250 255

Leu Ile Lys Lys Arg Lys Glu Asp Leu Gln Ser Gly Ser Ala Thr AlaLeu Ile Lys Lys Arg Lys Glu Asp Leu Gln Ser Gly Ser Ala Thr Ala

260 265 270260 265 270

Thr Gln Asp Leu Leu Ser Val Leu Leu Thr Phe Arg Asp Asp Lys GlyThr Gln Asp Leu Leu Ser Val Leu Leu Thr Phe Arg Asp Asp Lys Gly

275 280 285275 280 285

Thr Pro Leu Thr Asn Asp Glu Ile Leu Asp Asn Phe Ser Ser Leu LeuThr Pro Leu Thr Asn Asp Glu Ile Leu Asp Asn Phe Ser Ser Leu Leu

290 295 300290 295 300

His Ala Ser Tyr Asp Thr Thr Thr Ser Pro Met Ala Leu Ile Phe LysHis Ala Ser Tyr Asp Thr Thr Thr Ser Pro Met Ala Leu Ile Phe Lys

305 310 315 320305 310 315 320

Leu Leu Ser Ser Asn Pro Glu Cys Tyr Gln Lys Val Val Gln Glu GlnLeu Leu Ser Ser Asn Pro Glu Cys Tyr Gln Lys Val Val Gln Glu Gln

325 330 335325 330 335

Leu Glu Ile Leu Ser Asn Lys Glu Glu Gly Glu Glu Ile Thr Trp LysLeu Glu Ile Leu Ser Asn Lys Glu Glu Gly Glu Glu Ile Thr Trp Lys

340 345 350340 345 350

Asp Leu Lys Ala Met Lys Tyr Thr Trp Gln Val Ala Gln Glu Thr LeuAsp Leu Lys Ala Met Lys Tyr Thr Trp Gln Val Ala Gln Glu Thr Leu

355 360 365355 360 365

Arg Met Phe Pro Pro Val Phe Gly Thr Phe Arg Lys Ala Ile Thr AspArg Met Phe Pro Pro Val Phe Gly Thr Phe Arg Lys Ala Ile Thr Asp

370 375 380370 375 380

Ile Gln Tyr Asp Gly Tyr Thr Ile Pro Lys Gly Trp Lys Leu Leu TrpIle Gln Tyr Asp Gly Tyr Thr Ile Pro Lys Gly Trp Lys Leu Leu Trp

385 390 395 400385 390 395 400

Thr Thr Tyr Ser Thr His Pro Lys Asp Leu Tyr Phe Asn Glu Pro GluThr Thr Tyr Ser Thr His Pro Lys Asp Leu Tyr Phe Asn Glu Pro Glu

405 410 415405 410 415

Lys Phe Met Pro Ser Arg Phe Asp Gln Glu Gly Lys His Val Ala ProLys Phe Met Pro Ser Arg Phe Asp Gln Glu Gly Lys His Val Ala Pro

420 425 430420 425 430

Tyr Thr Phe Leu Pro Phe Gly Gly Gly Gln Arg Ser Cys Val Gly TrpTyr Thr Phe Leu Pro Phe Gly Gly Gly Gln Arg Ser Cys Val Gly Trp

435 440 445435 440 445

Glu Phe Ser Lys Met Glu Ile Leu Leu Phe Val His His Phe Val LysGlu Phe Ser Lys Met Glu Ile Leu Leu Phe Val His His Phe Val Lys

450 455 460450 455 460

Thr Phe Ser Ser Tyr Thr Pro Val Asp Pro Asp Glu Lys Ile Ser GlyThr Phe Ser Ser Tyr Thr Pro Val Asp Pro Asp Glu Lys Ile Ser Gly

465 470 475 480465 470 475 480

Asp Pro Leu Pro Pro Leu Pro Ser Lys Gly Phe Ser Ile Lys Leu PheAsp Pro Leu Pro Pro Leu Pro Ser Lys Gly Phe Ser Ile Lys Leu Phe

485 490 495485 490 495

Pro Glu Thr Ile Val AsnPro Glu Thr Ile Val Asn

500500

<210>32<210>32

<211>502<211>502

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的紫杉二烯-5α-羟化酶<223> Modified taxadiene-5α-hydroxylase

<400>32<400>32

1 5 10 151 5 10 15

Gln Leu Asp Cys Ser Thr Glu Ser Phe Ser Ile Ala Leu Ser Ser IleGln Leu Asp Cys Ser Thr Glu Ser Phe Ser Ile Ala Leu Ser Ser Ile

20 25 3020 25 30

35 40 4535 40 45

50 55 6050 55 60

65 70 75 8065 70 75 80

85 90 9585 90 95

100 105 110100 105 110

115 120 125115 120 125

130 135 140130 135 140

145 150 155 160145 150 155 160

Ala Leu Gln Ser Tyr Ile Gly Lys Met Asn Thr Glu Ile Gln Asn HisAla Leu Gln Ser Tyr Ile Gly Lys Met Asn Thr Glu Ile Gln Asn His

165 170 175165 170 175

180 185 190180 185 190

195 200 205195 200 205

210 215 220210 215 220

225 230 235 240225 230 235 240

His Arg Ala Leu Gln Gly Arg Ala Thr Leu Asn Lys Ile Met Leu SerHis Arg Ala Leu Gln Gly Arg Ala Thr Leu Asn Lys Ile Met Leu Ser

245 250 255245 250 255

260 265 270260 265 270

275 280 285275 280 285

290 295 300290 295 300

305 310 315 320305 310 315 320

325 330 335325 330 335

340 345 350340 345 350

355 360 365355 360 365

370 375 380370 375 380

385 390 395 400385 390 395 400

Thr Thr Tyr Ser Thr His Pro Lys Asp Leu Tyr Phe Ser Glu Pro GluThr Thr Tyr Ser Thr His Pro Lys Asp Leu Tyr Phe Ser Glu Pro Glu

405 410 415405 410 415

420 425 430420 425 430

435 440 445435 440 445

450 455 460450 455 460

465 470 475 480465 470 475 480

485 490 495485 490 495

Pro Glu Thr Ile Val AsnPro Glu Thr Ile Val Asn

500500

<210>33<210>33

<211>509<211>509

<212>PRT<212>PRT

<213>拟南芥(Arabidopsis thaliana)<213> Arabidopsis thaliana

<400>33<400>33

Met Ala Phe Phe Ser Met Ile Ser Ile Leu Leu Gly Phe Val Ile SerMet Ala Phe Phe Ser Met Ile Ser Ile Leu Leu Gly Phe Val Ile Ser

1 5 10 151 5 10 15

Ser Phe Ile Phe Ile Phe Phe Phe Lys Lys Leu Leu Ser Phe Ser ArgSer Phe Ile Phe Ile Phe Phe Phe Lys Lys Leu Leu Ser Phe Ser Arg

20 25 3020 25 30

Lys Asn Met Ser Glu Val Ser Thr Leu Pro Ser Val Pro Val Val ProLys Asn Met Ser Glu Val Ser Thr Leu Pro Ser Val Pro Val Val Pro

35 40 4535 40 45

Gly Phe Pro Val Ile Gly Asn Leu Leu Gln Leu Lys Glu Lys Lys ProGly Phe Pro Val Ile Gly Asn Leu Leu Gln Leu Lys Glu Lys Lys Pro

50 55 6050 55 60

His Lys Thr Phe Thr Arg Trp Ser Glu Ile Tyr Gly Pro Ile Tyr SerHis Lys Thr Phe Thr Arg Trp Ser Glu Ile Tyr Gly Pro Ile Tyr Ser

65 70 75 8065 70 75 80

Ile Lys Met Gly Ser Ser Ser Leu Ile Val Leu Asn Ser Thr Glu ThrIle Lys Met Gly Ser Ser Ser Ser Leu Ile Val Leu Asn Ser Thr Glu Thr

85 90 9585 90 95

Ala Lys Glu Ala Met Val Thr Arg Phe Ser Ser Ile Ser Thr Arg LysAla Lys Glu Ala Met Val Thr Arg Phe Ser Ser Ile Ser Thr Arg Lys

100 105 110100 105 110

Leu Ser Asn Ala Leu Thr Val Leu Thr Cys Asp Lys Ser Met Val AlaLeu Ser Asn Ala Leu Thr Val Leu Thr Cys Asp Lys Ser Met Val Ala

115 120 125115 120 125

Thr Ser Asp Tyr Asp Asp Phe His Lys Leu Val Lys Arg Cys Leu LeuThr Ser Asp Tyr Asp Asp Phe His Lys Leu Val Lys Arg Cys Leu Leu

130 135 140130 135 140

Asn Gly Leu Leu Gly Ala Asn Ala Gln Lys Arg Lys Arg His Tyr ArgAsn Gly Leu Leu Gly Ala Asn Ala Gln Lys Arg Lys Arg His Tyr Arg

145 150 155 160145 150 155 160

Asp Ala Leu Ile Glu Asn Val Ser Ser Lys Leu His Ala His Ala ArgAsp Ala Leu Ile Glu Asn Val Ser Ser Lys Leu His Ala His Ala Arg

165 170 175165 170 175

Asp His Pro Gln Glu Pro Val Asn Phe Arg Ala Ile Phe Glu His GluAsp His Pro Gln Glu Pro Val Asn Phe Arg Ala Ile Phe Glu His Glu

180 185 190180 185 190

Leu Phe Gly Val Ala Leu Lys Gln Ala Phe Gly Lys Asp Val Glu SerLeu Phe Gly Val Ala Leu Lys Gln Ala Phe Gly Lys Asp Val Glu Ser

195 200 205195 200 205

Ile Tyr Val Lys Glu Leu Gly Val Thr Leu Ser Lys Asp Glu Ile PheIle Tyr Val Lys Glu Leu Gly Val Thr Leu Ser Lys Asp Glu Ile Phe

210 215 220210 215 220

Lys Val Leu Val His Asp Met Met Glu Gly Ala Ile Asp Val Asp TrpLys Val Leu Val His Asp Met Met Glu Gly Ala Ile Asp Val Asp Trp

225 230 235 240225 230 235 240

Arg Asp Phe Phe Pro Tyr Leu Lys Trp Ile Pro Asn Lys Ser Phe GluArg Asp Phe Phe Pro Tyr Leu Lys Trp Ile Pro Asn Lys Ser Phe Glu

245 250 255245 250 255

Ala Arg Ile Gln Gln Lys His Lys Arg Arg Leu Ala Val Met Asn AlaAla Arg Ile Gln Gln Lys His Lys Arg Arg Leu Ala Val Met Asn Ala

260 265 270260 265 270

Leu Ile Gln Asp Arg Leu Lys Gln Asn Gly Ser Glu Ser Asp Asp AspLeu Ile Gln Asp Arg Leu Lys Gln Asn Gly Ser Glu Ser Asp Asp Asp

275 280 285275 280 285

Cys Tyr Leu Asn Phe Leu Met Ser Glu Ala Lys Thr Leu Thr Lys GluCys Tyr Leu Asn Phe Leu Met Ser Glu Ala Lys Thr Leu Thr Lys Glu

290 295 300290 295 300

Gln Ile Ala Ile Leu Val Trp Glu Thr Ile Ile Glu Thr Ala Asp ThrGln Ile Ala Ile Leu Val Trp Glu Thr Ile Ile Glu Thr Ala Asp Thr

305 310 315 320305 310 315 320

Thr Leu Val Thr Thr Glu Trp Ala Ile Tyr Glu Leu Ala Lys His ProThr Leu Val Thr Thr Thr Glu Trp Ala Ile Tyr Glu Leu Ala Lys His Pro

325 330 335325 330 335

Ser Val Gln Asp Arg Leu Cys Lys Glu Ile Gln Asn Val Cys Gly GlySer Val Gln Asp Arg Leu Cys Lys Glu Ile Gln Asn Val Cys Gly Gly

340 345 350340 345 350

Glu Lys Phe Lys Glu Glu Gln Leu Ser Gln Val Pro Tyr Leu Asn GlyGlu Lys Phe Lys Glu Glu Gln Leu Ser Gln Val Pro Tyr Leu Asn Gly

355 360 365355 360 365

Val Phe His Glu Thr Leu Arg Lys Tyr Ser Pro Ala Pro Leu Val ProVal Phe His Glu Thr Leu Arg Lys Tyr Ser Pro Ala Pro Leu Val Pro

370 375 380370 375 380

Ile Arg Tyr Ala His Glu Asp Thr Gln Ile Gly Gly Tyr His Val ProIle Arg Tyr Ala His Glu Asp Thr Gln Ile Gly Gly Tyr His Val Pro

385 390 395 400385 390 395 400

Ala Gly Ser Glu Ile Ala Ile Asn Ile Tyr Gly Cys Asn Met Asp LysAla Gly Ser Glu Ile Ala Ile Asn Ile Tyr Gly Cys Asn Met Asp Lys

405 410 415405 410 415

Lys Arg Trp Glu Arg Pro Glu Asp Trp Trp Pro Glu Arg Phe Leu AspLys Arg Trp Glu Arg Pro Glu Asp Trp Trp Pro Glu Arg Phe Leu Asp

420 425 430420 425 430

Asp Gly Lys Tyr Glu Thr Ser Asp Leu His Lys Thr Met Ala Phe GlyAsp Gly Lys Tyr Glu Thr Ser Asp Leu His Lys Thr Met Ala Phe Gly

435 440 445435 440 445

Ala Gly Lys Arg Val Cys Ala Gly Ala Leu Gln Ala Ser Leu Met AlaAla Gly Lys Arg Val Cys Ala Gly Ala Leu Gln Ala Ser Leu Met Ala

450 455 460450 455 460

Gly Ile Ala Ile Gly Arg Leu Val Gln Glu Phe Glu Trp Lys Leu ArgGly Ile Ala Ile Gly Arg Leu Val Gln Glu Phe Glu Trp Lys Leu Arg

465 470 475 480465 470 475 480

Asp Gly Glu Glu Glu Asn Val Asp Thr Tyr Gly Leu Thr Ser Gln LysAsp Gly Glu Glu Glu Asn Val Asp Thr Tyr Gly Leu Thr Ser Gln Lys

485 490 495485 490 495

Leu Tyr Pro Leu Met Ala Ile Ile Asn Pro Arg Arg SerLeu Tyr Pro Leu Met Ala Ile Ile Asn Pro Arg Arg Ser

500 505500 505

<210>34<210>34

<211>1896<211>1896

<212>DNA<212>DNA

<213>树绵(Gossypium arboreum)<213> Sponge (Gossypium arboreum)

<400>34<400>34

ccacttcgca gcaatattat tgcagttcct ggttggctac ctctgagttt tcaacttaaa 60ccacttcgca gcaatattat tgcagttcct ggttggctac ctctgagttt tcaacttaaa 60

atttcttggt tttcctcaag aaggaagaag atgttgcaaa tagctttcag ctcgtattca 120atttcttggt tttcctcaag aaggaagaag atgttgcaaa tagctttcag ctcgtattca 120

tggctgttga ctgctagcaa ccagaaagat ggaatgttgt tcccagtagc tttgtcattt 180tggctgttga ctgctagcaa ccagaaagat ggaatgttgt tcccagtagc tttgtcattt 180

ttggtagcca tattgggaat ttcactgtgg cacgtatgga ccataaggaa gccaaagaaa 240ttggtagcca tattgggaat ttcactgtgg cacgtatgga ccataaggaa gccaaagaaa 240

gacatcgccc cattaccgcc gggtccccgt gggttgccaa tagtgggata tcttccatat 300gacatcgccc cattaccgcc gggtccccgt gggttgccaa tagtgggata tcttccatat 300

cttggaactg ataatcttca cttggtgttt acagatttgg ctgcagctta cggtcccatc 360cttggaactg ataatcttca cttggtgttt acagatttgg ctgcagctta cggtcccatc 360

tacaagcttt ggctaggaaa caaattatgc gtagtcatta gctcggcacc actggcgaaa 420tacaagcttt ggctaggaaa caaattatgc gtagtcatta gctcggcacc actggcgaaa 420

gaagtggttc gtgacaacga catcacattt tctgaaaggg atcctcccgt ttgtgcaaag 480gaagtggttc gtgacaacga catcacattt tctgaaaggg atcctcccgt ttgtgcaaag 480

attattacct ttggcctcaa tgatattgta tttgattctt acagtagtcc agattggaga 540attattacct ttggcctcaa tgatattgta tttgattctt acagtagtcc agattggaga 540

atgaagagaa aagtgctggt acgtgaaatg cttagccata gtagcattaa agcttgttat 600atgaagagaa aagtgctggt acgtgaaatg cttagccata gtagcattaa agcttgttat 600

ggtctaagga gggaacaagt gcttaaaggc gtacaaaatg ttgctcaaag tgctggcaag 660ggtctaagga gggaacaagt gcttaaaggc gtacaaaatg ttgctcaaag tgctggcaag 660

ccaattgatt ttggtgaaac ggcattttta acatcaatca atgcgatgat gagcatgctg 720ccaattgatt ttggtgaaac ggcattttta acatcaatca atgcgatgat gagcatgctg 720

tggggtggca aacagggagg agagcggaaa ggggccgacg tttggggcca atttcgagat 780tggggtggca aacagggagg agagcggaaa ggggccgacg tttggggcca atttcgagat 780

ctcataaccg aactaatggt gatacttgga aaaccaaacg tttctgatat tttcccggtg 840ctcataaccg aactaatggt gatacttgga aaaccaaacg tttctgatat tttcccggtg 840

cttgcaaggt ttgacataca gggattggag aaggaaatga ctaaaatcgt taattctttc 900cttgcaaggt ttgacataca gggattggag aaggaaatga ctaaaatcgt taattctttc 900

gataagcttt tcaactccat gattgaagaa agagagaact ttagcaacaa attgagcaaa 960gataagcttt tcaactccat gattgaagaa agagagaact ttagcaacaa attgagcaaa 960

gaagatggaa acactgaaac aaaagacttc ttgcagcttc tgttggacct caagcagaag 1020gaagatggaa acactgaaac aaaagacttc ttgcagcttc tgttggacct caagcagaag 1020

aacgatagcg gaatatcgat aacaatgaat caagtcaagg ccttgctcat ggacattgtg 1080aacgatagcg gaatatcgat aacaatgaat caagtcaagg ccttgctcat ggacattgtg 1080

gtcggtggaa ctgatacaac atcaaccatg atggaatgga caatggctga actaattgca 1140gtcggtggaa ctgatacaac atcaaccatg atggaatgga caatggctga actaattgca 1140

aatcctgaag caatgaaaaa ggtgaagcaa gaaatagacg atgttgtcgg ttcggatggc 1200aatcctgaag caatgaaaaa ggtgaagcaa gaaatagacg atgttgtcgg ttcggatggc 1200

gccgtcgatg agactcactt gcctaagttg cgctatctag atgctgcagt aaaggagacc 1260gccgtcgatg agactcactt gcctaagttg cgctatctag atgctgcagt aaaggagacc 1260

ttccgattgc acccaccgat gccactcctt gtaccccgtt gcccgggcga ctcaagcaac 1320ttccgattgc accccaccgat gccactcctt gtaccccgtt gcccgggcga ctcaagcaac 1320

gttggtggct atagcgtacc aaagggcacc agggtcttct taaacatttg gtgtattcag 1380gttggtggct atagcgtacc aaagggcacc agggtcttct taaacatttg gtgtattcag 1380

agggatccac agctttggga aaatccttta gaattcaagc ctgagaggtt cttgactgat 1440agggatccac agctttggga aaatccttta gaattcaagc ctgagaggtt cttgactgat 1440

catgagaagc tcgattattt aggaaacgat tcccggtaca tgccgtttgg ttctggaagg 1500catgagaagc tcgattattt aggaaacgat tcccggtaca tgccgtttgg ttctggaagg 1500

agaatgtgtg ccggagtatc tctcggtgaa aagatgttgt attcctcctt ggcagcaatg 1560agaatgtgtg ccggagtatc tctcggtgaa aagatgttgt attcctcctt ggcagcaatg 1560

atccatgctt atgattggaa cttggccgac ggtgaagaaa atgacttgat tggcttattt 1620atccatgctt atgattggaa cttggccgac ggtgaagaaa atgacttgat tggcttattt 1620

ggaattatta tgaagaaaaa gaagccttta attcttgttc ctacaccaag accatcaaat 1680ggaattatta tgaagaaaaa gaagccttta attcttgttc cctaccaag accatcaaat 1680

ctccagcact atatgaagta actttactat tgtatttctt ttataccact ttattgcctc 1740ctccagcact atatgaagta actttactat tgtatttctt ttataccact ttattgcctc 1740

tttgtcatgt ttaggcaaca attctaagta ataagtttgg ctatatggtg aacaataatg 1800tttgtcatgt ttaggcaaca attctaagta ataagtttgg ctatatggtg aacaataatg 1800

tgtttattat acatcataag caatgagctc ttcccgaccc tagggcaata caatgatact 1860tgtttattat acatcataag caatgagctc ttcccgaccc tagggcaata caatgatact 1860

gtgtattaag tgaaatcaac aaatctttta ttctaa 1896gtgtattaag tgaaatcaac aaatctttta ttctaa 1896

<210>35<210>35

<211>1677<211>1677

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<223>合成多核苷酸<223> Synthetic polynucleotides

<400>35<400>35

atgctgcaga ttgctttttc ttcttattct tggctgctga ccgcttctaa ccagaaagac 60atgctgcaga ttgctttttc ttcttattct tggctgctga ccgcttctaa ccagaaagac 60

ggcatgctgt tcccggtggc gctgagcttc ctggtggcaa tcctgggcat tagcctgtgg 120ggcatgctgt tcccggtggc gctgagcttc ctggtggcaa tcctgggcat tagcctgtgg 120

cacgtgtgga ctatccgtaa accgaagaaa gatatcgcac cgctgccacc gggtccgcgt 180cacgtgtgga ctatccgtaa accgaagaaa gatatcgcac cgctgccacc gggtccgcgt 180

ggcctgccga tcgttggcta cctgccgtat ctgggcaccg acaacctgca cctggtgttc 240ggcctgccga tcgttggcta cctgccgtat ctgggcaccg acaacctgca cctggtgttc 240

accgacctgg cagccgcgta cggtccgatc tacaaactgt ggctgggcaa taaactgtgc 300accgacctgg cagccgcgta cggtccgatc tacaaactgt ggctgggcaa taaactgtgc 300

gtagttatct cctctgctcc tctggcgaag gaggtggttc gcgacaacga catcaccttc 360gtagttatct cctctgctcc tctggcgaag gaggtggttc gcgacaacga catcaccttc 360

tccgaacgtg acccaccggt ctgtgctaaa atcatcacct tcggcctgaa cgacatcgta 420tccgaacgtg accccaccggt ctgtgctaaa atcatcacct tcggcctgaa cgacatcgta 420

ttcgactcct atagctctcc tgactggcgt atgaaacgta aggttctggt acgcgagatg 480ttcgactcct atagctctcc tgactggcgt atgaaacgta aggttctggt acgcgagatg 480

ctgtcccaca gctccattaa ggcatgctac ggcctgcgtc gcgaacaggt actgaaaggc 540ctgtcccaca gctccattaa ggcatgctac ggcctgcgtc gcgaacaggt actgaaaggc 540

gtacaaaacg tagcgcagtc cgcgggcaaa ccgatcgatt tcggcgaaac ggccttcctg 600gtacaaaacg tagcgcagtc cgcgggcaaa ccgatcgatt tcggcgaaac ggccttcctg 600

actagcatca acgctatgat gtccatgctg tggggtggta aacagggcgg cgagcgtaaa 660actagcatca acgctatgat gtccatgctg tggggtggta aacagggcgg cgagcgtaaa 660

ggcgccgacg tatggggcca gtttcgtgac ctgatcaccg aactgatggt gattctgggc 720ggcgccgacg tatggggcca gtttcgtgac ctgatcaccg aactgatggt gattctgggc 720

aaaccgaacg tcagcgacat cttcccggtt ctggctcgct tcgacatcca gggcctggaa 780aaaccgaacg tcagcgacat cttcccggtt ctggctcgct tcgacatcca gggcctggaa 780

aaagaaatga ccaagatcgt caactctttc gacaaactgt ttaactccat gatcgaagaa 840aaagaaatga ccaagatcgt caactctttc gacaaactgt ttaactccat gatcgaagaa 840

cgcgaaaatt tctctaacaa actgagcaaa gaagatggca acaccgaaac taaagatttc 900cgcgaaaatt tctctaacaa actgagcaaa gaagatggca acaccgaaac taaagatttc 900

ctgcagctgc tgctggacct gaaacaaaag aacgattctg gtatctccat taccatgaac 960ctgcagctgc tgctggacct gaaacaaaag aacgattctg gtatctccat taccatgaac 960

caagtgaaag cgctgctgat ggacattgtt gtgggtggta ctgacaccac ttctaccatg 1020caagtgaaag cgctgctgat ggacattgtt gtgggtggta ctgacaccac ttctaccatg 1020

atggaatgga cgatggcaga actgattgct aatccggaag cgatgaagaa agtgaaacaa 1080atggaatgga cgatggcaga actgattgct aatccggaag cgatgaagaa agtgaaacaa 1080

gaaattgatg atgtagtggg ctctgatggt gcggtagacg agacgcacct gcctaagctg 1140gaaattgatg atgtagtggg ctctgatggt gcggtagacg agacgcacct gcctaagctg 1140

cgttatctgg acgcagccgt gaaagaaacc ttccgtctgc atccgcctat gccgctgctg 1200cgttatctgg acgcagccgt gaaagaaacc ttccgtctgc atccgcctat gccgctgctg 1200

gttccacgtt gcccaggcga ttccagcaac gttggtggct atagcgtacc gaagggtacc 1260gttccacgtt gcccaggcga ttccagcaac gttggtggct atagcgtacc gaagggtacc 1260

cgtgtgttcc tgaatatctg gtgcattcag cgcgacccgc agctgtggga aaacccgctg 1320cgtgtgttcc tgaatatctg gtgcattcag cgcgacccgc agctgtggga aaacccgctg 1320

gagttcaaac ctgaacgctt cctgaccgac catgaaaagc tggactacct gggcaacgat 1380gagttcaaac ctgaacgctt cctgaccgac catgaaaagc tggactacct gggcaacgat 1380

tcccgttaca tgccgttcgg ttctggccgt cgtatgtgcg caggcgtctc cctgggcgag 1440tcccgttaca tgccgttcgg ttctggccgt cgtatgtgcg caggcgtctc cctgggcgag 1440

aaaatgctgt actctagcct ggctgccatg atccacgctt acgactggaa cctggcagat 1500aaaatgctgt actctagcct ggctgccatg atccacgctt acgactggaa cctggcagat 1500

ggtgaagaga acgacctgat cggcctgttc ggcatcatta tgaaaaagaa aaagccgctg 1560ggtgaagaga acgacctgat cggcctgttc ggcatcatta tgaaaaagaa aaagccgctg 1560

atcctggtgc cgactccgcg tccaagcaac ctgcagcact acatgaaact ggtgccgcgt 1620atcctggtgc cgactccgcg tccaagcaac ctgcagcact acatgaaact ggtgccgcgt 1620

ggctctaaag aaaccgctgc tgcaaaattc gaacgtcagc acatggacag ctaataa 1677ggctctaaag aaaccgctgc tgcaaaattc gaacgtcagc acatggacag ctaataa 1677

<210>36<210>36

<211>717<211>717

<212>PRT<212>PRT

<213>东北红豆杉(Taxus cuspidata)<213>Northeast yew (Taxus cuspidata)

<400>36<400>36

Met Gln Ala Asn Ser Asn Thr Val Glu Gly Ala Ser Gln Gly Lys SerMet Gln Ala Asn Ser Asn Thr Val Glu Gly Ala Ser Gln Gly Lys Ser

1 5 10 151 5 10 15

Leu Leu Asp Ile Ser Arg Leu Asp His Ile Phe Ala Leu Leu Leu AsnLeu Leu Asp Ile Ser Arg Leu Asp His Ile Phe Ala Leu Leu Leu Asn

20 25 3020 25 30

Gly Lys Gly Gly Asp Leu Gly Ala Met Thr Gly Ser Ala Leu Ile LeuGly Lys Gly Gly Asp Leu Gly Ala Met Thr Gly Ser Ala Leu Ile Leu

35 40 4535 40 45

Thr Glu Asn Ser Gln Asn Leu Met Ile Leu Thr Thr Ala Leu Ala ValThr Glu Asn Ser Gln Asn Leu Met Ile Leu Thr Thr Ala Leu Ala Val

50 55 6050 55 60

Leu Val Ala Cys Val Phe Phe Phe Val Trp Arg Arg Gly Gly Ser AspLeu Val Ala Cys Val Phe Phe Phe Val Trp Arg Arg Gly Gly Ser Asp

65 70 75 8065 70 75 80

Thr Gln Lys Pro Ala Val Arg Pro Thr Pro Leu Val Lys Glu Glu AspThr Gln Lys Pro Ala Val Arg Pro Thr Pro Leu Val Lys Glu Glu Asp

85 90 9585 90 95

Glu Glu Glu Glu Asp Asp Ser Ala Lys Lys Lys Val Thr Ile Phe PheGlu Glu Glu Glu Asp Asp Ser Ala Lys Lys Lys Val Thr Ile Phe Phe

100 105 110100 105 110

Gly Thr Gln Thr Gly Thr Ala Glu Gly Phe Ala Lys Ala Leu Ala GluGly Thr Gln Thr Gly Thr Ala Glu Gly Phe Ala Lys Ala Leu Ala Glu

115 120 125115 120 125

Glu Ala Lys Ala Arg Tyr Glu Lys Ala Val Phe Lys Val Val Asp LeuGlu Ala Lys Ala Arg Tyr Glu Lys Ala Val Phe Lys Val Val Asp Leu

130 135 140130 135 140

Asp Asn Tyr Ala Ala Asp Asp Glu Gln Tyr Glu Glu Lys Leu Lys LysAsp Asn Tyr Ala Ala Asp Asp Glu Gln Tyr Glu Glu Lys Leu Lys Lys

145 150 155 160145 150 155 160

Glu Lys Leu Ala Phe Phe Met Leu Ala Thr Tyr Gly Asp Gly Glu ProGlu Lys Leu Ala Phe Phe Met Leu Ala Thr Tyr Gly Asp Gly Glu Pro

165 170 175165 170 175

Thr Asp Asn Ala Ala Arg Phe Tyr Lys Trp Phe Leu Glu Gly Lys GluThr Asp Asn Ala Ala Arg Phe Tyr Lys Trp Phe Leu Glu Gly Lys Glu

180 185 190180 185 190

Arg Glu Pro Trp Leu Ser Asp Leu Thr Tyr Gly Val Phe Gly Leu GlyArg Glu Pro Trp Leu Ser Asp Leu Thr Tyr Gly Val Phe Gly Leu Gly

195 200 205195 200 205

Asn Arg Gln Tyr Glu His Phe Asn Lys Val Ala Lys Ala Val Asp GluAsn Arg Gln Tyr Glu His Phe Asn Lys Val Ala Lys Ala Val Asp Glu

210 215 220210 215 220

Val Leu Ile Glu Gln Gly Ala Lys Arg Leu Val Pro Val Gly Leu GlyVal Leu Ile Glu Gln Gly Ala Lys Arg Leu Val Pro Val Gly Leu Gly

225 230 235 240225 230 235 240

Asp Asp Asp Gln Cys Ile Glu Asp Asp Phe Thr Ala Trp Arg Glu GlnAsp Asp Asp Gln Cys Ile Glu Asp Asp Phe Thr Ala Trp Arg Glu Gln

245 250 255245 250 255

Val Trp Pro Glu Leu Asp Gln Leu Leu Arg Asp Glu Asp Asp Glu ProVal Trp Pro Glu Leu Asp Gln Leu Leu Arg Asp Glu Asp Asp Glu Pro

260 265 270260 265 270

Thr Ser Ala Thr Pro Tyr Thr Ala Ala Ile Pro Glu Tyr Arg Val GluThr Ser Ala Thr Pro Tyr Thr Ala Ala Ile Pro Glu Tyr Arg Val Glu

275 280 285275 280 285

Ile Tyr Asp Ser Val Val Ser Val Tyr Glu Glu Thr His Ala Leu LysIle Tyr Asp Ser Val Val Ser Val Tyr Glu Glu Thr His Ala Leu Lys

290 295 300290 295 300

Gln Asn Gly Gln Ala Val Tyr Asp Ile His His Pro Cys Arg Ser AsnGln Asn Gly Gln Ala Val Tyr Asp Ile His His Pro Cys Arg Ser Asn

305 310 315 320305 310 315 320

Val Ala Val Arg Arg Glu Leu His Thr Pro Leu Ser Asp Arg Ser CysVal Ala Val Arg Arg Glu Leu His Thr Pro Leu Ser Asp Arg Ser Cys

325 330 335325 330 335

Ile His Leu Glu Phe Asp Ile Ser Asp Thr Gly Leu Ile Tyr Glu ThrIle His Leu Glu Phe Asp Ile Ser Asp Thr Gly Leu Ile Tyr Glu Thr

340 345 350340 345 350

Gly Asp His Val Gly Val His Thr Glu Asn Ser Ile Glu Thr Val GluGly Asp His Val Gly Val His Thr Glu Asn Ser Ile Glu Thr Val Glu

355 360 365355 360 365

Glu Ala Ala Lys Leu Leu Gly Tyr Gln Leu Asp Thr Ile Phe Ser ValGlu Ala Ala Lys Leu Leu Gly Tyr Gln Leu Asp Thr Ile Phe Ser Val

370 375 380370 375 380

His Gly Asp Lys Glu Asp Gly Thr Pro Leu Gly Gly Ser Ser Leu ProHis Gly Asp Lys Glu Asp Gly Thr Pro Leu Gly Gly Ser Ser Leu Pro

385 390 395 400385 390 395 400

Pro Pro Phe Pro Gly Pro Cys Thr Leu Arg Thr Ala Leu Ala Arg TyrPro Pro Phe Pro Gly Pro Cys Thr Leu Arg Thr Ala Leu Ala Arg Tyr

405 410 415405 410 415

Ala Asp Leu Leu Asn Pro Pro Arg Lys Ala Ala Phe Leu Ala Leu AlaAla Asp Leu Leu Asn Pro Pro Arg Lys Ala Ala Phe Leu Ala Leu Ala

420 425 430420 425 430

Ala His Ala Ser Asp Pro Ala Glu Ala Glu Arg Leu Lys Phe Leu SerAla His Ala Ser Asp Pro Ala Glu Ala Glu Arg Leu Lys Phe Leu Ser

435 440 445435 440 445

Ser Pro Ala Gly Lys Asp Glu Tyr Ser Gln Trp Val Thr Ala Ser GlnSer Pro Ala Gly Lys Asp Glu Tyr Ser Gln Trp Val Thr Ala Ser Gln

450 455 460450 455 460

Arg Ser Leu Leu Glu Ile Met Ala Glu Phe Pro Ser Ala Lys Pro ProArg Ser Leu Leu Glu Ile Met Ala Glu Phe Pro Ser Ala Lys Pro Pro

465 470 475 480465 470 475 480

Leu Gly Val Phe Phe Ala Ala Ile Ala Pro Arg Leu Gln Pro Arg TyrLeu Gly Val Phe Phe Ala Ala Ile Ala Pro Arg Leu Gln Pro Arg Tyr

485 490 495485 490 495

Tyr Ser Ile Ser Ser Ser Pro Arg Phe Ala Pro Ser Arg Ile His ValTyr Ser Ile Ser Ser Ser Ser Pro Arg Phe Ala Pro Ser Arg Ile His Val

500 505 510500 505 510

Thr Cys Ala Leu Val Tyr Gly Pro Ser Pro Thr Gly Arg Ile His LysThr Cys Ala Leu Val Tyr Gly Pro Ser Pro Thr Gly Arg Ile His Lys

515 520 525515 520 525

Gly Val Cys Ser Asn Trp Met Lys Asn Ser Leu Pro Ser Glu Glu ThrGly Val Cys Ser Asn Trp Met Lys Asn Ser Leu Pro Ser Glu Glu Thr

530 535 540530 535 540

His Asp Cys Ser Trp Ala Pro Val Phe Val Arg Gln Ser Asn Phe LysHis Asp Cys Ser Trp Ala Pro Val Phe Val Arg Gln Ser Asn Phe Lys

545 550 555 560545 550 555 560

Leu Pro Ala Asp Ser Thr Thr Pro Ile Val Met Val Gly Pro Gly ThrLeu Pro Ala Asp Ser Thr Thr Pro Ile Val Met Val Gly Pro Gly Thr

565 570 575565 570 575

Gly Phe Ala Pro Phe Arg Gly Phe Leu Gln Glu Arg Ala Lys Leu GlnGly Phe Ala Pro Phe Arg Gly Phe Leu Gln Glu Arg Ala Lys Leu Gln

580 585 590580 585 590

Glu Ala Gly Glu Lys Leu Gly Pro Ala Val Leu Phe Phe Gly Cys ArgGlu Ala Gly Glu Lys Leu Gly Pro Ala Val Leu Phe Phe Gly Cys Arg

595 600 605595 600 605

Asn Arg Gln Met Asp Tyr Ile Tyr Glu Asp Glu Leu Lys Gly Tyr ValAsn Arg Gln Met Asp Tyr Ile Tyr Glu Asp Glu Leu Lys Gly Tyr Val

610 615 620610 615 620

Glu Lys Gly Ile Leu Thr Asn Leu Ile Val Ala Phe Ser Arg Glu GlyGlu Lys Gly Ile Leu Thr Asn Leu Ile Val Ala Phe Ser Arg Glu Gly

625 630 635 640625 630 635 640

Ala Thr Lys Glu Tyr Val Gln His Lys Met Leu Glu Lys Ala Ser AspAla Thr Lys Glu Tyr Val Gln His Lys Met Leu Glu Lys Ala Ser Asp

645 650 655645 650 655

Thr Trp Ser Leu Ile Ala Gln Gly Gly Tyr Leu Tyr Val Cys Gly AspThr Trp Ser Leu Ile Ala Gln Gly Gly Tyr Leu Tyr Val Cys Gly Asp

660 665 670660 665 670

Ala Lys Gly Met Ala Arg Asp Val His Arg Thr Leu His Thr Ile ValAla Lys Gly Met Ala Arg Asp Val His Arg Thr Leu His Thr Ile Val

675 680 685675 680 685

Gln Glu Gln Glu Ser Val Asp Ser Ser Lys Ala Glu Phe Leu Val LysGln Glu Gln Glu Ser Val Asp Ser Ser Ser Lys Ala Glu Phe Leu Val Lys

690 695 700690 695 700

Lys Leu Gln Met Asp Gly Arg Tyr Leu Arg Asp Ile TrpLys Leu Gln Met Asp Gly Arg Tyr Leu Arg Asp Ile Trp

705 710 715705 710 715

<210>37<210>37

<211>680<211>680

<212>PRT<212>PRT

<213>热带念珠菌(Candida tropicalis)<213> Tropical Candida (Candida tropicalis)

<400>37<400>37

Met Ala Leu Asp Lys Leu Asp Leu Tyr Val Ile Ile Thr Leu Val ValMet Ala Leu Asp Lys Leu Asp Leu Tyr Val Ile Ile Thr Leu Val Val

1 5 10 151 5 10 15

Ala Ile Ala Ala Tyr Phe Ala Lys Asn Gln Phe Leu Asp Gln Gln GlnAla Ile Ala Ala Tyr Phe Ala Lys Asn Gln Phe Leu Asp Gln Gln Gln

20 25 3020 25 30

Asp Thr Gly Phe Leu Asn Thr Asp Ser Gly Asp Gly Asn Ser Arg AspAsp Thr Gly Phe Leu Asn Thr Asp Ser Gly Asp Gly Asn Ser Arg Asp

35 40 4535 40 45

Ile Leu Gln Ala Leu Lys Lys Asn Asn Lys Asn Thr Leu Leu Leu PheIle Leu Gln Ala Leu Lys Lys Asn Asn Lys Asn Thr Leu Leu Leu Phe

50 55 6050 55 60

Gly Ser Gln Thr Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser ArgGly Ser Gln Thr Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser Arg

65 70 75 8065 70 75 80

Glu Leu His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp Phe AlaGlu Leu His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp Phe Ala

85 90 9585 90 95

Asp Tyr Asp Phe Glu Asn Phe Gly Asp Ile Thr Glu Asp Ile Leu ValAsp Tyr Asp Phe Glu Asn Phe Gly Asp Ile Thr Glu Asp Ile Leu Val

100 105 110100 105 110

Phe Phe Ile Val Ala Thr Tyr Gly Glu Gly Glu Pro Thr Asp Asn AlaPhe Phe Ile Val Ala Thr Tyr Gly Glu Gly Glu Pro Thr Asp Asn Ala

115 120 125115 120 125

Asp Glu Phe His Thr Trp Leu Thr Glu Glu Ala Asp Thr Leu Ser ThrAsp Glu Phe His Thr Trp Leu Thr Glu Glu Ala Asp Thr Leu Ser Thr

130 135 140130 135 140

Leu Lys Tyr Thr Val Phe Gly Leu Gly Asn Ser Thr Tyr Glu Phe PheLeu Lys Tyr Thr Val Phe Gly Leu Gly Asn Ser Thr Tyr Glu Phe Phe

145 150 155 160145 150 155 160

Asn Ala Ile Gly Arg Lys Phe Asp Arg Leu Leu Gly Glu Lys Gly GlyAsn Ala Ile Gly Arg Lys Phe Asp Arg Leu Leu Gly Glu Lys Gly Gly

165 170 175165 170 175

Asp Arg Phe Ala Glu Tyr Gly Glu Gly Asp Asp Gly Thr Gly Thr LeuAsp Arg Phe Ala Glu Tyr Gly Glu Gly Asp Asp Gly Thr Gly Thr Leu

180 185 190180 185 190

Asp Glu Asp Phe Leu Ala Trp Lys Asp Asn Val Phe Asp Ser Leu LysAsp Glu Asp Phe Leu Ala Trp Lys Asp Asn Val Phe Asp Ser Leu Lys

195 200 205195 200 205

Asn Asp Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro Asn ValAsn Asp Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro Asn Val

210 215 220210 215 220

Lys Leu Thr Glu Arg Asp Asp Leu Ser Gly Asn Asp Pro Asp Val SerLys Leu Thr Glu Arg Asp Asp Leu Ser Gly Asn Asp Pro Asp Val Ser

225 230 235 240225 230 235 240

Leu Gly Glu Pro Asn Val Lys Tyr Ile Lys Ser Glu Gly Val Asp LeuLeu Gly Glu Pro Asn Val Lys Tyr Ile Lys Ser Glu Gly Val Asp Leu

245 250 255245 250 255

Thr Lys Gly Pro Phe Asp His Thr His Pro Phe Leu Ala Arg Ile ValThr Lys Gly Pro Phe Asp His Thr His Pro Phe Leu Ala Arg Ile Val

260 265 270260 265 270

Lys Thr Lys Glu Leu Phe Thr Ser Glu Asp Arg His Cys Val His ValLys Thr Lys Glu Leu Phe Thr Ser Glu Asp Arg His Cys Val His Val

275 280 285275 280 285

Glu Phe Asp Ile Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly Asp HisGlu Phe Asp Ile Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly Asp His

290 295 300290 295 300

Leu Ala Ile Trp Pro Ser Asn Ser Asp Glu Asn Ile Lys Gln Phe AlaLeu Ala Ile Trp Pro Ser Asn Ser Asp Glu Asn Ile Lys Gln Phe Ala

305 310 315 320305 310 315 320

Lys Cys Phe Gly Leu Glu Asp Lys Leu Asp Thr Val Ile Glu Leu LysLys Cys Phe Gly Leu Glu Asp Lys Leu Asp Thr Val Ile Glu Leu Lys

325 330 335325 330 335

Ala Leu Asp Ser Thr Tyr Ser Ile Pro Phe Pro Asn Pro Ile Thr TyrAla Leu Asp Ser Thr Tyr Ser Ile Pro Phe Pro Asn Pro Ile Thr Tyr

340 345 350340 345 350

Gly Ala Val Ile Arg His His Leu Glu Ile Ser Gly Pro Val Ser ArgGly Ala Val Ile Arg His His Leu Glu Ile Ser Gly Pro Val Ser Arg

355 360 365355 360 365

Gln Phe Phe Leu Ser Ile Ala Gly Phe Ala Pro Asp Glu Glu Thr LysGln Phe Phe Leu Ser Ile Ala Gly Phe Ala Pro Asp Glu Glu Thr Lys

370 375 380370 375 380

Lys Ser Phe Thr Arg Ile Gly Gly Asp Lys Gln Glu Phe Ala Ser LysLys Ser Phe Thr Arg Ile Gly Gly Asp Lys Gln Glu Phe Ala Ser Lys

385 390 395 400385 390 395 400

Val Thr Arg Arg Lys Phe Asn Ile Ala Asp Ala Leu Leu Phe Ala SerVal Thr Arg Arg Lys Phe Asn Ile Ala Asp Ala Leu Leu Phe Ala Ser

405 410 415405 410 415

Asn Asn Arg Pro Trp Ser Asp Val Pro Phe Glu Phe Leu Ile Glu AsnAsn Asn Arg Pro Trp Ser Asp Val Pro Phe Glu Phe Leu Ile Glu Asn

420 425 430420 425 430

Val Gln His Leu Thr Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Ser LeuVal Gln His Leu Thr Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Ser Ser Leu

435 440 445435 440 445

Ser Glu Lys Gln Thr Ile Asn Val Thr Ala Val Val Glu Ala Glu GluSer Glu Lys Gln Thr Ile Asn Val Thr Ala Val Val Glu Ala Glu Glu

450 455 460450 455 460

Glu Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn Leu Leu LysGlu Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn Leu Leu Lys

465 470 475 480465 470 475 480

Asn Ile Glu Ile Glu Gln Asn Lys Thr Gly Glu Thr Pro Met Val HisAsn Ile Glu Ile Glu Gln Asn Lys Thr Gly Glu Thr Pro Met Val His

485 490 495485 490 495

Tyr Asp Leu Asn Gly Pro Arg Gly Lys Phe Ser Lys Phe Arg Leu ProTyr Asp Leu Asn Gly Pro Arg Gly Lys Phe Ser Lys Phe Arg Leu Pro

500 505 510500 505 510

Val His Val Arg Arg Ser Asn Phe Lys Leu Pro Lys Asn Ser Thr ThrVal His Val Arg Arg Ser Asn Phe Lys Leu Pro Lys Asn Ser Thr Thr

515 520 525515 520 525

Pro Val Ile Leu Ile Gly Pro Gly Thr Gly Val Ala Pro Leu Arg GlyPro Val Ile Leu Ile Gly Pro Gly Thr Gly Val Ala Pro Leu Arg Gly

530 535 540530 535 540

Phe Val Arg Glu Arg Val Gln Gln Val Lys Asn Gly Val Asn Val GlyPhe Val Arg Glu Arg Val Gln Gln Val Lys Asn Gly Val Asn Val Gly

545 550 555 560545 550 555 560

Lys Thr Val Leu Phe Tyr Gly Cys Arg Asn Ser Glu Gln Asp Phe LeuLys Thr Val Leu Phe Tyr Gly Cys Arg Asn Ser Glu Gln Asp Phe Leu

565 570 575565 570 575

Tyr Lys Gln Glu Trp Ser Glu Tyr Ala Ser Val Leu Gly Glu Asn PheTyr Lys Gln Glu Trp Ser Glu Tyr Ala Ser Val Leu Gly Glu Asn Phe

580 585 590580 585 590

Glu Met Phe Asn Ala Phe Ser Arg Gln Asp Pro Thr Lys Lys Val TyrGlu Met Phe Asn Ala Phe Ser Arg Gln Asp Pro Thr Lys Lys Val Tyr

595 600 605595 600 605

Val Gln Asp Lys Ile Leu Glu Asn Ser Ala Leu Val Asp Glu Leu LeuVal Gln Asp Lys Ile Leu Glu Asn Ser Ala Leu Val Asp Glu Leu Leu

610 615 620610 615 620

Ser Ser Gly Ala Ile Ile Tyr Val Cys Gly Asp Ala Ser Arg Met AlaSer Ser Gly Ala Ile Ile Tyr Val Cys Gly Asp Ala Ser Arg Met Ala

625 630 635 640625 630 635 640

Arg Asp Val Gln Ala Ala Ile Ala Lys Ile Val Ala Lys Ser Arg AspArg Asp Val Gln Ala Ala Ile Ala Lys Ile Val Ala Lys Ser Arg Asp

645 650 655645 650 655

Ile His Glu Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys Val GlnIle His Glu Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys Val Gln

660 665 670660 665 670

Asn Arg Tyr Gln Glu Asp Val TrpAsn Arg Tyr Gln Glu Asp Val Trp

675 680675 680

<210>38<210>38

<211>692<211>692

<212>PRT<212>PRT

<213>拟南芥(Arabidopsis thaliana)<213> Arabidopsis thaliana

<400>38<400>38

Met Thr Ser Ala Leu Tyr Ala Ser Asp Leu Phe Lys Gln Leu Lys SerMet Thr Ser Ala Leu Tyr Ala Ser Asp Leu Phe Lys Gln Leu Lys Ser

1 5 10 151 5 10 15

Ile Met Gly Thr Asp Ser Leu Ser Asp Asp Val Val Leu Val Ile AlaIle Met Gly Thr Asp Ser Leu Ser Asp Asp Val Val Leu Val Ile Ala

20 25 3020 25 30

Thr Thr Ser Leu Ala Leu Val Ala Gly Phe Val Val Leu Leu Trp LysThr Thr Ser Leu Ala Leu Val Ala Gly Phe Val Val Leu Leu Trp Lys

35 40 4535 40 45

Lys Thr Thr Ala Asp Arg Ser Gly Glu Leu Lys Pro Leu Met Ile ProLys Thr Thr Ala Asp Arg Ser Gly Glu Leu Lys Pro Leu Met Ile Pro

50 55 6050 55 60

Lys Ser Leu Met Ala Lys Asp Glu Asp Asp Asp Leu Asp Leu Gly SerLys Ser Leu Met Ala Lys Asp Glu Asp Asp Asp Leu Asp Leu Gly Ser

65 70 75 8065 70 75 80

Gly Lys Thr Arg Val Ser Ile Phe Phe Gly Thr Gln Thr Gly Thr AlaGly Lys Thr Arg Val Ser Ile Phe Phe Gly Thr Gln Thr Gly Thr Ala

85 90 9585 90 95

Glu Gly Phe Ala Lys Ala Leu Ser Glu Glu Ile Lys Ala Arg Tyr GluGlu Gly Phe Ala Lys Ala Leu Ser Glu Glu Ile Lys Ala Arg Tyr Glu

100 105 110100 105 110

Lys Ala Ala Val Lys Val Ile Asp Leu Asp Asp Tyr Ala Ala Asp AspLys Ala Ala Val Lys Val Ile Asp Leu Asp Asp Tyr Ala Ala Asp Asp

115 120 125115 120 125

Asp Gln Tyr Glu Glu Lys Leu Lys Lys Glu Thr Leu Ala Phe Phe CysAsp Gln Tyr Glu Glu Lys Leu Lys Lys Glu Thr Leu Ala Phe Phe Cys

130 135 140130 135 140

Val Ala Thr Tyr Gly Asp Gly Glu Pro Thr Asp Asn Ala Ala Arg PheVal Ala Thr Tyr Gly Asp Gly Glu Pro Thr Asp Asn Ala Ala Arg Phe

145 150 155 160145 150 155 160

Ser Lys Trp Phe Thr Glu Glu Asn Glu Arg Asp Ile Lys Leu Gln GlnSer Lys Trp Phe Thr Glu Glu Asn Glu Arg Asp Ile Lys Leu Gln Gln

165 170 175165 170 175

Leu Ala Tyr Gly Val Phe Ala Leu Gly Asn Arg Gln Tyr Glu His PheLeu Ala Tyr Gly Val Phe Ala Leu Gly Asn Arg Gln Tyr Glu His Phe

180 185 190180 185 190

Asn Lys Ile Gly Ile Val Leu Asp Glu Glu Leu Cys Lys Lys Gly AlaAsn Lys Ile Gly Ile Val Leu Asp Glu Glu Leu Cys Lys Lys Gly Ala

195 200 205195 200 205

Lys Arg Leu Ile Glu Val Gly Leu Gly Asp Asp Asp Gln Ser Ile GluLys Arg Leu Ile Glu Val Gly Leu Gly Asp Asp Asp Gln Ser Ile Glu

210 215 220210 215 220

Asp Asp Phe Asn Ala Trp Lys Glu Ser Leu Trp Ser Glu Leu Asp LysAsp Asp Phe Asn Ala Trp Lys Glu Ser Leu Trp Ser Glu Leu Asp Lys

225 230 235 240225 230 235 240

Leu Leu Lys Asp Glu Asp Asp Lys Ser Val Ala Thr Pro Tyr Thr AlaLeu Leu Lys Asp Glu Asp Asp Lys Ser Val Ala Thr Pro Tyr Thr Ala

245 250 255245 250 255

Val Ile Pro Glu Tyr Arg Val Val Thr His Asp Pro Arg Phe Thr ThrVal Ile Pro Glu Tyr Arg Val Val Thr His Asp Pro Arg Phe Thr Thr

260 265 270260 265 270

Gln Lys Ser Met Glu Ser Asn Val Ala Asn Gly Asn Thr Thr Ile AspGln Lys Ser Met Glu Ser Asn Val Ala Asn Gly Asn Thr Thr Ile Asp

275 280 285275 280 285

Ile His His Pro Cys Arg Val Asp Val Ala Val Gln Lys Glu Leu HisIle His His Pro Cys Arg Val Asp Val Ala Val Gln Lys Glu Leu His

290 295 300290 295 300

Thr His Glu Ser Asp Arg Ser Cys Ile His Leu Glu Phe Asp Ile SerThr His Glu Ser Asp Arg Ser Cys Ile His Leu Glu Phe Asp Ile Ser

305 310 315 320305 310 315 320

Arg Thr Gly Ile Thr Tyr Glu Thr Gly Asp His Val Gly Val Tyr AlaArg Thr Gly Ile Thr Tyr Glu Thr Gly Asp His Val Gly Val Tyr Ala

325 330 335325 330 335

Glu Asn His Val Glu Ile Val Glu Glu Ala Gly Lys Leu Leu Gly HisGlu Asn His Val Glu Ile Val Glu Glu Ala Gly Lys Leu Leu Gly His

340 345 350340 345 350

Ser Leu Asp Leu Val Phe Ser Ile His Ala Asp Lys Glu Asp Gly SerSer Leu Asp Leu Val Phe Ser Ile His Ala Asp Lys Glu Asp Gly Ser

355 360 365355 360 365

Pro Leu Glu Ser Ala Val Pro Pro Pro Phe Pro Gly Pro Cys Thr LeuPro Leu Glu Ser Ala Val Pro Pro Pro Phe Pro Gly Pro Cys Thr Leu

370 375 380370 375 380

Gly Thr Gly Leu Ala Arg Tyr Ala Asp Leu Leu Asn Pro Pro Arg LysGly Thr Gly Leu Ala Arg Tyr Ala Asp Leu Leu Asn Pro Pro Arg Lys

385 390 395 400385 390 395 400

Ser Ala Leu Val Ala Leu Ala Ala Tyr Ala Thr Glu Pro Ser Glu AlaSer Ala Leu Val Ala Leu Ala Ala Tyr Ala Thr Glu Pro Ser Glu Ala

405 410 415405 410 415

Glu Lys Leu Lys His Leu Thr Ser Pro Asp Gly Lys Asp Glu Tyr SerGlu Lys Leu Lys His Leu Thr Ser Pro Asp Gly Lys Asp Glu Tyr Ser

420 425 430420 425 430

Gln Trp Ile Val Ala Ser Gln Arg Ser Leu Leu Glu Val Met Ala AlaGln Trp Ile Val Ala Ser Gln Arg Ser Leu Leu Glu Val Met Ala Ala

435 440 445435 440 445

Phe Pro Ser Ala Lys Pro Pro Leu Gly Val Phe Phe Ala Ala Ile AlaPhe Pro Ser Ala Lys Pro Pro Leu Gly Val Phe Phe Ala Ala Ile Ala

450 455 460450 455 460

Pro Arg Leu Gln Pro Arg Tyr Tyr Ser Ile Ser Ser Cys Gln Asp TrpPro Arg Leu Gln Pro Arg Tyr Tyr Ser Ile Ser Ser Cys Gln Asp Trp

465 470 475 480465 470 475 480

Ala Pro Ser Arg Val His Val Thr Ser Ala Leu Val Tyr Gly Pro ThrAla Pro Ser Arg Val His Val Thr Ser Ala Leu Val Tyr Gly Pro Thr

485 490 495485 490 495

Pro Thr Gly Arg Ile His Lys Gly Val Cys Ser Thr Trp Met Lys AsnPro Thr Gly Arg Ile His Lys Gly Val Cys Ser Thr Trp Met Lys Asn

500 505 510500 505 510

Ala Val Pro Ala Glu Lys Ser His Glu Cys Ser Gly Ala Pro Ile PheAla Val Pro Ala Glu Lys Ser His Glu Cys Ser Gly Ala Pro Ile Phe

515 520 525515 520 525

Ile Arg Ala Ser Asn Phe Lys Leu Pro Ser Asn Pro Ser Thr Pro IleIle Arg Ala Ser Asn Phe Lys Leu Pro Ser Asn Pro Ser Thr Pro Ile

530 535 540530 535 540

Val Met Val Gly Pro Gly Thr Gly Leu Ala Pro Phe Arg Gly Phe LeuVal Met Val Gly Pro Gly Thr Gly Leu Ala Pro Phe Arg Gly Phe Leu

545 550 555 560545 550 555 560

Gln Glu Arg Met Ala Leu Lys Glu Asp Gly Glu Glu Leu Gly Ser SerGln Glu Arg Met Ala Leu Lys Glu Asp Gly Glu Glu Leu Gly Ser Ser

565 570 575565 570 575

Leu Leu Phe Phe Gly Cys Arg Asn Arg Gln Met Asp Phe Ile Tyr GluLeu Leu Phe Phe Gly Cys Arg Asn Arg Gln Met Asp Phe Ile Tyr Glu

580 585 590580 585 590

Asp Glu Leu Asn Asn Phe Val Asp Gln Gly Val Ile Ser Glu Leu IleAsp Glu Leu Asn Asn Phe Val Asp Gln Gly Val Ile Ser Glu Leu Ile

595 600 605595 600 605

Met Ala Phe Ser Arg Glu Gly Ala Gln Lys Glu Tyr Val Gln His LysMet Ala Phe Ser Arg Glu Gly Ala Gln Lys Glu Tyr Val Gln His Lys

610 615 620610 615 620

Met Met Glu Lys Ala Ala Gln Val Trp Asp Leu Ile Lys Glu Glu GlyMet Met Glu Lys Ala Ala Gln Val Trp Asp Leu Ile Lys Glu Glu Gly

625 630 635 640625 630 635 640

Tyr Leu Tyr Val Cys Gly Asp Ala Lys Gly Met Ala Arg Asp Val HisTyr Leu Tyr Val Cys Gly Asp Ala Lys Gly Met Ala Arg Asp Val His

645 650 655645 650 655

Arg Thr Leu His Thr Ile Val Gln Glu Gln Glu Gly Val Ser Ser SerArg Thr Leu His Thr Ile Val Gln Glu Gln Glu Gly Val Ser Ser Ser

660 665 670660 665 670

Glu Ala Glu Ala Ile Val Lys Lys Leu Gln Thr Glu Gly Arg Tyr LeuGlu Ala Glu Ala Ile Val Lys Lys Leu Gln Thr Glu Gly Arg Tyr Leu

675 680 685675 680 685

Arg Asp Val TrpArg Asp Val Trp

690690

<210>39<210>39

<211>712<211>712

<212>PRT<212>PRT

<213>拟南芥(Arabidopsis thaliana)<213> Arabidopsis thaliana

<400>39<400>39

Met Ser Ser Ser Ser Ser Ser Ser Thr Ser Met Ile Asp Leu Met AlaMet Ser Ser Ser Ser Ser Ser Ser Ser Thr Ser Met Ile Asp Leu Met Ala

1 5 10 151 5 10 15

Ala Ile Ile Lys Gly Glu Pro Val Ile Val Ser Asp Pro Ala Asn AlaAla Ile Ile Lys Gly Glu Pro Val Ile Val Ser Asp Pro Ala Asn Ala

20 25 3020 25 30

Ser Ala Tyr Glu Ser Val Ala Ala Glu Leu Ser Ser Met Leu Ile GluSer Ala Tyr Glu Ser Val Ala Ala Glu Leu Ser Ser Met Leu Ile Glu

35 40 4535 40 45

Asn Arg Gln Phe Ala Met Ile Val Thr Thr Ser Ile Ala Val Leu IleAsn Arg Gln Phe Ala Met Ile Val Thr Thr Ser Ile Ala Val Leu Ile

50 55 6050 55 60

Gly Cys Ile Val Met Leu Val Trp Arg Arg Ser Gly Ser Gly Asn SerGly Cys Ile Val Met Leu Val Trp Arg Arg Ser Gly Ser Gly Asn Ser

65 70 75 8065 70 75 80

Lys Arg Val Glu Pro Leu Lys Pro Leu Val Ile Lys Pro Arg Glu GluLys Arg Val Glu Pro Leu Lys Pro Leu Val Ile Lys Pro Arg Glu Glu

85 90 9585 90 95

Glu Ile Asp Asp Gly Arg Lys Lys Val Thr Ile Phe Phe Gly Thr GlnGlu Ile Asp Asp Gly Arg Lys Lys Val Thr Ile Phe Phe Gly Thr Gln

100 105 110100 105 110

Thr Gly Thr Ala Glu Gly Phe Ala Lys Ala Leu Gly Glu Glu Ala LysThr Gly Thr Ala Glu Gly Phe Ala Lys Ala Leu Gly Glu Glu Ala Lys

115 120 125115 120 125

Ala Arg Tyr Glu Lys Thr Arg Phe Lys Ile Val Asp Leu Asp Asp TyrAla Arg Tyr Glu Lys Thr Arg Phe Lys Ile Val Asp Leu Asp Asp Tyr

130 135 140130 135 140

Ala Ala Asp Asp Asp Glu Tyr Glu Glu Lys Leu Lys Lys Glu Asp ValAla Ala Asp Asp Asp Glu Tyr Glu Glu Lys Leu Lys Lys Glu Asp Val

145 150 155 160145 150 155 160

Ala Phe Phe Phe Leu Ala Thr Tyr Gly Asp Gly Glu Pro Thr Asp AsnAla Phe Phe Phe Leu Ala Thr Tyr Gly Asp Gly Glu Pro Thr Asp Asn

165 170 175165 170 175

Ala Ala Arg Phe Tyr Lys Trp Phe Thr Glu Gly Asn Asp Arg Gly GluAla Ala Arg Phe Tyr Lys Trp Phe Thr Glu Gly Asn Asp Arg Gly Glu

180 185 190180 185 190

Trp Leu Lys Asn Leu Lys Tyr Gly Val Phe Gly Leu Gly Asn Arg GlnTrp Leu Lys Asn Leu Lys Tyr Gly Val Phe Gly Leu Gly Asn Arg Gln

195 200 205195 200 205

Tyr Glu His Phe Asn Lys Val Ala Lys Val Val Asp Asp Ile Leu ValTyr Glu His Phe Asn Lys Val Ala Lys Val Val Asp Asp Ile Leu Val

210 215 220210 215 220

Glu Gln Gly Ala Gln Arg Leu Val Gln Val Gly Leu Gly Asp Asp AspGlu Gln Gly Ala Gln Arg Leu Val Gln Val Gly Leu Gly Asp Asp Asp

225 230 235 240225 230 235 240

Gln Cys Ile Glu Asp Asp Phe Thr Ala Trp Arg Glu Ala Leu Trp ProGln Cys Ile Glu Asp Asp Phe Thr Ala Trp Arg Glu Ala Leu Trp Pro

245 250 255245 250 255

Glu Leu Asp Thr Ile Leu Arg Glu Glu Gly Asp Thr Ala Val Ala ThrGlu Leu Asp Thr Ile Leu Arg Glu Glu Gly Asp Thr Ala Val Ala Thr

260 265 270260 265 270

Pro Tyr Thr Ala Ala Val Leu Glu Tyr Arg Val Ser Ile His Asp SerPro Tyr Thr Ala Ala Val Leu Glu Tyr Arg Val Ser Ile His Asp Ser

275 280 285275 280 285

Glu Asp Ala Lys Phe Asn Asp Ile Thr Leu Ala Asn Gly Asn Gly TyrGlu Asp Ala Lys Phe Asn Asp Ile Thr Leu Ala Asn Gly Asn Gly Tyr

290 295 300290 295 300

Thr Val Phe Asp Ala Gln His Pro Tyr Lys Ala Asn Val Ala Val LysThr Val Phe Asp Ala Gln His Pro Tyr Lys Ala Asn Val Ala Val Lys

305 310 315 320305 310 315 320

Arg Glu Leu His Thr Pro Glu Ser Asp Arg Ser Cys Ile His Leu GluArg Glu Leu His Thr Pro Glu Ser Asp Arg Ser Cys Ile His Leu Glu

325 330 335325 330 335

Phe Asp Ile Ala Gly Ser Gly Leu Thr Met Lys Leu Gly Asp His ValPhe Asp Ile Ala Gly Ser Gly Leu Thr Met Lys Leu Gly Asp His Val

340 345 350340 345 350

Gly Val Leu Cys Asp Asn Leu Ser Glu Thr Val Asp Glu Ala Leu ArgGly Val Leu Cys Asp Asn Leu Ser Glu Thr Val Asp Glu Ala Leu Arg

355 360 365355 360 365

Leu Leu Asp Met Ser Pro Asp Thr Tyr Phe Ser Leu His Ala Glu LysLeu Leu Asp Met Ser Pro Asp Thr Tyr Phe Ser Leu His Ala Glu Lys

370 375 380370 375 380

Glu Asp Gly Thr Pro Ile Ser Ser Ser Leu Pro Pro Pro Phe Pro ProGlu Asp Gly Thr Pro Ile Ser Ser Ser Leu Pro Pro Pro Phe Pro Pro

385 390 395 400385 390 395 400

Cys Asn Leu Arg Thr Ala Leu Thr Arg Tyr Ala Cys Leu Leu Ser SerCys Asn Leu Arg Thr Ala Leu Thr Arg Tyr Ala Cys Leu Leu Ser Ser

405 410 415405 410 415

Pro Lys Lys Ser Ala Leu Val Ala Leu Ala Ala His Ala Ser Asp ProPro Lys Lys Ser Ala Leu Val Ala Leu Ala Ala His Ala Ser Asp Pro

420 425 430420 425 430

Thr Glu Ala Glu Arg Leu Lys His Leu Ala Ser Pro Ala Gly Lys AspThr Glu Ala Glu Arg Leu Lys His Leu Ala Ser Pro Ala Gly Lys Asp

435 440 445435 440 445

Glu Tyr Ser Lys Trp Val Val Glu Ser Gln Arg Ser Leu Leu Glu ValGlu Tyr Ser Lys Trp Val Val Glu Ser Gln Arg Ser Leu Leu Glu Val

450 455 460450 455 460

Met Ala Glu Phe Pro Ser Ala Lys Pro Pro Leu Gly Val Phe Phe AlaMet Ala Glu Phe Pro Ser Ala Lys Pro Pro Leu Gly Val Phe Phe Ala

465 470 475 480465 470 475 480

Gly Val Ala Pro Arg Leu Gln Pro Arg Phe Tyr Ser Ile Ser Ser SerGly Val Ala Pro Arg Leu Gln Pro Arg Phe Tyr Ser Ile Ser Ser Ser Ser

485 490 495485 490 495

Pro Lys Ile Ala Glu Thr Arg Ile His Val Thr Cys Ala Leu Val TyrPro Lys Ile Ala Glu Thr Arg Ile His Val Thr Cys Ala Leu Val Tyr

500 505 510500 505 510

Glu Lys Met Pro Thr Gly Arg Ile His Lys Gly Val Cys Ser Thr TrpGlu Lys Met Pro Thr Gly Arg Ile His Lys Gly Val Cys Ser Thr Trp

515 520 525515 520 525

Met Lys Asn Ala Val Pro Tyr Glu Lys Ser Glu Lys Leu Phe Leu GlyMet Lys Asn Ala Val Pro Tyr Glu Lys Ser Glu Lys Leu Phe Leu Gly

530 535 540530 535 540

Arg Pro Ile Phe Val Arg Gln Ser Asn Phe Lys Leu Pro Ser Asp SerArg Pro Ile Phe Val Arg Gln Ser Asn Phe Lys Leu Pro Ser Asp Ser

545 550 555 560545 550 555 560

Lys Val Pro Ile Ile Met Ile Gly Pro Gly Thr Gly Leu Ala Pro PheLys Val Pro Ile Ile Met Ile Gly Pro Gly Thr Gly Leu Ala Pro Phe

565 570 575565 570 575

Arg Gly Phe Leu Gln Glu Arg Leu Ala Leu Val Glu Ser Gly Val GluArg Gly Phe Leu Gln Glu Arg Leu Ala Leu Val Glu Ser Gly Val Glu

580 585 590580 585 590

Leu Gly Pro Ser Val Leu Phe Phe Gly Cys Arg Asn Arg Arg Met AspLeu Gly Pro Ser Val Leu Phe Phe Gly Cys Arg Asn Arg Arg Met Asp

595 600 605595 600 605

Phe Ile Tyr Glu Glu Glu Leu Gln Arg Phe Val Glu Ser Gly Ala LeuPhe Ile Tyr Glu Glu Glu Leu Gln Arg Phe Val Glu Ser Gly Ala Leu

610 615 620610 615 620

Ala Glu Leu Ser Val Ala Phe Ser Arg Glu Gly Pro Thr Lys Glu TyrAla Glu Leu Ser Val Ala Phe Ser Arg Glu Gly Pro Thr Lys Glu Tyr

625 630 635 640625 630 635 640

Val Gln His Lys Met Met Asp Lys Ala Ser Asp Ile Trp Asn Met IleVal Gln His Lys Met Met Asp Lys Ala Ser Asp Ile Trp Asn Met Ile

645 650 655645 650 655

Ser Gln Gly Ala Tyr Leu Tyr Val Cys Gly Asp Ala Lys Gly Met AlaSer Gln Gly Ala Tyr Leu Tyr Val Cys Gly Asp Ala Lys Gly Met Ala

660 665 670660 665 670

Arg Asp Val His Arg Ser Leu His Thr Ile Ala Gln Glu Gln Gly SerArg Asp Val His Arg Ser Leu His Thr Ile Ala Gln Glu Gln Gly Ser

675 680 685675 680 685

Met Asp Ser Thr Lys Ala Glu Gly Phe Val Lys Asn Leu Gln Thr SerMet Asp Ser Thr Lys Ala Glu Gly Phe Val Lys Asn Leu Gln Thr Ser

690 695 700690 695 700

Gly Arg Tyr Leu Arg Asp Val TrpGly Arg Tyr Leu Arg Asp Val Trp

705 710705 710

<210>40<210>40

<211>667<211>667

<212>PRT<212>PRT

<213>拟南芥(Arabidopsis thaliana)<213> Arabidopsis thaliana

<400>40<400>40

Met Leu Ile Glu Asn Arg Gln Phe Ala Met Ile Val Thr Thr Ser IleMet Leu Ile Glu Asn Arg Gln Phe Ala Met Ile Val Thr Thr Ser Ile

1 5 10 151 5 10 15

Ala Val Leu Ile Gly Cys Ile Val Met Leu Val Trp Arg Arg Ser GlyAla Val Leu Ile Gly Cys Ile Val Met Leu Val Trp Arg Arg Ser Gly

20 25 3020 25 30

Ser Gly Asn Ser Lys Arg Val Glu Pro Leu Lys Pro Leu Val Ile LysSer Gly Asn Ser Lys Arg Val Glu Pro Leu Lys Pro Leu Val Ile Lys

35 40 4535 40 45

Pro Arg Glu Glu Glu Ile Asp Asp Gly Arg Lys Lys Val Thr Ile PhePro Arg Glu Glu Glu Ile Asp Asp Gly Arg Lys Lys Val Thr Ile Phe

50 55 6050 55 60

Phe Gly Thr Gln Thr Gly Thr Ala Glu Gly Phe Ala Lys Ala Leu GlyPhe Gly Thr Gln Thr Gly Thr Ala Glu Gly Phe Ala Lys Ala Leu Gly

65 70 75 8065 70 75 80

Glu Glu Ala Lys Ala Arg Tyr Glu Lys Thr Arg Phe Lys Ile Val AspGlu Glu Ala Lys Ala Arg Tyr Glu Lys Thr Arg Phe Lys Ile Val Asp

85 90 9585 90 95

Leu Asp Asp Tyr Ala Ala Asp Asp Asp Glu Tyr Glu Glu Lys Leu LysLeu Asp Asp Tyr Ala Ala Asp Asp Asp Glu Tyr Glu Glu Lys Leu Lys

100 105 110100 105 110

Lys Glu Asp Val Ala Phe Phe Phe Leu Ala Thr Tyr Gly Asp Gly GluLys Glu Asp Val Ala Phe Phe Phe Leu Ala Thr Tyr Gly Asp Gly Glu

115 120 125115 120 125

Pro Thr Asp Asn Ala Ala Arg Phe Tyr Lys Trp Phe Thr Glu Gly AsnPro Thr Asp Asn Ala Ala Arg Phe Tyr Lys Trp Phe Thr Glu Gly Asn

130 135 140130 135 140

Asp Arg Gly Glu Trp Leu Lys Asn Leu Lys Tyr Gly Val Phe Gly LeuAsp Arg Gly Glu Trp Leu Lys Asn Leu Lys Tyr Gly Val Phe Gly Leu

145 150 155 160145 150 155 160

Gly Asn Arg Gln Tyr Glu His Phe Asn Lys Val Ala Lys Val Val AspGly Asn Arg Gln Tyr Glu His Phe Asn Lys Val Ala Lys Val Val Asp

165 170 175165 170 175

Asp Ile Leu Val Glu Gln Gly Ala Gln Arg Leu Val Gln Val Gly LeuAsp Ile Leu Val Glu Gln Gly Ala Gln Arg Leu Val Gln Val Gly Leu

180 185 190180 185 190

Gly Asp Asp Asp Gln Cys Ile Glu Asp Asp Phe Thr Ala Trp Arg GluGly Asp Asp Asp Gln Cys Ile Glu Asp Asp Phe Thr Ala Trp Arg Glu

195 200 205195 200 205

Ala Leu Trp Pro Glu Leu Asp Thr Ile Leu Arg Glu Glu Gly Asp ThrAla Leu Trp Pro Glu Leu Asp Thr Ile Leu Arg Glu Glu Gly Asp Thr

210 215 220210 215 220

Ala Val Ala Thr Pro Tyr Thr Ala Ala Val Leu Glu Tyr Arg Val SerAla Val Ala Thr Pro Tyr Thr Ala Ala Val Leu Glu Tyr Arg Val Ser

225 230 235 240225 230 235 240

Ile His Asp Ser Glu Asp Ala Lys Phe Asn Asp Ile Asn Met Ala AsnIle His Asp Ser Glu Asp Ala Lys Phe Asn Asp Ile Asn Met Ala Asn

245 250 255245 250 255

Gly Asn Gly Tyr Thr Val Phe Asp Ala Gln His Pro Tyr Lys Ala AsnGly Asn Gly Tyr Thr Val Phe Asp Ala Gln His Pro Tyr Lys Ala Asn

260 265 270260 265 270

Val Ala Val Lys Arg Glu Leu His Thr Pro Glu Ser Asp Arg Ser CysVal Ala Val Lys Arg Glu Leu His Thr Pro Glu Ser Asp Arg Ser Cys

275 280 285275 280 285

Ile His Leu Glu Phe Asp Ile Ala Gly Ser Gly Leu Thr Tyr Glu ThrIle His Leu Glu Phe Asp Ile Ala Gly Ser Gly Leu Thr Tyr Glu Thr

290 295 300290 295 300

Gly Asp His Val Gly Val Leu Cys Asp Asn Leu Ser Glu Thr Val AspGly Asp His Val Gly Val Leu Cys Asp Asn Leu Ser Glu Thr Val Asp

305 310 315 320305 310 315 320

Glu Ala Leu Arg Leu Leu Asp Met Ser Pro Asp Thr Tyr Phe Ser LeuGlu Ala Leu Arg Leu Leu Asp Met Ser Pro Asp Thr Tyr Phe Ser Leu

325 330 335325 330 335

His Ala Glu Lys Glu Asp Gly Thr Pro Ile Ser Ser Ser Leu Pro ProHis Ala Glu Lys Glu Asp Gly Thr Pro Ile Ser Ser Ser Leu Pro Pro

340 345 350340 345 350

Pro Phe Pro Pro Cys Asn Leu Arg Thr Ala Leu Thr Arg Tyr Ala CysPro Phe Pro Pro Cys Asn Leu Arg Thr Ala Leu Thr Arg Tyr Ala Cys

355 360 365355 360 365

Leu Leu Ser Ser Pro Lys Lys Ser Ala Leu Val Ala Leu Ala Ala HisLeu Leu Ser Ser Pro Lys Lys Ser Ala Leu Val Ala Leu Ala Ala His

370 375 380370 375 380

Ala Ser Asp Pro Thr Glu Ala Glu Arg Leu Lys His Leu Ala Ser ProAla Ser Asp Pro Thr Glu Ala Glu Arg Leu Lys His Leu Ala Ser Pro

385 390 395 400385 390 395 400

Ala Gly Lys Asp Glu Tyr Ser Lys Trp Val Val Glu Ser Gln Arg SerAla Gly Lys Asp Glu Tyr Ser Lys Trp Val Val Glu Ser Gln Arg Ser

405 410 415405 410 415

Leu Leu Glu Val Met Ala Glu Phe Pro Ser Ala Lys Pro Pro Leu GlyLeu Leu Glu Val Met Ala Glu Phe Pro Ser Ala Lys Pro Pro Leu Gly

420 425 430420 425 430

Val Phe Phe Ala Gly Val Ala Pro Arg Leu Gln Pro Arg Phe Tyr SerVal Phe Phe Ala Gly Val Ala Pro Arg Leu Gln Pro Arg Phe Tyr Ser

435 440 445435 440 445

Ile Ser Ser Ser Pro Lys Ile Ala Glu Thr Arg Ile His Val Thr CysIle Ser Ser Ser Pro Lys Ile Ala Glu Thr Arg Ile His Val Thr Cys

450 455 460450 455 460

Ala Leu Val Tyr Glu Lys Met Pro Thr Gly Arg Ile His Lys Gly ValAla Leu Val Tyr Glu Lys Met Pro Thr Gly Arg Ile His Lys Gly Val

465 470 475 480465 470 475 480

Cys Ser Thr Trp Met Lys Asn Ala Val Pro Tyr Glu Lys Ser Glu AsnCys Ser Thr Trp Met Lys Asn Ala Val Pro Tyr Glu Lys Ser Glu Asn

485 490 495485 490 495

Cys Ser Ser Ala Pro Ile Phe Val Arg Gln Ser Asn Phe Lys Leu ProCys Ser Ser Ala Pro Ile Phe Val Arg Gln Ser Asn Phe Lys Leu Pro

500 505 510500 505 510

Ser Asp Ser Lys Val Pro Ile Ile Met Ile Gly Pro Gly Thr Gly LeuSer Asp Ser Lys Val Pro Ile Ile Met Ile Gly Pro Gly Thr Gly Leu

515 520 525515 520 525

Ala Pro Phe Arg Gly Phe Leu Gln Glu Arg Leu Ala Leu Val Glu SerAla Pro Phe Arg Gly Phe Leu Gln Glu Arg Leu Ala Leu Val Glu Ser

530 535 540530 535 540

Gly Val Glu Leu Gly Pro Ser Val Leu Phe Phe Gly Cys Arg Asn ArgGly Val Glu Leu Gly Pro Ser Val Leu Phe Phe Gly Cys Arg Asn Arg

545 550 555 560545 550 555 560

Arg Met Asp Phe Ile Tyr Glu Glu Glu Leu Gln Arg Phe Val Glu SerArg Met Asp Phe Ile Tyr Glu Glu Glu Leu Gln Arg Phe Val Glu Ser

565 570 575565 570 575

Gly Ala Leu Ala Glu Leu Ser Val Ala Phe Ser Arg Glu Gly Pro ThrGly Ala Leu Ala Glu Leu Ser Val Ala Phe Ser Arg Glu Gly Pro Thr

580 585 590580 585 590

Lys Glu Tyr Val Gln His Lys Met Met Asp Lys Ala Ser Asp Ile TrpLys Glu Tyr Val Gln His Lys Met Met Asp Lys Ala Ser Asp Ile Trp

595 600 605595 600 605

Asn Met Ile Ser Gln Gly Ala Tyr Leu Tyr Val Cys Gly Asp Ala LysAsn Met Ile Ser Gln Gly Ala Tyr Leu Tyr Val Cys Gly Asp Ala Lys

610 615 620610 615 620

Gly Met Ala Arg Asp Val His Arg Ser Leu His Thr Ile Ala Gln GluGly Met Ala Arg Asp Val His Arg Ser Leu His Thr Ile Ala Gln Glu

625 630 635 640625 630 635 640

Gln Gly Ser Met Asp Ser Thr Lys Ala Glu Gly Phe Val Lys Asn LeuGln Gly Ser Met Asp Ser Thr Lys Ala Glu Gly Phe Val Lys Asn Leu

645 650 655645 650 655

Gln Thr Ser Gly Arg Tyr Leu Arg Asp Val TrpGln Thr Ser Gly Arg Tyr Leu Arg Asp Val Trp

660 665660 665

<210>41<210>41

<211>560<211>560

<212>PRT<212>PRT

<213>花菱草(Eschscholzia stolonifera)<213> Poppy (Eschscholzia stolonifera)

<400>41<400>41

Met Glu Lys Pro Ile Leu Leu Gln Leu Gln Ala Gly Ile Leu Gly LeuMet Glu Lys Pro Ile Leu Leu Gln Leu Gln Ala Gly Ile Leu Gly Leu

1 5 10 151 5 10 15

Leu Ala Leu Ile Cys Phe Leu Tyr Tyr Val Ile Lys Val Ser Leu SerLeu Ala Leu Ile Cys Phe Leu Tyr Tyr Val Ile Lys Val Ser Leu Ser

20 25 3020 25 30

Thr Arg Asn Cys Asn Gln Leu Val Lys His Pro Pro Glu Ala Ala GlyThr Arg Asn Cys Asn Gln Leu Val Lys His Pro Pro Glu Ala Ala Gly

35 40 4535 40 45

Ser Trp Pro Ile Val Gly His Leu Pro Gln Leu Val Gly Ser Gly LysSer Trp Pro Ile Val Gly His Leu Pro Gln Leu Val Gly Ser Gly Lys

50 55 6050 55 60

Pro Leu Phe Arg Val Leu Gly Asp Met Ala Asp Lys Phe Gly Pro IlePro Leu Phe Arg Val Leu Gly Asp Met Ala Asp Lys Phe Gly Pro Ile

65 70 75 8065 70 75 80

Phe Met Val Arg Phe Gly Val Tyr Pro Thr Leu Val Val Ser Thr TrpPhe Met Val Arg Phe Gly Val Tyr Pro Thr Leu Val Val Ser Thr Trp

85 90 9585 90 95

Glu Met Ala Lys Glu Cys Phe Thr Ser Asn Asp Lys Phe Leu Ala SerGlu Met Ala Lys Glu Cys Phe Thr Ser Asn Asp Lys Phe Leu Ala Ser

100 105 110100 105 110

Arg Pro Pro Ser Ala Ala Ser Ser Tyr Met Thr Tyr Asp His Ala MetArg Pro Pro Ser Ala Ala Ser Ser Tyr Met Thr Tyr Asp His Ala Met

115 120 125115 120 125

Phe Gly Phe Ser Phe Tyr Gly Pro Tyr Trp Arg Glu Ile Arg Lys IlePhe Gly Phe Ser Phe Tyr Gly Pro Tyr Trp Arg Glu Ile Arg Lys Ile

130 135 140130 135 140

Ser Thr Leu His Leu Leu Ser His Arg Arg Leu Glu Leu Leu Lys HisSer Thr Leu His Leu Leu Ser His Arg Arg Leu Glu Leu Leu Lys His

145 150 155 160145 150 155 160

Val Pro His Thr Glu Ile His Asn Phe Ile Lys Gly Leu Phe Gly IleVal Pro His Thr Glu Ile His Asn Phe Ile Lys Gly Leu Phe Gly Ile

165 170 175165 170 175

Trp Lys Asp His Gln Lys Gln Gln Gln Pro Thr Gly Arg Glu Asp ArgTrp Lys Asp His Gln Lys Gln Gln Gln Pro Thr Gly Arg Glu Asp Arg

180 185 190180 185 190

Asp Ser Val Met Leu Glu Met Ser Gln Leu Phe Gly Tyr Leu Thr LeuAsp Ser Val Met Leu Glu Met Ser Gln Leu Phe Gly Tyr Leu Thr Leu

195 200 205195 200 205

Asn Val Val Leu Ser Leu Val Val Gly Lys Arg Val Cys Asn Tyr HisAsn Val Val Leu Ser Leu Val Val Gly Lys Arg Val Cys Asn Tyr His

210 215 220210 215 220

Ala Asp Gly His Leu Asp Asp Gly Glu Glu Ala Gly Gln Gly Gln LysAla Asp Gly His Leu Asp Asp Gly Glu Glu Ala Gly Gln Gly Gln Lys

225 230 235 240225 230 235 240

Leu His Gln Thr Ile Thr Asp Phe Phe Lys Leu Ser Gly Val Ser ValLeu His Gln Thr Ile Thr Asp Phe Phe Lys Leu Ser Gly Val Ser Val

245 250 255245 250 255

Ala Ser Asp Ala Leu Pro Leu Leu Gly Leu Phe Asp Leu Gly Gly LysAla Ser Asp Ala Leu Pro Leu Leu Gly Leu Phe Asp Leu Gly Gly Lys

260 265 270260 265 270

Lys Glu Ser Met Lys Arg Val Ala Lys Glu Met Asp Phe Phe Ala GluLys Glu Ser Met Lys Arg Val Ala Lys Glu Met Asp Phe Phe Ala Glu

275 280 285275 280 285

Arg Trp Leu Gln Asp Lys Lys Leu Ser Leu Ser Leu Ser Ser Glu ThrArg Trp Leu Gln Asp Lys Lys Leu Ser Leu Ser Leu Ser Ser Glu Thr

290 295 300290 295 300

Asn Asn Lys Gln Asn Asp Ala Gly Glu Gly Asp Gly Asp Asp Phe MetAsn Asn Lys Gln Asn Asp Ala Gly Glu Gly Asp Gly Asp Asp Phe Met

305 310 315 320305 310 315 320

Asp Val Leu Met Ser Ile Leu Pro Asp Asp Asp Asp Ser Leu Phe ThrAsp Val Leu Met Ser Ile Leu Pro Asp Asp Asp Asp Ser Leu Phe Thr

325 330 335325 330 335

Lys Tyr Ser Arg Asp Thr Val Ile Lys Ala Thr Ser Leu Ser Met ValLys Tyr Ser Arg Asp Thr Val Ile Lys Ala Thr Ser Leu Ser Met Val

340 345 350340 345 350

Val Ala Ala Ser Asp Thr Thr Ser Val Ser Leu Thr Trp Ala Leu SerVal Ala Ala Ser Asp Thr Thr Ser Val Ser Leu Thr Trp Ala Leu Ser

355 360 365355 360 365

Leu Leu Leu Asn Asn Ile Gln Val Leu Arg Lys Ala Gln Asp Glu LeuLeu Leu Leu Asn Asn Asn Ile Gln Val Leu Arg Lys Ala Gln Asp Glu Leu

370 375 380370 375 380

Asp Thr Lys Val Gly Arg Asp Arg His Val Glu Glu Lys Asp Ile AspAsp Thr Lys Val Gly Arg Asp Arg His Val Glu Glu Lys Asp Ile Asp

385 390 395 400385 390 395 400

Asn Leu Val Tyr Leu Gln Ala Ile Val Lys Glu Thr Leu Arg Met TyrAsn Leu Val Tyr Leu Gln Ala Ile Val Lys Glu Thr Leu Arg Met Tyr

405 410 415405 410 415

Pro Ala Gly Pro Leu Ser Val Pro His Glu Ala Ile Glu Asp Cys AsnPro Ala Gly Pro Leu Ser Val Pro His Glu Ala Ile Glu Asp Cys Asn

420 425 430420 425 430

Val Gly Gly Tyr His Ile Lys Thr Gly Thr Arg Leu Leu Val Asn IleVal Gly Gly Tyr His Ile Lys Thr Gly Thr Arg Leu Leu Val Asn Ile

435 440 445435 440 445

Trp Lys Leu Gln Arg Asp Pro Arg Val Trp Ser Asn Pro Ser Glu PheTrp Lys Leu Gln Arg Asp Pro Arg Val Trp Ser Asn Pro Ser Glu Phe

450 455 460450 455 460

Arg Pro Glu Arg Phe Leu Asp Asn Gln Ser Asn Gly Thr Leu Leu AspArg Pro Glu Arg Phe Leu Asp Asn Gln Ser Asn Gly Thr Leu Leu Asp

465 470 475 480465 470 475 480

Phe Arg Gly Gln His Phe Glu Tyr Ile Pro Phe Gly Ser Gly Arg ArgPhe Arg Gly Gln His Phe Glu Tyr Ile Pro Phe Gly Ser Gly Arg Arg

485 490 495485 490 495

Met Cys Pro Gly Val Asn Phe Ala Thr Leu Ile Leu His Met Thr LeuMet Cys Pro Gly Val Asn Phe Ala Thr Leu Ile Leu His Met Thr Leu

500 505 510500 505 510

Ala Arg Leu Leu Gln Ala Phe Asp Leu Ser Thr Pro Ser Ser Ser ProAla Arg Leu Leu Gln Ala Phe Asp Leu Ser Thr Pro Ser Ser Ser Pro

515 520 525515 520 525

Val Asp Met Thr Glu Gly Ser Gly Leu Thr Met Pro Lys Val Thr ProVal Asp Met Thr Glu Gly Ser Gly Leu Thr Met Pro Lys Val Thr Pro

530 535 540530 535 540

Leu Lys Val Leu Leu Thr Pro Arg Leu Pro Leu Pro Leu Tyr Asp TyrLeu Lys Val Leu Leu Thr Pro Arg Leu Pro Leu Pro Leu Tyr Asp Tyr

545 550 555 560545 550 555 560

<210>42<210>42

<211>493<211>493

<212>PRT<212>PRT

<213>长春花(Catharanthus roseus)<213> Periwinkle (Catharanthus roseus)

<400>42<400>42

Met Asp Tyr Leu Thr Ile Ile Leu Thr Leu Leu Phe Ala Leu Thr LeuMet Asp Tyr Leu Thr Ile Ile Leu Thr Leu Leu Phe Ala Leu Thr Leu

1 5 10 151 5 10 15

Tyr Glu Ala Phe Ser Tyr Leu Ser Arg Arg Thr Lys Asn Leu Pro ProTyr Glu Ala Phe Ser Tyr Leu Ser Arg Arg Thr Lys Asn Leu Pro Pro

20 25 3020 25 30

Gly Pro Ser Pro Leu Pro Phe Ile Gly Ser Leu His Leu Leu Gly AspGly Pro Ser Pro Leu Pro Phe Ile Gly Ser Leu His Leu Leu Gly Asp

35 40 4535 40 45

Gln Pro His Lys Ser Leu Ala Lys Leu Ser Lys Lys His Gly Pro IleGln Pro His Lys Ser Leu Ala Lys Leu Ser Lys Lys His Gly Pro Ile

50 55 6050 55 60

Met Ser Leu Lys Leu Gly Gln Ile Thr Thr Ile Val Ile Ser Ser SerMet Ser Leu Lys Leu Gly Gln Ile Thr Thr Ile Val Ile Ser Ser Ser Ser

65 70 75 8065 70 75 80

Thr Met Ala Lys Glu Val Leu Gln Lys Gln Asp Leu Ala Phe Ser SerThr Met Ala Lys Glu Val Leu Gln Lys Gln Asp Leu Ala Phe Ser Ser

85 90 9585 90 95

Arg Ser Val Pro Asn Ala Leu His Ala His Asn Gln Phe Lys Phe SerArg Ser Val Pro Asn Ala Leu His Ala His Asn Gln Phe Lys Phe Ser

100 105 110100 105 110

Val Val Trp Leu Pro Val Ala Ser Arg Trp Arg Ser Leu Arg Lys ValVal Val Trp Leu Pro Val Ala Ser Arg Trp Arg Ser Leu Arg Lys Val

115 120 125115 120 125

Leu Asn Ser Asn Ile Phe Ser Gly Asn Arg Leu Asp Ala Asn Gln HisLeu Asn Ser Asn Ile Phe Ser Gly Asn Arg Leu Asp Ala Asn Gln His

130 135 140130 135 140

Leu Arg Thr Arg Lys Val Gln Glu Leu Ile Ala Tyr Cys Arg Lys AsnLeu Arg Thr Arg Lys Val Gln Glu Leu Ile Ala Tyr Cys Arg Lys Asn

145 150 155 160145 150 155 160

Ser Gln Ser Gly Glu Ala Val Asp Val Gly Arg Ala Ala Phe Arg ThrSer Gln Ser Gly Glu Ala Val Asp Val Gly Arg Ala Ala Phe Arg Thr

165 170 175165 170 175

Ser Leu Asn Leu Leu Ser Asn Leu Ile Phe Ser Lys Asp Leu Thr AspSer Leu Asn Leu Leu Ser Asn Leu Ile Phe Ser Lys Asp Leu Thr Asp

180 185 190180 185 190

Pro Tyr Ser Asp Ser Ala Lys Glu Phe Lys Asp Leu Val Trp Asn IlePro Tyr Ser Asp Ser Ala Lys Glu Phe Lys Asp Leu Val Trp Asn Ile

195 200 205195 200 205

Met Val Glu Ala Gly Lys Pro Asn Leu Val Asp Phe Phe Pro Leu LeuMet Val Glu Ala Gly Lys Pro Asn Leu Val Asp Phe Phe Pro Leu Leu

210 215 220210 215 220

Glu Lys Val Asp Pro Gln Gly Ile Arg His Arg Met Thr Ile His PheGlu Lys Val Asp Pro Gln Gly Ile Arg His Arg Met Thr Ile His Phe

225 230 235 240225 230 235 240

Gly Glu Val Leu Lys Leu Phe Gly Gly Leu Val Asn Glu Arg Leu GluGly Glu Val Leu Lys Leu Phe Gly Gly Leu Val Asn Glu Arg Leu Glu

245 250 255245 250 255

Gln Arg Arg Ser Lys Gly Glu Lys Asn Asp Val Leu Asp Val Leu LeuGln Arg Arg Ser Lys Gly Glu Lys Asn Asp Val Leu Asp Val Leu Leu

260 265 270260 265 270

Thr Thr Ser Gln Glu Ser Pro Glu Glu Ile Asp Arg Thr His Ile GluThr Thr Ser Gln Glu Ser Pro Glu Glu Ile Asp Arg Thr His Ile Glu

275 280 285275 280 285

Arg Met Cys Leu Asp Leu Phe Val Ala Gly Thr Asp Thr Thr Ser SerArg Met Cys Leu Asp Leu Phe Val Ala Gly Thr Asp Thr Thr Ser Ser

290 295 300290 295 300

Thr Leu Glu Trp Ala Met Ser Glu Met Leu Lys Asn Pro Asp Lys MetThr Leu Glu Trp Ala Met Ser Glu Met Leu Lys Asn Pro Asp Lys Met

305 310 315 320305 310 315 320

Lys Lys Thr Gln Asp Glu Leu Ala Gln Val Ile Gly Arg Gly Lys ThrLys Lys Thr Gln Asp Glu Leu Ala Gln Val Ile Gly Arg Gly Lys Thr

325 330 335325 330 335

Ile Glu Glu Ser Asp Ile Asn Arg Leu Pro Tyr Leu Arg Cys Val MetIle Glu Glu Ser Asp Ile Asn Arg Leu Pro Tyr Leu Arg Cys Val Met

340 345 350340 345 350

Lys Glu Thr Leu Arg Ile His Pro Pro Val Pro Phe Leu Ile Pro ArgLys Glu Thr Leu Arg Ile His Pro Pro Val Pro Phe Leu Ile Pro Arg

355 360 365355 360 365

Lys Val Glu Gln Ser Val Glu Val Cys Gly Tyr Asn Val Pro Lys GlyLys Val Glu Gln Ser Val Glu Val Cys Gly Tyr Asn Val Pro Lys Gly

370 375 380370 375 380

Ser Gln Val Leu Val Asn Ala Trp Ala Ile Gly Arg Asp Glu Thr ValSer Gln Val Leu Val Asn Ala Trp Ala Ile Gly Arg Asp Glu Thr Val

385 390 395 400385 390 395 400

Trp Asp Asp Ala Leu Ala Phe Lys Pro Glu Arg Phe Met Glu Ser GluTrp Asp Asp Ala Leu Ala Phe Lys Pro Glu Arg Phe Met Glu Ser Glu

405 410 415405 410 415

Leu Asp Ile Arg Gly Arg Asp Phe Glu Leu Ile Pro Phe Gly Ala GlyLeu Asp Ile Arg Gly Arg Asp Phe Glu Leu Ile Pro Phe Gly Ala Gly

420 425 430420 425 430

Arg Arg Ile Cys Pro Gly Leu Pro Leu Ala Leu Arg Thr Val Pro LeuArg Arg Ile Cys Pro Gly Leu Pro Leu Ala Leu Arg Thr Val Pro Leu

435 440 445435 440 445

Met Leu Gly Ser Leu Leu Asn Ser Phe Asn Trp Lys Leu Glu Gly GlyMet Leu Gly Ser Leu Leu Asn Ser Phe Asn Trp Lys Leu Glu Gly Gly

450 455 460450 455 460

Met Ala Pro Lys Asp Leu Asp Met Glu Glu Lys Phe Gly Ile Thr LeuMet Ala Pro Lys Asp Leu Asp Met Glu Glu Lys Phe Gly Ile Thr Leu

465 470 475 480465 470 475 480

Gln Lys Ala His Pro Leu Arg Ala Val Pro Ser Thr LeuGln Lys Ala His Pro Leu Arg Ala Val Pro Ser Thr Leu

485 490485 490

<210>43<210>43

<211>496<211>496

<212>PRT<212>PRT

<213>长春花(Catharanthus roseus)<213> Periwinkle (Catharanthus roseus)

<400>43<400>43

Met Leu Leu Phe Cys Phe Ile Leu Ser Lys Thr Thr Lys Lys Phe GlyMet Leu Leu Phe Cys Phe Ile Leu Ser Lys Thr Thr Lys Lys Phe Gly

1 5 10 151 5 10 15

Gln Asn Ser Gln Tyr Ser Asn His Asp Glu Leu Pro Pro Gly Pro ProGln Asn Ser Gln Tyr Ser Asn His Asp Glu Leu Pro Pro Gly Pro Pro

20 25 3020 25 30

Gln Ile Pro Ile Leu Gly Asn Ala His Gln Leu Ser Gly Gly His ThrGln Ile Pro Ile Leu Gly Asn Ala His Gln Leu Ser Gly Gly His Thr

35 40 4535 40 45

His His Ile Leu Arg Asp Leu Ala Lys Lys Tyr Gly Pro Leu Met HisHis His Ile Leu Arg Asp Leu Ala Lys Lys Tyr Gly Pro Leu Met His

50 55 6050 55 60

Leu Lys Ile Gly Glu Val Ser Thr Ile Val Ala Ser Ser Pro Gln IleLeu Lys Ile Gly Glu Val Ser Thr Ile Val Ala Ser Ser Pro Gln Ile

65 70 75 8065 70 75 80

Ala Glu Glu Ile Phe Arg Thr His Asp Ile Leu Phe Ala Asp Arg ProAla Glu Glu Ile Phe Arg Thr His Asp Ile Leu Phe Ala Asp Arg Pro

85 90 9585 90 95

Ser Asn Leu Glu Ser Phe Lys Ile Val Ser Tyr Asp Phe Ser Asp MetSer Asn Leu Glu Ser Phe Lys Ile Val Ser Tyr Asp Phe Ser Asp Met

100 105 110100 105 110

Val Val Ser Pro Tyr Gly Asn Tyr Trp Arg Gln Leu Arg Lys Ile SerVal Val Ser Pro Tyr Gly Asn Tyr Trp Arg Gln Leu Arg Lys Ile Ser

115 120 125115 120 125

Met Met Glu Leu Leu Ser Gln Lys Ser Val Gln Ser Phe Arg Ser IleMet Met Glu Leu Leu Ser Gln Lys Ser Val Gln Ser Phe Arg Ser Ile

130 135 140130 135 140

Arg Glu Glu Glu Val Leu Asn Phe Ile Lys Ser Ile Gly Ser Lys GluArg Glu Glu Glu Val Leu Asn Phe Ile Lys Ser Ile Gly Ser Lys Glu

145 150 155 160145 150 155 160

Gly Thr Arg Ile Asn Leu Ser Lys Glu Ile Ser Leu Leu Ile Tyr GlyGly Thr Arg Ile Asn Leu Ser Lys Glu Ile Ser Leu Leu Ile Tyr Gly

165 170 175165 170 175

Ile Thr Thr Arg Ala Ala Phe Gly Glu Lys Asn Lys Asn Thr Glu GluIle Thr Thr Arg Ala Ala Phe Gly Glu Lys Asn Lys Asn Thr Glu Glu

180 185 190180 185 190

Phe Ile Arg Leu Leu Asp Gln Leu Thr Lys Ala Val Ala Glu Pro AsnPhe Ile Arg Leu Leu Asp Gln Leu Thr Lys Ala Val Ala Glu Pro Asn

195 200 205195 200 205

Ile Ala Asp Met Phe Pro Ser Leu Lys Phe Leu Gln Leu Ile Ser ThrIle Ala Asp Met Phe Pro Ser Leu Lys Phe Leu Gln Leu Ile Ser Thr

210 215 220210 215 220

Ser Lys Tyr Lys Ile Glu Lys Ile His Lys Gln Phe Asp Val Ile ValSer Lys Tyr Lys Ile Glu Lys Ile His Lys Gln Phe Asp Val Ile Val

225 230 235 240225 230 235 240

Glu Thr Ile Leu Lys Gly His Lys Glu Lys Ile Asn Lys Pro Leu SerGlu Thr Ile Leu Lys Gly His Lys Glu Lys Ile Asn Lys Pro Leu Ser

245 250 255245 250 255

Gln Glu Asn Gly Glu Lys Lys Glu Asp Leu Val Asp Val Leu Leu AsnGln Glu Asn Gly Glu Lys Lys Glu Asp Leu Val Asp Val Leu Leu Asn

260 265 270260 265 270

Ile Gln Arg Arg Asn Asp Phe Glu Ala Pro Leu Gly Asp Lys Asn IleIle Gln Arg Arg Asn Asp Phe Glu Ala Pro Leu Gly Asp Lys Asn Ile

275 280 285275 280 285

Lys Ala Ile Ile Phe Asn Ile Phe Ser Ala Gly Thr Glu Thr Ser SerLys Ala Ile Ile Phe Asn Ile Phe Ser Ala Gly Thr Glu Thr Ser Ser

290 295 300290 295 300

Thr Thr Val Asp Trp Ala Met Cys Glu Met Ile Lys Asn Pro Thr ValThr Thr Val Asp Trp Ala Met Cys Glu Met Ile Lys Asn Pro Thr Val

305 310 315 320305 310 315 320

Met Lys Lys Ala Gln Glu Glu Val Arg Lys Val Phe Asn Glu Glu GlyMet Lys Lys Ala Gln Glu Glu Val Arg Lys Val Phe Asn Glu Glu Gly

325 330 335325 330 335

Asn Val Asp Glu Thr Lys Leu His Gln Leu Lys Tyr Leu Gln Ala ValAsn Val Asp Glu Thr Lys Leu His Gln Leu Lys Tyr Leu Gln Ala Val

340 345 350340 345 350

Ile Lys Glu Thr Leu Arg Leu His Pro Pro Val Pro Leu Leu Leu ProIle Lys Glu Thr Leu Arg Leu His Pro Pro Val Pro Leu Leu Leu Pro

355 360 365355 360 365

Arg Glu Cys Arg Glu Gln Cys Lys Ile Lys Gly Tyr Thr Ile Pro SerArg Glu Cys Arg Glu Gln Cys Lys Ile Lys Gly Tyr Thr Ile Pro Ser

370 375 380370 375 380

Lys Ser Arg Val Ile Val Asn Ala Trp Ala Ile Gly Arg Asp Pro AsnLys Ser Arg Val Ile Val Asn Ala Trp Ala Ile Gly Arg Asp Pro Asn

385 390 395 400385 390 395 400

Tyr Trp Ile Glu Pro Glu Lys Phe Asn Pro Asp Arg Phe Leu Glu SerTyr Trp Ile Glu Pro Glu Lys Phe Asn Pro Asp Arg Phe Leu Glu Ser

405 410 415405 410 415

Lys Val Asp Phe Lys Gly Asn Ser Phe Glu Tyr Leu Pro Phe Gly GlyLys Val Asp Phe Lys Gly Asn Ser Phe Glu Tyr Leu Pro Phe Gly Gly

420 425 430420 425 430

Gly Arg Arg Ile Cys Pro Gly Ile Thr Phe Ala Leu Ala Asn Ile GluGly Arg Arg Ile Cys Pro Gly Ile Thr Phe Ala Leu Ala Asn Ile Glu

435 440 445435 440 445

Leu Pro Leu Ala Gln Leu Leu Phe His Phe Asp Trp Gln Ser Asn ThrLeu Pro Leu Ala Gln Leu Leu Phe His Phe Asp Trp Gln Ser Asn Thr

450 455 460450 455 460

Glu Lys Leu Asn Met Lys Glu Ser Arg Gly Val Thr Val Arg Arg GluGlu Lys Leu Asn Met Lys Glu Ser Arg Gly Val Thr Val Arg Arg Glu

465 470 475 480465 470 475 480

Asp Asp Leu Tyr Leu Thr Pro Val Asn Phe Ser Ser Ser Ser Pro AlaAsp Asp Leu Tyr Leu Thr Pro Val Asn Phe Ser Ser Ser Ser Ser Pro Ala

485 490 495485 490 495

<210>44<210>44

<211>505<211>505

<212>PRT<212>PRT

<213>拟南芥(Arabidopsis thaliana)<213> Arabidopsis thaliana

<400>44<400>44

Met Asp Leu Leu Leu Leu Glu Lys Ser Leu Ile Ala Val Phe Val AlaMet Asp Leu Leu Leu Leu Glu Lys Ser Leu Ile Ala Val Phe Val Ala

1 5 10 151 5 10 15

Val Ile Leu Ala Thr Val Ile Ser Lys Leu Arg Gly Lys Lys Leu LysVal Ile Leu Ala Thr Val Ile Ser Lys Leu Arg Gly Lys Lys Leu Lys

20 25 3020 25 30

Leu Pro Pro Gly Pro Ile Pro Ile Pro Ile Phe Gly Asn Trp Leu GlnLeu Pro Pro Gly Pro Ile Pro Ile Pro Ile Phe Gly Asn Trp Leu Gln

35 40 4535 40 45

Val Gly Asp Asp Leu Asn His Arg Asn Leu Val Asp Tyr Ala Lys LysVal Gly Asp Asp Leu Asn His Arg Asn Leu Val Asp Tyr Ala Lys Lys

50 55 6050 55 60

Phe Gly Asp Leu Phe Leu Leu Arg Met Gly Gln Arg Asn Leu Val ValPhe Gly Asp Leu Phe Leu Leu Arg Met Gly Gln Arg Asn Leu Val Val

65 70 75 8065 70 75 80

Val Ser Ser Pro Asp Leu Thr Lys Glu Val Leu Leu Thr Gln Gly ValVal Ser Ser Pro Asp Leu Thr Lys Glu Val Leu Leu Thr Gln Gly Val

85 90 9585 90 95

Glu Phe Gly Ser Arg Thr Arg Asn Val Val Phe Asp Ile Phe Thr GlyGlu Phe Gly Ser Arg Thr Arg Asn Val Val Phe Asp Ile Phe Thr Gly

100 105 110100 105 110

Lys Gly Gln Asp Met Val Phe Thr Val Tyr Gly Glu His Trp Arg LysLys Gly Gln Asp Met Val Phe Thr Val Tyr Gly Glu His Trp Arg Lys

115 120 125115 120 125

Met Arg Arg Ile Met Thr Val Pro Phe Phe Thr Asn Lys Val Val GlnMet Arg Arg Ile Met Thr Val Pro Phe Phe Thr Asn Lys Val Val Gln

130 135 140130 135 140

Gln Asn Arg Glu Gly Trp Glu Phe Glu Ala Ala Ser Val Val Glu AspGln Asn Arg Glu Gly Trp Glu Phe Glu Ala Ala Ser Val Val Glu Asp

145 150 155 160145 150 155 160

Val Lys Lys Asn Pro Asp Ser Ala Thr Lys Gly Ile Val Leu Arg LysVal Lys Lys Asn Pro Asp Ser Ala Thr Lys Gly Ile Val Leu Arg Lys

165 170 175165 170 175

Arg Leu Gln Leu Met Met Tyr Asn Asn Met Phe Arg Ile Met Phe AspArg Leu Gln Leu Met Met Tyr Asn Asn Met Phe Arg Ile Met Phe Asp

180 185 190180 185 190

Arg Arg Phe Glu Ser Glu Asp Asp Pro Leu Phe Leu Arg Leu Lys AlaArg Arg Phe Glu Ser Glu Asp Asp Pro Leu Phe Leu Arg Leu Lys Ala

195 200 205195 200 205

Leu Asn Gly Glu Arg Ser Arg Leu Ala Gln Ser Phe Glu Tyr Asn TyrLeu Asn Gly Glu Arg Ser Arg Leu Ala Gln Ser Phe Glu Tyr Asn Tyr

210 215 220210 215 220

Gly Asp Phe Ile Pro Ile Leu Arg Pro Phe Leu Arg Gly Tyr Leu LysGly Asp Phe Ile Pro Ile Leu Arg Pro Phe Leu Arg Gly Tyr Leu Lys

225 230 235 240225 230 235 240

Ile Cys Gln Asp Val Lys Asp Arg Arg Ile Ala Leu Phe Lys Lys TyrIle Cys Gln Asp Val Lys Asp Arg Arg Ile Ala Leu Phe Lys Lys Tyr

245 250 255245 250 255

Phe Val Asp Glu Arg Lys Gln Ile Ala Ser Ser Lys Pro Thr Gly SerPhe Val Asp Glu Arg Lys Gln Ile Ala Ser Ser Lys Pro Thr Gly Ser

260 265 270260 265 270

Glu Gly Leu Lys Cys Ala Ile Asp His Ile Leu Glu Ala Glu Gln LysGlu Gly Leu Lys Cys Ala Ile Asp His Ile Leu Glu Ala Glu Gln Lys

275 280 285275 280 285

Gly Glu Ile Asn Glu Asp Asn Val Leu Tyr Ile Val Glu Asn Ile AsnGly Glu Ile Asn Glu Asp Asn Val Leu Tyr Ile Val Glu Asn Ile Asn

290 295 300290 295 300

Val Ala Ala Ile Glu Thr Thr Leu Trp Ser Ile Glu Trp Gly Ile AlaVal Ala Ala Ile Glu Thr Thr Leu Trp Ser Ile Glu Trp Gly Ile Ala

305 310 315 320305 310 315 320

Glu Leu Val Asn His Pro Glu Ile Gln Ser Lys Leu Arg Asn Glu LeuGlu Leu Val Asn His Pro Glu Ile Gln Ser Lys Leu Arg Asn Glu Leu

325 330 335325 330 335

Asp Thr Val Leu Gly Pro Gly Val Gln Val Thr Glu Pro Asp Leu HisAsp Thr Val Leu Gly Pro Gly Val Gln Val Thr Glu Pro Asp Leu His

340 345 350340 345 350

Lys Leu Pro Tyr Leu Gln Ala Val Val Lys Glu Thr Leu Arg Leu ArgLys Leu Pro Tyr Leu Gln Ala Val Val Lys Glu Thr Leu Arg Leu Arg

355 360 365355 360 365

Met Ala Ile Pro Leu Leu Val Pro His Met Asn Leu His Asp Ala LysMet Ala Ile Pro Leu Leu Val Pro His Met Asn Leu His Asp Ala Lys

370 375 380370 375 380

Leu Ala Gly Tyr Asp Ile Pro Ala Glu Ser Lys Ile Leu Val Asn AlaLeu Ala Gly Tyr Asp Ile Pro Ala Glu Ser Lys Ile Leu Val Asn Ala

385 390 395 400385 390 395 400

Trp Trp Leu Ala Asn Asn Pro Asn Ser Trp Lys Lys Pro Glu Glu PheTrp Trp Leu Ala Asn Asn Pro Asn Ser Trp Lys Lys Pro Glu Glu Phe

405 410 415405 410 415

Arg Pro Glu Arg Phe Phe Glu Glu Glu Ser His Val Glu Ala Asn GlyArg Pro Glu Arg Phe Phe Glu Glu Glu Ser His Val Glu Ala Asn Gly

420 425 430420 425 430

Asn Asp Phe Arg Tyr Val Pro Phe Gly Val Gly Arg Arg Ser Cys ProAsn Asp Phe Arg Tyr Val Pro Phe Gly Val Gly Arg Arg Ser Cys Pro

435 440 445435 440 445

Gly Ile Ile Leu Ala Leu Pro Ile Leu Gly Ile Thr Ile Gly Arg MetGly Ile Ile Leu Ala Leu Pro Ile Leu Gly Ile Thr Ile Gly Arg Met

450 455 460450 455 460

Val Gln Asn Phe Glu Leu Leu Pro Pro Pro Gly Gln Ser Lys Val AspVal Gln Asn Phe Glu Leu Leu Pro Pro Pro Gly Gln Ser Lys Val Asp

465 470 475 480465 470 475 480

Thr Ser Glu Lys Gly Gly Gln Phe Ser Leu His Ile Leu Asn His SerThr Ser Glu Lys Gly Gly Gln Phe Ser Leu His Ile Leu Asn His Ser

485 490 495485 490 495

Ile Ile Val Met Lys Pro Arg Asn CysIle Ile Val Met Lys Pro Arg Asn Cys

500 505500 505

<210>45<210>45

<211>512<211>512

<212>PRT<212>PRT

<213>罗勒(Ocimum basilicum)<213> Basil (Ocimum basilicum)

<400>45<400>45

Met Ala Ala Leu Leu Leu Leu Leu Leu Leu Pro Leu Leu Leu Pro AlaMet Ala Ala Leu Leu Leu Leu Leu Leu Leu Pro Leu Leu Leu Pro Ala

1 5 10 151 5 10 15

Ile Phe Leu Leu His His Leu Tyr Tyr Arg Leu Arg Phe Arg Leu ProIle Phe Leu Leu His His His Leu Tyr Tyr Arg Leu Arg Phe Arg Leu Pro

20 25 3020 25 30

Pro Gly Pro Arg Pro Leu Pro Ile Val Gly Asn Leu Tyr Asp Val LysPro Gly Pro Arg Pro Leu Pro Ile Val Gly Asn Leu Tyr Asp Val Lys

35 40 4535 40 45

Pro Val Arg Phe Arg Cys Phe Ala Asp Trp Ala Gln Ser Tyr Gly ProPro Val Arg Phe Arg Cys Phe Ala Asp Trp Ala Gln Ser Tyr Gly Pro

50 55 6050 55 60

Ile Ile Ser Val Trp Phe Gly Ser Thr Leu Asn Val Ile Val Ser AsnIle Ile Ser Val Trp Phe Gly Ser Thr Leu Asn Val Ile Val Ser Asn

65 70 75 8065 70 75 80

Thr Glu Leu Ala Lys Glu Val Leu Lys Glu Lys Asp Gln Gln Leu AlaThr Glu Leu Ala Lys Glu Val Leu Lys Glu Lys Asp Gln Gln Leu Ala

85 90 9585 90 95

Asp Arg His Arg Ser Arg Ser Ala Ala Lys Phe Ser Arg Asp Gly GlnAsp Arg His Arg Ser Arg Ser Ala Ala Lys Phe Ser Arg Asp Gly Gln

100 105 110100 105 110

Asp Leu Ile Trp Ala Asp Tyr Gly Pro His Tyr Val Lys Val Arg LysAsp Leu Ile Trp Ala Asp Tyr Gly Pro His Tyr Val Lys Val Arg Lys

115 120 125115 120 125

Val Cys Thr Leu Glu Leu Phe Ser Pro Lys Arg Leu Glu Ala Leu ArgVal Cys Thr Leu Glu Leu Phe Ser Pro Lys Arg Leu Glu Ala Leu Arg

130 135 140130 135 140

Pro Ile Arg Glu Asp Glu Val Thr Ala Met Val Glu Ser Ile Tyr HisPro Ile Arg Glu Asp Glu Val Thr Ala Met Val Glu Ser Ile Tyr His

145 150 155 160145 150 155 160

Asp Cys Thr Ala Pro Asp Asn Ala Gly Lys Ser Leu Leu Val Lys LysAsp Cys Thr Ala Pro Asp Asn Ala Gly Lys Ser Leu Leu Val Lys Lys

165 170 175165 170 175

Tyr Leu Gly Ala Val Ala Phe Asn Asn Ile Thr Arg Leu Ala Phe GlyTyr Leu Gly Ala Val Ala Phe Asn Asn Ile Thr Arg Leu Ala Phe Gly

180 185 190180 185 190

Lys Arg Phe Val Asn Ser Glu Gly Ile Ile Asp Lys Gln Gly Leu GluLys Arg Phe Val Asn Ser Glu Gly Ile Ile Asp Lys Gln Gly Leu Glu

195 200 205195 200 205

Phe Lys Ala Ile Val Ser Asn Gly Leu Lys Leu Gly Ala Ser Leu AlaPhe Lys Ala Ile Val Ser Asn Gly Leu Lys Leu Gly Ala Ser Leu Ala

210 215 220210 215 220

Met Ala Glu His Ile Pro Ser Leu Arg Trp Met Phe Pro Leu Asp GluMet Ala Glu His Ile Pro Ser Leu Arg Trp Met Phe Pro Leu Asp Glu

225 230 235 240225 230 235 240

Asp Ala Phe Ala Lys His Gly Ala Arg Arg Asp Gln Leu Thr Arg GluAsp Ala Phe Ala Lys His Gly Ala Arg Arg Asp Gln Leu Thr Arg Glu

245 250 255245 250 255

Ile Met Glu Glu His Thr Arg Ala Arg Glu Glu Ser Gly Gly Ala LysIle Met Glu Glu His Thr Arg Ala Arg Glu Glu Ser Gly Gly Ala Lys

260 265 270260 265 270

Gln His Phe Phe Asp Ala Leu Leu Thr Leu Lys Asp Lys Tyr Asp LeuGln His Phe Phe Asp Ala Leu Leu Thr Leu Lys Asp Lys Tyr Asp Leu

275 280 285275 280 285

Ser Glu Asp Thr Ile Ile Gly Leu Leu Trp Asp Met Ile Thr Ala GlySer Glu Asp Thr Ile Ile Gly Leu Leu Trp Asp Met Ile Thr Ala Gly

290 295 300290 295 300

Met Asp Thr Thr Ala Ile Ser Val Glu Trp Ala Met Ala Glu Leu IleMet Asp Thr Thr Ala Ile Ser Val Glu Trp Ala Met Ala Glu Leu Ile

305 310 315 320305 310 315 320

Lys Asn Pro Arg Val Gln Gln Lys Ala Gln Glu Glu Leu Asp Arg ValLys Asn Pro Arg Val Gln Gln Lys Ala Gln Glu Glu Leu Asp Arg Val

325 330 335325 330 335

Ile Gly Tyr Glu Arg Val Met Thr Glu Leu Asp Phe Ser Asn Leu ProIle Gly Tyr Glu Arg Val Met Thr Glu Leu Asp Phe Ser Asn Leu Pro

340 345 350340 345 350

Tyr Leu Gln Cys Val Ala Lys Glu Ala Leu Arg Leu His Pro Pro ThrTyr Leu Gln Cys Val Ala Lys Glu Ala Leu Arg Leu His Pro Pro Thr

355 360 365355 360 365

Pro Leu Met Leu Pro His Arg Ser Asn Ser Asn Val Lys Ile Gly GlyPro Leu Met Leu Pro His Arg Ser Asn Ser Asn Val Lys Ile Gly Gly

370 375 380370 375 380

Tyr Asp Ile Pro Lys Gly Ser Asn Val His Val Asn Val Trp Ala ValTyr Asp Ile Pro Lys Gly Ser Asn Val His Val Asn Val Trp Ala Val

385 390 395 400385 390 395 400

Ala Arg Asp Pro Ala Val Trp Lys Asn Pro Cys Glu Phe Arg Pro GluAla Arg Asp Pro Ala Val Trp Lys Asn Pro Cys Glu Phe Arg Pro Glu

405 410 415405 410 415

Arg Phe Leu Glu Glu Asp Val Asp Met Lys Gly His Asp Phe Arg LeuArg Phe Leu Glu Glu Asp Val Asp Met Lys Gly His Asp Phe Arg Leu

420 425 430420 425 430

Leu Pro Phe Gly Ala Gly Arg Arg Val Cys Pro Gly Ala Gln Leu GlyLeu Pro Phe Gly Ala Gly Arg Arg Val Cys Pro Gly Ala Gln Leu Gly

435 440 445435 440 445

Ile Asn Leu Val Thr Ser Met Ile Gly His Leu Leu His His Phe AsnIle Asn Leu Val Thr Ser Met Ile Gly His Leu Leu His His Phe Asn

450 455 460450 455 460

Trp Ala Pro Pro Ser Gly Val Ser Ser Asp Glu Leu Asp Met Gly GluTrp Ala Pro Pro Ser Gly Val Ser Ser Asp Glu Leu Asp Met Gly Glu

465 470 475 480465 470 475 480

Asn Pro Gly Leu Val Thr Tyr Met Arg Thr Pro Leu Glu Ala Val ProAsn Pro Gly Leu Val Thr Tyr Met Arg Thr Pro Leu Glu Ala Val Pro

485 490 495485 490 495

Thr Pro Arg Leu Pro Ser Asp Leu Tyr Lys Arg Ile Ala Val Asp LeuThr Pro Arg Leu Pro Ser Asp Leu Tyr Lys Arg Ile Ala Val Asp Leu

500 505 510500 505 510

<210>46<210>46

<211>521<211>521

<212>PRT<212>PRT

<213>栽培大豆(Glycine max)<213> Cultivated soybean (Glycine max)

<400>46<400>46

Met Leu Leu Glu Leu Ala Leu Gly Leu Phe Val Leu Ala Leu Phe LeuMet Leu Leu Glu Leu Ala Leu Gly Leu Phe Val Leu Ala Leu Phe Leu

1 5 10 151 5 10 15

His Leu Arg Pro Thr Pro Ser Ala Lys Ser Lys Ala Leu Arg His LeuHis Leu Arg Pro Thr Pro Ser Ala Lys Ser Lys Ala Leu Arg His Leu

20 25 3020 25 30

Pro Asn Pro Pro Ser Pro Lys Pro Arg Leu Pro Phe Ile Gly His LeuPro Asn Pro Pro Ser Pro Lys Pro Arg Leu Pro Phe Ile Gly His Leu

35 40 4535 40 45

His Leu Leu Lys Asp Lys Leu Leu His Tyr Ala Leu Ile Asp Leu SerHis Leu Leu Lys Asp Lys Leu Leu His Tyr Ala Leu Ile Asp Leu Ser

50 55 6050 55 60

Lys Lys His Gly Pro Leu Phe Ser Leu Ser Phe Gly Ser Met Pro ThrLys Lys His Gly Pro Leu Phe Ser Leu Ser Phe Gly Ser Met Pro Thr

65 70 75 8065 70 75 80

Val Val Ala Ser Thr Pro Glu Leu Phe Lys Leu Phe Leu Gln Thr HisVal Val Ala Ser Thr Pro Glu Leu Phe Lys Leu Phe Leu Gln Thr His

85 90 9585 90 95

Glu Ala Thr Ser Phe Asn Thr Arg Phe Gln Thr Ser Ala Ile Arg ArgGlu Ala Thr Ser Phe Asn Thr Arg Phe Gln Thr Ser Ala Ile Arg Arg

100 105 110100 105 110

Leu Thr Tyr Asp Asn Ser Val Ala Met Val Pro Phe Gly Pro Tyr TrpLeu Thr Tyr Asp Asn Ser Val Ala Met Val Pro Phe Gly Pro Tyr Trp

115 120 125115 120 125

Lys Phe Val Arg Lys Leu Ile Met Asn Asp Leu Leu Asn Ala Thr ThrLys Phe Val Arg Lys Leu Ile Met Asn Asp Leu Leu Asn Ala Thr Thr

130 135 140130 135 140

Val Asn Lys Leu Arg Pro Leu Arg Thr Gln Gln Ile Arg Lys Phe LeuVal Asn Lys Leu Arg Pro Leu Arg Thr Gln Gln Ile Arg Lys Phe Leu

145 150 155 160145 150 155 160

Arg Val Met Ala Gln Ser Ala Glu Ala Gln Lys Pro Leu Asp Val ThrArg Val Met Ala Gln Ser Ala Glu Ala Gln Lys Pro Leu Asp Val Thr

165 170 175165 170 175

Glu Glu Leu Leu Lys Trp Thr Asn Ser Thr Ile Ser Met Met Met LeuGlu Glu Leu Leu Lys Trp Thr Asn Ser Thr Ile Ser Met Met Met Leu

180 185 190180 185 190

Gly Glu Ala Glu Glu Ile Arg Asp Ile Ala Arg Glu Val Leu Lys IleGly Glu Ala Glu Glu Ile Arg Asp Ile Ala Arg Glu Val Leu Lys Ile

195 200 205195 200 205

Phe Gly Glu Tyr Ser Leu Thr Asp Phe Ile Trp Pro Leu Lys Tyr LeuPhe Gly Glu Tyr Ser Leu Thr Asp Phe Ile Trp Pro Leu Lys Tyr Leu

210 215 220210 215 220

Lys Val Gly Lys Tyr Glu Lys Arg Ile Asp Asp Ile Leu Asn Lys PheLys Val Gly Lys Tyr Glu Lys Arg Ile Asp Asp Ile Leu Asn Lys Phe

225 230 235 240225 230 235 240

Asp Pro Val Val Glu Arg Val Ile Lys Lys Arg Arg Glu Ile Val ArgAsp Pro Val Val Glu Arg Val Ile Lys Lys Arg Arg Glu Ile Val Arg

245 250 255245 250 255

Arg Arg Lys Asn Gly Glu Val Val Glu Gly Glu Ala Ser Gly Val PheArg Arg Lys Asn Gly Glu Val Val Glu Gly Glu Ala Ser Gly Val Phe

260 265 270260 265 270

Leu Asp Thr Leu Leu Glu Phe Ala Glu Asp Glu Thr Met Glu Ile LysLeu Asp Thr Leu Leu Glu Phe Ala Glu Asp Glu Thr Met Glu Ile Lys

275 280 285275 280 285

Ile Thr Lys Glu Gln Ile Lys Gly Leu Val Val Asp Phe Phe Ser AlaIle Thr Lys Glu Gln Ile Lys Gly Leu Val Val Asp Phe Phe Ser Ala

290 295 300290 295 300

Gly Thr Asp Ser Thr Ala Val Ala Thr Glu Trp Ala Leu Ala Glu LeuGly Thr Asp Ser Thr Ala Val Ala Thr Glu Trp Ala Leu Ala Glu Leu

305 310 315 320305 310 315 320

Ile Asn Asn Pro Arg Val Leu Gln Lys Ala Arg Glu Glu Val Tyr SerIle Asn Asn Pro Arg Val Leu Gln Lys Ala Arg Glu Glu Val Tyr Ser

325 330 335325 330 335

Val Val Gly Lys Asp Arg Leu Val Asp Glu Val Asp Thr Gln Asn LeuVal Val Gly Lys Asp Arg Leu Val Asp Glu Val Asp Thr Gln Asn Leu

340 345 350340 345 350

Pro Tyr Ile Arg Ala Ile Val Lys Glu Thr Phe Arg Met His Pro ProPro Tyr Ile Arg Ala Ile Val Lys Glu Thr Phe Arg Met His Pro Pro

355 360 365355 360 365

Leu Pro Val Val Lys Arg Lys Cys Thr Glu Glu Cys Glu Ile Asn GlyLeu Pro Val Val Lys Arg Lys Cys Thr Glu Glu Cys Glu Ile Asn Gly

370 375 380370 375 380

Tyr Val Ile Pro Glu Gly Ala Leu Val Leu Phe Asn Val Trp Gln ValTyr Val Ile Pro Glu Gly Ala Leu Val Leu Phe Asn Val Trp Gln Val

385 390 395 400385 390 395 400

Gly Arg Asp Pro Lys Tyr Trp Asp Arg Pro Ser Glu Phe Arg Pro GluGly Arg Asp Pro Lys Tyr Trp Asp Arg Pro Ser Glu Phe Arg Pro Glu

405 410 415405 410 415

Arg Phe Leu Glu Thr Gly Ala Glu Gly Glu Ala Gly Pro Leu Asp LeuArg Phe Leu Glu Thr Gly Ala Glu Gly Glu Ala Gly Pro Leu Asp Leu

420 425 430420 425 430

Arg Gly Gln His Phe Gln Leu Leu Pro Phe Gly Ser Gly Arg Arg MetArg Gly Gln His Phe Gln Leu Leu Pro Phe Gly Ser Gly Arg Arg Met

435 440 445435 440 445

Cys Pro Gly Val Asn Leu Ala Thr Ser Gly Met Ala Thr Leu Leu AlaCys Pro Gly Val Asn Leu Ala Thr Ser Gly Met Ala Thr Leu Leu Ala

450 455 460450 455 460

Ser Leu Ile Gln Cys Phe Asp Leu Gln Val Leu Gly Pro Gln Gly GlnSer Leu Ile Gln Cys Phe Asp Leu Gln Val Leu Gly Pro Gln Gly Gln

465 470 475 480465 470 475 480

Ile Leu Lys Gly Asp Asp Ala Lys Val Ser Met Glu Glu Arg Ala GlyIle Leu Lys Gly Asp Asp Ala Lys Val Ser Met Glu Glu Arg Ala Gly

485 490 495485 490 495

Leu Thr Val Pro Arg Ala His Ser Leu Val Cys Val Pro Leu Ala ArgLeu Thr Val Pro Arg Ala His Ser Leu Val Cys Val Pro Leu Ala Arg

500 505 510500 505 510

Ile Gly Val Ala Ser Lys Leu Leu SerIle Gly Val Ala Ser Lys Leu Leu Ser

515 520515 520

<210>47<210>47

<211>456<211>456

<212>PRT<212>PRT

<213>阿维链霉菌(Streptomyces avermitilis)<213> Streptomyces avermitilis

<400>47<400>47

Met Met Ser Gln Ser Thr Ser Ser Ile Pro Glu Ala Pro Gly Ala TrpMet Met Ser Gln Ser Thr Ser Ser Ser Ile Pro Glu Ala Pro Gly Ala Trp

1 5 10 151 5 10 15

Pro Val Val Gly His Val Pro Pro Leu Met Arg Gln Pro Leu Glu PhePro Val Val Gly His Val Pro Pro Leu Met Arg Gln Pro Leu Glu Phe

20 25 3020 25 30

Leu Arg Ser Ala Ala Asp His Gly Asp Leu Leu Lys Leu Arg Leu GlyLeu Arg Ser Ala Ala Asp His Gly Asp Leu Leu Lys Leu Arg Leu Gly

35 40 4535 40 45

Pro Lys Thr Ala Tyr Leu Ala Thr His Pro Asp Leu Val Arg Thr MetPro Lys Thr Ala Tyr Leu Ala Thr His Pro Asp Leu Val Arg Thr Met

50 55 6050 55 60

Leu Val Ser Ser Gly Ser Gly Asp Phe Thr Arg Ser Lys Gly Ala GlnLeu Val Ser Ser Gly Ser Gly Asp Phe Thr Arg Ser Lys Gly Ala Gln

65 70 75 8065 70 75 80

Gly Ala Ser Arg Phe Ile Gly Pro Ile Leu Val Ala Val Ser Gly GluGly Ala Ser Arg Phe Ile Gly Pro Ile Leu Val Ala Val Ser Gly Glu

85 90 9585 90 95

Thr His Arg Arg Gln Arg Arg Arg Met Gln Pro Gly Phe His Arg GlnThr His Arg Arg Gln Arg Arg Arg Met Gln Pro Gly Phe His Arg Gln

100 105 110100 105 110

Arg Leu Glu Ser Tyr Val Ala Thr Met Ala Ala Ala Ala Gln Glu ThrArg Leu Glu Ser Tyr Val Ala Thr Met Ala Ala Ala Ala Gln Glu Thr

115 120 125115 120 125

Ala Asp Ser Trp Ser Ala Gly Gln Val Val Asp Val Glu Gln Ala AlaAla Asp Ser Trp Ser Ala Gly Gln Val Val Asp Val Glu Gln Ala Ala

130 135 140130 135 140

Cys Asp Leu Ser Leu Ala Met Ile Thr Lys Thr Leu Phe Phe Ser AspCys Asp Leu Ser Leu Ala Met Ile Thr Lys Thr Leu Phe Phe Ser Asp

145 150 155 160145 150 155 160

Leu Gly Ala Lys Ala Glu Ala Ala Leu Arg Lys Thr Gly His Asp IleLeu Gly Ala Lys Ala Glu Ala Ala Leu Arg Lys Thr Gly His Asp Ile

165 170 175165 170 175

Leu Lys Val Ala Arg Leu Ser Ala Leu Ala Pro Thr Leu Tyr Glu ValLeu Lys Val Ala Arg Leu Ser Ala Leu Ala Pro Thr Leu Tyr Glu Val

180 185 190180 185 190

Leu Pro Thr Ala Gly Lys Arg Ser Val Gly Arg Thr Ser Ala Thr IleLeu Pro Thr Ala Gly Lys Arg Ser Val Gly Arg Thr Ser Ala Thr Ile

195 200 205195 200 205

Arg Glu Ala Ile Thr Ala Tyr Arg Ala Asp Gly Arg Asp His Gly AspArg Glu Ala Ile Thr Ala Tyr Arg Ala Asp Gly Arg Asp His Gly Asp

210 215 220210 215 220

Leu Leu Ser Thr Met Leu Arg Ala Thr Asp Ala Glu Gly Ala Ser MetLeu Leu Ser Thr Met Leu Arg Ala Thr Asp Ala Glu Gly Ala Ser Met

225 230 235 240225 230 235 240

Thr Asp Gln Glu Val His Asp Glu Val Met Gly Ile Ala Val Ala GlyThr Asp Gln Glu Val His Asp Glu Val Met Gly Ile Ala Val Ala Gly

245 250 255245 250 255

Ile Gly Gly Pro Ala Ala Ile Thr Ala Trp Ile Phe His Glu Leu GlyIle Gly Gly Pro Ala Ala Ile Thr Ala Trp Ile Phe His Glu Leu Gly

260 265 270260 265 270

Gln Asn Ala Glu Ile Glu Ser Arg Leu His Ala Glu Leu Asp Thr ValGln Asn Ala Glu Ile Glu Ser Arg Leu His Ala Glu Leu Asp Thr Val

275 280 285275 280 285

Leu Gly Gly Arg Leu Pro Thr His Glu Asp Leu Pro Arg Leu Pro TyrLeu Gly Gly Arg Leu Pro Thr His Glu Asp Leu Pro Arg Leu Pro Tyr

290 295 300290 295 300

Thr Gln Asn Leu Val Lys Glu Ala Leu Arg Lys Tyr Pro Gly Trp ValThr Gln Asn Leu Val Lys Glu Ala Leu Arg Lys Tyr Pro Gly Trp Val

305 310 315 320305 310 315 320

Gly Ser Arg Arg Thr Val Arg Pro Val Arg Leu Gly Gly His Asp LeuGly Ser Arg Arg Thr Val Arg Pro Val Arg Leu Gly Gly His Asp Leu

325 330 335325 330 335

Pro Ala Asp Val Glu Val Met Tyr Ser Ala Tyr Ala Ile Gln Arg AspPro Ala Asp Val Glu Val Met Tyr Ser Ala Tyr Ala Ile Gln Arg Asp

340 345 350340 345 350

Pro Arg Trp Tyr Pro Glu Pro Glu Arg Leu Asp Pro Gly Arg Trp GluPro Arg Trp Tyr Pro Glu Pro Glu Arg Leu Asp Pro Gly Arg Trp Glu

355 360 365355 360 365

Thr Lys Gly Ser Ser Arg Gly Val Pro Lys Gly Ala Trp Val Pro PheThr Lys Gly Ser Ser Arg Gly Val Pro Lys Gly Ala Trp Val Pro Phe

370 375 380370 375 380

Ala Leu Gly Thr Tyr Lys Cys Ile Gly Asp Asn Phe Ala Leu Leu GluAla Leu Gly Thr Tyr Lys Cys Ile Gly Asp Asn Phe Ala Leu Leu Glu

385 390 395 400385 390 395 400

Thr Ala Val Thr Val Ala Val Val Ala Ser His Trp Arg Leu His AlaThr Ala Val Thr Val Ala Val Val Ala Ser His Trp Arg Leu His Ala

405 410 415405 410 415

Leu Pro Gly Asp Glu Val Arg Pro Lys Thr Lys Ala Thr His Val PheLeu Pro Gly Asp Glu Val Arg Pro Lys Thr Lys Ala Thr His Val Phe

420 425 430420 425 430

Pro Asn Arg Leu Arg Met Ile Ala Glu Pro Arg Ser Val Val Arg LeuPro Asn Arg Leu Arg Met Ile Ala Glu Pro Arg Ser Val Val Arg Leu

435 440 445435 440 445

Glu Glu Pro Ala Ala Met Gly AlaGlu Glu Pro Ala Ala Met Gly Ala

450 455450 455

<210>48<210>48

<21l>401<21l>401

<212>PRT<212>PRT

<213>比基链霉菌(Streptomyces bikiniensis)<213> Streptomyces bikiniensis

<400>48<400>48

Met Gly Leu Pro Leu Thr Ser Thr Lys Thr Ala Pro Val Ser Tyr ProMet Gly Leu Pro Leu Thr Ser Thr Lys Thr Ala Pro Val Ser Tyr Pro

1 5 10 151 5 10 15

Phe Gly Arg Pro Glu Gly Leu Asp Leu Asp Glu Ala Tyr Glu Gln AlaPhe Gly Arg Pro Glu Gly Leu Asp Leu Asp Glu Ala Tyr Glu Gln Ala

20 25 3020 25 30

Arg Lys Ser Glu Gly Leu Leu Trp Val His Met Pro Tyr Gly Glu ProArg Lys Ser Glu Gly Leu Leu Trp Val His Met Pro Tyr Gly Glu Pro

35 40 4535 40 45

Gly Trp Leu Val Ser Arg Tyr Asp Asp Ala Arg Phe Val Leu Gly AspGly Trp Leu Val Ser Arg Tyr Asp Asp Ala Arg Phe Val Leu Gly Asp

50 55 6050 55 60

Arg Arg Phe Ser His Ala Ala Glu Ala Glu Asn Asp Ala Pro Arg MetArg Arg Phe Ser His Ala Ala Glu Ala Glu Asn Asp Ala Pro Arg Met

65 70 75 8065 70 75 80

Arg Glu Leu Arg Thr Pro Asn Gly Ile Ile Gly Met Asp Ala Pro AspArg Glu Leu Arg Thr Pro Asn Gly Ile Ile Gly Met Asp Ala Pro Asp

85 90 9585 90 95

His Thr Arg Leu Arg Gly Leu Val Thr Lys Ala Phe Thr Pro Arg ArgHis Thr Arg Leu Arg Gly Leu Val Thr Lys Ala Phe Thr Pro Arg Arg

100 105 110100 105 110

Val Glu Ala Met Arg Pro His Val Arg Arg Met Thr Ala Ser Leu LeuVal Glu Ala Met Arg Pro His Val Arg Arg Met Thr Ala Ser Leu Leu

115 120 125115 120 125

Arg Asp Met Thr Ala Leu Gly Ser Pro Val Asp Leu Val Asp His TyrArg Asp Met Thr Ala Leu Gly Ser Pro Val Asp Leu Val Asp His Tyr

130 135 140130 135 140

Ala Val Pro Leu Pro Val Ala Val Ile Cys Gly Leu Leu Gly Val ProAla Val Pro Leu Pro Val Ala Val Ile Cys Gly Leu Leu Gly Val Pro

145 150 155 160145 150 155 160

Glu Glu Asp Arg Asp Leu Phe Arg Gly Trp Cys Glu Ile Ala Met SerGlu Glu Asp Arg Asp Leu Phe Arg Gly Trp Cys Glu Ile Ala Met Ser

165 170 175165 170 175

Thr Ser Ser Leu Thr Ala Glu Asp His Val Arg Leu Ala Gly Glu LeuThr Ser Ser Leu Thr Ala Glu Asp His Val Arg Leu Ala Gly Glu Leu

180 185 190180 185 190

Thr Gly Tyr Leu Ala Asp Leu Ile Thr Ala Arg Arg Ala Ala Pro ArgThr Gly Tyr Leu Ala Asp Leu Ile Thr Ala Arg Arg Ala Ala Pro Arg

195 200 205195 200 205

Asp Asp Leu Val Ser Ala Leu Val Glu Ala Arg Asp Ala Gln Gly ArgAsp Asp Leu Val Ser Ala Leu Val Glu Ala Arg Asp Ala Gln Gly Arg

210 215 220210 215 220

Leu Ser Gln Glu Glu Leu Val Asp Leu Ile Val Phe Leu Leu Phe AlaLeu Ser Gln Glu Glu Leu Val Asp Leu Ile Val Phe Leu Leu Phe Ala

225 230 235 240225 230 235 240

Gly His Glu Thr Thr Ala Ser Gln Ile Ser Asn Phe Val Leu Val LeuGly His Glu Thr Thr Ala Ser Gln Ile Ser Asn Phe Val Leu Val Leu

245 250 255245 250 255

Leu Glu Gln Pro Asp Gln Leu Ala Leu Leu Arg Asp Arg Pro Asp LeuLeu Glu Gln Pro Asp Gln Leu Ala Leu Leu Arg Asp Arg Pro Asp Leu

260 265 270260 265 270

Leu Asp Asn Ala Val Glu Glu Leu Thr Arg Phe Val Pro Leu Gly SerLeu Asp Asn Ala Val Glu Glu Leu Thr Arg Phe Val Pro Leu Gly Ser

275 280 285275 280 285

Gln Ala Gly Phe Pro Arg Tyr Ala Thr Glu Asp Val Glu Val Gly GlyGln Ala Gly Phe Pro Arg Tyr Ala Thr Glu Asp Val Glu Val Gly Gly

290 295 300290 295 300

Thr Leu Val Arg Ala Gly Asp Pro Val Leu Val Gln Met Asn Ala AlaThr Leu Val Arg Ala Gly Asp Pro Val Leu Val Gln Met Asn Ala Ala

305 310 315 320305 310 315 320

Asn Arg Asp Ala Leu Arg Phe Arg Ser Pro Gly Val Leu Asp Ile ThrAsn Arg Asp Ala Leu Arg Phe Arg Ser Pro Gly Val Leu Asp Ile Thr

325 330 335325 330 335

Arg Asp Asp Ala Gly Arg His Leu Gly Tyr Gly His Gly Pro His HisArg Asp Asp Ala Gly Arg His Leu Gly Tyr Gly His Gly Pro His His

340 345 350340 345 350

Cys Leu Gly Ala Ser Leu Ala Arg Leu Glu Leu Gln Glu Ala Leu ArgCys Leu Gly Ala Ser Leu Ala Arg Leu Glu Leu Gln Glu Ala Leu Arg

355 360 365355 360 365

Thr Leu Leu Asp Glu Leu Pro Gly Leu His Leu Ala Gln Pro Val GluThr Leu Leu Asp Glu Leu Pro Gly Leu His Leu Ala Gln Pro Val Glu

370 375 380370 375 380

Trp Lys Thr Glu Met Val Val Arg Gly Pro Arg Thr Met Leu Val GlyTrp Lys Thr Glu Met Val Val Arg Gly Pro Arg Thr Met Leu Val Gly

385 390 395 400385 390 395 400

TrpTrp

<210>49<210>49

<211>1497<211>1497

<212>DNA<212>DNA

<213>黄花蒿(Artemisia annua)<213> Artemisia annua

<400>49<400>49

catatgaagt ctattctgaa agcaatggct ctgtctctga ccactagcat cgccctggcg 60catatgaagt ctattctgaa agcaatggct ctgtctctga cactagcat cgccctggcg 60

actatcctgc tgtttgtgta caaattcgcg acccgttcta aaagcactaa gaaatctctg 120actatcctgc tgtttgtgta caaattcgcg acccgttcta aaagcactaa gaaatctctg 120

ccggaaccgt ggcgtctgcc aatcatcggt cacatgcacc acctgatcgg caccaccccg 180ccggaaccgt ggcgtctgcc aatcatcggt cacatgcacc acctgatcgg caccaccccg 180

caccgtggcg tacgcgacct ggcgcgtaag tacggctctc tgatgcatct gcagctgggc 240caccgtggcg tacgcgacct ggcgcgtaag tacggctctc tgatgcatct gcagctgggc 240

gaggtaccta ctatcgtcgt ttcctccccg aagtgggcca aagaaatcct gactacctat 300gaggtaccta ctatcgtcgt ttcctccccg aagtgggcca aagaaatcct gactacctat 300

gacatcactt tcgccaaccg cccggaaacg ctgaccggcg aaattgtcct gtaccataac 360gacatcactt tcgccaaccg cccggaaacg ctgaccggcg aaattgtcct gtaccataac 360

acggatgtgg ttctggcccc gtacggtgag tactggcgcc agctgcgcaa aatttgtact 420acggatgtgg ttctggcccc gtacggtgag tactggcgcc agctgcgcaa aatttgtact 420

ctggaactgc tgagcgttaa aaaggttaaa tccttccaga gcctgcgtga agaggaatgc 480ctggaactgc tgagcgttaa aaaggttaaa tccttccaga gcctgcgtga agaggaatgc 480

tggaacctgg tgcaggagat taaagcgtct ggcagcggtc gtccagttaa cctgtctgag 540tggaacctgg tgcaggagat taaagcgtct ggcagcggtc gtccagttaa cctgtctgag 540

aatgttttta aactgatcgc tactatcctg tctcgcgcgg cattcggtaa aggtatcaaa 600aatgttttta aactgatcgc tactatcctg tctcgcgcgg cattcggtaa aggtatcaaa 600

gatcagaaag aactgaccga aatcgttaag gaaatcctgc gccagactgg tggcttcgac 660gatcagaaag aactgaccga aatcgttaag gaaatcctgc gccagactgg tggcttcgac 660

gttgcggaca tcttcccgtc caaaaagttc ctgcaccatc tgtctggcaa acgcgctcgt 720gttgcggaca tcttcccgtc caaaaagttc ctgcaccatc tgtctggcaa acgcgctcgt 720

ctgacctccc tgcgtaagaa aattgataac ctgattgaca acctggtcgc tgagcacact 780ctgacctccc tgcgtaagaa aattgataac ctgattgaca acctggtcgc tgagcacact 780

gtgaacacct cttctaaaac caacgaaacc ctgctggacg tactgctgcg cctgaaggac 840gtgaacacct cttctaaaac caacgaaacc ctgctggacg tactgctgcg cctgaaggac 840

tctgccgaat ttccactgac tagcgacaat atcaaagcaa tcatcctgga catgttcggc 900tctgccgaat ttccactgac tagcgacaat atcaaagcaa tcatcctgga catgttcggc 900

gccggtaccg atacgtcctc ttccacgatt gagtgggcta tttccgaact gatcaaatgc 960gccggtaccg atacgtcctc ttccacgatt gagtgggcta tttccgaact gatcaaatgc 960

ccgaaggcga tggaaaaagt gcaggcggaa ctgcgtaaag cgctgaacgg taaagagaaa 1020ccgaaggcga tggaaaaagt gcaggcggaa ctgcgtaaag cgctgaacgg taaagagaaa 1020

attcatgaag aggacatcca ggaactgtcc tacctgaata tggtaatcaa agaaactctg 1080attcatgaag aggacatcca ggaactgtcc tacctgaata tggtaatcaa agaaactctg 1080

cgtctgcatc cgccgctgcc actggttctg ccgcgtgaat gccgtcagcc ggttaacctg 1140cgtctgcatc cgccgctgcc actggttctg ccgcgtgaat gccgtcagcc ggttaacctg 1140

gccggctaca acattccgaa caaaacgaag ctgatcgtca acgttttcgc gatcaaccgc 1200gccggctaca acattccgaa caaaacgaag ctgatcgtca acgttttcgc gatcaaccgc 1200

gatcctgaat actggaaaga cgcggaagcg ttcattccgg aacgctttga gaactcctct 1260gatcctgaat actggaaaga cgcggaagcg ttcattccgg aacgctttga gaactcctct 1260

gccaccgtta tgggcgctga atacgagtac ctgccgttcg gtgcgggtcg ccgtatgtgc 1320gccaccgtta tgggcgctga atacgagtac ctgccgttcg gtgcgggtcg ccgtatgtgc 1320

ccgggtgctg cactgggcct ggcgaacgtt caactgccac tggcgaacat cctgtaccac 1380ccgggtgctg cactgggcct ggcgaacgtt caactgccac tggcgaacat cctgtaccac 1380

ttcaactgga aactgcctaa cggcgtatct tatgatcaaa tcgacatgac cgaaagctcc 1440ttcaactgga aactgcctaa cggcgtatct tatgatcaaa tcgacatgac cgaaagctcc 1440

ggcgcgacca tgcagcgtaa aaccgaactg ctgctggttc cgtcctttta acctagg 1497ggcgcgacca tgcagcgtaa aaccgaactg ctgctggttc cgtcctttta acctagg 1497

<210>50<210>50

<211>1497<211>1497

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的紫穗槐二烯氧化酶核酸<223> Modified amorphadiene oxidase nucleic acid

<400>50<400>50

<210>51<210>51

<211>1488<211>1488

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<400>51<400>51

atgaagtcta ttctgaaagc aatggctctg tctctgacca ctagcatcgc cctggcgact 60atgaagtcta ttctgaaagc aatggctctg tctctgacca ctagcatcgc cctggcgact 60

atcctgctgt ttgtgtacaa attcgcgacc cgttctaaaa gcactaagaa atctctgccg 120atcctgctgt ttgtgtacaa attcgcgacc cgttctaaaa gcactaagaa atctctgccg 120

gaaccgtggc gtctgccaat catcggtcac atgcaccacc tgatcggcac caccccgcac 180gaaccgtggc gtctgccaat catcggtcac atgcaccacc tgatcggcac caccccgcac 180

cgtggcgtac gcgacctggc gcgtaagtac ggctctctga tgcatctgca gctgggcgag 240cgtggcgtac gcgacctggc gcgtaagtac ggctctctga tgcatctgca gctgggcgag 240

gtacctacta tcgtcgtttc ctccccgaag tgggccaaag aaatcctgac tacctatgac 300gtacctacta tcgtcgtttc ctccccgaag tgggccaaag aaatcctgac tacctatgac 300

atcactttcg ccaaccgccc ggaaacgctg accggcgaaa ttgtcctgta ccataacacg 360atcactttcg ccaaccgccc ggaaacgctg accggcgaaa ttgtcctgta ccataacacg 360

gatgtggttc tggccccgta cggtgagtac tggcgccagc tgcgcaaaat ttgtactctg 420gatgtggttc tggccccgta cggtgagtac tggcgccagc tgcgcaaaat ttgtactctg 420

gaactgctga gcgttaaaaa ggttaaatcc ttccagagcc tgcgtgaaga ggaatgctgg 480gaactgctga gcgttaaaaa ggttaaatcc ttccagagcc tgcgtgaaga ggaatgctgg 480

aacctggtgc aggagattaa agcgtctggc agcggtcgtc cagttaacct gtctgagaat 540aacctggtgc aggagattaa agcgtctggc agcggtcgtc cagttaacct gtctgagaat 540

gtttttaaac tgatcgctac tatcctgtct cgcgcggcat tcggtaaagg tatcaaagat 600gtttttaaac tgatcgctac tatcctgtct cgcgcggcat tcggtaaagg tatcaaagat 600

cagaaagaac tgaccgaaat cgttaaggaa atcctgcgcc agactggtgg cttcgacgtt 660cagaaagaac tgaccgaaat cgttaaggaa atcctgcgcc agactggtgg cttcgacgtt 660

gcggacatct tcccgtccaa aaagttcctg caccatctgt ctggcaaacg cgctcgtctg 720gcggacatct tcccgtccaa aaagttcctg caccatctgt ctggcaaacg cgctcgtctg 720

acctccctgc gtaagaaaat tgataacctg attgacaacc tggtcgctga gcacactgtg 780acctccctgc gtaagaaaat tgataacctg attgacaacc tggtcgctga gcacactgtg 780

aacacctctt ctaaaaccaa cgaaaccctg ctggacgtac tgctgcgcct gaaggactct 840aacacctctt ctaaaaccaa cgaaaccctg ctggacgtac tgctgcgcct gaaggactct 840

gccgaatttc cactgactag cgacaatatc aaagcaatca tcctggacat gttcggcgcc 900gccgaatttc cactgactag cgacaatatc aaagcaatca tcctggacat gttcggcgcc 900

ggtaccgata cgtcctcttc cacgattgag tgggctattt ccgaactgat caaatgcccg 960ggtaccgata cgtcctcttc cacgattgag tgggctattt ccgaactgat caaatgcccg 960

aaggcgatgg aaaaagtgca ggcggaactg cgtaaagcgc tgaacggtaa agagaaaatt 1020aaggcgatgg aaaaagtgca ggcggaactg cgtaaagcgc tgaacggtaa agagaaaatt 1020

catgaagagg acatccagga actgtcctac ctgaatatgg taatcaaaga aactctgcgt 1080catgaagagg acatccagga actgtcctac ctgaatatgg taatcaaaga aactctgcgt 1080

ctgcatccgc cgctgccact ggttctgccg cgtgaatgcc gtcagccggt taacctggcc 1140ctgcatccgc cgctgccact ggttctgccg cgtgaatgcc gtcagccggt taacctggcc 1140

ggctacaaca ttccgaacaa aacgaagctg atcgtcaacg ttttcgcgat caaccgcgat 1200ggctacaaca ttccgaacaa aacgaagctg atcgtcaacg ttttcgcgat caaccgcgat 1200

cctgaatact ggaaagacgc ggaagcgttc attccggaac gctttgagaa ctcctctgcc 1260cctgaatact ggaaagacgc ggaagcgttc attccggaac gctttgagaa ctcctctgcc 1260

accgttatgg gcgctgaata cgagtacctg ccgttcggtg cgggtcgccg tatgtgcccg 1320accgttatgg gcgctgaata cgagtacctg ccgttcggtg cgggtcgccg tatgtgcccg 1320

ggtgctgcac tgggcctggc gaacgttcaa ctgccactgg cgaacatcct gtaccacttc 1380ggtgctgcac tgggcctggc gaacgttcaa ctgccactgg cgaacatcct gtaccacttc 1380

aactggaaac tgcctaacgg cgtatcttat gatcaaatcg acatgaccga aagctccggc 1440aactggaaac tgcctaacgg cgtatcttat gatcaaatcg acatgaccga aagctccggc 1440

gcgaccatgc agcgtaaaac cgaactgctg ctggttccgt ccttttaa 1488gcgaccatgc agcgtaaaac cgaactgctg ctggttccgt ccttttaa 1488

<210>52<210>52

<211>495<211>495

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的紫穗槐二烯氧化酶<223> Modified amorphadiene oxidase

<400>52<400>52

Met Lys Ser Ile Leu Lys Ala Met Ala Leu Ser Leu Thr Thr Ser IleMet Lys Ser Ile Leu Lys Ala Met Ala Leu Ser Leu Thr Thr Ser Ile

1 5 10 151 5 10 15

Ala Leu AIa Thr Ile Leu Leu Phe Val Tyr Lys Phe Ala Thr Arg SerAla Leu AIa Thr Ile Leu Leu Phe Val Tyr Lys Phe Ala Thr Arg Ser

20 25 3020 25 30

Lys Ser Thr Lys Lys Ser Leu Pro Glu Pro Trp Arg Leu Pro Ile IleLys Ser Thr Lys Lys Ser Leu Pro Glu Pro Trp Arg Leu Pro Ile Ile

35 40 4535 40 45

Gly His Met His His Leu Ile Gly Thr Thr Pro His Arg Gly Val ArgGly His Met His His His Leu Ile Gly Thr Thr Pro His Arg Gly Val Arg

50 55 6050 55 60

Asp Leu Ala Arg Lys Tyr Gly Ser Leu Met His Leu Gln Leu Gly GluAsp Leu Ala Arg Lys Tyr Gly Ser Leu Met His Leu Gln Leu Gly Glu

65 70 75 8065 70 75 80

Val Pro Thr Ile Val Val Ser Ser Pro Lys Trp Ala Lys Glu Ile LeuVal Pro Thr Ile Val Val Ser Ser Pro Lys Trp Ala Lys Glu Ile Leu

85 90 9585 90 95

Thr Thr Tyr Asp Ile Thr Phe Ala Asn Arg Pro Glu Thr Leu Thr GlyThr Thr Tyr Asp Ile Thr Phe Ala Asn Arg Pro Glu Thr Leu Thr Gly

100 105 110100 105 110

Glu Ile Val Leu Tyr His Asn Thr Asp Val Val Leu Ala Pro Tyr GlyGlu Ile Val Leu Tyr His Asn Thr Asp Val Val Leu Ala Pro Tyr Gly

115 120 125115 120 125

Glu Tyr Trp Arg Gln Leu Arg Lys Ile Cys Thr Leu Glu Leu Leu SerGlu Tyr Trp Arg Gln Leu Arg Lys Ile Cys Thr Leu Glu Leu Leu Ser

130 135 140130 135 140

Val Lys Lys Val Lys Ser Phe Gln Ser Leu Arg Glu Glu Glu Cys TrpVal Lys Lys Val Lys Ser Phe Gln Ser Leu Arg Glu Glu Glu Cys Trp

145 150 155 160145 150 155 160

Asn Leu Val Gln Glu Ile Lys Ala Ser Gly Ser Gly Arg Pro Val AsnAsn Leu Val Gln Glu Ile Lys Ala Ser Gly Ser Gly Arg Pro Val Asn

165 170 175165 170 175

Leu Ser Glu Asn Val Phe Lys Leu Ile Ala Thr Ile Leu Ser Arg AlaLeu Ser Glu Asn Val Phe Lys Leu Ile Ala Thr Ile Leu Ser Arg Ala

180 185 190180 185 190

Ala Phe Gly Lys Gly Ile Lys Asp Gln Lys Glu Leu Thr Glu Ile ValAla Phe Gly Lys Gly Ile Lys Asp Gln Lys Glu Leu Thr Glu Ile Val

195 200 205195 200 205

Lys Glu Ile Leu Arg Gln Thr Gly Gly Phe Asp Val Ala Asp Ile PheLys Glu Ile Leu Arg Gln Thr Gly Gly Phe Asp Val Ala Asp Ile Phe

210 215 220210 215 220

Pro Ser Lys Lys Phe Leu His His Leu Ser Gly Lys Arg Ala Arg LeuPro Ser Lys Lys Phe Leu His His Leu Ser Gly Lys Arg Ala Arg Leu

225 230 235 240225 230 235 240

Thr Ser Leu Arg Lys Lys Ile Asp Asn Leu Ile Asp Asn Leu Val AlaThr Ser Leu Arg Lys Lys Ile Asp Asn Leu Ile Asp Asn Leu Val Ala

245 250 255245 250 255

Glu His Thr Val Asn Thr Ser Ser Lys Thr Asn Glu Thr Leu Leu AspGlu His Thr Val Asn Thr Ser Ser Lys Thr Asn Glu Thr Leu Leu Asp

260 265 270260 265 270

Val Leu Leu Arg Leu Lys Asp Ser Ala Glu Phe Pro Leu Thr Ser AspVal Leu Leu Arg Leu Lys Asp Ser Ala Glu Phe Pro Leu Thr Ser Asp

275 280 285275 280 285

Asn Ile Lys Ala Ile Ile Leu Asp Met Phe Gly Ala Gly Thr Asp ThrAsn Ile Lys Ala Ile Ile Leu Asp Met Phe Gly Ala Gly Thr Asp Thr

290 295 300290 295 300

Ser Ser Ser Thr Ile Glu Trp Ala Ile Ser Glu Leu Ile Lys Cys ProSer Ser Ser Thr Ile Glu Trp Ala Ile Ser Ser Glu Leu Ile Lys Cys Pro

305 310 315 320305 310 315 320

Lys Ala Met Glu Lys Val Gln Ala Glu Leu Arg Lys Ala Leu Asn GlyLys Ala Met Glu Lys Val Gln Ala Glu Leu Arg Lys Ala Leu Asn Gly

325 330 335325 330 335

Lys Glu Lys Ile His Glu Glu Asp Ile Gln Glu Leu Ser Tyr Leu AsnLys Glu Lys Ile His Glu Glu Asp Ile Gln Glu Leu Ser Tyr Leu Asn

340 345 350340 345 350

Met Val Ile Lys Glu Thr Leu Arg Leu His Pro Pro Leu Pro Leu ValMet Val Ile Lys Glu Thr Leu Arg Leu His Pro Pro Leu Pro Leu Val

355 360 365355 360 365

Leu Pro Arg Glu Cys Arg Gln Pro Val Asn Leu Ala Gly Tyr Asn IleLeu Pro Arg Glu Cys Arg Gln Pro Val Asn Leu Ala Gly Tyr Asn Ile

370 375 380370 375 380

Pro Asn Lys Thr Lys Leu Ile Val Asn Val Phe Ala Ile Asn Arg AspPro Asn Lys Thr Lys Leu Ile Val Asn Val Phe Ala Ile Asn Arg Asp

385 390 395 400385 390 395 400

Pro Glu Tyr Trp Lys Asp Ala Glu Ala Phe Ile Pro Glu Arg Phe GluPro Glu Tyr Trp Lys Asp Ala Glu Ala Phe Ile Pro Glu Arg Phe Glu

405 410 415405 410 415

Asn Ser Ser Ala Thr Val Met Gly Ala Glu Tyr Glu Tyr Leu Pro PheAsn Ser Ser Ala Thr Val Met Gly Ala Glu Tyr Glu Tyr Leu Pro Phe

420 425 430420 425 430

Gly Ala Gly Arg Arg Met Cys Pro Gly Ala Ala Leu Gly Leu Ala AsnGly Ala Gly Arg Arg Met Cys Pro Gly Ala Ala Leu Gly Leu Ala Asn

435 440 445435 440 445

Val Gln Leu Pro Leu Ala Asn Ile Leu Tyr His Phe Asn Trp Lys LeuVal Gln Leu Pro Leu Ala Asn Ile Leu Tyr His Phe Asn Trp Lys Leu

450 455 460450 455 460

Pro Asn Gly Val Ser Tyr Asp Gln Ile Asp Met Thr Glu Ser Ser GlyPro Asn Gly Val Ser Tyr Asp Gln Ile Asp Met Thr Glu Ser Ser Gly

465 470 475 480465 470 475 480

Ala Thr Met Gln Arg Lys Thr Glu Leu Leu Leu Val Pro Ser PheAla Thr Met Gln Arg Lys Thr Glu Leu Leu Leu Val Pro Ser Phe

485 490 495485 490 495

<210>53<210>53

<211>3018<211>3018

<212>DNA<212> DNA

<213>人工序列<213> Artificial sequence

<220><220>

<400>53<400>53

catatgaccg tacacgacat catcgcaacg tacttcacta aatggtacgt aattgtgccg 60catatgaccg tacacgacat catcgcaacg tacttcacta aatggtacgt aattgtgccg 60

ctggcactga ttgcgtatcg cgtgctggat tatttctacg cgacccgttc taaaagcact 120ctggcactga ttgcgtatcg cgtgctggat tatttctacg cgacccgttc taaaagcact 120

aagaaatctc tgccggaacc gtggcgtctg ccaatcatcg gtcacatgca ccacctgatc 180aagaaatctc tgccggaacc gtggcgtctg ccaatcatcg gtcacatgca ccacctgatc 180

ggcaccaccc cgcaccgtgg cgtacgcgac ctggcgcgta agtacggctc tctgatgcat 240ggcaccaccc cgcaccgtgg cgtacgcgac ctggcgcgta agtacggctc tctgatgcat 240

ctgcagctgg gcgaggtacc tactatcgtc gtttcctccc cgaagtgggc caaagaaatc 300ctgcagctgg gcgaggtacc tactatcgtc gtttcctccc cgaagtgggc caaagaaatc 300

ctgactacct atgacatcac tttcgccaac cgcccggaaa cgctgaccgg cgaaattgtc 360ctgactacct atgacatcac tttcgccaac cgcccggaaa cgctgaccgg cgaaattgtc 360

ctgtaccata acacggatgt ggttctggcc ccgtacggtg agtactggcg ccagctgcgc 420ctgtaccata acacggatgt ggttctggcc ccgtacggtg agtactggcg ccagctgcgc 420

aaaatttgta ctctggaact gctgagcgtt aaaaaggtta aatccttcca gagcctgcgt 480aaaatttgta ctctggaact gctgagcgtt aaaaaggtta aatccttcca gagcctgcgt 480

gaagaggaat gctggaacct ggtgcaggag attaaagcgt ctggcagcgg tcgtccagtt 540gaagaggaat gctggaacct ggtgcaggag attaaagcgt ctggcagcgg tcgtccagtt 540

aacctgtctg agaatgtttt taaactgatc gctactatcc tgtctcgcgc ggcattcggt 600aacctgtctg agaatgtttt taaactgatc gctactatcc tgtctcgcgc ggcattcggt 600

aaaggtatca aagatcagaa agaactgacc gaaatcgtta aggaaatcct gcgccagact 660aaaggtatca aagatcagaa agaactgacc gaaatcgtta aggaaatcct gcgccagact 660

ggtggcttcg acgttgcgga catcttcccg tccaaaaagt tcctgcacca tctgtctggc 720ggtggcttcg acgttgcgga catcttcccg tccaaaaagt tcctgcacca tctgtctggc 720

aaacgcgctc gtctgacctc cctgcgtaag aaaattgata acctgattga caacctggtc 780aaacgcgctc gtctgacctc cctgcgtaag aaaattgata acctgattga caacctggtc 780

gctgagcaca ctgtgaacac ctcttctaaa accaacgaaa ccctgctgga cgtactgctg 840gctgagcaca ctgtgaacac ctcttctaaa accaacgaaa ccctgctgga cgtactgctg 840

cgcctgaagg actctgccga atttccactg actagcgaca atatcaaagc aatcatcctg 900cgcctgaagg actctgccga atttccactg actagcgaca atatcaaagc aatcatcctg 900

gacatgttcg gcgccggtac cgatacgtcc tcttccacga ttgagtgggc tatttccgaa 960gacatgttcg gcgccggtac cgatacgtcc tcttccacga ttgagtgggc tatttccgaa 960

ctgatcaaat gcccgaaggc gatggaaaaa gtgcaggcgg aactgcgtaa agcgctgaac 1020ctgatcaaat gcccgaaggc gatggaaaaa gtgcaggcgg aactgcgtaa agcgctgaac 1020

ggtaaagaga aaattcatga agaggacatc caggaactgt cctacctgaa tatggtaatc 1080ggtaaagaga aaattcatga agaggacatc caggaactgt cctacctgaa tatggtaatc 1080

aaagaaactc tgcgtctgca tccgccgctg ccactggttc tgccgcgtga atgccgtcag 1140aaagaaactc tgcgtctgca tccgccgctg ccactggttc tgccgcgtga atgccgtcag 1140

ccggttaacc tggccggcta caacattccg aacaaaacga agctgatcgt caacgttttc 1200ccggttaacc tggccggcta caacattccg aacaaaacga agctgatcgt caacgttttc 1200

gcgatcaacc gcgatcctga atactggaaa gacgcggaag cgttcattcc ggaacgcttt 1260gcgatcaacc gcgatcctga atactggaaa gacgcggaag cgttcattcc ggaacgcttt 1260

gagaactcct ctgccaccgt tatgggcgct gaatacgagt acctgccgtt cggtgcgggt 1320gagaactcct ctgccaccgt tatgggcgct gaatacgagt acctgccgtt cggtgcgggt 1320

cgccgtatgt gcccgggtgc tgcactgggc ctggcgaacg ttcaactgcc actggcgaac 1380cgccgtatgt gcccgggtgc tgcactgggc ctggcgaacg ttcaactgcc actggcgaac 1380

atcctgtacc acttcaactg gaaactgcct aacggcgtat cttatgatca aatcgacatg 1440atcctgtacc acttcaactg gaaactgcct aacggcgtat cttatgatca aatcgacatg 1440

accgaaagct ccggcgcgac catgcagcgt aaaaccgaac tgctgctggt tccgtccttt 1500accgaaagct ccggcgcgac catgcagcgt aaaaccgaac tgctgctggt tccgtccttt 1500

taacctaggc atatgaccgt acacgacatc atcgcaacgt acttcactaa atggtacgta 1560taacctaggc atatgaccgt acacgacatc atcgcaacgt acttcactaa atggtacgta 1560

attgtgccgc tggcactgat tgcgtatcgc gtgctggatt atttctacgc gacccgttct 1620attgtgccgc tggcactgat tgcgtatcgc gtgctggatt atttctacgc gacccgttct 1620

aaaagcacta agaaatctct gccggaaccg tggcgtctgc caatcatcgg tcacatgcac 1680aaaagcacta agaaatctct gccggaaccg tggcgtctgc caatcatcgg tcacatgcac 1680

cacctgatcg gcaccacccc gcaccgtggc gtacgcgacc tggcgcgtaa gtacggctct 1740cacctgatcg gcaccacccc gcaccgtggc gtacgcgacc tggcgcgtaa gtacggctct 1740

ctgatgcatc tgcagctggg cgaggtacct actatcgtcg tttcctcccc gaagtgggcc 1800ctgatgcatc tgcagctggg cgaggtacct actatcgtcg tttcctcccc gaagtgggcc 1800

aaagaaatcc tgactaccta tgacatcact ttcgccaacc gcccggaaac gctgaccggc 1860aaagaaatcc tgactaccta tgacatcact ttcgccaacc gcccggaaac gctgaccggc 1860

gaaattgtcc tgtaccataa cacggatgtg gttctggccc cgtacggtga gtactggcgc 1920gaaattgtcc tgtaccataa cacggatgtg gttctggccc cgtacggtga gtactggcgc 1920

cagctgcgca aaatttgtac tctggaactg ctgagcgtta aaaaggttaa atccttccag 1980cagctgcgca aaatttgtac tctggaactg ctgagcgtta aaaaggttaa atccttccag 1980

agcctgcgtg aagaggaatg ctggaacctg gtgcaggaga ttaaagcgtc tggcagcggt 2040agcctgcgtg aagaggaatg ctggaacctg gtgcaggaga ttaaagcgtc tggcagcggt 2040

cgtccagtta acctgtctga gaatgttttt aaactgatcg ctactatcct gtctcgcgcg 2100cgtccagtta acctgtctga gaatgttttt aaactgatcg ctactatcct gtctcgcgcg 2100

gcattcggta aaggtatcaa agatcagaaa gaactgaccg aaatcgttaa ggaaatcctg 2160gcattcggta aaggtatcaa agatcagaaa gaactgaccg aaatcgttaa ggaaatcctg 2160

cgccagactg gtggcttcga cgttgcggac atcttcccgt ccaaaaagtt cctgcaccat 2220cgccagactg gtggcttcga cgttgcggac atcttcccgt ccaaaaagtt cctgcaccat 2220

ctgtctggca aacgcgctcg tctgacctcc ctgcgtaaga aaattgataa cctgattgac 2280ctgtctggca aacgcgctcg tctgacctcc ctgcgtaaga aaattgataa cctgattgac 2280

aacctggtcg ctgagcacac tgtgaacacc tcttctaaaa ccaacgaaac cctgctggac 2340aacctggtcg ctgagcacac tgtgaacacc tcttctaaaa ccaacgaaac cctgctggac 2340

gtactgctgc gcctgaagga ctctgccgaa tttccactga ctagcgacaa tatcaaagca 2400gtactgctgc gcctgaagga ctctgccgaa tttccactga ctagcgacaa tatcaaagca 2400

atcatcctgg acatgttcgg cgccggtacc gatacgtcct cttccacgat tgagtgggct 2460atcatcctgg acatgttcgg cgccggtacc gatacgtcct cttccacgat tgagtgggct 2460

atttccgaac tgatcaaatg cccgaaggcg atggaaaaag tgcaggcgga actgcgtaaa 2520atttccgaac tgatcaaatg cccgaaggcg atggaaaaag tgcaggcgga actgcgtaaa 2520

gcgctgaacg gtaaagagaa aattcatgaa gaggacatcc aggaactgtc ctacctgaat 2580gcgctgaacg gtaaagagaa aattcatgaa gaggacatcc aggaactgtc ctacctgaat 2580

atggtaatca aagaaactct gcgtctgcat ccgccgctgc cactggttct gccgcgtgaa 2640atggtaatca aagaaactct gcgtctgcat ccgccgctgc cactggttct gccgcgtgaa 2640

tgccgtcagc cggttaacct ggccggctac aacattccga acaaaacgaa gctgatcgtc 2700tgccgtcagc cggttaacct ggccggctac aacattccga acaaaacgaa gctgatcgtc 2700

aacgttttcg cgatcaaccg cgatcctgaa tactggaaag acgcggaagc gttcattccg 2760aacgttttcg cgatcaaccg cgatcctgaa tactggaaag acgcggaagc gttcattccg 2760

gaacgctttg agaactcctc tgccaccgtt atgggcgctg aatacgagta cctgccgttc 2820gaacgctttg agaactcctc tgccaccgtt atgggcgctg aatacgagta cctgccgttc 2820

ggtgcgggtc gccgtatgtg cccgggtgct gcactgggcc tggcgaacgt tcaactgcca 2880ggtgcgggtc gccgtatgtg cccgggtgct gcactgggcc tggcgaacgt tcaactgcca 2880

ctggcgaaca tcctgtacca cttcaactgg aaactgccta acggcgtatc ttatgatcaa 2940ctggcgaaca tcctgtacca cttcaactgg aaactgccta acggcgtatc ttatgatcaa 2940

atcgacatga ccgaaagctc cggcgcgacc atgcagcgta aaaccgaact gctgctggtt 3000atcgacatga ccgaaagctc cggcgcgacc atgcagcgta aaaccgaact gctgctggtt 3000

ccgtcctttt aacctagg 3018ccgtcctttt aacctagg 3018

<210>54<210>54

<211>1500<211>1500

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<400>54<400>54

atgaccgtac acgacatcat cgcaacgtac ttcactaaat ggtacgtaat tgtgccgctg 60atgaccgtac acgacatcat cgcaacgtac ttcactaaat ggtacgtaat tgtgccgctg 60

gcactgattg cgtatcgcgt gctggattat ttctacgcga cccgttctaa aagcactaag 120gcactgattg cgtatcgcgt gctggattat ttctacgcga cccgttctaa aagcactaag 120

aaatctctgc cggaaccgtg gcgtctgcca atcatcggtc acatgcacca cctgatcggc 180aaatctctgc cggaaccgtg gcgtctgcca atcatcggtc acatgcacca cctgatcggc 180

accaccccgc accgtggcgt acgcgacctg gcgcgtaagt acggctctct gatgcatctg 240accacccccgc accgtggcgt acgcgacctg gcgcgtaagt acggctctct gatgcatctg 240

cagctgggcg aggtacctac tatcgtcgtt tcctccccga agtgggccaa agaaatcctg 300cagctgggcg aggtacctac tatcgtcgtt tcctccccga agtgggccaa agaaatcctg 300

actacctatg acatcacttt cgccaaccgc ccggaaacgc tgaccggcga aattgtcctg 360actacctatg acatcacttt cgccaaccgc ccggaaacgc tgaccggcga aattgtcctg 360

taccataaca cggatgtggt tctggccccg tacggtgagt actggcgcca gctgcgcaaa 420taccataaca cggatgtggt tctggccccg tacggtgagt actggcgcca gctgcgcaaa 420

atttgtactc tggaactgct gagcgttaaa aaggttaaat ccttccagag cctgcgtgaa 480atttgtactc tggaactgct gagcgttaaa aaggttaaat ccttccagag cctgcgtgaa 480

gaggaatgct ggaacctggt gcaggagatt aaagcgtctg gcagcggtcg tccagttaac 540gaggaatgct ggaacctggt gcaggagatt aaagcgtctg gcagcggtcg tccagttaac 540

ctgtctgaga atgtttttaa actgatcgct actatcctgt ctcgcgcggc attcggtaaa 600ctgtctgaga atgtttttaa actgatcgct actatcctgt ctcgcgcggc attcggtaaa 600

ggtatcaaag atcagaaaga actgaccgaa atcgttaagg aaatcctgcg ccagactggt 660ggtatcaaag atcagaaaga actgaccgaa atcgttaagg aaatcctgcg ccagactggt 660

ggcttcgacg ttgcggacat cttcccgtcc aaaaagttcc tgcaccatct gtctggcaaa 720ggcttcgacg ttgcggacat cttcccgtcc aaaaagttcc tgcaccatct gtctggcaaa 720

cgcgctcgtc tgacctccct gcgtaagaaa attgataacc tgattgacaa cctggtcgct 780cgcgctcgtc tgacctccct gcgtaagaaa attgataacc tgattgacaa cctggtcgct 780

gagcacactg tgaacacctc ttctaaaacc aacgaaaccc tgctggacgt actgctgcgc 840gagcacactg tgaacacctc ttctaaaacc aacgaaaccc tgctggacgt actgctgcgc 840

ctgaaggact ctgccgaatt tccactgact agcgacaata tcaaagcaat catcctggac 900ctgaaggact ctgccgaatt tccactgact agcgacaata tcaaagcaat catcctggac 900

atgttcggcg ccggtaccga tacgtcctct tccacgattg agtgggctat ttccgaactg 960atgttcggcg ccggtaccga tacgtcctct tccacgattg agtgggctat ttccgaactg 960

atcaaatgcc cgaaggcgat ggaaaaagtg caggcggaac tgcgtaaagc gctgaacggt 1020atcaaatgcc cgaaggcgat ggaaaaagtg caggcggaac tgcgtaaagc gctgaacggt 1020

aaagagaaaa ttcatgaaga ggacatccag gaactgtcct acctgaatat ggtaatcaaa 1080aaagagaaaa ttcatgaaga ggacatccag gaactgtcct acctgaatat ggtaatcaaa 1080

gaaactctgc gtctgcatcc gccgctgcca ctggttctgc cgcgtgaatg ccgtcagccg 1140gaaactctgc gtctgcatcc gccgctgcca ctggttctgc cgcgtgaatg ccgtcagccg 1140

gttaacctgg ccggctacaa cattccgaac aaaacgaagc tgatcgtcaa cgttttcgcg 1200gttaacctgg ccggctacaa cattccgaac aaaacgaagc tgatcgtcaa cgttttcgcg 1200

atcaaccgcg atcctgaata ctggaaagac gcggaagcgt tcattccgga acgctttgag 1260atcaaccgcg atcctgaata ctggaaagac gcggaagcgt tcattccgga acgctttgag 1260

aactcctctg ccaccgttat gggcgctgaa tacgagtacc tgccgttcgg tgcgggtcgc 1320aactcctctg ccaccgttat gggcgctgaa tacgagtacc tgccgttcgg tgcgggtcgc 1320

cgtatgtgcc cgggtgctgc actgggcctg gcgaacgttc aactgccact ggcgaacatc 1380cgtatgtgcc cgggtgctgc actgggcctg gcgaacgttc aactgccact ggcgaacatc 1380

ctgtaccact tcaactggaa actgcctaac ggcgtatctt atgatcaaat cgacatgacc 1440ctgtaccact tcaactggaa actgcctaac ggcgtatctt atgatcaaat cgacatgacc 1440

gaaagctccg gcgcgaccat gcagcgtaaa accgaactgc tgctggttcc gtccttttaa 1500gaaagctccg gcgcgaccat gcagcgtaaa accgaactgc tgctggttcc gtccttttaa 1500

<210>55<210>55

<211>499<211>499

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的紫穗槐二烯氧化酶<223> Modified amorphadiene oxidase

<400>55<400>55

1 5 10 151 5 10 15

20 25 3020 25 30

Ala Thr Arg Ser Lys Ser Thr Lys Lys Ser Leu Pro Glu Pro Trp ArgAla Thr Arg Ser Lys Ser Thr Lys Lys Ser Leu Pro Glu Pro Trp Arg

35 40 4535 40 45

Leu Pro Ile Ile Gly His Met His His Leu Ile Gly Thr Thr Pro HisLeu Pro Ile Ile Gly His Met His His Leu Ile Gly Thr Thr Pro His

50 55 6050 55 60

Arg Gly Val Arg Asp Leu Ala Arg Lys Tyr Gly Ser Leu Met His LeuArg Gly Val Arg Asp Leu Ala Arg Lys Tyr Gly Ser Leu Met His Leu

65 70 75 8065 70 75 80

Gln Leu Gly Glu Val Pro Thr Ile Val Val Ser Ser Pro Lys Trp AlaGln Leu Gly Glu Val Pro Thr Ile Val Val Ser Ser Ser Pro Lys Trp Ala

85 90 9585 90 95

Lys Glu Ile Leu Thr Thr Tyr Asp Ile Thr Phe Ala Asn Arg Pro GluLys Glu Ile Leu Thr Thr Tyr Asp Ile Thr Phe Ala Asn Arg Pro Glu

100 105 110100 105 110

Thr Leu Thr Gly Glu Ile Val Leu Tyr His Asn Thr Asp Val Val LeuThr Leu Thr Gly Glu Ile Val Leu Tyr His Asn Thr Asp Val Val Leu

115 120 125115 120 125

Ala Pro Tyr Gly Glu Tyr Trp Arg Gln Leu Arg Lys Ile Cys Thr LeuAla Pro Tyr Gly Glu Tyr Trp Arg Gln Leu Arg Lys Ile Cys Thr Leu

130 135 140130 135 140

Glu Leu Leu Ser Val Lys Lys Val Lys Ser Phe Gln Ser Leu Arg GluGlu Leu Leu Ser Val Lys Lys Val Lys Ser Phe Gln Ser Leu Arg Glu

145 150 155 160145 150 155 160

Glu Glu Cys Trp Asn Leu Val Gln Glu Ile Lys Ala Ser Gly Ser GlyGlu Glu Cys Trp Asn Leu Val Gln Glu Ile Lys Ala Ser Gly Ser Gly

165 170 175165 170 175

Arg Pro Val Asn Leu Ser Glu Asn Val Phe Lys Leu Ile Ala Thr IleArg Pro Val Asn Leu Ser Glu Asn Val Phe Lys Leu Ile Ala Thr Ile

180 185 190180 185 190

Leu Ser Arg Ala Ala Phe Gly Lys Gly Ile Lys Asp Gln Lys Glu LeuLeu Ser Arg Ala Ala Phe Gly Lys Gly Ile Lys Asp Gln Lys Glu Leu

195 200 205195 200 205

Thr Glu Ile Val Lys Glu Ile Leu Arg Gln Thr Gly Gly Phe Asp ValThr Glu Ile Val Lys Glu Ile Leu Arg Gln Thr Gly Gly Phe Asp Val

210 215 220210 215 220

Ala Asp Ile Phe Pro Ser Lys Lys Phe Leu His His Leu Ser Gly LysAla Asp Ile Phe Pro Ser Lys Lys Phe Leu His His Leu Ser Gly Lys

225 230 235 240225 230 235 240

Arg Ala Arg Leu Thr Ser Leu Arg Lys Lys Ile Asp Asn Leu Ile AspArg Ala Arg Leu Thr Ser Leu Arg Lys Lys Ile Asp Asn Leu Ile Asp

245 250 255245 250 255

Asn Leu Val Ala Glu His Thr Val Asn Thr Ser Ser Lys Thr Asn GluAsn Leu Val Ala Glu His Thr Val Asn Thr Ser Ser Lys Thr Asn Glu

260 265 270260 265 270

Thr Leu Leu Asp Val Leu Leu Arg Leu Lys Asp Ser Ala Glu Phe ProThr Leu Leu Asp Val Leu Leu Arg Leu Lys Asp Ser Ala Glu Phe Pro

275 280 285275 280 285

Leu Thr Ser Asp Asn Ile Lys Ala Ile Ile Leu Asp Met Phe Gly AlaLeu Thr Ser Asp Asn Ile Lys Ala Ile Ile Leu Asp Met Phe Gly Ala

290 295 300290 295 300

Gly Thr Asp Thr Ser Ser Ser Thr Ile Glu Trp Ala Ile Ser Glu LeuGly Thr Asp Thr Ser Ser Ser Ser Thr Ile Glu Trp Ala Ile Ser Glu Leu

305 310 315 320305 310 315 320

Ile Lys Cys Pro Lys Ala Met Glu Lys Val Gln Ala Glu Leu Arg LysIle Lys Cys Pro Lys Ala Met Glu Lys Val Gln Ala Glu Leu Arg Lys

325 330 335325 330 335

Ala Leu Asn Gly Lys Glu Lys Ile His Glu Glu Asp Ile Gln Glu LeuAla Leu Asn Gly Lys Glu Lys Ile His Glu Glu Asp Ile Gln Glu Leu

340 345 350340 345 350

Ser Tyr Leu Asn Met Val Ile Lys Glu Thr Leu Arg Leu His Pro ProSer Tyr Leu Asn Met Val Ile Lys Glu Thr Leu Arg Leu His Pro Pro

355 360 365355 360 365

Leu Pro Leu Val Leu Pro Arg Glu Cys Arg Gln Pro Val Asn Leu AlaLeu Pro Leu Val Leu Pro Arg Glu Cys Arg Gln Pro Val Asn Leu Ala

370 375 380370 375 380

Gly Tyr Asn Ile Pro Asn Lys Thr Lys Leu Ile Val Asn Val Phe AlaGly Tyr Asn Ile Pro Asn Lys Thr Lys Leu Ile Val Asn Val Phe Ala

385 390 395 400385 390 395 400

Ile Asn Arg Asp Pro Glu Tyr Trp Lys Asp Ala Glu Ala Phe Ile ProIle Asn Arg Asp Pro Glu Tyr Trp Lys Asp Ala Glu Ala Phe Ile Pro

405 410 415405 410 415

Glu Arg Phe Glu Asn Ser Ser Ala Thr Val Met Gly Ala Glu Tyr GluGlu Arg Phe Glu Asn Ser Ser Ala Thr Val Met Gly Ala Glu Tyr Glu

420 425 430420 425 430

Tyr Leu Pro Phe Gly Ala Gly Arg Arg Met Cys Pro Gly Ala Ala LeuTyr Leu Pro Phe Gly Ala Gly Arg Arg Met Cys Pro Gly Ala Ala Leu

435 440 445435 440 445

Gly Leu Ala Asn Val Gln Leu Pro Leu Ala Asn Ile Leu Tyr His PheGly Leu Ala Asn Val Gln Leu Pro Leu Ala Asn Ile Leu Tyr His Phe

450 455 460450 455 460

Asn Trp Lys Leu Pro Asn Gly Val Ser Tyr Asp Gln Ile Asp Met ThrAsn Trp Lys Leu Pro Asn Gly Val Ser Tyr Asp Gln Ile Asp Met Thr

465 470 475 480465 470 475 480

Glu Ser Ser Gly Ala Thr Met Gln Arg Lys Thr Glu Leu Leu Leu ValGlu Ser Ser Gly Ala Thr Met Gln Arg Lys Thr Glu Leu Leu Leu Val

485 490 495485 490 495

Pro Ser PhePro Ser Phe

<210>56<210>56

<211>2988<211>2988

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的紫穗槐二烯氧化酶<223> Modified amorphadiene oxidase

<400>56<400>56

catatgatcg aacaactgct ggaatactgg tacgtggttg tgcctgttct gtatattatc 60catatgatcg aacaactgct ggaatactgg tacgtggttg tgcctgttct gtatattatc 60

aaacagctgc tggcgtacac taaagcgacc cgttctaaaa gcactaagaa atctctgccg 120aaacagctgc tggcgtacac taaagcgacc cgttctaaaa gcactaagaa atctctgccg 120

gcgaccatgc agcgtaaaac cgaactgctg ctggttccgt ccttttaacc taggcatatg 1500gcgaccatgc agcgtaaaac cgaactgctg ctggttccgt ccttttaacc taggcatatg 1500

atcgaacaac tgctggaata ctggtacgtg gttgtgcctg ttctgtatat tatcaaacag 1560atcgaacaac tgctggaata ctggtacgtg gttgtgcctg ttctgtatat tatcaaacag 1560

ctgctggcgt acactaaagc gacccgttct aaaagcacta agaaatctct gccggaaccg 1620ctgctggcgt acactaaagc gacccgttct aaaagcacta agaaatctct gccggaaccg 1620

tggcgtctgc caatcatcgg tcacatgcac cacctgatcg gcaccacccc gcaccgtggc 1680tggcgtctgc caatcatcgg tcacatgcac cacctgatcg gcaccacccc gcaccgtggc 1680

gtacgcgacc tggcgcgtaa gtacggctct ctgatgcatc tgcagctggg cgaggtacct 1740gtacgcgacc tggcgcgtaa gtacggctct ctgatgcatc tgcagctggg cgaggtacct 1740

actatcgtcg tttcctcccc gaagtgggcc aaagaaatcc tgactaccta tgacatcact 1800actatcgtcg tttcctcccc gaagtgggcc aaagaaatcc tgactaccta tgacatcact 1800

ttcgccaacc gcccggaaac gctgaccggc gaaattgtcc tgtaccataa cacggatgtg 1860ttcgccaacc gcccggaaac gctgaccggc gaaattgtcc tgtaccataa cacggatgtg 1860

gttctggccc cgtacggtga gtactggcgc cagctgcgca aaatttgtac tctggaactg 1920gttctggccc cgtacggtga gtactggcgc cagctgcgca aaatttgtac tctggaactg 1920

ctgagcgtta aaaaggttaa atccttccag agcctgcgtg aagaggaatg ctggaacctg 1980ctgagcgtta aaaaggttaa atccttccag agcctgcgtg aagaggaatg ctggaacctg 1980

gtgcaggaga ttaaagcgtc tggcagcggt cgtccagtta acctgtctga gaatgttttt 2040gtgcaggaga ttaaagcgtc tggcagcggt cgtccagtta acctgtctga gaatgttttt 2040

aaactgatcg ctactatcct gtctcgcgcg gcattcggta aaggtatcaa agatcagaaa 2100aaactgatcg ctactatcct gtctcgcgcg gcattcggta aaggtatcaa agatcagaaa 2100

gaactgaccg aaatcgttaa ggaaatcctg cgccagactg gtggcttcga cgttgcggac 2160gaactgaccg aaatcgttaa ggaaatcctg cgccagactg gtggcttcga cgttgcggac 2160

atcttcccgt ccaaaaagtt cctgcaccat ctgtctggca aacgcgctcg tctgacctcc 2220atcttcccgt ccaaaaagtt cctgcaccat ctgtctggca aacgcgctcg tctgacctcc 2220

ctgcgtaaga aaattgataa cctgattgac aacctggtcg ctgagcacac tgtgaacacc 2280ctgcgtaaga aaattgataa cctgattgac aacctggtcg ctgagcacac tgtgaacacc 2280

tcttctaaaa ccaacgaaac cctgctggac gtactgctgc gcctgaagga ctctgccgaa 2340tcttctaaaa ccaacgaaac cctgctggac gtactgctgc gcctgaagga ctctgccgaa 2340

tttccactga ctagcgacaa tatcaaagca atcatcctgg acatgttcgg cgccggtacc 2400tttccactga ctagcgacaa tatcaaagca atcatcctgg acatgttcgg cgccggtacc 2400

gatacgtcct cttccacgat tgagtgggct atttccgaac tgatcaaatg cccgaaggcg 2460gatacgtcct cttccacgat tgagtggggct atttccgaac tgatcaaatg cccgaaggcg 2460

atggaaaaag tgcaggcgga actgcgtaaa gcgctgaacg gtaaagagaa aattcatgaa 2520atggaaaaag tgcaggcgga actgcgtaaa gcgctgaacg gtaaagagaa aattcatgaa 2520

gaggacatcc aggaactgtc ctacctgaat atggtaatca aagaaactct gcgtctgcat 2580gaggacatcc aggaactgtc ctacctgaat atggtaatca aagaaactct gcgtctgcat 2580

ccgccgctgc cactggttct gccgcgtgaa tgccgtcagc cggttaacct ggccggctac 2640ccgccgctgc cactggttct gccgcgtgaa tgccgtcagc cggttaacct ggccggctac 2640

aacattccga acaaaacgaa gctgatcgtc aacgttttcg cgatcaaccg cgatcctgaa 2700aacattccga acaaaacgaa gctgatcgtc aacgttttcg cgatcaaccg cgatcctgaa 2700

tactggaaag acgcggaagc gttcattccg gaacgctttg agaactcctc tgccaccgtt 2760tactggaaag acgcggaagc gttcattccg gaacgctttg agaactcctc tgccaccgtt 2760

atgggcgctg aatacgagta cctgccgttc ggtgcgggtc gccgtatgtg cccgggtgct 2820atgggcgctg aatacgagta cctgccgttc ggtgcgggtc gccgtatgtg cccgggtgct 2820

gcactgggcc tggcgaacgt tcaactgcca ctggcgaaca tcctgtacca cttcaactgg 2880gcactgggcc tggcgaacgt tcaactgcca ctggcgaaca tcctgtacca cttcaactgg 2880

aaactgccta acggcgtatc ttatgatcaa atcgacatga ccgaaagctc cggcgcgacc 2940aaactgccta acggcgtatc ttatgatcaa atcgacatga ccgaaagctc cggcgcgacc 2940

atgcagcgta aaaccgaact gctgctggtt ccgtcctttt aacctagg 2988atgcagcgta aaaccgaact gctgctggtt ccgtcctttt aacctagg 2988

<210>57<210>57

<211>1485<211>1485

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<400>57<400>57

atgatcgaac aactgctgga atactggtac gtggttgtgc ctgttctgta tattatcaaa 60atgatcgaac aactgctgga atactggtac gtggttgtgc ctgttctgta tattatcaaa 60

cagctgctgg cgtacactaa agcgacccgt tctaaaagca ctaagaaatc tctgccggaa 120cagctgctgg cgtacactaa agcgacccgt tctaaaagca ctaagaaatc tctgccggaa 120

ccgtggcgtc tgccaatcat cggtcacatg caccacctga tcggcaccac cccgcaccgt 180ccgtggcgtc tgccaatcat cggtcacatg caccacctga tcggcaccac cccgcaccgt 180

ggcgtacgcg acctggcgcg taagtacggc tctctgatgc atctgcagct gggcgaggta 240ggcgtacgcg acctggcgcg taagtacggc tctctgatgc atctgcagct gggcgaggta 240

cctactatcg tcgtttcctc cccgaagtgg gccaaagaaa tcctgactac ctatgacatc 300cctactatcg tcgtttcctc cccgaagtgg gccaaagaaa tcctgactac ctatgacatc 300

actttcgcca accgcccgga aacgctgacc ggcgaaattg tcctgtacca taacacggat 360actttcgcca accgcccgga aacgctgacc ggcgaaattg tcctgtacca taacacggat 360

gtggttctgg ccccgtacgg tgagtactgg cgccagctgc gcaaaatttg tactctggaa 420gtggttctgg ccccgtacgg tgagtactgg cgccagctgc gcaaaatttg tactctggaa 420

ctgctgagcg ttaaaaaggt taaatccttc cagagcctgc gtgaagagga atgctggaac 480ctgctgagcg ttaaaaaggt taaatccttc cagagcctgc gtgaagagga atgctggaac 480

ctggtgcagg agattaaagc gtctggcagc ggtcgtccag ttaacctgtc tgagaatgtt 540ctggtgcagg agattaaagc gtctggcagc ggtcgtccag ttaacctgtc tgagaatgtt 540

tttaaactga tcgctactat cctgtctcgc gcggcattcg gtaaaggtat caaagatcag 600tttaaactga tcgctactat cctgtctcgc gcggcattcg gtaaaggtat caaagatcag 600

aaagaactga ccgaaatcgt taaggaaatc ctgcgccaga ctggtggctt cgacgttgcg 660aaagaactga ccgaaatcgt taaggaaatc ctgcgccaga ctggtggctt cgacgttgcg 660

gacatcttcc cgtccaaaaa gttcctgcac catctgtctg gcaaacgcgc tcgtctgacc 720gacatcttcc cgtccaaaaa gttcctgcac catctgtctg gcaaacgcgc tcgtctgacc 720

tccctgcgta agaaaattga taacctgatt gacaacctgg tcgctgagca cactgtgaac 780tccctgcgta agaaaattga taacctgatt gacaacctgg tcgctgagca cactgtgaac 780

acctcttcta aaaccaacga aaccctgctg gacgtactgc tgcgcctgaa ggactctgcc 840acctcttcta aaaccaacga aaccctgctg gacgtactgc tgcgcctgaa ggactctgcc 840

gaatttccac tgactagcga caatatcaaa gcaatcatcc tggacatgtt cggcgccggt 900gaatttccac tgactagcga caatatcaaa gcaatcatcc tggacatgtt cggcgccggt 900

accgatacgt cctcttccac gattgagtgg gctatttccg aactgatcaa atgcccgaag 960accgatacgt cctcttccac gattgagtgg gctatttccg aactgatcaa atgcccgaag 960

gcgatggaaa aagtgcaggc ggaactgcgt aaagcgctga acggtaaaga gaaaattcat 1020gcgatggaaa aagtgcaggc ggaactgcgt aaagcgctga acggtaaaga gaaaattcat 1020

gaagaggaca tccaggaact gtcctacctg aatatggtaa tcaaagaaac tctgcgtctg 1080gaagaggaca tccaggaact gtcctacctg aatatggtaa tcaaagaaac tctgcgtctg 1080

catccgccgc tgccactggt tctgccgcgt gaatgccgtc agccggttaa cctggccggc 1140catccgccgc tgccactggt tctgccgcgt gaatgccgtc agccggttaa cctggccggc 1140

tacaacattc cgaacaaaac gaagctgatc gtcaacgttt tcgcgatcaa ccgcgatcct 1200tacaacattc cgaacaaaac gaagctgatc gtcaacgttt tcgcgatcaa ccgcgatcct 1200

gaatactgga aagacgcgga agcgttcatt ccggaacgct ttgagaactc ctctgccacc 1260gaatactgga aagacgcgga agcgttcatt ccggaacgct ttgagaactc ctctgccacc 1260

gttatgggcg ctgaatacga gtacctgccg ttcggtgcgg gtcgccgtat gtgcccgggt 1320gttatgggcg ctgaatacga gtacctgccg ttcggtgcgg gtcgccgtat gtgcccgggt 1320

gctgcactgg gcctggcgaa cgttcaactg ccactggcga acatcctgta ccacttcaac 1380gctgcactgg gcctggcgaa cgttcaactg ccactggcga acatcctgta ccacttcaac 1380

tggaaactgc ctaacggcgt atcttatgat caaatcgaca tgaccgaaag ctccggcgcg 1440tggaaactgc ctaacggcgt atcttatgat caaatcgaca tgaccgaaag ctccggcgcg 1440

accatgcagc gtaaaaccga actgctgctg gttccgtcct tttaa 1485accatgcagc gtaaaaccga actgctgctg gttccgtcct tttaa 1485

<210>58<210>58

<211>494<211>494

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的紫穗槐二烯氧化酶<223> Modified amorphadiene oxidase

<400>58<400>58

1 5 10 151 5 10 15

Tyr Ile Ile Lys Gln Leu Leu Ala Tyr Thr Lys Ala Thr Arg Ser LysTyr Ile Ile Lys Gln Leu Leu Ala Tyr Thr Lys Ala Thr Arg Ser Lys

20 25 3020 25 30

Ser Thr Lys Lys Ser Leu Pro Glu Pro Trp Arg Leu Pro Ile Ile GlySer Thr Lys Lys Ser Leu Pro Glu Pro Trp Arg Leu Pro Ile Ile Gly

35 40 4535 40 45

His Met His His Leu Ile Gly Thr Thr Pro His Arg Gly Val Arg AspHis Met His His Leu Ile Gly Thr Thr Pro His Arg Gly Val Arg Asp

50 55 6050 55 60

Leu Ala Arg Lys Tyr Gly Ser Leu Met His Leu Gln Leu Gly Glu ValLeu Ala Arg Lys Tyr Gly Ser Leu Met His Leu Gln Leu Gly Glu Val

65 70 75 8065 70 75 80

Pro Thr Ile Val Val Ser Ser Pro Lys Trp Ala Lys Glu Ile Leu ThrPro Thr Ile Val Val Ser Ser Pro Lys Trp Ala Lys Glu Ile Leu Thr

85 90 9585 90 95

Thr Tyr Asp Ile Thr Phe Ala Asn Arg Pro Glu Thr Leu Thr Gly GluThr Tyr Asp Ile Thr Phe Ala Asn Arg Pro Glu Thr Leu Thr Gly Glu

100 105 110100 105 110

I1e Val Leu Tyr His Asn Thr Asp Val Val Leu Ala Pro Tyr Gly GluI1e Val Leu Tyr His Asn Thr Asp Val Val Leu Ala Pro Tyr Gly Glu

115 120 125115 120 125

Tyr Trp Arg Gln Leu Arg Lys Ile Cys Thr Leu Glu Leu Leu Ser ValTyr Trp Arg Gln Leu Arg Lys Ile Cys Thr Leu Glu Leu Leu Ser Val

130 135 140130 135 140

Lys Lys Val Lys Ser Phe Gln Ser Leu Arg Glu Glu Glu Cys Trp AsnLys Lys Val Lys Ser Phe Gln Ser Leu Arg Glu Glu Glu Cys Trp Asn

145 150 155 160145 150 155 160

Leu Val Gln Glu Ile Lys Ala Ser Gly Ser Gly Arg Pro Val Asn LeuLeu Val Gln Glu Ile Lys Ala Ser Gly Ser Gly Arg Pro Val Asn Leu

165 170 175165 170 175

Ser Glu Asn Val Phe Lys Leu Ile Ala Thr Ile Leu Ser Arg Ala AlaSer Glu Asn Val Phe Lys Leu Ile Ala Thr Ile Leu Ser Arg Ala Ala

180 185 190180 185 190

Phe Gly Lys Gly Ile Lys Asp Gln Lys Glu Leu Thr Glu Ile Val LysPhe Gly Lys Gly Ile Lys Asp Gln Lys Glu Leu Thr Glu Ile Val Lys

195 200 205195 200 205

Glu Ile Leu Arg Gln Thr Gly Gly Phe Asp Val Ala Asp Ile Phe ProGlu Ile Leu Arg Gln Thr Gly Gly Phe Asp Val Ala Asp Ile Phe Pro

210 215 220210 215 220

Ser Lys Lys Phe Leu His His Leu Ser Gly Lys Arg Ala Arg Leu ThrSer Lys Lys Phe Leu His His Leu Ser Gly Lys Arg Ala Arg Leu Thr

225 230 235 240225 230 235 240

Ser Leu Arg Lys Lys Ile Asp Asn Leu Ile Asp Asn Leu Val Ala GluSer Leu Arg Lys Lys Ile Asp Asn Leu Ile Asp Asn Leu Val Ala Glu

245 250 255245 250 255

His Thr Val Asn Thr Ser Ser Lys Thr Asn Glu Thr Leu Leu Asp ValHis Thr Val Asn Thr Ser Ser Lys Thr Asn Glu Thr Leu Leu Asp Val

260 265 270260 265 270

Leu Leu Arg Leu Lys Asp Ser Ala Glu Phe Pro Leu Thr Ser Asp AsnLeu Leu Arg Leu Lys Asp Ser Ala Glu Phe Pro Leu Thr Ser Asp Asn

275 280 285275 280 285

Ile Lys Ala Ile Ile Leu Asp Met Phe Gly Ala Gly Thr Asp Thr SerIle Lys Ala Ile Ile Leu Asp Met Phe Gly Ala Gly Thr Asp Thr Ser

290 295 300290 295 300

Ser Ser Thr Ile Glu Trp Ala Ile Ser Glu Leu Ile Lys Cys Pro LysSer Ser Thr Ile Glu Trp Ala Ile Ser Glu Leu Ile Lys Cys Pro Lys

305 310 315 320305 310 315 320

Ala Met Glu Lys Val Gln Ala Glu Leu Arg Lys Ala Leu Asn Gly LysAla Met Glu Lys Val Gln Ala Glu Leu Arg Lys Ala Leu Asn Gly Lys

325 330 335325 330 335

Glu Lys Ile His Glu Glu Asp Ile Gln Glu Leu Ser Tyr Leu Asn MetGlu Lys Ile His Glu Glu Asp Ile Gln Glu Leu Ser Tyr Leu Asn Met

340 345 350340 345 350

Val Ile Lys Glu Thr Leu Arg Leu His Pro Pro Leu Pro Leu Val LeuVal Ile Lys Glu Thr Leu Arg Leu His Pro Pro Leu Pro Leu Val Leu

355 360 365355 360 365

Pro Arg Glu Cys Arg Gln Pro Val Asn Leu Ala Gly Tyr Asn Ile ProPro Arg Glu Cys Arg Gln Pro Val Asn Leu Ala Gly Tyr Asn Ile Pro

370 375 380370 375 380

Asn Lys Thr Lys Leu Ile Val Asn Val Phe Ala Ile Asn Arg Asp ProAsn Lys Thr Lys Leu Ile Val Asn Val Phe Ala Ile Asn Arg Asp Pro

385 390 395 400385 390 395 400

Glu Tyr Trp Lys Asp Ala Glu Ala Phe Ile Pro Glu Arg Phe Glu AsnGlu Tyr Trp Lys Asp Ala Glu Ala Phe Ile Pro Glu Arg Phe Glu Asn

405 410 415405 410 415

Ser Ser Ala Thr Val Met Gly Ala Glu Tyr Glu Tyr Leu Pro Phe GlySer Ser Ala Thr Val Met Gly Ala Glu Tyr Glu Tyr Leu Pro Phe Gly

420 425 430420 425 430

Ala Gly Arg Arg Met Cys Pro Gly Ala Ala Leu Gly Leu Ala Asn ValAla Gly Arg Arg Met Cys Pro Gly Ala Ala Leu Gly Leu Ala Asn Val

435 440 445435 440 445

Gln Leu Pro Leu Ala Asn Ile Leu Tyr His Phe Asn Trp Lys Leu ProGln Leu Pro Leu Ala Asn Ile Leu Tyr His Phe Asn Trp Lys Leu Pro

450 455 460450 455 460

Asn Gly Val Ser Tyr Asp Gln Ile Asp Met Thr Glu Ser Ser Gly AlaAsn Gly Val Ser Tyr Asp Gln Ile Asp Met Thr Glu Ser Ser Gly Ala

465 470 475 480465 470 475 480

Thr Met Gln Arg Lys Thr Glu Leu Leu Leu Val Pro Ser PheThr Met Gln Arg Lys Thr Glu Leu Leu Leu Val Pro Ser Phe

485 490485 490

<210>59<210>59

<211>2946<211>2946

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<400>59<400>59

catatggctc tgctgctggc tgtcttcctg ggtctgtcct gcctgctgct gctgtccctg 60catatggctc tgctgctggc tgtcttcctg ggtctgtcct gcctgctgct gctgtccctg 60

tgggcgaccc gttctaaaag cactaagaaa tctctgccgg aaccgtggcg tctgccaatc 120tgggcgaccc gttctaaaag cactaagaaa tctctgccgg aaccgtggcg tctgccaatc 120

atcggtcaca tgcaccacct gatcggcacc accccgcacc gtggcgtacg cgacctggcg 180atcggtcaca tgcaccacct gatcggcacc accccgcacc gtggcgtacg cgacctggcg 180

cgtaagtacg gctctctgat gcatctgcag ctgggcgagg tacctactat cgtcgtttcc 240cgtaagtacg gctctctgat gcatctgcag ctgggcgagg tacctactat cgtcgtttcc 240

tccccgaagt gggccaaaga aatcctgact acctatgaca tcactttcgc caaccgcccg 300tccccgaagt gggccaaaga aatcctgact acctatgaca tcactttcgc caaccgcccg 300

gaaacgctga ccggcgaaat tgtcctgtac cataacacgg atgtggttct ggccccgtac 360gaaacgctga ccggcgaaat tgtcctgtac cataacacgg atgtggttct ggccccgtac 360

ggtgagtact ggcgccagct gcgcaaaatt tgtactctgg aactgctgag cgttaaaaag 420ggtgagtact ggcgccagct gcgcaaaatt tgtactctgg aactgctgag cgttaaaaag 420

gttaaatcct tccagagcct gcgtgaagag gaatgctgga acctggtgca ggagattaaa 480gttaaatcct tccagagcct gcgtgaagag gaatgctgga acctggtgca ggagattaaa 480

gcgtctggca gcggtcgtcc agttaacctg tctgagaatg tttttaaact gatcgctact 540gcgtctggca gcggtcgtcc agttaacctg tctgagaatg tttttaaact gatcgctact 540

atcctgtctc gcgcggcatt cggtaaaggt atcaaagatc agaaagaact gaccgaaatc 600atcctgtctc gcgcggcatt cggtaaaggt atcaaagatc agaaagaact gaccgaaatc 600

gttaaggaaa tcctgcgcca gactggtggc ttcgacgttg cggacatctt cccgtccaaa 660gttaaggaaa tcctgcgcca gactggtggc ttcgacgttg cggacatctt cccgtccaaa 660

aagttcctgc accatctgtc tggcaaacgc gctcgtctga cctccctgcg taagaaaatt 720aagttcctgc accatctgtc tggcaaacgc gctcgtctga cctccctgcg taagaaaatt 720

gataacctga ttgacaacct ggtcgctgag cacactgtga acacctcttc taaaaccaac 780gataacctga ttgacaacct ggtcgctgag cacactgtga acacctcttc taaaaccaac 780

gaaaccctgc tggacgtact gctgcgcctg aaggactctg ccgaatttcc actgactagc 840gaaaccctgc tggacgtact gctgcgcctg aaggactctg ccgaatttcc actgactagc 840

gacaatatca aagcaatcat cctggacatg ttcggcgccg gtaccgatac gtcctcttcc 900gacaatatca aagcaatcat cctggacatg ttcggcgccg gtaccgatac gtcctcttcc 900

acgattgagt gggctatttc cgaactgatc aaatgcccga aggcgatgga aaaagtgcag 960acgattgagt gggctatttc cgaactgatc aaatgcccga aggcgatgga aaaagtgcag 960

gcggaactgc gtaaagcgct gaacggtaaa gagaaaattc atgaagagga catccaggaa 1020gcggaactgc gtaaagcgct gaacggtaaa gagaaaattc atgaagagga catccaggaa 1020

ctgtcctacc tgaatatggt aatcaaagaa actctgcgtc tgcatccgcc gctgccactg 1080ctgtcctacc tgaatatggt aatcaaagaa actctgcgtc tgcatccgcc gctgccactg 1080

gttctgccgc gtgaatgccg tcagccggtt aacctggccg gctacaacat tccgaacaaa 1140gttctgccgc gtgaatgccg tcagccggtt aacctggccg gctacaacat tccgaacaaa 1140

acgaagctga tcgtcaacgt tttcgcgatc aaccgcgatc ctgaatactg gaaagacgcg 1200acgaagctga tcgtcaacgt tttcgcgatc aaccgcgatc ctgaatactg gaaagacgcg 1200

gaagcgttca ttccggaacg ctttgagaac tcctctgcca ccgttatggg cgctgaatac 1260gaagcgttca ttccggaacg ctttgagaac tcctctgcca ccgttatggg cgctgaatac 1260

gagtacctgc cgttcggtgc gggtcgccgt atgtgcccgg gtgctgcact gggcctggcg 1320gagtacctgc cgttcggtgc gggtcgccgt atgtgcccgg gtgctgcact gggcctggcg 1320

aacgttcaac tgccactggc gaacatcctg taccacttca actggaaact gcctaacggc 1380aacgttcaac tgccactggc gaacatcctg taccacttca actggaaact gcctaacggc 1380

gtatcttatg atcaaatcga catgaccgaa agctccggcg cgaccatgca gcgtaaaacc 1440gtatcttatg atcaaatcga catgaccgaa agctccggcg cgaccatgca gcgtaaaacc 1440

gaactgctgc tggttccgtc cttttgacct aggcatatgg ctctgctgct ggctgtcttc 1500gaactgctgc tggttccgtc cttttgacct aggcatatgg ctctgctgct ggctgtcttc 1500

ctgggtctgt cctgcctgct gctgctgtcc ctgtgggcga cccgttctaa aagcactaag 1560ctgggtctgt cctgcctgct gctgctgtcc ctgtgggcga cccgttctaa aagcactaag 1560

aaatctctgc cggaaccgtg gcgtctgcca atcatcggtc acatgcacca cctgatcggc 1620aaatctctgc cggaaccgtg gcgtctgcca atcatcggtc acatgcacca cctgatcggc 1620

accaccccgc accgtggcgt acgcgacctg gcgcgtaagt acggctctct gatgcatctg 1680accacccccgc accgtggcgt acgcgacctg gcgcgtaagt acggctctct gatgcatctg 1680

cagctgggcg aggtacctac tatcgtcgtt tcctccccga agtgggccaa agaaatcctg 1740cagctgggcg aggtacctac tatcgtcgtt tcctccccga agtgggccaa agaaatcctg 1740

actacctatg acatcacttt cgccaaccgc ccggaaacgc tgaccggcga aattgtcctg 1800actacctatg acatcacttt cgccaaccgc ccggaaacgc tgaccggcga aattgtcctg 1800

taccataaca cggatgtggt tctggccccg tacggtgagt actggcgcca gctgcgcaaa 1860taccataaca cggatgtggt tctggccccg tacggtgagt actggcgcca gctgcgcaaa 1860

atttgtactc tggaactgct gagcgttaaa aaggttaaat ccttccagag cctgcgtgaa 1920atttgtactc tggaactgct gagcgttaaa aaggttaaat ccttccagag cctgcgtgaa 1920

gaggaatgct ggaacctggt gcaggagatt aaagcgtctg gcagcggtcg tccagttaac 1980gaggaatgct ggaacctggt gcaggagatt aaagcgtctg gcagcggtcg tccagttaac 1980

ctgtctgaga atgtttttaa actgatcgct actatcctgt ctcgcgcggc attcggtaaa 2040ctgtctgaga atgtttttaa actgatcgct actatcctgt ctcgcgcggc attcggtaaa 2040

ggtatcaaag atcagaaaga actgaccgaa atcgttaagg aaatcctgcg ccagactggt 2100ggtatcaaag atcagaaaga actgaccgaa atcgttaagg aaatcctgcg ccagactggt 2100

ggcttcgacg ttgcggacat cttcccgtcc aaaaagttcc tgcaccatct gtctggcaaa 2160ggcttcgacg ttgcggacat cttcccgtcc aaaaagttcc tgcaccatct gtctggcaaa 2160

cgcgctcgtc tgacctccct gcgtaagaaa attgataacc tgattgacaa cctggtcgct 2220cgcgctcgtc tgacctccct gcgtaagaaa attgataacc tgattgacaa cctggtcgct 2220

gagcacactg tgaacacctc ttctaaaacc aacgaaaccc tgctggacgt actgctgcgc 2280gagcacactg tgaacacctc ttctaaaacc aacgaaaccc tgctggacgt actgctgcgc 2280

ctgaaggact ctgccgaatt tccactgact agcgacaata tcaaagcaat catcctggac 2340ctgaaggact ctgccgaatt tccactgact agcgacaata tcaaagcaat catcctggac 2340

atgttcggcg ccggtaccga tacgtcctct tccacgattg agtgggctat ttccgaactg 2400atgttcggcg ccggtaccga tacgtcctct tccacgattg agtgggctat ttccgaactg 2400

atcaaatgcc cgaaggcgat ggaaaaagtg caggcggaac tgcgtaaagc gctgaacggt 2460atcaaatgcc cgaaggcgat ggaaaaagtg caggcggaac tgcgtaaagc gctgaacggt 2460

aaagagaaaa ttcatgaaga ggacatccag gaactgtcct acctgaatat ggtaatcaaa 2520aaagagaaaa ttcatgaaga ggacatccag gaactgtcct acctgaatat ggtaatcaaa 2520

gaaactctgc gtctgcatcc gccgctgcca ctggttctgc cgcgtgaatg ccgtcagccg 2580gaaactctgc gtctgcatcc gccgctgcca ctggttctgc cgcgtgaatg ccgtcagccg 2580

gttaacctgg ccggctacaa cattccgaac aaaacgaagc tgatcgtcaa cgttttcgcg 2640gttaacctgg ccggctacaa cattccgaac aaaacgaagc tgatcgtcaa cgttttcgcg 2640

atcaaccgcg atcctgaata ctggaaagac gcggaagcgt tcattccgga acgctttgag 2700atcaaccgcg atcctgaata ctggaaagac gcggaagcgt tcattccgga acgctttgag 2700

aactcctctg ccaccgttat gggcgctgaa tacgagtacc tgccgttcgg tgcgggtcgc 2760aactcctctg ccaccgttat gggcgctgaa tacgagtacc tgccgttcgg tgcgggtcgc 2760

cgtatgtgcc cgggtgctgc actgggcctg gcgaacgttc aactgccact ggcgaacatc 2820cgtatgtgcc cgggtgctgc actgggcctg gcgaacgttc aactgccact ggcgaacatc 2820

ctgtaccact tcaactggaa actgcctaac ggcgtatctt atgatcaaat cgacatgacc 2880ctgtaccact tcaactggaa actgcctaac ggcgtatctt atgatcaaat cgacatgacc 2880

gaaagctccg gcgcgaccat gcagcgtaaa accgaactgc tgctggttcc gtccttttaa 2940gaaagctccg gcgcgaccat gcagcgtaaa accgaactgc tgctggttcc gtccttttaa 2940

cctagg 2946cctagg 2946

<210>60<210>60

<211>1464<211>1464

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<400>60<400>60

atggctctgc tgctggctgt cttcctgggt ctgtcctgcc tgctgctgct gtccctgtgg 60atggctctgc tgctggctgt cttcctgggt ctgtcctgcc tgctgctgct gtccctgtgg 60

gcgacccgtt ctaaaagcac taagaaatct ctgccggaac cgtggcgtct gccaatcatc 120gcgacccgtt ctaaaagcac taagaaatct ctgccggaac cgtggcgtct gccaatcatc 120

ggtcacatgc accacctgat cggcaccacc ccgcaccgtg gcgtacgcga cctggcgcgt 180ggtcacatgc accacctgat cggcaccacc ccgcaccgtg gcgtacgcga cctggcgcgt 180

aagtacggct ctctgatgca tctgcagctg ggcgaggtac ctactatcgt cgtttcctcc 240aagtacggct ctctgatgca tctgcagctg ggcgaggtac ctactatcgt cgtttcctcc 240

ccgaagtggg ccaaagaaat cctgactacc tatgacatca ctttcgccaa ccgcccggaa 300ccgaagtggg ccaaagaaat cctgactacc tatgacatca ctttcgccaa ccgcccggaa 300

acgctgaccg gcgaaattgt cctgtaccat aacacggatg tggttctggc cccgtacggt 360acgctgaccg gcgaaattgt cctgtaccat aacacggatg tggttctggc cccgtacggt 360

gagtactggc gccagctgcg caaaatttgt actctggaac tgctgagcgt taaaaaggtt 420gagtactggc gccagctgcg caaaatttgt actctggaac tgctgagcgt taaaaaggtt 420

aaatccttcc agagcctgcg tgaagaggaa tgctggaacc tggtgcagga gattaaagcg 480aaatccttcc agagcctgcg tgaagaggaa tgctggaacc tggtgcagga gattaaagcg 480

tctggcagcg gtcgtccagt taacctgtct gagaatgttt ttaaactgat cgctactatc 540tctggcagcg gtcgtccagt taacctgtct gagaatgttt ttaaactgat cgctactatc 540

ctgtctcgcg cggcattcgg taaaggtatc aaagatcaga aagaactgac cgaaatcgtt 600ctgtctcgcg cggcattcgg taaaggtatc aaagatcaga aagaactgac cgaaatcgtt 600

aaggaaatcc tgcgccagac tggtggcttc gacgttgcgg acatcttccc gtccaaaaag 660aaggaaatcc tgcgccagac tggtggcttc gacgttgcgg acatcttccc gtccaaaaag 660

ttcctgcacc atctgtctgg caaacgcgct cgtctgacct ccctgcgtaa gaaaattgat 720ttcctgcacc atctgtctgg caaacgcgct cgtctgacct ccctgcgtaa gaaaattgat 720

aacctgattg acaacctggt cgctgagcac actgtgaaca cctcttctaa aaccaacgaa 780aacctgattg acaacctggt cgctgagcac actgtgaaca cctcttctaa aaccaacgaa 780

accctgctgg acgtactgct gcgcctgaag gactctgccg aatttccact gactagcgac 840accctgctgg acgtactgct gcgcctgaag gactctgccg aatttccact gactagcgac 840

aatatcaaag caatcatcct ggacatgttc ggcgccggta ccgatacgtc ctcttccacg 900aatatcaaag caatcatcct ggacatgttc ggcgccggta ccgatacgtc ctcttccacg 900

attgagtggg ctatttccga actgatcaaa tgcccgaagg cgatggaaaa agtgcaggcg 960attgagtggg ctatttccga actgatcaaa tgcccgaagg cgatggaaaa agtgcaggcg 960

gaactgcgta aagcgctgaa cggtaaagag aaaattcatg aagaggacat ccaggaactg 1020gaactgcgta aagcgctgaa cggtaaagag aaaattcatg aagaggacat ccaggaactg 1020

tcctacctga atatggtaat caaagaaact ctgcgtctgc atccgccgct gccactggtt 1080tcctacctga atatggtaat caaagaaact ctgcgtctgc atccgccgct gccactggtt 1080

ctgccgcgtg aatgccgtca gccggttaac ctggccggct acaacattcc gaacaaaacg 1140ctgccgcgtg aatgccgtca gccggttaac ctggccggct acaacattcc gaacaaaacg 1140

aagctgatcg tcaacgtttt cgcgatcaac cgcgatcctg aatactggaa agacgcggaa 1200aagctgatcg tcaacgtttt cgcgatcaac cgcgatcctg aatactggaa agacgcggaa 1200

gcgttcattc cggaacgctt tgagaactcc tctgccaccg ttatgggcgc tgaatacgag 1260gcgttcattc cggaacgctt tgagaactcc tctgccaccg ttatgggcgc tgaatacgag 1260

tacctgccgt tcggtgcggg tcgccgtatg tgcccgggtg ctgcactggg cctggcgaac 1320tacctgccgt tcggtgcggg tcgccgtatg tgcccgggtg ctgcactggg cctggcgaac 1320

gttcaactgc cactggcgaa catcctgtac cacttcaact ggaaactgcc taacggcgta 1380gttcaactgc cactggcgaa catcctgtac cacttcaact ggaaactgcc taacggcgta 1380

tcttatgatc aaatcgacat gaccgaaagc tccggcgcga ccatgcagcg taaaaccgaa 1440tcttatgatc aaatcgacat gaccgaaagc tccggcgcga ccatgcagcg taaaaccgaa 1440

ctgctgctgg ttccgtcctt ttaa 1464ctgctgctgg ttccgtcctt ttaa 1464

<210>61<210>61

<211>487<211>487

<212>PRT<212>PRT

<213>人工序列<213> Artificial sequence

<220><220>

<223>修饰的紫穗槐二烯氧化酶<223> Modified amorphadiene oxidase

<400>61<400>61

1 5 10 151 5 10 15

Leu Ser Leu Trp Ala Thr Arg Ser Lys Ser Thr Lys Lys Ser Leu ProLeu Ser Leu Trp Ala Thr Arg Ser Lys Ser Thr Lys Lys Ser Leu Pro

20 25 3020 25 30

Glu Pro Trp Arg Leu Pro Ile Ile Gly His Met His His Leu Ile GlyGlu Pro Trp Arg Leu Pro Ile Ile Gly His Met His His Leu Ile Gly

35 40 4535 40 45

Thr Thr Pro His Arg Gly Val Arg Asp Leu Ala Arg Lys Tyr Gly SerThr Thr Pro His Arg Gly Val Arg Asp Leu Ala Arg Lys Tyr Gly Ser

50 55 6050 55 60

Leu Met His Leu Gln Leu Gly Glu Val Pro Thr Ile Val Val Ser SerLeu Met His Leu Gln Leu Gly Glu Val Pro Thr Ile Val Val Ser Ser

65 70 75 8065 70 75 80

Pro Lys Trp Ala Lys Glu Ile Leu Thr Thr Tyr Asp Ile Thr Phe AlaPro Lys Trp Ala Lys Glu Ile Leu Thr Thr Tyr Asp Ile Thr Phe Ala

85 90 9585 90 95

Asn Arg Pro Glu Thr Leu Thr Gly Glu Ile Val Leu Tyr His Asn ThrAsn Arg Pro Glu Thr Leu Thr Gly Glu Ile Val Leu Tyr His Asn Thr

100 105 110100 105 110

Asp Val Val Leu Ala Pro Tyr Gly Glu Tyr Trp Arg Gln Leu Arg LysAsp Val Val Leu Ala Pro Tyr Gly Glu Tyr Trp Arg Gln Leu Arg Lys

115 120 125115 120 125

Ile Cys Thr Leu Glu Leu Leu Ser Val Lys Lys Val Lys Ser Phe GlnIle Cys Thr Leu Glu Leu Leu Ser Val Lys Lys Val Lys Ser Phe Gln

130 135 140130 135 140

Ser Leu Arg Glu Glu Glu Cys Trp Asn Leu Val Gln Glu Ile Lys AlaSer Leu Arg Glu Glu Glu Cys Trp Asn Leu Val Gln Glu Ile Lys Ala

145 150 155 160145 150 155 160

Ser Gly Ser Gly Arg Pro Val Asn Leu Ser Glu Asn Val Phe Lys LeuSer Gly Ser Gly Arg Pro Val Asn Leu Ser Glu Asn Val Phe Lys Leu

165 170 175165 170 175

Ile Ala Thr Ile Leu Ser Arg Ala Ala Phe Gly Lys Gly Ile Lys AspIle Ala Thr Ile Leu Ser Arg Ala Ala Phe Gly Lys Gly Ile Lys Asp

180 185 190180 185 190

Gln Lys Glu Leu Thr Glu Ile Val Lys Glu Ile Leu Arg Gln Thr GlyGln Lys Glu Leu Thr Glu Ile Val Lys Glu Ile Leu Arg Gln Thr Gly

195 200 205195 200 205

Gly Phe Asp Val Ala Asp Ile Phe Pro Ser Lys Lys Phe Leu His HisGly Phe Asp Val Ala Asp Ile Phe Pro Ser Lys Lys Phe Leu His His

210 215 220210 215 220

Leu Ser Gly Lys Arg Ala Arg Leu Thr Ser Leu Arg Lys Lys Ile AspLeu Ser Gly Lys Arg Ala Arg Leu Thr Ser Leu Arg Lys Lys Ile Asp

225 230 235 240225 230 235 240

Asn Leu Ile Asp Asn Leu Val Ala Glu His Thr Val Asn Thr Ser SerAsn Leu Ile Asp Asn Leu Val Ala Glu His Thr Val Asn Thr Ser Ser

245 250 255245 250 255

Lys Thr Asn Glu Thr Leu Leu Asp Val Leu Leu Arg Leu Lys Asp SerLys Thr Asn Glu Thr Leu Leu Asp Val Leu Leu Arg Leu Lys Asp Ser

260 265 270260 265 270

Ala Glu Phe Pro Leu Thr Ser Asp Asn Ile Lys Ala Ile Ile Leu AspAla Glu Phe Pro Leu Thr Ser Asp Asn Ile Lys Ala Ile Ile Leu Asp

275 280 285275 280 285

Met Phe Gly Ala Gly Thr Asp Thr Ser Ser Ser Thr Ile Glu Trp AlaMet Phe Gly Ala Gly Thr Asp Thr Ser Ser Ser Ser Thr Ile Glu Trp Ala

290 295 300290 295 300

Ile Ser Glu Leu Ile Lys Cys Pro Lys Ala Met Glu Lys Val Gln AlaIle Ser Glu Leu Ile Lys Cys Pro Lys Ala Met Glu Lys Val Gln Ala

305 310 315 320305 310 315 320

Glu Leu Arg Lys Ala Leu Asn Gly Lys Glu Lys Ile His Glu Glu AspGlu Leu Arg Lys Ala Leu Asn Gly Lys Glu Lys Ile His Glu Glu Asp

325 330 335325 330 335

Ile Gln Glu Leu Ser Tyr Leu Asn Met Val Ile Lys Glu Thr Leu ArgIle Gln Glu Leu Ser Tyr Leu Asn Met Val Ile Lys Glu Thr Leu Arg

340 345 350340 345 350

Leu His Pro Pro Leu Pro Leu Val Leu Pro Arg Glu Cys Arg Gln ProLeu His Pro Pro Leu Pro Leu Val Leu Pro Arg Glu Cys Arg Gln Pro

355 360 365355 360 365

Val Asn Leu Ala Gly Tyr Asn Ile Pro Asn Lys Thr Lys Leu Ile ValVal Asn Leu Ala Gly Tyr Asn Ile Pro Asn Lys Thr Lys Leu Ile Val

370 375 380370 375 380

Asn Val Phe Ala Ile Asn Arg Asp Pro Glu Tyr Trp Lys Asp Ala GluAsn Val Phe Ala Ile Asn Arg Asp Pro Glu Tyr Trp Lys Asp Ala Glu

385 390 395 400385 390 395 400

Ala Phe Ile Pro Glu Arg Phe Glu Asn Ser Ser Ala Thr Val Met GlyAla Phe Ile Pro Glu Arg Phe Glu Asn Ser Ser Ala Thr Val Met Gly

405 410 415405 410 415

Ala Glu Tyr Glu Tyr Leu Pro Phe Gly Ala Gly Arg Arg Met Cys ProAla Glu Tyr Glu Tyr Leu Pro Phe Gly Ala Gly Arg Arg Met Cys Pro

420 425 430420 425 430

Gly Ala Ala Leu Gly Leu Ala Asn Val Gln Leu Pro Leu Ala Asn IleGly Ala Ala Leu Gly Leu Ala Asn Val Gln Leu Pro Leu Ala Asn Ile

435 440 445435 440 445

Leu Tyr His Phe Asn Trp Lys Leu Pro Asn Gly Val Ser Tyr Asp GlnLeu Tyr His Phe Asn Trp Lys Leu Pro Asn Gly Val Ser Tyr Asp Gln

450 455 460450 455 460

Ile Asp Met Thr Glu Ser Ser Gly Ala Thr Met Gln Arg Lys Thr GluIle Asp Met Thr Glu Ser Ser Gly Ala Thr Met Gln Arg Lys Thr Glu

465 470 475 480465 470 475 480

Leu Leu Leu Val Pro Ser PheLeu Leu Leu Val Pro Ser Phe

485485

<210>62<210>62

<211>10633<211>10633

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<223>重组质粒<223> recombinant plasmid

<400>62<400>62

accttcggga gcgcctgaag cccgttctgg acgccctggg gccgttgaat cgggatatgc 60accttcggga gcgcctgaag cccgttctgg acgccctggg gccgttgaat cgggatatgc 60

aggccaaggc cgccgcgatc atcaaggccg tgggcgaaaa gctgctgacg gaacagcggg 120aggccaaggc cgccgcgatc atcaaggccg tgggcgaaaa gctgctgacg gaacagcggg 120

aagtccagcg ccagaaacag gcccagcgcc agcaggaacg cgggcgcgca catttccccg 180aagtccagcg ccagaaacag gcccagcgcc agcaggaacg cgggcgcgca catttccccg 180

aaaagtgcca cctgacgtct aagaaaccat tattatcatg acattaacct ataaaaatag 240aaaagtgcca cctgacgtct aagaaaccat tattatcatg acattaacct ataaaaatag 240

gcgtatcacg aggccctttc gtcttcaaga attctcatgt ttgacagctt atcatcgata 300gcgtatcacg aggccctttc gtcttcaaga attctcatgt ttgacagctt atcatcgata 300

agctttaatg cggtagttta tcacagttaa attgctaacg cagtcaggca ccgtgtatga 360agctttaatg cggtagttatta tcacagttaa attgctaacg cagtcaggca ccgtgtatga 360

aatctaacaa tgcgctcatc gtcatcctcg gcaccgtcac cctggatgct gtaggcatag 420aatctaacaa tgcgctcatc gtcatcctcg gcaccgtcac cctggatgct gtaggcatag 420

gcttggttat gccggtactg ccgggcctct tgcgggatat cgtccattcc gacagcatcg 480gcttggttat gccggtactg ccgggcctct tgcgggatat cgtccattcc gacagcatcg 480

ccagtcacta tggcgtgctg ctagcgctat atgcgttgat gcaatttcta tgcgcacccg 540ccagtcacta tggcgtgctg ctagcgctat atgcgttgat gcaatttcta tgcgcacccg 540

ttctcggagc actgtccgac cgctttggcc gccgcccagt cctgctcgct tcgctacttg 600ttctcggagc actgtccgac cgctttggcc gccgcccagt cctgctcgct tcgctacttg 600

gagccactat cgactacgcg atcatggcga ccacacccgt cctgtggatc ctctacgccg 660gagccactat cgactacgcg atcatggcga ccaacacccgt cctgtggatc ctctacgccg 660

gacgcatcgt ggccggcatc accggcgcca caggtgcggt tgctggcgcc tatatcgccg 720gacgcatcgt ggccggcatc accggcgcca caggtgcggt tgctggcgcc tatatcgccg 720

acatcaccga tggggaagat cgggctcgcc acttcgggct catgagcgct tgtttcggcg 780acatcaccga tggggaagat cgggctcgcc acttcgggct catgagcgct tgtttcggcg 780

tgggtatggt ggcaggcccc gtggccgggg gactgttggg cgccatctcc ttgcatgcac 840tgggtatggt ggcaggcccc gtggccgggg gactgttggg cgccatctcc ttgcatgcac 840

cattccttgc ggcggcggtg ctcaacggcc tcaacctact actgggctgc ttcctaatgc 900cattccttgc ggcggcggtg ctcaacggcc tcaacctact actgggctgc ttcctaatgc 900

aggagtcgca taagggagag cgtcgaccga tgcccttgag agccttcaac ccagtcagct 960aggagtcgca taagggagag cgtcgaccga tgcccttgag agccttcaac ccagtcagct 960

ccttccggtg ggcgcggggc atgactatcg tcgccgcact tatgactgtc ttctttatca 1020ccttccggtg ggcgcggggc atgactatcg tcgccgcact tatgactgtc ttctttatca 1020

tgcaactcgt aggacaggtg ccggcagcgc tctgggtcat tttcggcgag gaccgctttc 1080tgcaactcgt aggacaggtg ccggcagcgc tctgggtcat tttcggcgag gaccgctttc 1080

gctggagcgc gacgatgatc ggcctgtcgc ttgcggtatt cggaatcttg cacgccctcg 1140gctggagcgc gacgatgatc ggcctgtcgc ttgcggtatt cggaatcttg cacgccctcg 1140

ctcaagcctt cgtcactggt cccgccacca aacgtttcgg cgagaagcag gccattatcg 1200ctcaagcctt cgtcactggt cccgccacca aacgtttcgg cgagaagcag gccattatcg 1200

ccggcatggc ggccgacgcg ctgggctacg tcttgctggc gttcgcgacg cgaggctgga 1260ccggcatggc ggccgacgcg ctgggctacg tcttgctggc gttcgcgacg cgaggctgga 1260

tggccttccc cattatgatt cttctcgctt ccggcggcat cgggatgccc gcgttgcagg 1320tggccttccc cattatgatt cttctcgctt ccggcggcat cgggatgccc gcgttgcagg 1320

ccatgctgtc caggcaggta gatgacgacc atcagggaca gcttcaagga tcgctcgcgg 1380ccatgctgtc caggcaggta gatgacgacc atcagggaca gcttcaagga tcgctcgcgg 1380

ctcttaccag cctaacttcg atcactggac cgctgatcgt cacggcgatt tatgccgcct 1440ctcttaccag cctaacttcg atcactggac cgctgatcgt cacggcgatt tatgccgcct 1440

cggcgagcac atggaacggg ttggcatgga ttgtaggcgc cgccctatac cttgtctgcc 1500cggcgagcac atggaacggg ttggcatgga ttgtaggcgc cgccctatac cttgtctgcc 1500

tccccgcgtt gcgtcgcggt gcatggagcc gggccacctc gacctgaatg gaagccggcg 1560tccccgcgtt gcgtcgcggt gcatggagcc gggccacctc gacctgaatg gaagccggcg 1560

gcacctcgct aacggattca ccactccaag aattggagcc aatcaattct tgcggagaac 1620gcacctcgct aacggattca ccactccaag aattggagcc aatcaattct tgcggagaac 1620

tgtgaatgcg caaatgcgcc caatacgcaa accgcctctc cccgcgcgtt ggccgattca 1680tgtgaatgcg caaatgcgcc caatacgcaa accgcctctc cccgcgcgtt ggccgattca 1680

ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg ggcagtgagc gcaacgcaat 1740ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg ggcagtgagc gcaacgcaat 1740

taatgtgagt tagctcactc attaggcacc ccaggcttta cactttatgc ttccggctcg 1800taatgtgagt tagctcactc attaggcacc ccaggcttta cactttatgc ttccggctcg 1800

tatgttgtgt ggaattgtga gcggataaca atttcacaca ggaaacagct atgaccatga 1860tatgttgtgt ggaattgtga gcggataaca atttcacaca ggaaacagct atgaccatga 1860

ttacgccaag cgcgcaatta accctcacta aagggaacaa aagctgggta ccgggccccc 1920ttacgccaag cgcgcaatta accctcacta aagggaacaa aagctgggta ccgggccccc 1920

cctcgaggtc gacggtatcg ataagcttga tatcgaattc ctgcagtagg aggaattaac 1980cctcgaggtc gacggtatcg ataagcttga tatcgaattc ctgcagtagg aggaattaac 1980

catgtcatta ccgttcttaa cttctgcacc gggaaaggtt attatttttg gtgaacactc 2040catgtcatta ccgttcttaa cttctgcacc gggaaaggtt attatttttg gtgaacactc 2040

tgctgtgtac aacaagcctg ccgtcgctgc tagtgtgtct gcgttgagaa cctacctgct 2100tgctgtgtac aacaagcctg ccgtcgctgc tagtgtgtct gcgttgagaa cctacctgct 2100

aataagcgag tcatctgcac cagatactat tgaattggac ttcccggaca ttagctttaa 2160aataagcgag tcatctgcac cagatactat tgaattggac ttcccggaca ttagctttaa 2160

tcataagtgg tccatcaatg atttcaatgc catcaccgag gatcaagtaa actcccaaaa 2220tcataagtgg tccatcaatg atttcaatgc catcaccgag gatcaagtaa actcccaaaa 2220

attggccaag gctcaacaag ccaccgatgg cttgtctcag gaactcgtta gtcttttgga 2280attggccaag gctcaacaag ccaccgatgg cttgtctcag gaactcgtta gtcttttgga 2280

tccgttgtta gctcaactat ccgaatcctt ccactaccat gcagcgtttt gtttcctgta 2340tccgttgtta gctcaactat ccgaatcctt ccactaccat gcagcgtttt gtttcctgta 2340

tatgtttgtt tgcctatgcc cccatgccaa gaatattaag ttttctttaa agtctacttt 2400tatgtttgtt tgcctatgcc cccatgccaa gaatattaag ttttctttaa agtctacttt 2400

acccatcggt gctgggttgg gctcaagcgc ctctatttct gtatcactgg ccttagctat 2460acccatcggt gctgggttgg gctcaagcgc ctctatttct gtatcactgg ccttagctat 2460

ggcctacttg ggggggttaa taggatctaa tgacttggaa aagctgtcag aaaacgataa 2520ggcctacttg ggggggttaa taggatctaa tgacttggaa aagctgtcag aaaacgataa 2520

gcatatagtg aatcaatggg ccttcatagg tgaaaagtgt attcacggta ccccttcagg 2580gcatatagtg aatcaatggg ccttcatagg tgaaaagtgt attcacggta ccccttcagg 2580

aatagataac gctgtggcca cttatggtaa tgccctgcta tttgaaaaag actcacataa 2640aatagataac gctgtggcca cttatggtaa tgccctgcta tttgaaaaag actcacataa 2640

tggaacaata aacacaaaca attttaagtt cttagatgat ttcccagcca ttccaatgat 2700tggaacaata aacacaaaca attttaagtt cttagatgat ttcccagcca ttccaatgat 2700

cctaacctat actagaattc caaggtctac aaaagatctt gttgctcgcg ttcgtgtgtt 2760cctaacctat actagaattc caaggtctac aaaagatctt gttgctcgcg ttcgtgtgtt 2760

ggtcaccgag aaatttcctg aagttatgaa gccaattcta gatgccatgg gtgaatgtgc 2820ggtcaccgag aaatttcctg aagttatgaa gccaattcta gatgccatgg gtgaatgtgc 2820

cctacaaggc ttagagatca tgactaagtt aagtaaatgt aaaggcaccg atgacgaggc 2880cctacaaggc ttagagatca tgactaagtt aagtaaatgt aaaggcaccg atgacgaggc 2880

tgtagaaact aataatgaac tgtatgaaca actattggaa ttgataagaa taaatcatgg 2940tgtagaaact aataatgaac tgtatgaaca actattggaa ttgataagaa taaatcatgg 2940

actgcttgtc tcaatcggtg tttctcatcc tggattagaa cttattaaaa atctgagcga 3000actgcttgtc tcaatcggtg tttctcatcc tggattagaa cttattaaaa atctgagcga 3000

tgatttgaga attggctcca caaaacttac cggtgctggt ggcggcggtt gctctttgac 3060tgatttgaga attggctcca caaaacttac cggtgctggt ggcggcggtt gctctttgac 3060

tttgttacga agagacatta ctcaagagca aattgacagc ttcaaaaaga aattgcaaga 3120tttgttacga agagacatta ctcaagagca aattgacagc ttcaaaaaga aattgcaaga 3120

tgattttagt tacgagacat ttgaaacaga cttgggtggg actggctgct gtttgttaag 3180tgattttagt tacgagacat ttgaaacaga cttgggtggg actggctgct gtttgttaag 3180

cgcaaaaaat ttgaataaag atcttaaaat caaatcccta gtattccaat tatttgaaaa 3240cgcaaaaaat ttgaataaag atcttaaaat caaatcccta gtattccaat tatttgaaaa 3240

taaaactacc acaaagcaac aaattgacga tctattattg ccaggaaaca cgaatttacc 3300taaaactacc acaaagcaac aaattgacga tctattattg ccaggaaaca cgaatttacc 3300

atggacttca taggaggcag atcaaatgtc agagttgaga gccttcagtg ccccagggaa 3360atggacttca taggaggcag atcaaatgtc agagttgaga gccttcagtg ccccagggaa 3360

agcgttacta gctggtggat atttagtttt agatacaaaa tatgaagcat ttgtagtcgg 3420agcgttacta gctggtggat atttagtttt agatacaaaa tatgaagcat ttgtagtcgg 3420

attatcggca agaatgcatg ctgtagccca tccttacggt tcattgcaag ggtctgataa 3480attatcggca agaatgcatg ctgtagccca tccttacggt tcattgcaag ggtctgataa 3480

gtttgaagtg cgtgtgaaaa gtaaacaatt taaagatggg gagtggctgt accatataag 3540gtttgaagtg cgtgtgaaaa gtaaacaatt taaagatggg gagtggctgt accatataag 3540

tcctaaaagt ggcttcattc ctgtttcgat aggcggatct aagaaccctt tcattgaaaa 3600tcctaaaagt ggcttcattc ctgtttcgat aggcggatct aagaaccctt tcattgaaaa 3600

agttatcgct aacgtattta gctactttaa acctaacatg gacgactact gcaatagaaa 3660agttatcgct aacgtattta gctactttaa acctaacatg gacgactact gcaatagaaa 3660

cttgttcgtt attgatattt tctctgatga tgcctaccat tctcaggagg atagcgttac 3720cttgttcgtt attgatattt tctctgatga tgcctaccat tctcaggagg atagcgttac 3720

cgaacatcgt ggcaacagaa gattgagttt tcattcgcac agaattgaag aagttcccaa 3780cgaacatcgt ggcaacagaa gattgagttt tcattcgcac agaattgaag aagttcccaa 3780

aacagggctg ggctcctcgg caggtttagt cacagtttta actacagctt tggcctcctt 3840aacagggctg ggctcctcgg caggtttagt cacagtttta actacagctt tggcctcctt 3840

ttttgtatcg gacctggaaa ataatgtaga caaatataga gaagttattc ataatttagc 3900ttttgtatcg gacctggaaa ataatgtaga caaatataga gaagttatc ataatttagc 3900

acaagttgct cattgtcaag ctcagggtaa aattggaagc gggtttgatg tagcggcggc 3960acaagttgct cattgtcaag ctcagggtaa aattggaagc gggtttgatg tagcggcggc 3960

agcatatgga tctatcagat atagaagatt cccacccgca ttaatctcta atttgccaga 4020agcatatgga tctatcagat atagaagatt cccacccgca ttaatctcta atttgccaga 4020

tattggaagt gctacttacg gcagtaaact ggcgcatttg gttgatgaag aagactggaa 4080tattggaagt gctacttacg gcagtaaact ggcgcatttg gttgatgaag aagactggaa 4080

tattacgatt aaaagtaacc atttaccttc gggattaact ttatggatgg gcgatattaa 4140tattacgatt aaaagtaacc atttaccttc gggattaact ttatggatgg gcgatattaa 4140

gaatggttca gaaacagtaa aactggtcca gaaggtaaaa aattggtatg attcgcatat 4200gaatggttca gaaacagtaa aactggtcca gaaggtaaaa aattggtatg attcgcatat 4200

gccagaaagc ttgaaaatat atacagaact cgatcatgca aattctagat ttatggatgg 4260gccagaaagc ttgaaaatat atacagaact cgatcatgca aattctagat ttatggatgg 4260

actatctaaa ctagatcgct tacacgagac tcatgacgat tacagcgatc agatatttga 4320actatctaaa ctagatcgct tacacgagac tcatgacgat tacagcgatc agatatttga 4320

gtctcttgag aggaatgact gtacctgtca aaagtatcct gaaatcacag aagttagaga 4380gtctcttgag aggaatgact gtacctgtca aaagtatcct gaaatcacag aagttagaga 4380

tgcagttgcc acaattagac gttcctttag aaaaataact aaagaatctg gtgccgatat 4440tgcagttgcc acaattagac gttcctttag aaaaataact aaagaatctg gtgccgatat 4440

cgaacctccc gtacaaacta gcttattgga tgattgccag accttaaaag gagttcttac 4500cgaacctccc gtacaaacta gcttattgga tgattgccag accttaaaag gagttcttac 4500

ttgcttaata cctggtgctg gtggttatga cgccattgca gtgattacta agcaagatgt 4560ttgcttaata cctggtgctg gtggttatga cgccattgca gtgattacta agcaagatgt 4560

tgatcttagg gctcaaaccg ctaatgacaa aagattttct aaggttcaat ggctggatgt 4620tgatcttagg gctcaaaccg ctaatgacaa aagattttct aaggttcaat ggctggatgt 4620

aactcaggct gactggggtg ttaggaaaga aaaagatccg gaaacttatc ttgataaata 4680aactcaggct gactggggtg ttaggaaaga aaaagatccg gaaacttatc ttgataaata 4680

ggaggtaata ctcatgaccg tttacacagc atccgttacc gcacccgtca acatcgcaac 4740ggaggtaata ctcatgaccg tttacacagc atccgttacc gcacccgtca acatcgcaac 4740

ccttaagtat tgggggaaaa gggacacgaa gttgaatctg cccaccaatt cgtccatatc 4800ccttaagtat tgggggaaaa gggacacgaa gttgaatctg cccaccaatt cgtccatatc 4800

agtgacttta tcgcaagatg acctcagaac gttgacctct gcggctactg cacctgagtt 4860agtgacttta tcgcaagatg acctcagaac gttgacctct gcggctactg cacctgagtt 4860

tgaacgcgac actttgtggt taaatggaga accacacagc atcgacaatg aaagaactca 4920tgaacgcgac actttgtggt taaatggaga accacacagc atcgacaatg aaagaactca 4920

aaattgtctg cgcgacctac gccaattaag aaaggaaatg gaatcgaagg acgcctcatt 4980aaattgtctg cgcgacctac gccaattaag aaaggaaatg gaatcgaagg acgcctcatt 4980

gcccacatta tctcaatgga aactccacat tgtctccgaa aataactttc ctacagcagc 5040gcccacatta tctcaatgga aactccacat tgtctccgaa aataactttc ctacagcagc 5040

tggtttagct tcctccgctg ctggctttgc tgcattggtc tctgcaattg ctaagttata 5100tggtttagct tcctccgctg ctggctttgc tgcattggtc tctgcaattg ctaagttata 5100

ccaattacca cagtcaactt cagaaatatc tagaatagca agaaaggggt ctggttcagc 5160ccaattacca cagtcaactt cagaaatatc tagaatagca agaaaggggt ctggttcagc 5160

ttgtagatcg ttgtttggcg gatacgtggc ctgggaaatg ggaaaagctg aagatggtca 5220ttgtagatcg ttgtttggcg gatacgtggc ctgggaaatg ggaaaagctg aagatggtca 5220

tgattccatg gcagtacaaa tcgcagacag ctctgactgg cctcagatga aagcttgtgt 5280tgattccatg gcagtacaaa tcgcagacag ctctgactgg cctcagatga aagcttgtgt 5280

cctagttgtc agcgatatta aaaaggatgt gagttccact cagggtatgc aattgaccgt 5340cctagttgtc agcgatatta aaaaggatgt gagttccact cagggtatgc aattgaccgt 5340

ggcaacctcc gaactattta aagaaagaat tgaacatgtc gtaccaaaga gatttgaagt 5400ggcaacctcc gaactattta aagaaagaat tgaacatgtc gtaccaaaga gatttgaagt 5400

catgcgtaaa gccattgttg aaaaagattt cgccaccttt gcaaaggaaa caatgatgga 5460catgcgtaaa gccattgttg aaaaagattt cgccaccttt gcaaaggaaa caatgatgga 5460

ttccaactct ttccatgcca catgtttgga ctctttccct ccaatattct acatgaatga 5520ttccaactct ttccatgcca catgtttgga ctctttccct ccaatattct acatgaatga 5520

cacttccaag cgtatcatca gttggtgcca caccattaat cagttttacg gagaaacaat 5580cacttccaag cgtatcatca gttggtgcca caccattaat cagttttacg gagaaacaat 5580

cgttgcatac acgtttgatg caggtccaaa tgctgtgttg tactacttag ctgaaaatga 5640cgttgcatac acgtttgatg caggtccaaa tgctgtgttg tactacttag ctgaaaatga 5640

gtcgaaactc tttgcattta tctataaatt gtttggctct gttcctggat gggacaagaa 5700gtcgaaactc tttgcattta tctataaatt gtttggctct gttcctggat gggacaagaa 5700

atttactact gagcagcttg aggctttcaa ccatcaattt gaatcatcta actttactgc 5760atttactact gagcagcttg aggctttcaa ccatcaattt gaatcatcta actttactgc 5760

acgtgaattg gatcttgagt tgcaaaagga tgttgccaga gtgattttaa ctcaagtcgg 5820acgtgaattg gatcttgagt tgcaaaagga tgttgccaga gtgattttaa ctcaagtcgg 5820

ttcaggccca caagaaacaa acgaatcttt gattgacgca aagactggtc taccaaagga 5880ttcaggccca caagaaacaa acgaatcttt gattgacgca aagactggtc taccaaagga 5880

ataactgcag cccgggagga ggattactat atgcaaacgg aacacgtcat tttattgaat 5940ataactgcag cccgggagga ggattactat atgcaaacgg aacacgtcat tttattgaat 5940

gcacagggag ttcccacggg tacgctggaa aagtatgccg cacacacggc agacacccgc 6000gcacaggggag ttcccacggg tacgctggaa aagtatgccg cacacacggc agacacccgc 6000

ttacatctcg cgttctccag ttggctgttt aatgccaaag gacaattatt agttacccgc 6060ttacatctcg cgttctccag ttggctgttt aatgccaaag gacaattatt agttacccgc 6060

cgcgcactga gcaaaaaagc atggcctggc gtgtggacta actcggtttg tgggcaccca 6120cgcgcactga gcaaaaaagc atggcctggc gtgtggacta actcggtttg tgggcaccca 6120

caactgggag aaagcaacga agacgcagtg atccgccgtt gccgttatga gcttggcgtg 6180caactgggag aaagcaacga agacgcagtg atccgccgtt gccgttatga gcttggcgtg 6180

gaaattacgc ctcctgaatc tatctatcct gactttcgct accgcgccac cgatccgagt 6240gaaattacgc ctcctgaatc tatctatcct gactttcgct accgcgccac cgatccgagt 6240

ggcattgtgg aaaatgaagt gtgtccggta tttgccgcac gcaccactag tgcgttacag 6300ggcattgtgg aaaatgaagt gtgtccggta tttgccgcac gcaccactag tgcgttacag 6300

atcaatgatg atgaagtgat ggattatcaa tggtgtgatt tagcagatgt attacacggt 6360atcaatgatg atgaagtgat ggattatcaa tggtgtgatt tagcagatgt attacacggt 6360

attgatgcca cgccgtgggc gttcagtccg tggatggtga tgcaggcgac aaatcgcgaa 6420attgatgcca cgccgtgggc gttcagtccg tggatggtga tgcaggcgac aaatcgcgaa 6420

gccagaaaac gattatctgc atttacccag cttaaataac ccgggggatc cactagttct 6480gccagaaaac gattatctgc atttacccag cttaaataac ccgggggatc cactagttct 6480

agagcggccg ccaccgcgga ggaggaatga gtaatggact ttccgcagca actcgaagcc 6540agagcggccg ccaccgcgga ggaggaatga gtaatggact ttccgcagca actcgaagcc 6540

tgcgttaagc aggccaacca ggcgctgagc cgttttatcg ccccactgcc ctttcagaac 6600tgcgttaagc aggccaacca ggcgctgagc cgttttatcg ccccactgcc ctttcagaac 6600

actcccgtgg tcgaaaccat gcagtatggc gcattattag gtggtaagcg cctgcgacct 6660actcccgtgg tcgaaaccat gcagtatggc gcatttattag gtggtaagcg cctgcgacct 6660

ttcctggttt atgccaccgg tcatatgttc ggcgttagca caaacacgct ggacgcaccc 6720ttcctggttt atgccaccgg tcatatgttc ggcgttagca caaacacgct ggacgcaccc 6720

gctgccgccg ttgagtgtat ccacgcttac tcattaattc atgatgattt accggcaatg 6780gctgccgccg ttgagtgtat ccacgcttac tcattaattc atgatgattt accggcaatg 6780

gatgatgacg atctgcgtcg cggtttgcca acctgccatg tgaagtttgg cgaagcaaac 6840gatgatgacg atctgcgtcg cggtttgcca acctgccatg tgaagtttgg cgaagcaaac 6840

gcgattctcg ctggcgacgc tttacaaacg ctggcgttct cgattttaag cgatgccgat 6900gcgattctcg ctggcgacgc tttacaaacg ctggcgttct cgattttaag cgatgccgat 6900

atgccggaag tgtcggaccg cgacagaatt tcgatgattt ctgaactggc gagcgccagt 6960atgccggaag tgtcggaccg cgacagaatt tcgatgattt ctgaactggc gagcgccagt 6960

ggtattgccg gaatgtgcgg tggtcaggca ttagatttag acgcggaagg caaacacgta 7020ggtattgccg gaatgtgcgg tggtcaggca ttagatttag acgcggaagg caaacacgta 7020

cctctggacg cgcttgagcg tattcatcgt cataaaaccg gcgcattgat tcgcgccgcc 7080cctctggacg cgcttgagcg tattcatcgt cataaaaccg gcgcattgat tcgcgccgcc 7080

gttcgccttg gtgcattaag cgccggagat aaaggacgtc gtgctctgcc ggtactcgac 7140gttcgccttg gtgcattaag cgccggagat aaaggacgtc gtgctctgcc ggtactcgac 7140

aagtatgcag agagcatcgg ccttgccttc caggttcagg atgacatcct ggatgtggtg 7200aagtatgcag agagcatcgg ccttgccttc caggttcagg atgacatcct ggatgtggtg 7200

ggagatactg caacgttggg aaaacgccag ggtgccgacc agcaacttgg taaaagtacc 7260ggagatactg caacgttggg aaaacgccag ggtgccgacc agcaacttgg taaaagtacc 7260

taccctgcac ttctgggtct tgagcaagcc cggaagaaag cccgggatct gatcgacgat 7320taccctgcac ttctgggtct tgagcaagcc cggaagaaag cccgggatct gatcgacgat 7320

gcccgtcagt cgctgaaaca actggctgaa cagtcactcg atacctcggc actggaagcg 7380gcccgtcagt cgctgaaaca actggctgaa cagtcactcg atacctcggc actggaagcg 7380

ctagcggact acatcatcca gcgtaataaa taagagctcc aattcgccct atagtgagtc 7440ctagcggact acatcatcca gcgtaataaa taagagctcc aattcgccct atagtgagtc 7440

gtattacgcg cgctcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt 7500gtattacgcg cgctcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt 7500

tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga 7560tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga 7560

ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat ggaaattgta 7620ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat ggaaattgta 7620

agcgttaata ttttgttaaa attcgcgtta aatttttgtt aaatcagctc attttttaac 7680agcgttaata ttttgttaaa attcgcgtta aatttttgtt aaatcagctc attttttaac 7680

caataggccg actgcgatga gtggcagggc ggggcgtaat ttttttaagg cagttattgg 7740caataggccg actgcgatga gtggcagggc ggggcgtaat ttttttaagg cagttattgg 7740

tgcccttaaa cgcctggtgc tacgcctgaa taagtgataa taagcggatg aatggcagaa 7800tgcccttaaa cgcctggtgc tacgcctgaa taagtgataa taagcggatg aatggcagaa 7800

attcgaaagc aaattcgacc cggtcgtcgg ttcagggcag ggtcgttaaa tagccgctta 7860attcgaaagc aaattcgacc cggtcgtcgg ttcagggcag ggtcgttaaa tagccgctta 7860

tgtctattgc tggtttaccg gtttattgac taccggaagc agtgtgaccg tgtgcttctc 7920tgtctattgc tggtttaccg gtttatgac taccggaagc agtgtgaccg tgtgcttctc 7920

aaatgcctga ggccagtttg ctcaggctct ccccgtggag gtaataattg acgatatgat 7980aaatgcctga ggccagtttg ctcaggctct ccccgtggag gtaataattg acgatatgat 7980

catttattct gcctcccaga gcctgataaa aacggtgaat ccgttagcga ggtgccgccg 8040catttattct gcctcccaga gcctgataaa aacggtgaat ccgttagcga ggtgccgccg 8040

gcttccattc aggtcgaggt ggcccggctc catgcaccgc gacgcaacgc ggggaggcag 8100gcttccattc aggtcgaggt ggcccggctc catgcaccgc gacgcaacgc ggggaggcag 8100

acaaggtata gggcggcgag gcggctacag ccgatagtct ggaacagcgc acttacgggt 8160acaaggtata gggcggcgag gcggctacag ccgatagtct ggaacagcgc acttacgggt 8160

tgctgcgcaa cccaagtgct accggcgcgg cagcgtgacc cgtgtcggcg gctccaacgg 8220tgctgcgcaa cccaagtgct accggcgcgg cagcgtgacc cgtgtcggcg gctccaacgg 8220

ctcgccatcg tccagaaaac acggctcatc gggcatcggc aggcgctgct gcccgcgccg 8280ctcgccatcg tccagaaaac acggctcatc gggcatcggc aggcgctgct gcccgcgccg 8280

ttcccattcc tccgtttcgg tcaaggctgg caggtctggt tccatgcccg gaatgccggg 8340ttcccattcc tccgtttcgg tcaaggctgg caggtctggt tccatgcccg gaatgccggg 8340

ctggctgggc ggctcctcgc cggggccggt cggtagttgc tgctcgcccg gatacagggt 8400ctggctgggc ggctcctcgc cggggccggt cggtagttgc tgctcgcccg gatacagggt 8400

cgggatgcgg cgcaggtcgc catgccccaa cagcgattcg tcctggtcgt cgtgatcaac 8460cgggatgcgg cgcaggtcgc catgccccaa cagcgattcg tcctggtcgt cgtgatcaac 8460

caccacggcg gcactgaaca ccgacaggcg caactggtcg cggggctggc cccacgccac 8520caccacggcg gcactgaaca ccgacaggcg caactggtcg cggggctggc cccacgccac 8520

gcggtcattg accacgtagg ccgacacggt gccggggccg ttgagcttca cgacggagat 8580gcggtcattg accacgtagg ccgacacggt gccggggccg ttgagcttca cgacggagat 8580

ccagcgctcg gccaccaagt ccttgactgc gtattggacc gtccgcaaag aacgtccgat 8640ccagcgctcg gccaccaagt ccttgactgc gtattggacc gtccgcaaag aacgtccgat 8640

gagcttggaa agtgtcttct ggctgaccac cacggcgttc tggtggccca tctgcgccac 8700gagcttggaa agtgtcttct ggctgaccac cacggcgttc tggtggccca tctgcgccac 8700

gaggtgatgc agcagcattg ccgccgtggg tttcctcgca ataagcccgg cccacgcctc 8760gaggtgatgc agcagcattg ccgccgtggg tttcctcgca ataagcccgg cccacgcctc 8760

atgcgctttg cgttccgttt gcacccagtg accgggcttg ttcttggctt gaatgccgat 8820atgcgctttg cgttccgttt gcacccagtg accgggcttg ttcttggctt gaatgccgat 8820

ttctctggac tgcgtggcca tgcttatctc catgcggtag ggtgccgcac ggttgcggca 8880ttctctggac tgcgtggcca tgcttatctc catgcggtag ggtgccgcac ggttgcggca 8880

ccatgcgcaa tcagctgcaa cttttcggca gcgcgacaac aattatgcgt tgcgtaaaag 8940ccatgcgcaa tcagctgcaa cttttcggca gcgcgacaac aattatgcgt tgcgtaaaag 8940

tggcagtcaa ttacagattt tctttaacct acgcaatgag ctattgcggg gggtgccgca 9000tggcagtcaa ttacagattt tctttaacct acgcaatgag ctattgcggg gggtgccgca 9000

atgagctgtt gcgtaccccc cttttttaag ttgttgattt ttaagtcttt cgcatttcgc 9060atgagctgtt gcgtaccccc cttttttaag ttgttgattt ttaagtcttt cgcatttcgc 9060

cctatatcta gttctttggt gcccaaagaa gggcacccct gcggggttcc cccacgcctt 9120cctatatcta gttctttggt gcccaaagaa gggcacccct gcggggttcc cccacgcctt 9120

cggcgcggct ccccctccgg caaaaagtgg cccctccggg gcttgttgat cgactgcgcg 9180cggcgcggct ccccctccgg caaaaagtgg cccctccggg gcttgttgat cgactgcgcg 9180

gccttcggcc ttgcccaagg tggcgctgcc cccttggaac ccccgcactc gccgccgtga 9240gccttcggcc ttgcccaagg tggcgctgcc cccttggaac ccccgcactc gccgccgtga 9240

ggctcggggg gcaggcgggc gggcttcgcc ttcgactgcc cccactcgca taggcttggg 9300ggctcggggg gcaggcgggc gggcttcgcc ttcgactgcc cccactcgca taggcttggg 9300

tcgttccagg cgcgtcaagg ccaagccgct gcgcggtcgc tgcgcgagcc ttgacccgcc 9360tcgttccagg cgcgtcaagg ccaagccgct gcgcggtcgc tgcgcgagcc ttgacccgcc 9360

ttccacttgg tgtccaaccg gcaagcgaag cgcgcaggcc gcaggccgga ggcttttccc 9420ttccacttgg tgtccaaccg gcaagcgaag cgcgcaggcc gcaggccgga ggcttttccc 9420

cagagaaaat taaaaaaatt gatggggcaa ggccgcaggc cgcgcagttg gagccggtgg 9480cagagaaaat taaaaaaatt gatggggcaa ggccgcaggc cgcgcagttg gagccggtgg 9480

gtatgtggtc gaaggctggg tagccggtgg gcaatccctg tggtcaagct cgtgggcagg 9540gtatgtggtc gaaggctggg tagccggtgg gcaatccctg tggtcaagct cgtgggcagg 9540

cgcagcctgt ccatcagctt gtccagcagg gttgtccacg ggccgagcga agcgagccag 9600cgcagcctgt ccatcagctt gtccagcagg gttgtccacg ggccgagcga agcgagccag 9600

ccggtggccg ctcgcggcca tcgtccacat atccacgggc tggcaaggga gcgcagcgac 9660ccggtggccg ctcgcggcca tcgtccacat atccacgggc tggcaaggga gcgcagcgac 9660

cgcgcagggc gaagcccgga gagcaagccc gtagggcgcc gcagccgccg taggcggtca 9720cgcgcagggc gaagcccgga gagcaagccc gtagggcgcc gcagccgccg taggcggtca 9720

cgactttgcg aagcaaagtc tagtgagtat actcaagcat tgagtggccc gccggaggca 9780cgactttgcg aagcaaagtc tagtgagtat actcaagcat tgagtggccc gccggaggca 9780

ccgccttgcg ctgcccccgt cgagccggtt ggacaccaaa agggaggggc aggcatggcg 9840ccgccttgcg ctgcccccgt cgagccggtt ggacaccaaa agggaggggc aggcatggcg 9840

gcatacgcga tcatgcgatg caagaagctg gcgaaaatgg gcaacgtggc ggccagtctc 9900gcatacgcga tcatgcgatg caagaagctg gcgaaaatgg gcaacgtggc ggccagtctc 9900

aagcacgcct accgcgagcg cgagacgccc aacgctgacg ccagcaggac gccagagaac 9960aagcacgcct accgcgagcg cgagacgccc aacgctgacg ccagcaggac gccagagaac 9960

gagcactggg cggccagcag caccgatgaa gcgatgggcc gactgcgcga gttgctgcca 10020gagcactggg cggccagcag caccgatgaa gcgatgggcc gactgcgcga gttgctgcca 10020

gagaagcggc gcaaggacgc tgtgttggcg gtcgagtacg tcatgacggc cagcccggaa 10080gagaagcggc gcaaggacgc tgtgttggcg gtcgagtacg tcatgacggc cagcccggaa 10080

tggtggaagt cggccagcca agaacagcag gcggcgttct tcgagaaggc gcacaagtgg 10140tggtggaagt cggccagcca agaacagcag gcggcgttct tcgagaaggc gcacaagtgg 10140

ctggcggaca agtacggggc ggatcgcatc gtgacggcca gcatccaccg tgacgaaacc 10200ctggcggaca agtacggggc ggatcgcatc gtgacggcca gcatccaccg tgacgaaacc 10200

agcccgcaca tgaccgcgtt cgtggtgccg ctgacgcagg acggcaggct gtcggccaag 10260agcccgcaca tgaccgcgtt cgtggtgccg ctgacgcagg acggcaggct gtcggccaag 10260

gagttcatcg gcaacaaagc gcagatgacc cgcgaccaga ccacgtttgc ggccgctgtg 10320gagttcatcg gcaacaaagc gcagatgacc cgcgaccaga ccacgtttgc ggccgctgtg 10320

gccgatctag ggctgcaacg gggcatcgag ggcagcaagg cacgtcacac gcgcattcag 10380gccgatctag ggctgcaacg gggcatcgag ggcagcaagg cacgtcacac gcgcattcag 10380

gcgttctacg aggccctgga gcggccacca gtgggccacg tcaccatcag cccgcaagcg 10440gcgttctacg aggccctgga gcggccacca gtgggccacg tcaccatcag cccgcaagcg 10440

gtcgagccac gcgcctatgc accgcaggga ttggccgaaa agctgggaat ctcaaagcgc 10500gtcgagccac gcgcctatgc accgcaggga ttggccgaaa agctgggaat ctcaaagcgc 10500

gttgagacgc cggaagccgt ggccgaccgg ctgacaaaag cggttcggca ggggtatgag 10560gttgagacgc cggaagccgt ggccgaccgg ctgacaaaag cggttcggca ggggtatgag 10560

cctgccctac aggccgccgc aggagcgcgt gagatgcgca agaaggccga tcaagcccaa 10620cctgccctac aggccgccgc aggagcgcgt gagatgcgca agaaggccga tcaagcccaa 10620

gagacggccc gag 10633gagacggccc gag 10633

<210>63<210>63

<211>4263<211>4263

<212>DNA<212>DNA

<213>人工序列<213> Artificial sequence

<220><220>

<223>重组多核苷酸<223> Recombinant polynucleotide

<400>63<400>63

cttgatatcg aattcctgca gcccggggat cctctagagt cgactaggag gaatataaaa 60cttgatatcg aattcctgca gcccggggat cctctagagt cgactaggag gaatataaaa 60

tgaaaaattg tgtcatcgtc agtgcggtac gtactgctat cggtagtttt aacggttcac 120tgaaaaattg tgtcatcgtc agtgcggtac gtactgctat cggtagtttt aacggttcac 120

tcgcttccac cagcgccatc gacctggggg cgacagtaat taaagccgcc attgaacgtg 180tcgcttccac cagcgccatc gacctggggg cgacagtaat taaagccgcc attgaacgtg 180

caaaaatcga ttcacaacac gttgatgaag tgattatggg taacgtgtta caagccgggc 240caaaaatcga ttcacaacac gttgatgaag tgattatggg taacgtgtta caagccgggc 240

tggggcaaaa tccggcgcgt caggcactgt taaaaagcgg gctggcagaa acggtgtgcg 300tggggcaaaa tccggcgcgt caggcactgt taaaaagcgg gctggcagaa acggtgtgcg 300

gattcacggt caataaagta tgtggttcgg gtcttaaaag tgtggcgctt gccgcccagg 360gattcacggt caataaagta tgtggttcgg gtcttaaaag tgtggcgctt gccgcccagg 360

ccattcaggc aggtcaggcg cagagcattg tggcgggggg tatggaaaat atgagtttag 420ccattcaggc aggtcaggcg cagagcattg tggcgggggg tatggaaaat atgagtttag 420

ccccctactt actcgatgca aaagcacgct ctggttatcg tcttggagac ggacaggttt 480ccccctactt actcgatgca aaagcacgct ctggttatcg tcttggagac ggacaggttt 480

atgacgtaat cctgcgcgat ggcctgatgt gcgccaccca tggttatcat atggggatta 540atgacgtaat cctgcgcgat ggcctgatgt gcgccaccca tggttatcat atggggatta 540

ccgccgaaaa cgtggctaaa gagtacggaa ttacccgtga aatgcaggat gaactggcgc 600ccgccgaaaa cgtggctaaa gagtacggaa ttacccgtga aatgcaggat gaactggcgc 600

tacattcaca gcgtaaagcg gcagccgcaa ttgagtccgg tgcttttaca gccgaaatcg 660tacattcaca gcgtaaagcg gcagccgcaa ttgagtccgg tgcttttaca gccgaaatcg 660

tcccggtaaa tgttgtcact cgaaagaaaa ccttcgtctt cagtcaagac gaattcccga 720tcccggtaaa tgttgtcact cgaaagaaaa ccttcgtctt cagtcaagac gaattcccga 720

aagcgaattc aacggctgaa gcgttaggtg cattgcgccc ggccttcgat aaagcaggaa 780aagcgaattc aacggctgaa gcgttaggtg cattgcgccc ggccttcgat aaagcaggaa 780

cagtcaccgc tgggaacgcg tctggtatta acgacggtgc tgccgctctg gtgattatgg 840cagtcaccgc tgggaacgcg tctggtatta acgacggtgc tgccgctctg gtgattatgg 840

aagaatctgc ggcgctggca gcaggcctta cccccctggc tcgcattaaa agttatgcca 900aagaatctgc ggcgctggca gcaggcctta cccccctggc tcgcattaaa agttatgcca 900

gcggtggcgt gccccccgca ttgatgggta tggggccagt acctgccacg caaaaagcgt 960gcggtggcgt gccccccgca ttgatgggta tggggccagt acctgccacg caaaaagcgt 960

tacaactggc ggggctgcaa ctggcggata ttgatctcat tgaggctaat gaagcatttg 1020tacaactggc ggggctgcaa ctggcggata ttgatctcat tgaggctaat gaagcatttg 1020

ctgcacagtt ccttgccgtt gggaaaaacc tgggctttga ttctgagaaa gtgaatgtca 1080ctgcacagtt ccttgccgtt gggaaaaacc tgggctttga ttctgagaaa gtgaatgtca 1080

acggcggggc catcgcgctc gggcatccta tcggtgccag tggtgctcgt attctggtca 1140acggcggggc catcgcgctc gggcatccta tcggtgccag tggtgctcgt attctggtca 1140

cactattaca tgccatgcag gcacgcgata aaacgctggg gctggcaaca ctgtgcattg 1200cactattaca tgccatgcag gcacgcgata aaacgctggg gctggcaaca ctgtgcattg 1200

gcggcggtca gggaattgcg atggtgattg aacggttgaa ttaaggagga cagctaaatg 1260gcggcggtca gggaattgcg atggtgattg aacggttgaa ttaaggagga cagctaaatg 1260

aaactctcaa ctaaactttg ttggtgtggt attaaaggaa gacttaggcc gcaaaagcaa 1320aaactctcaa ctaaactttg ttggtgtggt attaaaggaa gacttaggcc gcaaaagcaa 1320

caacaattac acaatacaaa cttgcaaatg actgaactaa aaaaacaaaa gaccgctgaa 1380caacaattac acaatacaaa cttgcaaatg actgaactaa aaaaacaaaa gaccgctgaa 1380

caaaaaacca gacctcaaaa tgtcggtatt aaaggtatcc aaatttacat cccaactcaa 1440caaaaaacca gacctcaaaa tgtcggtatt aaaggtatcc aaatttacat cccaactcaa 1440

tgtgtcaacc aatctgagct agagaaattt gatggcgttt ctcaaggtaa atacacaatt 1500tgtgtcaacc aatctgagct agagaaattt gatggcgttt ctcaaggtaa atacacaatt 1500

ggtctgggcc aaaccaacat gtcttttgtc aatgacagag aagatatcta ctcgatgtcc 1560ggtctgggcc aaaccaacat gtcttttgtc aatgacagag aagatatcta ctcgatgtcc 1560

ctaactgttt tgtctaagtt gatcaagagt tacaacatcg acaccaacaa aattggtaga 1620ctaactgttt tgtctaagtt gatcaagagt tacaacatcg acaccaacaa aattggtaga 1620

ttagaagtcg gtactgaaac tctgattgac aagtccaagt ctgtcaagtc tgtcttgatg 1680ttagaagtcg gtactgaaac tctgattgac aagtccaagt ctgtcaagtc tgtcttgatg 1680

caattgtttg gtgaaaacac tgacgtcgaa ggtattgaca cgcttaatgc ctgttacggt 1740caattgtttg gtgaaaacac tgacgtcgaa ggtattgaca cgcttaatgc ctgttacggt 1740

ggtaccaacg cgttgttcaa ctctttgaac tggattgaat ctaacgcatg ggatggtaga 1800ggtaccaacg cgttgttcaa ctctttgaac tggattgaat ctaacgcatg ggatggtaga 1800

gacgccattg tagtttgcgg tgatattgcc atctacgata agggtgccgc aagaccaacc 1860gacgccattg tagtttgcgg tgatattgcc atctacgata agggtgccgc aagaccaacc 1860

ggtggtgccg gtactgttgc tatgtggatc ggtcctgatg ctccaattgt atttgactct 1920ggtggtgccg gtactgttgc tatgtggatc ggtcctgatg ctccaattgt atttgactct 1920

gtaagagctt cttacatgga acacgcctac gatttttaca agccagattt caccagcgaa 1980gtaagagctt cttacatgga acacgcctac gatttttaca agccagattt caccagcgaa 1980

tatccttacg tcgatggtca tttttcatta acttgttacg tcaaggctct tgatcaagtt 2040tatccttacg tcgatggtca tttttcatta acttgttacg tcaaggctct tgatcaagtt 2040

tacaagagtt attccaagaa ggctatttct aaagggttgg ttagcgatcc cgctggttcg 2100tacaagagtt attccaagaa ggctatttct aaagggttgg ttagcgatcc cgctggttcg 2100

gatgctttga acgttttgaa atatttcgac tacaacgttt tccatgttcc aacctgtaaa 2160gatgctttga acgttttgaa atatttcgac tacaacgttt tccatgttcc aacctgtaaa 2160

ttggtcacaa aatcatacgg tagattacta tataacgatt tcagagccaa tcctcaattg 2220ttggtcacaa aatcatacgg tagattacta tataacgatt tcagagccaa tcctcaattg 2220

ttcccagaag ttgacgccga attagctact cgcgattatg acgaatcttt aaccgataag 2280ttcccagaag ttgacgccga attagctact cgcgattatg acgaatcttt aaccgataag 2280

aacattgaaa aaacttttgt taatgttgct aagccattcc acaaagagag agttgcccaa 2340aacattgaaa aaacttttgt taatgttgct aagccattcc acaaagagag agttgcccaa 2340

tctttgattg ttccaacaaa cacaggtaac atgtacaccg catctgttta tgccgccttt 2400tctttgattg ttccaacaaa cacaggtaac atgtacaccg catctgttta tgccgccttt 2400

gcatctctat taaactatgt tggatctgac gacttacaag gcaagcgtgt tggtttattt 2460gcatctctat taaactatgt tggatctgac gacttacaag gcaagcgtgt tggtttattt 2460

tcttacggtt ccggtttagc tgcatctcta tattcttgca aaattgttgg tgacgtccaa 2520tcttacggtt ccggtttagc tgcatctcta tattcttgca aaattgttgg tgacgtccaa 2520

catattatca aggaattaga tattactaac aaattagcca agagaatcac cgaaactcca 2580catattatca aggaattaga tattactaac aaattagcca agagaatcac cgaaactcca 2580

aaggattacg aagctgccat cgaattgaga gaaaatgccc atttgaagaa gaacttcaaa 2640aaggattacg aagctgccat cgaattgaga gaaaatgccc atttgaagaa gaacttcaaa 2640

cctcaaggtt ccattgagca tttgcaaagt ggtgtttact acttgaccaa catcgatgac 2700cctcaaggtt ccattgagca tttgcaaagt ggtgtttact acttgaccaa catcgatgac 2700

aaatttagaa gatcttacga tgttaaaaaa taaggaggat tacactatgg ttttaaccaa 2760aaatttagaa gatcttacga tgttaaaaaa taaggaggat tacactatgg ttttaaccaa 2760

taaaacagtc atttctggat cgaaagtcaa aagtttatca tctgcgcaat cgagctcatc 2820taaaacagtc atttctggat cgaaagtcaa aagtttatca tctgcgcaat cgagctcatc 2820

aggaccttca tcatctagtg aggaagatga ttcccgcgat attgaaagct tggataagaa 2880aggaccttca tcatctagtg aggaagatga ttcccgcgat attgaaagct tggataagaa 2880

aatacgtcct ttagaagaat tagaagcatt attaagtagt ggaaatacaa aacaattgaa 2940aatacgtcct ttagaagaat tagaagcatt attaagtagt ggaaatacaa aacaattgaa 2940

gaacaaagag gtcgctgcct tggttattca cggtaagtta cctttgtacg ctttggagaa 3000gaacaaagag gtcgctgcct tggttatca cggtaagtta cctttgtacg ctttggagaa 3000

aaaattaggt gatactacga gagcggttgc ggtacgtagg aaggctcttt caattttggc 3060aaaattaggt gatactacga gagcggttgc ggtacgtagg aaggctcttt caattttggc 3060

agaagctcct gtattagcat ctgatcgttt accatataaa aattatgact acgaccgcgt 3120agaagctcct gtattagcat ctgatcgttt accatataaa aattatgact acgaccgcgt 3120

atttggcgct tgttgtgaaa atgttatagg ttacatgcct ttgcccgttg gtgttatagg 3180atttggcgct tgttgtgaaa atgttatagg ttacatgcct ttgcccgttg gtgttatagg 3180

ccccttggtt atcgatggta catcttatca tataccaatg gcaactacag agggttgttt 3240ccccttggtt atcgatggta catcttatca tataccaatg gcaactacag agggttgttt 3240

ggtagcttct gccatgcgtg gctgtaaggc aatcaatgct ggcggtggtg caacaactgt 3300ggtagcttct gccatgcgtg gctgtaaggc aatcaatgct ggcggtggtg caacaactgt 3300

tttaactaag gatggtatga caagaggccc agtagtccgt ttcccaactt tgaaaagatc 3360tttaactaag gatggtatga caagaggccc agtagtccgt ttcccaactt tgaaaagatc 3360

tggtgcctgt aagatatggt tagactcaga agagggacaa aacgcaatta aaaaagcttt 3420tggtgcctgt aagatatggt tagactcaga agagggacaa aacgcaatta aaaaagcttt 3420

taactctaca tcaagatttg cacgtctgca acatattcaa acttgtctag caggagattt 3480taactctaca tcaagatttg cacgtctgca acatattcaa acttgtctag caggagattt 3480

actcttcatg agatttagaa caactactgg tgacgcaatg ggtatgaata tgatttctaa 3540actcttcatg agattagaa caactactgg tgacgcaatg ggtatgaata tgatttctaa 3540

aggtgtcgaa tactcattaa agcaaatggt agaagagtat ggctgggaag atatggaggt 3600aggtgtcgaa tactcattaa agcaaatggt agaagagtat ggctgggaag atatggaggt 3600

tgtctccgtt tctggtaact actgtaccga caaaaaacca gctgccatca actggatcga 3660tgtctccgtt tctggtaact actgtaccga caaaaaacca gctgccatca actggatcga 3660

aggtcgtggt aagagtgtcg tcgcagaagc tactattcct ggtgatgttg tcagaaaagt 3720aggtcgtggt aagagtgtcg tcgcagaagc tactattcct ggtgatgttg tcagaaaagt 3720

gttaaaaagt gatgtttccg cattggttga gttgaacatt gctaagaatt tggttggatc 3780gttaaaaagt gatgtttccg cattggttga gttgaacatt gctaagaatt tggttggatc 3780

tgcaatggct gggtctgttg gtggatttaa cgcacatgca gctaatttag tgacagctgt 3840tgcaatggct gggtctgttg gtggatttaa cgcacatgca gctaatttag tgacagctgt 3840

tttcttggca ttaggacaag atcctgcaca aaatgttgaa agttccaact gtataacatt 3900tttcttggca ttaggacaag atcctgcaca aaatgttgaa agttccaact gtataacatt 3900

gatgaaagaa gtggacggtg atttgagaat ttccgtatcc atgccatcca tcgaagtagg 3960gatgaaagaa gtggacggtg atttgagaat ttccgtatcc atgccatcca tcgaagtagg 3960

taccatcggt ggtggtactg ttctagaacc acaaggtgcc atgttggact tattaggtgt 4020taccatcggt ggtggtactg ttctagaacc acaaggtgcc atgttggact tattagggtgt 4020

aagaggcccg catgctaccg ctcctggtac caacgcacgt caattagcaa gaatagttgc 4080aagaggcccg catgctaccg ctcctggtac caacgcacgt caattagcaa gaatagttgc 4080

ctgtgccgtc ttggcaggtg aattatcctt atgtgctgcc ctagcagccg gccatttggt 4140ctgtgccgtc ttggcaggtg aattatcctt atgtgctgcc ctagcagccg gccatttggt 4140

tcaaagtcat atgacccaca acaggaaacc tgctgaacca acaaaaccta acaatttgga 4200tcaaagtcat atgacccaca acaggaaacc tgctgaacca acaaaaccta acaatttgga 4200

cgccactgat ataaatcgtt tgaaagatgg gtccgtcacc tgcattaaat cctaagtcga 4260cgccactgat ataaatcgtt tgaaagatgg gtccgtcacc tgcattaaat cctaagtcga 4260

cct 4263cct 4263

Claims

1. A nucleic acid, in the order of 5' to 3', said nucleic acid comprising an operably linked nucleotide sequence encoding a certain structural domain and a nucleotide sequence encoding a cytochrome P450 enzyme, said structural domain being selected from A transmembrane domain, a secretory domain, a solubilizing domain, or a membrane inserted protein, wherein said domain is heterologous to said cytochrome P450 enzyme.

2. The nucleic acid of claim 1, wherein the transmembrane domain is functional in a prokaryotic host cell.

3. nucleic acid as claimed in claim 1, is characterized in that, the nucleotide sequence of described coding cytochrome P450 enzyme has accepted the codon optimization that adapts to the expression in prokaryotic host cell.

4. The nucleic acid of claim 1, wherein the cytochrome P450 enzyme is an isopentenoid precursor modification enzyme that catalyzes the modification of an isopentenoid precursor.

5. nucleic acid as claimed in claim 4, is characterized in that, described modification is selected from oxidation, hydroxylation or epoxidation.

6. nucleic acid as claimed in claim 1, also contains the nucleotide sequence of coding cytochrome P450 reductase.

7. An expression vector containing nucleic acid according to claim 1.

8. A host cell containing the expression vector according to claim 7.

9. The host cell of claim 8, wherein the host cell is a cell that does not normally produce isopent-1-enyl pyrophosphate through the mevalonate pathway.

10. The host cell of claim 9, wherein the host cell is a prokaryotic cell.

11. The host cell of claim 8, further comprising a nucleic acid comprising a nucleotide sequence encoding a heterologous terpene synthase.

12. The host cell of claim 8, further comprising a nucleic acid comprising a nucleotide sequence encoding a cytochrome P450 reductase.

13. The host cell of claim 9, wherein the host cell is genetically modified with one or more nucleic acids comprising nucleotide sequences encoding two or more mevalonate pathway enzymes of.

14. A method of producing a biosynthetic pathway product in a host cell, the method comprising:

A genetically modified host cell is cultured in a suitable medium to produce an enzymatically active modified cytochrome P450 enzyme, wherein the host cell is genetically modified with a nucleic acid comprising a nucleotide sequence encoding a cytochrome P450 enzyme, the cytochrome The P450 enzyme is operably linked to a domain selected from the group consisting of a transmembrane domain, a secretory domain, a solubilizing domain, and a membrane insertion protein,

Wherein, producing said modified cytochrome P450 enzyme in the presence of a biosynthetic pathway intermediate results in enzymatic modification of said biosynthetic pathway intermediate and production of said biosynthetic pathway product.

15. The method of claim 14, wherein the cytochrome P450 enzyme is an isopentenoid precursor modifying enzyme, wherein the isopentenoid is produced in the presence of an isopentenoid precursor compound An alkene precursor modifying enzyme results in enzymatic modification of the isopentenoid precursor and production of the isopentenoid compound.

16. The method of claim 14, wherein the host cell is a eukaryotic host cell.

17. The method of claim 16, wherein the host cell is a yeast cell.

18. The method of claim 16, wherein the host cell is a plant cell.

19. The method of claim 14, wherein the host cell is a prokaryotic cell.

20. The method of claim 15, further genetically modifying said host cell with a nucleic acid comprising a nucleotide sequence encoding a heterologous terpene synthase, wherein said culturing produces said terpene synthase, The terpene synthase is capable of modifying polyprenyl pyrophosphate to produce a substrate for the isopentenoid modifying enzyme.

21. The method of claim 20, wherein the polyprenyl pyrophosphate is selected from farnesyl pyrophosphate, geranyl pyrophosphate or geranylgeranyl pyrophosphate.

22. The method of claim 14, wherein the host cell is further genetically modified with a nucleic acid comprising a nucleotide sequence encoding a cytochrome P450 reductase (CPR).

23. The method of claim 15, wherein the host cell is a cell that does not normally synthesize isopent-1-enyl pyrophosphate (IPP) through the mevalonate pathway, wherein the one or more nucleic acids of the nucleotide sequences of two or more enzymes in the valerate pathway, IPP isomerase, isopentenyltransferase and terpene synthase genetically modify the host cell, the culturing produces A mevalonate pathway enzyme, wherein production of said two or more mevalonate pathway enzymes, IPP isomerase, isopentenyltransferase, terpene synthase, and isopentenoid precursor modifying enzyme results in production of the class Isoamyl compounds.

24. The method of claim 23, wherein the two or more mevalonate pathway enzymes comprise mevalonate kinase, phosphomevalonate kinase, and pyrophosphate mevalonate decarboxylation The enzyme, wherein said host cell is cultured in the presence of mevalonate.

25. The method of claim 23, wherein the two or more mevalonate pathway enzymes comprise acetoacetyl-CoA thiolase, hydroxymethylglutaryl-CoA synthase, hydroxymethylglutaryl-CoA synthase, Glutaryl-CoA reductase, mevalonate kinase, phosphomevalonate kinase and pyrophosphomevalonate decarboxylase.

26. The method of claim 14, wherein the nucleotide sequence encoding a cytochrome P450 enzyme is operably linked to an inducible promoter.

27. The method of claim 15, wherein the yield of the isopentenoid compound is at least about 10 mg/L.

28. The method of claim 14, further comprising isolating the biosynthetic pathway product.