MXPA98008509A

MXPA98008509A - Compositions and methods for ta biosynthesis

Info

Publication number: MXPA98008509A
Application number: MXPA/A/1998/008509A
Authority: MX
Inventors: R Wildung Mark; B Croteau Rodney
Original assignee: Washington State University Research Foundation
Priority date: 1996-04-15
Filing date: 1998-10-14
Publication date: 1999-02-24

Abstract

The gene for taxadiene synthetase from the Pacific yew has been cloned, and its polypeptide and nucleic acid sequence is presented, the truncation or removal of the transit peptide increases the expression of the gene for taxadiene synthetase cloned in E. coli cells.

Description

COMPOSITIONS AND METHODS FOR TAXQL BIOSIITESIS RECIPROCAL REFERENCE TO THE RELATED CASE This application claims the benefit of the provisional application of E.U.A. Na SO / 015, 993 > filed on April 15, 199S, incorporated herein by reference.

TECHNICAL FIELD 10 This invention relates to the field of detection of dipenoid biosynthesis »particularly to the biasintesis of taxoid compounds such as taxol.

RECOGNITION TO GOVERNMENT SUPPORT This invention was made with governmental support under the National Institutes of Health »concession No. CA-55254. The government has certain rights over this invention. • and O TECHNICAL BACKGROUND Highly functionalized taxon erpenoid (Wani et al. J. Am. Chem. Soc. 93: 2325-2327, 1971) is well established as a potent chemotherapeutic agent (Holmes et al.) In Taxane Anticancer Agents: Basic Science and Current Status »Georg et al.» Eds. »Pp. 31-57» American Chemical Society »Washington» DC. »1995; ArbucK and .Blaylock» in Taxol: Science and Appl cations »Suffness ed.» Page 379-415 » CRC Press »Boca Raton» FL, 1995). (Paclitaxel is the generic name for taxol »registered trademark of Bristol-Myers Squibb). The supply of taxol from the original source, the bark of the Pacific yew (Taxus brevi olia IMutt, Taxaceae), is limited. As a result, intensive efforts have been made to develop alternative means of production "including the isolation of foliage and other renewable tissues of Taxus species developed in plantations" biosynthesis in tissue culture systems "and the semisynthesis of taxol and its analogues to from advanced taxane terpenoid metabolites (taxoids) that are more easily obtainable (Cragg et al. »J. Nat. Prod. 56: 1657-1SSS, 1993). At present, the total synthesis of taxol is not commercially viable (Bor an, Chem. Eng. News 72 (7): 32-34, 1994), and it is clear that in the foreseeable future »the supply of taxol and its progenitors Synthetically useful should depend on biological methods of production »either in Taxus plants or in cell cultures derived from them (Suffness» in Taxane Anticancer Agents: Basic Science and Current Status »Georg and others» eds. »American Chemical Society, Washington, DC. »1995, pp. 1-17). Taxol biosynthesis involves the initial cyranization of geran 1 gerani 1 diphosphate or > the universal precursor of dipepene is (West, in Biosynthesis of I-soprenoid Compounds, Porter and Spurgeon »eds.» vol.1 »pp. 375-411» Wiley S Sons, New York »NY» 1981), in taxa -4 (5), 11 (12) -d eno (Koepp et al., J. Biol. Chem. 270: 8686-8690, 1995) »followed by extensive oxidative modification of this olefin (Koepp et al. J. Biol. Chem. 27?: 8SS6-8S90 »1995; Croteau and others» in Taxane Anticancer Agents: Basic Science and Current Status, Georg et al. »Eds.» Pp. 72-80 »American Chemical Society, Washington» DC, 1995) and elaboration of side chains (Figure 1) (Floss and MoceK, in Taxol: Science and Applications »Suffness, ed.» pp. 191-208, CRC Press, Boca Raton »FL, 1995). Taxa-4 (5), 11 (12) -diene synthetase ("taxadiene synthetase"), the enzyme responsible for the initial cyclization of Gerani diphosphate Igerani to delineate the taxane skeleton, has been isolated from stem tissue of T. brevifolia, partially purified "and characterized (Hezari et al., Arch. Biochem. Biophys. 322: 437-444, 1995). Although the taxadiene synthetase resembles other terpenoi of plant heads in general enzymatic properties (Hezari et al., Arch. Biochem. Biophys, 322: 437-444, 1995), it has been shown that it is extremely difficult to purify it in sufficient quantities to the preparation of antibodies or for the determination of microsequences »preventing this approach the cloning of the cDNA.

BRIEF DESCRIPTION OF THE INVENTION The authors have cloned and determined the gene sequence for taxadiene synthetase from the Pacific yew. One embodiment of the invention includes isolated polyucleotides comprising at least 15 consecutive nucleotides, preferably at least 20 »more preferably at least 25» and most preferably at least 30 consecutive nucleotides of a native gene for taxadiene synthetase »for example» the gene for taxadiene synthetase from the Pacific yew. Said polynucleotides are useful, for example, co or probes and primers for obtaining homologs of the gene for taxadiene synthetase from the Pacific yew by contacting, for example, a nucleic acid of a taxoid producing organism with said probe or initiator under severe conditions. of hybridization to allow the probe or primer to hybridize with a gene for taxadiene synthetase of the organism, then isolate the gene for taxadiene synthetase from the organism with which the probe or primer hybridizes. Another embodiment of the invention includes isolated polynucleotides comprising a sequence encoding a polypeptide having biological activity of taxadiene synthetase. Preferably, the sequence encoding the polypeptide has at least 70%, preferably at least 80%, and more preferably at least 90% nucleotide sequence similarity with a native gene for polynucleotide of taxadiene synthetase of the Yew from the Pacific. In preferred embodiments of said polynucleotides, the sequence encoding the polypeptide codes for a polypeptide having only conservative amino acid substi tutions for the native taxadiene synthetase polypeptide of the Pacific yew except in some embodiments, for amino acid subst one or more of: cysteine residues 329, 650, 719 and 777; histidine residues 370, 415 »579 and 793; a DDXXD motif; a motive DXXDD; a conserved arginine; and an RWWK element. Preferably, the encoded polypeptide has only conservative amino acid substi tutions for or is completely homologous with the native taxadiene synthetase polypeptide of the Pacific yew. In addition, the encoded polypeptide preferentially lacks at least part of the transit peptide. Also included are cells, particularly plant cells and transgenic plants including said polynucleotides and the encoded polypeptides. Another mode of the invention includes isolated polypeptides having taxadiene synthase activity having preferably at least 70%, more preferably at least 80%, and very preferably at least 90% homology with a native polypeptide of taxadiene synthetase. Also included are isolated polypeptides comprising at least 10, preferably at least 20, more preferably at least 30 consecutive amino acids of a native taxane synthetase from the Pacific yew, and more preferably the mature taxadiene synthetase polypeptide from the yew of the yew. Pacific (ie »lacking only the transit peptide). Another embodiment of the invention includes antibodies specific for a native Pacific yew taxadiene synthetase polypeptide. Another embodiment of the invention includes methods for expressing a taxadiene synthetase polypeptide in a cell, eg, a taxoid-producing cell, by culturing a cell that includes an expressible polynucleotide that encodes a taxadiene synthetase polypeptide under appropriate conditions for the expression of the polypeptide, preferably resulting in the production of. taxoid at levels that are higher than expected from a similarly similar cell lacking the expressible polynucleotide. The foregoing and other objects and advantages of the invention will become more apparent from the following detailed description and the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 shows the steps in taxol biosynthesis, including the initial cyclization of gerani 1 gerani di phosphate up to taxa-4 (5), 11 (12) -diene, followed by extensive oxidative modification and elaboration of the side chains is. Figure 2 shows the nucleotide sequence and the predicted amino acid sequence of the pTb 42.1 clone of taxadiene synthetase from the Pacific yew. The start and stop codons are underlined. The localities of the regions used for the synthesis of the initiator are doubly underlined. The DDMAD and DSYDD motifs are in bold type. Histidines (H) and cysteines (conserved O and RWWK element are indicated by tables.) Truncation sites to remove the entire transit peptide or part thereof are indicated by a triangle (v).

DETAILED DESCRIPTION OF THE PREFERRED MODALITIES A cloning strategy based on homology using the polymerase chain reaction (PCR) was used to isolate a cDNA encoding taxadiene synthetase. A series of degenerate primers was constructed based on the consensus sequences of monoterpene sesqui erpene and diterpene cyclase. Two of these primers amplified a fragment of 83 base pairs (bp) which was similar to the cyclase in sequence and which had been used as a hybridization probe to select a cDNA library constructed from poly (A) -4 RNA - extracted from stems of the Pacific yew. Twelve clones and dependent ones with insertion size in excess of two pairs of Kilobases (kb) were isolated, and their sequence was partially determined. One of these cDNA isolates was functionally expressed in Escherichia coli, producing a protein that was catalytically active to convert the gerani 1 gerani 1 diphosphate or a diterpene olefin which was confirmed to be taxa-4 (5), ll (12) - diene by the combination of gas chromatography in capillary-mass spectrometry (Satterwhi e and Croteau, J. Chromatography 452: 61-73 »1988). The taxa-4 (5) »ll (12) -diene synthetase cDNA sequence specifies an open reading frame of 2586 nucleotides. The deduced sequence of the polypeptide contains 862 amino acid residues and has a molecular weight of 98.303, compared to about 79,000 previously determined for the mature native enzyme. It therefore appears to be of "total length" and includes a probable long plastid target peptide. Sequence comparisons with monoterpene, sesqu terpene and diterpene, of plant origin indicate a significant degree of similarity between these enzymes; the taxadiene synthetase resembles more closely (46% identity, 67% similarity) to the abietadiene synthetase, a diterpene cyclase from the large spruce.

Uses of the gene for taxadiene synthetase Increase of taxol biosynthesis in trans-ormed cells. The obligatory step of taxol biosynthesis (paclitaxel) is the initial cyclization of geranium 1 geranium diphosphate, an isoprene and ubiquitous termedium, catalyzed by the taxadiene synthetase, a diterpene cyclase. The product of this reaction is the mother olefin with a taxane skeleton, taxa-4 (5) »ll (12) -diene. For a review of taxoids and taxoid biochemistry, see, for example »Kingston and others» "The Taxane D terpenoids" »Progress in the Chemistry of Qrganic Natural Products» vol. 61 »Springer Verlag» New York »1993, pp.1-206. The obligate cyclization step of the target pathway is a slow step in the extended biosynthetic sequence leading to taxol and related taxoids (Koepp et al., J. Biol. Chem. 270: 8686-8690, 1995; Hezari et al., Arch. Biochem, B ophys, 322: 437-444, 1995). The performance of taxol and related taxoids (for example »cephalomannine» baccatins and taxinins, among others) in cells of an organism capable of performing the taxoid biosynthesis, is increased by the expression, in said cells, of a recombinant gene for taxadiene if tetasa This approach to increase the biosynthesis of taxoids can be used in any organism that is able to show biosynthesis thereof. It is known that the synthesis of taxol occurs "for example" in Taxaceae (taxaceae) "including Taxus species from all over the world (including» but not limited to »T. brevifolia, T. baccata, T. x media, T. cuspidata, T. canadensis and T. chi ens s), as well as in certain microorganisms. Taxol can also be produced by a fungus »Taxomyces andreanae (Stierle and others» Science 260: 214, 1993). The transformation of Taxus species mediated by Agrobacterium tumefaciens has been described, and it has been shown that the resulting callus cultures produce taxol. (Han and others, Plant Sc. 95: 187-196 »1994). Taxol can be isolated from cells transformed with the gene for taxadiene synthetase by conventional methods. The production of Taxus cultures by callus and in suspension has been described as well as the isolation of taxol and related compounds from said cultures (for example in Fett etto et al., Bio / Technolgy 10: 1572-1575, 1992).

Biosynthesis of taxoids in microorganisms. As described below, taxadiene synthetase activity was observed in E host cells. col i transformants that express recombinant taxadiene synthetase. The taxadiene synthetase does not require extensive post-translational modification, as is provided, for example, in mammalian cells, to have enzymatic function. As a result »functional taxadiene synthetase can be expressed in a wide variety of host cells. Geranium di phosphate 1 is a substrate of 1-taxadiene synthetase, is produced in a wide variety of organisms including bacteria and yeasts that synthesize carotenoid pigments (eg, Serra ia spp. And Rhodotorula spp.). The introduction of vectors capable of expressing the taxadiene synthetase in said microorganisms allows the production of large quantities of taxa-4 (5) »ll (12) -diene and related compounds having the taxane base structure. The base structure of the taxane so produced is useful as a chemical supply material. Simple taxoids, for example, would be useful as perfume fixatives.

Cloning of homologs of taxadiene synthetase and related genes. The availability of the gene for taxadiene synthetase of the Pacific yew makes it possible to clone homologs of the taxadiene synthetase from other organisms capable of showing biosynthesis of taxoids, particularly Taxus spp. Although the proportion of common taxoids varies with the species or cultivated from yew tested, apparently all Taxus species synthesize taxoids, including taxol to a certain degree (see, for example, Mattina and Palva, J. Environ, Hort 10: 187-191, 1992, Miller, J. Natural Products 43: 425-437 »1980). Taxol can also be produced by a fungus »Taxomyces andreanae (Stierle et al., Sc ence 260: 214, 1993). A gene for taxadiene synthetase can be isolated from any organism capable of producing taxol or related taxoids by using primers or probes based on the gene sequence for taxadiene synthetase from the Pacific yew or specific antibodies for the taxadiene synthetase by methods conventional Modified forms of the gene for taxadiene synthetase and taxadiene synthetase polypeptide. Knowledge of the gene sequence for taxadiene synthetase allows the modification of the sequence "as described in more detail below" to produce variant forms of the gene and the polypeptide product thereof. For example, the transit peptide of plastes can be removed and / or replaced by other transit peptides to allow the gene product to be directed to various intracellular compartments or released from a host cell.

DEFINITIONS AND METHODS The following definitions and methods are provided to better define the present invention, and to guide those skilled in the art in the practice of the present invention. Definitions of common terms in molecular biology can also be found in Rieger et al., Glossary of Essentials: Classical and Molecular, 5a. edition, Spri ger- Verlag, New York, 1991; and Lewin. Genes V. Oxford University Press, New York, 1994.

The term "plant" includes any plant or progeny thereof. The term also covers parts of plants »including seeds» stakes »tubers, fruits» flowers »etc. A" reproductive unit "of a plant is any part or totipotential tissue of the plant from which a progeny of the plant can be obtained. same »including, for example, seeds, stakes, buds, bulbs, somatic embryos, cultured cells (eg, callus culture or in suspension)» etc ..

Nucleic acids Nucleic acids (a term used reciprocally with "polynucleotides" herein) that are useful in the practice of the present invention, include the isolated gene for taxadiene synthetase »its homologues in other plant species, and fragments and variants thereof. The term "gene for taxadiene synthetase" refers to a nucleic acid containing a sequence for taxa- 4 (5), 11 (12) -diene, preferably a nucleic acid encoding a peptide having enzyme activity of taxadiene synthetase. This term refers primarily to the isolated full-length cDNA for taxane synthetase from the Pacific yew described above, and the corresponding genomic sequence (including flanking or internal sequences operably linked thereto including regulatory elements and / or intron sequences). This term also encompasses alleles of the gene for taxadiene synthetase from the Pacific yew.

"Ma vo". The term "native" refers to a nucleic acid or polypeptide that occurs naturally ("type yes 1 vestre").

"Homologo". A "homologue" of the gene for taxadiene synthetase is a sequence of genes that codes for a taxadiene synthetase isolated from a different organism from the Pacific yew.

"Ais! Ado." An "isolated" nucleic acid is a nucleic acid that has been substantially separated or purified from other nucleic acid sequences in the organism cell in which the nucleic acid occurs naturally, ie, other DNA and chromosomal RNA and extracorporeal DNA. , by conventional methods of nucleic acid purification. The term also encompasses recombinant nucleic acids and chemically synthesized nucleic acids.

Fragments, probes and initiators. According to the present invention, a fragment of a nucleic acid for taxadiene synthetase is a portion of nucleic acid that is less than the total length and comprises at least a minimum length capable of hybridizing specifically with the nucleic acid for ta. adieno synthetase of figure 2 under strict conditions of hybridization. The length of said fragment is preferably 15 to 17 nucleotides »or more. Nucleic acid probes and primers can be prepared based on the gene sequence for taxadiens synthetase provided in Figure 2. A "probe" is an isolated DNA or RNA linked to a detectable label or reporter molecule "eg" an isotope radioactive »ligand» chemiluminescent agent »or enzyme. The "primers" are nucleic acids isolated "generally oligonucleotides of DNA of 15 nucleotides or more in length" that are annealed with a band of complementary target DNA by hybridization of nucleic acid to form a hybrid between the primer and the target DNA band » then extended along the target DNA band by a polymerase, for example, a DNA polymerase. Primer pairs can be used to amplify a nucleic acid sequence, for example by polymerase chain reaction (PCR) or other conventional methods of nucleic acid amplification. Methods for preparing and using probes and primers are described, for example, in Sambrook et al., Molecular Clom'ng: A Laboratory Manual, 2a. ed., vols. 1-3, ed. Sambrook et al., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989, * Current Protocols in Molecular Biology, ed. Ausubel et al., Sreene Publishing and Wi 1 ey-Interscience, New York, 1987 (with periodic updates); and Innis et al., PCR Protocols: A Tuide to Methods and Applications, Academic Press: San Diego »1990. Primer pairs can be derived by PCR from a known sequence, for example, by the use of computer programs intended for that purpose "such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge, MA).

Similarity of nucleotide sequence. The "similarity" of nucleotide sequence is a measure of the degree to which two polynucleotide sequences have identical nucleotide bases in corresponding positions in their sequence when optimally nean (with insertions or deletions of appropriate nucleotides). The sequence similarity can be determined using programs for sequence analysis »such as the Sequence Analysis Software Package of the Genetics Computer Group» University of Wisconsin Biotechnology Center »Madison» Wl. Preferably, a variant form of a polynucleotide of taxadiene synthetase has at least 70%. more preferably at least 80% »and preferably at least 90% nucleotide sequence similarity with a native gene for taxadiene synthetase» particularly with a taxadiene synthetase native to the Pacific yew, as provided in Figure 2 .

Operably linked. A first nucleic acid sequence is "operably" linked to a second nucleic acid sequence, when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For example, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of said sequence. Generally, operably linked DNA sequences are contiguous and, where necessary, "bind two protein coding regions" in the reading frame.

"Recombinant." A "recombinant" nucleic acid is an isolated polypeptide obtained by an artificial combination of two sequence segments otherwise separated "for example" by chemical synthesis or by manipulation of isolated segments of nucleic acids by genetic engineering techniques. Techniques for nucleic acid manipulation are generally described, for example, in Sambrook et al. (1989) and Ausubel et al. (1987) with periodic updates). Methods for chemical synthesis of nucleic acids are described »for example» in Beaucage and Carruthers »Tetra. Letts. 22: 1859-1862 »1981» and Matteucci et al., J. Am. Chem. Soc. 103: 3185 »1981. The chemical synthesis of nucleic acids can be carried out» for example, in commercial automated oligonucleotide synthesizers.

Preparation of recombinant or chemically synthesized nucleic acids; vectors, transformation and host cells. In accordance with the present invention, natural or synthetic nucleic acids can be incorporated into recombinant nucleic acid constructs, typically DNA constructs, capable of being introduced and of being replicated in a host cell. Said construction is preferably a vector that includes a replication system and sequences that are capable of transcribing and translating a polypeptide coding sequence into a given host cell. For the practice of the present invention, conventional compositions and methods are used to prepare and use vectors and host cells as described, inter alia, in Sambrook et al., 1989, or Ausubel et al., 1987. A cell, tissue, or organ. or "trans or" transgenic organism "is one in which a foreign (introduced) nucleic acid has been introduced. A "transgenic" or "transformed" cell or organism also includes (1) progeny of the cell or organism, and (2) progeny produced from a breeding program that uses a "transgenic" plant as progenitor in a crossbreeding and exhibits an altered phenotype resulting from the presence of the "transgene" »that is, the recombinant nucleic acid for taxadiene synthetase.

Hybridization of nucleic acid; "severe conditions"; "specific". The nucleic acid probes and primers of the present invention hybridize under severe conditions with a target DNA sequence "for example" with the gene for taxadiene synthetase. The term "stringent conditions" is defined functionally with respect to the hybridization of a nucleic acid probe with a target nucleic acid (i.e. with a particular nucleic acid sequence of interest) by the hybridization procedure described in Sambrook et al. 1989, at 9.52-9.55. see also Sambrook et al., 1989 »at 9.47-9.52. 9.56-9.58; Kanehisa, Nucí. Acids Res. 12: 203-213, 1984; and Wet ur and Davidson, J. Mol. Biol. 31: 349-370, 19S8. Regarding the amplification of a target nucleic acid sequence (eg by PCR) using a pair of amplification primers in particular, severe conditions are conditions that allow the pair of primers to hybridize only with the target nucleic acid sequence for which an initiator having the corresponding wild-type sequence (or its complement) would be joined and preferably produce a single amplification product. The term "specific for (an objective sequence)" indicates that a probe or primer hybridizes under severe conditions only to the target sequence in a sample comprising said sequence.

Nucleic acid amplification. As used in 1 to present, "amplified DNA" refers to the product of zo nucleic acid amplification of a target nucleic acid sequence. The amplification of nucleic acid can be achieved by any of the nucleic acid amplification methods known in the art, including the polymerase chain reaction (PCR). Various amplification methods are known in the art and are described, inter alia, in the U.S. Patents. Nos. 4,683,195 and 4,683,202, and in PCR Protocols: A Guide to Methods and Appl cations »Innis et al., Eds., Academic Press, San Diego, 1990.

Methods for obtaining cDNA clones encoding taxadiene synthetase or homologues thereof. Based on the availability of the cDNA for taxadiene synthetase or described herein, other genes for taxadiene synthetase (eg alleles and homologs of taxadiene synthetase) can be easily obtained from a wide variety of plants by known cloning methods in The technique. One or more primer pairs based on the sequence of the taxadiene synthetase can be used to amplify said genes for taxadiene synthetase or its homologs by polymerase chain reaction (PCR) or other conventional amplification methods. Alternatively, the described cDNA for taxadiene synthetase or fragments thereof can be used to probe a cDNA or library obtained from a given plant species by conventional methods.

Cloning of the genome sequence of the taxadiene synthetase »and homologues thereof. The availability of the cDNA sequence for taxadiene synthetase allows those skilled in the art to obtain a genomic clone corresponding to the cDNA for the taxadiene synthetase (including the promoter and other regulatory regions and intron sequences) as well as the determination of its nucleotide sequence by conventional methods. Virtually »all Taxus species synthesize taxoids, including taxol, to a certain degree (see, for example, Mattina and Palva, J. Environ. Hort., 10: 187-191» 1992, My 11th »J. Natural Products 43: 425-437 1980). It would be expected that any organism that produces taxoids expresses a homolog of the taxadiene synthetase. Genes for taxadiene synthetase can be obtained by hybridization of a taxadiene synthetase probe from the Pacific yew with a cDNA or genomic library of a target species. Said homolog can also be obtained by PCR or other amplification method of genomic DNA or RNA of a target species using primers based on the sequence of the taxadiene synthetase shown in Figure 2. Genomic and cDNA libraries can be prepared by conventional methods. from yew or other plant species. Initiators and probes based on the sequence shown in Figure 2 can be used to confirm (and »if necessary» to correct) the sequence of the taxadiens synthetase by conventional methods.

Sequence variants of the cDNA for taxadians synthetase »and amino acid sequence variants of the taxadiene synthetase protein. Using the nucleotide and amino acid sequence of the taxadiene synthetase protein described herein, those skilled in the art can create DNA molecules and polypeptides having minor variations in their nucleotide or amino acid sequence. "Variant" DNA molecules are DNA molecules that contain minor changes in the sequence of the native taxadiene synthetase »that is,» changes in which one or more nucleotides of a sequence of the native taxadiene synthetase are deleted »added or substi tuted »Preferably while substantially maintaining the activity of the taxadiene synthetase. Variant DNA molecules can be produced "for example" by standard techniques of DNA mutagenesis "or by chemically synthesizing the variant DNA molecule or a portion thereof. Preferably, said variants do not change the reading frame of the nucleic acid protein coding region, and encode preferentially for a protein that has no change, only a minor reduction "or an increase in the biological function of the taxadiene synthetase . The amino acid substitutions are preferably substi tutions of individual amino acid residues. The DNA inserts are preferably from about 1 to 10 contiguous nucleotides, and the deletions are preferably from about 1 to 30 contiguous nucleotides. The insertions and deletions are preferably insertions or deletions of one end of the protein coding or non-coding sequence thereof., and are preferably made in adjacent base pairs. You can combine substitions, deletions, insertions or any combination of them to arrive at an inal construction. Preferably the variant nucleic acids according to the present invention are "silent" or "conservative" variants. The "silent" variants are variants of a native taxadiene synthetase sequence or a homologue thereof in which there has been a substitution of one or more base pairs, but no change in the amino acid sequence of the polypeptide encoded by the sequence. The "conservative" variants are variants of the sequence of the native taxadiene synthetase, or a homologue thereof, in which at least one codon has been changed in the protein coding region of the gene, resulting in a conservative change in one or more amino acid residues of the polypeptide encoded by the nucleic acid sequence, i.e., an amino acid substitution. The following is a list of several conservative amino acid substitutions. In addition »one or more codons coding for cysteine residues can be substituted, resulting in a loss of a cistern residue and affecting the disulfide bonds in the taxadiene synthetase polypeptide.

Original waste Substitutions conservati as Ala Ser Arg Lys Asn 1 n, His Asp Glu Cys Ser Gln Asn Glu Asp Gly Pro His Asn; Gl lie Leu; Val Leu lie; Val Lys Arg; Gln; Glu Met Leu; lie Phe Met; Leu; Tyr Ser Thr Thr Ser Trp Tyr Tyr Trp; Phe Val lie; Leu Substantial changes are made in function by selecting substitutions that are less conservative than those previously included "for example" causing changes in: (a) the structure of the base structure of the polypeptide in the area of the substitution; (b) the load or hydrophobic character of the polypeptide at the target site; or (c) the volume of a side chain of amino acids. Substitions that are generally expected to produce the largest changes in the properties of proteins are those in which: (a) a hydrophilic residue, eg, seryl or threonyl, "is replaced by (or by) a hydrophobic residue. »For example» leucyl »isoleucyl» feni lalani lo, valyl or alanyl; (b) a cysteine or proline is substituted by (or by) any other residue; (c) a residue having an electropositive side chain "eg, lysyl, arginyl" or histadyl "is replaced by (or by) an electronegative residue" eg "glutamyl or aspartyl; or (d) a residue having a bulky side chain, for example, phenylalanine, is replaced by (or by) one that does not have a side chain, for example, glycine. The gene sequence for taxadiene synthetase can be modified as follows: (1) To improve the efficiency of expression and redirect the orientation of the expressed polypeptide: For expression in non-host plants (or to direct the expressed polypeptide into a different intracellular compartment in a host plant), the sequence of the native gene can be truncated from the end 5r to remove the plasmid transit peptide coding sequence of about 137 amino acids (ie up to about 13BS), leaving the sequence coding for the mature taxadiene synthetase polypeptide of about 725 amino acids. In addition, one or more codons can be changed, for example, to conform the gene to the predilection for use of the codon of the host cell to improve expression. Enzymatic stability can be altered by stirring or adding one or more cysteine residues, stirring to thereby adding one or more disulfide bonds. (2) To alter catalytic efficiency: As described below, the aspartate-rich fraction that plays a role in substrate binding is also present in the taxadiene synthetase, since it is a related DXXDD motif (Figure 2). The histidine and cysteine residues have been implicated as the active sites of several terpenoid cyclase of plant origin. The histidine residues 370, 415 and 793, and the steins at residues 329, 650 and 777 of the taxadiene synthetase, are conserved among the genes for terpenoid plant cyclase. It can induce mutagenesis or alter kinetics 1 enzymatic of one or more conserved histidine and cistern residues (as described below), or of semi-conserved residues "such as conserved cysteine residues (eg, residues 329» 650 »719 and 777) and histidine residues (eg, .example »waste 370» 415, 579 and 793). In addition, residues adjacent to these conserved histidine and cysteine residues can also be altered to increase the content of cysteine or histidine to improve charge stabilization. By increasing the aspartate content of the DDXXD and DXXDD motifs (where D is aspartate and X is any amino acid), which are probably involved in the binding of the substrate / intermediate compound, it is also possible to increase the enzymatic velocity (ie » the limiting ionization step of the enzymatic speed of the enzymatic reaction). Arginines have been implicated in binding or catalysis "and conserved arginine residues are also good targets for mutagenesis. Change the stored DDXXD and / or DXXDD motifs (for example, the aspartate residues thereof) by conventional methods of site-directed mutagenesis, to coincide with those of other known enzymes, may also lead to changes in the kinetics or the specific character of the substrate of the taxadiene synthetase. Additionally, the formation of the product can be altered by mutagenesis of the RWWK element (residues 5S4 to 567), which includes aromatic residues that may play a role in the stabilization of the carbocation reaction intermediates. (3) To modify the use of the substrate: The enzyme »particularly the active site» can be modified to allow the enzyme to bind to shorter chains (eg, C1Q) or longer (eg, C2S) than the gerani diphosphate 1 gerani 1 o. The use of the size of the substrate can be altered by increasing or decreasing the size of the hydrophobic patches to modify the size of the hydrophobic cavity of the enzyme. Similar effects can be achieved by domain change. (4) To change the product result: Direct mutagenesis of conserved aspartate and arginine residues can be used to allow the enzyme to produce different diterpene skeletons with, for example, one, two or three rings. See »for example» Cane et al., Biochemistry 34: 2480-2488, 1995; Joly and Edwards, J. Biol. Chem. 268: 26983-26989, 1993; Marrero et al., J. Biol. Chem. 267; 533-536, 1992? and Song and Poulter, Proc. Nati Acad. Sc. USA 91: 3044-3048 »199).

Expression of nucleic acids for taxadiene synthetase in host cells. In accordance with the present invention, "DNA constructs incorporating a gene for taxadiene synthetase or fragment thereof" preferably place the coding sequence of the protein taxadiene synthetase under the control of an operably linked promoter that is capable of expression in a host cell. Various promoters suitable for the expression of heterologous genes in plant cells "including constitutive promoters" are known in the art for example, the 35S promoter of cauliflower mosaic virus (CaMV), which is expressed in many plant tissues, organ or tissue-specific promoters, and promoters that are inducible by chemicals such as methyl acetate, silicic acid, or protective compounds, for example. There are several other promoters or other sequences useful for constructing expression vectors for expression in host cells of bacteria, yeast, mammals, insects »amphibians» birds »and other host cells.

Nucleic acids bound to a solid support. The nucleic acids of the present invention may be in solution or may be attached by conventional means to a solid support such as a hybridization membrane (eg, nitrocellulose or nylon), a "bubble" or other solid support known in the art. technique.

Pol eps The term "taxadiene synthetase protein (or polypeptide)" refers to a protein encoded by a gene for taxadiene synthetase "including alleles" homologs or variants thereof for example. A taxadiene synthetase polypeptide can be produced by the expression of a recombinant nucleic acid for taxadiene synthetase, or it can be chemically synthesized. Techniques for chemical synthesis of polypeptides are described, for example, in Merrifield, J. Amer. Chem. Soc. 85: 2149-2156, 1963.

Identity and similarity of the ip sequence. Ordinarily, the taxadiene synthetase polypeptides encompassed by the present invention have at least about 70% amino acid sequence "identity" (or homology), compared to a native polypeptide of ta? Ad? En synthetase »preferentially so less about 80% identity, and more preferably at least about 90% identity with a native taxadiene synthetase polypeptide. Preferably, said polypeptides also possess structural features and biological activity characteristic of a native taxadiene synthetase polypeptide. The amino acid sequence "similarity" is a measure of the degree to which aligned sequences of amino acids have identical amino acids or conserved subsitutions of amino acids at corresponding positions. The "biological activity" of a taxadiene synthetase includes the enzyme activity of taxadiene synthetase determined by conventional protocols (for example, the protocol described in Hezari et al., Arch. Biochem. Biophys., 322: 437-444)., 1995, incorporated herein by reference). Other biological activities of taxadiene synthetase include, but are not limited to, substrate binding, immunological activity (including the ability to induce the production of antibodies that are specific for the taxadiene synthetase), etc. . The identity (homology) or similarity of the polypeptide is typically analyzed using programs for sequence analysis, such as the Sequence Analysis Software Package of the Genetics Computer Group »University of Wisconsin Biotechnology Center» Madison, Wl). Programs for polypeptide sequence analysis match the polypeptide sequences using identity measurements assigned to various substitutions, deletions, substitutions and other modifications.

Poli peptides- "isolated", "purified" and_ "homogeneous". A polypeptide is "isolated" if it has been separated from the cellular components (nucleic acids »lipids» carbohydrates and other polypeptides) that accompany it naturally. Said polypeptide can also be referred to as "pure" or "homogeneous" or "substantial entity" pure or homogeneous. Thus, a polypeptide that is chemically synthesized or that is recombinant (ie, the product of the expression of a recombinant nucleic acid, even if expressed in a homologous cell type) is considered isolated. A monomeric polypeptide is isolated when at least 60 to 90% by weight of a sample is composed of the polypeptide »preferably 95% or more, and more preferably more than 99%. The purity or homogeneity of the protein is indicated "for example" by polyacrylamide gel electrophoresis of a protein sample "followed by visualization of an individual band of the polypeptide after staining with polyacrylamide gel; Chromatography of high pressure liquids; and other conventional methods.

Purification of the protein. The polypeptides of the present invention can be purified by any of the methods known in the art. Various methods of protein purification are described, for example, in Guide to Protein Purification, ed. Deutscher, Meth. Enzymol. 185 »Academic Press, San Diego, 1990; and Scopes, Protein Puri ication; Principies and Practice, Springer Verlag, New York, 1982.

Variant forms of taxadiene synthetase polypeptides; marked. According to one embodiment of the present invention "and encompassed by the taxadiene synthetase polypeptides" there are polypeptide variants in which there have been substitutions, "deletions," insertions or other modifications of a native taxadiene synthetase polypeptide. The variants retain substantially structural and / or biological characteristics, and are preferably silent or conservative substi tutions of an amino acid residue or a small number of continuous amino acid residues. Preferably, said variant polypeptides are at least 70%, more preferably at least 80%, and most preferably at least 90%, homologs with a native polypeptide of taxadi ena synthetase. The sequence of the native palpeptide of taxadiene synthetase can be modified by conventional methods, for example, by acetylation, carbaxination, phosphorylation, glycosylation, ubiquination and labeling, whether achieved by enzymatic treatment in vivo or in vitro a taxadiene synthetase polypeptide, or by synthesis of said polypeptide using modified amino acids. There are several conventional methods and reagents for labeling palpeptides and fragments thereof. Typical labels include radioactive isotopes, ligands or ligand receptors, fluorophores, chemiluminescent agents, and enzymes. The methods for marking and guiding the choice of appropriate marks for various purposes are described, for example, in Sambroko and others (1989) and Ausubel et al. (1987, with periodic updates).

Fragments of polypeptides. The present invention also encompasses fragments of taxadiene synthetase polypeptides that lack at least one residue of a full-length native taxadiene synthetase polypeptide »while retaining at least one of the characteristic biological activities of the taxadiene synthetase, for example. enzymatic activity of taxadiene synthetase or possession of a characteristic immunological determinant. As an additional example, an immunologically active fragment of the taxadiene synthetase polypeptide is capable of producing antibodies specific for the taxadiene synthetase in an immune system (eg, murine or rabbit), or of competing with the taxadiene synthetase to bind antibodies specific for the taxadiene synthetase, and is thus useful in immunoassays to detect the presence of polypeptides of taxadiene synthetase in a biological sample. Said immunologically active fragments typically have a minimum size of 7 to 17 amino acids. The fragments preferably comprise at least 10, more preferably at least 20, and most preferably at least 30 consecutive amino acids of a native palmitate of taxadiene synthetase.

Fusion polypeptides. The present invention also provides fusion polypeptides that include, for example, heterologous fusion polypeptides, ie, a taxadiene synthetase polypeptide sequence or fragment thereof and a heterologous polypeptide sequence, eg, a sequence of a polypeptide. different. Said heterologous fusion polypeptides thus exhibit biological properties (such as ligand binding »catalysis» certain antigenic »secretion signals» etc. derived from each of the fused sequences. Fusion members include »for example» i munoglobul as »beta galactoxy asa» trpE »protein A» beta lactamase »alpha amylase» alcohol dehydrogenase »yeast alpha coupling factor» and several signal and guidance sequences which »for example» they can direct the secretion of the polypeptide. Fusion polypeptides are typically obtained by the expression of recombinant nucleic acids before, or by chemical synthesis.

Determination of the polypeptide sequence. The sequence of a polypeptide of the present invention can be determined by various methods known in the art. To determine the sequence of a polypeptide, the polypeptide is typically fragmented, the fragments separated, and the sequence of each fragment determined. To obtain fragments of a taxadiene synthetase polypeptide »the polypeptide can be digested with an enzyme such as trypsin» clostr paine, or Staphy1 protease ococcus »or with chemical agents such as cyanogen bromide» o-iodoisobenzoate »hydroxylamine or 2-ni tro-5-thiocyanobenzoate. Peptide fragments can be separated "for example" by reversed-phase high-performance liquid chromatography (HPLC = HPLC) and analyzed by gas-phase sequence determination.

Ant bodies The present invention also encompasses polyclonal and / or monoclonal antibodies which are specific for the taxadiene synthetase, ie, which bind to the taxadiene synthetase and are capable of differentiating the taxadiene synthetase polypeptide from other polypeptides under standard conditions. Said antibodies are produced and tested by conventional methods. For the preparation and use of antibodies according to the present invention, including various techniques and applications of i munoassay, see, for example, Goding »Monoclonal Antibodies: Principles and Practice» 2a. ed. »Academic Press» New York, 1986; and Harlow and Lane, Antibiotics: A Laboratory Manual »Cold Spring Harbor Laboratory, Cold Spring Harbor» NY »19BS. The antibodies specific for the taxadiene synthetase are useful, for example, in: purification of taxadiene synthetase polypeptides; cloning of taxadiene synthetase homologs from the Pacific yew or other plant species from an expression library; antibody probes for protein blots and immunoassays; etc. The taxadiene synthetase polypeptides and antibodies can be labeled by conventional techniques. Suitable labels include radionuclides, enzymes, substrates, cofactors, "inhibitors," fluorescent agents, chemiluminescent agents, magnetic particles, etc.

Trans ormaci n and regeneration of plants. In the practice of the present invention, any well-known method can be used for the trans formation, cultivation and regeneration of plant cells. Methods for introducing foreign (introduced) DNA into plant cells include, but are not limited to: transfer that includes the use of Agrobacterium tumefaciens and appropriate Ti vectors, including binary vectors; chemically induced transference (for example »with pol ethyleneol col); biolistics; and microinjection. See "for example" An et al., Plant Molecular Biology Manual A3: 1-19, 1988. The invention will be better understood in relation to the following examples, which are intended to illustrate only the best known mode for practicing. the invention. However, it is not considered that the scope of the invention is limited thereto.

EXAMPLE 1 Cloning and determination of the sequence of a cDNA coding for taxa-4 (5), 11 (12) -diene synthetase MATERIALS AND M ALL Plants »substrates and standards. Young trees of T. brevi folia of four years of age in active growth were maintained in a greenhouse. The Cl-3Hl-diphos ato de gerani lgeram "lo (120 Ci / mol) was prepared as described above (LaFever et al.» Arch. Biochem. Biophys. 313: 139-149, 1994) »and <l) was prepared -taxa-4 (5) > 11 (12) -authentic diene by total synthesis (Rubenstein »J. Org. Chem. 60: 7215-7223, 1995).

Construction of 1 a library. Total RNA was extracted from the T stem. brevifo1 using the procedures of Lewinsohn and associates (Lewinsohn et al. »Plant Mol. Biol. Rep. 12: 20-25» 1991) developed for woody gymnosperm tissue. Poly (A) - * - RNA was purified by chromatography on aligo (dT) -cellulose (Pharmacia) »and 5 μg of the resulting mRNA was used to construct a lambdaZAP II cDNA library in accordance with the manufacturer's instructions (Stratagene) .

Generation of the probe based on PCR »and selection of the library. Comparison of six available sequences for monaterpene »sesqui erpene and diterpene cyclase from higher plants (Facchini and Chappel 1» Proc Nati, Acad Sci USA 98: 11088-11092; Colby et al., J. Biol. Chem. 268: 23016-23024 »1993; Mau and West, Proc. Nati, Acad. Sci. USA 91: 8497-8501, 1994; Back and Chappell, J. Biol. Chem. 270: 7375-73B1, 1995; Sun and Kar iya» Plant Cell 6: 1509-1518 »1994; and Bensen et al.» Plant Cell 7: 75-84 »1995)» allowed the definition of eleven homologous regions for which degenerate consensus primers were synthesized. The 20 primers (the most carboxy-terminal initiator "the most amino-terminal initiator" and nine internal primers in both directions) were displayed in all possible combinations with a wide range of amplification conditions using as a template phage DNA from library purified with CsCl of the stem of T. brevifol a (Innis and Gelfand »in PCR Protocols (Innis et al., eds)» pp. 3-12, 253-258 »Academic Press» San Diego »CA, 1990; Sambrook et al., 1989). Analysis of the PCR products by gel electrophoresis (Sambraok et al. 1989) indicated that only the combination of primers CC7.2F and CC3R (see Fig. 2) generated a specific ADW fragment (~ SQ pb). This DNA fragment was cloned into pT7Blue (Novagen) and its sequence was determined (DyeDeoxy Termi nator Cycle Sequencing, Appl ed Biosystems) and was shown to be 83 base pairs in length. PCR was used to prepare approximately 1 μg of this material for random hexamer labeling with Ca-32P3dATP (Tabor et al., In Current Protocols in Molecular Biology, Ausubel et al. »Sections 3.5.9-3.5.10, 1987), and it was used as a hybridization probe to select filter elevations of 3 x 10a plates developed in E. coli LE392 using standard protocols (Britten and Davidson »in Nucleic Acid Hybridization, Hames and Higgins, eds., pp. 3-14, IRL Press, Oxford, 1988). Of the plates that produce positive signals (102 in total), 50 were purified by two other hybridization cycles. 38 pure clones were separated in v vo as phagemids from Bluescript. The size of the insert was determined by PCR using T3 and T7 promoter primers, and the sequence of the 12 largest clones (insertion> 2 kb) was partially determined.

Expression of the cDNA in E. coli. All full-length insertions to which their sequence was partially determined were either outside the reading frame, or were taken to the sites of premature arrest immediately towards the 5 rd end of the probable methionine start codon. The last complication probably resulted from the synthesis of double-stranded cDNA initiated with a pin (Old and Primrose, Principies of Gene Manipul ati on »4th ed., Pp. 3435» Black ell Scientific »London, 1989). The 2.7 kb insert of pTb42 was cloned into the reading frame by PCR using the thermostable Pful (Stratagene) polymerase, high fidelity and shaved ends, as well as the FRM42 primer (towards the 3"end of the false arrest codons) and the T7 promoter primer.The resulting shaved end fragment was ligated into pBluescript SK (-) (Stratagene ) digested with Eco RV »producing pTb42.1» and transformed into XL1-Blue from E. coli (Stratagene). To evaluate the functional expression of terpene cyclase activity »XLl-Blue cells were cultured from E. coli. harboring pTB42.1 (up to Aßoo = 0.4) in 5 ml of LB medium supplemented with 100 μg / ml of ampicillin and 12.5 μg / ml of tetracycline before induction with IPTG at 200 μM and subsequent development for 4 hours at 25 ° C. Bacteria were harvested by centrifugation (1800 g, 10 min), resuspended in pH regulator for taxadiene synthetase test (Hezari et al., Arch. Biochem. Biophys., 322: 437-444, 1995), fractured by short treatment with sound at 0-4 ° C, and the resulting suspension it was centrifuged (18 »000g, 10 min) to transform the waste into pellets. The supernatant was tested for taxadene synthetase activity by an established protocol (Hezari et al., Arch. Biochem. Biophys., 322: 437-444, 1995) in the presence of gerani Cl-3HH-diphosphate 1 seri 1 or 15 μM and 1 mM gCl ^ »with incubation at 31 ° C for 4 hours. The reaction products were extracted with pentane, and the extract was purified by column chromatography on silica gel as described above (Hezari et al., Arch. Biochem. Biophys., 322: 437-444, 1995) to produce the fraction of olefin, an aliquot of which the count was made by spectrometry of fluid analysis to determine the incorporation of 3H. Control experiments were also carried out with transformed E. coli carrying the plasmid with insertions outside the reading frame. The identity of the olefin product of the recombinant enzyme was verified by capillary gas radiochromatography ("radio-GC in capillary") (Croteau and Satterwhite, J.

Chro atogr. 500: 349-354, 1990), as well as capillary gas chromatography / mass spectrometry ("GC-MS in capillary") using previously described methods (Koepp et al., J. Biol. Chem. 270: 8686-8690 , 1995) and taxa-4 (5), ll (12) -authentic diene (Rubenstein, J. Org. Chem. 60: 7215-7223, 1995). For GC-MS analysis (GC-MSD 6890 from Hewlett-Packard), selected diagnostic ions were monitored: m / z 272 CP ^ I! 257 CP "" - 15 (CH 3) 3; 229 CP - * - 43 (C3H):.; 121, 122, 123 Cracing of fragments of the carbon ring]; and 107 Cm / z 122 base peak - 5 (CH; 3) 3. The origin of the ion of the highly characteristic C-ring double-cut shear fragment Cp base »m / z 122 (CsHi4) has already been described (Koepp et al., J. Biol. Chem. 270: 8686-8690, 1995).

RESULTS AND DISCUSSION Isolation and characterization of the cDNA. In general characteristics (molecular weight, requirement of divalent metal ions, kinetic constants, etc.), the taxadiene synthetase resembles other terpenoid cyclase from higher plants; However, the low titers of the enzyme in tissues "and its stability under a wide range of fractionation conditions" prevented purification of the protein until homogeneous (Hezari et al. »Arch. Biochem. Biophys., 322: 437-444, 1995 ). A 10 μg sample of the electrophoretically purified cyclase, prepared by standard analytical procedures (Schagger and von Jagow »Anal. Biochem. 166: 368-379, 1987, Towbin et al., Proc. Nati. Acad. Sci. USA 76: 350 -4354 »1979), it was not possible to obtain the amino terminal sequence by Edman degradation. Repeated attempts at trypsin treatment and CNBr cleavage of comparable protein samples did not allow obtaining peptides from which their frequency could be determined, due in large part to very low recoveries. As an alternative method to the selection of the cDNA library using protein-based oligo-gonucleotide probes, a PCR-based strategy was developed that relied on a series of degenerate primers for PCR amplification designed to recognize highly conserved regions of six. terpene cyclase from higher plants whose nucleotide sequence is known. Three of these cyclases, (-) - limonene synthetase (a rnonoterpene cyclase from spearmint) (Colby et al., J. Biol. Chem. 268: 23016-23024, 1993), epi-aristhal okene synthetase (a sesqui terpene cyclase of tobacco) (Facchini and Chappel 1, Proc. Nati, Acad. Sci. USA 89: 11088-11092, 1992; Back and Chappel 1, J. Biol. Chem. 270: 7375-7381, 1995) and the casbene synthetase ( a diterpene cyclase from castor bean) (Mau and West, Proc. Nati, Acad. Sci. USA 91: 8497-8501, 1994), use similar reaction mechanisms to the taxadiene synthetase in the cyclization of the respective substrates of geranyl diphosphate! C10), farnesyl (CaLS) and gerani 1 gerani 1 or (C20) (Lin et al., Biochemistry, in press). The kaurene synthetase A of Arabidopsis thaliana (Sun and Kar iya »Plant Cell 6: 1509-1518» 1994) and of maize (Bensen et al. »Plant Ce11 7: 75-84» 1995) and the (-) -abietadiene synthetase of the large fir (Abies grandis, Stofer Vogel, Wildung, Vogel and Croteau, manuscript in preparation), use a quite different mechanism involving protonation of the terminal double bond of gerani 1 gerani 1 diphosphate or to initiate cyclization to ial copal diphosphate intermediate or followed, in the case of 1 to abietadieno sintetasa »by the most typical ionization of the function of the diphosphate ester to initiate a second sequence of cyclization to the olefin product (LaFever et al.» Arch. Biochem. Biophys. 313: 139-149, 1994). The latter represents the only gymnosperm terpene cyclase sequence currently available. The comparison of the deduced amino acid sequences among all the cyclases allowed to mark 11 regions for the construction of the primer by PCR. Testing the twenty primers in all combinations under a wide range of amplification conditions, followed by analysis of the product by electro-oresis gel, revealed that only a combination of primers CCC7.2 (leading) with CC3 (opposite), see Fig. 2 for localization 1 produced a specific DNA fragment (83 bp) using the phage of the T library as a template. brevi ol i a. The CC3 initiator delineates a region of strong homology between (-) - lnonnoneno synthetase (Colby et al., J. Biol.

Chem. 26B: 23016-23024, 1993), epi-ary stol okene synthetase (Facchini and Chappel 1, Proc. Nati, Acad. Sci. USA 89: 11088-11092) and casbene synthetase < Mau and West »Proc. Nati Acad. Sci. USA 91: 8497-8501, 1994). The CC7.2 primer was selected in the sequence comparison of the angioesper diterpene cyclases (Mau and West »Proc. Nati, Acad. Sci. USA 91: 8497-8501» 1994, Sun and Karmiya »Plant Cell 6: 1509-1518 »1994? Bensen et al.» Plant Cell 7: 75-84, 1995) with the newly acquired cDNA clone coding for a diperpene cyclase from gi nasperma »the (-) - abietadiene synthetase from the large fir (Stofer Vogel »Wildung» Vogel and Croteau »manuscript in preparation). The 83 bp fragment was cloned, its sequence was determined and thus proved to be similar to a cyclase. This PCR product was labeled with 32P. to be used as a hybridization probe »and was used in high severity selection of 3 x 10B plates» which produced 102 positive signals. Fifty of these clones were purified by two other selection cycles. They were separated n vivo, and the size of the inserts was determined. The sequence of the twelve clones bearing the largest inserts (> 2. O kb) was partially determined, indicating that they were all representations of the same gene. Four of these insertions appeared to be of total length.

Expression of the cDNA in E. col i. The four full-length clones that were purified were either out of the reading frame, or had arrest sites immediately towards the 5"end of the methionine start codon resulting from double-stranded cDNA synthesis initiated with pin. pTB42 was cloned into the reading frame by the PCR method, the shaved end fragment was ligated into the EcoRV site of pBluescript SK (-), producing pTb42.1, and transformed into XLl-Blue from E. coli. Transformed E. coli cells were cultured in LB medium supplemented with anti-iotics, and induced with IPTG.The cells were harvested and homogenized, and the extracts were tested for taxadiene synthetase activity using standard protocols with Cl-3-HD-diphosphate of gerani 1 gerani 1 or as a substrate (Hezari et al., Arch. Biochem. Biophys. 322: 437-444 »1995) The isolated olefin fraction of the reaction mixture contained a radioactive product (~ 1 nmol) that coincided in capillary GC-radio with taxa-4 (5), ll (12) -dieno authentic (Rt = 19.40 ± 0.13 min). Identification of this diterpene olefin was confirmed by capillary GC-MS analysis. The retention time (12.73 min versus 12.72 min) and the mass spectrum of the selected ion (Table I) of the diterpene olefin product were identical to those of (±) -taxa-4 (5) »11 (12) - authentic diene (Rubenstein, J. Org. Chem. 60: 7215-7223, 1995).

The origin of the selected diagnostic ions shown in Table I "which explains most of the abundance of the total spectrum" is described herein and elsewhere (Koepp et al., J. Biol. Chem. 270: 8686 -8690 »1995). Due to the different sample sizes, the total abundance of the authentic standard (2.96 Es) was approximately twice that of the biosynthetic olefin (1.42 EB). This "and the background variation in the runs" probably explains the minor differences in the relative abundances of the large mass fragments.

TABLE 1 GC-MS analysis of diterpene olefin synthesized by recombinant taxadiene synthetase ("product" ""), comparatively with ta-4 (5), 11 (12) -dienei ("standard") authentic Relative abundance (%) m / z Standard Product 107 15.3 15.3 121 14.3 14.3 122 58.1 57.8 123 10.2 10.3 229 0.56 0.71 257 0.35 0.45 272 1.19 1.17 Since the identically prepared extracts of control cultures of E. col i that were transformed with pBluescript that have an insert outside the reading frame, were unable to transform gerani 1 gerani 1 diphosphate or detectable levels of diterpene olefin »these results confirm that the clone pTb42.1 codes for the taxadiene synthetase of the Pacific yew.

Analysis of sequence. The sequence of both chains of the inserts of pTb42 and pTb42.l was determined. No error was incorporated by Pfu polymerase. The cDNA for taxadiene synthase of "Tb42.1 is 2700 nucleotides in length" and contains a complete open reading frame of 2586 nucleotides (Figure 2). The deduced amino acid sequence indicates the presence of a putative transit peptide from plaques of approximately 137 amino acids »and a mature protein of approximately 725 residues (~ 82.5 KDa), based on the size of the native (mature) enzyme (~ 79 kDa) »determined by gel permeation chromatography and sodium dodecyl sulfate-electrophoresis! of polyacrylamide ("SDS-PAGE") (Hezari et al., Arch. Biochern, Biophys., 322: 437-444, 1995), the amino acid content and characteristic structural characteristics of said amy-noterminal marker sequences "and their cutting sites (Keegstra et al., Annu, Rev. Plant Physiol. Plant Mol. Biol. 40: 471-501, 1989, von Heijne et al., Eur. J. Biochem. 180: 535-545, 1989), and the fact that the biterhesis of diterpene is located exclusively within the plastids (West et al. »Rec.Adv. Phytochem .13: 163-198» 1979; Kleinig, Annu., Rev. Plant Physiol. Plant Mol. Biol. 40: 39-59 » 1989). The binding of the mature transit / protein peptide is unknown, and thus the exact lengths of both portions, because the mature sense of the mature pratein is apparently blocked and has not yet been identified. The comparison of sequences in pairs (Feng and Daolittle, Methods Enzy ol., 183: 375-387, 1990, Genetics Computer Group, Program Manual for the Wisconsin Packet, Version 8, Genetics Computer Group, Madisan »Wl» 1994) with other terpene Cyclase from higher plants, revealed a significant degree of sequence similarity at the amino acid level. The yeast taxadiene synthetase showed 32% identity and 55% similarity to the (-) - limonene synthetase from spearmint (Colby et al., J. Biol. Chem. 268: 23016-23024 »1993), 30% identity and 54% similarity with the epi-aristoloqueno synthetase of tobacco (Facchini and Chappel 1, Proc. Nati, Acad. Sci. USA 89: 11088-11092, 1992), 31% identity and 56% similarity to casbene Castor synthetase (Mau and West, Proc. Nati, Acad. Sci. USA 91: 8497-8501, 1994), and 33% identity and 56% similarity to the kaurene synthetase A of Arabidopsis thaliana and corn (Sun and Kar iya, Plant Cell 6: 1509-1518, 1994, Bensen et al., Plant Cell 7: 75-84, 1995), and 45% identity and 67% similarity to the (-) - abietadiene synthetase of the large spruce ( Stofer Vogel, Wildung, Vogel and Croteau »manuscript in preparation). The comparison in pairs of other members within this group shows approximately comparable levels of identity (30-40%) and similarity (50-60%). These terpenoid synthetases represent a broad range of cyclase types from diverse plant families, supporting the suggestion of a common origin for this class of enzymes (Colby et al., J. Biol. Chem. 268: 23016-23024, 1993); Mau and West, Proc. Nati Acad. Sci. USA 91: 8497-8501, 1994; BacK and Chappell, J. Biol. Chem. 270: 7375-7381 »1995; McGarvey and Croteau, Plant Cell 7: 1015-1026, 1995; Chappel 1, Annu, Rev. Plant Physiol. Plant Mol. Biol. 46: 521-547, 1995). The amino acid sequence of the taxadiene synthetase does not closely resemble (approximately 20%, approximately 40% similarity) to that of any of the recently determined microbial sesqu terpene cyclases (Hohn and Beremand, Gene (Amst.) 79: 131-136, 1989, Proctor and Hohn, J. Biol. Chem. 268: 4543-4548, 1993, Cañe et al., Biache istry 33: 5846-5857, 1994), nor the sequence of the taxadiene synthetase resembles to some of the published sequences for phenyl trans erases (Chen et al., Protein Sci. 3: 600-607, 1994; Scolnik and Bartley, Plant Phys., 104: 1469-1470, 1994; Attucci et al., Arch. Biochem. Biaphys 321: 493-500, 1995), a group of enzymes which, like terpenoid cyclase, utilize allylic diphosphate substrates and similar electrophysiological reaction mechanisms (Poulter and Rilling) in Biosynthesis of Isoprenoid Compounds »Porter and Spurgeon »eds.» Vol.1 »pp. 161-224» Wiley to Sons, New York, NY, 1981). The (I, L, V) XDDXX (XX) D motifs rich in aspartate present in most of the phenyl transferases and terpenoid cyclase (Facchini and Chappel 1, Proc. Nati, Acad. Sci. USA 89: 11088-11092, 1992; Colby et al., J. Bial. Chem. 268: 23016-23024, 1993; Mau and West, Proc. Nati Acad. Sci. USA 91: 8497-8501, 1994; Back and Chappel 1, J. Biol. Chem. 270: 7375-7381, 1995; Hohn and Beremand, Gene (Amst.) 79: 131-136 »1989; Proctor and Hohn »J. Bial. Chem. 268: 45 3-4548, 1993; Ca e et al., Biochemistry 33: 5846-5e57 »1994; Chen et al., Protein Sci. 3: 600-607, 1994; Scolnik and Bartley, Plant Physiol. 104: 1469-1470 »1994; Attucci and others »Arch. Biachem. Biophys. 321: 493-500, 1995, "Abe and Prestwich, J. Biol. Chem. 269: 802-804, 1994), and which are thought to play a role in binding to the substrate (Chen et al., Protein Sci. 3: 600-607, 1994, Abe and Prestwich, J. Biol. Chem. 269: 802-804, 1994, Marrero and others, J. Biol. Chem. 267: 21873-21878, 1992, Joly and Edwards »J. Biol. Chem. 268: 26983-26989, 1993; Tarshis et al. »Biochemistry 33: 10871-10877» 1994) are also present in the taxadiene synthetase »as is a related DXXDD motif (Figure 2). cysteine have been implicated in the active sites of several terpenoid cyclase of plant origin (Rajaonari vony and others »Arch. Biochem. Biophys., 299: 77-82» 1992; Savage et al. »Arch. Biochem. Biophys., 320: 257-265 , 1995) .A search of the aligned sequences revealed that three histidines (at positions 370, 415 and 793) and three cysteines (at positions 329, 650 and 777) of the taxadiene synthetase, are conserved among the genes for terpenoid ci. Classes of plants. The yeast taxadiene synthetase more closely resembles the abietadiene synthetase of the large spruce, rather than the casbene synthetase of the castor bean (Mau and West »Proc. Nati, Acad. USA 91: 8497-8501, 1994), which catalyzes a type similar cyclization reaction, but it is physically quite distant. The abietadieno synthetase of the great spruce is the only terpenoid cyclase sequence of a gymnosperm now available (Stofer Vogel, Wildung, Vogel and Croteau, in preparation), and these two diterpene cyclase from the coniferales share several regions of significant sequence homology, one of which it was chosen fortuitously to build the initiator, and proved to be fundamental in the acquisition of a PCR-derived probe that led to the cloning of the taxadiene synthetase.

EXAMPLE 2 Expression of truncated genes for taxad? Ene synthetase to remove transit peptide sequences The native gene sequence for taxadiene synthetase was truncated from the 5 'end to remove the entire sequence or part thereof encoding the plastid transit peptide, of approximately 137 amino acids (the mature taxadiene synthetase polypeptide is approximately 725 amino acids). Deletion mutants were produced that remove amino acid residues from the inal aminoter to residue 31 (Glu), 39 (Ser) »49 (Ser)» 54 (Gly) »79 (Val) or 82 (lie). These mutants were expressed in E. coli calli and the cell extracts were tested for taxadiene synthetase activity as described above. In preliminary experiments, the expression of mutants was increased by truncation on the wild type taxadiene synthetase to about 50%, and further truncation beyond residues 83 to 84 apparently decreased the activity of the taxadiene synthetase. Truncation of at least part of the plastid transit peptide improves the expression of the taxadiene synthetase. In addition, the fact of removing this sequence improves the purification of the taxadiene synthetase, since the transit peptide is recognized by the chahorinins of E. coli, which are copied with the enzyme and complicate the purification, and because the taxadiene preprotein synthetase tends to form inclusion bodies when expressed in E. coli. The actual cut site for the removal of the transit peptide may not be at the predicted cleavage site between residue 136 (Ser) and residue 137 (Pro). A transit peptide of 136 residues appears to be quite long, and others (monoterpene) synthetases have a pair of arginines (Arg-Arg) in tandem at approximately residue 60 (Met). The immediate truncation at the aminotermi to the pair of tandem arginines of these synthetases, has resulted in an excellent expression in E. coli The taxadiene synthetase lacks an Arg-Arg element. Similarly »a truncation beyond residues 83 to 84 leads to less activity. This invention has been detailed as much by examples as by direct description. It should be apparent that any person skilled in the relevant art would be able to assume equivalents for the invention "bed is described in the following indications" but would be within the spiof the foregoing description. Said equivalents will be included within the scope of this invention.

Claims

NOVELTY OF THE INVENTION CLAIMS

1. - An isolated "nucleotide poly" characterized in that it comprises at least 15 consecutive nucleotides of a native gene for taxadiene synthetase.

2. An isolated polynucleotide, characterized in that it comprises at least 30 consecutive nucleotides of a native gene for taxadiene synthetase.

3. The inucleotide pal according to the rei indication 1"characterized in that it comprises a sequence coding for a polypeptide having biological activity of taxadiene synthetase.

4. A cell »characterized in that it comprises the polynucleotide according to claim 1. 5.- A plant cell» characterized in that it comprises the polynucleotide according to claim 1. 6.- A transgenic plant, characterized in that it comprises the Inucleotide pal according to claim 1. 7. An isolated polynucleotide, characterized in that it comprises a polypeptide coding sequence coding for a palpeptide with biological activity of taxadiene synthetase "wherein the nucleotide coding sequence has at least 70% of similarity of nucleotide sequence with a native gene for taxadiene synthetase of the Pacific yew. 8. The polynucleotide pol according to claim 7 »characterized in that the polypeptide coding sequence has at least 80% nucleotide sequence similarity with the native gene for taxadiene synthetase from the Pacific yew. 9. The inucleotide pal according to claim 8 »characterized in that the polypeptide coding sequence has at least 90% nucleotide sequence similarity to the native gene for taxadiena synthetase from the Pacific yew. I. The ectoid polysulfide according to claim 7, characterized in that the palpeptide coding sequence codes for a polypeptide having only conservative amino acid substi tutions with the taxadiene synthetase polypeptide sequence of Figure 2, except for a amino acid substitution in at least one location selected from the group consisting of: cysteine residues 329 »650, 719 and 777; histidine residues 37 ?, 415, 579 and 793; a DDXXD motif; a motive DXXDD; a conserved arginine; and an RWW element. 11. The polynucleotide according to claim 7, characterized in that the polypeptide coding sequence encodes a polypeptide having only amino acid substitutions with a native taxane synthetase polypeptide sequence from the Pacific yew. 12. The polynucleotide according to claim 7, characterized in that the polypeptide coding sequence encodes a polypeptide that lacks at least part of a transit peptide sequence of a native polypeptide of taxadiene synthetase from the Pacific yew . 13. The polynucleotide pol according to claim 7, characterized in that the polypeptide coding sequence codes for a polypeptide that is completely homologous with a native polypeptide of taxadiene si ntetase. 14. A cell »characterized in that it comprises the polynucleotide according to claim 7. 15.- A plant cell, characterized in that it comprises the polynucleotide in accordance with the rei indication 7. 16.- A transgenic plant» characterized in that comprises the polynucleotide according to claim 7. 17. An isolated polypeptide »characterized in that it has taxadiene synthetase activity. 18. The polypeptide according to claim 17 »characterized in that it has at least 70% 5B of amino acid sequence identity with a native polypeptide of taxadiene synthetase from the Pacific yew. 19. The polypeptide according to claim 18 »characterized in that it has at least 80% amino acid sequence identity with the taxadiene synthetase polypeptide. 20. The polypeptide according to claim 19 »characterized in that it has at least 90% amino acid sequence identity with the taxadiene synthetase polypeptide. 21. The polynucleotide according to the rei indication 17"characterized in that it has only conservative substi tutions with the sequence of the native polypeptide of taxadiene synthetase from the Pacific yew, except for an amino acid substitution at one or more locations of the native Pacific yew taxadiene synthetase polypeptide sequence that are selected from the group consisting of: cysteine residues 329, 650, 719 and 777; histidine residues 370, 415, 579 and 793; a motive or DDXXD; a motive DXXDD; a conserved arginine; and an RWWK element. 22. The polypeptide according to rei indication 17, characterized in that it has only conservative amino acid substi tutions with the sequence of the native taxadiene synthetase polypeptide of the Pacific yew. 23. The polypeptide according to rei indication 17, characterized in that it is completely homologous to the sequence of the native polypeptide of taxadiene synthetase from the Pacific yew. 24. The polypeptide according to the rei indication 17, characterized in that it lacks all the transit peptide, or part thereof. 2

5. An "isolated polypeptide" characterized in that it comprises at least 10 consecutive amino acids of a native taxadiene synthetase polypeptide from the Pacific yew. 26.- A native, mature polypeptide isolated from taxadiene synthetase from the Pacific yew. 27.- An antibody specific for a pal peptide native of taxadiene synthetase of the yew Pac. 28. A method for expressing a polypeptide of taxadiene synthetase in a cell, characterized in that the method comprises the steps of: providing a cell comprising an expressible eukaryotic pallet encoding a polypeptide of taxadiene synthetase in accordance with the indicated indication;; and culturing the cell under conditions suitable for the expression of the polypeptide. 29.- The method of compliance with the claim 28 »characterized in that the cell is a cell producing taxoids. 30.- The method of compliance with the reiviication 29, characterized in that expression of the polynucleotide causes the cell to produce a higher level of a taxoid than an otherwise similar cell lacking the expressible polynucleotide. 31.- A method for obtaining a gene for taxadiene synthetase, characterized in that it comprises the steps of: contacting a nucleic acid of a taxoid producing organism with a probe or initiator comprising an effective polypeptide in accordance with the re vindication 1 under severe hybridization conditions, thereby causing the probe or primer to hybridize with a gene for taxadiene synthetase from the organism; and isolate the gene for taxadiene synthetase from the organism.