[go: up one dir, main page]

MXPA01003143A - Novel plant acyltransferases - Google Patents

Novel plant acyltransferases

Info

Publication number
MXPA01003143A
MXPA01003143A MXPA/A/2001/003143A MXPA01003143A MXPA01003143A MX PA01003143 A MXPA01003143 A MX PA01003143A MX PA01003143 A MXPA01003143 A MX PA01003143A MX PA01003143 A MXPA01003143 A MX PA01003143A
Authority
MX
Mexico
Prior art keywords
leu
val
ser
lys
phe
Prior art date
Application number
MXPA/A/2001/003143A
Other languages
Spanish (es)
Inventor
Michael W Lassner
Diane M Ruezinsky
Robin A Emig
Eenennaam Alison Van
Original Assignee
Calgene Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Calgene Llc filed Critical Calgene Llc
Publication of MXPA01003143A publication Critical patent/MXPA01003143A/en

Links

Abstract

By this invention, novel nucleic acid sequences encoding for acyltransferase related proteins are provided, wherein said acyltransferase-like protein is active in the transfer of a fatty acyl group from a fatty acyl donor to a fatty acyl acceptor. Also considered are amino acid and nucleic acid sequences obtainable from AT-like nucleic acid sequences and the use of such sequences to provide transgenic host cells capable of producing modified lipid content and composition.

Description

NOVELTY VEGETABLE ACILTRANSFERASAS RELATED REQUEST This application claims the benefit of the provisional application for E.U.A. with Serial No. 60/101, 939, filed on September 25, 1998.
TECHNICAL FIELD i The present invention is directed to sequences and constructions of nucleic acids and amino acids, and to methods related thereto.
BACKGROUND OF THE INVENTION Through the development of plant genetic engineering techniques, it is now possible to produce a transgenic variety of plant species to provide plants that have novel and desired characteristics. For example, it is now possible to genetically modify plants to make them tolerant to environmental stress, such as resistance to pathogens and tolerance to herbicides and to improve the quality characteristics of the plant, for example by improving the fatty acid compositions. However, the number of nucleotide sequences useful for the modification of said characteristics is quite limited and the speed with which useful nucleotide sequences are studied to modify novel characteristics is slow. The characterization of several acyltransferase proteins is useful for the additional study of vegetable fatty acid synthesis systems and for the development of new and / or alternative oil sources. Studies of plant mechanisms can provide means to further improve, control, modify, or otherwise alter the total fatty acid composition of triglycerides and oils. In addition, it is desired to elucidate the critical factor (s) for the natural production of fatty acids in plants, including the purification of said factors and the characterization of the element (s) and / or cofactors which improve the efficiency of the system. Of particular interest are the nucleic acid sequences of the genes encoding proteins which may be useful for applications in genetic engineering.
BRIEF DESCRIPTION OF THE INVENTION The present invention provides nucleic acids encoding amino acid sequences of a class of proteins which is related to the acyltransferase proteins. Said proteins are referred to herein as related to acyltransferases or acyltransferase-like proteins. By this invention, the nucleic acid sequences encoding these acyltransferase related proteins can currently be characterized with respect to enzymatic activity. In particular, identification and isolation of nucleic acid sequences encoding proteins related to acyltransferases of Arabidopsis, yeast, corn, and soy are provided. Thus, this invention encompasses the sequences of nucleic acids related to acyltransferases and the corresponding amino acid sequences, and the use of these nucleic acid sequences in the preparation of oligonucleotides containing said coding sequences related to acyltransferases for the analysis and recovery of the sequence genetics related to plant acyltransferases. The sequences that encode proteins related to acyltransferases can encode a total or partial sequence depending on the intended use. All or a portion of the genomic sequence, or cDNA sequence, is intended. Of special interest are recombinant DNA constructs which provide sequences related to acyltransferases for transcription or transcription and translation (expression) in host cells. In particular, constructs that are capable of transcription or transcription and translation in plant host cells are preferred. * For some applications it may be desirable to reduce the ! sequences that encode sequences related to acyltransferases. Thus, recombinant constructs can be designed so that they have sequences related to acyltransferases in an inverted orientation for the expression of an antisense sequence or the use of a cosuppression, also known as "transinterruptor", such constructions can be useful. Said constructions may contain a variety of regulatory regions including transcription initiation regions obtained from genes preferably expressed in plant seed tissues. For some uses, it may be desired to use transcriptional initiation regions and translation initiation regions for genes related to acyltransferases with either the sequences encoding acyltransferase-related proteins or to direct the transcription and translation of a heterologous sequence. The plants and seeds containing the construction and polynucleotides of this invention are also considered in this invention.
BRIEF DESCRIPTION OF THE DRAWINGS Figures 1A-F provide the profile of the conserved sequence of 204 amino acids identified from the comparison of the acyltransferases glycerol-3-phosphatase and various acyltransferases lysophosphatidic acid using PSI-BLAST.
Figures 2A-C provide an alignment of the amino acid sequences for the acyltransferase sequences. The alignment shown is of the regions of the protein that extend about 30 amino acids before the conserved H in the conserved sequence HXXXXD up to 100 amino acids later, or towards the 3 'end, of the P ¡in the PEG conserved motif sequence of sequences similar to acyltransferases. Figures 3A-B provide schemes showing the relationship of the identified acyltransferases. The described relationships are derived from an alignment of the regions of the protein that extend up to about 30 amino acids before the conserved H in the conserved sequence HXXXXD up to 100 amino acids later, or towards the 3 'end, of the P in the sequence conserved PEG motif of the sequences similar to acyltransferases. Figure 3A provides a phylogenetic tree showing the relationship of several acyltransferases. Figure 3B provides a table showing the percent similarity and percent divergence of novel acyltransferases and known acyltransferases using the Clustal method with the PAM250 waste weight table.
DETAILED DESCRIPTION OF THE INVENTION In accordance with the object of the invention, nucleotide sequences are provided which are capable of encoding amino acid sequences, such as a protein, polypeptide or peptide, which! is related to the nucleic acid sequences encoding acyltransferase proteins, referred to herein as acyltransferase-like or related to acyltransferases. The novel nucleic acid sequences found are used in the preparation of constructs for (directing their expression in a host cell) In addition, the new nucleic acid sequences can be used in the preparation of plant expression constructs to modify the fatty acid composition of a plant cell In one embodiment of the present invention, nucleic acid sequences, also referred to herein as polynucleotides, are identified from the database that relates to acyltransferases Isolated proteins, polypeptides and polynucleotides, A first aspect of the present invention relates to polynucleotides isolated from acyltransferases The polynucleotide sequences of the present invention include isolated polynucleotides encoding the polypeptides of the invention having a deduced amino acid sequence selected from the group of sequences described in the list of secue proteins, and other polynucleotide sequences closely related to such sequences, and variants thereof. The invention provides a polynucleotide sequence identical over its entire length to each coding sequence set forth in the sequence listing. The invention also provides the coding sequence for the mature polypeptide or a fragment thereof, as well as the coding sequence for the mature polypeptide or a fragment thereof in a frame! of reading with other coding sequences, such as those encoding a leader or secretory sequence, a sequence of pre-, pro- or prepro-protein. The polynucleotide can also include non-coding sequences, Including, but not limited to, 5 'and 3' non-coding sequences, such as transcribed, untranslated sequences, termination signals, ribosome binding sites, mRNA stabilizing sequences, introns, polyadenylation signals, and additional coding sequences that encode additional amino acids. For example, a marker sequence can be included to facilitate purification of the fused polypeptide. The polynucleotides of the present invention also include polynucleotides that comprise a structural gene and the naturally associated sequences that control gene expression. The invention also includes polynucleotides of the formula: X- (R?) N- (R2) - (R3) n-Y wherein, at the 5 'end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, Ri and R3 are any nucleic acid residue, n is an integer between 1 and 3000, preferably between 1 and 1000, and R2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence selected from the group described in the sequence listing, and preferably SEQ ID NOs: 1, 3, 5, 7, 9, 10, 12, 14, 16, 18, 20, 22 and 226-233. In the formula, R2 is oriented in such a way that its 1 residue at the 5 'end is on the left, joined to Ri, and its residue at the 3' end is on the right, joined to R3. Any extension of nucleic acid residues denoted by any group R, wherein R is greater than 1, I may be a heteropolymer or a homopolymer, preferably a heteropolymer. The invention also relates to variants of the polynucleotides described herein, which encode variants of the polypeptides of the invention. To synthesize full length polynucleotides of the invention, variants which are fragments of the polynucleotides of the invention can be used. Preferred embodiments are polynucleotides that encode polypeptide variants in which 5 to 10, 1 to 5, 1 to 3, 2, or 1, or no amino acid residue of a polypeptide sequence of the invention, is substituted, added or deleted , in any combination. Substitutions, additions and deletions that are silent so as not to alter the properties or activities of the polynucleotide or polypeptide are particularly preferred. The nucleotide sequences encoding acyltransferases can be obtained from natural sources or synthesized partially or totally artificially. These may correspond directly to an endogenous acyltransferases from a natural source or contain I sequences of modified amino acids, such as sequences that have been mutated, truncated, increased or the like. Acyltransferases can be obtained by a variety of methods, including but not limited to, partial or homogeneous purification of protein extracts, protein modeling, nucleic acid probes, antibody preparation and sequence comparison. Typically an acyltransferases will be derived in whole or in part from a natural source. A natural source includes, but is limited to, prokaryotic and eukaryotic sources, including, bacteria, yeast, plants, including algae, and the like. Of special interest are the acyltransferases that are obtained from eukaryotic sources, including those which are obtained, from plants, or from acyltransferases that are obtained through the use of these sequences. "Which is obtained" refers to those acyltransferases having sequences sufficiently similar to that of the sequences provided herein to provide a biologically active protein of the present invention. Preferred embodiments of the invention are at least 50%, 60% or 70% identical over their entire length with a polynucleotide that encodes a polypeptide of the invention, and polynucleotides that) are complementary to said polynucleotides. Most preferred are polynucleotides which comprise a region that is at least 80% identical throughout its length with a polynucleotide encoding a polypeptide of the invention, and polynucleotides that are complementary thereto. In this regard, polynucleotides at least 90% identical over their entire length are particularly preferred, with those which are especially preferred. 95% identical. Furthermore, those with at least 97% identity are very preferred and those with 98% and 99% identity are particularly preferred, those with 99% identity being most preferred. Preferred embodiments are polynucleotides that encode polypeptides that retain substantially the same function or biological activity as the mature polypeptides encoded by the polynucleotides described in the sequence listing. In addition, the invention relates to polynucleotides that hybridize with the above-mentioned sequences. In particular, the invention relates to polynucleotides that hybridize under severe conditions with the above-mentioned polynucleotides. As used here, the terms "Severe conditions" and "Severe hybridization conditions" mean that hybridization will generally occur if there is at least 95%, and preferably at least 97% identity between the sequences. An example of severe hybridization conditions is incubation overnight at 42 ° C in a solution comprising 50% formamide, 5x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6) , solution Denhardt 5x, 10% dextran sulfate and 20 micrograms / milliliter of fragmented DNA from salmon sperm, followed by washing the hybridization support in 0.1x SSC, at approximately 65 ° C. Other hybridization and washing conditions are well known and are exemplified in Sambrook et al., "Molecular Cloning: A Laboratory Manual" Second Edition, Cold Sjpring Harbor, New York (1989), particularly chapter 11. j The invention also provides a polynucleotide consisting essentially of a polynucleotide sequence that can be obtained by screening an appropriate library containing the complete gene for a polynucleotide sequence described in the sequence listing, j under severe hybridization conditions, with a probe having the sequence of said polynucleotide or a fragment thereof; and isolating said polynucleotide sequence. Fragments useful for obtaining said polynucleotide include, for example, probes and primers such as those described herein. As described herein by considering the polynucleotide assays of the invention, for example, the polynucleotides of the invention can be used as a hybridization probe for RNA, cDNA or genomic DNA, to isolate full length cDNA or genomic clones encoding a polypeptide , and to isolate clones of cDNA or genomic DNA from other genes having a high sequence similarity with a polynucleotide i that is described in the sequence listing. Said probes will generally comprise at least 15 bases. Preferably, said probes will have at least 30 bases and may have at least 50 bases. Particularly preferred probes will have between 30 bases and 50 bases, inclusive. The coding region of each gene comprising a polynucleotide sequence described in the sequence listing can be isolated by selection using a DNA sequence provided in the sequence listing to synthesize an oligonucleotide probe. A labeled oligonucleotide having a sequence complementary to that of a gene of the invention is then used to select a library of cDNA, genomic DNA or mRNA, to identify members of the library that hybridize with the probe. For example, synthetic oligonucleotides corresponding to the N-terminal sequence of the polypeptide are prepared. The partial sequences thus prepared can be used as primers to obtain acyltransferase clones from a library prepared from a source cell of interest. Alternatively, when low degeneracy oligonucleotides can be prepared from particular peptides, such probes can be used directly to select libraries for gene sequences. In particular, the selection of cDNA libraries in phage vectors is useful for such methods due to the lower levels of background hybridization. Typically, a sequence that can be obtained from the use of the nucleic acid probes will show 60-70% sequence identity between the target acyltransferase sequence and the coding sequence used as a probe. However, long sequences with a sequence identity as low as 50-60% can also be obtained.
The nucleic acid probes can be a long fragment of the nucleic acid sequence, or they can be a shorter oligonucleotide probe. I When longer nucleic acid fragments (greater than about 100 bp) are used as probes, it can be selected with less severity to obtain sequences of the blank sample having 20-50% deviation (ie, 50-80% of sequence homology) of the ii I sequences used as a probe. The oligonucleotide probes can be considerably shorter than the entire nucleic acid sequence encoding an acyltransferases enzyme, but should be at least about 10, preferably at least about; 15, and at least about 20 nucleotides are very preferred. When shorter regions are used instead of longer regions, a higher degree of sequence identity is sought. Therefore, it can be! It is convenient to identify highly conserved amino acid sequence regions to design oligonucleotide probes to detect and recover other related genes. Generally, shorter probes are particularly useful for polymerase chain reactions (PCR), especially when highly conserved sequences can be identified (see Gould et al., PNAS USA (1989) 86: 1934-1938). The expert technician will appreciate that, in many cases! an isolated cDNA sequence will be incomplete, since the region encoding the polypeptide is truncated with respect to the 5 'end of the cDNA. This is a consequence of reverse transcriptase, an enzyme with "Processivity" low (a measure of the ability of the enzyme to remain anchored to the template during the polymerization reaction) used during the synthesis of the first strand of cDNA. There are several methods available and well known to those skilled in the art for obtaining full length cDNA, or short extension cDNA, for example those based on the rapid amplification methods of the cDNA ends (RACE) (see, for example. , Frohman et al. (1988) Proc. Nati, Acad. Sci. USA 85: 8998-9002). Recent modifications of the technique, exemplified by the Marathon ™ technology (Clonetech Laboratories, Inc.) for example, have significantly simplified the obtaining of full-length cDNA sequences. Another aspect of the present invention relates to polypeptides isolated from acyltransferases. Such polypeptides include isolated polypeptides such as those described in the sequence listing, as well, such as polypeptides and fragments thereof, particularly those polypeptides that exhibit acyltransferases activity and also those polypeptides having at least 50%, 60% or 70% identity, preferably at least 80% identity, more preferably at least 90% identity, and more preferably at least 95% identity, with a polypeptide sequence selected from the group of sequences described in the sequence listing, and also includes portions of such polypeptides, wherein said portion of the polypeptide preferably includes at least 30 amino acids, and more preferably includes at least 50 amino acids. "Identity", as is well understood in the art, is a relationship between two or more polypeptide sequences, or two or more polynucleotide sequences, as determined by sequence comparison. In the art, "identity" also means the degree of relationship between the polypeptide or polynucleotide sequences, determined by coupling between the strands of such sequences. The "identity" can be easily calculated by means of known methods including, without limitation, those described I in "Computational Molecular Biology", by Lesk A.M., ed., Oxford University Press, New York (1988); "Biocomputing: Informatics and Genome Projects" Smith D.W., ed., Academic Press, New York, 1993; "Computer Analysis of Sequence Data "part I, Griffin A.M. and Griffin H.G., eds., Humana Press.i New Jersey (1994); "Sequence Analysis in Molecular Biology" von Heinje, G., Academic Press (1987); "Sequence Analysis Primer" Gribskov M. and Devereux J., eds., Stockton Press, New York (1991); and Carillo H. and Lipman D., SlAM J Applied Math, 48: 1073 (1988). The methods to determine the identity are designed to give maximum coupling between the tested sequences. In addition, methods for determining identity are encoded in publicly available programs. Computer programs that can be used to determine the identity between two sequences include, without limitation, GCG (Devereux J. et al., Nucleic Acids Research 12 (1): 387 (1984); series of five BLAST programs, three designed for questions of nucleotide sequences (BLASTN, BLASTX and TBLASTX), and two designed for questions of protein sequences (BLASTP and TBLASTN) (Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren et al., Genome Analysis , 1: 543-559 (1997).) The BLASTX program is publicly available from NCBI and other sources (BLAST Manual, Altschul S. et al., NCBI NLM NIH, Bethesda, Maryland 20894, Altschul et al., J. Mol. Biol. 215: 403-410 (1990)) The well-known Smith Waterman algorithm can also be used to determine identity.Parameters for comparing polypeptide sequences typically include the following: Algorithm: Needleman and Wunsch , J. Mol. Biol. 48: 443-453 (1970). Comparison matrix: BLOSSUM62 by Hentikoff and He? Tikoff, Proc. Nati Acad. Sci. USA 89: 10915-10919 (1992). Clearing penalty: 12. Clearing length penalty: 4. A program that can be used with these parameters is publicly available as the "gap" program of Gehetics Computter Group, Madison Wisconsin. The parameters above juntp with no penalty for clear end are the default parameters1 for peptide comparisons. The parameters for polynucleotide sequence comparison include the following: Algorithm: Needleman and Wunsch, J. Mol. Biol. 48: 443-453 (1970). Comparison matrix: couplings = +10; decoupling = 0. Clearing penalty: 50. Clearing length penalty: 3.
! A program that can be used with these parameters! is publicly available as the "clear" program from Genetics Computter Group, Madison Wisconsin. The parameters above are the default parameters for nucleic acid comparisons. The invention also includes polypeptides of the formula: X- (R?) N- (R2) - (R3) pY wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or a metal , Ri and R3 are any amino acid residue, n is an integer between 1 and 1000, and R2 is an amino acid sequence of the invention, particularly an amino acid sequence I selected from the group described in the sequence listing, and preferably SEQ ID NOs: 2, 4, 6, 8, 11, 13, 15, 17, 19, 21, 23 and 218-225. In the formula, R2 is oriented in such a way that its amino terminal residue is on the left, bound to R-i, and its carboxyl terminal residue is on the right, bound to R3. Any extension of amino acid residues denoted by any group R, wherein R is greater than 1, can be a heteropolymer or a homopolymer, preferably a heteropolymer. The polypeptides of the present invention include isolated polypeptides encoded by a polynucleotide comprising a sequence selected from the group of a sequence contained in. SEQ ID NOs: 1, 3, 5, 7, 9, 10, 12, 14, 16, 18, 20, 22 and 226-233. The polypeptides of the present invention can be a mature protein or can be part of a fusion protein.
I! Fragments and variants of the polypeptides are also considered as part of the invention. A fragment is a variant polypeptide I having an amino acid sequence that is completely the same as a part, but not all, of the amino acid sequence of the polypeptides described above. The fragments may be "alone" or be comprised within a larger polypeptide of which the fragment forms a part or a region, very preferably as a single continuous region. Preferred fragments are biologically active fragments which are those fragments that mediate the activities of the polypeptides of the invention, including those with similar activity or improved activity or with reduced activity. Also included are fragments that are antigenic or immunogenic in an animal, particularly in a human. Variants of the polypeptides also include polypeptides that vary from the sequences described in the sequence listing by conservative amino acid substitutions, substitution of a pqr residue with similar characteristics. In general, these substitutions are between Ala, Val, Leu and He; between Ser and Thr; between Asp and Glu; between Asn and Gln; between Lys and Arg; or between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 to 5; 1 to 3; or an amino acid are substituted, deleted or added, in any combination. Variants that are fragments of the polypeptides of the invention can be used to produce the corresponding polypeptide of! ! total length by peptide synthesis. Therefore, these variants can be used as intermediates in the production of the full-length polypeptides of the invention. The polynucleotides and polypeptides of the invention can be used, for example, in the transformation of host cells, such as plant host cells, as described herein further. The invention also provides polynucleotides that encode a polypeptide that is a mature protein plus additional amino- or carboxyl-terminal amino acids, or amino acids within the mature polypeptide (eg, when the mature form of the protein has more than one polypeptide chain). Such sequences may, for example, have a function in the processing of protein from a precursor to a mature form, allow the transport of the protein, shorten or lengthen the half-life of the protein, or facilitate the manipulation of the protein in assays. or production. It is contemplated that cellular enzymes may be used to remove any additional amino acid from the mature protein. A precursor protein, having the mature form of the polypeptide fused to one or more prosequences, can be an inactive form of the polypeptide. Inactive precursors are usually activated when the prosequences are removed. Some or all of the prosequences can be removed before activation. Such precursor proteins are generally referred to as proproteins. The polynucleotide and polypeptide sequences can also be used to identify additional sequences that are homologous to the sequences of the present invention. The most preferred and preferred method I is to store the sequence in a computer reading medium, for example, floppy disk, CD ROM, hard disk, external disk drives and DVDs, then use the saved sequence to search a database of sequences with well-known search tools. Examples of public databases include the DNA Database of Japan (DDBJ, http://www.ddbj.nig.ac.jp/); Genebank (http://www.ncbi.nlm.nih.gov/web/Genbank/lndex.htlm); and the Nuclear Acids Sequence Data series of the European Molecular Biology Laboratory (EMBL, http://www.ebi.ac.uk/ebi docs / embl db.html). Several search algorithms are available for the expert technician, an example of which is the series of programs referred to as the BLAST program. There are five implementations of BLAST, three designed for questions of nucleotide sequences (BLASTN, BLASTX and TBLASTX), and two designed for questions of protein sequences (BLASTP and TBLASTN) (Coμlson, Trends in Biotechnology, 12: 76-80 ( 1994), Birren et al., Genome Analysis, 1: 543-559 (1997)). Additional programs are available in the field of analysis of identified sequences, such as alignment programs of I sequences, programs for the identification of closely related sequences, and similar programs, and are well known to the skilled artisan.
I Plant constructs and methods of use of interest in the present invention is the use of nucleotide or polynucleotide sequences in recombinant DNA constructs I to direct the transcription or transcription and translation (expression) of the acyltransferase sequences of the present invention. invention in a host plant cell. Of particular interest is the use of the nucleotide or I polynucleotide sequences in recombinant DNA constructs to direct the transcription or transcription and translation (expression) of the i acyltransferase sequences of the present invention in a host plant cell.
Expression constructs generally comprise a functional motor in a host plant cell operably linked to a nucleic acid sequence encoding an acyltransferases of the present invention, and a functional transcription termination region in a host plant cell. By "host cell" is meant a cell which contains a vector and which maintains the replication, and / or transcription or transcription and translation (expression) of the expression construct. Host cells for use in the present invention can be prokaryotic cells, such as E. coli, or eukaryotic cells such as I! Cells. yeast, plant, insect, amphibian, or mammal. Preferably, the cells! Hosts are cells of monocotyledonous or dicotyledonous plants. Of particular interest in the present invention is the use of polynucleotides of the present invention for the preparation of constructs to direct the transcription or transcription and translation of nucleotide sequences encoding an acyltransferases in a host plant cell. Plant expression constructs generally comprise a functional promoter in a host plant cell operably linked to a nucleic acid sequence of the present invention, and a functional transcription termination region er) a host plant cell. Those skilled in the art will recognize that there are several promoters that are functional in plant cells, and have been described in the literature. Specific promoters of chloroplast and plastid, functional promoters of chloroplast or plastid, and promoters operable in chloroplast and plastid are also contemplated. A series of promoters are constitutive promoters such as the 35S CaMV or 35S FMV promoters, which produce high levels of expression in most plant organs. Improved or duplicate versions of the 35S CaMV and 35S FMV promoters are useful in the practice of this invention (Odell et al (1985) Nature 313: 810-812; Rogers, U.S. Patent No. 5,378,619). In addition, it may be preferable to carry out the expression of the protein of interest in specific tissues of the plant! such as leaf, stem, root, tuber, seed, fruit, etc., and the chosen promoter must have the desired tissue specificity and development.
Of particular interest is the expression of the nucleic acid sequences of the present invention of transcription initiation regions that are preferably expressed in a seed tissue of a plant. Examples of such preferred seed transcription initiation sequences I include those derived from sequences of genes encoding plant storage protein or genes involved in the biosynthesis of fatty acids in oilseeds. Examples of such promoters include the 5 'regulatory regions of such genes as napin (Kridl et al., Seed Sci. Res. 1: 209: 219 (1991)), phaseolin, zein, soybean trypsin inhibitor, ACP , stearoyl-ACP desaturase, subunit to 'of ß-conglocinina de soya (soy 7s, (Chen et al., Proc. Nati. Acad. Sci., 83: 8560-8564 (1986)) and oleosin It may be desirable to direct the localization of proteins conferring acyltransferases to a particular subcellular compartment., for example, to the mitochondria, endoplasmic reticulum, vacuoles, chloroplast, or other plastid compartment. For example, when the genes of interest! of the present invention are directed for their expression to plastids, such as chloroplasts, the constructs will also use sequences to direct the gene towards the plastid. Such sequences are referred to herein as chloroplast transit peptides (CTP) or plastid transit peptides (PTP). In this way, when the gene of interest is inserted directly into the plastid, the expression construct will additionally contain a gene encoding a transit peptide to direct the gene of interest to the plastid. Chloroplast transit peptides can be derived from the gene of interest, or can be derive from a heterologous sequence that has a CTP. Said transit peptides are known in the art. See, for example, Von Heijne et al. I (1991) Plant Mol. Biol. Rep. 9: 104-126; Clark et al. (1989) J. Biol. Ghem. 264: 17544-17550: della-Cioppa et al. (1987) Plant Physiol. 84: 965-968; Romer et al. (1993) Biochem. Biophys. Res Commun. 196: 1414-1421; and Shah'et al. ! (1986) Science 233: 478-481. Additional transit peptides for the translocation of the protein to the endoplasmic reticulum (ER), or vacuole, may also be used in the constructions of the present invention. Depending on the intended use, the constructs may contain the nucleic acid sequence encoding the total protein acyltransferases or a portion thereof. For example, where antisense inhibition of a given protein acyltransferases is desired, the entire acyltransferases sequence is not required. In addition, when the acyltransferases sequences used in the constructs are intended to be used as probes, it may be advantageous to prepare constructs containing only a particular portion of an acyltransferase coding sequence, for example a sequence that is found to encode a highly conserved acyltransferase region. . The skilled artisan will recognize that there are several methods to inhibit the expression of endogenous sequences in a host cell. Such methods include, but are not limited to antisense suppression (Smith et al (1988) Nature 334: 724-726), cosuppression (Napoli et al. (1989) Pla? T Cell i 2: 279-289) , ribozymes (PCT publication WO 97/10328), and combinations! of sense and antisense, such as those described by Waterhouse et al. (1998) Proc. Nati Acad. Sci. USA 95: 13959-13964. Methods for the suppression of endogenous sequences in a host cell generally employ the transcription or transcription and translation of at least a portion of the sequence to be deleted. Such sequences can be homologous to the coding regions as well as to the non-coding regions of the endogenous sequence. Regulatory regions of transcription termination can also be provided in the plant expression constructs of this invention. The transcription termination regions can be provided by the DNA sequence encoding the acyltransferases or a convenient translation termination region derived from a different gene source, for example, the transcription termination region that is naturally associated with the region of the beginning of transcription. The skilled artisan will recognize that any convenient transcription termination region capable of terminating transcription in a plant cell can be employed in the constructions of the present invention. Alternatively, constructs can be prepared to direct the expression of the sequences of the acyltransferase sequences directly from the plastid of the host plant cell. Such constructs and methods are known in the art and are generally described, for example, in Svab et al. (1990) Proc. Nati Acad. Sci. USA 87: 8526-8530 and! i Scab and Maliga (1993) Proc. Nati Acad. Sci. USA 90: 913-917 and in the patent of E.U.A. No. 5,693,507. A cell, tissue or plant organ, or plant, in which recombinant DNA constructs containing expression constructs have been introduced, are considered transformed, transfected or transgenic. A transgenic or transformed cell or plant also includes the progeny of the cell or plant and the progeny produced by a propagation program that employs said transgenic plant as a parent in a cross, and exhibits an altered phenotype resulting from the presence of a sequence of acyltransferase nucleic acid. The term "introduced" in the context of inserting a nucleic acid sequence into a cell means "transfeccióri", or "transformation" or "transduction" and includes reference to the incorporation of a nucleic acid sequence within a prokaryotic or eukaryotic cell in which the nucleic acid sequence can be incorporated into the genome of the cell (eg, chromosome DNA, plasmid , plastid, or mitochondrion), converting it into an autonomous replicon, or transient expression (for example, transfected mRNA). Plant expression or transcription constructs having an acyltransferases as the DNA sequence of interest, for more or less expression thereof, can be used with a wide variety of plant life, particularly, plant life involved in the production of vegetable oils for edible and industrial uses. Plants of interest in the present invention include monocotyledonous and dicotyledonous plants.
More especially preferred are harvests of hardened oilseeds. Plants of interest include, but are not limited to, rapeseed (varieties of Cañóla and High Erucic Acid), sunflower, safflower, fruit, soy, peanuts, coconut palms and oil and corn. Depending on the method of introduction of the recombinant constructs into the host cell, other DNA sequences may be required. Importantly, this invention is applied to similar dicotyledonous and monocotyledonous species and will be readily applied to new and / or improved transformation and regulation techniques. As used herein, the term "plant" refers to those plants, plant organs (e.g., leaves, stems, roots, etc.), seeds, and plant cells and progeny thereof. Plant cells, as used herein, include, without limitation, seed suspension cultures, embryos, meristematic regions, callus tissue, leaf and root shoots, gametophytes, sporophytes, pollen, and microspores. The class of plants that can be used in the methods of the present invention is generally as broad as the class of larger plants available for transformation techniques, including both monocotyledonous and dicotyledonous plants. Particularly preferred plants of interest include, but are not limited to, rapeseed (canola and high erucic acid), sunflower, safflower, cotton, soybean, peanut, coconut palms and oil and corn. The most especially preferred plants include Brassica, soy, and corn.
As used herein, "transgenic plant" refers to plants that comprise within their genome a heterologous polynucleotide. Generally, the heterologous polynucleotide is stably integrated into the genome so that the polynucleotide goes through successive generations. The heterologous polynucleotide can be integrated into the genome only or as part of a recombinant expression cassette. "transgenic" is used herein to include any cell, cell line, callus, tissue, part of a plant or plant, the genotype of which has been altered by the presence of heterologous nucleic acid including those transgenic initially altered as well as those created by sexual cross or asexual propagation of the initial transgenic. Thus a plant having within its cells a heterologous polynucleotide I is referred to herein as a transgenic plant. The heterologous polynucleotide can be either stably integrated into the genome, or it can be extra-chromosomal. Preferably, the polynucleotide of the present invention stably integrates into the genome such that the polynucleotide passes through successive generations. The polynucleotide is integrated into the genome only or as part of a recombinant expression cassette. "transgenic" is used here to include any cell, | cell line, callus, tissue, part of a plant or plant, the genotype of which be ha! altered by the presence of heterologous nucleic acid, including those transgenic altered initially as well as those created by sexual cross or asexual propagation of the initial transgenic. - * mfjÍt »SA * As used herein," heterologous "refers to a nujcleic acid that is a nucleic acid that originates from a foreign species, or, if it is of the same species, is substantially modified in composition from its native form and / or genomic locus by deliberate human intervention. For example, a promoter operably linked to a heterologous structural gene is of a different species from which it was formed with the structural gene derived therefrom, or, if formed from the same species, one or both are modified substantially from of its original form. A heterologous species can originate from a strange species, or, if it is formed from the same species, it is substantially modified from its original form by deliberate human intervention. As used herein, a "recombinant expression cassette" is a nucleic acid construct, which is generated recombinantly or synthetically, with a series of specific nucleic acid elements which allow the transcription of a particular nucleic acid in a target cell . The recombinant expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, plastid DNA, viruses, or nucleic acid fragments. Typically, the portion of the recombinant expression cassette of an expression vector includes, among other sequences, a nucleic acid sequence to be transcribed and a promoter. It is contemplated that the gene sequence can be synthesized, either completely or in part, especially where it is desired to provide preferred sequences for plants. Thus, all or a portion of the desired structural gene (that portion of the gene encoding the acyltransferase protein) I can be synthesized using codons preferred by a select host. The preferred codons of hosts can be determined, for example, a! from the most commonly used codons in the expression of proteins in a desired host species. One skilled in the art will readily recognize that preparations of antibodies, nucleic acid probes (DNA and RNA) and the like can be prepared and used to select and thereby recover "homologous" or "related" transferases from a variety of plant sources. Homologous sequences are found when there is a sequence identity, which can be determined in the comparisons of sequence information, nucleic acid or amino acids, or through hybridization reactions between a known acyltransferases and a source candidate. Conservative changes, such as Glu / Asp, Cal / lie, Ser / Thr, Arg / Lys and Gln / Asn can also be considered to determine sequence homology. Nucleic acid sequences are considered to be homologous when they have at least 25% sequence identity between the two complete mature proteins (see generally, Doolittle, RF, OF URFS and ORFS (University Science Books, CA, 1986). Acyltransferases can be obtained from the specific exemplified sequences provided herein In addition, it will be apparent that one can obtain natural and synthetic sequences, including modified amino acid sequences and starting materials from the modeling of synthetic proteins from the exemplified sequences and from the Acyltransferases that are obtained through the use of said exemplified sequences The modified amino acid sequences I include sequences that can be mutated, truncated, increased, and I-like, wherein said sequences are partially or totally synthesized. purify plant preparations or that are identical or that encode identical proteins thereof, without considering the method used to obtain the proteins or sequence, are also considered as naturally derived. For immunological selection, anti-convolutions for the protein acyltransferases can be prepared by injecting rabbits or clone mice with the purified protein or a portion thereof, said method of preparing antibodies is well known in the art. Antibodies either monoclonal or polyclonal can be produced, although polyclonal antibodies are typically more useful for gene isolation. Western analysis can be conducted to determine that a related protein is present in a crude extract of a desired plant species, as determined by cross-reactivity with prptein acyltransferases antibodies. When cross-reactivity is observed, the gene (s) encoding the related proteins are isolated by expression selection libraries that represent the desired plant species. The ! Expression libraries can be constructed in a variety of commercially available viewers, including lambda gt11, as described in Sambrook, et al. (Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory, Cold Spring Harbor, New York). The nucleic acid sequences associated with the acyltransferase proteins will find many uses. For example, recombinant constructs can be prepared which can be used as probes, d which will provide for the expression of the protein acyltransferases in host cells to produce an easy source of the enzyme and / or to modify the composition of the triglycerides found therein. Other useful applications may be found when the host cell is a plant host cell, either in vitro or in vivo. Modifications of fatty acid compositions can also affect the fluidity of plant membranes, for example, they have observed different concentrations of liquids in plants hardened by cold. By this invention, one may be able to introduce traits that will lead to cold tolerance. The regulatory control regions of the initiation of temperature-inducible transcription or constitutively may have several applications for such uses. As discussed above, the nucleic acid sequences encoding acyltransferases of this invention can include genomic, cDNA or mRNA sequences. By "coding" means that the sequence corresponds to a particular amino acid sequence either I i in sense or antisense orientation. By "extrachromosomal" it means that the sequence is external to the plant genome with which it is naturally associated. By "recombinant" is meant that the sequence contains a genetically produced modification through manipulation via mutagenesis, restriction enzymes, and the like. Once the nucleic acid sequence of the desired acyltransferases is obtained, it can be manipulated in a variety of ways. Where the sequence involves the non-coding flanking regions, the flanking regions may be subject to restriction, mutagenesis, etc. thus, transitions, transversions, deletions, and insertions can be carried out in a naturally occurring sequence. In addition, all or part of the sequence can be synthesized. In the structural gene, one or more codons can be modified to provide a modified amino acid sequence, or one or more mutations can be introduced to the codon to provide a convenient restriction site or other purpose involved with the construct or expression. The structural gene can also be modified by employing synthetic adapters, binders to introduce one or more convenient restriction sites or the like. The nucleic acid or nucleic acid sequence encoding acyltransferases of this invention can be combined with another non-native sequence, or "heterologous", in a variety of ways. By sequences "heterologous" refers to any sequence which is naturally bound to acyltransferases, including, for example, i i! combinations of nucleic acid sequences from the same plant i! which are not naturally linked. The DNA sequence encoding an acyltransferases of this invention can be used in conjunction with all or part of the gene sequences that are normally associated with acyltransferases. In its component parts a DNA sequence encoding acyltransferases is combined in a DNA construct having, in the direction of 1 transcription, 51 to 3 ', a control region of the start of transcription capable of promoting transcription and translation in a host cell, the DNA sequence coding for the plant acyltransferases and a region of transcription and translation term. Potential host cells include both prokaryotic cells, such as E. coli and eukaryotic cells such as yeast cells, insect, amphibian, or mammal. A host cell may be unicellular or located in a differentiated or undifferentiated multicellular organism depending on the intended use. Preferably, the host cells of the present invention include plant cells, both monocotyledonous and dicotyledonous. The cells of this invention can be distinguished by having an external sequence to the wild type cell I presented herein, for example, by having a recombinant nucleic acid construct encoding an acyltransferases here. The methods used for transformations of host plant cells are not critical to the present invention. The . Ata »* transformation of the plants is preferably permanent, that is, by integration of the expression constructs introduced into the host plant genome, so that the introduced constructions pass through successive generations of plants. The person skilled in the art will recognize that a wide variety of transformation techniques exist in the field, and new techniques become available continuously. Any technique that is suitable for addressing a hospital plant may be employed within the scope of the present invention; For example, constructs can be introduced in a variety of ways including, but not limited to, a strand of DNA, a plasmid, or an artificial chromosome. The introduction of the constructs into the target plant cells can be carried out by a variety of techniques, including, but not limited to co-precipitation of DNA-calcium-phosphate, electroporation, microinjection, Agrobacterium infection, liposomes or transformation by microprojectiles. The person skilled in the art can refer to the literature for details and suitable selection techniques for use in the methods of the present invention. Normally, included within the DNA construct will be a structural gene that has the regulatory region necessary for expression in a host and selectively provide the transforming cell. The gene can provide resistance to a cytotoxic agent, for example antibiotic, heavy metal, toxin, etc., by providing prototrophy supplementation, an auxotrophic host, viral immunity or the like. Depending on the number of different host species the expression construction or components thereof will be introduced, one or more markers can be used, where different selection conditions are used for different hosts. When Agrobacterium is used for the transformation of plant cells, a vector can be used which can be introduced into the host Agrobacterium by homologous recombination with T-DNA or the plasmid -Ti or -Ri present in the host Agrobacterium. The plasmid -Ti or -Ri contains the T-DNA for recombination that can be armed (capable of causing gall formation) or disarmed (incapable of gill formation), the latter being permissive, so that the vir genes are found present in the Agrobacterium host. The armed plasmids can give rise to a mixture of normal plant cells and galls. In some examples where Agrobacterium is used as a vehicle for the transformation of host plant cells, the expression or transcription construct bounded by the T-DNA border region (s) will be inserted into a broad-scale host vector capable of replicated in E. coli and Agrobacterium, these large-scale host vectors are described in the literature. PRK2 or derivatives thereof are commonly used. See, for example, Ditta, et al, (Proc. Nat. Acad. Sci. U.S.A. (1980) 77: 7347-7351) and EPA 0 120 515, which are incorporated herein by reference. Alternatively, one can insert the sequence to be expressed in plant cells within a vector containing the separate replication sequence, one of which stabilizes the vector at £. coli and the other in Agrobacterium. See, for example, McBride and Summerfeit (Plant Mol. Biol. (1990) 14: 269-276), where the origin of replication pRiHRI (Jouanin, et al, Mol. Gen. Genet. (1985) 201: 370-374 ) is used and provides added stability to plant expression vectors in Agrobacterium host cells. Included with the expression construct and the T-DNA are one or more markers, which allow the selection of transformed plant cells and transformed Agrobacterium. A number of markers have been developed for use within plant cells, such as resistance to chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The particular marker used is not essential for; this invention, one or the other marker may be preferred depending on the particular host and the manner of construction. For the transformation of plant cells using Agrobacterium, the explants can be combined and incubated with transformed Agrobacterium for a sufficient time for transformation, the bacterium is eliminated, and the plant cells are cultured in an appropriate selection medium. Once the callus is formed, the root of the shoots can be promoted by employing the appropriate plant hormones according to known methods and the shoots will be transferred to the growing medium for the regeneration of the plants. Plants can develop until they form seed and seeds are used to establish repetitive generations and to isolate vegetable oils. There are several possible ways to obtain the plant cells of this invention which contain multiple expression constructs. Any means for producing a ptanta comprising a construct having a nucleic acid sequence of the present invention, and at least one or another construct having another DNA sequence encoding an enzyme is encompassed by the present invention. For example, the expression construct can be used to transform a plant at the same time that a second construct is made either by including both expression constructs in a single transformation vector or by 'using separate vectors, each of which express desired genes. The second construction can be introduced into a plant which has already been transformed with the initial expression construction, or alternatively, the transformed plants, one having the first construction and another having the second construction can be crossed so that they carry the constructions together in the same plant. In general, the acyltransferase proteins are active in the transfers of acyl groups from a donor to a variety of different substrates. For example, diacylglycerol acyltransferases add acyl groups to diacylglycerol to form triacylglycerol (TAG), oracil: CoA: cholesterol acyltransferases use an acyl-CoA as a donor to transfer an acyl group to a sterol to form a sterol ester group . Typically substrates include, but are not limited to glycerides, including mdno and diglycerides, sterols, tinols, phosphatides, and the like. Donors include, but are not limited to, acyl-CoA and acyl-ACP molecules. Now being the invention described generally, it will be! more easily understood by reference to the following examples which are included for purposes of illustration only and are not intended to limit the present invention.
EXAMPLES EXAMPLE 1 RNA isolations Total RNA is isolated from inflorescence and developing seeds of Arabidopsis thaliana for use in the construction of complementary libraries (cDNA). The procedure is an adaptation of the DNA isolation protocol of Webb and Knapp (D.M. Webb and S.J. Knapp (1990) Plant Molec, Repórter 8, 180-185). The following description assumes the use of 1 g of fresh tissue weight. The frozen seed fabric is pulverized by trituration under liquid nitrogen. The powder is added to 10 ml of REC buffer (50 mM Tris-HCl, pH 9, 0.8 M NaCl, 10 mM EDTA, C (cetyltrimethylammonium bromide) 0.5% w / v, together with 0.2 g of insoluble polyvinylpolypyrrole] idona, The homogenate is centrifuged for 5 minutes at 12,000 xg to concentrate the insoluble material.The resulting supernatant fraction is extracted with chloroform and the upper phase is recovered, then the RNA is precipitated by the addition of 1 volume. of RecP (50 mM Tris-HCl pH9, 10 mM EDTA and 0.5% C (w / v)), and collected by short centrifugation as before.The RNA pellet is redissolved in 0.4 ml of 1M NaCl. Redissolve in water and extract with phenol / chloroform Add enough 3 M potassium acetate (pH5) to make the 0.3 M mixture in acetate, followed by the addition of two volumes of ethanol to precipitate the RNA. with ethanol, this final precipitate of RNA dissolves in water and is stored ngelado.
EXAMPLE 2 Identification of sequences sequences homologous to acyltransferases The searches were carried out with a Silicon Graphics Unix computer using a bioaccelerator hardware and Gen software, Web supplemented by Compugen Ltd. This software and hardware enables the use of the Smith-Waterman algorithm in the search of DNA and protein databases using profiles as searches. The program used for protein searches in databases is the profile search. This is a search where the research is not a single sequence without a profile based on multiple alignment of amino acid or nucleic acid sequences. The profile is used to inquire sequence data, that is, a sequence database. The profile contains all the pertinent information to evaluate each position in a sequence, in fact replacing the "evaluation matrix" used for searches of statistical research. The program used for the search of nucleotide databases with a protein profile is the profile search. The nucleic acid databases for search by profile search are used as amino acid profile inquiries. As the search continues, the sequence! in the database they are translated into amino acid sequences in 6 reading frames. The outgoing file for the profile search is identical to the output file for the profile search except that it has an additional column indicating the frame in which the best alignment occurs. The Smith-Waterman algorithm, (Smith and Waterman (1981) mentioned above), is used to look for similarities between a sequence being investigated and a group of sequences contained in the database. The evaluation values E as well as other sequence information, such as the conserved peptide sequences of HXXXXD and PEG are used to identify related sequences. When using the information of the conserved peptide sequences, evaluation values E greater than E-12 and E-8 are considered. For example, the sequence! EST originally used to identify ATAT2 had an E evaluation of 0.J0094, while the EST sequence originally used to identify ATLPAAT1 had an E evaluation of 0.0868.
! A protein sequence of glycerol-3-phosphate from E. coli (Swiss Prot access P00482) is used to inquire into the non-redundant protein database NCBI using BLAST. In the first round of search, other membrane forms of G3PAAT are identified. In subsequent PSI-BLAST searches (Altschul, et al (1997) Nucleic Acids Res 25: 3389-3402), LPAATs and other acyltransferases are identified. Using the sequence alignment software programs, the amino acid sequences G3LPAAT and different LPAAT are aligned, and a profile is generated using a region of homologous sequence, between amino acids 256 and 459 of the E. coli sequence. The 204 identified amino acids are used to search the protein database using PSI-BLAST. After 5 iterations of PSI-BLAST, the generated profile of this new search (figure 1) identifies a soluble form of G3PAAT. Prior to this identification, no sequence homology had been detected between the membrane form and the soluble form of G3PAAT.
EXAMPLE 3 Profile excision PSI-BLAST The generated profile of the searches using PSI-BLAST is taken from the hypertext mark language file (html). The interface of the global extension network (www) / html for psiblast in nebí stores the profile matrix currently generated in a hidden field in the html file that comes back after each iteration of the psiblast. However, this matrix has been encoded within the format string62 for ease and transport through (s62). The String62 format is a simple conversion of the values of the array into legal ascii characters in html. The matrix encoded with a width of (x-axis) 26 characters, and comprises the consensus characters, the probabilities of each amino acid in the order A, B, C, D, E, F, G, H, L, J, K, L, M, N, O, P, Q, R, S, T, V, W, X, Y, Z (in dohde B 1 represents D and N, and Z represents Q and E, and X represents any amino acid) , the creation value of clear, and the length value of clear. The length (x axis) of the matrix corresponds to the length! of the sequence identified by PSI-BLAST. The order of the amino acids corresponding to the conserved amino acid sequence of the sequence identified using PSI-BLAST, with the N-terminus at the top of the matrix. The probabilities of other amino acids in that position are represented for each amino acid along with the x axis, below the abbreviation of a single letter of the respective amino acid. Thus, each row of the profile consists of the highest evaluation (consensus) of the amino acid, followed by the evaluation for each possible amino acid in that position in the sequence matrix, the evaluation to create a space in that position, and the evaluation for the length of spain in that position.
The file string62 is converted back into a profile for the use of subsequent searches. The open space fields are established! to 11 and the space length fields are set to 1 along with the x axis. The creation of space and the length of space are known values, based on the establishment given by the PSI-BLAST algorithm. The matrix is exported to the profile form GCG standard, this format can be read by Gen Web. The algorithm used to convert the formatted string62 file to the matrix is highlighted in Table 1.
TABLE 1 1. - If the coded character is z then the value is the minimum evaluation in blast! 2.- If the coded character is Z then the value value is the maximum evaluation in blast 3.- If the coded character is in capital letters then the value is (64- (ascci ¡# of character))! 4. - If the encoded character is a digit, the value is ((ascii # of the character) -48) 5.- If the encoded character is not an upper case, then the value is ((ascii # of the character) -87) 6.- All positions B establish a minimum of amino acids D and N and those rows is the sequence matrix 7.- All Z positions establish a minimum of amino acids U and E and that row is the sequence matrix 8.- All X positions establish a minimum of all amino acids and that row is the sequence matrix 9.- kBLAST_evaluation _MAX = 999; 10.- kBLAST_evaluation _MIN = 999; 11.- All open spaces are set to 11 12.- All the objective spaces are set to 1 EXAMPLE 4 Identification of amino acid sequences related to novel acyltransferases The profile (Figure 1) is used in further searches to identify a number of previously unidentified yeast protein as novel acyltransferases. A protein is identified from a database of the Arabidopsis protein sequence (ATAT1) (SEQ ID NO: 2). The sequences are also identified from the nucleic acid databases (Table 2).
TABLE 2 gi 320748 Putative LPAAT from Limnanthes e-19 (SEQ ID NO: 219) gi 2506920 deletes CTR1 (choline transport mutant) (SEQ ID NO: 220) gi 549627 Similar to CTR1 c-118 (SEQ ID NO 221 gi 2133031 unidentified (SEQ ID NO: 222) gi 2132939 not identified (SEQ ID NO: 223) gi 2132299 TAFAZZIN e-14 (SEQ ID NO: 224) In Table 2, the number gi is the identifier of the database, the middle column shows the results of the BLAST search against the NCBI NR protein database, and the logarithmic probability number shown represents the logarithm of the probability that said coupling occurs by chance. These proteins, including the ATATI protein sequence, are identified using the original PSI-BLAST search of the NCBI NR protein database. Thus, these proteins are proteins related to new acyltransferases with unidentified activities. The Arabidopsis acyltransferase sequence, here referred to as ATATI, is also identified using the original PSI-BLAST I search of the NCBI NR protein database, and has no designated function. The additional amino acid sequences of Arabidopsis, related to acyltransferases are identified from the databases, referred to as ATAT2est, ATAT2est, ATAT3est, ATAT4est, ATATdest, ATATest, ATAT7est, ATATdest, ATAT9, ATAT10, and ATATHest. In addition, the amino acid sequences of Arabidopsis are identified, which demonstrates the sequence similarity to the known lysophosphatidic acid, referred to as ATLPAAT1. The ATAT9 and ATAT10 sequences are identified from the database as genomic sequences, all other Arabidopsis sequences are identified as EST.
EXAMPLE 5 Sequence analysis of the novel acyltransferases To obtain the complete coding region corresponding to the Arabidopsis acyltransferases sequences, synthetic oligonucleotide primers were designed to amplify the 5 'and 3' ends of the partial cDNA clones containing sequences related to acyltransferases. . The primers were designed according to the sequence related to the respective acyltransferases of Arabidopsis (Table 3) and were used in rapid amplification reactions of the cDNA ends (RACE) (Frohman et al. (1988) Proc. Natl.Acad. Use 85: 8998 ^ 9002) using the Marathon cDNA amplification kit (Clontech Laboratories Ine, Pío Alto CA). Initiators with an R designation were used for the 5 'RACE reactions and initiators with an F designation were used for the 3' RACE reactions. i * ®, j ^ j TABLE 3 From the nucleic acid sequence obtained from the RACE reaction, the protein sequences are predicted for each nucleic acid sequence using the Macvector software. Nucleic acid sequences are provided by ATAT1 (SEQ ID NO: 1), ATAT2 (SEQ ID NO: 3), ATAT3 (SEQ ID NO: 5), ATAT4 (SEQ ID NOJ), ATAT5 (SEQ ID NO: 9) , ATAT6 (SEQ ID NO: 10), ATAT7 (SEQ ID NO: 12), ATAT8 (SEQ ID NO: 14), ATAT9 (SEQ ID NO: 16), ATAT10 (SEQ ID NO: 18), ATAT1 1 ( SEQ ID NO: 20) and ATLPAAT1 (SEQ ID NO: 22), respectively. The protein sequence derived from the nucleic acid sequence ATAT1 (SEQ ID NO: 2) of Arabidopsis has a predicted molecular mass of 32.5 kDa, and a Pl of 9.74. Alignment of the Arabidopsis acyltransferases with several LPAAT and G3PAAT shows that some of the domains are conserved between LPAAT and G3PAAT are conserved in the novel protein acyltransferases.
The nucleic acid sequence ATAT2 is predicted to encode a protein of 312 amino acids (SEQ ID NO: 4), with a (molecular weight of 34.6 kd, and a pl of 9.99) The ATAT2 protein may also contain 2 to 3 transmembrane domains However, the protein encoded by the nucleic acid sequence ATAT2 may be higher than predicted due to the absence of a stop codon within the reading frame towards the 5 'end of the ATG start codon used. nucleic ATAT3 is predicted to encode a protein of 398 amino acids (SEQ ID NO: 6), with a molecular weight of 44.7 kD, and a pl of 5.62. The ATAT3 protein can contain from 1 to 4 transmembrane domains. The ATAT4 nucleic acid sequence is predicted to encode a protein of 317 amino acids (SEQ ID NO: 8), with a molecular weight of 36.5 kD, and a pl of 9.67. The ATAT4 protein is predicted to have 2 to 5 transmembrane domains. The ATLPAAT1 nucleic acid sequence is predicted to encode a protein of 389 amino acids (SEQ ID NO: 23), with a molecular weight of 43.7 kD, and a pl of 9.52. The ATLPAAT1 protein is predicted to have more than 3 transmembrane domains. The predicted protein from the nucleic acid sequence ATLPAAT1 is similar to LRAATs reported for Brassica, corn, and Limmanthes douglasii (described in PCT publication WO 94/13814). The nucleic acid sequence ATAT11 is predicted to encode a protein of 375 amino acids (SEQ ID NO: 21), with a molecular weight of 43.5 kD, and a pl of 9.45. The I I deduced amino acid sequence of ATAT6 (SEQ ID NO: 11), ATAT7 (SEQ ID Np: 13), ATAT8 (SEQ ID NO: 15), ATAT9 (SEQ ID NO: 17), and ATAT10 (SEQ ID NO: 19) is also provided. A sequence region of about 30 amino acids to the 5 'end through approximately 100 amino acids towards the 3' end of the conserved amino acid sequence HXXXXD (Heath and Rock, (1998) J. Bacteriol. 180 (6): 1425-1430) and PEG (Neuwald (1997) Curr Biol 7: R465-R466) of the predicted amino acid sequence derived from the amino acid sequence of ATAT1, ATAT2, ATAT3, ATAT4, ATAT6, ATAT7, ATAT8 , ATAT9, ATAT10, ATLPAT1, and ATAT11 were compared to the amino acid sequence of the acyltransferases of lysophosphatidic acid (Jojoba AT (SEQ ID NO: 162, the nucleic acid sequence is provided in SEQ ID NO: 161), corn AT ( PCT publication WO 94/13814), JPLSC coconut (GenBank access 1098605), PLSC Lim (GenBank access 1209507), PLSC, E. coli (GenBank access 1209507), and PLSC yeast (GenBank access 464422)) and glycerol acyltransferases 3-phosphate (PLSB from E. coli (GehBank access 130326) and mouse PLSB (GenBank access 2498786)) (figur 2), and i identified similarities (figure 2 and figure 3). The sequence comparison revealed that several classes of II acyltransferases exist based on the conserved amino acid sequences identified in the comparisons in Figure 2. For example, ATAT1, ATAT6, ATAT7, ATAT8, and ATAT9, contain the sequences of I conserved amino acids of VTYSXS (SEQ ID NO: 128), VXLTRXR (SEQ ID I NO: 129), LXXGDLV (SEQ ID NO: 132) between the sequences HXXXXD and jPEG. In addition, ATAT1, ATAT6, ATAT7, ATAT8, and ATAT9 also contain the conserved CPEGT sequences (SEQ ID NO: 130) which comprise the PEG sequence, as well as IVPVA (SEQ ID NO: 131) and VANXXQ (SEQ ID NO: 134) ) (figure 2) towards the 3 'end of the PEG sequence. The sequences corresponding to ATAT1, ATAT7, and ATAT9 are the closest ?} nent i related to this class, with similarities between ATAT1 and ATAT9 of 67.0%, between ATAT1 and ATAT7 of 58.2% and between ATAT9 and ATAT7 of 63.9% (figure 3B). Sequence comparisons also show that the sequence of ATLPAATI is more closely related to the LPAAT of jojoba (82.3% similarity), and maize (78% similarity). In addition, the sequence analysis shows that ATAT4i is the most divergent sequence with the highest similarity to ATAT10 (18.5%). The greatest similarity (15.3%) for a known sequence is LPAAT for the Limmanthes douglassi. However, the sequence of ATAT4 and ATAT10 share several peptide sequences conserved with the amino acid sequence of ATAT2 and ATAT3 (Figure 2), VXNHXS (SEQ ID NO: 127) where H comprises the conserved H of the sequence HXXXXD and FXXGAF (SEQ ID NO: 133) to the 3 'end of the PEG sequence.
! I I EXAMPLE 6 Identification of additional sequences of acyltransferases' 1 The novel sequences of Arabidopsis identified above are used to search for properties in databases containing EST sequences of soybean and maize. The results of these searches identify the EST sequences of the soybean (SEQ ID NO: 24? To SEQ ID NO: 85) as well as of the maize (SEQ ID NO: 86 to SEQ ID NT: 126) as proteins related to the coding of acyltransferases. The comparison of sequences between the various EST sequences and the complete Arabidopsis sequence reveals that the identified EST sequences demonstrate greater similarity to the various Arabidopsis sequences as determined by the BLAST evaluations. The Tag sequences of the expressed sequence (EST) of the soybean and maize databases which are most closely related by the BLAST to ATAT1 evaluations were identified (SÉQ ID NOS: 24-29 and SEQ ID NOS: 86- 88, respectively), ATAT2 (SEQ ID NO: 30 and SEQ ID NO: 89, respectively), ATAT3 (SEQ ID NOS: 31-35 and SEQ ID NOS: 90-94, respectively), ATAT4 (SEQ ID NOS: 36 -44 and SEQ ID NOS: 95-100, respectively), ATAT6 (SEQ ID NOS: 45-49 and SEQ ID NO: 101, respectively), ATAT7 (SEQ ID NOS: 50-54 and SEQ ID NOS: 102-103 , respectively), ATAT8 (SEQ ID NOS: 55-56 and SEQ ID NO: 104, respectively), ATAT9 (SEQ ID NOS: 57-79 and SEQ ID NOS.105-111 I respectively), ATAT10 (SEQ ID NOS: 80-81 and SEQ ID NO: 112, respectively), ATAT11, (SEQ ID NOS: 82-85 and SEQ ID NOS: 123-126, respectively), and ATLPAAT1 (SEQ ID NOS: 113-122 respectively)]!reparation of Expression Constructs I A series of synthetic oligonucleotide primers I was prepared for use in polymerase chain reaction (PCR) to amplify the entire DNA sequences encoding several acyltransferases sequences identified above. The sequences are listed in table 4.
TABLE 4 Initiator Sequence (listed 5 '- 3') SEQ ID NO: ATAT1F AAGCTTGCATGCGTCGACACAATGGTTCATGCGACCAAGT 1 ^ 3 CAG ATAT1R GGTACCGTCGACTCACTTCTTGGTGTTGTTGATAG_164_ATAT2F GGATCCGCGGCCGCACAATGACGAGCTTTACTACTTCCCT 165 TCAT ATAT2R GGATCCCCTCCAG riTAGAGATCCATTGATTCTGCAAT 166 ATAT3F GGATCCGCGGCCGCATAATGGAATCAG? GCTCAAAGAT 167 ATAT3R GGATCCCCTGCAGGTCATTCTTCTTTCTGATGGAAATC 168 ATAT4F OGATCCGC < CCGCACA? TGACTCG ^ rCACAAGATGrrTC 169 TO ATAT4R GGATCC (XTGCAGGTCACTTCTCITCC ^ 170 ATAT6F GGATCCGCGGCCGCACAATGTCCGGTAATAAGATCTCGAC 171 TCTTCA ATATßR GGATCCCCTG ^ GGTTATTTTTTCTTG ^ 172 TACCGG ATAT7F ATATCCGCGGCCGCACAATGGTTATGGAGCAAGCTGGAA 173 ATAT7R GGATCCCCTGCAGGTCAATGGAGACAAGGCTCGA? AGT 174 ATAT8F GGATCCGCGGCCGC? O ^ ATGTCXXSCCAAGATTTCAATATT 175 CC ATAT8R < K3Atc cct? ^ GG tAA t ttc rAAcrAcp ^ 176 ATAT9F GGATÍXG MCCGCACAATGGGAGCTCAGGAGAAACGGCG 177 CC ATAT9R GGATCCCCTC ^ C; TCACGTCGTCTCCGTCTTCACCGG 178 ATAT10F GGATCCGCGGCCGCACAATH3GCGGATCCTGATCTGTCTTC 179 TCCT ATATIOR X3ATCCCCTX3CAGGTTATGTTCK3GGCCAAGTCAGGTGCAA 180 AGAT ATATllF GGATCCGCGGCCGCAAAATGGAAAAAAAGAGTGTACCAAA ßl TTCT ATATllR GGATCCCCTGCAGGTTATTTGTTTACTAATTTGAGGGAAT 182 TTTTTG ATLPAAT TCGACCTGCAGGAAGCTTAAGGATGGTGATTGCTGC 183 ÍF ATLPAAT GGATCCGCGGCCGCTTACTTCTCCTTCTCCG 184 GO YSCAT1F GGATCCGC8CCGCACAATGTCTTTTAGGGATGTCCTAG_185_YSCATIR GGATCCCCTGCAGGTCAATCATCCTTACCCTTTGGTTTAC 186 C YSCAT 1 ATGTCTTTT? GGGATGTCCTAGAAAGAGGAGATGAATTTT 187 KO F CTGTGCGGTATTTCACACCG YSCAT 1 TO? TC? TCCrTACCCTTTGGTTTACCCrrCTGGAGGCAGA 188 KO R? GATTGTACTGAGAGTGCAC YSCAT2F GGATCCGCGGCCGCACAATGAAGCATTCCCA ?? AAT? CCG 189 TAGG YSCAT2R GCATCCXCTGCaAGGTCAATGATTTTTT rCATCACAAATA 190 C YSCAT 2 ATGAAGCATTCCC? AAAATACCGTAGGTATGGAATTTATG 191 XO F CTGTGCGGTATTTCACACCG YSCAT 2 TCAATGA "rtrrTTTCATCACAAATACAAGAATAAGAAAA 192 KO R AGATTGTACTGAG, AGTGCAC YSCAT GGATCCGCGCCCGCACAATGGGTTTTGTTGATTTCTTCGA 193 3F AAC YSCAT GGATCCC TCX: AGGTTATTTGGTCTCAATTTTAATATTTT 194 3R TTTGC YSCAT 3 ATGGGTTTTGTTGATTTC TCGAAACATATATGGTCGGTT 195 KO F CTGTGCGGTATTTCACACCG YSCAT 3 TTATTTGGTCGCAATGTGAATATTTTGGGGCAAGGACTCG 196 KO R AGATGGTACTGAGAGTGCAC YSCAT GGATCCGCGGCCGCACAATGGAAAAGTACACCAATTGGAG 197 4F AGAC YSCAT GGATCCCCTGCAGGCTACITCC, l \, lTACGTTGATCGC 198 4R TG YSCAT 4 ATGGAAAAGTACACCAATTGGAGAGACAATGGTACGGGAA 199 KO F CTGTGCGGT? TTTCACACCG YSCAT 4 CTACTTCCTCI I l'ACGTTGATCGCTGATATATTCCTTC 200 KO R AGATTGTACTGAGAGTGCAC YSCAT GGATCCG «? 3CCGCACAATGCCWXACCjftAAA rCÁCGGA 201 5F G YSCAT GGATCCCCTGCAGG ^ rrACGCATCTCCTTCTTTCCCTCTC 202 5R YSCAT 5 ATGCCTGCACCA? AACTCACGGAGAAATCTGCCTCTTCCA 203 KO F CTGTGCGGTATTTCACACCG YSCAT 5 CTACGCATCrc TTCTTTCCCTTCTTCTTCI CTTC ^ 204 KO R? CATTGT? CTGAG? GTGCAC YSCAT GGATCCO GCCGCACAATGTCTGCTCCCGCTGCCGATCA 205 6F TAACGC YSCAT GG? TCCCXrTGCAGGTCATTCrTTCrTTrcG 206 6R TCTG YSCAT 6 ATGTCTGCTCCCGCTGCCGATCATAACGCTGCCAAACCTA 207 KO F CTGTGCGGTATTTCACACCG YSCAT 6 CAT CTT CT TTCGTGT crcTrr CTGTc 208 KO R AGATTGTACTGAGAGTGCAC YSCAT GiATCCGCGGCCC ?: ACAATGCTGCATC? AAAAATAGCTCA 209 7F TAAAGTTCG YSCAT GGATCC < XTr? CAC «TCAAAAAATAAAACAATAAAGTTTAT 210 7R AAACTAACC YSCAT 7 ATGCTGC? TCAAAA? ATAGCTCAT ??? GTTCG ??? AGTCG 211 KO F CTGTGCGGTATTTCACACCG YSCAT 7 TCAAAAAATAAAACAATAAAGTTTATAAACTAACCAAATT 212 KO R AGATTGTACTGAGAGTGCAC YSCAT GG? T (G x? X »CACMTGAGTGTGATAGGTAC? RtCTT 213 8F G YSCAT GGATCCCCTGCAGGTTAATGCAT l 'i I ACAGATGAAC 214 8R C YSCAT 8 ATGAGTGTGATAGGTAGGTTCTTGTATtACTTGAGGTCCG 215 KO F CTGTGCGGTATTTCACACCG YSCAT 8 TTAATGCATX TTTTTACAGATGAACCTTCGTTATGGGTA 216 KO R AGATTGTACTGAGAGTGCAC The complete coding regions for each of the acyltransferase sequences were amplified using the list of respective primers listed in Table 3 above., were cloned into the vector pCR2.1Topo (Invitrogen) or pZero (Invitrogen), and became dizzy as pCGN8558 (ATAT1), pCGN8564 (ATAT2), pCGB8565 (ATAT3), i pCGN8566 (ATAT4), pCGN8918 (ATAT6), pCGN8913 ( ATAT7), pCGI ^ 8904 i (ATAT8), pCGN9970 (ATAT9), pCGN9940 (ATAT10), pCGN8567 (ATAT11), pCGN8632 (ATLPAAT1), pCGN9901 (YSCAT1 also referred to as g2132299), pCGN9902 (YSCAT2, also referred to as gil 078509), I pCGN9903 (YSCAT3, also referred to as g2132939), pCGN9904 (YSCAT4, also referred to as gi2133031), pCGN9905 (YSCAT5, also referred to as gi320748), pCGN9906 (YSCST6, also referred to as gi549627), pCGN9907 (YSCAT7, also referred to as g586485), and pCGN9908 (YSCAT8, also referred to as g466622). Nucleic acid sequences for the respective acyltransferases of yeast are provided by YSCAT1 (SEQ ID NO: 225), YSCAT2 (SEQ ID NO: 226), YSCAT3 (SEQ ID NO: 227), YSCAT4 (SEQ ID NO: 228), YSCAT5 (SEQ ID NO: 229), YSCAT6 (SEQ ID NO: 230), YSCAT7 (SEQ ID NO: 231), and YSCAT8 (SEQ ID NO: 232). 7A. Baculovirus expression constructs The constructs were prepared to direct the expression of Arabidopsis ATAT sequences in insect cell cultures, the complete coding regimes of ATATI, 2, 3, 4, 6, 7, 8, 9, 10 and 11 were cloned into the vector pFastBad (Gibco-BRL, Gaithersburg, MD) digested with No and Pst .. The respective coding sequences were cloned into the fragments? / o / Sse83871. The double-stranded DNA sequence was obtained by verifying that no errors were introduced by the PCR amplification. The resulting plasmid was designated pCGN9723 (ATAT1), pCGN9724 (ATAT2), pCGN9725 (ATAT3), pCGN9726 (ATAT4), pCGN9727 (ATAT5), pCGN9728 (ATAT7), pCGN9729 (ATAT8), pCGN9730 (ATAT10), pCGN9731 (ATAT11). 7B. Preparation of plant expression constructs A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 5,639,790, the entirety of which is incorporated by reference) was modified to make it more useful for cloning large fragments of DNA containing multiple restriction sites, and to allow the cloning of multiple napin fusion genes within the binary transformation vectors of plants. An adapter comprising the self-joining oligonucleotide of the sequence CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAA (SEQ ID NO: 233 AT) was ligated into the cloning vector pBCSK + (Stratágene) after digestion with the restriction endonuclease BssHIl to construct the vector pCGN7765. The plasmids pCGN3223 and pCGN7765 were digested with Notl and ligated together The resulting vector, pCGN7770, contains the pCGN7765 framework with the expression cassette specific for the napin seed from pCGN3223.The cloning cassette, pCGN7787, essentially the same regulatory elements as in pCGN7770 , with the exception that the regulatory region pCGN7770 has been replaced with the double promoton , > . ^. '35S CaMV and the tml polyadenylation region and the transcription termination region. A binary vector for plant transformation, pCGN5139, was constructed from pCGN1558 (McBride and Summerfeit, (1990) Plant Molecular Biology, 14: 269-276). The polylinker of pCGN1558 was replaced as a HindIII / Asp718 fragment with a polyadapter containing a single restriction endonuclease site, Ascl, PacI, XbaI, SwaI, BamHI, and NotI. The! restriction endonuclease sites Asp718 and Hindlll are retained in pCGN5139. A series of binary turbo vectors were built! to allow rapid cloning of DNA sequences within binary vepters containing the transcription initiation regions (promoters) and the transcription termination regions. Plasmid pCGN8618 was constructed by ligating the oligonucleotides 5'-TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3") (SEQ ID NO: 234) and 5'-TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3 ') (SEQ ID NO: 235) into pCGN7770 digested with Sall / Xhol. fragment containing the napin promoter, polyadaptator and the 3 'napin region was extracted from pCGN8618 by digestion with Asp718i, the fragment was blunt-ended by filling the 5' ends with the Klenow fragment and then ligating it into pCGN5139 which had been digested with Asp7181 and Hindlll and the ends were made blunt by filling them with the Klenow 5 'fragment.A plasmid containing the insert oriented so that the napin promoter was closer to the Asp7181 binding site of pCGN5139 and the 3' napin was closer to the Hindlll binding site was subjected to sequence analysis to confirm both the orientation of the insertion and the integrity of the cloning junctions.The resulting plasmid was designated pCGN8622. smido pCGN8619 was constructed by ligating the oligonucleotides! TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3 ') (SEQ ID NO: 236 ^ and 5'-TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3') (SEQ ID NO: 237) within pCGN7770 digested with Sall / Xhol. A fragment containing the napin promoter, polyadaptator and the 3 'napin region was extracted from pCGfJ8619 by digestion with Asp718l.; To the fragment, extreme ends were made by filling the 5 'ends with the Klenow fragment, then ligating it into pCGN5139 which had been digested with Asp7181 and Hindlll and the ends were blunt when filled with the Klenow 5' fragment. A plasmid containing the insert oriented so that the napin promoter was closer to the Asp7181 binding site of pCGN5139 and the 3 'napin was closer to the binding site Hindlll was subjected to sequence analysis to confirm both the orientation of the insertion and the integrity of cloning unions. The resulting plasmid was designated pCGN8623. Plasmid pCGN8620 was constructed by ligating the oligonucleotides TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT-3 ') (SEQ ID NO: 238) and 5'-CCTGCAGGAAGCTTGCGGCCGCGGATCC-3') (SEp ID NO: 239) within pCGN7787 digested with Sall / Xhol. A fragment containing the d35S promoter, polyadaptator and tml 3 'region were removed from pCGN8620 upon complete digestion with Asp718l; and partial digestion with Notl. The fragment was blunt ended when filled with the fragment 'of Klenow was then ligated into pCGN5139 and had to be digested with Asp7181 and Hindlll and the ends became blunt when filled with the 5 'fragment of Klenow. A plasmid containing the insert oriented so that the d35S promoter was close to the Asp7181 binding site of pCGN5139 and the 3 'tml was closest to the HindII binding site were subjected to sequence analysis to confirm both the orientation of the conjugate insertion the integrity of cloning unions. The resulting plasmids were designated pCGN8624. The plasmid pCGN8621 was constructed by ligating the oligonucleotides '-TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT-3 ') (SEQ! ID NO: 240) and 5'-GGATCCGCGGCCGCAAGCTTCCTGCAGG-3 ') (SEQ ID NO: 241) The coding regions of several acyltransferases sequences were cloned into the? / Orl / Sse83871 fragments within pCGN8622, pCGN8623, pCGN8624 and pCGN8625, for sense or antisense expression of the orientated forms of a preferential tissue promoter, napin, or 35S promoter. the fragments that were cloned into the vector pCGN8622 created the constructs pCGN8901 (ATAT1), pCGN8571 (ATAT2), pCGN8909 (ATAT3), pCGN8596 (ATAT4), pCGN8919 (ATAT6), pCGN8914 (ATAT7), pCGN8905 (ATAT8), pCGN9973 (ATAT9), pCGN9942 (ATAT10), pCGN8575 (ATAT11), and pCGN8633 (ATLPAAT1) for the sense expression of the respective coding sequences of the napin promoter II. Fragments that were cloned into vector pCGN8623 stained constructs pCGN8900 (ATAT1), pCGN8572 (ATAT2), pCGN8910 (ATAT3), pCGN8597 (ATAT4), pCGN8920 (ATAT6), pCGN8915 (ATAT7), PCGN8906 (ATAT8), pCGN9972 (ATAT9) ), pCGN9943 (ATAT10), pCG 8576 (ATAT11), and pCGN8634 (ATLPAAT1) for the antisense expression of the respective coding sequences for the napin promoter. Fragments that were cloned into vector pCGN8623 created the constructs pCGN8903 (ATAT1), pCGN8573 (ATAT2), pCGN8911 (ATAT3), pCG 8598 (ATAT4), pCGN8921 (ATAT6), pCGN8916 (ATAT7), pCGN8907 (ATYAT8), pCGN9971 ( ATAT9), pCGN9944 (ATAT10), pCGN8577 (ATAtl 1), and pCGN8635 (ATLPAAT1) for the sense expression of the respective coding sequences of the 35S promoter. The fragments that were cloned into the vector pCGN8625 created the constructs pCGN8902 (ATAT1) and pCGN9974 (ATAT9) for the antisense expression of the respective coding sequences for the 35S promoter. In addition, the sequences encoding yeast acyltransferases were cloned into the vector pCGN8624 creating the constructs PCGN9926 (YSCAT1), pCGN9927 (YSCAT2), pCGN9928 (YSCAT3), pCGN9929 (YSCAT4), pCGN9930 (YSCAT5), pCGN9931 (YSCAT6), pCGN9932 ( YSCAT7), and pCGN9933 (YSCAT8). These constructions allow the sense expression of the respective coding sequences of acyltransferases from the 35S promoter in plant cells. ; »? - taS EXAMPLE 8 Plant transformation! A variety of methods have been developed to insert a DNA sequence of interest into the genome of a hospitable plant to obtain transcription or transcription and translation of the sequence to effect phenotypic changes. The transgenic Brassica plants are obtained by Agrobacterium-mediated transformation as described by Radke et al. (Theor. Appl. Genet. (1998) 75: 685-694; Plant Cell Reports (1992) 11: 499-505). Transgenic Arabidopsis thaliana plants can be obtained by Agrobacterium-mediated transformations as described by Valverkens et al., (Proc. Nat. Acad. Sci. (1998) 85: 5536-5540), or as described by Bent et al. ((1994), Science 265: 1856-1860), or Bechtold et al. ((1993), C.R.Acad.Sci, Life Sciences 316: 1194-1199) or Clough, et al. (1998) Plant J., 16: 735-43. Other plant species can similarly be transformed using related techniques. Alternatively, microprojectile bombardment methods, such as those described by Klein et al. (BiolTechnology 10: 286-291) can also be used to obtain plants with nuclear transformations. The above-mentioned results demonstrate that the identified nucleic acid sequences encode proteins that are rtrnt * related to protein sequences that code for acyltransferase proteins. Said acyltransferase sequences are used in preparing expression constructs for plant transformations. All publications and patent applications mentioned in this specification are indicative of the skill level of the experts (in the art to which this invention pertains.) All publications and patent applications are incorporated herein by reference to the same degree as each publication. or individual patent application was specifically and individually indicated to be incorporated as a reference Although the preceding invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be obvious that certain changes and modifications may be practiced within the scope of the invention. scope of the appended claims. i LIST OF SEQUENCES < 110 > Lassner, Michael W Emig, Robin A Ruezinsky, Diane Van Eenennaam, Alison < 120 > Novel Plant Acyltransferases < 130 > 17029/00 / WO < 140 > < 141 > < 150 60/101, 939 < 151 > 1998-09-25 < 160 > 241 < 170 > Patentln Ver. 2.0 '< < 210 > 1 < 211 > 869 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 1 atggttcatg cgaccaagtc agccacaacg attccaaaag aacgcttaaa gaaccgcata1 60 gtcttccatg atgggcgttt agcgcaacgt ccaactccgt taaacgccat tatcacatac 120 ctatggcttc cttttggttt catctctcca tcattcgcgt etaetteade ctecetttac¡ 180 ctgaaagatt tgtccgttac acttacgaga tgctcgggat ccacttaacc attcgtggtc1 240 atcgtcctcc acctccttcc cctggaactc ttggcaacct ctatgtcctt aaccaccgtai 300 ccgcgcttga tcccatcatc gtcgctattg ctcttggacg taagatctgt tgcgtcactt¡ 360 . ^ s & acagtgtctc tcgtctctcc cttatgcttt ctcctattcc tgctgttgcc ctcacccgtg 420 accgtgccac cgatgctgcc aacatgagaa aacttctcga gaaaggcgac ttggtgatat 490 gtccggcagg cacgacgtgt agagaagagt atctactgag atttagcgct ctattcgcagj 540 agctaagcga ceggattgtg ccagtagcga tgaactgtad acaaggaatg ttcaacgggaí 600 ccacagttag gggtgtgaag ttttgggacc cttacttctt cttcatgaac ccaagaccaa. 660 gctatgaagc cactttcttg gatcgtttgc ctgaagaaat gactgtcaac ggtggtggca 720 agactectat agaggtggct aattacgtcc agaaagttat cggcgcggtt ttgggcttcg 790 aatgcaccga acttactcgc aaggataaat atcttttgct tggaggtaac gacggcaagg 840 tggagtctat caacaacacc aagaagtga 869 < 210 > 2 < 211 > 289 < 212 > PRT < 213 > Arabidopsis sp. < 400 > 2 Met Val Hís Ala Thr Lys Ser Ala Thr Thr lla Pro Pro Lys Glu Arg Lau 1 5 10 15 Lys Asn Arg lie Val Phe His Asp Gly Arg Leu Wing Gln Arg Pro Thr 20 25 30 Pro Leu Asn Ala lie lie Thr Tyr Leu Trp Leu Pro Phe Gly Phe lie 35 40 45 Leu Ser lie lie Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 50 55 60 Val Arg Tyr Thr Tyr Glu Met Leu Gly lie His Leu Thr lie Arg Gly 65 70 75 80 His Arg Pro Pro Pro Pro Pro Gly Thr Leu Gly Asn Leu Tyr Val 85 90 95 Leu Asn His Arg Thr Ala Leu Asp Pro lie lie Val Ala lie Ala Leu 100 105 110 ' \ < 110 > Lassner, Michael W Emig, Robin A i Ruezinsky, Diane i Van Eenennaam, Alison < 120 > Novel Plant Acyltransferases < 130 > 17029/00 / WO < 140 > < 141 > < 150 > 60/101, 939 < 151 > 1998-09-25 '! < 160 > 241 < 170 > Patentln Ver. 2.0 '< 210 > 1 < 211 > 869 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 1 atggttcatg cgaccaagtc agccacaacg attccaaaag aacgcttaaa gaaccgcatá 60 gtcttccatg atgggcgttt agcgcaacgt ccaactccgt taaacgccat tatcacatac 120 ctatggcttc cttttggttt catctctcca tcattcgcgt ctacttcaac ctccctttac 180 ctgaaagatt tgtccgttac acttacgaga tgctcgggat ccacttaacc attcgtggtc 240 atcgtcctcc acctccttcc cctggaactc ttggcaacct ctatgtcctt aaccaccgta 300 ccgcgcttga tcccatcatc gtcgctattg ctcttggacg taagatctgt tgcgtcactt 360 acagtgtctc tcgtctctcc cttatgcttt ctcctattcc tgctgttgcc ctcacccgtg 420 accgtgccac cgatgctgcc aacatgagaa aacttctcga gaaaggcgac ttggtgatat 490 í gtccggaagg cacgacgtgt agagaagagt atctactgag ctttagcgct ctattcgcag 540 ccggattgtg agctaagcga tgaactgtaa ccagtdgcga ttcaacggga acaaggaatg 600 ccacagttag gggtgtgaag ttttgggacc cttacttctt cttcatgaac ccaagaccaa 660 gctatgaagc cactttcttg gatcgtttgc ctgaagaaat gactgtcaac ggtggtggca 120 agactcctat agaggtggct mattacgtcc agaaagttat cggcgcggtt ttgggcttcg 780 aatgcaccga acttactcgc aaggataaat atcttttgct tggaggtaat gacggcaagg 840 tggagtctat caa caacacc aagaagtga 869 < 210 > 2 < 211 > 289 < 212 > PRT < 213 > Arabidopsis sp. ! < 400 > 2 Met Val His Wing Thr Lys Ser Wing Thr Thr lie Pro Lys Glu Arg Leu 1 5 10 15 ' Lys Asn Arg lie Val Phe His Asp Gly Arg Leu Wing Gln Arg Pro Thr 20 25 30 Pro Leu Asn Ala lie lie Thr Tyr Leu Trp Leu Pro Phe Gly Phe lie 35 40 45 Leu Ser lie lie Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 50 55 60 Val Arg Tyr Thr Tyr Glu Met Leu Gly lie His Leu Thr lie Arg Gly 65 70 75 80 His Arg Pro Pro Pro Pro Pro Gly Thr Leu Gly Asn Leu Tyr Val, 85 90 95 Leu Asn His Arg Thr Ala Leu Asp Pro lie lie Val Ala lie Ala Leu 100 105 110 Gly Arg Lys He Cys Cys Val Thr Tyr Ser Val Ser Arg Leu Ser Leu 115 120 125 Met Leu Ser Pro He Pro Ala Val Ala Leu Thr Arg Asp Arg Ala Thr 130 135 140 Asp Ala Ala Asn Met Arg Lys Leu Leu Glu Lys Gly Asp Leu Val He 145 150 155 160 Cys Pro Glu Gly Thr Thr Cys Arg Glu Glu Tyr Leu Leu Arg Phe Ser 165 170 175 Wing Leu Phe Wing Glu Leu Being Asp Arg He Val Pro Val Wing Met Asn 180 185 190 Cys Lys Gln Gly Met Phe Asn Gly Thr Thr Val Arg Gly Val Lys Phe 195 200 205 Trp Asp Pro Tyr Phe Phe Phe Met Asn Pro Arg Pro Ser Tyr Glu Ala 210 215 220 Thr Phe Leu Asp Arg Leu Pro Glu Glu Met Thr Val Asn Gly Gly Gly 225 230 235 240 Lys Thr Pro He Glu Val Wing Asn Tyr Val Gln Lys Val He Gly Wing 245 250 255 Val Leu Gly Phe Glu Cys Thr Glu Leu Thr Arg Lys Asp Lys Tyr Leu 260 265 270 Lru Leu Gly Gly Asn Asp Gly Lys Val Glu Ser He Asn Asn Thr Lys 275 280 285 Lys < 210 > 3 < 211 > 939 < 212 > SDN < 213 > Arabidopsis sp. < 400 > 3 atgacgagct ttactacttc ccttcatgct gtcccgagtg aaaaatttat gggcgaaaca 60 agacgtactg gcattcaatg gtctaaccgc tctttaagac atgatcatta cagatttctt 120 gataagaaat cacctagatc aagtcaattg gcaagagata tcactgtgag agcagatctt 180 tcaggagctg caacccctga ctcttctttt cctgaaccag agattaagtt gagctcaaga 240 ctcagaggga tattcttttg tgttgttgct ggcatttcgg ctacttttct cattgtcctg 300 atgattattg ggcatccgtt cgtccttctc ttcgatccct ataggagaaa attccaccac 360 ttcattgcta aactttgggc ttccataagc atttatccgt tttacaaaat caacatcgag 420 ggtttggaga atctgccatc atcagacact cctgctgtat atgtttcaaa ccaccaaagt 480 tttctggata totacacact tcttagtctt ggaaaaagct ttaagttcat cagcaagaca 540 gggatattcg taattcccat catcggttgg gccatgtcca tgatgggtgt cgttcccttg 600 aagcggatgg acccaagaag ccaagtggat tgcttaaaac gctgcatgga acttttaaag 660 aagggagcat ctgtgttttt cttcccagaa ggaacacgga gtaaggatgg tcggttaggt 720 i tctttcaaga aaggcgcatt tacagtggct gcgaagaccg gagttgcagt agttccaata 780 gaacaggcaa acgctaatgg tatcatgcca acgggtagtg aaggtatact gaaccatggg 840 i aatgtgagag tt atcatcca taaaccaata catggaagca aagcggatgt tctttgcaas 900 gaggccagaa gcaagattgc agaatcaatg gatctctaa 939 < 210 > 4 < 211 > 312 < 212 > PRT < 213 > Arabidopsis sp. < 400 > 4 Met Thr Ser Phe Thr Thr Ser Leu His Wing Val Pro Ser Glu Lys Phe 1 5 10 15 Met Gly Glu Thr Arg Arg Thr Gly He Gln Trp Ser Asn Arg Ser Leu 20 25 30 Arg His Asp Pro Tyr Arg Phe Leu Asp Lys Lys Ser Pro Arg Ser 35 40 45 Gln Leu Ala Arg Asp He Thr Val Arg Ala Asp Leu Ser Gly Ala Ala 50 55 60 Thr Pro Asp Ser Ser Phe Pro Glu Pro Glu He Lys Leu Ser Ser Arg 65 70 75 80 Leu Arg Gly He Phe Phe Cys Val Val Wing Gly He Ser Wing Thr Phe 85 90 95 Leu He Val Leu Met He He Gly His Pro Phe Val Leu Leu Phe Asp 100 105 110 Pro Tyr Arg Arg Lys Phe His His Phe He Wing Lys Leu Trp Wing Ser 115 120 125 He Ser He Tyr Pro Phe Tyr Lys He Asn He Glu Gly Leu Glu Asn 130 135 140 Leu Pro Ser Ser Asp Thr Pro Ala Val Tyr Val Ser Asn His Gln Ser 145 150 155 160 Phe Leu Asp He Tyr Thr Leu Leu Ser Leu Gly Lys Ser Phe Lys Phe 165 170 175 I've Been Lys Thr Gly He Phe Val He Pro He He Gly Trp Wing Met 180 185 190 Met Met Met Gly Le Val Val Leu Lys Arg Met Asp Pro Arg Ser Gln 195 200 205 Val Asp Cys Leu Lys Arg Cys Met Glu Leu Leu Lys Lys Gly Wing Ser 210 215 220 Val Phe Phe Phe Pro Glu Gly Thr Arg Ser Lys Asp Gly Arg Leu Gly 225 230 235 240 Be Phe Lys Lys Gly Wing Phe Thr Val Wing Wing Lys Thr Gly Val Wing 245 250 255 Val Val Pro He Thr Leu Met Gly Thr Gly Lys He Met Pro Thr Gly 260 265 270 Ser Glu Gly He Leu Asn His Gly Asn Val Arg Val He He His Lys 275 280 285 Pro He His Gly Ser Lys Wing Asp Val Leu Cys Asn Glu Wing Arg Ser 290 295 300 Lys He Wing Glu Ser Met Asp Leu 305 310 < 210 > 5 < 211 > 1197 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 5 atggaatcag agctcadaga tttgaattcg aattcgaatc ctccgtcgag caaagaggac 60 cggccgttac tgaaatcaga atccgatttg gcggctgcca ttgaagagtt agacaaaaag 120 ttcgcacctt acgcgaggac cgatttgtat gggacgatgg gtttgggtcc tttcccgatg 190 acggagaata ttaaattggc ggttgcattg gtgactcttg ttccattgcg gtttcttctc 240 tcgatgagca tcttgcttct ctattacttg atttgtaggg tatttacgct gttttctgct 300 ccttatcgtg ggccagagga agaggaagat gaaggtggag ttgtttttca ggaagattat 360 gctcacatgg aaggttggaa acggactgtt atcgtccggt ctgggaggtt tctctctagg 420 gttttgcttt tcgtttttgg gttttattgg attcacgaga tcgagattca gctgtccaga 480 gacatggatt ctaatcctaa aactacttct acagagatta accagaaagg ggaagccgcc 540 acggaggaac ctgaaagacc tggagccatt gtgtccaatc atgtttcgta cttggacatt 600 ttgtatcata tgtctgcttc ttttccaagt tttgttgcca agagatcagt gggcaaactt 660 cctcttgttg gcctcattag caaatgcctt ggttgtgtct atgttcaaag agaagcaaaá 720 tcgcctgatt tcaagggtgt atctggcaca gtaaatgaaa gagttcgaga agctcatage 780 aataaatctg ctccaactat tatgcttttt ccagaaggaa caactaccaa tggagactac 840 ttacttacat tcaagacagg tgcatttttg gctggaactc cagttcttcc ggtaatatta 900 aaatatccgt atgagcgctt tgggatacca cagtgtggca tatccggggc acgccacatt 960 I ttattccttc tctgtcaagt cgtaaatcac ttggaagtca tacggttacc tgtatactac 1020 10 aagagaaaga ccatcccaag ctttatgcta cgatcccaaa gcaatgttcg gaaattaatg 1080 gccaccgagg gtaacttgat tctatcggag ttgggactta gcgacaaaag gatatatcac 1140 gcaactctca atggtaatct tagtcaaacc cgtgatttcc atcagaaaga agaatga 1197 < 210 > 6 < 211 > 398 < 212 > PRT < 213 > Arabidopsis sp. 20 < 400 > 6 I Met Glu Ser Glu Leu Lys Asp Leu Asn Ser ASn Ser Asn Pro Pro Ser 1 5 10 15 Ser Lys Glu Asp Arg Pro Leu Leu Lys Ser Glu Ser Asp Leu Ala Wing 20 25 30 i 25 Wing He Glu Glu Leu Asp Lys Lys Phe Wing Pro Tyr Wing Arg Thr Asp 35 40 45 Leu Tyr Gly Thr Met Gly Leu Gly Pro Pha Pro Met Thr Glu Asn lie 50 55 60 I Lys Leu Ala Val Ala Leu Val Thr Leu Val Pro Leu Arg Phe Leu Leu 65 70 75 80 Ser Met Ser He Leu Leu Leu Tyr Tyr Leu He Cys Arg Val Phe Thr 85 90 95 Leu Phe Be Ala Pro Tyr Arg Gly Pro Glu Glu Glu Glu Aap Glu Gly 100 105 110 Gly Val Val Phe Gln Glu Asp Tyr Ala His Met Glu Gly Trp Lyg Arg 115 120 125 Thr Val He Val Arg Ser Gly Arg Phe Leu Ser Arg Val Leu Leu Phe 130 135 140 Val Phe Gly Phe Tyr Trp He His Glu Ser Cys Pro Asp Arg Asp Ser 145 150 155 160 Asp Met Asp Ser Asn Pro Lys Thr Thr Ser Thr Glu He Asn Gln Lys i 165 170 175 Gly Glu Wing Wing Thr Glu Glu Pro Glu Arg Pro Gly Wing He Val Ser 180 185 190 Asn His Val Ser Tyr Leu Asp He Leu Tyr His Met Ser Wing Ser Phe 195 200 205 Pro Ser Phe Val Wing Lyg Arg Ser Val Gly Lys Leu Pro Leu Val GIV 210 215 220 Leu He Ser Lys Cys Leu Gly Cys Val Tyr Val Gln Arg Glu Ala Lys 225 230 235 240 Ser Pro Asp Phe Lys Gly Val Ser Gly Thr Val Asn Glu Arg Val Arg 245 250 255 Glu Ala His Ser Asn Lys Ser Ala Pro Thr He Met Leu Phe Pro Glu 260 265 270 Gly Thr Thr Asn Gly Asp Tyr Leu Leu Thr Phe Lys Thr Gly Wing 275 280 285 Phe Leu Wing Gly Thr Pro Val Leu Pro Val He Leu Lys Tyr Pro Tyr '290 295 300 Glu Arg Phe Ser Val Wing Trp Aap Thr He Ser Gly Ala Arg His He 305 310 315 320 Leu Phe Leu Leu Cys Gln Val Val Asn His Leu Glu Val He Arg Leu 325 330 335 Pro Val Tyr Tyr Pro Ser Gln Glu Glu Lys Asp Asp Pro Lys Leu Tyr '340 345 350 Wing Ser Asn Val Arg Lys Leu Met Wing Thr Glu Gly Asn Leu Ha Leu i 355 360 365 i Ser Glu Leu Gly Leu Ser Asp Lys Arg lie Tyr His Wing Leu Asn 370 375 380 Gly Asn Leu Ser Gln Thr Arg Asp Phe His Gln Lys Glu Glu 385 390 395 < 210 > 7 < 211 > 1131 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 7 atgagcagta cggcegggag gctcgtgact tcaaaatccg agcttgacct cgatcaccct 60 aacategaag attaccttcc ttctggttct tccatcaatg aacctcgcgg caagctcagc 120 ctgcgtgatt tgctagacat ctctccaacg ctcactgaag ctgctggtgc cattgttgat 190 gactcgttca caagatgttt caaatcaaat cctccagaac cttggaactg gaatatttac 240 ttattcccac tatactgctt tggggttgtt gttagatact gtatcctctt tcccttgagg 300 tgcttcactt tagcttttgg gtggattatt ttcctttcat tgtttatccc tgtaaatgcg 360 ttgctgaaag gtcaagatag gttgaggaaa aagatagaga gggtcttggt ggaaatgatti 420 tgcagctttt ttgtcgcctc atggaccgga gttgtcaaat atcacgggcc acgtcctagc 480 i atccgtccta agcaggtcta tgttgccaac catacttcaa tgattgattt catcgtattg 540 gagcagatga ccgcatttgc tgttataatg cagaagcatc ctggttgggt tggtcttctg 600 caaagcacaa tattagagag tgtgggatgt atctggttca atcgttcaga ggcaaaggat 660 cgtgaaattg tagcaaaaaa gttaagggac catgtccaag gagctgacag taatcctctt! 720 ctcatatttc ccgaagggac atgtgtaaat aataattaca cagtgatgtt taagaagggt: 780 gcttttgaat tggactgcac attgcaatta tgtttgtcca gatttttgtt aatacaacaa '840 gacgccttct ggaatageag aaaacaatca acttgctgca tttactatgc actcatgaca 900 tcatgggctg ttgtatgtga agtgtggtac ttggaaccac aaaccataag gcccggtgaa 960 aatttgcaga acaggaattg gagggtcaga gacatgatat ctcttcgggc gggtctcaaa 1020 aaggtccctt gggatggata cttgaagtat tcgagaccaa gccccaagca tagtgaacgc 1080 aagcaacaga gtttcgcaga gtcgatcctg gctagattgg aagagaagtg to 1131 < 210 > 8 < 211 > 376 < 212 > PRT < 213 > Arabidopsis sp. < 400 > 8 Met Ser Ser Thr Wing Gly Arg Leu Val Thr Ser Lys Ser Glu Leu Asp 1 5 10 15 Leu Asp His Pro Asn lie Glu Asp Tyr Lau Pro Ser Gly Ser Ser lie 20 25 30 Asn Glu Pro Arg Gly Lys Leu Ser Leu Arg Asp Leu Leu Asp He Ser 40 45 Pro Thr Leu Thr Glu Ala Ala Gly Ala lie Val Asp Asp Ser Phe Thr 50 55 60 Arg Cys Phe Lys Ser Asn Pro Pro Glu Pro Trp Asn Trp Asn lie Tyr 65 70 75 80 Leu Phe Pro Leu Tyr Cys Phe Gly Val Val Val Tyr Ar Cys He Leu 85 90 95 Phe Pro Leu Arg Cys Phe Thr Leu Wing Phe Gly Trp lie He Phe Leu 100 105 110 Ser Leu Phe He Pro Val Asn Ala Leu Leu Lys Gly Gln Asp Arg Leu 115 120 125 Arg Lys Lys lie Glu Arg Val Leu Val Glu Met lie Cys Ser Phe Phe 130 135 140 Val Wing Ser Trp Thr Gly Val Val Lys Tyr Hís Gly Pro Arg Pro Ser 145 150 155 160 lie Arg Pro Lys Gln Val Tyr Val Wing Asn His Thr Ser Met lie Asp 165 170 175 Phe He Val Leu Glu Gln Met Thr Wing Phe Wing Val He Met Gln Lys 180 185 190 His Pro Gly Trp Val Gly Leu Leu Gln Ser Thr He Leu Glu Ser Val 195 200 205 Gly Cys He Trp Phe Asn Arg Ser Glu Wing Lys Asp Arg Glu He Val 210 215 220 Wing Lys Lys Leu Arg Asp His Val Gln Gly Wing Asp Ser Asn Pro Leu 225 230 235 240 Leu He Phe Pro Glu Gly Thr Cys Val Asn Asn Asn Tyr Thr Val Met 245 250 255 Phe Lys Lys Gly Wing Phe Glu Leu Asp Cys Thr Val Cys Pro He Wing 260 265 270 He Lys Tyr Even Lys He Phe Val Asp Wing Phe Trp Asn Ser Arg Lys 275 280 285 Gln Ser Phe Thr Met His Leu Leu Gln Leu Met Thr Ser Trp Wing Val 290 295 300 Val Cys Glu Val Trp Tyr Leu Glu Pro Gln Thr He Arg Pro Gly Glu 305 310 315 320 Thr Gly He Glu Phe Wing Glu Arg Val Arg Asp Met He Ser Leu Arg 325 330 335 Wing Gly Leu Lys Lys Val Pro Trp Asp Gly Tyr Leu Lys Tyr Ser Arg 340 345 350 Pro Ser Pro Lys His Ser Glu Arg Lys Gln Gln Ser Phe Ala Glu Ser 355 360 365 He Leu Ala Arg Leu Glu Glu Lys 370 375 < 210 > 9 < 211 > 965 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 9 gttgttaagt tacaggtctc ttcaaaaaca tctctcttca cacacacacg cagccaatca 60 tcgatcacag ctcgattttc ctttattgtt ccgttggttt tcttgagnat ttttctttct 120 tgggatcatc aaactngtcg gtaaggwaac ttcacggacg gatcttcaat gttgagctgt 190 tctaatggta ccgtcgtgat cgcaaccgcc atggtttgct caagcaccgc tctgtttctc 240 gtcaattcca gccatggctc caaaatccta tggaaatcat tcagactcta aggttcttga 300 cgacccattc tccgttcttg tctatcttca gaggaaacga agaaacaggg gaagaagata 360 ggttcgcgga aagaaagtgc taatgtgaaa gatacgaaag gtaacggggc agagtaccgg 420 aggagggaat tgaaccggaa aagcgtaccg aagccagtga ctaaaceggg aaagaccggt 480 tctatgtgta gaatctctac catgccagcg aaccggatgg ctctgtacaa tgggattctt 540 agagaccgag atcacagagt tcaatattct tattgacttt ttcttcttga ttagtcaata 600 gatttaggtt ttgtaaatct ttcttttgtt tttcggtaat attagatttt ttcttggaaa 660 tttcagatat tgtagacttt gtagttgggt ggtcttcttt ttctcccttt ttgtgtctca 720 tagtagtagg tggttttctt atgctccact tatctactta cttgttttaa atcaagtgat 780 gatgtaaata attgacatgt aagtagtcat tagaaatttg aaaaggcaaa tgadagaata 840 taaatttgta aaaacata gt gtgcctattg tacatataaa ctctcttttg ttggggatat 900 ctatggaatt tatattgatt gtgttgaaaa aacaaaaaaa aaaaaaaaaa aaaaaaaaaa 960 aaaaa 965 < 210 > 10 < 211 > 1593 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 10 atgtccggta ataagatctc gactcttcaa gctcttgtct tcttcttgta ccggtttttc 60 attctccgtc gttggtgtca tcgtagccct adacaaaaat accaaaaatg cccttctcac 120 ggcctccacc aatatcaaga cctatcgaat cacactttga tattcaacgt cgaaggagct 180 ctactcaaat caaactcttt attcccttac ttcatggttg tggcattcga agccggaggg i 240 gtgataaggt cacttttcct cttagttctt tatccattta taagcttgat gagctacgaa (300 atgggcttga agacgatggt gatgctgagc ttctttggag ttaaaaagga aagettccga 1360 gtggggaaat cagttttgcc taagtatttt ctagaagatg ttgggctcga gatgttccag '420 gttttgaaaa gaggaggcat gagagttgct gtgagtgatt taccacaagt tatgattgat! 480 gtattcttgc gagattactt ggagatagad gttgtggtcg gaagagacat gaaaatggtc 540 ggtggttact acctaggcat cgtggaggat aagaagaacc ttgaaattgc ttttgataaa¡ 600 gtggttcaag aagaaagact tggtagtggt cgtcgtctta ttggcatcac ttcctttaac 660 tcgccaagtc acagatctct cttctctcaa ttttgccagg aaatttactt cgtcagaaat 720 aaagttggca tcagacaaga aacectacca caagatcant accctaaacc attgattttc 780 i cacgatggtc gtttagccgt taagccaaca cctttaaaca cactcgtatt attcangtgg 840 gccccattcg ccgccgtctt agccgctgca agactcgtct tcggcctaaa cttacettac 900 tccctagcca atcccttcct cgccttttcc ggtatccacc ttactctcac cgtcaacaac 960 cacaacgacc taatatccgc cgacagaaala agaggttgtc tctttgtgtg taaccataga 1020 acgttattgg acccacttta catttcatac gctctaagaa agaaaaacat gaaagccgtg 1080 acgtatagtc taageagatt atctgagctt ctggctccga tcaagaccgt tagattgact¡1140 i cgtgatcgag tcaaagatgg tcaagccatg gagaaattgc tgagccaggg agatctcgtg I aagggactac gtgtagagag 1200 gtttgtccgg ccttacttgc ttcggtttag tccacttttc 1260 tctgaggttt gtgacgtcat cgtacctgtt gctattgact cacacgtgac tttcttctat 1320 ggcacgacgg ctagtggtct taaggcattt gatcccattt tcttcctttt gaatcctttc 1380 cct tcctaca ccgtcaaatt gcttgaccct gtctctggaa gtagctcgtc cacgtgtcgá 1440 ggagtccctg acaatggaaa agttaacttc gaggtggcta atcacgtgca gcatgagatc 1500 gggaatgcct tggggtttga gtgcaccaac ctcacgagaa gagataagta cttgatchg 1560 gccggtaata acggagttgt caagaaaaaa taa 1593 < 21O > 11 < 211 > 530 < 212 > PRT < 213 > Arabidopsis sp < 400 > 11 Met Ser Gly Asn Lys Be Ser Thr Leu Gln Ala Leu Val Phe Phe Leu 1 5 10 15 Tyr Arg Phe Phe He Leu Arg Arg Trp Cys His Arg Ser Pro Cys Gln 20 25 30 Lys Tyr Gln Lys Cys Pro Ser His Gly Leu His Gln Tyr Gln Asp Leu 35 40 45 Ser Asn His Thr Leu He Phe Asn Val Glu Gly Ala Leu Leu Lys Ser 50 55 60 Asn Ser Leu Phe Pro Tyr Phe Met Val Val Wing Phe Glu Wing Gly Gly 65 70 75 80 Val lie Arg Ser Leu Phe Leu Leu Val Leu Tyr Pro Phe He Ser Leu 85 90 95 Met Ser Tyr Glu Met Gly Leu Lys Thr Met Val Met Leu Ser Phe Phe 100 105 110 Gly Val Lys Lys Glu Ser Phe Arg Val Gly Lys Ser Val Leu Pro Lys 115 120 125 Tyr Phe Lau Glu Asp Val Gly Leu Glu Met Phe Gln Val Leu Lys Arg 130 135 140 Gly Gly Lys Arg Val Wing Val Ser Asp Leu Pro Gln Val Met lie Asp 145 150 155 160 Val Phe Leu Arg Asp Tyr Leu Glu lie Glu Val Val Val Gly Arg Asp 165 170 175 Met Lys Met Val Gly Gly Tyr Tyr Leu Gly He Val Glu Asp Lys Lys 180 185 190 Asn Leu Glu He Ala Phe Asp Lys Val Val Gin Glu Glu Arg Leu Gly 195 200 205 Ser Gly Arg Arg Leu He Gly He Thr Ser Phe Asn Ser Pro Ser His 210 215 220 Arg Ser Leu Phe Ser Gln Phe Cys Gln Glu He Tyr Phe Val Arg Asn 225 230 235 240 Be Asp Lys Lys Ser Trp Gln Thr Leu Pro Gln Asp Gln Tyr Pro Lys 245 250 255 Pro Leu He Phe His Asp Gly Arg Leu Wing Val Lys Pro Thr Pro Leu 260 265 270 Asn Thr Leu Val Leu Phe Met Trp Wing Pro Phe Wing Wing Val Leu Wing 275 280 285 Wing Wing Arg Leu Val Phe Gly Leu Asn Tyr Pro Tyr Ser Leu Wing Asn 290 295 300 Pro Phe Leu Wing Phe Ser Gly He His Leu Thr Leu Thr Val Asn Asn 305 310 315 320 His Asn Asp Leu He Be Wing Asp Arg Lys Arg Gly Cys Leu Phe Val 325 330 335 Cys Asn His Arg Thr Leu Leu Asp Pro Leu Tyr He Ser Tyr Ala Leu 340 345 350 Arg Lys Lys Asn Met Lys Wing Val Thr Tyr Ser Leu Ser Arg Leu Ser 355 360 365 Glu Leu Leu Wing Pro He Lys Thr Val Arg Leu Thr Arg Asp Arg Val 370 375 380 Lys Asp Gly Gln Wing Met Glu Lys Leu Leu Ser Gln Gly Asp Leu Val 385 390 395 400 Val Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe 405 410 415 Ser Pro Leu Phe Ser Glu Val Cys Asp Val Val Val Pro Wing He 420 425 430 Asp Ser His Val Thr Phe Phe Tyr Gly Thr Thr Wing Ser Gly Leu Lys 435 440 445 Wing Phe Asp Pro He Phe Phe Leu Leu Asn Pro Phe Pro Ser Tyr Thr 450 455 460 Val Lys Leu Leu Asp Pro Val Ser Gly Ser Ser Ser Thr Cys Arg 465 470 475 480 Gly Val Pro Asp Asn Gly Lys Val Asn Phe Glu Val Wing Asn His Val 485 490 495 Gln His Glu He Gly Asn Wing Leu Gly Phe Glu Cys Thr Asn Leu Thr 500 505 510 Arg Arg Asp Lys Tyr Leu He leu Wing Gly Asn Asn Gly Val Val Lys 515 520 525 Lys Lys 530 < 210 > 12 < 211 > 1509 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 12 atggttatgg agcaagctgg aacgacatcg tattcggtcg tgtcagagtt tgaaggaaca¡60 atactgaaga acgcagattc attctcttac ttcatgctcg tagccttcga agcagctggt 120 ctaattcgtt tcgctatctt gttgtttcta tggcccgtaa tcacactcct tgacgttttc 180 agctacaaaa acgcagctct caagctcaag atttttgtag ccactgttgg tctacgtgaa 240 ccggagatcg aatcagtggc tagagccgtt ctgccaaaat tctacatgga cgacgtaag? 300 atggacatcgt ggagggtttt cagctcgtgt aagaagaggg tcgtggtcac gagaatgcct 360 cgagttatgg tggagaggtt tgctaaggag catcttagag cagatgaggt catcggtacg! 420 gaactgattg taaaccggtt cggttttgtc accggtttga ttcgcgaaac ggatgttgat 4 ^ 0 cagtctgctt tgaaccgtgt cgctaatttg tttgttggtc ggaggcctca actaggtctt 540 ggaaaaccgg ctttgaccgc ctctacaaat ttcttatcgt tatgtgaggc geatatteat 600 I gcaccaatcc cggagaacta caaccacggt gaccaacaac ttcagctacg tccacttccg 660 gtgatatttc acgacggeog actagtgaag cggccaacgc cggccaccgc tctcatcatc 720 ctcctttgga tcccatttgg aatcattetc gccgtgatcc ggatctttct tggagccgtc 780 'ctcccattgt gggccacacc ttacgtctct cagatatteg gtggccatat catcgtcaaa 840 ggaaagcctc ctcagccacc ggcggctgga aaatccggcg tgctctttgt gtgtactcac 900 agdaccctaa tggaccctgt ggtattatct tatgtcctcg gacgtagcat cccagccgtt 960 aettaetcaa tctcgcgctt atcagagatc ttacctccca ttccaaccgt ccgattgaca 1020 agaatccgag atgtggatgc ggctcagatc aaacaacaac tgtcaaaagg agatctagtg 1080 gtttgtcctg agggaaccac ttgtcgtgaa ccgtttttgt taagattcag cgcgcttttc 1140 gctgdgttaa cggataggat tgttccggtt gcgatgaact acagagtcgg attcttccac 1200 gcgactacag cgagaggctg gaagggtttg gacccaattt tcttcttcat gaacccaaga 1260 ccggtttacg agattacgtt cttgaaccag cttcctatgg aggcaacatg ttcgtccggg 1320 aagagcccgc atgacgtggc gaactatgtt cagagaatct tggcggctac gttagggttt 1380 gagtgcacca acttcacaag aaaagataag tatagggttc tcgctggaaa cgatggaaca 1440! gtgtcgtact tgtcgttgct agaccaattg aagaaggtgg ttagcacttt cgagccttgt! 1500 'ctccattga 1509 < 210 > 13 < 211 > 502 < 212 > PRT < 213 > Arabidopsis sp. < 400 > 13 Met Val Met Glu Gln Ala Gly Thr Thr Ser Tyr Ser Val Val Ser Glu 1 5 10 15 Phe Glu Gly Thr He Leu Lys Asn Wing Asp Ser Phe Ser Tyr Phe Met 25 30 Lau Val Ala Phe Glu Ala Ala Gly Leu He Arg Phe Ala He Leu Leu 35 40 45 Phe Leu Trp Pro Val He Thr Leu Leu Asp Val Phe Ser Tyr Lys Asn 50 55 60 Ala Ala Leu Lys Leu Lys He Phe Val Wing Thr Val Gly Leu Arg Glu 65 70 75 80 Pro Glu He Glu Ser Val Ala Arg Ala Val Leu Pro Lys Phe Tyr Met 85 90 95 Asp Asp Val Ser Met Asp Thr Trp Arg Val Phe Ser Ser Cys Lyx Lys 100 105 110 Arg Val Val Val Thr Arg Met Pro Arg Val Met Val Glu Arg Phe Wing 115 120 125 Lys Glu My Leu Arg Wing Asp Glu Val He Gly Thr Glu Leu He Val 130 135 140 Asn Arg Phe Gly Phe Val Thr Gly Leu He Arg Glu Thr Asp Val Asp 145 150 155 160 Gln Ser Ala Leu Aun Arg Val Ala Asn Leu Phe Val Gly Arg Arg Pro 165 170 175 Gln Lau Gly Leu Gly Lys Pro Wing Leu Thr Wing Being Thr Asn Phe Lau 180 185 190 Ser Leu Cys Glu Glu His He His Pro Wing Pro Glu Asn Tyr Even 195 200 205 His Gly Asp Gln Gln Leu Gln Leu Arg Pro Leu Pro Val lla Phe His 210 215 220 Asp Gly Arg Leu Val Lys Arg Pro Thr Pro Ala Thr Ala Leu Lie He 225 230 235 240 Leu Leu Trp He Pro Phe Gly lla He Leu Ala Val He Arg He Phe 245 250 255 Leu Gly Ala Val Leu Pro Leu Trp Wing Thr Pro Tyr Val Ser Gln He 260 265 270 Phe Gly Gly His He He Val Lys Gly Lys Pro Pro Gln Pro Pro Wing 275 230 295 Wing Gly Lys Ser Gly Val Leu Phe Val Cys Thr His Arg Thr Leu Met 290 295 300 Asp Pro Val Val Leu Ser Tyr Val Leu Gly Arg Ser He Pro Wing Val 305 310 315 320 Thr Tyr Ser He Ser Arg Leu Ser Glu He Leu Ser Pro Pro Thr 325 330 335 Val Arg Leu Thr Arg He Arg Asp Val Asp Ala Ala Lys He Lys Gln 340 345 350 Gln Leu Ser Lys Gly Asp Leu Val Val Cys Pro Glu Gly Thr Thr Cys 355 360 365 Arg Glu Pro Phe Leu Leu Arg Phe Be Ala Leu Phe Ala Glu Leu Thr 370 375 380 Asp Arg He Val Pro Val Ala Met Asn Tyr Arg Val Gly Phe Phe His 385 390 395 400 Wing Thr Thr Wing Arg Gly Trp Lys Gly Leu Asp Pro He Phe Phe Phe 405 410 415 Met Asn Pro Arg Pro Val Tyr Glu He Thr Phe Leu Asn Gln Leu Pro 420 425 430 Met Glu Wing Thr Cys Ser Ser Gly Lys Ser Pro His Asp Val Ala Asn 435 440 445 Tyr Val Gln Arg He Leu Wing Wing Thr Leu Gly Phe Glu Cys Thr Asn 450 455 460 Phe Thr Arg Lys Aip Lys Tyr Arg Val Leu Wing Gly Asn Asp Gly Thr 465 470 475 480 Val Ser Tyr Leu Ser Leu Leu Asp Gln Leu Lys Lys Val Val Ser Thr 485 490 495 Phe Glu Pro Cys Leu His 500 < 210 > 14 < 211 > 1563 < 212 > DNA i < 213 > Arabidopsis sp. < 400 > 14 ti atgtccgcca agatttcaat attccaagct cttgtctttc tattctaccg gtttatcctc 60 cggcgatatc ggaactctaa accaaaatac caaaatggcc cttcttctct cctccaatcc 120 I gacctatcac gccacacatt gatcttcaac gtagaaggag ctcttctcaa atccgactct 180 ctcttccctt acttcatgtt agtagcattt gaggcgggag gcgtaataag gtcatttctc 240 ctcttcattc tctatccatt gataagcttg atgagccatg agatgggtgt caaagtgatg 300 gtaatggtga gcttcttcgg gatcaaaaaa gaaggttttc gagcggggag agcggttttg 360 cctaaatact ttctagaaga tgtcggactc gagatcttcg aagtgttgaa gagaggaggg 420 aagaaaatcg gagtgagtga tgatcttcct caagttatga tcgaagggtt cttgagagat 480 tacttggaga ttgacgttgt ggtcgggaga gaaatgaaag tcgttggagg ttattatcta 540 ggtatcatgg aggataaaac caaacatgat cttgtetttg atgagttagt tcgtaaagag 600 agactadaca ccggtcgtgt tattggcatc acttccttca atacatctct tcaccgatat 660 agttttgcca ctattctctc ttcgtgaaga ggaaatttat aatcagacaa gcgaagctgg 720 cacgaagcca caaaccctac gtaccctaaa ccattgattt tccatgatgg ccgtctcgcg 780 ccctaatgaa atcaaaccaa cactttggtc ttgttcatgt ggggtccttt cgcagccgca 84¡0 gccgcag cag ccagactctt cgtctctctt tgcatccctt actctttatc aatcccgatc 900 ctcgcctttt ccggttgcag actaaccgtc actadcgdct acgttteate tcaaaaacaa 960 aaaccaagtc aacgcaaagg ttgtctcttt gtatgtaacc ataggacttt attggaccct 1020 I ctctatgttg cattcgcttt gagaaagaaa aacatcaaaa ctgtaacgta tagtttgagt 1080 agggtatctg agattttggc tccgatcaag acggtgagac tgacccgtga tcgggtgagc 1140 gacggtcaag ccatggagaa attgttaacc gaaggagcttc tcgttgtttg tcctgaagga 1200 i accacttgta gagaacctta cctgcttagg tttagccctt tgttcaccga ggttagtgat 1260 gtcatcgttc ccgtggctgt gacggtacac gtgaccttct tctacggtac aacggcgagt 1320 ggtcttaagg cacttgaccc gcttttcttc ctcttggatc cttatcctac ctacaccatc 1380 I caatttctcg accctgtctc cggtgccacg tgccaagatc ctgatggaaa gttgaagttt ' 1440 gaggtggcca acaatgttcn gagtgatatt gggaaggcgc tggatttcga gtgcacaagt 1500 ctcactagaa aagacaagta tttgatcttg gccggtaata atggagtagt taagaaaaat 1560 taa 1563 < 210 > 15 < 211 > 520 < 212 > PRT I < 213 > Arabidopsis sp. < 400 > 15 Met Be Ala Lys Be Ser He Phe Gln Ala Leu Val Phe Leu Phe Tyr 1 5 10 15 Arg Phe He Leu Arg Arg Tyr Arg Asn Ser Lys Pro Lys Tyr Gln Asn 20 25 30 Gly Pro Ser Ser Leu Lau Gln Ser Asp Leu Ser Arg His Thr Leu He 35 40 45 Phe Asn Val Glu Gly Ala Leu Leu Lys Ser Asp Ser Leu Phe Pro Tyr 50 55 60 Phe Met Leu Val Wing Phe Glu Wing Gly Gly Val lla Arg Ser Phe Leu 65 70 75 80 Leu Phe He Leu Tyr Pro Leu He Ser Leu Met Ser His Glu Met Gly 85 90 95 Val Val Val Met Val Val Met Val Met Val Ser Phe Phe Gly He Lys Lys Glu Gly 100 105 110 Phe Arg Ala Gly Krg Ala Val Leu Pro Lys Tyr Phe Leu Glu Asp Val 115 120 125 Gly Leu Glu He Phe Glu Val Leu Lys Arg Gly Gly Lys Lys He Gly 130 135 140 Val Ser Asp Asp Leu Pro Gln Val Met He Glu Gly Phe Leu Arg Asp 145 150 155 160 Tyr Leu Glu He Asp Val Val Val Gly Arg Glu Met Lys Val Val Gly 165 170 175 i Gly Tyr Tyr Leu Gly He Met Glu Asp Lys Thr Lys His Asp Leu Val 180 185 190 Phe Asp Glu Leu Val Arg Lys Glu Arg Leu Asn Thr Gly Arg Val ¡He i 195 200 205 Gly He Thr Ser Phe Asn Thr Ser Leu His Arg Tyr Leu Phe Ser Gln 210 215 220 Phe Cys Gln Glu He Tyr Phe Val Lys Lys Ser Asp Lys Arg Ser Trp 225 230 235 240 Gln Thr Leu Pro Arg Ser Gln Tyr Pro Lys Pro Leu He Phe His Asp 245 150 255 Gly Arg Leu Wing He Lys Pro Thr Leu Met Asn Thr Leu Val Phe 260 265 270 Met Trp Gly Pro Phe Wing Wing Wing Wing Wing Wing Wing Arg Leu Phe Val 275 280 285 Ser Leu Cys He Pro Tyr Ser Leu Ser He Pro He Leu Ala Phe Ser 290 295 300 Gly Cys Arg Leu Thr Val Thr Asn Asp Tyr Val Ser Ser Gln Lys Gln 305 310 315 320! Lys Pro Ser Gln Arg Lys Gly Cys Leu Phe Val Cys Asn His Arg¡ Thr 325 330 335! Leu Leu Asp Pro Leu Tyr Val Ala Phe Ala Leu Arg Lys Lys Asn He 340 345 350 I Lys Thr Val Thr Tyr Ser Leu Ser Arg Val Ser Glu He Leu Ala Pro 355 360 365 He Lys Thr Val Arg Leu Thr Arg Asp Arg Val Ser Asp Gly Gln Wing 370 375 380 Met Glu Lys Leu Leu Thr Glu GIV Asp Leu Val Val Cys Pro Glu Gly 385 390 395 400 Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe Ser Pro Leu Phe Thr 405 410 415 Glu Val Ser Asp Val lie Val Pro Val Wing Val Thr Val His Val Thr 420 425 430 Phe Phe Tyr Gly Thr Thr Wing Ser Gly Leu Lys Wing Leu Asp Pro Leu 435 440 445 Phe Phe Leu Leu Aap Pro Tyr Pro Thr Tyr Thr He Gln Phe Leu Asp 450 455 460 Pro Val Ser Gly Wing Thr Cys Gln Arp Pro Asp Gly Lys Leu Lys Phe 465 470 475 480 Glu Val Wing Asn Asn Val Gln Ser Asp He Gly Lys Wing Leu Asp Phe 485 490 495 'Glu Cys Thr Ser Leu Thr Arg Lys Asp Lys Tyr Leu He Leu Wing Gly 500 505 510 Asn Asn Gly Val Val Lys Lys Asn 515 520 < 210 > 16 I < 211 > 1506 'I < 212 > DNA < 213 > Arabidopsis sp. < 400 > 16 atgggagctc aggagaaacg gcgccgtttc gagcagatat caaagtgcga tgttaaggap 60 cggtccaacc ataccgtggc cgctgatcta gacggaacac tactaatctc tcgtagcgcc 120 ttcccttact atttcctcgt agccctcgag gcagggagct tgctccgagc gttgatccta 1801 cttgtgtccg taccattcgt ttatettacg tacttgacca tctccgagac tttagccatc 240 aacgtatttg tcttcatcac gttcgcgggt ctcaagatcc gagacgttga gctagtggtc 300 cgttccgtcc tcccgaggtt ctatgcggag gacgtgaggc ccgatacctg gcgtatcttc 360 aacacgttcg ggaaacggta cataataact gcgagccctc gaattatggt cgagccattc 420 gtgaaaacat tcctoggagt tgataaagtt cttggaacag agctagaggt ctccaaatcg 480 ggtcgggcaa ccgggttcac cagaaaacca ggtattcteg tcggtcagta caaacgtgac 540 gtcgttttga gagagtttgg tggcctagcg tctgatttac ctgatttggg gctcggcgat 600, Agcaagacgg accacgactt catgtccatc tgcaaggaag gttacatggt gccacgtacg 660 aaatgcgaac cattaccaag aaacaaactc ttaagcccca taatattecc cgagggcaga 720 ttagtccaac gcccaacgcc gttagttgct ctgttaactt tcctctggct tcccgtcggt 780 ttcgtcctct ctatcatccg cgtctacacg aatattccgt taccggaacg tatcgcccgt 8 ^ 0 tacaactaca agcttactgg catcaagcta gtcgtcaacg gccaccctcc tccgccgcca1 900 aaacctggcc agccaggcca tcttttggtc tgcaaccacc gcaccgttct cgatcctgtg¡ 960 gtcacagctg tcgcactcgg ccggaaaatc agctgcgtca cttacagcat cagcaagttc • jwitfc- 1020 \ tctgagctaa tctcaccaat caaagccgtt gcgttgactc ganagacgca gtcaacgtga 1080 gcgaacatco agcgtctttt ggaggaaggc gatctcgtga tatgtcccga gggaaccacg 1140 'tgccgtgagc ctttccttct ccggtttagt gctcttttcg ctgagctcac ggaccggatc 1200 gttcccgtgg cgatcaacac aaagcagagc atgttcaatg gtaccaccac acgtggatac 1260 aagcgttcttg atccttactt tgcgttcatg aacccgaggc cgacgtatga gatcacgttc 1320 ttccagctga ctcaaacaga gctgacgtgt aaaggaggcc aatctccgat agaggttgcg 1380 aattacatac agagggtttt gggaggaacc ttaggttttg agtgcaccaa tttcacaaga 1440 aaggataagt acgccatgct tgctggtact gacggtaggg ttccggtgad gaaggagaag 1500 acgtga 1506 < 210 > 17 < 211 > 500 < 212 > PRT < 213 > Arabidopsis sp. < 400 > 17 Met Gly Ala Gln Glu Lys Arg Arg Arg Phe Glu Gln He Ser Lys Cys 1 5 10 15 Asp Val Lys Asp Arg Ser Asn His Thr Val Wing Wing Asp Leu Asp¡ Gly 20 25 30 Thr Leu Leu He Ser Arg Be Wing Phe Pro Tyr Tyr Phe Leu Val Wing 35 40 45 Leu Glu Wing Gly Be Leu Leu Arg Wing Leu Leu Leu Val Ser Val 50 55 60 Pro Phe Val Tyr Leu Thr Tyr Leu Thr He Ser Glu Thr Leu Allah He 65 70 75 80 Asn Val Phe Val Phe lla Thr Phe Wing Gly Leu Lys He Arg Asp Val 85 90 95, Glu Leu Val Val Arg Ser Val Leu Pro Arg Phe Tyr Ala Glu Asp Val 100 105 110 Arg Pro Asp Thr Trp Arg He Phe Asn Thr Phe Gly Lys Axg Tyr He 115 120 125 He Thr Ala Ser Pro Arg He Met Val Glu Pro Phe Val Lys Thr | Phe 130 135 140 t Leu Gly Val Asp Lys Val Leu Gly Thr Glu Leu Glu Val Ser Lys Ser 145 150 155! 160 Gly Arg Ala Thr Gly Phe Thr Arg Lys Pro Gly lla Leu Val Gly Gln 165 170 175 Tyr Lys Arg Asp Val Val Lau Arg Glu Phe Gly Gly Leu Wing As As 180 185 190 Leu Pro Asp Leu Gly Leu Gly Asp Ser Lys Thr Asp His Asp Phe Met 195 200 205 Ser He Cys Lys Glu Gly Tyr Met Val Pro Arg Thr Lys Cys Glu Prp 210 215 220 i Leu Pro Arg Still Lys Leu Leu Ser Pro He He Phe My Glu Gly Arg 225 230 235 240 Leu Val Gln Arg Pro Thr Pro Leu Val Ala Leu Leu Thr Phe Leu Trp 245 250 255 l Leu Pro Val Gly Phe Val Leu Ser He He Arg Val Tyr Thr Asn He 260 265 270 Pro Leu Pro Glu Arg He Wing Arg Tyr Asn Tyr Lys Leu Thr Gly He 275 280 285 Lys Leu Val Val Asn Gly His Pro Pro Pro Pro Pro Lys Pro Gly Gln ' 290 295 300 Pro Gly My Leu Leu Val Cys Asn My Arg Thr Val Leu Asp Pro Val '305 310 315 320 Val Thr Ala Val Ala Leu Gly Arg Lys He Ser Cys Val Thr Tyr Ser 325 330 335 lla Ser Lys Phe Ser Glu Leu lie Ser Pro He Lys Ala Val Ala Leu 340 345 350! Thr Arg Gln Arg Glu Lys Asp Ala Ala Asn He Lys Arg Leu Leu Glu 355 360 365 Glu Gly Asp Leu Val He Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro 370 375 380 Phe Leu Leu Arg Phe Be Ala Leu Phe Ala Glu Leu Thr Asp Arg He 385 390 395 400 Val Pro Val Ala He Asn Thr Lys Gln Ser Met Phe Aun Gly Thr Thr 405 410 415 Thr Arg Gly Tyr Lys Leu Leu Asp Pro Tyr Phe Wing Phe Met Asn Pro 420 425 430 Arg Pro Thr Tyr Glu lla Thr Phe Leu Lys r.ln He Pro Wing Glu Leu 435 440 445 Thr Cys Lys Gly GIV Lys Ser Pro He Glu Val Ala Asn Tyr He Gln 450 455 460 Arg Val Leu Gly Gly Thr Leu Gly Phe Glu Cys Thr Asn Phe Thr Arg 465 470 475 480 Lys Asp Lys Tyr Wing Met Leu Wing Gly Thr Asp Gly Arg Val Pro Val 495 490 495 Lys Lys Glu Lys 500 < 210 > 18 < 211 > 1620 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 18 atggcggatc ctgatctgtc ttctcctttg atccaccatc aatcctccga tcaacctgaa 60 gttgttatct ctatcgccga cgacgacgac gacgagtcag gactcaatct tcttccagcc 120 gttgttgacc ctcgtgtttc acgaggtttt gagtttgacc atcttaatcc ttatggcttt 180 ctcagcgagt cagagcctcc ggttctcggt ccgacgacgg tggatccatt ccggaacaat 240 acacctggag ttagcggatt gtaccaagcg attaagctcg tgatttgtct tccgattgct 300 ctgattagac ttgttctctt tgctgctagc ttagctgttg gttacttggc tacaaaattg 360 gcacttgctg gctggaaaga taaagagaac cctatgcctc tttggagatg cagaatcatg 420 tggattactc ggatctgtac cagatgtatc ctcttctctt ttggctatca gtggataaga 480 aggaaaggga aacctgctcg gagagagatt gctccgattg ttgtatcaaa tcatgtttct 540 caatcttcta tatattgaac ttatcaccga cttctatgaa atcggagtcal ccattgttgc 600 catgattcac ttccatttgt tggaactatt atcagggcaa tgcaggtgat atatgtgaat 660; agattctcac agacatcaag gaagaatgct gtgcatgaaa taaagagaaa agcttcctgc 720 gatagatttc ctcgtctgct gttattcccc gaaggaacca cgactaatgg gaaagttctt 78th atttccttcc aactcggtgc tttcatccct ggttacccta ttcaacctgt agtagtccgg 840 tatccccatg ii tacattttga tcaatcctgg ggaaatatct ctttgttgac gctcatgttt 900 agaatgttca ctcagtttca caatttcatg gaggttgaat atcttcctgt aatctatccc 960 > agtgaaaagc aaaagcagaa tgctgtgcgt ctctcacaga agactagtca tgcaattgca 1020 acatctttga atgtcgtcca aacatcccct tcttttgcgg acttgatgct cctcaacaaa | 1030! gcaactgagt gaacccctca taaagctgga aattacatgg ttgaaatggc aagagttgag 1140 tcgctattcc atgtaagcag cttagaggca acgcgatttt tggatacctt tgtttccatg 1200 I attccggact cgagtggacg tgttaggcta catgactttc ttcggggtct taaactgaaa 1260, ccttgccctc tttctaaaag gatatttgag ttcatcgatg tggagaaggt cggatcaatc 1320 actttcaaac agttcttgtt tgcctcgggc cacgtgttga cacagccgct ttttaagcaa 1380 acatgcgagc tagccttttc ccattgcgat gcagatggag atggctatat tacaattcaa 1440 gaactcggag aagctctcaa aaacacaatc ccaaacttga acaaggacga gattcgagga 1500 atgtaccatt tgctagacga cgaccaagat cadagaatca gccaaaatga cttgttgtcc 1560 tgcttaagaa gaaaccctct tetcatagcc atctttgcac ctgdcttggc cccaacataa 1620 < 210 > 19 < 211 > 539 < 212 > PRT < 213 > Arabidopsis sp. < 400 > 19 Met Ala Aap Pro Asp Leu Be Ser Pro Leu He His His Gln Be Ser 1 5 10 15 Asp Gln Pro Glu Val Val He Ser He Wing Aap Ap Asp Asp Asp Glu 20 25 30 Be Gly Leu Aan Leu Leu Pro Wing Val Val Asp Pro Arg Val Ser Arg 40 45 Gly Phe Glu Phe Asp His Leu Asn Pro Tyr Gly Phe Lau Ser Glu Ser 50 55 60 Glu Pro Pro Val Leu Gly Pro Thr Thr Val Asp Pro Phe Arg Asn Asn 65 70 75 80 Thr Pro Gly Val Ser GIV Leu Tyr Glu Wing He Lys Leu Val He Cys 85 90 95 Leu Pro He Ala leu He Arg Leu Val Leu Phe Ala Ala Ser Leu Ala 100 105 110 Val Gly Tyr Leu Ala Thr Lys Leu Ala Leu Ala Gly Trp Lys Asp Lys 115 120 125 Glu Asn Pro Met Pro Leu Trp Arg Cys Arg He Met Trp He Thr Arg 130 135 140 He Cys Thr Arg Cya He Leu Phe Ser Phe Gly Tyr Gln Trp He Arg 145 150 155 160 Arg Lys Gly Lys Pro Wing Arg Arg Glu He Wing Pro He Val Val Ser 165 170 175 Asn His Val Ser Tyr He Glu Pro He Phe Tyr Phe Tyr Glu Leu Ser 180 185 190 Pro Thr He Val Wing Ser Glu Be My Asp Ser Leu Pro Phe Val Gly 195 200 205 Thr He He Arg Wing Met Gln Val He Tyr Val Asn Arg Phe Ser Gln 210 215 220 Thr Ser Arg Lys Asn Wing Val His Glu He Lys Arg Lys Wing Ser Cys 225 230 235 240 Asp Arg Phe Pro Arg Leu Leu Leu Phe Pro Glu Gly Thr Thr Thr Asn 245 250 255 Gly Lys Val Leu He Ser Phe Gln Leu Gly Wing Phe He Pro Gly Tyr 260 265 270 Pro He Gln Pro Val Val Val Arg Tyr Pro His Val His Phe Asp Gln 275 230 225 Ser Trp Gly Asn He Ser Leu Leu Thr Leu Met Phe Arg Met Phe Thr 290 295 300 Gln Phe My Asn Phe Met Glu Val Glu Tyr Leu Pro Val He Tyr Pro 305 310 315 320 Ser Glu Lys Gln Lys Gln Asn Wing Val Arg Leu Ser Gln Lys Thr Ser 325 330 335 His Wing He Wing Thr Ser Leu Asn Val Val Gln Thr Ser His Ser Phe 340 345 350 Wing Asp Leu Met Leu Leu Asn Lys Wing Thr Glu Leu Lys Leu Olu A-in 355 360 365; Pro Ser Asn Tyr Met Val Glu Met Wing Arg Val Glu Ser Leu Phe His 370 375 380 Val Ser Ser Leu Glu Wing Thr Arg Phe Leu Asp Thr Phe Val Ser Met 385 390 395 400 He Pro Asp Being Ser Gly Arg Val Arg Leu My Asp Phe Leu Arg Gly 405 410 415 Leu Lys Leu Lys Pro Cys Pro Leu Ser Lys Arg He Phe Glu Phe lio 420 425 430 Asp Val Glu Lys Val Gly Ser lie Thr Phe Lys Gln Phe Leu Phe Wing 435 440 445 Ser Gly His Val Leu Thr Gln Pro Leu Phe Lys Gln Thr Cys Glu Leu 450 455 460 Wing Phe Ser His Cys Asp Wing Asp Gly Asp Gly Tyr He Thr He Gln 465 470 475 480 Glu Leu Gly Glu Wing Leu Lys Asn Thr He Pro Asn Leu Asn Lys Asp 485 490 495 Glu He Arg Gly Met Tyr His Leu Leu Asp Asp Asp Gln Asp Gln Arg 500 505 510 lla Ser Gln Asn Asp Leu Leu Ser Cys Leu Arg Arg Asan Pro Leu Leu! i 515 520 525 He Wing He Phe Wing Pro Asp Leu Wing Pro Thr 530 535 < 210 > 20 < 211 > 1128 < 212 > DNA < 213 > Arabidopsis sp. < 400 > twenty atggaaaaaa agagtgtacc aaattctgat aagttgtctc tgatt &gagt gttaagaggt 60 ataatatgte tgatggtgtt agtttcaaca gcttttat < ja tgttgatatt ctgggggttc 120, ttateagctg tagtgttgag gctttteage attcgctata gccgtaaatg tgtttccttc 180 i ttetttgget cgtggetege ettgt < jgcct ttcctctttg agaagattaa caaaaccaaa 24¡0 gttatcttet ctggtgataa ggtteettgc gaggatcgag tettgctcat tgcaelaccac 30ß egadeaga¿Lg ttgattgget gtacttetgg gatettgcac tgcgtaaagg ceagattggg! 360 aatatcaaat atgtgctta¿t gagtagtttg atgaaattac ctctctttgg ttgggcgttt 420 'cacctctteg agtttattee tgttgagagg agatgggaag tcgatgaagc aaacttgaga 480 cagatagttt egagttttaa ggatccccga gacgctttat ggettgctct ttteccegag 540 ggcacagatt acacagaggc taaatgccaa aggagtaaga aatttgctgc tgaaaatggc 600 cttccgatac tgaacaacgt gctgcttecc aggacaaaag gtttcgtcte ctgcttgcaa 660 gaactgagtt getcacttg * cgcagtttat gatgtgacca tcggttataa aacccgctgc 720 ccatctttct tagacaacgt ttatggaatt gagceatcag aagtteacat ecacatccgt 780 cgtatcaacc tgacccaaat cccaaatca "gaaaaggaca tcaatgcttg gttaargaajc 840 l acettccage tcaaagacca gctgctcaat gacttttact ccaatggtca tttecetaac §00 i gaaggaacag agaaagagtt caacacaaag aagtacetea taaactgttt ggcagtga¡tt 960 gcctteacea ecatctgtac acatctcacc ttettetcat caatgatttg cjttcaggatt ' 1020 tatgcctctt tggcctgtgt ctaettgacc tctgctacgc atttcaatct tcgttctgtt 1080 ccacttgttg agactgcaaa adattccctc adattagtaa acaaataa 1128 < 210 > 21 < 211 > 375 < 212 > PRT < 213 > Arabidopsis sp. < 400 > twenty-one Met Glu Lys Lys Ser Val Pro Asn Ser Asp Lys Leu Ser Leu He Arg 1 5 10 15 Val Leu Arg Gly He He Cys Leu Met Val Zeu Val Ser Thr Wing Phe 20 25 30 Met Met Leu He Phe Trp Gly Phe Leu Ser Ala Val Val Leu > -rg Leu 35 40 45 Phe Ser lla Arg Tyr Ser Arg Lys Cys Val Ser Phe Phe Phe Gly Ser 50 55 60 Trp Leu Ala Leu Trp Pro Phe Leu Phe Glu Lys He Asn Ly.% Thr Lys 65 70 75 80 Val He Phe Ser Gly Up Lys Val Pro Cys Glu Aap Arg Val Leu Lau 85 90 95 He Wing Agn His Arg Thr Glu Val Asp Tr-p Met Tyr Phe Trp Asp Leu 100 105 110 Ala Leu Arg Lys GIV Gln lla Gly Asn lla Lys Tyr Val Leu Lys Ser 115 120 125 Ser Leu Met Lys Leu Pro Leu Phe Gly Trp Wing Phe His Leu Phe Glu 130 135 140 Phe He Pro Val Glu Arg Arg Trp Glu Val Asp Glu Ala Aen Lau Axg 145 150 155 160 Gln He Val Being Ser Phe LYS Asp Pro Arg Asp Wing Leu Trp Leu Wing 165 170 175 Lau Phe Pro Glu Gly Thr Asp Tyr Thr Glu Wing Lys Cys Gln Arg Ser 180 185 190 Lys Lys Phe Ala Ala Glu Aun Gly Leu Pro He Leu Aun Aun Val Leu 195 200 205 Leu Pro Arg Thr Lyg Gly Phe Val Ser Cys Leu Gln Glu Leu Ser Cys 210 215 220 Ser Leu Asp Ala Val Tyr Asp Val Thr He Gly Tyr Lys Thr Axg Cys 225 230 235 240 Pro Ser Phe Leu Asp Even Val Tyr Gly He Glu Pro Ser Glu Val His 245 250 255 He He He Arg Arg He Still Leu Thr Gln He Pro Even Gln Glu Lys 260 265 270 Asp He Aun Wing Trp Leu Met Even Thr Phe Gln Leu Lys Asp Gln Leu 275 230 285 Leu Aun Asp Phe Tyr Ser Aun Gly His Phe Pro Even Glu Gly Thr Glu 290 295 300 Lys Glu Phe Even Thr Lys Lys Tyr Leu He Still Cys Leu Ala Val He 305 310 315 320 Wing Phe Thr Thr lio ay & Thr His Lau Thr Phe Phe Ser Ser Met lio 325 330 335 Trp Phe Arg He Tyr Val Ser Leu Wing Cys Val Tyr Leu Thz Ser Wing 340 345 350 Thr His Phe Even Leu Arg Ser Val Pro Leu Val Glu Thr Ala Lys Aun 355 360 365 Ser Leu Lys Leu Val Asn Lys 370 375 < 210 > 22 < 211 > 1170 < 212 > DNA < 213 > Arabidopsis sp. < 400 > 22 atggtgattg ctgcagctgt catcgtgcct ttgggcctte tettcttcat atctggtctc 60 gctgtc & atc tettteagge agtttgctat gtacteattc gaceactgtc taagmacaca 120 tacagaaaaa ttaacegggt ggttgcagaa acettgtggt tggagcttgt atggatagtt 80 gactggtggg ctggagttaa gatccaagtg tttgctgata atgagacett caategaatg 240 i ggcaaagaac atgctcttgt cgtttgtaat cacegaagtg atattgattg gettgtggga 3Q0 tggattctgg ctcagcggte aggttgcctg g < jaagcgcat tagctgtaat gaagaagtet 360 tccaaattcc ttccagteat aggctggtea atgtggttet cggagtatet etttctggaa, 420 eceaggatga agaaattggg dagcactcta aagtcaggtc ttcagcgctt gagcgactte: 480 cctcgacctt tctggttagc cetttttgtg gagggaactc gctttacaga agccaaactt 540 aaagccgcec aagagtatgc agcctcctct gaattgccta tccctcgaaa tgtgttgatt 600 cctcgcacca aaggtttcgt gteagctgtt agtaatatgc gttcatttgt cccagcaatt 660 atgatatga cagtgactat tccaaaaacc tctccaccec ccacgatgct aagactattc 7J20 aaaggacaac ettcagtggt gcatgtteac atcaagtgtc actegatgaa agacttacet 700 gaateagatg acgcaattgc acagtggtgc a < lagatcagt ttgtggctaa ggatgctctg j 840 ttagaca "c acatagctgc agacacttte eceggtcaac aagaacag & & cattggccgt 900 cceataaagt ecettgcggt ggttetatca tgggcatgcg tactaactet tggagcaata 960 i aagttectac actgggcaca actettttct teatggaaag gtatcacgat ateggcgctt 1020 ggtctaggt tcatcactct ctgtatgcag atectgatac gctcgtctca gteagagcgt 1080 tegaccceag ccaaagtcgt cceagccaag ccaaaagaca ateaceacce agadtcatcc 1140 tcccaaacag? adacggagaa ggagaagtaa 1170 <210> 23 <211> 389 <212> PRT <213> Arabidopsis sp. < 400 > 23 Met Val He Ala Ala Ala Val He Val Pro Leu Gly Leu Leu Phe Phe 1 5 10 15 lio Ser Gly Leu Ala Val Asn @, your Phe Gln Ala Val Cys Tyr Val Leu 25 30 He Axg Pro Leu Ser Lys Asn Thr Tyr Arg Lys He Asn Arg Val Val 40 45 Wing Glu Thr Leu Trp Leu Clu Leu Val Trp He Val Asp Trp Trp Wing 50 55 60 Gly Val Lys lla Gln Val Phe Wing Asp Asn Glu Thr Phe Asn Arg met 65 70 75 85 GIV Lys Glu Bis Ala Leu Val Val Cys Asn Bis Arg Ser Asp lla Asp 85 90 95 Trp Leu Val Gly Trp lla Leu Ala Gln Arg Ser GIV Cys Leu Gly Ser 100 105 110 Ala Leu Ala Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val He Gly 115 120 125 Trp Ser Met Trp? He Ser Glu Tyr Leu Phe Leu Glu Arg Asn Trp Wing 130 135 140 Lys Asp Glu Ser Thr Leu Lys Ser Gly Leu Gln > -rg Leu Ser Asp Phe 145 150 155 160 Pro Axg Pro Pho Trp Lau Wing Leu Phe Val Glu Gly Thr Arg Phe Thr 165 170 175 Glu Ala Lys Leu Lys Ala Ala Gln Glu Tyr Ala Ala Ser Ser Glu Leu 190 185 190 Pro He Pro Arg Asn Val Leu He Pro Arg Thr Lys Gly Phe Val Ser 195 200 205 Wing Val Ser Aen Net Arg Ser Phe Val Pro Wing Tyr Aup Met Thr 210 215 220 Val Thr lio Pro Lys Thr Ser Pro Pro Pro Thz Met Leu Arg Leu Phe 225 230 235 240 Lys Gly Gln Pro Ser Val Val Ni * Val Bis lla Lys Cys Bis Ser Met 245 250 255 Lys Asp Leu Pro Glu Ser Aap Asp Ala He Wing Gln Trp Cys Arg A-sp 260 265 270 Gln Phe Val Wing Lys Asp Wing Leu Leu Asp Lys Bis Wing Wing Asp 275 280 285 Thr Phe Pro Gly Gln Gln Gln Gln Asn He Gly Arg Pro He Lys Ser 290 295 300 Leu Wing Val Val Leu Ser Trp Wing Cys Val Leu Thr Leu Gly Wing He 305 310 315 320 Lys Phe Leu My Trp Wing Gln Leu Phe Ser Ser Trp Lys Gly lla Thr 325 330 335 I Have To Be Wing Lau Gly Leu Gly He He Thr Leu Cys Met Gln He Leu 340 345 350 He Arg Be Ser Gla Be Glu Arg Be Thr Pro Ala Lys Val Val Pro 355 360 365 Wing Lys Pro Lys Asp As His His Pro Glu Ser Be Ser Gln Thr Glu 370 375 380 Thr Glu Lys Glu Lys 385 < 210 > 24 < 211 > 269 < 212 > DNA < 213 > Glicine max < 400 > 24 gacccactga acgctctcat caccttcacg tggctcccct tcggcttcat cctctccatc 60 ataagggtct acttcaacct ccctctccca gaacncattg tccgctacac ctacgagatg 120 i ctcggcatca acctcgtcat ccgcggccac cgccctcctc cgccttcccc cggcaccccc 130¡ ggcaacctct acgtctgcaa ccaccgcacc gctctcgacc ccatcgtcat cgccattgcc 240! I i ctcggccgca aggtctcctg cgtcaccta 269 < 210 > 25 < 211 > 242 < 212 > DNA < 213 > Glicine max < 400 > 25 i tgatcttcca cgacggccgt ttcgtgcaga ggccagaccc actgaacgct ctcatcacct 60 tcacgtggct cccctteggc ttcatcctct ccatcatcag ggtctacttc aaccttcctc 120 tcccagaacg cattgtccgc tacacctacg agatgctcgg catcadicctc gtcatccgcg 18Q geodec Ecc tCotec cet eccceggcaa teecccggca cctetacgtc tgcaaccacc 240 242 I ge? < 210 > 26 < 211 > 272 < 212 > DNA < 213 > Glicine max < 400 > 26 gtttgttcaa aggccaactc ctctagcagc cctcttgacc ttcctatggt tgccaattgg 60 catcatactc tecatnetta agggtctacc ttaacatccc tttgcctgaa agaattgctt 120 ggtataacta taagctatta ggaatcagag ttattgtgaa gggtacccct ccaccacccc 180 | caaagaaggg tcaaagtggt gtcctatttg tttgtaacca ccgcacagtt ttagaccctg 240, tggttactgc agttgcactt ggaagaaaaa tt 272 < 210 > 27 < 211 > 218 < 212 > DNA < 213 > Glicine max < 400 > 27 ! atagcacagg agggttacat ggtgcctccg agcaaatcag caaaggcagt cccacaggag 60 cgtctgaaga gcagaatgat cttccacgac gggcgtttcg tgcagaggcc agacccaatg 12o0 aatgccctca tcaccttcac atggctccct ttgggtttcg tcctctccat cataagggtc 190 tacttcaacc tccctctccc agaacgcatc gtccgcta 218! < 210 > 28 < 211 > 270 i < 212 > DNA < 213 > Glicine max < 400 > 28 gtgcctgttg ctgtgaactg caagcagaac atgttctttg gaaccaccgt tcgtggcgtc 60 aagttctggg acccttact tacttcttac atgaacccta ggcctgtgta cgaggttacc 120 ttaccttgat acctttgccg aggagatgtc ggttaaggct ggggggaagt cgtccattga 180! s ggtggccaac cacgtggcag aaggtgctgg gggatgtgtt agggtttgag tgcaccgggt 240 tgactaggaa ggataagtat atgttgttgg 270 i < 210 > 29 I < 211 > 252 < 212 > DNA < 213 > Glicine max < 400 > 29 catgagggta ggtttgctca aaggccaact cctctagctg ccctattgac cttcctatgg 60 ctgccaattg gcatcatact ctccatctta agggtctacc ttaacatccc tttgcctgaa 120 agaattgttg gtacaactac aagctcttag gaatcagagt tattgtgaag ggtacccctc 180! caccgccccc aaagaagggt caaagtggtg tctatttgtt tgtaaccacc gcacagtatt 240 j! agaccctgtt gt 252 i < 210 > 30 < 211 > 272 I < 212 > DNA | ! < 213 > Glicine max < 400 > 30! ctgggactgc cttaaacgat gcatggatct tatcaagaaa ggagcctctg tttttttctt 60 tocagagggc acacgcagta aagatggaag actaggcaca ttcaagaagg gtgctttcag 120 tgttgctgca aagacaaatg caccagtagt accaattacc cttattggaa ctggtcaaat 190! a i- - catgcctgca ggaaaggagg gaatagtgaa cataggttct gtgaaagtgg ttatacataa 240 I acctattgtt ggaaaggatc ctgacatgtt at 272! I < 210 > 31 '< 211 > 239 i < 212 > DNA i < 213 > Glicine max < 400 > 31 egggaatcaa ggteatcaga cttcaagggt gtttcagctg ttgtcactga cagdattcga 60 gaagctcatc agaatgagtc tgctccatta atgatgttat ttccagaagg tacaaccaca 120 aatggagagt tcctccttcc attcaagact ggtggttttt tggcaaggc accggtcctt 130 cctgtgatat tacgatatca ttaccagaga tttagccctg cctgggattc catatctgg 239 < 210 > 32 < 211 > 242 < 212 > DNA < 213 > GHcine max < 400 > 32 gaacggcaac ggcaacagcg ttcgcgatga ccgtcctctg ctgaagccgg agcctccggt 6Q cttccgccga cagcatcgcc gatatggaga agaagttcgc cgcttacgtc cgccgctacg 120 tgtacggcac catgggacgc ggcgagttgc ctcccaagga gaagctcttg ctcggtttcg 180 tottotoocca "cgagcgttocgccgtcaccaa" gcto ".a" 240 cttctcccca cgttggtcac a * ac < 210 > 33 < 212 > DNA < 213 > Glicine ax < 400 > 33 ttcttcttct ctcactctct aaaaccctaa ctctatacat ggaagggddd nctcaaatct 60 natgactaat taattaatcc atcgatcaag catggagtcc gaactcaaag acctcaattc 120 gaagccgccg aacggcaacg gcaacagcgt tcgcgatgac cgtcctctg tgaagccgga 180 gcctccggtc tccgccgaca gcatcgccga tatggagaag aagttcgccg cttacgtccg 240248 ccgcgacg < 210 > 34 < 211 > 217 < 212 > DNA < 213 > Glicine ax twenty gcatcgccga tatggagaag aagttcgccg cttacgt 217 < 210 > 35 < 210 > 35 < 211 > 257 < 212 > DNA < 213 > Glicine max < 400 > 35 atctctgtct ctgcatttcc ctccctaaaa ccctaattct acatttcggaa aggaaatctc 60 aaatctaatg actaattact caatcaatcg tattaataat ccatcgatca agtatggagt 120 'ccgaactcaa agacctcaat tcgaagccac ccaactgcaa cggcaacgcc aacagcgttt 180 gcgacgaccg tcctctgctg aagccggagc ctccggcctc ctccgacagc atcgccgaga 240 257 tggagaagaa gttcgcc < 210 > 36 < 211 > 284 < 212 > DNA! < 213 > Glicine max < 400 > 36! cccgaccaaa accggttttt gtggccaatc atacttccat gattgatttc attatcttag 60! aacagatgac tgcdtttgct gttattatgc agaagcatcc tggatgggtt ggattattgc 120 agagcaccat tntggagagt gtagggtgta tctggttcaa ccgtacagag gcaaaggatc 180 gagaagttgt ggcaaggaaa ttgagggatc atgtcctggg agctaacaac aaccctcttc 240 ttatatttcc tgaaggaact tgtgtaaata atcactactc OEWG 284 I < 210 > 37 < 211 > 246 | < 212 > DNA < 213 > Glicine max < 400 > 37 i ggagatccgc ataagcaaat caatcatcct gttccttcct tatctctgtc tctgcatttc 60 cctccctaaa accctaattc tacatttcgga aaggaantct caaatctaat gataattaat 120, caatcaatcg tattaataat ccatcgatca agtatggagt ccgaactcaa agacctcaat 180 tcgaagccac ccaactgcaa cggcaacgcc aacagcgttt gcgacgaccg tcctctgctg ctagccg 240 246 < 210 > 38 < 211 > 278 < 212 > DNA | < 213 > Glicine max! < 400 > 38 gttttctatt gacacgttcit ggaagcgtaa cgaagatgaa tggcattggg aaactcaaat.60 cgtcgagttc tgaattggac cttcacattg aagattacct accttctgga tccagtgttc 120 aacaagaacg gcatggcaag ctccgactgt gtgatttgct agacctttct cctagtctat 180 ctgaggcagc aegtgccatt gtagatgata cattcacaag gtccttcaag caaatcctcc 240 agaaccttgg aactggaatg tttatttgtt tcctttgt 278 < 210 > 39 < 211 > 312 < 212 > DNA < 213 > Glicine max < 400 > 39 ttaactttgg cacattctcc ttttgttcat caatgtgtgt tgtaaattgt neatttcctt 60 i cagaggtctt tggtaganat gatgtgcagt ttctgtggtg catcttggac tgnggntgtt 120! aagnatcatg gacccaggcc tagcaggaga ccaaagcagg tttttgtagc cacccatact 180 tcatgattga tnteattatn tnagaacaga tgactgcttt tgcngttatn atgcagaagc 240 atcctggatg ggttggtaag cntacagnat gtcaacngtg tatnaaatat gntacacna 300 acttgcgtct tc 312 < 210 > 40 < 211 > 255 < 212 > DNA < 213 > Glicine max < 400 > 40 ggattattgn ngcanatgca gtcatctgtt ctaagataat ganatcnatc atggaagtat 60 gattggncac anaaacctgt yttttggttg gatactaggt cttggcccat ggtacttgac 120 naccccagtc catgatgcaa canaganact gnacatcatc tccaccaaac ccctctgcna 180 ganacgagaa ttgagcaatt tagagtacct tggtttgatg cnagtcagta tattcaagtt 240 tctattcatc aaagg 255 < 210 > 41 < 211 > 291 < 212 > DNA < 213 > Glicine max < 400 > 41 caacctccca tgcaatcgct cacectetcc gtcaectgaa tctgttttet attccetccg 60 tcgcgtaaca aggatgaatg gcattgggaa actcaaatcg tcgagttctg acttggacct 120 tcacattgaa gattacctgc cttctggatc cagtgttcaa caagaacggc atggcaagct 180 ccgcctgtgt gatttgctag acatttctcc tagtctatct gaggcagcac gtgccattgt 240 ttcacaaggt agatgataca gcttcaagtc aaatcctcca gaaccttgga 291 < 210 > 42 < 211 > 284 < 212 > DNA < 213 > Glicine max < 400 > 42 ctgcaaccta ccatgcaatt cctcacctga atccgttttc tattgccacg ttgtggaagc 60 gtaacgaaga tgaatggcat tgggaaactc aaatcgtega gttctgaatt ggaccttcac 120 attgcagatt acctaccttc tggatccagt gttcaacaag aacggcatgg caagctccga 180 ctgtgtgatt tgctagacat ttctcctagt ctatctgagg cagcacgtgc catgtagatg 240 aggtgctcaa atacatcaca gtcaaatctc cagaaccttg 284 gaat < 210 > 43 < 211 > 268 < 212 > DNA < 213 > Glicine max < 400 > 43 ctgaagtatt ctcgtcctag cccaaagcat agagaaaggn agedacagna etttgctgeg 60 tcagtgctgc ggcgatggga ggaaaagtga tgtgtccett tatgtggtgt tgttcttaat 120! tattcttagt aatgccattg cttcgacccc tttttttgct tttgttttgt cattgctaac 180 tatttatttt tancactttt attaaagata tggcatatat ncactteagt anacaaagtt 240 gtnccagtaa tttnttttee 268 < 210 > 44 < 211 > 241 < 212 > DNA < 213 > Glicine max < 400 > 44! gancaaaatt gccctccatc actttccttg ttagagttgg tttctgcncc ctaccatgca 60 attccctcac ctgaatccgt tttetattgc cacgttgtgg aagcgtaacg aagatgaatg 120 gcattgggaa actcaaatcg tcgagttctg aattggacct tcacattgaa gcttccctac 180 'cttctggatc cagtgttcaa caagaacggc atggcaagct ccgactgtgt gatttgctag 240 i 241 s < 210 > 45 < 211 > 247 'i < 212 > DNA < 213 > Glicine max < 400 > 45 I i I gtaggatgtc tgagatcctt gceccaatca aaacggtgcg gttaactaga aaccgcgacg 60 i i I aggatgcgaa catgatgaaa aatttgctgg ggcaaagggcja cctggtggtt tgtcctgaag 120 ggaccacatg tagagdacct tatttattga ggttcagccc tctgttctca gagatgtgcg 180 atgagattgt ccccgttggc agttgattcc cagttatatg ttccacggaa ccactgctgg tgganta 240 247 < 210 > 46 < 211 > 271 < 212 > DNA < 213 > Glicine max < 400 > 46 tgcagggggg ettgttagag ccatagtttt ggttcttcta tacccttttg tttgtgtcgt 60 aggaaaagag atggggttga agataatggt catggcatgc ttcttcggga tcaaagcatc 120 gagettcaga gttggaaggt ccgttttgcc cnaattettc tnggaggacg 180 ttngtgcaga ii elatgtttgag aaggagggaa gcactcaaaa gacagtggga gttaccaatt taccccacgt 240 agcttettga gatggtggaa gagagtattt g January 2 < 210 > 47 < 211 > 242 < 212 > DNA < 213 > Glicine max < 400 > 47 [ | ttcacagctg tcacgccgtn aacggaaaat ggcaacggcg agacgcagtt tecegeatat 60¡ aacggaacga caccgaatgc cnccgtgcga ntctgtngne Iccgacctcg agggtacgct 120 cctcatctcc cgtngetcgt tcccgtactt catgctcgtc gccgtegaag ceggcagent 180 cctccgcgge ctcatgctnc tcctctccct tccgttcgtc atnatcgcct acctctteat 240 et 242 < 210 > 48 < 211 > 244, < 212 > DNA < 213 > Glicine max < 400 > 48 acatattctt cagttagctc ccccaaccta tacacttcac caccacacca caaccctacc 60 etetetetet gteatggtea ttggaggaga ettecetcgt ttegacccaa tcaccaaatg 120! tagacecaag acegetecea ceagaceatc gcctcggacc tcgatggcdc cctccttgtc 1 O I tcccggagtg cettececta etaettcctc gtcgccctcg aagccggcag cgtettccga 240 j gcet 244 I < 210 > 49 < 211 > 230 < 212 > DNA < 213 > Glicine max < 400 > 49 caacatteca ectagetece caatcacatc ttcaccacac cataaacctt cttagatttet 60 atetteattt tetectetat tgtcataatc atggggacct tccctcgctt cggtcccaatc 120 I I accacccaag accggtccaa ecagaccgtg gcctccgacc ttgacggcac cctcctcgte 180 teceggagcg cettececta etacctecte gttgccctcg aagccggcag 230 < 210 > 50 I! < 211 > 265 i < 212 > DNA | < 213 > Glicine max I < 400 > 50 'I caacatteca ectagetece caatcacatc ttcaccacac cataaacctt cttagatttet 60 atetteattt tetectetat tgtcataatc atggggacct tccctcgctt cggtcccaatc 120 accacccaag accggtccaa ecagaccgtg gcctccgacc ttgacggcac cctcctcgte 180 teceggagcg cettececta etacctecte gttgccctcg aagccggcag 230 < 21 O > 51 < 211 > 252 < 212 > DNA < 213 > Glicine max < 400 > 51 ctggtgaata atcctaagtt atggagtctg tggtgtgtga gctagaaggc acgcttgtga 60 tgcgttctca aggacaagga tactteatgt tggttgcgtt tgaagcttea ggtttggtte 120 gttgctaaca gtttcgcctt ctattgcccg tgatteggtt cettgacatg gttggcatga 130 acgatgcatc tctcaagcta atggtettcg tggctgtggc tgggttccaa agtccgaget 240 tgaatcagtg ge 252 < 210 > 52 < 211 > 212 < 212 > DNA < 213 > Glicine max < 400 > 54 gcaactacaaa caacattcat tcattcacda ctgtcacgcc gtgaacggaa aatggcaacg 60 i gegagaegea gtttcccgcc tatcacegaa tgcaacggaa cgacgccgtg egagtctgtg 120! gccgccgacc tcgacggtac getecteate tcccgtagnc cgttcccgta ettcatgctc 180 gtngccgtcg aagccggcag cctcctccgc gg 2 2 < 210 > 55 < 211 > 273 < 212 > DNA < 213 > Glicine max '< 400 > 55 catggttttc ttgagcttct ttggcctcag adaggacaca ttcagaacag gateagctgt 60 I tctggcaaag ttcttettag aagatgttgg attggaglggc tttgaggccg taatatgttg 120 tgagagaaaa gtggcatcta gtaagttgcc aagggtcatg gttgaaaatt tcctcaagga 190 ctatttaggg gttgatgctg ttatagcaag agaattgaag tcctttagtg gcttcttttt 240 i gggagttttt gagagtaaaga agecaattaa aat 273 < 210 > 56 < 211 > 257 < 212 > DNA < 213 > Glicine max < 400 > 56 ctctcaaaaa aggagggaag acagtgggag tcaccaatet aceceatgtg atggtggaaa 60 gcttcttgag agagtatttg gacattgatt tcgttgtggg cagggagctg aaagttttct 120 'gtggatacta cgtaggattg atggatgaca caaaaactat gcatgccttg gagctggtta 180 aagaaggaaa aggatgctcc gacatgatcg gaatcacaag gtttcgcaac atacgcgace 240 257 atgatgattt tttctcc < 210 > 57 < 211 > 240 < 212 > DNA < 213 > Glicine max < 400 > 57 gaactaagtg tgaaccacta ccaagaaaca agettttaag tccaattatt ttteatgagg 60 gtaggtttgc tcaaaggcca actcctetag ctgnnctctt gaccttccta tggctgccea 120 ttggeateat actctccatc ttaagggtct accttadcat ccctttgcct gaaagaattg 190 cttggtacaa ctacaagctc ttaggadttca gagttattgt gaagggtacc cctccaccgc 240 < 210 > 58 I i < 211 > 254 < 212 > DNA < 213 > Glicine max < 400 > 58! I cttggaataa gggtcattag gaagggtate ectecacece cagcnaagaa gggccaaagt 60 ggagtcctat ttgtatgcaa ecacaggaca gttttagace ctgtggttac agctcgttgca 120 [ttaggaagga aaattagctg tgtcacatat agcataagca aatteactga aataatttea 180 ccaatcaaag ctgtggcact etctagggag agggacaaadg atgctgccad catcadgang ttgcttgagg dagg 240 254 < 210 > 59 < 211 > 267 < 212 > DNA < 213 > Glicine max < 400 > 59 gccaganaga acttgcttggt acaactacaa gcttcttgga ataagggtca ttaggaaggg 60, tatccctcca cececagcaa agaagggcca aagtggagtc ctatttgtat gcaaccacag 120 gacagtttta gaccctgtgg ttacagctgt tgcattagga aggaaaatta gctgtgteac 180 atatagcata agcaaattca ctgaaataat teaccaatca aagctgtggc actctctagg 240! gagagggacc nagatgctgc cnacatc 267 < 210 > 60 < 211 > 261 < 212 > DNA < 213 > Glicine max < 400 > 60 gtaaceacag ggtctaaaac tgtgcggtgg ttactgcagt tgcacttgnc nagaaaaatt 60 tgcttatgct atatgtgaca cagetaatte actgnaataa tttcaccaat taaagctgtg 120 gcactctcaa ggganngaga gaaagatgct gccaatatcc ngagactact tgaggaaggg 180 gacttggtga tttgccctga cggcacaact tgtagagagc cttectcttg 240 aggttcagtg j cactatttgc tgaactcact g 261! < 210 > 61 < 211 > 258 < 212 > DNA < 213 > Glicine max < 400 > 61 caaggagctc acatgcagtg gagggaaatc agetattgaa gttgcaaact acattcaaag 60 ggttettgca gggactttgg gatttgagtg cacaaatttg actaggaala gcaaatatgc 120; catgcttgca ggcacagatg ggacagttcc atetaaggag aaggettgan aagggagaga 180 aattaagttc tcccttttga ttattctgte ttggtgccca atgtgtttcc adadcaetta 240 gaattatgat agaaataa 258 < 210 > 62 < 211 > 258 < 212 > DNA < 213 > Glicine max < 400 > 62 attggcataa tectetecat ectaag < lgte tatetcaaca tccctctgcc agaaagactt 60 ¡gcttgntaca actacaaget tettggaata agggteatta ggaagggtat cectecaccc 1201 ceagcaaaga agggccaaag tggagcctat ttgtatgcaa ecacaggaca gttttagacc 130 ctgtggttac agctgttgca ttaggaagga aaattageto tgtcacatat agcataagca 240, aattcactga aataattt 258 < 210 > 63 < 211 > 239 < 212 > DNA < 213 > Glicine max < 400 > 63 cactheacea ecacaecaca accetaccet ctctctctgt catggteatt ggaggagcct 60 tccctcgttt egacccaate accaaatgta geacccaaga cegetccaac cagaceateg 120 ccteggacct ctccttgtc ctccttgtc ctggagtcc ettececctcc taettcctcg 180 tcgccctcga agccggcage gtettccgag ecetecttet ottaacette gteccette 239 < 210 > 64 < 211 > 531 < 212 > DNA < 213 > Glicine max < 400 > 64 cegagaaceg gtctaaccaa accgtggect cggacttgga eggeacecte ctggtgtccc 60 ceagcgcatt ccettactac atgctggteg ecatcgaagc cggcagette ctccgtggcc 120 ttgtectect tgcctccgte cetttcgtgt attcacgtac atattectet cegagaccgc 180 ggccatcaag tccctgatct tcatcgcett cgcgggcctg daggteaggg acgttgagat 240 ggtcgcgtgc teggtgctgc cgccgacata ccaagtteta agctccccca ttottcagtt 500 i acetatacae tteacedeca caccacaace etacectete tctctgteat ggtcattgga 360¡ ctcgtttega ggagccttcc cccaatcace aaatgtagea cccaagaceg etccaaccag 420 aceatcgcct eggacctcga tggcaccetc cttgtctecc ggagtgcctt cccetactac 480 ttcctcgteg ecetegaage cggcagcgte ttccgagccc tccttctctt a 531 < 210 > 65 < 211 > 256 < 212 > DNA < 213 > Glicine max < 400 > 65 acatattctt cagttagctc ccccaaccta tacacttcac cacedeacea caaccctace 60 ctetctctct gtcatqgtca ttclgalgagc ettccctcgt ttcgacccaa tcaccaaatg 120 • * - * - - * - '- • i tagcacccaa gaccgctcca accagaccat cgcctcggac ctcgatggca ccctccttgt 180 ctcccggagt gccttcccct actacttcct cgtcgccctc gadgccggea gcgtcttccg 240 agccctcctt ctotta 256 < 210 > 66 < 211 > 260 < 212 > DNA < 213 > Glicine max < 400 > 66 ccatccaaca tattctteag ttagctcccc caacctatac aettcaccac cacaccacaa 60 ccecacccte tctctctgte atggtcattg gaggagcctt gacccaatca ccctcgttte 120 ccaaatgtag cacceadgac cgctccaacc agactatcgc cteggacctc gatggcaccc 180 tccttgtete ceggagtgcc ttecactact acttcctcgt cgccetegaa gccggcagcg 240 260 tettccgage cctccttctc < 210 > 67 < 211 > 248 < 212 > DNA < 213 > Glicine max < 400 > 67 ^ ittjAi i! caecadccaa acetcactet ccctttctcc cctgaccctc tccctgccat ggtcatggga 60 gcetttggcc aettcgaacc ggtctccaaa tgcagcaccg agaaccggtc taaccdelacc 120 I gtggcctcgg acttggacgg caccctcctg gtgtccccea gcgcatttcc ttactacatg 180 '! I ctgggcgcca tcgaagccgg cagcttecte cgtggccttg tcctccttgc ctccgtccct 240 ttcgtgta 248 < 210 > 68 < 211 > 283 < 212 > DNA < 213 > Glicine max < 400 > 68 ttetteccea ecateacace aancaaacet cactetnect ggccatggtc atgnnngeft 60 ttccgceact tcgaaccggt ttccaaatgc ageaccgaaa accggtttaa ccaaaccgtg 120 'I gceteggact tggacggcac cctcctggtg tecectagcg cotttcctta ctacatgctc 180 gtcgccatcg aagccggeag cttcctccgt ggccttgtcc tccttggatc cgtccctttc 240 gtgtacttca cgtacatatt cttctccgag tea accgcggcca 283 < 210 > 69 < 211 > 258 < 212 > DNA < 213 > Glicine max < 400 > 69 t ctcttetteo ecaceatenn aecaaccaaa cetcaetcte cctgaccatg gtcatgggag 60 cctttcgcca cttcgaaccg gtttccaaat gcagcaccga aaaccggttt aaccaaaceg 120 I tggcctcgga cttggacgge accetcctgg tgtcccctag cgcctttect tactacatgc 160 i i tcgtcgccat cgaagccggc agettectec gtggccttgt ecteettgga tccgtecctt 240 cacgtaca tcgtgtactt 258, ! < 210 > 70 < 211 > 256 < 212 > DNA < 213 > Glicine max ' < 400 > 70 I tgcaactaca tteatteaca acaacattea gctgteacgc cgtgaacgga aaatggcaac 60 ggegagaege agttteccgc etateacega atgcaacgga acgacaccgt cgcgagtctgt 120 ggccgccgac ctcgacggta cgctccteat ctcccgtagc tcgttcccgt aetteatget 180 cgtcgccgtc gadgccggca gcntcctccg cggcctcatc ctcctcctng ccantccgtt 240 256 cgteatcanc geetac < 210 > 71 < 211 > 259 < 212 > DNA < 213 > Glicine max < 400 > 71 ettccecace ateacacean ggenaacctc antctecett tetecaenga cectetecct so gccatngtca tgggancett tggccacttc gaaceggtct ecaaatgcag cacegagaac 120 cggnetaace aaaccgtgge cteggacttg gacggcaccc tcctggtote cencagcgca 1ß0 tttccttect acatgctgge ngccategaa gceggcagct tcctccgtgg ecttgtcctc 240 259 cttgcctccg tcccttteg < 210 > 72 < 211 > 249 < 212 > DNA < 213 > Glicine max < 400 > 72 ccaacatatt ettcagttag etcccccaac ctatacactt caceaceaca ccacaaccet 60 accctctctc tctgtcatgg teattggagg agccttccct cgtttcgacc caatcaccaa 120 atgtagcacc caagaccgct eceaceagac catcgcctcg gacctcgatg gcaccctnet 18¡0 tgtctcccgg agtgccttee ectactaett cctcgtcgcc ctegaagccg gcagcgtctt ncgagccct 240 249 < 210 > 73 < 211 > 257 < 212 > DNA < 213 > Glicine max! i < 400 > 73 caaccctctt cttccccacc atcacaccaa ncaaacetca ctetecettt cteccctgac 60 cctctccctg ccatggtcat gggagccttt ggccactteg aaccggtctc caaatgcagc 120 accgagaace ggtctaacca aaccgtggcc tcggacttgg acggcaccct cctggtgtee 180 cccagcgcat ntccttacta catgctggte gccategaag ccggcagctt cctccgtggc cttgtcctcc ttgcctg 2401 - 2571 < 210 > 74 < 211 > 255 < 212 > DNA < 213 > Glicine max < 400 > 74 gccgeagacg tgcacccgga gagttggaga gtgttcaact Ctttcgggaa gcgttacatt 60 gtcacggcta gtcctagggt gatggtggag ccgtttgtta aggcgtttct eggggctgaa 120 aaggtgcttg ggactgaact tgaggccace aaategggga cgtteactgg gtttgttaag 180 aagcctggtg tgcttgttgg ggagcataag aaagtggetc tggtgaagga gtttcagggt 240 255 aattacctga cttgg < 210 > 75 < 211 > 244 < 212 > DNA < 213 > Glicine max < 400 > 75! ! i caacaacatt catteattca cagctgtcac gccgtgaacg gaaaatggca acggcgagac 60 gcagtttece gcetateace gaatgcaacg gaacgacacc gtgcgagtct gtggccgceg 120 acctegacgg tacgctcctc atencccgta gctcgttcce gtacttcatg ctcgtcgccg 180 > tcgaagccgg cagoctecte cgcggcctea tgcnttcctg ggtttanttt gagnacecet 240 I gagg 244 < 210 > 76 < 211 > 240! < 212 > DNA i < 213 > Glicine max < 400 > 76 gctggctace ctcttettcc ecaecatcac aecaatcaaa cetcaetcta ccctggccat 60i ggtcatggga gcctttncgc cacttcgaac eggtttccaa atgcagcacc gaanaccggt 120 ttneccanae cgtggccteg gnettggacg gcaccetcct ggtgtcccct agcgcctttc 190 i cttactacot gctcgtcgcc atcgaagccg gcagcttcct ccgtggcttg tecteettgg 240 < 210 > 77 < 211 > 263 < 212 > DNA < 213 > Glicine max < 40O 77 I I gtttctcggg gctgacaagg tgcttgggac tgaacttgag gccaccaaat cggggacgtt 60 'cactgggttt gttaagaaggc ctggtgtgct tgttggggag catatagaaag tggctctggt 120¡ gaaggagttt cagggtaatt tacctgactt gggtctaggt gatagtaaaa gtgattatggt 180 cttcatgtea atttgcaagg dagggtecat ggtgccaaga actaagtgtg aaccactacc 240 263 caa aagaaacaag ettttaagte < 210 > 78 < 211 > 258 < 212 > DNA < 213 > Glicine max < 400 > 78 ggccacgaaaa tcggggaggt tcactgggtt tgttaaggag cctggtgtgc ttgttgggga 60 gcacaagaaa gtggctgttg tgaaggagtt tcagggtaat ttacctgact tgggactagg 120 I agatagtaaa agtgattatg actteatgte aatttgcaag gaagggtaca tggtgccddg 180 gactaagtgt gaaceactac caagaaacaa aettttaagt ceaattattt ntcatgagog 240: taggtttgtt caaaggcc 258, < 210 > 79 '< 211 > 260 < 212 > DNA < 213 > Glicine max < 400 > 79 ctettcttcc ccaccatcac accaancaaa ecteactete cetttetece ctgaccctet 60 ccctgccatg gtcatgggag cctttggcca cttcgaaceg gtctccaaat geageacega 120 i gaaccggtct aaceaadccg tggcctcgga cttggacggc accctcctgg tgtcccccag 180 I I cgcatttect tactacdtgc tggtcgccat cgaagccgge agettcctcc gtgggccttg 240 | tcctccttgc ctccgtccct 260 < 210 > 80 < 211 > 257 < 212 > DNA < 213 > Glicine max < 400 > 80 gggaaceaca acaaatggca ngaaccttat ctccttccaa ettggtgcat ttatccctgg 60 I atacccaatc cagcctgtaa ttgtacgcta tcctcatgtg cactttgacc aatcctgggg 120. { tcatgtntct ttgggaaagc ttdtgtteag aatgttcgtct caatttcaca aettttttga 180 ggtagaatat cttcctgtca tttatcccet ggatgataag gaaactgctg tancttnteg 240 ggagaggact agccggg 257 < 210 > 81 < 211 > 272 < 212 > DNA < 213 > Glicine max < 400 > 81 l catacctttt gttggcacca ttattagage aatgcaggtc atatatgtta acagattctt 60 'aceatcatca aggaagcagg ctgttaggga aataaaggaa ctgaataaca gagaagggcc | 120 tcttgtgata aatttecteg agtactatta tttecegagg gaacaacaac taatggcagg 180 > aacettatet cetteedact tggtgcattt atccctggat acccaatcca gactgtaatt 240 atacgctate cteatgteca ctttgaccaa tc 272 < 210 > 82 < 211 > 245 < 212 > DNA < 213 > Glicine max < 400 > 82 gggeatttca catactagag tteateceag tgaaaagaaa gtgggagget gatgaatcaa 60! teatgcgcca tatgctttet acattcaagg atecacaacja tcctctctgg cttgcgcttt 120 tcccagaagg cactgattte actgagcaaa agtgcctteg gagtcaaaaa tatgctgctg 180 aacataagtt accggttctg aaaaatgttt tacttccaag gacaaagggg ettctgtgcc 240 gettg 245 < 210 > 83 < 211 > 268 < 212 > DNA < 213 > Glicine max, < 400 > 83 I cagtgtcctt cctttctgga cuatgttttt ggtgttgacc ettcagaagt gcacetgcat 60 gtgcggcgta ttccggtgga ggagatteca gcttctgaaa ttettggtte ccaaagctgc 120 ategelcacat tccagatcaa ggaceaattg etttcggatt tcaagattea aggccattte 180 taaatgaaaa cctaaccaac tgaaatttet agettteaga gcctactete ttttatggtg 240 atagtttett ttactgccat gtttattt 268 < 210 > 84 < 211 > 265 < 212 > DNA i < 213 > Glicine max! < 400 > 84 gadagagact gggcaaaaga tgaaacatea ctgaagtcag gttttaggea tetagacjcac 60 atgccattcc ctttctggtt ggcccttttt gttgaaggaa ctcgttteac geagacaaag 120 cttttacaag cteaagagtt tgctgcttea aaagggctge ctatacctag aaatgttttg 190 etaagggttt attcctcgta tgtcacagea gnaca "SCFA tteggccatt tcgttecage 240 catttatgat tgcacatatg cegtt 265 < 210 > 85 < 211 > 265 < 212 > DNA < 213 > Glicine max < 400 > 85,! I gaaagagact gggcaaaaga tgaaacatea ctgaagteag gttttaggea tetagageac 60 atgecattee etttctggtt ggcccttttt gttgaaggaa ctcgtttcac gcegacaaag 120 cttttacaag etcaagagtt tgctgcttea aaagggctgc etatacetag aaatgttttg 180 attcctcgta etaagggttt tgtcacagca gnacaaagce tteggccatt tcgttccage 240, 265 catttatgat tgcacatatg cagtt < 210 > 86 < 211 > 301 < 212 > DNA < 213 > Zea mays < 400 > 86 ctcgtcgtea agggeacece gccgccgccg cccaagaagg gceacccggg cgtcctcttc 60 g-! tctgcaace acegeacegt getcgaccce gtcga < jgtgg ccgtggcgct gcgccgcaag 120 cteagctgcg teacetacag catctccaag ttctccgacjc teatctcgcc catcaaggcc 180, gtcgcgctgt cgcgggagge gaceaggacg ccgagaacat ccgcegcctg ctggaggagg 240 gcgacctggt catctgcccc gagggnaaca actgccgcga gcccttcctg ctgcgttcag 300g soj < 210 > 87! < 211 > 309! < 212 > DNA i < 213 > Zea mays i < 400 > 87 I cgctcatgcg gtgtacatea acctgccgct gcccgagcgc atcgtctact acacetacaa 60¡ getcatggge atcaggeteg tcgtcaaggg caecccgccg ccgccgccca agaagggcca > 120 cccgggcgtc ctcttcgtet gcaaccaccg caccgtgctc gaccccgtcg aggtggccgt 13¡0 ggcgctgcgc cgcaaggtca gctgcgtcac ctacagcatc tccaagttct ccgagctcatt 240 ctcgcccate aaggccgtcg cgctgteggg gaggcgacaa ggacgccgag aacatccgce j 300 gcctgctgg; 309 < 210 > 88 < 211 > 304 < 212 > DNA < 213 > Zea mays l < 400 > 88 tggctgtgca ggaggcctac ctggtgacgt caaggaagta cagcccggtg cecaggaacc 60 agctgctgag cccgctgatt cgtgcacgac ggccgcctg tgcagcgccc gacgccgctc 120 gtcgcgctcg tcaccttect ctggatgccg ttcggetteg cgctggcgct catgcgcgtg 180 tacatcaacc tgccgctgcc cgagcgcatc gtetactaca ectacaagct catgggcatc 240 aggctcgtcg tcaagggcae cccgccgccg ccgcccaaga agggccaccc gggcgtecte TGCTs 300 304! I < 210 > 89 < 211 > 312 < 212 > DNA < 213 > Zea mays < 400 > 89 ggtteateca ettgtgttgc tattngaceg gtaccgtagg agageacagc actancateg 60 caaagatttn gggctacggt gacaatctcc atgttctaca atettnaggt cgaaggaatg 120 gagaatctgc etccaaatag ctgtcctggt gtctatgttg ataaceatea gagcttettg 180 gatatttata ccettctaac tetagggagg tgcttcaaat ttataagcaa gaceageate 240 tttatgttee etattatagg gtgggcaatg tatctettgg gtgtgattcc tctgcggcgt gg 312 300 atggacagca < 210 > 90 < 211 > 264 < 212 > DNA < 213 > Zea mays < 400 > 90 ggtgctgtat ctgaaagaat ecatcgtgct catcaacaga aaaatgc & CC aatgatgcta 60 gagggeacela ctcttcccct ctacaaatgg ggettatctc cttecattea aaacaggtgc 120 ttttettgca aaggcaccag ttcaaccagt tateettaca cattttgaga LAO aaagatttaa tgcageatgg gattecatgt caggggcacg tcatgtattt ctgctgctct gtcaatttgt 240 aaattaceta gaggtggtcc gctt 264 I < 210 > 91! < 211 > 212 < 212 > DNA < 213 > Zea mays < 400 > 91! aaatgtcttg gatgcatttt tgttcagcgg gagtegaaaa caccagattt caaaggtgtt 60 teaggtgctg tatttgaaag aatccatcgt getcatcaac agaaaaatgc accaatgatg 120 ctactettee ctgagggcac aactacaaat ggggattete teettccatt caaaacaaggt 180 gcttttettg caaaggcacc agttcaacea gt 212 < 210 > 92 < 211 > 267 i < 212 > DNA < 213 > Zea mays < 400 > 92 gtctaaagaa etngaaaggc gtggggnadt tgtgtetaat catgtntett atgtggatat 60 'tetttatcan atgtcagcet ettttectag ttttgttgct aagagatcag tggntagatt 120! gcctetagtt ggteteataa gcaaatgtet tggatgcatt tttgtteage gggagtnnaa 180 aatncatt tcaaaggtgt ttaaggtgtg gnatetgaaa gaatccatcg tgctcatcaa 240 cagaaaaatg caccaatgat gctactc 267 MHÉHrfÜlÉt ^ jiBÉritl < 210 > 93 < 211 > 152 < 212 > DNA < 213 > Zea mays < 400 > 93 ctdcddfttgg ggdttacctt ettecattta agactggagc etttnttgca ggtgcaccag 60 tccagccagt cattttgaaa taccettaca ggagatttag tecagcatgg gattcaatgg 120 atggagcacg tcatgtgtta ttgctgctct gt 152 < 210 > 94 < 211 > 274 < 212 > DNA < 213 > Zea mays < 400 > 94 aaaatataaa ttaatatggt cttaatccca ecatataaat aacgttctct ttctgcaggg 60 caatttagtt etttetaata ttgggctgge agagaagcgc gtgtaceatg ca < jcactgac 120 tggtagtagt ctacctggcg ctagacatga gaaagatgat tgaaagacgt tgcgtcgctt 180 tttctgtaac agacagcega ggaacactta aaaatgtaac tgtgtgcgtg tttttatace 240 tgtaatgtgg cagtttattt gtttgaggag gctg 274 < 210 > 95! I í < 211 > 295 < 212 > DNA < 213 > Zea mays < 400 > 95 l actagctate aagtacaata aaatatttgt tgatgccttt tggaacLgta agaagcaatc 60 ttttacaatg caettggtcc ggctgatgac atcatgggct gttgtgtgtg atgtttggta 120 cttacctect caatatctga gggagggaga gacggcaatt gcatttgctg agagagtaag 18 ?! ggacatgata gctgctagag ctggactaaa gaaggttcct tgggatggct atctgaaaca 240! caaccgtcct agtcccaaac acactgaaga gaacaacgca tattgccgat ctgtc 295 < 210 > 96 < 211 > 273 < 212 > DNA < 213 > Zea mays < 400 > 96 gngccatctc accggcggcn ggcctgcggc cggcaacegg aggcgatggc gagetngtct 60 gtggtggcgg acatggagca ntaccgcccc aacctggagg actacctcce gccegactc0 120 ctcccgcagg aggcgcccag gaatctecat ctgcgcgatc tgcttgacat ctcgccggtg 180 ctaaccgagg cagcgggtgc catagtcgat gattcattca cccgttgctt taagtcgaat 240 tctccagaac catggaatgg aacacatatt tgt 273 < 210 > 97 < 211 > 127 < 212 > DNA < 213 > Zea mays I < 400 > 97 ctcaatatct ganggaggga gagactgcae ttgcgtttgc tgagagagte agggacatga Go¡ tagcagctag agctggtett dagaag ° tec cgtgggatgg ctatctgaag cacaaccgcc 120 ctagtcc 127 I < 210 > 98 < 211 > 286 < 212 > DNA < 213 > Zea mays' < 400 > 98 gaaccgtacg cgcctcatta cgcccatcca cgtgctcgcc tctccccatc cjcataatttt 60; nctcggcggc gtcgccatct ccancggcng cnggcctgcn gccggcaacc ggaggcgatg 120 gcgagctcgt ctgtggcggc ggacatggag ctggaccgce ccaacctgga ggctct l! 80! ccgcccgant cgctcccgca ggaggcgacc aggaatctee etctgngcga tctgcttgan 240 atctcgccgg tgctaaccga ggcagcgggt gccatagtcg atgatt 286 < 210 > 98 ' < 211 > 308 i < 212 > DNA 'l < 213 > Zea mays < 400 > 99 cgccatctea tcggcggcgg gcgtgcggcc ggcggcngag gcgaggngcg attggcgagc 60 tcgtctgtgg cgceggacat ggagctggac cgcccanace tggaggacta netcccgccc 120 gactegnnec egeagaggcg ccccggaatc tccanctgcg cgatctgctg gacatcncgc 190 eggtgctcac cgaggcagcg ggtgccattg tcgatgacte ctteacacgg ngctttaagt 240 caaattetcc agagccatgg nattggaaca tatatctgtt cecettatgt getttggtgt ataataag 300 303 < 210 > 100 < 211 > 282 < 212 > DNA < 213 > Zea mays < 400 > 100 cagaaactag angttagtea cagcatg ° ca ttaaattgtc atagtaaaca acancncact 60 gagcaactat gcaatttaat gccatgctgt gactaactte tagtttctgg cattaaatta 120 i ctgtttgget actaggaaga gaagcaaata cogaggtaga ctccaacgca taagaatacc 180 canccaaatg acagagtaaa tgaaggtagg gtteacette ttgaacatga ccgtatactg 240 'gttgttaaca caagtteetc tgggaaaatc agagagggtt tt 202 I < 210 > 101 < 211 > 282: < 212 > DNA < 213 > Zea mays < 400 > 101 ! ggcgcggctg gccgtggcgc tggtcctgcc gtacagtact cgacgccgat cctggcngcg 601 acnggcatgt cgtggcggct caaagggtng cgcccngnge ttgcnnngce gtgctccgge 120 gggcgctgne agctgttcgt gtgcaacnae cggacgctga tcgacccngt gtacgtgtcc 180 gtagcgtgga ccgggaaatg cgcgncgtgt nctacagnct gangcggntn teggagetca 240 tcteccecat ngncggaang tgcacctgan accgggaacg 282 gg I < 210 > 102 < 211 > 290 < 212 > DNA < 213 > Zea mays < 400 > 102 ggacgcggea ccatgcgcgc cgagctggcc agtggcgacg tggccgtgtg ccccgagggc 60 accacgtgcc gggagccctt cctgctccgc ttctccaagc tettegcgga gctcagcgac 120 i aggatcgtgc ccgtggcgat gaactacege gtggggetet tccacccgac gacggcgcgc 180! gggtggaaaq ceatggaccc catcttcttc ttcatgaacn gcggcccgtg tacgaggtga 24? cgttcctgaa ecantecceg caaagcgacg tgcgcggcgg ggaagagccc 290 I I < 210 > 103 < 211 > 279 < 212 > DNA I I < 213 > Zea mays < 400 > 103 Acgaggtgac gttcctgaac cagctccccg cagaggcgac gtgcgcggcg gggaagagcc 60 I ccgttgatgt agccaactac gttcagcgga tactcgctgc cacgctcggg ttcgagtgcaj 120 I cceccetcae aaggaaggac aaatacacgg tgctcgccgg caacgacggc gtcctgaacg 190 ccaagccgge ggcggcccgg aagccggctt ggcagagccg cgtgaaggaa gtcctcgggt 240 tctgctccac taaceattac acettgccca gatctggac 279 < 210 > 104 < 211 > 315 < 212 > DNA < 213 > Zea mays < 400 > 104; gcccgagcgc atcgtetact acacctacaa getcatggge atcaggctcg tcgtcaaggg 60! caccccgccg agaagggcca ccgccgccca ecegggcgte etettcgtct gcaacc & CC9 120 caccgtgctc gaccccgtcg dggtggccgt ggcgctgcgc cgc-ngtea gctgcgtcac 180 tacagcatct ccaagttctc cgagctcatc tcgcccatea aggccgtagc agnaaagcag 240 gtcgcaaatg gagcagnagc gagtegatgg aagngaattg atetgenega gcgactggtc 300 aggnacactg cggag 315 < 210 > 105 < 211 > 314 < 212 > DNA i < 213 > Zea mays < 400 > 105 cgagacaceg agcacgtact accageaaga tggtggcgtc tcccagattc aagcccatcg 60 aggagtgetg ctcggagggg cggtcggage agacggtgge cgccgacctg gacggcacgc1 120 tgctcatctc caggagcgcg tteccctact acctcctcgt ggctctcgag gccggcagcg 180 'tcctccgcgc cgcgctgctg etcctgtccg tgccgttcgt etacgtcacc tacgccttct 240 tctccgagtc gctggccatc agcacgctgg tgtacatctc cgtggcgggg ctcaaggtgc 300 gcanatcgag ATGG 314 < 210 > 106 < 211 > 291 < 212 > DNA < 213 > Zea mays < 400 > 106 ctctgggtet ggggccgaga caccgagcac gtactaccag caagatggtg gcgtctccca 60 I gattcaagee catcgaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg 120 acctggacgg cacgctgctc atntccagga gcgcgttecc ctactacctc etcgtggctc 180 tcgaggccgg cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg 240 tcacctacgc cttcttctcc gagtcgctgg ccatcagcac gctggtgtac 291 < 210 > 107 < 211 > 300 < 212 > DNA < 213 > Zea mays! < 400 > 107 geaegeagea gtacgacgtc tctcctctgg gtctggggce gcacgtacta gagacacega 601 i ceageaagat ggtggcgtet cccagattca agcccatcga ggagtgctgc teggaggggc 12¡0 I ggteggagea gacggtggcc gcegacctgg acggeacgct gctcatctec aggagegegt 180 tcccctacta cctcctcgtg gctetcgagg ccggcagcgt cctccgcgce gcgctgctgc 240 tcctgtccgt gccgttcgtc tacgtcacct acgccttett ctccgagtcg ctggccatca 300 < 210 > 108 < 211 > 284 < 212 > DNA < 213 > Zea mays < 400 > 108! gnggccgaga caccgagcac gtactaccag cangatggtg gcgtctccea gattcangcc 60 antegaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg acctggacgg 120 atetccagga cacgctgctc gcgcgttecc ctacnacctc ctcgtggctc tegaggccgg 180, cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg tcactacgcc 240 ttcttctccg agtcgctggc catcaanacg ctggtgtaca tête 284 < 210 > 109 < 211 > 280 < 212 > DNA < 213 > Zea mays < 400 > 109 ctcctctggg tctggggccg agacacegag cacgtactac cagcaagatg gtggcgtctc 60! ceagattcaa gcecategag gagtgctgct eggaggggcg gtcggagcag acggtggccg 120 ccgacctgga cggcacgctg cteatctcca ggagcgcgtt ecnetactac etectcgtgg 1801 ctctcgagge cggcagcgte ctccgcgccg cgctgctgct cctgtccgtn ccgttcgtet 240 acgtcaccta cgcnttnttc tccgagtcgc tggccatcag 280 < 210 > 110 < 211 > 287 < 212 > DNA < 213 > Zea mays < 400 > 110 cgtotetect ctgggtctgg ggccgagaca ccgagcacgt actaccagca agatggtgge 60 ttcaagccca gtctcccaga tegaggagtg ctgctcggag gggcggtegg ageagacggt 120 ctggacggca ggccgccgac gctgctcatc tccaggagcg cgttccccta ctacctecte 1201 gtggctctcg aggccggcag cgtectccgc gccgcgctgc tgctectgtc cgtgccgtte 240 ctacggette gtctacgtca ttctccgagt cgctggccat cageacg 287 < 210 > 111 < 211 > 286 < 212 > DNA < 213 > Zea mays < 400 > 111 cgcacagtta cgacgtctct cctctgggtc tggggccgag acacegagen egeactacea 6 ^ i gcaagatggt ggcgtctccc agattcaagc ccetegagga gtgctgctcg gaggggcggt 120 cggagcagac ggtggccgcc gacctggacg gcacgctgct catetccagg agcgcgttee 190 ectactacte ctcgtgctct egaggccggc aggtectccg cgccgcgctg tgctectgte 240 | gtgcgttcgt ctagteacta egettttctc gancgtggea ataaa 286 < 210 > 112 < 211 > 323 < 212 > DNA < 213 > Zea mays < 400 > 112 i! i! gttattccct gaaggtacca caacaaatgg gagattcctg atttcgttcc aacatggtgc 6¡0 atteatacet ggctacectg ttcaacctgt tgttgtccgt tatecacat0 tgcactttga 120 'tcaatcatgg gggnatat-dt cgttattaad gctcatgttt aagatgttea cecaatttea 130 taattteatg gaggtagagt acettcctgt tgtct & CCCT cctgagatea agcaagagaa 240 tgccctteat tttgcggagg ataceagcta tgctatggca cgtgccctca atgtettgcc 300 323 att aacttectat teatatggtg < 210 > 113 < 211 > 312 < 212 > DNA < 213 > Zea mays < 400 > 113 egataaggcc ettttegaag agcttetace gteggatcaa cagattcttg gccgagctgc 60 tgtggcttea gcttgtctgg gtggtggact ggtgggcagg tgttaaggtd caactgcatg 120, cagatgagga aacttacega tcaatgggtaa aagagcatgc acteatcata tcaaatcatc 160 ggagtgatat tgattggctc attggatgga tattggccca gcgttcaggg tgccttggaa 240 gtacaettgc tgtcatgaag aagtcatcca agttccttcc acgttattggc tggtcaatgt 300, 312 ggtttgcaga gt < 210 > 114 < 211 > 279 < 212 > DNA Éü ^^^ lÉítetel i < 213 > Zea mays i < 400 > 114 i I agtggggtct ceamaggttg aaagacttcc ctagaceatt ttggctagct etttttgttg 60 ¡5 agggtactcg etttactaca gcaaagctte tcgcagctca ggagtatgcg gcttcccaagg 120 gctteccege tcctagaaat gtacttattc cacgtaccaa gggatttgta tctgccgtaa 180 gtattatgcg aggattttgtt ecagccattt acgatacaac tgtaatagtt cetaaagatt. 240 cccctcaacc daceatgctg cggattttga aagggcaat 279 < 210 > 115 < 211 > 304 < 212 > DNA < 213 > Zea mays < 400 > 115 ' cgtcaglcgcc atecaggccg tectatttgt gacgataagg ecettttcga agagetteta 60! ccgtcggatc macagettet tggccgagct gctgtggctt cagettgtct gggtggtgga 120 ctggtgggca ggtgttaagg tacadctgca tgcagatgag gaaacttaca gatcaatggg 190 gcactcatca taaaaagcat tataaaatca tcggagtgat ettgattggc tcatggatgg 24Q 20 atattggcce agcgttcagg gtgccttgga agtacattgc tgtcatgaag aagtcateca 300 AGTT 304 < 210 > 116 | < 211 > 259! < 212 > DNA I | < 213 > Zea mays! i < 400 > 116 ettcctcctg tccggcctea tcgtcaacgc catecaggcc gtcctatttg tgacgata & September 60 gccentttcg bagagettct aacgtcggat caacagattc ntggccgagc tgctgtggct 120 tcagcttgtc tgggtggtgg acnggtgggc aggtgttaag gtacaactgc atgcngatga 180 ggaaacttac agatcnatgg gtanagagea tgcaetcatc atatcaaatc atcggagtga 240 259 encattgga cattgattgg < 210 > 117 < 211 > 235 < 212 > DNA < 213 > Zea mays < 400 > 117! i attccacgte ccaagggatt tgtatctgct gtaagtatta tgcgagattt tgttccagcc 60 atttátgata caactgtaat agttcctaaa gattcccctc aaccaacaat gctgcggatt 120 ttgaaaggqc aateatcagt gatacatgte cgcatgaaac gtcatgcaat gagtgagatg 180 ccaeaatcag atgaggatgt ttcaaaatgg tgtaaagaca tttttgtgge aaagg 235 < 210 > 118 < 211 > 282 < 212 > DNA! < 213 > Zea mays < 400 > 118 ' tgagatgcca amatcagatg atgacgtttc aaaatggtgt deagacattt ttgtgacaaa 60 ggatgcctta ctggacaaac atttggcaac aggcactttc gatgaggaga ttagacetat 120 cggccgccca gtgaaatcat tgctggtgac cctgttttgg tcgtgcctgc tgttgtttgg! 180 tgccatcgag ttettcaagt ggacgcagct cctategaca tggagaggag tggcattcac 240 tgccgcagga tggcgctcgt gacaggggtc atgcacgtet tc 282 < 21O > 119 < 211 > 166 < 212 > DNA < 213 > Zea mays < 400 > 119 ctggtgggca ggcgttmagg tacaactaca tgcggatgag gacaettacc gatcaatggg 60 taaagagaat gcactcgtea tatcaaatea tcgaagtgat ettgattggc ttattggatg 120 gatattggce cagcgcteag ggtgccttgg eeotacgctc gctgte 166 < 210 > 120 < 211 > 234 I < 212 > DNA | < 213 > Zea mays < 400 > 120 agtcanccaa gntccttcca gtcattggct ggtcaatgtg gttegeagag tacctctttt 60 nggagaggag ctgggccaag gatgeaaaga cactaaagtg gggtctccaa aggttgaaag 120 aettecctag accatttngg ctattctcttn tttgtngagg ntcgctt tactccagea 180 angnttntng aggnnneagn agrtnncgggn ttcccanggg ttaacagncc cna 234, < 21O > 121 < 211 > 210 < 212 > DNA < 213 > Zea mays < 400 > 121! gtgagatgcn aaaateagat gatgacgttt caaaatggtg taaagacatt tttgtggdc & 60 aaggatgcct tactggacaa acatttggea acaggcactt tegatgagge gattagacet 120 atcggccgcc cagtgaaatc atngctggtg accctgtnnt ggtcgtgcct gctgttgttti 180 ggtgccatcg agntcttcaa gtggacgcag 210 < 210 > 122 < 211 > 274 < 212 > DNA! < 213 > Zea mays < 400 > 122 i acncccgaat ccgccgcgcg cgcnccgtcc tcgtcgccgg cggaggcgcc cgcnaccgce [60 tatcgccgga cacagcagcc gaaggaacgc cgcggggagc ttttccacng ceatctcccg 120 tctgacccct ccgagatcgn aagcggcggc catggcgatc ccgctcgtgc tcgtcgtgct 130i cccgctcggc ctcctctteo tcctgtccgg ectcatcgtc dacaceatcc aggccatcct 240 atttgtgaca ataaggccct tttccaagag CTTG 274 < 210 > 123 < 211 > 305 < 212 > DNA < 213 > Zea mays! < 400 > 123 ttgcactgag gaaaggccat tagggatata tcaagtacat acataagagc agcttgatga 60 agttgcctat ttttagctgg gcattteaca tttttgagtt tatcccg < jta gaacggaaat 120 gggagattga tgaagesatt attcagaaca agctatcaaa atttaagaac cegagagatc 180 etatctggtt ggcggttttt cctgaaggea eggattatac tgagaagaaa tgcatcatga 240 gtcaagagta tgcttengad catggcttgc ctatgctaga acatgtcctc ettccaaaga 300 caagg 305 < 210 > 124 < 211 > 279 ^ MiMh ^ rita I < 212 > DNA ' < 213 > Zea mays | < 400 > 124 ' I ccagattttc tggacaatgt gtatggcgtt gateettetg aagtecacat ccacgtcaga 60 I atggtteage tecetcacat ccccacaaca gaagacaaga taacagaatg gatggnegag 120 I aggtttaggc gctcctggca agaaggacca tgaaggggca gatttettca tttcctgatg 180 aaaggaactg amaggagatc tgtcgacgcc gagtgcctgg caaactttet taaceagtag 240 tatgcttgac ggccnatctg gtttgtacct daactcttt 279 < 210 > 125 < 211 > 219 < 212 > DNA! < 213 > Zea mays < 400 > 125 agattttntg gacaatgtgt atggngttga tecttntgaa gtncacatcc acgtriagaat 60 ggtteadetc catcacatee ecacaacagn agacaagata acagaangga tggtagagag 120 I gtttaggcag aaggaceage tcctggcaga tttettcatg aaggggcact ttcctgatgai 180 aggaactgad ggagatctgt egacgecgaa gtgcctgge 219 < 210 > 126 < 211 > 293 i < 212 > DNA < 213 > Zea mays I < 400 > 126 ? 5 taceatagat gctgtgtacg acatcacgat egentac "caccggenge ngacatttct 60 ngacaacgte taengcgtgg ntccttegga agtccacatc cacateanca gcatccaggjt 120 ctccgacata neggcgtccg aaa¿kacgggg tggctggcng gntnngtgge gcggttcaag 180 gtngattta acgagctnge tgtteggggc tttctaccgc ggctggggcc aatttcnece1 240 cgaacgaaag ggaaaaaggg gaacegaagg ggggaacctg ttngaacggg neck 293 10 < 210 > 127 < 211 > 6 < 212 > PRT < 213 > preserved sequence, < 400 > 127 Val Xaa Asn His Xaa Ser i 1 5 < 210 > 128 < 211 > 6 20 < 212 > PRT < 213 > preserved sequence < 400 > 128 Val Thr Tyr Ser Xaa Ser 1 5 < 210 > 129 < 211 > 7 < 212 > PRT < 213 > preserved sequence < 400 > 129 Val Xaa Leu Thr Arg Xaa Arg 1 5 < 210 > 130 < 211 > 5 < 212 > PRT < 213 > preserved sequence < 400 > 130 Cys Pro Glu Gly Thr 1 5 < 210 > 131 < 211 > 5 < 212 > PRT < 213 > preserved sequence < 400 > 131 He Val Val val Ala 1 5 < 210 > 132 < 211 > 7 < 212 > PRT < 213 > preserved sequence < 400 > 132 Leu Xaa Xaa Gly Asp Leu Val 1 5 < 210 > 133 < 211 > 6 < 212 > PRT < 213 > preserved sequence < 400 > 133 Phe Xaa Xam Gly Ala Phe 1 5 < 210 > 134 < 211 > 6 < 212 > PRT < 213 > Synthetic Oligonucleotide < 400 > 134 Vdl Wing Asn Me Xad Gln < 210 > 135 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 135 ccatccgctt caagggaacg acacccatea 30 < 210 > 136 < 211 > 31 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 136 tccctgtett gcttgatgaa ettaaagctt g 31 < 210 > 137 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 137 acageaggag tgtctgatga tggeagatte 30 < 210 > 138 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 138 actggagtte cagccaaaaa tgcacctgte 30 < 210 > 139 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 139 gatacaccct tgaamtcagg cgattttgct 30 < 210 > 140 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 140 ttgcaaatte aattcctgtt teacegggcc 30 < 210 > 141 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 141 gttttctgct attecagaag gcgtcaacaa 30 < 210 > 142 < 211 > 32 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 142 ccggtttacg agalttacgtt cttgaaccag 30 < 210 > 143 < 211 > 30 < 212 > DNA 5 < 213 > Synthetic Oligonucleotide < 400 > 143 tcgagctgtg atcgatgatt ggctgtgaag 30 < 210 > 144 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 144 15 gtetettcaa aaacacacac acacgtetet 30 < 210 > 145 < 211 > 30 < 212 > DNA 20 < 213 > Synthetic Oligonucleotide < 400 > 145 gtetettead aaacacacac acacgctct 30 a _ ^ áa __? ^ - < 210 > 146 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 146 gtagagagcc ttaettgctt eggtttagte 30 < 210 > 147 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 147 acgtcatcgt acctgttgct attgactcae 30 < 210 > 148 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 148 aettttecat tgteagggac tectegacac 30 < 210 > 149 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 149 acggtgtagg aagggaaagg attcaaaagg 30 < 210 > 150 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 150 gcgatgaact acagagtegg attettcctc 30 < 210 > 151 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 151 ccggtttacg agalttacgtt cttgaaccag 30 < 21O > 152 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 152 caatggagac aaggetegaa agtgctaace 30 < 210 > 153 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 153 attctctgaa catdgttege cacggtcatg 30 < 210 > 154 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 154 gaaatecaac gccttcccaa tatcactctg 30 < 210 > 155 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 155 etccaacttt ccateaggat cttggcacgt 30 < 210 > 156 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 156 aecacttgtt agagacetta cctgcttagg 30 < 210 > 157 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 157 tcctacctac accatccaat ttctcgaccc 30 < 210 > 158 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 158 ctgcgtcaag tgagcaactc agttcttgca 30 < 210 > 159 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 159 tgggaageag cacgttgtte agtateggaa 30 < 210 > 160 < 211 > 30 < 212 > DNA < 213 > Synthetic Oligonucleotide < 400 > 160 tagcctctgt gtaatctgtg ccctcgggga 30 < 210 > 161 < 211 > 1702 < 212 > DNA < 213 > simmondsia chinensis < 400 > 161 I ! gaattctagc ctctctcctc ctgcaattct acttgctttc tacgatcttt cectetetet! 60 etctaaaacc ttaaaattgg datggaateg tttaaaaata tgatcttttt gtaattgaat i 120 I tagtataatt atatctgggt aatcttgaat ttcjttggtga ggccatgggg atcccagctg | 180 cggctgtgat tgtaccgctt ggcttctctct tcttettctc tggtetettc atcaacttca j 240 ttcaggcaat ttgttttgtg etcgtgcgge cactgtcaaa gnntacatac agaaggatta i 300 acagggtgct ggtggaattg ttgtggcttg agctgatatg gctcgtagat tggtgggcaa; agctgggcca 360 gtgtteagat caagttgtte acagatcctg atacettteg getaatgggt aaagagcatgi 420 caettgtgat atcaaaccae agaagtgara ttgattgget tgttggatgg gtgttggccc 480 i agagatcagg ctgcctggga agcacactgg ctgtcatgaa gaaatcatca aagtttctcc 540 cggtcatagg ttggtctatg tggttttctg agtacetttt tcttgagaga 600 aggatgaaag cacattgaag ttaggtettc aacgcctcaa ggactaccct ctgcctttet 660 ggttggctct tttcgtagaa ggaacacgat ttacccaagc taaactttta gcagctcaag! 720 aetatgetac ttcaatggga ttgccagttc ctagametac tttgatccct cgtactaa < jg 780 gatttgtttc agccgtgage entatgcgtt cgtttgtecc ggeciltatat gatgtaacgg 840 tggccatccc taaatcttet tcgcagccta caatgetcag acztttcaaa ggccagccatj 900 ccacggttea tgtacacate aatgcgccgct cgatgaaaga tctccctgaa gcagcagatg 960 atggtgtega atgttgcaca gacacatteg tcgcaaagga tgcactcctg gacaageata 1020! atgtagatga caetttegga gatgagtate tgcaggacac tggccggcct ttgaaatctc! 1080 tctttgtagc agtetcttgg geattgattc teatectggg a < jgtttgaaa ttectacgat 1140 ggtcgtccct tctatcatea tggaaggggg tcgccttctc agccgcatgc cttgtgctcg '1200 tcaccettct tatgcagatc ttaazccaat tttctcaatc cgdgcgctcg actcctgcta ¡1260; aggtagcece aggaaagccc aagaacatgg tateagaace cacggaaacg caacgacata 1320 aaagtatata agcagcacta tggaccccaa etaagaagat teagacgcaa cjccacagttg 1380 attcaactgt tcagaatgtc aaatatagtt tgagaaacaa aagatcaaga ttagctgatg 1440 i l aagagcctaa tgaacctaca tacttggatc tgtcgtcgcc accgtctgct gctagctcgt i 1500 tatcagdatt cgtgattccg ggaccgatcc cggatcttag ccttotatgc atggattatg 1560 i i atagtatett aaatttcttt aatgatgtac cggaattata atgttagtta attaggggga I 1620 | tgagcattgt ttgggtttat atcgtggtaa atccttgtdt tgtttataag atttgaagaa 1680: aettcgatte gagtgctctg aa 1702 < 210 > 162 < 211 > 387 < 212 > PRT < 213 > simmondsia chinensis < 400 > 162 Met Gly lla Pro Ala Ala Ala Val lla Val Pro Leu Gly Leu Leu Phe 1 5 10 15 Phe Phe Ser Gly Leu Phe He Aan Phe He Gln Ala He ee Phe Val 20 25 30 Leu Val Axg Pro Leu Ser Lys Thr Tyr Arg Axg He Asn Arg Val Leu 35 40 45 Val Glu Leu Leu Trp Leu Glu Leu He Trp Leu Val Asp Trp Trp Wing 50 55 60 Ser Val Lys He Lys Leu Phe Thr Asp Pro Asp Thr Phe Arg Leu Met 65 70 75 80 GIV Lys Glu Bis Ala Leu Val He Ser Aen His Arg Ser Asp lla Asp 85 90 95 Trp Leu Val Gly Trp Val Lau Wing Gln Arg Ser Gly Cys Leu Gly Ser 100 105 110 Thr Leu Wing Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val lio Gly 115 120 125 Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Ser Trip Wing ^^^^ MriflMgaAM ^ ri ^ ri ^ HBriita ^^ H ^^^ IM ^ l ^ ß ^ H ^^ ^ -a * m "? * I * m * £ á. 130 135 140 Lys Asp Glu Ser Thr "u Lys Leu Gly Leu Gln Arg Leu Lys Asp Tyr 145 150 155 160 Pro Leu Pro Phe Trp L-su Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 165 170 175 Gln Ala Lys Leu Leu Ala Ala Gln Olu Tyr Ala Thr Ser Met Gly Leu 180 185 190 Pro Val Pro Arg Xan Thr Leu lla Pro Arg Thr Lys Gly Phe Val Ser 195 200 205 Wing Val Ser His Met Arg Ser Phe Val Pro Wing He Tyr Asp Val Thr 210 215 220 Val Ala Pro Pro Lys Ser Ser Gln Pro Thr Met Leu Arg Leu Phe 225 230 235 240 Lys Gly Gln Pro Ser Thr Val Hia Val His He Lys Arg Arg Ser Met 245 250 255 Lys Asp Leu Pro Glu Ala Wing Asp Asp Val Wing Gln Trp Cys Arg Asp 260 265 270 Thr Phe Val Wing Lys Asp Wing Leu Leu Asp Lys His Asn Val Asp> Asp 275 280 285 Thr Phe GIV Asp Glu Tyr Leu Gln Asp Thr Gly Arg Pro Leu Lys Ser 290 295 300 Leu The Val Ala Val Ser Trp Ala Leu He Leu He Lau Gly Gly Leu ^^^ He 305 310 315 320 Lys Phe Leu Arg Trp Ser Ser Leu Lau Ser Ser Trp Lys Gly Val Ala 325 330 335 Phe Ser Ala Ala Cys Leu Val Leu Val Thr He Leu Met Gln He Leu 340 345 350 He Gln Phe Ser Gln Ser Glu Arg Ser Thr Pro Wing Lys Val Wing Pro 355 360 365 Oly Lys Pro Lys Asn Met Val Ser Glu Pro Thr Glu Thr Gln Arg His 370 375 380 Lys Gln Mig 385 < 210 > 163 < 211 > 43 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence Synthetic Onyigonucleotide. < 400 > 163 aagcttgcat gcgtcgacdc aatggtteat gcgaccaagt cag 43 < 210 > 164 < 211 > 35 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence Synthetic Oligonucleotide. < 400 > 164 ggtaccgtcg ActcftcttCt tggtgttgtt gatag 35 < 210 > 165 < 211 > 44 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 165 ggatccgcgg ccgcacaatg acgagcttta etaettecet teat 44 < 210 > 166 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Synthetic Oligonucleotide. < 400 > 166 ggatcccctg caggttagag atccattgat tctgcaat 38 < 210 > 167 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence Synthetic Oligonucleotide. < 400 > 167 ggatccgggg cegeataatg gaatcagagc tcaaagat 38 < 210 > 168 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificial Synthetic Oligonucleotide Sequence. < 400 > 168 ggatcccctg caggtcatte ttctttctga tggaaatc 38 < 210 > 169 _ ^ _ tÉií < 211 > 41 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence: Synthetic Oligonucleotide. < 400 > 169 ggatccgcgg ccgcacaatg aetcgtteac aagatgttte a 41 < 210 > 170 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence Synthetic Oligonucleotide. < 400 > 170 ggatcccctg caggtemett ctcttccaat ctagccag 33 < 210 > 171 < 211 > 46 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificial Synthetic Oligonucleotide Sequence. < 400 > 171 ggatccgcgg cegeacaatg tccggtaata agatetegac tettca 46 < 210 > 172 < 211 > 46 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence Synthetic Oligonucleotide. < 400 > 172 ggatcccctg caggttattt tttettgaca actccgttat taccgg 46 < 210 > 173 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence Synthetic Oligonucleotide. < 400 > 173 atatccgcgg ccgcacaatg gttatggagc aagctggaa 39 < 210 > 174 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificial Sequence Synthetic Oligonucleotide. < 400 > 174 ggatcccctg caggtcaatg gagacaaggc tegaadgt 38 < 210 > 175 < 211 > 42 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificial Sequence Synthetic Legionucleotide. < 400 > 175 ggatccgcgg ccgcacaatg tccgceaaga tttcaatatt ce 42 < 210 > 176 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 176 ggatcccctg caggttaatt tttcttaact actecatt 38 < 210 > 177 < 211 > 42 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 177 ggatccgcgg ccgcacaatg ggagctcagg agaaacggcg ec 42 < 210 > 178 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 178 ggatcccctg caggtcacgt etteteette tteacegg 38 - - • "-» - - "^ Mtt ^. < 210 > 179 < 211 > 44 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 179 ggatccgcgg ccgcacaatg gcggatcctg atctgtette tect 44 < 210 > 180 < 211 > 44 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 180 ggatcccctg caggttatgt tggggcceag teaggtgcaa agat 44 < 210 > 131 < 211 > 44 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 181 ggatccgcgg ccgcaaaialtg gaaaaaaaga gtgtaccaaa ttet 44 < 210 > 182 < 211 > 46 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 182 ggatcccctg caggttattt gtttactaat ttgagggaat tttttg 46 < 210 > 183 < 211 > 36 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 183 tcgacctgca ggeagettaa ggatggtgat tgctgc 36 < 210 > 184 < 211 > 31 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 184 ggatccgcgg ccgcttaett eteettetcc g 31 < 210 > 185 < 211 > 39 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 185 ggatccgcgg ccgcacaatg tcttttaggg atgtcct¿Lg 39 < 210 > 186 < 211 > 41 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 186 ggatcccctg caggtcaatc atccttaecc tttggtttac e 41 < 210 > 187 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 187 atgtctttta gggatgtcct agaaagagga gatgaatttt ctgtgcggta tttcacaccg 60 < 210 > 188 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > • ** '* < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 188 tcdatcatcc ttaccctttg gtttaccctc tggaggcaga agattgtact gagagtgcac 60 < 210 > 189 < 211 > 44 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 189 ggatccgcgg ccgcacaatg aagcattccc aaaaataccg tagg 44 < 210 > 190 < 211 > 41 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 190 ggatcccctg caggtcaatg attttttttc atcacaaata c 41 < 210 > 191 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 191 atgaagcatt cccaaaaata ccgtaggtct ggaatttatg ctgtgcggta tttcacaccg 60 < 210 > 192 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 192 tcaatgattt tttttcatca caaatacaag aataagaaaa agattgtact gagagtgcac 60 < 210 > 193 < 211 > 43 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 193 ggatccgcgg ccgcacaatg ggttttgttg atttettega aac 43 < 210 > 194 < 211 > 45 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Synthetic Artificial Oligonucleotide Sequence. < 400 > 194 ggatcccctg caggttcttt ggtctcaatt ttaatatttt tttgc 45 < 210 > 195 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 195 atgggttttg ttgatttctt cgadacatat atggtcggtt ctgtgcggta tttcacaccg 60 < 210 > 196 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 196 ttatttggtc tcaattttaa tatttttttg caaggactcg agattgtact gagagtgcac 60 < 210 > 197 < 211 > 44 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 197 ggatccgcgg ccgcacaatg gaaaagtaca ccaattggag agac 44 < 210 > 198 < 211 > 42 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 198 ggatcccctg caggctactt cctcttttta cgttgatcgc tg 42 ' < 210 > 199 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 199 I atggaaaagt acaccnattg gagagacaat ggtacgggaa ctgtgcggta tttcacaccg 60¡ < 210 > 200 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > ! < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 200 ! ctacttcctc tttttacgtt gatcgctgat atattccttc agattgtact gagagtgcac 60 < 210 > 201 < 211 > 41 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 201 ggatccgcgg ccgcacaatg cctgcaccaa aactcacgga g 41 < 210 > 202 < 211 > 38 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 202 ggatcccctg caggctacgc atctccttct ttcccttc 38 < 210 > 203 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 203 atgcctgcac caaaactcac ggagaaatct gcctcttcca ctgtgcggta tttcacaccg 60 < 210 > 204 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 204 ctacgcatct ccttctttcc cttcttcttc ttcttcctct agattgtact gagagtgcac 60 < 210 > 205 < 211 > 46 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Synthetic Artificial Oligonucleotide Sequence. < 40O 205 ggatccgcgg ccgcacaatg tctgctcccg ctgccgatca taacgc 46 < 210 > 206 < 211 > 44 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 206 ggatcccctg caggtcattc tttcttttcg tgttctcttt tctg 44 < 210 > 207 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 207 atgtctgctc ccgctgccga tcataacgct gccaaaccta ctgtgcggta tttcacaccg 60 < 210 > 208 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 208 tcattctttC ttttcgtgtt ctcttttctg tcttaccagc agattgtact gagagtgcac 60 < 210 > 209 < 211 > 49 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 209 ggatccgcgg ccgcacaatg ctgcatcaaa aaatagctca taaagctcg 49 < 210 > 210 < 211 > 49 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 210 ^^^^^^^^^^ tm ?? ^ ^ a ^? ^ m ^? ^^ m ^ t ^ ^. ^^ t ^ J. ^. ^^^. ^^ ai? EIm ggatcccctg caggtcaaaa aataaaacaa taaagtttat caactaacc 49 < 21O > 211 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 211 atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg ctgtgcggta tttcacaccg 60 < 210 > 212 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 212 tcaaaaaata aaacaataaa gtttataaac taaccaaatt agattgtact gagagtgcac 60 < 210 > 213 < 211 > 41 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificiah Sequence Synthetic Oligonucleotide. < 400 > 213 ggatccgcgg ccgcacaatg agtgtgatag gtaggttctt g 41 < 210 > 214 < 211 > 41 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 214 ggatcccctg caggttaatg catctttttt acagatgaac c 41 < 210 > 215 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 215 atgagtgtga taggtaggtt cttgtattac ttgaggtccg ctgtgcggta tttcacaccg 60 < 210 > 216 < 211 > 60 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 216 ttaatgcatc ttttttaccag atgaaccttc gttatgggta agattgtact gagagtgcac 60 < 210 > 217 < 211 > 381 < 212 > PRT < 213 > Saccharomyces sp. < 220 > < 400 > 217 Met Ser Phe Arg Asp Val Leu Glu Arg Gly A-sp Glu Phe Lau Glu Ala 1 5 10 15 Tyr Pro Arg Arg Ser Pro Leu Trp Arg Phe Leu Ser Tyr Ser Thr Ser 20 25 30 Leu Leu Thr Phe GIV Val Ser Lys Leu Leu Leu Phe Thr Cys Tyr Asn 40 45 Val Lys Lau Asn Gly Phe Glu Lys Leu Glu Thr Ala Leu Glu Arg Ser 50 55 60 Lys Arg Glu Asn Arg Gly Leu Met Thr Val Met Asn My Met Ser Met 65 70 75 80 Val Asp Asp Pro Leu Val Trp Wing Thr Leu Pro Tyr Lys Leu Phe Thr 85 90 95 Ser Leu Asp Asn He Arg Trp Ser Leu Gly Wing My Asn He Cys Phe 100 105 110 Gln Asn Lys Phe Leu Wing Asn Phe Phe Ser Leu Gly Gln Val Leu Ser 115 120 125 Thr Glu Arg Phe Gly Val Gly Pro Phe Gln Gly Ser He Asp Wing Ser 130 135 140 He Arg Leu Leu Ser Pro Asp Asp Thr Leu Asp Leu Glu Tf Thr Pro 145 150 155 160 His Ser Glu Val Being Ser Leu Lys Lys Ala Tyr Ser Pro Pro He 165 170 175 He Arg Ser Lys Pro Ser Trp Val His Val Tyr Pro Glu Gly Phe Val 180 185 190 Leu Gln Leu Tyr Pro Pro Phe Glu kan Ser Met Krg Tyr Phe Lys Trp 195 200 205 Gly He Thr Arg Met He Leu Glu Wing Thr Lyg Pro Pro He Val Val 210 215 220 Pro He Phe Wing Thr Gly Phe Glu Lys lie Wing Ser Glu Wing Val Thr 225 230 235 240 Asp Ser Met Phe Arg Gln He Leu Pro Arg Asn Phe Gly Ser Glu lla 245 250 255 Asn Val Thr He Gly Amp Pro Leu Asn Asp Asp Leu As Asp Arg Tyr 260 265 270 Arg Lys Glu Trp Thr My Leu Val Glu Lys Tyr Tyr Asp Pro Lys Asn 275 280 285 Pro Asn Asp Leu As Asp Glu Leu Lys Tyr Gly Lys Glu Wing Gln Asp 290 295 300 Leu Arg Be Arg Leu Wing Wing Glu Leu Arg Wing His Val Wing Glu He 305 310 315 320 Axg Asn Glu Val Arg Lys Leu Pro Arg Glu Asp Pro Arg Phe Lys Ser 325 330 335 Pro Be Trp Trp Lys Arg Phe Asn Thr Thr Glu Gly Lys Ser Asp Pro 340 345 350 Asp Val Lys Val lla Gly Glu Asn Trp Wing He Arg Arg Met Gln Lys 355 360 365 Phe Leu Pro Pro Glu Gly Lys Pro Lys Gly Lys Asp Asp 370 375 380 < 211 > 396 ¿¡¡¡¡¡¡¡^ ^ ^ 212 > PRT < 213 > Saccharomyces sp. < 220 > < 400 > 218 Met Lys His Ser Gln Lys Tyr Arg Arg Tyr Gly lla Tyr Glu Lys Thr 1 5 10 15 Gly Asn Pro Phe He Lys Gly Leu Gln Arg Leu Leu call Ala Cys Leu 20 25 30 Phe He Ser Gly Ser Leu Ser He Val Val Phe Gln lie Cys Lau Gln 35 40 45 Val Leu Leu Pro Trp Ser Lys He Arg Phe Gln Asn Gly He Asn Gln 50 55 60 Ser Lys Lys Wing Phe He Val Leu Leu Cys Met lio Leu Asn Met Val 65 70 75 80 Wing Pro Ser Ser Leu Asn Val Thr Phe Glu Thr Ser Arg Pro Leu Lys 85 90 95 Asn Being Being Asn Wing Lys Pro Cys Phe Arg Phe Lys Asp Arg Wing He 100 105 110 lie He Wing Asn His Gln Met Tyr Wing A3p Trp lla Tyr Leu Trp Trp 115 120 125 Ler Ser Phe Val Ser Asn Leu Gly Gly Asn Val Tyr lio He Leu Lys 130 135 140 Lys Ala Lau Gln Tyr lie Pro Leu Leu Gly Phe Gly Ret Arg Asn Phe 145 150 155 160 LyS Phe He Phe Leu Ser Arg Asn Trp Gln Lys Asp Glu Lys Ala Leu 165 170 175 Thr Asn Ser Leu Val Ser Met Asp Leu Asn Wing Arg Cys Lys Gly Pro 180 185 190 Leu Thr Asn Tyr Lys Ser Cys Tyr Ser Tys Thr Asn Glu Ser He Ala 195 200 205 Wing Tyr Asn Leu He Met Phe Pro Glu Gly Thr Asn Leu Ser Leu Lys 210 215 220 Thr Arg Glu Lys Ser Glu Wing Phe Cys Gln Arg Wing His Leu Asp His 225 230 235 240 Val Gln Leu Arg His Lau Leu Leu Pro His Ser Lys Gly Leu Lys Phe 245 250 255 Wing Val Glu Lys Leu Pro Wing Ser Leu Asp Wing Tyr Asp Val Thr 260 265 270 He Gly Tyr Ser Pro Wing Leu Arg Thr Glu Tyr Val Gly Thr Lys Phe 275 220 285 Thr Leu Lys Lys He Phe Leu Met Gly Val Tyr Pro Glu Lys Val Asp 290 295 300 Phe Tyr He Arg Glu Phe Arg Val Asn Glu He Pro Leu Gln Asp Asp 305 310 315 320 Glu Val Phe Phe Asn Trp Leu Leu Gly Val Trp Lys Glu Lys Asp Gln 325 330 335 Leu Leu Glu Asp Tyr Tyr Asn Thr Gly Gln Phe Lys Ser Asn Ala Lys 340 345 350 Asn Asp Asn Gln Be He Val Val Thr Thr Gln Thr Thr Gly Phe Gln 355 360 365 His Glu Thr Leu Thr Pro Arg He Leu Ser Tyr Tyr Gly Phe Phe Wing 370 375 380 Phe Leu He bcu Val Phe Val Met Lys Lys Asn His 395 390 395 < 210 > 219 < 211 > 479 < 212 > PRT < 213 > Saccharomyces sp. < 220 > < 400 > 219 Met Gly Phe Val Aip Phe Phe Glu Thr Tyr Met Val Gly Ser Arg Val 1 5 10 15 Gln Phe Lys Gln Leu Asp He Ser Asp Trp Lau Ser Leu Thr Pro Arg 20 25 30 Leu Leu He Leu Phe Gly Tyr Phe Tyr Leu His Be Phe Phe Thr Ala 35 40 45 10 He Asn Gln Phe Leu Gln Phe Lie Even Thr Asn Being Phe Cys Leu Arg 50 55 60 Leu His Leu Leu Tyr Asp Arg Phe Trp Ser His Val Pro He He Gly 65 lo 75 80 Glu Tyr Lyg He Arg Leu Leu Ser Arg Ala Lau Thr Tyr Ser Lys Leu 85 90 95 Lys He He Pro Thr Leu Asp Lys Val Leu alu Ala lie Glu He Trp 100 105 110 Phe Gln Lau His Leu Val Glu Met Thr Phe Clu Lys Lys Lys Asn Val 115 120 125 Gln lla Phe lla Thr Glu Gly Ser Asp Asp Leu Asn Phe Phe Lys Asp 130 135 140 Ser Lys Phe Gla Thr Thr Lau Met He Cys Asn His Arg Ser Val Asn 145 150 155 160 Asp Tyr Thr teu He Asn Tyr Leu Phe Leu Lys Ser Cys Pro Thr Lys 165 170 175 Phe Tyr Thr Lys Trp Glu Phe Leu Gln Lys Leu Arg Lys Gly Glu Asp 180 135 190 Leu Wing Glu Trp Pro Gln Leu Lys Phe Leu Gly Trp Gly Lys Met Phe 195 200 205 Asn Phe Pro Arg Leu Asp Leu Leu Lys Asn He Phe Phe Lys Asp Glu 210 215 220 Thr Leu Ala Lau Being Ser Asn Glu Leu Arg Asp He Leu Glu Arg Gln 225 230 235 240 Asn Asn Gln Wing He Thr He Phe Pro Glu Val Asn lie Met Ser Leu 245 250 255 Glu Leu Ser He He Gln Arg Lys Leu His Gln Asp Phe Pro Phe Val 260 265 270 He Asn Phe Tyr Asn Leu Leu Tyr Pro Arg Phe Lys Asn Phe Thr Thr 275 280 285 Leu Met Ala Ala Phe Ser Ser He Lys Asn lio Lys. Arg Lys Lys Asn 290 295 300 Arg Asn Asn He He Lys Glu Wing Arg Tyr Leu Phe His Arg Glu Leu 305 310 315 320 Asp Lys Leu Val His Lys Ser Met Lys Met Glu Ser Ser Lys Val Ser 325 330 335 Asp Lys Thr Thr Pro Pro Met He Val Amp Asn Ser Tyr Leu Leu Thr 340 345 350 Lys Lys Glu Glu He Ser Ser Gly Lys Pro Lys Val Val Arg He Asn 355 360 365 Pro Tyr He Tyr Asp Val Thr lio lio Tyr Tyr Arg Val Lys Tyr Thr 370 375 380 Asp Ser Gly His Asp His Thr Asn Gly Asp Leu Arg Leu His Lys Gly 385 390 395 400 Tyr Gln Leu Glu Gln lla Ser Pro Thr He Phe Glu Met He Gln Pro 405 410 415 Glu Met Glu Ser Glu Asn Asn He Lys Asp Lys Asp Pro He Val Val 420 425 430 Met Val Asn Val Lys Lys His Gln He Gln Pro Leu Leu Wing Tyr Asn 43S 440 445 Asp Glu Ser Lau Glu Lys Trp Leu Glu Asn Arg Trp He Glu Lys Asp 450 455 460 Arg Leu lla Glu Ser Lau Gln Lys Asn He Lys lla Glu Thr Lys 465 470 475 < 210 > 220 < 211 > 300 < 212 > PRT < 213 > Saccharomyces sp. < 400 > 220 Met Glu Lys Tyr Thr Asn Trp Arg Asp Asn Gly Thr Gly He Ala Pro 1 5 10 15 Phe Lau Pro Asa Thr He Arg Lys Pro Ser Lys Val Met Thr Ala Cys 20 25 30 Leu Leu Gly He Lau GIV Val Lys Thr He Met Met Leu Pro Leu He 35 40 45 Met Leu Tyr Leu Lau Thx Gly Gln Asn Asn Leu Leu Gly Leu He Leu so 55 60 Lys Phe Thr Phe Ser Trp Lys Glu Glu He Thr Val Gln Gly lla Lys 65 70 75 80 Lys Arg Asp Val Arg Lys Ser Lys His Tyr Pro Gln Lys Gly Lys Leu 85 90 95 Tyr He Cyg Asn Cys Thr Ser Pro Leu Asp Wing Phe Ser Val Val Leu 100 105 He Leu Wing Gln Gly Pro Val Thr Leu Leu Val Pro Ser As Asp He Val 115 120 125 Tyr Lys Val Ser He Arg Glu Phe He Asn Phe He Leu Wing Gly Gly 130 135 140 Leu Asp lla Lys Leu Tyr Gly His Glu Val Ala Glu Leu Ser Gln Leu 145 150 155 160 Gly Asn Thr Val Asn Phe Met Phe Wing Glu Gly Thr Ser Cys Asn Gly 165 170 175 Lys Ser Val Leu Pro Phe Ser He Thr Gly Lys Lyg Lau Lys Glu Phe 180 185 190 He Asp Pro Ser He Thr Thr Met Asn Pro Ala Met Ala Lys Thr Lys 195 200 205 Lys Phe Glu Leu Gln Thr He Gln He Lys Thr Asn Lys Thr Ala He 210 215 220 Thr Thr. Leu Pro Be Ser Asn Met Glu Tyr Leu Ser Arg Phe Leu Asn 225 230 235 240 Lys Gly He Asn Val Lys Cys Lys He Asn Glu Pro Gln Val Leu Ser 245 250 255 Asp Asn Leu Glu Glu Leu Arg Val Wing Leu Asn Gly Gly Asp Lys Tyr 260 265 270 Lys Leu Val Ser Arg Lys Leu Asp Val Glu Ser Lys Arg Asn Phe Val 275 290 285 Lys Glu Tyr He Ser Asp Gln Arg Lys Lys Arg Lys 290 295 300 < 210 > 221 < 211 > 759 < 212 > PRT < 213 > Saccharomyces sp. < 400 > 221 Met Pro Pro Lys Leu Thr Glu Lys Phe Wing Ser Ser Lys Ser Thr 1 5 10 15 Gln Lys Thr Thr Asn Tyr Ser Ser He Glu Wing Lys Ser Val Lys Thr 20 25 30 Wing Wing Asp Gln Wing Tyr He Tyr Gln Glu Pro Be Wing Thr Lys Lys 35 40 45 He Leu Tyr Ser He Wing Thr Trp Leu Leu Tyr Asn He Phe His Cys 50 55 60 Phe Phe Arg Glu He Arg Gly Arg Gly Ser Phe Lys Val Pro Gln Gln - i. - i_ ^ ¡^^ jg 65 70 75 80 Cly Pro Val He Phe Val Ala Ala Pro His Ala Asn Gln Phe Val Asp as 90 95 Pro Val He Leu Met Gly Glu Val Lys Lys Ser Val Asn Arg Arg Val 100 105 110 Being Phe Leu He Wing Glu Being Ser Leu Lys Gln Pro Pro calling Gly Phe 115 120 125 Leu Wing Being Phe Phe Met Wing He Gly Val Val Arg Pro Gln Asp Asn 130 135 140 Leu Lys Pro Wing Glu Gly Thr He Arg Val Asp Pro Thr Asp Tyr Lys 145 150 155 160 Arg Val He Gly His Asp Thr His Phe Lau Thr Asp Cys Met Pro Lys 165 170 175 Gly Lau He Gly Leu Pro Lys Ser Met Gly Phe Gly Glu He Gln Ser 180 185 190 He Glu Ser Aap Thr Ser Leu Thr Leu Arg Lys Glu Phe Lys Met Wing 195 200 205 Lys Pro Glu He Zys Thr Wing Leu Leu Thr Gly Thr Thr Tyr Lys Tyr 210 215 220 Wing Wing Lys Val Asp Gln Ser Cys Val Tyr His Arg Val Phe Glu His 225 230 235 240 Leu Ala His Asn Asn Cys He Gly He Phe Pro Glu Gly Gly Ser His 245 250 255 - '* »*'» - * Asp Arg Thr Asn Leu Leu Pro Leu Lya Wing Gly Val Wing He Met Wing 260 265 2.70 Leu Gly Cys Met Asp Lys Without Pro Aup Val Asn Val Lys He Val Pro 275 280 285 Cys Gly Met Asn Tyr Phe His Pro His Lys Phe Arg Ser Arg Ala Val 290 295 300 Val Glu Phe Gly Asp Pro He Glu He Pro Lys Glu Leu Val Ala Lys 305 310 315 320 Tyr His Asn Pro Glu Thr Asn Arg Asp Ala Val Lys Glu Leu Leu Asp 325 330 335 Thr He Ser Lys Gly Leu Gln Ser Val Thr Val Thr Cya Ser Asp Tyr 340 345 350 Glu Thr Leu Met Val Val Gln Thr He Arg Arg Leu Tyr Met Thz, Gln 355 360 365 Phe Ser Thr Lys Leu Pro Leu Pro Leu He Val Glu Met Asn Arg Arg 370 375 380 Met Val Lys Gly Tyr Glu Phe Tyr Arg Asn Asp Pro Lys He Wing Asp 385 390 395 400 Leu Thr Lys Asp He Met Wing Wing Tyr Asn Wing Wing Leu Arg His Tyr Acn 405 410 415 Leu Pro Asp Without Leu Val Glu Glu Ala Lys Val Ann Phe Ala Lys Asn 420 425 430 Leu Gly Leu Val Phe Phe Arg Ser He Gly Leu Cyg He Leu Phe Ser ^ uH ^ 435 440 445 Leu Ala Met Pro Gly He Met Met Phe Ser Pro Val Phe He Leu, Ala 450 455 460 Lys Arg lla Ser Gln Olu Lys Wing Arg Thr Wing Leu Ser Lys Ser Thr 465 470 475 480 Val Lys He Lys Wing Asn Asp Val He Wing Thr Tf Lys He Lau He 485 490 495 Gly Met Gly Phe Wing Pro Leu Leu Tyr He Phe Trp Ser Val Leu He 500 505 510 Thr Tyr Tyr Leu Arg His Lys Pro Trp Even Lys He Tyr Val Phe Ser 515 520 525 Gly Ser Tyr He Ser Cys Val He Val Thr Tyr Ser Ala Leu He Val 530 535 540 Gly Asp He Gly Met Asp Gly Phe Lys Ser Leu Arg Pro Leu Val Leu 545 550 555 560 Ser Leu Thr Ser Pro Lyg Gly Leu Gln Lys Leu Gln Lys Asp Arg Arg 565 570 575 Asn Leu Ala Glu Arg He He Glu Val Val Asn Asn Phe Gly Ser Glu 580 585 590 Leu Phe Pro Asp Phe Asp Be Ala Ala Leu Arg Glu Glu Phe Asp Val 595 600 605 He Asp Glu Glu Glu Glu Asp Arg Lys Thr Ser Glu Leu Aun Arg Arg 610 615 620 Lys Met Leu Arg Lys Gln Lys lie Lys Arg Gln Glu Lys Asp Ser Ser 625 630 635 640 Ser Pro He He Ser Gln Arg Asp Asn His Asp Ala Tyr Glu His His 645 650 655 Asn Gln Asp Ser Asp Gly Val Ser Lau Val Asn Ser Asp Asn Ser Leu 660 665 670 Ser Asn lla Pro Pro Leu Phe Ser Ser Thr Phe His Arg Lys Ser Glu Ser 675 680 685 Ser Leu Ala Ser Thr Ser Val Ala Pro Ser Ser Ser Ser Glu Phe Glu 690 695 700 Val Glu Asn Glu lio Leu Glu Glu Lys Aun Gly Leu Ala Ser Lys He 705 710 715 720 Ala Gln Ala Val Lau Asn Lys Arg He Gly Glu Even Thr Ala Arg Glu 725 730 735 Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu 740 745 750 Glu Gly Lys Glu Gly Asp Ala 755 < 210 > 222 < 211 > 743 < 212 > PRT < 213 > Saccharomyces sp. < 400 > 222 Met Be Wing Pro Wing Wing Asp His Asn Wing Wing Lys Pro He Pro His 1 5 10 15 Val Pro Gln Wing Being Arg Arg Tyr Lys Asn Being Tyr Asn Gly Phe Val 20 25 30 Tyr Asn He His Thr Trp Lau Tyr Asp Val Ser Val Phe Leu Phe Asn 35 40 45 lie Leu Phe Thr He Phe Phe Arg Glu He Lys Val Arg Gly Ala Tyr so 55 60 Asn Val Pro Glu Val Gly Val Pro Thr He Lau Val Cys Ala Pro His 65 70 75 80 Wing Asn Gln Phe He Asp Pro Wing Leu Val Met Ser Gln Thr Arg Leu 85 90 95 Leu Lys Thr Ser Wing Gly Lys Ser Arg Ser Arg Met Pro Cys Phe Val 100 105 110 Thr Wing Glu Ser Ser Phe Lys Lys Arg Phe He Ser Phe Phe Gly His 115 120 125 Wing Met Gly Gly He Pro Val Pro Arg He Gln Asp Asn Leu Lys Pro 130 135 gone Val Asp Glu Asn Leu Glu He Tyr Ala Pro Asp Leu Lys Asn His Pro 145 150 155 160 Glu He He Lys Gly Arg Ser Lys Asn Pro Gln Thr Thr Pro Val Asn 'a. ^ Jfajj'? * 165 170 175 Phe Thr Lys Arg Phe Ser Wing Lys Ser Leu Leu Gly Leu Pro Asp Tyr 180 185 190 Leu Ser Asn Wing Gln He Lys Glu He Pro Asp Asp Glu Thr He He 195 200 205 Leu Be Ser Pro Phe Arg Thr Ser Lys Ser Lya Val Val Glu Leu Leu 210 215 220 Th.r Asn Gly Thr Asn Phe Lys Tyr Wing Glu Lys He Asp Asn Thr Glu 225 230 235 240 Thr Phe Gln Ser Val Phe Asp His Leu His Thr Lyg Gly Cys Val GLy 245 250 255 lie Phe Pro Glu Gly Gly Ser His Asp Arg Pro Ser Leu Leu Pro He 260 265 270 Lys Ala Gly Val Ala Wing Met Wing Ala Leu Gly Wing Val Wing Wing Asp Pro 275 280 225 Thr Mat Lys Val Ala Val Val Cys Gly Leu His Tyr Phe His Arg 290 295 300 Asn Lys Phe Arg Ser Arg Ala Val Leu Glu Tyr Gly Glu Pro He Val 305 310 315 320 Val Asp Gly Lys Tyr Gly Glu Met Tyr Lys Asp Ser Pro Arg Glu Thr 325 330 335 Val Ser Lys Leu Leu Lys Lys He Thr Asn Ser Leu Phe Ser Val Thr 340 345 350 Glu Asn Wing Pro Amp Tyr Asp Thr Leu Met Val He Gln Wing Wing Arg 355 360 365 Arg Leu Tyr Gln Pro Val Lys Val Arg Leu Pro Leu Pro Wing Val 370 375 380 Glu He Asn Arg Arg Leu Leu Phe Gly Tyr Ser Lys Phe Lys Asp Asp 385 390 395 400 Pro Arg He He His Leu Lys Lys Leu Val Tyr Asp Tyr Asn Arg Lys 405 410 415 Leu Asp Ser Val Gly Leu Lys Asp His Gln Val Met Gln Leu Lys Thr 420 425 430 Thr Lys Leu Glu Wing Leu Arg Cys Phe Val Thr Leu He Val Arg Leu 435 440 445 He Lys Phe Ser Val Phe Wing He Leu Ser Leu Pro Gly Ser He Leu 450 455 460 Phe Thr Pro He Phe He He Cys Arg Val Tyr Ser Glu Lys Lys Wing 465 470 475 480 Lys Glu Gly Leu Lys Lys Ser Leu Val Lys He Lys Gly Thr Asp Leu 485 490 495 Leu Wing Thr Trp Lys Leu He Val Wing Leu He Leu Wing Pro He Leu 500 505 510 Tyr Val Thr Tyr Ser He Leu Leu He He Leu Ala Arg Lys Gln His 515 520 525 Tyr Cys Arg He Trp Val Pro Ser Asn Asn Ala Phe lla Gln Phe Val • * - • "- * -_i M * m * _? J ^ _ ^^^^ _ ^ r¿ ^ - ^^ 530 535 540 Tyr Phe Tyr Ala Leu Leu Val Phe Thr Thr Tyr Ser Ser Leu Lys Thr 545 550 555 560 Gly Glu He Gly Val Asp Leu Phe Lys Ser Leu Arg Pro Leu Phe Val 565 570 575 Ser He Val Tyr Pro Gly Lys Lys lie Glu He Gln Thr Thr Arg seo sas 590 Lys Asn Leu Ser Leu Glu Leu Thr Wing Val Cys Asn Asp Leu Gly Pro 595 600 605 Leu Val Phe Pro Asp Tyr Asp Lys Leu Wing Thr Glu He Phe Ser Lys 610 615 620 Arg Asp Gly Tyr Asp Val Ser Ser Asp Wing Glu Ser Ser He Ser Arg 625 630 635 640 Met Ser Val Gln Ser Arg Ser Arg Ser Ser He He Ser He Gly 645 650 655 Ser Leu Ala Ser Asn Ala Leu Ser Arg Val Asn Ser Arg Gly Ser Leu 660 665 670 Thr Asp He Pro He Phe Ser Asp Ala Lys Gln Gly Gln Trp Lys Ser 675 680 685 Glu Gly Glu Thr Ser Glu Asp Glu Asp Glu Phe Asp Glu Lys Asn Pro 690 695 700 Ala He Val Gln Thr Ala Arg Ser Ser Asp Leu Asn Lys Glu Asn Ser 705 710 715 720 -.- afafaSfe Arg Asn Thr Asn He Ser Ser Lys He Wing Ser Leu Val Arg Gln Lys 725 730 735 Arg Glu His Glu Lys Lys Glu 740 < 210 > 223 < 211 > 397 < 212 > PRT < 213 > Saccharomyces sp. < 400 > 223 Met Leu My Gln Lys I Wing My Lys Val Arg Lys Val Val Val Pro 1 5 10 15 Gly lio Ser L-su Leu He Phe Phe Gln Gly Cys Leu lio Leu Leu Phe 20 25 30 Leu Gln Leu Thr Tyr Ly3 Thr Leu Tyr Cys Arg Asn Asp lio Az-g Lys 35 40 45 Gln lla Gly Lau Ann Lyg Thr Lys Arg Leu Phe lio Val Leu Val¡ Ser 50 55 60 Ser lie Leu Mis Val Val Ala Pro Ser Ala Val Arg lla Thr Thr Glu 65 70 75 80 Asn Ser Ser Val Pro Lys Gly Thr Phe Phe Leu Asp Leu Lys Lys Lys . «« * «& 85 90 95 Arg lla Lau Be My Leu Lys Ser Asn Ser Val Ala lio Cys Asn Mis 100 105 110 Gln lla Tyr Thr Asp Trp He Phe Leu Trp Trp Leu Wing Tyr Thr¡ Ser 115 120 125 Aen Lau Gly Wing Mn Val Phe lie He Lau Lyg Lys Ser Leu Wing Ser 130 135 140 He Pro He Leu Gly Phe Gly Met Arg Asn Tyr Asn Phe He Phe Met 145 150 155 160 Ser Azg Lyg Trp Wing Gln Asp Lys He Thr Leu Ser Mn Ser Leu | Wing 165 170 175! Gly Leu Asp Ser Ann Wing Axg Gly Wing Gly Ser Leu Wing Gly Lysl Ser 180 185 190 Pro Glu Arg lla Thr Glu Glu < 31 and Glu Ser He Trp Asn Pro Glu Val 195 200 205 lla Asp Pro Lys Gln lio My Trp Pro Tyr Aan Leu lla L-eu Phe Pro 210 215 220 Glu Gly Thr Asn Leu Ser Wing Asp Thr Azg Gln Lyi Ser Wing Lyis Tyr 225 230 235 240 Ala Ala Lys He Gly Lys Lys Pro Phe Lys Asn Val Leu Leu Pro! His 245 250 255 Be Thr Gly Leu Arg Tyr Ser Leu Gln Lys Leu Lys Pro Ser Hei Glu 260 265 270 Ser Leu Tyr Aup lio Thr He Gly Tyr Ser GIV Val Lys Gla Glu Glu 275 220 285 Tyr Gly Glu Leu lla Tyr Gly Leu Lys Ser He Phe Leu Glu Gly Lys 290 295 300 Tyr Pro Lys Lau Val > sp lla Mis lio Arg Ala Phe Mp Val Lys Asp 305 310 315 320 He Pro Leu Glu Asp Glu Asn Glu Pha Saz Glu Trp Leu Tyr Lys He 325 330 335 Trp Ser Glu Lys Asp Wing Leu Met Glu Axg Tyr Tyr Ser Thr Gly Ser 340 345 350 Phe Val Ser Asp Pro Glu Thr Asn Hig Ser Val Thr Asp Ser Phe Lys 355 360 365 lio Asn Arg He Glu Leu Thr Glu Val Leu He Leu Pro Thr Lau Thr 370 375 380 lla He Trp Leu Val Tyr Lys Leu Tyr Cys Phe He Phe 385 390 395 < 210 > 224 < 211 > 303 < 212 > PRT < 213 > Saccharomyces sp. < 400 > 224 Met Ser Val He Gly Arg Phe Leu Tyr Tyr Leu Arg Ser Val Leu Va (^^ aa | áM ^^ U | UaMa ^ aM ^^? a? Haa | UaH 1 1 5 10 15 Val Leu Ala Leu Ala Gly Cya Gly Phe Tyr Oly Val lio Ala be lla 25 30 ¡t Leu Cya Thr Leu lla Gly Lys Gln His Leu Ala Gln Trp He Thr! Wing I 35 40 45 Arg Cys Phe Tyr His Val Met Lyg Leu Met Leu Gly Leu Asp Val! Lys 50 55 60 Val Val Gly Clu Glu Aan Leu Ala Lys Lys Pro Tyr lla Met lla Ala 65 70 75 80 Asn Laugh Gln Ser Thr Leu Aup lie Phe Met Leu GIV Arg lie Phe¡ Pro 85 90 95! Pro GIV Cyg Thr Val Thr Ala Lys 1, and x Ser Leu Lys Tyr Val Pro! Phe 100 105 110 | Leu Gly Trp Phe Met Wing Leu Ser Gly Thr Tyr Phe Leu Asp Arg Ser 115 120 125 Lys Arg Gln Glu Wing He Asp Thr Lau Asn Lys Gly Leu Glu Asnl Val 130 135 140 Lys Lys kan Lyg Arg Ala Lau Trp Val Phe Pro Glu Gly Thr Arg Ser 145 150 155 160 Tyr Thr Saz Glu Lau Thr Met Leu Pro Phe Lys Lyg Gly Wing Phe Mis 165 170 175 Leu Wing Gln Gln GIV Lys lla Pro lio Val Pro Val Val Val Ser Asn 180 185 190 Thr Ser Thr Lau Val Pro Pro Lys Tyr GIV Val Phe A = Arg Gly Cys I 195 200. 205! Met lio Val Arg lio 1, cu Lyx Pro He Ser Thr Glu A-gn Leu Thr Lys 210 215 220 t Lys lla Gly Glu Phe Ala Glu Lys Val Arg Asp Gln > Let Val Asp 225 230 235 240 Thr Leu Lys Glu He Gly Tyr Ser Pro XI? l lio Asn Asp Thr Thr Leu 245 250 255 Pro Pro Gln Wing Wing Glu Tyr Wing Wing Ala Leu Gln My Asp Lys Val 260 265 270 Even Lys Lys He Lys Still Glu Pro Val Pro Ser Val Ser lla Ser Asn 275 280 285 Asp Val Even Thr His Aun Glu GIV Ser Ser Val Lys Lys Met His 290 295 300 < 210 > 225 < 211 > 1146 < 212 > DNA < 213 > Saccharomyces sp. i < 400 > 225 atgtetttte gggatgtcct agaaa < jagga gatga¿ttttt tagaagccta tcccagaaga 60 agcceccttt ggagatttet ttcatacagt acateattac tgaccttegg tgtatcaaaa 120 i . AeJ &rfJí - *. - "* - '* jj i ctgcttettt tcacatgcta taatgtcaaa ttgaatggtt ttgaaftaatt ageabctgcc 180' ttggaacgtt ccaaaaggga aaatagaggc cttatgacgg tcatgaacca tatgagtatg 240 I gtcgatgate cgttagtttg ggcaacacta ccetataagt tatttacgte tttggacaac 300 i ataagatggt etttgggtgc acata¿ktatt tgctttcaaa ataaatttct ggccaacttt 360 ttetcacttg gccaagtcct ttcaacagam agatttgggg tgggcccatt tcaaggttet ! 420 atagatgctt caataagatt gttaagccct gacgacactt tagacttgga atggacccct 480 caetctgagg tctettettc getaaaaaaa gcetaetcce cgcccataat aaggtcgaag 540 ccatcttggg tccatgttta tecagaagga tttgtactac aamtatatee gcettttgaa 600 aattcgatg * ggtattttaa atggg < gtatt aecagaatga tectagaagc aacaaagccg 660 cceattgtag tmecamtett tgctaceggg tttgaaaaaa agcagteaca tageatccga 720 gettcaatgt ttagacaaat aactttggct tctaccaaga ctgaaataaa tgttaccata 780 taaatgatga ggggatectt tttaategac aggtetagaa magaatggac acetttggtt 840 gaaaaatact atgatcccaa aaatectaac gacctctctg degaattgaa atatggtaaa 900 gaggcgcaag etttaagaag cagattagcc gc tgaactga gagcccatgt tgctgaaatt 960 agaaatgaag ttcgcaaatt accacgcgaa gacectaggt tcaaatcccc ctcatggtgg 1020 aagcggttca acaccacgga aggtaaatcg gacceagatg ttaaagtcat tggcgaaaat 1080 tgggcaataa ggaggatgca aaagtttctg cetecagagg gtaaaccaadl gggr- aagget 1140 gattga 1146 < 210 > 226 < 211 > 1191 < 212 > DNA < 213 > Saccharomyces sp. < 400 > 226 atgaagcatt cccaaaaata ccgtaggtat 9gaatttatg aaaagactgg taatcccttt 60 ataaaagggz tgcaaagget gcttatcgct tgcttgttca ttteaggete gctgagtett 120 gtcgttttte agatctgtct acaggtgctt etecettgga gcaagattag atttcaaaat leo¡ ggtatadate aaagtaagaa ggcttttatc gtttt¿tttat gcatgatett gaacatggtg 240 getecetett etttgaatgt caettttgaa acatcgcggc cattga¿kgaa etettetaac 300 gccaagccat getttagatt tamagacagg gct¿ttaataa ttgcaaatca tcaaatgtat 360 i geagactgge tttatctetg gtggetttcc tttgtttca «atttgggtgg taacgtttat 420! atcatcctga agaaagotct geagtac¿Lta ecattactgg getttggcat gcgadatttt 80 aagtttatat ttttaagtag gaactggcaa aaggatgaga aagetttaac aaatagtttg 540 gtttctatgg acttaaacgc gaggtgcaag gggcccctta caaattataa gagttgttat 600 tccaagacaa atgaatccat tgccgcttat aatttaatea tgttccctga gggtacaaat 660 etnALgcetea agacaagaga aaaaagcgag gcattctgte aaagagcaca tttggaccat 720 gtecaettma gacatttgtt attaccgcae tetaaagget tgaagtttgc agtagaaaaa 780 ctagctccta gtttagatgc tatctacgat gteactattg gatattetee cgccttgaga 840 i deggaatacg tcggcaccaa ettcaecttg eagaaaatat tcttaatggg tgtctatccg 900 gagaaagtag atttttatat ta < jggeattt agagttaatg agatecettt gcaagatgac 960 i gaagtttttt teaattggtt actgggcgtg tggaaag ~ aagatcaact getagangac! 1020 tactacaaca caggccaatt taaaagtaat gctaaaaatg acaacceatc catcgttgtt 1060 ftegacacada egactggatt teagcacgaa ac¿Lttgacae cccgtatcct ttcatattac! 1140 gggttettcg cttttettat tottgtattt gtgatgaaaa aaaatcattg to 1191 < 210 > 227 < 211 > 1440 < 212 > DNA < 213 > Saccharomyces sp. < 400 > 227 atgggttttg ttgatttett egaaacatat atggtcggtt etagggtcoa gttca¿Lacag 6o ttagatattt ctgattggtt gagtctgacc ccaaggttgc ttattetttt tggetatttt 120 tacetteatt etttttttac tgcaatcaat caattcctac agttc¿ittea cacgaattcc 180 ttctgtctta gactgcattt actatatgac acjattttggt cgcatgt < lcc cataataggt 240 gagtacaaaa ttcggctgct etcgagggea ctgacatata gtaaactgaa aataatacca 300 i aetttagaca aggtgctgga ggcgattg * e atttggtttc agetacattt agttgaaatg 360 acettegaaa aaaaaaaaaa cgtccaaatt tteataaccg agggaagtga tgacctaaa? 420 i ttttttaaag atagcaaatt ccaaaccaca ttaatgatat gtaatcatcg atc¿tgtgaat 480 gactacacet tgattaatta cetttttctc aaaagttgte ccaccaagtt ttatact¿iaa 540 | tgggaattte tacaaaagct gaggaagggg gaagatctag ctgaatggcc tcagttae * aj 600 tttcttggtt ggggaaaaat gtttaacttt cctegattgg atctactaaa gaacatatte 660 ttcaaagatg aaacactcgc acteteateg aatgagttad gagatatttt agaaagacaa 720 aacaatcaag etattectat ttttecegaa gtcaatatea tgagtttgggt actatcaatt 780 mattacacea attcaaagaa agattttece tttgttataa aettetataa tttattatac 340 ccaagattte aamactttac caetttgatg < jctgcttttt catenactaa & amp; amp; & & amp; amp 900 accgtaacaa agaaagaaaa tataatca to gaggcccgat acctgtttca cagagaactt 960 gacaaattag ttcacaagag catgaaaatg gagtctteca aggtatccga taagacgacg 1020 ccgcccatg * tcgtegataa ttcataetta ettacaaaaa aggaagadat cagcagcggc 1080 aagcccaagg tggtacgaat caetecatac atatatgatg teaccataat ttattacega 1140 GTC &? Aatata ctgatagtgg gcatgatcat accaacggag atttgagact teataaaggt 1200 tatcaattag mgcaaatatc tccgacaatc tttgagatga ttcaaccaga aatggagtet 1260 taaaggataa gaaaacaaca ggaccecatt gttgtgatgg taaatgtaaa aaagcatcaa 1320 attcaaccat tactcgcata caatcjatgag agtttagada agtggcttga aa¿ktaggtgg 1380 atagaa "ag cttagatt? at cgagteettg & & & amp; ata ttaaaattga gaccaaataa 1440 < 210 > 228 < 211 > 903 < 212 > DNA < 213 > Saccharomyces sp. < 400 > 228 atggaaaagt acaccaattg gagagacaet ggtacgggaa tagctccatt tctaccaaac 60 aceateagga aacetagtaa ggtgatgaca gcgtgtttgt tgggtatcct aggggtgadaj 120 aecattataa tgctaccatt gattatgctg taccttctaa ctggccagad caacttactg 130 ggtttgatat tgaagtttac attcagttgg ftaagaggaaa ttaccgtgca aggeatcaag 240 aaacgtgacg taaggedate cadgeattat ecacagaagg gcaagcttta tatttgcaat 300 tgtacetc & c etttagatgc tttttcagtg gtgttattag ctcaagggcc tgttacgttg 360! ttggtcccat ccaatgaeat tgtatacada gtttecatea gagaattcat caectteatc.420 I ctcgccggtg ggttagatat aaaactctat ggccacgagg tagcagagct atctc-ttg 430 ggcaataccg tgaattttat gtttgctgag ggtacctcat gtaatggtaa aagcgtctta S4¡0 ccgtttagta taaccgggaa aaaacttaaa gaattcatag acecttcaat aaccacaatg 600 I aaccccgcaa tggccaaaac taaaaaattt gaattgcaga ecatccaaat caaaactaat 660 aaaactgcc "tcaceacett gcccatctcc aatatggagt atttatctag atttctgaac 720 dagggcatta atgttaaatg ceagatemae gagccacaag tactetegga taatttagag 780 gaattacgcg ttgcattaaa eggtggcgac «matataaac tagtctcaeg gaagttagat 840 gttgaatcta agaggaattt tgtgaaggaa tatatcagcg atcaacgtaa aaagag < jaag 900 tag 903 < 210 > 229 < 211 > 2280 < 212 > DNA < 213 > Saccharomyces sp. < 400 > 229 atgcctgcac caaadtctcac ggagaaattt gcetetteca agagaacaca gaamactacg 60 dattacagtt ccatcgaggc caeaagcgte aagacgtegg ctgatcaggc atacatctac 120 caagagccta gcgctaccaa gaagatactt tactccatcg ccacatggct gttotacade; 190 dtcttccbct gcttctttag agaaatcaga ggccggggea gtttcaaggt accgcaecag j 240 ggaceggtga tctttgttgc ggetccgcat gcteaccagt tcgtegacce tgtaatcctt 300 atgggcgagg tgaagamate tgtcaacaga cgtgtgtcct tcttgattgc ggagagctca j 360 ttaaa < gcaac cecceatagg gtttttg ° ct agtttettea tggccatagg cgtggtaagg 420 cegeaggata atttgaaacc ggcagaaggt actatccgcg tagatccaac agactacaag 480 aga < jttatcg gccacgacac gcattzcttg actgattgta tgccaaaggg tcteatcggg 540 ttacccaaat eaatgggatt tggagaaatc cagtccatag aaagtgacac gagtttgacc 600 ctaagaaaag agttcaaaat ggccaaacca gagattemad ctgctttact caceggcactl 660 acttataaat atgcegetaa agtegaccaa tettgcgttt aceatagagt ttttgagcat 720 ttggcccata acaactgcat tgggatcttt cctgglaggtg ggtcccacga cagaacaaac 780 ttgttgccee tgaaagcagg tgtggcgatt atggctettg gttgcatgga taageatect 840 gacgtcaatg ttaagattgt tccctgcggt atgaattatt tccatccaca taagttcagg 900 tcgagagcgg ttgttgaatt eggtgaccee attgaaatac cgaaggaact agtcgccaag 960 taccacaacc cggaaacgaa cagagatgca gtgaaagaat tattagatac catatcgeag «** t M jp-; tejí4sab ** 1020 ggtttacaat ccgttaccgt tacatgttct gattatgaa¿k ctttgatggt ggttcaaacg 1090 ataagaagac tatatatgac acaatttagc accaagttac cgttgccett gattgtggaa 1140! atgaaca < gaa gaatggtcaa ttctatagaa aggttacgaa acgatecta "aatagcggác 1200 ttgaccaaag atataatgge atataatgcc gccttgagac actataatet tcctgateac 1260 cttgtggagg aggcaaaggt aaatttcgca aaaaaceteg gacttgtttt ttttagatcc 1320 ategggetct gcatcctctt ttcgttagcc atgccaggta tcattatgtt ctcacctgtc 1380 ttcatattag ccaagagaat ttctcaagaa aaggcccgta caagtctaca ccgctttgte 1440! gttaaaatae aggctaacga tgtcattgcc acgtggaaaa tcttgattgg gatgggattt i 1500 gcgcccttgc tttacatett ttggtccgtt ttaatcactt attacctcag acataaacca 1560 tggaataaaa tatatgtttt ttccgggtet tacatctcgt gtgttatagt cacgtattee 1620 gcettaatcg tgggtgatat tggtatggat ggtttcaaat ctttgagace actggtttta tctcttacet ctccaaaggg ettgcaaaag ctacaaaagg atcgtagaaa tctggcagaa 1740 agmataateg aagttgtaaa taactttgga Sgcgaattat tccccgattt cgatagtgcc 1800 gccctacgtg aagaattega cgtcatcgat gaagaggaag aagatcgaaa aacetcagaa 1860 ttgaatcgca ggaaaatgct aagaaaacag aaaataaaaa gacaagaaaa agattcgtea 1920 tcaectatca tcagccaacg tgacaaccac gatgcctatg ameaccataa ccaagatteé 1900 gatggcgtct cattggtcaa tagtgacaat tecetcteta acattccgltt attctcttet 2040 gtaagtcaga acttttcatc gtettcctta gcttegacat ccgttgcacc ttcttcttcc 2100 tccgaatttg aggtaga & & amp; amp; egaaatcttg gaggaaaaaa atggattage aagtaaaatc 2160 'geacaggccg tettaaacaa gagmattggt gadeatactg ceagggeaga ggaagaggaa 2220 gaagaagagg dagaagaaga agaggaagaa gaagaagaag ggaaagaagg agatgcgtag 2280 < 210 > 230! < 211 > 2232 < 212 > DNA < 213 > Saccharomyces sp. < 400 > 230 atgtctgctc ccgctgccga tcataacgct gccaaaccta ttcctcatgt acctcaagcg 60 tecegacggt acaaa "ttc ttcgtetaca ataczlatgga atatacatac atggctgtat 120 gatgtgtctg tatttctgtt taatattttg ttcactattt tetteagaga aatt¿taggta 180 cgtggtgcat ataacgttec cgaagttggg gtgccaacca teettgtgtg tgcccctcat 240 gcaaatcagt teategacce ggctttggt" atgtcgcaaa gaagacatca cccgtttgct 300 gcggga "gt cccgmtcc & g dbtgccttgt tttgttactg ctgagtcgag ttttaagaaa 360 agatttatct etttetttgg tcacgcaatg ggcggtatte ccgtwcctag aatteaggac¡ 420 aacttgaagc cagtggatga gaatcttgag atttacgctc eggacttgaa gaacoacccg 480 gaaatcatea agggccgctc caagaaccca cagactacae cagtgaactt tdcgaaaagg 540 ttttctgcca agtccttoct tggattgccc gtaatgctca &amp gactaettaa; Atcaaggaa 600 atcccggatg atgaaacgat aatcttgtec tetecattea gaacategam atcaaaagtg 660 gtggagctcz tgact & ATGG tactaetttt aaatatgcag agaaftatega caatacggaa 720 actttecaga gtgtttttga tcacttgcat acgaaggget gtgtaggtat tttcccccjag 780 ggtggttete atgaccgtcc ttcgttacta cccatcaagg caggtgttgc cattatgget 340 ctgggcgcag t¿kgccgctga tcctaccatg aaagttgctg ttgtaccctg tggtttgcati 900 tatttecaca gaaataaatt cagatctaga gctgttttag aatacggcga acetatagtg 960 I gtggatggga aatatggcga aatgtataag gactececae gtgagaccgt ttecamacta 1020! I ctaaaaaaga tcaccaatte tttgttttet gttacegada atgctccaga ttacgatact 1080 i ttgatggtea tteaggctgc cagaagacta tatcaaccgg taaaagteag getacctttg i i 1140 cctgccattg tagaaatcaa cagaaggtta ettttcggtt attccaagtt taaagatgat i 1200 ¡ce? agaatta ttcacttada aa? Lactggta tatgactaca acaggaaatt agattcagtg i 1260 ¡t ggtttaaaag aecatcaggt gatgcaatta aaaactacea aattagaagc attgaggtgc 1320 tttgtaactt tgatcgttcg attgattada ttttctgtet ttgctatact atcgttaccg 1380 ggttetattc tcttcactcc adlttttcatt atttgtcgcg tatacteaga aaagaaggce ' 1440 aaagagggtt taaagaaatc attggttagra ettaagggta cegatttgtt ggccacatgg 1500 aaacttateg tggcgttaat attggcacca attttetacg tt &Cttacte gatcttgttg I 1560 ettattttgg cadgaaaaca acactattgt cgcatctggg tteettecam taacgcattc 1620 atacaatttg tctattttta tgcgttettg gtttteacea ct tattecte tttadagacc 1680 ggtgaaatcg gtgttgacct tttcaaatct ttaagaceao tttttgttte tattgtttac 1740 ü ^ í? i &i? ii ^ sítti ecoggtaaga agatcgaaga aatccaaaca acaagaaaaga dtttaagtet agagttgact 1800 gctgtttgt & acgatttagg acctttggtt ttccctgatt acgataafttt agcgactgag 1860 atattetcta agagagacgg ttatgatgtc tcttctgatg cagagtettc tataagtcgt 1920 atgagtgtac aatctagaag ccgctcttct tctatacett etattggete gctagcttct 1980 aacgccctat caagagtgaa ttcaagagge tcgttgaceg atattccaat tttttctgat 2040 I gcaaagcaag gtcaatggaa aagtgaaggt gaaactagtg aggatgagga tgaatttgdt 2100! gagaaaaatc ctgccatagt acaaaccgca cgaagttctg atctaaataa ggaaaacagt 2160 cgcaacacaa atatatcttc gaagattgct tcgctggtaa gacagaaaag agaaccgagaa 2220 aagamagmat ga I 2232 < 210 > 231 < 211 > 1194 < 212 > DNA < 213 > Saccharomyces sp. < 400 > 231 atgctgcatc "aaaatagc teataaagtt egaaaagtcg tcgtcccagg tatttectta 60 ttgettetet tccagggatg cettettott ttgtttetec dectcaccta taagactett 120 i tactgtagaa atgatatdag ga¿tacaaatt ggtetedata aaaccaaaag attatttatt 1J30 gtettggtat catccmtttt gcatgttgte gcaccatctg qagtgagaat taecactgaa 240 aattccagtg ttecZamag9 t4cttttttt ttagacttga agaagaamag gettetttet 300 catctaaagt ccaatteg ° t ggceatttgc aatcacc¿kaa tataemegga ttggatattt 360 ttatggtggt tggcttacac etcgaactta ggggetaatg tctteattet tttaaaaaaa 42 |! tcgttggctt ccattcctat cctcggttte ggtatgagaa actataattt catttttatg 480 agtagaaagt gggcacaaga eaaaataacc ctaagcaaca gccttgctgg cettgattcg 540 I aatgcaaggg gcgccggctc acttgctgga eegtcacctg agcgcatade tgaggaagga 600 gagagcatat ggaatccgga ggttattgat ccaaaacaaa tccattggcc atacaatctt 660 atcctattee ctgaaggtac aaatctcagt gctgatacta ggcaaaaaag tgctaaatat 720 taggeaaaaa getgeeaaaa geeattc &ag aatgtgcrac tgcctcatte tacaggccta ¡780 agatactegt tacaaaagtt gaagccaagt attgaaagtc tttatgatat tacgatcggc 840 Tactical Tactics < jga ° < laatatggt gagcttatac atgggctgaa gagedeattt 900 ttagaaggaa aataccega¿k gttagtcgat ettcacatca gageatttga tgttaaagat! 960 attccatteg aggacgagaa tgaattttca gaatggctgt atamaatttg gagtgagaag 1020 gatgctctaa tggaaaggta etattecact ggatcatteg taagtgatcc tgaaacaaac 1080 cattcagtta ccgatagttt cadgatcaat cgtattgagt ta4kctgaagt getaatatta 1140 ccaactctaa cadltaatttg gttagtttat aaactttatt gttttatttt ttga 1194 < 210 > 232 < 211 > 912 < 212 > DNA < 213 > Saccharomyces sp. < 400 > 232 atgagtgtga taqgtaggtt ettgtattac ttgaggtccg tgttggtcgt actggcgctt 60 geaggctgtg gettttacgg tgtaatcgcc tctatccttt gcacgttaat cggtaagcaa 120 catttggctc agtggattac tgcgcgttgt ttttaccatg tcatgaaatt gatgcttgge 180! cttgacgtea aggtcgttgg cgaggagaat ttggccmaga agccatacat tatgattgce 240 amtcacceat ecacettgga tatettcatg ttaggtegge ttttcccccc tggttgcaca 300 gttactgcca agaagtcttt gaaatacgtc ccctttctgg gttggttcat ggctttgagt 360 ggtacatatt tcttagacag atctaaaagg caftgaagcca ttgacacctt gaataaaggt 420 tt¿Lgaaaatg ttadgaaaaa caagcgtgct ctatgggttt ttcctgaggg taecagcjtct 480 tacacgagtg agctgacaat gttgccttte aagaagggtg ctttccattt ggcacaacag 540 ggtaagatcc ccattgtteo agtggttgtt tccaatacca gtactttagt dagtectaaft 600 tatggggtct tcadeagagg ctgtatgatt gttagaattt tamaacctat ttcaaccgag 660 aacttaacad aggacaaaat tggtgaattt gctgaaaaag ttegagatea aatggttgac 720 actttgaagg agettggcta ctctcccgcc atcaacgata caaccctccc accacaagct 780 ccgctcttca ttgagtatg acatgacaag aaagtgaaca agaaaatcaa gaatgagcct i 840 gtgccttctg tcagcattag esacgatgte aatacccata atctgtaaaa acgaaggtte aagatgcatt 900 aa 912 < 210 > 233 < 211 > 54 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 233 cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgciag ggcgcgccat ttca 54 < 210 > 234 < 211 > 32 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 2. 3. 4 tcgaggatcc gcggccgcaa gcttcctgca gg 32 < 210 > 235 < 211 > 32 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 235 tcgacctgca ggaagcttgc ggccgcggat ce 32 < 210 > 236 < 211 > 32 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificiah Sequence Synthetic Onyilonucleotide. < 400 > 236 tcgacctgca ggaagcttgc ggccgcggat ce 32 < 210 > 237 < 211 > 32 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 237 tcgaggatcc gcggccgcaa gcttcctgca gg 32 < 210 > 238 < 211 > 36 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of Artificia Sequence Synthetic Onyilonucleotide. < 400 > 238 tcgacc gcggccgcaa gcttcctgca ct 36 < 210 > 239 < 211 > 28 < 212 > DNA < 213 > Artificial Sequence < 220 > -m.to > Sa .. < 223 > Description of the Artificiah Sequence Synthetic Oligonucleotide. < 400 > 239 cctgca gcttgcggcc gccc 28 < 210 > 240 < 211 > 36 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificiah Sequence Synthetic Oligonucleotide. < 400 > 240 tcgacctgca gcttgc ggccgc ccagct 36 < 210 > 241 < 211 > 28 < 212 > DNA < 213 > Artificial Sequence < 220 > < 223 > Description of the Artificiah Sequence Synthetic Oligonucleotide. < 400 > 241 ¿G * ¡ccgcgg ccgcaagctt cctgcagg 28

Claims (23)

  1. NOVELTY OF THE INVENTION CLAIMS 1. - An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like proteins, wherein said enzyme includes the amino acid sequence of SEQ ID NO. 127 (VxNHxS) where H is the conserved histidine residue in the HXXXXD peptide sequence of said acyltransferase-like protein, x representing any amino acid.
  2. 2. An isolated DNA sequence that encodes an enzyme of the class of proteins similar to acyltransferases, wherein said enzyme includes the amino acid sequence of SEQ ID NO. 128 (VTYSxS) with I about 30 amino acids towards the 3 'end of the conserved amino acid sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 3. An isolated DNA sequence that encodes an enzyme of the class of proteins similar to acyltransferases, wherein said enzyme includes the amino acid sequence of SEQ ID NO. 129 (VxLTRxR) with about 60 amino acids towards the 3 'end of the conserved amino acid sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 4. - An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like proteins, wherein said enzyme includes the amino acid sequence of SEQ ID NO. 132 (LxxGDLV) with about 20 amino acids towards the 3 'end of the conserved PEG amino acid sequence of said acyltransferase-like protein, x representing any amino acid. 5. An isolated DNA sequence that encodes an enzyme of the class of proteins similar to acyltransferases, wherein said enzyme includes the amino acid sequence of SEQ ID NO. 130 (CPEGT) containing the conserved amino acid sequence PEG of said protein similar to acyltransferases, x representing any amino acid. , 6.- An isolated DNA sequence that encodes an enzyme of the class of proteins similar to acyltransferases, where said enzyme! includes the amino acid sequence of SEQ ID NO. 133 (FxxGAF) with about 20 amino acids towards the 3 'end of the conserved PEG amino acid sequence of said acyltransferase-like protein, x representing any amino acid. 7. An isolated DNA sequence that encodes an enzyme of the class of proteins similar to acyltransferases, wherein said enzyme I includes the amino acid sequence of SEQ ID NO. 131 (IVPVA) with approximately 40 amino acids towards the 3 'end of the conserved PEG amino acid sequence of said acyltransferase-like protein, x representing any amino acid. Ja'aSlt & a: - - .. °. I 8. - An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like proteins, wherein said enzyme includes the amino acid sequence of SEQ ID NO. 134 (VANxxQ) with approximately 110 amino acids towards the 3 'end of the conserved PEG amino acid sequence of said acyltransferase-like protein, x representing any amino acid. i 9. An isolated DNA sequence that encodes an enzyme from the class of acyltransferase-like proteins, said DNA sequence is obtained by the steps comprising: (a) using a profile of figure 1 to search a database of the nucleic acid sequence; (b) obtaining a probability evaluation for the nucleic acid sequences in said sequence database using the Smith-Walterman algorithm; and (c) selecting a nucleic acid sequence having a probability of evaluation of less than about 1. 10. The DNA coding sequence according to claim 9, further characterized in that said DNA sequence is a coding sequence. 11. The DNA coding sequence according to claim 9, further characterized in that said DNA sequence is an ETS. 12. The coding sequence of DNA according to claims 1 to 11, further characterized in that said acyltransferase-like protein is from a plant. ^^^^^ 'i 13.- The coding DNA sequence according to any of claims 1 to 11, further characterized in that is linked to a heterologous functional region initiation of transcription and initiation of translation in a cell host 14. The construction according to claim 13, further characterized in that said host cell is a plant cell. 15. A plant cell comprising a DNA construct according to claim 13. 16. A plant comprising a cell according to claim 13.! 17. The DNA coding sequence according to any of claims 1 to 11, further characterized in that said acyltransferase-like protein is from Arabidopsis thaliana. 18. The coding sequence of DNA according to any of claims 1 to 11, further characterized in that said acyltransferase-like protein is corn. 19. The DNA coding sequence according to claim 18, further characterized in that said sequence comprises an ETS selected from the group consisting of SEQ ID NO. 24 to SEQ ID NO. 85. i 20. The DNA coding sequence under tI any of claims 1 to 11, further characterized in that said protein is acyltransferases like soybean. 21. - The DNA coding sequence according to claim 20, further characterized in that said sequence comprises an ETS selected from the group consisting of SEQ ID NO. 24 to SEQ ID NO. 85. 22.- DNA coding sequence according to any of claims 2, 3, 4, 5, 7 and 8, further characterized in that said acyltransferase-like protein is selected from the group consisting of SEQ ID NO. 1, SEQ ID NO. 10, SEQ ID NO. 12, SEQ I D NO. 14 and SEQ ID NO. 16. 23. The DNA coding sequence under ^ to either claim 1 or claim 6, further characterized in that porquei acyltransferases like protein is selected from the group consisting of SEQ ID NO.
  3. 3, SEQ ID NO. 5, SEQ ID NO. 7 and SEQ ID NO. 18
MXPA/A/2001/003143A 1998-09-25 2001-03-23 Novel plant acyltransferases MXPA01003143A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US60/101,939 1998-09-25

Publications (1)

Publication Number Publication Date
MXPA01003143A true MXPA01003143A (en) 2002-05-09

Family

ID=

Similar Documents

Publication Publication Date Title
US6489461B1 (en) Nucleic acid sequences encoding proteins involved in fatty acid beta-oxidation and methods of use
US6444876B1 (en) Acyl CoA: cholesterol acyltransferase related nucleic acid sequences
AU2006203596B2 (en) Plants with modified polyunsaturated fatty acids
CA2331329C (en) Diacylglycerol acyl transferase proteins
AU2024201251A1 (en) Plants with modified traits
CA2381901C (en) A plant lecithin:cholesterol acyltransferase-like polypeptide
US7601890B2 (en) Plant sterol acyltransferases
CA2343969A1 (en) Novel plant acyltransferases
CA2375317A1 (en) Method of increasing the content of fatty acids in plant seeds
WO2008061334A1 (en) Lunaria annua, cardamine graeca and teesdalia nudicaulis fae genes and their use in producing nervonic and eicosenoic acids in seed oils
AU2021240309A1 (en) Processes for producing lipids
MXPA01003143A (en) Novel plant acyltransferases
US6093568A (en) Plant lysophosphatidic acid acyltransferases
MXPA02005178A (en) Oleoyl-acyl-carrier-protein thioesterases in plants.
CA2816177C (en) Desaturase introns and method of use for the production of plants with modified polyunsaturated fatty acids
EP1218497A1 (en) Nucleic acid sequences encoding polyenoic fatty acid isomerase and uses thereof
AU6913400A (en) Nucleic acid sequences and methods of use for the production of plants with modified polyunsaturated fatty acids
HK1176973A (en) Novel △9-elongase for production of polyunsaturated fatty acid-enrichted oils