[go: up one dir, main page]

WO1998059034A2 - Treponema pallidum polynucleotides and sequences - Google Patents

Treponema pallidum polynucleotides and sequences Download PDF

Info

Publication number
WO1998059034A2
WO1998059034A2 PCT/US1998/013041 US9813041W WO9859034A2 WO 1998059034 A2 WO1998059034 A2 WO 1998059034A2 US 9813041 W US9813041 W US 9813041W WO 9859034 A2 WO9859034 A2 WO 9859034A2
Authority
WO
WIPO (PCT)
Prior art keywords
pallidum
sequence
fragments
seq
nos
Prior art date
Application number
PCT/US1998/013041
Other languages
French (fr)
Other versions
WO1998059034A3 (en
Inventor
Claire M. Fraser
Original Assignee
Human Genome Sciences, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Human Genome Sciences, Inc. filed Critical Human Genome Sciences, Inc.
Priority to EP98931511A priority Critical patent/EP0990022A1/en
Priority to CA002296814A priority patent/CA2296814A1/en
Priority to AU81623/98A priority patent/AU8162398A/en
Publication of WO1998059034A2 publication Critical patent/WO1998059034A2/en
Publication of WO1998059034A3 publication Critical patent/WO1998059034A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/20Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Spirochaetales (O), e.g. Treponema, Leptospira
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies

Definitions

  • the present invention relates to the field of molecular biology.
  • it relates to, among other things, nucleotide sequences of Treponema pallidum, contigs, ORFs, fragments, probes, primers and related polynucleotides thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, such as in fermentation, polypeptide production, assays and pharmaceutical development, among others.
  • Spirochetes are a family of motile, unicellular, spiral-shaped bacteria which share a number of structural characteristics.
  • Three genera of the spirochetes are pathogenic in humans: (a) Treponema, which includes the pathogens that cause syphilis (T. pallidum), yaws (T. permur), and pinta (71 carateum); (b) Borrelia, which includes the pathogens that cause epidemic and endemic relapsing fever and Lyme disease; and (c) Leptospira, which includes a wide variety of small spirochetes that cause mild to serious systemic human illness (Koff, A. B. and Rosen, T. 7. Am. Acad. Dermatol. 29:519-535 (1993)). In 1986, more than 27,000 cases of early infectious syphilis were diagnosed in the United States alone. Such statistics indicate that infection with T. pallidum is the largest source of human disease resulting from the spirochetes.
  • T. pallidum is morphologically indistinguishable from several other pathogenic spirochetes, but, in general, treponemes and other spirochetes, are easily identifiable when compared to other bacteria.
  • a key morphological characteristic of T. pallidum, and other spirochetes, is the presence of a central protoplasmic cylinder composed primarily of peptidoglycan and one or more adjacent axial fibrils (also designated periplasmic flagella or endoflagella; Charon, N. W., et al., Res. Microbiol. 143:597-603 (1992)). These structures provide a source of corkscrew-like motion to the treponemes.
  • treponemes move in an apparently random fashion and, unlike the majority of motile bacteria, continue to move in a more viscous medium.
  • treponemes are highly moldable to intercellular spaces; a characteristic which is thought to be mediated by the interactions of bacterial adhesins and cellular fibronectins.
  • Syphilis is the primary clinical manifestation of infection with T. pallidum.
  • the clinical manifestations of syphilis can resemble many diseases.
  • Syphilis is typically transmitted by sexual contact, but can also be transmitted transplacentally.
  • the infecting organism multiplies at the site of infection within 10 to 60 days postinfection and results in a primary ulcer-like lesion termed a chancre.
  • a small number of organisms move from the primary lesion to the regional lymph nodes and establish small infectious centers termed satellite buboes. Organisms from these locations enter the blood stream and result in a systemic infection (Goens, J. L., et al, Am. Fam. Physician 50:1013-1020 (1994)).
  • the secondary stage of syphilis manifests itself as a widespread skin rash and begins between two and twelve weeks following the primary infection. During this stage, the infected individual often experiences a low grade fever coupled with swollen lymph nodes. Also during this period, lesions of various degrees of severity may develop in a number of phyical locations including bone, liver, kidney, central nervous system (CNS), and other organs (Neeravahu, M. Arch. Intern. Med. 145:132-134 (1985)). Such secondary infections are highly infectious, but will, in time, subside spontaneously.
  • a third stage of syphilis occurs in approximately 30% of infected, but not treated, individuals. The third stage occurs several years following the first and second stages.
  • the lesions which characterize the third stage of infection are minor in terms of the number of organisms, but may be severe in terms of tissue damage. Such lesions may result in necrosis, scar formation, general paresis, damage to aortic valves, permanent blindness, and other extensive tissue damage, all probably related to a delayed type hypersensitivity reaction by the host to the T. pallidum organisms (Scheck, D. ⁇ . and Hook, E. W. 3 rd Infect. Dis. Clin. North Am. 8:769-795 (1994)).
  • T. pallidum has a remarkable ability to evade both the humoral and cellular components of the immune system. It was originally thought that the ability of T.
  • T. pallidum to evade the immune system of the host organism was due to the presence of an outer coat of mucopolysaccharides.
  • T. pallidum make use of the organization of the relative immunogenicity of its complement of outer membrane proteins to evade the immune system (Radolf, J. D. Mol. Microbiol. 16: 1067-1073 (1995)).
  • the T. pallidum outer membrane contains a scarcity of immunogenic transmembrane proteins (with regard to T. pallidum, these are termed "rare outer membrane proteins").
  • T. pallidum also secretes a number of small, but immunogenic proteins which may induce an immune response (Hindersson, P. et al., Res. Microbiol. 143:629-639 (1992)). It is clear that the etiology of diseases mediated or exacerbated by T.
  • T. pallidum genes and that characterizing the genes and their patterns of expression would add dramatically to our understanding of the organism and its host interactions.
  • Knowledge of T. pallidum genes and genomic organization would dramatically improve understanding of disease etiology and lead to improved and new ways of preventing, ameliorating, arresting and reversing diseases.
  • characterized genes and genomic fragments of T. pallidum would provide reagents for, among other things, detecting, characterizing and controlling T. pallidum infections. There is a need therefore to characterize the genome of T. pallidum and for polynucleotides and sequences of this organism.
  • the present invention is based on the sequencing of fragments of the T. pallidum genome.
  • the primary nucleotide sequences which were generated are provided in SEQ ID NOS: 1-744.
  • the present invention provides the nucleotide sequence of several thousand contigs of the T. pallidum genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan.
  • the present invention is provided as contiguous strings of primary sequence information corresponding to the nucleotide sequences depicted in SEQ ID NOS: 1-744.
  • the present invention further provides nucleotide sequences which are at least 95% identical to the nucleotide sequences of SEQ ID NOS: 1-744.
  • the nucleotide sequence of SEQ ID NOS: 1-744 may be provided in a variety of mediums to facilitate its use.
  • the sequences of the present invention are recorded on computer readable media.
  • Such media includes, but is not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.
  • the present invention further provides systems, particularly computer-based systems which contain the sequence information herein described stored in a data storage means.
  • Such systems are designed to identify commercially important fragments of the T. pallidum genome.
  • Another embodiment of the present invention is directed to fragments of the T. pallidum genome having particular structural or functional attributes.
  • Such fragments of the T. pallidum genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter referred to as open reading frames or ORFs, fragments which modulate the expression of an operably linked ORF, hereinafter referred to as expression modulating fragments or EMFs, and fragments which can be used to diagnose the presence of T. pallidum in a sample, hereinafter referred to as diagnostic fragments or DFs.
  • Each of the ORFs in fragments of the T. pallidum genome disclosed in Tables 1, 2 and 3, and the EMFs found 5' to the ORFs can be used in numerous ways as polynucleotide reagents.
  • the sequences can be used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity.
  • the present invention further includes recombinant constructs comprising one or more fragments of the T. pallidum genome of the present invention.
  • the recombinant constructs of the present invention comprise vectors, such as a plasmid or viral vector, into which a fragment of the T. pallidum has been inserted.
  • the present invention further provides host cells containing any of the isolated fragments of the T. pallidum genome of the present invention.
  • the host cells can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell.
  • the present invention is further directed to isolated polypeptides and proteins encoded by
  • ORFs of the present invention A variety of methods, well known to those of skill in the art, routinely may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and proteins of the present invention from cells which have been altered to express them.
  • the invention further provides methods of obtaining homologs of the fragments of the T. pallidum genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specifically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.
  • the invention further provides antibodies which selectively bind polypeptides and proteins of the present invention.
  • Such antibodies include both monoclonal and polyclonal antibodies.
  • the invention further provides hybridomas which produce the above-described antibodies.
  • a hybridoma is an immortalized cell line which is capable of secreting a specific monoclonal antibody.
  • the present invention further provides methods of identifying test samples derived from cells which express one of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one or more of the antibodies of the present invention, or one or more of the DFs of the present invention, under conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom.
  • kits are provided which contain the necessary reagents to carry out the above-described assays.
  • the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the antibodies, or one of the DFs of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of bound antibodies or hybridized DFs.
  • the present invention further provides methods of obtaining and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present invention.
  • agents include, as further described below, antibodies, peptides, carbohydrates, pharmaceutical agents and the like.
  • Such methods comprise steps of: (a)contacting an agent with an isolated protein encoded by one of the ORFs of the present invention; and (b)determining whether the agent binds to said protein.
  • the present genomic sequences of T. pallidum will be of great value to all laboratories working with this organism and for a variety of commercial purposes. Many fragments of the T. pallidum genome will be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value to T. pallidum researchers and for immediate commercial value for the production of proteins or to control gene expression.
  • sequenced contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative genomic and molecular phylogeny.
  • FIGURE 1 is a block diagram of a computer system (102) that can be used to implement computer-based systems of present invention.
  • FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit and annotate the contigs of the T. pallidum genome of the present invention.
  • Both Macintosh and Unix platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et al, Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer Society Press, Washington D.C. (1993).
  • Factura (AB) is a Macintosh program designed for automatic vector sequence removal and end-trimming of sequence files.
  • the program Loadis runs on a Macintosh platform and parses the feature data extracted from the sequence files by Factura to the Unix based T. pallidum relational database.
  • Assembly of contigs is accomplished by retrieving a specific set of sequence files and their associated features using Extrseq, a Unix utility for retrieving sequences from an SQL database.
  • the resulting sequence file is processed to trim portions of the sequences with a high rate ambiguous nucleotides.
  • the sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic Research (TIGR ) for rapid and accurate assembly of thousands of sequence fragments.
  • TIGR Institute for Genomic Research
  • the collection of contigs generated by the assembly step is loaded into the database with the lassie program.
  • Identification of open reading frames (ORFs) is accomplished by processing contigs with zorf. The ORFs are searched against T.
  • the present invention is based on the sequencing of fragments of the T. pallidum genome and analysis of the sequences.
  • the primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID NOS: 1-744.
  • the "primary sequence” refers to the nucleotide sequence represented by the IUPAC nomenclature system.).
  • the present invention provides the nucleotide sequences of SEQ ID NOS: 1-744, ORF IDs and ORFs within, or representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan.
  • a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS: 1-744" refers to any portion of the SEQ ID NOS: 1-744 which is not presently represented within a publicly available database.
  • Preferred representative fragments of the present invention are T. pallidum open reading frames ( ORFs ), expression modulating fragment ( EMFs ) and fragments which can be used to diagnose the presence of T. pallidum in sample (DFs).
  • ORFs T. pallidum open reading frames
  • EMFs expression modulating fragment
  • SEQ ID NOS: 1-744 and in Tables 1-3 together with routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence all "representative fragments" of interest, including open reading frames encoding a large variety of T. pallidum proteins.
  • the present invention is further directed to nucleic acid molecules encoding portions or fragments of the nucleotide sequences described herein.
  • Fragments include portions of the nucleotide sequences of SEQ ID NOS: 1-744, at least 10 contiguous nucleotides in length selected from any two integers, one of which representing a 5' nucleotide position and a second of which representing a 3' nucleotide position, where the first nucleotide for each nucleotide sequence in SEQ ID NOS: 1-744 is position 1. That is, every combination of a 5' and 3' nucleotide position that a fragment at least 10 contiguous nucleotides in length could occupy is included in the invention.
  • a fragment may be 10 contiguous nucleotide bases in length or any integer between 10 and the length of an entire nucleotide sequence of SEQ ID NOS: 1-744 minus 1. Therefore, included in the invention are contiguous fragments specified by any 5' and 3' nucleotide base positions of a nucleotide sequences of SEQ ID NOS: 1-744 wherein the contiguous fragment is any integer between 10 and the length of an entire nucleotide sequence minus 1.
  • the invention includes polynucleotides comprising fragments specified by size, in nucleotides, rather than by nucleotide positions.
  • the invention includes any fragment size, in contiguous nucleotides, selected from integers between 10 and the length of an entire ORF ID, ORF, or SEQ ID NO:, minus 1.
  • Preferred sizes of contiguous nucleotide fragments include 20 nucleotides, 30 nucleotides, 40 nucleotides, 50 nucleotides.
  • Other preferred sizes of contiguous nucleotide fragments, which may be useful as diagnostic probes and primers include fragments 50-300 nucleotides in length which include, as discussed above, fragment sizes representing each integer between 50-300.
  • the present invention also provides for the exclusion of any fragment, specified by 5' and 3' base positions or by size in nucleotide bases as described above for any ORF ID or SEQ ID NOS: 1-744. Any number of fragments of nucleotide sequences in ORF IDs or SEQ ID NOS: 1-744, specified by 5' and 3' base positions or by size in nucleotides, as described above, may be excluded from the present invention.
  • SEQ ID NOS: 1-744 While the presently disclosed sequences of SEQ ID NOS: 1-744 are highly accurate, sequencing techniques are not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS: 1-744. However, once the present invention is made available (i.e., once the information in SEQ ID NOS: 1-744 and Tables 1-3 has been made available), resolving a rare sequencing error in SEQ ID NOS: 1-744 will be well within the skill of the art. The present disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to be obtained readily by straightforward application of routine techniques.
  • polynucleotides of the present invention readily may be obtained by routine application of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining libraries and for sequencing are provided below, for instance.
  • a wide variety of T. pallidum strains can be used to prepare T. pallidum genomic DNA for cloning and for obtaining polynucleotides of the present invention which are known in th art.
  • nucleotide sequences of the genomes from different strains of T pallidum differ somewhat. However, the nucleotide sequences of the genomes of all T. pallidum strains will be at least 95% identical, in corresponding part, to the nucleotide sequences provided in SEQ ED NOS: 1-744 and the ORF IDs and ORFs within. Nearly all will be at least 99% identical and the great majority will be 99.9% identical.
  • the present application is further directed to nucleic acid molecules at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleic acid sequence shown in SEQ ID NOS: 1-744, the ORF IDs and ORFs within.
  • the above nucleic acid sequences are included irrespective of whether they encode a polypeptide having T. pallidum activity. This is because even where a particular nucleic acid molecule does not encode a polypeptide having T. pallidum activity, one of skill in the art would still know how to use the nucleic acid molecule, for instance, as a hybridization probe. Uses of the nucleic acid molecules of the present invention that do not encode a polypeptide having T.
  • pallidum activity include, inter alia, isolating an T. pallidum gene or allelic variants thereof from a DNA library, and detecting T. pallidum mRNA expression samples, environmental samples, suspected of containing T. pallidum by Northern Blot, PCR, or similar analysis.
  • nucleic acid molecules having sequences at least 90%, 95%, 96%, 97%, 98% or 99% identical to the nucleic acid sequence shown in SEQ ID NOS: 1-744, the ORF IDs, and the ORF within each ORF ID, which do, in fact, encode a polypeptide having T. pallidum protein activity
  • a polypeptide having T. pallidum activity is intended polypeptides exhibiting activity similar, but not necessarily identical, to an activity of the T. pallidum protein of the invention, as measured in a particular biological assay suitable for measuring activity of the specified protein.
  • nucleic acid molecules having a sequence at least 90%, 95%, 96%, 97%, 98%, or 99% identical to the nucleic acid sequences shown in SEQ ID NOS: 1-744, the ORF IDs, and the ORF within each ORF ID will encode a polypeptide having T. pallidum protein activity.
  • degenerate variants of these nucleotide sequences all encode the same polypeptide, this will be clear to the skilled artisan even without performing the above described comparison assay.
  • the biological activity or function of the polypeptides of the present invention are expected to be similar or identical to polypeptides from other bacteria that share a high degree of structural identity/similarity.
  • Table 1-3 lists accession numbers and descriptions for the closest matching sequences of polypeptides available through Genbank. It is therefore expected that the biological activity or function of the polypeptides of the present invention will be similar or identical to those polypeptides from other bacterial genuses, species, or strains listed in Table 1- 3.
  • nucleotide sequence of the polynucleotide is identical to the reference sequence except that the polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the reference nucleotide sequence encoding the T. pallidum polypeptide.
  • nucleotide sequence at least 95% identical to a reference nucleotide sequence
  • up to 5% of the nucleotides in the reference sequence may be deleted, inserted, or substituted with another nucleotide.
  • the query sequence may be an entire sequence shown in SEQ ID NOS: 1-744, the ORF IDs, or the ORF within each ORF ID, or any fragment specified as described herein.
  • nucleic acid molecule or polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide sequence of the presence invention can be determined conventionally using known computer programs.
  • a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. See Brutlag et al. (1990) Comp. App. Biosci. 6:237-245.
  • sequence alignment the query and subject sequences are both DNA sequences.
  • An RNA sequence can be compared by first converting U's to T's.
  • the result of said global sequence alignment is in percent identity.
  • the percent identity is corrected by calculating the number of bases of the query sequence that are 5' and 3' of the subject sequence, which are not matched/aligned, as a percent of the total bases of the query sequence. Whether a nucleotide is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This corrected score is what is used for the purposes of the present invention.
  • nucleotides outside the 5' and 3' nucleotides of the subject sequence are calculated for the purposes of manually adjusting the percent identity score. For example, a 90 nucleotide subject sequence is aligned to a 100 nucleotide query sequence to determine percent identity. The deletions occur at the 5' end of the subject sequence and therefore, the FASTDB alignment does not show a matched/alignment of the first 10 nucleotides at 5' end.
  • the 10 unpaired nucleotides represent 10% of the sequence (number of nucleotides at the 5' and 3' ends not matched/total number of nucleotides in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 nucleotides were perfectly matched the final percent identity would be 90%.
  • a 90 nucleotide subject sequence is compared with a 100 nucleotide query sequence. This time the deletions are internal deletions so that there are no nucleotides on the 5' or 3' of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by FASTDB is not manually corrected.
  • nucleotides 5' and 3' of the subject sequence which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are to made for the purposes of the present invention.
  • nucleotide sequences provided in SEQ ID NOS: 1-744 including ORF IDs and corresponding ORFs, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to said polynucleotide sequences may be "provided” in a variety of mediums to facilitate use thereof.
  • "provided” refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence of the present invention. Such a manufacture provides a large portion of the T. pallidum genome and parts thereof ⁇ e.g., a T.
  • a nucleotide sequence of the present invention can be recorded on computer readable media.
  • computer readable media refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media.
  • a variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention.
  • the choice of the data storage structure will generally be based on the means chosen to access the stored information.
  • a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium.
  • the sequence information can be represented in a word processing text file, formatted in commercially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like.
  • nucleotide sequence information of the present invention.
  • Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium.
  • nucleotide sequences of SEQ ID NOS: 1-744 including ORF IDs and corresponding ORFs, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to said polynucleotide sequences, the present invention enables the skilled artisan routinely to access the provided sequence information for a wide variety of purposes.
  • the present invention further provides systems, particularly computer-based systems, which contain the sequence information described herein. Such systems are designed to identify, among other things, commercially important fragments of the T. pallidum genome.
  • a computer-based system refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information of the present invention.
  • the minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means.
  • CPU central processing unit
  • input means input means
  • output means output means
  • data storage means any one of the currently available computer-based system are suitable for use in the present invention.
  • the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means.
  • data storage means refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.
  • search means refers to one or more programs which are implemented on the computer- based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of the present genomic sequences which match a particular target sequence or target motif.
  • a variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA).
  • EMBL MacPattern
  • BLASTN BLASTN
  • NCBIA BLASTX
  • a "target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids.
  • a skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database.
  • the most preferred sequence length of a target sequence is from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues.
  • searches for commercially important fragments, such as sequence fragments involved in gene expression and protein processing may be of shorter length.
  • a target structural motif refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif.
  • target motifs include, but are not limited to, enzymic active sites and signal sequences.
  • Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).
  • a variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention.
  • a preferred format for an output means ranks fragments of the T. pallidum genomic sequences possessing varying degrees of homology to the target sequence or target motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the target sequence or target motif and identifies the degree of homology contained in the identified fragment.
  • comparing means can be used to compare a target sequence or target motif with the data storage means to identify sequence fragments of the T. pallidum genome.
  • implementing software which implement the BLAST and BLAZE algorithms, described in Altschul et al, J. Mol. Biol. 215: 403-410 (1990), is used to identify open reading frames within the T. pallidum genome.
  • any one of the publicly available homology search programs can be used as the search means for the computer- based systems of the present invention. Of course, suitable proprietary systems that may be known to those of skill also may be employed in this regard.
  • FIG. 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present invention.
  • the computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as random access memory, RAM) and a variety of secondary storage devices 110, such as a hard drive 112 and a removable medium storage device 114.
  • the removable medium storage device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc.
  • a removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medium storage device 114.
  • the computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 114, once it is inserted into the removable medium storage device 114.
  • a nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 110, and/or a removable storage medium 116.
  • software for accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 108, in accordance with the requirements and operating parameters of the operating system, the hardware system and the software program or programs.
  • inventions of the present invention are directed to isolated fragments of the T. pallidum genome.
  • the fragments of the T. pallidum genome of the present invention include, but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and fragments which can be used to diagnose the presence of T pallidum in a sample, hereinafter diagnostic fragments (DFs).
  • ORFs open reading frames
  • EMFs expression modulating fragments
  • DFs diagnostic fragments
  • an "isolated nucleic acid molecule” or an “isolated fragment of the T. pallidum genome” refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification means to reduce, from the composition, the number of compounds which are normally associated with the composition.
  • the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS: 1-744, to representative fragments thereof as described above including ORF IDs and ORFs, to polynucleotides at least 95%, preferably at least 96%, 97%, 98%, or 99% and especially preferably at least 99.9% identical in sequence thereto, also as set out above.
  • T. pallidum DNA can be enzymatically sheared to produce fragments of 15-20 kb in length. These fragments can then be used to generate a T. pallidum library by inserting them into lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated in the ORF IDs of Tables 1-3, can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-744.
  • the isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and double stranded DNA, and single stranded RNA.
  • the entire sequence of each sequence of SEQ ID NOS: 1-744 is included with the first nucleotide being position 1. Therefore, for reference purposes the numbering used in the present invention is that provided in the sequence listing for SEQ ID NOS: 1-744.
  • an open reading frame means a series of nucleotide triplets coding for amino acid residues without any termination codons and is a sequence translatable into protein. Further, unless specified, the term "ORF' for each ORF ID is defined by the termination codon at the 3' end and the 5' most methionine codon, at the 5' end, in frame with said 3' termination codon.
  • ORF also refers to a particular polypeptide sequence defined by the ORF polynucleotide sequence, wherein the N-terminus is defined by the 5' most methionine codon in frame with the termination codon at the 3' end of the ORF ID and the C-terminus is defined by the last codon before the said 3' termination codon.
  • an ORF ID represents a sequence without any internal termination codons flanked by termination codons.
  • Tables 1, 2, and 3 list ORF IDs in the T. pallidum genomic contigs of the present invention that were identified as putative coding regions by the GeneMark software using organism-specific second-order Markov probability transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical methods, such as those discussed herein, to generate more inclusive, more restrictive, or more selective lists.
  • Table 1 sets out ORF IDs in the T. pallidum contigs of the present invention that over a continuous region of at least 50 bases are 95% or more identical (by BLAST analysis) to a nucleotide sequence available through GenBank in June, 1997.
  • Table 2 sets out ORF IDs in the T. pallidum contigs of the present invention that are not in Table 1 and match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through GenBank in July, 1996.
  • Table 3 sets out ORF IDs in the T. pallidum contigs of the present invention that do not match significantly, by BLASTP analysis, a polypeptide sequence available through GenBank in July, 1996.
  • the first and second columns identify the ORF ID by, respectively, contig number and ORF ID number within the contig; the third column indicates the first nucleotide of the ORF ID, counting from the 5' end of the contig strand; and the fourth column indicates the last nucleotide of the ORF ID, counting from the 5' end of the contig strand.
  • Tables 1 and 2 column six, lists the Reference for the closest matching sequence available through GenBank. These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be familiar with their denominators. Descriptions of the nomenclature are available from the National Center for Biotechnology Information. Column seven in Tables 1 and 2 provides the gene name of the matching sequence; column eight provides the BLAST identity score from the comparison of the ORF and the homologous gene; and column nine indicates the length in nucleotides of the highest scoring segment pair identified by the BLAST identity analysis.
  • Tables 1 and 2 herein enumerate the percent identity of the highest scoring segment pair in each ORF and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided below and are described in the pertinent literature highlighted by the citations provided below.
  • an "expression modulating fragment,” EMF means a series of nucleotide molecules which modulates the expression of an operably linked ORF or EMF.
  • EMFs include, but are not limited to, promoters, and promoter modulating sequences (inducible elements).
  • One class of EMFs are fragments which induce the expression or an operably linked ORF in response to a specific regulatory factor or physiological event.
  • EMF sequences can be identified within the contigs of the T. pallidum genome by their proximity to the ORF IDs provided in Tables 1-3 and ORFs within each ORF ID.
  • An intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably linked ORF in a fashion similar to that found with the naturally linked ORF sequence.
  • an "intergenic segment” refers to fragments of the T. pallidum genome which are between two ORF(s) herein described.
  • EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems of the present invention. Further, the two methods can be combined and used together.
  • An EMF trap vector contains a cloning site linked to a marker sequence.
  • a marker sequence encodes an identifiable phenotype, such as antibiotic resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap vector is placed within an appropriate host under appropriate conditions.
  • a EMF will modulate the expression of an operably linked marker sequence.
  • a sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction sites upstream from the marker sequence in the EMF trap vector.
  • the vector is then transformed into an appropriate host using known procedures and the phenotype of the transformed host in examined under appropriate conditions.
  • an EMF will modulate the expression of an operably linked marker sequence.
  • a "diagnostic fragment,” DF means a series of nucleotide molecules which selectively hybridize to T. pallidum sequences. DFs can be readily identified by identifying unique sequences within contigs of the T. pallidum genome, such as by using well- known computer analysis software, and by generating and testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which determines amplification or hybridization selectivity.
  • the sequences falling within the scope of the present invention are not limited to the specific sequences herein described, but also include allelic and species variations thereof.
  • allelic and species variations can be routinely determined by comparing the polynucleotide sequences provided in SEQ ID NOS: 1-744, ORF IDs and ORFs within, a representative fragment thereof, or a nucleotide sequence at least 99% and preferably 99.9% identical to said polynucleotide sequences, with a sequence from another isolate of the same species.
  • the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated.
  • any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, such as an ORF, in both directions (i.e., sequence both strands).
  • error screening can be performed by sequencing corresponding polynucleotides of T. pallidum origin isolated by using part or all of the fragments in question as a probe or primer.
  • Each of the ORFs of the T. pallidum genome within the ORF IDs of Tables 1, 2 and 3, and the EMFs found 5' to the ORFs can be used as polynucleotide reagents in numerous ways.
  • the sequences can be used as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, particularly T.
  • ORFs such as those of Table 3, which do not match previously characterized sequences from other organisms and thus are most likely to be highly selective for T. pallidum. Also particularly preferred are ORFs that can be used to distinguish between strains of T. pallidum, particularly those that distinguish medically important strain, such as drug- resistant strains.
  • fragments of the present invention can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynucleotide sequence to DNA or RNA.
  • Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide.
  • Information from the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides.
  • Polynucleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et al, Nucl. Acids Res. 3:113 (1979); Cooney et al, Science 241:456 (1988); and Dervan et al,
  • the present invention further provides recombinant constructs comprising one or more fragments of the T. pallidum genomic fragments and contigs of the present invention.
  • Certain preferred recombinant constructs of the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the T. pallidum genome has been inserted, in a forward or reverse orientation.
  • the vector may further comprise regulatory sequences, including for example, a promoter, operably linked to the ORF.
  • the vector may further comprise a marker sequence or heterologous ORF operably linked to the EMF.
  • Useful bacterial vectors include phagescript, PsiX174, pBluescript SK, pBS KS, pNH8a, ⁇ NH16a, pNH18a, pNH46a (available from Stratagene); ⁇ Trc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available from Pharmacia).
  • Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXTl, pSG (available from Stratagene) pSVK3, pBPV, pMSG, pSVL (available from Pharmacia).
  • Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers.
  • Two appropriate vectors are pKK232-8 and pCM7.
  • Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, lambda PR, and trc.
  • Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein- 1. Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art.
  • the present invention further provides host cells containing any one of the isolated fragments of the T. pallidum genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the host cell using known methods.
  • the host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell.
  • a polynucleotide of the present invention such as a recombinant construct comprising an ORF of the present invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are described in, for instance, Davis, L. et al, BASIC METHODS IN MOLECULAR BIOLOGY (1986).
  • a host cell containing one of the fragments of the T. pallidum genomic fragments and contigs of the present invention can be used in conventional manners to produce the gene product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF.
  • the present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid fragments of the present invention.
  • degenerate variant is intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence.
  • Preferred nucleic acid fragments of the present invention are the ORF IDs depicted in Tables 2 and 3 and the ORFs within which encode proteins.
  • the amino acid sequence can be synthesized using commercially available peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies against the native polypeptide, as discussed further below.
  • the polypeptide or protein is purified from bacterial cells which naturally produce the polypeptide or protein.
  • polypeptides and proteins of the present invention also can be purified from cells which have been altered to express the desired polypeptide or protein.
  • a cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally does not produce or which the cell normally produces at a lower level.
  • Those skilled in the art can readily adapt procedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides or proteins of the present invention.
  • the polypeptides of the present invention are preferably provided in an isolated form, and preferably are substantially purified. A recombinantly produced version of the T.
  • pallidum polypeptide can be substantially purified by the one-step method described by Smith et al. (1988) Gene 67:31-40.
  • Polypeptides of the invention also can be purified from natural or recombinant sources using antibodies directed against the polypeptides of the invention in methods which are well known in the art of protein purification.
  • the invention further provides for isolated T. pallidum polypeptides comprising an amino acid sequence selected from the group including: (a) the amino acid sequence of a full-length T. pallidum polypeptide having the complete amino acid sequence from the first methionine codon to the termination codon of each sequence listed in SEQ ID NOS: 1-744, wherein said termination codon is at the end of each SEQ ID NO: and said first methionine is the first methionine in frame with said termination codon; and (b) the amino acid sequence of a full-length T pallidum polypeptide having the complete amino acid sequence in (a) excepting the N-terminal methionine.
  • polypeptides of the present invention also include polypeptides having an amino acid sequence at least 80% identical, more preferably at least 90% identical, and still more preferably 95%, 96%, 97%, 98% or 99% identical to those described in (a) and (b) above.
  • the present invention is further directed to polynucleotides encoding portions or fragments of the amino acid sequences described herein as well as to portions or fragments of the isolated amino acid sequences described herein. Fragments include portions of the amino acid sequences described herein at least 5 contiguous amino acid in length and selected from any two integers, one of which representing an N-terminal position and another representing a C-terminal position.
  • the initiation codon of the ORFs of the present invention is position 1.
  • the initiation codon (positon 1) for purposes of the present invention is the first methionine codon of each ORF ID which is in frame with the termination codon at the end of each said sequence.
  • Every combination of a N-terminal and C-terminal position that a fragment at least 5 contiguous amino acid residues in length could occupy, on any given ORF is included in the invention, i.e., from initiation codon up to the termination codon. "At least" means a fragment may be 5 contiguous amino acid residues in length or any integer between 5 and the number of residues in an ORF, minus 1. Therefore, included in the invention are contiguous fragments specified by any N- terminal and C-terminal positions of amino acid sequence set forth in SEQ ID NOS: 1-744 or
  • the invention includes polypeptides comprising fragments specified by size, in amino acid residues, rather than by N-terminal and C-terminal positions.
  • the invention includes any fragment size, in contiguous amino acid residues, selected from integers between 5 and the number of residues in an ORF, minus 1.
  • Preferred sizes of contiguous polypeptide fragments include about 5 amino acid residues, about 10 amino acid residues, about 20 amino acid residues, about 30 amino acid residues, about 40 amino acid residues, about 50 amino acid residues, about 100 amino acid residues, about 200 amino acid residues, about 300 amino acid residues, and about 400 amino acid residues.
  • the preferred sizes are, of course, meant to exemplify, not limit, the present invention as all size fragments representing any integer between 5 and the number of residues in a full length sequence minus 1 are included in the invention.
  • the present invention also provides for the exclusion of any fragments specified by N-terminal and C-terminal positions or by size in amino acid residues as described above. Any number of fragments specified by N-terminal and C-terminal positions or by size in amino acid residues as described above may be excluded.
  • the above fragments need not be active since they would be useful, for example, in immunoassays, in epitope mapping, epitope tagging, to generate antibodies to a particular portion of the protein, as vaccines, and as molecular weight markers.
  • polypeptides of the present invention include polypeptides which have at least 90% similarity, more preferably at least 95% similarity, and still more preferably at least 96%, 97%, 98% or 99% similarity to those described above.
  • a further embodiment of the invention relates to a polypeptide which comprises the amino acid sequence of a T. pallidum polypeptide having an amino acid sequence which contains at least one conservative amino acid substitution, but not more than 50 conservative amino acid substitutions, not more than 40 conservative amino acid substitutions, not more than 30 conservative amino acid substitutions, and not more than 20 conservative amino acid substitutions. Also provided are polypeptides which comprise the amino acid sequence of a T. pallidum polypeptide, having at least one, but not more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 conservative amino acid substitutions.
  • a polypeptide having an amino acid sequence at least, for example, 95% "identical" to a query amino acid sequence of the present invention it is intended that the amino acid sequence of the subject polypeptide is identical to the query sequence except that the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
  • the amino acid sequence of the subject polypeptide may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
  • up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, (indels) or substituted with another amino acid.
  • These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence.
  • any particular polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to the ORF amino acid sequences encoded by the sequences of SEQ ID NOS: 1-744, as described hererin, can be determined conventionally using known computer programs.
  • a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence can be determined using the FASTDB computer program based on the algorithm of Brutlag et al., (1990) Comp. App. Biosci. 6:237-245.
  • a sequence alignment the query and subject sequences are both amino acid sequences.
  • the result of said global sequence alignment is in percent identity.
  • Preferred parameters used in a FASTDB amino acid alignment are:
  • the percent identity is corrected by calculating the number of residues of the query sequence that are N- and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a residue is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This final percent identity score is what is used for the purposes of the present invention.
  • a 90 amino acid residue subject sequence is aligned with a 100 residue query sequence to determine percent identity.
  • the deletion occurs at the N-terminus of the subject sequence and therefore, the FASTDB alignment does not match/align with the first 10 residues at the N-terminus.
  • the 10 unpaired residues represent 10% of the sequence (number of residues at the N- and C- termini not matched/total number of residues in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 residues were perfectly matched the final percent identity would be 90%.
  • a 90 residue subject sequence is compared with a 100 residue query sequence.
  • polypeptides of the present invention that do not have T. pallidum activity include, ter alia, as epitope tags, in epitope mapping, and as molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration columns using methods known to those of skill in the art.
  • polypeptides of the present invention can also be used to raise polyclonal and monoclonal antibodies, which are useful in assays for detecting T. pallidum protein expression or as agonists and antagonists capable of enhancing or inhibiting T. pallidum protein function.
  • polypeptides can be used in the yeast two-hybrid system to "capture" T. pallidum protein binding proteins which are also candidate agonists and antagonists according to the present invention. See, e.g., Fields et al. (1989) Nature 340:245-246.
  • Any host/vector system can be used to express one or more of the ORFs of the present invention.
  • These include, but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis.
  • the most preferred cells are those which do not normally express the particular polypeptide or protein or which expresses the polypeptide or protein at low natural level.
  • Recombinant means that a polypeptide or protein is derived from recombinant (e.g., microbial or mammalian) expression systems.
  • Microbial refers to recombinant polypeptides or proteins made in bacterial or fungal (e.g., yeast) expression systems.
  • recombinant microbial defines a polypeptide or protein essentially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells.
  • Nucleotide sequence refers to a heteropolymer of deoxyribonucleotides.
  • DNA segments encoding the polypeptides and proteins provided by this invention are assembled from fragments of the T. pallidum genome and short ohgonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a microbial or viral operon.
  • Recombinant expression vehicle or vector refers to a plasmid or phage or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence.
  • the expression vehicle can comprise a transcriptional unit comprising an assembly of ( 1) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a structural or coding sequence which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the beginning of the desired coding region and terminate translation at its end.
  • Structural units intended for use in yeast or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated protein by a host cell.
  • recombinant protein may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the expressed recombinant protein to provide a final product.
  • "Recombinant expression system” means host cells which have stably integrated a recombinant transcriptional unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or proteins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed.
  • Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention.
  • Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described in Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby incorporated by reference in its entirety.
  • recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S.
  • heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing secretion of translated protein into the periplasmic space or extracellular medium.
  • the heterologous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired characteristics, e.g., stabilization or simplified purification of expressed recombinant product.
  • Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter.
  • the vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and, when desirable, provide amplification within the host.
  • Suitable prokaryotic hosts for transformation include strains of E. coli, B. subtilis, Salmonella typhimurium and various species within the genera Pseudomonas and Streptomyces. Others may, also be employed as a matter of choice.
  • useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017).
  • cloning vector pBR322 ATCC 37017
  • Such commercial vectors include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence to be expressed.
  • the selected promoter where it is inducible, is derepressed or induced by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally by physical or chemical means, and the resulting crude extract is retained for further purification.
  • appropriate means e.g., temperature shift or chemical induction
  • mammalian cell culture systems can also be employed to express recombinant protein.
  • mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23:115 (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and BHK cell lines.
  • Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5 flanking nontranscribed sequences.
  • DNA sequences derived from the SV40 viral genome for example, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements.
  • Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps.
  • Microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps.
  • HPLC high performance liquid chromatography
  • the present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are substantially equivalent to those herein described.
  • substantially equivalent can refer both to nucleic acid and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity between reference and subject sequences.
  • sequences having equivalent biological activity, and equivalent expression characteristics are considered substantially equivalent.
  • truncation of the mature sequence should be disregarded.
  • the invention further provides methods of obtaining homologs from other strains of T. pallidum, of the fragments of the T pallidum genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention.
  • a sequence or protein of T. pallidum is defined as a homolog of a fragment of the T pallidum fragments or contigs or a protein encoded by one of the ORFs of the present invention, if it shares significant homology to one of the fragments of the T. pallidum genome of the present invention or a protein encoded by one of the ORFs of the present invention.
  • sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.
  • two nucleic acid molecules or proteins are said to "share significant homology" if the two contain regions which possess greater than 85% sequence (amino acid or nucleic acid) homology.
  • Preferred homologs in this regard are those with more than 90% homology.
  • Especially preferred are those with 93% or more homology.
  • those with 95% or more homology are particularly preferred.
  • Very particularly preferred among these are those with 97% and even more particularly preferred among those are homologs with 99% or more homology.
  • the most preferred homologs among these are those with 99.9% homology or more. It will be understood that, among measures of homology, identity is particularly preferred in this regard.
  • Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS: 1-744 or from a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ ID NOS: 1-744 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have been described in great detail in many publications such as, for example, Innis et al, PCR Protocols, Academic Press, San Diego, CA (1990)).
  • primers derived from SEQ ID NOS: 1-744 or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS J -744 one skilled in the art will recognize that by employing high stringency conditions (e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC) only sequences which are greater than 75% homologous to the primer will be amplified.
  • high stringency conditions e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC
  • DNA probes derived from SEQ ID NOS: 1-744, or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS: 1-744 , for colony/plaque hybridization one skilled in the art will recognize that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC), sequences having regions which are greater than 90% homologous to the probe can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X SSPC), sequences having regions which are greater than 35-45% homologous to the probe will be obtained.
  • high stringency conditions e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC
  • lower stringency conditions
  • Any organism can be used as the source for homologs of the present invention so long as the organism naturally expresses such a protein or contains genes encoding the same.
  • the most preferred organism for isolating homologs are bacteria which are closely related to T. pallidum.
  • Each ORF corresponding to the ORF IDs provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide.
  • polypeptides of the present invention for commercial, therapeutic and industrial purposes consistent with the type of putative identification of the polypeptide.
  • identifications permit one skilled in the art to use the T. pallidum ORFs in a manner similar to the known type of sequences for which the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite.
  • Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes enzymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in amino acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, can be used for industrial biosynthesis.
  • T pallidum The various metabolic pathways present in T pallidum can be identified based on absolute nutritional requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS: 1-744. Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non-macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase.
  • Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to give unicellular fruits.
  • a detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts et al, Symbiosis 21:19 (1986) and Voragen et al. in Biocatalysts In Agricultural Biotechnology, Whitaker et al, Eds., American Chemical Society Symposium Series 389:93 (1989) .
  • the metabolism of sugars is an important aspect of the primary metabolism of T pallidum.
  • Enzymes involved in the degradation of sugars can be used in industrial fermentation.
  • sugars such as, particularly, glucose, galactose, fructose and xylose
  • Some of the important sugar transforming enzymes include sugar isomerases such as glucose isomerase.
  • Other metabolic enzymes have found commercial use such as glucose oxidases which produces ketogulonic acid (KG A).
  • KG A is an intermediate in the commercial production of ascorbic acid using the Reichstein's procedure, as described in Krueger et al, Biotechnology 6(A), Rhine et al, Eds., Verlag Press, Weinheim, Germany (1984).
  • Glucose oxidase is commercially available and has been used in purified form as well as in an immobilized form for the deoxygenation of beer. See, for instance, Hartmeir et al, Biotechnology Letters 7:21 (1979). The most important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, for example, in Bigelis et al, beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett et al, Eds., Academic Press, New York (1985).
  • Proteinases such as alkaline serine proteinases, are used as detergent additives and thus represent one of the largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See Faultman et al, Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and Godfrey et al, Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et al, Report Industrial Enzymes by 1990, Hel Hepner & Associates, London (1986)).
  • lipases Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for instance, Macrae et al, Philosophical Transactions of the Chiral Society of London 310:221 (1985) and Poserke, Journal of the American Oil Chemist Society 67:1758 (1984).
  • a major use of lipases is in the fat and oil industry for the production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides.
  • Application of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the washing procedures.
  • the following reactions catalyzed by enzymes are of interest to organic chemists: hydrolysis of carboxylic acid esters, phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming reactions such as the aldol reaction.
  • Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in nucleic acid synthesis, repair, and recombination.
  • proteins of the present invention can be used in a variety of procedures and methods known in the art which are currently applied to other proteins.
  • the proteins of the present invention can further be used to generate an antibody which selectively binds the protein.
  • T. pallidum protein-specific antibodies for use in the present invention can be raised against the intact T. pallidum protein or an antigenic polypeptide fragment thereof, which may be presented together with a carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if it is long enough (at least about 25 amino acids), without a carrier.
  • a carrier protein such as an albumin
  • antibody As used herein, the term "antibody” (Ab) or “monoclonal antibody” (Mab) is meant to include intact molecules, single chain whole antibodies, and antibody fragments.
  • Antibody fragments of the present invention include Fab and F(ab')2 and other fragments including single- chain Fvs (scFv) and disulfide-linked Fvs (sdFv). Also included in the present invention are chimeric and humanized monoclonal antibodies and polyclonal antibodies specific for the polypeptides of the present invention.
  • the antibodies of the present invention may be prepared by any of a variety of methods.
  • cells expressing a polypeptide of the present invention or an antigenic fragment thereof can be administered to an animal in order to induce the production of sera containing polyclonal antibodies.
  • a preparation of T. pallidum polypeptide or fragment thereof is prepared and purified to render it substantially free of natural contaminants. Such a preparation is then introduced into an animal in order to produce polyclonal antisera of greater specific activity.
  • the antibodies of the present invention are monoclonal antibodies or binding fragments thereof.
  • Such monoclonal antibodies can be prepared using hybridoma technology. See, e.g., Harlow et al., ANTIBODIES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988); Hammerling, et al., in: MONOCLONAL ANTIBODIES AND T-CELL HYBRIDOMAS 563-681 (Elsevier, N.Y., 1981).
  • F(ab')2 fragments may be produced by proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')2 fragments).
  • enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')2 fragments).
  • T. pallidum polypeptide-binding fragments, chimeric, and humanized antibodies can be produced through the application of recombinant DNA technology or through synthetic chemistry using methods known in the art.
  • additional antibodies capable of binding to the polypeptide antigen of the present invention may be produced in a two-step procedure through the use of anti-idiotypic antibodies.
  • T. pallidum polypeptide-specific antibodies are used to immunize an animal, preferably a mouse.
  • the splenocytes of such an animal are then used to produce hybridoma cells, and the hybridoma cells are screened to identify clones which produce an antibody whose ability to bind to the T. pallidum polypeptide-specific antibody can be blocked by the T. pallidum polypeptide antigen.
  • Such antibodies comprise anti-idiotypic antibodies to the T. pallidum polypeptide-specific antibody and can be used to immunize an animal to induce formation of further T. pallidum polypeptide-specific antibodies.
  • Antibodies and fragements thereof of the present invention may be described by the portion of a polypeptide of the present invention recognized or specifically bound by the antibody.
  • Antibody binding fragements of a polypeptide of the present invention may be described or specified in the same manner as for polypeptide fragements discussed above., i.e, by N-terminal and C-terminal positions or by size in contiguous amino acid residues. Any number of antibody binding fragments, of a polypeptide of the present invention, specified by N- terminal and C-terminal positions or by size in amino acid residues, as described above, may also be excluded from the present invention. Therefore, the present invention includes antibodies the specifically bind a particuarlly discribed fragement of a polypeptide of the present invention and allows for the exclusion of the same.
  • Antibodies and fragements thereof of the present invention may also be described or specified in terms of their cross-reactivity. Antibodies and fragements that do not bind polypeptides of any other species of Borrelia other than T. pallidum are included in the present invention. Likewise, antibodies and fragements that bind only species of Borrelia, i.e. antibodies and fragements that do not bind bacteria from any genus other than Borrelia, are included in the present invention.
  • Antibodies can be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels (such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), paramagnetic atoms, etc. Procedures for accomplishing such labeling are well-known in the art, for example see Sternberger et al, J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. et al, Meth. Enzym. 62:308 (1979); Engval, E. et al, Immunol. 109:129 (1972); Goding, J. W., J. Immunol. Meth. 13:215 (1976)).
  • radioisotopes such as biotin, avidin, etc.
  • enzymatic labels such as horseradish peroxidase, alkaline phosphatase, etc
  • the labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells or tissues in which a fragment of the T. pallidum genome is expressed.
  • the present invention further provides the above-described antibodies immobilized on a solid support.
  • solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports are well known in the art (Weir, D. M. et al, "Handbook of Experimental Immunology” 4th Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. et al, Meth. Enzym. 34 Academic Press, N. Y. (1974)).
  • the immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for immunoaffinity purification of the proteins of the present invention.
  • the invention provides peptides and polypeptides comprising epitope-bearing portions of the T pallidum polypeptides of the present invention.
  • These epitopes are immunogenic or antigenic epitopes of the polypeptides of the present invention.
  • An "immunogenic epitope” is defined as a part of a protein that elicits an antibody response when the whole protein or polypeptide is the immunogen. These immunogenic epitopes are believed to be confined to a few loci on the molecule.
  • an antigenic determinant or "antigenic epitope.”
  • the number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, e.g., Geysen, et al. (1983) Proc. Natl. Acad. Sci. USA 81:3998- 4002. Amino acid residues comprising anigenic epitopes may be determined by algorithms such as the the Jameson- Wolf analysis or similar algorithms or by in vivo testing for an antigenic response using the methods described herein or those known in the art.
  • peptides or polypeptides bearing an antigenic epitope i.e., that contain a region of a protein molecule to which an antibody can bind
  • relatively short synthetic peptides that mimic part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. See, e.g., Sutcliffe, et al., (1983) Science 219:660-666.
  • Peptides capable of eliciting protein-reactive sera are frequently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or carboxyl terminals. Peptides that are extremely hydrophobic and those of six or fewer residues generally are ineffective at inducing antibodies that bind to the mimicked protein; longer, peptides, especially those containing proline residues, usually are effective. See, Sutcliffe, et al., supra, p. 661.
  • 18 of 20 peptides designed according to these guidelines containing 8-39 residues covering 75% of the sequence of the influenza virus hemagglutinin HAl polypeptide chain, induced antibodies that reacted with the HAl protein or intact virus; and 12/12 peptides from the MuLV polymerase and 18/18 from the rabies glycoprotein induced antibodies that precipitated the respective proteins.
  • Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention.
  • a high proportion of hybridomas obtained by fusion of spleen cells from donors immunized with an antigen epitope-bearing peptide generally secrete antibody reactive with the native protein. See Sutcliffe, et al., supra, p. 663.
  • the antibodies raised by antigenic epitope-bearing peptides or polypeptides are useful to detect the mimicked protein, and antibodies to different peptides may be used for tracking the fate of various regions of a protein precursor which undergoes post-translational processing.
  • the peptides and anti-peptide antibodies may be used in a variety of qualitative or quantitative assays for the mimicked protein, for instance in competition assays since it has been shown that even short peptides (e.g., about 9 amino acids) can bind and displace the larger peptides in immunoprecipitation assays. See, e.g., Wilson, et al., (1984) Cell 37:767-778.
  • the anti-peptide antibodies of the invention also are useful for purification of the mimicked protein, for instance, by adsorption chromatography using methods known in the art.
  • Antigenic epitope-bearing peptides and polypeptides of the invention designed according to the above guidelines preferably contain a sequence of at least seven, more preferably at least nine and most preferably between about 10 to about 50 amino acids (i.e. any integer between 7 and 50) contained within the amino acid sequence of a polypeptide of the invention.
  • peptides or polypeptides comprising a larger portion of an amino acid sequence of a polypeptide of the invention, containing about 50 to about 100 amino acids, or any length up to and including the entire amino acid sequence of a polypeptide of the invention also are considered epitope-bearing peptides or polypeptides of the invention and also are useful for inducing antibodies that react with the mimicked protein.
  • the amino acid sequence of the epitope-bearing peptide is selected to provide substantial solubility in aqueous solvents (i.e., the sequence includes relatively hydrophilic residues and highly hydrophobic sequences are preferably avoided); and sequences containing proline residues are particularly preferred.
  • the epitope-bearing peptides and polypeptides of the present invention may be produced by any conventional means for making peptides or polypeptides including recombinant means using nucleic acid molecules of the invention.
  • an epitope-bearing amino acid sequence of the present invention may be fused to a larger polypeptide which acts as a carrier during recombinant production and purification, as well as during immunization to produce anti-peptide antibodies.
  • Epitope-bearing peptides also may be synthesized using known methods of chemical synthesis. For instance, Houghten has described a simple method for synthesis of large numbers of peptides, such as 10-20 mg of 248 different 13 residue peptides representing single amino acid variants of a segment of the HAl polypeptide which were prepared and characterized (by ELISA-type binding studies) in less than four weeks (Houghten, R. A. Proc. Natl. Acad. Sci.
  • Epitope-bearing peptides and polypeptides of the invention are used to induce antibodies according to methods well known in the art. See, e.g., Sutcliffe, et al., supra;; Wilson, et al., supra;; and Bittle, et al. (1985) J. Gen. Virol. 66:2347-2354.
  • animals may be immunized with free peptide; however, anti-peptide antibody titer may be boosted by coupling of the peptide to a macromolecular carrier, such as keyhole limpet hemacyanin (KLH) or tetanus toxoid.
  • KLH keyhole limpet hemacyanin
  • peptides containing cysteine may be coupled to carrier using a linker such as m-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS), while other peptides may be coupled to carrier using a more general linking agent such as glutaraldehyde.
  • a linker such as m-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS)
  • MBS m-maleimidobenzoyl-N-hydroxysuccinimide ester
  • glutaraldehyde m-maleimidobenzoyl-N-hydroxysuccinimide ester
  • Animals such as rabbits, rats and mice are immunized with either free or carrier-coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 ⁇ g peptide or carrier protein and Freund's adjuvant.
  • booster injections may be needed, for instance, at intervals of about two weeks, to provide a useful titer of anti-peptide antibody which can be detected, for example, by ELISA assay using free peptide adsorbed to a solid surface.
  • the titer of anti-peptide antibodies in serum from an immunized animal may be increased by selection of anti-peptide antibodies, for instance, by adsorption to the peptide on a solid support and elution of the selected antibodies according to methods well known in the art.
  • Immunogenic epitope-bearing peptides of the invention i.e., those parts of a protein that elicit an antibody response when the whole protein is the immunogen, are identified according to methods known in the art. For instance, Gey sen, et al, supra, discloses a procedure for rapid concurrent synthesis on solid supports of hundreds of peptides of sufficient purity to react in an ELISA. interaction of synthesized peptides with antibodies is then easily detected without removing them from the support. In this manner a peptide bearing an immunogenic epitope of a desired protein may be identified routinely by one of ordinary skill in the art.
  • the immunologically important epitope in the coat protein of foot-and-mouth disease virus was located by Geysen et al. supra with a resolution of seven amino acids by synthesis of an overlapping set of all 208 possible hexapeptides covering the entire 213 amino acid sequence of the protein. Then, a complete replacement set of peptides in which all 20 amino acids were substituted in turn at every position within the epitope were synthesized, and the particular amino acids conferring specificity for the reaction with antibody were determined.
  • peptide analogs of the epitope-bearing peptides of the invention can be made routinely by this method.
  • U.S. Patent No. 4,708,781 to Geysen (1987) further describes this method of identifying a peptide bearing an immunogenic epitope of a desired protein.
  • U.S. Patent No. 5,194,392, to Geysen (1990) describes a general method of detecting or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of interest. More generally, U.S. Patent No. 4,433,092, also to Geysen (1989), describes a method of detecting or determining a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A.
  • polypeptides of the present invention and the epitope-bearing fragments thereof described above can be combined with parts of the constant domain of immunoglobulins (IgG), resulting in chimeric polypeptides.
  • IgG immunoglobulins
  • These fusion proteins facilitate purification and show an increased half-life in vivo. This has been shown, e.g., for chimeric proteins consisting of the first two domains of the human CD4-polypeptide and various domains of the constant regions of the heavy or light chains of mammalian immunoglobulins. (EPA 0,394,827; Traunecker et al. (1988) Nature 331:84-86.
  • Fusion proteins that have a disulfide-linked dimeric structure due to the IgG part can also be more efficient in binding and neutralizing other molecules than a monomeric T. pallidum polypeptide or fragment thereof alone. See Fountoulakis et al. (1995) J. Biochem. 270:3958-3964. Nucleic acids encoding the above epitopes of T. pallidum polypeptides can also be recombined with a gene of interest as an epitope tag to aid in detection and purification of the expressed polypeptide.
  • the present invention further relates to methods for assaying Borrelia infection in an animal by detecting the expression of genes encoding Borrelia polypeptides of the present invention.
  • the methods comprise analyzing tissue or body fluid from the animal for 2forre// ⁇ -specific antibodies, nucleic acids, or proteins. Analysis of nucleic acid specific to Borrelia is assayed by PCR or hybridization techniques using nucleic acid sequences of the present invention as either hybridization probes or primers. See, e.g., Sambrook et al. Molecular cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 2nd ed., 1989, page 54 reference); Eremeeva et al. (1994) J. Clin. Microbiol.
  • the present invention is useful for monitoring progression or regression of the disease state whereby patients exhibiting enhanced Borrelia gene expression will experience a worse clinical outcome relative to patients expressing these gene(s) at a lower level.
  • biological sample any biological sample obtained from an animal, cell line, tissue culture, or other source which contains Borrelia polypeptide, mRNA, or DNA.
  • Biological samples include body fluids (such as saliva, blood, plasma, urine, mucus, synovial fluid, etc.) tissues (such as muscle, skin, and cartilage) and any other biological source suspected of containing Borrelia polypeptides or nucleic acids. Methods for obtaining biological samples such as tissue are well known in the art.
  • the present invention is useful for detecting diseases related to Borrelia infections in animals.
  • Preferred animals include monkeys, apes, cats, dogs, birds, cows, pigs, mice, horses, rabbits and humans. Particularly preferred are humans.
  • Total RNA can be isolated from a biological sample using any suitable technique such as the single-step guanidinium-thiocyanate-phenol-chloroform method described in Chomczynski et al. (1987) Anal. Biochem. 162: 156-159.
  • mRNA encoding Borrelia polypeptides having sufficient homology to the nucleic acid sequences identified in SEQ ID NOS: 1-744 to allow for hybridization between complementary sequences are then assayed using any appropriate method. These include Northern blot analysis, S 1 nuclease mapping, the polymerase chain reaction (PCR), reverse transcription in combination with the polymerase chain reaction (RT-PCR), and reverse transcription in combination with the ligase chain reaction (RT-LCR).
  • PCR polymerase chain reaction
  • RT-PCR reverse transcription in combination with the polymerase chain reaction
  • RT-LCR reverse transcription in combination with the ligase chain reaction
  • RNA is prepared from a biological sample as described above.
  • an appropriate buffer such as glyoxal/dimethyl sulfoxide/sodium phosphate buffer
  • the filter is prehybridized in a solution containing formamide, SSC, Denhardt's solution, denatured salmon sperm, SDS, and sodium phosphate buffer.
  • pallidum polynucleotide sequence shown in SEQ ID NOS: 1-744, or portion thereof, labeled according to any appropriate method is used as probe. After hybridization overnight, the filter is washed and exposed to x-ray film.
  • DNA for use as probe according to the present invention is described in the sections above and will preferably at least 15 nucleotides in length.
  • SI mapping can be performed as described in Fujita et al. (1987) Cell 49:357-367.
  • probe DNA for use in S 1 mapping, the sense strand of an above-described T. pallidum DNA sequence of the present invention is used as a template to synthesize labeled antisense DNA.
  • the antisense DNA can then be digested using an appropriate restriction endonuclease to generate further DNA probes of a desired length.
  • Such antisense probes are useful for visualizing protected bands corresponding to the target mRNA (i.e., mRNA encoding Borrelia polypeptides).
  • RNA encoding Borrelia polypeptides are assayed, for e.g., using the RT-PCR method described in Makino et al. (1990) Technique 2:295-301.
  • the radioactivities of the "amplicons" in the polyacrylamide gel bands are linearly related to the initial concentration of the target mRNA. Briefly, this method involves adding total RNA isolated from a biological sample in a reaction mixture containing a RT primer and appropriate buffer. After incubating for primer annealing, the mixture can be supplemented with a RT buffer, dNTPs, DTT, RNase inhibitor and reverse transcriptase.
  • the RT products are then subject to PCR using labeled primers.
  • a labeled dNTP can be included in the PCR reaction mixture.
  • PCR amplification can be performed in a DNA thermal cycler according to conventional techniques. After a suitable number of rounds to achieve amplification, the PCR reaction mixture is electrophoresed on a polyacrylamide gel. After drying the gel, the radioactivity of the appropriate bands (corresponding to the mRNA encoding the Borrelia polypeptides of the present invention) are quantified using an imaging analyzer.
  • RT and PCR reaction ingredients and conditions, reagent and gel concentrations, and labeling methods are well known in the art.
  • PCR PRIMER A LABORATORY MANUAL (C.W. Dieffenbach et al. eds., Cold Spring Harbor Lab Press, 1995).
  • the polynucleotides of the present invention may be used to detect polynucleotides of the present invention or Borrelia species including T. pallidum using bio chip technology.
  • the present invention includes both high density chip arrays (>1000 oligonucleotides per cm 2 ) and low density chip arrays ( ⁇ 1000 oligonucleotides per cm 2 ).
  • Bio chips comprising arrays of polynucleotides of the present invention may be used to detect Borrelia species, including T. pallidum, in biological and environmental samples and to diagnose an animal, including humans, with an T. pallidum or other Borrelia infection.
  • the bio chips of the present invention may comprise polynucleotide sequences of other pathogens including bacteria, viral, parasitic, and fungal polynucleotide sequences, in addition to the polynucleotide sequences of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis.
  • the bio chips can also be used to monitor an T. pallidum or other Borrelia infections and to monitor the genetic changes (deletions, insertions, mismatches, etc.) in response to drug therapy in the clinic and drug development in the laboratory.
  • the bio chip technology comprising arrays of polynucleotides of the present invention may also be used to simultaneously monitor the expression of a multiplicity of genes, including those of the present invention.
  • the polynucleotides used to comprise a selected array may be specified in the same manner as for the fragements, i.e, by their 5' and 3' positions or length in contigious base pairs and include from.
  • Methods and particular uses of the polynucleotides of the present invention to detect Borrelia species, including T. pallidum, using bio chip technology include those known in the art and those of: U.S. Patent Nos. 5510270, 5545531, 5445934, 5677195, 5532128, 5556752, 5527681, 5451683, 5424186, 5607646, 5658732 and World Patent Nos.
  • Biosensors using the polynucleotides of the present invention may also be used to detect, diagnose, and monitor T. pallidum or other Borrelia species and infections thereof. Biosensors using the polynucleotides of the present invention may also be used to detect particular polynucleotides of the present invention. Biosensors using the polynucleotides of the present invention may also be used to monitor the genetic changes (deletions, insertions, mismatches, etc.) in response to drug therapy in the clinic and drug development in the laboratory.
  • Methods and particular uses of the polynucleotides of the present invention to detect Borrelia species, including T. pallidum, using biosenors include those known in the art and those of: U.S. Patent Nos 5721102, 5658732, 5631170, and World Patent Nos. WO97/35011, WO/9720203, each incorporated herein in their entireties.
  • the present invention includes both bio chips and biosensors comprising polynucleotides of the present invention and methods of their use.
  • Assaying Borrelia polypeptide levels in a biological sample can occur using any art-known method, such as antibody-based techniques.
  • Borrelia polypeptide expression in tissues can be studied with classical immunohistological methods.
  • the specific recognition is provided by the primary antibody (polyclonal or monoclonal) but the secondary detection system can utilize fluorescent, enzyme, or other conjugated secondary antibodies.
  • an immunohistological staining of tissue section for pathological examination is obtained.
  • Tissues can also be extracted, e.g., with urea and neutral detergent, for the liberation of Borrelia polypeptides for Western-blot or dot/slot assay. See, e.g., Jalkanen, M. et al. (1985) J.
  • a Borrelia polypeptide-specific monoclonal antibodies can be used both as an immunoabsorbent and as an enzyme-labeled probe to detect and quantify a Borrelia polypeptide.
  • the amount of a Borrelia polypeptide present in the sample can be calculated by reference to the amount present in a standard preparation using a linear regression computer algorithm.
  • Such an ELISA is described in Iacobelli et al. (1988) Breast Cancer Research and Treatment 11 : 19-30.
  • two distinct specific monoclonal antibodies can be used to detect Borrelia polypeptides in a body fluid. In this assay, one of the antibodies is used as the immunoabsorbent and the other as the enzyme-labeled probe.
  • the above techniques may be conducted essentially as a "one-step” or “two-step” assay.
  • the "one-step” assay involves contacting the Borrelia polypeptide with immobilized antibody and, without washing, contacting the mixture with the labeled antibody.
  • the "two-step” assay involves washing before contacting the mixture with the labeled antibody.
  • Other conventional methods may also be employed as suitable. It is usually desirable to immobilize one component of the assay system on a support, thereby allowing other components of the system to be brought into contact with the component and readily removed from the sample. Variations of the above and other immunological methods included in the present invention can also be found in Harlow et al., ANTIBODIES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988).
  • Suitable enzyme labels include, for example, those from the oxidase group, which catalyze the production of hydrogen peroxide by reacting with substrate.
  • Glucose oxidase is particularly preferred as it has good stability and its substrate (glucose) is readily available.
  • Activity of an oxidase label may be assayed by measuring the concentration of hydrogen peroxide formed by the enzyme-labeled antibody/substrate reaction.
  • radioisotopes such as iodine ( 125 1, 121 I), carbon ( 14 C), sulphur ( 35 S), tritium ( 3 H), indium ( n2 In), and technetium (“"Tc), and fluorescent labels, such as fluorescein and rhodamine, and biotin.
  • suitable labels for the Borrelia polypeptide-specific antibodies of the present invention are provided below.
  • suitable enzyme labels include malate dehydrogenase, Borrelia nuclease, delta-5-steroid isomerase, yeast-alcohol dehydrogenase, alpha-glycerol phosphate dehydrogenase, triose phosphate isomerase, peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase, and acetylcholine esterase.
  • suitable radioisotopic labels include 3 H, n l In, 125 I, 13, 1, 32 P, 35 S, ,4 C, 51 Cr,
  • ⁇ ⁇ In is a preferred isotope where in vivo imaging is used since its avoids the problem of dehalogenation of the 125 I or ,31 IJabeled monoclonal antibody by the liver.
  • this radionucleotide has a more favorable gamma emission energy for imaging. See, e.g., Perkins et al. (1985) Eur. J. Nucl. Med. 10:296-301; Carasquillo et al. (1987) J. Nucl. Med.
  • fluorescent labels examples include an 152 Eu label, a fluorescein label, an isothiocyanate label, a rhodamine label, a phycoerythrin label, a phycocyanin label, an allophycocyanin label, an o-phthaldehyde label, and a fluorescamine label.
  • suitable toxin labels include, Pseudomonas toxin, diphtheria toxin, ricin, and cholera toxin.
  • chemiluminescent labels include a luminal label, an isoluminal label, an aromatic acridinium ester label, an imidazole label, an acridinium salt label, an oxalate ester label, a luciferin label, a luciferase label, and an aequorin label.
  • nuclear magnetic resonance contrasting agents include heavy metal nuclei such as Gd, Mn, and iron.
  • Typical techniques for binding the above-described labels to antibodies are provided by Kennedy et al. (1976) Clin. Chim. Acta 70:1-31, and Schurs et al. (1977) Clin. Chim. Acta 81:1-40. Coupling techniques mentioned in the latter are the glutaraldehyde method, the periodate method, the dimaleimide method, the m-maleimidobenzyl-N-hydroxy-succinimide ester method, all of which methods are incorporated by reference herein.
  • the invention includes a diagnostic kit for use in screening serum containing antibodies specific against T. pallidum infection.
  • a kit may include an isolated T. pallidum antigen comprising an epitope which is specifically immunoreactive with at least one anti-r. pallidum antibody.
  • Such a kit also includes means for detecting the binding of said antibody to the antigen.
  • the kit may include a recombinantly produced or chemically synthesized peptide or polypeptide antigen. The peptide or polypeptide antigen may be attached to a solid support.
  • the detecting means of the above-described kit includes a solid support to which said peptide or polypeptide antigen is attached.
  • a kit may also include a non-attached reporter-labeled anti-human antibody.
  • binding of the antibody to the T. pallidum antigen can be detected by binding of the reporter labeled antibody to the anti-r. pallidum polypeptide antibody.
  • a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the DFs or antibodies of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of a bound DF or antibody.
  • a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are not cross- contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another.
  • Such containers will include a container which will accept the test sample, a container which contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody or DF.
  • wash reagents such as phosphate buffered saline, Tris-buffers, etc.
  • the invention includes a method of detecting T. pallidum infection in a subject.
  • This detection method includes reacting a body fluid, preferably serum, from the subject with an isolated T. pallidum antigen, and examining the antigen for the presence of bound antibody.
  • the method includes a polypeptide antigen attached to a solid support, and serum is reacted with the support. Subsequently, the support is reacted with a reporter-labeled anti-human antibody. The support is then examined for the presence of reporter- labeled antibody.
  • the solid surface reagent employed in the above assays and kits is prepared by known techniques for attaching protein material to solid support material, such as polymeric beads, dip sticks, 96-well plates or filter material. These attachment methods generally include non-specific adsorption of the protein to the support or covalent attachment of the protein , typically through a free amine group, to a chemically reactive group on the solid support, such as an activated carboxyl, hydroxyl, or aldehyde group. Alternatively, streptavidin coated plates can be used in conjunction with biotinylated antigen(s).
  • the polypeptides and antibodies of the present invention, including fragments thereof, may be used to detect Borrelia species including T. pallidum using bio chip and biosensor technology.
  • Bio chip and biosensors of the present invention may comprise the polypeptides of the present invention to detect antibodies, which specifically recognize Borrelia species, including T. pallidum.
  • Bio chip and biosensors of the present invention may also comprise antibodies which specifically recognize the polypeptides of the present invention to detect Borrelia species, including T. pallidum or specific polypeptides of the present invention.
  • Bio chips or biosensors comprising polypeptides or antibodies of the present invention may be used to detect Borrelia species, including T. pallidum, in biological and environmental samples and to diagnose an animal, including humans, with an T. pallidum or other Borrelia infection.
  • the present invention includes both bio chips and biosensors comprising polypeptides or antibodies of the present invention and methods of their use.
  • the bio chips of the present invention may further comprise polypeptide sequences of other pathogens including bacteria, viral, parasitic, and fungal polypeptide sequences, in addition to the polypeptide sequences of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis.
  • the bio chips of the present invention may further comprise antibodies or fragements thereof specific for other pathogens including bacteria, viral, parasitic, and fungal polypeptide sequences, in addition to the antibodies or fragements thereof of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis.
  • the bio chips and biosensors of the present invention may also be used to monitor an T.
  • the bio chip and biosensors comprising polypeptides or antibodies of the present invention may also be used to simultaneously monitor the expression of a multiplicity of polypeptides, including those of the present invention.
  • the polypeptides used to comprise a bio chip or biosensor of the present invention may be specified in the same manner as for the fragements, i.e, by their N-terminal and C-terminal positions or length in contigious amino acid residue. Methods and particular uses of the polypeptides and antibodies of the present invention to detect Borrelia species, including T.
  • bio chip and biosensor technology examples include those known in the art, those of the U.S. Patent Nos. and World Patent Nos. listed above for bio chips and biosensors using polynucleotides of the present invention, and those of: U.S. Patent Nos. 5658732, 5135852, 5567301, 5677196, 5690894 and World Patent Nos. WO9729366, WO9612957, each incorporated herein in their entireties.
  • the present invention further provides methods of obtaining and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the fragments and the T. pallidum fragment and contigs herein described.
  • such methods comprise steps of:
  • agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents.
  • the agents can be selected and screened at random or rationally selected or designed using protein modeling techniques.
  • agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention.
  • agents may be rationally selected or designed.
  • an agent is said to be "rationally selected or designed" when the agent is chosen based on the configuration of the particular protein.
  • one skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, for example see Hurby et al, "Application of Synthetic Peptides: Antisense Peptides," in Synthetic Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, and Kaspczak et al, Biochemistry 25:9230-8 (1989), or pharmaceutical agents, or the like.
  • one class of agents of the present invention can be used to control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or multiple ORFs which rely on the same EMF for expression control.
  • One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity.
  • Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary to a region of the gene involved in transcription (triple helix - see Lee et al, Nucl. Acids Res. 3:173 (1979); Cooney et al, Science 241:456 (1988); and Dervan et al, Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, /. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)).
  • Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides, and other DNA binding agents.
  • the present invention further provides pharmaceutical agents which can be used to modulate the growth or pathogenicity of T. pallidum, or another related organism, in vivo or in vitro.
  • a "pharmaceutical agent” is defined as a composition of matter which can be formulated using known techniques to provide a pharmaceutical compositions.
  • the “pharmaceutical agents of the present invention” refers the pharmaceutical agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are identified using the herein described assays.
  • a pharmaceutical agent is said to "modulate the growth pathogenicity of T. pallidum or a related organism, in vivo or in vitro," when the agent reduces the rate of growth, rate of division, or viability of the organism in question.
  • the pharmaceutical agents of the present invention can modulate the growth or pathogenicity of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth by binding to an important protein thus blocking the biological activity of the protein, while other agents may bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone to act the bodies nature immune system.
  • the agent may comprise a protein encoded by one of the ORFs of the present invention and serve as a vaccine. The development and use of a vaccine based on outer membrane components are well known in the art.
  • a "related organism” is a broad term which refers to any organism whose growth can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As such, related organisms do not need to be bacterial but may be fungal or viral pathogens.
  • the pharmaceutical agents and compositions of the present invention may be administered in a convenient manner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes.
  • the pharmaceutical compositions are administered in an amount which is effective for treating and/or prophylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most cases, the dosage is from about 0J mg/kg to about 10 g/kg body weight daily, taking into account the routes of administration, symptoms, etc.
  • the agents of the present invention can be used in native form or can be modified to form a chemical derivative.
  • a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, REMINGTON'S PHARMACEUTICAL SCIENCES ( 1980) cited elsewhere herein.
  • such moieties may change an immunological character of the functional derivative, such as affinity for a given antibody.
  • Such changes in immunomodulation activity are measured by the appropriate assay, such as a competitive type immunoassay.
  • Modifications of such protein properties as redox or thermal stability, biological half-life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers also may be effected in this way and can be assayed by methods well known to the skilled artisan.
  • the therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient by any suitable means (e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single or multiple injections.
  • the dosage of the administered agent will vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered.
  • the therapeutically effective dose can be lowered by using combinations of the agents of the present invention or another agent.
  • two or more compounds or agents are said to be administered "in combination" with each other when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can be measured at the same time.
  • the composition of the present invention can be administered concurrently with, prior to, or following the administration of the other agent.
  • the agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to decrease the rate of growth (as defined above) of the target organism.
  • the administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose.
  • the agent(s) are provided in advance of any symptoms indicative of the organisms growth.
  • the prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of any subsequent infection.
  • the agent(s) are provided at (or shortly after) the onset of an indication of infection.
  • the therapeutic administration of the compound(s) serves to attenuate the pathological symptoms of the infection and to increase the rate of recovery.
  • the agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharmaceutically acceptable form and in a therapeutically effective concentration.
  • a composition is said to be "pharmacologically acceptable” if its administration can be tolerated by a recipient patient.
  • Such an agent is said to be administered in a "therapeutically effective amount” if the amount administered is physiologically significant.
  • An agent is physiologically significant if its presence results in a detectable change in the physiology of a recipient patient.
  • the agents of the present invention can be formulated according to known methods to prepare pharmaceutically useful compositions, whereby these materials, or their functional derivatives, are combined in a mixture with a pharmaceutically acceptable carrier vehicle.
  • compositions suitable for effective administration will contain an effective amount of one or more of the agents of the present invention, together with a suitable amount of carrier vehicle. Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention.
  • the controlled delivery may be effectuated by a variety of well known techniques, including formulation with macromolecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired time course of release.
  • Another possible method to control the duration of action by controlled release preparations is to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers.
  • microcapsules prepared, for example, by coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-microcapsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions.
  • colloidal drug delivery systems for example, liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions.
  • the invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention.
  • a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention.
  • Associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale for human administration.
  • the agents of the present invention may be employed in conjunction with other therapeutic compounds.
  • the present invention further demonstrates that a large sequence can be sequenced using a random shotgun approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols.
  • the probability that any given base has not been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, therefore, is equivalent to the fraction of the whole sequence that has yet to be determined.
  • approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced.
  • coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to .0067 or 0.67%.
  • 5X coverage of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with an average sequence read length of 410 bp.
  • 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a polynucleotide 2.8 Mb long.
  • T. pallidum DNA is prepared by phenol extraction. A mixture containing 200 ⁇ g DNA in 1.0 ml of 300 mM sodium acetate, 10 mM Tris-HCl, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The sonicated DNA is ethanol precipitated and redissolved in 500 ⁇ l TE buffer.
  • a 100 ⁇ l aliquot of the resuspended DNA is digested with 5 units of BAL31 nuclease (New England BioLabs) for 10 min at 30°C in 200 ⁇ l BAL31 buffer.
  • the digested DNA is phenol-extracted, ethanol-precipitated, redissolved in 100 ⁇ l TE buffer, and then size-fractionated by electrophoresis through a 1.0% low melting temperature agarose gel.
  • the section containing DNA fragments 1.6-2.0 kb in size is excised from the gel, and the LGT agarose is melted and the resulting solution is extracted with phenol to separate the agarose from the DNA.
  • DNA is ethanol precipitated and redissolved in 20 ⁇ l of TE buffer for ligation to vector.
  • a two-step ligation procedure is used to produce a plasmid library with 97% inserts, of which >99% were single inserts.
  • the first ligation mixture (50 ul) contains 2 ⁇ g of DNA fragments, 2 ⁇ g pUC18 DNA (Pharmacia) cut with Smal and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and is incubated at 14°C for 4 hr.
  • the ligation mixture then is phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 20 ⁇ l TE buffer and electrophoresed on a 1.0% low melting agarose gel.
  • Discrete bands in a ladder are visualized by ethidium bromide-staining and UV illumination and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc.
  • the portion of the gel containing v+I DNA is excised and the v+I DNA is recovered and resuspended into 20 ⁇ l TE.
  • the v+I DNA then is blunt-ended by T4 polymerase treatment for 5 min. at 37°C in a reaction mixture (50 ul) containing the v+I linears, 500 ⁇ M each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), under recommended buffer conditions.
  • the repaired v+I linears are dissolved in 20 ⁇ l TE.
  • the final ligation to produce circles is carried out in a 50 ⁇ l reaction containing 5 ⁇ l of v+I linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the following day, the reaction mixture is stored at -20°C.
  • This two-stage procedure results in a molecularly random collection of single-insert plasmid recombinants with minimal contamination from double-insert chimeras ( ⁇ 1%) or free vector ( ⁇ 3%).
  • E. coli host cells deficient in all recombination and restriction functions are used to prevent rearrangements, deletions, and loss of clones by restriction.
  • transformed cells are plated directly on antibiotic diffusion plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells. Plating is carried out as follows. A 100 ⁇ l aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a chilled Falcon 2059 tube on ice.
  • a 1.7 ⁇ l aliquot of 1.42 M beta-mercaptoethanol is added to the aliquot of cells to a final concentration of 25 mM.
  • Cells are incubated on ice for 10 min.
  • a 1 ⁇ l aliquot of the final ligation is added to the cells and incubated on ice for 30 min.
  • the cells are heat pulsed for 30 sec. at 42°C and placed back on ice for 2 min.
  • the outgrowth period in liquid culture is eliminated from this protocol in order to minimize the preferential growth of any given transformed cell.
  • the transformation mixture is plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g tryptone, 5 g yeast extract, 0.5 g NaCl, 1.5% Difco Agar per liter of media).
  • the 5 ml bottom layer is supplemented with 0.4 ml of 50 mg/ml ampicillin per 100 ml SOB agar.
  • the 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal (2%), 1 ml MgC12 (1 M), and 1 ml MgSO4/100 ml SOB agar.
  • the 15 ml top layer is poured just prior to plating.
  • High quality double stranded DNA plasmid templates are prepared using a "boiling bead” method developed in collaboration with Advanced Genetic Technology Corp. (Gaithersburg, MD) (Adams et al, Science 252:1651 (1991); Adams et al, Nature 355:632 (1992)). Plasmid preparation is performed in a 96-well format for all stages of DNA preparation from bacterial growth through final DNA purification. Template concentration is determined using Hoechst Dye and a Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding templates are identified where possible and not sequenced.
  • T. pallidum DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) containing 50 ⁇ g DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C. The digested DNA was phenol-extracted and electrophoresed on a 0.5% low melting agarose gel at 2V/cm for 7 hours.
  • Fragments from 15 to 25 kb are excised and recovered in a final volume of 6 ul.
  • One ⁇ l of fragments is used with 1 ⁇ l of DASH-fl vector (Stratagene) in the recommended ligation reaction.
  • One ⁇ l of the ligation mixture is used per packaging reaction following the recommended protocol with the Gigapack II XL Packaging Extract (Stratagene, #227711). Phage are plated directly without amplification from the packaging mixture (after dilution with 500 ⁇ l of recommended SM buffer and chloroform treatment). Yield is about 2.5x103 pfu/ul.
  • the amplified library is prepared essentially as above except the lambda GEM- 12 vector is used.
  • 3.5x104 pfu are plated on the restrictive NM539 host.
  • the lysate is harvested in 2 ml of SM buffer and stored frozen in 7% dimethylsulfoxide.
  • the phage titer is approximately 1x109 pfu ml.
  • Liquid ly sates (100 ⁇ l) are prepared from randomly selected plaques (from the unamplified library) and template is prepared by long-range PCR using T7 and T3 vector-specific primers.
  • Sequencing reactions are carried out on plasmid and/or PCR templates using the AB Catalyst LabStation with Applied Biosystems PRISM Ready Reaction Dye Primer Cycle Sequencing Kits for the M13 forward (M13-21) and the M13 reverse (M13RP1) primers (Adams et al. , Nature 368:414 (1994)).
  • Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits.
  • T7 and SP6 primers are used to sequence the ends of the inserts from the Lambda GEM- 12 library and T7 and T3 primers are used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are performed by eight individuals using an average of fourteen AB 373 DNA Sequencers per day. All sequencing reactions are analyzed using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read distance. The overall sequencing success rate very approximately is about 85% for Ml 3-21 and M13RP1 sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for Ml 3-21 sequences, 445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions.
  • the Catalyst robot is a publicly available sophisticated pipetting and temperature control robot which has been developed specifically for DNA sequencing reactions.
  • the Catalyst combines pre-aliquoted templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and reaction buffer. Reaction mixes and templates are combined in the wells of an aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear amplification (i.e., one primer synthesis) steps are performed including denaturation, annealing of primer and template, and extension; i.e., DNA synthesis.
  • a heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for an oil overlay.
  • Two sequencing protocols are used: one for dye-labelled primers and a second for dye- labelled dideoxy chain terminators.
  • the shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, permitting the four individual reactions to be combined into one lane of the 373 DNA Sequencer for electrophoresis, detection, and base-calling.
  • ABI currently supplies pre-mixed reaction mixes in bulk packages containing all the necessary non-template reagents for sequencing.
  • Sequencing can be done with both plasmid and PCR- generated templates with both dye-primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable sequences. Thirty-two reactions are loaded per AB373 Sequencer each day, for a total of 960 samples. Electrophoresis is run overnight following the manufacturer's protocols, and the data is collected for twelve hours. Following electrophoresis and fluorescence detection, the ABI 373 performs automatic lane tracking and base-calling. The lane- tracking is confirmed visually. Each sequence electropherogram (or fluorescence lane trace) is inspected visually and assessed for quality.
  • Trailing sequences of low quality are removed and the sequence itself is loaded via software to a Sybase database (archived daily to 8mm tape).
  • Leading vector poly linker sequence is removed automatically by a software program.
  • Average edited lengths of sequences from the standard ABI 373 are around 400 bp and depend mostly on the quality of the template used for the sequencing reaction.
  • ABI 373 Sequencers converted to Stretch Liners provide a longer electrophoresis path prior to fluorescence detection and increase the average number of usable bases to 500-600 bp.
  • TIGR Assembler developed for the rapid and accurate assembly of thousands of sequence fragments was employed to generate contigs.
  • the TIGR assembler simultaneously clusters and assembles fragments of the genome.
  • the algorithm builds a hash table of 12 bp ohgonucleotide subsequences to generate a list of potential sequence fragment overlaps. The number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements.
  • TIGR Assembler extends the current contig by attempting to add the best matching fragment based on ohgonucleotide content.
  • the contig and candidate fragment are aligned using a modified version of the Smith- Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., Methods in Enzymology 164:165 (1988)).
  • the contig is extended by the fragment only if strict criteria for the quality of the match are met.
  • the match criteria include the minimum length of overlap, the maximum length of an unmatched end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal coverage and raised in regions with a possible repetitive element.
  • the number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repetitive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of alignments and excluded from the current contig.
  • TIGR Assembler is designed to take advantage of clone size information coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from two ends of the same template point toward one another in the contig and are located within a certain range of base pairs (definable for each clone based on the known clone size range for a given library). The process resulted in 744 contigs as represented by SEQ ID NOs: 1-744.
  • the predicted coding regions of the T. pallidum genome were initially defined with the program GeneMark, which finds ORFs using a probabilistic classification technique.
  • the predicted coding region sequences were used in searches against a database of all nucleotide sequences from GenBank (June, 1997), using the BLASTN search method to identify overlaps of 50 or more nucleotides with at least a 95% identity.
  • Those ORFs with nucleotide sequence matches are shown in Table 1.
  • the ORFs without such matches were translated to protein sequences and compared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept databases.
  • ORFs that matched a database protein with BLASTP probability less than or equal to 0.01 are shown in Table 2.
  • the table also lists assigned functions based on the closest match in the databases. ORFs that did not match protein or nucleotide sequences in the databases at these levels are shown in Table 3.
  • Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the methods known in the art.
  • the protein can also be produced in a recombinant prokaryotic expression system, such as E. coli, or can be chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms/ml.
  • Monoclonal or polyclonal antibody to the protein can then be prepared as follows.
  • Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from murine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or modifications of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media).
  • HAT media aminopterin
  • the successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued.
  • Antibody- producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol 70:419 (1980), and modified methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Davis, L. et al, Basic Methods in Molecular Biology, Elsevier, New York. Section 21-2 (1989).
  • Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than others and may require the use of carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, J. et al, J. Clin. Endocrinol. Metab. 33:988-991 (1971).
  • Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony, O. et al, Chap. 19 in: Handbook of Experimental Immunology, Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher, D., Chap.
  • Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively or qualitatively to identify the presence of antigen in a biological sample.
  • antibodies are useful in various animal models of pneumococcal disease as a means of evaluating the protein used to make the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunotherapeutic or immunoprophylactic reagent.
  • NOS: 1-744 can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses.
  • the PCR primers are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approximately the same.
  • the PCR primers and amplified DNA of this Example find use in the Examples that follow.
  • T. pallidum strain B3 IPU has been deposited as a convienent source for obtaining a T. pallidum strain although a wide varity of strains T. pallidum strains can be used which are known in the art.
  • T. pallidum genomic DNA is prepared using the following method.
  • a 20ml overnight bacterial culture grown in a rich medium e.g., Trypticase Soy Broth, Brain Heart Infusion broth or Super broth
  • TES Tris-pH 8.0, 25mM EDTA, 50mM NaCl
  • TES high salt TES
  • Lysostaphin is added to final concentration of approx 50ug/ml and the mixture is rotated slowly 1 hour at 37C to make protoplast cells.
  • the solution is then placed in incubator (or place in a shaking water bath) and warmed to 55C.
  • a plasmid is directly isolated by screening a plasmid T. pallidum genomic DNA library using a polynucleotide probe corresponding to a polynucleotide of the present invention.
  • a polynucleotide probe corresponding to a polynucleotide of the present invention.
  • a specific polynucleotide with 30-40 nucleotides is synthesized using an Applied Biosystems DNA synthesizer according to the sequence reported.
  • the ohgonucleotide is labeled, for instance, with 32 P- ⁇ -ATP using T4 polynucleotide kinase and purified according to routine methods.
  • the library is transformed into a suitable host, as indicated above (such as XL-1 Blue (Stratagene)) using techniques known to those of skill in the art. See, e.g., Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL (Cold Spring Harbor, N.Y. 2nd ed. 1989); Ausubel et al., CURRENT PROTOCALS IN MOLECULAR BIOLOGY (John Wiley and Sons, N.Y. 1989).
  • the transformants are plated on 1.5% agar plates (containing the appropriate selection agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. These plates are screened using Nylon membranes according to routine methods for bacterial colony screening. See, e.g., Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL (Cold Spring Harbor, N.Y. 2nd ed. 1989); Ausubel et al., CURRENT PROTOCALS IN MOLECULAR BIOLOGY (John Wiley and Sons, N.Y. 1989) or other techniques known to those of skill in the art.
  • two primers of 15-25 nucleotides derived from the 5' and 3' ends of a polynucleotide of SEQ ID NOS: 1-744 are synthesized and used to amplify the desired DNA by PCR using a T. pallidum genomic DNA prep as a template.
  • PCR is carried out under routine conditions, for instance, in 25 ⁇ l of reaction mixture with 0.5 ug of the above DNA template.
  • a convenient reaction mixture is 1.5-5 mM MgCl 2 , 0.01% (w/v) gelatin, 20 ⁇ M each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq polymerase.
  • overlapping oligos of the DNA sequences of SEQ ID NOS: 1-744 can be chemically synthesized and used to generate a nucleotide sequence of desired length using PCR methods known in the art.
  • the bacterial expression vector pQE60 is used for bacterial expression of some of the polypeptide fragements of the present invention. (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311). pQE60 encodes ampicillin antibiotic resistance ("Ampr”) and contains a bacterial origin of replication ("ori"), an IPTG inducible promoter, a ribosome binding site (“RBS”), six codons encoding histidine residues that allow affinity purification using nickel- nitrilo-tri-acetic acid (“Ni-NTA”) affinity resin (QIAGEN, Inc., supra) and suitable single restriction enzyme cleavage sites.
  • Amr ampicillin antibiotic resistance
  • ori an IPTG inducible promoter
  • RBS ribosome binding site
  • 6 six codons encoding histidine residues that allow affinity purification using nickel- nitrilo-tri-acetic acid (“Ni-NTA”) affinity resin (QIAGEN, Inc., supra
  • the DNA sequence encoding the desired portion of a T. pallidum protein of the present invention is amplified from T. pallidum genomic DNA using PCR ohgonucleotide primers which anneal to the 5' and 3' sequences coding for the portions of the T. pallidum polynucleotide shown in SEQ ID NOS: 1-744. Additional nucleotides containing restriction sites to facilitate cloning in the pQE60 vector are added to the 5' and 3' sequences, respectively.
  • the 5' primer has a sequence containing an appropriate restriction site followed by nucleotides of the amino terminal coding sequence of the desired T. pallidum polynucleotide sequence in SEQ ID NOS: 1-744.
  • SEQ ID NOS: 1-744 nucleotides of the amino terminal coding sequence of the desired T. pallidum polynucleotide sequence in SEQ ID NOS: 1-744.
  • the 3' primer has a sequence containing an appropriate restriction site followed by nucleotides complementary to the 3' end of the polypeptide coding sequence of SEQ ID NOS: 1-744, excluding a stop codon, with the coding sequence aligned with the restriction site so as to maintain its reading frame with that of the six His codons in the pQE60 vector.
  • the amplified T. pallidum DNA fragment and the vector pQE60 are digested with restriction enzymes which recognize the sites in the primers and the digested DNAs are then ligated together.
  • the T. pallidum DNA is inserted into the restricted pQE60 vector in a manner which places the T. pallidum protein coding region downstream from the IPTG-inducible promoter and in-frame with an initiating AUG and the six histidine codons.
  • E. coli strain M15/rep4 containing multiple copies of the plasmid pREP4, which expresses the lac repressor and confers kanamycin resistance ("Kanr"), is used in carrying out the illustrative example described herein.
  • This strain which is only one of many that are suitable for expressing a T. pallidum polypeptide, is available commercially (QIAGEN, Inc., supra).
  • Transformants are identified by their ability to grow on LB agar plates in the presence of ampicillin and kanamycin. Plasmid DNA is isolated from resistant colonies and the identity of the cloned DNA confirmed by restriction analysis, PCR and DNA sequencing.
  • Clones containing the desired constructs are grown overnight ("O/N") in liquid culture in LB media supplemented with both ampicillin (100 ⁇ g/ml) and kanamycin (25 ⁇ g/ml).
  • the O/N culture is used to inoculate a large culture, at a dilution of approximately 1 :25 to 1 :250.
  • the cells are grown to an optical density at 600 nm ("OD600”) of between 0.4 and 0.6.
  • Isopropyl- ⁇ -D- thiogalactopyranoside (“IPTG”) is then added to a final concentration of 1 mM to induce transcription from the lac repressor sensitive promoter, by inactivating the lad repressor. Cells subsequently are incubated further for 3 to 4 hours.
  • Ni-NTA nickel-nitrilo-tri-acetic acid
  • the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then washed with 10 volumes of 6 M guanidine-HCl pH 6, and finally the T. pallidum polypeptide is eluted with 6 M guanidine- HCI, pH 5.
  • the purified protein is then renatured by dialyzing it against phosphate-buffered saline (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCl.
  • PBS phosphate-buffered saline
  • the protein could be successfully refolded while immobilized on the Ni-NTA column.
  • the recommended conditions are as follows: renature using a linear 6M-1M urea gradient in 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, containing protease inhibitors.
  • the renaturation should be performed over a period of 1.5 hours or more. After renaturation the proteins can be eluted by the addition of 250 mM immidazole.
  • Immidazole is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer plus 200 mM NaCl.
  • the purified protein is stored at 4°C or frozen at -80° C.
  • the polypeptide of the present invention are also prepared using a non-denaturing protein purification method.
  • Absorbance at 550 nm is approximately 10-20 O.D./ml.
  • the suspension is then put through three freeze/thaw cycles from -70°C (using a ethanol-dry ice bath) up to room temperature.
  • the cells are lysed via sonication in short 10 sec bursts over 3 minutes at approximately 80W while kept on ice.
  • the sonicated sample is then centrifuged at 15,000 RPM for 30 minutes at 4°C.
  • the supernatant is passed through a column containing 1.0 ml of CL-4B resin to pre-clear the sample of any proteins that may bind to agarose non-specifically, and the flow-through fraction is collected.
  • Ni-NTA nickel-nitrilo-tri-acetic acid
  • Buffer B 50 mM Na-Phosphate, 300 mM NaCl, 10% Glycerol, 10 mM 2-mercaptoethanol, 500 mM Imidazole, pH of the final buffer should be 7.5.
  • the protein is eluted off of the column with a series of increasing Imidazole solutions made by adjusting the ratios of Lysis Buffer A to Buffer B. Three different concentrations are used: 3 volumes of 75 mM Imidazole, 3 volumes of 150 mM Imidazole, 5 volumes of 500 mM Imidazole.
  • the fractions containing the purified protein are analyzed using 8 %, 10 % or 14% SDS-PAGE depending on the protein size.
  • the purified protein is then dialyzed 2X against phosphate-buffered saline (PBS) in order to place it into an easily workable buffer.
  • PBS phosphate-buffered saline
  • the purified protein is stored at 4° C or frozen at -80°.
  • the following alternative method may be used to purify T. pallidum expressed in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, all of the following steps are conducted at 4-10°C. Upon completion of the production phase of the E. coli fermentation, the cell culture is cooled to 4-10°C and the cells are harvested by continuous centrifugation at 15,000 rpm
  • cell paste On the basis of the expected yield of protein per unit weight of cell paste and the amount of purified protein required, an appropriate amount of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 mM ⁇ DTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a high shear mixer.
  • the cells are then lysed by passing the solution through a microfluidizer (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi.
  • the homogenate is then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by centrifugation at 7000 x g for 15 min.
  • the resultant pellet is washed again using 0.5M NaCl, 100 mM Tris, 50 mM ⁇ DTA, pH 7.4.
  • the resulting washed inclusion bodies are solubilized with 1.5 M guanidine hydrochloride (GuHCl) for 2-4 hours. After 7000 x g centrifugation for 15 min., the pellet is discarded and the T. pallidum polypeptide-containing supernatant is incubated at 4°C overnight to allow further GuHCl extraction. Following high speed centrifugation (30,000 x g) to remove insoluble particles, the
  • GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM ⁇ DTA by vigorous stirring.
  • the refolded diluted protein solution is kept at 4°C without mixing for 12 hours prior to further purification steps.
  • a previously prepared tangential filtration unit equipped with 0J6 ⁇ m membrane filter with appropriate surface area e.g.,
  • Fractions containing the T. pallidum polypeptide are then pooled and mixed with 4 volumes of water.
  • the diluted sample is then loaded onto a previously prepared set of tandem columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion (Poros CM-20, Perseptive Biosystems) exchange resins.
  • the columns are equilibrated with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium acetate, pH 6.0, 200 mM NaCl.
  • CM-20 column is then eluted using a 10 column volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A 280 monitoring of the effluent. Fractions containing the T. pallidum polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled.
  • T. pallidum polypeptide exhibits greater than 95% purity after the above refolding and purification steps. No major contaminant bands are observed from Commassie blue stained 16% SDS-PAGE gel when 5 ⁇ g of purified protein is loaded.
  • the purified protein is also tested for endotoxin/LPS contamination, and typically the LPS content is less than 0.1 ng/ml according to LAL assays.
  • Tthe vector pQElO is alternatively used to clone and express some of the polypeptides of the present invention for use in the soft tissue and systemic infection models discussed below. The difference being such that an inserted DNA fragment encoding a polypeptide expresses that polypeptide with the six His residues (i.e., a "6 X His tag") covalently linked to the amino terminus of that polypeptide.
  • the bacterial expression vector pQElO (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311) was used in this example .
  • the components of the pQElO plasmid are arranged such that the inserted DNA sequence encoding a polypeptide of the present invention expresses the polypeptide with the six His residues (i.e., a "6 X His tag”)) covalently linked to the amino terminus.
  • the DNA sequences encoding the desired portions of a polypeptide of SEQ ID NOS: 1- 744 were amplified using PCR ohgonucleotide primers from genomic T. pallidum DNA.
  • the PCR primers anneal to the nucleotide sequences encoding the desired amino acid sequence of a polypeptide of the present invention.
  • Additional nucleotides containing restriction sites to facilitate cloning in the pQElO vector were added to the 5' and 3' primer sequences, respectively.
  • the 5' and 3' primers were selected to amplify their respective nucleotide coding sequences.
  • the point in the protein coding sequence where the 5' and 3' primers begins may be varied to amplify a DNA segment encoding any desired portion of a polypeptide of the present invention.
  • the 5' primer was designed so the coding sequence of the 6 X His tag is aligned with the restriction site so as to maintain its reading frame with that of T. pallidum polypeptide.
  • the 3' was designed to include an stop codon. The amplified DNA fragment was then cloned, and the protein expressed, as described above for the pQE60 plasmid.
  • the DNA sequences encoding the amino acid sequences of SEQ ID NOS: 1-744 may also be cloned and expressed as fusion proteins by a protocol similar to that described directly above, wherein the pET-32b(+) vector (Novagen, 601 Science Drive, Madison, Wl 53711) is preferentially used in place of pQElO.
  • the above methods are not limited to the polypeptide fragements actually produced. The above method, like the methods below, can be used to produce either full length polypeptides or desired fragements therof.
  • the bacterial expression vector pQE60 is used for bacterial expression in this example (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311). However, in this example, the polypeptide coding sequence is inserted such that translation of the six His codons is prevented and, therefore, the polypeptide is produced with no 6 X His tag.
  • the DNA sequence encoding the desired portion of the T. pallidum amino acid sequence is amplified from an T. pallidum genomic DNA prep the deposited DNA clones using PCR ohgonucleotide primers which anneal to the 5' and 3' nucleotide sequences corresponding to the desired portion of the T. pallidum polypeptides. Additional nucleotides containing restriction sites to facilitate cloning in the pQE60 vector are added to the 5' and 3' primer sequences.
  • 5' and 3' primers are selected to amplify their respective nucleotide coding sequences.
  • the point in the protein coding sequence where the 5' and 3' primers begin may be varied to amplify a DNA segment encoding any desired portion of a polypeptide of the present invention.
  • the 3' and 5' primers contain appropriate restriction sites followed by nucleotides complementary to the 5' and 3' ends of the coding sequence respectively.
  • the 3' primer is additionally designed to include an in-frame stop codon.
  • the amplified T. pallidum DNA fragments and the vector pQE60 are digested with restriction enzymes recognizing the sites in the primers and the digested DNAs are then ligated together. Insertion of the T. pallidum DNA into the restricted pQE60 vector places the T. pallidum protein coding region including its associated stop codon downstream from the IPTG- inducible promoter and in-frame with an initiating AUG. The associated stop codon prevents translation of the six histidine codons downstream of the insertion point.
  • the ligation mixture is transformed into competent E. coli cells using standard procedures such as those described by Sambrook et al.
  • E. coli strain M15/rep4 containing multiple copies of the plasmid pREP4, which expresses the lac repressor and confers kanamycin resistance ("Kanr"), is used in carrying out the illustrative example described herein.
  • This strain which is only one of many that are suitable for expressing T. pallidum polypeptide, is available commercially (QIAGEN, Inc., supra).
  • Transformants are identified by their ability to grow on LB plates in the presence of ampicillin and kanamycin. Plasmid DNA is isolated from resistant colonies and the identity of the cloned DNA confirmed by restriction analysis, PCR and DNA sequencing.
  • Clones containing the desired constructs are grown overnight ("O/N") in liquid culture in LB media supplemented with both ampicillin (100 ⁇ g/ml) and kanamycin (25 ⁇ g/ml).
  • the O/N culture is used to inoculate a large culture, at a dilution of approximately 1:25 to 1:250.
  • the cells are grown to an optical density at 600 nm ("OD600") of between 0.4 and 0.6. isopropyl-b-D- thiogalactopyranoside (“IPTG”) is then added to a final concentration of 1 mM to induce transcription from the lac repressor sensitive promoter, by inactivating the lad repressor.
  • IPTG isopropyl-b-D- thiogalactopyranoside
  • the cells are then stirred for 3-4 hours at 4°C in
  • T. pallidum polypeptide 6M guanidine-HCl, pH 8.
  • the cell debris is removed by centrifugation, and the supernatant containing the T. pallidum polypeptide is dialyzed against 50 mM Na-acetate buffer pH 6, supplemented with 200 mM NaCl.
  • the protein can be successfully refolded by dialyzing it against 500 mM NaCl, 20% glycerol, 25 mM Tris/HCl pH 7.4, containing protease inhibitors. After renaturation the protein can be purified by ion exchange, hydrophobic interaction and size exclusion chromatography. Alternatively, an affinity chromatography step such as an antibody column can be used to obtain pure T pallidum polypeptide.
  • the purified protein is stored at 4°C or frozen at -80° C.
  • the following alternative method may be used to purify T. pallidum polypeptides expressed in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, all of the following steps are conducted at 4-10°C.
  • the cell culture Upon completion of the production phase of the E. coli fermentation, the cell culture is cooled to 4-10°C and the cells are harvested by continuous centrifugation at 15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit weight of cell paste and the amount of purified protein required, an appropriate amount of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 mM ⁇ DTA, pH 1.4. The cells are dispersed to a homogeneous suspension using a high shear mixer.
  • the cells ware then lysed by passing the solution through a microfluidizer (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi.
  • the homogenate is then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by centrifugation at 7000 x g for 15 min.
  • the resultant pellet is washed again using 0.5M NaCl, 100 mM Tris, 50 mM ⁇ DTA, pH 7.4.
  • the resulting washed inclusion bodies are solubilized with 1.5 M guanidine hydrochloride (GuHCl) for 2-4 hours. After 7000 x g centrifugation for 15 min., the pellet is discarded and the T. pallidum polypeptide-containing supernatant is incubated at 4°C overnight to allow further GuHCl extraction.
  • guanidine hydrochloride GuHCl
  • the GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM ⁇ DTA by vigorous stirring.
  • the refolded diluted protein solution is kept at 4°C without mixing for 12 hours prior to further purification steps.
  • a previously prepared tangential filtration unit equipped with 0J6 ⁇ m membrane filter with appropriate surface area e.g.,
  • the diluted sample is then loaded onto a previously prepared set of tandem columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion (Poros CM-20, Perseptive Biosystems) exchange resins.
  • the columns are equilibrated with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium acetate, pH 6.0, 200 mM NaCl.
  • the CM-20 column is then eluted using a 10 column volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A 280 monitoring of the effluent. Fractions containing the T. pallidum polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled.
  • T pallidum polypeptide exhibits greater than 95% purity after the above refolding and purification steps. No major contaminant bands are observed from Commassie blue stained 16% SDS-PAGE gel when 5 ⁇ g of purified protein is loaded.
  • the purified protein is also tested for endotoxin/LPS contamination, and typically the LPS content is less than 0.1 ng/ml according to LAL assays.
  • T. pallidum polypeptides can also be produced in: T. pallidum using the methods of S.
  • a T. pallidum expression plasmid is made by cloning a portion of the DNA encoding a T pallidum polypeptide into the expression vector pDNAI A-mp or pDNAIII (which can be obtained from Invitrogen, Inc.).
  • the expression vector pDNAI/amp contains: (1) an E. coli origin of replication effective for propagation in E.
  • coli and other prokaryotic cells (2) an ampicillin resistance gene for selection of plasmid-containing prokaryotic cells; (3) an SV40 origin of replication for propagation in eukaryotic cells; (4) a CMV promoter, a polylinker, an SV40 intron; (5) several codons encoding a hemagglutinin fragment (i.e., an "HA" tag to facilitate purification) followed by a termination codon and polyadenylation signal arranged so that a DNA can be conveniently placed under expression control of the CMV promoter and operably linked to the SV40 intron and the polyadenylation signal by means of restriction sites in the polylinker.
  • HA hemagglutinin fragment
  • the HA tag corresponds to an epitope derived from the influenza hemagglutinin protein described by Wilson et al. 1984 Cell 37:767.
  • the fusion of the HA tag to the target protein allows easy detection and recovery of the recombinant protein with an antibody that recognizes the HA epitope.
  • pDNAIII contains, in addition, the selectable neomycin marker.
  • a DNA fragment encoding a T. pallidum polypeptide is cloned into the polylinker region of the vector so that recombinant protein expression is directed by the CMV promoter.
  • the plasmid construction strategy is as follows. The DNA from a T. pallidum genomic DNA prep is amplified using primers that contain convenient restriction sites, much as described above for construction of vectors for expression of T. pallidum in E. coli.
  • the 5' primer contains a Kozak sequence, an AUG start codon, and nucleotides of the 5' coding region of the T. pallidum polypeptide.
  • the 3' primer contains nucleotides complementary to the 3' coding sequence of the T. pallidum DNA, a stop codon, and a convenient restriction site.
  • the PCR amplified DNA fragment and the vector, pDNAI/Amp, are digested with appropriate restriction enzymes and then ligated.
  • the ligation mixture is transformed into an appropriate E. coli strain such as SURETM (Stratagene Cloning Systems, La Jolla, CA 92037), and the transformed culture is plated on ampicillin media plates which then are incubated to allow growth of ampicillin resistant colonies. Plasmid DNA is isolated from resistant colonies and examined by restriction analysis or other means for the presence of the fragment encoding the T. pallidum polypeptide
  • COS cells are transfected with an expression vector, as described above, using DEAE-dextran, as described, for instance, by Sambrook et al. (supra). Cells are incubated under conditions for expression of T. pallidum by the vector.
  • T. pallidum-RA fusion protein is detected by radiolabeling and immunoprecipitation, using methods described in, for example Harlow et al., supra.. To this end, two days after transfection, the cells are labeled by incubation in media containing 35 S- cysteine for 8 hours. The cells and the media are collected, and the cells are washed and the lysed with detergent-containing RIPA buffer: 150 mM NaCl, 1% NP-40, 0.1% SDS, 1% NP- 40, 0.5% DOC, 50 mM TRIS, pH 7.5, as described by Wilson et al. (supra ).
  • Proteins are precipitated from the cell lysate and from the culture media using an HA-specific monoclonal antibody. The precipitated proteins then are analyzed by SDS-PAGE and autoradiography. An expression product of the expected size is seen in the cell lysate, which is not seen in negative controls.
  • Plasmid pC4 is used for the expression of T. pallidum polypeptide in this example.
  • Plasmid pC4 is a derivative of the plasmid pSV2-dhfr (ATCC Accession No. 37146).
  • the plasmid contains the mouse DHFR gene under control of the SV40 early promoter.
  • Chinese hamster ovary cells or other cells lacking dihydrofolate activity that are transfected with these plasmids can be selected by growing the cells in a selective medium (alpha minus MEM, Life Technologies) supplemented with the chemotherapeutic agent methotrexate.
  • amplification of the DHFR genes in cells resistant to methotrexate (MTX) has been well documented.
  • DHFR as a result of amplification of the DHFR gene. If a second gene is linked to the DHFR gene, it is usually co-amplified and over-expressed. It is known in the art that this approach may be used to develop cell lines carrying more than 1,000 copies of the amplified gene(s). Subsequently, when the methotrexate is withdrawn, cell lines are obtained which contain the amplified gene integrated into one or more chromosome(s) of the host cell.
  • Plasmid pC4 contains the strong promoter of the long terminal repeat (LTR) of the Rouse Sarcoma Virus, for expressing a polypeptide of interest, Cullen, et al. (1985) Mol. Cell. Biol. 5:438-447; plus a fragment isolated from the enhancer of the immediate early gene of human cytomegalovirus (CMV), Boshart, et al., 1985, Cell 41:521-530. Downstream of the promoter are the following single restriction enzyme cleavage sites that allow the integration of the genes: Bam HI, Xba I, and Asp 718. Behind these cloning sites the plasmid contains the 3' intron and polyadenylation site of the rat preproinsulin gene.
  • LTR long terminal repeat
  • CMV cytomegalovirus
  • ⁇ -actin promoter e.g., the human ⁇ -actin promoter, the SV40 early or late promoters or the long terminal repeats from other retroviruses, e.g., HIV and HTLVI.
  • Clontech's Tet-Off and Tet-On gene expression systems and similar systems can be used to express the T. pallidum polypeptide in a regulated way in mammalian cells (Gossen et al., 1992, Proc. Natl. Acad. Sci. USA 89:5547-5551.
  • Other signals e.g., from the human growth hormone or globin genes can be used as well.
  • Stable cell lines carrying a gene of interest integrated into the chromosomes can also be selected upon co-transfection with a selectable marker such as gpt, G418 or hygromycin. It is advantageous to use more than one selectable marker in the beginning, e.g., G418 plus methotrexate.
  • the plasmid pC4 is digested with the restriction enzymes and then dephosphorylated using calf intestinal phosphates by procedures known in the art.
  • the vector is then isolated from a 1% agarose gel.
  • the DNA sequence encoding the T pallidum polypeptide is amplified using PCR ohgonucleotide primers corresponding to the 5' and 3' sequences of the desired portion of the gene.
  • a 5' primer containing a restriction site, a Kozak sequence, an AUG start codon, and nucleotides of the 5' coding region of the T. pallidum polypeptide is synthesized and used.
  • a 3' primer, containing a restriction site, stop codon, and nucleotides complementary to the 3' coding sequence of the T. pallidum polypeptides is synthesized and used.
  • the amplified fragment is digested with the restriction endonucleases and then purified again on a 1 % agarose gel.
  • the isolated fragment and the dephosphorylated vector are then ligated with T4 DNA ligase.
  • E. coli HB101 or XL-1 Blue cells are then transformed and bacteria are identified that contain the fragment inserted into plasmid pC4 using, for instance, restriction enzyme analysis. Chinese hamster ovary cells lacking an active DHFR gene are used for transfection.
  • lipid-mediated transfection agent such as LipofectinTM or LipofectAMIN ⁇ .TM (LifeTechnologies).
  • the plasmid pSV2-neo contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme that confers resistance to a group of antibiotics including G418.
  • the cells are seeded in alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus MEM supplemented with 10, 25, or 50 ng/ml of methotrexate plus 1 mg/ml G418.
  • single clones are trypsinized and then seeded in 6- well petri dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 100 nM, 200 nM, 400 nM, 800 nM). Clones growing at the highest concentrations of methotrexate are then transferred to new 6- well plates containing even higher concentrations of methotrexate (1 ⁇ M, 2 ⁇ M, 5 ⁇ M, 10 mM, 20 mM). The same procedure is repeated until clones are obtained which grow at a concentration of 100-200 ⁇ M. Expression of the desired gene product is analyzed, for instance, by SDS-PAGE and Western blot or by reversed phase HPLC analysis.
  • GACAGCATCC GCTATCATGC CAATGGTGGT GACAACCAGG GGTTTCCCGT CCGCTGCGGC 1020
  • CTCCATATCA ATGACACTCA CGCGCGTCTT ACGCTTTCCT ATGCAGTGTT TGACCTTCCG 5280
  • ATGTCATGCA CCTTTTGGAA GAAATGCACG AGCACAATGA ACGAGAAAGG CTATCGTGAA 7020
  • TCAcTACGCC ATCCATACCG CGAAGAAACA TAGTACCGAC TGCGAAGAGG AGCACGAAAC 7860
  • AATCGTTCTA AGGAGATCTG ATGCGCCGCC GCTATAGACG AAAACGTATC GCCGTTTTTT 8160
  • ACGGTATATA AAATGCCGTC CACTGAGGGG ATTTTTAGTA GCTGTCCAAC TTGGAGCGCC 8220
  • TTTTACGCGC GTTTTGTTCG GTGGCAAAAT CGGGAATTGG CTCAACCCCA TAGCGCTTGC 12300
  • TGAATGGTGA CAATGGTGCC ACCCCTGTTT TTCGTATGCG CCCTTTTCTT TGCCGAGGGC 14040
  • CTCAGTCTCG AACAGGtGCG CACGAGACGA GTATCGCACG CCGTTACCTG GAGGCGCTCG 840

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Testing Of Short-Circuits, Discontinuities, Leakage, Or Incorrect Line Connections (AREA)
  • Peptides Or Proteins (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides polynucleotide sequences of the genome of T. pallidum, polypeptide sequences encoded by the polynucleotide sequences, corresponding polynucleotides and polypeptides, vectors and hosts comprising the polynucleotides, and assays and other uses thereof. The present invention further provides polynucleotide and polypeptide sequence information stored on computer readable media, and computer-based systems and methods which facilitate its use.

Description

Treponema pallidum Polynucleotides and Sequences
FIELD OF THE INVENTION
The present invention relates to the field of molecular biology. In particular, it relates to, among other things, nucleotide sequences of Treponema pallidum, contigs, ORFs, fragments, probes, primers and related polynucleotides thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, such as in fermentation, polypeptide production, assays and pharmaceutical development, among others.
BACKGROUND OF THE INVENTION
Spirochetes are a family of motile, unicellular, spiral-shaped bacteria which share a number of structural characteristics. Three genera of the spirochetes are pathogenic in humans: (a) Treponema, which includes the pathogens that cause syphilis (T. pallidum), yaws (T. pertenue), and pinta (71 carateum); (b) Borrelia, which includes the pathogens that cause epidemic and endemic relapsing fever and Lyme disease; and (c) Leptospira, which includes a wide variety of small spirochetes that cause mild to serious systemic human illness (Koff, A. B. and Rosen, T. 7. Am. Acad. Dermatol. 29:519-535 (1993)). In 1986, more than 27,000 cases of early infectious syphilis were diagnosed in the United States alone. Such statistics indicate that infection with T. pallidum is the largest source of human disease resulting from the spirochetes.
T. pallidum is morphologically indistinguishable from several other pathogenic spirochetes, but, in general, treponemes and other spirochetes, are easily identifiable when compared to other bacteria. A key morphological characteristic of T. pallidum, and other spirochetes, is the presence of a central protoplasmic cylinder composed primarily of peptidoglycan and one or more adjacent axial fibrils (also designated periplasmic flagella or endoflagella; Charon, N. W., et al., Res. Microbiol. 143:597-603 (1992)). These structures provide a source of corkscrew-like motion to the treponemes. In aqueous media, treponemes move in an apparently random fashion and, unlike the majority of motile bacteria, continue to move in a more viscous medium. In tissues, treponemes are highly moldable to intercellular spaces; a characteristic which is thought to be mediated by the interactions of bacterial adhesins and cellular fibronectins.
Syphilis is the primary clinical manifestation of infection with T. pallidum. The clinical manifestations of syphilis can resemble many diseases. Syphilis is typically transmitted by sexual contact, but can also be transmitted transplacentally. The infecting organism multiplies at the site of infection within 10 to 60 days postinfection and results in a primary ulcer-like lesion termed a chancre. A small number of organisms move from the primary lesion to the regional lymph nodes and establish small infectious centers termed satellite buboes. Organisms from these locations enter the blood stream and result in a systemic infection (Goens, J. L., et al, Am. Fam. Physician 50:1013-1020 (1994)).
The secondary stage of syphilis manifests itself as a widespread skin rash and begins between two and twelve weeks following the primary infection. During this stage, the infected individual often experiences a low grade fever coupled with swollen lymph nodes. Also during this period, lesions of various degrees of severity may develop in a number of phyical locations including bone, liver, kidney, central nervous system (CNS), and other organs (Neeravahu, M. Arch. Intern. Med. 145:132-134 (1985)). Such secondary infections are highly infectious, but will, in time, subside spontaneously. A third stage of syphilis occurs in approximately 30% of infected, but not treated, individuals. The third stage occurs several years following the first and second stages. The lesions which characterize the third stage of infection are minor in terms of the number of organisms, but may be severe in terms of tissue damage. Such lesions may result in necrosis, scar formation, general paresis, damage to aortic valves, permanent blindness, and other extensive tissue damage, all probably related to a delayed type hypersensitivity reaction by the host to the T. pallidum organisms (Scheck, D. Ν. and Hook, E. W. 3rd Infect. Dis. Clin. North Am. 8:769-795 (1994)).
A further, and increasingly common, complication of syphilis infection is coinfection with the human immunodeficiency virus (HIV). In fact, a recent study indicates that ulcerous genital diseases such as those exhibited during the primary stages of infection with syphilis may facilitate the transmission of HIV (Rufli, T. Dermatologica 179:113-117 (1989)). In addition, it is clear that the CΝS is regularly involved in the early stages of syphilis. In the timespan between the introduction of penecillin and other antibotics and the spread of HIV, early neurosyphilis was an exceptionally uncommon development. However, since the standard antibiotic dosage used to treat syphilis is not exceptionally high and since a successful treatment requires an adequate host immune response, individuals infected with HIV often exhibit a highly increased occurance of many neurosyphilis-related sequalae including asymptomatic neurosyphilis, syphilitic meinigitis, cranial nerve abnormalities, or cerebrovascular problems (Musher, D. M., et al, Ann. Intern. Med. 113:872-881 (1990)). T. pallidum has a remarkable ability to evade both the humoral and cellular components of the immune system. It was originally thought that the ability of T. pallidum to evade the immune system of the host organism was due to the presence of an outer coat of mucopolysaccharides. However, recent evidence suggests it is more likely that T. pallidum make use of the organization of the relative immunogenicity of its complement of outer membrane proteins to evade the immune system (Radolf, J. D. Mol. Microbiol. 16: 1067-1073 (1995)). Unlike most other bacterial outer membranes characterized thus far, the T. pallidum outer membrane contains a scarcity of immunogenic transmembrane proteins (with regard to T. pallidum, these are termed "rare outer membrane proteins"). Among the highly immunogenic proteins of treponemes are a number of lipoproteins anchored to the periplasmic leaflet of the cytoplasmic membrane. As a result of their physical location, the lipoproteins may be less susceptible to typical immunologic surveillance (Norris, J. Microbiol. Rev. 57:750-779 (1993)). In addition to the periplasmic lipoproteins, T. pallidum also secretes a number of small, but immunogenic proteins which may induce an immune response (Hindersson, P. et al., Res. Microbiol. 143:629-639 (1992)). It is clear that the etiology of diseases mediated or exacerbated by T. pallidum genes, and that characterizing the genes and their patterns of expression would add dramatically to our understanding of the organism and its host interactions. Knowledge of T. pallidum genes and genomic organization would dramatically improve understanding of disease etiology and lead to improved and new ways of preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of T. pallidum would provide reagents for, among other things, detecting, characterizing and controlling T. pallidum infections. There is a need therefore to characterize the genome of T. pallidum and for polynucleotides and sequences of this organism.
SUMMARY OF THE INVENTION
The present invention is based on the sequencing of fragments of the T. pallidum genome. The primary nucleotide sequences which were generated are provided in SEQ ID NOS: 1-744.
The present invention provides the nucleotide sequence of several thousand contigs of the T. pallidum genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embodiment, the present invention is provided as contiguous strings of primary sequence information corresponding to the nucleotide sequences depicted in SEQ ID NOS: 1-744. The present invention further provides nucleotide sequences which are at least 95% identical to the nucleotide sequences of SEQ ID NOS: 1-744.
The nucleotide sequence of SEQ ID NOS: 1-744 , a representative fragment thereof, or a nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ ID NOS: 1-744 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. The present invention further provides systems, particularly computer-based systems which contain the sequence information herein described stored in a data storage means. Such systems are designed to identify commercially important fragments of the T. pallidum genome. Another embodiment of the present invention is directed to fragments of the T. pallidum genome having particular structural or functional attributes. Such fragments of the T. pallidum genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter referred to as open reading frames or ORFs, fragments which modulate the expression of an operably linked ORF, hereinafter referred to as expression modulating fragments or EMFs, and fragments which can be used to diagnose the presence of T. pallidum in a sample, hereinafter referred to as diagnostic fragments or DFs.
Each of the ORFs in fragments of the T. pallidum genome disclosed in Tables 1, 2 and 3, and the EMFs found 5' to the ORFs, can be used in numerous ways as polynucleotide reagents. For instance, the sequences can be used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity.
The present invention further includes recombinant constructs comprising one or more fragments of the T. pallidum genome of the present invention. The recombinant constructs of the present invention comprise vectors, such as a plasmid or viral vector, into which a fragment of the T. pallidum has been inserted.
The present invention further provides host cells containing any of the isolated fragments of the T. pallidum genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell. The present invention is further directed to isolated polypeptides and proteins encoded by
ORFs of the present invention. A variety of methods, well known to those of skill in the art, routinely may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and proteins of the present invention from cells which have been altered to express them.
The invention further provides methods of obtaining homologs of the fragments of the T. pallidum genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specifically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.
The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. Such antibodies include both monoclonal and polyclonal antibodies.
The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is an immortalized cell line which is capable of secreting a specific monoclonal antibody. The present invention further provides methods of identifying test samples derived from cells which express one of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one or more of the antibodies of the present invention, or one or more of the DFs of the present invention, under conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom. In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the above-described assays.
Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the antibodies, or one of the DFs of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of bound antibodies or hybridized DFs.
Using the isolated proteins of the present invention, the present invention further provides methods of obtaining and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharmaceutical agents and the like. Such methods comprise steps of: (a)contacting an agent with an isolated protein encoded by one of the ORFs of the present invention; and (b)determining whether the agent binds to said protein. The present genomic sequences of T. pallidum will be of great value to all laboratories working with this organism and for a variety of commercial purposes. Many fragments of the T. pallidum genome will be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value to T. pallidum researchers and for immediate commercial value for the production of proteins or to control gene expression.
The methodology and technology for elucidating extensive genomic sequences of bacterial and other genomes has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative genomic and molecular phylogeny.
DESCRIPTION OF THE FIGURES
FIGURE 1 is a block diagram of a computer system (102) that can be used to implement computer-based systems of present invention.
FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit and annotate the contigs of the T. pallidum genome of the present invention. Both Macintosh and Unix platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et al, Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer Society Press, Washington D.C. (1993). Factura (AB) is a Macintosh program designed for automatic vector sequence removal and end-trimming of sequence files. The program Loadis runs on a Macintosh platform and parses the feature data extracted from the sequence files by Factura to the Unix based T. pallidum relational database. Assembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and their associated features using Extrseq, a Unix utility for retrieving sequences from an SQL database. The resulting sequence file is processed to trim portions of the sequences with a high rate ambiguous nucleotides. The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic Research (TIGR ) for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs generated by the assembly step is loaded into the database with the lassie program. Identification of open reading frames (ORFs) is accomplished by processing contigs with zorf. The ORFs are searched against T. pallidum sequences from GenBank and against all protein sequences using the BLASTN and BLASTP programs (using default parameters), described in Altschul et al, J. Mol. Biol. 215: 403-410 (1990). Results of the ORF determination and similarity searching steps were loaded into the database. As described below, some results of the determination and the searches are set out in Tables 1-3.
DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
The present invention is based on the sequencing of fragments of the T. pallidum genome and analysis of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID NOS: 1-744. As used herein, the "primary sequence" refers to the nucleotide sequence represented by the IUPAC nomenclature system.). In addition to the aforementioned T. pallidum polynucleotide and polynucleotide sequences, the present invention provides the nucleotide sequences of SEQ ID NOS: 1-744, ORF IDs and ORFs within, or representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan.
As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS: 1-744" refers to any portion of the SEQ ID NOS: 1-744 which is not presently represented within a publicly available database. Preferred representative fragments of the present invention are T. pallidum open reading frames ( ORFs ), expression modulating fragment ( EMFs ) and fragments which can be used to diagnose the presence of T. pallidum in sample (DFs). A non- limiting identification of preferred representative fragments is provided in Tables 1-3. As discussed in detail below, the information provided in SEQ ID NOS: 1-744 and in Tables 1-3 together with routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence all "representative fragments" of interest, including open reading frames encoding a large variety of T. pallidum proteins. The present invention is further directed to nucleic acid molecules encoding portions or fragments of the nucleotide sequences described herein. Fragments include portions of the nucleotide sequences of SEQ ID NOS: 1-744, at least 10 contiguous nucleotides in length selected from any two integers, one of which representing a 5' nucleotide position and a second of which representing a 3' nucleotide position, where the first nucleotide for each nucleotide sequence in SEQ ID NOS: 1-744 is position 1. That is, every combination of a 5' and 3' nucleotide position that a fragment at least 10 contiguous nucleotides in length could occupy is included in the invention. At least means a fragment may be 10 contiguous nucleotide bases in length or any integer between 10 and the length of an entire nucleotide sequence of SEQ ID NOS: 1-744 minus 1. Therefore, included in the invention are contiguous fragments specified by any 5' and 3' nucleotide base positions of a nucleotide sequences of SEQ ID NOS: 1-744 wherein the contiguous fragment is any integer between 10 and the length of an entire nucleotide sequence minus 1.
Further, the invention includes polynucleotides comprising fragments specified by size, in nucleotides, rather than by nucleotide positions. The invention includes any fragment size, in contiguous nucleotides, selected from integers between 10 and the length of an entire ORF ID, ORF, or SEQ ID NO:, minus 1. Preferred sizes of contiguous nucleotide fragments include 20 nucleotides, 30 nucleotides, 40 nucleotides, 50 nucleotides. Other preferred sizes of contiguous nucleotide fragments, which may be useful as diagnostic probes and primers, include fragments 50-300 nucleotides in length which include, as discussed above, fragment sizes representing each integer between 50-300. Larger fragments are also useful according to the present invention corresponding to most, if not all, of the nucleotide sequences shown in Tables 1-3 (ORF IDs) and SEQ ID NOS: 1-744. The preferred sizes are, of course, meant to exemplify not limit the present invention as all size fragments, representing any integer between 10 and the length of an entire nucleotide sequence minus 1, of each ORF ID, ORF, and SEQ ID NO:, are included in the invention.
The present invention also provides for the exclusion of any fragment, specified by 5' and 3' base positions or by size in nucleotide bases as described above for any ORF ID or SEQ ID NOS: 1-744. Any number of fragments of nucleotide sequences in ORF IDs or SEQ ID NOS: 1-744, specified by 5' and 3' base positions or by size in nucleotides, as described above, may be excluded from the present invention.
While the presently disclosed sequences of SEQ ID NOS: 1-744 are highly accurate, sequencing techniques are not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS: 1-744. However, once the present invention is made available (i.e., once the information in SEQ ID NOS: 1-744 and Tables 1-3 has been made available), resolving a rare sequencing error in SEQ ID NOS: 1-744 will be well within the skill of the art. The present disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing effort, also of a routine nature, to the region containing the potential error.
Even if all of the very rare sequencing errors in SEQ ID NOS: 1-744 were corrected, the resulting nucleotide sequences would still be at least 95% identical, nearly all would be at least 99% identical, and the great majority would be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS: 1-7441-744.
As discussed elsewhere herein, polynucleotides of the present invention readily may be obtained by routine application of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining libraries and for sequencing are provided below, for instance. A wide variety of T. pallidum strains can be used to prepare T. pallidum genomic DNA for cloning and for obtaining polynucleotides of the present invention which are known in th art.
The nucleotide sequences of the genomes from different strains of T pallidum differ somewhat. However, the nucleotide sequences of the genomes of all T. pallidum strains will be at least 95% identical, in corresponding part, to the nucleotide sequences provided in SEQ ED NOS: 1-744 and the ORF IDs and ORFs within. Nearly all will be at least 99% identical and the great majority will be 99.9% identical.
The present application is further directed to nucleic acid molecules at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleic acid sequence shown in SEQ ID NOS: 1-744, the ORF IDs and ORFs within. The above nucleic acid sequences are included irrespective of whether they encode a polypeptide having T. pallidum activity. This is because even where a particular nucleic acid molecule does not encode a polypeptide having T. pallidum activity, one of skill in the art would still know how to use the nucleic acid molecule, for instance, as a hybridization probe. Uses of the nucleic acid molecules of the present invention that do not encode a polypeptide having T. pallidum activity include, inter alia, isolating an T. pallidum gene or allelic variants thereof from a DNA library, and detecting T. pallidum mRNA expression samples, environmental samples, suspected of containing T. pallidum by Northern Blot, PCR, or similar analysis.
Preferred, are nucleic acid molecules having sequences at least 90%, 95%, 96%, 97%, 98% or 99% identical to the nucleic acid sequence shown in SEQ ID NOS: 1-744, the ORF IDs, and the ORF within each ORF ID, which do, in fact, encode a polypeptide having T. pallidum protein activity By "a polypeptide having T. pallidum activity" is intended polypeptides exhibiting activity similar, but not necessarily identical, to an activity of the T. pallidum protein of the invention, as measured in a particular biological assay suitable for measuring activity of the specified protein.
Due to the degeneracy of the genetic code, one of ordinary skill in the art will immediately recognize that a large number of the nucleic acid molecules having a sequence at least 90%, 95%, 96%, 97%, 98%, or 99% identical to the nucleic acid sequences shown in SEQ ID NOS: 1-744, the ORF IDs, and the ORF within each ORF ID, will encode a polypeptide having T. pallidum protein activity. In fact, since degenerate variants of these nucleotide sequences all encode the same polypeptide, this will be clear to the skilled artisan even without performing the above described comparison assay. It will be further recognized in the art that, for such nucleic acid molecules that are not degenerate variants, a reasonable number will also encode a polypeptide having T. pallidum protein activity. This is because the skilled artisan is fully aware of amino acid substitutions that are either less likely or not likely to significantly effect protein function (e.g., replacing one aliphatic amino acid with a second aliphatic amino acid), as further described below.
The biological activity or function of the polypeptides of the present invention are expected to be similar or identical to polypeptides from other bacteria that share a high degree of structural identity/similarity. Table 1-3 lists accession numbers and descriptions for the closest matching sequences of polypeptides available through Genbank. It is therefore expected that the biological activity or function of the polypeptides of the present invention will be similar or identical to those polypeptides from other bacterial genuses, species, or strains listed in Table 1- 3.
By a polynucleotide having a nucleotide sequence at least, for example, 95% "identical" to a reference nucleotide sequence of the present invention, it is intended that the nucleotide sequence of the polynucleotide is identical to the reference sequence except that the polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the reference nucleotide sequence encoding the T. pallidum polypeptide. In other words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to a reference nucleotide sequence, up to 5% of the nucleotides in the reference sequence may be deleted, inserted, or substituted with another nucleotide. The query sequence may be an entire sequence shown in SEQ ID NOS: 1-744, the ORF IDs, or the ORF within each ORF ID, or any fragment specified as described herein.
As a practical matter, whether any particular nucleic acid molecule or polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide sequence of the presence invention can be determined conventionally using known computer programs. A preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. See Brutlag et al. (1990) Comp. App. Biosci. 6:237-245. In a sequence alignment the query and subject sequences are both DNA sequences. An RNA sequence can be compared by first converting U's to T's. The result of said global sequence alignment is in percent identity. Preferred parameters used in a FASTDB alignment of DNA sequences to calculate percent identity are: Matrix=Unitary, k-tuple=4, Mismatch Penalty=l, Joining Penalty=30, Randomization Group Length=0, Cutoff Score=l, Gap Penalty=5, Gap Size Penalty 0.05, Window Size=500 or the lenght of the subject nucleotide sequence, whichever is shorter. If the subject sequence is shorter than the query sequence because of 5' or 3' deletions, not because of internal deletions, a manual correction must be made to the results. This is because the FASTDB program does not account for 5' and 3' truncations of the subject sequence when calculating percent identity. For subject sequences truncated at the 5' or 3' ends, relative to the query sequence, the percent identity is corrected by calculating the number of bases of the query sequence that are 5' and 3' of the subject sequence, which are not matched/aligned, as a percent of the total bases of the query sequence. Whether a nucleotide is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This corrected score is what is used for the purposes of the present invention. Only nucleotides outside the 5' and 3' nucleotides of the subject sequence, as displayed by the FASTDB alignment, which are not matched/aligned with the query sequence, are calculated for the purposes of manually adjusting the percent identity score. For example, a 90 nucleotide subject sequence is aligned to a 100 nucleotide query sequence to determine percent identity. The deletions occur at the 5' end of the subject sequence and therefore, the FASTDB alignment does not show a matched/alignment of the first 10 nucleotides at 5' end. The 10 unpaired nucleotides represent 10% of the sequence (number of nucleotides at the 5' and 3' ends not matched/total number of nucleotides in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 nucleotides were perfectly matched the final percent identity would be 90%. In another example, a 90 nucleotide subject sequence is compared with a 100 nucleotide query sequence. This time the deletions are internal deletions so that there are no nucleotides on the 5' or 3' of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by FASTDB is not manually corrected. Once again, only nucleotides 5' and 3' of the subject sequence which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are to made for the purposes of the present invention.
COMPUTER RELATED EMBODIMENTS
The nucleotide sequences provided in SEQ ID NOS: 1-744, including ORF IDs and corresponding ORFs, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to said polynucleotide sequences may be "provided" in a variety of mediums to facilitate use thereof. As used herein, "provided" refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence of the present invention. Such a manufacture provides a large portion of the T. pallidum genome and parts thereof {e.g., a T. pallidum open reading frame (ORF)) in a form which allows a skilled artisan to examine the manufacture using means not directly applicable to examining the T. pallidum genome or a subset thereof as it exists in nature or in purified form. In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising computer readable medium having recorded thereon a nucleotide sequence of the present invention. Likewise, it will be clear to those of skill how additional computer readable media that may be developed also can be used to create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently know methods for recording information on computer readable medium to generate manufactures comprising the nucleotide sequence information of the present invention.
A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of data-processor structuring formats {e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention. Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEQ ID NOS: 1-744, including ORF IDs and corresponding ORFs, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to said polynucleotide sequences, the present invention enables the skilled artisan routinely to access the provided sequence information for a wide variety of purposes.
The examples which follow demonstrate how software which implements the BLAST (Altschul et al, J. Mol. Biol. 275:403-410 (1990)) and BLAZE (Brutlag et al, Comp. Chem. 77:203-207 (1993)) search algorithms on a Sybase system was used to identify open reading frames (ORFs) within the T. pallidum genome which contain homology to ORFs or proteins from both T. pallidum and from other organisms. Among the ORFs discussed herein are protein encoding fragments of the T. pallidum genome useful in producing commercially important proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites.
The present invention further provides systems, particularly computer-based systems, which contain the sequence information described herein. Such systems are designed to identify, among other things, commercially important fragments of the T. pallidum genome.
As used herein, "a computer-based system" refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available computer-based system are suitable for use in the present invention. As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means.
As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.
As used herein, "search means" refers to one or more programs which are implemented on the computer- based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of the present genomic sequences which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches can be adapted for use in the present computer-based systems. As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. The most preferred sequence length of a target sequence is from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that searches for commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length. As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).
A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the T. pallidum genomic sequences possessing varying degrees of homology to the target sequence or target motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the target sequence or target motif and identifies the degree of homology contained in the identified fragment.
A variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify sequence fragments of the T. pallidum genome. In the present examples, implementing software which implement the BLAST and BLAZE algorithms, described in Altschul et al, J. Mol. Biol. 215: 403-410 (1990), is used to identify open reading frames within the T. pallidum genome. A skilled artisan can readily recognize that any one of the publicly available homology search programs can be used as the search means for the computer- based systems of the present invention. Of course, suitable proprietary systems that may be known to those of skill also may be employed in this regard.
Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present invention. The computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as random access memory, RAM) and a variety of secondary storage devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 114, once it is inserted into the removable medium storage device 114.
A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 110, and/or a removable storage medium 116. During execution, software for accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 108, in accordance with the requirements and operating parameters of the operating system, the hardware system and the software program or programs. BIOCHEMICAL EMBODIMENTS
Other embodiments of the present invention are directed to isolated fragments of the T. pallidum genome. The fragments of the T. pallidum genome of the present invention include, but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and fragments which can be used to diagnose the presence of T pallidum in a sample, hereinafter diagnostic fragments (DFs).
As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the T. pallidum genome" refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification means to reduce, from the composition, the number of compounds which are normally associated with the composition. Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS: 1-744, to representative fragments thereof as described above including ORF IDs and ORFs, to polynucleotides at least 95%, preferably at least 96%, 97%, 98%, or 99% and especially preferably at least 99.9% identical in sequence thereto, also as set out above.
A variety of purification means can be used to generate the isolated fragments of the present invention. These include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size. In one embodiment, T. pallidum DNA can be enzymatically sheared to produce fragments of 15-20 kb in length. These fragments can then be used to generate a T. pallidum library by inserting them into lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated in the ORF IDs of Tables 1-3, can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-744. Well known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA library or T pallidum genomic DNA. Thus, given the availability of SEQ ID NOS: 1-744, the information in Tables 1, 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS: 1-744 using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or other nucleic acid fragment of the present invention.
The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and double stranded DNA, and single stranded RNA. For purposes of numbering and reference to polynucleotide and polypeptide sequences the entire sequence of each sequence of SEQ ID NOS: 1-744 is included with the first nucleotide being position 1. Therefore, for reference purposes the numbering used in the present invention is that provided in the sequence listing for SEQ ID NOS: 1-744.
As used herein, an open reading frame (ORF), means a series of nucleotide triplets coding for amino acid residues without any termination codons and is a sequence translatable into protein. Further, unless specified, the term "ORF' for each ORF ID is defined by the termination codon at the 3' end and the 5' most methionine codon, at the 5' end, in frame with said 3' termination codon. Unless specified, the term "ORF" also refers to a particular polypeptide sequence defined by the ORF polynucleotide sequence, wherein the N-terminus is defined by the 5' most methionine codon in frame with the termination codon at the 3' end of the ORF ID and the C-terminus is defined by the last codon before the said 3' termination codon. As used herein, an ORF ID represents a sequence without any internal termination codons flanked by termination codons.
Tables 1, 2, and 3 list ORF IDs in the T. pallidum genomic contigs of the present invention that were identified as putative coding regions by the GeneMark software using organism-specific second-order Markov probability transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical methods, such as those discussed herein, to generate more inclusive, more restrictive, or more selective lists.
Table 1 sets out ORF IDs in the T. pallidum contigs of the present invention that over a continuous region of at least 50 bases are 95% or more identical (by BLAST analysis) to a nucleotide sequence available through GenBank in June, 1997.
Table 2 sets out ORF IDs in the T. pallidum contigs of the present invention that are not in Table 1 and match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through GenBank in July, 1996.
Table 3 sets out ORF IDs in the T. pallidum contigs of the present invention that do not match significantly, by BLASTP analysis, a polypeptide sequence available through GenBank in July, 1996.
In each table, the first and second columns identify the ORF ID by, respectively, contig number and ORF ID number within the contig; the third column indicates the first nucleotide of the ORF ID, counting from the 5' end of the contig strand; and the fourth column indicates the last nucleotide of the ORF ID, counting from the 5' end of the contig strand.
In Tables 1 and 2, column six, lists the Reference for the closest matching sequence available through GenBank. These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be familiar with their denominators. Descriptions of the nomenclature are available from the National Center for Biotechnology Information. Column seven in Tables 1 and 2 provides the gene name of the matching sequence; column eight provides the BLAST identity score from the comparison of the ORF and the homologous gene; and column nine indicates the length in nucleotides of the highest scoring segment pair identified by the BLAST identity analysis.
In Table 3, the last column, column six, indicates the length of each ORF ID in amino acid residues.
The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 1, 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not identical, were "similar" (i.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence similarity, such as FASTA and BLAST specifically list percent identity of a matching region as an output parameter. Thus, for instance, Tables 1 and 2 herein enumerate the percent identity of the highest scoring segment pair in each ORF and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided below and are described in the pertinent literature highlighted by the citations provided below.
It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled artisan can readily identify ORFs in contigs of the T. pallidum genome other than those specified for Tables 1-3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those ascertainable using the computer-based systems of the present invention. As used herein, an "expression modulating fragment," EMF, means a series of nucleotide molecules which modulates the expression of an operably linked ORF or EMF.
As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression or an operably linked ORF in response to a specific regulatory factor or physiological event.
EMF sequences can be identified within the contigs of the T. pallidum genome by their proximity to the ORF IDs provided in Tables 1-3 and ORFs within each ORF ID. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably linked ORF in a fashion similar to that found with the naturally linked ORF sequence. As used herein, an "intergenic segment" refers to fragments of the T. pallidum genome which are between two ORF(s) herein described. EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems of the present invention. Further, the two methods can be combined and used together.
The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided below. A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction sites upstream from the marker sequence in the EMF trap vector. The vector is then transformed into an appropriate host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. As described above, an EMF will modulate the expression of an operably linked marker sequence.
As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize to T. pallidum sequences. DFs can be readily identified by identifying unique sequences within contigs of the T. pallidum genome, such as by using well- known computer analysis software, and by generating and testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which determines amplification or hybridization selectivity. The sequences falling within the scope of the present invention are not limited to the specific sequences herein described, but also include allelic and species variations thereof. Allelic and species variations can be routinely determined by comparing the polynucleotide sequences provided in SEQ ID NOS: 1-744, ORF IDs and ORFs within, a representative fragment thereof, or a nucleotide sequence at least 99% and preferably 99.9% identical to said polynucleotide sequences, with a sequence from another isolate of the same species. Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated. Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, such as an ORF, in both directions (i.e., sequence both strands). Alternatively, error screening can be performed by sequencing corresponding polynucleotides of T. pallidum origin isolated by using part or all of the fragments in question as a probe or primer. Each of the ORFs of the T. pallidum genome within the ORF IDs of Tables 1, 2 and 3, and the EMFs found 5' to the ORFs, can be used as polynucleotide reagents in numerous ways. For example, the sequences can be used as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, particularly T. pallidum. Especially preferred in this regard are ORFs such as those of Table 3, which do not match previously characterized sequences from other organisms and thus are most likely to be highly selective for T. pallidum. Also particularly preferred are ORFs that can be used to distinguish between strains of T. pallidum, particularly those that distinguish medically important strain, such as drug- resistant strains.
In addition, the fragments of the present invention, as broadly described, can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynucleotide sequence to DNA or RNA. Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides. Polynucleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et al, Nucl. Acids Res. 6:3013 (1979); Cooney et al, Science 241:456 (1988); and Dervan et al,
Science 257.J360 (1991). Antisense techniques in general are discussed in, for instance, Okano, J. Neurochem. 56:560 (1991) and Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)).
The present invention further provides recombinant constructs comprising one or more fragments of the T. pallidum genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the T. pallidum genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a promoter, operably linked to the ORF. For vectors comprising the EMFs of the present invention, the vector may further comprise a marker sequence or heterologous ORF operably linked to the EMF.
Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially available for generating the recombinant constructs of the present invention. The following vectors are provided by way of example. Useful bacterial vectors include phagescript, PsiX174, pBluescript SK, pBS KS, pNH8a, ρNH16a, pNH18a, pNH46a (available from Stratagene); ρTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available from Pharmacia). Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXTl, pSG (available from Stratagene) pSVK3, pBPV, pMSG, pSVL (available from Pharmacia). Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein- 1. Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art.
The present invention further provides host cells containing any one of the isolated fragments of the T. pallidum genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell.
A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are described in, for instance, Davis, L. et al, BASIC METHODS IN MOLECULAR BIOLOGY (1986).
A host cell containing one of the fragments of the T. pallidum genomic fragments and contigs of the present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF.
The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence.
Preferred nucleic acid fragments of the present invention are the ORF IDs depicted in Tables 2 and 3 and the ORFs within which encode proteins.
A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially available peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies against the native polypeptide, as discussed further below. In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the polypeptide or protein. One skilled in the art can readily employ well- known methods for isolating polypeptides and proteins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain, or by other methods. Methods for isolation and purification that can be employed in this regard include, but are not limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immuno-affinity chromatography.
The polypeptides and proteins of the present invention also can be purified from cells which have been altered to express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt procedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides or proteins of the present invention. The polypeptides of the present invention are preferably provided in an isolated form, and preferably are substantially purified. A recombinantly produced version of the T. pallidum polypeptide can be substantially purified by the one-step method described by Smith et al. (1988) Gene 67:31-40. Polypeptides of the invention also can be purified from natural or recombinant sources using antibodies directed against the polypeptides of the invention in methods which are well known in the art of protein purification.
The invention further provides for isolated T. pallidum polypeptides comprising an amino acid sequence selected from the group including: (a) the amino acid sequence of a full-length T. pallidum polypeptide having the complete amino acid sequence from the first methionine codon to the termination codon of each sequence listed in SEQ ID NOS: 1-744, wherein said termination codon is at the end of each SEQ ID NO: and said first methionine is the first methionine in frame with said termination codon; and (b) the amino acid sequence of a full-length T pallidum polypeptide having the complete amino acid sequence in (a) excepting the N-terminal methionine. The polypeptides of the present invention also include polypeptides having an amino acid sequence at least 80% identical, more preferably at least 90% identical, and still more preferably 95%, 96%, 97%, 98% or 99% identical to those described in (a) and (b) above.
The present invention is further directed to polynucleotides encoding portions or fragments of the amino acid sequences described herein as well as to portions or fragments of the isolated amino acid sequences described herein. Fragments include portions of the amino acid sequences described herein at least 5 contiguous amino acid in length and selected from any two integers, one of which representing an N-terminal position and another representing a C-terminal position. The initiation codon of the ORFs of the present invention is position 1. The initiation codon (positon 1) for purposes of the present invention is the first methionine codon of each ORF ID which is in frame with the termination codon at the end of each said sequence. Every combination of a N-terminal and C-terminal position that a fragment at least 5 contiguous amino acid residues in length could occupy, on any given ORF is included in the invention, i.e., from initiation codon up to the termination codon. "At least" means a fragment may be 5 contiguous amino acid residues in length or any integer between 5 and the number of residues in an ORF, minus 1. Therefore, included in the invention are contiguous fragments specified by any N- terminal and C-terminal positions of amino acid sequence set forth in SEQ ID NOS: 1-744 or
Tables 1-3 wherein the contiguous fragment is any integer between 5 and the number of residues in an ORF minus 1.
Further, the invention includes polypeptides comprising fragments specified by size, in amino acid residues, rather than by N-terminal and C-terminal positions. The invention includes any fragment size, in contiguous amino acid residues, selected from integers between 5 and the number of residues in an ORF, minus 1. Preferred sizes of contiguous polypeptide fragments include about 5 amino acid residues, about 10 amino acid residues, about 20 amino acid residues, about 30 amino acid residues, about 40 amino acid residues, about 50 amino acid residues, about 100 amino acid residues, about 200 amino acid residues, about 300 amino acid residues, and about 400 amino acid residues. The preferred sizes are, of course, meant to exemplify, not limit, the present invention as all size fragments representing any integer between 5 and the number of residues in a full length sequence minus 1 are included in the invention. The present invention also provides for the exclusion of any fragments specified by N-terminal and C-terminal positions or by size in amino acid residues as described above. Any number of fragments specified by N-terminal and C-terminal positions or by size in amino acid residues as described above may be excluded.
The above fragments need not be active since they would be useful, for example, in immunoassays, in epitope mapping, epitope tagging, to generate antibodies to a particular portion of the protein, as vaccines, and as molecular weight markers.
Further polypeptides of the present invention include polypeptides which have at least 90% similarity, more preferably at least 95% similarity, and still more preferably at least 96%, 97%, 98% or 99% similarity to those described above.
A further embodiment of the invention relates to a polypeptide which comprises the amino acid sequence of a T. pallidum polypeptide having an amino acid sequence which contains at least one conservative amino acid substitution, but not more than 50 conservative amino acid substitutions, not more than 40 conservative amino acid substitutions, not more than 30 conservative amino acid substitutions, and not more than 20 conservative amino acid substitutions. Also provided are polypeptides which comprise the amino acid sequence of a T. pallidum polypeptide, having at least one, but not more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 conservative amino acid substitutions.
By a polypeptide having an amino acid sequence at least, for example, 95% "identical" to a query amino acid sequence of the present invention, it is intended that the amino acid sequence of the subject polypeptide is identical to the query sequence except that the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence. In other words, to obtain a polypeptide having an amino acid sequence at least 95% identical to a query amino acid sequence, up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, (indels) or substituted with another amino acid. These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence.
As a practical matter, whether any particular polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to the ORF amino acid sequences encoded by the sequences of SEQ ID NOS: 1-744, as described hererin, can be determined conventionally using known computer programs. A preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al., (1990) Comp. App. Biosci. 6:237-245. In a sequence alignment the query and subject sequences are both amino acid sequences. The result of said global sequence alignment is in percent identity. Preferred parameters used in a FASTDB amino acid alignment are:
Matrix=PAM 0, k-tuple=2, Mismatch Penalty=l, Joining Penalty=20, Randomization Group Length=0, Cutoff Score=l, Window Size=sequence length, Gap Penal ty=5, Gap Size Penalty=0.05, Window Size=500 or the length of the subject amino acid sequence, whichever is shorter. If the subject sequence is shorter than the query sequence due to N- or C-terminal deletions, not because of internal deletions, the results, in percent identity, must be manually corrected. This is because the FASTDB program does not account for N- and C-terminal truncations of the subject sequence when calculating global percent identity. For subject sequences truncated at the N- and C-termini, relative to the query sequence, the percent identity is corrected by calculating the number of residues of the query sequence that are N- and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a residue is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This final percent identity score is what is used for the purposes of the present invention. Only residues to the N- and C-termini of the subject sequence, which are not matched/aligned with the query sequence, are considered for the purposes of manually adjusting the percent identity score. That is, only query amino acid residues outside the farthest N- and C-terminal residues of the subject sequence.
For example, a 90 amino acid residue subject sequence is aligned with a 100 residue query sequence to determine percent identity. The deletion occurs at the N-terminus of the subject sequence and therefore, the FASTDB alignment does not match/align with the first 10 residues at the N-terminus. The 10 unpaired residues represent 10% of the sequence (number of residues at the N- and C- termini not matched/total number of residues in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 residues were perfectly matched the final percent identity would be 90%. In another example, a 90 residue subject sequence is compared with a 100 residue query sequence. This time the deletions are internal so there are no residues at the N- or C-termini of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by FASTDB is not manually corrected. Once again, only residue positions outside the N- and C-terminal ends of the subject sequence, as displayed in the FASTDB alignment, which are not matched/aligned with the query sequence are manually corrected. No other manual corrections are to made for the purposes of the present invention. The above polypeptide sequences are included irrespective of whether they have their normal biological activity. This is because even where a particular polypeptide molecule does not have biological activity, one of skill in the art would still know how to use the polypeptide, for instance, as a vaccine or to generate antibodies. Other uses of the polypeptides of the present invention that do not have T. pallidum activity include, ter alia, as epitope tags, in epitope mapping, and as molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration columns using methods known to those of skill in the art.
As described below, the polypeptides of the present invention can also be used to raise polyclonal and monoclonal antibodies, which are useful in assays for detecting T. pallidum protein expression or as agonists and antagonists capable of enhancing or inhibiting T. pallidum protein function. Further, such polypeptides can be used in the yeast two-hybrid system to "capture" T. pallidum protein binding proteins which are also candidate agonists and antagonists according to the present invention. See, e.g., Fields et al. (1989) Nature 340:245-246.
Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. The most preferred cells are those which do not normally express the particular polypeptide or protein or which expresses the polypeptide or protein at low natural level.
"Recombinant," as used herein, means that a polypeptide or protein is derived from recombinant (e.g., microbial or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial"defines a polypeptide or protein essentially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells.
"Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. Generally, DNA segments encoding the polypeptides and proteins provided by this invention are assembled from fragments of the T. pallidum genome and short ohgonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a microbial or viral operon.
Recombinant expression vehicle or vector" refers to a plasmid or phage or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly of ( 1) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a structural or coding sequence which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the beginning of the desired coding region and terminate translation at its end. Structural units intended for use in yeast or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the expressed recombinant protein to provide a final product. "Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or proteins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed.
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described in Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby incorporated by reference in its entirety. Generally, recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such promoters can be derived from operons encoding glycolytic enzymes such as 3- phosphoglycerate kinase (PGK), alpha-factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired characteristics, e.g., stabilization or simplified purification of expressed recombinant product.
Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and, when desirable, provide amplification within the host.
Suitable prokaryotic hosts for transformation include strains of E. coli, B. subtilis, Salmonella typhimurium and various species within the genera Pseudomonas and Streptomyces. Others may, also be employed as a matter of choice.
As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence to be expressed.
Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter, where it is inducible, is derepressed or induced by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally by physical or chemical means, and the resulting crude extract is retained for further purification.
Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23:115 (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and BHK cell lines.
Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5 flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements.
Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps. The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are substantially equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity between reference and subject sequences. For purposes of the present invention, sequences having equivalent biological activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining equivalence, truncation of the mature sequence should be disregarded.
The invention further provides methods of obtaining homologs from other strains of T. pallidum, of the fragments of the T pallidum genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. As used herein, a sequence or protein of T. pallidum is defined as a homolog of a fragment of the T pallidum fragments or contigs or a protein encoded by one of the ORFs of the present invention, if it shares significant homology to one of the fragments of the T. pallidum genome of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.
As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain regions which possess greater than 85% sequence (amino acid or nucleic acid) homology. Preferred homologs in this regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred among these are those with 97% and even more particularly preferred among those are homologs with 99% or more homology. The most preferred homologs among these are those with 99.9% homology or more. It will be understood that, among measures of homology, identity is particularly preferred in this regard.
Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS: 1-744 or from a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ ID NOS: 1-744 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have been described in great detail in many publications such as, for example, Innis et al, PCR Protocols, Academic Press, San Diego, CA (1990)). When using primers derived from SEQ ID NOS: 1-744 or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS J -744, one skilled in the art will recognize that by employing high stringency conditions (e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC) only sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X SSPC), sequences which are greater than 40-50% homologous to the primer will also be amplified.
When using DNA probes derived from SEQ ID NOS: 1-744, or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS: 1-744 , for colony/plaque hybridization, one skilled in the art will recognize that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC), sequences having regions which are greater than 90% homologous to the probe can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X SSPC), sequences having regions which are greater than 35-45% homologous to the probe will be obtained.
Any organism can be used as the source for homologs of the present invention so long as the organism naturally expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs are bacteria which are closely related to T. pallidum.
ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION
Each ORF corresponding to the ORF IDs provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide. As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and industrial purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one skilled in the art to use the T. pallidum ORFs in a manner similar to the known type of sequences for which the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial use of enzymes, for example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., MacMillan Publications, Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et al, Eds., Elsevier Science Publishers, Amsterdam, The Netherlands (1985). A variety of exemplary uses that illustrate this and similar aspects of the present invention are discussed below.
1. Biosynthetic Enzymes
Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes enzymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in amino acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, can be used for industrial biosynthesis.
The various metabolic pathways present in T pallidum can be identified based on absolute nutritional requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS: 1-744. Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non-macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase.
Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts et al, Symbiosis 21:19 (1986) and Voragen et al. in Biocatalysts In Agricultural Biotechnology, Whitaker et al, Eds., American Chemical Society Symposium Series 389:93 (1989) . The metabolism of sugars is an important aspect of the primary metabolism of T pallidum. Enzymes involved in the degradation of sugars, such as, particularly, glucose, galactose, fructose and xylose, can be used in industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint, include sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose oxidases which produces ketogulonic acid (KG A). KG A is an intermediate in the commercial production of ascorbic acid using the Reichstein's procedure, as described in Krueger et al, Biotechnology 6(A), Rhine et al, Eds., Verlag Press, Weinheim, Germany (1984).
Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized form for the deoxygenation of beer. See, for instance, Hartmeir et al, Biotechnology Letters 7:21 (1979). The most important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, for example, in Bigelis et al, beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett et al, Eds., Academic Press, New York (1985). In addition to industrial applications, GOD has found applications in medicine for quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and cellulose hydrosylates. This application is described in Owusu et al, Biochem. et Biophysica. Acta. 572:83 (1986), for instance. The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today. Initially, soluble enzymes were used and later immobilized enzymes were developed (Krueger et al., Biotechnology, The Textbook of Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glucose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988).
Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See Faultman et al, Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and Godfrey et al, Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et al, Report Industrial Enzymes by 1990, Hel Hepner & Associates, London (1986)). Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for instance, Macrae et al, Philosophical Transactions of the Chiral Society of London 310:221 (1985) and Poserke, Journal of the American Oil Chemist Society 67:1758 (1984). A major use of lipases is in the fat and oil industry for the production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the washing procedures.
The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral intermediates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et al, Recent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)). The following reactions catalyzed by enzymes are of interest to organic chemists: hydrolysis of carboxylic acid esters, phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming reactions such as the aldol reaction.
When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an isolated partially purified enzyme on the other hand, has been described in detail by Bud et al, Chemistry in Britain (1987), p. 127. Amino transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic rates. A description of the use of amino transferases for amino acid production is provided by Roselle- David, Methods of Enzymology 136:419 (1987).
Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in nucleic acid synthesis, repair, and recombination.
2. Generation of Antibodies As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety of procedures and methods known in the art which are currently applied to other proteins. The proteins of the present invention can further be used to generate an antibody which selectively binds the protein.
T. pallidum protein-specific antibodies for use in the present invention can be raised against the intact T. pallidum protein or an antigenic polypeptide fragment thereof, which may be presented together with a carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if it is long enough (at least about 25 amino acids), without a carrier.
As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is meant to include intact molecules, single chain whole antibodies, and antibody fragments. Antibody fragments of the present invention include Fab and F(ab')2 and other fragments including single- chain Fvs (scFv) and disulfide-linked Fvs (sdFv). Also included in the present invention are chimeric and humanized monoclonal antibodies and polyclonal antibodies specific for the polypeptides of the present invention. The antibodies of the present invention may be prepared by any of a variety of methods. For example, cells expressing a polypeptide of the present invention or an antigenic fragment thereof can be administered to an animal in order to induce the production of sera containing polyclonal antibodies. For example, a preparation of T. pallidum polypeptide or fragment thereof is prepared and purified to render it substantially free of natural contaminants. Such a preparation is then introduced into an animal in order to produce polyclonal antisera of greater specific activity.
In a preferred method, the antibodies of the present invention are monoclonal antibodies or binding fragments thereof. Such monoclonal antibodies can be prepared using hybridoma technology. See, e.g., Harlow et al., ANTIBODIES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988); Hammerling, et al., in: MONOCLONAL ANTIBODIES AND T-CELL HYBRIDOMAS 563-681 (Elsevier, N.Y., 1981). Fab and
F(ab')2 fragments may be produced by proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, T. pallidum polypeptide-binding fragments, chimeric, and humanized antibodies can be produced through the application of recombinant DNA technology or through synthetic chemistry using methods known in the art.
Alternatively, additional antibodies capable of binding to the polypeptide antigen of the present invention may be produced in a two-step procedure through the use of anti-idiotypic antibodies. Such a method makes use of the fact that antibodies are themselves antigens, and that, therefore, it is possible to obtain an antibody which binds to a second antibody. In accordance with this method, T. pallidum polypeptide-specific antibodies are used to immunize an animal, preferably a mouse. The splenocytes of such an animal are then used to produce hybridoma cells, and the hybridoma cells are screened to identify clones which produce an antibody whose ability to bind to the T. pallidum polypeptide-specific antibody can be blocked by the T. pallidum polypeptide antigen. Such antibodies comprise anti-idiotypic antibodies to the T. pallidum polypeptide-specific antibody and can be used to immunize an animal to induce formation of further T. pallidum polypeptide-specific antibodies.
Antibodies and fragements thereof of the present invention may be described by the portion of a polypeptide of the present invention recognized or specifically bound by the antibody. Antibody binding fragements of a polypeptide of the present invention may be described or specified in the same manner as for polypeptide fragements discussed above., i.e, by N-terminal and C-terminal positions or by size in contiguous amino acid residues. Any number of antibody binding fragments, of a polypeptide of the present invention, specified by N- terminal and C-terminal positions or by size in amino acid residues, as described above, may also be excluded from the present invention. Therefore, the present invention includes antibodies the specifically bind a particuarlly discribed fragement of a polypeptide of the present invention and allows for the exclusion of the same.
Antibodies and fragements thereof of the present invention may also be described or specified in terms of their cross-reactivity. Antibodies and fragements that do not bind polypeptides of any other species of Borrelia other than T. pallidum are included in the present invention. Likewise, antibodies and fragements that bind only species of Borrelia, i.e. antibodies and fragements that do not bind bacteria from any genus other than Borrelia, are included in the present invention.
The present invention further provides the above- described antibodies in detectably labelled form. Antibodies can be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels (such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), paramagnetic atoms, etc. Procedures for accomplishing such labeling are well-known in the art, for example see Sternberger et al, J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. et al, Meth. Enzym. 62:308 (1979); Engval, E. et al, Immunol. 109:129 (1972); Goding, J. W., J. Immunol. Meth. 13:215 (1976)).
The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells or tissues in which a fragment of the T. pallidum genome is expressed.
The present invention further provides the above-described antibodies immobilized on a solid support. Examples of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports are well known in the art (Weir, D. M. et al, "Handbook of Experimental Immunology" 4th Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. et al, Meth. Enzym. 34 Academic Press, N. Y. (1974)). The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for immunoaffinity purification of the proteins of the present invention.
3. Epitope-Bearing Portions In another aspect, the invention provides peptides and polypeptides comprising epitope-bearing portions of the T pallidum polypeptides of the present invention. These epitopes are immunogenic or antigenic epitopes of the polypeptides of the present invention. An "immunogenic epitope" is defined as a part of a protein that elicits an antibody response when the whole protein or polypeptide is the immunogen. These immunogenic epitopes are believed to be confined to a few loci on the molecule. On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic determinant" or "antigenic epitope." The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, e.g., Geysen, et al. (1983) Proc. Natl. Acad. Sci. USA 81:3998- 4002. Amino acid residues comprising anigenic epitopes may be determined by algorithms such as the the Jameson- Wolf analysis or similar algorithms or by in vivo testing for an antigenic response using the methods described herein or those known in the art.
As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e., that contain a region of a protein molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. See, e.g., Sutcliffe, et al., (1983) Science 219:660-666. Peptides capable of eliciting protein-reactive sera are frequently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or carboxyl terminals. Peptides that are extremely hydrophobic and those of six or fewer residues generally are ineffective at inducing antibodies that bind to the mimicked protein; longer, peptides, especially those containing proline residues, usually are effective. See, Sutcliffe, et al., supra, p. 661. For instance, 18 of 20 peptides designed according to these guidelines, containing 8-39 residues covering 75% of the sequence of the influenza virus hemagglutinin HAl polypeptide chain, induced antibodies that reacted with the HAl protein or intact virus; and 12/12 peptides from the MuLV polymerase and 18/18 from the rabies glycoprotein induced antibodies that precipitated the respective proteins.
Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention. Thus, a high proportion of hybridomas obtained by fusion of spleen cells from donors immunized with an antigen epitope-bearing peptide generally secrete antibody reactive with the native protein. See Sutcliffe, et al., supra, p. 663. The antibodies raised by antigenic epitope-bearing peptides or polypeptides are useful to detect the mimicked protein, and antibodies to different peptides may be used for tracking the fate of various regions of a protein precursor which undergoes post-translational processing. The peptides and anti-peptide antibodies may be used in a variety of qualitative or quantitative assays for the mimicked protein, for instance in competition assays since it has been shown that even short peptides (e.g., about 9 amino acids) can bind and displace the larger peptides in immunoprecipitation assays. See, e.g., Wilson, et al., (1984) Cell 37:767-778. The anti-peptide antibodies of the invention also are useful for purification of the mimicked protein, for instance, by adsorption chromatography using methods known in the art.
Antigenic epitope-bearing peptides and polypeptides of the invention designed according to the above guidelines preferably contain a sequence of at least seven, more preferably at least nine and most preferably between about 10 to about 50 amino acids (i.e. any integer between 7 and 50) contained within the amino acid sequence of a polypeptide of the invention. However, peptides or polypeptides comprising a larger portion of an amino acid sequence of a polypeptide of the invention, containing about 50 to about 100 amino acids, or any length up to and including the entire amino acid sequence of a polypeptide of the invention, also are considered epitope-bearing peptides or polypeptides of the invention and also are useful for inducing antibodies that react with the mimicked protein. Preferably, the amino acid sequence of the epitope-bearing peptide is selected to provide substantial solubility in aqueous solvents (i.e., the sequence includes relatively hydrophilic residues and highly hydrophobic sequences are preferably avoided); and sequences containing proline residues are particularly preferred. The epitope-bearing peptides and polypeptides of the present invention may be produced by any conventional means for making peptides or polypeptides including recombinant means using nucleic acid molecules of the invention. For instance, an epitope-bearing amino acid sequence of the present invention may be fused to a larger polypeptide which acts as a carrier during recombinant production and purification, as well as during immunization to produce anti-peptide antibodies. Epitope-bearing peptides also may be synthesized using known methods of chemical synthesis. For instance, Houghten has described a simple method for synthesis of large numbers of peptides, such as 10-20 mg of 248 different 13 residue peptides representing single amino acid variants of a segment of the HAl polypeptide which were prepared and characterized (by ELISA-type binding studies) in less than four weeks (Houghten, R. A. Proc. Natl. Acad. Sci. USA 82:5131-5135 (1985)). This "Simultaneous Multiple Peptide Synthesis (SMPS)" process is further described in U.S. Patent No. 4,631,211 to Houghten and coworkers (1986). In this procedure the individual resins for the solid-phase synthesis of various peptides are contained in separate solvent-permeable packets, enabling the optimal use of the many identical repetitive steps involved in solid-phase methods. A completely manual procedure allows 500-1000 or more syntheses to be conducted simultaneously (Houghten et al. (1985) Proc. Natl. Acad. Sci. 82:5131-5135 at 5134.
Epitope-bearing peptides and polypeptides of the invention are used to induce antibodies according to methods well known in the art. See, e.g., Sutcliffe, et al., supra;; Wilson, et al., supra;; and Bittle, et al. (1985) J. Gen. Virol. 66:2347-2354. Generally, animals may be immunized with free peptide; however, anti-peptide antibody titer may be boosted by coupling of the peptide to a macromolecular carrier, such as keyhole limpet hemacyanin (KLH) or tetanus toxoid. For instance, peptides containing cysteine may be coupled to carrier using a linker such as m-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS), while other peptides may be coupled to carrier using a more general linking agent such as glutaraldehyde. Animals such as rabbits, rats and mice are immunized with either free or carrier-coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 μg peptide or carrier protein and Freund's adjuvant. Several booster injections may be needed, for instance, at intervals of about two weeks, to provide a useful titer of anti-peptide antibody which can be detected, for example, by ELISA assay using free peptide adsorbed to a solid surface. The titer of anti-peptide antibodies in serum from an immunized animal may be increased by selection of anti-peptide antibodies, for instance, by adsorption to the peptide on a solid support and elution of the selected antibodies according to methods well known in the art.
Immunogenic epitope-bearing peptides of the invention, i.e., those parts of a protein that elicit an antibody response when the whole protein is the immunogen, are identified according to methods known in the art. For instance, Gey sen, et al, supra, discloses a procedure for rapid concurrent synthesis on solid supports of hundreds of peptides of sufficient purity to react in an ELISA. interaction of synthesized peptides with antibodies is then easily detected without removing them from the support. In this manner a peptide bearing an immunogenic epitope of a desired protein may be identified routinely by one of ordinary skill in the art. For instance, the immunologically important epitope in the coat protein of foot-and-mouth disease virus was located by Geysen et al. supra with a resolution of seven amino acids by synthesis of an overlapping set of all 208 possible hexapeptides covering the entire 213 amino acid sequence of the protein. Then, a complete replacement set of peptides in which all 20 amino acids were substituted in turn at every position within the epitope were synthesized, and the particular amino acids conferring specificity for the reaction with antibody were determined. Thus, peptide analogs of the epitope-bearing peptides of the invention can be made routinely by this method. U.S. Patent No. 4,708,781 to Geysen (1987) further describes this method of identifying a peptide bearing an immunogenic epitope of a desired protein.
Further still, U.S. Patent No. 5,194,392, to Geysen (1990), describes a general method of detecting or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of interest. More generally, U.S. Patent No. 4,433,092, also to Geysen (1989), describes a method of detecting or determining a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al. (1996) discloses linear C,-C7-alkyl peralkylated oligopeptides and sets and libraries of such peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a peralkylated oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, non-peptide analogs of the epitope-bearing peptides of the invention also can be made routinely by these methods. The entire disclosure of each document cited in this section on "Polypeptides and Fragments" is hereby incorporated herein by reference. As one of skill in the art will appreciate, the polypeptides of the present invention and the epitope-bearing fragments thereof described above can be combined with parts of the constant domain of immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins facilitate purification and show an increased half-life in vivo. This has been shown, e.g., for chimeric proteins consisting of the first two domains of the human CD4-polypeptide and various domains of the constant regions of the heavy or light chains of mammalian immunoglobulins. (EPA 0,394,827; Traunecker et al. (1988) Nature 331:84-86. Fusion proteins that have a disulfide-linked dimeric structure due to the IgG part can also be more efficient in binding and neutralizing other molecules than a monomeric T. pallidum polypeptide or fragment thereof alone. See Fountoulakis et al. (1995) J. Biochem. 270:3958-3964. Nucleic acids encoding the above epitopes of T. pallidum polypeptides can also be recombined with a gene of interest as an epitope tag to aid in detection and purification of the expressed polypeptide.
3. Diagnostic Assays and Kits The present invention further relates to methods for assaying Borrelia infection in an animal by detecting the expression of genes encoding Borrelia polypeptides of the present invention. The methods comprise analyzing tissue or body fluid from the animal for 2forre//α-specific antibodies, nucleic acids, or proteins. Analysis of nucleic acid specific to Borrelia is assayed by PCR or hybridization techniques using nucleic acid sequences of the present invention as either hybridization probes or primers. See, e.g., Sambrook et al. Molecular cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 2nd ed., 1989, page 54 reference); Eremeeva et al. (1994) J. Clin. Microbiol. 32:803-810 (describing differentiation among spotted fever group Rickettsiae species by analysis of restriction fragment length polymorphism of PCR-amplified DNA) and Chen et al. 1994 J. Clin. Microbiol. 32:589- 595 (detecting T. pallidum nucleic acids via PCR).
Where diagnosis of a disease state related to infection with Borrelia has already been made, the present invention is useful for monitoring progression or regression of the disease state whereby patients exhibiting enhanced Borrelia gene expression will experience a worse clinical outcome relative to patients expressing these gene(s) at a lower level.
By "biological sample" is intended any biological sample obtained from an animal, cell line, tissue culture, or other source which contains Borrelia polypeptide, mRNA, or DNA. Biological samples include body fluids (such as saliva, blood, plasma, urine, mucus, synovial fluid, etc.) tissues (such as muscle, skin, and cartilage) and any other biological source suspected of containing Borrelia polypeptides or nucleic acids. Methods for obtaining biological samples such as tissue are well known in the art.
The present invention is useful for detecting diseases related to Borrelia infections in animals. Preferred animals include monkeys, apes, cats, dogs, birds, cows, pigs, mice, horses, rabbits and humans. Particularly preferred are humans.
Total RNA can be isolated from a biological sample using any suitable technique such as the single-step guanidinium-thiocyanate-phenol-chloroform method described in Chomczynski et al. (1987) Anal. Biochem. 162: 156-159. mRNA encoding Borrelia polypeptides having sufficient homology to the nucleic acid sequences identified in SEQ ID NOS: 1-744 to allow for hybridization between complementary sequences are then assayed using any appropriate method. These include Northern blot analysis, S 1 nuclease mapping, the polymerase chain reaction (PCR), reverse transcription in combination with the polymerase chain reaction (RT-PCR), and reverse transcription in combination with the ligase chain reaction (RT-LCR).
Northern blot analysis can be performed as described in Harada et al. (1990) Cell 63:303-312. Briefly, total RNA is prepared from a biological sample as described above. For the Northern blot, the RNA is denatured in an appropriate buffer (such as glyoxal/dimethyl sulfoxide/sodium phosphate buffer), subjected to agarose gel electrophoresis, and transferred onto a nitrocellulose filter. After the RNAs have been linked to the filter by a UV linker, the filter is prehybridized in a solution containing formamide, SSC, Denhardt's solution, denatured salmon sperm, SDS, and sodium phosphate buffer. A T. pallidum polynucleotide sequence shown in SEQ ID NOS: 1-744, or portion thereof, labeled according to any appropriate method (such as the 32P-multiprimed DNA labeling system (Amersham)) is used as probe. After hybridization overnight, the filter is washed and exposed to x-ray film. DNA for use as probe according to the present invention is described in the sections above and will preferably at least 15 nucleotides in length.
SI mapping can be performed as described in Fujita et al. (1987) Cell 49:357-367. To prepare probe DNA for use in S 1 mapping, the sense strand of an above-described T. pallidum DNA sequence of the present invention is used as a template to synthesize labeled antisense DNA. The antisense DNA can then be digested using an appropriate restriction endonuclease to generate further DNA probes of a desired length. Such antisense probes are useful for visualizing protected bands corresponding to the target mRNA (i.e., mRNA encoding Borrelia polypeptides).
Levels of mRNA encoding Borrelia polypeptides are assayed, for e.g., using the RT-PCR method described in Makino et al. (1990) Technique 2:295-301. By this method, the radioactivities of the "amplicons" in the polyacrylamide gel bands are linearly related to the initial concentration of the target mRNA. Briefly, this method involves adding total RNA isolated from a biological sample in a reaction mixture containing a RT primer and appropriate buffer. After incubating for primer annealing, the mixture can be supplemented with a RT buffer, dNTPs, DTT, RNase inhibitor and reverse transcriptase. After incubation to achieve reverse transcription of the RNA, the RT products are then subject to PCR using labeled primers. Alternatively, rather than labeling the primers, a labeled dNTP can be included in the PCR reaction mixture. PCR amplification can be performed in a DNA thermal cycler according to conventional techniques. After a suitable number of rounds to achieve amplification, the PCR reaction mixture is electrophoresed on a polyacrylamide gel. After drying the gel, the radioactivity of the appropriate bands (corresponding to the mRNA encoding the Borrelia polypeptides of the present invention) are quantified using an imaging analyzer. RT and PCR reaction ingredients and conditions, reagent and gel concentrations, and labeling methods are well known in the art. Variations on the RT-PCR method will be apparent to the skilled artisan. Other PCR methods that can detect the nucleic acid of the present invention can be found in PCR PRIMER: A LABORATORY MANUAL (C.W. Dieffenbach et al. eds., Cold Spring Harbor Lab Press, 1995).
The polynucleotides of the present invention, including both DNA and RNA, may be used to detect polynucleotides of the present invention or Borrelia species including T. pallidum using bio chip technology. The present invention includes both high density chip arrays (>1000 oligonucleotides per cm2) and low density chip arrays (<1000 oligonucleotides per cm2). Bio chips comprising arrays of polynucleotides of the present invention may be used to detect Borrelia species, including T. pallidum, in biological and environmental samples and to diagnose an animal, including humans, with an T. pallidum or other Borrelia infection. The bio chips of the present invention may comprise polynucleotide sequences of other pathogens including bacteria, viral, parasitic, and fungal polynucleotide sequences, in addition to the polynucleotide sequences of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis. The bio chips can also be used to monitor an T. pallidum or other Borrelia infections and to monitor the genetic changes (deletions, insertions, mismatches, etc.) in response to drug therapy in the clinic and drug development in the laboratory. The bio chip technology comprising arrays of polynucleotides of the present invention may also be used to simultaneously monitor the expression of a multiplicity of genes, including those of the present invention. The polynucleotides used to comprise a selected array may be specified in the same manner as for the fragements, i.e, by their 5' and 3' positions or length in contigious base pairs and include from. Methods and particular uses of the polynucleotides of the present invention to detect Borrelia species, including T. pallidum, using bio chip technology include those known in the art and those of: U.S. Patent Nos. 5510270, 5545531, 5445934, 5677195, 5532128, 5556752, 5527681, 5451683, 5424186, 5607646, 5658732 and World Patent Nos. WO/9710365, WO/9511995, WO/9743447, WO/9535505, each incorporated herein in their entireties. Biosensors using the polynucleotides of the present invention may also be used to detect, diagnose, and monitor T. pallidum or other Borrelia species and infections thereof. Biosensors using the polynucleotides of the present invention may also be used to detect particular polynucleotides of the present invention. Biosensors using the polynucleotides of the present invention may also be used to monitor the genetic changes (deletions, insertions, mismatches, etc.) in response to drug therapy in the clinic and drug development in the laboratory. Methods and particular uses of the polynucleotides of the present invention to detect Borrelia species, including T. pallidum, using biosenors include those known in the art and those of: U.S. Patent Nos 5721102, 5658732, 5631170, and World Patent Nos. WO97/35011, WO/9720203, each incorporated herein in their entireties. Thus, the present invention includes both bio chips and biosensors comprising polynucleotides of the present invention and methods of their use.
Assaying Borrelia polypeptide levels in a biological sample can occur using any art-known method, such as antibody-based techniques. For example, Borrelia polypeptide expression in tissues can be studied with classical immunohistological methods. In these, the specific recognition is provided by the primary antibody (polyclonal or monoclonal) but the secondary detection system can utilize fluorescent, enzyme, or other conjugated secondary antibodies. As a result, an immunohistological staining of tissue section for pathological examination is obtained. Tissues can also be extracted, e.g., with urea and neutral detergent, for the liberation of Borrelia polypeptides for Western-blot or dot/slot assay. See, e.g., Jalkanen, M. et al. (1985) J. Cell. Biol. 101:976-985; Jalkanen, M. et al. (1987) J. Cell . Biol. 105:3087-3096. In this technique, which is based on the use of cationic solid phases, quantitation of a Borrelia polypeptide can be accomplished using an isolated Borrelia polypeptide as a standard. This technique can also be applied to body fluids. Other antibody-based methods useful for detecting Borrelia polypeptide gene expression include immunoassays, such as the ELISA and the radioimmunoassay (RIA). For example, a Borrelia polypeptide-specific monoclonal antibodies can be used both as an immunoabsorbent and as an enzyme-labeled probe to detect and quantify a Borrelia polypeptide. The amount of a Borrelia polypeptide present in the sample can be calculated by reference to the amount present in a standard preparation using a linear regression computer algorithm. Such an ELISA is described in Iacobelli et al. (1988) Breast Cancer Research and Treatment 11 : 19-30. In another ELISA assay, two distinct specific monoclonal antibodies can be used to detect Borrelia polypeptides in a body fluid. In this assay, one of the antibodies is used as the immunoabsorbent and the other as the enzyme-labeled probe.
The above techniques may be conducted essentially as a "one-step" or "two-step" assay. The "one-step" assay involves contacting the Borrelia polypeptide with immobilized antibody and, without washing, contacting the mixture with the labeled antibody. The "two-step" assay involves washing before contacting the mixture with the labeled antibody. Other conventional methods may also be employed as suitable. It is usually desirable to immobilize one component of the assay system on a support, thereby allowing other components of the system to be brought into contact with the component and readily removed from the sample. Variations of the above and other immunological methods included in the present invention can also be found in Harlow et al., ANTIBODIES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988).
Suitable enzyme labels include, for example, those from the oxidase group, which catalyze the production of hydrogen peroxide by reacting with substrate. Glucose oxidase is particularly preferred as it has good stability and its substrate (glucose) is readily available. Activity of an oxidase label may be assayed by measuring the concentration of hydrogen peroxide formed by the enzyme-labeled antibody/substrate reaction. Besides enzymes, other suitable labels include radioisotopes, such as iodine (1251, 121I), carbon (14C), sulphur (35S), tritium (3H), indium (n2In), and technetium (""Tc), and fluorescent labels, such as fluorescein and rhodamine, and biotin.
Further suitable labels for the Borrelia polypeptide-specific antibodies of the present invention are provided below. Examples of suitable enzyme labels include malate dehydrogenase, Borrelia nuclease, delta-5-steroid isomerase, yeast-alcohol dehydrogenase, alpha-glycerol phosphate dehydrogenase, triose phosphate isomerase, peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase, and acetylcholine esterase. Examples of suitable radioisotopic labels include 3H, n lIn, 125I, 13,1, 32P, 35S, ,4C, 51Cr,
57To, 58Co, 59Fe, 75Se, 152Eu, 90Y, 67Cu, 217Ci, 2I 1At, 212Pb, 47Sc, 109Pd, etc. ι πIn is a preferred isotope where in vivo imaging is used since its avoids the problem of dehalogenation of the 125I or ,31IJabeled monoclonal antibody by the liver. In addition, this radionucleotide has a more favorable gamma emission energy for imaging. See, e.g., Perkins et al. (1985) Eur. J. Nucl. Med. 10:296-301; Carasquillo et al. (1987) J. Nucl. Med. 28:281-287. For example, '"In coupled to monoclonal antibodies with l-(P-isothiocyanatobenzyl)-DPTA has shown little uptake in non-tumors tissues, particularly the liver, and therefore enhances specificity of tumor localization. See, Esteban et al. (1987) J. Nucl. Med. 28:861-870. Examples of suitable non-radioactive isotopic labels include ,57Gd, 55Mn, 162Dy, 52Tr, and 56Fe.
Examples of suitable fluorescent labels include an 152Eu label, a fluorescein label, an isothiocyanate label, a rhodamine label, a phycoerythrin label, a phycocyanin label, an allophycocyanin label, an o-phthaldehyde label, and a fluorescamine label. Examples of suitable toxin labels include, Pseudomonas toxin, diphtheria toxin, ricin, and cholera toxin.
Examples of chemiluminescent labels include a luminal label, an isoluminal label, an aromatic acridinium ester label, an imidazole label, an acridinium salt label, an oxalate ester label, a luciferin label, a luciferase label, and an aequorin label. Examples of nuclear magnetic resonance contrasting agents include heavy metal nuclei such as Gd, Mn, and iron.
Typical techniques for binding the above-described labels to antibodies are provided by Kennedy et al. (1976) Clin. Chim. Acta 70:1-31, and Schurs et al. (1977) Clin. Chim. Acta 81:1-40. Coupling techniques mentioned in the latter are the glutaraldehyde method, the periodate method, the dimaleimide method, the m-maleimidobenzyl-N-hydroxy-succinimide ester method, all of which methods are incorporated by reference herein.
In a related aspect, the invention includes a diagnostic kit for use in screening serum containing antibodies specific against T. pallidum infection. Such a kit may include an isolated T. pallidum antigen comprising an epitope which is specifically immunoreactive with at least one anti-r. pallidum antibody. Such a kit also includes means for detecting the binding of said antibody to the antigen. In specific embodiments, the kit may include a recombinantly produced or chemically synthesized peptide or polypeptide antigen. The peptide or polypeptide antigen may be attached to a solid support.
In a more specific embodiment, the detecting means of the above-described kit includes a solid support to which said peptide or polypeptide antigen is attached. Such a kit may also include a non-attached reporter-labeled anti-human antibody. In this embodiment, binding of the antibody to the T. pallidum antigen can be detected by binding of the reporter labeled antibody to the anti-r. pallidum polypeptide antibody.
Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the DFs or antibodies of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of a bound DF or antibody. In detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are not cross- contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another. Such containers will include a container which will accept the test sample, a container which contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody or DF.
In a related aspect, the invention includes a method of detecting T. pallidum infection in a subject. This detection method includes reacting a body fluid, preferably serum, from the subject with an isolated T. pallidum antigen, and examining the antigen for the presence of bound antibody. In a specific embodiment, the method includes a polypeptide antigen attached to a solid support, and serum is reacted with the support. Subsequently, the support is reacted with a reporter-labeled anti-human antibody. The support is then examined for the presence of reporter- labeled antibody.
The solid surface reagent employed in the above assays and kits is prepared by known techniques for attaching protein material to solid support material, such as polymeric beads, dip sticks, 96-well plates or filter material. These attachment methods generally include non-specific adsorption of the protein to the support or covalent attachment of the protein , typically through a free amine group, to a chemically reactive group on the solid support, such as an activated carboxyl, hydroxyl, or aldehyde group. Alternatively, streptavidin coated plates can be used in conjunction with biotinylated antigen(s). The polypeptides and antibodies of the present invention, including fragments thereof, may be used to detect Borrelia species including T. pallidum using bio chip and biosensor technology. Bio chip and biosensors of the present invention may comprise the polypeptides of the present invention to detect antibodies, which specifically recognize Borrelia species, including T. pallidum. Bio chip and biosensors of the present invention may also comprise antibodies which specifically recognize the polypeptides of the present invention to detect Borrelia species, including T. pallidum or specific polypeptides of the present invention. Bio chips or biosensors comprising polypeptides or antibodies of the present invention may be used to detect Borrelia species, including T. pallidum, in biological and environmental samples and to diagnose an animal, including humans, with an T. pallidum or other Borrelia infection. Thus, the present invention includes both bio chips and biosensors comprising polypeptides or antibodies of the present invention and methods of their use.
The bio chips of the present invention may further comprise polypeptide sequences of other pathogens including bacteria, viral, parasitic, and fungal polypeptide sequences, in addition to the polypeptide sequences of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis. The bio chips of the present invention may further comprise antibodies or fragements thereof specific for other pathogens including bacteria, viral, parasitic, and fungal polypeptide sequences, in addition to the antibodies or fragements thereof of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis. The bio chips and biosensors of the present invention may also be used to monitor an T. pallidum or other Borrelia infection and to monitor the genetic changes (amio acid deletions, insertions, substitutions, etc.) in response to drug therapy in the clinic and drug development in the laboratory. The bio chip and biosensors comprising polypeptides or antibodies of the present invention may also be used to simultaneously monitor the expression of a multiplicity of polypeptides, including those of the present invention. The polypeptides used to comprise a bio chip or biosensor of the present invention may be specified in the same manner as for the fragements, i.e, by their N-terminal and C-terminal positions or length in contigious amino acid residue. Methods and particular uses of the polypeptides and antibodies of the present invention to detect Borrelia species, including T. pallidum, or specific polypeptides using bio chip and biosensor technology include those known in the art, those of the U.S. Patent Nos. and World Patent Nos. listed above for bio chips and biosensors using polynucleotides of the present invention, and those of: U.S. Patent Nos. 5658732, 5135852, 5567301, 5677196, 5690894 and World Patent Nos. WO9729366, WO9612957, each incorporated herein in their entireties.
4. Screening Assay for Binding Agents
Using the isolated proteins of the present invention, the present invention further provides methods of obtaining and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the fragments and the T. pallidum fragment and contigs herein described. In general, such methods comprise steps of:
(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated fragment of the T. pallidum genome; and
(b) determining whether the agent binds to said protein or said fragment. The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed using protein modeling techniques.
For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen based on the configuration of the particular protein. For example, one skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, for example see Hurby et al, "Application of Synthetic Peptides: Antisense Peptides," in Synthetic Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, and Kaspczak et al, Biochemistry 25:9230-8 (1989), or pharmaceutical agents, or the like.
In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or multiple ORFs which rely on the same EMF for expression control. One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity.
Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary to a region of the gene involved in transcription (triple helix - see Lee et al, Nucl. Acids Res. 6:3073 (1979); Cooney et al, Science 241:456 (1988); and Dervan et al, Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, /. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides, and other DNA binding agents.
5. Pharmaceutical Compositions and Vaccines
The present invention further provides pharmaceutical agents which can be used to modulate the growth or pathogenicity of T. pallidum, or another related organism, in vivo or in vitro. As used herein, a "pharmaceutical agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharmaceutical compositions. As used herein, the "pharmaceutical agents of the present invention" refers the pharmaceutical agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are identified using the herein described assays.
As used herein, a pharmaceutical agent is said to "modulate the growth pathogenicity of T. pallidum or a related organism, in vivo or in vitro," when the agent reduces the rate of growth, rate of division, or viability of the organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth by binding to an important protein thus blocking the biological activity of the protein, while other agents may bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs of the present invention and serve as a vaccine. The development and use of a vaccine based on outer membrane components are well known in the art.
As used herein, a "related organism" is a broad term which refers to any organism whose growth can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As such, related organisms do not need to be bacterial but may be fungal or viral pathogens.
The pharmaceutical agents and compositions of the present invention may be administered in a convenient manner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or prophylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most cases, the dosage is from about 0J mg/kg to about 10 g/kg body weight daily, taking into account the routes of administration, symptoms, etc.
The agents of the present invention can be used in native form or can be modified to form a chemical derivative. As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, REMINGTON'S PHARMACEUTICAL SCIENCES ( 1980) cited elsewhere herein.
For example, such moieties may change an immunological character of the functional derivative, such as affinity for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a competitive type immunoassay. Modifications of such protein properties as redox or thermal stability, biological half-life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers also may be effected in this way and can be assayed by methods well known to the skilled artisan.
The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient by any suitable means (e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single or multiple injections.
In providing a patient with one of the agents of the present invention, the dosage of the administered agent will vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered. The therapeutically effective dose can be lowered by using combinations of the agents of the present invention or another agent. As used herein, two or more compounds or agents are said to be administered "in combination" with each other when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can be measured at the same time. The composition of the present invention can be administered concurrently with, prior to, or following the administration of the other agent. The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to decrease the rate of growth (as defined above) of the target organism.
The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose. When provided prophylactically, the agent(s) are provided in advance of any symptoms indicative of the organisms growth. The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symptoms of the infection and to increase the rate of recovery.
The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharmaceutically acceptable form and in a therapeutically effective concentration. A composition is said to be "pharmacologically acceptable" if its administration can be tolerated by a recipient patient. Such an agent is said to be administered in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiologically significant if its presence results in a detectable change in the physiology of a recipient patient. The agents of the present invention can be formulated according to known methods to prepare pharmaceutically useful compositions, whereby these materials, or their functional derivatives, are combined in a mixture with a pharmaceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e.g., human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16th Ed., Osol, A., Ed., Mack Publishing, Easton PA (1980). In order to form a pharmaceutically acceptable composition suitable for effective administration, such compositions will contain an effective amount of one or more of the agents of the present invention, together with a suitable amount of carrier vehicle. Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macromolecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired time course of release. Another possible method to control the duration of action by controlled release preparations is to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-microcapsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such techniques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980).
The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention. Associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale for human administration. In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds.
6. Shot-Gun Approach to Megabase DNA Sequencing
The present invention further demonstrates that a large sequence can be sequenced using a random shotgun approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols.
Certain aspects of the present invention are described in greater detail in the examples that follow. The examples are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the inventors, as will be clear to those of skill in the art from reading the present disclosure.
ILLUSTRATIVE EXAMPLES LIBRARIES AND SEQUENCING
1. Shotgun Sequencing Probability Analysis
The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman (Landerman and Waterman, Genomics 2:231 (1988)) application of the equation for the Poisson distribution. According to this treatment, the probability, PO, that any given base in a sequence of size L, in nucleotides, is not sequenced after a certain amount, n, in nucleotides, of random sequence has been determined can be calculated by the equation PO = e- m, where m is L/n, the fold coverage. For instance, for a genome of 2.8 Mb, m=l when 2.8 Mb of sequence has been randomly generated (IX coverage). At that point, PO = e-1 = 0.37. The probability that any given base has not been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, therefore, is equivalent to the fraction of the whole sequence that has yet to be determined. Thus, at one-fold coverage, approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced. When 14 Mb of sequence has been generated, coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to .0067 or 0.67%. 5X coverage of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with an average sequence read length of 410 bp.
Similarly, the total gap length, G, is determined by the equation G = Le-m, and the average gap size, g, follows the equation, g = L/n. Thus, 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a polynucleotide 2.8 Mb long.
The treatment above is essentially that of Lander and Waterman, GenomicsJL: 231 (1988).
2. Random Library Construction
In order to approximate the random model described above during actual sequencing, a nearly ideal library of cloned genomic fragments is required. The following library construction procedure was developed to achieve this end.
T. pallidum DNA is prepared by phenol extraction. A mixture containing 200 μg DNA in 1.0 ml of 300 mM sodium acetate, 10 mM Tris-HCl, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The sonicated DNA is ethanol precipitated and redissolved in 500 μl TE buffer.
To create blunt-ends, a 100 μl aliquot of the resuspended DNA is digested with 5 units of BAL31 nuclease (New England BioLabs) for 10 min at 30°C in 200 μl BAL31 buffer. The digested DNA is phenol-extracted, ethanol-precipitated, redissolved in 100 μl TE buffer, and then size-fractionated by electrophoresis through a 1.0% low melting temperature agarose gel. The section containing DNA fragments 1.6-2.0 kb in size is excised from the gel, and the LGT agarose is melted and the resulting solution is extracted with phenol to separate the agarose from the DNA. DNA is ethanol precipitated and redissolved in 20 μl of TE buffer for ligation to vector.
A two-step ligation procedure is used to produce a plasmid library with 97% inserts, of which >99% were single inserts. The first ligation mixture (50 ul) contains 2 μg of DNA fragments, 2 μg pUC18 DNA (Pharmacia) cut with Smal and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and is incubated at 14°C for 4 hr. The ligation mixture then is phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 20 μl TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete bands in a ladder are visualized by ethidium bromide-staining and UV illumination and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc. The portion of the gel containing v+I DNA is excised and the v+I DNA is recovered and resuspended into 20 μl TE. The v+I DNA then is blunt-ended by T4 polymerase treatment for 5 min. at 37°C in a reaction mixture (50 ul) containing the v+I linears, 500 μM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+I linears are dissolved in 20 μl TE. The final ligation to produce circles is carried out in a 50 μl reaction containing 5 μl of v+I linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the following day, the reaction mixture is stored at -20°C.
This two-stage procedure results in a molecularly random collection of single-insert plasmid recombinants with minimal contamination from double-insert chimeras (<1%) or free vector (<3%).
Since deviation from randomness can arise from propagation the DNA in the host, E. coli host cells deficient in all recombination and restriction functions (A. Greener, Strategies 3 (1):5 (1990)) are used to prevent rearrangements, deletions, and loss of clones by restriction. Furthermore, transformed cells are plated directly on antibiotic diffusion plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells. Plating is carried out as follows. A 100 μl aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 μl aliquot of 1.42 M beta-mercaptoethanol is added to the aliquot of cells to a final concentration of 25 mM. Cells are incubated on ice for 10 min. A 1 μl aliquot of the final ligation is added to the cells and incubated on ice for 30 min. The cells are heat pulsed for 30 sec. at 42°C and placed back on ice for 2 min. The outgrowth period in liquid culture is eliminated from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation mixture is plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g tryptone, 5 g yeast extract, 0.5 g NaCl, 1.5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented with 0.4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal (2%), 1 ml MgC12 (1 M), and 1 ml MgSO4/100 ml SOB agar. The 15 ml top layer is poured just prior to plating. Our titer is approximately 100 colonies/10 μl aliquot of transformation. All colonies are picked for template preparation regardless of size. Thus, only clones lost due to "poison" DNA or deleterious gene products are deleted from the library, resulting in a slight increase in gap number over that expected.
3. Random DNA Sequencing
High quality double stranded DNA plasmid templates are prepared using a "boiling bead" method developed in collaboration with Advanced Genetic Technology Corp. (Gaithersburg, MD) (Adams et al, Science 252:1651 (1991); Adams et al, Nature 355:632 (1992)). Plasmid preparation is performed in a 96-well format for all stages of DNA preparation from bacterial growth through final DNA purification. Template concentration is determined using Hoechst Dye and a Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding templates are identified where possible and not sequenced.
Templates are also prepared from two T. pallidum lambda genomic libraries. An amplified library is constructed in the vector Lambda GEM- 12 (Promega) and an unamplified library is constructed in Lambda DASH II (Stratagene). In particular, for the unamplified lambda library, T. pallidum DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) containing 50 μg DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C. The digested DNA was phenol-extracted and electrophoresed on a 0.5% low melting agarose gel at 2V/cm for 7 hours. Fragments from 15 to 25 kb are excised and recovered in a final volume of 6 ul. One μl of fragments is used with 1 μl of DASH-fl vector (Stratagene) in the recommended ligation reaction. One μl of the ligation mixture is used per packaging reaction following the recommended protocol with the Gigapack II XL Packaging Extract (Stratagene, #227711). Phage are plated directly without amplification from the packaging mixture (after dilution with 500 μl of recommended SM buffer and chloroform treatment). Yield is about 2.5x103 pfu/ul. The amplified library is prepared essentially as above except the lambda GEM- 12 vector is used. After packaging, about 3.5x104 pfu are plated on the restrictive NM539 host. The lysate is harvested in 2 ml of SM buffer and stored frozen in 7% dimethylsulfoxide. The phage titer is approximately 1x109 pfu ml.
Liquid ly sates (100 μl) are prepared from randomly selected plaques (from the unamplified library) and template is prepared by long-range PCR using T7 and T3 vector-specific primers.
Sequencing reactions are carried out on plasmid and/or PCR templates using the AB Catalyst LabStation with Applied Biosystems PRISM Ready Reaction Dye Primer Cycle Sequencing Kits for the M13 forward (M13-21) and the M13 reverse (M13RP1) primers (Adams et al. , Nature 368:414 (1994)). Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. T7 and SP6 primers are used to sequence the ends of the inserts from the Lambda GEM- 12 library and T7 and T3 primers are used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are performed by eight individuals using an average of fourteen AB 373 DNA Sequencers per day. All sequencing reactions are analyzed using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read distance. The overall sequencing success rate very approximately is about 85% for Ml 3-21 and M13RP1 sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for Ml 3-21 sequences, 445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions.
Richards et al, Chapter 28 in AUTOMATED DNA SEQUENCING AND ANALYSIS, M. D. Adams, C. Fields, J. C. Venter, Eds., Academic Press, London, (1994) described the value of using sequence from both ends of sequencing templates to facilitate ordering of contigs in shotgun assembly projects of lambda and cosmid clones. We balance the desirability of both- end sequencing (including the reduced cost of lower total number of templates) against shorter read-lengths for sequencing reactions performed with the M13RP1 (reverse) primer compared to the Ml 3-21 (forward) primer. Approximately one-half of the templates are sequenced from both ends. Random reverse sequencing reactions are done based on successful forward sequencing reactions. Some M13RP1 sequences are obtained in a semi-directed fashion: M13-21: sequences pointing outward at the ends of contigs are chosen for M13RP1 sequencing in an effort to specifically order contigs.
4. Protocol for Automated Cycle Sequencing The sequencing is carried out using ABI Catalyst robots and AB 373 Automated DNA
Sequencers. The Catalyst robot is a publicly available sophisticated pipetting and temperature control robot which has been developed specifically for DNA sequencing reactions. The Catalyst combines pre-aliquoted templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and reaction buffer. Reaction mixes and templates are combined in the wells of an aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear amplification (i.e.., one primer synthesis) steps are performed including denaturation, annealing of primer and template, and extension; i.e., DNA synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for an oil overlay. Two sequencing protocols are used: one for dye-labelled primers and a second for dye- labelled dideoxy chain terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, permitting the four individual reactions to be combined into one lane of the 373 DNA Sequencer for electrophoresis, detection, and base-calling. ABI currently supplies pre-mixed reaction mixes in bulk packages containing all the necessary non-template reagents for sequencing. Sequencing can be done with both plasmid and PCR- generated templates with both dye-primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable sequences. Thirty-two reactions are loaded per AB373 Sequencer each day, for a total of 960 samples. Electrophoresis is run overnight following the manufacturer's protocols, and the data is collected for twelve hours. Following electrophoresis and fluorescence detection, the ABI 373 performs automatic lane tracking and base-calling. The lane- tracking is confirmed visually. Each sequence electropherogram (or fluorescence lane trace) is inspected visually and assessed for quality. Trailing sequences of low quality are removed and the sequence itself is loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector poly linker sequence is removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 are around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. ABI 373 Sequencers converted to Stretch Liners provide a longer electrophoresis path prior to fluorescence detection and increase the average number of usable bases to 500-600 bp.
INFORMATICS 1. Data Management
A number of information management systems for a large-scale sequencing lab have been developed. (For review see, for instance, Kerlavage et al, Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) The system used to collect and assemble the sequence data was developed using the Sybase relational database management system and was designed to automate data flow wherever possible and to reduce user error. The database stores and correlates all information collected during the entire operation from template preparation to final analysis of the genome. Because the raw output of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based on a Unix platform, it was necessary to design and implement a variety of multi- user, client-server applications which allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort.
2. Assembly An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence fragments was employed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments of the genome. In order to obtain the speed necessary to assemble more than 104 fragments, the algorithm builds a hash table of 12 bp ohgonucleotide subsequences to generate a list of potential sequence fragment overlaps. The number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add the best matching fragment based on ohgonucleotide content. The contig and candidate fragment are aligned using a modified version of the Smith- Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., Methods in Enzymology 164:165 (1988)). The contig is extended by the fragment only if strict criteria for the quality of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repetitive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of alignments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from two ends of the same template point toward one another in the contig and are located within a certain range of base pairs (definable for each clone based on the known clone size range for a given library). The process resulted in 744 contigs as represented by SEQ ID NOs: 1-744.
3. Identifying Genes
The predicted coding regions of the T. pallidum genome were initially defined with the program GeneMark, which finds ORFs using a probabilistic classification technique. The predicted coding region sequences were used in searches against a database of all nucleotide sequences from GenBank (June, 1997), using the BLASTN search method to identify overlaps of 50 or more nucleotides with at least a 95% identity. Those ORFs with nucleotide sequence matches are shown in Table 1. The ORFs without such matches were translated to protein sequences and compared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept databases. ORFs that matched a database protein with BLASTP probability less than or equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases. ORFs that did not match protein or nucleotide sequences in the databases at these levels are shown in Table 3.
ILLUSTRATIVE APPLICATIONS
1. Production of an Antibody to a T. pallidum Protein Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as E. coli, or can be chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the protein can then be prepared as follows.
2. Monoclonal Antibody Production by Hybridoma Fusion
Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from murine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or modifications of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. Antibody- producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol 70:419 (1980), and modified methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Davis, L. et al, Basic Methods in Molecular Biology, Elsevier, New York. Section 21-2 (1989).
3. Polyclonal Antibody Production by Immunization
Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than others and may require the use of carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, J. et al, J. Clin. Endocrinol. Metab. 33:988-991 (1971).
Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony, O. et al, Chap. 19 in: Handbook of Experimental Immunology, Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. Soc. For Microbiology, Washington, D. C. (1980) Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively or qualitatively to identify the presence of antigen in a biological sample. In addition, antibodies are useful in various animal models of pneumococcal disease as a means of evaluating the protein used to make the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunotherapeutic or immunoprophylactic reagent.
4. Preparation of PCR Primers and Amplification of DNA Various fragments of the T. pallidum genome, such as those of Tables 1-3 and SEQ ID
NOS: 1-744 can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approximately the same. The PCR primers and amplified DNA of this Example find use in the Examples that follow.
5. Isolation of a Selected DNA Clone From T. pallidum
Three approaches are used to isolate a T. pallidum clone comprising a polynucleotide of the present invention from any T. pallidum genomic DNA library. The T. pallidum strain B3 IPU has been deposited as a convienent source for obtaining a T. pallidum strain although a wide varity of strains T. pallidum strains can be used which are known in the art.
T. pallidum genomic DNA is prepared using the following method. A 20ml overnight bacterial culture grown in a rich medium (e.g., Trypticase Soy Broth, Brain Heart Infusion broth or Super broth), pelleted, ished two times with TES (3OmM Tris-pH 8.0, 25mM EDTA, 50mM NaCl), and resuspended in 5ml high salt TES (2.5M NaCl). Lysostaphin is added to final concentration of approx 50ug/ml and the mixture is rotated slowly 1 hour at 37C to make protoplast cells. The solution is then placed in incubator (or place in a shaking water bath) and warmed to 55C. Five hundred micro liter of 20% sarcosyl in TES (final concentration 2%) is then added to lyse the cells. Next, guanidine HCI is added to a final concentration of 7M (3.69g in 5.5 ml). The mixture is swirled slowly at 55C for 60-90 min (solution should clear). A CsCl gradient is then set up in SW41 ultra clear tubes using 2.0ml 5.7M CsCl and overlaying with 2.85M CsCl. The gradient is carefully overlayed with the DNA-containing GuHCl solution. The gradient is spun at 30,000 rpm, 20C for 24 hr and the lower DNA band is collected. The volume is increased to 5 ml with TE buffer. The DNA is then treated with protease K (10 ug/ml) overnight at 37 C, and precipitated with ethanol. The precipitated DNA is resuspended in a desired buffer.
In the first method, a plasmid is directly isolated by screening a plasmid T. pallidum genomic DNA library using a polynucleotide probe corresponding to a polynucleotide of the present invention. Particularly, a specific polynucleotide with 30-40 nucleotides is synthesized using an Applied Biosystems DNA synthesizer according to the sequence reported. The ohgonucleotide is labeled, for instance, with 32P-γ-ATP using T4 polynucleotide kinase and purified according to routine methods. (See, e.g., Maniatis et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY (1982).) The library is transformed into a suitable host, as indicated above (such as XL-1 Blue (Stratagene)) using techniques known to those of skill in the art. See, e.g., Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL (Cold Spring Harbor, N.Y. 2nd ed. 1989); Ausubel et al., CURRENT PROTOCALS IN MOLECULAR BIOLOGY (John Wiley and Sons, N.Y. 1989). The transformants are plated on 1.5% agar plates (containing the appropriate selection agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. These plates are screened using Nylon membranes according to routine methods for bacterial colony screening. See, e.g., Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL (Cold Spring Harbor, N.Y. 2nd ed. 1989); Ausubel et al., CURRENT PROTOCALS IN MOLECULAR BIOLOGY (John Wiley and Sons, N.Y. 1989) or other techniques known to those of skill in the art.
Alternatively, two primers of 15-25 nucleotides derived from the 5' and 3' ends of a polynucleotide of SEQ ID NOS: 1-744 are synthesized and used to amplify the desired DNA by PCR using a T. pallidum genomic DNA prep as a template. PCR is carried out under routine conditions, for instance, in 25 μl of reaction mixture with 0.5 ug of the above DNA template. A convenient reaction mixture is 1.5-5 mM MgCl2, 0.01% (w/v) gelatin, 20 μM each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR (denaturation at 94°C for 1 min; annealing at 55°C for 1 min; elongation at 72°C for 1 min) are performed with a Perkin-Elmer Cetus automated thermal cycler. The amplified product is analyzed by agarose gel electrophoresis and the DNA band with expected molecular weight is excised and purified. The PCR product is verified to be the selected sequence by subcloning and sequencing the DNA product.
Finally, overlapping oligos of the DNA sequences of SEQ ID NOS: 1-744 can be chemically synthesized and used to generate a nucleotide sequence of desired length using PCR methods known in the art.
6(a). Expression and Purification Borrelia polypeptides in E. coli
The bacterial expression vector pQE60 is used for bacterial expression of some of the polypeptide fragements of the present invention. (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311). pQE60 encodes ampicillin antibiotic resistance ("Ampr") and contains a bacterial origin of replication ("ori"), an IPTG inducible promoter, a ribosome binding site ("RBS"), six codons encoding histidine residues that allow affinity purification using nickel- nitrilo-tri-acetic acid ("Ni-NTA") affinity resin (QIAGEN, Inc., supra) and suitable single restriction enzyme cleavage sites. These elements are arranged such that an inserted DNA fragment encoding a polypeptide expresses that polypeptide with the six His residues (i.e., a "6 X His tag") covalently linked to the carboxyl terminus of that polypeptide.
The DNA sequence encoding the desired portion of a T. pallidum protein of the present invention is amplified from T. pallidum genomic DNA using PCR ohgonucleotide primers which anneal to the 5' and 3' sequences coding for the portions of the T. pallidum polynucleotide shown in SEQ ID NOS: 1-744. Additional nucleotides containing restriction sites to facilitate cloning in the pQE60 vector are added to the 5' and 3' sequences, respectively.
For cloning the mature protein, the 5' primer has a sequence containing an appropriate restriction site followed by nucleotides of the amino terminal coding sequence of the desired T. pallidum polynucleotide sequence in SEQ ID NOS: 1-744. One of ordinary skill in the art would appreciate that the point in the protein coding sequence where the 5' and 3' primers begin may be varied to amplify a DNA segment encoding any desired portion of the complete protein shorter or longer than the mature form. The 3' primer has a sequence containing an appropriate restriction site followed by nucleotides complementary to the 3' end of the polypeptide coding sequence of SEQ ID NOS: 1-744, excluding a stop codon, with the coding sequence aligned with the restriction site so as to maintain its reading frame with that of the six His codons in the pQE60 vector.
The amplified T. pallidum DNA fragment and the vector pQE60 are digested with restriction enzymes which recognize the sites in the primers and the digested DNAs are then ligated together. The T. pallidum DNA is inserted into the restricted pQE60 vector in a manner which places the T. pallidum protein coding region downstream from the IPTG-inducible promoter and in-frame with an initiating AUG and the six histidine codons.
The ligation mixture is transformed into competent E. coli cells using standard procedures such as those described by Sambrook et al., supra.. E. coli strain M15/rep4, containing multiple copies of the plasmid pREP4, which expresses the lac repressor and confers kanamycin resistance ("Kanr"), is used in carrying out the illustrative example described herein. This strain, which is only one of many that are suitable for expressing a T. pallidum polypeptide, is available commercially (QIAGEN, Inc., supra). Transformants are identified by their ability to grow on LB agar plates in the presence of ampicillin and kanamycin. Plasmid DNA is isolated from resistant colonies and the identity of the cloned DNA confirmed by restriction analysis, PCR and DNA sequencing.
Clones containing the desired constructs are grown overnight ("O/N") in liquid culture in LB media supplemented with both ampicillin (100 μg/ml) and kanamycin (25 μg/ml). The O/N culture is used to inoculate a large culture, at a dilution of approximately 1 :25 to 1 :250. The cells are grown to an optical density at 600 nm ("OD600") of between 0.4 and 0.6. Isopropyl-β-D- thiogalactopyranoside ("IPTG") is then added to a final concentration of 1 mM to induce transcription from the lac repressor sensitive promoter, by inactivating the lad repressor. Cells subsequently are incubated further for 3 to 4 hours. Cells then are harvested by centrifugation. The cells are then stirred for 3-4 hours at 4°C in 6M guanidine-HCl, pH 8. The cell debris is removed by centrifugation, and the supernatant containing the T. pallidum polypeptide is loaded onto a nickel-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with high affinity are purified in a simple one-step procedure (for details see: The QIAexpressionist, 1995, QIAGEN, Inc., supra). Briefly the supernatant is loaded onto the column in 6 M guanidine-HCl, pH 8, the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then washed with 10 volumes of 6 M guanidine-HCl pH 6, and finally the T. pallidum polypeptide is eluted with 6 M guanidine- HCI, pH 5.
The purified protein is then renatured by dialyzing it against phosphate-buffered saline (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCl. Alternatively, the protein could be successfully refolded while immobilized on the Ni-NTA column. The recommended conditions are as follows: renature using a linear 6M-1M urea gradient in 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, containing protease inhibitors. The renaturation should be performed over a period of 1.5 hours or more. After renaturation the proteins can be eluted by the addition of 250 mM immidazole. Immidazole is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer plus 200 mM NaCl. The purified protein is stored at 4°C or frozen at -80° C. The polypeptide of the present invention are also prepared using a non-denaturing protein purification method. For these polypeptides, the cell pellet from each liter of culture is resuspended in 25 mis of Lysis Buffer A at 4°C (Lysis Buffer A = 50 mM Na-phosphate, 300 mM NaCl, 10 mM 2-mercaptoethanol, 10% Glycerol, pH 7.5 with 1 tablet of Complete EDTA- free protease inhibitor cocktail (Boehringer Mannheim #1873580) per 50 ml of buffer). Absorbance at 550 nm is approximately 10-20 O.D./ml. The suspension is then put through three freeze/thaw cycles from -70°C (using a ethanol-dry ice bath) up to room temperature. The cells are lysed via sonication in short 10 sec bursts over 3 minutes at approximately 80W while kept on ice. The sonicated sample is then centrifuged at 15,000 RPM for 30 minutes at 4°C. The supernatant is passed through a column containing 1.0 ml of CL-4B resin to pre-clear the sample of any proteins that may bind to agarose non-specifically, and the flow-through fraction is collected.
The pre-cleared flow-through is applied to a nickel-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (Quiagen, Inc., supra). Proteins with a 6 X His tag bind to the Ni-NTA resin with high affinity and can be purified in a simple one-step procedure. Briefly, the supernatant is loaded onto the column in Lysis Buffer A at 4°C, the column is first washed with 10 volumes of Lysis Buffer A until the A280 of the eluate returns to the baseline. Then, the column is washed with 5 volumes of 40 mM Imidazole (92% Lysis Buffer A / 8% Buffer B) (Buffer B = 50 mM Na-Phosphate, 300 mM NaCl, 10% Glycerol, 10 mM 2-mercaptoethanol, 500 mM Imidazole, pH of the final buffer should be 7.5). The protein is eluted off of the column with a series of increasing Imidazole solutions made by adjusting the ratios of Lysis Buffer A to Buffer B. Three different concentrations are used: 3 volumes of 75 mM Imidazole, 3 volumes of 150 mM Imidazole, 5 volumes of 500 mM Imidazole. The fractions containing the purified protein are analyzed using 8 %, 10 % or 14% SDS-PAGE depending on the protein size. The purified protein is then dialyzed 2X against phosphate-buffered saline (PBS) in order to place it into an easily workable buffer. The purified protein is stored at 4° C or frozen at -80°.
The following alternative method may be used to purify T. pallidum expressed in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, all of the following steps are conducted at 4-10°C. Upon completion of the production phase of the E. coli fermentation, the cell culture is cooled to 4-10°C and the cells are harvested by continuous centrifugation at 15,000 rpm
(Heraeus Sepatech). On the basis of the expected yield of protein per unit weight of cell paste and the amount of purified protein required, an appropriate amount of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 mM ΕDTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a high shear mixer.
The cells are then lysed by passing the solution through a microfluidizer (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by centrifugation at 7000 x g for 15 min. The resultant pellet is washed again using 0.5M NaCl, 100 mM Tris, 50 mM ΕDTA, pH 7.4.
The resulting washed inclusion bodies are solubilized with 1.5 M guanidine hydrochloride (GuHCl) for 2-4 hours. After 7000 x g centrifugation for 15 min., the pellet is discarded and the T. pallidum polypeptide-containing supernatant is incubated at 4°C overnight to allow further GuHCl extraction. Following high speed centrifugation (30,000 x g) to remove insoluble particles, the
GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM ΕDTA by vigorous stirring.
The refolded diluted protein solution is kept at 4°C without mixing for 12 hours prior to further purification steps. To clarify the refolded T. pallidum polypeptide solution, a previously prepared tangential filtration unit equipped with 0J6 μm membrane filter with appropriate surface area (e.g.,
Filtron), equilibrated with 40 mM sodium acetate, pH 6.0 is employed. The filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Perseptive Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eluted with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a stepwise manner. The absorbance at 280 mm of the effluent is continuously monitored. Fractions are collected and further analyzed by SDS-PAGΕ.
Fractions containing the T. pallidum polypeptide are then pooled and mixed with 4 volumes of water. The diluted sample is then loaded onto a previously prepared set of tandem columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion (Poros CM-20, Perseptive Biosystems) exchange resins. The columns are equilibrated with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium acetate, pH 6.0, 200 mM NaCl. The CM-20 column is then eluted using a 10 column volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A280 monitoring of the effluent. Fractions containing the T. pallidum polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled.
The resultant T. pallidum polypeptide exhibits greater than 95% purity after the above refolding and purification steps. No major contaminant bands are observed from Commassie blue stained 16% SDS-PAGE gel when 5 μg of purified protein is loaded. The purified protein is also tested for endotoxin/LPS contamination, and typically the LPS content is less than 0.1 ng/ml according to LAL assays.
6(b). Alternative Expression and Purification Borrelia polypeptides in E. coli
Tthe vector pQElO is alternatively used to clone and express some of the polypeptides of the present invention for use in the soft tissue and systemic infection models discussed below. The difference being such that an inserted DNA fragment encoding a polypeptide expresses that polypeptide with the six His residues (i.e., a "6 X His tag") covalently linked to the amino terminus of that polypeptide. The bacterial expression vector pQElO (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311) was used in this example . The components of the pQElO plasmid are arranged such that the inserted DNA sequence encoding a polypeptide of the present invention expresses the polypeptide with the six His residues (i.e., a "6 X His tag")) covalently linked to the amino terminus.
The DNA sequences encoding the desired portions of a polypeptide of SEQ ID NOS: 1- 744 were amplified using PCR ohgonucleotide primers from genomic T. pallidum DNA. The PCR primers anneal to the nucleotide sequences encoding the desired amino acid sequence of a polypeptide of the present invention. Additional nucleotides containing restriction sites to facilitate cloning in the pQElO vector were added to the 5' and 3' primer sequences, respectively. For cloning a polypeptide of the present invention, the 5' and 3' primers were selected to amplify their respective nucleotide coding sequences. One of ordinary skill in the art would appreciate that the point in the protein coding sequence where the 5' and 3' primers begins may be varied to amplify a DNA segment encoding any desired portion of a polypeptide of the present invention. The 5' primer was designed so the coding sequence of the 6 X His tag is aligned with the restriction site so as to maintain its reading frame with that of T. pallidum polypeptide. The 3' was designed to include an stop codon. The amplified DNA fragment was then cloned, and the protein expressed, as described above for the pQE60 plasmid. The DNA sequences encoding the amino acid sequences of SEQ ID NOS: 1-744 may also be cloned and expressed as fusion proteins by a protocol similar to that described directly above, wherein the pET-32b(+) vector (Novagen, 601 Science Drive, Madison, Wl 53711) is preferentially used in place of pQElO. The above methods are not limited to the polypeptide fragements actually produced. The above method, like the methods below, can be used to produce either full length polypeptides or desired fragements therof.
6(c). Alternative Expression and Purification of Borrelia polypeptides in
E. coli
The bacterial expression vector pQE60 is used for bacterial expression in this example (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311). However, in this example, the polypeptide coding sequence is inserted such that translation of the six His codons is prevented and, therefore, the polypeptide is produced with no 6 X His tag.
The DNA sequence encoding the desired portion of the T. pallidum amino acid sequence is amplified from an T. pallidum genomic DNA prep the deposited DNA clones using PCR ohgonucleotide primers which anneal to the 5' and 3' nucleotide sequences corresponding to the desired portion of the T. pallidum polypeptides. Additional nucleotides containing restriction sites to facilitate cloning in the pQE60 vector are added to the 5' and 3' primer sequences.
For cloning a T. pallidum polypeptides of the present invention, 5' and 3' primers are selected to amplify their respective nucleotide coding sequences. One of ordinary skill in the art would appreciate that the point in the protein coding sequence where the 5' and 3' primers begin may be varied to amplify a DNA segment encoding any desired portion of a polypeptide of the present invention. The 3' and 5' primers contain appropriate restriction sites followed by nucleotides complementary to the 5' and 3' ends of the coding sequence respectively. The 3' primer is additionally designed to include an in-frame stop codon.
The amplified T. pallidum DNA fragments and the vector pQE60 are digested with restriction enzymes recognizing the sites in the primers and the digested DNAs are then ligated together. Insertion of the T. pallidum DNA into the restricted pQE60 vector places the T. pallidum protein coding region including its associated stop codon downstream from the IPTG- inducible promoter and in-frame with an initiating AUG. The associated stop codon prevents translation of the six histidine codons downstream of the insertion point.
The ligation mixture is transformed into competent E. coli cells using standard procedures such as those described by Sambrook et al. E. coli strain M15/rep4, containing multiple copies of the plasmid pREP4, which expresses the lac repressor and confers kanamycin resistance ("Kanr"), is used in carrying out the illustrative example described herein. This strain, which is only one of many that are suitable for expressing T. pallidum polypeptide, is available commercially (QIAGEN, Inc., supra). Transformants are identified by their ability to grow on LB plates in the presence of ampicillin and kanamycin. Plasmid DNA is isolated from resistant colonies and the identity of the cloned DNA confirmed by restriction analysis, PCR and DNA sequencing.
Clones containing the desired constructs are grown overnight ("O/N") in liquid culture in LB media supplemented with both ampicillin (100 μg/ml) and kanamycin (25 μg/ml). The O/N culture is used to inoculate a large culture, at a dilution of approximately 1:25 to 1:250. The cells are grown to an optical density at 600 nm ("OD600") of between 0.4 and 0.6. isopropyl-b-D- thiogalactopyranoside ("IPTG") is then added to a final concentration of 1 mM to induce transcription from the lac repressor sensitive promoter, by inactivating the lad repressor. Cells subsequently are incubated further for 3 to 4 hours. Cells then are harvested by centrifugation.
To purify the T. pallidum polypeptide, the cells are then stirred for 3-4 hours at 4°C in
6M guanidine-HCl, pH 8. The cell debris is removed by centrifugation, and the supernatant containing the T. pallidum polypeptide is dialyzed against 50 mM Na-acetate buffer pH 6, supplemented with 200 mM NaCl. Alternatively, the protein can be successfully refolded by dialyzing it against 500 mM NaCl, 20% glycerol, 25 mM Tris/HCl pH 7.4, containing protease inhibitors. After renaturation the protein can be purified by ion exchange, hydrophobic interaction and size exclusion chromatography. Alternatively, an affinity chromatography step such as an antibody column can be used to obtain pure T pallidum polypeptide. The purified protein is stored at 4°C or frozen at -80° C. The following alternative method may be used to purify T. pallidum polypeptides expressed in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, all of the following steps are conducted at 4-10°C.
Upon completion of the production phase of the E. coli fermentation, the cell culture is cooled to 4-10°C and the cells are harvested by continuous centrifugation at 15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit weight of cell paste and the amount of purified protein required, an appropriate amount of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 mM ΕDTA, pH 1.4. The cells are dispersed to a homogeneous suspension using a high shear mixer.
The cells ware then lysed by passing the solution through a microfluidizer (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by centrifugation at 7000 x g for 15 min. The resultant pellet is washed again using 0.5M NaCl, 100 mM Tris, 50 mM ΕDTA, pH 7.4.
The resulting washed inclusion bodies are solubilized with 1.5 M guanidine hydrochloride (GuHCl) for 2-4 hours. After 7000 x g centrifugation for 15 min., the pellet is discarded and the T. pallidum polypeptide-containing supernatant is incubated at 4°C overnight to allow further GuHCl extraction.
Following high speed centrifugation (30,000 x g) to remove insoluble particles, the GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM ΕDTA by vigorous stirring.
The refolded diluted protein solution is kept at 4°C without mixing for 12 hours prior to further purification steps.
To clarify the refolded T. pallidum polypeptide solution, a previously prepared tangential filtration unit equipped with 0J6 μm membrane filter with appropriate surface area (e.g.,
Filtron), equilibrated with 40 mM sodium acetate, pH 6.0 is employed. The filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Perseptive Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eluted with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a stepwise manner. The absorbance at 280 mm of the effluent is continuously monitored. Fractions are collected and further analyzed by SDS-PAGE. Fractions containing the T. pallidum polypeptide are then pooled and mixed with 4 volumes of water. The diluted sample is then loaded onto a previously prepared set of tandem columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion (Poros CM-20, Perseptive Biosystems) exchange resins. The columns are equilibrated with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium acetate, pH 6.0, 200 mM NaCl. The CM-20 column is then eluted using a 10 column volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A280 monitoring of the effluent. Fractions containing the T. pallidum polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled.
The resultant T pallidum polypeptide exhibits greater than 95% purity after the above refolding and purification steps. No major contaminant bands are observed from Commassie blue stained 16% SDS-PAGE gel when 5 μg of purified protein is loaded. The purified protein is also tested for endotoxin/LPS contamination, and typically the LPS content is less than 0.1 ng/ml according to LAL assays.
6(d). Cloning and Expression of T. pallidum in Other Bacteria T. pallidum polypeptides can also be produced in: T. pallidum using the methods of S.
Skinner et al., (1988) Mol. Microbiol. 2:289-297 or j. I. Moreno (1996) Protein Expr. Purif. 8(3):332-340; Lactobacillus using the methods of C. Rush et al., 1997 Appl. Microbiol. Biotechnol. 47(5):537-542; or in Bacillus subtilis using the methods Chang et al., U.S. Patent No. 4,952,508.
7. Cloning and Expression in COS Cells
A T. pallidum expression plasmid is made by cloning a portion of the DNA encoding a T pallidum polypeptide into the expression vector pDNAI A-mp or pDNAIII (which can be obtained from Invitrogen, Inc.). The expression vector pDNAI/amp contains: (1) an E. coli origin of replication effective for propagation in E. coli and other prokaryotic cells; (2) an ampicillin resistance gene for selection of plasmid-containing prokaryotic cells; (3) an SV40 origin of replication for propagation in eukaryotic cells; (4) a CMV promoter, a polylinker, an SV40 intron; (5) several codons encoding a hemagglutinin fragment (i.e., an "HA" tag to facilitate purification) followed by a termination codon and polyadenylation signal arranged so that a DNA can be conveniently placed under expression control of the CMV promoter and operably linked to the SV40 intron and the polyadenylation signal by means of restriction sites in the polylinker. The HA tag corresponds to an epitope derived from the influenza hemagglutinin protein described by Wilson et al. 1984 Cell 37:767. The fusion of the HA tag to the target protein allows easy detection and recovery of the recombinant protein with an antibody that recognizes the HA epitope. pDNAIII contains, in addition, the selectable neomycin marker.
A DNA fragment encoding a T. pallidum polypeptide is cloned into the polylinker region of the vector so that recombinant protein expression is directed by the CMV promoter. The plasmid construction strategy is as follows. The DNA from a T. pallidum genomic DNA prep is amplified using primers that contain convenient restriction sites, much as described above for construction of vectors for expression of T. pallidum in E. coli. The 5' primer contains a Kozak sequence, an AUG start codon, and nucleotides of the 5' coding region of the T. pallidum polypeptide. The 3' primer, contains nucleotides complementary to the 3' coding sequence of the T. pallidum DNA, a stop codon, and a convenient restriction site.
The PCR amplified DNA fragment and the vector, pDNAI/Amp, are digested with appropriate restriction enzymes and then ligated. The ligation mixture is transformed into an appropriate E. coli strain such as SURE™ (Stratagene Cloning Systems, La Jolla, CA 92037), and the transformed culture is plated on ampicillin media plates which then are incubated to allow growth of ampicillin resistant colonies. Plasmid DNA is isolated from resistant colonies and examined by restriction analysis or other means for the presence of the fragment encoding the T. pallidum polypeptide
For expression of a recombinant T. pallidum polypeptide, COS cells are transfected with an expression vector, as described above, using DEAE-dextran, as described, for instance, by Sambrook et al. (supra). Cells are incubated under conditions for expression of T. pallidum by the vector.
Expression of the T. pallidum-RA fusion protein is detected by radiolabeling and immunoprecipitation, using methods described in, for example Harlow et al., supra.. To this end, two days after transfection, the cells are labeled by incubation in media containing 35S- cysteine for 8 hours. The cells and the media are collected, and the cells are washed and the lysed with detergent-containing RIPA buffer: 150 mM NaCl, 1% NP-40, 0.1% SDS, 1% NP- 40, 0.5% DOC, 50 mM TRIS, pH 7.5, as described by Wilson et al. (supra ). Proteins are precipitated from the cell lysate and from the culture media using an HA-specific monoclonal antibody. The precipitated proteins then are analyzed by SDS-PAGE and autoradiography. An expression product of the expected size is seen in the cell lysate, which is not seen in negative controls.
8. Cloning and Expression in CHO Cells The vector pC4 is used for the expression of T. pallidum polypeptide in this example. Plasmid pC4 is a derivative of the plasmid pSV2-dhfr (ATCC Accession No. 37146). The plasmid contains the mouse DHFR gene under control of the SV40 early promoter. Chinese hamster ovary cells or other cells lacking dihydrofolate activity that are transfected with these plasmids can be selected by growing the cells in a selective medium (alpha minus MEM, Life Technologies) supplemented with the chemotherapeutic agent methotrexate. The amplification of the DHFR genes in cells resistant to methotrexate (MTX) has been well documented. See, e.g., Alt et al., 1978, J. Biol. Chem. 253:1357-1370; Hamlin et al., 1990, Biochem. et Biophys. Acta, 1097: 107-143; Page et al., 1991, Biotechnology 9:64-68. Cells grown in increasing concentrations of MTX develop resistance to the drug by overproducing the target enzyme,
DHFR, as a result of amplification of the DHFR gene. If a second gene is linked to the DHFR gene, it is usually co-amplified and over-expressed. It is known in the art that this approach may be used to develop cell lines carrying more than 1,000 copies of the amplified gene(s). Subsequently, when the methotrexate is withdrawn, cell lines are obtained which contain the amplified gene integrated into one or more chromosome(s) of the host cell.
Plasmid pC4 contains the strong promoter of the long terminal repeat (LTR) of the Rouse Sarcoma Virus, for expressing a polypeptide of interest, Cullen, et al. (1985) Mol. Cell. Biol. 5:438-447; plus a fragment isolated from the enhancer of the immediate early gene of human cytomegalovirus (CMV), Boshart, et al., 1985, Cell 41:521-530. Downstream of the promoter are the following single restriction enzyme cleavage sites that allow the integration of the genes: Bam HI, Xba I, and Asp 718. Behind these cloning sites the plasmid contains the 3' intron and polyadenylation site of the rat preproinsulin gene. Other high efficiency promoters can also be used for the expression, e.g., the human β-actin promoter, the SV40 early or late promoters or the long terminal repeats from other retroviruses, e.g., HIV and HTLVI. Clontech's Tet-Off and Tet-On gene expression systems and similar systems can be used to express the T. pallidum polypeptide in a regulated way in mammalian cells (Gossen et al., 1992, Proc. Natl. Acad. Sci. USA 89:5547-5551. For the polyadenylation of the mRNA other signals, e.g., from the human growth hormone or globin genes can be used as well. Stable cell lines carrying a gene of interest integrated into the chromosomes can also be selected upon co-transfection with a selectable marker such as gpt, G418 or hygromycin. It is advantageous to use more than one selectable marker in the beginning, e.g., G418 plus methotrexate.
The plasmid pC4 is digested with the restriction enzymes and then dephosphorylated using calf intestinal phosphates by procedures known in the art. The vector is then isolated from a 1% agarose gel. The DNA sequence encoding the T pallidum polypeptide is amplified using PCR ohgonucleotide primers corresponding to the 5' and 3' sequences of the desired portion of the gene. A 5' primer containing a restriction site, a Kozak sequence, an AUG start codon, and nucleotides of the 5' coding region of the T. pallidum polypeptide is synthesized and used. A 3' primer, containing a restriction site, stop codon, and nucleotides complementary to the 3' coding sequence of the T. pallidum polypeptides is synthesized and used. The amplified fragment is digested with the restriction endonucleases and then purified again on a 1 % agarose gel. The isolated fragment and the dephosphorylated vector are then ligated with T4 DNA ligase. E. coli HB101 or XL-1 Blue cells are then transformed and bacteria are identified that contain the fragment inserted into plasmid pC4 using, for instance, restriction enzyme analysis. Chinese hamster ovary cells lacking an active DHFR gene are used for transfection. Five μg of the expression plasmid pC4 is cotransfected with 0.5 μg of the plasmid pSVneo using a lipid-mediated transfection agent such as Lipofectin™ or LipofectAMINΕ.™ (LifeTechnologies
Gaithersburg, MD). The plasmid pSV2-neo contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme that confers resistance to a group of antibiotics including G418. The cells are seeded in alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus MEM supplemented with 10, 25, or 50 ng/ml of methotrexate plus 1 mg/ml G418. After about 10-14 days single clones are trypsinized and then seeded in 6- well petri dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 100 nM, 200 nM, 400 nM, 800 nM). Clones growing at the highest concentrations of methotrexate are then transferred to new 6- well plates containing even higher concentrations of methotrexate (1 μM, 2 μM, 5 μM, 10 mM, 20 mM). The same procedure is repeated until clones are obtained which grow at a concentration of 100-200 μM. Expression of the desired gene product is analyzed, for instance, by SDS-PAGE and Western blot or by reversed phase HPLC analysis. The disclosure of all publications (including patents, patent applications, journal articles, laboratory manuals, books, or other documents) cited herein are hereby incorporated by reference in their entireties SEQ ID NOS: 1-744 are hereby incorporated into the specification by reference. The present invention is not to be limited in scope by the specific embodiments described herein, which are intended as single illustrations of individual aspects of the invention. Functionally equivalent methods and components are within the scope of the invention, in addition to those shown and described herein and will become apparant to those skilled in the art from the foregoing description and accompanying drawings. Such modifications are intended to fall within the scope of the appended claims.
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000067_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000068_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000069_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000070_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000071_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000072_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000073_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000074_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000075_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000076_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000077_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000078_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000079_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000080_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000081_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000082_0001
TABLE 1.
Treponema pallidum - Coding regions containing know sequences
Figure imgf000083_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000084_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000085_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000086_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000087_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000088_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000089_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000090_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000091_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000092_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000093_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000094_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000095_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000096_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000097_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000098_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000099_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000100_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000101_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000102_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000103_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000104_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000105_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000106_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000107_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000108_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000109_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000110_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000111_0002
Figure imgf000111_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000112_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000113_0001
Figure imgf000113_0002
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000114_0001
Figure imgf000114_0002
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000115_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000116_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000117_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000118_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000119_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000120_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000121_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000122_0001
TABLE 2.
Figure imgf000123_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000124_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000125_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000126_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000127_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000128_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000129_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000130_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000131_0002
Figure imgf000131_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000132_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000133_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000134_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000135_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000136_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000137_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000138_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000139_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000140_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000141_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000142_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000143_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000144_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000145_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000146_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000147_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000148_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000149_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000150_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000151_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000152_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000153_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000154_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000155_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000156_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000157_0002
Figure imgf000157_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000158_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000159_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000160_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000161_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000162_0001
TABLE 2.
Treponema pallidum - Putative coding regions of novel proteins similar to know proteins
Figure imgf000163_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000164_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000165_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000166_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000167_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000168_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000169_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000170_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000171_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000172_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000173_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000174_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000175_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000176_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000177_0001
TABLE 3.
Treponema pallidum - Putative coding regions of novel proteins not similar to know proteins
Figure imgf000178_0001
(1) GENERAL INFORMATION:
(i) APPLICANT: Human Genome Sciences Inc., et . al .
(ii) TITLE OF INVENTION: Treponema pallidum Polynucleotides and
Sequences
(iii) NUMBER OF SEQUENCES: 744
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Human Genome Sciences, Inc.
(B) STREET: 9410 Key West Avenue
(C) CITY: Rockville
(D) STATE: Maryland
(E) COUNTRY: USA
(F) ZIP: 20850
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage
(B) COMPUTER: HP Vectra 486/33
(C) OPERATING SYSTEM: MSDOS version 6.2
(D) SOFTWARE: ASCII Text
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: Unassigned
(B) FILING DATE: June 23, 1998
(C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 60/050,667 (B) FILING DATE: June 24, 1997
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Brookes, A. Anders
(B) REGISTRATION NUMBER: 36,373
(C) REFERENCE/DOCKET NUMBER: PB387PCT
(vi) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (301) 309-8504
(B) TELEFAX: (301) 309-8512
(2) INFORMATION FOR SEQ ID NO: 1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 14063 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:
AAGGTTGTTT GTTGATAATC TCTGCCATAT TACTGTTCCC TTCTTTTCGT TCATCGGGTA 60
AGAGCCGTCA GCGGTGAGCG CGCCACcTCC TTCACTAATC ACcACGTCGT GAACAGGCGC 120
TGCcGTAcAG CCGACCACCA ACGGCTCTTC TACCCACTCA ACTCTACATT CCACGCTAGC 180
GAGCCAACGC AGCAACGATC GTATCGATAT TGTGTGTTAC CATCCCTACG TAGGTACCCT 240
CGCTCGTACC CGCATCCCCC ATCGCATCAG AAAACAACTC GCCTCCAATn CTGnTACGTG 300
CCCTCTTGCC TGCACCGCAT CCCTTAACGC TTCAACGTTT TTGTGCGGAA TAGAACTCTC 360
AATAAAGATA GCAGGGAGTT mTACnTGCGC AATAAACGCT GCCAGTTCCT GCATATCATG 420
CGCACTGGCT TCCGAAGCGG TGCTCACCCC TTGCAACCCC TTCACCTCAA AACCATACGC 480
ACGGCTAAAA TAGCCGAACG CATCATGAGC GGTCACCAAC ACACGCcTTT CAGCAGGCAG 540
CGACTGCGCC TTGCGCCGAA CGTACGCGTC AAGCTTATCC AACTGCTGCT GGTACGCCTG 600
ATAACGTTGA GTAAATTCGC GAGTTTTTCC CGGCAACAGC TTGCACAAGC TTTCGTACAC 660
TGCCTTCACC GAATAAGACC ACAGCTTTAC ATCAAACCAC ACATGCGGAT CGAACTCTGC 720
TTCCTCAAGA GAAAGACGCT GAGACACCGG AATAGTCTCA GAAACTGCAA CTACCAAGCG 780 GCTCCCGCGC AGTTTGGAAA ACACCTCGCC CATCTTGGTT TCCAGGTGCA ACCCGTTGTA 840
CAGGATGAGA TCCGCATTCC CGAGCCATTC CACATCCCCC GCAGTAGCCG TGTACAGGTG 900
CGGGTCAACA CCAGGACCCA TCAACCCCTT TAGATGCACA TCACCTTGAG CGATGTTTTT 960
GACAGCATCC GCTATCATGC CAATGGTGGT GACAACCAGG GGTTTCCCGT CCGCTGCGGC 1020
ATCCTTGCTA CCGAATGCGT GCGTAAAACC GGTCAGCATG CCAAGCGCGA GCACGCAGGC 1080
ACATATTCTT TCACGTATCA AGTGACACTC CTTGGGTGAA TTTgATGCAT CAAAGTAGCG 1140
GATCACGCAG GAAACGTCAA TTTTTCAGTA CCcTTTCAGG AAAAGAAAAC GGCACTCTGG 1200
CGCGGACCTA CTCGAGCTAC ATGATAAAGA AGCATTTGAT CTTCCTCGCA TAAGCAGCTA 1260
CAGTTGCGCC CTCTTATGGA TCACACCGCG TCACTGAGTC CTGTGCGCCC TGAAGCACAA 1320
CCTACAGACG ATCGCGAGCG TGCGCTCAGA CCGCGCCTCC TGAAAGACTT TCTAGGTCAG 1380
GAGAAAACAA AACGCAACTT ACGTCTTTTC ATTCAGGCAG CGCGCGATCG CAACGAAAGC 1440
TTAGATCACC TGTTCCTCAT CGGCCCCCCG GGGCTcGGCA AAACGACGCT CGCGCATATC 1500
ACTGCATGCG AGCTGGGCGT TGAGTGCAAG GTTACAGGCG CACCGGCGCT TGATAAACCA 1560
AAAGATTTAG CGGGTATCCT CACTGCGCTG AGTGAGCGAA GGCnTTCTTC GTGGATGAAA 1620
TCCACCGCCT CAAACCAGCC ATAGAAGAGA TGCTGTACAT TGCCATGGAG GACTACGAAC 1680
TGGATTGGGT TATCGGTCAG GGACCGTCCG CGCGCACGGT GCGCATCCCA CTCCCCCCGT 1740
TTACCCTCAT TGGTGCAACC ACTCGCGCGG GTATGGTTTC AAGCCCGCTG ATTAGCCGCT 1800
TTGGAATCGT AGAGCGCTTC GAGTTCTATA CCCCTGAGGA GCTTGCTGCC ATTGTGCAAC 1860
GCTCAGCGCG GCTTCTAGAT ATCACGCTCG ACGCACGCGC AGTTnAGCCC TTGCGCGGTG 1920
TTCGCGAGGA ACACCCCGGG TGGCCAACCG GCTTTTGCGC CGTATACGCG ATTTTGCCCA 1980
AGTTGCGGGG TCTGCACACA TCAGCGAGAC GATAGTACGC GCAGGcTTGC CCACCTAAAG 2040
ATCGACGAAT TAGGGCTAGA ACTGCACGAC ATACAGCTGC TGCGCGTCAT GaTTGAGCAC 2100
TTCGGCGGAG GGCCAGTGGG CGCAGAAACG CTGGCGATCT CCCTCGGGGA ATCACCGGAA 2160
ACACTTGAGG ATTACTACGA GCCCTACCTT ATCCAAATTG GGCTCATGCA GCGCACCCCC 2220
CGCGGGCGCA TGGCCACCGC GCGTGCCTAT GCGCACCTAG GTCTCCCTGT CCCCGAGGCA 2280
CGCACGCTCA CCCCGCACTC CCCAGAACAA GGAACGCTTC TTTAGCAAAG ATGCGGACAC 2340
CTTGTCAGAG CTGTCGGCAG tCCGCTCAGA CCGGGTAAGA CACAAGAGTC TGAAAAAGGC 2400
ATATACTACG CGCGGGAGGG GTGCGTTTCG TGAAGATGTC TGCGTTTTTT GCACCAACCt 2460
GCGGTCTGCA CCTGCTGATG CAACCATCGC AAGCCACCAG CTGCTCATGC GCGCAGGGTA 2520 CGTCAGAAAA ATCGCCAACG GCCTGTTTGC GTACCTTCCC CTGGGCcTGC GCGTTCGACA 2580
CAAAATTGAA GCGATTATTC GGGAAGAACT CGAGGCTATC GGGTGTTTGG AGTGCACCGC 2640
GCCTGTCGTG ACTCCTGCAG AGTTGTGGAA GGAATCTGGC CGCTGGTACC GCATGGGCGC 2700
AGAGCTTTTG CGCGCCAAAA ATCGGCTCGA TCACGAGCTC CTTTTCAGTC CGACTGCAGA 2760
AGAATCCTTC ACCGCTTTGG TGCGCGGCGA CTGTACTTCC TACAAACATT TTCCCCTCAG 2820
TCTCTACCAA ATCAACGCAA AATATCGCGA TGAAATCCGT CCGCGTTACG GACTGATGCG 2880
CGCGCGCGAG TTCACCATGG CCGACGCCTA TTCTTTCCAC ACAGACTGCG CATGCCTTGC 2940
GCGCACGTAC GAAAAGTTTG CGCACGCGTA TCGCGCCATT TTCCGTCGCA TCGGCCTATC 3000
AGTCATTGCA GTACATGCAC ACcTCGGTGC GATGGGGGGG CAGGAATCCG AGGAATTCAT 3060
GGTAGAGTCC GCGGTGGGCG ACAACACGCT CCTGTTGTGT CCCCACTGcA CCTACGCTGg 3120
CAAATTGCGA AAAGGCCGTC GGACAGCGCC CCCTCCCAGA CACGCATGAC ACTCATCTAA 3180
AAGACGAACA CGAAGGgTCA GATCTCAAGA CGCCTGCAGC AATGCGCGAG GTGCACACCC 3240
CGCACGTGAA AACTATTGAG GAACTTGAAC ACTTCTTGCA CGTACCTGCA CATCGCTGCA 3300
TCAAGACGCT TATTTACCGC ATTGACACGG TGCCCCAGGC GGCTGGGCAT TTTGTGGCAG 3360
TGTGCATCCG CGGCGACCTA GAACTCAACG AGTCAAAGCT CGAAGCGCTC CTGCGCGTGC 3420
CATCTGTAGT ACTGGCAACT GAACAAGAGG TGTATGCACT CAGCGGCACC CCCGTAGGAT 3480
TCATTGGTCC GGTAGGAcTT GCACAGCGTG CTGCAGCTGC GTATGCCGCT CGCACCCtGC 3540
GTTCTTCCCC TCCGCTGCTG AGCCTGCATC CGTCACTTCT GACATTCCAT TTTTTTCCCT 3600
CGTTGCAGAT CAGTCCGTGA TGGCTATGCA CAACGCTATC ACCGGTGCGT TGAAAGTTGA 3660
CACGCATCTT GTGCAGGTAG AACCGGGTCG AGACTTTGTT CCTGACGCAg TTGCAGATCT 3720
CATGCTCGTG CGCGCCGGCG ACCGGTGCAT ACACTGTGGA GCGCCCCTAT ACGAAAAAAA 3780
GGGTAACGAA CTAGGTCACC TCTTTAAATT AGGGGACAAA TACACGCGCA gcATGcACCT 3840
TACCTTTACT GATGAGCAGG GTGTACGACA GTTCCCCCTG ATGGGCTGCT ATGGCATTGG 3900
CCTTGATCGC ACGCTTGCCT CTGTGGTGGA AAACCACCAT GACACGCGGG GTATCAGCTG 3960
GCCGCTTGCG ATCAGCCCCT ATGCAGTTGT GCTCATACCC ATCCCTCACA CGCAGGCCCC 4020
CTATGCAGCA GCAGAGGCAC TGTACGTGCA GCTGCGGACA CGGGGAGTTG AGGTACTGTT 4080
TGATGATCGT GCAGAGCGAC CCGGAGTAAA GTTCGCAGAC GCTGATTTAA TCGGTATTCC 4140
CTTCGTGTGG TACTGAGTGC GAAAAnCTAC CGCGCGTTGA ATGCaCAACA CGGTGTGGTG 4200
CGCACACGTA TTTTTTTACG CAAGAAGAGG CGTCCGAGCA CATTGCACGC CTGCTCGAAC 4260 AACTCGCTTC CCCGGAAAGT TCGTAAGAAC GGGAATGCCG GAGCGGGATC CAGCGCATGC 4320
AGTGCTGAGA CCTGCGCATA ATAGCACAGT GTACGGCACC CGTGGTTTAG AAAAAAATGA 4380
CGAAGGAGAA AAGGAAAAAC GGTGTACATA AAGGTAGCGC TCGTGTGTCT TTTCAGCATG 4440
GGAGCGCGGT GTCTTTTGGC CACAGAACCG GCGCCAGTCT CTGGAGATTA CGTATTGTAT 4500
CGCGACTATT CGTGGAAATC GCCCACATGG GTTGGCTTTT TGTGCTACGA CGCACACACG 4560
TACGGTGCGC TGCTGTGTAC TCCGGCAGAA AGCCGCAGGA TCACAATTCT CTTCACGGGT 4620
ACTGAAAAGC ACGGCCGCTT TGAGCTGACC GGACAACGCA TCACCTCACC GGTGCGCACA 4680
GAGGATCTGA CTGGCATAAA TTATCTCATG GATCTTTTTC CTCAACTACA GCGCTGGAAG 4740
CATTTTCCCC GGGATACACA CACCCTTGTT GCGCGGCATA CCGATCGGAG TAAAAAGAGC 4800
ACACAATTCT CAGGGGCAGT CGAACTGCAG TTCGCTTCTT TTGTCCCCCT CTTCCACCTA 4860
GAAATACTCC GTGATAAGCA GCAGCGCGTC ATGCTCCAGC TAAGCGAGAT AGGGAAGATC 4920
GACCACACCA GTGACGCAGC CTTCTTTCAA TTCACCCCCA TGCCCCCGTC CACGCCCACT 4980
GATGCACCGc CAGCAACGCT TAATCAGACC CTGACACGCA CGGAGTATGT CATCGATGAC 5040
GTGTGCATTG CACTTGATCC GCAGTGGAAA AGAATTGCAG AAAATTCCTT TCTTTCAGAC 5100
TTTGCCTTTC TCACCGTACA CCAGGTGCCT GCACCGCGCG CGCACGACTA TTCTGCGCTC 5160
CGTGCATTGC TGCAACTCTT TCTGTATTCA GGCCCTCAGG GAAAAAACAT TCTTGAACAA 5220
CTCCATATCA ATGACACTCA CGCGCGTCTT ACGCTTTCCT ATGCAGTGTT TGACCTTCCG 5280
TCAAAAACAG TTAAAAAGAC ATGGAAGATA TTCATCCGCC ACTCTGATAC GCACTACTCT 5340
ATACTTAGTC TCACGGCGGA CCAgCGCACA GCGCAGsGTT ACGCGCGCTA CTTTGACACG 5400
CTCATTGAAA CTATCCGTAC AAAAAACTAA AAAATGCTGA ATTGGAGCAT ACCCGTGATT 5460
AGACACATAT TATTTGACAT AGACAACACG CTGTACTCCT GTACAAATCC CATTGAAATG 5520
GCTATCACGC AGCGCATACA CACATTTGTT GCACATTTTC TCCACGTATC TTGTGAGGAG 5580
GCGCGCGCGT TACGCCAGCG CACAAAGCAC CTCTATGCTA CCACCTTTGA GTGGTTAAAG 5640
GCAGAGCACA ATCTCATTCA CGATGAACAC TACTTTCGTG CCGTATATCC TCCCACCGAA 5700
ATACAGGAGT TGCAGTACGA TCCGATGcTC CGCCCTTTTT TACAGTCACT GCACATGCCA 5760
CTGACGGCAT TAACTAACGC ACCGCGCGTG CACGCACAAC GCGTATTGGA TTTTTTTCAT 5820
CTGTCAGACC TTTTTTTAGA TGTCTTTGAC ATCACGTATC ATGCAGGCAA GGGAAAACCA 5880
CACCACAGCT GCTTTGTACG TACGCTTGAA GCGGTACACA AAACTGTGCA GGAAACGCTT 5940
TTTGTCGATG ACTGTCTCAT GCACGTGCGT GCcTTTATTG CGCTTGGCGG ACATGCCGTG 6000 CTGGTTGACG AACGTGACTG TCATGCAGAA CTGCCTCCTT CTGCACGCAT GACACGCGTA 6060
AAAACAATTT ATGAATTGCC CGCACACCTT GCACGCCTCG CCCAAGGAGA CAATCAGTGA 6120
GTATACATTC GTTGCAGCAG ACTTTTAGCG ACATCGTCCC GCTCCTGGAG CAGTATACGC 6180
GCGCAGACCG CTTCATGCGG GAGGATAATT TGTTACACGA GAGAAACGAA CCTATCCGGC 6240
GTATCGTTGA GTCCCTCGTC GCCCGCATAT TACTCCCCGG CTCCACAATG CGCGGAAATG 6300
AGCAAATCGC ATCCTTTTTA CATAAAACCA ATGAAGGGAA ACGGGGACTC ATTCTTGCGG 6360
AACACTACAG CAATTTTGAC TTACCCTGTC TGCTCTACCT TATGGAACAA GGAAGTAGTG 6420
CCGGGCGCAT GCTTTCAGAA AAAATCGTAT CTATTGCCGG TATTAAACTT CGTGAAGAAA 6480
ATCGCATCCT GGCAATGCTC ACCGAAGgAT ATGATCACCT GGTGATATAT CCCAGTAGGA 6540
GTTTGGCCAC CATCACTGAT GCGCACTGTC TTGCAAGAGA GACAAAGCGC AGCnGAGCAC 6600
TGAATCGTGC AGCTATGAAG TATTTAGAGG AACTGCGCAA CGCGGGAAAG GTGATTCTCG 6660
TGTTTCCTGC AGGGACACGC TACCGACCCG GGAGACCGGA AACAAAGCGA GGGGTGCGCG 6720
AAGTATACTC CTACATAAAA CACGCCGAGG TACTGCTCCT TATTTCAATC AATGGGAATT 6780
GTTTGCGCGT TGCAGAACGT TCAACTGATA TGACGGAAGA CGCGGTGCAT CCGGATGTCG 6840
TGCTTCTTGA AGCGCGCACT GTAGACGAcT GCGCCCTTTT TCGAGAAAAA GCGCTGGACT 6900
GGCACCGCAC ACACAACGTG GCGGCACCGT CAGAGGATAA AAAACAAATC GTAGTCGACT 6960
ATGTCATGCA CCTTTTGGAA GAAATGCACG AGCACAATGA ACGAGAAAGG CTATCGTGAA 7020
TTTTTCGCTG GAATTCCCCG TAAGATCCTA TGAGCTAGAC GGATACGGAC ACGTGAACAA 7080
TGCGGTATAT CTCCAATATT TTGAATATGC GCGCGCCGCT TTTTTGCTCC ACATAGGGTT 7140
CGACCTCAAA CAGTTGCACG AAGCAGGTTA CGCTTTCTAC GTAACCCAGG CGCACATTCA 7200
CTAcCGCACT GCAGTGCATC TATTCGATAC GTTGCGCGCC CGGGTAAAAC CATTAAAGCT 7260
CGGAAAAGCT TCCGGCGTCT TTTCACAGAC GCTGGAGAAC CAGCATCACG TGCTATGCGC 7320
GGATGCGGAA ATTACCTGGG TGTGCGTTTC GCGCACAAGC GGCAAACCAA CTAAGATTCC 7380
CCCCGAGTAT CTGGTACCTG CGCTGTATCC GAACTACTAG TCCTCCCTTC TTTCCCCTTT 7440
ACTCTCCCAA GGACATCACA CTACGGAAGG GTACGCATAC GCAGTAGGGA GGTAGGGTTT 7500
ATCGCGGAGC CATTCTTATA GATTGTAAAA TGCAGGTGTG GTCCCGTGCT GCGTCCTGTT 7560
TTTCCCAATA ATCCGATTTT tGTCGCGCTG GTGACGCGCG TACCTGCTGA AACCAACACC 7620
GTCTGCAGAT GCCCATACAG GGTCTGATAC CCCGCGTGGT GCCCCACAAT CAGGTAATTA 7680
CCATACACTG CACTGTATCC AACCGTGCGT ACAATCCCTC CGAGCGCCGA ATATACTGGG 7740 GTACCCCGCC GACTCACCAT ATCCAAACCA TTGTGAAAAC TTCTGGCACC GGTAAACGGA 7800
TCAcTACGCC ATCCATACCG CGAAGAAACA TAGTACCGAC TGCGAAGAGG AGCACGAAAC 7860
AAGTCACCAT TAATTTCCTG CAACGCGCGT GCGCTTAAAT GTGCACCGGG CAAAAACAGT 7920
ACGCGTGCAG GCTGCAATGG CTGCAcTGCG TCAAACGACG TATTTTCCCT CCACTGTTTC 7980
GCAGAAGAAA ACGGAAAAGG CACGCAGgAC TCCCGTGCAG CTGAATTATA GAACGGAGAA 8040
ACCAGCGTAC GCACTGAAGG AGGTGACTCC TTTGAAGAAG ACGGCGTGTT AAGCAGCACC 8100
AATCGTTCTA AGGAGATCTG ATGCGCCGCC GCTATAGACG AAAACGTATC GCCGTTTTTT 8160
ACGGTATATA AAATGCCGTC CACTGAGGGG ATTTTTAGTA GCTGTCCAAC TTGGAGCGCC 8220
CGTtGyTGCG CAATTTATTC AAACTAATGA TTGCATCCTG ACTGATGTCA TAgcgCtGCG 8280
CAATCCTTCC TACCACATCA CCTTCACGCA tTCGTACACT GTGTAGTACA GTGCAGGcTC 8340
CGCATCTTCC TGCACGATAC GTGCACGGAG CAAGGAAGAC ACGTACCCCG ACGCCTGACG 8400
TGGTTCCTGC TCAGTGAGCG TGAGGGCAGG TGTCAATGGT TCCACCTGAG CACCAAAGTA 8460
CGCAAGGGCA AGAGCAAGGA GCAACAGTGT TACGAACAGT AACAGTnGCC tACGGGGACA 8520
GGTCTACACG GTTCTCGCAC AGTCTGTTTG GAACTTCGAC AGTACACGCT CACACCGGCT 8580
ATCCTTCAGG TGTACACACT GCCGTATCtG CGGGCAGGTT GCGTCTGTAC CTAACGCACC 8640
GTCTAGAGCG TCCACGCACG CAcGCGCGCG CGCGAGGGAG TCCGGCGGAA AAGAAGTTAA 8700
CACCCGCATG AATGCAGGGC TCGGTGTTGT CCACACCTGT GCAGGCAGCG CCTGAAGACG 8760
CGATGAAGGA AAACGCACAC ACCAAGCAAG CAAAAAAAAG nGCGTGTAAC GCGCATTCCC 8820
GTGcTGACTC GCGCCACCGT ACGTGCGATC TCCTACCAGG GGGAATCCCT GTGCAGCGCA 8880
ATAACGGCGA ATCTGATGCT TTTTCCCcGT AACCGGCACA ATCACGCGGA GCACCAGCGC 8940
GCTATCACAG CTATGTAACA CTGTTTGCAC ATGCGTTACC TCTCCTGGAC GCACCAGCGT 9000
GCGCGCCGCA GCAGCGGTGC gCGCAGGGGC GGCGGTGATC GCAAGATAAA ACTTGCGCAA 9060
TGTATGCTGC TGCAACGCGG CAGAAAACCA CTGGGCACCG CGTAACGAGC GCGAAAAAGC 9120
AATCAGTCCC TCTGTCCCTC GGTCCAAGCG GTGCAACGGT CCAGGGCGGA ATGACAAAGC 9180
AGGGGGAACG TGCGCACGCC CTTGTCCCCT CACCCAGGCA TCCAGGctGC GCGGACCGTG 9240
CACAcAcAAC tGCGGGTTTA TGAAAAAAAA GCAAATCTTG TGTTTTAAAT ACCACCGAAG 9300 cAACACGCGC ATTcGGTGTT CCAGGCATCT TCGAAAGACG ACTGGATGCT GCACGCGCCC 9360
TACACAGGGA TTCAGGTAAA GAAAGCACAT CCCCCACCTG CACCCGCTTT GCAGGCTGCA 9420
CCGGACGACC ATTGAGCCGG ATAGCGGTGC GGCGCACGCG GGCATACACC CCAACACGCG 9480 GACAGGCAGG CAACAATATT CGCAAAACAC GATCTACTCG TCTACCTGCA TCGTTTTTGG 9540
TGCAGCGAAA ACACTCAACA GCCGCTCCAC CATGGGGTCT CACGGTGGAA ACAGGAGGGA 9600
CGACATCCAT ATGCACAGTG TGGGGAACGT TAGACGAGAC CCACCTTTTC ACGCGACGAA 9660
CACCTACTTT CATACACGGT GCGTCCCCGT GCCGGATACC AGTTGCTCCT CCCAAACGTC 9720
CTCCCCGTCT TTCCCAGTAC GACACCACAG CCCATGGCGG GTACAGCCGC CCCAGTATAG 9780
CGCACACAAC GCTCCCTTGA CAAAGGTTTA GAGAGTATAG GAGACTGCCC CGCGATGGAC 9840
GGTGGCTATT TTCTTGGCCA GCTGCATGCG GTGTTCAGTG GTGAAGTCTT CCTCTCTGCC 9900
ACCTGTAGTT GGCTTGCAAG TCAGGTGATT AAAGTGGCTA TCGCATGCCG AAGtCGGCTA 9960
TACGGTCGGT GCACGGCTTT TTTGATTTTG CTGTTTGGCG CACCGGCGGC ATGCCTTCGA 10020
GTCACTCTGC TCTTGTGTCG GCGCTCACGC TCTCTTTTGC GCTCAAGTGC GGGTTGCATT 10080
CGGATCTGTT CATCTTTTCC TTTTTCTCTG CCATCATTGT CGTGCGCGAC GCGCTCGGTG 10140
TGCGCCGTTC AAGCGGCCTG CAGGCCGAGG CGCTCAATAG CCTCGGTGCG CGTGTTTCGG 10200
AGAAACTTGA TTTTTCTTTC AGACCAGTGC GAGAGATTCA TGGACATAAA CCGCTGGAAG 10260
TTGTCGTTGG CGTGGCAGTG GGCATCGTCA CGAGCGCTTT GTTCTACAGC TCCATGAGCC 10320
CTTGAGTCTC CGGTGGACGT GCATGCAATG CGGcGGACCC CTCCACACAG AGGAAGAGGC 10380
GGTGCTGTGC GCGTTCTCTC GTGTGTCCCT GCCGCGTCGG GAGGCGCAGA CCTTTTCTGT 10440
ACCGTACAGA GGGCACACCA ATGATAGAGC GCCTACGGAG CAGTCGCGGG AAACTCACCC 10500
TCACCCACCA GATTTTCCCC CTCAGCTTTG GGGGGAATGC TTTTTTGCCT GCGCGCGCGC 10560
TCGTTCCGTT CTCCGTTGAT GCTGGAGAGC CAGCCGCCGT CGCTGTGGTA AAGGTTGGGG 10620
A ACGGTCCG AGAAGGTCAG CTGATCGCAC GCGCCGCGCA CGCCGGTGCT GcTCACGCAC 10680
ATGCCTCCGT CCCCGGTGTC GTCACCCGCT TGGTAAGTGC TAATTTTCTC GCCGGTAGTG 10740
CCCTGCGCGC TGTCGAGATT CGTACACGCG GTTCCTTCGA ACATCTTGGC AAGGTCCAAC 10800
CAAATCGCCC GTGGCAGCAC AGCACCGCTT CAGAATTGct GCGCCTAGTT ACAGATGCAG 10860
GAGTAGTGGC CACACGCCTA CATCCGCACG CCCAGATCAC GAGCACCGCA ACGGGCACGC 10920
ACGCGGGTGC ACAGCACACG TACGCGAAAG ACTACGGACA GAAGAGAAGG GCTGAAGCGC 10980
ACACGCTGCG TCTCATGCGC GCGGCGTGGG AAAGCGGCAA TGCGCTCGCC ACGCACCTCC 11040
ACCTGCACGT GCGTAAGGGT GTACGGAAAC TTACGCTCTA CCTTTGTGAC GACGACGCTA 11100
CCTGCCCTTT GAGTTCGTTC CTTGCGCAGG AGTTTCCAGA ACCTGTTGCT ACCGGTACCG 11160
CCATTATTGC ACGGATACTG GACGCTACGT ATACCcGCGT GTCTCCACAC GCTGCCAAAA 11220 CGCTCCCCCG GTCTTGCAAG GATGCGCGCT GTCTTTCCAT TCAACGAGAT GCACGACGCG 11280
TATAGACGAC ATTATCCTTT TAGCAATCTA TGTGCCCCAC GCTATCGTGC AGGTTGCACA 11340
ATCGATGCAC TCACTGCAGT GCACGTGTAT GAGGCAGTGG TACTCAGTCA GCCGCAAATC 11400
AGTTCCTACA TTGCTCTGAC AGGCGCTGGA TTAAAATCAC CGCAGGTACT CCGCGCGCGT 11460
ATCGGCACCC CCCTTGGCGC GCTCATCGAG GAGTGTGGAG GGTTTCGCAC ACGCCCCGGG 11520
CATCTCATCA TCAATGGACT GCTCAAGGGT AGTGTTTTAG AGTCGTTGGA CCTGCCTTTC 11580
TCAAAGGGGA TCAAATCGCT CCACGTCACC GGTAAAGCGC TTTCAAGCTC TGCGTCCTGT 11640
ACCTCCTGTC AAAACTGTGG TGATTGCGCG CGCATTTGCC CAGTATATCT TGACCCAATA 11700
AAAATTGCGC GTGCCGCACA CCGTAATCAG TTTACTGAAG AAGTGCTCCA ATCCcTGcGG 11760
ATTTGCCACC AATGCGGTCT GTGTTCTGCC GCCTGTACTG CGCGTATTCC TCTTGCAAAA 11820
CTTTTGCACG ATGCACAAGA ACGCGCACTG CATCTTTCCC GTGCTCCAGT CACCAAAATA 11880
GAACCCCACT CCACACAAAG CGTCGGGAAA ACTATCCGCG AGGCACCTGC CAATGCGCAC 11940
CGCTGAGTAC AAACACGCAC CCTTCCTTTA CACCGGCTTA AGTGCTGGAC AGAACAACAG 12000
TGTACTGTTG GCGCTGCTTG TTGCGCACGT GTTCGTCGTT GCAGCcaTkc gCGACACGGT 12060
CGcGCTTTTT TCCATCGTCA GTACCGAACT CGGCGCACTG AGCGCCGCGC TCGTTCAAAC 12120
AcTACGCACA CCACATGTGC CCCTGAGCGA CTCTCTCGTA CTGGGCCTGC TCATCGGTGC 12180
AGTACTCCCC GCACACAAcT CTTTTTTGAA cACATTTTGT GTCGCGTTCT GTGCCgTATT 12240
TTTTACGCGC GTTTTGTTCG GTGGCAAAAT CGGGAATTGG CTCAACCCCA TAGCGCTTGC 12300
CCCTGTCCTC CTCCGTCTGT GCACGGAGGG AACTTCCCTC CCAACGTCTG GGCGTGTCTC 12360
TGTTGTACAG GGAGCGATGT CTTATCCTCT TTTCTATTCT GCGCTTGTCG AGTGGGACGC 12420
CGCCGTGCGT ACGTGGTGCA ATACGCAGGT GTTCCAACCA CTTGGTCTTA CCCTCCCTGA 12480
GGGAGCGTTG AGCGCCTGTG TGTTCACTCA GGCTGCAGCG CCTGGGTTTC GCTATCCAGT 12540
ACTTACCCTT CTTGCTGCAC TGTGTGTATA CGCAgTGCGG GCGCGACGCT ACATCTGTTC 12600
GTGCGCGTTC CTTGTGGTGT ACAGCACACT GTTTTTTTTa CCCGCACACG CACACCCTGC 12660
AmCCCTTGTT TCCCTCATAA AAAGCGGCGC GCTGTTTACT GCATTCTTTG TACTCCCTGA 12720
GCCAGATACG TCAATGCGCA CAAATGGCGG GGCTTGGATC TCAGGGGGAC TCTGTGcTAT 12780
GTGCGCGTTT TTTCTTGCAA AAAAGAATAG TTCTGCCCCA GATATGTGGG GTGCACACGA 12840
CATGCACTTG TGGAGTGCGA TACTACTCAC CAACATCGTA CAGCCACTCA TTCTACGCGC 12900
AGAATCCTGG TACTACTATG TGCGGAGGCG TCGCTATGAC GTACAACACT AACACGAGTC 12960 TTTCATCCTA CGCAGgATTG AGCGCATTTG CGTTGTCAGT CTTTTGCATT CTATGGGGCA 13020
CCGCGCGCAC TGGTTCTTTT TTAAAAGAAA AGGCGCTCAT CACyTGCGCC GCAGATATCC 13080
TTGCAAGGCA AGCCCCAGAA CTTGGGGTCA CGTCACGCAC CCTGCGCATG GTACCGAGCT 13140
CCCCCATACC GCAGgCTGAG GTGCTTCGGG GAAAAAAGAA TACGGGAGAG GAAATATTCC 13200
TATACTTTTT CCCACTCAGG GGAATGTACG GTTCGTTTCC TACCCTTTTT TTGTACGATA 13260
AAAAAGATGG TGCaCGCTTT TGCCaTCTCA TAGGTAATCA CCCTACACCG CGTGATGCAC 13320
GCTTTTATGG CATATCGAgT tGCGCGCATC GCkyTTCAGT GTAGAAAAAT AGAACACCTC 13380
CATCAAACAG TCGCATATGA GTAAGTACAC GGTTAAGCGC GCGAGTGTAT TGTGCATTTT 13440
TGGCATAGGA CTATTTGTTC CTGCAACCGG AACCTTTGCc TGCGGTCTAC TACTCGTACT 13500
TGGCTTTTGG GTTCTATTTT TTTCCTCGCT GCTGGCGAGA TTTCTCTCAC AGTTTTTTAT 13560
GCGCACGCGC AGCgcTCCTT TGTTCGAGGT CTGTCTTACC CTCTCAGCCA CCATTATGTA 13620
TGACAACTTG ATCCAAGGCT TTTTCCCGCT TGTGCGTATG ATGCTGTGTC CTTACCTTTT 13680
CATTA CsCG CTTTCGCGCA CACTCGATCT CTGTCTTACC GCATACGATG CAGATGCCGA 13740
ATCGCTCGAA TGCGTAGGTG TCTTCGGCAT CATGATTGCG GGAATTTCTC TTGTACGTGA 13800
ATTAGTTGCC TTCGGGTGCG TTTCGCTACC GGCCCCGTCG GGGTTCTTGC GCATCATCTC 13860
TTTTCCACCC AGCAATGTAA TACGCTTTGC AGCCACCGGC GCAGGGACCC TCATAAGCTG 13920
TGGTATTGTT CTTTGGATAT TCCGCAGTGC AGGTAACGAC CACACGCCCT CTTTAAGGAG 13980
TGAATGGTGA CAATGGTGCC ACCCCTGTTT TTCGTATGCG CCCTTTTCTT TGCCGAGGGC 14040
ATCGGATTAG ATCGCCTGGT AnC 14063 (2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 14244 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
CTCTGCTTGC CCCCTATGAG AAGACGGAGG CGCTTTCTCA CTCTTTGCGC GTGTGCTGTG 60
CACCTTCTTC CTCTTTTCCC TCAGACGATT ACAACCGCTT CTGTCTTTTT CGCCTCAGTT 120
TCTGAGCATT ATGCCCGGTT AAAGGATTAC GCTGCTGATT TGGCCATGAG CACCGGGTCA 180
GGAACCCGCG CGCACCTTAT GCGCGCAAAG GTTATTTTTA AATATCCAGA CCGTCTGCGT 240 TTGGATTTCT CAAGCCCTGC TGAACAAACT ATTGTCTTCA CGGGAGATAG CCTGACCATC 300
TACTTGCCCA CCTCCCGCGT CGCGCTTGTA CAATCGGTAG CAAAAGATGA CACAGTAAGT 360
GCTGCTTCTC TAGCTTCGCC TCATGGTCTT GCGCTTATGA AGCGGTTCTA CACGATAGCC 420
TACGAGACGA GTTCTTCTCC TGTTCCCCTG GGTCCGGACA GTGGGGAGAT GGtCGTTGCA 480
CTGGTGCTCA ATCGTAAGTC TGCAGCAGAA ACATTTAAAT CTTTGCGCGT GCTTGTCTCG 540
GCACATACCA AGCTTATCCG TCGCATTGAA GCGTGGCCTC TTTCGGGGGA AAAAATAACA 600
TTTGATTTCA GCCACTATCG TTTGAACGTC GGTATTCCAG ACACACGGTT CCTCTACGAT 660
GTGCCCCCAA CCGCAAATGT GGTGCACAAT TTTCTCTTTG CTGATTGACC GCTGCCCCCA 720
AAGGACGTGA CGATGCCAGA TATTGGAGAG CTGCTAAAGA CGACGCGCGA ACGCAAACAC 780
CTCAGTCTCG AACAGGtGCG CACGAGACGA GTATCGCACG CCGTTACCTG GAGGCGCTCG 840
AGAACGATGA GTATGATGTT TTTCCCGGCG AACCCTACAT CCTTGGCTTT TTGCGCAATT 900
ACTGCGAGTA CCTCCAGCTG GATACGGAGC AGTGCATCGC TCGCTATAAA CATTTAAAAA 960
TTCAAGAAAT GTCGCTGCCA ACGGAGACCC TCCTACCGAG TAAACGGTGG GGTTCATTTC 1020
CCCTGtTAAA rGGAGTTGCC TGTGTGCTCT TCCTGGGTGG GGTGCTGGGT GTGTATTACG 1080
CGCGGCACCG CnCnTnGGGT TTTCTATCCC GtATTGTGTT CTTTGGCAGA GCACAGCGTA 1140
CCCCAAGGGA GCTGTCTCCC CCCGATGCAA CGGGGGCGGT GCGCGAAACA GTGTCGCTGT 1200
CTTCTGCACA ACATGAGGAG CGTGCGCGAC GCACCGTATA CGAGCGCATC TCGCTATACG 1260
CTTGCTGAGG AAAAGTTTGA ACACACGGTC TTTCCAGGAG ATGTGTTGGT TATCAGTTCC 1320
GGGGGGAATG CGTACGAGCT AACCGTCAGC CGCACTACGC CGCACCTGTA TCTGGACACG 1380
CCCATTGGTA CACAGGTGAT CTCTCTTGGT CAGCGCCtAG TGATGGATTT GAATACAGAT 1440
GTGCAGCCGG ACGTAGAAAT AAGTGTGGAA GACATTGAAG CACATCAGGC GGACGGGGGC 1500
GCGCkTGTTC GCGTGTTTAC AGGTaGTCTG GTGCAGACGC TCCGTGAtCG CAgTGCTCAG 1560
AGCTTTGTGC CTACAAGTGG GGTAAATGTC TCTGGTCAGA CGGGAGTCGC TGCCGGCGCG 1620
CGATATCAAG TTTTGTTTGA AGGCGGTGTT GCGTACCCGG TGACAATGAA CGCAACGTTT 1680
CGCTCGTACT GTTTGTTCCG GTACGAAGCA GATCGCACGC GGCGGGAGGA GCGGTATTAC 1740
CAAAAGGGCG AGCAGCTGAC GGTGCAAGCA AACAACGGGA TTCGGGTGTG GGCATCTAAC 1800
GGGAATGTGG TGCAGCTGCA AATTGTCGCA GGCGGTAAGA CGGTGGATGT AGGCCTCAGC 1860 CGTCCGGGGG AAGTGCTGGT CAAAGACATC AAATGGATCA AAGATGAGGA CGCCGGGCGG . 1920
TTCAAGTTCG TGGTCATGGA AGTAGACTAG CGCGCGGCGG CAGCAATCGC GTACGCkTTC 1980 CAGAGCGCGT GGACTGCAGT GCACAGTGCG CTTGCgCGCG CGCGGGAGCC GCTTCTTTTT 2040
TTCTCTCTTA CAAAAAGTAC CCGTAgCGCT GCGCCCGCAG CTcCTGCAAA CAGCGTGGcG 2100
CTGCcTGCGG GCCGGTGTGC AAGAGCAAAG AGAAGGACTG ACAGTACCTC gCCACAGGcG 2160
CGTGCAGTGC AGGAAGTGGC ATGGTGGCAC AGACGCTCAG GTATATAGGC GCGAAAGAGC 2220
ACTTCTTCGC TCAGAGCATT TAAAAAAAGC CGTACATAAA AAGCTCCCCC TTCCCCTTCT 2280
GGGAAGGGAA ACGGTGAGGG AAGAGAAAAA CAGAAGGAAG GAAATACTAG CGTGCTGAGT 2340
ACGAACGCGT ATTCGGTAAC AGCCGCCGCA GCGTGnCAGT ACGTGTGGCG TGTTGCGCAA 2400
AGGGGAGAGG TGCATCAGCG TGGAACAGTG TAACAGAATC GGGCAGAGGG GGTACAGAGC 2460
GCATATATCC CCTGCAGTGG TGATGGCCAT TGCTGCGGTG GGAGGTTTTT ATGGGACTCA 2520
CGTGTATGAG GTACCGTTTC CGTATGCATT CTTTGGTGCA GTACAGGCGT GTGTGCTGTG 2580
TATTGGGTGT TTGTTGGTCC GCAGTGGTGT GCGGTTCTTT TCTCGTTGGG GTGCTGTCCG 2640
TATCTGGAGG AGGTGGGGAA TCGCATACAC CAGCGTATGT CGGTGTTGTA ATACGCTTTT 2700
TTTCGTGTTC TGTGGTCTGT GTGTTGCCTG CGTTGCGCGA ACCTCCCTCA TGGTACAACA 2760
AGCTCCGTTG CAAACACTTG CACAACCCCA AAAACTACGC GTTTTGACTA TACACCTTTT 2820
GCaAGAGCCA AAGCCTGCAG GCaCGCGCTT TCGTGTTCGG GCGCGCGTAT TGGGTGCAGG 2880
TTACATAGAC GGTGCTTCCT TTTCTGCACG TGGGGTGTGC ACTGTATTAT TTCCTGCAGA 2940
GGTAATTTTG CAGCAGTACG CTACCGATAT GACGGACGAC gcGGATGCCC GCGTCTGTCA 3000
GTATTACGCG CGTGGGTTGC GCTGTCAGAT TCGTGGGCGC TTTGCATCTT CTGCACCGAA 3060
GCTTTTTATC AGTAGTTCTA CACCACCACG CTTTGTTGGC TGGAGTTCCT ATTTTGCACA 3120
GATGCGCGCA CAGATGCGGG TTGCACTCAT GAGGTTTTTA TCTCCATGGG GGCGTGCAGG 3180
GGGATTGTTA CTCGCGCTCC TTTCTGCAGA TAGTGTTTTT CTTTCGGATG AAATGCGTGT 3240
CGCGTTTCGC CATGCAGGAC TTGCTCACGT GTTGGCACTC TCTGGCATGC ACTTGTCTTT 3300
GGTAGGGGCG AGTGCAACGT tTTTGGGCCG TTTCATCGGC ACAAGGCACA GAGGTATGCA 3360
GGGGGCGTTT TTTGCGATGC TTGTCTTTGT GTGGTTTGCA GGTATATCGC CTTCCCTTGC 3420
GCGTGCACTT GGTATGACTT TAGTGCTGAT GGGAGGACAG ATGGCATACG TGCGCGTAGG 3480
ACTTTTTTCT GTACTGTGTG CTGTACTTAG CATACATATG CTCATTGCGC CGCATGATGT 3540
ACAGACGTTA AGTTTCATGT TGTCATACGG AGCGCTTGCA GGTATTGTGT TGCTTGGCTC 3600
TGAGATTACT GAAATGATGT CGGGTTTGAT TCCTCGGCCA CTTGCATCGC TGCTTGGAAC 3660
GTCCTGTAGT GCGCAGTTTT TTACAGCACC GATAGTGCTT TCGGTCATTG GATATTTTGC 3720 CCCCATTGGG GTACTTGCCT CGTGTGTGGT TAGTCCGCTT ATCGCCTTAT TTTTGATAGG 3780
GGGGAGCGTG GCGCTGTGCT GCTCTTTGGC AGTGCCTGCT GTTGCGCCTT TTTTAAGTTG 3840
GGGTGTGTAC TTTTTTGGTG AAGGACTCTG TGCGGTTGTG CGTTTTTTTG CGTGTGCGCC 3900
GCTTGTGTAT GTACAGAGTG CCTGCGGACA TGTGTGTGCT GCATTATTTT CTTTTTTACT 3960
CGGTGGGGGA CTACTAGAGG CGGCGCGTCG CGTGcGTGTT CACAAGGATA CATATGTGTT 4020
GCCCGAATTA TAATTCGGCG CGTGCACTTG CACAGTTTTT GACGGAGCGC GGTTTGCGGA 4080
TGCATAAAAA GTGGGGGCAG AATTTTCTGC TCGATCCGGT GTTACGTACG CAGCTTGTTA 4140
AGATATTGGC GCCGGAGCGT GGGGAACGTG TATGGGAAAT tGGTGCAGGC ATTGGTGCGA 4200
TGACCGCACT TTTGGTGCAA AACAGTGATT TTTTAACAGT GTTTGAAATT GATCGCGGCT 4260
TTGTGCAGAC ATTGCGCAAA CTTTTTGATG CACACGTCCG TGTGATAGAA GGGGATGTGT 4320
TGCAACAGTG GCATGCTGCA GCAGCACAGG AACAACCTGC GTGTGTTCTA GGAAATTTAC 4380
CCTACAATAT TGCTGCCCGT TTTATTGGAA ACACGATCGA ATCAGGCTAT ATTTTTAAGC 4440
GTATGGTGGT GACCGTTCAA AAAGAAATCG GGTTGAGAAT GACTGCGCTC CCTGCACAAA 4500
AATGGTATTC ATACTTTTCA GTACTCTGTC AGTGGCAGTA TGAAGTGCGT GTGATTCGTA 4560
ACGTTGCGCC TGTCTGTTTT TGGCCGCGTC CTCATGTAGT TTCTCAAGCA TTGGTACTCA 4620
CCAAGCGTAA TGCGGTGCCT TCTTGTGTGG ATCCTGCGCT TTTTCTGCAC GTGACGAAAA 4680
CTTTGTTTTC TGCGCGGCGT AAAACGGTAA GAAATAATTy ACTCACGTGG CAAAAAAGGA 4740
TGCCAGGCGG TGCAGCTGTG TGTGTAGAAG AACTCTGCGC ACGTGCAGGT ATTGACGCGC 4800
GTGCGCkTGC AGAGCAACTG AGCATCTATG ATTTTATTAC GCTTTCTGaT aCgctGCGCG 4860
CGCTACTGTA GTCCGGTGTG GGTGTTGAAT GGCGCGTGTC TATATTCTTT TTTTCAGTGT 4920
GTTTTTTGTT TTTCCGCTCT TTTCTGAAGA CGCCGCGCGC GATGTGGAAC CTAGCGATGC 4980
GCCTGTGCCC TATGAGGACA CAGAATTTTC CTTATGGCAG AAAGAATTGT ATCGTTTTGA 5040
AGCGCTGTCC ATCGGTGCAT TCCCGATAGT AACGCTGCTC TCTTTTATCA CGTATGACAT 5100
CATACGTCTT ATTCAGCAAT GGTCGACAAA GCCTCCGACA TGGTGGGCGC TGATTATTCC 5160
TGGCGCGGAc TAcCGCCAcT GAGTACGAAG GAGCGCGCGA TAGTTTTTGG TGTGGCAGTG 5220
GGGATTTCTG TGACGATTGG ATTAATTGAC GTGACGTATC GTGCAGTGAA GCGTGCAATA 5280
CACCGGCGTA GTCTTGcaGC GTTCGCAGTT AGTACCAGAC CCGATAGAAC TGGTGCCACT 5340
TGATTCTTTT GTTGAGGGGA CTGACGATAG CACGTGAAGG TGCACAGGGT GTCGTATTCT 5400
TGCAGTGCAG AAAGCACGTC GGTGTGCAGT GTTCATTTTT CCGTTTTTAG AATACCGGGC 5460 GCGACGTGTC GGTGTGATGT GTTCTGCGCA GAACATGTTT TTTTTTGTGC ACGATCTCAG 5520
CTCAAGAGGA TCGTGCGTGC ATTGTGTGTG AATGGGCATA CGGCAAAGTT TTCAAGACCT 5580
CTTCACGTGC GCGATCGGGT GTCTTTTGAG TGGGTACGCT CAGTGCCCCC GGCGCTCATT 5640
CCTGAGAATA TATCGCTTTC TATTCTGTTT GAAAACGAAG ACATTATTGC GGTGAACAAA 5700
GCGCAGGGCA TGATAGTACA TCCTGGGGCA GGCCACTGGA CGGGAACACT TGTTCAGGCG 5760
CTCAGTTTCT ACCGGGTGTA TCGTGCACGT TTTGAGGATG AGTTTTCTCG TCAATTTCAG 5820
AAAGGATTTC CCGATTTTTT CAGTACCCTG cGTCAGGGTA TTGTGCACCG TTTGGATAAA 5880
GATACATCGG GCGTACTCCT CACTTCGCGC AACATGCATG CTCATGAGGC ACTTGTACGT 5940
TCGTTTAAAA AAAGACAAGT AAGAAAAGTA TATCTTGCGT TATTGCAGGG TGTTCCTGCA 6000
CGCGGGGTTG GGGTGATTGA AACAACAATC GTGCGAGATA GAAGACGACG CACGCGGTTT 6060
GTTGCGTCTG AAGATTTTTC AAAAGGAAAG TACGCACGTA CGCGATACAA GGTGATGAAA 6120
ATATGTGGGG CGTGCGCTTT TGTCCAGTTT CTATTGGATA CTGGTCGTAC CCATCAGATA 6180
CGTGTGCACG CGCGATACCT AGGATGTCCC GTTGTAGGAG ATCCGTTGTA TGGTTCCCGG 6240
AATATCTGTG GCATACCCAC AACACTCATG CTTCATGCGT ACGCAGTACG GTTTGTTCTT 6300
CCGAGAACGA AAAAACGCAT AACGCTGGTA GCGCCCATAC CGCTTCGTTT TGTTCGACTG 6360
ATACACCGAT TATCGGTTAG GTAGGGTGTG GCAGGTGCGT GCGTATATGC GTTTTACTTC 6420
AGCAGCTAAA TAGAAAGAAC CGATGGCAAG GAGCGCTTTA CGTTGCGCAA AACTGGCATA 6480
CAGTGCACGG GAAATAATTG ACGCAAAGTC TTCGCTCCAA AAAATTGGGA CTGTTTGGTG 6540
AAATGTCGTA CGAAACGCGT GGTATGTTTT TTGTATATCT GCATGTTTAG ATGTGCCCGG 6600
TATGGTTAAA AAAATTTCGC TGGCGGCATG TGAAAAAAGA GGGGGGAACT GGGACACTGC 6660
TTTGTCCGCC GCGCACGCAA AGAGTAAAAT GTATTGTGCA GAAGGTAGTA AAGAAGAGAA 6720
CGTACGACAT GCGCACCGTA TACTCTGcGT AgTGTGCGCA CCGTCAATCA CTATGaGTGG 6780
ATCTTCCTGc ATAATTTcAA AGCGTGcTGG cACGTATGCA CGGGaCAGTC CCCGCTCGAT 6840
TAACGTTTCG CTCACGGTAG GAAATAAATA TTTTGCCGCG CACGCAGCCA GTGCTGCATT 6900
TTTTGCCTGA ACAATATCGC ATAACGCGAG TGTGCAGTGT ATATTtCGAG CGAATAATCT 6960
GCCAACAGGA TGCGCGGCGT TAAAACTGAG AGTTGCaGTG TGTGTGAAGT GTTTTATtGA 7020
ACTTTCAATA TGTGTGACCA TATCtGGTAA GTAAAAGAAG GGAGCATGTT TTTCTCGCGC 7080
GATATGTTTA AAAACGTGCA ATGCATCTTC TGGCTGATCA AAACAAAAAA TAGGCGTATA 7140
GGGTTTGATA ATGCCGCCCT TTTCTTTTGC AATACTTTTT ATACGTGTTC CTAATATGCG 7200 CGTGTGTTCT TGTTCTATGG GGAGAAGGAG ACAGATACTA GGACAAATGA TGTTTGTTGC 7260
ATCTAGTCTT CCTCCAAGTC CTACTTCAAA AACGGACCAT TCCATGCGTT GTTGTGCAAA 7320
TAGCATGAAC GCCAGTAGCG TTATAAGCTC AAACCACGTC GCCTGGCCGT AGTCGCGCAG 7380
ATTCTCTGTT TTTTTCACCG TGTGGTATAC GTGTGTGCAC GCGCTTGCAT ACTCGGCAGG 7440
TGAAAAAAAC ACACCCGCGC GTGTTATTCT CTCTCTCGGA TCCATAACGT GAGGAGAAGC 7500
GTATAGCCCG GTGTTGAATC CAATTTCATT GAGTATCGCT GCAAgCATAC GTGCGCTGGA 7560
ACCTTTTCCC TTCGTGCCCG CAACATGGAT GCTCTGATAT GCGTTGTGTG GATTACAAAG 7620
CGCGCGTGCA AGTGCAGTCA TCCTGTGCAG AGGTGGTGTG CCGCTTGGGG GCATTTTCTC 7680
AAGCGTGCGA ATGCGCTCAA CCCAGGCGTA AAAATCTTGA AAAGAATGCA CCGGTATATG 7740
TGAAAnTCCG TGTGCGCTCG GTCGCACTAT AATATGCGGT ACGGAAGAGG CAGCAATCCT 7800
TGCCGGGAAA GAGAACTGAT GTACATTGCT AAGGTCTGAC ATTTGAGCTA AAATCCGCCC 7860
ATGAAGCGGG GGACTCTACC AAAAGATGTG TCAGGTATCA AGATTCACAT GATTGGTATC 7920
AAGGGCACTG GCATGTCTGC GCTTGCAGAG CTACTGTGTG CACGGGGTGC CCGTGTGTCA 7980
GGTAGTGATG TTGCAGATGT GTTTTACACG GATAGGATTC TCGCCCGTTT GGGTGTTCCC 8040
GTGCGTACTC CCTTTTCTTG CCAGAACCTT GCTGACGCTC CCGATGTGGT TATCCACTCT 8100
GCAGCCTATG TGCCTGAAGA AAACGACGAG TTGGCAGAGG CGTACCGGCG GGGTATTCCT 8160
ACCCTTACCT ACCCAGAAGC GCTGGGGGAC ATTTCCTGTG CGCGGTTTTC GTGTGGTATT 8220
GCAGGTGTTC ATGGAAAGAC GACCACGACC GCGATGATTG CTCAAATGGT AAAGGAGCTG 8280
CGCCTTGATG CGTCCGTCCT TGTGGGGAGC GCTGTTTCGG GAAACAATGA TTCTTGTGTG 8340
GTTCTTAACG GAGATACCTT TTTTATCGCA GAAACGTGCG AGTACCGTCG GCATTTCCTG 8400
CATTTTCATC CTCAAAAGAT TGTCCTCACC AGTGTTGAGC ACGATCACCA GGATTATTAC 8460
TCCTCGTACG AGGATATACT CGCGGCATAC TTTCATtACA TAGATAGGCT TCCTCAATTT 8520
GGTGAGTTAT TTTATTGCGT GGATGACCAG GGCGTGCGGG AGGTAGTGCa GCTTGCGTTT 8580
TTCAGTAGAC CGGACCTGGT GTATGTTCCT TATGGGGAAC GTGCGTGGGG CGATTATGGG 8640
GTCAGTATTC ACGGTGTTCA AGACCGGAAG ATAAGCTTCT CATTGCGGGG TTTTGCAGGT 8700
GAGTTTTATG TTGCGCTCCC CGGTGAGCaT AGTGTTTTGA ATGCAACCGG TGCGCTCGCA 8760
TTAGCACTGA GTTTAGTGAA GAAGCAGTAT GGAGAGGTTA CCGTTGAGCA CCTCAcGCTC 8820
TGCgGAAGGT ACTCGCTCTT TTTCAGGGAT GCCGGCGAAG GAGTGAAGTT CTTGGGGAAG 8880
TGCGCGGTAT TTTGTTCATG GACGATTATG GACATCATCC GACTGCAATT AAAAAGaCTC 8940 CGCGGGTTAA AAACGTTCTT TCCGGAAAGA AGAATTGTCG TCGATTTTAT GTCCCATACA 9000
TATTCGcgTA CCGCAGCCCT CCTCACCGAA TTTGCTGAGT CTTTTCAGGA TGCGGATGTA 9060
GTTATTTTGC ATGAGATTTA CGCCTCTGCT CGGGAAGTGT ATCAGGGCGA GGTGAACGGT 9120
GAACATCTTT TTGAATTAAC TAAACGGAAG CACCGGCGGG TGTATTATTA CGAGGCTGTC 9180
ATGCAGGCAG TGCCTTTTTT GCAGGCTGAA TTGAAAGAGG GCGACCTGTT CGTTACGCTC 9240
GGCGCTGGAG ACAATTGCAA ATTGGGTGAG GTGTTGTTCA ATTATTTTAA AGAGGAGGTG 9300
TAAAGTTCGG TTGCGGTTTC GCCAACATGG TGGTGGTGCC gGCTGTGGAT CTGGTGGATA 9360
TAGGGTGAAG TGAGACAGGC TGCGAATGGA TGGTGATGCA AAGAGCGGAG TGCGGAGGGG 9420
TGCGTGAGTA ATCGGTGCGA TGTGTCTGGA AATAAGGCGG TACgCATAGC AGTTTCAGGC 9480
GCGTCAGGGT GTGGTAATAC CACCGTGTCT GCATTGCTTG CGGAAAGACT GGGACTTCCC 9540
CTAGTGAATT ATACGTTTAG GAATATTGCC CGGGAGTTGG GTATCTCTCT TAGTGAGGTG 9600
CTCGAGCGTG CGCGGACGGA TAATCATTTT GATAAAGCAG TTGATGCGCG GCAGCTCTGT 9660
CTTGCGATGC GTTCTTCCTG CGTGGTAGGG TCGCGCCTGG CCATTTGGTT GGTGAAAGAT 9720
GCCGCGCTGA AGGTATATCT TTTGGCTTCA TTAAAAGAGC GGGTGAAACG TGTTCTCCAA 9780
AGGGAGGGAr GGGACGTACA GGATGTTGAG CGATTCACGT CTATGCGTGA CGCTGAAGAT 9840
ATGAGTCGCT ACAAAAAGTT GTATCGTATT GATAACACGA ATTACAGTTT TGCAGATCTT 9900
GTTCTAAACA CAGAAGGGTG CGATCAAGAA ACAGTGGTGA GTATTATTAT TGAAATGTTA 9960
CGCGCTAGAG GGATAGCTTG GTAGGGCTGA GCCAATCTGC GGGTGATATA GAAAAGTTTC 10020
AAAACGCCAT ATTGGATTTT TATGCACAGC AGGGCAGGGA TTTTCCGTGG AGAAGTACTT 10080
GCGACGCGTA TGnaTACTGG TGTCTGAGTT TATGTTACAA CAGACACAGA CGGAGCGGGT 10140
GTGTCCGAAG TATGCAGAAT GGCTTCATCG TTTTCCTTCT TTGGAGTCTC TTGCGTGCGC 10200
TCCATTTGCG CACGTGCTCC AAGCGTGGAT TGGATTAGGA TACAACAGGC GCGCTCGTTT 10260
TTTGCATCAG TCGGCAAAAC TCATTGTTGA AAGGTATTGT GCAGTAGTTC CTGATGACCC 10320
GAGTGAACTA AAGAAGCTCC CCGGTGTCGG TGACTATACT GCCGCTGCAG TTGCTTGCTT 10380
TGCGTACAAT AAGGCCACCG TGTTTTTAGA AACAAACATC CGTGCAGTGT TTATACGCTT 10440
TTTCTTTCCC GATACGCACC AGGTCAGTGA TCGGGAGTTG CTCTCGCTGG TCCGGTGCAC 10500
CCTGTATGAG GAAAATCCTC GGCGTTGGTA CTACGCACTG ATGGATTATG GGGCAGTTCT 10560
AAAAAGGAAG ATTACAAATC CTAATCGTCG CAGCAAGCAT TACGTGAAGC AGTCACCGTT 10620
TGAAGGTTCT CTGAGGCAGG TGCGTGGAGC GGTTTTAAGA GAGATAAGCG GCATGCAACA 10680 CGCGGTGCGC GAGAAAACGC TTTtCGCAAA GCTGTCCTTt GAGCACGAAA GATTGAGCCG 10740
CGCTCTAGAC TCGCTGGTAA GCGAGGGACT GGTAGTAAAA ACAGAGGCTG GGTATTCCAT 10800
CGCTGATTGA TTCTTTATGA CTCAAGACGC TTGAGTATTT CACAAATAAA GATGCCTTTC 10860
TCTTTTATTT CAATGGCATC TGATGTGAGG ATCAGCATGT TGGGCTGCTT TGCGTCAAGG 10920
CGAACGCGAT CGTTGGAGGT CTGTACCAGG CGCaGTATCT TTTTGAAGGC AGTGTCGCAG 10980
ACCTTGTCAA ACTCAATCAA TAAGCTTTCG TGTGTTTCTT TGAGTGAGAG AATGGCGAGC 11040
TTTTCGCACC GCACTTTGAT TTCTGCCACG GTAAACAACC CAGCTGCTTC CTCaGGGATA 11100
GGACCGAACC GGGTGATAGT TTCCGTGCGT ATGCGCTCAA GCTCCTCATG CGTATGAGCT 11160
GCAGCGATTT TTTTATACAG TTCCATTTTA ATTTCATCTG CGGCAATGTA CGTATGGGGG 11220
ATGAACCCTC GGTAATTAAG ATCGATGACG GTTTCTATCC TTTGCTCGTT TGGAGCATGT 11280
TGGAGGCGTT CTATTGCCTC TTCTAACAGC TGTACATACA GGTCGAATCC GACTGAATAG 11340
ATATCTCCTG aTTGTTCTTT GCCTAATAGA TTTCCTACCC CGCGAATCTC CATATCTTTT 11400
AAGGCGACTT TGAAACCCGC CCCAAGGTCA GTAAAGTCAG AGATCACCTG TAAACGTTTT 11460
ATTGCAAGGT CTGAAAGTGC CACGTCGTGA TAGTACAGCA GATACGCATA TGCTTTTTTG 11520
TCAGACCGAC CAACGCGTCC CCTGAGTTGG TAGAGCTGGG AAACCCCGTA CATATCAGCT 11580
CTATCTATGA TGATAGTATT TGCATTGGGA ACGTCGATAC CATTTTCAAT AATGGTGGTA 11640
GAAAGCAGGA GCTGGAACGT TTTTTGATAA AACCTTTCAA AAATGTCTTC CAGTTCTTCT 11700
GACCCCATGA GACTGTGGGC AACGCATATG GATAGCTCAG GCACGAGTTT TTGGAGCATA 11760
CACTTTACGG ATTCTAAGTT TTCGATTCTG TTATGTAGGT AAAAAATCTG CCCCTCACGA 11820
TCTAGCTCTT TTCTGATTGC AGTGGCAACA AGGTTTGGAT CAAACTGCTG GATAACCGTT 11880
TCTATAGGTA GGCGGCCTTC AGGAGGGGTG GTGAGCAAGC TCATGTCTCT GATTTTGAGC 11940
ATACCCATGT GAAGCGTTCG GGGAATGGGC GTTGCACTGA GGGAGAGACA ATCTACATTA 12000
GTTTTCATCT GCTTTAATTT TTCTTTATCC TGCACACCGA AACGTTGTTC CTCATCGAGG 12060
ATCATCAACC CAAGATCCTT GAAGGACACG TCCTTTTGGA TAAGCCGGTG GGTACCCACA 12120
ATAAGATCGA TATCTCCATG CGCGAGTTTG GCGAGTATGT CCTTTTGTTC AGATTTAGGA 12180
ACAAAGCGTG AGAGCTTCTC GATTCTGACG GGAAAGTGTT TAAACCGATT GCAGATTGTG 12240
CGAAAGTGTT GTTCCACTAG TAAGGTGGTA GGGGTGAGGA ACACCACTTG TTTTCCTCCC 12300
ATTACCGCCT TAAATGCCGC GCGCATTGCA ATCTCTGTTT TTCCGTATCC GACATCTCCG 12360
CACACCAGCC GATCCATGGG GACGGCTTCT TGCATATCCT GTTTGACTTC TTCAATGCAT 12420 ATGCGCTGAT CGTCTGTTTC TTCGTAGGGG AATGCTGCTT CAAACGCATA CTGCCATTCG 12480
TCATCTTTTG GGAAGGCGTG GCCGCGCGTA GTTTTTCGCA GAGAGTAGAG TTCCACTAGT 12540
TTTTGCGCGA TGTTTTCAAC AGATTTTTTG ACACGTGCTT TTCTCGTTTC CCATGACTTT 12600
GACCCAAGGC TATCTAAGTG AGGTTTGTTC CCTTCATTTC CAATGTAACG TTGCACCAGA 12660
TGTGCCTGCT CAATAGGGAT AAGGATCGTT TCTTCCTGTG CATAGAGGAG GTTTACGTAA 12720
TCACGTTCTG ACTGTGCTGT TTTTATGCGC TCTATTCCCT TAAATAAACC GATGCCGTAC 12780
TGCGCATGCA CCACGTAATC CCCGGGATTT AATTCCACAA ATGTGTCGAT AGGCGTGCTC 12840
CGTGCGCGTT GCACTGATTG AGGAGTTTTT CTGCGGCGAC CGAAGATTTC GCCTTCTTGA 12900
ACGATCAGTA TTTTGAGAGC AGGAATGCTA AATCCTGCAG AAAGCGCGCA AGGTAGCACA 12960
GTGACGTCGC AACCTTTGAC TAGTGCTCTG ATGCGmrTGC CTGCTGCTCA CTTTCTGCAA 13020
AGACGAAAAC GTGCCATCCG TCTTTTGAAA GACGGAGTAG CTCTTCTTTG AAGTAAGGAA 13080
TGTTACCGAA GAAGCTGCGT GCAGGATCGC TTGCCAAGCA TATACTTTCG CACGCTGGCA 13140
GCTGTGGAAA AAAGTGAGTG AAATACACCG TGTGCAGGTG GAGCGCGCAG ACAGCGGAAA 13200
AATCGAGCAC TATGTGTTCT GGTTGAGGAT ACCAGCGCGC AGtACATGTT CGTGCGCGAG 13260
TTGCATTTTA TGGTAGAGGT TCCGACACTC GTCTTGGAGC GCGCGTGCAC CGTTGTGCTG 13320
GCGTTCGTAG TCAAGATAAA AGACGCTTGG GGGTGAAGGG CTGTGGCGAA AATATTCGAG 13380
AACGCAGgTG GGACGTTCAA AGCACAGTGG ATAGAACATT TCCTCCCCTT CATACGTTTT 13440
TCTGTGGGTG AGTTCTTCGA TACACGGGAC GCAGTGGGCA GGACATTCGG ACAGTTTTTG 13500
GAGATTTTGG TGGAGGAACG CTATACGCTC CTCACTCCAA AGAATTTCTT TTGCAGCGTA 13560
CAGTGTGCAC GCAGATACCT CTTGCAGGAC GGCACACGTG GACACCGCCA GTATATGGAT 13620
ACGTTCTATG GTGTTAAAAT CACACACGAT TCGGTACGCT TGTGTGTTGT CAGCAGCCTG 13680
CGCTGCGGCA GCGATATCGA GAATTTCTCC CCGGAGAGAA AACTCTGCGC AAgcGcTGaC 13740
GTGGTCGACA CGTGCATATC CCCATTGCAT AAGCTGGGCA GCGAGCGTGT GGATCTCGAT 13800
GTGCTCTCCC ACACGGAAGG AGCGTTTGAG GGTACGCACA TAATCGAGGG GAGGAACGGG 13860
GGTGAGCAGT GCACGCTGGG TGAAAACGaA CrcGCATGgC gGTaTGCATC GcGCTGTGCG 13920
AGTGCGCACA nnCTnCTnAC CCGGTGAGAG AACACGTGTG CGTTAGGTGA GACAGGGCGG 13980
TAGGGCAGCG ACCCCCACCA GGGGCAGCAC GCGCGTAGGA ACTGCTGCAT GTGCAAGGTC 14040
GGTGCAGACG GCGGCGACGT CCTGTTTCGG TAnGGACTAC GAGCACTATG TGTGCGCAAC 14100
ACGTGCGCAC GTTATTCGCC AAAAAAGTAG GACCGCAGTC CAACGGTGCA CCCTTTCAAA 14160 CGCGGTAGGG AAAAGCGTGC GGCACCAGCG AnnGCAGCAA TTGCTGGAAG CTCATTTCCA 14220
AGAAATGGAG TATGCCACGC AACA 14244
(2) INFORMATION FOR SEQ ID NO : 3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2109 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:
AATCGACGAA AGGTATTGTA GCAGAGAGAA TACTCAAGCC ATGCGTAAGG AGAAAAGTAA 60
ATGGCAAGTT TAGATCTACC TAAGAGTCCC AATGTGTTTC ATCCCGAAAA GCCGAGTGCG 120
GTTGGGTCAA GGAATTCACT GGCGCAGGAC TGTCGTGACC AGCAGCAGGA GGTGAACCAG 180
CTAATAGAGG AAGAGACAAA CAAGATTCTG CACCACCTGA ACACTAAACT GCCGAAGaGG 240
TTCTCGAGCG TCTGGACGTA ATGGGTGGGT TGAAGGAAAA GTTGTATAAC TACTTCAACC 300
AGAATTACCA GAACATGTTC AACCGGTACA TGGTGACTGC GGAAGACGAA ATGCTGAAGA 360
AGGTCCGTGG TTTCATCGAC CGAGAGGAAA TGAAGGTGTT GAACCGTTAC ACGCCGAAGG 420
AGATTGCCAT CCTACTGGAT GAGGTTGCGG GAGCGGATAA GTTCAACACC GGAGAGATCG 480
AGAAATCGAT GGTGAATATG TACGGGCACT TGCAGGGTCA TATACAGCGG GGTGTGAATG 540
AGCTTGAGAC GCACACCAAT TCTTTGCTGC GTCAGAAGGT TGATGTGGGT GCTTTTGTCC 600
GCGGAGAGAA TGCGTATGCG GTAGTCAAGT GTGCGTTCAA GGACAATCTT GCGCGTCCTA 660
AGACCGTCAC TGACGTGAAG TTGTCTATCA ATATTCTGGA CTCAGAGTTA GTTAGCCCTA 720
TCTTCCATTA CCAGACGACG GTAGCGTACC TTATTAAGGA TCTCATCTCC AATCACTACA 780
TAGATGCCAT CGACAAAGAA ATTGATCGCG TGAAGGACGA GCTTATCGAC CAGGGTAAGG 840
AAGAGATGTC TGATAGCAGT ATCATCTTCG AAAAGATGAA GATGGTGAGC GATTTCACCG 900
ACGATGACTG CGAGAAmCCT GACAGCAAGC GCTACGAGCT TATTTCGCGG GAGTTGATGG 960
AAAGAATCAG CAATTTGCGC GCGGAAATTG ATCCGGAAAC TTTCGACCAA TTGAATGTTC 1020
GCGAGAATAT CAAAAAAATC GTTGACCTTG AGAACATAAG GAATCGTGGC TTTAACACGG 1080
CTATCAATTC GATTACATCT ATCCTTGATA CGTCGAGGAT GGGGTACCAG TATATCGAGA 1140
ACTTCAAGAA TGCGCGCGAg CTTATCCTTC GTGAGTATGA TGACACAGAT ATTTCGAATC 1200
TTCCTGaTGA GCGTTACCAG TTGCGCTTAA AGTACCTCGA TAATGCTCAG TTGATTGAGG 1260 AGCGTAAGGG GTATGAGGTG ATGCTTCGTT CTTTTGAGAC GGAGGTGGAT CATCTATGGG 1320
ATGTGCTGCG TACTAAGTAC GATAAGTCTA AGGCGTCTAG GTTCATGGCG AAGATTACCG 1380
ACTTTGATGA CCTTGCTAAG GTGTACAAGA AGCATATAAA GAAGCATTAC AAGGATAAGA 1440
CTGGTGAGCC CGTGTACGAG GATATTGCGA AGGTATGGGA CGAGATTGCT TTTGTGAAGC 1500
CTGCTGAGAC CGAGGTGGAG CGGATGAATC GTACGTTTGT GTACGAGAAA GACAAGATGC 1560
GAAGGAAGCT TATTCTGATG CGTGGGAAGT TAAAGGGTAT GTATGATTAC CAGTATCCTA 1620
TTGAGCGTCG GGTTATGGAG GAGCGTCTCG CGTTCTTGGA ATCCGAGTTT AACCGTTTCG 1680
ATTACTTGGT GAATCCTTTT CACTTGCAGC CGGGCTTACT GCTCGATATC GACATCACGT 1740
CTATAAAGCG CAAGAAGGCG ACGCTCGACG GTATGGCTAA CGTGCTTAAT GAGTTCTTGC 1800
ATGGTATCTC TAAAGGATTT GCGGACGCTG CCTTTGCTTC GTTTAGTCGT CGTCGTTCAA 1860
CGGTGCGTGC TGATATCGGT CAGAGTTTTG CTAGTGACGG CAgTGCCGAC CAGAAGGAGT 1920
CCAGCGGTAG GGTGGCTTTT ATGGATATGG TAAATGAGAC TCCTGCGCTT GAGTCTTCCG 1980
TGGCCGCTGA GCAGGTGGAT GTGCGCTCGG ATGTTGGAAT GAAGACGAGA AAGGTGGCGC 2040
GGTGGATGCA GGCAAGGGTC GACGTGGTAG ACGGTCTGCC ATTCGCGAAt CTAGCGAGAT 2100
TGTAGATAC 2109 (2) INFORMATION FOR SEQ ID NO: 4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9848 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:
CTGGACATGT TTCCTGCCTC TGAACCTTGG GTGAGGGAGT TTGCACAGAG GGTGGGGATT 60
CACGTGCAAG AAGGTGCACG GCTCGTGAAT TTGCCTCGTC ACCCTAGCCA AATCTATGAA 120
GCTTTTTTTG AAGGAATTGT GCTGTGGTGT ATTTTGTGGT GTGCGCGTCG GGTAAAAACG 180
TATAACGGCT TTTTGGTGTG TTTGTATGTG GTGGGGTACG GAGTGTTTCG TTTTTTTATT 240
GAGTATTTTC GTCAGCCTGA TGCGCATTTG GGGTACAGGT TTTCCGCCAC GCAATCGTCT 300
CCGATTTACC TTTTCCAGTC ATGGAGTGAT GTTTCCACCG GGCAGATTCT GTGTGTTCTA 360
ATGATTCTCG CAGGTTTGGG TGGGATGTTC GCACTTTCGG CGTATCACAA GCGGGATAGT 420
GTGCGGAAAG CGCGTGTATG AAAATGAAAA GAATGCACCG ACTGGTCCAT CAGCCGAGAT 480 GGGGTGGCGC GATGTACCTT GCGTATAATG CGCAAAGGTG AGCCGATTcT TCTTCCGTCG 540
TCCTAATGAG CTGGTCTTCT TCTGGTGTAC CAAGCACCGC ATTGCGCTCC AGAGGGTTTC 600 GGTATGTGAT TTCTCGAAAG GGGTTCAGAT AGTCTTTGCA TCTCGTTTTG TGAAtTTGTG 660
TTACTCTGTA CTTTTCGATG GTAATTGCGT GTTGCACGTA GGTGCAAAGT AGCGGTGCGT 720
CTATCCTTGT TCGTGCAAGA GAACATAACA CTGCATAGGC GTGGAGTGCT TTATCCGTTT 780 CGTACGTGGC AGCAAGCGCG TCGTATTCTT TGTGGATGTG GTGCCAACTT TTGAACTTTC 840 CCGTTTCAAT GTTTTGTAGA AGTGTCTCAA ACTTATCTTC AGGTACGAGT TGTCCTCCCA 900
TGTTTACCCA GTTGTGTGTG ATAGTTTTAG GATCGTGCGA AGTAAAGTGG GAGATGCAGC 960
GTGTTGTTTG CTCAAAGAAA GACAGGAGCG TTTTTGTGCA GTACCATATG AGTATGTCTC 1020
GGTATGCTTT GCGGGCTTCT AAGGGTTTTA GCACAAGTGT TGATCGGGTG GAGTGTTCTA 1080
TGCCGGTGAG CAGTACGGGG ATTGTCTGTG CTTTGGTGGG ATGTTTGAGG ACAATGTCTT 1140
CGGCAGTGAG CGCGCTGTTT CCCGCGTTAA CCCACGCGCG CTCTATCGCG CTGTCTAGTA 1200
AAGCGAGTGC GTTTTCAATT TCTCCTATTG TGTCGGGAGC GAAGACAGAG ATTTCTACTG 1260
TTTGTGTTTT CGTTTTTCGT TTGTCTCGCG CAGAAtTTTT TTTCGTTGCG TTCAATCGCA 1320
TACATGTTGT AGAGCCAGTA GTACGCGGGC ATGATTTCTA ATCTGTTTTC CCGTTCATTG 1380
TTGGTGACAA GACTAAAAGG AAATGGAATA TCAAGTTCTG CGGGGTAATT TTTCCCCGTG 1440
ATTAATACGA AAGAAGCAAA ACGACAATTA TGTTTGAGGG TACTTGCAAG TCCTGACCAG 1500
AACCCTCGTC CCGCAATAAT TTCTCCGTCA TTTTTGCGTG TATTGTGATT TGACCCAATG 1560
GTGGAGCCAG CAGCAATATT TGACTGACCA CGTATGAGTG CTGCGATGAG GAAAGAGTTG 1620
TTGTGGTGTT GTTCATGGTA GGGAAAGATG AGTGCGTTAA GAACCTCACA ACACGAAATC 1680
GTGGAGTTAT CACCTAAAAC TGAGTGAATG AGTCTGGTAC CATATTTAAG TGCGCAgTTA 1740
TTTCCCAGTA CAAAGCGCAC CGCCTTTACC CCATAGAACA CACGGCAACC ATATCCGATC 1800
ACCCCGTTCA CTAACTCTAC TCCTTCTCCT ATTTGCGTAG GTTCCTGGAG AGATGACTGA 1860
ACAGTAAGGT TTTTTAGTTT GTTTGCTCCT TTTACGTATG ATCCAGGACC GAAGcACACA 1920
TCCTTGATGA TACGGCaGCT TTTGATAACG GATTGTGTTT CaATCGTTCC gTAGTATCCg 1980
CgGCGAGTGT CGTGTTGTTG TTGAGTCATT GACTCGAAGC GTTGCATGAG CAATGTTCTG 2040
TCTCGGTGGC ATGCCCACAA AAACGCGTCT GCTGCGATCA TCCCTACGAA GGGAAATATT 2100
TTCCGTCCGC CTGTTTCGTT GAGAGGATCA ATGGTGATGC GGACTGCTTC TTGTTCTCCG 2160
TCTTTTATAA TTCCTGCGCC GAATTTTGCG TGGTTAGTGG TACACAGCTC ATCAATCCTG 2220 CTGAGGATGA CATGATTTCC AATTATGTAG TGAGAAATAT ATGCGCAGTG ATGGATGGCG 2280
CAATTTTCTC CGACGTCGCA CGAAaTGAGT GTACTGTGGG TAATACCGGT TGGTACGGTA 2340
AAGTCGTGAT ATCGCAGAAA GgCTCGCTCG AGCGwasCGA TGCGTACGAG CCCTGCAAAT 2400
GATGAATTAC GTATGAGTGA CGCGTCGAAC GGATCTGCTA CTAAAACGTC GTGCCAGGTA 2460
TCGCAGTGAT TGCCCTTTTG TATAAGGGTG TGAATTTCCT CCTTAGACAA TGGTCTCCAT 2520
GCACGTGGGG GTTCGGCGCT CTGGGAAAAG CGGAGATAGT ACTCGTCTTT TCCGCGAGGG 2580
ATATGGGTTT GTGTGATGAA GTGGTATCCA AAAGAGGGTA GGTCTAAAAT TTGCACACGC 2640
ATTCTCCCTT TTGGATGCCC ACTATAGGTG GTGAATTTTT ACATGTAAAT AAGGAATTGG 2700
GGTTGTGATG GGGATGGTGA TTTCCTGCAT GTTTACTTGA CATGACATAT TAGGAATGGC 2760
TAGATTGGGG CCCaGTCTTG TTTTTTAGCG TGCATTAGAA GTGGATGTaC TGGGGAGGAT 2820
CgTTGGCGGA TAACAAAAGC TTGCGGATTA ATGGAAGTAT TCGGGTACGA GAAGTGAGGT 2880
TGGTTGACGC TGTAGGGCAG CAGTGTkGGG TGGTGCCCAC CCCTGAGGCG CTGAGAATGG 2940
CACGGGATAT CAATCTTGAT TTAGTAGAGG TcgcTCCGCA GgCGAGTCCG CCGGTGTGCA 3000
AGATCCTGGA CTATGGGAAG TATCGCTTTG AGATGGGCAA AAAGTTGCGT GACTCGAAAA 3060
AGCGACAGAG ATTGCAGACG CTCAAGGAGG TGCGTATGCA ACCGAAGATC AACGACCATG 3120
ACATGGCGTT TAAGGCCAAG CATATACAGC GGTTTCTCGA TGAAGGGGAT AAGGTGAAAG 3180
TGACTATCCG CTTTCGTGGA AGGGAGCTTG CGCATACCGA TCTGGGTTTT AACGTGTTAC 3240
AGAATGTGCT TGGCCGTCTG GTGTGTGGGT ATAGTGTTGA GAAGCAGGCA GCAATGGAAG 3300
GTCGGTCTAT GTCCATGACG CTCACTCCGA AGTCAAAGAA ATGATGGAGT GTCGGGTAAC 3360
TGCAGTTCGT GTTGTTGGAT AAAGGGGAGA AAGTATATGG CTAAGATGAA AACGAAAAGC 3420
GCAcAGCAAA GCGTTTTAGT GTAACCGGGG CTGGTAAGGT AAAGTTCAAA AAGATGAACC 3480
TGCGTCACAT TTTGACGAAA AAGGCCCCGA AACGCAAAAG GAAATTACGT CATGCGGGTT 3540
TTCTGTCAAA AGTTGAGCTT AAAGTGGTGA AGCGGAAGCT GTTGCCTTAC GCGTAGgTGG 3600
CAAGCGTGAG AGGACGGAGG AGCGTGG AT GTCTCGATCG TTGAGTAGTA ACGGCAGAGT 3660
GCGCCGGAGA AAGAGGATTT TAAAGTTAGC CAAGGGCTTT CGGGGTAGGT GTGGCACGAA 3720
TTACAAGGCG GCGAAGGATG CGGTCTCGAA GGCTCTTGCG CATAGCTATG TTGCGCGGAG 3780
GGATAGGAAG GGGAGTATGC GCAGtTGTGG ATCAGTCGCA TCAATGCATC GGTTCGTACG 3840
CAGGGtTGAG CTATTCTCGC TTTATGAATG GTCTCTTGCA GGCTGGGATT GCGCTTAATC 3900
GCAAGGTTCT CTCCAATATG GCAATTGAGG ATCCAGGTGC GTTTCAGACG GTGATCGATG 3960 CTTCTAAGAA AGCTTTGGGG GGTGGAGCGT GCTAAACCTC GGTCAGGTAA AAGTGCTGGA 4020
GGAGAAGGTT GCGAAGgCGG TGCACCTTGT CCAAATGTTG AAGGAAGAAA ATGCCGcgTT 4080
GCGGgCTGAA ATTGATGGAC GTGGTAAGCG TATTACGGAG CTGGAGCAGC TGGTGCTTGs 4140
CTTTCAGGAT GATCAGACGA AGATAGAGGA AGGAATTCTT AAGGCACTGA ACCACCTGAG 4200
TACATTTGAG GATTCTGcGT ATGGAGAAGC GCTTACGCAA CACGCGGCGA AGgTTCTAGA 4260
AAACCGGGAG CATGCGGGGC TGTCTGAAGA ACTTACCAGC CGTACCCAGA TGGAAATTTT 4320
TTAGTGGTCA GTGTAAAGGG GCAgTTGCAC ATCGATCTGT TGGGAGCGTC TTTTTCCATC 4380
CAGGCTGACG AGGACTCCTC GTATCTGCGT GTGTTGTATG AgCATTACAA GATGGTGGTG 4440
TTGCaGGTGG AGAAGACGTC aGGGGTCCGC GATCCcTTAA AGGTcGCGGT GATTGCgGGT 4500
GTGCTTCTCG CGGATGAACT GCATAAAGAG AAGAGGAGAC GTCTTGTACA GTCCGAGGAA 4560
GATCTGCTGG AAATAGGGGA GTCTAnCCGA GCGTATGCTC GAATCCATCA GCAAAGTGGT 4620
GGACGAGGGG TTTGTGTGCG GGCGCGATTG AGGGTTGTGT CCTTCTTTGT GTACGGGACG 4680
TCCTGCGGTG ACGCTGTGGG TGGACGCGGA CTCATGCCCC GCGCgcGTCC GCGTACTTGT 4740
CGCGAGAGCG GCAGCGCGCC TGGGGTGTGT GGCTCGATTT GTGGCCAACC GTCCTATCCC 4800
TCTCGTGCAA AGCCCGCATT GTATCATGGT CGAGACTCAA CCTGTTGACC AGGCTGCGGA 4860
CCGTCACATg CATCGCGTAT GCGCGAGCGG GTGATTTGGT CGTCACGCGT GATATCGTGC 4920
TTGCAAAGGC AATTGTAGAC GCGCGCATCT CTGTTATCAA CGACCGGGGT GATGTGTATA 4980
CGGAGGAGAA CATACGCGAG CGACTCTCGG TGCGTAACTT CATGTACGAC tGCGAGGGCA 5040
GGGACTCGCC CCTGAAACAA CGTCACCGTT CGGCAGGAGG GATGCCGCAC GCTTCGCAGA 5100
CTCCCTAGAT AGGGAAACCG CGAAgcTCCT GCGGCTTGCC AGGCGGCGGG AGGCGAAGAC 5160
AGGGGAGGAG CAGTGCGACT GGCCCTCCGC GCAAGGGAAA AGCCAAACCG GCCGCCGGTG 5220
ACCGCACGCA AGACACTAAG AGTCCAAGGC CGGGCGGGTG GACTCCTAGT GTCTTCTACC 5280
GCTTCTGCGA GATGAACTTA AGCAAATCCA CCACACGGTT TGAGTAACCC CACTCGTTGT 5340
CATACCAGGA CACTACCTTG AAGAAGCGCT TCTCGTTCGG GAGGTTGTTC TGCAGCGTCG 5400
CCCTGCTGTC GTAGATGGAG GAGTACTGGT TGTGGATGAC GTCCGCGGAT ACAATATCCT 5460
CGTCGCAATA CTGCAGGACA CCCCGCAGAT AGGACTCCGA CGCCTTCTTG AGCATCGCGT 5520
TGAGGTCCGC AACGCTCGTC TCTTTTTCCG TGCGGAAGGT TAGATCCACC ACGGAACCGG 5580
TTGGTGTCGG GACACGGAAG GCCATCCCCG TCAACTTACC TCTCGTAGAC GGCAGCACTT 5640
CGCCTACCGC TTTCGCAGCT CCAGTGGTGG AAGGGATAAT GTTAACCGCT GCAGCGCGGC 5700 CTCCGCgCCA GTCCTTCAAA GAAACCCCAT CTACAGTTTT TTGCGTTGCG GTATAGGAGT 5760
GGATAGTCGT CATCAGTCCC GTTTCAA AC CGACTCCCTC TTTGAGAAAG ACGTGCACTA 5820
CCGGCGCGAG ACAGTTGGTA GTGCAGCTCG CGTTGGAGAC GACCTTGTGC TCAGCAGGAT 5880
CGAACTCATG CTCGTTCACC CCCATTACAA TAGTCTTCAC CGGCTTAGAC GCATCCGAGC 5940
TCTTAGCCGG AGCACTGATG ATGACTCGCT TTGCTCCTGC TTCAAGGTGA CCGTATGAAG 6000
ACTCATTCGC GTAAATGCCG GTGGscTCAA TAACCACCTC AATACCAAGA TCCTTCCAGG 6060
GaAGTTGGGA AGGCTTTAAG CCGCGACCGC AGACACACTT GATCCGATGC CCGCCCACCT 6120
CGAGGATATC CTCGGCAGGA GCACTGAGAC TAGAACCCAT TTTGCCCTGC ACGGAGTCAT 6180
ACTTTAGCTG ATAGGCAAAG TAGCGCGCAT CGGTGGAAAG GTCTACAACT GCCGCCACGT 6240
CGAACTCTTT CCCCAACAGC TTCTGtCCGC CATGGCCTGG AGTACGAGAC GCCCGATACG 6300
CCCAAAACCA TTGATTGCAA CTCTCATTTG CCCAACCTCC TCTAAAAAGA GCACACATCC 6360
CGCGCAACGC TATCTGAAAA AAGATCGGCA CGTCAATCCC TCTTTGCTGT AGGGCTCCCT 6420
TGCATTTTTC TATGTGCCCA GATACCATGG CCTCGCCTTG GAAGGTCTGG CCTCTAGTGG 6480
AAGATTATTA CCGCGTGCTT GGTGTGTCGC ACCGTGCCTC GACCCCTGAA ATTAAGTGTG 6540
CCTTCAGAAA GAAGGCAAAG GCGTTACATC CGGATCTCGT TTCCCA ACT GCAGAACTTG 6600
AGTGCGAGGC GGTAgCgCGC GAGCGCgCTC TTCGCCGTAT ACTCACCGCA TACGAGGTGC 6660
TCTCTGATCC GGGGCGTCGC GCGAAATTTG ACCTCCTCTA CGCGCGTTTC TGCGCACGTC 6720
CTGCTCCAGC GGGCTTTGAC TACCGCGTGT AmCTGCGTGC GCAGGtACGC TCTGCGCGAT 6780
GGTGGAGCTT ATCTTGTTTG ATCTCTTTCA CGGTTTTGAG TGTGACGCTG TCCGCGCGTA 6840
CTTGTCCCTC AAGTGTCGGC CAGAAGGGTT CAACCTCGCC ACTCACCTTA CACGAGAGGA 6900
TTTTATGGAC TGTGGCTTTG TGCTCGCAGA GGAATTGCAT GTACGGGGAG AGTGCTATGA 6960
ATGCTTTACT TTGCTCCAGG ACATCGTTTT TGAAGAATTG CGGTGCGCGT ATTTTCGTCA 7020
TTTTTTTCCT GAAGTACTGA AGCTCGCTGA GCATATCGCG CTCGGTAcTG CGTCTGTGCG 7080
TGGTCGCAAC GGTAAATCCT GCGTATACTG CGCGCGCGCC ATGCCTGCTT GCCTGCGCAA 7140
GAAATTGTCA CCTTCTACGC GTGCTTAGTT GAGTATTACG AACGTACGGG AGACCGCAGc 7200
GTGCGCGTGG CTATGCCCAG AAGATGGATT CTGTCAGGTG AATGTTTGAC TGCACCCTGG 7260
CGGAGGAGTA CCGTGGTCCT GGGGGACCTC CGAAGGcTGG AGGTCCCCCT GCAGCTAGTG 7320
AACGGACAGA GGAGGGACGC TTGAGCAGGA AGGAAAGGAC CTCATGATCC GCATTAAAAC 7380
ACCAGAACAA ATCGACGGTA TCCGTGCCTC TTGCAAGGCA TTGGCGCGCC TTTTCGACGT 7440 TCTTATTCCG CTTGTCAAAC CGGGCGTTCA AACCCAGGAG CTTGATGCGT TTTGCCAACG 7500
CTTCATCCGC TCAGTCGGTG GTGTTCCTGC CTGGTTCTCG GAAGGTTTTC CTGCCGCTGC 7560
TTGCATTTCA ATCAACGAAG AGGTCATCCA TGGTTTACCT TCAGCGCGTG TGATTCAGGA 7620
CGGGGATCTT GTTTCCCTTG ATGTTGGTAT CAACCTCAAT GGATACATTT CTGACGCGTG 7680
TCGTACTGTT CCTGTCGGTG GAGTTGCACA CGAGCGACTA GAACTTTTGC GTGTAACCAC 7740
TGAGTGCCTC CGTGCGGGCA TTAAAGCGTG CCGTGCCgGA gCnCGyGCGC GCtgTTTCTC 7800
GCGCTGTATA CGCTGTTGCA GCACGGCACC GCTTTGGCGT GGTGTACGAA TATTGCGGAC 7860
ATGGCGTGGG GCTTGCCGTG CATGAGGAGC CGAACATCCC CAATGTGCCT GGCTTGGAAG 7920
GGCCTAATCC ACGTTTTTTG CCCGGTATGG TAGTCGCGAT AGAACCCATG TTGACGCTTG 7980
GCACAGACGA GGTGCGCACC AGTGCAGATG GCTGGACGGT GGTAACGGCA GACGGATCGT 8040
GTGCCTGCCA TGTGGAGCAC ACTGTGGCAG TTTTTGCAGA CCACACGGAG GTTTTAACAG 8100
AACtACGGAA GTAGAGCGTA CCGGCTAGTC AGCTATCTTA AGTGTGCGCG GTGTGCTGAT 8160
AGTACATGCA GGGAGCAGTT TGTGCACGGT AGGCAGCGTG TAAGTGTACG TGGCGGGCAC 8220
AGGTGAAGAG GGGATAAACT CGTAACCATA TCGCTGTGTG CTGCTTTTAA CCCGGGCTGT 8280
GTCGGTAGGG GTTTGGGTAC GCGCAgGGAC GTGGAGGGAC TCATGAACAT ATTGTTTACC 8340
TCGTTTGTGT GTGGGGTACA TGCGGTATGC CGCAGTTTTT TTACAGCAGC GGCGTTGCTC 8400
GTTTTTATCT GCTGCTCTGG TCATCCAAGT TCTGCGCGTG TGCCCtCTGC AGACACGATA 8460
GCTCGGCGCG TTGCCGGAGA CAGTGGGAAC gCTGGGGGGC GGACATTACT TCCTGTGGGG 8520
GTTTCgCGTG AATCGGTGCA GCTGTTAGAA CGGCTGCAAA ACGCGAACCG TCAGGTAACT 8580
GCCGAAGTGC TGCCTTCAGT AGTGACGCTG GATGTGGTGG AGACCAGAAA GGTTCGGGTA 8640
CGTGATCCGT TTGGCGGTTT TCCGTGGTTT TTCTTTCGTG GTCCTGAAGG TCCGGGTGCG 8700
GGGnCTGGCG GTGGTTCTGG AAACAAAGGG GAAGCTGAGG AACGGGAGTA CAAAACGGAG 8760
GGACTTGGTT CTGGAGTCAT TGTAAAGAAG ACAGGGAAGA CGCATTACGT GCTTACCAAC 8820
TATCACGTGG CGGGTAAGGC TAATGAGATA GAGATTAAAC TGCACGATGG CAGAATCGTA 8880
AAAGGTAAAC TTGTCGGTGG TGACCAGCGC AAGGACATCG CGCTGGTCTC CTTTGAGGAC 8940
GCAGACCCAA ATATCCGTGT TGCCGTCCTT GGTGACTCGG ATGCAGTACG GGTAGGAGAC 9000
ATTGTGTTCG CAGTTGGCTC TCCTCTTGGG TACACTTCCA CTGTAACGCA GGGGATTATC 9060
AGTGCGCTGG GTCGCTTTGG GGGACCGGGC AACAATATTA ATGATTTTAT TCAAACAGAT 9120
GCGGCCATAA ACCAGGGCAA TTCCGGGGGA CCAATGGTCA ATATTTATGG CGAAGTGATT 9180 GGGATTAACG CGTGGATTGC CTCCTCAAGT GGGGGATCGC AAGGGATTGG TTTTTCAATT 9240
CCTATCAATA ATGTGAAGTC GGATATCGAA TCATTTATCC AGTACGGGCA GGTGAAGTAC 9300
GGGTGGTTAG GCGTGCAGCT GGTGGCAACG GATGCGGACA CCGTAGCATC GCTTGGTATT 9360
GCAAAGGGTA CAAAAGGGGT GCTTGCGGCG GAAATTTTCT TAGGTTCTCC TGCGCACAAG 9420
GGGGGACTGA AACCGGGCGA TTACTGTGTA AAACTGAACG GAAAAGAAGT AAAGGATGTA 9480
AATCAGTTTG TGCGGGATGT CGGCGCGCTG CGCATTGGGC AAACAGCAGT ATTCGATTTA 9540
ATTCGCGGTG GTGTGCCGAT GACGCTTTCG GTGCGCATTA CGGAGCGTGA TGAAAAAATA 9600
GTAAATGACT ACTCAAAGCT TTGGCCTGGG TTCATCCCAC TGCCGCTTAC GGAGGCCGTG 9660
CGTAAACGTT TGGATTTGAA AGCGTCGGTG CGTGGTGTGC TAGTTAGCAA CGCGCAGAGC 9720
AAAAGCCCTG nCGGCGCTGA TGGGATTGAA GTCGGCGGAC ATAGTAGTGG CGGTCAATGA 9780
TCAAAGAGTC TCGAGCGTGC GTGAGTTTTA CGCGGnGCTT GCACGTCAGA CGAGGGAAGG 9840
TGTGGnTT 9848 (2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7415 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
CAAAAGGAGA TGTCATTCTA TGTGGAGCAT GCCATTCGCG TATGCACCTG GGTTGTGCGC 60
AGACACTTCC GCCTGATGTG CGCACCAGAT AGAGTATGTT GTGGAACGCA CTCGGTCTCA 120
TCGGGGAACT ACTGTTGCGC TTGCTATAAA TTATGGGGGA AAAGATGAAA TTTTACGTGC 180
GGTAAAAAAG GTTTTGTGCA GCACTTCGTG CCCGGATGGT GAGCTTCTCA CCGAAGAAGC 240
TTTCGGCGCG TGCCTTGATG CGCCGCAGTT GCCGAGTGTC GACTTTCTCA TCAGAACAGG 300
GGGTCAGCAA CGCATGAGTA ATTTTTTGCT TTGGCAAAGC GCGTACGcGG AGTTCTATTT 360
TACCGATATC CTGTGGCCTG ACTTTCGGGT AGAAGACATG cTGCGCGCCC TGGATGAGTA 420
TCGCCTGCGC ACGCGTACCT TTGGGGGTTT GGAATGAGCG CGGAAATAAA GAGGCTGTTA 480
ATCTTTTTTT TCGGCGtTCC AACTATTCTT ATGTTGGTAT ATGCGGCACC GCATGcACAC 540
TTCCTAGCGT TCCATTTGGT TATCTTCGGA TCAGTTATGG GTGCGGTATG GGAAATGCAT 600
GCGATGGTGT CgcGcAGGAT GTGCACGTAC CCACTGGTTT TGTTGATCCC TTTCAGTCTT 660 GTGCTTCCGC TTTTAGGATA TGCAGCGCTG TGGCAGCCTG CACGGGGCGC TGAATCTGTC 720
CTTTTTATTG GAGCACTGGG CACGCTGCTC ATGAGTGTTT TTTTCACCGA ATTGGTGTAT 780
TCGTTTTCTG CTTCTTTTGA AAACGCCCTT GAGCGTATGG CCTCGGCACT GTTGCTTGTT 840
TTGTATCCAG GTATCTTTAG CCTTTTTTTT TCGCTCATTA CGCGGTGGCG TCATGCAGAG 900
ATCGCaTTGG TAATTTTTTT yCTCATGGTT TTTACGTGCG ACTCTTGTGC ATGGTTCTGT 960
GGGACGCTCT GGGGAGTCAA CAACAGAGGG ATAATTCCTG CAAtCCTAAA AAGAGTATgC 1020
AGtTTTATgG AGGTTTTgCC GGTTCGGTAG GTGCAGGgTG TTTTGGCTCA CTtGTATTTG 1080
GTTCGCgTGT GACGCTCTCT TTGGGGATGC TCATGGGTGT TGGAGCCTTG GTAGGACTGA 1140
CTGCCATTGT AGGCGATCTA GTCGAGTCGG TGATGAAACG TTCGGCTCAG GTAAAGGATT 1200
CAGGATTTTT TACCCCCGGG CGGGGCGGAA TTATGGATAA CCTGGATTCG tTGCGCCGTC 1260
ACTGGGGACT TTTTACATTG CATGTGAGTG TTTTGGGATC GCTGCAGTAT GAGTGTGCGA 1320
CGTGTGGTAG TGCTGGGCAT TACTGGTTCT ATTGGAGCTG CAGCACTCAA ACTTCTGCGT 1380
CGGTTTCCCG ATCGGTTCTT GCTGGTGGGC GCTTCAGGTC ACCGGCAGAC CGAGTACGCG 1440
CGGGCGTTGG CGCGCGAGTT CTCTTTATCA GATATCACTA TGACTGGCTC ATGTTCTGAG 1500
CAAGAAGGTC GCGCACGCAT AAAGCGTCTG CTTTCTTCCT GTGAAGCAGA GGTGGTGGTA 1560
AACGGTATTG CCGGCGCTGC TGGTCTTTTT GCCTCTCTTG AGGTGCTCAA GACGCGTTGT 1620
ACGCTCGCGT TAGCAAATAA AGAAAGTGTG GTACTTGCAG CTTCTCTTTT GCATGCTGCG 1680
GCACGCGAAA GTGGGGCAAC AATCGTTCCT GTAGATTCAG AGCATGCTGC TATTTTTCAA 1740
CTTATTGCAG cGCACGGCGC GCATGCGGTG GCGCAGGTAG TGCTCAcTGC GTCAGGTGGT 1800
CCATTTAGAA CCTTTTCAAA GGAGTGCTTA GCGCATGTCA CGGTGGAAGA TGCGCTTCAA 1860
CATCCGACGT GGCGTATGGG GAAGAAGATT TCTGTTGATT CTGCAACACT TGCAAATAAG 1920
GCACTGGAAG TTATAGAAGC AGTGCAGTTT TTTCGTATAC CGGTGGATCG GGTCaCGGTG 1980
GTGGTGCaCC CTCAGAGCaT AGTGCATGCg CTGGTGcAAT GTCATTCGGG AGAAACGTAT 2040
GCGCAGCTTT CTGTCCCTGA TATGGCGTCG CCGTTACTGT ATGCGTTGCT GTACCCTGAT 2100
GCGCCTCtGC GTATCAAACT CCGCTTGATT TTACATCGGG ACTGTCTTTG CATTTTGAAC 2160
CTCCGAGGGT AGATGACTTT CCGCTGTTGC GTATGGGTTT TGATGTTGCA CGGGCGCAGC 2220
GTGCGTATCC TATTGCCTTT AATGCAGCAA ATGAGGAGGC GGTGCGTGCG TTCTTGCAAA 2280
GAAACATTGG GTTTTTAGAT ATCGCACACG TGACTGCACA GGCGTTGCAA GAAGATTGGC 2340
GCGCAATTCC CCAAACGTTT GAAgAAGTTA TGGCgTGCGA TAmGCGTGCg CGGATGTGTG 2400 CgCGGACGTG CATTGCACAG AGGTGGAGAG AGAGGTGATT AAGATAATTA TTGGCGTTGT 2 60
GGTGCTTGGT ATTGTGGTGT TGTTTCATGA ACTGGGGCAT TTTGTCGCCG CGCTTTGGTG 2520
TCGAGTGGAG GTGCTCAGTT TTTCTGTCGG TATGGGGCCG GTCCTGTTTC GAAAGAAATT 2580
TGGAAAAACG GAATATCGCC TTTCGATGCT TCCTCTTGGG GGGTATTGCG GTATGAAGGG 2640
AGAGCAAGCG TTTCAAACGG CGCTTGATCA AAAACTTTCC CGTATTCCCG TTGAGCCCGG 2700
TTCACTGTAT GCaGTAGGAC CGCTCAAACG CATGGGTATT GCCTTTGCAG GACCGCTGGC 2760
GAATGTGCTT ATGGCGGTAA TGGTATTGGC ATTGGTTAGT GCGCTTGGCT CGCGTGTACA 2820
CACATTTGGA AACCGTATTT CACCGGTGTA TGTATACGAT AGTTCTGATA ACTCGCCTGC 2880
ACGCCGCGTG GGACTTCAGG ACGGGGATAC AATcCTGCGC ATTGGTGACC AGCCGATACG 2940
CTATTTCAGT GATATTCAAA AAATTGTATC ACAGCATGCG CAGCGTGCAT TGCCATTTGT 3000
GATCGAACGG AGGGGGCAGC TTATGCACGT GACCATTACG CCTGATAGAG ATGCGCATAC 3060
TGGCATGGGG AGGGTTGGTA TTTACCATTA CGTACCGCTA GTTGTTGCGG CGGTTGATGC 3120
ACACGGTGCT GCATCGCGGG CAGGTCTTGA ACCTGAAGAT AAAATTCTTG CAGTAGCAGG 3180
ACGCCGTGTG CAACACcAGT ACAGCTCCTT GCGCTGCTCA AGGAATTTCG AAAAAAGTCA 3240
GTCGTATTGA CTGTGCTGCG TTCAGGGAAG AGGCGATATC ATACCATTGC GTTAGTGCGC 3300
ACAGAAAACG GGGCAATAGA TGTTGGTATC GAATGGAAAG CTCACACCGT GGTTATACCG 3360
GGAACTTCTT TTTTTGCAAG TGTCCGTGCG GGCATTGCAG AAACGTTGCG TATGTGTGTA 3420
TTGACGGTGA AGGGTATTGG TATGCTCTTT CGGGGCCTGC AATTTCAGCA GGCTATCTCA 3480
GGCCCATTAA GGATTACGCA TGTGATAGGA GATGTGGCCC AGCATGGTTT TCAGGAGAGT 3540
TTTTTAACGG GACTGTCACA ATTATGCGAG TTTGTGGCAC TCGTGTGCGT CTCTCTCTTT 3600
ATTATGAATC TACTCCCCAT TCCGATCCTG GACGGCGGTT TGATTTTATT CGCATGTGTT 3660
GAATTGTTTA TGCAAAGAAG CATACACCCG CGTGTGTTGT ACTATCTGCA GTTTGTAGGT 3720
TTTGCGTTTG TTGCATTGAT ATTTTTATGT GCGTTTTGGA ACGACGTGAA TTTTTTGTTT 3780
CACTAGGAGT GAGTGATGCA GTTACGGTGT GCGTGTGAGC GGGTGTTCGA TATTGAACAT 3840
GAGACGGTAA TTTCGCTTGA TGAGCACCCG GAATTTGTTG CGCGTATACA GCAGGGGGAT 3 00
TTTTTAAGTT ACCAGTGTCC GGCATGTGGT GCGCGTATTC GTGCCGAAAT AAAAACAGAA 3960
TTTGTGTGGC ATGCGAAGAA TGTGCATTTG CTTTTGGTTC CTGaGCGAGA GCGTTTGCGG 4020
TGTTTGGCTT TTTGTGCCGG TATGCATATG AGCGACGGAG ATAGTGCTGA CTTTTGTGAA 4080
CCCTTTGTCT TACGGGAGCA CCAGACACCC GTGATTGGCT ACGCAGAACT TGCTGATCGT 4140 GTTGCAATAC TAGCATGGGA TTTGAACCCT GAAATTGTTG AAGCAGTAAA GTTTTTTGTG 4200
TTGGAAGGGG CACCGCATCT AGGAGACAAG AGAGTTTCGT GTTTTTTTGA ACGTTGTGTC 4260
GGGGACACCG GATCGCGCGT GATGGAGTTG CACGTGTACG GTATCAGAGA ACAACAAACG 4320
GCAATTATGC CGGTTCCCAT GAATGTGTAT GAACGCGTTG AGCGAGAGCr mGGTAAACAA 4380
GCGGAGTTGT TTGAGGCGCT GTATGTTGGG GCGTATCTTT CATACAAGAA TGTTTTTACT 4440
GACGCGTAGC sCCGCACAGC GAGCAGCATC TGGTGTGCGT GGTGTATGGG GTGTTGTACG 4500
CTTGGGCTTT TTGCAGACAG TGTAGAGAAG CGCGCAgcgA AGGATGTGTT TACTGAACCG 4560
GCGCGCTTTT ATCCCTCACA AAAATCAACG CTTGAATCTG CCCGGTCTGA TACATCTGAA 4620
TCTGAGAATG CATCTTCTTC CGTTCCTTCC CACAGTCAGC AGGAGTTGGC GCCAGACTCT 4680
GCCGCGCCTG CGCGTAACTC TGTGTTGTCC CCTGCTCCTC CTGAAAGGAG AGAGAAGCAG 4740
GGGACTGCGG TGCATGGGGC GGAAGTGACG CGGGCGGGAG CTGTCAGCCC GCGTTTTGTA 4800
GGGGGGCTGA CAAAAATACT GGCCGCCTCT GACCATACAT TCTTCGCTGC AGGAAATGAT 4860
GGGTTTCTCA CCCAGTACAC GTATCCGGAT TATAAACCGG ATACGTGGCA GATCACCCCT 4920
GTTTCTATCA AACACTGTGC AGTGCATCCG GACCGCGCGC GTATTGCCGT ATATGAAACA 4980
GATGGACGCA ATTACCACCG AGTCAGTGTG TGGAATTGGC GCACGAAAGA AATACTTTTT 5040
GCAAAGCGTT TTACCGCATC GGTTGTGTCA CTCTCGTGGA TTGTGCAGGG AAGTTTTTTG 5100
AGTGTGGGAA CAGCATCGCG CGAAGgTGTG ACGGTGTTAG ATGGGAGTGG AAATACAGTT 5160
TCTCTATTTT CGGAAGAGCC TGGGGTGGTG TTGTTGACTG CGAGTGGACC GCGCCTTGTG 5220
CTCAGTTATG CAGAATCTGG ACGCCTCACG TACGTAGATT ACAGCAAAAA GACAACCGTC 5280
AAACGTCTTC TTACCGAAAA GAATCTCCTG TCTCCCATGT TAATACATAA CGGTGCACAT 5340
CTTGTCGGTT ATAGAGACCA ACGTGTGTAT GTCATCCAGT CTTCAAGTGG CGCGGTGCTC 5400
ACCGAGTACC CTGCACGGAG tGcATGtTTT GCGCATACAT TCAGCGATAG TCTTCCTGTG 5460
TGGATAGAGC CTGCTGAGTT GAAGTATCAC TGGCGTATAC GGAAAGcTGC GCAGCGTTCT 5520
GCTGATTTTA TGCTTCCTGA CAATGCTCGC ATAACAAGTG CGTGCTCGGT TCGCACGCGG 5580
GTCATCGTAG GAACCGATCG CGGGATCCTC TATGAATTGC AGCAGGGAGA TGACAGGCGC 5640
GTAACTATCC GCGCACTCAA TGGCGAGCGT CAGATATACG CAAGCGATGT ACATGGTGCA 5700
GATGAGGGCG CGTATTTTTT AGCAGACGGA TCCCTATATC ACAGCATGGC GTCCGGGGGA 5760
CCGTATCGTG TTTTGGTGCG CGGAGTAAAA GGAACTCGGT TTCTGCCTTA TCGTGATGGT 5820
TTTATTGTGT GGTCTGCAGG GAAAGAAACA GAGTTTCTTC ATTGTGCGCA AAAGACGAGT 5880 CAACACAGGA TGATATATCG CGCGCGTTCC ACGGTAAGCG GCGTGTCGGT GTATGGGCGT 5940
ATGTTGGTGA TTACTGAACC TTTCTCTGGA GTATCGGTGG TGGATATTGA GCGGGGGATA 6000
CgAGTTTTTT TTCACAAAGC GATTGGTATG CAGGATTCGC TATTGATTAC TGATGACGTA 6060
ATTGTAGCCA CTCAAAGCGG TTTGCAGCCA CTTGTCCTGC TGCATATGCG TACGGGGGAG 6120
ACATATACGC AGCGGTGGGA GGCGATTTGC CTTGGCGTCC GCGCGCATGA TACACAGCAT 6180
GTATATTTTT TTTCGTTGGA TACGAATGCG GGCACGACTG ATTTGATCCA TTTCGTCTGC 6240
AACTGCAGCA ACCCACAGAA AGTGTTGTGC GACGCATCCT CTCTTAtAAG GATGAGGATA 6300
TAGATGCGCA TATGGTGATG CGGCGTTCAC TGTTGGTAAC TAATTTAGGA AAAGGGGCGC 6360
TTGTCGGACA TCGCGTGCAA CAGTCGCAGG TGTATCGTAT GTCCCGTGCG TATGCGTTAC 6420
CAAAAGTTGC TGCAATCACG TCGAACGGAG TTGTCAGCGT GAATTACGAT GGTTCAGTTT 6480
CGTGGTATGA AGGCGACGGT GCGACATTGA AAGCAACCGA ATTTATCCGG ACCGAAGATT 6540
TTTGAACGGG TACACAAGGT GCGGTGTATT TTGTAATTCG GCACGGTGGT ATGAATGCTT 6600
CCTAGTTGGT CTTGACAGGG AGCTCCTTCT CGGGGGAGGA TGGGCGGGGT AGATGTTGGT 6660
TCGCTACAGT TACGATGCAA AGGGAAGGCG GTTGGGGCGT GCGCTGGTGT ACACTGAGTC 6720
GGAGCACGGT ATACCTCGGC AGAGCGTTGA CGCTGGGGCG ATAAGGgTTG TAGAAGCGCT 6780
GGTGGGTGCG GGGTATGAAA CCTATATCGT CGgTGGGGCG gTAAGGGACc TGGTTGCGGG 6840
AAGGACACCA AAAGATTTTG ACATTGTTAC AGGCGCAGTT CCCTCTAGGA TTCGTAGGTT 6900
GTTCAGGAAC TCGCGCATTA TCGGCAGGCG CTTCCGCATT GTTCATGTGT CGTGTGGCTC 6960
GCAGCTGTAC GAGGTTTCCA CCTTTCGCTC TCGTGTGGGG GAAGgTTCGG TGTGTGTTCC 7020
TGGCACGTTG GAGGAAGATG CATGGCGGAG GGACTTTAGT GTCAATGCCT TGTACTATGA 7080
TCCTCTGAGA AATGTGGTGA TCGATTGTGT CGGTGGAATG GTTGATCTGA AGAGGCGTCG 7140
CGTGCGGCCG CTCATACCTC TGCGGTCCAT CTTTGTAGAG GACCCAGTGC GCATGCTCCG 7200
GGCATTGAAG TGCTCGGTGA TGTGCGAGTC TTCCATCCCT TTTTCTGTCC GCCGCAtATT 7260
CGCCGCAtGT TTCCCTTCTt GGGGGGTGCT CTCCCTCCCG GTTGACCGAC GAATTtGTAA 7320
AAATCCTCTT TtCCGGTCGG AGCGcCsCGC TTGTGCGCGC CCTATGTGGG TAmCAGCTCC 7380
TTCTGTACTT GCAGCCGTCT GTGCACTACT TTATG 7415 (2) INFORMATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5271 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
CTTTGATTGG TAAGGTAAGA GAACCGTCAA GAAATAATCC TTTTAAAGTT TTTAGTTCAA 60
AAATAAAACG CGTTCCACTT ATTTCAATAG TTTTCGTTTC TCCTTCAGTT CTAGAATAGC 120
GAACGTTTCC TGTGGAATAC AGAATTTGGG CAGAGGTGTT GTAGACAGTT TGATCTGAAA 180
GAATCGTTGC GGTAACGcTC CCATCATCAA TGGAAATGGA AACATTCCCG GTAAAGACTA 240
CCAACTGATC TTGCAAATCA AAGGGGGACA GCGGCACGCG gCCGGTACTT TCCAATATAG 300
GTTGTCCAGT TTCAGAAAGG CGCGTAGTTT CCTGTGCCGA ATTAATAATA ATTCTTAATT 360
TGCGTAGACC ACTTTCACCA AAGAGGGGAC AAAAAAGAAA GAATATGCCC CAAATTGGAT 420
ACCATGCTCT CATGTCTTTT TACACAGCAC CTCTTTGATG TACCAACCGG TGCGAGACTC 480
AGTTATCTGG GATACTGCTT CAGGGCTTCC CTGTGCAACG ATAGTTCCAC CGTGCATTCC 540
TCCTTCAGGA CCTAAATCGA TAACACAGTC TGCCTGAACA ATAACATCCA TGTTATGTTC 600
GATCATCACA ACCGTATTTC CCTGATCTAC CAAGCGTTGA ACAACCTCCA TTAATTGGAT 660
GATATCGGCA AAATGCAATC CGGTAGTAGG TTCGTCAAAG ATATAGAGAG TTTTTCCTGT 720
CGCACGCTTT GAAAGCTCAA GTGCAAGTTT AACGCGCTGG GCTTCTCCCC CTGACAACGT 780
CAGAGcAGAC TGTCCTAAGC GCACATACCC AAGCCCCACC GAGcAGAGAG CTTCTAGCTT 840
TCGTACTATA GGrGGaACAG CAGAAAAAAA AgAACGsGCT TCTTCGATCG TCATGTCCAG 900
CACATGGGAA ATGTTCTTGc CCTTATAAAA CACAGCTAAT GTCTCCCGGT TAAACCGGGT 960
GCCGTGACAC ACATCACAGG TAATGTACAC ATCAGGTAAA AAATTCATTT CAATAGTGAT 1020
AACGCCATCA CCTTTACAAT GCTCACACCG TCCTCCAGGA ACATTGAAAG AAAAACGTCC 1080
TGGTTTATAT CCCCGCATTT TTGCTTCAGG AACcTGGGAG AACaGCATTC TAaTATCTGT 1140
AAACACACCC ACATAaGTTG CAGGATTTGA AcGAGGAGTT CTCCCGATAG GACTTTGGTC 1200
TACATAAATT ACTTTATCTA AATGCTCCGT CCCCTCAATC GAGGAAAATT TTCCTTcAGG 1260
AAGTCTGCCG TTCATCACAC GGTTGTATAA CGCAGGATAT AGCACATCAA TTAAAAGCGT 1320
TGATTTACCC GAGCCGGATA CTCCGGTAAT GCAGGTAAAA GTACCGAGTC GAATACGTAC 1380
AGAAATGTGT TGCAAGTTAT GTTCATGGAC GTCATGCACC GTAAGAACAT TTCCATTTCC 1440
CGTTCTCCGT ACTGCAGGAA TGGGTAATGT AATTGCACCG GCAAGATACT GACCAGTAAG 1500
ACTTGCTTGC ACCTGCATAA CTTCAGGTGG ACykCtGCGG CGACAACATA TCCTCCGTGA 1560 ACACCCGCAC CGGGGCCGAG ATCTACAATA TAATCTGCTA CGCGGAGcgT TTgCTCATCG 1620
TGCTCTACCA CAAGCACTGT GTTTCCCAAA TCACGCAAAT GAAGAAGCGT TTGGATCAGT 1680
CGTTCATTAT CCCGCTGATG CAAACCAATA GACGGTTCGT CCAGTATGTA CAAAACCCCT 1740
GTAAGGCGCG AACCTATCTG GGTTGCCAGT CTAATTCGTT GTGCTTCTCC GCCGGATAAC 1800
GTGGCAGCAG CCCGTTCCAA GGTGAGATAT CCAAGACCCA CGTTCTGAAG AAACTCTAGG 1860
CGATCGGTAA TTTCTTTCAG GATCTGTTGC GCAATTGTCG CTTCTACTTC TGTCAGATGG 1920
AGAGTTTTAA AAAACTCACA CGAATCATCT ACAGACAGCg CACTGAGTGC GTGGATGTTT 1980
TTTTTTTCTA TAGTCACCGC AAGCGACTCT GGCTTTAAGC GCATCCCTCG ACACGCTTCA 2040
CATGTACGCA CCGATAAATA CCGTTCATAT ACCTCGCGCT GTGAGTGAGT ACATGACTCT 2100
GCGTATCTCC TGTGCAGCTC GCTAAAAATT CCCGGCCACG GCTTAATGTA GCGTGCGGTA 2160
CGAGAGCCAT CTTTTCGTTC ATGGGAAAAC TCAAGAGCCT CGCTGCCACT TCCATGCAGG 2220
ATAATATCCA GTGCGTGTTT TGACAGATTG CGTACCGGAT CATCGAGAGA AAAATGGTAC 2280
TTTTCTGCGA GTGCAGCAAA CCGCACACGG TTCCACTCAT GCTCAGGTTT AAATGGCAAA 2340
AAAGCACCCT CGTTAAAAGA ACGGTTTTGA TCAGGGACAA TGCGATCTAA ATCAAATGTC 2400
TGCATAATCC CCAGTCCTGC ACAGCTCGGA CAGGCACCAA AAGGTGCGTT AAAAGAGAAC 2460
AAGCGAGGCT GCAATTCGGG TACGGAGACA TTACAGTGCG CGCACGCGTT TTTTTGCGAA 2520
AAAAATAACT CAGACGGCAG GAGAGCAGAT GTTTCTATCT TTCCGGAAAC GGTCCCAGAA 2580
TTCTCTCCCT GCACTAAGAC GGTCAACAGC CCATCTGCAT AGCCTAGCGT CGTCTCTACT 2640
GATTCTGTTA ATCGTTTACG TACTGTATCT GACAATTGAA TTCTATCGAC AACTATATCG 2700
ATAGAATGCT TTTTTTGCTT ATCCAACGAA ATGCGCTCGT GTAAGTGGAG CAAAGCCCCG 2760
TCAATACGAG CTCGTACAAA ACCATCTTTG CGTGCAGCTT CCAAGACCTT GTGGTGTGTA 2820
CCTTTTTTTC CTCGCACCAC CGGGGCAAGC AACTGAATTC TGCTTCCCGA CGGCACGGTC 2880
ATGAGGGTAT CAACAATTtG GTCAACGGTT TGTTCCTtGA TCTCCCGCGC ACAGTGCGGA 2940
CAATGCGCGC GTCCTATGCG GGCAAACAGC AGACGATAGT AGTCATAAAT TTCTGTGACA 3000
GTACCAACCG TTGAGCGAGG GTTACGCTGc GTAGTTTTTT GCTCGATGGC AATCGCAGGA 3060
GAAAGACCCT CGATAGAGTC AACATCCGGC TTATCTAACC GACCTAAAAA CTGGCGAGCG 3120
TATGCAGAAA GGGACTCCAC ATACCGACGC TGTCCTTCTG CAAAAATAGT ATCAAACGCA 3180
AGCGAACTCT TGCCTGAACC AGAAAGACCG GAGATCACCA CAAGCGCATC TCGCGGCAAC 3240
ATAACATCAA TATTCTTCAG ATTATGCTCA CGCGCACCCT TTATACACAG ATTACGAGCA 3300 GCAGAGCCCA CGCGAACGTC CTGCGACACG TTCTCCCTTT CTACGCGATC CCCATCCATA 3360
AGAGGGCAGA CTATGCGCAA TTTTTATGCT TTATGCAATT CCACTCCCTT CGCGGCAGAC 3420
CGCATTACAC CGTTCCTCCA AGCAACGCGA CAATTTTTTC ACTCTGCGCA GACTGTACGA 3480
TGAACATCAG CACACTTCCC TCAAGCAGAA CCGTATCTCC TGAAGGAATG AAAGAGCCAC 3540
GCACCGTTGA GATGAGCAGC ACCAAGAAGC TTCCGTGCAC AGCGATATCC TTCAGGCGCT 3600
TGCCTACCAG GGGGGACTGC GCGGAAATAG CGAACTCAAC GATTTCTAGC GATCCGTCGC 3660
CGATAGTATG TATGCCAGTG ACGTGGGAAC CGGCCAAATG GCTCATAATA GCGTCAACCA 3720 cTACGTCTTG ATAAGAAACA GCAACATCGA TGCCAATTTT CCCCGCAATA TCCTCCATAA 3780
GGGAGCTGTG TACCAATGCA ACAGCCCGAG GCACTCCGAG CGTCTTCATG TATGCGGCTG 3840
TAATCATATT CAGCTCATAG TTATTAGTGG TGGTAATCAC CAGATCAAAC GTGTCCGGCG 3900
TAATCTCTGC GAAAAAAGCC TCATCTGTGA CATCACCATG ATAGGCAGTA ACGTGCGGAA 3960
ATTGAGCACA CACTGCCTGG GTTGcCtTTC ACTCTTATCA ACCAATACAA GACTCGCACG 4020
CTCCCTTGGA GAAAGACTGA AGGCACTACT GAAAAAGTGC GGCTTGCATT TTTCTGCTAC 4080
ATCCTGTGCC ACGAGCGTAC CTACCGCGCT CATGCCAATG AGTGCAATTT TTTTTACCGG 4140
ATGTATTTTA AAACCCGCCA GCTCATAAAA ACGTCCCATA TGTTCAGGCG CACAGAGTAC 4200
TGACAGGCGC ATACCAGAAG CGAGCATGGT CTCCCCTGAG GGAATTACAC TCCTCCCCCG 4260
AACTTCAAAA GCAACGGCAA CAAAAGAAAT TTTTACCAGA CGACGCATAT CAGAGAGCGT 4320
GATACCATCG AGGCCGCTGC CCTTTGCAAT AGGAAAACGG GCAATTTCAT ACGGTGCATT 4380
TTTCAATGGG ATGACATCGC TGATGGCACC CTGCTCGACG GTGCTCACTA CCGCACGCAT 4440
CGCTTCCTTA TCCgCAGATA TGAGAAAGTC AATACCAAAA ATACAGCGCG ACTCACGACA 4500
CACCGCGTGA GcGTAGTGGT CATCGTGCGT TTGGGCTATT TTAATCACTC CGGCATTCAA 4560
GTCGGCGGCT ATACCACACA GTACTATGTT AAGTTCGTCA ACCTCGGTGA CCGCAACAAA 4620
CGCCTGTGCC TTTGCGATAC CTGCyTCACC CAGGGTAGCg CGGTGAtCTT TTTGATGACG 4680
CaCGAGCGCT TGTGCCCCCG CGGGAATTTT ACGGGGGACA GGCTCATGCG CGGCAGCAAC 4740
AAGCGTAACC TGATGTCCCC TCGCGCTCAA ACGACGCGTA AGTTCACGCC CATTCGTACC 4800
GCATCCAACA ACAATAACCC TCATGTCCGA AGGCCATAGT AGCACGAAAT TTTTTTGCAT 4860
GGCCAGCGCG cAGAACaCGg CGcACAACGC CTGCCACTCA TATCTTTTTC AAAAGTACCA 4920
CTACCTGTGC GGTAACCGCC GCACCAGATC CAACAGGTCC GAGGCGTTCG GCAGTCTTTG 4980
CCTTAACAAA AACACGTGTT ACGTGCGTGT CCAGGGCCTG CGycAaGcGA TGCGCGCATC 5040 GCTTCCCGAA ATGGGTGTAA TGCAGGCTGC TCAAGACAGA CAACAGCATC GAGATTCACC 5100
AGCCGCCAGn CACTGCGCGC ACCAGTTGCC AGGTATGGCG GAGCAACGCG CAAGAATGTG 5160
CGTCTTTCCA TCGTCCGTCA CAAGAGGGGA AAAACGTGCC AATATCCCCC AGGCCCTTTC 5220
GCCAGCTGGC GTAATAGCGA AGAGGCCCGC ACCGATCGCC CTTCCCAACA G 5271 ( 2 ) INFORMATION FOR SEQ ID NO : 7 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 646 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: AAGTCCTTCT CGTAGCGCTT ACTGTCGGTG AAGCGGGAGT GGAGGACTTG GCCCGGCAGA 60
TGCAGCAGCC CTCTATGCTC CGGTACATCT TTGGCTTGTG GAAACCAAAC ATCTTCGCTT 120
CTGGCCAGGT CTGGAAAGAC AAGGCAACAG ACACAGCTCT GCCGAGGCCC TCACAAGAAA 180
CCATCCGGGG AGCCTGTGCT GCTCCACCCC GGGCCCACCC AGCGCCGTCA GACAGCAGGA 240
CTCTGTGGAA GGTGGCCTGG ACCCGCCTCC GCTCCTGGCG CTGGCACGGC AAGTGTATGA 300
CACACAGAAG AGCTCAGGTG TTCAGGGAGG CCCCCCGCTC TCAGCACTCC CCCACCCCTG 360
CCCAGCAAAC ATCCTTTCTG AAAATGAGGA AGGGGAGGCT GGTTGGTTTG TTGGCAGGGA 420
GCCAAGCACT TGAGCCATCA TCTGCTGCCT CCCAGGGTCC ACGTGAGAAG GAAGCTGGAA 480
TCGGGAGTGG ATCAAGGATT GGAACCCAGG CACTTGCGTA CAGGATATGC TACAAGCTCT 540
CCTGATAATC CTGTAAAATG ATGAAATCAT TTAGGATGTA TCCTGAAATC TGAGACaAGG 600
CATACCTTTT CTTCTTGCAT CTTTGAAAGT GaACCCCCCC CCACGC 646 (2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28295 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: GTTCCCTAAA GAAGGGAATG TTTTCTCCtG TGTGTGCAgT CAATGTGCCG GCGTATATyC 60
CGTTAGAATG GGTGCGCTCC AGTGTCAAGG CGATATCCTG ATAGCGCATT CTGAGCGCTC 120 GGATATTTTT TTTAAACACA TTAGTCATAT GCTGAGCTTG TTCCTAAGTT GAACGATTTC 180
ATGGCGCACC TTGTGAATGA GAGTTTGTGT TCCCTGAGAA GGGTTTTTTC TCAGGTGAAG 240
TAGCTCAGCG TATGCAAGAA ACTCAAAAGC CTCTTTTTCT ATGCCGGAGC TGGCTATACT 300
GGATGCGATA CGTGCATGGT GTGsACTGCG TGTGATGCGT TCAAAGTAGT GGCGGAgCAC 360
TGCCTTTTTG TGGATGCGCG AAATCGGCGG CCCTTCTGAA AAGGTGAATG TATCCTTTTT 420
GGCGAAGGCG TGTTGGGAGA AATCTTCATA CGTGATGGTA CGTACATTGG AAGGTGGTGC 480
GTGTTCTGTA TTTAGCCGTG TAAGTTTCTG GATTTtCTCT TGTGGGAGAC TGTAAAACCA 540 GTGTGCGTAT GTTTCTAGTG AGCGATTGTC AAAATTTCCT GCAATGCAGt CGCAAGTGGC 600 TCAGTTCTCT TGGGATTAAA ACGTTTGAAG GAAGCATGTG GACGTGCGTG TTGGAATCCC 660
TTTCCCGGGT GCAAATCGAG TCCGGCCAAA TAAATAGAAT GTACCCCACA CTGTAAAAGA 720 TATTCGAGTG CAGTCCCCAT GACACTCCCA TGTCTTTGTG CAGAAACCGA AGGaTGTGCA 780 GGTGTTGCAG AAAGAAGTGT TCTGTATGCG AATGGTAGTT GAGCAAAGAA ATTGGAGAGT 840 ATTTAAACAC ACGTGGAGGA ATGCACGCTT CTAAGGGAAA CAACACCGGA AGGGTAGGCG 900 CTGGCGGAAA ATGCTCTGCG GCCCAAAAAC TGCCGTCCGT GCTCATGCAG ATATCAGGAG 960
AAATATTGCG GTAGAGAAGT GCCTGCAGTG CTGAGGAGAC CGCAACTATG GGAAGTCCGA 1020
CTCGGGTGAT TTGCTCGAGT CCGAAACCTG cAGCCACACA CAACACTTCC GGTGCGGTCA 1080
GGCACACCgT ACCGGGcGCT CAAGAAAAAA TACATTTCTC AGTGTATTCA GTAACCAACG 1140
TTTTCCGAAG TATATGCGCG TGGCGATTTC ACTCTGAATT ACTTTGATGG TATAgGTGAT 1200
TTCTTGCCAT GTGGACTGAG CTTGTGCGCG CCATATTCGG TTTGCTGGCT CCCAGGGAAC 1260
AAACGCGGTT TGCCCGAGCA ATTCGTCAGG GATATGATTA ATTAAGAAAG AATGCAAACT 1320
GCcGCAGTCT GGTCTCCAGA CTGCGTCCCA CTGGATATCG GATGAAGTGA ACGCATCGTT 1380
TGTGTATCGA ACTGCGATAA GCTTTGCATG TGGAAAGCGT GCCCGCAAAA ACTCCGCGGT 1440
GTACGACTCA CCCGGCTCTG TTATTACCAC AATGCGGCGA CCTTCTAGGA TCGCAGCGAC 1500
AAAACGCTCC GCCTCCCTTC TTGGGTTATA TTTTGAGTGG AGGTTCAGCG ACACGTCTCC 1560
CCGGTTTCGT ACTGCAAGAA ACGGAGGACG TCCGTCTGCG CGCGAGTAGT TCCACTGCTT 1620
GTGATGAGAA GTGAATTCTC CGCGTGGTTA ACAAAGTACG CGCTGACGCC ACGGGTAAGT 1680
TGGTTTAGGT ACAGCTCAGG ATCGACGGGA AAGGAAGAAA AGACCACAAC GCGCActTCC 1740
CGGGAAGCGA CGCATGAGGC AAGGTTGTCG TTCCCTACGA GCGTTAATAG CTGGTTGCTC 1800
AGCGCACAGC AGCGGCGTAC CATACTCAGC TCTTCCTGAA GAACGGCGGG GGGCAACGCA 1860 CTcTGGTTGA TGCTGATAGT GAACAGGTTT GCCATCCGCT GATTGTTCAG GGCGAACTCG 1920
AGCTGGGAGC AGCTCGGTGC CAACTCCGGC TGTcGCGGAT ACACGGGCCT ACTCGCCTCG 1980
ATAAACGAGA AGCGCGCCTG TACGGCaAGG ATAAAGGAAC GCACGCGCTC CTCGCCTGAT 2040
GCaTCACCGT GAAGGTCTGC aGCGAGGGCA CCTTCCTCGA CGTCGGTTCG GGTATCAAAG 2100
AGCACTAGGT ATACGCGCAC GCTTCCAGCG GAAAACGGAT ACAAGTGGAG CGCACGCAGC 2160
GTGCTGAATA GATGGGAAGA GAAGTAACCC TGCAATTCGG TAAGTTCCTG CCTGAACAGG 2220
GTCGTCCACT GCTCCGCAGg CAGGCCGGGC GAAAGGAGGG AGGTGGGAAC ACGCATACGG 2280
TAAAACGTGG TGGTGTCAAT CCCGAAGGGa AACGCAGgTC AAACATGCGC TCTTCAGTGA 2340
GAAAGAGAAA ACCCGCGCGG GGAATTCCGA GAGATTCAAC CAGCTGTACG AACTGCAGCT 2400
CCAGGCGGTA GTACTCGCAG GAGACACCCG AGCTACGCCG GATGCGGGAA GCGCGCGCGA 2460
TAAGACCCAT TAGGAAATCC CTAGCTcGTC AAAGAGGTGT TTGTACGTGT GGAAATACTC 2520
CGATCGCGCA AACTCCTCGA TCTTCTCCTC GGGCAGACTT TCGAGCAGCC GGTCCATATA 2580
CCCGAGCACA GAGCGCACCT CGTCGGCAAG GCTCTTTGAG AGGGGAGGTG CAAGGTTGGC 2640
AGAATCCCCC TGCGCAGGAC GGGCCGCTTC CGATTGGGGC GGGGACTCGA TTGCCACCGG 2700
CGTGGCAACC TCCGAAGACT CAGAGGCGGC GGTTTCAATC GCGTGGGGAG CAGAAAACTC 2760
TAGCGAGGGA ACGGAGACGT CCAACTCCTG CTCTTCTGGC AGTGGGAACG ATGTATCTTC 2820
GCGCTGGGTG TCTTCGTCGT CGAAGACATT TTCAGTATTG AGGGGCTGCT CGGCACCGGC 2880
TGCATCGAGG CCGGTGGAGG AAACGTGCGC CCGTTGAGCC TCAGGCGATG CGTCATCTTT 2940
GGCAACAGGA GACGGATCGG CCGCAGCGTC ACGATCCGCG TGAGCCCCCT GCTCTTTCAC 3000
ATCGAAAAnT TCGTTTCCAA GAGTAGCACG CCCCTGTTGT ACCCACTGCT CCtGcgCGAC 3060
GCgCGCTGcA GCGTGGTCGT CCTCCTCTAG GTAGCTGAAA TCGCCTGCAG CGGACCCGAT 3120
GTGAGAGGAC TCAAAACCAC TGAGTTCACC ACCAAATGCG TCGGCCGCAG AGTCTTTCCC 3180
ATCTTCCTCG GTAAACTCAG AGGTGATGAG GATATTGTTC AGCTCATCGT TGGTAAGCGC 3240
AATCGTCTCG TCTGGATCAT CGTCACAGAA AAACCCGGAG TAGGCAGCCT CTGTTCCTTC 3300
CGGAGCCGGA GCGGTGGCGG GAGCCTGAGC AGGCTCGGCG CCAGCTTGCC GGGAGAAGGT 3360
TCCCTTCAGC TGGTCTAGAT CAGCGCGGAT ACTGCTGATT TcCTGTGCAA TTTTCAGAAG 3420
CAAATCGGTG GAAGCATCGC CTGaGGTACT GGCTGCGCCC CTGCGCGGCC ACCTGGGCAT 3480
CACCCATGAG GGACTGCGCa ATGCGTCCAC GTCACTGAAT GGGGCGCCGC GTACCGGTCC 3540
TTCCGAAGAA GGCTCCGAAG CaGaGTAArT GATAGGGTCG GAATCGACCG GGTATTCCGG 3600 TGCGTCGTGG CTCAAACTCG AGAGCAAGTC GTCGAACTCT GTGGTGTCGC ACGACCCGCT 3660
GGGGCAGGGC ATGCTGCACC CGCGCGGCTC TGGCGGCTCT TCGGTCACAA CGGAGCTCAT 3720
CGCTGCCGAT AGATCAACTC TGTCGTCCTG AGACTGCGCG GTGCGTGTGA CCAAATCGTC 3780
GAACGACGGC ACCTCCGAGT AAGCGGCATC TCCGGtGGCG CATCTGCCGC CTGGTCGGAG 3840
GAGGACTCGC CACCCGGAGC GTGGCTGCAC CCGCGAGCGA GCACCTCGTC CTCCACGGAG 3900
AGGTCAACAA AGCCACGAGA ATCCTCTGCA GAGGACTGCA CAGGAAGCGC ACGGTACACA 3960
TGGTGCCGCA CCGCATCCAC GCGCGCACCC GGGGGAGCGT GTTGGGCAGG CGCCTCTGGA 4020
TCCTCTGGGC CACCTGAGCT GACGGGTGGC TCCTGCGCTT CCGGCGGTAC CTTTACCCAC 4080
ACGCCACACG CATCCCAGGC ACGGTCTTCC CCCTGCTCCT GAGGGCGTGC AGTTGCGTGT 4140
GGACTGTCCA TTTCTGCTGA CTCTGATCCC ATAGCATCTT CCTCTGTGCC GTGAACTCCC 4200
TCACACATCC TGTGTCGGCA GCCGTGCCGC AAAGCATGAG CATAGCGTGT GCGGTCCGTT 4260
TTTTCAACAA TTCGTAAAAG CACCCCGTTC TTACGCCGTA AATGACGGCA ACGCGCGGGT 4320
GCGGAAGGCA CCTGCAGTGC CTGTGTCATT ACCTCCTGGC GGGGGGGGGT ACGGACAAAA 4380
ACAGGGGTGA AGATGTGGAA GACAGTGCAG ACACGCAGTC AGAGGAAGCG GTGAGGACTC 4440
TATGCGCATT TACTTGAGGG TAGTACTTCC CCTGTCTCTT GCGCTGAACA GCTACGGTGT 4500
ACTCGCCTTT TTCTGGGGAG AGCGGGGGGT GTGTGCCATG CGGCTACTGG AACGTGAGAA 4560
AAAGGAGCTC GTCCATCACA TCCAGACGCT CGCAGAGCGT GGGCGCGACT TGGCTGCGGT 4620
GGTGGACGCT CTATCCTTTG ACGAAGAGAC TATCGGTGCG TATGCGCGTc AGCTAGGATA 4680
TGTCCGCGCG GGGGATGTGT TAGTGAGGCC GGTAAACTTT ACCnTTGCGC ACATGCATAC 4740
CCTgATTCTG GGGATGCACG TCCGCTtGTT GCACCTGCAT GTTTTAGCGA CACGCGTTGA 4800
CAAGGTGTAC GCGCTGTkCs TcGGCTTcTT TgTCGTCCTG TTACAGCTGC TGTGGGGTAG 4860
CGCGCGTGCG TATTTTAAAA CATGAGGCGC AGTCTGCaCG CTnTGCgCgT GCGCTCAAGG 4920
CCGGTGCGCT CGTAGCGTTG CCGACAGATA CGGTGTACGG TTTCTCTGGC CTTGTGCCAC 4980
ACGCTGTTCC GGATCTCATA TGTCTGAAGG CGCGTGGGTG CACAGAGACG GAAGGGAACC 5040
GGAGAGAGGG CTATCCGTTC ATTGCACTGC TTGCAGATCC ACAGGACGTG GTTGTCTATA 5100
CCGGGACGCG GCTTCTGCGG AGTTTCGTGC GCTGTGGCCT GGCCCGTATA CGTTTGTntG 5160
CGCATGCAAG ACGGCGCGAC GCAGGcgTTC CGCTGTCCTG CTGACCTGTG cTGCGCTCAG 5220
TGATACGGGC AGTTGGGGGA GCGATCTTTT CCACGAGTGC AAATCGGCAC GGCGAGCCGC 5280
CGCTGCAAGA TGCACAGGAC ATCGACCACA TCTTTGGAAA GCATCTTGCG CTGACCGTAG 5340 ACGCAGGACC ACTGACCGGC TCCCCAAGCG CGGTGATAGA CCTCACGCAC CCCGTGcCGC 5400
GTGTGctCCG CGCTGGTGCG GCGCCGTTGC CTCTTGCAGG ACTGGAAAGG CGTGACTCTC 5460
CTTCCCTCCC TCATGTGGGG GAAGTATGTA AAGAATGAGG TCAGTTGCGG ACCCACTCCT 5520
CAGGCACCTC TTCATACTCA ACAGAAGCGG TGTGCCCTGC GCCCCGCACG ACAGGGGCGG 5580
TTCTGACGCC TGAAAGCCGG TTGTTTCCTG AGATCTTTGG GAACCTCCAC TGGAGCGGAC 5640
GCGCCTCcTG CGCGAGCTTC TGAGCGATCT CAGGGGGAAA ATGAGAAAAC TGTCGCGCAT 5700
TGACTGCCCC CTTCTGTGAC ACCACGAGCA CTCCGTCTGA AATTTTCAGC TTGAGGGGAA 5760
CGGAAAGTCC GGAGCGCAAC ACcGCGATGC CGCGGCCTTC GCTCACGATG ACGATTCTTT 5820
CCCCTTCCTC TGTGCTGTGC CAGGAACCGG CGAGGGCATC TAGGGACGAA ACGGATTCTT 5880
CACCTGCGCG CGTGTGCCGC ATGCTAGACG TCTCTGTCTG ATTTCCTGTC AGCGGGACAG 5940
AACGGTCAAA CACATCTCGG ACCAGGTGGC GCGAGTCAAG CAGAATGCGC GCTGCCGTCT 6000
CATATGTTTT GGAGAGCAGC CGCGTGGCGT TGTGATCTTT AmCCTTGAGT GCAACAGCCA 6060
GTCTAATCCC CTCAGGGGTG ArGTCCATCG CGCCGCAAAA TATGTAGTCA AGATTCCCTT 6120
TTTCGGGAAA ACGGTGCGGC ACCGCTTGCT CTCTGCAATC TACAAmGTGA TACCCACGCA 6180
ACTCCCGAAT GAAAgAAAAG AGCGCGTCGT TGATGGTAGT TTCTGTGTGC GCAGGCACAC 6240
CAgACACTTc TAGCCTGTAG ACGCCAACAC GAGGAGCTGC GTGAACAGCA TGCGCGAACA 6300
GCACGAACAG GATAAGCAGC CGAGAAGAAC GGACACCTTT TTTCATGAGA CTAGTGGTGT 6360
CGCTCACAGA GGCTGCGGGA CAGCTCCCGT GCGTTGTCGC GAGCTTTGAT CTGCGCGCGC 6420
TTGTCAAAAA GCTTCTTGCC CTTGCAGATT CCCAGCGCTA CCTTCACCCG CCCTGCTTTT 6480
AGGTAAAACT CCAGGGGGAC CAGAGTATAG CCTTTCTCTT CAACCTTGCG CTTCAAGCGC 6540
GCAATCTGGT CCCGATGTGC CAGTAACTTC CGCATCCGAT CCGGATTGGG GGCAAAGGAG 6600
CAAGCATGCA CGTACTCCGC AATATGCACA TTCTTTAGCC ACAGCTCGCC TCCGCGCATC 6660
TCTGcAAATG CGTCAGGAAA AGAAAGATGC CCCGCGCGCA CAGACTTCAC CTCCGTGCCT 6720
TCAAGCGCGA TGCCACACTC TAGACGGTCT TCCACATGGT AATTGAAAAA AGCcTTGCGG 6780
TTCTTTGCAA TGAGATGGGT TCCTGTGCCC CTCATGGCGC CGGATGCTAC CGGATAGGCA 6840
CTTCCCTTGT CAATTCGATT ATCGCCGTGT TAGGCTGCCG TGTCTGGGAG GGACGCCGTT 6900
TTATGTTTGC GCGGTGGAGA AGGTACTCAT ATTTGGCGCG GCGCGAAGCA CGGCGGAATG 6960
CGACCGCAGT TTGTAGTgCT GGGGTGGGCT TCTTTCTGTT CTATCTTTTT ATCACTACGC 7020
ATGTGGTTGC AGCGTATCGC ATTCAGgCGG ACTCGATGCA GCCGACCCTG AGCGCAGGGG 7080 ATTGCGTTCT TGCCTCGTCC CTGTTTCGCT TTGCCCGCAT CAAGCGGGGG GATTTGGTGC 7140
TTGCAACTCC CCTTGAGAAA GAGGATATAG GCCTGTTTAA AAGGGCGATG AATGCTGTGT 7200
TnAGgnTTCG CAAGCCTTCA ATTGTACCGG CCGTTTGGCG CGGCAGATCG CATGTTTTCG 7260
CGGCCGCAAA TGCGCAGGGT GGTGGGCCTT CCAGGGGACA CTGTCTATAT GCGCGATTTT 7320
GTGCTGTACG TTAAGCCCCA CGGTCAGCAA CACTTCCTCA CGGAATTTGA AGTGAGTGCA 7380
GTTAGCTACG ACGTGCGTAA GGGGGTGCTT CCTGAGCATT GGTCTGAACG GCTTCCCTTT 7440
TCTGGTTTCA TGGAAGAGAT GCAGTTGGAC GAGCACTCCT ATTCGTGCTG TGCGATAATC 7500
GAATTGTCTC CAGTGATTCT CGTCTGTGGG GTGCCATCGA CGGTAGTACG CAGATAAAAG 7560
CAAAGGCATT CATGCGTTAT TTCCCTTTCG GAGCATTTGG TGTCTTGTAG TGTGTAGGCG 7620
CCGCATTTGT GGTGCGTGtG CGCATCGTGC TGTTCCTTTT ATCATGTCTT CTGAGGTCGG 7680
TGCGTCTTTG TACGTGCACA TCCCCTTCTG TGCGCAACGC TGTGCTTACT GCGATTTTTA 7740
CTCCCTGGTG CGTTCAACCT ATTTTAGGCC TCATCAGCCT TGTCCGCATT TTATCGATCG 7800
GCTGCTACAG GATGTGGCAT TGCAGCGGGA GTGCTTTGGG GTCCAGGGkT GGCAGACAGT 7860
GTATATGGGT GGAGgTACCC CTTCGCTATT GGCACCGCAG GACATTCGTC ATTTTTGCGT 7920
AGCGTTACGC GCCGCGCAGC GGTATCCGAT TCAGGAGTTC ACTCTTGAGG TGAATCCTGA 7980
GGATGTGACC GAAGAGTTTT TGTGTGCGTG TGCAGAAGGC GGAGTAAACC GTTTATCCCT 8040
TGGGGTACAA AGTCTGCGTG ATGAGGTGTT GCGTGCGGAG CGTCGTGCAG CCTCTGCTGA 8100
ATGTGCTCGT ACCCGcTCCG CGTGATGACG GCAAATGCGC GCTTTTTCTC TGGCGGGGTG 8160
CGTATTTCAG CAGATCTCAT CGCTGGATTG CGCGGGCAAA CGGCGCGAAT GGTGCGTGAG 8220
GATaTAGATG AGCTTTTGTC TTTTGGGCTG AGACACGTGT CGCTATATGG GTTGTGTGTA 8280
CCGCATCCGA CTGAAACGCA AGAGGAGCGA ATTGCAGCGC TTTGGGCACA CGGCAGCGCG 8340
TATCTGGTGC GTGCaGGATT TAACCGGTAT GAGCTTTCGA ATTTTGCACG TACTGCgGCG 8400
GACGAGAGCG CGCACAACAG AGCATATTGG CGGATGGCAC CGCACGCAGG GGTGGGGCCT 8460
GGCGCAGTTG GCACGCGTTT TGTCAACCTT TCTTTATCAA AGGAGGGGGC GTGGGCGATC 8520
CGCAGCACGG TGCGGAAACA TCTTGGCCAA TACTTAGCAG AAGTGTGTCG GGAAAATGTG 8580
TATGAGCACG AATTCCTTAC AGAACATATG TGTGTGCAAG AAGCATTGTT AATGGGATTA 8640
CGTCTTGAAC AGGGACTGGA TGTGGTTACA TTTCGTGCGC GGTTCGGGAA GGGAATTGAA 8700
GCGTACATTG GCAAAACAAT CGCGCGGTGG CAGTGTCATG GCCGAATGCA GCGGACGGCG 8760
ACGTCATTGC GTTTGAGTGC GCAGgCACGG GTATTTCTGG ACAGTTTTTT GCGAGAGGCG 8820 TTTGCAGAAC TTGCGCGCAC GTGaGTGGTC GTGAGAATGG GTACGCCGTG TACTATTTGA 8880 ATTCGGGTAT ATTTCCTTTG GAAGAcTTGA cAGCctGCGT TTGTCTTATT ACCATCGGCC 8940 GCAAAATACC GGACTTGCGC GCCCGCGGTG GGACGCAAGA GGATGCCAAA CCAGTCATCA 9000 AATGTTGGTT TGATGGTGTT CAGGTGGGTC CCCTTTCAAG CGTGCGCAAT GCGCTTGGAT 9060 CCCGCGCTGG TTGTGCGCGG TGCCTGAGAG GCCCGTTTGG TTGTAGCGCT GTCGATCCTT 9120 CGTTTCGCTT TTTTGTGCCT CCTTGTTGCG TGCGGGCGTG CGGGGTCCGC CGTATCTGTG 9180 CCACAGATCT TTTGAGGAGG ATTTTCATGG CCAAGGAAAA GTTCGCGCGC ACTAAAGTTC 9240 ACATGAACGT GGGTACTATT GGGCACGTCG ATCACGGGAA GACAACGCTC TCTGCGGCGA 9300 TCACCTCGTA CTGTGCAAAG AAGTTCGGTG ATAAGCAACT AAAATACGAC GAGATTGACA 9360 ATGCGCCCGA AGAGAAAGCG CGCGGGATCA CCATTAACAC GCGTCATCTT GAGTATCAGT 9420 CCGATCGTCG TCATTACGCG CATATTGATT GTCCTGGGCA CGCGGACTAT GTGAAGAATA 9480 TGATCACGGG TGcTGCGCAG ATGGACGGTG GTATTCTCGT CGTGTCTGgC tGACGGCGTT 9540 ATGCCACAGA CGAAGgAGCA TCTTCTGCTC GCCCGTCAGg TTGGTGTTCC CTCCATCATT 9600 GTTTTTTTGA ACAAGGTTGA TTTGGTTGAT GATCCTGAGT TGCTAGAGCT GGTGGAAGAA 660 GAGGTGCGTG ATGCGCTTGC TGGATATGGG TTTTCGCGTG AGACGCCTAT CGTCAAGGGG 9720 TCTGCGTTTA AAGCTCTGCA GGATGGCGCT TCCCCGGAGG ATGCAGCTTG TATTGAGGAA 9780 CTGCTTGCGG CCATGGATTC CTACTTTGAA GACCCAGTGC GTGACGACGC AAGACCTTTC 9840 TTGCTCTCTA TCGAGGATGT GTACACTATT TCTGGGCGTG GTACCGTTGT CACGGGGCGC 9900 ATCGAATGTG GGGTAATTAG TCTGAATGAA GAGGTCGAGA TCGTCGGGAT TAAGCCCACT 9960 AAGAAAACAG TGGTTACTGG CATTGAGATG TTTAATAAGT TGCTTGATCA GGGAATTGCA 10020 GGTGATAACG TGGGGCTGCT TTTGCGCGGG GTGGATAAAA AAGAGGTTGA GCGCGGTCAG 10080 GTGCTTTCTA AGCCCGGTTC TATTAAGCCA CACACCAAGT TTGAGGCGCA GATCTACGTG 10140 CTCTCTAAGG AAGAGGGTGG CCGTCACAGT CCTTTTTTTC AAGGTTATCG TCCGCAGTTT 10200 TATTTTAGAA CTACTGACAT TACCGGTACG ATTTCTCTTC CTGAAGGGGT AGACATGGTG 10260 AAGCCGGGGG ATAACACCAA GATtATAGGT GAGCTCATCC ACCCGATAGC TATGGACAAG 10320 GGTCTGAAGC TTGCGATTCG TGAArGGGGG CGCACTATTG CTTCTGGTCA GtGACAGAGA 10380 TTTtGTTGTA GGCGTTTGCG GCGCGGAGTG TGTTTGGAGT TATTTTGCAA GGTGGGTGCG 10440 GTTTTAGGCT GATGGAGGGG TTATGGCCAG GGAGAGAATT CGGGTAAAAC TGTGCGGATT 10500 TGACGTGGAG CTAGTGGATC AAAGTTCGCG CGCGATCGTG CACGCGGTGC AGAAGGCGGG 10560 CGCTGAGGTG CTCGGACCTA TTCCGCTTCC GACTAGGATG CACAAGTTTA CGGTCTTGCG 10620
CTCTCCTCAT GTGAACAAGA AGTCGAGGGA ACAGTTTGAG ATGCGTACGC ACAAGCGGCT 10680
GATTGATATC ATCGAACCTT CTCAGGAAGT GATGAATGCG CTTATGGGTT TAGAGCTTTC 10740
TGCAGGAGTG GATGTGCGGA TAAAGCAGTG AGGCGTGTGT GTTTTGTCTG TGCGTTGCGA 10800
TACGGAAGAG GTAGGTGATG GTTGGTTTAA TCGGCCAGAA AGTTGGTATG ACCCAGATTT 10860
TTGACGCaCG GGGTTGTGTT ACGCCGGTGA CGGTGATTCG GGTGGAGCAC AACGTGGTGG 10920
TAGGACTGAA GGATGTGGAG CGCTTCGGTT ACTCTGCaGT GATACTTGGC ACAGGGTGCA 10980
TGAAGAAAAG TCGTATCTCA AAGCCATATG CTGGACAGTT CGCTGAGCGG ATACCGCCGG 11040
TGAGGGTCAT GAGGGAGTTT CGGGGCTTTA CGTTGGACGT TTCGGTTGGG CAAGTGCTCG 11100
ATGTGCGTGT ATTGGAGTCC GTGCGTTATC TTGATGTGTG TGCTCTCTCA AAAGGAAAAG 11160
GATTTCAGGG AGTAGTGAAG CGGTGGGGTT TCAGCGGAGG TCGCTCTTCT CACGGATCGA 11220
AGTTTCATCG TGAAGCGGGT TCCACCGGGC AGTGTACGAG TCCTGGCCGT ACGTTTAAAA 11280
ACGTAAAAAT GCCGGGACGT ATGGGGGCTG AGCGGGTGAC GGTGCAGAAT CTGCGTATTG 11340
AACGGATTGA TGTGGGTTTG GGTGTCGTGA TGGTGCGCGG TGCGGTGCCA GGTAGAAACA 11400
AGGCCACGGT GTTTCTGCGG ACCGCGGTCA AGCGTGAAAG ATAGGGGTGT ATACGCAtGG 11460
AAAAGACAGT GTATTCGGTT GAAGGTGTTG CGCTGCGGTC AGTTGAGCTT GATGAGAGTG 11520
TCTTTGGGCT TTCGGTGAAC CGGGGTGTGA TTTATTACGC GATAAATAGT GAGTTGAGTA 11580
ACAAGCGCTT GGGGACTGCG TGTACTAAGG GACGTTCCGA AGTGCATGGT TCGAATACCA 11640
AGCCCTATAA GCAGAAGGGT ACGGGTCGTG CTCGCCGCGG AGATAAGAAG TCTCCACTTC 11700
TGGTGGGGGG TGGTACTATA TTTGGTCCTA AGCCGCGTGA TTTTCACTAT GCTCTCCCGA 11760
AGAAGGTGAA GCGTTTGGCC ATGAAGTCTC TCCTAAGTTT AAAGGCGCAG GGGGATGCGC 11820
TGACAGTGAT TGAGGACTTT ACGGTCGAAA GTGGAAAAAC TAGGGATCTG ATACAGGTGT 11880
TGCGTCATTT TGCACAAAGG GAGCGTACCG TTTTCATCTT GCAAAATGAT GATGCGTTGT 11940
TGAAGCGTGC GGGGAGAAAT ATTCCAACGC TCAGTTTTTT GTCGTACAAC CGTTTGCGCG 12000
CGCACGACCT TTTCTACGGG CGCAAGGTAT TGGTTTTGGA GACTGCGGTA CATAAGATCG 12060
CGGATTTCTA TCGGTCAAAG GATGCTGCAC AAGATGGAAC ATACTGATGT AGTGATTGCT 12120
CCGGTGCTTA CGGAGAAGTC GAATGCGCTG CGGCAACAGG GTAAGTACGT GTTCCGTGTT 12180
GCAGCTCGTG CGACAAAGAT TCAGATTAAG CAGGCGGTGA CGCAGCTTTT TGGAGTAACG 12240
GTTAGGCGGT GTACGGTAAT GAATGTCTTT GGGAAGAAGA GGCGTGTTCG TCATCGGACC 12300 GGTAGGACGT CTGGGTGGAA GAAGgCGATC GTGCACGTTG CAGCAGGACA GTCAATTGGT 12360
GTTCTTGAGC GTGCATAGCG gTAAGCTGCG GTAGCTGCGT AAGtGCCAGA GCGGTGACCG 12420
AAGgAGACGG GGATGGCGTT GAAGATGTAT AGGCCTATGA CGGCGGGCTT GCGGGGGCgT 12480
GTTGATCTGT GTCGTGCGGA GCTTACCGCG CGCACGCCCG AAAAGAGTCT TACACGCGGT 12540
AAGCCTGCCA AGGCGGGCAG GGGTGCTGGG GGTAGGATTT CGGTGCGTCA TCGTGGGGGT 12600
GGGCATAAGC GGAGGTACCG TGATATCGAT TTTAAACGTG ATTTGCACGA CATACCTGGC 12660
ACGGTAAAGA CTATCGAGTA TGACCCGAAT CGAAGTGTGA ACATCGCGCT TGTGTTTTAC 12720
GCGAATGGTC AGAAGCGCTA TATACTCGCA CCCAAGGGTT TGAAGGTGGG ACAGCAGGTC 12780
GTTAGCGGAG AGAAGGTCCC TTTAGAGCCC gCGAACGCGC TGCCACTCGG GGTAATTCCA 12840
GTTGGTTTTA CGGTGCATAA CGTTGAGCTT ACGATCGGTA AGGGTGGTCA GATCGCGCGT 12900
TCTGCAGGCA CCAGGGCGGT GATTGCGGCA AAGGACGGTG GCTATGTGAT GCTTCGTTTG 12960
CCCTCTGGGG AGGCGCGTCT GGTGCATCGC AGGTGCTATG CCACTATTGG TGAATTAGGT 13020
AATGAGGATC ATATGAACAC GGCTTTGGGG AAGGCAGGTC GTGCGCGTTG GCGTGGGGTG 13080
CGGCCGACAG TTCGTGGTAT GGCTATGAAT CCTGTGGATC ACCCGTTAGg TGGTGGTGAA 13140
GGGCGTGGTA AGGGACGTAA CCCAGTAACT CCCTGGGGGC AGCCGTGTCG AGGATACAAG 13200
ACGCGCAAGA AGCGCAGGGT ATCCGATCGC TTTATCGTGT CAAAGAGAAA GTAAGGGGGG 13260
GGTATGTCTA GGTCGGTGAA GAAAGGTCCC TTCGTTGATA AAAAGCTGTA TAAGCGAGTT 13320
GTCGAGATGA ACAAAGCGGC TAATCAGAGA AATAAAAAGG TGATCAAGTC GTATTCGCGT 13380
TGTTCCACCA TTATCCCTGA AATGGTGGGC TTCACTATCT CGGTGCAC A TGGCAAGTCG 13440
TGGATCCCAG TGTACATTAC GGAGGAGTTT GTGGGGCATA AGCTGGGTGA ATTTTCTCCG 13500
ACTCGTGTGT TCCGTGGGCA TAGCGGTTCT GACAAGAAAG TGGGAAGGTA GGTGAACTGA 13560
TGACTGAGCG TGTCACGTAT CGAGCGAAGA CAAAATTTTT GGTTGCtCTC CGACAAAGGT 13620
GCGTCCGGTT GCGAATGTGG TGAAGTGCAA GCCGTATGTg CGCGCGATGG CGCTTTTGGG 13680
ACACTTACCG CACAAGGGTG CACGTTTAAT CTCCaAGGTC ATGAAGTCAG CGGCTTCGAA 13740
TGCAATTGAT CGGGACAAGC GTCTTGATGA ArAGCGCTTG TTCGTGCGTG ACATTCAGAT 13800
AGATGAGGGG CCTCGTTTGA AGCGTCTGTG GTGCCgGGGA CGGGsGCGGG GAGATGTTCA 13860
GTTGAAGCGG ATGTGTCACA TCACTGTTGT GGTAGAGGAA AGTGTGAGGA CGAAAGATGG 13920
GTCAAAAGGT TAGTCCAATC GGTCTGAGAC TGGGGATCAA TAAAGTATGG TCTTCTAGGT 13980
GGTATGCAGG TCCTCGGGAG TACGCGGCGT TGCTGCATGA GGATTTAAGG ATTCGTAGCA 14040 TGATTCGCTC CTTTCCTGAG TGCAAAAATG CGGATATTGC CGAGGTGGAG ATTGTCCGTC 14100
ATCCCCAGCG AGTGACGGTA GTGATGCACA CCGCGCGCCC TGGAGTAGTT ATTGGAGCAA 14160
AGGGTGTAAA TATAGAAAAG ATTGGCGCTG AGGTTCAAAA GCGTTTGAAT AAGAAGGTTC 14220
AAATCAAGGT AAAAGAGATC AAGCGCATGG AGTTAAATGC TTACTTGGTT GCGCAGAATG 14280
TTGCTCGCCA ACTCACGGCG CGTGTTTCTT TTCGTAAGTG TTTGCGGCAG GCCTGTGCGG 14340
GGACGATGAA GTCTGGTGCT CAAGGGGTAA AAATTCGAGT TTCGGGGCGT TTGGGTGGTG 14400
CTGAGATGTC TCGCACTGAG GAGATAAAAG AGGGGCGTAC GCCTCTGnCA nCACgcTGCG 14460
CGCAGATATT GATTATGGTT TTGCCGAGGC ACATACGACT TATGGGAGTA TCGGGGTAAA 14520
GGTGTGGCTA TACTCAGGGA TGATGTACGG GAATGAGTGT CGCAAAGATG TAGGcTCTCT 14580
GTTGCGGCGA TCGCGCAGGG AGAGTGGCCA AAAGTCTGAC GAGTTGGTGC GCGACGAGCG 14640
TACGCATGCG GAGAGAGGTT GAGGTATGGC GCTTAGTCCC AAGCGGGTAA AtACCGAAAG 14700
GTACAGCGGG GGAGGgTGAA GGGGGATGCC ACTCGGTGCA ATGCGGTTGA TTTTGGTGCG 14760
TACGCGCTGG TGTGTCTTGA GCCGTTTTGG TTGACGAGCC GACAAATCGA AGCGGCTCGT 14820
GTAGCGTTAA ACCGAAGgAT TAAGCGCGGG GGTAAGTTGT GGATTCGTGT TTTTCCCGAT 14880
AAGCCATACA GCAAGAAGCG TGCAGAGACG CGTATGGGAA AAGGAAAGGG GTCGCCTGAG 14940
TATTGGGTTG CGGTAGTAAA GCCAGGTACT GTTCTGTTTG AACTAATGGG TGTAGAACGA 15000
GCGTTGGCAG AGCAGGCGAT GCTTCTGGCA GGAAGTAAAC TTCCAATCAA GACGCGGTTT 15060
GCCGAACGCG TACAGGAAAT TTAGAGGGGA GCTGTAAGAT GGGTCGGGGT GGGTGTGCGC 15120
AATTATCATA TTCTGAGCTT CTTTCGAGGC GTCGTGAGCT TGAGAGAAAA TACTTGGATC 15180
TGCGcTTTCA GCTTGTTGTT GAGCATGTTG ACAACAAGCT TATGAAAAGG ATTCTCCGTC 15240
GTCAAATTGC GGCGGTTAAT ACTTTTTTGC GACATAAAGA GTTGACTGAA CTAGAAAAGA 15300
GAGGGGTTCG GGAGTGATGG AGCAGTGTAC GGTGAAAAGG CCTGAGCGGC GCACCCTTGT 15360
CGGGCTGGTG ACCAGTGACA AGATGCACAA AACCGTTACG GTTCGGATTA CGACAAAGAA 15420
GTTGCACGCG TTGTATAAGA AGTACGTGTC GCGGAgcAAA AAGTATCAGG CTCATGATGA 15480
GGAAAATACC GCGCGGGCAG GGGATGTGGT GCGTATTGCC GAGAGTCGTC CTTTGAGTAG 15540
GCGTAAgCGC TGGCGGTTGG TAGAGATTGT TGAACGAGCG AAGTAAGGGA TTTGTGTCAT 15600
GATTCAGGTG CAGTCGCGGT TGAACGTCGC GGATAATTCT GGAGCCAGGT TGGTGCAGTG 15660
TATTAAGGTG GTGGGTGGAT CCCGTCGCCG GTACGCGAGT GTTGGGGATA TCATCGTGGT 15720
GGCAGTGAAG GATGCACTTC CCACTTCTGT GATTAAGAAA GGATCAGTAG AGAAGGCCGT 15780 CATTGTACGA GTTTCTAAGG AATATCGTCG CGTAGACGGT ACTTATATTC GATTTGACGA 15840
CAATGCCTGT GTTGTTATCG ATGCTAATGG AAATCCTAAG GGGAAGCGTA TTTTTGGTCC 15900
TGTTGCGCGG GAGCTGCGGG ATATGGATTT TACGAAAATC GTGTCTTTAG CTCCTGAGGT 15960
TTTGTGAAGG GGAAAGTGAT GGGGAAGACG GTAAAGATTC GCAAGGATGA CATGGTATTG 16020
GTGATTGCCG GCAAAGATCG GGGTAAGCGG GGTGCAGTGC TGCGTGTGCT CCGCGACGTA 16080
GATCGCGTTT TGGTGCAGGG TTTGAACATG CGCAAAAAGA CGATTCGTAG AAAGAGTGCT 16140
CAGGATGAGG GGGGTATCAT GGAGGTTGAA GCTCCTATTC ATATTTCCAA CGTTATGATT 16200
ATGGGCAAGA AGGGGCCTAC GCGCGTGGGG TATCGGATGG AAAACGGTAA GAAAGTGAGG 16260
GTATGTCGTA AAACAGGAGA GGTGCTATGA CCGATCATTC TTGCATACCT GAACTGAAAG 16320
TCCGGTATGT GCAGCAGATT GTTCCGGATA TGATGCGGGA TTTTGGTTAC TCGACGGTGA 16380
TGCAGGTTCC TAAGCTGTTG AAGATAGTGT TGAGTATGGG TCTCGGGGAA GCGCTCGCTA 16440
ATCGGAAGCT TTTGGACGCG TCAGTAGCAG ATTTGGGTGT TATTAGTGGC CAGCATGCAG 16500
TAAAGACTAG GGCGCGCAAG AGTATTGCGA ATTTTAAGCT GCGTGAAGGC AATGAGATTG 16560
GGGTGATGGT GACTCTGCGC CGTAGTAGGA TGTATGAGTT TCTCCACCGG CTCATCAATG 16620
TTGCTCTGCC TCGTGTAAAG GATTTTCGTG GGGTAAGTCC TCGTGGGTTT GATGGACATG 16680
GTAATTACTC GATGGGTATT ACGGAACAGA TTATTTTTCC TGAAATTGAC TTTGACAAAA 16740
TCGAGCGAAT TAGCGGTTTG AACGTCAATG TAGTGACATC TGCGCAGACA GATCAGGAGG 16800
CTCGTACTCT TCTTACGAAG CTCGGTATGC CTTTTAGAAA ATAAGAGAGG ATTTCATGGC 16860
GACAGTAGCA ATGATCAATA AGGCAAAAGC AACTCCGAAA nTACgctACG CGCAGGTACA 16920
ACCGCTGTGG GGTGTGTGGG CGACCCCGCG GGTACATGAG GAGATTTCAA TTGTGCCGCC 16980
TGTGTTTTAG AAAGCTGGCG AGCGAGGGTC AAATCCCTGG GGTAACGAAG TCGAGTTGGT 17040
AGGAAAGGAG GGAGAGTATG GGTGTTTCGG ATCCTGTTGC AGACATGCTC ACGAAGATAC 17100
GTAACGCGGC GCtGcGGGAC ATGAAAAAGT GGATGTAmCT TCTTCGAAgT TGAAAGTTGA 17160
GGTTGTGAAA ATACTGAAAA CGGAAGGATA TATCAGGAAC TTCAGGAAAG TAGAGGAGGA 17220
TGGTTCCGGT TGTATTCGTG TGTTTCTTAA GTATGACGAT AACGAAACGT CGGTTATTCA 17280
CGGTATCGAG CGGATTTCTA CTCCGGGCCG CCGTGTGTAC TCGGGGTACA AGACGCTTCG 17340
TCGTGTGTAT AACGGGTACG GCACTTTGAT TGTTTCTACC TCTCTAGGGG TGACCACTGG 17400
CaGGCATGCa AGGGAGCAGC GTGTGGGTGG TGAGCTGATT TGCAAAGTTT GGTaGGGGGC 17460
TGTAGTGTCA AGAATTGGTA AAGTTCCTGT GTCTGTTCCT GGCGGTGTGC ACGTGCGAGT 17520 CTCTTCTGGG GTGGTTGAGG TCGAGGGTCC AAAGGGGGTG CTTTCGTGTG CGTTTCTCCC 17580
AGTGGTTACG GTTCGTGTTG AGCAGGAATA CGTAATTGTT GCCCGGTGTG ATGATTCCAA 17640
GCGCGCGCGT GCATGTCATG GGCTGTATCG CAAGCTTTTG AGCAATATGG TAGTTGGGGT 17700
AAGCGAAGGg TTTTCTAAGA CATTGGTAAT TACGGGTATC GGGTACCGCG CTGAGGTTCA 17760
AGGGCGGGTG CTGGTGATGG CATTGGGTTA CTCCAATGAC TTTACAGTGC TCATTCCCTC 17820
TGGTATTGAG GTGCGGGTTG AGTCTTCCAC GAGGGTTATT GTTTCCGGTG TAAGTAAGGA 17880
AAGAGTGGGG GAGTTCGCAG CGCAACTTCG TAGGCTGCGG TTGCCTGAGG CGTATAAGGG 17940
TAAGGGTATT CGCTATGATT ACGAGACCAT TGTGCGTAAG GTAGGAAAGT CAGGGGTAAA 18000
GTAGAGG AC GCATGCTAAG GAAGTGCAGT GATAAACAGC GAAAGAGGAT GAAGCGTAAG 18060
GTTCATATTA GGAAGAGGGT GTATGGCACG gCGGTTCGCC CTCGGATGAC GGTGTTCCGA 18120
AGTAATCGGA ACATTTCGGT GCAGGTCATT GACGACGACG CGCGTAgCAC GCTTGCGTCA 18180
GTTTCTACTC TTGAGAAGGA TTTTGTTCTG CTTAGGGCAA ATGTTTCTTC TGGTTTGCAG 18240
ATAGGAGAAG AGATCGGCAG GCGCCTTTTA GAGAAACACA TTGACACGGT TATCTTTGAC 18300
CGAAATGGGT ACTTGTACCA CGGGGTAGTG GCGGCCGTCG CAGATGGTGC ACGTAAGGCA 18360
GGAGTTAAGT TCTAGGAGAG CGTATGGATC GTCACAGGGA TTTTGGCAAA GACAGACTTC 18420
GAGACAAAGA GTTTACCGAG AAATTAATCA AGCTGAACCG CACGGCAAAG GTAGTAAAGG 18480
GCGGACGTCG GTTTTCCTTT TCGGCACTCA CGGTAGTTGG TGATCAAAAG GGCCGCGTGG 18540
GGTTTGGTTT TGGTAAAGCC GGGGATGTGA GCGAGGCAAT TAGGAAGAGT GTTGAAAGGG 18600
CGAAGCGGAG TATGGTGCTC TTTCCGCTCA AGGATGGTAC CATCCCGCAT GAAGTACA G 18660
CTAAGTTTAA GGGCTCTCTG GTGTTACTGC GCCCTGCCTG TTCAGgTACG GGTATTATTG 18720
CTGGTGGAAC CGTGCGTGCT ATCATGGAGG TTGCAGGTGC AACCGATGTG CTGTCTAAgT 18780
CTTTGGGTTC GAATtCTGCT ATCAACgTGG TTCGTGCaAC gTTTGGGGCG GTTGCscAgT 18840
TGATGGATGC aAGAAAGTTG GCACgTGAGC GTGGGAAGGC ACTCGTGGAT ATGTGGGGGT 18900
AGGCATGACA AAGAGGGTGC GTATAACGCT GGTGAGGAGT ACGATCGGTC AGAGGGAGCC 18960
GGTGCGTCGG ACGGTTCGGT CTTTGGGTTT GAGGAAGTTG CATTCAATGG TGGAGAAAGA 19020
CGGGAGTCCT GCCGTCTTGG GGATGGTGCG AGCTGTTTCG CACCTGGTGC GGGTGGAGGA 19080
GTTAGGTTAG TGGCGGATTT CCATTTGATT GCTCCGAAGG GGgCTAATAG GGCGCGTCGT 19140
ATCGTGGGTC GTGGGTCCTC CTCTGGGCGG GGTACCACGT CTGGGCGGGG TACTAAGGGA 19200
CAGCAGGCCC GTGCGGGGCA TAAGGCTTAT GTAGGTTTTG AGGGTGGGCA GATGCCGCTA 19260 TATCGGCGTG TGCCGCGGCG GGGTTTTTCT AACTGTGCTT TCAAAAAGGA ATACGCGGTA 19320
GTTAATGTGG GCGCGCTTGA GTTTGTCTAT GCTCCAGGGG AGACGGTCAA CAGACAGACT 19380
CTCATTGAGA AGGGCTTGGT AAAGGGGCGG GTCCCCTTCA TCAAAATCTT GGCAGACGGA 19440
GAGCTGACAA AGTCTATTGT GGTGCGGGTG GACCGGGTTT CTGCTCGTGC ACAGGAGAAG 19500
ATTCAGCAGG CGGGCGGTTC AGTGGAGTGT ATTGAAGCGC AGGAACGATG AGCGGTATAT 19560
GAAACAGGGT GTTTTTGCAG CGGTGTTCCG GATAAGGGAg CTGCGTGCGC GTATCTTTTT 19620
CACGCTTAGC GTGTTGACGG TGTTTCGCTT TGGCTCGGTG CTGACAGTTC CGAGTGTGGA 19680
CCCGCGTGCG CTTTCTGCTT ATTTCCGATC TCAGGTTCGG GGAAATGCTT TTGCAGACTA 19740
CATGGATTTT TTTGTAGGCG GGGCGTTCTC GAATTTTTCA GTGTTTATGC TGGGAGTGAT 19800
GCCGTACATT TCGACGCAGA TTCTCATGCA GCTTTCGATG ATTGTTTTTC CAAGTCTTAA 19860
GAAGGTTGTA GAAGATGTAG GGGGGAGACG TCGCGTTCAG TTTTGGACAC GTGTTGCAAC 19920
GGTTTTTGTG TGTCTTATAC AGTCTTCTGC GGTAACCGTT TACGCAAATC AGATTCCCGG 19980
TGCCATTGTT ATTCAGAGCT ACGCCGTGCA TCTGTTTGTC ACCATGCTGA CGGTGACCTC 20040
AGGGAGTATG ATCACGCTTT GGCTTGGGGA ACAGATCACA GCGCGAGGCA TTGGTAACGG 20100
TGTGTCAATG ATTATTTTTT CGGGTATTGT CGCGCGTTTG CCTCATGCGC TTGCAGAGAT 20160
GTGGAGGCTG CAGCGTCTTG GCGAATTGAA TATGGTGTTT GTGATCGTTG CGTTTGTGAT 20220
GTTTGTAGGA ATTATTGTGC TGGTGGTGTA TGAGCAGCAG GGGCAACGAA AAATACCAAT 20280 tCATTATGCG CGGCGTGTGG TCGGGCGGAA AATGTACGGT GGTCAGAGCA CGTATATCCC 20340
TTTTAAAATA AAmCcTTCGG GCGTAATTCC GATTATTTTT GCCTCATCTT TTTTGACATT 20400
TCCCCTGCAG ATAGCCAGCA GTATTGGACC GAACGTGCGC TTTCTGCATC AGcTTGCGCA 20460
GTTCTTACGA CCGAACAGTT GGTGGTACAA CGCGTTCTAT GTAGTTTTGA TTGTGTTTTT 20520
TGCGTACTTC TACACGCAAG TCACCCTTAA CCCGACTGAG ATAGCAAAGC AGATTCGCGA 20580
GAACGGAGGT ACGATTCCGG GTATTCGTGC GGATAAGACG GAAGAATATC TACAAGGGAT 20640
CTTGAACCGC CTGGTACTTC CCGGTTCGTT GTATCTTGGG ATGATCGCAG TGCTGCCCAC 20700
CTTGATTCAA GCTGCGTTTG GGTTTCCGTC CTCTATTTCC TTACTGATGG GCGGTACTTC 20760
TCTGTTGATT CTGGTAGGGG TGGATCTAGA CACTATGAGT CAGATTGAGG CGCAGTTGAA 20820
AATGCGGCAG CGTGAGGGGT TGGGAGGGCG TGGCAAAGTG CTACCGCGCA TTTGTAGCGG 20880
GTACTtGCGA AAGGATGTAT ACGAGGAGTG AGTTCATGAA GATAAGGACG AGCGTaAAGG 20940
TTATTTGTGA TAAGTGTAAG CTTATTAAGC GTTTCGGTAT TATCCGGGTG ATTTGTGTGA 21000 ATCCAAAGCA CAAGCAACGT CAGGGCTAAG GGGGTAGACG AGGTATGGCG CGTATTGCGG 21060
GGGTTGATCT TCCTAATAAG CATGTCAGCG TTGCGTTAAC TTACATATAT GGTATTTCGC 21120
GTTCATCCGC CAGGACTATT TGTGAGAAGG CCCGCATCAG TTCTGCTTGT CTGATAAACG 21180
ATTTGAGTCA AGATGAGCTT GCAGTTGTCC GTGCAATTAT CGATAGAGAA TACAAAGTGG 21240
AAGGTCGTCT GAGAACTGAG GTTGCCTTAA ATATCAAGAG GTTGATGGAT ATTGGGTGTT 21300
ACCGAGGGCT AAGACATAGA AAGGGGCTGC CTGTTCGTGG GCAGCGCACG CGAACAAATG 21360
CGCGCACACG CAAGGGTAAG AGAAAAACCG TCGCTGGAAA GAAAAAGTAA GGGATCAGGA 21420
GGGCATTGTG GCGGTCACAA AGAAGCGTAA AGAAAAAAAG AATGTGTACG AGGGGAACGT 21480
GTATATCCAG GCGACTTTCA ATAACACCAT CATAACGGTT ACTGACCTGC AAGnAAATGC 21540
GCTCTCCTGG GCTTCGTCCG GGGGCCTTGG GTTTAATGGG GCAAAGAAAT CTACTCCTTT 21600
TGCAGCACAG ACGGTCGCGG AAGCTGCGGT ACAGAAAGCG CAcAGTGCgG acTGCgTGAA 21660
GTACATGTGT TTGTCAAAGG GCCGGGTATT GGGCGTGAGT CAGCAATTAG AATGCTTGGT 21720
ACCATGGGAC TGAGGGTGCG TTCGATTCGC GACATCACAC CCATTCCACA TAACGGCTGT 21780
CGTCCGCGTA AAACTCGCCG CATCTGATAA AAGGAGTGAG CATGCCTCGT AGAAATCTTT 21840
TGAAGGGTTT TAAAAGACCT AAGGTGCTGG AGTTTCTTTC GGAGAACTCA AGCGAGTGTT 21900
ATGGGAAGTT CACCGCCTCT CCTTTTGAGA CTGGTTTTGG CACCACTGTT GGTAACTGTT 21960
TGCGGCGCGT CTTACTCTCT TCTATCCAGG GGTATGCGGT CACCGGGGTT CGCATCACGT 22020
CCTTTGATGC GGACGGGGTT GCGCACTTCA TTTCAAGCGA GTTTGAACAG ATTCCCCACG 22080
TACGGGAAGA TACCCTCGAG ATTCTAAATA ATTTTAAGCG TCTGCGTTTT CTCCTGCCGC 22140
AGGGGcAGAG TCTAGTACGT TCACGTATGA GTTTCGCGGC GCGgTGTCTT TGACGGGGAA 22200
GGACTTTGCT AAGAAGTTTC AACTCGAGGT TCTGTCTCAA GACCTGCTCA TCATGGAAAT 22260
GATGGACGGT GCGCATGTTG AAGTAGAGCT ACACGTCGAA TTCGGGCGTG GGTATGTACC 22320
TGCTGAATCG CACGATCGGT ATGCCGATTT AGTTGGGGTT ATCCCTGTTG ACGCAATTTT 22380
TAGTCCCGTG TTGAGAGTCC GCTATGATAT TCAGTCTTGC CGTGTAGGTC AGCGGGGGGA 22440
TTACGATCAG TTATCCCTTG AAGTGTGGAC AGATGGTACG GTGCGTCCCG AAGACGCGAT 22500
AgcCGAGGCA GCGAAAATTA TCAAGGAGCA CTTTACAGTT TTTGTTAATT TTGACGAGAC 22560
CGCGCTCGAC CTGGAGGACG AGCCAGAAGA GGATGACCCT GCCGTTCTGG AGCTGTTGAA 22620
CACGAAAATC GCTGATGTAG ATTTTTCAGT GCGCGCGCGT AACTGCCTTT TAACTATGGG 22680
AATCAAGACG CTGGGGGAGT TGACAAGGAT TTCTGAGCAG ACACTTGCGA ATACGCGTAA 22740 TGTGGGTAAG AAAAGTTTAA GTGAGATACA GGgCAAGTTG CAGGAATATA ACTTGCGTCT 22800
GGGTATGGCT GACTACAACC ATGTGGGGGT TGTTAGTAGA CTGATGCGAC AGAAGGAAGA 22860
AATAGATGAG GCATAGGACC GGTTTCAACC CGcTTTCGTG tATGGCTGCG CATAGGCGTG 22920
CGCTCCGTCG CAATATGGTT ACTTCTCTTT TTAAGTTTGA GCGGATCACC ACGACGAAGC 22980
CGAAAGCTGC CGAGGTGCGG CGCGCGGCAG AGAGGTTAAT TACGCGTTCT AAGTCTGACT 23040
CTGTGCATAA CCGGCGCCAG GTGGCCCGTT TTATTTGGGA TAAGGCTGTG TTGCACAAGT 23100
TATTTGCGGA TATCGGACCT CGCATGCGGG AACGTGAGGG GGGGTATACG CGCATATTGA 23160
AGTTGGGCCT CAGGCAGGGG GATGCGGCAC ATGTGGTTGT GTTGGAATTG GTTGACTATA 23220
CCTTTGAAAA AAGCCTCAAA AAACGCGCGC GTACTGATAG TGTGCCTGCA AGAAAAGGAG 23280
CTGGGAAGAA GGaTGcTTCG CGCGTCAGTG GGACGGTTCC AGACGGTCAG TCTCAAAAAA 23340
TAGGAAAGAA GAAAGAATAG CAGTTGGGCA ATGGAGGGGT GGTATGTCGA AGGCTCATCG 23400
TGGAAAGGGG ATCCGGGGTA TGGTCGGTCG TGGCCGTGGC GTGTGTCCGG TGACTGGGCA 23460
GACGGGGGTA AAGCTCCTGT ATGAGTGCGA GATTGATGGT AAGAAGGTCA AGGTTTCCAA 23520
GGTTGGGCGC GCGACTCTCC AGAATAGGAA GAGACGTTTG GATGCGCAgC CTGGAGCTTG 23580
ATCGCGCATC CTCGTGATAT GAGGTTCCGT CCCAAGGACG TTAGGTGGTT GTCCGTTTCT 23640
GTGCTTGGCA GTTACCATTG GGATGCAGGT CGCATCGTGG TCGGTGTAGT CAGACGGTAA 23700
ATAGGTGTTT TCTTGACCGA GGGCGGCGTC TCTCGTTACT TTTACGGCAT TACCGCGAgG 23760
GTGTTATGGC AAAAAAGGAG AAGAAAGTGT GCGGCGGCGA CGTTCAGGGG CAGGGAGTTG 23820
CCTCAGGTTG TGACGAGGCC TTGGAGCGGG CAGATAGCCT TCGCGCGTCT GATCCTGTAC 23880
CGGTTGAATC GGGGGAGGGT TCTGTTCCTG GGGAGCATAG TCaGGAGTTG GAGACAGGTG 23940
CCTCTGAAGA GACCCTGCGC GArCGCGTGA ATGTTTTGCA GGAsCAGTAC CtGCGCAAGG 24000
CTGCCGACCT CGAAAACTAC CGGAAGCGTG CGTTGCGGGA AAGGCAGGAG gCGGTGGAAC 24060
AnCGTACGCG GCGCTGCTTG CCGACATCGT CGCTGTCTTG GATGACTTTG ACCGTGCTAT 24120
TGAAGCGGCG GATCACGCGT CGAGTACAGA GGTGGAGGCT TCATCTGCCT TCCGAGAGGG 24180
TGTTCTTATG ATCCGCAAGC AGCTCTCCTC AGTGCTTGAG ACAAAGTATG GTCTTGAGTA 24240
TTACCCGGTG CTCGGGGAGC GCTTCGATCC AAATCTCCAT GAGGCTTTGA GTATGAGTCC 24300
TTCCGCTTCT GTGCATGAGA AGATAGTAGG GGCAGAGCTA CAAAAAGGAT ATAGGGTTAG 24360
GAACCGTATC CTCCGGCATG CCAAGGTTAT GGTGCTCACT CCTGAAGAGC AGACAGAGCC 24420
CGATCGTGGG GATGGcCCTT CGGAGTGACA GGCAGGGTAT GCTGAGAGGT CAGGATGGAG 24480 TTCTGGAGCA CCGGTGCTAG GTAACGGCTA TACTGCGCgC CCTGCAGGCA GGGCGGGTAT 24540
CCTATACAGA GGAGTTGAGG GTTATGGGGA AGATTATTGG CATTGACTTG GGAACGACAA 24600
ATTCATGTGT TGCGATCATG GAGGGGGGGG AGCCCGTTGT CATTCAAAAT GCCGAAGGGG 24660
GAAGGACTAC GCCCTCCATT AyCGGTTTCA CCTCTGATGG TGGACGCGTC GTCGGTCAGC 24720
CAGCAAAAAA CCAAATGGTT ACTAATCCGG AACATACTAT CTATTCGATA AAGCGCTTTA 24780
TCGGCAGTCG TTTCAATGAA CTGACCGGTG AAGCAAAAAA GGTGCCCTAC AAAATTGTTC 24840
CACAGGGAGA CGACGTGCGC GTTGAGGTGG AGGGTAAGCT TTACTCTACG CAGGAGATCT 24900
CCGCGTTCAT TTTGCAAAAA ATGAAGAAGA CAGCTGAGGA TTATTTGGGC GAGGCAGTCA 24960
CAGAGGCAGT CATTACCGTT CCGGCTTACT TTAACGATGC ACAGCGTCAG GCAACCAAGG 25020
ATGCGGGGAA GATAGCAGGG CTCGATGTGA AGCGTATTAT TAATGAGCCG ACTGCTGCGT 25080
CGCTTGCCTT TGGTTTTAAC AAAGACTCTA AGAGAGAGAA GATTATTGCT GTGTATGATC 25140
TTGGGGGGGG TACCTTTGAC ATATCCATCT TGGAACTCGG TGACGGTGTT TTTGAAGTCA 25200
AGTCAACGAA TGGGGACACT CACCTGGGGG GCGATGACTT TGATGCACGT ATCGTGCAAT 25260
GGCTGGAGCA GGGCTTCAAG AGTGACACGG GTATCGACTT GGGCAACGAC CGCATGGCGT 25320
TGCAGCGGCT GAGAGAAGCG GCGGAGAAAG CAAAGATAGC GCTTTCTTCC TCTGCGAGTA 25380
CCGAGATTAA TTTGCCCTTC ATTACTGCAG ATGCCAATGG GcCAAAGCAT CTCCAGAGGA 25440
CTCTCTCTCG ATCTGAGTTT GAGAAGATGA CTGATGATCT TTTTGAGCGG ACCAAAGAGC 25500
CTTGCCGCAA GGnGCTCAAA GACGCCGGAA TTAGTGCGGA CAGGATCGAT GAGATTCTCT 25560
TAGTTGGTGG TTCCACGCGC ATGCCCAAAG TAGCGCACGT GATCAAAGAT GTCTTTGGGA 25620
AAGAAGGATC GAAGGGAGTC AATCCTGACG AGGCTGTCGC AATTGGCGCT GCAATTCAAG 25680
GAGGTATCCT CGGGGGGGAC GTGAAGGATG TACTTCTCTT AGACGTTACG CCTCTTTCTC 25740
TAGGAATTGA AACAATGGGC GGGGTGTTCA CTCCGCTTAT CAGTCGTAAT ACCACCATCC 25800
CCACGCGCAA GAGTCAGGTG TTTTCCACCG CAGCTGATGG GCAGACGGCA GTTTCCATTC 25860
ACGTGCTGCa GGGGGAGCGT GGCATGGCGA ACCAAAACCG GACGCTCGGT AATTTTGATC 25920
TAGTAGGAAT TCCCCCTGCT CCGCGGGGAG TGCCGCAAAT TGAAGTGACG TTTGACATTG 25980
ATGCGAATGG TATCGTGCAC GTTTCTGCCA AAGACCTAGG GACGGGAAAA GAGCAGCACA 26040
TCCGCATTGA AAGTTCGAGT GGTCTGAGCG AAAGTGAAAT CGACCGCATG GTAAAGGAAG 26100
CCGAAGCGAA TGCAGAAAGT GATAAGCGTG AGCgGGAGAA AATcGAAGCA CGTAACGTGG 26160
CTGACTCCCT AATCTATCaG ACGGAAAAGA CGCTCAAGGA GGCGGGAGAC GGGGTGAACG 26220 CTGCGGACCG CGCGCGCATA GACGAGGCGA TCGCAGAGTT GAAGACGGTG CTCTCcAGGc 26280
GACGACGTCG CATCGATCAA AGCGAAGACT GAGATCTTGC AGCAAGCTTC CTACAAAATT 26340
GCGGAGGAAA TGT TAAACG TCAAGCAGCA GCGGGTGCCG CTGCAGGTAA GAAGAGTGAT 26400
GCACCCTCTG GCAATGAGGC AGAAGGTGGT GACGTTGATT ACGAGGTAGT GAAGGACGAA 26460
GATTCAAAGT AGGCATCTGG TGTTGCGGGG AGGGAATAGC CTGCGTGTAG GAGCTGTGTG 26520
ATCTGACTTC CCCCAGGCCT TTTGTGATCC GGGTGTTCGC CTGATCGCCC GGGTCTTTCG 26580
GCTGTCTAGT GGGTGTTTGG ATGTAGCCTG CGTAGGCGGT GCTTCAGGCG TCCTGCTTTT 26640
GTGCCGGTTT CGCGTGCACA CCCTGTTTTT CTGTGTGTGC GCGCAAATGT AGACAAAGAT 26700
TCTCTAGACG GGGTGATCGT GGCAAAGAAG GATTATTACG AGGTTCTCGG TATCTCAAAG 26760
ACCGCGAGTG GAGAAGAAAT CAAAAAGGCG TACCGGCGGC TGGCTATTCA GTTTCATCCT 26820
GACCGTAATC AGGGAAATAA AGAGGCGGAG GAACGCTTCA AGGAGGCTAC CGAAGCCTAT 26880
GAGGTGCTCA TTGATGCACA GAAGCGTGCC GCGTACGATC GGTATGGCTT TGATGGCCTG 26940
AAGGATATGC ACGGTGCGCA TGGCTTTAAC TCTTCGGCcT TTCAGGGGTT CGAAGA ATT 27000
TTTGGGGGTG GCTTTTCTGA TATCTTTGAA AATATTTTTG GGAcTTCGTC TCGCCGCGGC 27060
GGTTCAGGGA ACGACGGCTC GGGTGGCTCC GGGCGTGGGG CAAACTTGCG TTATGATTTG 27120
CAAATCTCTT TTGAAGAAGC AGTGTACGGG AAAAAGAGTG AGCTGCACTA TGTGCGCGAC 27180
GAAACGTGTA TTACCTGCAA GGtGCCGGCT CGGCCAGCGG TGGGCGTAAG ATGTGTCCAG 27240
ATTGCAAGGG TACGGGGCAG ATTCGGCGTA GTACAGGTTT TTTCTCTATT GCGCAAAGTT 27300
GTGCGCGCTG TGGTGGTGAG GGGACGATTA TCGAAAGTCC CTGTGCACGG TGTGCGGGTA 27360
GTGGCATTGA GCGTAAAAAG CAAAAAATTA TCGTCAGTAT TCCGGCAGgT GTAGAAGAAG 27420
GGCGGCGCAT TACTATTCCC CGTCAGGnAA ACGCCGGTCG CGCAGGCGGT GCCTACGGGG 27480
ACCTGTACGT GTTTGTGTTT GTTCGTGCGC ATGAGTATTT CGAACGTGAA GGTGCTGACC 27540
TGTACTGTGC AACTTCGATA TCGGTAACCC AAGCGATTTT GGGCGCGCAG GTGACGGTGC 27600
GGGCATTAGA TGGATCTGCG CACAGnTGCG GGTTCCGGCC GGCACGCAGG GAGGTGCGCT 27660
TTTGCGTGTT AAGGGTATGG GGGTCCCACT GGCGCGCGGG GCGGGGGATT TGTACGTAAA 27720
GGTATTGGTG CGTATTCCAA CTACGCTTTC TGCACGGTCG CGTGCGCTCT TAGCGGAGAT 27780
TTCTCAAGAG GAAGGGGAAA ACGCCCATCC GCCGTTGCTT GAACTTTCAA GTCTCAAGTA 27840
GGCTACAGAA AGGGGCGCGT GGGGTAAAAG GATTATTCTC GCGTGCGTGG TGTTTCTTTC 27900
TCGTGTGTCG CAGGATGAGT TGGcTTCATC GTGAtGGGTG CGTGTGCTAT CTGAGTTTTC 27960 TTCCCACAGT TTAAAAGACA ACGTGTTTTT GAAGCAGCCA TACAAAGGGA ACGGTAGGTG 28020
ATTCGCAGAA GGCTCGCAAT TGTAAAGGCA GGTTCATTCG CACTCCTGGC GCTTTTTTTT 28080
TCAATATTTT TGCGCTTTCT CAGTCCGCGG TATTCGTTTC TCGGTCGTTT CGTTTCTGCG 28140
CGCGATATGG CGCTGTTGAT TTCTCGGTAT GAGyATTTGC CTGAGCTTTC TTCGCGTGAT 28200
CGAGCCTTGC TGGTAGGTTT CGTTTTCATG ATTTTTnGGT TGCGCTTACA GAAATCCAAC 28260
GCTATGCGCA CGGGCGCATC CCGTCTTGTT GTCTA 28295
(2) INFORMATION FOR SEQ ID NO: 9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5199 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
AACTTTGTGG TGATTAAGGG GTTGGAGCGA TATCAACGCT GGGATCTTGC GCGGGAGTGT 60
TCTATCCGTC ATCTCTATTA TGTGTTGGAT GcTTTGCAAT TGAACGATCA AACAAAGCGT 120
GGGGTTCTGT GGGAAGCGTA TCTGCCTACG CGTGAAGGTC CTGCACAATG GCCAGGGAAA 180
GAAGGATTTC CGCGCAGGCA ATATCTTGCG TACGCTGcGC TTTCTACTAT CACGCTTATG 240
ATAGAAAACG TTATCGGTCT TTCCATCAGT TTGCCGCGCA AAACAGTGCA CTGGATTATC 300
CCTAACCTGG AGGTGAtGGG CATTGAGAAT TTGAGCTtGA AACGGAATCT CATTACGATT 360
CTCTCTTCAA AAAGTGTGCG GGGGTGGGAA GTCTATATGG AAAGCGAGAA ACTTTACTAT 420
TTTACCCTCA ACATCCTTGG ACAGAAAAAG AAGACGCTCC CAATCCCCTC GGGGAAATGC 480
TCAATGCTCG TCGATAAGTT ATAGTGCGAT AAGAAATGTT TTACGGCGCG TGGGTGCTGC 540
GCGACGTAcT GCGTTTTCTC CAGTGGCGGA GAAAGTTCTG CTAGCCCTTA GTCCAGAGAA 600
GATGGGATGC GGCTGAGGAG CTTAAGAAAG AAATCAAGTT CTGCACGTAC ACGGCAGAGG 660
AAATTATATT CTAAATATTC TGCGCGCTCT CCATCGGGAA AATGCAAGTA GAGGTAGTCG 720
CTTCTCCGTA TGAGCGCGCG GATAACGTCT TGGTTAGAAG GTGTAGCCGT GGTGAGCATG 780
TGCTTGAGCT GGAGcAACgC TGCGCGCTCT GTGGTCAATA GTTCTTTTAC GCGTAGACAA 840
AAGTCCTCTG CAGGTTTTTG AGAAAACGCA AAGGATATCT GTGCGTCTTG TGTGTGCAAA 900
ATGGGTACAT GCGTTCTATG TCTGCGCGTA CAGCAGTGGG TGATGATCTC ACACGCCTGT 960
GCGTGCGAGA CTGTGTGTGA TGTGGTAAGA GCAAGACCTT CTTGGTCTAG ATCGTAAAGG 1020 TGCGGGTGCT CAGCGCAGAG GGTACGATAA GCGTGTGCAT AGTGCACAAG GTTGGGGATA 1080
CTTTGTACCG CGTATCTTTT CCCCTTAAGG GAGCGTATCC CTTCTGGGAA CAGCGCACAG 1140
TCCGGATATA GGCTATGGAG GCGCGTTGTG CTGTGGTAC GTCTTTGTAC TGGAGCACTT 1200
CCCCTCGCAT GGGTAAaTCC CTTGTTCCAT GAAAAATTTA AACCGGTGTG AAAAATAGGA 1260
ACTCCCTTTT TTCTCAATCG GAGTGCAAGT TGCAACGCGG cGAGTCCTAC TGATCCACTA 1320
GGATGAATGG AAAGCGGCAC GACGCCCgCA TTCTGCGCaC GTTTGATAAA TGCgGCATGA 1380
GTGTACGGGG TGAAAAAGAA GTGTGTAGGC ACGTTCGCTG CACGCACCGC ACGGGGAAAT 1440
GCGCTTAGAT CCGCAAAGAG CGCAACGGTG CGCGGgAGTG TACCTATGAA TGCTTGTTCA 1500
ATCCAGAACT GTGATTCTAA TAAGACAACT GCATCTGGAA CAATGTCAGG TAACAGTGCG 1560
TTGCAcGCCA CATCTACGGC AAGTAAAAAG ATGGTGTCCT TCATGCGGGC ACAAAAAGAA 1620
CGGCAGGCAT CCAGCGCAGG ACCGGCACCT ACGATAAGGA GTGGTTTGTC TATACTCTGG 1680
GGAACAAGGT GGTGTATGTG CGAGGGATTT TGTAATTGCG TAATATAATT TGAGAATATA 1740
TTGCGCGCAT AATTCCTTCC TGAATGGATG AGTGTAATCT TATTGATCCA AAAGGTGTCG 1800
ATGAGAGTGC GTATGTTTTG CTCGCTTTCG TCATAAAAAT TCCGGTACTG CGCATATGCG 1860
CCGGaTCCCG CAATTTTCAG TATCTGTTTG AAAGGGAAGC GGGTGAGACG CTCCACGGTA 1920
TGCAGCACTT GGGTGATGTG TGTGGTGTAT AACACGTACA CATTCTGTGC GGTGATAAGC 1980
TGGCGCGGAG cgTGcTGCAT AAAAAGGTGC ATAAGCTGCA GGTCACATTC AAGACAGAGG 2040
AGAAATGAGG AAGGAGGCAT ACGAGTAAGA AGCGCACATA GGCCGTGGCC AAGCACTGGC 2100
GCGCAACAAA GTACAAGGGT ATGCGGCTTG ACCgCTAAGm GcGCCACGGC ncgCTCaTGC 2160
GCATCCTGTG CGCGATACTT TGAGTAAAGG TAGGTGTTGC GGTAAAGAAC GGTGAAGCCg 2220
TTTTGTGTCT TGATAAGACG CGGCGGGAGC GAGGGAACGT CGCCACGCAC GCCAGCGACG 2280
TCAGATGTAC CCATAGAAGG GAAACGCACT CACAGCCGCG CCACGCACAG GgCTAGCGCC 2340
GAAAGATATT GTCAAATGTA TCCTTAAATG AGGGATGAGC CAGGATAGTA TCTACGACCG 2400
ATTCCTGTGC CCATGGCATC TTTGCCGTGT TTAGCAGCGC AAGCGGGTAG CCCGGGTACC 2460
AAGTGCGCAC ATGGTGTGCA TAGCTGTCTC CGGTAGTAGC GAGCACTGAT TCGTCCACTG 2520
ACTTTTCCTC GTGTAACTGG AGCAGTACTA CGGAACTATC AAGAATAAGT GGGTCAGACA 2580
CCTGCCCGGG GAGAAGGGCG AACGCGGTGG AAAAAAATTT CTCGTCATAT GCAATCCGCG 2640
CTAACGGGGG GTCGGACTGG CGCGGAAGGG CAGGGAGCAC GTCGACATTT CCGAAATTAA 2700
TGGGAAAGGA ACGACTCGTA TGCACGTCTA AGTTAAGGCT CTGTGCGGCG GCGGTGAATC 2760 CACCCTGTTT CGCCCGGGTG GAAAAGGTAT GTGCTTCCTC CTCAAGGAAA CGTTCGATGG 2820
TGCCCCGCTC CACACGAGCC ATGTGTGCGA ACACGCGCTC CCGCGTTGCT GCGTCACTAA 2880
AGTCCGGAGC GCTAGGTTCA GCGTCGGTGC GCACGATGGC AAAACCGCGC TCTATCTTCA 2940
CTACAGGACT CAAGGCTCCC ACCGCCgTGC GgAGCACCGT GTCCAAGTCC TGCGCGTCGG 3000
GAAAGAATTC GTTCACATCA CTCCGATAGG AgTGGGTCAT TTTTCCGGTA GCATCGGTAC 3060
CAACTTTGGT GGAGCCAGTA GCGACAGCGT CTTCAAAAGA CAATTCCCGT TTCTCTAGAG 3120
CGCGTGCCGT GCGGCGCGCG TcCTCTTCCG AGGAGTAGGT GAGCAAGGAA AGGTGATGCA 3180
GGGTAAAAAG GTGGGCATGC TCTTTCCCGT ACgCGCTGAC GCkTTCAGCG GGAAACCGCT 3240
CTTCGCCCAA AACGACGTAA CGGAAGCTAC GTTCCTTCTT CGCCATGTCC TGAACAAAGC 3300
GCAGTTCTCG GCTATTGAGC TTAAGGCCCC CGCGTCCTGT CTCTTTTCCA AAGAGGTGGT 3360
AAAGGTACTG ATCGGAAAGG AGCGAGTCGC GCATCTTTTT GCGCTGAGAA AGCCGGACAT 3420
GTTCAGGGGT GCCCTGATAG CGCTGCGGCG AGTAAGTACC GTCAGCGTCA gctAaAAAGG 3480
AGAGCACCTC CCGATCTAGC AGCTCCTCGC TGAGGGTAAA GCCGCTTTGC TTTGTTTGCT 3540
CGGTTCCCGC AAGCTGGACA ACCGCGGCGC GAAAGGCAGC ACGTAAGACA CGACGATCCA 3600
TCCCCTCGCG TTCCTGAGCG TCCTTGGGGT ACAAGTTATA GCGcTCCGCA GTCTGTGCAA 3660
GCGCAGAGTA CTGCTGGGAA AAAAGACTAT CAGGGGCATT GGTGAGCGCG ACCCCTCCCC 3720
AGGAACCGAG TCGTACGTGG CCGTGCCCTC CCCCGCTGAG GGCAGGGAGA AACACGAACA 3780
TGAGCGCCGC CACCGCCAAC ACCACGCAGC CGCCCGTCGA GGCAAGCGCC CCCCTGCCGA 3840
AGTAGAACGA TTTCATGGAc TGCGCAACTG TGGCACAGCG GAACTGCCCC TGTCAACACC 3900
CGCGCAGaGT GcCTCGGCTC AGATCCAGTA ATCCGTATCA CACGAGGATC AAGCAACTGC 3960
GTCGGTGTTC TGACGTACCG CGGTGATAAA GCCTGTGCGC CTGCAGAGAT CGAAAAGGGA 4020
TTGGTGAATG TGCACTTCGT GCAttTTTTG TACCCGTCTC GGTCACGGCA AATGCGGATG 4080
ATATACCCAT CCTCTGTTTG ACTTACGATG GAGCAATGTG AAGTCTCTCC CGGATCGCTT 4140 TTGATGCTGT ACTGCTGCAT AAATCCCCCG TCCCGATCTC GGCAGACTGT GGTATTTCGG 4200
TAAGAAATTT TATGCTTTTG TGTGTTCGCT TGCACTGTAT TCCGGGAAAA TGAGAACGCG 4260
CTGTCTTGCC TCCCTGATAC CGGATATAAT GTTTCCCTCG CGTGGCTTGG ACGCGTGGAG 4320
GAGGATGTAG GGTATGTCAA TCGTGCTGCA GGGAGTTGCC GCAGtTCTGT GCTTTTTTCC 4380
CTTCTGTGCC CGGTACAGCA ACTGAGCGTT CCTGCGCTGC TTGCTGCTCT TGCAGGAGCG 4440
GCATTCTCGT GTGTGCTGTG CGTCGCAACG TATGTGCTAA CGGTGCGCCA GCGTGCTGGC 4500 GCTTTTGGCG TCGTGCGTAA ATGCATTGAA TACACACCCT TTGTGCTGAT GGCGTGCTTT 4560
GTCCTCTCCC GCGCGTATGC GCcTACGGTG GCGATGCCGT GGTTAGATTC CCTTTTGGGG 4620
ATGAGTTGGA TGGTGCTGAC TCTGTGTGTC TGTGCGCTGT TGTTTTGCCT GAGGAGGAAG 4680
TACGTACATC TCTTTTTTCC TCGTGGGGTT TCGGTGCACA CGCCCCCTGC GTCTTCGGAC 4740
GTGCGGAGTG TGTTGCCGGA TATGCCAGTG AGAAGGAGGC GAGGAATCTT TGTCGTACTC 4800
GAATGGGTTG ACGCGCTCAC CCAGGCTGCG TGTTTCATGC TTTTGGTGAA TTTGTTCGCG 4860
TTCCAGTTGT ACGTTATCCC GAGCGAATCG ATGGTCCCCA GCTTtATGGT CGGCGATAGA 4920
CTCCTCGTGT TCAAGACCGC CTCAGGgCCT GTATTCCCGC TTTCTTCGTT TCGTTtGCCA 4980
CGCTGGCGTA CCTACAAGCG CGGAGACATC GTCGTTTTTT cCAATCCTCA TTAcCCTGAC 5040
ACTCCGCCCC CCTGACGAGC ATCACAAAAA TCGACGCTCA AGTCAGAGGT GGCGAAACCC 5100
GACAGGACTA TAAAGATACC AGGCGTTTCC CCCTGGAAGC TCCCTCGTGC GCTCTCCTGT 5160
TCCGACCCTG CCGCTTACCG GTTACCTTGT CCTGCCTTT 5199 (2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12838 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
TCACCCTCTC AAATATCATT CCGCGCGCAC CACATACCCG CAGCACACAC AACTCAACCA 60
CTCTACCCAT AACCTATACC CCTTGTCAAC CCCCACCACC CGCATAAAAT TCTTTAGAAC 120
TCGCCTTTGT ACCCGCACCA CCCCTATTCA CATACAAACG CTGCCCCGGC AAAATACTCC 180
CAGGAGGAAT CGTGATACAT ACTCACACGC TCTCGCTGAG CTTCATGCTG TTTTCATTCT 240
TCTTCGGTGC AGGAAACCTC ATCCTTCCCC CCTTACTGGG AAAACACGCA GGTACGACAC 300
TCGCCACGGC GTTGCTCGGC TTTGCCACTT CCGCAGTCCT CATACCAATC GCAGGGCTCA 360
TTACTATCGC ACACGCAGgC GGTATTGTCC CTTTGTCAGA AAGGGTAGGA AAACGCTTCG 420
CTCACTTGTA TCCGGCTATT ACTCTCCTTG TCATCGGACC GGCGCTTTCT ATCCCACGGG 480
CAGGAATCGT CCCCTTTGCG CTCGCCATCG CTCCCCTCAT CCATCGGGCG' AATACCACAC 540
TACTTGCGCA StTATATATA CAACATGCTT CTTCATTGTT TCCTACTGGC TCTGCATGCG 600
CCCACACACC TTAAGCAACA CTCTCGGCAA AGTACTTACC CCCGCGCTCC TAGTACTCGT 660 TCTCCTCCTC TTCCTTGCCT CCTTCACTCC GACACTCGGT CCCTACCTCC CTGCACAGGG 720
CGCTTACGCT ACCCACATAC CCTTCAGCCA GGGATTCTTA GACGGTTACC TCACCACGGA 780
TGCACTCGCC TCCCTTATGT TCGGCAATAT GATCCTTACC TATTTGCATC GGACCCGCTA 840
CACAACCGCC CCTTTCCCTC CCACTCCAGC AAACACCCCC GCAGATATGC GCACCGTCGC 900
CTGGATAGCA GGGGTCATGC TCTTTTTTAC CTATGGAGTA CTGGCGCATC TCGGCGCACT 960
CAGCGCCCGC CAACTCCCCC ATACCGTTAA CGGCGCGCAC ATACTCGCGT CGGTGTCACG 1020
CCACCTTTTC GGAAAAGCAG GCATCGCACT ACTAGGACTG ATCTTTACAA TTGCCTGCCT 1080
AACTACCTGC GCCGGACTGC TTGTTTGCGT CAsGAATTAC TTCCACAAAC GCGCACCCCG 1140
TGTGTCTTAC CTGTGCTGGA TACGCCTGTT CACCATATCC AGCTTTGCGC TCGCAAATAC 1200
AGGACTAGAA CGTATACTGG gCATACGGAA CACCCCTACT CATGATCCTA TACCCAATCT 1260
CGCTGGTCCT CATTGGCATA TCACACCTCG AGCGACTCAT ACGGATACCA CGCGCCGCCT 1320
ACCGCCTGAC AGTATGGAGC GCAGGAACAC TCAGCACCTG TGCAGTCGGT ACGCCGCTTG 1380
TGGCGCACAC CCGGATAGGA CACGTGTTGA ATACACTcAT ACATACCCTT CCACTCGCAC 1440
AGGAACAGCT CTGCTGGcTT ATCCCCAGCG CGGCAGTTCT TATACTTAGT ACTGCGCATG 1500
CACGCTTACG TGAAAAAACA TGCACGCCTC GCGGTACGCT ACCCcTCACG GATAACTGAC 1560
CACTGGATCT CACCATCTTG TGGAGATGGG GGGAATCGAA CCCCCGTCCT AAAGAGCGAG 1620
TGcTGCGCGC CTACAGGTTT AGCGGTGCGT AcTGCGTTTG TCGGACTCTG CTAGGCCTGC 1680
ACCGCACGCG CCAGAGTCTT AGCACAGACA AAAGTCCCCC TACGTCCGCC GTGCACAACG 1740
TAAGAGCAAG CTCCGTTTGG CGTCGGGCCG ATATGTTCGC TCAGAGCAGC GCAAACACAG 1800
GCcCGCGATT ACGCGGCGAG CGCGTAGTCG AAACTGTCAG AATTGGCAGT TATAAAGGCG 1860
CCGAATCAGG AGATCGACAC TCCACCTGCA GCGCAACACC CCACATCCCT AGTCGAAACC 1920
TAGTCATCCC CCACAGACTG ACGTCTGCCC TTTTCTCTAC ATTCCCCCTC CCTCACCCTG 1980
TGCACCCAAC CTAGGAGGCA CGCTCTTCCA TAATCGCCAC CCCATTGCTG GTACCAATCC 2040
GTGCACAGCC AAGCTCGATA AAACGCTGCG CCTGCGCGCG CGTACGGATG CCGCCTGAGG 2100
CTTTTATCTT TGTCTCACCT TTCAGATATT TTTTAAAGCA CTGAATATCC CCTTCCGTTC 2160
CCCCGCGCGA CGcgTAnCCG GTGGATGTTT TGATAAAATC CGCGTGTCCT GCCTCCACAC 2220
AGGAACACGC AAACGCGATG TGCATCTCAT CTAAGAGGGC GGTTTCCACG ATCACTTTTA 2280
CAATGGCCCC GCGCGCGTGA CACCGCGCTG CAACCTGCGC AATTTCGCGT TCTACAACCT 2340
CTCTTTCTCC CGCGCACACC TTGTCTATAC GGACGACCAT ATCCAACTCT TGCGCCCCAT 2400 CGTCCAACGC ACGctGCGCC TCAGCGCACT TGACCTCCGT GACGTGCGTA CCAAAGGGAA 2460
AACCGATCAC GCTGCACACC CGCACCGCCG TCCCCCGCAC CGCACCTGCT GctAACGCCA 2520
CATGGCAAGG ATTTACACAT ACCGACGCGA AGCGATAGTG TGCGCCTCTT GGCACAGACG 2580
CAACACTTCG GCCTCAGACG CAGAGGGCCT TAAGAGCGTG TGGTCAATAT ATGCATTGAG 2640
TTCCATGACA CTATCCTCCC TGGGACGGAA TGTCAGCCGT ACCTAGATGG GGAAGCGGAC 2700
GCGCCCACGC CTcCGCGGAC AATTGGCCGA TCCGCTCCTC CCCTGCTCCG TCATATGCTG 2760
CAAGCGCAAA AAAATACAAA ACGCCGTTCT GCAGTCCCCG CACCGTATAT GACAAACGCT 2820
TACCCACCCG AATAGGAGAT CCCGCCACAA AATACATCCC TGACGTGTCG CCCACATACA 2880
CCACATACCC CTCTACGTCA AAGTCAACCG AAGgTGTgcC ACGTGAGTAT CACCGACCCG 2940
TCAGCCGCCT GCGCAAAAAG ACGCCCCGGG GGCAAAGGAC GCTCATCTTG CTCATAATCA 3000
ACGGTTACTG CATGCACCAC CGGTGTCTTA CGACCTGCAC CATCTGGATA CAATTGCACA 3060
GCCACCTGGA AGTATCGTCC TTCCAATCCG CTGAGCGGCT GGCCTGCCAC CACCGGCTGC 3120
CACAACGGGT ACTCCAACGT CCAGTCCTCT TTCGTTTGTC CTACTCGCAC GAAGAACGCC 3180
ACATCTGCCT GCTCTGGAAT ATCCACATCT GCATTCACAC GCCGGACCAC CGCCTGCAGT 3240
CCTCCTGCAT CCGTGATCTC CGACTCAAAG CGGCCACCTG CCTGGTCAAA ACGCGCAAGA 3300
CGGTTAAAAC GGCGCAGcAC CTCTCCTGCT TCAGACGGCG GAACAAATTC TTCAGTAATC 3360
ACCACCTCAT CGATCAGACC AGAGTAGCGC TCCCCGATAT GTACTGCGGC TGCAGCGCCT 3420
AAACGCGCAT GCCACACCTG GCCAGTTTCA TCCTGCGAAT CCGTGAGtAC TCAAGACATT 3480
CTGTACGCCC ATTCATGCGA TACTCAAGCA CACCGCGCGT TTCGTCGTAC GTGAGCATAT 3540
GGTGGCTCCA cCGCTCTGGC AGCACGTGCG TACGCGaAcG GaGaCGCAAA GAGACTGCCT 3600
GTCCGCGCAC GTCATTCCAC AAmCCCTCCG CGCGCCAyTC AAGCCGGTGC TGTAAAATGT 3660
GCGCCACAAT ATGCTGATAA AAAGAACGTC CGCGATCAGA AAGAGAAGAA CGCCACCGGa 3720
ACAACACCGC CCCGTTCTCA CTCACCGCAG gATaCAGCCA AAACTCAATG GAGAACGAAG 3780
ACAAGGCCTG cGACCCATAA AAAAGAGCGC CCGGATTTGG CTGTAGCACT ACCCCTTCTG 3840
CACCATCTCC CCCCGCGGCC CCCCCATATG CAGATGGCAC GGAGCGGTGC ATCGTGCGAA 3900
ACAACGcCGC CCCCsCTCCG CGATGCGCAC GCTCTTCGCC CACATGtGCG CAGAGGAGGA 3960
CTGCACACGA TAACGACCGT ACAAATCGCT GACCAACGGA TcATCAAAAC TAAGATACAA 4020
GTCACCCCGA ACACTGCGGG TACGcGCAgC AGaAGAAAGT TCAAGCGCCG GATGGCCCGC 4080
TCGCCCCACA CGCGTACGCA GGTTTTTTAC CCGTGTAAGC GACTGCCACC CcTGCGCACC 4140 CCCGAGCAgC AGTGCGCTTC CTTTGCATAC AGCACACCCG CaGCATTCCC CTCCCCsynC 4200 aCACGCCGCA CCATACCCCA ACAAGGGCGA ATATTGTCGC AAAAAGCGCA CCCATCGCCG 4260
CCGAGTATCG GCCTTTCCAC ACGAATAATC AATAATGAGC GCTCACACAG CTCCCACACC 4320
GTACAGAACC AGAGCTGCAC GCAAAGCCcG CGCAGCgctT TGTATGTACG CGCACGGTAC 4380
ACCCCAGGGA AAAAGAGCGT CTCCTCAGCA GACAGCGAAA ACGAATGGGC ACTCCGCTAC 4440
AACGGTTCAG CCTAAGGGCA AACCATCACC CTTACCCCGG CGCATGGTCA CGTAGCGGTT 4500
CAAATCCGTC ATATTTTTAA CTACTATCCG GCGCTCTTGC CACTCAACCT TCCGCTGATC 4560
AGAAAGCCTG CGCAGCGCGT CCTGTACCTC CCTGTCAGAA AGACCAGCCC ACCGGGCTAT 4620
CTCTTCGATG CTGATATCAA AAGAACGCGC GTCGCTGCTG CGATCCACGT GCGGTTGCGT 4680
TTCATCCAAC ATCAAGAAAA CATCCCCCAC ACGCGCAGTA CTATCTTGAA TCGTTAAAAT 4740
CATAAAACGC CGCTTCTGGG TGTAAATACG CCGCACAAAC GTTTTCAAAA GCCGCATTGC 4800
AATAGCAGGA TTACCCATCA TGAGCACTTC GAAATTCTCC CGATTGAACT CCAGCGCCAC 4860
AACATCGTCG TACGCAACCG CAGACGCCGA ACGCGGTGAG TTGTCGAGAA TAGACATCTC 4920
TCCAAAAATC TCCCCCGGTT GCAACACATC CAAGTAGCGC TCCTTTCCGT TGATAATTTT 4980
AATTAGACGC ACCCGCCCGC TCTGCACGAG GTAGAAACTC TCCCCCACAT CAAACTCTGC 5040
GAAGATAACA GAACCCCGCT GAAACTTTTT GGCAAAGCGC GTAAACGAGG CAAAGGCATC 5100
AGCCATGCGA CGCCTCCTCA CAGGAACGCT GGAGCTCTTT TATGCGCGGA ACTAAAGACT 5160
CAGGAGCGGT AGAAAGCGCC TTGTCGTAAA AAGAAATCGC CTTATCCGGC CTCCCCATAC 5220
CCTGATAGCA CTGTCCCAGA TACATCAGCA CCTCCGCAAG ACGCGTAGAC TTCGGATTGC 5280
GGGTAATGCA CTCAGTGAAC GTCTGAATGC TCCGCACAAA CTCCCTCTGC TCAAAAAGAC 5340
ACCGCCCCGC ACCCAGATAC GCAGCCTCCG CGCCCACTCC CCCGTTCGAC GCGCACCCTA 5400
ACCGGTAATG CTCATAGGCC TCACCCCACT TTCCCTGTTG CTCAAGAAGC TCTGCGGCAC 5460
GCAktCCCCC ACTTCAGAAT CGACAGCTGA CTCAGCAAAA GCAGAAGGCA CCTCAAAACC 5520
CGCATCCGAA CcTTGCGCAC CCGCCTCCTC AAAGCCGCGC CCGACTGTAC CGTCGGCGCT 5580
CTCAAGCATC GACGCAATAT CGTGCCGATG CTTTCCGTCC GGATACAGTT CCCGGTAACG 5640 cTGCGCCACC TGACTTGCAG CAAGGTAATG CTCAGAGGCG TGAAAGGCAC GGGCAACGGT 5700
GTACAAACCC TCCTCGTTAT TTGTCTCCTC CTGAGAGTCA AGGAGCGACT CAAGCTGCCG 5760
ATGCACGCTC CGCAGTTGGC GAGAGAACAC CTTGAGCATC TTCATAACAA TGCGAATGTT 5820
CGTTTGAGCA AAGGCTTCAA ACTCCTGACT CCCAAACGCA TACACAATCG AATCCACCAA 5880 GGTGATCGCA TTCTCTTCAC GAGGAAAATT ACCGAGCGCA GACTTAACCC CAAAGAACTC 5940
ACCAGTTTTA ATGTACTCAG TCACCTGCGA CCCTGTTTCG ACATCCGCAA AGGTAAGCGC 6000
CACGTGCCCC TTATTCAGGA TCATAACACG ATCGTCAAGA TCGCCTGAAA AATAGATAAC 6060
CGAATTGGCC TTATACTGAA TGGCTTTTGG CACGTCTGCC TCCGTCTTCC ACGCGACACC 6120
GAGAAAAAGC GTGGCTCCCA CGGTACATTT TCGATAGAAC GGTCATGCAC TTAAGTCTTT 6180
TTCCAGAATT CACGTCACGC GCTCCTGCGg CnACACCTGC AGAGATCGTT CGCACGCCTT 6240
TTCCAGGAAA GGTGGGTGTG CTACTATCGG TATGTGCAAG TTCACTCTAT CTGGAGGCGC 6300
TTCTGTGCGC TCGGCCTGCT GGTGCCCTTT CTGCTTCTGC TGTTTTCTTG CACCAACACG 6360
GTTGGCTACG GCGTCCTCCA GTGGTCCCTC CCAGATCTGG GACTGAGTAC AGGAGACATC 6420
CTGCCGGTGT ACGTGCGCTC AAACGTCTCC CAAGTGTACA TTGTGGAAAT CCAGAAGAAA 6480
AAGGTAGAGC TGCCTTTCTG GCAGCTAAAA TTATGCAGGA CAAAGAAAGA GGCGCTTCAG 6540
TACGCTGAGC GCCTCCGCGA GTACCGTTAC AGtACGCCAC CTCTGTGCTC GACGGTCTGC 6600
CCCTGCGAGA AGGGCCTGAG AACACTGCCC CCCAAGTTTA TCGCCTCCGC GAGGGACAGG 6660
CGGTCAAGCT ATTGTGGAAG GGTACAGGGA AGGCCGTCTA CCGCGGTGAA AATCGCCTCG 6720
AAGGGGATTG GTTCAAGGTC ATGACCGAAG ACGGTACCAC CGGATGGTGT TTTTCTCACG 6780
GTCTATCCCT CTTTGATGAG CGCGAGTCGC GTCCTACAGT ACGAGAAACG GACGATCTCG 6840
CACGTGATCG CGACCTTCAG CACGTACTCA ACTCTGCGTG GTATCCTGAA TACTACCGCA 6900
CCATGGTTGA ACAGCGCCGC ATCGACTTAG AAAAAATGGC AAGCGGCTGG GGTTTATTTG 6960
TCGGTGAGAA AAAAGGCCTC GCACGCATTG AATTGCCCGA TGCGCAcTAC GCCTTTCCCT 7020
ACTCCCGTCT GGTAAAAACC GGATCCAACG GGTACCTCTT TGACGGATCC TCTCTGAGCA 7080
TCTATGTTCG GGACGCGCAC ACCCTTGCCG CGCAGTTCAC TGACGAAGCT GGGCGCcTGC 7140
GCATAGAACG CTTCGTCACC CTGGAGAAAA CGCCTGAAGA GATTATCGCA GAAGAGCAGC 7200
TGCGGCGCAG TGCGCTTTTG GAACACGTCT GCACACCAGG ctGCGCCTTC ACTCTGAGAT 7260
ATATGGGACG CTGTCTTTTA CAGAACGCAA CGTCTTCACC TGGACAGGGG CGCGCGCGCT 7320
GTCCCCGGCG CTTATCCCCG CAGGGGCAGG GAGCACGGGG CGTGTAGCAC TGCGGTGCTT 7380
CATAGATCAA TCGCTGAAAA GCGAGTATGA AGGGGTGCTG TCCTTCGACT TTGaCAGCGC 7440
GCAGGAATGG GTGCACTTCC TGTACTTACG CACCCCCGGG GGGCTAAAGC TCGAACACAT 7500
AGACTCCACC CACcTGAAGG ATGCGACAGT GTCCGCAGGA GCGTAAGCCC AGTGGTACTC 7560
TAtTCGCGCC GGAAGGACAC GCCGAGCCCC AACCCTAAGA AGCAGTCAGC CGTCGGGAAG 7620 GGAGACGgCC GCAGGCCCCT TTTGGGAGGA CGAGCGAGCC CTGTGTGCCC GGCTGCCCGC 7680
CAGCTGCCAC CCCGGCGCGG ATTTGAGCAA ACTGGCGCGC AGTGACTCCT AGCGCGGCAT 7740
GTCCCGAGGA TCGGTCTTGC CGATTTTTTC CCCGAGTTTG TTCATAAACT CAGCTAGCTT 7800
CATCTTACCC TCAGAACGAT AGATACGCTC TACCTGGGCG AATGCCTCAA GCCCTTGCGG 7860
ACATGATCCG GATCAGCCAT AAACTCGTCA AAGTTCGGAA TAACCAACTT CATACCTGGC 7920
CTAATGCGGT CTGGATGCGT TACCACGCCC TCACTACAGG CCATAATAAT GGGGAAATAG 7980
TATCCGCGGA TGCGCGAGCC GTAAAACTTT TTAGCAATTT GAGAAAGGGT GTCTTCGTTT 8040
TTGACCACGT ACTCAGCACG CGACACCTCT TGCGGAGgAC TGCCATAGGC TCcTGCGGCT 8100
CCTCGGGGAC GGGCGTCATG GGCTCTTCTT CCTGCTCCTC CACAGGCGGC GCAACTTCCA 8160
CGAGCTCCTC CGCCGGAGcc AGACTTGCAG GACGTAACAC CCGAGATAAA CGCAATCCAC 8220
AGCAACAGGC CCGTACTAAG CTTCCTCATC ATCGTCTCCT CCTCCAGACT GCACGCACAA 8280
TCCGGACAAA ATAGCGGCCC ATTTTCACAC AGAACCCCAA TGAAGACAAG ATACATGCCA 8340
CCTCATGGGG AAAATCATAC GGCATACTCT TAGCAAGCGC ACACACCGTC CTTCCCACCT 8400
CTGCCGGGAG CAGCGTGCTC TCTTTAGGTC CATTTCATCG GAAAGGATGG CACTGCAATT 8460
CATTAAGGAT ACGCACACAA TCAACCGAAA CAGTGACAAA CGTGGGTGCG CGTAgyTTTC 8520
GTTCCGCCGT TTTTACTCTT CTCACCACTG TCTGTGCGCT GTGCGGCGAC CCAGCGCATG 8580
CCTATCTTGT GCAATTTAAA GAGCAATTTT ACCGTCTGTA CCATACACAC CTGCACCAGT 8640
ATCCGGACGA AGTTATTGAA AATATTCATT GGCTTGAGCG CGCTGTACAC GCCGACTTCG 8700
CCAATCCCCT TTATGCACTT GCGCCTATTC GCGATAAAAA AAGCTGGGAA AAATACCGCG 8760
CACTCTTCAT GATGCACTTA AACCTCAAGC TGACGGAGCA ACATTTGCGT CTTGGCGAAA 8820
AATTTGACAA AAGGGAAGCC CTCTTTTTCA ACGCACCATG GAGAGAAGAG AATATCGAAA 8880
GCCTCGCTAC nGCCGAGCGG TGCTACCACA CTGCACGGCG ATACTGGCAC GAAGCGGCGC 8940
TCTGGGCCGA GCGCGCAAAC GCCGATCAGT TTCGTTTTCT GTTCCTCACC GAGTTGCCTG 9000
CTTGGGAAGA CGAGCGAGAA cGCATCGCGC GCGGCACGyt CAACTATGCG CGTACCATAT 9060
CGCGCGAGtC GCACGTCTCG AGGCGGTGCG TGCTCgCTTC ATGGAGATGA ACGACACCTA 9120
CTGAGAACGA ACTCCACCGG TACCAGGCGA CACTGAACAC TTACATCGCC TGGACTACAC 9180
AGAAGCCGGT CAATAGCGGG CGGGTTTGCC CAGAAACTTt ACCCCAAGAT CAGGAAAATC 9240
CGTGAGCACC ACGTTTGCGC CCGTCTGTTT GAACAAAATG GAAAACATCT CGTCCaTGGT 9300
GCGCGCGTAG CTAGGCAGTG TTTCTTTCgT ACCGTGTGCA CATGACATTC CAATTTCGCA 9360 TCTTGGATTG CAGAAACCAT CGGACTCAGG CGAACAGCGC CCACCTTCGA CCATTCATTC 9420
TCTATGAGCA TCCTCCAGTC AGGACCCACG CCGTCTGCAT ATTTTGCTAT TTTCTGCATA 9480
CCACCGGGCT CAAACATCCA ATTGTAATTG TAGTT ATCC ATTTCCCACG CGAGTCCTTC 9540
TCCTGTGTTT CACGTTGATC TGTGTAAGCA ACACGCTGAA TCAGCTTCAC GTTCATTTCG 9600
TACTTTGGTA AAAGTTCTCG TTTGATACGC TTCAGCTCGT TAAAATCATA CGTTTGCACA 9660
TACACTAGAT CCGATCGACT TTGGTAACCG TATTTTTTCA ACAGAGCGAG GGTAAGCGCT 9720
GCGATGTCTT TTCCTTCCTG ATGATGAAAC CACGGCACCT TTATTTCAGA GTAAATTCCA 9780
ATCTTTTTCC CGGTTGTCTG TTCCAACCCA CGGATAAACT GCAACTCCTC TTCAAAAGTG 9840
TGCAGCCTAA AACCAGGCTT CCAAAGAGGA AAGCGCTGGC CA ACACCGG CGTATGTCGC 9900
TTACCGCGCG TAT GAAACT ATTGGTTGCA CGGAGGAGGG AAAGTTCTTC TACCGTAAAA 9960
TCTATGAC T AGAAATGCCC ATCCGCACGC TGCCGGCGTG GAAATTTTTC TGCCACGTCA 10020
GTCATATTAT CCAGAATATG GCTTTGCGCT ACGATAAGCT GATTATCCTT TGAAAGCACG 10080
ACATCCTGCT GCAGGTAATC TGCTCCTtGT GCAAAAGCAA GAACTTTCGA GGCAAAGGTG 10140
TGCTCGGGCA CATATCCTGC AGCGCCCCGA TACGCAACTA TCATACGTTC GGACGCACAG 10200
CCTGCAACCA AkGCCGCAAA CACCCCCCCC CAAAGCGTCA CACAATATGT TCCCCGCATA 10260
AACTTCTCTC CTCCCCTGTT ATAGAATGCA CAGATCGTCA CCGTGCATAG TAGCACGCCG 10320
CATATCTCCA CTGAGCGCTG ATCCTACCTT CAGTTTTAAT CCCATTCACG ACATGTCTTG 10380
CATACAGGGT AACGGGTACG CCCACACACG CGCTTGAGAA TGGGGAAAAA AGTCTTCAAG 10440
TAAGAAAAAA AAGAAGCAAG AGCCAATACC ACTGGCACCA CGTACACCAG GCGGCCAACC 10500
GCACGCATGC GCTCATACCA ATCCGCGCCA GCCAATTCAA AGGCGTACAA TGCTTTCAAA 10560
AGAAGCGAAA AAAGGACCGC ACCCATATAC GAGGCTGTTT TCAGCTTTCC CATTCTCTGG 10620
GCACCAACTA CGTGACCTTC GCCACACGCA AGCATTCTCA GGAACATCAT TCCAAATTCC 10680
CGATACAAAA TACACAGAAA AAGAAAAACC GGCATGAAGT TGTCTGCCAC GAGGCAAAGC 10740
ATGACAGTTA CATTCGCTAT CACATCAGCA AAGGGATCGA AAACCTTCCC AAAACTAGAA 10800
TACTTCCCTG ACTTGCGCGC GTAATAACCG TCGAGGAAAT CAGTGCACGC AATGAAAAGA 10860
AAAAGCAGCA CCGACGCGAT AGACACCACA CGCCCCACAT TGGcAGCAGG AAAGTACATA 10920
ACAACCCAAC GTGACATATG ATAGAGTGCA AAGAAAGGGA GAACCAACGC CAGTCTCAGC 10980
GCAGTATAAA AGTCAGAAAG CCTCATACTC ACCTAGGGGA AAGACGTAAT GGTACACCAC 11040
AGGTGTACTc CTAGCCTCGA ATTTTGTATC GCCTATTGAA CCTGTCAATG CGTCCTGCTG 11100 AATCTACCAG TTTCTGTTTA CCGGTAAAAA ACGGATGGCA CGCAGAACAA ATCTCTACCC 11160
GCAAGTCCTT CACGGTAGAA GCAGTCACGA TGACGTTACC ACACGCACAC ACCACCTTCG 11220
TCTCCTCGTA CCGAGGATGC AGTCCCTTTT TCATCTGAAT CGCTCCTTTC CCCATTCTAC 11280
AGcGCGCCGC GCCCTTAAAA AGGTTACATC CACCCTCTTC CAGAAGGCTC GTACACGCCG 11340
GGTGTAACCG ACACACGTTC CAGTCTCCTT ACTCAGGAGG AGAACGCACC CGTATTCATA 11400
GACCTCAAAA ATGCCTCGTT GTTCTTTGTT TTACGCATCT TATCAATTAA CAATTCCACA 11460
ATTTCTGCAT CGTCCATAGG ATTGATTACC TTACGCAACA CCCAAATACG CTGCATTTCT 11520
TCCTCCGTCA GGrGCAACTC TTCCTTACGC GTACCAGACT TTTTAATACT CACCGCGGGA 11580
AATAGGCGCC GATCCGAAAG GCGACGATCG AGATTTATCT CCATATTCCC CGTACCTTTA 11640
AACTCCTCAA AAATAACCTC ATCCATCCTA CTGCCTGTTT CAATAAGCGC AGTGGcAATG 11700
ATTGTCAGAC TTCCTCCTTC CTCCACATTG CGAGCTGCAC CAAAGAAGCG TTTCGGTTTG 11760
TGCAGAGCAT TTGAATCCAC TCCCCCCGAC AACACTTTAC CTGAAGTTGG CATCGTTTGG 11820
TTATAGGCAc GCGCCAGACG CGTAATCGAG TCAAGCAAAA TCACCACGTC CTTCCGGTGC 11880
TCTACCAATC GCTTTGCGCG CTCAAGCACT ATCTCTGCAA TCTGTACATG GCGAGTAGCC 11940
TGTTCATCGA ACGTAGAAGA AATAACTTCA GCATCAACCG TACGCTCCAT GTCGGTTACC 12000
TCTTCAGGAC GCTCATCGAT GAGCAGCACG ATAAGATAAA CTTCAGGATG ATTTTGCGTG 12060
ATGGCATTTG CAATTTTCTG CATGAGAATC GTCTTTCCCG TACGCGGCGG CGCTACAATC 12120
AGCGCACGCT GCCCCTTTCC GATCGGACAG AACAaGTTCA TGACACGCGT TGAGATATCT 12180
TCCGTTCTTG TTTCTAAATT CAGCTTTTCC CGCGGGTACA AAGGGGTAAG ACTGTCGAAA 12240
GGGACACGGT CCTGTACCTT TGCTACTTCT TCGAAGTTTA CCGTTTCCAC GCGGAgcATT 12300
GCAAAGAAAC GCTCTCCCTC CTTAGGGGAG CGAATCTGCC CATAGATGGT GTCGCCCGTT 12360
TTCAGATTAA ACAGGCGAAT CTGACTTGGA GAGACGTAGA TGTCATCGGA ACCGGGCAAA 12420
TAACTGTTCT GAGGTGAACG CAAAAAACCA TACCCGTCAG GCAATATCTC CAGCGAGCCA 12480
GAAGCAAAGA TAACGCCACC ATTCTCAGTG TGATTTTTAA GAACGTGAAA GATGATATCT 12540
GACTTTTTCA TGACAACCAC GTCTTCTTGA GAGATACCCC GCTGTACTGC AAAATCACGC 12600
AGGgCATGCA TCCCCATCTC AGTTAAATCA TCAATCAGCA AACGCGCCCT ACCCTTAACG 12660
TCGGCGGAAG ACTCACTTGT TTCCGCATCT TCTGGGCAGA AATTTTGCTT AAAGCGTAGG 12720
GCTCTTCGCG GACGTTTTTC CACCTCCAGC GCTTTTGGCG TCACCACTCG TCTCCGTGAG 12780
CGAGAGsGgA TGACGAGCTT CTTCCTCCCG CCGTAGATCG CATTCCCCCA CGTCAGCG 12838 (2) INFORMATION FOR SEQ ID NO: 11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17378 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:
TGCGCGTGTG CCACGCACAC CAGTACGTAT GCTCGAGCAC AGTGGTCACG TGATCACTGA 60
TGACGTGGAG CGGGAGCAGG TTGCCTCTTG TGTCAGTGCT TTTTTACGCA CGTAnTtTAC 120
GTGATGTCTA CCCAAAAGGG AGTGCGGTGC ACGngTGCTC CACATTTCTA GTTGCTGTAG 180
GGGGAAGAGA CTGAGTCGCT GTTTCGGACC GCGTGCGTTT TATGTGCGCG CGCTGTGCAT 240
CGCGCTCCCC GTGATGCTGC ACTCCTTCAT CCAGACGGGT ATTTCTTTTT TAGACAACGT 300
TATGGTCTCC CGTTTGGGGG ATGTGAAGAT GGGTGCAGTG AATGTGGTCA ACTCGCTGCT 360
CTTTCTGTAT GTCACCGCGT TAATGACCGT GTCGAATGCA GGCAGCGTGT TTATGACGCA 420 tACTCAGGAG CCCGTCACGT AkGGGCATGC GGCAAAGCTA CCGATTTAAA CAGTACGCCA 480
TGGGGTCTCT GGCGCTGGGT GCTATGGCCG CTGCGCTGTG CTGTCCTCAG TATCTCCTTT 540
CGTGTTTGTT GGGAAAAAAT GCGCAGGCTG CTCAGATTAT AGCGGAAGgT GAGCGTTACC 600
TTTCGATAAT TGTGTACACT CTTGTGCCGC TGTCATTTTC TTTGGTCCTC ACCTCTACAT 660
TGCGAGAAAC AGGGAAGGTG CTTGTACCGC TTGCAGTGTA CGGGTGCAGT GCCGTATTGA 720
ACGCAtGrGT aATaTATGTT GATTTATGGA AACTGGGGGG CTCCGCGATT AGAAGTGCAA 780
GGTGCAGCAT GtGCAACGCT TATAGCGCGG GTGGTAGAAA GTCTTATGCT CCTGGTGTAT 840
GTGCGGGTTA AAAAACCGGA CTTTTATGTG CGGCTTTTTT GTCcTGTGCG ATACCCCTGT 900
CACTGTGTAC GGTGATGCTG AGAAAATCGC TGTGGATTTT CGTAGGAGAC ATGGCATGGT 960
CGGTAACGGA GATGGCCGTG GCTGCCTTGT ATCACAGCCG TGGTGGGGCT GAGGTTGTGG 1020
CAGGGATGTC GGCGGGGTGG ACACTCGCGC AATTATTTTT TCTATCATTC CCTGCAAGTA 1080
GCGTGGCAAT TACCATTTTG GTCGGGGATG TGTTAGGGAA AAGCGAGCTA AAGCAGGCGC 1140
AGGATTATGC cACGGTGGTT GATGAACGGA GCGTTCTTTT TAgGGTTAGG TTTGGGTGTG 1200
AtTGTGTGTG TAGCGCGTGC aGGGATTCCG TGGGCTTTTG GAGATTTGTC GCaTGcTTCG 1260
CAACGTATAG CACAGCAGTT GGTGCTCGTG ACGGCGCTGT ATATGCCGAT TTGGATGTAT 1320
TTAAATGCGC AGTATGCGGT GGCACGTGCA GGAGGTGAAG TGATGGTCAC CGCGTGGACA 1380 GAAACGTTGG TAGATACCCT GTTGTTTTTG CCGTTGATGT ATGTGTTGGC GCGCTTCACT 1440
CAGCTTGGTG CGCCGCTTAT GTATGGAATA GTAAAGAGTA CAAGTGTAGT AAAAATGGTG 1500
GTGCTTGCAC GTCACTTAAA AACACGTCGT TGGGTGCGTA ATCTCGTGGC GAATTTATCG 1560
TGATACGTGT TACGTCCTTC GGGTAAGGCA CGGAGCATGG AGTGACGGAA ATCGAAGCAT 1620
GTATAAAAAC TGTCCGCACG CCCTATAGGC GTCTTTTTGT TTTCCCTTCG CGTATTGTTG 1680
CCCAGGGGTG GCTGCGGCAA AGTCTTTCTC TGCTTGGTGT GCGTACGGTT CCGGGGCGGT 1740
TGTGTCTTTC TTGGGACGAG TTTAAAAAAC GATGTTTTCA ATGTGCGCCG TGTGCGCACC 1800
GTACACCTAT TTCCGAACCG CTTCGTCTTT TATTTGCACA CTCTGTGGTG CAGCGCAATG 1860
CGCGGCAGGC TGCAGAGGGG CGCGCACTTT TTTGCAACCT TATTCCTCCG GCATATGCGC 1920
AAGACGGTGC AGTGTTTGTG CAGTGGCTTG CCCGTATACT CCCTCAACTT GGATCGTGGC 1980
AACGGCGCGT TGAATCGCAC TGCATGCCTC CAAAAGATGC GGTATCGCGT AATACGTTTG 2040
ATGGyCsCGA CGCGCgCGCG TATGCcAGAg TGCGGAGGCG CAAGATTTAC AAACGCTGAA 2100
AGgCAcTATG AGCAATTTCT CCGCGCGCAT GCACTCTTTG AACCTTCCTG GGATACGCCG 2160
CAGTTTTGTG CGCAGGGGAA CACGTATGTC ATTGTATACC CGCAGCTGAT GCAAGACTTT 2220
GCAGAGTATG CACCGGTATT GCAAGAAGCG GCGCGCGCCA CTGCGGGAGT ACTCACCTTT 2280
CTTCCGGTTC CTCCCTTTCG GCAGGATACG CCGTTGTGTT GTTTTTCGAA TGTACGTGAG 2340
GAAATTACTG CCGTTGCGCT CCAGGTAGAG AGGTTGTTGC GCACGGGAAC GCCTGTGTCG 2400
CAGATAGCAG TTTCGGTGGC AAATTTAGAA GAACTGCAGC CATATGTGGA GCGTGAGTTT 2460
CGTCTGCGTG ACATTGAGCC TGAGGTGCGG GCAGGTTTTT GTCTTGGTGC TCATCCGGCA 2520
GGGAGGATGT TTTCCCAGCT TCGAGAGTTT GTGCGCAGTC ATGgCACGCT CAAAAGCGTG 2580
CGGGcGCTGC TTTTGAATCC GCATATTCGC TGGGCGGACC CCCAAGGGGC ACAGGCTGTG 2640
GTGCAGTACG GATTGCAGCA GGCGTGCATT CGTTCATGGA AGCAGAGCGG CACGTATTGC 2700
AACGTGTGGC TCCAGGCATT TGCGCTCCAC TGTGAGCGCA CAGAACAGGA GCGACAGCAC 2760
CAGCAGTGTG CGCAgcGATT TTTTCTGACG CTGTTACGTT TTGCGCGCGC GTTGGTTGAG 2820
GCGCGTAgTT TCGTACGGAT GCAAAAGGCc TACGGTGCGT TTCGTGCCGC TTGTTTGCTC 2880
CCCGCGTCAG CraCACGTCT GGTGCAGAAG AGGAGATAGC ATCGTCTGCG TTTGCGTCTT 2940
GCAGTGCTGG GGAAGACGAT GCGGTGATGG CGCGGTGCGT CTGTGTTTTG CAGGAGTTAG 3000
CGGCGCTTGA GCGGCGCTTT GCACACGTGG TGCCACCGGA TCCATATAGT TTTTTTGTAC 3060
AGCAATTGGC ACAGCAGATG TATGTACCGG TGCGTGCAGG GGTAGGACTG GCGATTTTTC 3120 CCTATCGGGT TGCGGcGctG CGCCCTTTTT GCATCACTTT GTGATAAACG TGTCGCACGA 3180
AGCGAGTAGC GTGCGGTATC AGCGAGGTAC CTTTTTGCGC GCGGATGTGC GTGCGGCATT 3240
TGGTTTTGAA GATGAAGACG TAACAGAGGC TTTCTTATCT GCGTACGCGA CGGCACAGAC 3300
GGTGTATTTT TCCTGTTCTG TGCAGGCGTT TTCAGGGGTG CAGCGCCCgA ATCGTTTTTT 3360
TTCAAATGTG CATCCGCCTG TGTCCTCCAC CCTGACGGGG AGGGCAAGTC AAACACCGTC 3420
CCCAGCGGGC ATTGCATCTG AAACTGGTGG CGCGGTGCCT CAGTATCCTG CGGTGCGCTA 3480
CGAGGAAGAC GCGCTGCAGG CAGAGCAGGA CCTGTACGCA CAGGGCGCAC CGGTGCCTTC 3540
GTCATTGTAC AGAACACAAC AGGAGCGTTT ACGGAAAGCG GCGTCTCTTA TCCCTGCAGC 3600
GGGGCGTTCG TACATACGTG ACTCCTTTGC GCAGGCGCTC CCACCGCTGA CTGCAGTACT 3660
CCATGCGCGT CATTTTCATC ATGCGGCGGT CAAGGTGAGT CAAACCGATC TTAATCTTTT 3720
TTTTCGGTGT CCGGCTGCTT GGTTTCTTGA GCGTGTGTTG GACGTGGCGC CGCTTTCTCG 3780
AAGGCCGCGT TTAGTGGATC CGCGCGTGTT GGGGGTTTTT AGTCATGTAG TACTCGAGCG 3840
GCTGTACAAT AGGATTGCGT GCGAGGACGA GTGTTTTTTT TCTGCGCACA TGGAACGCTA 3900
CCGTTTATGG ACGCAAGAAG CGATTGAGCA AGTCTTTTCT GAGCGTGCGG TGCGTGCCGG 3960
TCCGCTTGTG TGGGCGTTGC GCGCGGCCTG AGCGCGCGCA TCCGGCACAT GGTGGAGTTT 4020
GTATTGCAGT TTGATGCGCA GCGGCTTGAC GGcTGGCGCG TGGTACGTAC TGAGAAAGCG 4080
TTTGAGTTTA CCGATACGCA gTGTTTCTAC ACGGGGTTGG TGGATCGCAT TTCGTGCAGT 4140
CCGGACGCGC GGTCGCTTGc AGTGTTAGAT TACAAGACCG GGGCGCTCCC TGCGCTTTCT 4200
GATTACACAG ATTGTGAAAA GAAAGGTCGG TTGTCTGATT TTCAAATACC TATGTATGTG 4260
TATCTGTTGG AGCAGGCGGG GTATACCGTG ACGCACGCTT TTTTTTTAGA CGTGAGAAAG 4320
AGAGATTTTA AGGTTATTGT TTCGAACGGG CGGGTGGATA TGGGTGCAAA GCGTGGGGTA 4380
GATACCGTAC AATTTCAGGC GGTTATGCAG CGCTTTGAGC AGTCAGTGGC AGTTTTTTCG 4440
AAGGCGGTGC GCCAGGAATG CTTTGCCAAA GCGCCGTATG TCACCTGGCT TGAATGTGCC 4500
TCGTGTCGTT TTGCGCCGGT GTGTCGTACC TCGTATGTGG TGCGTGGGGC GTCGTGACTG 4560
ATTTTCTTTT TTCTTTTTTT CAAAGTTTGA ATGCAGAGCA GCGGCGTGCA GTTTTTTCTT 4620
CGCATAATGC AGTTGTTACC GCAGtGCAGG TTCGGGTAAG ACGAAGgTTA TTAGCGCGCG 4680
GTATATACAC CTGGTTGTGG AGCGGGCAAT TCCGGTTGAA CGGATTGTGG TGCTCACCTT 4740
TACCAGAAAG GCGGCCATGG AAATGGCGCG CAGAATTTAT GAGGACCTCC GTCTGTGTGT 4800
ACAGAGTGCG TCTGCGCAgc CGGAGCCGGG GCACGAAGCG TATCTGCTGC GTGCGCGTGA 4860 GGCGCTTGCg cGGTTTGGGG AAGCGCGCAT TATGACGTTA GATGCCTTTT CGCACGAAAT 4920
TGCGCGGGTA GGCGCGCGCT TTTTCGGTAT CGCGCCTGAT TTTTCTCTCA GTGAGGAAGA 4980
GAACCGCGCG CTGGCACACG AGTGTGCAGA AGATTTTTTT CTTGAGCATC GGGAACATCC 5040
AGTGGTACTG CATTTTTTGC AGCAGGAGCA CGCCGAAGAC TGCGTGCGAG AACTTTTTTT 5100
TATTCCCTTG CAGGATCACG GCATACTTAC ACATCCCTGT GACTTTCGTG CAGGGCTTGC 5160
GCATCAAATT GCTACAGCGC GTGGGTTATT AAAAACGGTG CTCTGCGATA TACACGCAGC 5220
ATTGCACGCC ATTCGGCACC ATATGCaAGA GGCAGATGCG CAGAATGCgc TnCATTGCGC 5280
GCTGCGTTGC GCTGTTTGCG GCACAGGATA CTGCCTTTTC CTACACGCCG GCTGCAGAAG 5340
CAGATGCGAT TGCCGACGCG TTTTTGGCAC GTGGGTACGA GGAATATGCA GCTAAACCTG 5400
ATGAGTTTTC TGTGTCTGAC CCTGATGAGG GAGCGcGGCG CcTGcACACG ATTGCcTGCG 5460
GTATTGTGCG GCGGTAAAAA CGCTTTTTTG TCTGAAGGGT AATTTAGGGG GGCGCGCGGG 5520
TGCAGCACAG GCGATAAAAG CACAGGTAAA GCAGCTGCGT CTTCAACTTG TACCGCAAAT 5580
GGAACGGCTG CACGCGTTTT TTGCGCAGGT ACCGTTCCTT GTGGCACTCA GCTCGTTGCT 5640
CGAGCTTCTG CAGGCGCGTT TTATCCGGCA AAAACGGGAA CGGAATTGTC TCAGTCACGC 5700
CGATGTGGCG CATCTTGCGG TGCAGGTGTT ACGTCAGTAT CCGGAAATAC GCGTTTCTTA 5760
CAAGCGGGGT ATCGATGCGT TCATGATTGA CGAGTTTCAA GATAACAATG CCCTCCAGAA 5820
GGAACTTCTT TTTTTTCTTG CCGAGCACGA AaGCgcGCAC CGCGCACTTC CTCCCTCCTG 5880
CACATGCGTT GTGCGCACAC AAGTTGTTTT TTGTGGGAGA TGAAAAGCAG TCGATTTATG 5940
CGTTCCGGGG TGCGGATGTG CGGGTATTTC GGTCTCTGGC AGGCGTACTC ACCCCGCAGg 6000
TCAGTGGCGC GTCCCAGCAG GAGCTTCCTC TTTcCGCTGC TGCGGAGCTG CAGCCCACAC 6060
TTCAGACGTT GCGTATCAAT TACCGAACAG AAGCGcGCTC CTTGAGCGCC TCAACATACT 6120
GTTTTCACAT ATTTTGCGTG GGCCGTCTGA GTCTGCCGAG AACGGGTACG AGGTTGGGTT 6180
TCAGTATATG CAGCCGGCCC GGTGTACTGC CGGTATTGAG CCGCAGTTTC GGGTGATTGG 6240
AGTGGATCGT CACCGTTTCT CCAGACCGGA GCACGAAGCG CAgcACTCAG CGGCGCGCCC 6300
AACTCCTCAA GCAGGGAGGA CAGGCGCGTC TGAGGACTCG GAGGATTCTC TATCGGCGCA 6360
GGAGACAGAA GCGTGGGnGC TTGCGCGTGC TATCCGTGCC ATGGTGGACG GCGGCACCCT 6420
GGTGCGCCAC AAGGGGGAGG CGCCGCGCGC GTGCACGTGG GCGGACGTAG TGATTCTGTT 6480
GCGTTCTGCA GACAAGCAGG CGCGGTACGA GCGCGCGCTG CGTCTGTGGG GTATTCCGTA 6540
CACGTCGCTT CAAACGCGGG GTATGTTTTG CGATGCGCCG CTGTCTGATC TCCTTGCCCC 6600 GCTGCGTTTA GTGCTCGAGC CTGCCGATCG GCATGTGTAC GCGCAAGTGC TCCGCGGTCC 6660
GTTTGTACGG GTCGATGACG ACACGCTTTC TCTGTTGCTG CTCCCACCCG CACCCCCCGA 6720
CGCCCCTTTT TCGTATATCC CCGCGGAGTT AaTCcGCtGC GGCGCGGTGT GTACGTGCAG 6780
GCGCGGACTT TTTTGCGCGC GTGCAGCAGC AGGTGCGGCG CCTGGCGACG AATACCGAGC 6840
TGCTGACCTA TCTGTGGTAC ACCGAGGCGT ATGGAACGCT GTTGCGCAAG ACCCCCTGGC 6900
GCGTCCGTAC CATGTGATAT ACGACTATGC GTTTGAsTTG CGCGGCGGGC AGATCGGCAG 6960
GGAAAGGGGA TAGGAGAATT TTTAGATTTT GTAGATGCGT GTCTGTCTGC CCAGGAGCGG 7020
GTAGAGGAGC TGGAGCTGCC TTGCACGGAC CGCGCGTGTG GGGCAGTGCA GATTATGAGC 7080
GTGCACAAAA GCAAGGGGCT TGAGTTTCCA ATTGTGTGTG TGCCGGACGC GGGGAGTTCT 7140
GGACCGCGAG TGATGGCgCG CGTAGgcGCG GTACACTCCC CGTACGGATA CATTCCCCGA 7200
TTTTTGCCTC ACCCTGAGGG GGTGCATCCG ATCTTTGTGC AGGAACAAGA CACGCGCGCC 7260
CGGGCGTACC GCGCGGAgcT GCGACGCGTG CTCTATGTGG CTTTCACGCG GGCTGAGTGC 7320
CaCGTGATTG TCAGCGGGGT ACTGCCTATT TCTGACGGAC ATCCTGCTCC TGCCGTTTCT 7380
CGGTCGTTGG CGGACATCTG CTCTCTGCTC CCCTCTGGTG ACGGGAGTGA GCCTCCTTCC 7440
TCCCTCTCTT TCTTTTCAGA GCTGCTCCCT GCGCTTATGC ATGCAGCCCC CCTTCCTCCC 7500
CATCCTTCTC CCTCTGTGGT GCCCGCACCG GTGTCGTTTG ATGAGTGTCt GCnTGGCGCG 7560
CCCTCAGGCG TATCAGCGCG CCCTCACCGG GTCAGCTCGC CAAAGGTGCC GCAAGCATCC 7620
GCCCCTCGGG ATTGGTACGC TGCAGTGCCT GTTCGCGCAC CCCATTATTA TCCCCgTCTT 7680
GTTCAGCCGG TGAcGTCCCT GGTTTCTCCG GCTCCGGGGC AAAACTCGGC TTCCGCTTCT 7740
CCTTCGCCCT TGACCCCGCA GTCCCCCCGT GGCGTGGAGT TTGGCACGcA CGTGCATGAG 7800
CTTTTGGCGC AGGTTTTCCA GTCCCCGGCG CCGaATCkTG' CACTCCATAG CGTACAACGT 7860
GTTGATTcCC CTGCGGCACg CCTTGTGGCC TGCTTTCTTC ACTCCCCCTT GGGCTGCCGT 7920
GCATGTGCAG CCCCCGCTCA CCAGCGTTTT GCGGAGTTTT CCTTTCTCAC CCGCGCTCCA 7980
GGTAATACGA AACCCCCACA TGGAGCGGAG TACCAAGCCG GCACCATTGA TCTCCTCTTC 8040
CTTTCCAATG GGGTGTGGCA CCTTGTGGAC TACAAAACCG ATTACGAAGA GCACCCGGCG 8100
CGTTATCTCC CCCAGTTGCA GCACTATGCA CGGGCGGTGC AGGATCTCTT CTCGGACCAC 8160
CCGGTGACGG CCTTTCTGTA TTACCTCCGA ACCGGGCATG AATTTTCTTT GGAAGCGTTA 8220
GAATCTCATT TTCTGAAAAA AAACGCAGTT CCGGATTCTG AATGATTGAC CCATCTGCCA 8280
CTTCCCGGTA TGGTTCCCCA CGTTTAGTTA GTAATGGTTT TCGGCATCGG AGAAAAGTGG 8340 TGTATCAGCG GGTAGGGCAC AGGCGATTTT CTCTCATTTT CTTTTTCGTT GTGGTTCTGG 8400
GGCGGTCCCC GCGGCTGTGG GCTCAGGTTT CGTTCACCCC GGATATTGAA GGCTATGCGG 8460
AgcTGGCCTG GGGCATTGCA TCCGAAgATG GTrGCGCCgg AAaCCTCAAG CATGGATTTA 8520
AGACTACTAC TGATTTTAAG ATTGTGTTCC CCATTGTGGC AAAGAAGGAT TTCAAGTACC 8580
GCGGTGAGGG GAATGTCTAT GCGGAAATTA ATGTTAAAGC GTTGAAGTTG AGTTTAGAGT 8640
CAAATGGTGG AGCAAAGTTT GACACGAAGG GTTCTGCAAA GACGATAGAG GCAACCCTGC 8700
ACTGTTATGG GGCCTACCTG ACCATTGGGA AGAATCCTGA TTTTAAGTCA ACGTTTGCTG 8760
TTTTGTGGGA GCCGTGGACC GCGAATGGGG ATTATAAGTC TAAGGGAGAT AAGCCGGTGT 8820
ATGAGCCGGG GTTTGAGGGA GCCGGGGGAA AGTTAGGGTA TAAACAGACT GACATCGCCG 8880
GCACGgGGCT CACGTTTGAT ATTGCGTTTA AGTTTGCGTC TAACACCGAC TGGGAGGGCA 8940
AAGACAGCAA GGGCAACGTC CCAGCAGGAG TAACCCCCAG CAAGTATGGA TTGGGGGGAG 9000
ATATTTTGTT CGGCTGGGAG CGTACGCtGa AGATGGCGTG CAGGAATACA TTAAAGTGGA 9060
GCTCACCGGC AACTCCACAC TGTCTAGCGA CTATGCCCAA GCCCGAGCCC TGGCAGCCGG 9120
GGCTAAGGTG AGTATGAAGC TTTGGGGTCT GTGTGCTCTG GCTGCTACAG ACGTGGGGCA 9180
TAAGAAAAAC GGAGCGCAGG gCACCGTAgG CGCAGATGCG TTGTTGACGT TGGGGTATCg 9240
TTGGTTCTCG GCGGGAGGAT ATTTCGCATC GmAGGCCAGC AATGTATTCG GGGGAGTATT 9300
TCTCAACATG GCCATGCGAG AGCACGACTG TGCTGCCTAT ATTAAGCTCG AAACCAAGGG 9360
GTCTGATCCT GATACTTCTT TCCTTGAGGG TCTTGATTTG GGTGTTGATG TGCGTACGTA 9420
CATGCCTGTC CATTACAAAG TCCTAAAAGC CctACCCCCA GCCATTTACT TCCCGGTGTA 9480
TGGAAAAGTC TGGGGTTCGT ATCGTCATGA TATGGGTGAG TATGGTTGGG TTAAAGTGTA 9540
TGCAAACTTG TACGGCGGTA CGAACAAAAA GGCCACGCCC CCTGCTGCTC CTGCTACGAA 9600 gTGGAAGGCA GGATATTGTG GGTATTACGA GTGTGGGGTA GTGGTCAGTC CGTTAGAGAA 9660
GGTGGAGATT CGGCTGAGCT GGGAGCAAGG CAAGCTACAA GAGAACAGCA ATGTAGTGAT 9720
AGAGAAGAAC GTGACGGAGC GTTGGCAATT CGTAGGGGCA TGTCGCTTGA TTTGGTAGGG 9780
ATGTATGGTT CTTTTCTTTC CGAAgGGgCG AATTTACGCC CCTTCGgAAG GTATGCAAAA 9840
ATTCCACGTA TCGGGTCACA TATGACCCGA TACGTGGAAT TTTTGCsCCG GCGCATATCT 9900
GGCCGGGCAT GACACGCAGC GGAGGTAGGC GGGGTGTGTA GACGTTTGTG CCATCACAGG 9960
TGGCGCGTGT GGGGAAAGGT TGCTTCCCTG GGAGTGCTCC TTTTAGGAGG GCTTGTTGCC 10020
TGCACTTCAA GCGcAGcCGG GTCAACCTCC AACACGCGGC CGGGGGTGCG TATGACGATC 10080 ACCAGCCGCT ACCCCTTCGA TCGCACTATG CAGCTTTTGG aGArcGCTTT GCGCACGCAG 10140
GGCTTTAGCG TTTTTGGTAT TGTTGACTAC CGCGAGGCAG CCCACAAACA GAACTTGGAT 10200
ATACAACCTG CAAAGCTTAT GGTGGTGGGC TTCCCTAAAA TTGGCACGCC CCTCATGCTC 10260
GAGGATCCTT ACTTTCTTCT TCGTGTCCCT CTGTACCTTA TGGTTACCGA TGTGCGCGGG 10320
AAGACGCGCG TGTCGTTCCA CAATACGCGT GGACTGATGG ATAGCTATGT AGAGCTTTCT 10380
GATATGGATC AGGCCATCGA GTTAGTAGAA TCCATCGTCA AGAAAACCCT TGCAGAGTAG 10440
GACGTTTTGG AAAAGAAATT TGCGTTGCTC ATCGATGGAG ACAATATCTC CCCTAAATTC 10500
CTTGAGGGAA TCGTCGGTGA AGTGTCTAAA GAAGGTGATA TCCACGTTCG CCGTGTCTAC 10560
GGTGACTGGA CTACCCCTAA CATGAATGGG TGGAAGGGGC TGCTCACGAA AATTCCTATC 10620
CGACCAGTGC AGCAGTTTCG GTACGGGGAT AACGCCACTG ATAATACCAT CATCATGGAG 10680
GCTATAGAGC TCGCGAACAA TAACCGGGCT ATCAACGCCG TGTGCATCGC TTCTACCGAT 10740
TCTGATTATT ACAGTCTTGC GCTCAAGCTG CGGGAGTACG GTCTGTACGT GCTCGGTATT 10800
GGAAAACGAA ACGCGCGTGA GATTTGGGTT TCTGCGTGCA ACGAATTTAA GTACATCGAA 10860
AATATTGAAA CTGAGCACTT TGGCCTGAGC GCGGGGTTTG CGTTTCATAC TGAGTCAGAT 10920
GCTGCTGCAG TTCCTGGTGC AGGGGTCGAT GCCGTTGAAG AGGATACTGG GGGTTTTGAC 10980
TTAGGGAAGC TCATTGCGCA CGCTTACAGA AACTCGCGCA TGACCGAAGA AGGCTGGGTG 11040
AGCCTTTCAA ATTTAGGAAA GTCGCTGCGC ATCACAAAAC CTGAGTTCGA CCCTCGTTCT 11100
TACAATCATA GTACCCTGcG GGAAATGGTG GAGGCTCTTC CTGAGCTTTT TGAGGTGCAG 11160
TCTGACCGAC GTATCCCTCC CAATTATTGG GTGCGTGCAG TGCGTGGTGC CCACAAGCGC 11220
ACGGTGCTCT ACGGTGTTAT CAAGCGTTTT CGTGAGCGTG ATCGGTGGGG TGTTATCAGT 11280
CATGAAGAGC TTGGTGATTT TCGTTTTGTG TACAGCAATC TCAAGCGTGA GTGTCGTGCT 11340
ACTGCCTTGC CTGAAGGTAC AACGGTCAGC TTCTCTGTGT TTCGTATGCC CAACGATCAG 11400
GGTAAAAGTG ATGAAGAGCG TCACGGGCGC GCTGCCGACG TACTCGTGGT GAAACGGGTG 11460
GCCGGCTAGC AGACGGGaGT CTGTAAcGTG CACCGGCGTG GTGCGGAgGT GCCGCCCGAG 11520
TGCGTTAcAC nCTnCACAGT AACCGAGGCC CGCGTAgTGT GTAGCGGGCG TATCCTCATT 11580
TTTGTGGGCC CTTGTGTGAG AGTTTTAAAC ATGCTAGAGT GAGCCCCGCG AGGGTGcGTG 11640
CGTGAACTTC AGTCCTAATA CGCTGGGATC CTTTGAAAGC TGCGCTGAGG TTGTGAGGTC 11700
GCTCGGGTGT GCCCTTGTCG ATCTGCAGTG GAGCGTTTCC GCTGTTTCTC GGCGTGTGCA 11760
GCAGGCTCAG GGAAGGGCgC GTGCCGTTAT TTACAGCGCA GGGGGAGTGA CGCTCGACGT 11820 GTGCGCGCGC GTTCATCGAA TACTGGTGCC GCGCCTTCAG GCTCTCGGTG GTGTGCGCAC 11880
TGTTTTTCTT GAAGTCGGCT CCCCCGGGGA GCGGGTTATT CGCAACGCCG CGGAGTTTTC 11940
CATCTTTTTA GGGGAGACTG TGAAGGTCTG GTTTTGcACG GGGCAGTTTC AGGTTGGGAC 12000
TCTTGCGTTT GCGGATGAGA CTTGCCTTAC CCTGACCGCC GGCGGAGTGC CCGTTACTAT 12060
CCCGTATGTT CAGCTAACAA AAGCGCAGTT ACATCCTGCA GTCCGCGCTT GAAAGGGCTT 12120
TTGGTCTGCA CcTTCGCCCA AAATCGCCTA AGGAGCCGCC TATGTTCGGC GTCAGTAACG 12180
ATGACATTAG AAAGTATGCG CAGGAGAAGG GGCTTGATGA AGACTTTGCC TTTAAAATCG 12240
TCGAGCAAAC ACTGAAGGCC GCTTATAAGA CTACATTTAA GACAGATGAA AACGCCGTCG 12300
TTACCTTTGG TGAGGAGCGG GTGTGTATCT AtGCgCGCAA GCGyGtGGTT GAAGAGGTGT 12360
ACGACCGCGT CTCGGAAGTG GATTTGTCTA CGGCACTTGA GCTTGATCCC ACTACTTCTT 12420
TAGATAGCGA AGTGCTGGTG GAGCTTGAGT CCGAAGATTT TAAGCGTGGA TCTGTGCAGG 12480
CTGCCGTCCA GCGTATCACT GAGCTGAGCA GAGAAATTCA AAAGGACGCT CTGTATGCTG 12540
AGTACAAGAG CAAAGAAGGA GAGATTATCG TTGGCTACTA CCAACGCGCG CGAAACGAGC 12600
ATATCTACGT TGACCTAGGA AAAGTTGAGG GCCTGATGCC AAAGTCGCAC CAGCTGCCCC 12660
AGGATGATTA TCGTCAAAAC GACCGCATTA AGTCGCTTGT GCGTGAGGTG CGCAAACATC 12720
CAAAGTCGAG CGTTGTCCAG CTCATTCTTT CACGAACTGA CTCTGCTTTT GTAAAAGAGC 12780
TGCTCGCCGT GGAGGTGCCG GAGATCTACG ACGGTATTGT TGAGGTGGCA AAAATAGTGC 12840
GGGAGCCAGG GTACCGTACA AAGATCGCCG TCACCAGTAG GCGTGATGAT GTGGATCCTG 12900
TTGGTGCCTG CGTAGGTCCT CGGGGCATAC GCATCCGCAT GGTTATTAAA GAATTGAATG 12960
ACGAGAAGAT AGATGTGCTT GAGTATTCTC CGGATCCAGT TATTTTCATC AAAAATGCGC 13020
TTTCTCCTGC TGAGGTGCTG AACGTCGTGG TACTTGATGA GGAGAAGCGT TCTGCACTTG 13080
CCATTGTTGC TGAAAgCCAG CTGTCTATCG CGATAGGAAA GCAAGGTTTG AACGTGCGTT 13140
TAGcGAATCG GCTTGTGGAC TGGAATATCG ATGTGAAGAC AGAGAGTCAG TTTGAAGAGA 13200
TGGATGTGTA CACTGACACG CGTCGTGCGG CAGAAAATCT TTTTGATAAc GATTATCAAG 13260
AAGAGTCTGA GTTTTCyTCa TACGkGGGAT TTACgCCgGA GCTCATTAAG ATTCTGCAGG 13320
ACAACGGTAT CCAAGACGTA CAGACTTTGG TAGATTTGGG CGAGGAAGGC TTGCGTGCGC 13380
TTGAGGGCAT GGACGAGGCG CACGTACaAG AATTGCTCGC CgCCATTGAG GAGAATTTTG 13440
AAGTTGTCGA GGAnGGGGAG GAGGCTTCAG TTACATCTTC TCCCGGGACT GGTGGTGATG 13500
ArGATCAGGC gTTGCAGTGT CCTGAgTGTG GGGTGCgCaT TACTACTGAC ATGAGTGAGT 13560 GTCCTCACTG TGGTATTGGC CTCAGCTTTG AGTTTGAATA CGAAGAGAAC GwssmaTAGG 13620
AGAGCTATGA CCtACGAGAC AATACGCCTA AAGACACTTC CCGTGTTGCT AGTGAgCaGG 13680
CTGTGCgTTC aCCGGTGAaC GTCTGGTCcA AACTACCTCT CGTACACGGG CgGATGTTGA 13740
CGTAAAGGAG AAAAGACTCg TaATAAAGAA GACAATCAAA GTGCGCGCaA AGAAAGTGGT 13800
TGCCAAAGTT ACTgTGCGCG GCGTGTGTCG TGCGCGGATG AAAATCGCAC GCCGGGCGAC 13860
GCGAGTCAGG CGACTATTTC TGCCGCGCCC GAAGATAAAA AGCAAGGTTT CCCTGACATT 13920
CGGGAGGATG GCGTTGCGCG TGGTGTATCT GCCTCGTGTG GCGCTGTGCA GAACGCTGCG 13980
TCTGCACAGG TTCCCGGTGC CCGTACTCCG GGGGTTATAG GCGTTCCTGT TGCCAGCAAA 14040
ACGGTGGAGG AAGCAAGGGG TGGGGGAGCT AAGCGGGTAA TCACTAAGCG TGTGGGTGGG 14100
GTTTTCGTGC TTGATGACTC TGCGGCACCC CTAACCGAAA GGCAGGAAAC CTTGCATCTG 14160
GCGCGCGCCT TTCTCGGTTT AGCCGCAGTG ATCGTCAGCG CACAtyGGGT TTTCTGGTAC 14220
TCAAGCGCGT GCTAACgCAG GTGGTGTGCG GCGTGGAGAG GGCCGTCCGT TTGCTCGCGA 14280
TTTCAGTCGT GGGTCCACGG GTGGGTATCG GCCCGCAGTG AGAGGTCCGG CTCGGCCGGC 14340
TGGACGTGTT GGTTCGGGTC CAAGAGGGCC GGCGCCCCTG CAAGTAGGTG CTGGTAAGCC 14400
TGCCCAGAAC AAAAGGTCTT TCCGGGGCAG AAAGCAGCAG ACATATCAGT ATCAGCATAA 14460
GGATCGTCTT GAACTGGAAG AAAAGCTTCT CCAGCAGAAG AAGAAAAATA AGGAAAAGCT 14520
TGCGGCGGTC CCGCGCTCTG TTGAGATCAT GGAGTCCGTT TCGGTTGCAG ATCTCGCAAA 14580
GAAGATGAAT TTAAAAGCCT CAGAGCTTAT CGGTAAGCTT TTTGGCATGG GCATGATGGT 14640
TACCATGAAT CAGTCTATCG ATGCGGACAC CGCCACGATT CTTGCTTCTG AGTACGGGTG 14700
TGAGGTAAGG ATTGTCAGTC TTTACGATGA AACAATTATC GAAAGTGTAG GTGACGAGCA 14760
TGCGGTGCTC CGCGCACGTC CGCCAGTAGT GACTGTTATG GGACATGTTG ATCACGGAAA 14820
AACTAAAACG CTCGATGCCA TCAGAAGTAC GCGCGTTGCT GAGGGGGAGT TTGGCGGTAT 14880
CACGCAGCAT ATTGGTGCTT ATGCAGTCTC TACTCCGAAA GGCTCAATTA CCTTTTTGGA 14940
CACGCCAGGT CACGAAGCTT TTACCATGAT GCGCGCGCGT GGAGCAGAAA TTACCGATAT 15000
TGTGGTGCTC ATCGTAGCTG cAGACGATGG GGTAATGCCC CAGACGATCG AAGCGATCAA 15060
TCACGCAAAG GCTTCGAAGG TTCCCATTAT TGTTGCAATC AACAAGATTG ACCGTGCGGA 15120
TGCGAACCCG AATAAGGTCA TGACGCGCCT TGCTGAGCTT GGCTTAGCTC CAGAGGAGTG 15180
GGGTGGTGAT ACCATGTACG TGAGTATTTC TGCGCTGCAA GGTATTGGGT TAGATCTGTT 15240
GCTAGATGCC ATCATGCTGC AGGCGGAGGT GATGGAGCTT CGTGCAAATT ACGGGTGTTG 15300 TGCAGAAGGG CGCATTATAG AGTCTAGGAT TGATCACGGG CGGGGGATTG TCGCGAGCGT 15360
TATCGTGCGT CGTGGGGTGC TTCGTGTTGG TGACACGTAC GTTGCaGGTG TGTACTCAGG 15420
GCGTGTGCGG GCAATTTTTA ATGATCAAGG GGAGAAGATT CAGGAGGCGA CTCCTAGTAT 15480
GCCCGTTGAA ATTTTAGGGC TTGAGGGAAT GCCCAATGCG GGTGATCCTT TTCAGGTTAC 15540
GGATTCTGAG CGTATTGCAC GGCAAATTTC GCTTAAGCGT CAGGAGTTGA GGCGTTACGA 15600
AAATGCGCGC AACGTGAAAA GGATAACGCT TGACAAGCTG TACGAGTCTA TCGAGAAGGG 15660
TTCGGTTTCG GAGTTCAAGG TTATTATTAA GGGGGACGTG CAAGGATCGG TTGAAGCGCT 15720
CAAGCAATCG CTTGAAAAAC TTTCTACCGA TGAGGTGCAG TTGCGTGTCA TTCATTCGTC 15780
GGTTGGTGCG ATAAATGATT CTGATGTTAT GCTCGCAGCT GCTGATTCAA ATGTGACCAT 15840
TGTTGGTTTT AATGTACGTC CCACTCCCCA GGCTGCGGTT CTTGCAGAAA GGGAAAGAGT 15900
AGAAATCAAA AAGTATACTG TCATCTACCA GGCGGTGGAG GAGATGGAGC GAGCTATGGA 15960
GGGTATGCTC AAACCATCCC TCAAAGAGGT AGTGCTCGGT TCGGCGGAGG TGCGCAAGGT 16020
GTTCAAGATT CCCAAAGTGG GAAGCGTTGC AGGAGTATAT GTGCTTGAAG GGGTAATGAA 16080
GAGGAACGCC ATTGTTCACG TTGTGCGCGA TGGGATTGTC CTGCATTCGG GGAAGGTTTC 16140
CTCATTGCGG AGAGAAAAGG ATGATGTGAA AGAGGTACAC AGCGGCTTTG AGTGTGGGGT 16200
TGGAGTTGAA AATTATTTTG ATTTTAGGGA GCGTGATCGG CTTGAATGCG CGGAGATGAA 16260
GGAGGTGTCG AGGAAACTGA AGGATGCCGC TCTTTCCGAT GCGGCGCGCT TACAGGGATG 16320
AAAcAGGTAA GTCAGTTAAG GGTGCGCAAA TTGGGGGAGC ATATCCGCGC AGAAATAGCG 16380
CAGCTTATTA TGCTCGGCAA AATAAAGGAT CCACGTGTTT CTCCCTTTCT CTCTGTGAAT 16440
TGGGTGGATG TGTCTGGGGG GATGGTCTGT GCGCGGGTAT ATGTGTCGAG TTTTATGGGT 16500
AAGTACAAAA CGAAgCAGGG AGTGCAAGGC TTAGAAAGCG CGGCAGGTTT TATTCGCTCT 16560
GTCTTGGCTA AGAAACTCCG TCTGCGGCAG TGTCCGCGTC TTAGCTTTGT GTATGACGAG 16620
AGTGTGAGGG ATGGATTTTC TCTTTCGAGA AAAATAGATC GGTTAGAATC CGGCGGTGTG 16680
CAGACTGAGC ATGCCtGACG CTATTGTTCC TTTCGCAAAG GTTTCCGGTC TTACGAGTTT 16740
TGCGGCACTG GCACAGGTCA GGCGTCTTCT GGGAGTAAAA AAGGTAGGGC ATACGGGGAC 16800
GCTTGATCGC TTTGCTGATG GGCTGCTGTT GCTTTTGGTA GGGGGCTTTA CCAAACTCGC 16860
GCCGGTGATG ACTCGCTTGG AAAAGAGTTA CGAGGCTCGT ATCCAGTTTG GGGTACAAAC 16920
AGACACTCTA GATCCGGAGG GGGCTGTCGT GCGGTGCTCC TTGTTCCCAA CATTTGCGCG 16980
CGTGCGTGCG GCGCTGCCTC ACTTCACTGG GAGTATTGAT CAGGTGCCGC CTGAATATTC 17040 GGCGCTAAAA TTCGGAGGTG TGCGTGCGTC CGACCGGGTG CGGCGTGGGG AAGCAGTGTG 17100
CATGAAGGCT CGGCGTGTGT TCGTCTTTGA CTTGCAGGTA CTAGGTTGCG AGGCGGATCT 17160
GGGTGAATTC AAAAAGACGC AGGCGGGGAG GGGGGCTGCG ATTGCTGATC TTGATCTGAC 17220
GCGCGTGCGT GCTGTAACGC TGTACGTACG TTGTTCGGCA GGCTTCTACG TGCGTGCACT 17280
TGCGCGCGAC ATAGCAGCCG CTTGCGGCTC TTGCGCGTAT nTTCACATTT ACGGAGAACA 17340
CGCATTGGAC CCTTTGATCT TGCACAGGCG GCGGGTGT 17378
(2) INFORMATION FOR SEQ ID NO: 12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5641 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:
GAGGAAGGCA AAACCTTTaA TTAAAGTACA CTGCGCGTAT GAGCAAAAAA TGCGCGCCTG 60
TTTGAATATT ATTCTGCACA CAGTACCGAA GTGTrtCTGT TGGGTATGCC AGAGACACGT 120
AACAAACAGT TGAATGAGAA GCTTGTGTAC ATCGAGCACG TACAAArGaA AGTAGTGGCG 180
CAATACGATC CGCAGCGGGT GCGCTATTAC TCCCTCAAAC CAATTGTACC CGGTGTACAC 240
GGAACATATG CAAGCGCGAT AAGGGACACG CACGGCCGTT GGGTACACGT GATGCACAAA 300
GACGGCATCC ACTACACCAT AGAGGGTGGT GCGTACGTTA TGGAAACTCT CTTACCCCTT 360
ATTCTTGCAG ATTTGGAACG GTCTCGTCAC GGATACATGC GTTCTTCTCT GGGGTCGCAT 420
GAACTCCCTG CGACGAAGGG aTGGAAAGAG CACGTCACGC GTCAACTCGA ACATAGGGAT 480
AAACCGCACC GTTGTATCCT GCATGACAGG GGGTGCGTCC TGCArGGGGT TATCGTACGT 540
CCACACCGTT GCCTGCACTT GCATAGTGCT GTTTTGGTCA CCTGAGAATG TTACCGTAAA 600
GGGGAGTGGT GGGCGCGCCT GCGATATGAA CCGTACCACA GATCCCTGCT CCAAACGCGT 660
AAAAGGCCTA TCTGCTTCGT ACGCAGCAGT TCCGTGTGGA GAGGTGATGC GCAGTTCGAT 720
TGCCTGCGCG CGCAACTGAG AGCGCACCTC CAGGACGTTG ATGCGTCCGC CAAAGTAACT 780
GcGAGAgTCC ATGGAGATCT GGAGGTGATC CTGTGCCTTT TGTATTTCCG CCGTGGACAG 840
CCGCGCGTGT GTGCGTGCAC GTATATGCCG AGGAATGGGC AGTGGAGAGT GAAGCGAAAG 900
GGTGAGCCCC CTCTCACTAA TGCGATAgCA TGCGGTAATT TCCTGTGGGA TACGCGTGCT 960
TGACAGTGAA amGCTCCAGA GCACACCGTT GATAACGAGC ACAGTCAGTA ATGCACCAGA 1020 AAGCAGAAGG TTTCTCCTAC GTATGAAGGG GATGCGAGGG TGGGATTTTt CCCGTTCAAA 1080
AAAGAGAGTG AGCATAAAAA GCATAAACGG AACACAACCG AGTGCAAAAA ATACGTTGCT 1140
TTGGTGGAAG GATAACGGCG CGGTAACCGC ACTGTTCTTT AAGTATGCGT AAAGGAAGGG 1200
AGCGAAGAGA AGAAACATCA TTGCAAGCAT TATCCGTCGT GTAGTCTTTC TCTGTGCGCT 1260
AAAGAAGACA AGCGAGAGCG CATACTCTAG AGCGAAGACT GCTGAGAGTG AAAAATCGAT 1320
AACAGAAAAA AGGACCGCGT ACAACAGACA GAGTGTGCTT GCAAGATATC CTCCGATAAA 1380
TCCGTTGTGC AACATGGAAT CTCTGATTGT CTTGCTGTTT GCCATGCACA CGCACGAGAG 1440
TGCAATTGCG ATGCTGTGCT TTACTATGAG TGCAAGAAGC GGTAATGCAC CTGTTGATGT 1500
GTGCGTGCCA AAGCGGATAA GAAAAAAAAG AGCGGTAAgT TGTGTTGCAA CGAACACGCT 1560
ACACACGCTC AGTACCCCTA AGATAGCAGG GAGCCACCAC ATGGCTAAAA TAGTGTTCCA 1620
GTAATGCCTT TTGCGCGAGT CAGAAAGAAA ACTAAAAATA GCCAAAGAGA GCAAAAGAGT 1680
GATGACAGCG CCTAAGATAA GAATCACGAA AAATTGCTCT CGAATGAGGT AGAGCGTGCC 1740
GCCGTACGAA ACACTCACGT AATGCGTGTC CCATTCTTCT GAGTACACCT GGGTAAGGAG 1800
TGCCgGGAGC GTGTGCAGTG CACCTGGAGG GTACAACGAA TTTTCCATCC TTACCGCaGg 1860
AATATGCTCC TTGATGTAGA GTGCATGGCG TGGGTCCTCG TGCAACCATC CAAGCCGGTG 1920
CAATATGGCG TCAAAGTCCC GATAACGGAT GGGGACGTGG TGTGATGTTA GGTGTTGGTA 1980
TACTGCTTGA AGAAGCCAGG GTGGGCACAC TGTACGGTGT GCCCCCGTGT GCAAGCGCGG 2040
TGGTCCTGTA GTTTCGTCGA GTATCAGGAC TATTGGCGCC TTATAGGAGG AGATGAGCGA 2100
GATGAGTTTT TTAGTTCCTG TAAGCCGATC GACGGGCACA AAATCAGGAA CGGGAGGATG 2160
GTCGTGTGCA GTTATTGCCA CGAGTACCGA CACGTCTGGG GTCTGGTGCT CAAAGTCCTG 2220
CACCAGCAAG CGGAGCTGCT GTACCGCACG CGTCTCGCGT TCTTGCGCTT CACGCGATGC 2280
AGTGAAGACG ACGAGCACAT CCGTTCTTGG TCCAAAGAAG TGAAGCTCGT TTTCCTGTGC 2340
GTGTACAAAA ATACCACAGA GCACGAAAAG CAGCGCGCGG CACACCCGAG TCATGACTTG 2400
TTATCGGTAG GGAGTGCGCC GGTTTCGAAC CATGTCTCTA TGCGCTTGTA GGAGCTATTA 2460
ATCcTGCGCA CTATATCGGC TGCCTGTTCG GACGTGTGCG TACACGTGAA GACGTCAGGA 2520
TGGTATTTTT TCacAGTCGT TTCCACGACT GCTTGcAGTA GGAGAGGGGG AGCCCTGCAG 2580
GTACTGAAAG GACTGCAAAA TCCTCAACGA GTTCAGGAGG AACAGGGACG CGCACTGGTC 2640
CTGGGGGAGG GTTCTTTTTT GGCGGCGGTC GGCGCTCCAT GCGTCCTGCA CAGGTGCGGT 2700
ACTTGCCGCC GCGGTTGTCC CATTGATCGA AGGGGTCTTC GTCGGAGTTG AGCCGGTCGC 2760 GTAGGATAGC ACCGATTCGC TCGTAAAAGC TGGTCATAGA CATGTTCCTT TTTGAATAAT 2820
GTCGCCGAAG CGTGCGCCTT CTCtGCGTGT ACGGAGCGTT GGAGTACTGC CCCCACAAAC 2880
AGGTTGGCGT GTTCGGGAGC TTCCCCGGCA GAAAGATTTG CCckCTGCGC ACTATTACCA 2940
GGTTCTCTTT TTGGAGCACA GTCCCACGGG GGTATGCGTG GAGATAGTGA AGAGAGCGGT 3000
TAGATTTCTG GTAGTGAGCG CGTTCTGAGG GAGCAAGCAC CTTCTCTCCT GAACCGATGA 3060
CGGCGCGAAC AACGTGCGGG GCATACCCAC GCTCGTGTAG AAAGGAGATA ATCTGTGAGG 3120
GAGAACGGCG AGCGCATGAG TTGAGCsCCG CTGTCATGGT GCGAAAGTCG GCAGGATCTA 3180
ATGCAATGGA GTCATCGAGT CCTGCATCTG TTCGCGAGAG GCAAATGTGT TTTTCGACGA 3240
TGCAGGCGCC GTGTGCACGG GCAAGGAgCG GGACAAGGAG CGGGTCTACG CTGTGGTCGC 3300
TGACGCCGAC GTTGATATTG AAGATGGTAG CAAGCGCAGG CAGCAGCGCA AGGTTGTACT 3360
CTGTCTCTGG AGCAGGGTAT GCGGTGATGC AGTGCAGTAA GGCGTGGGAG CTGCCCTGCT 3420
TGGTATACTG GCGGCATTGG GCAAGGGCCC CTTCGATTTC CTTCAGGAGG CAGACTCCAC 3480
TTGAAAGTAT AAGTGGAAGT TCTGCAGCAG CGAGTGTGGA GATAAGGGTG GGGTAGTTGA 3540
GCTCTGGGGA AGCTACCTTG AGGAAGTCTG GTTTcAAGGC GAGCGCCTCT GTTGCAGAGC 3600
GCGGGCCAAA GGGGCTGATG CCGACTAGCA TACCCCTGCT TCGTGCGTGG TTAAAGCACT 3660
GCGCATAAAA GGAAAGTGGA ACTTCTAACT CCTCAAAGCG CTGGTAGAGG GAAACTGCTC 3720
CGCTGGGAAG ACGGACAGCC CCCGTCAGCG GGTGCAGTAT TTCGTGCGCG TAGATGAGCT 3780
GGAATTTGAC CsCAGCTGCT GCTGcgTCTG CAGCTGCGTC TATGAGCGCC CGCGCGCGGt 3840 cAAACGAGCC CGCGTGTGCG aGCCGATTTC AGCGATGGTG AGTATATCCG CGTCTGGGCG 3900
AAAACAACGT CCCCCGCACG TGAACATGGG GCATTGTACG CCAAACGCGT GATTGGTGTA 3960
TAGCTTTCCT GATCGGTAGG CAATCCTTGC CGTGGTTTGT ATGGGTAAGA GGCAGGTGCT 4020
AAGATAGTGT GCGCTTGTCA GACATCTATT TTTGCAGTAC CGTCGTGTCG GCCCTGCGGG 4080
TGCCGAGGAT GAACGGCATG TTGCGCACGA GCGTGTTGGT ATGTATTGGG TGTCTCTCTG 4140
CTGCAATCCC TGCGCGCTTA nGTGCCCGTG CGGTGCCGCC TCTCTCTAGT GCGGTGGTAG 4200
ATGAGGCGGC ACTCCTTTcT GTGCArGAGG CGCGTGGTAT TCGCGCCCTT CTAgAgGGcT 4260
TGCGCGCCCT TCTGGArATG GCTCTTCCAG ATCGCATCCT TCTCCTGCGC CTGAAGCTCA 4320
TGCGCGTACG CTCCATGACG GTACGCCGTT GCAGATAGCG GTTTTGATTG TTGATTCGCT 4380
CCAGGGGGAT AGTCTTGAGG ATTTTTCATT GCGTGTGGCT CAGGAGTGGG GTATCGGCAG 4440
TCGTGCGCAG GATACAGGAA TTGTGTTAGT GATTGCGCGC GCGGAnTAnA AGCACGCATC 4500 GAAGTAGGAT ACGGTCTTGA AGACCGCGTC ACCGACGTGC ATGCACATCA GCTTATCCGT 4560
GGGACgCTCG CGCCGTGTTT TCAAGCTGGC GCCTATGCAC AGGGTGTGTA CGAAACGGTG 4620
TTGCGTTTGG CTACCCTGGT GCGGGGTCAA CACGAGGTAC AGCAGTTCAT GCAGCCGCGC 4680
TCTGTGCAAC cTGCGGTACC GCGCCGGGGT CCAGTGAGAA ATAGTGCCGG GAGCGTGTTT 4740
TTCTTCCTGC TGCTTTTTTA CTGTCTGGGG GGCCGGCTTT TGCCAGGGGG AGTGTTGTGG 4800
CCATTGCTGT TCTTCGGCAC TCGGCGGCGT TATGACCCGT TCGGGTCAGG GTTTAGCGGC 4860
GCATTCGGGG AGTGGGCAGG GGATGGAGGA GGGTTTTCTG GCGGTGGTGG TCGCTTCGGT 4920
GGAGGCGGGG CCTCTGGTTC TTGGTAGCTG CTCCTAGCAC AGCACGGTTT CTTTTTCTGT 4980
ACGGGCAGTC TCTCTTGGAA GAGGTGTATC TATAGTGTGC TCGGTGACGC ACGGGAAAAG 5040
CATAAGGAGT GAGAACAATG ACTGAAGAAG CTATGCGCGC GATGGCACTT TCCATCCGCA 5100
GTTTGACGAT AGACGCCATC GAACGGGCGA ATTCTGGTCA CCCTGGTTTG CCGCTGGGCG 5160
CAgCAGAGCT TGCTGCCTGT TTATATGGGA CGATCTTAAA GCATAATCCG GCGAATCCTA 5220
GCTGGTTTAA TCGGGATCGT TTCGTCCTGT CTGCAGGACA CGGGTCTATG CTCTTGTAaT 5280
GcTGCGCTCC ACCTTTCTGG GTACGACGTT TCGCTTGAGG ATATTAAGAA CTTTAGGCAG 5340
GTAGGCTCCC GGTGTCCTGG CCATCCTGAA TACGGTTGTA mCCCCGGTGT GGAAGCAACA 5400
ACCGGTCCAT TGGGTCAGGG TAcTCTATGG CGGTGGGTTT tGCGCTTGCA GAGGCAATGC 5460
TTGCGGCAmG TTTTAATACt GATGAgCAtG CCGTTGTAGA TCACCACACC TATGCGCTTG 5520
TGGGGGAAGG CTGCCTTATG GAGGGCGTTG CCTCAGAGGC TTCTAGCTTT GCCGGCACTA 5580
TGCGTCTGGG CAAGCTCATC GTTTTTTATG ATGAGAACCA CATCAGCATA GACGGATCTA 5640
C 5641 (2) INFORMATION FOR SEQ ID NO: 13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8790 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
GGCAACAGAA AGCGGCGTAT GTCCGTCAGC GTCGCGTTCC TGGGTCGGGG GACCGAGCGT 60
GACAAGGTGT TTGATCAGAT CGCGGTCCAG CGCACGCACG GCCCAATGCC AGGGTGTTTT 120
GCCGCTTTTG TCCTTTTGTA GAAGTGTCTT TCTGGTTACC AATTCGTGCG CAGTTCCcTT 180 ACGGATGGCG CGGGTAAGCG GAGTGTCTCC CTGTGCATCC CGCGAAAAGA GGCTCGCGTC 240
TCGTGCAATC AGCATACGAA CAGACTCAAA ACAATTTTCT GTGACCGCCA CCATAAGCGG 300
CGTACTGCCT GAAGCATCCT GGGCCTCAGT ATCGGCACCC ATAGAGAGGA GGAAGTCAAC 360
CACGTGCGCG TCGTTACGCA ACACCCCCAC GTGCAAAAGT GTATCTCCAT TTGCATCACG 420
GACGTTGACA GAATCTTTAC CAAAACGGGT CTTCAGCGTA TCTAGATCAC CGCGCGCAAC 480
CATTTCAAAG AGATCGACCG AAGCCTGAGG AGAGGGAGAA GACGTAnTAG TGCAGGAAAG 540
CAAGACGAGG AAACACGCAA ATGTGCTCCC CACAAACCAC ACAATGCCAC GATTATGTAT 600
ATGCATGCAG CGGATCCTCC TGAGTATGGT GCCGCGTCTG TACAGTGTGT GTAAAAGCGT 660
ATCCTACCGG TTTCGGCGAT AAGGCACAGA ATCTTTAGAC GCCCACTCTC CCGTGAGGAC 720
GCAACcGCGC AGGCGCGTTC CCATTTTTAA AAACCCAGTA TCTGCTGACG GTGATGTAAT 780
CCGAGGTCTT TTAATGTGTC ACACACCTGC TGCTGAGGTG AcCGTGCACG TGCCTTTGCA 840
AAATCACGCG CAAAACAATT CATTGAAAAA TGTTGAAATA CAAAAAGCGG CGTACGGCGG 900
TCACACACCG TTTGTAGTAA ACGCGCATAG GGTTCAAACA GATGTCCGCT TTTCAATTCG 960
CTGTAACCCA CAATCCACAG ATCACATCCT AACGATCGCT GCAGCTGACA CGTACGCCGC 1020
GCCATCCAAC ACTGGCTACG ATGCAAAAAT GACTGCACCT GCCGTGCACG CgCAGTTCCT 1080
GCTGGATCTT GCGCAAGGaG GCGTACCAGT GCACGGAGCT GAACCGTTTT TGCCGTGTGA 1140
AGCGGCGTTT TGTTCAACAG CATCACGTCT GTACGAAAAT CAACCCCTAA CTGTGcATGC 1200
CGCGAAAAAA AGCCCTGaGC CACTCGACCT GACTGTCCTA CCAGATACCG TTGCTGCGTA 1260
TGCacTGGCT CCTCTTTCCC TGGATTGTCC GAAACCAAAA TAAGGGCCGG TACAGGATCA 1320
CTTTGGGTCA GTTCCTCAAG CGCGTGAGGG TACACAATCG GAGTGTGTAC CGGATACGTG 1380
GGTATTCCTT GTGCACGCAC GAGTGCCGCT TGCGCACGGT GCAAAAAAGC CTGCGGCGTG 1440
CAACACGCcA CTGCAGACTG CGCTGCATGC GGGTCCATAT ACGGATACCC AAACGCGCGC 1500
AAACTCTGTA CACAAAAAGC CTTGAGGTCA GTGCGAAAAG CGGcGAGCGC ATGCCACTGC 1560
GACCGAGTcA CGCGCTcACA TcCAAAACAG AAAGCATCCT CTACTATACC CTACATACCA 1620
CGTCCCTTCC TACAGACTGc AGTGACGGCG CAGGCGCACT GGCTCAGTGC TTCCTCCAAA 1680
ACGGCGCCCA TTGACAAACC ACCCATAAGG TCTCACGATT GGGCCTCTGT GTAGAAGAGA 1740
ATATCACCAT GCTGCAAAAA CGCTCAGATA CCCTCGACCG TCTGCGTCAC AGTCTGGCGC 1800
ACGTTATGGC AGAGGCCGTT CAAGCTCTCT TCCCCGGCAC CAAGCTCGCG GTGGGGCCGC 1860
CTATCGATTA CGGGTTTTAC TATGACTTCT CACCTCCCCG TCCCCTGTGC GATGCAGACC 1920 TAGCCCCCAT TGAAGAGAAA ATGCGCGCCA TCTTGCGTGC GGGGTGTCCC TTTGTCAAAG 1980
AGGTGGTTTC GCGTCCTGAC GCGCTTGCTC GTTTTAAAGA CGAGCCATTC AAGCAAGAGC 2040
TCATCGAACG CATCAGCGCA GACGACACGC TCAGTCTCTA CCACTCCGGC GCGTTCACTG 2100
ACCTGTGCCG GGGTCCTCAC GTGCAGTCTA TGCGAGACAT TAATCCGCAC GCCTTTAAAC 2160
TCACGAGCAT CGCTGGGGCC TATTGGCGCG GTAATGAGCG CGGCCCCCAG CTGACGCGCA 2220
TCTACGGCAC TGCCTGGGAA TCTGAAGAAG ATTTGCACAC ATACCTTCGC ATGCAGGATG 2280
AAGCAAAACG CCGAGATCAC CGTAAGCTCG GTCCTGCACT CGGTCTCTTT CACTTGGACG 2340
AAGAAAATCC TGGCCAGGTC TTTTGGCACC CTGAGGGGTG GACCCTCTAC GTGGCCATCC 2400
AGCAGTACTT GCGCCGCGTC ATGCACGAAG ACGGGTACGC AGAGGTGCAT ACTCCCTTTG 2460
TCATGCCCCA AAGCCTTTGG GAACGCTCGG GGCACTGGGA CAAATACCGC GCCAACATGT 2520
ACCTGACCGA AGcGAGAAGC GTTCTTTTGC GCTCAAGCCC ATGAATTGTC CCGGACATGT 2580
CGAAATCTTC AAGCAAAAAA CACGCAtTAC CGTGATCTCC CGCTCCGTCT TTCGGAGTTT 2640
GGCTCGTGCA CCCGCAATGA ACCGTCAGGC TCCCTGCATG GAGTTATGCG CGTACGTGGC 2700
TTTGTACAAG ACGATGCCCA TATCTTTTGT ACTGAGGCGC AAATCGCATC GGAGGTCACC 2760
CGTTTCTGTC GCCTCCTTGC GCGGGTATAT GCTGACTTTG GCTTTGCACA GGAGCAGATC 2820
CGCGTCAAGT TTTCTACGCG CCCAGAGCAG CGCATCGGAG ACGACGCCAC CTGGGACCGG 2880
GCCGAACGCG CATTGGCAGA AGCATGTGAA GCAGCAGGCC TTTCGTACGA GCACGCACCG 2940
GGAGAAGGAG CGTTCTATGG ACCAAAGTTG GAGTTTGCAC TTATAGATAC ACTCGAACGC 3000
GAGTGGCAGT GCGGCACCAT TCAGGTAGAC TATCAGTTGC CCTCGTGCGA GCGCTTGAAC 3060
GCAGAGTATG TGGGGGAGGA CAACCAACGG CACATGCCAG TGATACTCCA CCGCACGGTG 3120
ATTGGGTCTC TAGAACGGTT CATCGGTATT CTCATTGAAC ACTACGGGGG TGCATTCCCC 3180
CCATGGCTCG CACCGGTGCA GGCAGTGGTG ATTCCGGTTG CCCCTGCCTT CCTCGAATAT 3240
GCGCAgcACG TTGCACGGGA GCTGTGCGCC CGTTCGCTCC GCGTGCAGGC AGACGTGAGC 3300
GCAGAGCGCA TGAACGCAAA GATCCGCACT GCCCAAACGC AGAAAGTGCC CTATCTGCTC 3360
ATAGTTGGCG AGCGGGAgTG CGCGCGCAcA GGtAGCGGTG CGTCCGCGCA CAGGGCCCCA 3420
GCACTCAATG GGGCTCTCAG CCTTTTCCAC CTTTTTGCTC GCGAAcTAGA GACGCGCGCG 3480
CTGCACGCCT AGCCCATGAG TCCCCTGTGC CTTTTCCCCA AACCTTCAGG GGAAGGGACG 3540
CTATATCCGT AGCTGCTGTA CGCTACCGCC GTAGAGtGCG CGCGCGTGGC GTTGATATCC 3600
TCACTCTTTA CATAAGAaTC AAAGTCCATC ATACGATCGA TAATCCCGCG CGGCGTAATT 3660 TCCACAATGC GGTTTGCAAC AGAGCTGACA AACTCATGGT CATGCGAATT AAATAAAATC 3720
ACGCCGGGAA ACTGCACCAA CGCCTCATTC AGACTTGCAA TTGCTTCTAG GTCCAAATGA 3780
TTGGTCGGCT CGTCCAATAT CAAAACATTG CTCCCAGAAA GCATTAATTT ACTAAGCATG 3840
CAGCGTACTT TTTCCCCTCC AGAAAGTACA CGCACAGATT TGAGCGAATC CTCGCCTGTA 3900
AAAAGCATCC TGCCTAAAAA ACCGCGTACG TAGGTTTCAT CTTGATCATC AGAGAATTGG 3960
CGCAACCAAT CCGTGATAGA AAGATCACAA TCAAAATACC GCGCCGTATC CTTTCCCATA 4020
TACCCAACAG ATACCGTCTG TCCCCAACGG AAAGAGCCgG CATGTGCCTG CTTTTCTCCA 4080
GCAAGAATAT CAAACAATAT GGTCTTCGCG CGGTGTTCTT GACCGACGAA AGCGATTTTG 4140
TCTGTGCGCC CAACTGTAAA GCTCATGTCT GTAAAAAGCT CACATGAACC TCCCTGCATT 4200
CGGTCCTCAG CGGCATAGCG CAGTCCATCG CACGACAATA CGTGATTCCC AATTTCACGC 4260
CGTGGTTTAA AATGCACATA GGGAAACTTT CGACCAGTCA CCTCAATCTC TTCCAGCACC 4320
AATTTGTCAT ATATCTTTTT ACGACTCGtC gccTGCCGGC TTTTGGCTGC GTTAGAAGCG 4380
AAgCGCAAAA TAAACTCCCT CAGGTCCTTC ATCTTTTCTT CACGCTTCTT CTGCTGATCC 4440
TTAACCTGCC GCTGCATAAT CTGACTCATC TGATACCAAA AATCGTAATT GCCCGAGTAC 4500
AAACGAATCT TCCCATAATC GATATCGCAA ATATGCGTAC ACACGCTATT TAAAAAATGC 4560
CTATCATGCG AAACTACAAT CACAGTGTTG GGGAATTCAA TGAGAAATTC TTCCAACCAC 4620
GCAATAGAGT ACAAATCCAA ACCGTTTGTC GGCTCATCGA GCAAAAGCAC ATCGGGATTA 4680
CCAAACAACG CCTGCGCTAG GAGTACACGT ACCTTCTGGC TTTCGTCCAA TTCGCACATC 4740
ATCCGATCAT GGTGTGCCTC ATCTACACCC AACCCAGAAA GCATTTGTTC AATGCAATTT 4800
TCTGCCTCCC AGCCATTCAA ATCCGAAAAC TCACCTTCCA ATTCTGAAGC CTTCAACCCA 4860
TCTGCTTCAC TAAAATCACT CTTTGCGTAA AGAGCTTCCC GCTCCTTCAT CACTCGATAG 4920
AGCGCAGGAT GCCCCATGCA TACGGTATCT TTCACCGTGT GCTGATCGAA GGAAAAATGA 4980
TCTTGACGCA GAACTGCGAC GCGCGCGCCG GATGCGATAg cGATACTTCC CTGATGATGT 5040
TCGAGTTCAC CGGAAAGGAC TTTTAAAAAA GTTGACTTAC CTGCCCCGTT CGCTCCAATG 5100
ACTCCATAGC AATTCCCTGC AACAAACTTT AAATCAACAC CTTTAAAAAG AGGTTTGTCA 5160
GAAAACTGCA CACTCATACC CGTCACTGTT ATCATGCGGC GCATGcTAGC GCAAAATCCG 5220
TGcACAGGaC AAGCCGCTGT CCATAGAGCA TCACACATAC AGCGATGCTA TGAGCGCGTC 5280
ACTGTGGAAA ATATACGTGC AATACACCTC GTTCATTTCT TACACACAAC TGTGcAGAGC 5340
CCCCTGTAGA AAGACAGGTC CCCAGTGTTT TCCTCACACG CTGATCATTT ATGTACACCG 5400 CACCGTGGCC AGAAAATACT GAAAGTGCAT AGTACGACTG CCTTTCTGTA AAACGCGCAA 5460
CAACTGTGCC GGTGCGAGTA CCTATCTCAC TATTCCCTTG cAACGTACCA TCAAAAGATA 5520
GTGTGCCACG GTCCGTGTAT AAATGCGCGT CAGATAGGAC GCAGCGACTG CATTGCGTAT 5580
CACCGTCTGT CGTGTGCAGG AGAGTCCGAT CGGTACGTAC TCCATTGAGC TTTAGTTCGC 5640
TCGCGTGCGC ATACACATCT GcAAAACGCA CCTCAATACC TTCAAGCCGA AGACTGCTGT 5700
CTTTCACCCG CACTTTAAGA TTGTGCACGT TGTGctgCGC TGgCACACAA ATGATCGCTT 5760
CAATCGGTAC CACGCTATGT CCCCATCGGC TCCCCCATAC GTTTTTCCAA AATGTGTAAA 5820
ACCAATCGCG CAACGTATCG CGTACGTTCA ACGCACTCTG CATTACTCCT GGGGAATCCG 5880
CTGGAGGCGT GGACTCCCGT ACTGTTTTCT TTCCACGCCG TAGGACAaGC ATCGTAGGAT 5940
CCAGGTGAAT CGATAGCGGA TCGTACACGC TGTTCTTTAC AACCTTGTAT GCAAGAGAGC 6000
GCCgnTGCGC ACATACCTGT ACACTCACTC GTACGCGAtA GCATCTATAA CAATGGTATG 6060
GGGATGTGCT TGATCTAGGC AGGTA ATCC ATCTGCATCG GTAAAGGTAC GCACAACAGG 6120
TTGAGAACTG TGCGCGGGGT GACGCTGCAT ACGGCTACTG GTCCAAAACC TCGATACATC 6180
TCCACGCAGA TACTTCATCG ATCCGCCCAA AAGATACGCA CACGCTAGAG AGAGCGCACC 6240
AACCAAACAA ACGATAACAG TGTGTGCACG CGCTTTTTTG TCCATTTTCT CCCCCTCACC 6300
TATTTCTCCT CTGTAGAGCC TTTCCTCCGT CCTTAAACTG AACACCAGTT AGTGGACCAG 6360
ATTACGCCGC ATCAGTACAA TCGCGCGCAA TGAGTGGGGA ATATCAATCT TTCACGCTCA 6420
AGCGTGCGCG ACGcGTCTAT GACCAGTATA ATGTGATTAA CTCCCTTTCG TTCGCACTCG 6480
TAACTGGCAA TACCATTACG CTCTATGCAC TGCTGCTTGG TGCCCGCAGT ACCACGGTAG 6540
GCTTGCTAAG CGCGTGCATG CACTTTTCCT TCTTTGCACT CCCTTTAGGA AAACTTGTGT 6600
GCCGACGTTT TGGCGTCATT AAAACCTTTG CGTACACCTG GATCGCCCGC AATACTAGTT 6660
TGCTTCCAAT GCTCGCAATC CCTCACCTTT ATGCACAAGA CTATACGGCA CTTGCACTGT 6720
ATGTGCTTAT TTTTTCCGTC GCACTGTTTA ACTTTTTTCG TGGTATGGGA ATGATCGCGA 6780
ACAATCCGGT CATCACCATG CTCGCACCAG GCAAACATCG CAGCTCATAC ATCGTACGCA 6840
TCTCGCTTGC GAACAACAGT GCCATACTCA TTGCCACGCT TTTACTCTCC GGGGcACTGA 6900
GCGTTAACGC TTCACTCACA ACCTATCACT TTGCAACTGC ACTCGGCATC GCACTAGGTT 6960
TTTTTGCTTC GTTTCTCCTT TTCACATTAC CTACCGTCGA GTCATGCGAA CATGTGCAGC 7020
ACACTTCCCC GGAGACCCCA CGGACCTCAC CGCGCTCCGG GTACACCACG ATACTCCGTG 7080
CTCTGAAAGA GAAAAACTTT CGCACCTTTA CGTTCGCTTT TTTTGTCAGC AGCTTTGCCA 7140 CAGGTACAGT ACGCCCCTTC GTTGTCGTAT TCGCAAAGGA CGTATACCAC ACTCCAGATA 7200
GCTTTATCAC TATCCTCACC GTATGTGCAT CCGGCGGTGC ACTCATCGTC GGTTTTATAA 7260
TGAGTTTAGC TATCGATCGC ATTGGGGCAA AGCCAATGTA CATTATCTCC TCAGTTTTAA 7320
GTGTACTCAC CCTCATCCCT GCGCTTGGTA CGCCAGGACT CCATTCCTCT TTCCTTTCAA 7380
TTGCTTTTTT ATGCCTGTTC TGTGCAACTA CCAGCATGGG ATTTACCGGA CAAGATAATG 7440
CAGCGCAGTC CTATTTTTTT GTCCTCGTTC CTGAGGATGC TTTAATAGAT GTAAGTGTCC 7500
TGTACTATCT TATTTTGGGC ATCACTGGTG GAGCCGGATC GGTGATTGGC GGCGTGGTAT 7560
TAGACTTCTG CCATCTCTCA GGATACTCCA GTTTGCAGGC ATATCGTATC TTTTTTACAG 7620
GAGTCAGCGC GATTATGATA ATCGGCATCG CGCTTCAGAC ACAGcTGCGC AACCTGGGTG 7680
GATACCGTGT ATTGCGAACA CTCGCAACGC TTTGCTCTCC AAAAGATCTG CGTACTCTCA 7740
GCCTCCTACA TAAACTCGAC TTTAACGAAA ATTTAGAAAC CGAGCAGCAT ATCGTACAAG 7800
AACTTAGTAC CATCGCCTCT CCCATCTCTG CCGAACAACT GGGCACCTAC GTGCAATCGC 7860
CACGTTTCAG TATCCGCGCA AGCgcATTGC AAGCACTGGA AACGATTCCC TCGCTGAGTA 7920
CACACAACCG TAATCTTTTG CTGCGAGAAT TGCGCGAGGG AACATTCACT ACTGCCGCAC 7980
AGGCGGCACG CATCCTTGGC ATTCATATGG TCCAGCAAGC AATTCCAATC CtGcgCGAAG 8040
CGCTCCATAG CGAGGATTAC CTGCTCGTCG GAGAAGCGCT TGTaGcGTTA GCACGCACAC 8100
ACGATGACGA AAGTCATTTC CTTATTGGGC ATGTGcTGGC GCGCACGCAA AATCCCTTTG 8160
TCGTGCTGCG TGGCCTGCAA GCGCTTGAGA TGCTCAATTC AGTCCACGCG CTACCACCAC 8220
TGTTTGAGAT TTTGCGCACA ACGTGCAAAA ATACACAAAC GCACACAGAA GCATTACTGA 8280
CTCTATCGGT CTTGATGGGA ATACAAAATG AATTCTACTT TCTATTTGAG CGCTACgTAC 8340
CGGTCATACA ACCGTACAAG CGCTAGTACG AGAAAAACTA GAAGAAAGTT TTGCTATCAG 8400
CAGGGTCACT GACGCGACAC TTGAGAAAAC ACTGGAACGC TTTACGGCCG ACGCACGCGC 8460
GGGCACCCAC GTGGTCATGT GGGTACTGGC ACGCGCAGGA GAAGACCTAG GGACAAAAAC 8520
AGCACTCCTG CTGAGTCTTA CGTTGGAGAA TCCCCTGTGC GCGCGAGAGG CTTTTCGCCT 8580
TCTGATAGGT ACATGGACGG CCACCTTGTT TAGAAAACCC GCACTCATGT GCTCTTAGCG 8640
CTCAGACGGC CCGGTGCGCA CAACACGCCG CAGGACGTGA TCGACCGTGA CTATCCCCCC 8700
TAAAACCGAA ATCGCACGGT AGAAAGCGTT TGCCCATCGC GCAACACGTC AAACCACACC 8760
TCCCTCgTnT GACTGCAAGC ACCGCGTAAA 8790 (2) INFORMATION FOR SEQ ID NO: 14: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 651 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: nCCAnTCGCG GAAATTAACC cTCACTAAAG GgAACAAAAG CTGGAGCTCC ACCGCGGTGG 60
CGGCCGCTCT AGAACTAGTG GATCCCCCGG GCTGCAGGAA TTCGATATCA AGCTTATCGA 120
TACCGTCGAC CTCGAGGGGG GGCCCGGTAC CCAATTCGCC CTATAGTGAG TCGTATTACA 180
ATTCACTGGC CGTCGTTTTA CAACGTCGTG ACTGGGAAAA CCCTGGCGTT ACCCAACTTA 240
ATCGCCTTGC AGCACATCCC CCTTTCGCCA GCTGGCGTAA TAGCGAAGAG GCCCGCACCG 300
ATCGCCCTTC CCAACAGTTG CGCAnCTGAA TGGCGAATGG CAAATTGTAA GCGTTAATAT 360
TTTGTTAAAA TTCGCGTTAA ATTTTTGTTA AATCAGCTCA TTTTTTAACC AATAGGCCGA 420
AATCGGCAAA ATCCCTTATA AATCAAAAGA ATAGACCGAG ATAGGGTTGA GTGTTGTTCC 480
AGTTTGGAAC AAGAGTCCAC TATTAAAGAA CGTGGACTCC AACGTCAAAG GGCGAAAAAC 540
CGTCTATCAG GGCGATGGCC CACTACGTGA ACCATCACCC TAATCAAGTT TTTTGGGGTC 600
GAGGTGCCGT AAAGCACTAA ATCGGAACCC TAAAGGGAGC CCCCGATTTA G 651 (2) INFORMATION FOR SEQ ID NO: 15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5338 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:
TACCCTTTCT CCTTCAGTGC GTAtCTACAG yTATCGCACC AGACGCCACT TACAGCGTTG 60
GCGCGCCTTT TTGTCACGCA CGAAAcTGCG TATGTGCCTG CTATCCCCCC CACGTCTGCC 120
GTGAGCCGCC CTTACACCGG TATCCTCATA GATGCGCGCG GTTCTCTTCC TGTGCACGGC 180
GAATACGTGT CAGAGCCGCT GAGCGCATGT TTGTTCCCCA AGATTTGGAG CACGGACATG 240
GATTTAATCT ACGAAAAGAA TATGGTTCAC CCTGACCGTG CCAAGGCATG GGGTGTGGTG 300
CGGTACGGCT CGGTTTGGGA CGAGAAAATG TACCGAGACA GGATAGGTAC CACGCCCTTA 360
AAAATCATTG CGCGCGGAGT GTTTGGCCAG CAGCGCACGG ATCCTATCAT TGCATCAAAG 420 GATGCAGCCC AGATCTTGGC GCGCCCTGAa GAACTTGCGT TTGCTTGCAG AAGGCAACGT 480
GATTATCCTG TGCGACGAAG CAGCGCTGCG TGTGCACGTG CCGTATCCGC TTGTAGACGA 540
GCACTTT AC TTTGCATACC ACGACGTAAA ACGCTTCCTA ACCGACGAGC GGTCCCCCGG 600
TGTCGGTGTT CGCTCTGGCA TCAATACCCT CAAGATCACC GTGTACGACG TGCGTTTTGT 660
GGCAAACTCC CCAGAGATTC TCGCCTCAGA AAAAGATCGG GTAGACGTGA TAGCAACCGC 720
ACTGAAAAAG ATGGGsCCGT ACACAAGkTT TTTAATTGAA GGCCACACCG CAGATTTACA 780
CCGCCCTCAG GAGGAAGCGG CGCTTTCTGT AGCACGTGCG CacGCATGGC GCAGGAACTG 840
TCCAGACGTG GCATTGAGAT GACGCGGATT ACTACGGCAG GACACGGTGC GACAAAGCCT 900
ATCGCGCCAA GCGATaCGCA CGCGAACAAA GCCAAAAATC GTCGAGTGGA GATCACCATC 960
TTGcGCGATT AGTGCACGTA CCACGGAGCA TTCTCCGTGC CGGCTATTTC TCCCAAGTAA 1020
AGAGAACCTG CGATGACGTA CCGATGGCTT TCTGCAGTCA GGCGCAGTTA AAAGGAAGGA 1080
GCACTATGAT AAAGCCACGC GCGTATGCAC TGTTAGGCGT GTTTTTCCTG TACGCCTGTG 1140
CAAGCACACC ACGGGAAGAA GATGTACCTG AAAAATTCAC CCCCGCTGAC CTCATGCTGC 1200
GTGCACAGGA ATCCTACGAC GCAGGTAATA TAACGTGGGC GCGTTTTTAC TACCAAACGG 1260
TTCTCGATCG TTTCCCGAAC AATGAGTCAG CGGTCATTAG TGCAGAGTTT GAACTTGCGC 1320
ACATCCTTGT TAAACAGAAA TCCTGGCAAG ATGCCTACAA TAGGCTCATG TATATACTCA 1380
AAAAATATGA GGCTGCAGGC AGCGCACGCC TGCCTCCTGC CTACTACAAG CTCACACTCA 1440
TTGATCTGTC GCGGGTAAAG CCGCACTTGA ATCTTGAGAC AGCGAATACA AAAGCAACAG 1500
AATATCAAAA GAACTACCAA GAAGAGCTCA AGCAACGCCA GGAACTACGG CAAAAACTCT 1560
TACAAGAACG CACACAAAAA ATGCTTGAGG CTCTCCATCA AGAAGAAACT CCCGAACAGG 1620
ACGCGCGCGA TACCGCAAAA AAGAAGACAG ACCAAGAAGA ACACACCATG CGCAAAGCAA 1680
ACGCGCCTAA AACCAAAGCG TCTGGAGAAG CACCCACCCC ATGAAGATCC TGCACACAGC 1740
GGACCTACAT CTAGGCAAAA CACTCCATGA AGTATCGCTT TTTGCGTCAC AGAAAAAAAT 1800
GCTCGGCGAT CTGTGCACCC TCCTTGCGCA GGACAACTAC GCCGCGCTCA TCATCGCAGG 1860
CGACATCTAT GACCGCTGTG TACCCTCTGC AGAGAGTGTC AGTCTTTTTA GTTCTTTTTT 1920
GCAAAATATC AAACGGTCCA TGCCACGGCT CCCGATATAT CTCATCCCCG GcAACCATGA 1980
TTCTGCGCAA CGTCTCTCCT TTGCCCAGGA GCTACTTAAG CAGCAGGGAG TATTCATTGC 2040
GCAGGATCCT GAAGAGAGCA CCCGTCCCCA TCTCCTCTGT CACGAGGGGG AAACAGTGCA 2100
GTTATTTTTA CTTCCCTTTC TCCACGCAGG TGCCTTTTCC TATCTTGaTG AGGAAAACAC 2160 CACTTGTCTC ATTCACACCC AATCCGAACT CCTTCAAGAA GCCTCGCGTc GCTTGCAGcG 2220
TGCAGTATCG TTGGACACCC CTTCTATCCT TGTCGCACAC CTATTTACCC AAAAAGGTAT 2280
TAGCTGCGAA AGTGAACGCC CGTTTGTTGG CAATGCCGTT TACGCTGACC CACACTGGTT 2340
TGACTTTTTC ACCTATGTTG CACTTGGTCA TTTACACAAA TGTCAAAAAA TCACCGAACG 2400
CATGTACTAT TCCGGATCTC CTTTGCCCTA TTCGTTTGAC GAAGCAAATA CCCAAAAGGT 2460
TGCGCTTTCT GTAGAGATTC ACTGCAACAC AAAGGGATTC CCCATCCATG TGACTCCCCT 2520
TCCACTTGAG CCACTTATCC CTCTTCGCAC CATACGCGAC TCATTCCACG CACTATATAC 2580
CGGTGATCGC TATCTCCTTT ATCAACGTGA TTTTTTAGAA ATCACCCTGA CCGACCCGGC 2640
GCTCGTGCAC AATCCTATTG GCCTTTTGAA GCCGCGCTAT CCAGGATTGC TCAGTATCAA 2700
GCAGGAAAAT GCGTTCGCCT TTGATATACC CCCCCCCTAC TCCTCTAACG AGGGGATAGC 2760
GCCCTGCACA CACCACTCAT TGCGCACACA CTTTGATGTA TTTATGCACG AAGTAAGCCC 2820
CACTCCTGAT GACAGAGAAA AGGGCGCTCT CTTTCAGGAA CTTTTTGACG AAATGCAACA 2880
GGAATTCTCA TCGTGAAGCC GATGCGTCTT ACGCTCCACA ACATCGGTCC TTTCGTTGGC 2940
ACCCATACAG TTGACTTCAC CGCGCTCGGT CCTATTTTTC TAGTGTGTGG GAAAACAGGT 3000
TCAGGAAAAA CCACTCTATT CGATGCGATC GCCTATGCCC TGTATGGGAA ACCCCTTGGA 3060
ACCCGTGCAG AAGTTATCCG CAGTCTGCGC AGTCATTACG CCGCACCATC AGAAGCTGCA 3120
TTTGcTACGC TGGAATTTTC ACTCGGCACT AAAATCTACC GGGTACACCG GACGCTGACT 3180
TGCACACTTT CCCACAGAAA AACAGAGCAA CCCGAGCAGC TGTATCTTGA GCAAAAAAAA 3240
GGTCATGGAT GGGAGCGTAT TGCTTGTGCG CATAAAAGTG AAACTGAATG TGTTATTCAC 3300
GATCTTCTCA AACTCAATAG CAAAGAATTT GAGCGCGTGG TTATGCTCCC ACAGGGAGAA 3360
TGTGCGCAAT TTTTAAAgCA AATTCAAAAG AAAAAAAAGA AACGCTGATG AATCTATTTC 3420
CTGTTGATCA ATATACTGCT CTTATGGAGC GAGCAAAAAA AAAATCGCTC CATGCCAAAG 3480
CAGTGCTTGA AACGCTGCGT TCGCAACTTG AAACTCTATG TGCGGAGTGC ATGCCCGACA 3540
CATACCACGA AAGGAAACAA ACGCTAGAAG CTGAGTTACA GCACGCACGT GACGCACTGC 3600
AGCAAACCCG CATCTCCCAT GCGTACTATA CACAAAAACG TGAAGCGCTC GAAGCACAGC 3660
TAAAAAAACA ACAACTTTGT AAAGAGCTGC GTGCGCGTAT AGAAACATAC CGCGCGCAAG 3720
AACCAGTCCA CGCGGAAACT CAAAaGCGTA TTGATCGCGC GCGAAAAGCG GCACCACTTn 3780
TGCGCACATA AAACACGTCA CCCAGTGCGA ACAAGATGCA CaGCGCATTC ATGCAGAAAT 3840
ACAGGAAAgA TGCGTTCACG CGAACAATTG CTCATGAAAC GAAGTGCGCA TGTCGCGCAG 3900 CAGTCATCCA TTGAAGAACA ACGCCGTCTA CTACAAACAC TTCATAGTGC GTGCATTCAC 3960
ATTGAAGACG CGCATGACGT TGCCACGTCG ATACGCGACA TATCTTGTCA GGCGCACACA 4020
CTCACGCAGC ATATCCACAC GCTTGCACAA CAAAAAACAA CACTTACCCA GCAAGAACAA 4080
TCGTTGTGTA AAGAACTGGA TATACTGCAA AGAGAAGCGG GTACTATCGA TACTCGTACA 4140
TCTGCCTTTA ATGATTTACA AATTCAACTC GCGCATGCAA AGAAGACACA AGAATTGTCT 4200
CAGCGATATG CCGAGCTCTG TGCGGtCACG CAACATGCAC TGCACAATGT GAAAAACTTG 4260
AGAAAATACA CGCACAAAAA AGCGCGTATA GCACACGGGC ACGTGAGCAG CTCCTTCAGA 4320
CAAAAGAACA AATTCATCTC CAAGAAACCC GGACACACGC GGTAGTACTC GCGCGTCTCT 4380
TAGAGCATCA AGAACCGTGT CCTGTCTGCG GCTCTTGCAT TCATCCGAAT CCCGCACGTC 4440
AAGACATAGA TAATCTTGAA CCGTTAACCC GGCGCATGCA ACGCATAGAA CAAACATACG 4500
CGCAGcTGGa AACCAGCGAG AAAGATGTGT ACCACATCCT CACCTCTGAG CGTGAGCGmC 4560
GTGCATCCTA CAGTGCACAA ATGCAGGAAA TACAGCATTC ATTTTCCATT CTTACATCGT 4620
GTGATACGCG ATCATCCTGC GATATTCCAA ACGTGCAAAA AATTACCGTA CGTGTTTTGG 4680
ATCTCACGGA AAAATTATCT CGTGCAAAAG ATATGCTCGC ATGCGCGCAA CACGCTTTAC 4740
TGAGAAAAAA ACAGCCTGAG CAGGATTTAC AGGATGTACG CGCACACCTG CAGCAATGCT 4800
CACAAGAGCT CGCAAAAAAA GAAACAGCAC TCCACGCATT GCAAGAAACG CTTACACAGC 4860
AGCGCGTACG CATTCACGCA CTGTCCATAC GTTTACCCAA GGAATTGCTT GCATCGAACC 4920
TACTTGCTCC GCAAAAGATG CAGCATGAGA AGGAGAGTGT CGCCTATTGG AAAGAGATGC 4980
TCGCACACTG TCAAACCCTT ATGCGAGAAT TGCACACCCA TATTGAAGAA TACGACCGAG 5040
AGTTCAATGA GATAGAAAAC GCTTCTAGTG CGCTTGGCGC CGACATTGCA GCGCGAGAAG 5100
ATGCACTGAA CCATGTTCAA AAAGAATACA TGCACCTTGC ACGTACCGTG TGTTGCGCAC 5160
GAACAGAAGC GCATTTtCAA TAACAACGAr GAAGTAACCG CCGCTCTTAT GACTGATGCT 5220
GAACTTTCTC ATGgCTGCAG CAGAAATTCA ATTTTTCAAT GAATTGCGTG CGGCTGACAC 5280
CCATCTACTG AAAACACTCG AGGGCAGAAA TAGGAACAGA AATTCCATCC GATCTTGA 5338 (2) INFORMATION FOR SEQ ID NO: 16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32768 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:
CCGCGCAAGA TCCCAGCGTT GATATCGCTC CAACCCCTTA ATCACCACAA AGTTCAAGGG 60
AGGGAATACG CTCCCGCGGT ACCCCATCCC CCGCTCATCA AACTCTGCTT CATCTGCAGA 120
CAAACTCGGA ATCGGATGGT CCAGCCCAAA CGTGTGAGGA TTCACCAGAT GCTCTGCCAA 180
GCGCTCTGCC TTGTCTTCAT TCGGTATCTC CCCAAGCATA GGCCAGAAGC CAGCAATAGT 240
CTTATGCGGA AGCTGCTGCC CGGAAGCGTC GAGGTCGTGG TAAAAGCCAG TACTCGCGTT 300
CCACATAAAA TTATTAATAC GTGTCTTTAG GGTAAAATAT ACCCGCTTAT ACTGAAAGCT 360
CAGCTCCTTA TCGTTGATAA TATCGCCGAG TGCAGAAAGA TAAAAAGCGC TCACGGCCAA 420
GGCAGAGTTA AAATCTACCA GATAGGCAGC TTTTTTACGT GGAGAGTTTC CCATCTCCGT 480
AGCAGCAAGA GGAACCCTAT AGAGTCCGTT ACTCCTTCTA AACTGTGTTT CAATCCACTT 540
CATATAGCGC ACCATCACGG GCATGATCTC TTTAATTCGT TTTTTATTTG CAGTTTTATG 600
AAAGAGATTA AACTCTGCCC AGGCAAAAAG AGGCATGCCA ATACCCTCAG GATTGGCGCG 660
AGGCAAAACT GGCTCTTTGC TTGCAAGATG ATACTTCCAA CGAATAGCGC CGGACTCCTC 720
CTGCATTGCA TAGAAAAAAT CAAGACACTG CGTGATGTCA TAGTTCCGGT TCGAATACAC 780
GAAGAAAAAG GACGCAAATA TGATTTCATG CTGACTGATA ATTAATCCGT CTTTTTCTGG 840
AAACACAAAA AACGATTCGC TTGTGTTTTT CTCCCCCGAA GCAGACAGCC AATACTCCTT 900
TATCCAAGCC CACGTGCGAT CATAGATGTC AACAAAATCC TGATCATAAA AATGAATCCT 960
GGGAAAGTCT CGCTTATTCA CCGcATCTCC TCACACATCA CAGGCGACGG AGTGTAGCAC 1020
ATGCaGGGGA AAGTGAGTAT CTACCTTTCC ACCCTGTAAG CTACGCGATG TGCACACCGG 1080
CATCCAGACG AACTAGGATA GTAAGGTGTC AGAGGATAAG CTGGCACGTA ATAATTCACG 1140
CGCCGTACGT TCTTCATCAT ACGGTTGCAC GGTGTGCGCA TAAATAATTG AATGCTCATG 1200
CCCCTTACCC AGCAGCAGCA CAAGGTCCTG CGCACGCGCA AGAGAGAATA TGTGCCGCAG 1260
AGCAGCGACA CGATCCGGAA TCAGAAACAG GGTTTTACCC AATTTCTTGT GCTCACAACC 1320
TGCCGCAATC ATGCACAGAA TACCCATgGA TCCTCTCCTC TCGGATCCtC ATCTGTGAGC 1380
ACAATTACGT GCGCATAACG AGAGGCAATT GCGCCTTGCA TtGCACGCTT TgCGTGTcCC 1440
GCTTCCCCGC CGAGCCGAAC AATACCAACA TGCGCCTCTT AACGCGnGCA CACGCGCTGC 1500
AAGCGGTGGC AAAaTCTCCT CGAAGGAAGA GGkTGTATGC GCATAGTCAA TGAGCACCTC 1560
AAAaTCCTGT CCCaTATCCA CACGCTGCAT TCTCCCCtGG ATTGGCTGGA CGTACTGCAC 1620
GTGCTGTGCA AAAGCCGCAA GCGACGTACC AAGCAATCCA TGCAGTACAA GAAAAGACGC 1680 TGCTATATTA CAGGCATTGA AAGCTCCTTC AAGCGGCACC GATACATCAT GTGCTCCGTC 1740
CTGTGCTGGT TGTGCAGGTT CCTGAACAkT TGACAACACA AACCTTAATC GCAAGGCCTG 1800
AGATATCTGA GGAAGTGTTT GCACCCATAG CAGGGTACAC GGCATCCTTT CCAGGCAGGC 1860
TGCTGTCCTT TGCTCAGCGC CGGTTCCTCT CTTAAAGAAA AAACAGGGTT TGTGCcGTCT 1920
TCGCGAAAAT ACACAGCCGA TGCGTCTTCC GCCCAGAGTA CTCCAAAAGA AGGAACGCGC 1980
CGTCCGTCTT TTATGTGATC ATGCGCATCT AGCGCACGAA ATACATTTGC TTTATCAAAG 2040
CGATATTGTT CAAACGAACC ATGAAATTCT AAATGTTCAT GGCGTACGTT CATACATACT 2100
GCCACATCAA ATGCAACATC CTGCAAACGT GCCGTACGTG TGGAAAGCCC GTGGGACGAC 2160
GCTTCAATTA CTGCAAATTC ACAGCTGTGC TCCCGCATCT CAGCGAGGAG CCGCTGTACT 2220
GTTAGCGACT CCGGTGTGGT TTGATGCTCT GCGTTCGGGA GAATATCATC TCCTAACGAA 2280
TACTCCACAG TAGAGATAAA ACCAACTCGT TTACCACATA AACGCAAAAG CtGCGCAATG 2340
AAACTAACCG TGCTGCTTTT ACCCTCCGTG CCAGTGACCC CGATAACTGT CAAAGCACGC 2400
GTAGGAAAAT CGTAGAAAGC TGCAGCAGCA CTAGAAAGCG CACACCGTGC ATCTGGTACA 2460
CGAGCATAGT ACACGCCGAC GACATACGTA TCTAATGGAC AATCATGCAC AATTGCGCAG 2520
GCGCCGGCAT CAATTGCTGC GTGGATGTAC TGCGCGCCGT GCGCATGCGT ACCACGCAAC 2580
GCAAAAAAAA CCGAACCCTC ACGCACTGCG CGTGAATCAT ACGCTATGGA AGAAACGTCC 2640
GccACACTAC CGTGCGTTTC TTGCACAGAA CAGGAGGCAA GACAGACAGT AATGGGTTTA 2700
CGGTACAGCA TCGCGGGTCT GATTGTATCT GATTGCACGC CCTCGGGGAA CAATCTATTG 2760
TCAATGCTTT TTCAAAGAAG ATCGCAAACG GTGGGGAAAG GCCATCTCGT TGACAGCCTT 2820
TTAGTGATTA ACTTACACTC CGCCGCATGA AAATTTGGCT CAAATTTTTT GTCGGCAGTT 2880
GCATTGGTGC ACTGGTAGCC TACACTATCC CAGAAACGCT CAGCGCGCCG CTCATGCAGA 2940
CCATTTCAGA ATTGGTTGTA TCCGCTGGGA GTTACATGCT TTATCCAGTT ATTTTTTTTG 3000
GATTCAGTGT CAGTATTTTT GAGATGCGTC GAGAACGCCT ACTCCTGCGT ACTACCCTTA 3060
TCAGCATAGG TGCATGTGTT GCCACCGCAT TTAGCCTTTC TTTGGTAGGA CTATTCTCGG 3120
TACTCGTGTA CCGACcTGcG CGTATTCCCA TTTTTGCCAC CGGCACGCCG CAGAATCCAG 3180
GGTTTCAAAT CCGCACCTTT TTTTTGCAAT TGTTCCCTGC AAGTAGTTTT GAAGTATTCA 3240
CAAATGGTGA TTATCTTCTC CCtCTCTGcG TATTTGcCAG TTTCGTCGGC GCCGGCTGCG 3300
CAGTCGATCA TGTCGCGGCA AAACCCGTAC TCGCGCTTTT TGAGTCACTA ACGCGCGTCG 3360
CACACACCGT GATGGTCTTC TTCGTAGACA TGCTGTCTAT TGGATTTATT GCACTTTCTG 3420 CGCACTGGCT GTTTAGGTTT CGACCACTCC TTTCTACTGG GGTGTTCACT GACCTTGTAA 3480
TCCTACTGAC ACTGACAGCA ATTTTTATCT GCAGCGTGCT CTATCCTGCC CTTATTAAAA 3540
TTATGTGCCC TGAAGTCAAT CCGTATCGAG TACTGTATGC AGCATTGGCA CCAATGAGCA 3600
CGGCGTTCTT TTCGCAAAAC GTGCACGCGA CGCTCCCTGT CTTGCTCTAT CACGCAGAGG 3660
AAAGTATAGG GGTGCAACGC ACAACTGCAA CGGTGCTGCT CTCTATCTTT TCGATCTTTG 3720
GCAGGGCCGG GTCAGCGTGC GCAATCACGA TGAGCTTTGC CCTAATATTA AAGTCATATT 3780
CCCATTTGGG AATCGGcTTC TTCGATGcGc TGTGGATTAT AACTGcTGcA TCATTtCTCT 3840
CCATTTTCTT AGGACGCTTT CCCACAGGAG GGGTCCTTAT TGCGCTTGCG TCAATATGCG 3900
CGTGGTACGG ACGAGGTTTT GGAAGCGGAT ACCTTGTCAT CCGCCCTGCT GCATTTTTTG 3960
TTGGAAGCAT CGCCACAACG CTGGATACCC TAAACGCCCT CATCTGCACC GCAATAAGCG 4020
CAGAACGAAT TGGAACTGTG CGCCACCGCG CGGTGCGTTT CTTTATCTGA GCTCTAGTGA 4080
TTGCACTGCA ATAGCAGGAG ACTCCAGCAT TCGATGCGCC CACGCCCCCA TACGCGCAAC 4140
CGCTATGTGC GCACACGGCG ACACACTTTC CATCGACTGA AACAGGTGCC ACATCCCCGG 4200
CCACACGTCT AGGGTCACTT GTACCCCCGC CCCTTCAAGT ATCTGCGCAA GCGCACATGC 4260
GTCTGTGTGG AACAACTCTT GCTCCCCACA TTGCACAAAC ACCGGAGGaA ATCTCCAAAA 4320
TTCCCAAAAA GGGGGGAAAC CAGTGAATTG CGAAAATTAT CCGCGTACGT GTACTGCAAC 4380
GCACAGTAGC GGAACATATC GCGCGTCAAC AGGAGTTCTT TCTTCTTAAC TCCCTCCCCT 4440
GCAAACCGAT CCTCAGTTAA ATCAACCCAA GGAGAAATAA GcgCCAAAGC GCGCGGnACA 4500
CACCAGCCCC TTCTGTTTTA AATAGTGCGT CAGTGCAAGC ATCAACCCTG CACCTGCTCC 4560
ATCCCCACTG AAGATAATAT CTTCAGGACG AAATTTCTTC TGATCAATAA GTGCTACATA 4620
CGcATCATAC ATATTTTCTA GTGCAGCAGG AAAAgGATGC TCGGGCGCAA GTGgATATGC 4680 AGGArTATAA AACTTCGCGC CGACTTCATC CGCTAAAGAT GCACAGAGCG CACGAGAAGC ' 4740
CATAGGAGAA CCACTTATAA AAAATCCACC ATGTGCGTAC AACACTGCAT GGCCAAcCAT 4800
TAGGAAmCGC GGGCTCAACA CATCTGTTTC GATATTAGCC AATACCTCAC AGGAAACATC 4860
CACCCCATTG GGCACATACG GCATATAAAA AAAGTCATCA TACCCCGCAC GCAAGGCAGA 4920
AACCGATGTA CGCGGGGTAA AGCGCATCTT CTTAAACAGT TTTTTCGCCA TTCGGTGCAC 4980
GTGAGCGCGC GAAGGTCCCA TGCGTTTATC GTAACACAAA GAAACTGGTT CTGTCAGGCA 5040
ACCGCTTCAC GCCCATAAGA GCGCTTGACA CGTCCGGCGC ATTCCCGTTA CGCTCGGCCC 5100
CGGTTTCGTT AGGGCAATTA GCTCAGCTGG TTAGAGTACA AGCATGACAC GCTTGGGGTC 5160 ACTGGTTCGA TCCCAGTATT GCCCAGGAGC TCCTCTATTT CAGACCTGGC CCATTTTTTC 5220
TGTTTTTGGC AAAGCGGGGG GACGATAGCG GGTCGCCCGT CTCTTTTCCC CCTCCCTTTG 5280
AGGGGACCTC CCCGCTCGTA GAgGGGACGG GGTGCTCTCG CTCAACCAGG AGCCGATCGA 5340
TGCGCCTACC GTCCATTTTC AAGATAGTAA ATCGGCATCC GTTCGCATTT AACTGTTCTT 5400
TTACACGCGG GATTCGATTG CGTATGCTCA GCACATAACC GGCGATCGTG TGCACACCCG 5460
TGTGTGGCCG CGTCGTCCGC GCCAGAACCC CGAGACGATA CATTTCGTTT AAATTCATCC 5520
ATCCGCTAAC GATCCAGCTA CCGTCAGGTT CTGAAAACAA TCCTCACTAT TGATCCCGTC 5580
GGCACGCGCA TACTCCTGAA CGCAACGCAT AATCAACTCA TCACGCGTTA CCATTCCCTC 5640
AATCCCTCCG TACTCGTCAA TTACAAACGC CATCTGAGCC TGCATCTGTT GAAAAAGATG 5700
CAACAGCTTA CGCACACTTA TTACCTCAGG TACAAAAATC GGCTGCTGCA CTATACGCGC 5760
TATCACCGGA TCTGCGATCC ACACCCCCTG TCCTTTTGCA TGAGCGTGCC GCCGACCCTC 5820
CATGTGAACA TGCAATGCGC TTGCAACAGA CTCTTGCTCA TCCTGCTCAC GAGAGGCTCC 5880
CGCGCTCCGT TCCATTTCAA GACATACGCG CACATACCGT TGCACAGAAA AATAGCCCAC 5940
GcTGCGTCAA TCGTACGAGC ACACACAGGA AAATAGTTGT AATCGTCATG CTGCGAGATC 6000
ATCGAGAAAA TATGAGAAGG GAGTGCGCGC GCCTCGACCC ATACAATATC GGTACGGTGC 6060
ACCATGTAGC TTCCAATTCC CTGATCGAAA CACACAGCAC AATCaGCGTA CGACATCTGC 6120 aGCATAGAGG CGCGTGCcGT TGcCGCTGCC AGCTTTTCCC CATAAGCaAA GAAAAAATAT 6180
TCGCTGCCCA ATCCCACTTC ATACGCTGGC CCACAAACCC AATTGTAATA AAAATTTCAT 6240
ACGGCTGCGA CTCACACTTC TCCACCTCGC ACCAAACGCT ATTTTAAAAG GCGCGCTGCC 6300
TTTCAAAAGG AGAGTGCAGC GAGCACAAAG CAGCCGACCA CGCGTTTTGG ATACCCGTGT 6360
GAACCTCGCA TGCAACGCAC TGCCGGGTAA GACATATGCA ACCAACTCTT CGTTATTTTC 6420
AAGAACTACC TACAGTTTAC ACGCGTGGCA CCGTTTCCTC AAAAAGATAT CTATGCCCCT 6480
GCCGCCGCGT CCGGTAGAAC ACCGTGCGTG TGTGTGCGCT ATTCATTACG TGCAACAGAG 6540
CAAATCCTGC GTGCGAGCTT CCTCGCGCAT GCTCGAGACT ACCCGGATTG ATTATAAGCA 6600
CGCGGCTACT TACACACGCC CTTTGCACGT GGGTATGTCC ATGCACCGCA ATACTACAAC 6660
ACGCCTGCAG TGCTTGGGAA ACGAGCACAC TATCGCTCAC ATTCACCGAA TGGGTGTGAC 6720
CGTGCGCTAG AAAGAGCGTA ACATCTGCCA CCTGGATACG TCCACATAAG GGTATGTGCG 6780
AAGCACGATC ACAATTTCCT GCAACCATAA AAATAACACC TGGAATACGC TCCCTCAGCA 6840
CACGATTACG TCGTCCCCTT TGGCAGAGAT ACAACACATC CCCAATCCCG TCCCCTGCAA 6900 AAAGAAGGGC ATCTGCACAA GAACCAAACT GATCTACCAC CGCGGTCAAG GCCTCCGCGC 6960
TGCCGTGCGT ATCGGAAACC AGCAACAAAC GCGCACAAGA CAGCATATGC AATGAGGCAA 7020
TCGACTCCCG ACCCCCTATC ACTCCCGGTG CAGTCATCTC AAGCGTATTC ACATTCTATC 7080
CCTTTCGTGT ACGCTCCTTC CCGGACTGCC ACCATACGAA CGCACAAATC TGAACGTTTC 7140
TACCCGTTTT GACAGCAACA CATGATTGTA GGCACGCACA CCCGTCTCCG GGTCTACGTA 7200
GCATTCCATA GCATAGTCGG AGGAAAAAAC ACCCTGTTCC CGAAGGGAAT CAGTGACATT 7260
CGCAAATGCC CCTCTGAACT CCTGCAGAGG ACTTGCATGT GCTCGACGCG CGACACCCCT 7320
TCTGCGCACG ATTTCACATG CTGCACGCAG GAGCGACTGG AGtCCGCTTC TGTATTTGAA 7380
ACCTCCGCCA CTCCTTCTAG AGAATCTTCT TTTACCTGGA AATCCACGAG AATATCGCTA 7440
CCTGCCTGCA TAGGAAGCAC CTcTTcCTcA GGCAAAAGAG AAACGATAAG CGTTTGCGCA 7500
CGCTCGCGTT TTACCGCGAG CAAGTACTTA CCACCTCCTA CAGGCAATCC CAAACTGACA 7560
ACCGCATCAG GCCGCATTTC ATGCACGGCC TCGAGCACAA CCTTCATGCG CGGCAAGGCC 7620
GAGAGGCGCG TCAAATCCCC GTACTCCACC ACTCGCGCGC GCATGcgCCA GGcCTCGCAC 7680
ACCCGCTCGA CGCAATCGAC AAGATAAACG TGACCTGAGT ACTTTTTTCC AAAAACGAAC 7740
AGGACCTTCT GCTGAGAGGA AGGAGAGATC GGTACCACTC CAGACCTCTC CGTACTCTCC 7800
CGAACGCACG AACACGCCCC GAGAGAACAC CAACACACGc ACAAGAGACG CGCGAACTGT 7860
CCTGCACGGG CGCCCCTCCA ACCCCTGCAG AACTTCATTC AGCACACGGG GAGACGCTGA 7920
GCGCTCTCCT CGCCCACGAA AGACACCGTG CGCGCCCGGT CCCTAGAACG GACGGTCCGC 7980
AAGtACTTGT ACTCTTGCCA CTGTCTCCGC GCCTTCTCTT CTGCGCGCTT GAGCAGCATT 8040
TCAGCTCGCT CGGGATTTGC ATTCTTAAGC GTCTTGAACC GAACTTCTTT GTACATGAAA 8100
TCCGCAAGCT TAAAATCAGG TTCCTTACTG TCAAGCTGAA ATGGATTTTT TCCTTCCGCA 8160
ATGCGACGGG GATCGTAGCG GTACAACGGC CACAAACCAC ACGCGACGGC CTCTTTCTGA 8220
TTAATCATGC CCTTGGACAT ATCAATCCCG TGGCTAATAC AGTGGCTGTA GGCGACAATA 8280
AGCGATGGAC CATCATAACT TTCAGCCTCT CTAAACGCCT TGACCACTTG ACTCATGTTC 8340
GCTCCCATCG CGACACGTGC CACGTACACA TACCCATAGC TCATGGCCAT CAAACCAATA 8400
TCCTTTTTAC TGATCTCCTT CCCCGCCGCG GcAAACTTTG CGACGGCCCC GATAGGCGTG 8460
GCCTTCGACA TCTGaCCACC GGTGTTGGAA TACACCTCCG TATCCATAAC AAGGACGGTA 8520
ATATTGCGCC CAGAGGCCAA CACGTGATCT AGACCACCGT AGCCAATaTC ATAGGCCCAG 8580
CCGTCTCCCC CAAAAATCCA CACCGAGCGT TTGATAAGGT GGTCAACGAG AGAAAGCATT 8640 TCCTTTGCAA GGGGkTCAGt ACTCTCACTG AGCACTTtCT TAAGCTGATT AACGTAGgCA 8700
CGCTGCTCTT CCACTGCAAC ATCGTCCGCC TGCTGGTTAG AAAAAATACT CGCAAACAGA 8760
TCAGCCGCCA CCCcTTTTTC CTGCAAcTTG CGTCCAACcT CGCGGGCATA CTCTGcAAGT 8820
TTGTCACTAG TCACGCGCAT TCCGAAGCCA AACTCTGcTG CGTCTTCGAA AAGAGAATTT 8880
GACCAAGCGG GGCCGCGACC ATCAGGACGC GTCGTATAGG GGGTTGTAGG CAAATTTCCC 8940
CCATATATCG AAGAACATCC GGTTGCATTC GCGATAATGG CGCGATCCCC AACAATCTGC 9000
GTCATCAAGC GGATGTAGGG GGTCTCCCCG CAGCCTGGGC AGGCACCAGA GAACTCAAAA 9060
AGAGGTCTTT TCATGGACGC CCCTTTTGGC AGACTCAAAT TGAGCTTCTT CGCCTCAGGA 9120
TCGGGCAGTT TAACAAAGAA GGCCCAGTTC TCAGACTCCA CCGCACGGTG CTTGGAAAAA 9180
CTTTCCATGT TGATAGCCTT ACGCGTAGGA TCAGCCTTAT TTTTTGCCGG ACATTGCTGC 9240
ACGCACAGGC CACAACcTGT GCAGTCCTCT GGGGAAACCT GAATCGTAAA tTCGCCTCCC 9300
CAAATTCCTT GCCTTTGTAG TCACAGGAAG CAAACTTAGA AGGCGCATGC TCGAGCTCCT 9360
TACCATCGTA CGCTTTCATG CGGATAAcTG CGTGAGGACA CACCATAGCG CACTGACCAC 9420
ACTGGATACA AACAGACGGA TCCCAAATGG GTATAGTCTC GGCTATACAG CGCTTCTCGT 9480
AtGCGTGGTA CCAGTAGGAT AGGTACCATC CTCTGGTAGT GCGCTCACCC CAAGACTATC 9540
CCCCTGATTG AGCGCAATAG TACCTAACAC GCTTTGCACA AACTCCGGAG CATCGGAACT 9600
CATCGCAGGA CGACGCGTCA CCAAACTACC GGCAACTCCC GGATACTCCA CCAATCCCAC 9660
CCCAGCGAGC GCCATATCGA TAGTGGTGAT GTTCCTCTGT ACAACCTCCC CACCCTTTTT 9720
GCCGTAGgCc TtCTGTATAA ATTTCTTAAT CAGGTCAATC GCCTCAGCTT CCGGCAAGAT 9780
ACCAAAAATT TTGAAAAAAG CCGTTTGCAT CACCACATTG ATACGTGTGC CCATCCCCGC 9840
CTTCTGAGCG ATAGAAATCG CATCGATGAC GTAAAACTTC ACCTCCTTTT CAATGATCTG 9900
ACGCTGGACT TCTATGGGTA TGTGATGCCA CACCTCATGC TCACTGTACG GCGCATTCAG 9960
CAAAAAGGTC CCTCCACGCT TGAGCGTTTT GAGCATGTCA AAGGTTTCAA GGTACGTAAA 10020
CTTATGACAC GCTACAAAAT CCGCCTGCGT AATGAGGTAG GGCTTACGGA TCTTCTGCTT 10080
TCCAAAACGC AAATGAGAAA TAGTAAAACC ACCAGACTTC TTGCTATCGT AGGCAAAGTA 10140
AGCCTGCGCG TTATTATCCG TCGCCTCACC AATAATCTTA ATTGAATTTT TATTCGCGCC 10200
TACTGTACCG TCCGAGCCCA GACCATAGAA CACCGCCTGA CACACATCTT GATCATCAAG 10260
CTGAAAGTTC GGATCAAAGT CTACGCTGCT GAACGTAACA TCATCCTCTA TACCGACCGA 10320
GAAGTTCGGG ATCTTCTTCC CACTGAGGTT ATCAAACACT CCTTTGGCCA TCGCGGGCGT 10380 AAACTCCTTA GAACCCAGGC CA AGCGACC ACCGAGCACG AGAGGGTAAT GCGTAAACGG 10440
ACACTTCTTC TGGCTCTGCA TCTGGCCGAT AGCGGTGCGC ACATCCTCAT AGAGAGGTTC 10500
GCCCAGAGAA CCTGGCTCTT TCGTTCGATC GAGCACTGCA ATCGCCTGCA CCGTTTTGGG 10560
CAATGCATTG ACAAAACACT CTGCGCTGAA CGGGCGATAC AAGCGCACCT TGACTAGACC 10620
ACACTTTCCT CCCTGAGCAT TGAGCACATC AACTGTCTCT TCAACGGCCT CAGAGCCGGA 10680
GCCAATCATG ACAATCACCT TCTCTGCATC GGGTGCACCG TAGTAATCGA AAAGACGGTA 10740 cTGGCGTCCG GTAAGCGCCG CGTAGykCCA TAGCTTTTTG GACAATGGAG GGCGCAACCG 10800
CATAGTACCT ATTCACTGAT TCGCGGACCT GAAAATACAC ATCAGGATTC TGTGCTGTGC 10860
CGCGGACCAC TGGCTTTTCG GGAGTCAGTC CACGCATGCG GTGCGCATGG ACAAGTTCGT 10920
CGTCGATCAT AGCACGCATG ACGTCATAAG AGACTTCTTC AATTTTCTGA ATCTCATGAG 10980
AAGTCCTAAA ACCGTCAAAA AAATGAACAA AAGGCACGCG CGCCTCGAGC GTCGCAGCAT 11040
GAGCAATAAC TGCGGTGTCC ATGGCCTCCT GAACACTGTT GGAAGCAAGG AGCGCCCAAC 11100
CTGTCTGGCG GCACgCcATC ACGTCTTGAT GATCACCAAA GATAGAAAGA GAACTTGTGG 11160 cGACAGsrCG TGCAGCAACG TGAAAAACAG CGCTCGTAAG CTCCCCTGCG ATCTTATACA 11220
TATTCGGGAT CATAAGCAGC AATCCCTGAG AAGCAGTAAA AGTAGAAGAG AGCGCCCCCG 11280
TCGTCAGTGC GCCATGAACA GCTCCCGAAG CGCCTGCCTC AGACTGAAGT TCTACAACGG 11340
TGGGAACGGT ACCCCAGATA TTTGTGCGCC CCCGTGCGGA ATATTCGTCT GCGATTTCTC 11400
CCATAGGACT GGAGGGAGTG ATAGGGAAGA TAGCAATGAC CTCACTAAGC GCGTGAGCAA 11460
CGTGCCCCat GCGGTGTTAC CATCCATCAT GaCGAGGTTC TTCTCAGACA TACGACCGTC 11520
CTCTCTCTAT AAAGTATCAG GGCAACCGGG TGCAGGGAAA CACGCTCCAT ATCCGCCTCG 11580
ATCTCCCCGT GTCCCGGCTA TAGTAGCACA CCCCGCCTGA ATGTGCATCG GCTGCACGCG 11640
GGACTCACGC TTTTTTTCAA AAAACAAGCA TCACTTCTCC CTGTTCAGAA AAAAAGAACA 11700
CGCGCTTACT CCCCTGACAG CACACGTTCA AAAAGCACAT CAAGTTCTCG CTCTGAATAG 11760
CGACGACACG CAAGAAAACT CTCCTTCACC GCCGCTGCCG TCTGTTCATC CAAGCTCAGA 11820
CTTTTTGTAT CCCGCAGATA GAGGACGAAA TCTTCAAAAA CCGAAAGCGG CAACCACGGA 11880
AATGCCACCT CAAAAATGCC ATACCAGTGA CCATACGCGC GCGCTTCTAC AGACGCAAAA 11940
GCCACCCCTT GCTCTTTTAG TGACCCACGC ATGTCCGGAT GCACGGTGAA TACCTTTTGG 12000
TTCAAGAGAT CACAGTACTC CTGATCCGAA AAAAGCTTAC CGGTTCTCCG GATAAGCCCC 12060
ATCTTTTCAA CCGCAGACAC ATCTGCAAGG ACAGAAGCAG TAACAAAGTT CAACCCCTGA 12120 TCCTGTACCT CATAGAGCAC ATACCGCGCG CGTTGCsACG CTCCACACTC TGCTTTAACT 12180
CACCTTTTTC CAAAAGGGGA CGCAAAATAA CAGCGGGTTC CACAGCGCAT TGTCACACAA 12240
CCGACTGGAA AAATCCAGCT TGCCGCACCT CCTCCCCTGA GATGGTCTGG CAAAAGCGCT 12300
TATAAACTCA GATCAATTAC CGTCCCTGAG TCCTCCCCGT CCTCATAGCT CACCAGCCCA 12360
CGCGCCGCAC GAAATTCATC GAGCTGCTGG CGCAAATCGC TGACCTGCGC ACGGAGGAGG 12420
GCTCGGTTTT TTTCATTCTG CGCATATACC TGCTCCTGTA CTCGAGACAG ATCACACCGT 12480
AACTGCGCAA sTCCTGCACA CCAGACGCCT GCTGCAGACA CAGCTGTTCC ATCGGAGCCA 12540
ACGCCTTTTG CACACGCTGA ACATCTCCAA GCAACGCTTC TTCAAGTTCC ACATGACGCA 12600
ACACCGTCTC TGCGTCCTGC GCCTCAATGG CTACCTGTTG CTGCTGCAAC ACCACCAAAT 12660
ACTGCGCAAA CTTTTCcCGC TGCTCCTGCA GCAGTGCCTT CAAGCGCTTG AGAGTCGCAA 12720
CCCGACGCGC TACCTCTTCG TCTGATACCC GCGCACCGTC CATGcACCCT ACCCCGCAAG 12780
GTTCACGTTA TATGAAGCAG GCGACGCAGA AGTAACGGAG GATGCAGGAC TCCTGACAAT 12840
TTGAAACCAC GCCTCGCGCA ACTGCCTCAT CATATcGCGC ACGGTcACCA ATTCATCAGC 12900
CCGCTTcTGG aTATTCGCGT GAAAcAGCTG cTGGTTGAAA TACGCGTAAA TAGAAAGCAA 12960
GTTCTGCGCT ATCTTCTCCC CTGCTTCCAT GTCCAACGAC ACGGAAAGCT CCGTAATTAT 13020
CTCTTGCGCT TTCAAAATAT GACGGTGCAC CCGCTCAATA TCAGAGGCGG GAATCTTTTG 13080
CACGTCCATA AGCTCAATCG CACACCCCAA CTGCTTAATC CCTTCGTCGT ACAACAGCAA 13140
AATAAGCTCA CCCTGACTCG CCGTCTTCAC ATCCACCTGT CGATACGCAC TCAGCGCGGG 13200
ATCCTCATAC GCCATAGCAG CCTCCATCGT AGCAAAACAA TAGCGCCAAT ATCGACCACA 13260
ACACTAACCC GCCTTAAGTA CCGCCGCAGC CCCCTGCCTA TCCCCTGATA CCGTGCCAGA 13320
CTGGGAGTGT ACACTCTGAA AAACACTCCG CGCCTTTAAA TACAGGATAT CCCCCCGCAT 13380
CTGCACCTTC CCCAGCACCC GAATTGGTTT TTGCGGGTCT ACGGAAACCA CAAAATTACA 13440
AAATACCGGC ACAATCCCCT CAAGCTTTGT CAGTCGGTCA TACCCAACAA GCAGATTAAA 13500
ACTCACACTC CGCTCCGTCC GTTTGACGTT CGCCGCCATC CCTTGCCACA CCACCCAGCA 13560
ATCCAGGTAT AGCCCCTTTT GctCACGCAC CTGCACATAT GGATAATTAT CTTCGTCGGG 13620
AAACGTATCA AAGGTCGGAA GCGCAAAATA GTCCATGAGC ATACGTGCCC GGCGTTTAAT 13680
CGAATGCGAC GCATTGGAAG CAAGCAGCTT ATTTATTTCC ACCTGTGCTG CATTGTCCTT 13740
GAACGTCTGA AAGTATTTTT GCGCGTTGGC GTACGACTGC AAAACATCCT GCTTACTCAG 13800
CGTATACCGG TATGAACCGC TGCCTTCAAG CGGCCGTGCG TACTCCTCCT TCGTGAGCGT 13860 TAAAAACGAT ACGTCTGCAC GCGAAgTCCG CGCAGTCCCC CTCCACAAAC CTGAGTGCCA 13920
CACACGCAAT CCTGCTATAC CCAACGCGCC GACTAACAGG GTACACAGGC ATCCCACCCC 13980
TACCGCGCAC ACTACAGATC GGCGCCGATA TTCGGCAACC CCAGGGCGTG GATACAAAGC 14040
CTTAACCTTT CCACTTTGAA TAAAACCAGC AAGCGACTCA GGCGTATgtG cGyTTtCAGC 14100
GCGTTCAGAC CCTTCTGAGC AAGCCGATGT TTGGGATCCG CATCGAGCAC CGCGATATAA 14160
CGTTCAACCG CACGGTTCGT ATCACGCCGC TTGAGTGCCA GCGCTGCCTC CGCACACAAC 14220
AGCGTCGGAT GCGACATCTT AATCTGCCGC GCGCGACCCA AATACGAAAC CGCACCGGTT 14280
ACTATTCCCT CATGCAAACA CGCAAGCCCC AGGTATAAAT GAAAGACAAA GGATTCGTGA 14340
TAATCCAACA CGCAAGGCTC AAGCAGTTTG ATCACCTCGC CATACTTTTC CTGCGCAAAC 14400
TTCCTCTTCG CTCGATCCAG TACTGACAAC GCCATAAGTG CTCTGCTTTA CAAACATTAC 14460
CATAACTTGC ACGAACAACC ACTGTCGCGT GCAAGCACCA CGCAGCAAAA ACTAACCCCT 14520
ATGCATAGCA AGAAGCAGTA TCCCCGCAGA AAAACAAGCG CTCTAGTACC CTATGAGCGA 14580
CAGGGTCTCC TTACACCTGA ACACCTTACG CCCCTGAGCT GTCCTTGCAG AGAAGAGACC 14640
TCAAGGTACC GGAAAAAAAA GCAGAGCCTG ACACTATCAC ATCCGCACCA GCGTCAagmG 14700
CCTGCGGCAA CGTGCGACAG TCGATGCCCC CATCAACCGA GATCATGTAC GAATACCCCC 14760
GTTCGGTGCG CATCTGCACA AGTGCTGACA CTTTAGAAAG GCAATGGGCA ATCATCTGCT 14820
GTCCGCTAAA ACCAGGATTA ACCGTCATTA CCAGCACTAG GTCCACGAAG GGCAGCACTT 14880
CACTGAGCGC AgmAACAGGA GTAGACGGCA CGAGGCTAAT ACCCACCTTC ACTCCCCGAC 14940
CACGAATGGC ATGGATAAGC CGGTGTGCAT GCACCTCCGC CTCTATGTGA AAAGTTAAGA 15000
AGTCCGCGCC CGCCTGCACA AAATCCTCAA TGAGGTCGGC AGGCCTACTG ACCATCAGGT 15060
GAACATCAAA CGGCAGGTGC GTTTTGCTAC GCAAACAACG CAGCACCGGA GCACCAAACG 15120
TCAGgTTTGG CACAAAGTGC CCATCCATAA CATCCAGGTG CACCCACTGT GCGCCGTGCs 15180 yTTCCAAATA CACCAGCGCC CTATCGAGCG CAGAGAAATC TGCACTTAAT AGTGAAGGTG 15240
CCAATGTAAA AGACCGCTCC ATAGGTCCAT GCTAGCAAAC AAATCGAGCA CCTGTAAATA 15300
TGGATACTGG CGTGAACAAC CCCAAACATT CAAGACAGTG TCCTGCTTAT CCTGTTTTAT 15360
TTGCGCCTAA ACCGATAGAA AAATATCCTT CCCGCGGGTA GGCTGACCCC GTCATGGTGA 15420
CTCCGAGCCA ACACGCGCTC CTTGCAGAAG GCGTAgcACA CCTAACGGAC GCGCGCACCG 15480
CACCTGCTCT TTGCGCTCTC CTGAAACAGT ATTTGGAAGA ACTTATTCTC TTTAACACGC 15540
GCGCACACCT GGTGCATGTA ACACACACAG AGGAACTTAT CACACACCAC CTATTAGACA 15600 GCCTCAGCGC CTGGCCACAT TTCACCAACG CGCGCGctAT CGCCGACATC GGATCTGGCG 15660
CGGGCTTGCC AGGGATCCCG CTTGCCTGCG CACTCGCGTT GTATGCACCA GAAACAGAAC 15720
TGACGCTCAT CGAGCGGCGT GAGAAACGCA TAGCATTCCT TGAAAATGCC TGCGCGCGTC 15780
TGGCGCTCCC CCACCTTCGT ATCGTGCATG CGGACGCGCA CGACCTCACT CCCTACACGT 15840
ATGACGCAAT CACCTTTCGC GCTCTGTGTC CCCTCAACCA CCCAACGGTA TATATGCTCC 15900
TGAACAAACT GCGCCCTGGC GGCGTAATAC TCGCGTACAA AGGGAAGAGA AAACTCATCG 15960
AACAGGAAAC GCGTGATTTC CTCCCACAGT CTTGCTCTGT CTTCCCCCTC CATGTTCCCT 16020
TCCTCCACGA AGCACGGCAC CTCGTTGCCA TACACACACC CTGCGCAGCg cCTCCCCAGT 16080
GACACAGGAG CACAGCAGAG AGGAGAACAT ACAAACGGCA TGCCCAT AT TTGGACGCTG 16140
CGGATACCCG TTTTCGCAGC TGGTCGAAcT GCGCCTGATC GCAGACAAAA CgsTTCGCTG 16200
CCGCACCACG GCGCACTATT TTGAGTGAGC GAATCGTATC TCCCGCTATA ATTGCATGCA 16260
CCACTTCCAT ACCCTCAACG ACTTTGCCAA ACACGGTATG CTTTCCATCC AACCACGGCG 16320
TAgcACGTGG GTAATAAAAA ACTGCGAACC GTTCGTTCCT GGTCCTGCAT TCGCCATTGA 16380
CAACACTCCT GGGCTGTCGT GTCGCAACGC AGGATCACAT TCATCGGGGA ATTGATAGCC 16440
AGGACCTCCC GTCCCATtTC CtGCGGGTCT CCCCCCTGGA TCATAAAATC TTTGATAACA 16500
CGGTGAAAcG TTAACCCCTG ATAAAAAGGA CGACCcTTGC ACACCGCCAA CGTTCCCTCT 16560
GCTAACCCCA CAAAATTACA CACCGTAAGC GGCGCCTTTT CAAAAAAAAG CGAGAGAACA 16620
ATCGTTCCCC GATTTGTTTC CATTACCGCA TATATACCGT CAGCGACCGC CAACCCTTCC 16680
TCGCGTACCA TTTTTTCCTC CGCACACCCG ATCCTGCCAA CGAAGCAGAA CAACATCACC 16740
CCCACACAGA CGCGCCACAC tGCGTATTCA TATCGCACCC CCGGTGTGAC ACACCCGCAC 16800
CTGCTTGTAA AGCCCTTCCC CCTCACTTCT GGTCTCTACC CCACTGATCC CATGCAGCGT 16860
TGTATGAATA AGCCTCCGCT CATAGGGATT CATTGGCTCC AGCAACACCG ATCGCTTGCT 16920
CGCACTTACC TGATCTGCAG TAGCATATGC CATGCGGATG AGCATTTCTT CCCGTCGGAT 16980
GCGGTAATTC TCACAGTCAA GCACAACCTT CACCCCTTTA GCACCAATTT TGGTAAAAAA 17040
TACATTCGCT AATAGTTGCA ACGCATCGAG GTTCTTCCCC TTCTTTCCAA TCAAAATTGC 17100
AGAATGGCTT GAGTGCAATC TAACCACCAC CCGATCTGAC TCCCGTACGA AGGCATCGAT 17160
TGTCACCGCG TAGCCCATTG CGTGGAATAC ATGAGATAAA AAAGCACGcA GctGCGCCCC 17220
AAGATCTGCA AACTGCTGCT CACTCAACAC AGAAACCGGA TCGTACTGCG CATGCGCCGC 17280
ATGACCCGGC GAGGCGGTGA CCGCCGCTAC ATGCACACGA ATACGCGCAA AACCCTTCTT 17340 GAAAAGAGAG GATTTTTGTA TCTCCAGTAT CTCCACATCA AACTGATCTA CTCCCAATCC 17400
CAGCTCTGCA GCGGCACGcg CAATGGCGTC CTGTTCTGTC TTGCCTTCAA ACTCATACAC 17460
CATACACCGA CTCCTGCATC AGGTTTTATT CTTGTTAGCC GTACGCTTCA TCACCAACTG 17520
CTGTACAAGC GTCACTCCGT TCATTGCCGT CCAATACACT AGAAGACCCG AGGGCGCATC 17580
ATAAAAGAAA AAGAAAAAGA ACAACGGCAT CACATAGGTC ATAATCGTCA TGGATGTTTT 17640
TTGCTGCTCT GTGTGCGGTA CtGCGTCAAC TTACTAAACA TAATTTGAGA GACTACATAC 17700
AAAACCGGCA GCATACGCAT TTGAGTCCAC TGTGTCACCG GCAATGCGAA CGGCAGTGTC 17760
CACACGCTGT CTGCCAACGA AAGATCAGGA ATCCAATACG GGATAAACAT CGCACCACGG 17820
AACTCGAAGT AGTTATTGAA TAACCGATAC ATCGCAAAAA TAATAGGCAT CTGTACAAGC 17880
GTTGGGAGAC AGCCTGAAAG CGGATTGTAC TGCGCTTCCC GGTAGAGTTT CGCCATTTCC 17940
TCATGTATCT TCTGCGTATT CCCTTTGTAC CGCTCTTGGA TACGCTGCAT GTGTGGCTGC 18000
AGTTCTTGCA TCTTTTGCAT AGCGATAAAG CTCCTTTTCG TCAGCGGGAA AAAGAGCACC 18060
TTTATTGCAA TCGTCACCAA AATAATTGCC ACGCCCCAAT TAGGAATGAG GGTGTAAAAC 18120
AAACGCAGGA GCCACTTAAG GAGCACCTCA AGCGGATAGA GAATACCACC GCTTACTGCC 18180
ACCGCATCGA TATACGTGCG CTCAAGCCCA TAGGGATTTC GAGAGGCAAC GTTGTACGCA 18240
CTCAAATACT GCTCTGCGCA CGGGCCGATG TACACACGAT AAACATCTGC AACCGCAGsT 18300
GCGCAACAGC GCGGCGCACA AACGCGATGT GATGCTGCAC AGCTGTTTCA GCTTGGGGAG 18360
CCGATAGCAC TAGTCTTTTC AGACTGTCCG CATCATTGGG CAAAACGATG AGCGCAAAGT 18420
ACTTACCCGA GACACTCGCC CAAGAGACAG GCGTATCTAC CTGTTCAcGT CCATCTCCTT 18480
TCAGAGCATA CGTTTTCGCC TTGCCACCTG CACCTACCAT GAAGGTGCGA AACTCATACT 18540
TGTCCGCCGC ATTCCGCTCA GGCCCGATCT CAGGCGGTGT GCGCAGGGTA TAACTTGCTG 18600
TCCCAAAGTC AAAGCCATTC GCCCGCGCAA GAACAGTAGT CGCTGGGCCA GCATTCGTCT 18660
CCGCCACCGC GCGCACCTTC GCCCCTTGCC GGCTGTCTTC CCGCTCCTCT AGAACGTCTG 18720
CACTCAGGGA AACGTGCAAC TCAAACATAT AATTATCAGG ATAGAATACG TAaTCsTTCG 18780
CCAGCACGAA AGGAGTCCGT GTACCGTCAG CATGCTGCGT GGCAACACTG CGGTAAAAAC 18840
CTATGGAATG CAcGCCACGC GCGCCTATCT CCTGCTTTAC TTGGAAAAGT GCATTCACAT 18900
TCGGAGCATA CTCGTCCCCC AGCGCGAGCG AGAAAGCTCG ATGGTCTGGA CGTGCCTGTT 18960
CCACCATTTC TACGTACTCA CGTCGTTGCG CCGCATAGTG tCACGCAACT GATACGAAAG 19020
AATATCTCCA CCACGATTGG TAAACGTCAC TTGCACTAGC GGGGTGCGTA CCACATACGT 19080 GCGCTCTACA CGCTCTTCAG TTTCTGATTC GGGAAACACC AGCACCTGGC CGGAAGGATG 19140
CgCaGGCTGC GTGGTCTCTT GCGTATCTGC TGCACCCCCG TGAGCACTTT GAGTGCGCGT 19200
CTCCTCAGGT ACGGCCGACA GGGTGTGTTC CGCAGCAGAG TGCGAGGAGG ATGCAGACGG 19260
ATACAAAAGT TCCTGAAGGA ACGAAGCTCC CACAAGCACT AACACCGACA A ACGACTGC 19320
AATAAACATA TTTTTTTTCA TCACGAACAC TCCTGCGCGT CTGTTTGCAA TGGGAGTACG 19380
TCCTTTGTTG GTACTGGGTC GTAACCACCT CGAGCAAAAG GATGACAACG TACAACTCTC 19440
TTTACAGTAA GAAAAAGACC CCATACACAT CCATGTACTC GCACGCTCTC ATAGGCATAC 19500
TGCGAGCAAC TCGGATAATA ACGGCACGAA GGCAAAAGAT GAGGGGAAAG AGCGCGCTGG 19560
TAAAAACGGA TCAGCGCGAG CAACAATCCA CTAAATACCC ATGcACACAA GCGACGCACA 19620
CTCACGGCCT ATTCCTGCCT TTCAGCGGTA AATTCAGAGA ACAGACGCGC GCGATCACAC 19680
AACACGCACA GCAGCCGCTG ATACGCCGCA AGGTGtCTTC CACAACGGAA ACCAAGAGGA 19740
CCAAATCAAA ACCTGGAACG AGAGAACTTT TGAGTGCACG ATACGCCTCT TTTGAArGCC 19800
GTCGTGCGCG ATTGCGCGCG ACGcTTTTCC ATAGCCGCGC CGAAAAGTAG CAAGGAACCG 19860
ACTATACGCA CACCCGTTTG GCAACACAAA AAGACACGCG CGCCCGTAGC AAAACCTACG 19920
TCCCTGTTTG AACACTGCAC GCACCCGGCA GGAACCGCGC AACCGTTCGC AGGCATCAAA 19980
CGTAACGCGG ACGGTGAAAC AACGCGAAGA AACGCGTGAG GCAGCCACAA GGCCTAGTAA 20040
GGTTTTTTCT CGTCAGAAAC GGTGAGCTTG CGTCGTCCcT TCGCACGACG ACGCCGGAGT 20100
ATCGCGCGCC CACCACGCGT TGCCATGCGC GCGCGAAAGC CAAATTTCCG cACGCGTTTC 20160
CGCCTGCTTG GTTGATAGGT CCGCTTCACC GATTTCTCCT CCCCCTGCTA AGAACACGAG 20220
GGTCCACTCT GTACCAAAGG TACCGGAGGC GTCAAGATCC CCTCTTGTAG GGGGAACCGG 20280
CTCCTCGGGA GGAGCCGCGT CAGCACCTTT AATTCGAAAT GTTTTGCTCA CAGAAAAAGG 20340
GATAGGAGTG TGCGCGCCCT GTTTCTTGCG ATAAGTGAGG CTTTAAATTC CGGAACGGTC 20400
TCCCATGTTC ATTAAATAAG GCGCGATATT CCTCTTCTTC CCGCAAAAAA AGGCGAAACC 20460
AGGCTGATAT CAGGCGACGG TACAACGCGA ACGATTCCAC ACTCAGCGTT TTGTTGTACG 20520
TAAATCCCCA CACTTCAAAA TATTTATCCG TCCCCGCCCC ATGTCCCATA CCTTTTAGCG 20580
ACAAAAAGCA CGCAGGCACA TCCTGTGGTA AACTCTGAAA GAACTCATAC GACCGTTCAG 20640
GATATGCCAA CATATCCTTT GTTCCCGTCA GAATCAGCAC CGCAGCACGT ACCGAGCGCA 20700
AATCAGTTCC CAGTTGCTCG TTTTTTCCCC CAACCATGTT CACCATATCA GCGCCCCCGT 20760
TAAAGGGATG CAACGCAACT ACCGTTCGAA TACGCTCAGG GTACCGATTT GCATAGTGCA 20820 GGGCAGCTGT ACCACCCATC GAGTGTCCCA ACACCCCTAC ACGCGTAAAA TCAACCTTTC 20880
GATACAGAGG AGAGCCCTCC TGCTCATTTA CGCGCTGCAT AAGTGCATAG ACACTATCAA 20940
ACGTACTCAA AAAATCGGGT GGCCGcCGCT GAGCTCGCGA CGTAAACACC ACCGTGACAA 21000
ATCCCTGTGC AGCCAGAAAG CGCGcTAACG CACGCTGATA ATCTTGCGTG CTATTCCATC 21060
CGCGCGAAAG CATGATAAGC GGATACGTAC CCTGATGCGC TGGATAATAC ACCGAGGCAG 21120
GATAACGCAG GTGCACTCGG TCAGTGAGTT GCTCATCAAA CTGATTATCA CGCACAGTAC 21180
GCTTGTGCAT TTTATTCAGA AGGGAGACAT CATCGTTACA CGCCGCCACT TCGAAGACTT 21240
CTTGGAAAGT AAGATCCTGC GCACGCAAGA AAAACGAGAA GAAAATACTA AATGCCAACC 21300
CACACCATTT TTTCATCACA CACCCGCACA GAAACAAGCA TAGCAACGCC GCCGCTTACA 21360
CGCCCGCTCC GTTTGCCCGT GACTGCTCCA TCACTTTACA CGTGCTGCAG CrTTCCTTAC 21420
CGTCACTTAG CGAAACAGAT TCATCAGCAG CATCGGCACT GAAATTAAAC TGCCAATCAT 21480
TGCCCCTATG CTTGAAAGCA CCACCACTAA CAGTACATGC GCAAACTTAT TACGATACCA 21540
GCCACGGAAC GTAACAATAT CGTCCGCCAG GCGCTCCATA TCCGCCACTT GCGGCCGGCA 21600
CACCCACGCC TGCGCAAAAC CCGTAAACAA ACCCACCCCG ATTATCGGCG TGAGTACTGC 21660
AATTGGCGCC CCCACAAAAC CCACCACTAT GCTGAGCGGA TGTCCGAGTG CACACAACAC 21720
CCCCAGTGCT GTCATACTCC CGCTCCACCA TAGCCACTGC ATCAGTGCAT CGCGCGATGC 21780
GCCTaCGCCG CCAGCAAAAA AGCATGTCAC CACTACCCCC ATCAGCAGCA ACGGAAACAG 21840
CCAACCTAAC AGCTGCCCCG CATGCGGAGA ACTACCCACC GTTTCTAGAT TCGTCACTTC 21900
AGCCGTGCGT GCTCCGCAnA AaAACTCGTA CAAACACCGT TGCACACCCG CCACGCTCCC 21960
TGcGcTGACT ACTGCCATTA CCACTTGGCT GTCCACCGCC CAAATTTTGG AAGCCAGATA 22020
TTGGTTCCGC TCGTCCACCA ATACCCCCTT TACCGCCGGT AGGAACGAGA TAATCTCTTG 22080
CATTAACCCA TCCATCGCGC CGTGCGAACA CAGTACCGCA ACGgCTGsTC GsTyAACTGT 22140
TcCCCAGTAA ACGCCACACT CAGCAACACT GCCAGCAGCT TTGCTCGGCC CCAGGGATTT 22200
AACACGCGCC AGGCACGTCT GAGCGTCACC TCAATAGACC GATCGATATA CGCTACCTGG 22260
GCAGAAAGtC CGCTGCCACC TCAATTGCCG CTTTCACTTC ATCCCCAAGT CTTACGCCAG 22320
TACCGGAACT CAGACGTTTC TGAAACGCAG ACAGGGCTAG AGTACTGAGG AGAAAGAAGC 22380
CTTTCCCTTC ACGCAAAACC CGCGCAAGGT CATATTCCTG CCACTGGCGC ATCCCCAAGA 22440
GGTCCcTGCG CACGCGCATC GTCCACTTCC ACACACACAC ACTGCGGACG CCTCGCACGA 22500
ATnGTGCGAC GCACGCACTC AATGGCTTCC TGAGAAGTAC AGGCTACCCC CATAAGCACC 22560 ACACGGCGAG ACCCGAACTC AAGACACGTC AGCTGGTGAT CCATAACCAC TTTCAAAGGG 22620
TAACCTTTAC CCGAGACTCA CGCCATCAAC CAGTCGCACC TTGCCGACTG TCCCTCCCAC 22680
ACACGGTGGC GCCCAAGACG GCCTAGGGAT TCGTACGCAG CGACGATGAC CTCCCTCGTA 22740
AACGCCCAAG AGCACTCACC GCAAGTCGGT GCTCAAAGGA AGAAGGAGCA CGCGCCGATT 22800
CTACTTCACC GTAATGTTTC TCGGCAAGCA GATGCCGCCC CCCGAGTTCG TAAAAAAAGG 22860
CAAGGTAAAA CGAGTATCTT ATACGCTGGG GCACTGACTT AATTTTAGAT ATACGGGAAG 22920
CCATATCCTG TTCCCCACTC AAATCGACAA ACAAGCGACA CAAAAAATAC TCCACTTCCC 22980
GTCGCGTTCT ATCTACGGTC CGAATAAACG TCCGCATAAA GTGCTGCGCC TTGTGTGCCT 23040
GGCCCATCTG ACACAAACAC AGCGCCGTCA TGAGCGCATA CGAGATATTC GTTGGGGCGT 23100
AGGTAAGCGC AGTTGCAAAA GCCTCACGCG CCTCTTCCCA CCGCTGCTGC TCCCAAAATA 23160
GCACTCCTAA GCTTTCGAAA GAAAAATGGT ATTTCGGATA CAGCTGCACC GCGCGCTGAT 23220
AGTGCTCAAT CGCTTTCTCC GCTCGCCCGA GCTCGTCGTA AATTCCCCCC AGATAAATGT 23280
GGGCAAAATA CGCATCAGCA GAAAGCGCCA CTGCGCGCTC GAACGCCGCT GCCGCACGCT 23340
CCTTTTTTCC CGCCTGGGAC AAATACGTGC CGAGATCTGT CCAATATGCA GGATCATGCG 23400
GATCAAGCTG CACCACACGT TCCAGATCCC TAATAGCTTC GAGCACTCGG TTCGTTTCCG 23460
CTTTCACCCG CGCACATTCT GCAAGCGCGC GCTCATGTTC AGGCG ATCC TGCAGTACCT 23520
GTCGGTACTG TGCCTCTGCC TCTTGCATCT TTCCCTGCAG ATAGTACACC TTCCCCAAAC 23580
CTACCCGCGC GTCCTGCGCA CGCGGcTCCA CGCGCAGCGC GCGnAGAAAG CCTGcACCGC 23640
CTGGGcATAA TCATTAACAC TTAAAAAATC GTAACCGCGT TCAGTCAGTG CCCACAGATC 23700
GTGCGGaTCC TGCGCAAGAA TCTTTTCTAC ATACTGTTTC TTCTTGCGCA CATCTCGTTT 23760
CGCTTGCGCT ATCATTGCGT GTGCGTACCA CAGTTGCACC GTTTCAGAAG CCGTAGgACC 23820
GTCCTGCCCA AGCTTTTCTG CAAGCTCCTG CGCGTGCGTC AATTTCCcTG CGGAAATGAG 23880
CGTAGAAAGA TACAGGTATT GGATGCGCTT TTCGGCACGA TGCTCAGGGC TCAACGTATC 23940
GAACAGCTGC AACGCCTCTT CCCACCGCTG CTTTTCCAAC AACCCACTGA GCTGAGCTGC 24000
AAAACGGATA TTAGGATTTT CGCTTTTTAA AAGCTGTGCC CGCTCGCTCT CCCGAGCGCT 24060
GTTCGATCCA GTACTAACAC AAGATACAAA GAGCGCACCT AAGAGCACAC TCACCTTAAA 24120
AAGAGTGCCA TACCGCACAC CGACCCCCTG TACCGCACGC TGCCGACACC CGCGCGTCGC 24180
TGAATTCTCG GCGTCTTTCT TTTACCTTTT TATTTAAATC GCAGAGGAAC TTTCCTGGAG 24240
CTCCCCCACA AGAGCCATGC CCTTGACCTG AGAGGGAAAA TCAGCTTATG CTTCCsCCGA 24300 TGGCGTGCGG GGAGTCCTGG GAAGAGCTCA AATATCGTGA AGTTTTTGAG GAAGAATTGA 24360
GCGCGCTCGA GCACCGTCGC CAGCGCGATC CAACGTGCAG CGTCTCGGAT ATCGAGGCGG 24420
TTCTGGAAAC TCTTTATCTC ATGGATGGGA ATAATCAGGA TGGGCGTGGG AACCCCCGTC 24480
AAATCGGTCT GGACGCTACC ATAGCCGCGT ACGAGCAGTT TCTGTGCGAG TGGAGACGCC 24540
AGCTGAGCAC TGCCTCGCCC CTGAGCATGG AAAAGAAATG AAACACCCCT CGGTGCGCGT 24600
ATGCTGCTTT GCGTTCGCAT CCTGTCTTCT TTGTGCAGGC TGTTCACTGA AAAGGCTCGC 24660
CTTTTCCTCT CTCTCCCACA CGCTCGCTCC CTTTCCTGAG GGGGAACTGG ACGCGCACCT 24720
TTCGGACGCC GATTTTACGC GCGTTTTCAC CGAGGAAGAT GATCTTGATT TAGTCGCCCA 24780
GTCCCTCCCA cTGGTGCTCA AGGTGTACGA AGCGCTGCAT CTGCAGAATC CCGCGCACAG 24840
AGGACTATCC CTCGCTGTCG GCAGGCTCTA TATCATGTAC GCTAATGCTT TTGTCCAGAC 24900
CCCTGCTCAG TATTTGCCAG AAGACGAGTT TGAGGCGCAG AACGAAGCCT ATTCGCGCGC 24960
GAGGAAACTG TATTTGCGTG GCGCGCGCTA TGCGCTCTCC TCGCTAGAAA CCGCATATCC 25020
GGGCTTCACC CGTGAGGTAT TCTCCGGGGA TGAGCAACGG TTGCACAAGG TACTTTCTCG 25080
CTGTACGCGT GTGGATGTGG GCACCCTTTA CTGGGTAGGT ACGGGGTACG TGGCGGCGTT 25140
CGCCCTTACC CCTCTGGGAA GCGCGCTCCC AGACACCGTG CATGCGGCGG TGATGATGCT 25200
TGAGAGAGCC TGCGATCTGT GGCCTTCGTA TCAGGAAGGA GCAGTCTGGA ACGTACTGAC 25260
CAAGTTTTAC GCCGCAGCAC CAGAGTCTTT CGGTGGGGGG ATGGAGAAGG CACATACCGC 25320
GTTCGAACAC CTTACGCGGT ACTGCAGCGC GCACGACCCT GATCACCACA TCACATACGC 25380
TGATGCGCTG TGCATACCCC TTAACAATCG TGCAGGTTTT GACGAGGCAC TCGATCGCGC 25440
TCTTGCCATT GACCCTGAGT CGGTGCCGCA TAATAAACTA CTGGTGATCC TTTCTCAAAA 25500
GCGTGCACGT TGGTTAAAGG CGCACGTGCA GGATTTTTTC TTGGATTGAG AATAAGCAGA 25560
ATTCGTGGTG CAGGTAGTCT CCCTGCACAG GACGCGCGTT CTTGTGTAAA AAATTACTTT 25620
TTGCAAAAGG AATATCTGTA TGCGAACGTA CTTTTTCATG AGTGTCTGCT CGGTACTCAC 25680
CTGTTTTGGC CTCTATGCAA AAGAAAAAGT GGTGTTGAAG ATCGCTTCCA TTGCCCCTGC 25740
ACGCTCCATC TGGGAAACAG AGCTGAAAAA GCTTTCAGCA GAATGGAGTG AAATTACTGG 25800
CGGTCTGGTG TCCATGAAGT TTTATGACAT GAGTTCGCTC GGAGGAGAAC GAGAGGGAAT 25860
TAGAAAATTA AAATCCAGTC GTCCTGGTCA GGCAGCTCCT CTTGATGGAG CTGTTTTCAG 25920
TTGTTTAGGT CTGAGCGAAC TCGCGCCAGA TTCCGGTATC TATACGCTCT CGGTCCCCTT 25980
TCTCATTCAA AATGAGAAAG ATTTAGAACG AGTTCTGCAT GAGCTGCGCG AAGATTTAGA 26040 CAGACCCTTC CGCGCAcAGG TTTTCGCGTC ATCACGTGGA CGAACGCCGG TTGGCTTTCT 26100
TTTTACACAC GCGCGCCGTA CGCATCGTTA GGACAATTAA AAAAACAGAC TATCGCCCTT 26160
TCCAGCCTAG ACAGCTCGGT CCTCGGTACC TGTTTTAGAA TATGCGGTTT TGACATCAAA 26220
GATGCACCGA ACGCGCGCCT TGCACCGTTA CTGAAAGCAG GTAGCATCGA CGGTTTTCTT 26280
TCAGTGCATT TGTTCACCTG GGCAACCGGT TTTTACCGGT ACATTTCGTA CGCGCTCGAC 26340
ACTAAGATTT GTCcTGCGGT AATCGGTATG CTCATCTCAG ACGGGTCATG GGCGCGAATC 26400
CCATCGCGCT ACCACGACGC TATGCTCCAG GCAGCTACAC GCGTAAGACA GCGCCTAGCT 26460
AATAACCTTG AGACACTTGA TCGCGAATGC AGCAACAATA TACAGAAAGC CGGGGTCTCC 26520
ATCGTCCATC TTACCCCGCA GnAAATACAG GAATGGCGTA CCGAGTTCGC TGCAGACGTC 26580
AAGCGCATCC AGGCGCGCTT ACCTGGCATG TTGAACATGA CTTTGTACGA GAAGATCAAA 26640
CACCTCTTGT ACAGCGCACA GCgcwgAgcT TAGCCGGTAT AAGAGGGAAG GCGATGTCAT 26700
GAAGGGTACA CGGGGACAAC TGGTTTTGCG CAgcaTAGcG CTTCTGCTCA TTGGGACGCT 26760
CATGCTGCTG CCGTTAGTGC TTTTTTTAAT TGAACGGATA TTCGGTTTTC TTACGCGGGG 26820
CGTAGGTTCC GAGGTGTTCT CCGCGCACGA GGACTTCATT TTCCTTTTTT TCTCCTCCTC 26880
TGACGCCGCG GTTGCACAGT TAGCCTTCGT GTTTTCCTGT GTTGCAGGCA TTTtACgctG 26940
CGCGTGAACG TAAACACTTG AGTGTCACCC TGTTCTCGTG CGACGTGGAC AGACCGATGC 27000
ACCGCGTTCT TTCCTTCCTC TCTGCGATCT GTACGGTGGC AGTGCTCAGC GCTTGCTTTT 27060
TTGCGTCTGG ACCGAATATC GTCGCAGTTT TTCGCAAAGA AGAAGCTGTG TGGGGAGTGC 27120
CGTTACGCTG GATTTTTACC GCGCTGCCAT GCATGTACGG CGCGCTTCTT TTTCACTACG 27180
CACGAGAAGT CAAGTGTCGT ACGTGCGTCA TCGTTGGACT TTTAGTTGGC GTGCTGATAA 27240
GCACAGGATC CATCGCCTCT GTGCTTTTCC ATCTCTTTGA CCTGACCGTA CCCCTGCTGG 27300
ATAGTGTCTT TCACGGCTGG GTAGCAGTGG GTACACGACT CTTTTGGCCG TTCGTGCTTC 27360
TCCtTCTTCT GCTCGCTGCA CAGGGTCTCC CGCTTTTTAT TACGCTGCTT GCCATCGCGT 27420
ATCTGGCGCT GAGCGTCGAT GGAGGATACG TGGATACCCT TCCTCTCGAG GGGTACAAGA 27480
TCcTCACGGA TACGGGAGGA ATCGTAGCGG TTCCGCTTTT TGCCACTGCA AGTCTGCTGC 27540
TTGCACGCGG CAGTACTGGA ACGCGTnCTg cTTCGCTTGG TAAAAGAAGC GGTGGGCTGG 27600
CTTCGTGGAG GAGCAGCAGT TGCCTGCGTG GCAGTAGCGG CGCTGTTTAC GTCATTAACC 27660
GGTGTATCGG GGGTGACAAT CTTGGCCCTA GGAAGCTTAT TCAAGCTGAT TCTCACGGGT 27720
AACAAATACC CCGAGCACGA TGCAGAAGCG CTCATTACCT CCTCTGGCGC CATCGGACTC 27780 CTATTTCCAC CCAGTGCAGC GATTATCATT TTTGGCGCAA CTAACATTCT TACCGTACAT 27840
ATTGTGGATT TGTTCAAAGG TGCATTGCTT CCCGGGACAT TaCTTGTGCT TTCTGCCATG 27900
TGCTTAGGGG TGGCAAAAGA TCGCACACAG GTCCGTCCAT CCTTCTCCTG GCAGTTGCTT 27960
GTCCATGCCG TAAGAGGAAG CGTATTTGAC CTTGCCCTGC CAGTGTGTAT TAGCCTGGGC 28020
TATTTTTCCG GTACGCTCAA CCTGCTGCAG TGCGCGTCGC TGACAACTCT CCTGGCTTTT 28080
GTATTAGGTA CGTGGGTGCG CAGGGATTTC ACCGTGAAGG AAgTTGCGCA ACCGCCCTTG 28140
AGAGTCTGCC TATCGTCGGT GGCATTTTAA TCATTGTCGC AGCAGCGAAG GGGCTGTCCT 28200
TCTACCTGGT GGATGCAAAC GTACCGGACA CCCTCATCGC GTTTCTGCAG CATGCAATTT 28260
CATCAAAGTA TGCGTTTTTG CTCCTTTTGA ATGTACTGTT GCTGGGTGTC GGGTGTATCA 28320
TGGATCTGTA TTCGGCGATC CTGGTAATTT CTCCCCTAGT GTTACCCCTT GCAGTGCATT 28380
TTGGGGTACA TCCGGTGCAC GCGAGCGTCG TTTTCCTGAT GAACCTTGAG CTAGGTGCGC 28440
TGACCCCGCC GATTGGAATG AACTTGTTCA TCGCGAGTTT TGCATTCGAA AAACCGATTG 28500
TGTATCTCAC GCGCGCTATT GCACCCTTCT TGCTAGCACA ACTGGGAGTG CTTCTTCTTA 28560
CAACTTACAT ACCATGGCTC AGCACTGCAT TCCTGTAGCA CCGCGTTCCG GCCACAAGTC 28620
TGAAAAAGTT GAAAAGAAAC GCCGCAGgca TGCTGCGATC CCCGTTTTAT GCGCCGGGTG 28680
CAGCCtCCCT GCGGGGATTC AATTGTCTGT ATACCTTTTC CGCCAGGCCG AATCCACCCT 28740
GCGCGGCTAG CTGCGCACTA AAATGCTCAT AGAGGGCGTC TTCGTATAAC CTTCCTGAAA 28800
AACTCCGTTC ACCTGCAAGC GTCTGCCCGC TCAACGTCTC GCGCATAGAC TGCACCATCA 28860
CTCTCACAAA CAGCGTTTCA AGCTCCCGAG CTTGAGTGTA CAGCGCATCA TTCTTTTCTG 28920
CAGAACAGGC AGCACCTTGC TGCGCAsGGA aCAAGCGTGC CGCGAAAGAA CCACTACCTT 28 80
CCATTTTCCC TGTCTTAGAC AGGGTAACGG AAGGAACAGA CTGCATCCCC AATGACAATA 29040
CACGGTGCAC GTTCACCTTT CAGTCTCCTA ACGCTTGAGC GCCACTGCTG TGCCGAGCAT 29100
GTTGTCACTC GTTTGAATTG CTTTTGAATT AAACTCATAC GCACGCTGGG CGACAATCAT 29160
GTTCACCATT TCACTTACTG TAGACACGTT TGACATTTCC AAAAACTTAT GCTCAACCTT 29220
TCCGAATCCT TCAAAACCCG GCCTTCCGGG AATTGGCTGG CCGGACGCAG gTGTTTGGGT 29280
AAACACATTC CCCCCCTCTG CTGCAAgcCC CGCATTGTTC GCGAAGnaTa CAGCTCAAGC 29340
TGTCCTACCT CAACCGGATC TCCCTGTTCC CCGACTCGCA CCGTAACGCG CCCATCCTTG 29400
CTAATAGCGA TACTGTGTTC TACGTAGTTT TCGGGAAAAA TAATCTCTGG AACGAGACGC 29460
AACCCGTTTG AGGTCACCAA TTGcCGctCC GCATCCACCT TGAACGAACC GTCGCGGGTA 29520 TAAGCATAGG TTCCGTCATA TTGCAGTACG CGAAAAAACC CCTCACCCGC AATAGCCACA 29580
TCTCCGCTCA CACCCGTGTG CTGGAGCGAA CCTTGTTCGA AGAAGmGCTG CGTTGcAGCG 29640
AGTTTCACCC CGTGCCCCAT CCGTACCCCA ACAGGGGTAA GTGTGTCCTC AGTTGycAGG 29700
CGTACCGCGG TGCGTATGGT CTGATACAAC AGGTCCTCGA ACTCCGCACG CTGTCTTTTA 29760
AAACCAGACG TATTCACATT CGCTAGATTA TTCGCTACCG TATCGATGTT TGCCTGCTGG 29820
CCGTTCATCC CCGTAGCAGC GGTCCACAAA tTCGTACCAT TCACACCTCC CTCTCACTcC 29880
GCTACACGCT ATGCTCGTCT ATATCTTCTT TACATTCTAT AAAACATGCG GACAACTACT 29940
TTGCACGCaC CACTTCGTTC CACAATCTGC CCATCATTCC ATCTTCTGCT TGAATAGTTT 30000
TTTGGTTCGC CTCATACGCG CGATTCACCT CAATCATACG AACCATTTCA TTGACCACGT 30060
TTACATTCGA CGCCTCAACA AAACCCTGCA CTGCAGCCGG ACGTTCAGGA CCTTCCGCAG 30120
CAATAGGGGC CCCTGAAACA GGAGTTTGCA TATACGTATC AGCACCCTTC TTTTGCAGGT 30180
AACGAACATT TTCAAACGTG ACAATTTTCA GCCGATCTAA AAAAAAACCG TCAACGTCTG 30240
GCCTATCTAT GGGACGTACA TAAATCTCCC CGTTTTGATT GATCGTATAG TATCGCTCCT 30300
GCAGAAAAAG TGGACCATTT TCTCCCAGTA CTGGATACCC ATTTTTAGTC ATAAGGTAAC 30360
CTTCTACACC GACTAGGAAA TTCCCATTCC GGGTGTACTC TTCTCCCTGT GGAGTCCTAA 30420
TCACAAAAAA ACCCATCCCC TCAAGCGCAA TATCCGAAGG ACTTTGCGTT TGTTTAAGCG 30480
AACCCTGCTC AAATTCAGTG AACAGTTCAT TCACCTCAAC ACCGAGGCCT AACTTTCCAA 30540
CTATAGGAGA AACGTCCGAA GAACCGAAAG GGTTCTTCAC CACACCATcG TCGTTTACAC 30600
GACGCAATAG GAGCTCTGGA AAACTCTTGT GAACTGCTAC ATCTCGCTTG TAGCTTGTTG 30660
TGTCTACATT CGCTAGGTTT TGCGCAATAG CATCCAGCCT GCGCTGcTGC GCGCTCATGC 30720
CACTGGCTGC GGTATACCAC CCTCGGATCA TACGCCCCTC CGCTCCCCCT GGTATCGGGA 30780
GATAGAAAAG GGGAATCAAG AAAATTCTTT TGTCAGTGTG ACTACTTTTT TGATATTCAC 30840
TGAGCAAGTG CAATAATAGG ATCGAGTCTT GAGGCCTGCA GCGCTGGTTT TAATCCAAAG 30900
AAAATCCCCG CTCCCAATGA CATAAAGAAG GCTGTACGCA TACCCGCAGT GCTCAGACTG 30960
AAAACAACTG TTATCCCCTC TGGAGAAAAC ACGGAAAAGA GCCCATAACT GAGCACCATC 31020
CCAAGAATAA GGCCACACAC GCACCCCGCC AGGnTTAAAA GCACCGCCTC GAGCAAAAAC 31080
TGCTGAACTA TTGTTGCGCA CGTCGCACCG ACGGCCTTGC GGAGACCGAT TTCTCTGCGA 31140
CGCTCGGTTA CGGTTACTAC CATAATGTTC ATGATATTTA TGCCACCGAC AATCAGCGAG 31200
ACTGCAGCCA CAACCGACAG CACTACACTC ACCATACTCA GAACGCTGCG AAAACTTTTT 31260 ATTTCCCCCG CACCGGACCA AAGGCTCACA GAACCCGATT TGTTAGAGAA AAAGTCCGAA 31320
AACTCCCGGA TACGTTTTTC CGCTGCGGCA ATAACCTGCA CATCGCGTAC GCACACCTCC 31380
ACCGCGTCTG CCACACGACC TGCACCCATT TCTAGAGAAA TAAACTCACG GGGGACAAAA 31440
ACCCGATACG AAGGAATGCC ACTAATCAAA CTCCCCTTTT CCTGCAAGAC GCCTACGATT 31500
TCAAATGGGA AAGACAGTGC ACGTTCTGCA CCCGACGCCC GGGAAAGTAT GGTCACAGTC 31560
ATACGTTTAC CCAATGCATT CCCTTCAGGA AATAATTCTT GCGCAAGCAA ACCGCCAATC 31620
ACCGCACAGT GACGATGGGT CTTAAAGTCC GCTGGAGAAA AGAACGTCCC ATACTCAAGC 31680
TTAAAATCTT TTAACTCCAG CCACCGCGGC TCTACTCCCG TAATGTTCCG TTCCTTTCCC 31740
CCTGTGTGAG GAGAAGAAAT AAGTGCCTTG AGGGAAGAAT TGTAAAACAC TCCCTCTATA 31800
TCCTCGCTAC TTTGTACAAG TCGCGTCCGA TACGACTCAG TCGGCTGAAA CATGATTTCG 31860
TTCTTCACAT AATCCCACTC TGGCCTGACT CGAATGAGTC GGCGCTCGCC CTCGCCAACA 31920
CTCTGGGCAA GACTCGCGTA GAGAGACTCG CCGATCGAGG TAATTACCAC TACCGACGCA 31980
ACCCCTACCG CAATACCGAG GAACGAAAGG GTCGTCCgCA GCACACGCTG CCTGAAAtAC 32040
AACAGGGTGT TcACGATATC TTCAAGCATA TCCCTCTTTG CAGAGCcCGT yGCTCTACGC 32100
GCGCGCCTCT CCCCTCCTAC AGAAACCTGA CTTCACGCAC ATGGCAACGC CACAGGACAG 32160
GCACGCGCAC ACACATCGCC TAGCGGTCTT CCAAAGACTG GATTGGGTCA AGACCCGCGG 32220
CTTGGAATGC TGGATACGAC CCAAAACACA CCCCAATAAC TACTGACCCT GCAAAGGCAA 32280
TAAACATGCC GAcTACGCTA GGAGAAAACG TCATTTGAAA ATCAAACGCA TTCAATCCGG 32340
CGATAACAAT CACGCTCAGT AAAAGGCCAA GCACAACGCC GCACAAACCA CCTACGAACG 32400
TTAACGTGGC CGATTCTACC AAAAACTGAT GAAGCACGTG CATGCGCGAA GCACCCAGTG 32460
CCTTCCGCAA CCCAATCTCC TGGCGTCGTT CGGCGACCGT CACCAGCATG ATGTTCATAA 32520
TACCGATGCC ACCGACAATC AGTGAAATAG CTGCGATGCC CGTCAAGACC ATATTCATTG 32580
CCCGGAGAAA ACTCCGCATC TGTTCAACGA TAAGATGGAG GGAAGAAACT TCAAAGGCCT 32640
TCTCGTTACC GGTCAGATTG GTTAGCACTG ACTTAACTTT GTCCTCAACA TGCGCGATCG 32700
ATCCAGAATC GTATACTTTC AAATCCATTG CATCGGCAAT ACGCCGAGAG AACTCTCTTG 32760
TCAGCmmA 32768 (2) INFORMATION FOR SEQ ID NO: 17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8642 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:
CGGATACCCC GCAGTGGGCG CCGTTGCCGC CCCCGCTCGC TTGGACCGTA CGACAgTGGA 60
CGTCCGTGCC GTCTAACACG TAGAAGCACC CCTTTCCATC GCTCGTGCAG CCGAGGATGG 120
GGGACGTATA GCGGTAAACG TGCCGTCCTT TGTGTAGGTG CAGCTCGTCC CCGAACTATT 180
GCTTTTGGTG TACACGGCCT TGGTGGTTAC CACGTACCCG GTTCCGGAGC CCAGGACGTT 240
TTCATCACTT CCGTTTTTAA TACCACCATC GGAAGAAGAG CCGCTCCCTC CGCCTCCACC 300
GCCTGAGGAC CCAGAGCTAA GTTCACAGGG CATTTGCAAA AAGGGATTGT CTGACCCGCC 360
AATGCGGATT GCCCCATTTG TTTTGCCTAA AACGGTTGAA GGCGTCGTGG TCCCTCCCGT 420
TCTTCCGGCG CCGTTGCTGG TGTACGTGTA CACACCTTCC CCCGAAACAC AGGCGTACAC 480
GCACGCGCCt TCGAAACAAT GCTGGTAATC TTTTTGCCTG GCAAAAAATT TACTGCCGTC 540
CAcTTCCCCT CAGACTTGCT CGCGTCCTTT TCCCACAGCT GACCGGCGCA GGCGTACAGC 600
TTGTTATTGC ACTTTACCAA ACCCGTCACT ACCCcGCGGA TGCTCGGTAT TTTTAACGGT 660
AcTTCTGACT GGATGGAGGC AAAAATGCCa GAAAAATCGC AGCCGGTTAA GAGACTTGCG 720
CTTAAAAATA ACACCGGCGG ACAGACTATG CGGCGCACTA CGCGGCTTGC GCCGTCTGTG 780
TGCGCGCGGC ACGCCTTcGC CGCGgCACAG CCTcCACCTG CTTGGACTAC AGCTCTTTCC 840
AATCCTTTCC ATTGTTCCCC CCTATTACTG CCCTCATGCA GACGTGGCAC GTCCCCGCTC 900
CTATCACAGG CACAGAGACC CCCCTACCAA AAACAAAGCA CACTGCAGCC CCCCGAGGCC 960
GCTATCCGCG CCACCGGGGG GGGGGTATCG GAGGTTCGCG CGCTTCCtCC TCCGGTTGTA 1020
CCGTTTCCTG AGGAGAGACG GCAGATGCTG AAAAGGGTTC CCCTTCCCCA CTTTCTAACC 1080
AGAGACGGTT AACCGCAAAC ACCACACACA ACAGCAACGT CAAATcACGA yTGAGCGGCA 1140
TCGTGCCGCG CTCAAtCGCG TsGAGyTCTT CAAcACCAwT CCCAAGTTCC CGCGCAAAAG 1200
CCGTCTTACT TAAGTtTTGC TCACAGCGAA CGCGCTcAAC ACGCACACCC ATGCcTTCCA 1260
TCTTCCCAGT ATACTCTAAA ACACCGCCGC GCATAGCCCC GCAgCCGCAG CTTGCAACAC 1320 tACGGGTGCA GGGCAAAAgG AGTwATGGGA CCGGGCTTTT TCTCTCCCGC CGGCATCTGG 1380
GCAAGGATAC GAGGCACACC GTCCGCCTTG CGCGCATCAC CGAGTGCATG ATACGTCCTG 1440
CTTAAGTTGT ACAAGGCACT CCCCGACCGC GGCAGCAAGC TCAGCGCCGT CTCAAACGCA 1500
CGTGCGCCTG ATCGTACTTT TGcTCGGTGT GCAATAGCAG TCCGTAATTG TTCCAGGTAG 1560 CACCGTTCTT CGTTTCCAAG CGGATGGCGT GCTCGTACGC ATTGCGCGAA cCTCAGTGTC 1620
CCCCATATCG TGGAGCACCA CCCCAAGCAT GTTCCACACC GCCGCGTGAT GAGGGGTGCA 1680
CCGCAACGCA TGATACAGCG CACCTGCTGA ATCAACCGTA CGCTTGAGCG CATAATAGCC 1740
AAGCGCCAGG TTCAACCACA GAAGCCCGTT GAAAGGACTG AGCCCCAATC CCTTGCGTAA 1800
ACAGGCAACG GTTTCCTCAT ACCACCCCTT GCGCGCGCAG GAAAGCGCAA GACAGTTCAA 1860
AGACTCGGCC GTCTCCTGAA CCCCACGGGg cTGCCGCCCG TGCTCAGGGC ATTCAAACTC 1920
TTCAAAAAAC TGATTCACGA GTCGTCCTCC TCCCCTTCTC GAGCGGGTAT GCGCGCGCAg 1980
TATACCAGCA GCACCCGCAA AATACGAGCA CCGCGCACAC CCCACCTTCA GTCACCCTCC 2040
CCCCAGCCAG TCAAAACCAC GTCCCTGcTA TCGCGCACAC CCTGCGCACT GCAAGAGcTG 2100
CGCAACTTCC TGCAGCCTCA TGGTGCGAAA TGCTCCGGGC CTCAGCGGTC CTAGACGCAC 2160
CCTGCCAATG CGCACCCGCA CTAAGCGCAC GACATCCTGT CCCCACGCCT CGAATACCAC 2220
ACGGATCTCG CGCTTTTTCC CCTCAACCAG TACAAGCTGT ACACACTGCG CTGCAAGATG 2280
CCGCGCGCGC ACGCACCGAT ACCGGcACCC TTCCACCCAC ACCCCACGCA CAAAAGAGCT 2340
CAACAGCGCT GCAGGGACTG GCTCACGCGT TTCTACAATG TACTCTTTCT CTATTCCsGA 2400
ACGCGGATGG CCAAGAGCmT GCGCAAACGA ACCATCATTT GTGAACAGCA GCGCGCCTTC 2460
AGACCGCACG TCCAGCCGGC CGATGTGATA TAGGCGCTCC TGATACGCAG cTGTACTAAA 2520
TCGATTGCAC GCGCGTATTC CTGTTTGGAC GGGCCTGCCC GCACCTGCGT GTGTGCATAC 2580
CCGGCAGGAA ACTGCGGCGC GAGGGAACAG ATATATCCAA CCGGCTTATA CAGGAGCACG 2640
TAGCGCTGAA CTCGTTCAAG CTGCACGACG GTGCCGTCCA CACACACCAC ATTCTGCGCA 2700
CAAACGGTCC GTCCCTGTGT CGTAACCGTC TGACCATCAA CGGTCACACG CCCTGAAGCA 2760
ATCAAGGCCT CACAGGCACG CCGGGAGGCA CAGCCACTCC TGGCTAAATA GACCTGTAAC 2820
CGGAGGCGAA AAAACGGCTG CAGGCGGCAC ACCCTCCCCT GCACCCTCTG TTCACCCGGC 2880
TTAACGGGTA AGCTCAAAGC GCCGCTGCTC TTCTTCATCA AGTTTGGGCA AGTCTGCAAT 2940
GCTGCGCAAC CGGAACGCAG TCAAAAACTC CTCAGTCGTG CCATACTGCG CCGGCTTGCC 3000
GGGTATGTCC TTTTTCCCCA CCTCGCAAAT CAGACGGCGC TCACTCAAAA GGCGGATCAT 3060
TGTATCTGCA CCTAmCCCTC GGATTGCCTC TATTTCAGCA CGCGTCACCG GCTGCGCATA 3120
GGCCACAATA GACAGCGTTT CCATTGCCGC GCGCGAAAGG CGCCCTTCGC TCCGCTTCCC 3180
ATAGAGGGTT GCAAGACGCT CCCGTACGGT CGCCGCAGAG ACwACGCCAC ACCCTGCTCG 3240
TTGCAGTGAA GCTCCAGTCC ACCACCACCA CGCGCGCCAG AAGCGAGAGC TTCACCCAAA 3300 CGCGCAACAC ACTCACCCAC TGCCTGTTCG CTCAAACCGA GCTTTCGTGC AAGACACGCA 3360
TAACTGAGCC GCACGCCTTC GACAAACAAA ATAGCCTCCA GwAGCGCAAG GTCCGGTGCG 3420
GGTGCTCCGT GTAGCGTGCA AGGCTCTGCT TGGTCCATCC TGCCgATGtA CGCTCTTTCC 3480 tCCCTCCTaC AAGCACCCAA TGCAATCTAA CGACAGGGAA GACGGCACCC GCcTGTCTGA 3540
CTTCCGTTAC GGATTAAAAT GACCGATCTC CGGCGCACGC ACCCGCACCA ACGCAAGCAT 3600
CTGyTGGGGa TAcTGCGCTa CAAACTCTTC AAGAGTGCTC AAACGCGCCG CTTCAAAAAC 3660
CTCTACCTGA AATCTCCCTG ACCGATCGAA ATACGGCAAA AACACTGCCG CACGCTGGTA 3720
TTCACGCAAA CGCGGTtACG CCACGCTCCT CATTGAGCAC GCCAAGATAG AAgTGGCCTG 3780
GTTCCATAAC TGCAAGATAG TAAAAGAGCG CACGCACATA CCCGGTCTGG TATCCTGCAT 3840
GCGGCAAGTA CCCTAGGTAC CGCGACGGCC CAGCCGCAGT TTCACCCTGC GTCACACGGG 3900
GAGGAACAGC ACCCGCAGcG TCGACGGGGG CGCAATACCC GGATGCCGCG CCTCCTGCTT 3960
TATCCTCCCT TTGGGGACAC CCTCCACTGC AAAGGGTTCT ACCGTTACGT CCACACCTCC 4020
CTGTGCGGGG TACACCGTGC GCCGCACATT AAGACTCAAC GCTGCCGCTG CCAGGTTGCG 4080
TATCCAGTCA ATGCCAAAAA AGGGTkCGCG TGCGCGCTCA TGCGCCGCGT TAAAATGAGT 4140
ACGCGGcATA GTCGTCTGcT TTTTTAGCGC AGACAGGTAC ACTCCCTGaC CGGCAACCGG 4200
CGCAATGATC CCGTCTACCA CCCACTTCGC AAACCCAAGA GATCCCACTC CCCCGATAAT 4260
CTCCCGCGGC ACTTGGTCCA CACGCAGGGC ACGGATAATC TCTTGGTCAC TTTGACTGCT 4320
CCCGTCAGAA ATGTGGACCG GGCkTCCCCA TTCGTTAAAA CATCCATCCT CAAGTGGAGC 4380
GAGCCGGTAA CTCCTGCGCT TAATGATACT CGCCATTGAC TCTACTGCGC TGTAcGTGCG 4440
CGGCGGATCA AAAATACGCC ACGGCACCGC ATGcTCcGTA AGCGCCCGCA GCTGCAAAAG 4500
CGAGTaGGTA TACAACTGTT CAAAGGACAA CCCAAGCgCc sTGyTTGCGC acGTGCACGT 4560
TAAAAAGACA CACATCaAGA TGCGTCTTTC CcTGCATTTC TACTGCAGTG GGCGCAGAAA 4620
GTTGTACATA CAGCGACGCA CTTTCGCGCG GGTATACCCG TnATGaCACC GGCGCACCGG 4680
TGTGCATGTG CCGATACAGC ACCCATGTGC CTTGCGCCAC AGCTTGACTA GGCGCAgCGT 4740 cTGTCACAGG TGATTGCGGG GTATCTGCGC CTGTCCTTCC CGCCCGACTT GTTCCTGTGG 4800
GCGCCTCGAG CGCGAGGTGC AGCCGAGGCG CCGCTGCCAC TGGCGCAGAA TGCACACCAT 4860
CCCTTGGGGC ACGATCAGCT AAAAGAGGCG TAACGCGCAC TGCAAGTGGT CGGTTGCGGC 4920
AATCACGTAC GCCTCAACGC GAAAGcTATT GCCCGTTTCG TCCGTGTAAt ACGAGgTGCA 4980
CGCArCGCAA CACGAGACGG CTCATCCAAA AACCAATCtG CGCGATAACA CGnCnCACAT 5040 GGTGCGAGTC GGGGATGCGC TCAAAGGGCA ACAGCGGCAA ACGCTCCCCT GTCTCTGAAA 5100
CGCCGGAGTC AACCATGATT TCTATGGaGT ATTTCTTGCT CAGCTCCTCA TCCACCTCAC 5160
TCGCGCGCAg CAGCCCCTGC AGCATTATCA CCAAACTTAT GCGTAACACA CCgCGCGCAC 5220
GCgCAsCCAG ACAGCTCATA CGGCCCCATA TCGGCGCACG CGTGGGCGGG AAAAAGCAAG 5280
ATCTAACCGT TCCCATTGCC CACAAACCCc TTGGCAGGTG CATCCtGCTG TGCTAgGGTG 5340
CGCCCGCCtT GcACCCTGcA GCTGTCCTGT GTTAAGGAAG TCAAGATACC ATGAACACGC 5400
GCCTCGCTcT TGTCCTGtGT GCGGTGGGAT CTGGcGTGCT GTCTTTCTCC TGTGCACGCA 5460
CTGcCGAACC GACCCCCGCA GCTTCCACAC ACGTCCCTGT CACCACCGCC GGCGCACTCA 5520
GTGTCACACC GCCTTCGAGT ACTGACCGCT GGTACCAGTT CTCACGCACG GACGGACGAG 5580
TGCACCTGCG CGCGTGCCCC GCGCCGTCTC AGCCTTCTGC ACCTGAACAC TTTGTACCCT 5640
GGACTGAGGC TGTACgcCTG TCGGCAGTGG ATGCACAGCA AGAACTCTTG CTCATCAATC 5700
GCGCCGGAGT ACTCCCAGCC ACGCAtTAGC CCGCATGCAG ACCGCACCGG TTCCACGCAA 5760
AGCACCCTCC ACACCCGCTG CGGAGACGAC ATCGCTCACT CTGACGCCCC CCGCACTCTT 5820
AGCCACACAG AGCGCTGAGG GCTTTTACTC AGAGCCAATC CCCAACAGTT CCCCCCACCC 5880
TTGCCAGGGT ACCGGTGCAG TGTTTGTTCG TCTCTACACC GATCCCCTTT TTACCACTTC 5940
ACCACAAGAC TCTGCAGCTC CTTTTCTGGT GCGTTACGAT GTGCGCACCG CTCGCTGGAC 6000
TTCTGTCGcA TACACGCGmG CTCTGGGaTT GCCCCGGAAC GCCCAATGCA cCGCCCTcAC 6060
CCATACTCGc GGCACCTGGT ACGCTTCCTT TAAGTCCTCA GAAGCAGAAC GCGTTTCCTT 6120
TGCATACTTC TCCTTTCCCT CCCTTTCTTC CCTTGAGAAT TTGGGACCTA CCCAGCGCAG 6180
GGAGCATCCA ATAGGGGGAA AAGTGCAGGT GCCCCGTCCT ATCTCCGCGG CCGCCTTTCG 6240
TGCCGCGTGC ACTCCCCAGC GCCTGcACCT TCCGACGGcG TCCAcTTCCT CTGaTCATCA 6300
CAGCGACCTG cACGAATTGC TTGTGCATCG CTTGCTTGCG CGCGTACCTC TCTCTCCCCT 6360
GTACCTTTCT GCGCGAAcCC GTGTTGGGCA AGCGATCGCT CCTTTTTAAA GACTGCCCAC 6420
CGCACTGCTG ATGAGCGTGC ACACCACGCG AACGnGCTCA TTTTTCATCC ACCGCGCGCA 6480
CGGCTTTCTG CCGCACTCTT AACAGATTCA GGCCACCTCT ACTTTGTACG AGAAGACGGC 6540
TCTGAAGGCC ACGCACGACT GTCCGCCTTA CCCCCGCAAT TCGTGTACAC TTCCTTCACT 6600
CTCTCCGGCC CCTCCCTCAT TGCAGGGTGG GAAGAACAAG ACTTCTTTCA GGTAGGTAGT 6660
ACAGGTCTTT TATGCACCGA GGTAGAATCC CTTACAGGAA CATAAACGCT CCCGGGAGCC 6720
TGCTCTATGT CCTACAAATT CTGTAAGGAC CCCCACGTCA GGTGCCGACA CTACGTTGGC 6780 AGGGGTGGCG GATGAAGCTA AAGCGCTCAT TAATAGTCGG GGGAGGCCTG TTGCTTTGCT 6840
GTGCGCACGG ATATGCGCAG GCGAAGGGAG CACGGGCGTC TGTGCATATT GCGTACCATA 6900
ATCGCACGAT TTACTTCCCC GGCACCCACG AATCTGAACC CATTTGGGTG AAAGTTTCAC 6960
TTACAAATAC GGGAAAGGAC ACGTTGCGCT TCAAACTGGC GGACGACCGT ACCTTTAGTG 7020
TTGATTTTTC TATACGGACG ATGAAGAACC GCGCGCTTGC gCACACGGAC GAATGGATAC 7080
GCAAGCGGAG CACTCATCGT CCTGTGTATT TTAGGGAGAT CAGCCTTGAG CCGGGGGAAA 7140
GCTACTCTTT TGTGGAAAAT GTGAAGCATT ACCTTGATGT GCAGTCGGCA GGGTTGTACT 7200
TTCTAACCCT TCTCTTCTAC CCCGAACTGA AAAGGGAGCG CACCGGTGAC GAGGACCATC 7260
TGGCATCTAA TACGCTAACT CTTGAGGTAC AGCCTGCCCC TGCTGCGGCG GCGCTCGGCG 7320
CGTTGCCGGT TTCTCCCCCC GTGGGTGAAG TTCTGCAACC GCAACGTCTT TCCCCGGATA 7380
GGGTTATCGA GTACGTGCTG AATGCACGGC AAAAATCTCA CTGGGAACGC TTTTTTCTGT 7440
ATCTTGACTT GGCAAAAATG CTTTCTCGGG ATGCGGGGCG CAgTCGCCGC TTTAACGcAG 7500
AGTCTGAGGC AGGACGCTAC AACATGATTG ATACCTATAA GCACGAgTAC GCCAGGAGCG 7560
TGTGGATAAG GATATTGCTG CCATACCCGT TGAATTCCGT ATTGAAAAAA CCGTGTATAC 7620
TGCTACGGAC GCGGAGGTTC GCGTGCTTGA GTGGTTTGAG TACCGGGATT TCCGGGAAAA 7680
GAAGCGCTTT ACCTATCACC TGTCCTCCCG CGACGGCATC TGGTATGTAC ACGATTACGT 7740
AGTTGAGAAT TTGGGAACAG AATGATGAAG GCACTTTTAG TCGCAGATGA TCCCGTTTCG 7800
GTGAATCTGG TATTTGAAAA CCACACGCAG TGCGGTTATG AGGTGATCCA tACCGTTCTG 7860
CGCTGAAAGC CTTGGACAAT ATGGAAGAGA TTCAGCCACA GCTGCTCTTC ATCAACGCCA 7920
GCGACTTTCC GCGACACTGG AGGGTCCTCA CTCAGTTCTT TAAACATCAG TCGGTGTGCG 7980
GAgCGCGCGT AATCCTGCTA GTGAACACTC CGTCCTCCTC TCTCAGCGCG CGGCAGGTGG 8040
CGCAGGCAGG GGTACACGCG CTTATCGATT ACACTCTATC TCCGGAGGAG GGACGAAAGG 8100
CTTTATGCGG CGTGCTCACG CCCTCCGCGT GCGCAGGCTC TGTCGACGTG GGTCATGCGC 8160
ACACCTGCCA GGCAGATTTC GTGTTCACAA ACCCCTGTAG CGGCTCTATT GTCACCGGGA 8220
CCGTACGGGA AGTGAGCGAA GAGGGCGTAG ACTTCATCCC CGACTTTCCC GCGAGCGTCA 8280
ACAATCTGCA AGAGCAGGAT GTACTCGAGC ACTGTGCGCT AAAGGTAGCA CACGACATTC 8340
TCGGTGTTCG CTGCTCGTTC CATTCATCGG ACGGGCGCAT CCTGCTACGT TTTATAGATC 8400
CCGATGCGTC ACTGGTACAT GCAGTACGCA GCGTCACAGG TACCACATAG CAACGGTACC 8460
CACACACACC CCAAGCAAGC AAATGGCTGC GTAGACCCAG GTGGGCAAGG CCTCTTCGGC 8520 ACGGCGGGGG GCTCCtTCGG TGCCGGGGGG CTGCCCTTTC TCCGGTTCTC GTTCCCGAAC 8580
GCATACCCAC AGGAAGGnCA GCCCTTTTCG AAGTCTCTAG CGTTTCCTGC GTGTTTGCAC 8640
AA 8642
(2) INFORMATION FOR SEQ ID NO: 18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6761 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:
TTCTCCATGT TATGAGATTG GACTCCTCGT TGGAAACGTT CCTTTCAGAA nAGATAGCGT 60
TAACCGTGGA TTCGTtATCA ATATTGACGC TCTCTCTCGA TGAACAAAGA CAAGTTCTGC 120
AAGCGGTCCG CACCGTTTTT GTTCCCACAC GACAGGAGGG GTATATTCCC GTGTTGCTAA 180
CGACGGATAC GATTCGTAGC GCAATGTGGA ATTTGTTTTT TTCAGATCGT ATTGAAATCG 240
CAGTTATGTC CTATAAAGAA GTTTCTACCG ATATGCGTAT TGAAACAGTG GGAGTAGTAA 300
GGATAGAAGA GAGTGATGTG GATGCTTTTG TGAGAAAGCA GTAGTCTTCG GGCACAAGAT 360
GGGTGGAGGG TTTGATAGGT GGAGTTATTA GTAGAAGTTG CCCCAACGAA GGAAAAAGCG 420
ATAGAGAAAA TTCGGAAAAA GTATGGAGAT cGAGTTAATA TCCTGCGCAC GCAGAGGAAT 480
AATAGGAGTT TCTTTTTTGG TCTCATAGAA CGAGTCTCGG TAGAGATTTT TTTTTCTGTC 540
AATAGTGGAT CGCAATCATC AGTACACGAG ATACCCTCAG TGCAATCgCG TACGCtGTGT 600
CCGCTGCTCG GGTAGAGGAT ACTGAAGCAG AAAAAATAAA GATACTTGAA TCTGCGCAcG 660
TATTAATGCG AaGATAGCAC AGCAGGTAGA GCCCTTAATT TCAGCGGCAA AAGAGAAGAA 720
AACTGAAAAA GTGCCAACTT CCCCTGAAGC GGTGCATGCG CTCACTCAAA CGCTAGAGGG 780
TATGATCCAG AAGATcACGA ATAGTGCGCC GGTGGTGATA GCACAGGAGT TGCAGTCGAT 840
TCAAAGAATC GAACTTCTTT TAGAGGAAAA TGATTTTAGT TTTTCATTTA TAAGAAAAAG 900
TATTGCTCGT CTAAAGGACG AACTCAGTTA TCATGATTTA GAGTCTTTCG AAAAAGTTGA 960
ATCAACAGTC CTGCGATGGA TTATAGAATC AGTCCACATT CAAGTTCCCC CTATTTGTAC 1020
CGGAACAAGA AACATTGTAT TAGTAGGACC GACTGGTGTG GGAAAAACCA CTACCCTCGC 1080
AAAGCTTGCC GCGTTCTATT TTGTTACAGA ACCGAAGCGA ACTGGTATTC AGCCACGAGT 1140
AAAAATCATT ACAACGGACA ATTTTCGTAT TGGTGCAGCG TTTCAAATGG AACGTTATTG 1200 CGAGCTTATG GGACTCGATC TGTGTGTAGT GCAAGCACCG GTTGAGTTTT TGACGTACAT 1260
GACACTGTAT CAGCAGGAGA CCGATGTGGT CTTTGTGGAC ACGGAAGGgA GGAGTCCGGT 1320
TGATGGACAG AATATAGAGC GGATGGTGGA ATACTTtCGT GCGGTAAAAA ATTTtGAACt 1380
GGAAGTGTAC CTTACCATtG ACGCtGGATC GAAGGCGAAC GACTTGCGCG AGGTGTTTAA 1440
GCAATATGCG CTTTTgAGTA TCGTGCGCTG ATAGTAACCA AACTtGATGA AACAACAAGT 1500
ATTGGaAACC TCATTAGTGC GTTGAGTGAG GCAAGGACTC CTATCACCTA TATTACGACA 1560
GGACAAACGG TTCCAAGCAA TTTAGAAAAG GCGTCAGTAA ATTTACTACT TTCTAAATTA 1620
AAAGGTTTTA AACTTCTTGC TGAGGAGATG GGCAACGACT ATGGTGATTA CGGTAGCAAA 1680
GAGAGATAAG CGCATAGCAG ACCAGGCAGA AGAGCTGAGG GATTTGATGC AGGAAAAAAA 1740
TGCGCGGGAG CtGTTGAACG TCATCAGCAT AGAACGCGTG TTGTCGTGGT AACCAGTGGA 1800
AAAGGCGGGG TGGGAAAGAC GAATATTGCA ACGAATATGG CAATTGCTTA CGGGTACATG 1860
GGGAAAAAGG TGGTACTCAT AGATGCAGAT CTTGGACTTG CAAATGTGAA CGTGATAATG 1920
AACGTTGTTC CCCAGTATAA TTTGTACCAT GTGATCAAAA AGCAGAAGAA AATGTCTGAT 1980
ATCATCATCG ATACTAATTT TGGTATCAAG CTCATCGCTG GTGCATCAGG GTTTTCCAAG 2040
ATTGCaAATT TAAACGAAGA AGAGCGTGCA GCTTTTATCC AAGAGTTATA TTCTTTATCG 2100
GAGACGGATA TCATTATTAT CGATACAAGC GCTGGTGTTT CGAAGAATGT CGTAAGCTTT 2160
GTTGCATCTG CCGATGATGT CATTGTTGTG ACCACTGCCG AACCTACGGC AATCACCGAT 2220
GCGTATGGAA TGATAAAGAT CATTGCAACT GAGGTTGATA ATCgGGATAT GAACTTGAAG 2280
ATGATAGTAA ATAGAGTGAA TTCTGCCgCA GAAGGAAGAA GGATCTCTGA ACGCATGATA 2340
CAAATTGCAG CTCAGTTTTT AAATCTGAAG TTAGATTATC TGGGCTTCAT TTATGACGAC 2400
ACcTCGGTAG GTGCGAGCGT TCTCAGACAG GTCCCTTTTT TAATCCACGA GCCTCGGGGG 2460
AAGGCCTCCG TGTGCTTGCG CCATATCGTG GCAAAGCTGG AAAAAACAGA GATCGCCGAG 2520
ACAGGCGGGC TTTCAGGTTT TATTCGCAGG ATATTTGGAA GGGAATGGGA ATAAGGCTCC 2580
CCCTTTCCCT ACCGACTAAG ATTGATGAGA AGTTGGACCT CCCCCAGTGG CTTGCCGGTC 2640
TTTTCCGCAA TGAACTCAGG GGCAAGTCCC TTCTCAGATA GGGCGATGAT GGCGTCTTTA 2700
AGCAAAGGGG ACTCAGCTCT CAGCCCGTTG TCCCGAATGA TTTTCTCGTT ATAAACCTCA 2760
ATGGTGCCAC GCTTGGTAGG GGTGGGTTTC GCTGCACTCG ATCCTGCGCG TGAAGGAGCG 2820
CTCCGGTGCT CAGGTCGCGC GCAAACCTCC TCCCGAGTCT CCTGCCCCGC ACCCGATGCC 2880
CTGAACAGGG TGCGAGCCGT GTGCTCCTTA AGTATTTCCT CGTTAAGAAG AGTGAGTTTT 2940 CTGTCGATCG TGCGGACGAC CTGGTTACAC TCTCCTATCT TCTTCTCGAG CATTTGCACG 3000
GCAATGTCCG CTTCATACCG TATATCTCGA ATCATCTTTA TCACTTCCTG TTGCATCGTT 3060
TTCGCATAGG CGTCAGGAGA AAAACTCGTA CGCACCTTCA CGTAGAAGTA CACAAGCAGA 3120
GCAACGGCCA CGAAAGAGAA CGTAATCGAC ATCACCAGCA TAGCGTATTC CTCATTTTCG 3180
CTGCGTATAA GGAAAACTTA GTACGCATAA CCTGAAGGGG CTCTTTTATA ATCTTCATCC 3240
TGTTCGCCCG TGTATTGGCC ATAGCGAATG AGGCCATTGC GTAAATAAAA ACTTATGGCC 3300
GTGCTGTCCG CGCGCACCTC TTCGCTCACG TCTCGAATGG TGACTTTTAC ACCAGAATAC 3360
ACCTGTCCTG AGGCAGAGAT TTTCCCCTCG ATTGGAGAAG CATCCAAGGA TGCCTGAATC 3420
TCTTCGAgCT CCGCGCGCGA CTGCTGCACT AGCTGCTCGA GTGAGATCTT TTCCTCATGC 3480
AGACTAGTCT CAAGCGCCTC CTTATCTGGG GGAAGTTCCT TACGCGCTCT CTTTAAATTC 3540
TCGAGGGATT GGAGGTTCAA AGACAGATCG GAGAGTTTTC GTTCATGTGC GTGCAACTCT 3600
TCCTGCAACA TGCTGAGGCG ACGTACACGG TGCGGATCAA AGCCGACGCT GATTTGCGTG 3660
TCGTTGCCGC CTGATTGGCT GCCTAGGTTG CGCGCGTAGA CAGCCTCTGC CGCTGCAACG 3720
TTACTTCCGA TGATGTCGGC ACGCCGCCCA CGACAAATGA TTTTCCGGTT AGCAATGACG 3780
TGCGAGTTCA TAATTCCGTC AGAAACAATG ACAAGATCTC CTGCTTCAAC TGAGGCGCAA 3840
TTCTGGATGA ATTTAGCCCA CAGAGATTTG CCTGCACGAA CGCATCCTTC CTCCTTTCCC 3900
ACAATAcCTT GTCCGACTAG AATGTCCCCT TCTGCATCAA GCAAGGCCTT TCCCACCGTT 3960
CCGCGCACTT CGATGTTGCC TGAGGCCTTA ATCTCGTAGT TATCCTCAAC GTTTCCGTGT 4020
ACCAACACGG TACCAAGGAA CATAATGTTC CCTGTTTTTA CAGAGACGTT TCCTTCTACC 4080
ACATAGATGG GTTCTACGTT GATGCCCCTT cGGGAAAGCA GGGCTTGTCC GTCAGTTTCT 4140
GCAATGACCG TAAGGCCGTC aCGCGCAAGc GCTGTGTTTC TTCCCAGAGG AATGGACACA 4200
TCCTTTCCCG ACTGTGcCGG AAGATACGTG CCCGTGACGG TTTTGCCAGG AGTACCCCGC 4260
TGTGcAGGCa GCTTCTGCGC AAGCGGCTGT CCTTTGACCA CGTTATGAAT GAGGTTTAAC 4320
TCCTTAAAGT TAATCTTCCC CGTCTTGAGC TCTTGCAAGT GCACACGGgT GCGGTCAGTT 4380
TCGAAGTGAT AAGAAATCCT CGCATTTTCA CCGTCCTTTG GAGGGGTGCC CCGTGCAACG 4440
AGGTAGGGTT CATGGTAAAC CGGACAGTCT TGGAACGAAT TGACGCGTTC CATGTCGATG 4500
CCGTACACAA CCCGATTGGA GCGCAAGAAA GACAAGATGG TGTCCGCGCA TATGTCAGCG 4560
CCGTTCCGTC CAGGGGGGGT GGCAGTTACA AAGGCCTTCA TGTCGTTTTC TCGGATCTCC 4620
ACAGAAAGCA TTGCATCATG TGCAGGGATA CGTTCGAATG AAGAAACGTG CACGTAGCTG 4680 TTCGTAGCGT TTTTTATCAG CACCTTGAGA GTGTCAGCGG GAGGCAGGGC AAGGCCGCGC 4740
GCGCGGAACT TTTCCTGGAC GCTGGCGAGT GAAACCTTGC GCCCTTTACC GAGGGGAGCG 4800
GTGATTTTTA AAAAAACGCC CTCTTTTTTG CAAAGTACAA AnGCTGCGCC GTCGTGTCCG 4860
GTGTTGGGGG AAGAAGACAC GTCCCGGTGT GAATCGTGCG GTGCAGACAC GTTCCCGGCA 4920
GCGAACTCCA GTGAAGAGCT CTCGTAGGCG CGGATTTTCC ACTTTTTTTG GCGGAAGGAA 4980
AAGAAACTGC CAGCGCCTCT TTCGAGCACC TCGTATTCAA CGCGGTATTT CGGTATTCCT 5040
AATTGAACAG CAGCAGCCTC AAGTGCCTTA TCAAGTGTTT TTGCGCACGC ACTGACGCAG 5100
ACACGCTTAG AATCCTCCTC GTAGCGTTTC TGCATATCGC GGCGAATTTG ATCAAAGCGA 5160
GTATTCATAG GGAGTATTAC CGGATACCCT TTTTGATGTT GGTGAGCTTT GCCTTTAACT 5220
TTAAATTCGC GCTGGTGTGG ATCTGAGAGA TACGCGACTC GGTCACTTTG AGCACCTTGC 5280
CAATCTCCTT TAAGGTCATT TCTTCGTAGT AGTATAGTAT GAGCACCTGC TGCTCGCGTT 5340
GAGAAAGTTC CCTAAtTgcC TCTGcgAtGa tAcGctTgat TTcCtCgcgt TcgaCaATGA 5400
CGTCGGGATT GAGAGAAGCG GGCGCTTCGA TGCTGTCTCC CACAGAGACG TGGTCTCGCT 5460
CATCTCCACC AAACTTCGAA TCGGCAAGGG AAATCACGCT CGTGCCGGAC ACCTTCAAGA 5520
GGAGCTGGTG GTACTCTTCA AGCTCAATAT TCAGCGCGCA CGCGATCTCA GTATCTGTGG 5580 nCAwsACcCC AAGGCGTGCC TCTAGATCTG CAATCGCTTC TTCTATCTGG CGTGTTTsTG 5640
ACGCACCGAC CGGGGAACCC AGTCGATGGA GCGCAGTTCA TCAAAGATAG CACCGCGGTA 5700
TGGCGTAACC GCGTACGTAT TAAATCGAAT GTTTTTTTCT GGGTCATATT TATCGATAGC 5760
GTCAAAAAGA CCAAAGATAC CGTAGCTTAC GAGGTCATCG AACTCAACGT TCCCCGGTTT 5820
CCCAACGGCA ATTTTGCTTG CAACGTATTT GACCAGAGGA GCGTACTGCA CAACAAAGTA 5880
CTCGCGTATT TTCGCGCTAC GnkTCCTCCG ATACTCGAGC CAAAGCTCCT CTTCCGACTG 5940
CTGTTCGAAG GCTGTGTTCC CCATTCCCGT GCCCTCTACT GTTATAGCTG ATTTCAGAAT 6000
GAAAATACAA GCCGGCTCGC TAGGTGTCTT GGGAGAGCAC GGTTCGGATG GCGCGTGCCA 6060
TGCGTTTAGT CTCGGGTAGG TCAGTGAGAA CGTCACCGAC GAGATCAGAG GTCTTGTCCT 6120
CGAAGGAAGG AGTAGACGGG GAGAAAGAGT ACCCCAGCTC GCCGAGCTCT CCTGTAGGGA 6180
TGAGCGAGTC AAAGCTGTCA TCAGTTTCCT GTACGACATC GCCACCCGGC GACGCAAACG 6240
AAGGCTCGAC CACGTCATCT AGCGTCAGAT CCACATGAGG GATGGGCATC TTCCCATCTT 6300
CTTGAACAAG GAGGTCGGGA ACCACATATG CAAGAAGGGC CCGCAGCgcG TACGCGGTAA 6360
CTCCTGCGCC TAAAGCAAGG ACAGTCGCAC GGGCGACCGA CACATACACG CGygCGcGct 6420 acCGCCGCTG TTGCAATGGA TAAAAAGAAC GCAGCGCCAG CTGCAATcGC AGGTACCTTC 6480
AAGGAGGCAC CAACTGCAGC ACAGGAAAAA CGCCGCTCCT GCACGTCCAC GGAGCTAGTC 6540
TGCAGCTGGT TTTTTGTTTC GTCAAGGACC TGTTTCCAAG CGTGTAGTGT GAGCTATGCG 6600
CGGCACACCC GCATACCACG CGGTGAGTGG TGTGCCGTGC AGCGCTTGCA CGTGTACACA 6660
GGTGGCAGTA CAATTGGCCC TCTCTTGGAG GGGGAGTATG GGTCGTTTGA AnCGGTGTGA 6720
GGTTCGTCGC CGCCCGTGCG CGCTTTGGGC GATCGTACGC n 6761 (2) INFORMATION FOR SEQ ID NO: 19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19217 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:
AGTGTCCGTT TTGTTTGGCG TGGTGTGCGA TCATAGTTGT AAAAAGCTCC ATGCATACTC 60
GAGCGTCATC CTCTGCTCTG TGTGCTGcGT GTACCGTAAG TCCAAACTGA AGTGCAAGAT 120
TCTGCAGACG GTACTGATGG CGTCCTAACC CGGGGAACAC CGCTTGGGCC ATCGCGTACG 180
TATCAACTAC TTTGTGAGAC AGGGGTTGCT TTTTGCACAG GCTGAGTTCT GCATTGAGAA 240
ACTCGACATC GAAGTTTGCG TTATGTGCGA CGAGTACTGT CCCTTTGATG AATCGAGAAA 300
AGTCTGAAAC TATCTCACAA AAGCGCGGCT TATTGACGAG CATATCGTCG GTAATATGGT 360
TGATTTTGCT CACGTCAGGG GGTATAGCCC GATCAGGGAA GATGAGCGTG CTAAAGCGCG 420
CAATAATACC CTTTCGATCA AACGTTACTG CACCAATTTC TATAATGCGA TCTTCTTCTG 480
CTTTTAAACC AGTTGTTTCG GTGTCGAAGG CGGTGAATGC AACGTGTTCA TGCACCGCAA 540
AAACCCAATC ATATATCATT GCAGATATGT ACCCATCTCT TGTTCAACTG CGGTGATAAA 600
CATGTTGCCC GCCTGTTCTG CTTCTTCTAT CGAGGTACTG ACGGTCAGAG GGTGGATGAT 660
ATAACACTTT ATTTTTTGTT CTGTTCCACT CGGTCGTATA CTTACAATAC TGCCATCCTG 720
TACAAAATAC TGCAGCACGT TACTCTGAGG AAAAGAAAGG GCAGATGTCT GCGCAGGATT 780
TTCTGGACTA AACTCAACAC CAAGATATAT ATCCCTCACC TTCATTACCC GCTTGCGCGC 840
AATTTGGGTT AGCGGCTGTC TCCTGAGCGT ATTCATTATT GCATTCATGG TGCTTACACC 900
CGCGACGCCC GCATAGGTTT TGTTCAGCGT CTTTTCACAA AACAGGCCAT GCGTCCTGAA 960
TAACTGGTGA AGGCGATCGA TCAGGCTCAT TCCGCGCAAC TTCCAGTACA CACCCATTTC 1020 TGCACAGAGC GCTGCGGCGT TGATACCGTC TTTGTCTCTC ACCTGAATAC CAAAATTGTG 1080
TCCGTAACTT TCTTCAAATC CATATACGTA GGAGTACGCT CCTGACTGTG AAATCTTTTC 1140
TGCAGTACCA CATATCCATT TGAATCCGGT AAGGcACTCT ACACACGTTG cGCCATATGT 1200
GCGTGCTATA CGGTCGCTAA GTGGGGACGT AACAACGGAG CGTACAATTG CAGGACGCGC 1260
GGGCATATTG TTTTGTTCCT GCAGGGTTAG CAGAATGTAG TCAGTGAAGA GCGCTCCCAT 1320
TTGATTGCCC GTGAGCAACT GCAACACACC GCGGGTGTTT CTTACTGCAC ATGCAAAGCG 1380
GTCTGCGTCA GGATCAGTCG CCATAAGAAC CTCAGCATGT ACGCGATCAG CATATGCACA 1440
CGCATGCACC AACGCGGCCG GATCTTCTGG ATTAGGAGAC GACACCGTAG GrAAGTTCCC 1500
ATCTGGCAAC CGTTGCTCAG GCACGGTCAT AATGGAGAAC CCCATATCCC CCAGTATGCG 1560
CTCGACGTGG AGTGCACCCG TTCCGTGTAA TGGGGTGTAT GCAATACGCA TCGACTGGAC 1620
GGTCTCTTTC GTAAGACCGG GGCGAAAAAG CTTTTCCTTT ATAGAGGTGC AGTACGGTTC 1680
ATCAATTTCT GCATCAATGA TCGTGGGTgC ACTGCGTTTG ACAGGTACCT TTTCCTCAAG 1740
GTTCACGACA CTCGTGATAG CGTTCATTTC TTCGGTGATA TTTTTTTCGT GAGGATGCGC 1800
TATCTGTGCC CCGTCGTTCC AGTACACTTT GTATCCGTTA TACTGCGGTG GGTTGTGCGA 1860
TGCGGTGACC ACGATGCCCA CGTCACAGGT AAGATACTGT ACtGCGTAGG AAAGTTCTGG 1920
AGTCGGGCGT GGATCCGAAA AGAGGTAGGC GGTAATGTCA TGTGCAAGAA ACACGTGCGC 1980
AgcAGTGTGT GCGAACAGAC GAGAATGTAC ACGCGAGTCG TAGGCTATAA CGGCACGGaG 2040
CGCGCCGCGC GCTGcCTTTT CAGGAAAAGT TTTTAGTAAA TAGAGCGCAA TCGCGTGCGT 2100
GATCTTTTTG ATCATGAAGG GGTTCATTCT GTTTGTTCCT CCGCCGACAA CACCCCGCAG 2160
CCCGGCGGTG CCAAACGAAA GAGTTTGCAA AAAgcGCtCT TCGAGCTCTG CTATATTATT 2220
CTGTGCAACA AGATCCCGTA CCTGCTGTGC AAAGAAAGGA TCTGTTTCTT CTTCAAGATA 2280
AAGACGAGCA CGTTCGAACA ATTGACTGGA GTGCATGAGC GCTTCCTCAC CTTTAAAAGT 2340
ACTGGACTAT TTACGGCACC ACAGGATAGA GGGGCATTGT AATGGGAAGG TGCTGCTCTG 2400
TGCAATGCTC ACAAAAAGTG CATGTCTTGA AAAAGTGTAC CAGAGCCACT ACACTGGTGC 2460
GCGTGGGTTC TGCTGTTTCT CCGAAAGTTT TAAAAGGCTT TCGCGATCTT TTACCGGATG 2520
AAGAGATTGA GCGTGCATTG CTCGTAGAAA AACTGACGGT GGCTTTAAGA CAAATGGGTT 2580
TTGTACCTAT CGATACCCCC GCGTTGGAGT ACACCGAGGT TTTgcTGCGC AAAAGTGAGG 2640
GTGACACAGA GAAGCAGATG TTTCGCTTTG TTGATAAGGG TGGAAGAGAT GTGGCCCTCC 2700
GCTTTGATCT TACGGtGCcG CTTGCGCGGT TCGTTGCAAC GCACTATGCG CGTTTGTATT 2760 TTCCTTTTAA GCGCTATCAT TTTGCAAAAG TGTGGAGGGG CGAGAAGCCT CAGATGGGTC 2820
GTTATAGAGA ATTCACGCAG gTGATTTTGA TATCGTCGGT TCGGATTCGG TGTGTGCTGA 2880
CTTTGAAATT CTAAAGTCGA TACGGCACAT GTTGTATATG GCTGGTGCAG AACACATACG 2940
TATTCACGTT GCGCATCGTG GCCTGTTTGA TCGTTTTTTG CGTGCTCTTT CTTTGTCTGA 3000
CCAGGCTGAG CATATCCTGC GGATAATTGA CAAACGTGCA AAGATGGCGC CGCATGTGTT 3060
GACAGCTCAA CTTGAGTCGC TTTGCGATCC AGTTCGTGTG CAAAAGATTA TGACGTATGT 3120
AAGTGCGGGG GAGGTGGACG GTGTTGCGCC GTCGTTTGAA CATACATTGT CTGCCATTGA 3180
GACATTGACA GGGGGTGTCT CGGAAGAGAG TACACGGcTT AGAAAAATAT ATGAGCTACT 3240
CTGTGCAGTG AACATTCAGT CCTCTTATGT GTTCGATCCA TCTATCACGC GTGGTTTTGA 3300
TTACTACACC GGTATGGTGT GTGAAACGTT TTTAACACAG TTGCCTCATA TCGGTTCGGT 3360
GTGCTCAGGT GGGCGCTATG ACCATCTGAC GGCTTTGTAC ATGAAGGATG CAGTGAGTGG 3420
GGTGGGTGCA TCCATTGGGT TGGATCGCTT GTATGCAGCG TTTCAGCAGT TGGGAATGTC 3480
CCGAGAGCAC GTTTGTTTTG TGCAGGCGCT TATCTTCTGT CAGGATAGTG CGCTCATGGA 3540
TGTGTACCAA AAGCTGTGTT CATACTTTGC AGTGCAGGTG GCGACGGAAG TCTTCCCTGA 3600
TCCGCGGAAG TTGAGCCAAC AGTACGCCTT TGCAGAGAAG AAGGGGATTA GGTGGGGGAT 3660
CTTTGTTGAA CAGCGCAACG CCGTGGTGGA GGACTGCCTG CTCGTACTGC GCGACCTTTC 3720
TACGCGAAAG GACACACGCC TACCTGcGCA CGAAcGgACC GnCATGGgCA GCTGAAGGGT 3780
AACAGGCGCC CCCGCGACTC TAGAGTCGCA TGTTACTCAA TTCAGTGACT AGGTCCGTTA 3840
TGGAATCCTT GTTCTTCTGT CCGATTTCAG TGATTGAGGA AGCATACTTT TTAACCTGTT 3900
CTGAGTTTTT GTGCATGGAC GACACGCTGC TGTCGATTTC AGACGTGATG CGCGAAAGGG 3960
CTAGCATCGC CTCCTCTACG TGCTTGCTGT TGTCCAAGAT GGCGCCCGAA TTTTCCTGCA 4020
ACGTGCGCGT GATCTCGGTG ATGTGCTGTA TAGCGTGCAA CACGTGCACA CTGTCTTTCG 4080
TCTGTTCTAC CATCGACTGG CTGATCACTT CTTCTTGGGC CTTTATGTCT GTGGTGATAG 4140
AGAAAATCAG CGCAAACTGA GCCTGAACTG CAAGCGCGCT CTCAGAAACC TTTTCAATTT 4200
CCGTTTTTAC GTCCCGCAGC ACTGCGGAGA TATGCTTTCC CTGTTTCGAA GCGTCCTCTG 4260
CGAGCCTACG TATCTCACTC GCCACTACTT CAAAACCCTG ACCTGACTCT TTACAGTACG 4320
TTGCTTCGAT TGAAGCGTTC ATTGCAAGCA AGTTAGTTTG ACTTGCGATG TGCTGAATTA 4380
CCGCACCGGC TTCAGCCAAC GCTTCTGAAG CGTGAAGCAC TTCTCTTCCC ACATCTGCAG 4440
ACTGAATAGT TGCGTCCTGC GCTAATTTTG CCTCAAGCAG CAGGGTTTCA ATGACATTAC 4500 TATTGTCCGA AAGTACCTGA GTAACTGACT CGATGTTTCT CACCATCTGC TCTACGGAGG 4560
AGGCGGCATC TGTTACCGTA TCTACCTGCT CGCCGATGTG ACCGTTAAAC CGATCGATGT 4620
TTCCGACAAT GTGCTGCACG TGCTTTTGTG TTTCAAAAAT GGTACGGGAT TGGTGTTTAA 4680
TTTTGTCTTC TGCCTTGTGT GCTTGGGAAA TGATCTCGTT GATAGCGGCG CAGGAAGTGC 4740
GCGCATTTCT GACAAAACTA TTCGCGATTT TCCGAGCGCG TGCCATCTTT CCACGGGATG 4800
CAATGAGAAA GAAACGCGTG CTCTTGTGCA TCTTGTTCAA ACTTTGGGTG AGGAACCCAA 4860
ACTCATTTCT ACGCACAACG CGCATACTCG CTGCGGACGC ATCTCCGCTC AACCTGTTTT 4920
GGATAAAGGT GAGGATGCCT TGGTTTGCTA AATCGAGGTG TCGGCCAATC CTGATCATGA 4980
CGATTACGCT GAGTACGCAG ATTAAAGACG CGCACGGGAG GTTATGGGTT ATAACCACAT 5040
GTGCGAGCTC CTGCTCACCG TGCAGAGGGA GTAGGAGCGA TGAAAAGCCC ACGCAGAACA 5100
TACTAAACAA ACCTGAAACG AGCACGACGG ATATTTCTCT TGAGAGGGAG TAGTACCCGA 5160
GGCGTGTCTC GCGCAGGGGG ATAAAGTGGG CCCACTCCTG CAGTGTGTAA AAGAAAGGAT 5220
GGTGAAAGAA GGGAATGAGG CAGAGCGCTG CGCCCACGCT GCTGAGGTAG TAGGACAGCA 5280
AGAGCACATG GCTTGAGGAA AAACCGACTT CTAGGGCAGC ACACACCGGA TACACCACCG 5340
CAGCTACTGC AGGAGGCAGG GGAGAAATCC ACGTGTACCG CAGAAACGCA GCCTCTGCAG 5400
CTGCGCTATC GTTTTGATGT TTCTTAACGG ACATGAGGAG AACGTAgTAG AGCGTCAgcG 5460
AAGCACCCAG CATGAGCGCC AGcgCACAGA AGAAGGAGAC GCTGGTGAGC AGAGCGgAGA 5520
GCATACTCCC GTCTATGACG CCTGCGATAT AGGCTACGAG AGAAGTCGCC GGCACCCACG 5580
CAACGCTCAT CACGACGGCG CGCCGCAGCA CGGCGAAGTC GGAGACCGCA TGTTCAGTGT 5640
TCATAGGTTA CTCCATGGAA AGGTGGACTA CAGGCGCAgc TGGGTGAGCT GGTTCGTAAG 5700
CATGTCCATT GCATTCTTAT TCTCTGAAGC GATAATCTTG ACGGAGGTGA CCGCATCCAA 5760
GATACGAGAC GTGCTGGCGG CCATGTCGTG CATAGCCACG GTGATACCGC CTGTGTTGTG 5820
TGCAAGTTCG TTCATATCCT TGAGGACCAG CTCTCCATCC TTGAGCATGA GCGAGATACC 5880
GCCGCGGATA CGTCTGCGGT TGCCTGTAAT AGCCTCCATG GACCGCCTGA TGTACTCACC 5940
CCCTGAAGAC TGCTCCTCCA TGGCATGCAT GACGgTaGAT TCCCGCTGTT TAACTTCTTC 6000
TGCAAGGACA AAGAGATTGG AAAACTGCTT GCTCAGGCTC TCCGCCtGcT GCGCGATATG 6060
CTCTATCTTT TCAGTGAGTT GTACGAGGAC TGAAGAAATG ATCTTTCCCT GTTCGTTTGA 6120
TTcCTCTGCT AATTTCCTAA TTTCGTCGGs GaCAACTGCA AACCCGCGTC CGtCCTGACC 6180
CGCGTGAGCA GCCTCTATCG CGGCATTCAT AGCAAGCAAC GTAGTCTGAC TTGCGATCTT 6240 CCGTATTACC AAGCTGGCCT GTAAGAGTGC CTTGGAAGAC TTGGAAATAT TCTGGGTAAG 6300
CTCCAGCGAT CGGTTGGTGA GGTTCCCTGC GCGGTTTGTC TCCTCTCGCA ACCGTCCGAC 6360
ACGCGCGTCG TTGCTTTCGA TAATCTTTTC GATAGAATGG ATTTTCTCTA CCATTTGCTC 6420
GACAAGGGTG ATGGACCGGG TGACAGTTTC CGACTCTGCC TCGACGCTTG CATTCAGACG 6480
CCCGACTCCC TCAACCATAG TCCCCACTAT TTCTTTAGAG GAGCTGATGG TGGCAGTCTG 6540
CGCCGTCACA TTTTCGTTGA CGTGATCAAT CTCCTCTTTG GTGGTGCGAG CTGCCTCCTC 6600
GAGGTGAGCG ATATCCGTCT CTGCAGTGTG TGCAATTGCG CTTGAACTCT GTGTCAGCTG 6660
CAAGAAGCGG GAGAAAAACG CGCGATCTTT GTCGCGCAGG GAGCTCAGTG ACCTCGCCAA 6720
AAGACCGATC GAGGTGCGGG TTTCTCTAGG TACGTCCGCA CAGGTGTAGT CGCCAGTTCC 6780
GATGCACTCG GCGAGCACGC GCAAACGGGT GATTTCTCGC GAAGTAAGGT ACGCGCACAG 6840
GAGAAACTCC ATGCAGCTTA CGGCGCTGGC ACAGAGTCCA AACGTACCCG CCATGGTGAG 6900
CAGTTCTGCC GGAGTAATGT TCTGGCTATC TGTGCGTACA AGGAGTGGTG CAGAGACGAG 6960
GAGTAACGTG GCAACGAACA GCGCAATCGC TATGGTAAGA AGCATGCGGG TCCAGTTTGC 7020
GCGCACAGAA CTTTCGCGCA AAGGCAGGAA GGCAGCCCAC CGCTCAAGCG GACGCGTAAG 7080
GAGGAAATGC AGCGCAGGAG CAAACAGCAG ACAGCCACCG GCAGAAAGCA TAACATACAC 7140
TGCCATCTGT GGACCTGTCT CAAACGGGGA ACGCGCGAGG ATAGAGACGA ACGGGACGAA 7200
GCAGCTGAGG AGAACCGACG CACGGAGCGC CGTCCTCCCA TAGCTGTTGA CGCGCTTGCT 7260
GAAGCGcGCC TCATCTTCTC CACGAATAAA AGCGGCGCAC AGCgCGCGAt AATGGAGAAG 7320
CGGAAAGAGG CACAAGAGGT ACAGCACGCT AGCGGGAGAA AGGAGGAGTT CAACGAACTG 7380
GCCATCGCCG AAGrgAsCCG sCGCAGAAGC GGAGAGAAAC AAGAGTGGCA CCCAGCAACA 7440
GGAGAGCACG AGCTCGCGTA CAATtACACT GATCGGAGGA GTACTATAAG TACCGGCGTA 7500
CATGGTTGCC CCCTCAGGCA TAAGTTCACA CGAGCGCGCA CGGTAGCATC GGAGAGAGGG 7560
TCTGTCAAGC GCGCACAGGC CCTCGCCCGA GGGCGGCAGC ATTGCCGTcG GACAGAATCG 7620
AACTGTCGAC ACAAGGATTT TCAGTCCTTT GCTCTACCGA CTGAGCTACA ACGGCGCACA 7680
CCGCGCGCAC CTGnTCATAC AGAAAAAAAA AGGTCAAGTT CCTCACGATG CTCGAGGAAA 7740
AGCAGCACAG GAGAAGGAGA ATGCAGGTCT TGACAGGTGC GCATACTTTG CACTAGGCTC 7800
CCGCCGGAGC GGTGGGTGTA GTTCAGTGGT AGAGCGCCAG ATTGTGGATC TGGTTGTCGT 7860
GGGTTCGAAT CCCATCACTC ACCCTGTGCG TGCGCAGGCG CTCGTAGCTC AGGTGGATAG 7920
AGCAACAGAC TTCGAATCTG TAGGTCGCAC GTTcAAGTCG TGTCGGGCGC ACTTGTGGTG 7980 TCGCTTTCCT CGTATAAATG GTTCTGTTCT CGTCTTTCCT TTAACACGCT GAGGACGTGG 8040
GGGTGCTTTG AGTGGGTACG TTGAGGCGTT TTCCTGCGTC TGTTCTCCAG ATAGCGCTCG 8100
CGCTCTTCTT GCTTGCAAGT GGTGCACGAG ACCTCGTGCA TGTTGATGCG GGTGTGTTCA 8160
ACGCGGCGGT GTATTTCCTC GGCGGATTGT TTCGCGGTCA TGTGGCAATC GGTGTGTTAA 8220
CGCTCGCCGT GTCGCTGTGT TGTCTGACGG CGGGTTTCTT TTTGCTCGTC GATTTTCTGC 8280
GCCCAGAACT TTCCTGCGTT TCTGCGGTTT TAGCGCTGTT CGTTGTGCTG TGGGCTCTGA 8340
ATATGGTTCT GGTGGATGTG GTGGGTGCGT TCGGCCGCGG CAAGGTATTG CAGAATGTGT 8400
CCTCGGCGCT TGAGCATTTG CATCATACTG CTGTCGATCT CCTTGTGCTT GGAGCGCTCA 8460
TCTTTGTGAG GCAGCACACG CGTTAGGCGT TTACACAGCC GAGGATGACA CACGCCGCGT 8520
CTTTTTGCTT CTTCTTGTGT GCTCTTTAAG GAGGGGTCAG GCGGTGTCTC GTGCCCCGCG 8580
TGTGTTTTTT AAGCAAGAAA AAGGAGTGGG GTGAATGGCT GTGGCGGGTT CCTTTGAGCG 8640
GAGTAGATTG CCGTTACGAG CACCATGAAC GCGCCGTCGG AGTGGCGAAG ACAAGACTGC 8700
CCGGTCCCAA TGCGGTTGGA AACGCAGGCA CTTGTACCGT ACCCTGTTCG CTTTGACCGC 8760
AGCCACCATG ATGCGCTGGT GGTCCTGGGC GCTACCGCAA CAGGTAAGAC AGCGTTAgcA 8820
GTTGCGCTTG CCCAAAAATA TCAGGGGGAA ATTATTTCCG CCGATTCGCG GCAGGTGTAC 8880
CGTGGTCTGG ATGTGGGAAC GGGAAAGGAC TTAGCTCTGT ACGGGTCGGT CCCCTATCAC 8940
CTGATAGACG TGTGTGATCC GTATGAGGAA TACAATGTTT TCCGTTTCCA ACAGGCAGTA 9000
TATGGCATAG TGCCGAGTAT ACTCCGGGCG CACAAGGTGC CAATTATTGT CGGTGGTACG 9060
GGTTTGTATC TTGATGCAGT GCTGCGTCAG TACGCGTTGG TACCTGTTGA AAGAAATcAG 9120
GyGCtGCGCC ATTCgCTCCG CGgAGCTTCT CTGTCGCATA TGCGCGCGGT GTACTTTTCG 9180
TTAAAAGACT CCCATGCTGT TCACAACAAG ACAGATTTAG AAGATCCTGC GCGTTTGATG 9240
CGCGCTATTG AGATTGCTGT ATTCCATGCA ACGCACCCTG AGCTGCTCCA GCAGGCACGG 9300
GAAACGCGCC CGATGATGCG CGCGAAAGTG TATGGCATAC AGTATCCACG CTCTATGTTG 9360
CGTGCTCGGA TTCGAGCACG CCTCGAGCAG AGAATACGTG GGGGACTGAT AGAGGAAGTG 9420
GCAGCGCTCC ACAAAGGCGG GGTTTCCTGG CAGCGTCTGG AATACTTTGG GTTGGAATAT 9480
CGCTTCACTG CGCAGTATCT ACAAGGGATC ATTGCTACCC GTGATGAATA TGTCGACCTA 9540
CTTTTTAGAG CTATTAGCAG ATTTGCAAAA CGCCAGGAGA CGTGGTTCCG ACGTATGCAA 9600
AGACTCGGGG TAAAAATTCA CTGGCTCGTG CATAsGGAAA ACGGTTTTGT TCTCCGGTGA 9660
AAAAACGATG ATCGCTCATC GCACCGCTCC ATAGGGTATG TGTGTGGCGA GTCGCTCGTG 9720 GTACGCACAG GCGTTAGTTC TTTCTCTTTC AGCATTTGGA GAAGGGCACA AAAGCGTAAC 9780
GCTTTTTGTT CGCCTGCAGG ATGAAGCGTT TATTTTACGT GCCGCGCTTT TTGGAGGTGC 9840
CCAAAGCAAA cTGCGTGGAT TGGTTATTCC CTATACGACA GGTCGGGTAT GGGTATATTC 9900
GAATCCGCGT ACGAGTATGC ACAAAATTGT TGACTTCTCA GTTAcACACT CTCGTGTGGC 9960
CCTCCGTGAC AGTATCGTAA GAGTGTGGTG CGCTGCGATT TGCGTAGACA TTATAGAGGC 10020
GAGCAAAGGG ACCATCAGCT GGACGCTGGT GACTGCATTT TTGGATGGAA TCAGTCTCTC 10080
TTCGGACGGG GCATGTAAAC ACGCGCTTTT ACGCTTTTTG TGGCGAGTGC TTATAGGAGA 10140
AGGCGTTGCA CCAAATATCA CATCGTGTAG CCGGnTGTtA CGCAGTACGC CGTGTCCGTA 10200
AGTGGTGTGT cGCGCGTGGC GTATyTTAcG CAGGGTGAGT CTTTTGTGTG TGCTGCGTGT 10260
GCTGCTCCTG CAGAACACCG GTTTGAGCTA AACGCGGAGG CGTGGCACTT TCTCAACACG 10320
GTCAAAGAAT GCACTCCAAG GCACGCGCGC GCGTTGGTGT TGTCCCAAGA GAGCTACTGT 10380
GAACTTAAGC AGCTGCTATT TTGCCtGATC ACGAAAATGA GCGGTAAAAA GTTAAAAACA 10440
CTCGAACATG CgCACGCAGT GCTATGAGAC TGTCCGAAAC ATACCGCAGG GTGCAGTCCA 10500
GTTAGGTCCC CGGGGTGCAC AGAGCGAGAA AGACGTGCTC ACTGATAATA CGCACGCCAC 10560
ACCGAACAGC GTTCTGGCAT TTGCGTGACT GCCCATGTGG ATCGTTTGAC ACCAGATACG 10620
TCAGCTGTGT ACTGACAGCC GTTTTCACTG TTCCGCCTAG GCGTTGTACG AGCGCGATTG 10680
CCTGGGCGCG TGTCATACCG TCGAGATCGC CAGAGAAGCA GAAGCTCATG CCGTGCAATG 10740
GCGCGTGCAC GTGTGCATCC GTATGCAAAG GCGATTGGGC AGTGGCAATA AGCTTGTGGA 10800
GCGTCTCTTC CGAAATAATT TGGATTCCCA ACGCGCATGs GtGCGATAAC GTGTCGACTG 10860
ACTGTAAAAC ACCAGATAGT CAAGCTGGGC TGTTACGGAC GATGCGACAT CCCCTCCTAA 10920
CGCGCGCACT TTTTCTTGAG CATGTTTGTG ATTCATACTG CGCGACGCAC CTGAGAAATA 10980
AAAACTCTTT CCTTGCAAGG GGTGTACGGT TTCCCCCGTA TGGGTACAGG ATGCGGAAGC 11040
TTGATCTAGA AGCCTGCAAA ACTCGTCTTC CGAAATAATT TCTAGCGCTA CCCCCTGTTC 11100
TTTTTTTAGT TTTTGCGCGG TGCGATAGGG CTGTGATAGG CTTTCGAATA TTAGGTACGA 11160
AAGGTCTCGA GTAACTGACG TTCTCACAAC GCCTCCGAGC GCGCGTATAC GGTGTATAGT 11220
CGCGCGATCT CCATTTCTGA GGGAGCCTGA GAAGCAAAAG CTTTTTCCCC GGAGTGGCGA 11280
CTCTGCCTGC TGCTTGGCAA GAATGCGCAC GAAGCCTCTA TCAAGCAGTT CACACATATC 11340
ATCCTTAACC CGTGCAATAC CGGTTACCAC GCTTTTTGCA AGTTCTGTTC CAAACTGATA 11400
GATGGACTCG AGTGTCTCCG TTGTTgCGTG GAGCACTTTT TCAAGTGTGT CAAAACCGGC 11460 ACAGATAAGT TTTTCTCCCA TGGTTTCTCC GATCCCCTCG ATGCCAAACC CGGCAATAAA 11520
CGTTTGCAAC GCTATTTCTT TTTTGTGGTG TATAGCTTCG AGTATTTTTT TTGCAGTTGC 11580
ATTGCCGACG TGCTCAATCT CAATGAGATC CTCGCAGGTG AGCGTGTAGA GGTCCGGGAT 11640
GCGGCGTACC TTTTTTTCTT CAAAAAGACG CTGAATGAGT TCGGTCCCCA CATGCTTGAT 11700
CTCCAAACAC TCAATCCACC GTGTAATGCG GTGATGCGAG AGCAGGGGGC AGTTTACATT 11760
AGGACAAAAA AGCCTGCTCC CACTGTTTTC CAGCACTGTG TTGCAACTGG TACACTGTGT 11820
CGGTATGTGG ATTTCCTGCG CATGTGCTGG GGTTGAAACG AGTGCTTCAA TTTTTGGGAT 11880
AATCTCCCCC CGTTTGGAAA TTAAAACGTG GCTGCCAATT TTGAGACACA GCTTTGTGAG 11940
CATATTCGGG TTACATAAAT TTGCGCGCTT GACCGTAGTT CCTGCAAGGC GCACTGGATC 12000
GGTAATACCA ATGGGCGTGT ACGTCACTCC TGATGTTTGC CATTGAACGT CACGCAAGGT 12060
GGTTATCGCC TCCTGTGTAC TGAACTTAAA GGCTATCTGC TTTTTAGGTC TGGGAAGTTG 12120
TGCGTCCTGG AAGTCAAGAT CGGTACTCTT TACTACCAGG CCATCGATGC TGTAAGGCAA 12180
CAGCTCGCGC GTGCGCATAA TCTCAGATCG GAGTGCAACA ACTTCCTGTG CGTTAGCGCA 12240
GCGATGCGAA TGTACCGTCA CGAAACCTTG GCGCGCGAgC CAGGCAAGCT TTTCTGTTTC 12300
ATCAGCAAAG GGGAGGGAGC CGGTGAACGG TTTACCGGGG GTGCCGGGTA cTGCGTCGTA 12360
ACAAACAATA TGGAGGTGGG TGCGgCCGCG GCCGTCCTTT CGCTTTAGGA TGCCGTTTAC 12420
GGTGTTGCGG CAATTTGCGT GAGTAGGATA GTGGGAACGA TGTATATCCT TGTGCATAAT 12480
GACTTCGCCA CGAACACCCC CCGTGAAGGG GAGATTGCCG CAAGGTCCCC ACTCTGCAGT 12540
GAGGGTCGGC ACAAAGCCAC GCATGCCGCG TACGTTAGCA GTGACGTCGT CTCCGACAAT 12600
GCCGTTACCA CGGGTGAGCG CACAGCAAAA ATGACCGCGC TCGTATTGCA ACTCTAAgcT 12660
AACGCCATCG AGTTTGTGTT GGACGAGAAA TGCCTGCAAT GCATTTTTTT TTGCCCATGC 12720
GCTGAAGGAC TCCTCGTCTG CAGCTTTGTG TTGACTACCC ATAGGAACAA TGTGGCGCTT 12780
TTTCACTGCG TCACGTTGAC TGTCAGAACC GATTGCTTTA AGCAGCGGAT TTCCAGGATC 12840
AAGCCTTGCA AGTTCTTCCC AAAGCGCGTC AAAGGCGTCA TCTGAAATAT CAGACTCCGC 12900
GTTGTAGTAG CGGTCTTGAT GGTGAAGAAT GAGCTTTTCA AGTTCTTGAA CACGTCTCTG 12960
CGCAGTACTC ATAGCACAAG GsCGCAATGT GTAGCGTCCG GGGCGCACCT CCCACGCGTG 13020
CAACGCTCCT TCGTCTTGTC GGAAGATGCC AAAACGGCGG CGCAGTCCCC TGGGGGCGTG 13080
AGTGCAGCGG TCACTCTAAT GGCTCTGACG CCAGGTGGAG CGACACGTGG CAGTAACCAC 13140
CAGGATTCTT GGTCAAATAG TCTTGGTGAT ATTCTTCTGC AGGGTAGAAG TCACGCGCCT 13200 CTTCCAAAGT GGTAACCAAA GGACGGGTGA ACTTTCCCGC GCCTGCATAG CGACCCATGA 13260
GCGTCTCCGC TTGCTGCTTC TGCGTACCAC TGAGGTAGAA AATTGCAGAA CGGTATTGCG 13320
TGCCAACGTC TCCTCCCTGT TTGTTAAGAC TAGTTGGGTC ATGCATGCGG AAAAAGTGCT 13380
TGAGCAGATC CTCATAGGAA ATAACCTGGG GATCAAAGAG GATCTCTACC GCTTCTGCGT 13440
GTCCGGTAGT ACCGGTGCAG ACGTTTCGAT AGGTAGGACT TTTGGTGGTA CCGCCCGTGT 13500
ACCCGACTCG AACGCGCAAC ACTCCCTTGA CGCGCCGAAA GTAGGCTTCC GTACTCCAGA 13560
AGcActGCGG CGAAAAGAGC AGTCCCCTCT TCCCGAGCAA CAAAGCGCAA CGCTGCAGAA 13 620
TTGATGCAGT AACGGAGCCC CCCGCGTTCG GGCGGACCGT CTTTGAATAC GTGCCCGAGA 13680
TGAGAATTAG CGTTCCGACT CCGCACCTCA GTACGGAGCA TGTTGTGCGT TCGATCCTCC 13740
TTTTCAATCA CCGCGCGCGC GTCTGCAGGT GCAGAAAAGC TAGGCCAGCC ACAGCCTGAG 13800
TCGAATTTAT CTTCGGAGAG GAAGAGGAGT TCTCCTGAAA CAACATCAAG ATAGAAACCG 13860
GGTTGGTAAG AGTTCCAGAA CTCATTCTTA AAGGGAGTCT CAGTGGCAGA CTGCTGGGTA 13920
ACACGGTACT GAATATCACT CAGGTGCTGA AGGTTTGCAA GCATGGGGCC ATTCTAGGCG 13980
ATGAGAGCCA GACCTGCAAG GcACCTCCTC GGCGGCAGGA CTCGGCGCAC TGGTAAGAGG 14040
GAAACCGGcG GAAGCGTGAT TGCGTTGCAG GCAGCCGTGT TGATGCGCGG CTACAAGCGT 14100
TCCCTACGGC AGATGAACGA GAGGACGTGC AGAAAGACCC ATCGCCTCTT TTACTCTCTG 14160
CATGGTTTTG cGTGCTTCAG CACAnTGcGC GCGCAgcCTT cCTCTAATAC GCGTGCAAGG 14220
TACGCAGGCT GCTGTTCTAA CTGTGCACGG CGTGCGCGGA TGGGTTCTAA AAACGTATTC 14280
AAGGCGCGCG CAAGAGCGTC TTTCACTTCT GTGTCTCCCA CGCGCCCAGc GrATTACCGC 14340
TCTTTAAAAT GTGCAACCTC GTCCGTGTTT GGGTTGAACG CATCGTGGTA CGCAAACACC 14400
GGATTCCCTT CCACGCGTCC TGGTATATCT GCCCGGGTGC GCGCAGGATC TGTGTACATC 14460
GCACGGACCT TGCGCCGCAC CGTCTCTTCA TCGTCCGAGA GAAAAATCGC GTTCCCCAAA 14520
CTTTTAGACA TTTTCGCTTG TCCGTCAGTT CCCACGAGAG TGsCGCAGTC ACTGAGGAGT 14580
GCACGCGGCA ATGGAAAGAC CTCCCCATAG AGGCGATTGA AACGCTTTGC AACTTCGCGC 14640
GTCAGTTCTA CGTGACTTTC ATTATCTTTA CCTACCGGCA CAAGATGCGC CCGCGCCAgC 14700
AGAATGTCCG CCGCTTGAAG TACGGGATAT CCCAAAAGAC CAAAGGGAAG TTCAGAAAGG 14760
TTTGCCGCTT GAGCCATCTC CTTTAAGGAA GGGATGCGCT GCAAACGCGG CACCGTTACC 14820
AGATTCGCAA AAATGAACGC CAGCTCTGTA ACTTCTGGCA CCGCcGATTG CAGATAAATG 14880
ACCGCACGCT GCGcGTCGAT GCCACAAGCT AGGTAGTCCA GCACGAGCTC ACGAACGAGA 14940 GCAGGCAGCT CGGCAAGCTG TGCACGCCCA TGGGTGTTAg TGAGCGTGTG CAGATCCGCG 15000
ATGATGAAGA AACACTcGTG CTCAGACTGT ATGCGCAGAC GCGTcCCGAG CGAGCCTGCA 15060
TAGTGACCGA GGTGGAGACG CCCCGTGGGC CGGTCTCCGG TAAGGACGCG CAcATGGGGA 15120
GACTGTAAAT CAAGGTCCGT CCAGTATTCA AGAGTGCaGG CGCaGGAACA CGTGCGCCGA 15180
CCGCGCAGCC CTCCTCTATA TCACGGGTGC GTGTCCGTCT CAGGAAGAGG CAAACACCTT 15240
GCCGTAGCGA TTTATTGTCG CTGCCATATC ATCAAGACTC ACCTGTTCGC CAACTGCTCC 15300
AAGTTCGCAG GCTACACGCG GCATTCCATA TACAATGCAC GAATCTGCGT CCTGCGCAAT 15360
TGTACGCGAG CCTTCTGTGT ATAAACGCGc AAsTGCGCGG CACCGTCCCT TCCCATTCCT 15420
GTCATTAAAA TTCCCAATGC GCGGTTTTCA AAATGGCGAG CTACTGACTC AAATAACACA 15480
TCAACGCTGG GTCGGTGACC GTTTTCCGGT TCATCCGAGT TGATGTGCGC AACAGTTGCC 15540
AGGGAGCGCC GCTCTACCGT CAGATGGCGA TCCCCGGGGG CAATCAAAAC ACGCCCACGC 15600
CGCACCAAGT CTCCTTCTTG CGCTTCTTTT ACTTCCAGCG CACATACCTG GTTAAGACTG 15660
TACGCAAATT CACGCGTGAA ACCTGCAGGC ATGTGCTGCA CTACTACCAC CGGCTGCGGC 15720
AAATCTGCAT CGAGCTGTGC AAAGATATGA CGCAGCGCGC TAGGTCCTCC CGTTGATACG 15780
CCGATTGCAA TGATCTGCAG TGCCCCACTT TCACGCAGtG GCACGATGCG CGTTTGGCGC 15840
GCTCCCTCTG TGGGGGTGAT TGTGTAGGAT GCACGGTcAC GCGCGGGCGC ACGTGCACAT 15900
ACAGACGGCA GAGAGGCTCG CTCTTGGGTA TCTAAACAAT TCAAGTCCTC CTCGCCGGCA 15960
GGTCTCTCAA CAGGTGTGTC CATGGCAAGA CAGCGGCGCG TGCGGCGCAG CAACTTGTAG 16020
CGTCTCCCAT ACGCGGTAAC GTAATCGACA ATCTTGCGCG AAACCGTGCG CAAATGCGCA 16080
GACTCAGATC CAAAAGGCTT GGTGACAAAG TCACTTGCCC CCAGCTCCAA ACATTGCATC 16140
GTGACCCGTG CACCTTCTTT TGCAATGCTA GAGAGAATGA TTACTGGAAT ATCAATGCGT 16200
AGACGTTTCC GCTGTTCAAG AAACTGAAGC CCGTTCATGT GCGGCATTTC CAGGTCGAGC 16260
AAGATGACAT CTGGCTGTAC ACGCTCTAGC ATGTCAAGmG CAAAGCGCCC ATTCATGGCC 16320
TTGCcCGCTA TGCTAAGACC CGGCGCTCCT TCAATGACTT TTCCAATAAC CTTTCTCATG 16380
AGCGCAGAAT CGTCCACGAT TAAAACAGCA ATATCATTAG TATTTTTCAT ACGGTAGTCT 16440
GTGGCCCCCT GCCTGTATAC GCTCTGGTGT GCAGCCTTAC CGCGCGTGGC ATTCATGCGC 16500
TGACCCTTCA TCGTTTTTCT GGCACAGnCA GCCCCACGGG GTTTTTAGAA AAGAAAACTT 16560
TGTATTCATC CCAAAGAGAG ACTCAGAGTG CCCAATGAAA AGAAAAGAGT GAGCAGACAT 16620
CGCATCCCAA AATCGCTCAA TAACCGCCTT TTGGGCTGTT TCATCAAAAT AAATAAGTAC 16680 GTTTCTGCAG AACAGCACGT CGACGTTTCG ATGCATGGAA CGGTGTTTTA AGTTATGATA 16740
ATCGAAGCGA ACCATTTTCC TAATATCGGC GTTAATTTGA TATCCTTCCT GTGTTTCGCG 16800
AAAATATGCA CGGAGGTATT CGTCCGGTAC TCCAGAGAcC GTGCGCGCGG GTAGTAGCCC 16860
TGGCGTGCAA CGAGAAGCGA TTTGAGTGAG AGATCCGACG CGATCACCTG GCATGAGAAC 16920
GCAGCGGGCG CATACCGTTT CAGCAGCATT GCAATAGTGT ACGGCTCTTC CCCGGTGGAG 16980
CACCcTGCGC TCCAAACGGT TATGGAATGC TCACCtnCTG CGCTTGGCTT TTACCAATTC 17040
TGGAATGACA TAGTGCGAGA AGGAGTCAAA ATGGGCTTTG TTACGGAAAA AACGCGTTAG 17100
ATTTGTGGTT ACCGAATCGA GAAGTGCAGA AAGCTCTGCG CTACTTGCAA GGACCTGCTG 17160
GTAGTATGCG CACGCAGAAG GGAGGGCAAG TTTCGCGCAG GCGCGATCTA ATTCTACTTT 17220
CCAGTACCGA GCGATTAAGT GCAGAAAAGG TGATGCCgCT GTGCTCGTAA ATAAGAGTTT 17280
TAAACGCGGC AAATTCCGCA TCGCTGAGTG TGCTCATCGG TCTTACCCCC TTGCCGAGTG 17340
TCAGGTAGCT GCGTGTTATA TCGCTTTTTT CGGTCGCAGC CGTCATCCTG CACACGGGGT 17400
AATTGCTACC GGGTGTACAA CGACAGTGAG AACCGACACT GTGTCTTACG GAAGCGTGCC 17460
TGGCACCTCG CGCGCCGACA CGGTCTGCTC CCTTGAAGGT GCACGCGCCG CACGGTGTGT 17520
CAGGAGCGTG ACGCTTGCAG AGATCATGCG CCGTCTCACA TGGCACGGTA CGGTGAGTGA 17580
CCCGGGCCTG TCTGTGAGGC TACCTGCTCC CAATCACCTT GCTGAGGCTC ATACACCAAA 17640
AAATCTTGTG GCCGCTGTGC GTGCTGTTCG AGCGCCCAGG GATAGAGCAC CACcTTTCCT 17700
TCAGAATCAA CAAACCCAAT GGGGTCAAAC TGGTAAAAyT tCGCCTTAGG ATAACGAGCG 17760
CGCACATGGC TGTGCATTTC AAGGGGAATC CATGCGCGCG GGATGCACGT CCCCCACACC 17820
ATTTCCACCT GGGAGGGAAG CGCATACGCC CATATGAGGC GGGTGCTCAC GTCCAGTGCA 17880
AACAAGCACA GCGAAACGGA GGTAATCGgC GTTCGCCCTT CGAGACAGTA AAAGTGAAAG 17940
CGCTTCAACA GTGGTTCGTA GTAGCGCTCA TGAGCGTGAg CATCGAGCGG ATGGACAGCA 18000
CGGTGCACCA CGTATCGTGC GTGCTCAATG GTATTGAGTT GTCCCCAAAA AACAGTTTCA 18060
TGCTCGTTCC CGAAAACAAA AACATCCGGT TTGACGCGCG GCAGGGCGCG CGTGGACACG 18120
GCGTGGGACA CCCGGTGTGC TTTTGTTTTT TGCAGGGAGC CGGGAGGACG CAGACGGAGC 18180
CTCGCTCCAC ACTCCTTCTT CAAACCAGTA CGACGGGCAC GTTACGCGCA CTGATAGACA 18240
TTCTTAGCAC GCGCGCTGCA AAATCAGTTC CCTCAAGAAG AAAGTGTGCG GGACCTCTTG 18300
CAGCGTCTGC GCACACACCG ATGTGCCCAC CACGCACCAA AGCCCGCGCA AAAAACCGGC 18360
CATTGTCATG CCTAGATCCA CCGCTCCTCG TTCTGACCTC CTGcGTGTAC AAGACCACAA 18420 ACGAGGAGCG CCGCTGCCAC TCTAGAAGGG AGCGGGTACT TCTGTCAAGC CATTGCTCTT 18480
TTTTCTCACT CTCCTTTGCG CGCCTGCACC GTGCTGCTGT GGAGCGCGCA AAATAACCAG 18540
CATCTGTTGA CTTAGCGGCG GACGCTGTGA TGAGCGTGTT CTTTAACTGC GCAAGTCGCT 18600
CCGTGAGyGA AACCTTATAA ACGTGTGGAT TTAACAGTCG CACATAGCTT TTATCCAATC 18660
GTTGCAGTTG TGCGCGCATC CGTGcATGGA TTTCTTGTAC TGaTTCGCCG GCACGTAACC 18720
GATTGCCCGC CTGcATGACC GTGCGCAAAA GCGGTTCCAC CTGGTGCGCT GCGAACGTGA 18780
ATGATTGAAG ATTATAAAAA GGATGTATGT ATCGCTGTGC CACcTGCGGC TCTATCACCT 18840
CATCTGCAAG TCCTATAACA TCCGCCTTGT ATTGCCCCGC TGCATCATAC AGACGCCATA 18900
CCTGcTTTAT TCCAGGTGTG GTAGTCTTTG CCGGGTTATC CGAGACTTTC ATGACTGGCA 18960
GCCAGTGTGC ATCGTCCCAG TGGGGGAGGC GTTGGGATGC TGCATGCCCG TGTTGAGTGT 19020
GCGCCGGCTG TGTCGTAGCG CGTGCACTCA TCTTGTACAC TCCGGTAAAG GCAGAGTCTG 19080
CTCCTCCGGT TACCAGGTGT GTGCCTACAC CCCAAGCATC GATGGGAGCA CCGCTTAAAA 19140
CTAAAGATTC GATGATCGTC TCATCCAGCT CATTTGAAAC TGCAATGCGT GCTTCGGGCA 19200
ATCCCGCTGC GTCTAGT 19217 (2) INFORMATION FOR SEQ ID NO : 20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3496 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: AAAGCATTTG CAAAGACACG TACTGAGTTG GGACAGACAA TAGCAGCTGC TTACCTCGCA 60 TCAAAGGATG TGTTAACCGT GGGTATTGGA CTTGATATGT ATCAAACAAA TCAGTATTCA 120 GCTCTTTCTG AGCACATAGA AAAAATAGCG GGGGATAATA AGTTTGGAGC TCTACAAGCA 180 AAGGCAAGGC AAATTTTAGC ACGTCAAAAA AAAGAATCGT GAGGATGTGT TTTCTATAAA 240 AATCTGTGTA TTAACTGCAG TGGTGTGTCT GCTGTCTTGA ATTCTTTTTT GACGGTGAAT 300 ATGGAAGTAC TTCGTGTAAC CAGTTTAACG AAACATTATG GCTCCAGGCG CCATCCGGTA 360 CGTGGGTGTG AAGACGTAAC CTTTTGTGTT GAAAGGGGAC AGGTGTGCGG GATATTGGGG 420 TTGAACGGTG CAGGgAAAAG CACTGTACTc GCGTGATTGG TGGGTTGATT CATCCGTCTT 480 CGGGGGAAGT GTATGCGTGT CATTGTTCTT TATCACGCTA CCCGGTAGGt ATCGGCGTCA 540 TATTGGTGTT CTGCATGAGC AGAATCCGCT ATACGCAGAT ATGACGGTTG AGCAACATAT 600
TCTTTTTGTT GCCCGCATAT TTCAACTTGC CGATGGGGAG GCACGCACCG CTGAAATGAT 660
AGAATTATTC CAGTTGCAGT CTGTTGCACA CAGACGTGTG CGCAATCTTT CTAAAGGATA 720
TAAACAGAGG GTTGGGcTTG CGCAGGCATT GGTACACCGT CCCAAACTCC TTGTCTTAGA 780
TGAGCCTCTT TCTGGTTTGG ATATTGTATA TCTGAAGGAA TTCCATAAAG AGATTGTTGC 840
GCAAAACAAT AATCTTGCTG TGGTGTTTTC TACGCACGCG GTGCAGGAGA TCGAAGCGTT 900
GTGCGACGTG TTTGTCTTAT TGCATGCAGG ACATGTTCTT TTCTCAGGAA ATAGAGCGCA 960
AATAGCAGCG CGCATCGTGC GAGAATTTCC TGAGAAAAAG CAAACAGTAG CATTGCACCT 1020
TGAAACAGGA ACCTTTATCG CTTTTGTATT TGAGCAGTAT ATGCAATGGC AGAGTGCACA 1080
GGATGCTGCG TGCTATGCAG TGTAAACAAT TTTTTACTTT GTATAAAAAG GAGCTGCGTT 1140
CTCTACTCAC TTCACCGGTA ACTTACGTGT GTCACGTACT ACTGCACCTT GGTCTGACCA 1200
TACCGTTCAT TGGAGTAAAT TTTTGGTTAA ATGCGGGGAT ATCTGAGCTT CAAAGTTTTT 1260
TTCTTAATGC ACCACTTCTT TTCTGCATTA TCATACCGCT GCTGACAATG CATGTATGGT 1320
CTCATGAGCG AAAGTCAGGA ACCGATACAC TGCTTTTTTC TTTTCCGATT GCAGAACGAA 1380
CGATTGTTTT GACAAAGTAT CTATCgCTGC TTTCAGTGTA CGGTGGGATG ATTGTTGTCA 1440
GTACTGCTAT CCCTCTTTCT ATTTTTTCTC TGGGATATTT TGATTATGCA CCCTGTGCTC 1500
TTGCATACGT GACGCTTGTT CTTTTTGGTG CAGCTCTTCT TTCGCTGTCT TGTGCGGTAG 1560
CCAGCTACGT TTCTTACGCT GCAGTGGGTT TTGTTTTGAA CTTTACGCTT GCGGTGATGG 1620
CATTGCTGGT GCATATTCCC GCACGAGTGT TCATATCACA CAGATATATA AGGGCATGTG 1680
TTTCGTGGGT TTCTTTCGTA TATCATTTTG AATCTGCCGC TCGTGGCATA TTCGATTTAA 1740
GCGATTTCGC GTTCTATATT TTTGTAGCGA TAGCGGGTAT CGAGTTGCAG TGTTTGATTG 1800
TAAGGGTTCG TTTTAGGTGA GCAGAAAACA TCATATACCC TGTACCGTGA TGATTCTGAA 1860
TATAATGATG AGCGTGTTTG TGACGTTCTG TACACCTGTC CGGTGTGATT TAACAGCACA 1920
GAGAGCATAT TCCCTTTCGG CACACACCAT TAAGCTTTTT GAGAGTGTCG AAAGTACTGT 1980
GGAAATAACG TGGTTTTATT CCACCGATGT AGATAGGTAC ATTCCTACCG TCATATATGT 2040
GAGAGATTTG CTTAAAGAGT ACGCTCATCA GCTGAGTAAG CAGTGTGCAG TAGCGATGAA 2100
GGATATTAAT CTCCTTTCTC AGTCTTTGAG GAAAGAACTT GGATTTGTTG CTCGGCGCGT 2160
TACGTATACG CGTAACACTG CCAGCATAGC GTACGATGCG TATTCTGCAA TACTTGTTGA 2220
ATATCGTGGT ATGGCTCGTG CCGTACCCTT TGTGTCTGAC ACCAAAAGGC TGGAGTATGA 2280 CATCGCGCGT TTGATCATCC AGATGCAGCA GGAAATGAGT GCAGATATGA TGTCCCGTGG 2340
GATATATGTT CTTGCTCCAC CAGAAAGTTT AAGTACCACA TATGCCCATG TATTACCGCG 2400
TTTGCAATCT GAAGGATtGC TCCCAGAGAT TCTCTCtATT TCTTTGCCTC AGCTAGATAC 2460
CCGTATTCCA CTTTTGaTTT TAGGTtCyGG CTACGTGGaT GAACACGcCG TAACCTTACT 2520
TGATGCTTTT TTGCAGAAGG GAGGAAACGC ATTGTGCTTT GTATCcAGGA AATAGCGTGC 2580
AACTCAATGA TCAATGGACT GTTGAGGAAA AGCGCCATGA TTTTCTTATT AATCTCCTGA 2640
GCACGTACGG AATTACTATT AACTCAGATC TCATTCTCGA CGAGCAAAGT TTTGCTGTAT 2700
CGTTACCTTC AGTTTACGAA ACTCAATACG ATAGAGTGTC TTATCCGTTC TGGCCAGTTG 2760
TTACTTTGAA ACCGTATACG CACGGAGTAC CTGTAATGGT ACAAGCGGGA ATTCAGTTCC 2820
TTCGATTATT TTGGCCCTCG TCAATACGAG TTTCTTTTCC TGCCCGTGTA TTTGAGTCTA 2880
CGAGTAATCA TTCTCTGTGT ATGACTGCGC CTTTTAATAT TGATCCTTCT GTTGATCACC 2940
TGAAAGATCT TGCAAAAGGT AAAATGCCCG CTCCCCAGGC ATTTGTTGCA TTTCGTGATT 3000
ACCCTGGAAA GCTCATGGTA GTGTCCGATG AGTACATGGT CAGTGCAATT GTGGAACATA 3060
CGCACAACGG AGAAAATCTT GATTTCATGA TAAACTGTAT TCAGTGGCTG TGTGGTAACG 3120
ATGGTTTACT TATGCTGAAA AGCAAGAATC CCGCGTGGCT TCCATTGAAA TCTTTCCGTG 3180
ATGAACAAAA GTTCGCACGC ATTGTGCACC GTGCGCGCTA TCTGAATATC GTAGCTATCC 3240
CTGTGCTTAT AGGAATGCTG TTTGTGGTGA TGCAGATTCT TTATCGGAGA AAACGGTGAG 3300
GGTTATGCGA TCTGTGGATT CGCGTAGCAG CGTAACACGG TGGGTATGTT TAACCTCAGT 3360
GATTTTGTTT TGCTTTTGTA TTGCGGTGAT GAGGTATGGG GGAGTAAAAA AGAGGCGTTA 3420
CTTTTATGGA TTTTGTCTCC ACCCTAGAGA ACGGGCGGAT ATAACGGAAG TCATTCTCCG 3480
TTTTCCAAGG GAGGAA 3496 (2) INFORMATION FOR SEQ ID NO: 21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11628 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21:
GTTAATGTGG AAATGAATTC ATTTCCAAAA TTCTCCGCAG TGACGTATAT GACGTTCAGG 60
TCTGTTGTCT TGTAGATCTC GTGTCCAATA GCCTGCATAA GGTGGGTTTT TCCTAGTCCC 120 ACTCCACCGT AGATAAGTAA CGGATTGTAG GAAGTGCCTG GGTTTTTTGA TACGGAGATA 180
GCAGCGCTAT GGCTGAATTT GGTTTCTTCT CCGGATACAA AGTTCTCGAA GGTATAGTCT 240
CTGTTCAGGT CGGGGTGAAA GCTCTTTTTG GAAGGAACCT CTGCAGGAGA GTTTTTCTCC 300
AGGTAGGTAT GCaCGTGTTT GGGGGGAGCA GTATTTCCAT GAGGGGTGCC TTTTTTAACG 360
GCAAACAAAA GTTTAATGGG GTGTCCAGAA AGTTCGAGGA ACTTGCGCTC AAGCTTTTCT 420
TGATATTTTT GGCTAAACTG TATTCGGAAA AAGTCTGAAG GTACTGCTAT TTCGATAGCG 480
TTTTCAAAAG ATGCGATAAA GAACAAATGA GCAAACCACA TGTTAAATTC TGCTTCGGTC 540
GATTCACTCC GTATCTGGCT GAGTGTCTCG TTCCAGAATA CTTCATACCC TACTGCGTCC 600
ATCTACCTAT GATACAACCT ATTGTATTTT GCCTGCAATA AACGAAGAGG TTATACGCGC 660
GTTGCTTTGT GGGTGTAGAT TATCTTGTTA TTCAAGAGAA GTTTTTATGC TACACTAAGC 720
GGCTCTTGTT TAGTGTGGGG CTGTTGCGCG ACAGTATACC GTGAGCATGC CCGCGAGAAA 780
TGGGGAGTCG GAGTGGTTAT GAGGTGTGAT GCTACGCAGG AAAAACGTGC GCACTCAGAA 840
TCAGGGGAGA GTGTTTTTTT CCAGAAGTTT TTGGAAACGC GGCAAATTCT CCTTTCAGGG 900
GAAATAAGTA AAGACCTCGC AGAGGGAATA GTACGGCAAC TCTTTGTATT GGAGTCTCTT 960
TCCGTTTCGA AGCCCATCTA TATGTACGTG GATTCTCCTG GGGGGGATGT GGATGCAGGG 1020
TACGCTATTT TTGACGTTAT TCGCTTCGTC AAGACGCCAG TGTACACAAT TGGAATGGGG 1080
TTGGTTGCGA GTGCTGGTGT ACTCGTTTTG CTCGCGGCAA AAAAGGATTG TAGGTTTGGA 1140
TTGCGCAATA GCCGGTACTT GATACACCAA CCCCTTTCTG GTATGCGTGG CGTTGCGACA 1200
GACATAGAAA TCCACGCACG GGAGCTTGAG AAAACGCGAT CGAAACTGAA CGCTTTGATC 1260
GCAAGTGAAA srrGTGTGAG CTTAGA AAA GTTGCACAGG ATACAAATCG AGACTACTGG 1320
CTCGACGCTT CTCAAGCACT AGAATATGGT CTCATTTCGA ACCTGATTGA AAAAAGGGCG 1380
GACCTTCCTA AGAAATAATG GATACCGAAT CTGTCCTCTT TCGCGCGCAG TGCTTGCGTG 1440
CAGTGCGTGA TTTTTTCCTT GAACACCACT ACATAGAGCT CGATACGCCT GCACTCGCCC 1500
GTGCGCTCGT TCCAGAACGG TGTCTTGAGG TGTTTCAAAC CGAGTACTTT ACGTCAgTGC 1560
ATGCTAAAGA TACACAGAAG TTATATCTCG TTCCCTCTCC TGAGGTTTTT CTGAAACCGC 1620
TCATCGCGCA ACTGCAACGT TCGGCTTTTC AGATCTCAAA GTGCTATCGC AATGGAGAGT 1680
CCATGGGCGC CTTGCATAGG CCGGAATTTA CTATGGTCGA ATACTACACG GTGTACGCTG 1740
ACTACAAGAC GTCGCTCGAT GTAAGCAGCA AACTCTTTCG CTTTGTGGTT GAACAAGTAC 1800
AGAGTCATCC GCTCGCGGAC CCATATTCGT GTGCTTGTTT TTGTGCTCCC TTCGAGTACG 1860 TGACGGTCGA GGAAGCTTTT CTCCGCTATG CAGGCTTTTC CCTTTCGCAC GCGAGTAGTG 1920
TACAGACGCT TGCGCAGGAA GTATTGCGCT CCGGAATAGA CCTGGGAGCA CGTGCGGGGG 1980
TCGATTATAC CCAGTGGTCA TGGGACGATT TGTACGAACT GTTGCTCGTG CATATTGTTG 2040
AACCAAAGTT GAGGTCAATA AAGGATCGTT GCGTCGTGCT GTATGACTAT CCTATACAGA 2100
TATCCTGCCT GGCGCAnGAA CACACTGGAC GCTCAGGGAT ACAATCTACG TCACCTAACA 2160
AGGGTGACGC ACCTCACTGG GTGGTAAAGG AACGGTGGGA ACTGTACGTC CGCGGTGTGG 2220
AACTCATAAA CTGTTACACA GAGCAGCGGG ATGCGAAgcA TGTTACCCGG TACTGCAGGG 2280
AAGAACAAAC CGCAAAACAG GGATCTGCGC GAGTTGTGCA TCCTGTTCCA GAGGGCTTTG 2340
CGCACGCGTG CgcACGCATG CCCCCTTGCT CTGGAGCAGC ACTCGGATTT GATCGCCTGG 2400
TTGCGCTGCT AGCCGGTCGG CACTCATTAG ATGCGTTTGT GTATGATCAG TGACACTCCT 2460
CCTGCCTTGG AGAAGTTAAT TGGAAGTTTC CTGGTTGTAT TCGATGAGCG TTCTCACGGG 2520
AAGATCCCCA ATCAGCTCAT GGTACCGTAG aATGGTAAAC CCACAACGGC GAAGAAGCCC 2580
'ACCACTTCTG CCCCGCCAGC CCGGAGCATC GTGCGCGCTG CATTCAGCGT TCCACCGGTG 2640
GCAATCAGGT CGTCTGTTAA CAGCACGCGG GCCCCCGCGA CTACATCGCT CTTGTGAACC 2700
TCAACGGTCG CCTTTCCATA CTCTAAGGAA TAGGAGCACG AGTACGTATC CCCCGGTAGT 2760
TTCCCCGCCT TCCGAACTAA AATAAGAGGT ATTCCCATGC GATCTGCAAA AGGCGCGGCA 2820
AAAATAAAGC CACGTGATTC GATTGCTGCG ACCGCGGTAA CGTGCTCATC GCGGTAGAAT 2880
TCCACCATTT GATCAAGACA GTAACGAAAT ACAGCCGCGT TCATCAGCAC GCCAGTAATG 2940
TCGTAGTAGA GAATTCCTTT TTTAGGGAAA TCAATCCGCT TACGAATTGC GCGGTCCAGC 3000
GCCGCGTGTC CGTCCACAGG GGCATGGTAA CGTCCAATAC CACGCACGTC AATGATCTTA 3060
CCGGTTTGTT GGGAGGCTTG GTGGATTGAG AATTACGTCT CCTGGAAAAA AGATTTCGCT 3120
GAAACTTCAC GAAATCTCGG TGAAAATAAA TGATTATTTT ACCAATCGGT GAAAAAAAGC 3180
CGGGAAAAGT CCAAAAAGAC AGTGGTTATG CTCCATTTCT TTCGATTTTT TGTTGGCATG 3240
GTTTTTGCTT TAAAGTTTGG AGGAGAAAGA ACGATGAACA TGTGTACAGA TGGAAAAAAA 3300
TACCACAGCA CCGCCACGAG CGCTGCAGTT GGAGCCAGCG CCCCCGGTGT ACCGGACGCT 3360
CGTGCCATTG CTGCTATCTG CGAGCAATTG CGCCACATGn TAGCGGATCT GGGAGTACTG 3420
TATATCAAGC TACATAACTA TCACTGGCAC ATCTACGGCA TTGAGTTTAA ACAGGTGCAT 3480
GAGCTCCTTG AAGAGTATTA TGTATCAGTT ACTGAAGCCT TTGATACGAT TGCCGAGCGG 3540
TTGTTACAGC TGGGCGCGCA GGCTCCTGCG TCTATGGCTG AATACCTTGC GTTGAGTGGA 3600 ATTGCAGAAG AGACGGAGAA AGAGATCACT ATCGTCTCTG CkCTTGCGCG CGTAAAGCGG 3660
GATTTTGAAT ACCTAAGTAC GCGATTCAGC CAAACGCAAG TACTTGCAGC TGAAAGTGGG 3720
GATGCAGTGA CTGACGGCAT TATCACAGAC ATACTGAGGA CGTTGGGAAA GGCCATTTGG 3780
ATGCTTGGTG CTACCCTGAA AGCCTAGGTA GAGCAGGCTG TACGTACAAC ACACGTACGG 3840
CCATGCGCTG GAAGTCCTGT ATTTTGCACA TAAGGCCTCT CTCCCGTTAC AGCATGAGGG 3900
GAGGGAGGTG TTGGTTGAAG TGCTtGGGGA AGTGTGCATA ATCGTCcTAC GGAAGGGGGC 3960
GTTTTGTGGA AAAAATTGTT AACGCAGACG GATCGGATGC TATCTGTCCT GCGTCTGCGG 4020
CCTGTGCTAA GTCCATACGA TCTTACCAGG AGAGCTATTC TCTTGGTGAG GAAATCGCAA 4080
ATGCAGTCAC CCACGGTATC GGTGTCGGAC TATCCAwCtT GCACTGGTGC TCCTGGTGGT 4140
GCgTGCAGTG CACTAtACGC CGGCTGACTT GACGGCTCGC TATGTTGTTG GTTTTAGTGT 4200
CTTTGGCTCC TCACTCATTG TGCTGTACCT GTGCTCTACG CTGTACCATG CTCTGCCTCG 4260
TGGAGCGAAg TATGTGTTCG GTGTTATTGA TCACTGTTGT ATTTACGTGC TCATTGCAGG 4320
TACGTATACT GCGAGTTGCC TGACTACACT GTACGGCGCG ATCGGATGGA CTGTTTTTGG 4380
GGTTATTTGG GGATTAGCGT GTAGTGGGAG CGTAATATAC TCCGTGTTTG GGCATCGGGT 4440
ACGGTGGCTG TCTCTCGTGA TGTATATAGC GATGGGGTGG CTGGTAGTGT TTGTAGCAAA 4500
GCCGTTGCGG GAACGGCTCC CTGAGATTAG CTTTCTGTTT TTGGTATtAG GAGGCGTGCT 4560
CTACACGGTT GGTTGTGTAT TCTACGCACT CAAGAGAATA AAGTGGACGC ATACTATCTG 4620
GCATATGTTC GTCATCGGCG GTAGCGTCAT GCATTTTTTT TCGCTGTATT TAAGCTTTTA 4680
AATCCATAAG CCTCCTATGA TAGATAGGAG GTTCGTTTCT TTGCGCAGAC CGCATCCTGT 4740
CTGACGGAGC GaGCGAGTTC GCGCAGTCCT TTATGGTGAT GAAGACTGAA ACTGGTTCAA 4800
CCTCAACGCA TTGCATAACA CCGAGACTGA GCTTAAACTC ATCGCTGCTG CTGCAAGCAT 4860 aGGTGTGAGA CGTAATCCGA AGAAGGGATA TCCGAGTCCT GCTGCTAGAG GAACGCCGAG 4920
CGTGTTGTAA AAAAATGCCC AAAATAAGTT CTGCTTCATG TTCCGCACCG TTGCAATGCT 4980
GAGATCTACC AACGTTACCA CGTCCCGTAT GCAGTTTCTC ATCAGGACTA CGTCTGCACT 5040
TTCTACTGCA ATATCAGAAC CTGCACCGAT GGCGATCCCA ACATCGGCGG ATGCCAGTGC 5100
AGGCGCGTCG TTTACGCCGT CTCCTACCAT CGCTACCATC ATTCCGGACG CTTTTAAAGC 5160
GGAAATTTCT CGTTCCTTAT CATGAGGGAG TAACTCCGCT TTACTTTTCT TGACACCACA 5220
GCGTGCAGCG ATGGTGTGTG CAACGTGTTT GACGTCTCCC GTTAGCATCA GCGTTTGGAT 5280
CCCACGCTTG TGCAAGGCAC CAATCGCTGC AGAAGAATGT ACCTTTACGG GATCTGAAAC 5340 AAAAAGAACT CCTACGAGAT TTTTATCCGC TGCTACAAAT AAGGGCGTTT CCTCTAGATT 5400
GTGTGATGGA GAGAGATATG TGTCCATGCC ATCAATACTG TGTGCGACCA TCATACGTGC 5460
ATTGCCTACC ATGACGGTCT TTGCATACGA GGTATGCACT AAGCGCGCCC GTAGACCGAG 5520
TCCTTGTTCT GAGTTGAAAT CGGTTATAGC AAGCGGTGTC ATTCCTTTAC GCTGTGCAGc 5580
TACGCTAATT GCAGCTGCAA GCGGATGGCC AGAGCATACT TCTAAGCTGT ACGcAAGGTG 5640
GAGTATGTCT TCTTCGTTAT AGGTTGGATG GAGCGTGTGT ATGTGTGAAA GTGTAGGACG 5700
TCCTAAGGTG AGGGTGCCAG TTTTATCGAA CGCTATTACT TTCGTGCGTG CCATTTGCTG 5760
GAATACCTGC GCTGATTTTA TGAGAATACC CATCTGTGCA CCCTTACCCG TTGCAACCAT 5820
GAGCGCGGTA GGGACGGCAA GTCCTAACAC GCACGGGCAT GATATGACCA GGACAGTGAC 5880
TGCGATAGAA AAGGCAAATT CTGCAGACGC TCCTGCGCAT AACCACGCGC ACCAGaGAGC 5940
AAGGAGAGTG CTACGATTGA TGGTACGAAT ATGcGCTGAC AGCGTCGACT AGTTTGGTGA 6000
CCGGAACTTT AGACGCAGCA GTTTTTTCTA CCAATGAGAT AATTTGCGCA AGGGTGGTAT 6060
GCTCCCCTAC CCGTTCAGCA CGAAATTTGA GGAACCCCGT GCTGACTAAG GACGCAGAAA 6120
TGACGGAATC TCCGCGTCCT TTTTCTACCG GaATACTTTC CCCTGTGaCG TTTGACTCAT 6180
CGAGCGTGGC CTGCCCGGAT GTGATGATCC CATCTACCGG AACTAGCTCA CCTGCTTTTA 6240
CAAGTACGGT GTCTCCGACA AGTACGTCCT GTGcAGGAAT TTCTATCTCA ATTTCATGGG 6300
TCTCATGGGC TGATGcAGCG CTTGCAGTTG TTGGGGAAGA AGGGGATGCT CCGCGCGGAA 6360
CAGATACcTG ACGGATAACG CGAGCCGTTT TAGGTTTTAT GTCTAGCAGT TGTGTGAGTG 6420
CGCGAGAAGT GCGCCCTTTA GACAAGGCGG ACAGGTATTT ACCCACCGTG ACGAGCGTTA 6480
CGATCATTGC AGCTGATTCG AAATACAAAT CCGCCACATA GTGCGATACA AGTGCCGTGT 6540
CGTTGGCATG CACGCCCATT GCTATACGCG CCGTGGCAAA GAGACCGTAT GTAAAAGAAC 6600
TCAGGGAACC GAGAGAGATG AGCGAATCCA TAGTTGCAGT GTTGCGTCTC AGAATTGCAC 6660
CATACAACGC AATAAGTCCT GCACGAAAAA GAGAGCGATT GGCGTACAGG ACAGGTAATG 6720
TCAGAAACGC CTGTACAAGG GCAAAGGAAA GCGCATATTT CAGGGGGTGC AAGAACCCAG 6780
GGATCGGTAG GTGCACCATG TGCCCCATGG ACAGATACAT AAGGGGCACG AGTAAGCAGA 6840
GAGAAGTACG GACACGCCTT TTGAGCGTCA CAAAATCTGG ATGTACCGGC TGGGTTGCAG 6900
CAAGCGGTGC GGtTGTCGAA TGCGTATCTA AAAGCGTGGC TTTGAATCCT GCATGTGAAA 6960
CTGCATCGAT GATGGTCTGA GCAAACAGGG TGTGCTCAGT AGGGTGAAGA TCAGTGTGTA 7020
CGTATAAATG GCTGGTGGTG GGATTTACGT AAACGTCGTA TGCGCCTGTC ACGTGGCGCA 7080 CTGCTTCCTC TATGCGCCGC ACGCACGCAG nanAgnA T ACCGTGAACA ACAAATGACA 7140
CTTGCATGAA AGACTACCTC CTATTCAGGA CGGGTTTTTT ATGTATCCAA AAGCTCTGGG 7200
GAGGAGCGGC TGGCAGTGAC GGCAAGAAAC TTGCATGTAC CGGATAAAAA ACCGTACACT 7260
TTTCATCCTA TCTGCTGTGA AATGGGAGCT CAACGAATTA TGACCCAAAA ACTGCAAAAA 7320
ATAGTGCTGC CTCCTGTCTA TGGGCCTGCA GATTTTGAAG CGCGTGTCTA CGCATGCTGG 7380
GAGCAGCGGC AGGCATTTAG CCCGCGTGCG CGCGGgAGTG GAACGTCGGA TAGCGAGGGG 7440
TGCGATGGGC ATAGCAGACA GATAGAAGGG GGTGCGCGTA CCTTTGTCAT TGCTATCCCA 7500
CCGCCAAATA TAACGGGCGT ACTCCATATG GGGCACTGTC TCAATACGGT GTTGCAGGAT 7560
ATCGTTATCC GCTACCAGCG CATGGCCGGT GCGTGTACGC TCTGGATTCC GGGAACTGAC 7620
CATGCAGGTA TTGCCACGCA GCATGTGGTT GAACGCGCCT TGAGGAAGGA AGGCATCCAT 7680
AAGCGTGAGG TGACGCGCGA ACAATTCGTT GCACGAACGC AGCAGAT AA GGATTCCCAT 7740
CAAGACACTA TTCGCATGCA GTTACGGAAG ATGGGGGCAT CTTGTGATTG GACCTGTGAG 7800
CGCTTTACGC TTGATGCAGG TATGTCAGCC TCCGTACGCG AAGntTCGTT ACGCTTTATG 7860
AACGTGGCTT GCTCTATCGT AGCATGTACT TGGTTAACTG GTGTCCTCGC TGTGGCACCG 7920
CGCTGTCTGA CGATGAGGTT TTTCATCAAG AAAAGGATGG CGCGCTCTAT TATGTTCGGT 7980
ACCCTCTTTT ACCCCGTACT GAAGAAGAAG GAAACGGCGT TCCCCCTCCA TTAGGGACTG 8040
CTCAGGTGGG GGAAACTATC ATCATTGCTA CTACGCGCCC TGAAACCATT TTGGCAGATG 8100
TGGCAGTTGC GGTGCATCCA GATGATGCGC GCTACCAATC TTTGATTGGA CGTAAGGTAT 8160
GCGTGCCAAT GGTGAACCGC ATTGTTCCTA TTATTGCTGA TTCATATGTT GCGCAGGATT 8220
TTGGAACCGG TATGGTAAAG ATTACTCCTG CGCACGATCC GAACGACTGG GATATTGGGA 8280
CGCGCCATTC GCTTGAAgCG ATTAATATGC TCAATCCAGA TGGCTCGCTC AATGATCAGG 8340
TGCCTGCTGC GTATCGGGGG CTTTCGTGTG CTCAGGCACG GATACAAATC GTTGCCGATT 8400
TGCAGGCGCA TGGGCTCCTG TCCCGTGAGG AGCGCATAGT GCATTCGGTG GGAGTGTGTT 8460
ATCGCTGCGA AGCAGTTATT GAGCCGTATC TTTCTCTGCA GTGGTTTGTC AAAATGAAAC 8520
CACTTGCTTC TCAGGCCCTG GCTGCGTGGA AGCGTGCGGA CGTGCAGTTC CATCCTAAGA 8580
AATGGGAAAA TACCTATGTG CGGTGGCTTG AGCACATTCG CGACTGGTGT ATTTCGCGCC 8640
AGCTGTGGTG GGGACATCGC ATCCCGGTGT GGTATTGCGC ACAGTGTGCA CAGCAAACGG 8700
TGAGTCGGGT GGATGTGCAG CGCTGTGCTC ATTGCGGCAG TGCGGATATA ACGCAGGATC 8760
CTGACGTGTT AGATACGTGG TTTTCCAGTT GGCTGTGGCC TTTTTCTACT CTTGGGTGGC 8820 CTCAGGAAAC GCAGAArctG CGCGCGTTTT ACCCCACGTC TGCGGTCATT ACCGCGTATG 8880
ACATTATTTT CTTTTGGGTG GCGCGCATGA TAATGGCGGG GCTGGAGTTT ACGCAAACGG 8940
TTCCTTTTCG AGATGTGTAC CTGcACGGTT TAGTGCGTGA CAAGCAGGGA AGAAAGATGA 9000
GCAAATCACT CAACAACGGG GTGGACCCGC TGCACATTAT TCGCACGTAC GGTGCCGAtG 9060 cAtGCGTTTT ACGCTTGCCt TTATGTGTGC gCAGGGGCAG GACGTGTTGA TAGAAATGGA 9120
TTCGTTCAAG ATGGGTTCGC GGTTTGCGAA TAAGGTGTGG AATGCTTCTC GTTATATTTT 9180
GGGCAATCTC GAAGGCAGGC GGGTGTACGC TATTGCGCAC GTGTCTCTAA CTGAACTGGA 9240
TCGCTGGATC TTTCACACAT TTAATGAAAC TGTGCAGCAG GTGCGTACAG CACTTGAAGC 9300
GTACCGTTTT AATGATGCGG CACAGGCAGT GTATGAGTTC TTTTGGAACA GCTTTTGTGA 9360
TTGGTATGTA GAGGCAAGTA AATGCTCGTT TCAGAAACCT GATGAACAGG AGAAGGATCG 9420
CGCAgCTTcA GTGCTCTGTA CCCTTCTGGA AGAGACGCTG CGACTGCTCC ATCCTTTTTT 9480
GCCGTTTGTA ACAGAAGAGA TTTACCGGTC cTGTCGCCTT CTGTGCACGA TACCACCCAA 9540
GCAATTCCGT CTGGGGCGCA CGCGTTGCTC ATGTGCGCGC CATATCCGGT GTATGTGCCG 9600
TCGCGGGTAG ATGCGCGCGC GTGTGCGCAT ATAGGTGCGG TGCAGGAAAT AGTGCGTGCG 9660
GTGCGnTACT GCGCGCTGCG TGTGGTATTG ATCCGCAAAA AGCTGTTTCA GTCAGAcTGC 9720
GTCCGAGTTC TCCGGCGCAG GATGCGAACG CCGCAGCGCA GGTGTCCTGT GTGCACGATC 9780
CGGGAGCGGT GGCGCGCACA TATGAGGAAT TGATTTGTGT GTTAGCGGGT ATTTCCTCGC 9840
TTGTGTATCT TGAAAGCGAT GCGCCTAAAC CGCAGtTGCC GTTGCAACAG CGGGGACAGG 9900
GTTTGAGCTG TTCTTAGTAA CGACGGAAGG AATTGACCGG ACGATGCTGT GCGCGCGTCT 9960
TCAAAAAGCG TGGCAGAAGG CGCGGCAAAA AGTGCAGCAG GTGGAGCGTA AgcTTGCAGA 10020
CGCGCAgTTT TGCACGCACG CTCCTGAAGA AGTGGTGaCC GCAGAGCGCA AGAAACTGGC 10080
AGAGGCGCGC GCAACGTGCC ACACCCTTGC AGGATATCTT GCGGACATGA ATGGAAAGCC 10140
TGGACCGCTC TCTGACTCCG ATTAGGGTCC TGTGCCCCTG AGCAATCCGT TTAGCAGCAC 10200
GAACAGCCCA TA ACCGCGC ACAGGAGCAC ACCGGCAGGg CGGGTGAGGG TCCTGCGGCC 10260
GAGTGCGCAC GCGTGAAAGA TTCCCACGAC TAACAGCATG GCAGGCAGGT GTAGCAGAGA 10320
AAAGATTTTT GGCACCGGCA GGCCGTGTGG CGTAAGAGAC GCGGCAGCTC CGACTACAAA 10380
CAGCACATTG AGGATATCCG CACCTACTAT GTTTCCCACT GCCAgTGCGC CGTGTCCGCG 10440
GCGTACTGCG GTGATGGCAG AGACGAGTTC TGGCACGCTG GTGCCAAAGG CGATGATGGT 10500
TGCGGCTATG ATGCCTGCAG GTACTCCTGC GCGGAGCcmA TGATTTCTAC CGTGGGGATG 10560 AGGACGCGCG AACCGAGGAC GAGGAACCCG ATTCCCCCTC CTAATTGCAG GAGCAGGCGG 10620
CATACAcTGC GCGTATCAGT CTTGTCTGCG GGGAGCGCTG CTGTGCGGGT GTCTGGAGCG 10680
TGTGGGAGGG CCGACCAGCG CAGAGAAACC CACAGGTACA GCGCGAGCAG ACTGAGAAAC 10740
AGCCAGCCGA CGTACTGATG CACCCGCGCG CCAAAGCGTG GCAGGGTTAC CCATCCGAGC 10800
GCGCATACGA CGAACAATTG CACCCGCGCG TGCCGGCGCA TCAAGTGTGT GTCGAGCGCG 10860
AGCCCGGGGC GTGCAAGGAG TGCCCCGAGT CCGAGAATGA AACCGGTATC CACCACGATG 10920
GATCCTATGG CGTTTCCGAG TGCTAAGTCG GCGTTGCCsC AGAGCGCAGC GwATACAGAC 10980
ACGGCTGCCt CGGGGGTGGT GGTGCCCAGG CTCACGAGCG TGGCGCCCAG GAGCGCTTCG 11040
CTGATCCCCC AACGCCGGGA AAGCGCgCTG GCGCTCTCTA CCAAGCAGTC TGCGCTGCGG 11100
GCCAGAAAGT AGAGCGCACA GAGCAAGACG CCGAGTAAGG TGGGGAGTGT GCGCGcCGCA 11160
AGTGCGCTGC GTACAAACGA TTCCATAtGC GTACTGGAAC GGTATCACAC TGGGGGGAGA 11220
ATGGACAGCa GGGAGGAAAA CGCTCATAAT GACCGCACGT GAAGTGGTCG CTCGTTCTTT 11280
CAGGTGGTGG TGCGCGGGGA ATTGCCCACA TTGGGGTGCT CAAGGCGCTT GAAGCGCTAC 11340
AGGTTCCGCC GCCGCAATGT GTCGTAGGAT GTTCTATGGG TGmGsTGGTG GGGGCGCTCT 11400
ATGCGCTGGG GATGTCGGTG CGGGAGATGG AGGCGTTTTT TCAGCGTGAT TTTGTTATTT 11460
CAGACTATGT GAATGCACGG GATCCCTCTG CGTGCGTTGA GGCGGGGAGT CnATnnGCCA 11520
GCAAAAGGCC AGGAACCGTA AAAAGGTCGC GTTGCTGGCG TTTTTCCATA GTCnGGCCCC 11580
CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACC 11628 (2) INFORMATION FOR SEQ ID NO : 22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15518 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:
ATCGTGGAGG CAGTGGATAA AATAATTCAG CCAGTTTTTT TGTCTGCACT TACCACCTTC 60
GTTGGTTTTG TATCTTTTTG TTTTACCTCT GTTGTGCCTA TTTTTGAGTT CGGCGTGTTC 120
GCAAGCGTGG GCGTGGCGTC TGCGTTTGCA TGGCGCTCAT GCTTATCCCC TCGCTCCTCA 180
TTATCCGTGG GCCTGAATCG CGTGTGTGTG CGCATGCTCC CGATGCCGGT CATGAACACA 240
TGGATACGGC GATCACCGGT ACGCTGATGG TAATCGCCCA TCACTATCGG ACGGTGTTGT 300 TTGTTGCATT CCTTGCTGTT GTATTTTCCC TGGTGGGGAT GTCACGTTTG GTAATTGACA 360
ACGTGCTTAC GGAATACTTT GAGCCGGAGn TAACAGTGGT GCaGTCTGAT CGCTTTATGC 420
AGCAGCAcTT CGGTGGTTCT CGATCGCTCA CCGTATTAGT GAGTACCCcT GCGCGGGATG 480
GCAGTGTAGC ACGTCCGGAT GTACTGAAGG CTATGGATGA TCTGACTGAG TTTTTACAAA 540
CGCGGGTGGA GCATGTGGGA AAGGTTATTT CTCTCGTCCC GCTTATCAAG CGCATTAACC 600
AAGTGTACAA CGCAGACgCG TCGGCGCGAG GCCTGGAGGC GCAGTCTGCA GATGTGGTGC 660
GCGGTGGTAC GGATGACTTT GGTGTTTTTA AAACATTCAC GGGCGGACAT GAGGAACCTG 720
CGCGGGCGGA GACGTCACGT ACTTCCTTGG CGGCGCCGGG GTCATCGTAT GATTTTCGTC 780
AAGCAGTCGG TATGCTGGTA AGTGCCGTGC GGGATTCTGA TTTTGATCGT TCAGATGCGC 840
AgCAGCTcGT GCAGGCTCTT GAGAAGGCGG TGAACTACGA TGGGCGCGCG TATTATGAGA 900
TACCGTGTGA TCCTAAGAAA TATGGGGTGA AAACGAGCGA GGAATTGCAG GAAATTATCA 960
GTGGGTATTT GTTACTGCTT tCAGGAAAAG GGTTGGGTCT GGTGGATCGT GCCGTAGACC 1020
CCCGTGCGTT AAAGATGAAC ATCCAGCTCG GAACTAAGGG TCAGCAAGAC TCATACGGTG 1080
TCATTGAGGC AGTAAAAAAG TTTATCCGGG AAAATTTTCC TCAAGACGTG CACGCTGAGT 1140
TTGGCGGCTC AGTATTGGTT GAGCAATCCT TGAATGATCT GGTGGTACAA TCTCAGCTGA 1200
TTTCACTGGT TTTtTCTTTG TGTGTAGTTT TTATCATCAT CGCAGTACAT TACCGCTCGC 1260
TGTTTGCTGG TA AATCGGT ACCCTTCCTT TAGGAGTATC TGTGTTGGTG AACTTTGGGG 1320
TTATGGGATT TTTtGGCATT AAGCTGAACA TTTGCACCAC GATGGTGgCA GGCTTTTCAA 1380
GCGGTATTGG GGTCGACTAT ACGATACACT ATCTGGCGGC GTATCGGCGC GCGTGGAAGG 1440
AGTGTGGTGG AAAAGATTTT CTGACACAAA CATTCTATGG TTCAGGGCGG GCAATTCTTT 1500
TTAATGTTCT GTCTGTAGGA TCGGGATTTG CAGTGCTGAT GCTTTCAAAG TTCAATGTTC 1560
TTGCTGATTT TGGTTTGCTT ATGGTGTTGG CTATGCTTAC AAGTTCAGTG GCGAGTCTCA 1620
CGCTCCTTCC TACCTTACTG AATGTGGTCA AACCAAGGTT CATCACACGA TAGAACCAAA 1680
GGGAGGTATG CATGAAACGG ATAGCATATG TGCCGTTGTG CGCGGTAGTT GGTGGCATGT 1740
GTTCGATGTG GGCACAGAGT GCAACAGATG TGATGGGTAG CTTTAAGAAA ACGGCGGAAA 1800
CAGGCACAAT GGGTACGCAA GCCCGCATGG TTGTCCGGAA GGCGGGTAAG ACGGTGAGTA 1860
CCTTAGTACT TAAACAGTAT ACCCGGTATG AAAAGAGTGG AGAGCAAAAG ACTCTTATAG 1920
AGTTTTTGTC TCCGTTGAGC GTGAGGGGAA CACGCTTCTT ATCCCTGCAG AAAAAGGACG 1980
GGGCGTGGGA GCAGTACCTC TATTTGCCCA AACTCGCACG CGTCAGGAGC ATTACAGGGG 2040 GGGATGCCCA CGCTTCGTTT ATGGGGACGG ATTTTTCGTA TCACGATCTT TCGCTTGTTG 2100
GTGGGGTTGC TGATCTTGAT GAATGTACGC TCGACGGTAC GGAGTCGTAC GGGGGAAAGA 2160
TGTGCGTGCG CATTCAGACA CTGTCACACA AGCCCCAGGC GCGGTACGTC AGGGCGTTGC 2220
TGTGGATAGA GCAGGAAACA GGTCGTTTTG TGAAAGGGGA ATTTTTCGAT AAAAAAGACA 2280
AGCGCGTGAA GATCATGACG CTTTCTGATT ACGAGACTAT CCAGGGTGTA GATACACCAA 2340
AGACGGTTGT GCTCGAGACG ATCGCCCAAC GCATACTACA ACCATTCACC TCACGAAGgT 2400
TGAGTATCAC ATGGACATCC CTGAGAAGGT GTTTACCCCT GAGTATCTAA CCCAAACCGA 2460
TCGGTGAGTG TTGTGGCTTT TAGCGTGTTG TTTCTTCGTG CGGTATGGCT CGGTGGCTGG 2520
CGGTCGAGCC CTTCTTGAGC GTCTCGAGCG TCGAGGCGCC ATCCTGCCGT ATTGCGTTTA 2580
GGAAGTCGAT GACGCTTGGG TAGCGCTGAA AAAACTCACC CCACTTTGTG AGGCGTTGCA 2640
TTTCAAAGTT CTCAAGTgCg nTGCGCTTTG CCGCTTGtGC GCGAGTGCAC TGAGGTGTTC 2700
GGAACCcTTC GCTTCGAACG CGCGGTAGGC cTGCTCTGCA GTGTGGTACA GGACCATGTC 2760
TGGTATGTGT ACCTCGGAAA GCACCACCTC TGAAATTGCG CAGTGGGGAA TTTCTTTTTC 2820
GATTGCGCGT TTCAGTTCTC GTGTGGCAAG CCCGTACTGC GTGTGAACGC GCTCGTAGAG 2880
GGCTGGGTCG GCCATGCAGC GCGCGATGAA ATCGTGCGAC ACCTGGGTGA GTGCCGCTTG 2940
TGCTGTAGTG TCCACGTATC GCTCGAGCGA GTTTTGGTCA GTGATCCTTT CTCGCTGCAC 3000
GGTATCCAAC AGGAAAGCTT CTTTGAGAGC AACGCGCGCC GAGAGCGCTA AAGTCCAGTC 3060
AAAGGGTGCG TGTTGATCAA GCcAnTGCGC GTACTCCTTT GCCGCGGGCA GGACGTCCTG 3120
AACGGTGACG GAGACCGTCT GCTTCTTCAA CTCAAAGGCA AAGAGTCCGC ATTGGACGGG 3180
AGCAGCGGCT CCCAGCGCCA GAGAAATCGC CCCGGAGCGA TGAGCGCGTG GTGGTAACCA 3240
CCTGAGCGTG ATCGCATAAC CCCATAACTT CCCGCCGGCA GGGAGAGCTG AAGCCATCCC 3300
CTCCACAGAA GGTACGTACT CGCGCCCAGG GCACAGACAA GCGAAAAACA AAACGTCCGC 3360
ACACGCATGA GTAGGGGAGC CTAGCAGTTT GTGTATCCCA ACGCAAGAGT ATTTGGGCGC 3420
GCAgTATATG GTGGTATCGT GCAGTCCTTT TGCACTGTCC ATTGCTGAGA CATACCGTGG 3480
TAATTGAATG GGCAGCGCCc TCTATGGTAG GGTCCGCCCC TATGGGTCGA GACTCGCCGG 3540
CGGGTATGCG CGAGGCCGTT TACTTTCTGC ACCGGATGGT GGTGTGTCTG GGCGTGCTGC 3600
TGTGTGCAGC GTCGCTACTT TATGTGTTTG GGAACTTTTC TCACTTTCTT GATAAAAGCC 3660
AGTTTATTAT TTTACGTTCA TGTGTCGGCT GTTCAGTACT GTTAGTGGTT GCCTGTTTGT 3720
GTGCGGGCAG TTTTGAGCTC TACTTTTTTT TGACGCGTAg TGACGCCCCG TATGGGCGGC 3780 TGCTGTGTAT CACCGTCGTG GCACTGCTTT TTGGTATGGG TGCACTTGTT TTCAATACGG 3840
TAGTGCTCAT CGTGGCTAAA GGCACATGAG AGATTTACGA GGCACATCTG CATCTTTTAC 3900
TTCACACGTT GTAGTTTACA GCTGGAACTG AGAGCCGAGG TACACTTGGC GTACGTGAGA 3960
GGATCGTACT ACCTCTTGTG GACTGCCCTG GGCGATAATA TGGCCGCAGT GTATGATATA 4020
GGCTCGGTCA GTGATTTGTA GTGTTTCACG TACGTTGTGG TCCGTGATGA GTATGCCAAC 4080
GCCTGAGTGC GCAAGACGCA CGATAATGCG CTTGATATCC TGCACGGCGC AAGGGTCGAT 4140
CCCGGAGAAG GGTTCATCAA AAATTAAGAA GCGTGGATTT ACCGTTAATG CACGCGCAAT 4200
TTCCACGCGT TTGCGCTCGC CACCTGAAAG CGTGTCTGCC CTTTGATTTC GCACATGGGT 4260
CAGCTGAAAT GCTTTGAGCA GCGCTTCGCA TCGCTCGGTT TGTTCTGTGT AACTCAGATC 4320
GCGGCGCATT TGCATGATTG CGCGCACGTT TGCTTCTACC GTTATTTTTC TAAAAATAGA 4380
CGGTTCTTGC GGTACGTAGG ACACGCCCAT GCGCGCGCGC ACATGTATGG GTAGCGGCGT 4440
TATGTCTGTG CAGTCTAGCA GGACGCGCCC GCTATCTGGA CGGCACAGAC CCATTACCAT 4500
ACTGAACGAT ACTGATTTGC CTGCTCCATT GGGACCGAAC AGCCCAACTA TCTCTGCTTG 4560
GTGTACAGAA AAGGAGACGT CATGCwmCAC GTGCCGTGTT CTAAATGTTT TATTGAGTGC 4620
AGCGGCCACG AGGCGCTTTT CTCCCGGTGG AGATTCATTC CAGGAAATAT TTTTTTCTTT 4680
CTCTGTCTCT GCCTTTAAAG TATATGTGAC AwygTGCTTT GCACGGAGTC TGCGCGCGGC 4740
AGACAGGAAG CGTTGAAATG TCACCGGTGT TGTTCCCATT CGTAGGGTTT GAGAATTGAG 4800
CCTTGCACTT TTCCGTGTAG CTGAATTTCT CTACGGGCCA TATTCAACGT GATACGTTCT 4860
GCCTGAAAAA GATTACCGCG GTCGCTAACC CGTGGCGCAC CGCTTAATTC AAGGAATATA 4920
CTTTTGCGGT AATACATACC GAACATGGCC TGACATTGCA GGTCTTGGTA GGTAAGGGAC 4980
ACGTTCATTT GTAAGAGCAA TGTTTCGTTT TTTTCGTGGT AGACTATGCG CTCTGCACGT 5040
GCCATTACGT TTTCTCTTTT GTCAGAAATC TGCGCAGCAC CGAGTAGTTC GGTAGTATGC 5100
GCATTTCTGC TAAAATGCAG AGATTGTGCA GAGAACGTAA GGTGCTGTGT TTTTTTTTCT 5160
CCGCGCACGT TGCCGGTGGC GGTAACAAGA TGATAGTCGT CGCCGGAAAT TTCAATCTTA 5220
TCTGCGTGGA TTTCTAAATC GGCAAAGTAC ATGcGCgCGT TTCCGCGGAG TACAGTGCGT 5280
GGGGCGTGGG GGctCGGCAG AGCCTTCTAG ACTGTCTGCA AAAAAGCGCA CCTTACCTCG 5340
CGCGTGTGCG yGTTGCCACA CGCACAGGGA CATGAGAAAA AGTGCAAGGA GTACAGTGCG 5400
CAAGAAGGGT ACGGGTGAAA AAGCAGGGTA TGAGGACGCG CGAGTCATGG ACTGAGATTC 5460
GGTGCGGCGT ACCGTGACGC GCGGGTTTTG GCGAAAGAGT CAAGATCGAC GTCTATGTGA 5520 ACGCCATTTT GAAACACAAA GCGTTTGGTG CGTGCATTGA CGCATAAACC AACCCCGGTG 5580
ACAACAGCCC CTGATGCATC GCTGATGCGA ACAGGGGAGA AGTGATCGCT TGTGAACAAT 5640
GCTCGCGTAT TTTCCCATCG AAATGCGCGT CCTTCCAAAC GCAGACCCTC ATGGGGAAAG 5700
AAGCAGTCCA CCTTTTTCCC GAGGTAGAAA ACAGTACCTT CGCAGTCAGT GAGCAAAACA 5760
CCTGCATGCC CGCGCACGCT CGCACGCCCA TTGCCGTCGT AGCGTGTAAA GTGGATGTCC 5820
CGTGCAGTCC AGGTTTGGTC GTGGTGATAG AACTCAAGCG TTTGAGCATG CAGGCGCGTT 5880
TCCAACAGGT GAGTGTTGTA GCGATCGAAC GTCACTTGGA AAAACGCGAT GGTAGGTGTG 5940
TTGCGCGTGG CGGAGGGTTG CTGTATCCAG CGTAAGGCAC TGCTGTCGGC GCAgCAAACC 6000
AGAAGGAGGC AGGAGACAGG AGCAAAGAGG AGTCGAATGC GCACTACCGC TCCTTTACGC 6060
TACAACGGGC GTAgCcGGAC AGATGCCCGG GCAGAAAAAG AAGGACAGAC AACGCGTCCC 6120
AAGTAGGAAA GAAGGACGCA CCTTAAAAGC AACAACCCCG CCCCCACAGC CGCGAAGCAC 6180
AACAAACCTC GCAAGAAGGG CGAACGAACT TGAAAGAGAT CGCGCGCCAT TGTACGTTCT 6240
CCCCTATGAA GATTAAAGAG AAAAAAGGCT ATTTCATCTC TTTTTCCGCT CTATTTTTGA 6300
TTGCCTATAT GTTCGTAGCA GCCGTCCCCC TCGGGGCTGA CCCTTACTTT TTGCCTATTT 6360
GGGCACGTGA CCTTGCATCT GAATTGCACG AAkAGCGTCC TGAGCGCGCG GTGCsTGAwA 6420
CGCTGaCACA GTGCAGACCC TCCAACCTTT CATGGTGGGG GAGTACTTTG GCTATTTTAC 6480
CGATGAGGGG TCGGTTGTGT TTGCCACGCG GGTTACCCAG CGCCTTTCTG CTTCTACACA 6540
CGCATGGGCG GTGTATCCTG AGCATGCAGT GCGCACGCCT GTTTTTAACC CTGCTGGGGA 6600
ACACCTTGCA GAAATTGCTG AGCCAGGCTT TGTGCATATT GAAgCGGATC GCTTTTTTCT 6660
CTTTTCCCCA GGGGGAAATG CTGTTTCCtC CTATGACGCG CGCGGTGTAC AACGGTGGnC 6720
GTGTGTTGCA CACGGCGCCT ATAACCGCnT TTCACTCTTC TGCTGCAGGC GCGGTTATCG 6780
GGTTTTCTGA TGGGAAGGTG ATGGTTGTAm CtGCCGACGG CACCGTCAGA TGTGCATTCT 6840
ATCCGGGCGG GAGCACATAT GAAATTGTGT TTGGGGTGAC TCTCTCTGCA GATGGCACAC 6900
TTGCTGCGTG CGTGTGTGGT TTGGACAGGC AGCGCGTTAT CCTGGTGTCT CTTGCGGATG 6960
TGCAGTGCAA GATTGTTCAC CATCAATATT TGGAGGGCGC GTTACGTCAC CAGCTTTTGA 7020
TGAATTTTGA TACCGAAGGG CGCTATGTGG TATTTGAACA TGCACAAGGG GTAGGGGTGA 7080
TTGAtTGCCa AAGGTTAGAG ACAAACATTA TCCCCCTGGT TGGGGATGTT GTTGGTATGG 7140
GCGTGCAGCC TGAGTGCGAT GTTGTGACGG TGTTAAGCCA GAAGGAGCAG CGGTGTCGGT 7200
TTGCTGTTTT TGAGCGCGCG GTGCATAGGG TGGGGGATGT GCGGTTTGAC GCACAGGATG 7260 TGTCATTGAC TCAGGGTGAA AAAAAATTCT TTCTAAGTAT CGATATGCTT CTTGCACGCA 7320
TTGACATTGC AGGGATCCCG TAAGGGGTAG GGACAGAGGG GGTGTGCCGG CGTGCGTACG 7380
ATTTTTGTGG GTGTACTGTT GCTCGCGATT ATGGGAGAAG GGCGCTTGTG TGCGTTGGAA 7440
TGGCCTGTTG ATAAACCTAA GTTTTTGTCT CTTTTTGGAC AGAGTGTGGG CGCAGGTCTG 7500
TTACAGCAGG GATTGATTTT TGATGGAGCA GACTCTGCCA GGAGAGCGTG GATACGCGGT 7560
ACGTACTGCG GGGTaCGGAC GyTGCGTGAT GCGACTTCAA AAACATCGCC GTGCGCGTGT 7620
CTTTCCCGGC GCGTTAGGAA ATGCGTTGAT TTTTGCGCAT GAAGACGGGT TACAGACGGT 7680
ATATGCAAAT TTAnCGAAGC GAAAAACGCG CAGGATTTTG GTTCTACCGC GGAAGCAGAA 7740
TCCGGGGTAA CGGTCGGATA CGCAGGATCA AGTGCGTGGG CACCTCCAAA CAGTTTTGTG 7800
TTCCAGGTGA TTGATACAAA AAACAAAGTG TATCTCAATC CCTTGCTCCT GACTGnCTTC 7860
GGTGTCGGAC ACCATAAAGC CCACCATTCA GGATGTGGTA TTGGCGGGAA AGACAGGGGT 7920
GTTGGCTCTT TCGGGGACAG CAGCGCCGCG CGATGCCGAC GGGTATGTCT ATACACGCAA 7980
GCGCACCCGT GTGCACAGGC GCGTTACGCA GGGaACCTAT CGTCTGTATG CGGCAGTCGC 8040
AGATGTGTTA GAGCATGGTA CCCAGACGTT CACTCCGTTC CAAGTGCATG TTGTGGTGAA 8100
CGGATCGGAA GTGAGCGCGG TGTCCTTCGA GTTGATTGTG GCGAAAGATT CGCAGGCGTG 8160
TCTGTCAGGG TCGCTTTTAA ATGAACGCCT GTTATATGAG ATGAAGGGTC GCGTGTTTTT 8220
GGGGAGCGTA GTGCTCACGC GTGGTACTGC AGAGCTTGCG ATTAGCGCGC GTGATATTTC 8280
AGGCAATGAA CGAACGGAAG TGTTCTTTTT ACAGGTGGAG TAAGGCGTTC GTAGTTTTTT 8340
CATTTGTACA CCGGGGTGTG TGAGGGGGTG TGGTGTGGAT AGGACGGGTG GATACGTGCG 8400
GCTTGCGCTT GCAGCCCcTG CGGTGCGTGT TGCGGACTGT GCATACAATA CCCAGCGTAT 8460
GATTCAGACG GTGCGTCGTG CAGCTTCATG CGGTGTGGAC ATACTATTGT TTCCCCGTCT 8520
TTCGCTTACA GGGTGTAGCT GTGCGTCTCT TTTTGCTCAG GATACGCTGC TTTCGGCAGT 8580
CTGCACGCAC GTATCTGCAC TGTGTgcTGG CACTGCTGAT TGTCAGCTGT TAGCGCTTGT 8640
GAGTGTGCCC TGTTTTTTGC GCACTCAGGT GCcGTGTGTA CTGCGCTTGT CGCACGAGGT 8700
CGTGTTCTAG CACTGGTTGT GCAGGATACC CTGGCGGCGT GTGGCGCGCA AAAAATGCAA 8760
GTGCCCTGTG AGGTCCTGTA CGGTGGTGCA CCGGTGCCGG TGTACGATGT GCAGACGTGT 8820
TTTGAAAGTG CAGAGGGTCT TTTCTCTTTT TGTGTTGGTG CTATGGATGG ATCGGTACCT 8880
GCCACGCTGG TGTTGCAGGC CTACGGTACG CCAAGTACGG CGCAGACACC GGATATTTTT 8940
GCTGCGCACG CTGCGGCATA CAgTGCACAG CACCAATGTG CGTATGCGTA CGTAAATGCG 9000 GGGTGGGGGG AGTCTAGTGC TGATGCGGTG TATGGCGCGG AAAGTGGTAT TTTTGAGTGT 9060
GGGCAGTGTG TGGTCCAAGA CTCATTGCAG GAGATGCGAG AACGGGGGGA GCGTCCGGCG 9120
CACGCGGTGC tGGaCTGCAT GTTAGTGCGG ACGTAGATGT GTCTTTGGTA CACTTTCGTC 9180
GTCGTGCGCG TAGcgGACcA TACCACTCTG GGTGCATCGG CTCCCTGCGT CACGCTTCCT 9240
GCaGGCATAT TTGCAGCGTC AAAGGCGCAC GCCACGCTGC GGCGTCCTCG CGTACCCTGT 9300
CCTTTTTTTC CGCCTGCTTT TCAAAAATCG CAGGATGCGg TGCCCCCGCT CACGGGTGCC 9360
GTGTGCCTCG CTGTTTCTGC ACCGTCAGAC ACGCAGGACG GTTTTTTGCA AAGAACGATA 9420
GACTTAGCCG CGCAGGgCGT GGCACTCCGT CTTGAACACA TGGGCTGTAG GCGCCTGGTG 9480
GTGGGTGTTT CAGGAGGTGT TGATTCGGCG TGTGCATTGC TAATATGCGC GCGCGCGTTA 9540
GATTTTCTCT CGATTGCGCG TACACAACTT TATGCGCTAA CGCTTCCTGG CTTTGGTACT 9600
ACGTCAGGAA CGAAAGGTGC GGCGCAGGAG TTTGCGCGTG CGCTCGGTTG CACTGTGCAA 9660
GAAATTTCTA TTAGCGCGGC AGTGACGCAT CATCTCCATG ATATTGGGCA TACGATGCAG 9720
CAGTGTGACG GTAC ATGAG AATGCACAGG CGCGCGAACG GACGCAGATT TTGTTAGATC 9780
GTGCTAACCA GCTTGATGCG CTCATGATTG GTACGGGAGA TGCGTCAGAA GGTGCGCTTG 9840
GTTGGGAAAC CTTTGGGGGC GATCACCTTT CGCTGTACGC AgTGAACGCA TCTTTGCCCA 9900
AAACCGTGGT GCGAGCCTTG ATTTCCTATG CTGGGCGTGT ACCTGAGCGT TTTGTGTGTG 9960
AAACTGATTC TCCCTATGCA CCGCGCGGTG CTGCCTTTTC TCGCGTTTGT GCAGCTATAG 10020
TTGCACAGCC GGTGAGTCCT GAGCTCATAC CTCCTTGTGA TGATCGTATT GTGCAGTGTA 10080
CCGAGGAGAT GCTCGGTCCT TATGAATTGC ATGATTTTTT TCTGTATCAC A AACGGTGA 10140
ACGGTTTTGG TCCTCGAAAA CTTTTTCGTG TGGCCGCGCA TGCgTTTGGA ACTGCGTATT 10200
CTTGCGCGCA gcTATGTGCa GCgcTGCGCG TTTTTTTTAC CCGCTTGTTT TCACAGCAGT 10260
TCAAGCGTTC TTGTGTGCCT GATGGGCCCG GTCTTACGGA AGTGAACCTT TCCCCTCGTG 10320
TGGGTTTTTA TTTTCCCAGC GACACTTCCG GTGCGCTATG GCGCGCAGAG CTTGAGCAGC 10380
TGGcTTGTGG GGAATAGACT GGCACGCAGG ATTTTTAACA ACTGaTATGG AGGTGCGTAG 10440
GgCGTGGTGC ATACGCTTTT TTCTTGGGTT TCAGCGCATA TTCACTCGTT ACCTATGGTT 10500
GTGTTTGTCA GCCTGCTCTT GGCAGGAGTG CATGTGCCGG TTTCTGAAGA TGCGCTGATT 10560
GTCATGAGTG CATTAGTATG TCGACAGGAT GGAGCATCTG TGCCGAGCTT TCTAGGAGCG 10620
TTGTATGCAG GTGCATTAAT AAGTGATTAT GCGGTGTATT TTTGGGGATA CCTGTTGCAA 10680
CAGGGTGCGT TGCGTGTGGC TGCTCTTGAG CGGACGCTCG CGTCCTGCCG CGCACAAAAG 10740 ATAGTCACAC TTCTTTCGCG TTATGGCCTT TGGGTATATG TGCTTGCGCG TTTTGTCCCA 10800
TTTGGGGTTC GTAATGTGGT TTCGCTGACG TCGGGGTTTG TGCGTGTGCC GTTTGTGCGT 10860
TTTGCGTGCT ACGACGCACT CGCAGCGGCC TGTAGTATTT CTGTGCTCTT TTGGATGACC 10920
TATTTCCTTG GCTCTGTACA GCGTATTTCA CTCAAGGTTT TTGCGGTGGT GATTTTGCCT 10980
TTGTCGGTGC TGGGTATACG GGTGTTGATT GGCCGCCGGC AGAAAACCAC AGGGGATGGA 11040
GTGAGAATTA CACACGATGA CGTACAAACT AATGTAGGAG TGAGGTGATG AGCACGTGTG 11100
CGCAGGCTTT TTATCGCTTG TATGAAATAA TTGTGCGGTT GCGTGCGCCG GACGGGTGTG 11160
CGTGGGATTt GGCACAAACG CCGGTAAGTA TGTGTTCGTC CTTTTTGGAG GAGACGTATG 11220
AAGCGCTTGA GGCTATCCTC GAAgAgGrCG AnGGCACAGC ATTCGTCGTA TGCTCACGTT 11280
CAGGAGGAGT TGGGGGACGT GCTGATGAAT GTGTGTATGA TTGCATACAT GTATGAACAG 11340 CGAGGGGTGT TCTCGCTTGC AGATGTTGTA ACTGCATTAA CGGAAAAGTT AATTCGACGT " 11400
CACCCCCACG TATTTGGGCA AACAGAAGGA TTTCCTGGAC CGGAAAATCC GAAGCGAGCA 11460
CAAACAGCAC AGGAGGTGTT TGATCAGTGG GAACGGATTA AAACACAGGT GGAGCGTCGC 11520
CGTGCAGCTT CTCCGTTAGA GGGcATTCCT CGAACGGTTC CTCCCCTCAT GcGCGCGTCC 11580
AAAATGCAAA AAAACGCGTC GCTGnCGCGT CTTTTTTGTC CAACACGCAC GGAGGTGGTA 11640
CGAGAATGTG CGCGTACCTT TCGTGCACTC CGTGCGATGT CAGAGAATTC TGCCGAACAA 11700
TCCGCCACTc AAGCAGCGCA TGTTGCAGTA GGTGCGCTGT TGACTGCAGT GATATCGTTT 11760
GCACATCTTG TGGGGGTAGA TCCGGTGCTC GCCCTTATCC GCGCAAATGC GGACTTCGTG 11820
CGCCGCTTTT CGTGTGCCTG TTCTAtACCT GcCATTTCTG GAGGTACTTC TGTATTTTTG 11880
TCTCGCGCGT GCCATAAACC ACGTCGCGCA CGCACGCGGG CGTCTGCGGT GCGCAGGCGC 11940
GCACGGTcAC GGcGACTGTT TTTTACTCGA CACAAGCTGG GGAATATGCT ACGGTAGGAC 12000
GCGTCCCTGT CTCCGTGTGT AAATTGTTAG CACGGGCAGG GTGCGTGTTG AAGAAGAGGG 12060
GGCTTATGAA GACGTTGCAG TGTGATATTT GTCGGAAGGA AGTGGACAAT TCGCTGCCCG 12120
AGAGGTTGTA TTGGACATTC CGGGAGTATG ATGTGTGTGA GGACTGTAAG GAGTCTATTG 12180
AGGACAAGTT GCGCCCTATC ATACGTACTC ACCAGCCTTA TTCTCAGGGT TGGTACGAGA 12240
ATCAGTTCAT GGGTATGGTG CAGCGCGGGG TGTCTAACCG TCGTCCGTAA GTTTTTGATG 12300
TCAGTGTTTC GTGCTTGATG TGTGArGTAG GGACGTAnGG GTGTGATCCT TTTTTCTCGC 12360
GCGAGGTTGT GGGCGAGGGA TGGTGTCGCT CGCGCTTATG TTTCTTTCCT TGGGCCkCGG 12420
CGCTGTGTTT TTTGTGCgTC CCgGTGTAcT GGGAcGGTTC CTCTGTGCTG TTCGTGTGTG 12480 cAGGATCGGT TGTACGCGCG CGCACATGAC TTTTTGGAAC ACCCTGAGGA TTTCTGTAGT 12540
CGCTGTGCCA AGCCGCTTGT TTCGGCGCGA GCGTTGTGCG TCTCTTGCCG TGCGCTTCGA 12600
GAATCGGGTG AAACGCCTGC GCTTTGGCGT GTCTTTTCAC TTTTGCCgTA CCTGGGTGTG 12660
GGGCGTCCTC TTATGTCGTT GTGGAAGACA CAGCAGGAGC GGAATTTTGA TGCTCTTTTT 12720
TCCCGCATTG CCGGGTGTTT TTTGCGTACA GCGCGTGAkC GCTccTTCGT CACTGCAAGT 12780
ACCGAGTTGG TGCCAGTGCC GCCgCGGCCA TGCAAGATGG CTGAGAGAGG ATGGGACCAG 12840
GTTGAGGACG TGTCGCGTCG ACTAGAATTG GCTGGTTTTA CCGTTAATCG TGCGTTGGTG 12900
CGAGTAGAGG GTCGTTTCGC GCAGAAAACA TTGTCGCGCG CTGCGCTGnT TGAGAATCTT 12960
GCAGGGAGTA TAGAGCTCGG GGCGCACGCT CGTGTGCCGC GTGATGCCTT GATTATCGAT 13020
GACGTATTAC CACGTATGCC ACGATGGACG CGTGTGCGCT GTGCTGCGCT CCTCGGGCAG 13080
CGAGCGTGTG CAGGGTTTCT CGTTCTTTTT TGCGTGAGGC GTCAATTAGT TAGCGAACTT 13140
CTTTTAGAAA TTCCTGAAAA TGCCGCAGgA CGTACGGCCC AGCAATTTTT CTCTTATATG 13200
TTGTTTTTGC AGCATAGGTC TCTTAGGCAG CACTCCGTgC ACCGTGGTCT TCGTGCGGTG 13260
CAGGTGTAGC GTCCGTGGAG TAAAATCCAG TGATGTGCAT GCATGCGAAA CTCACGCGGG 13320
GTGACCACAA GCAAGTCGCG TTCAACCGCG CGTGGGGTTC TTCCAGAGGA AAGCCCGATG 13380
CGTGGCGCTG TCCGCAGATA TGCGTATCAA CTGCGATTGT GGGTATGCCA AAACCCATGT 13440
TCAGGACTAC GTTTGCCGTC TTGTGACCGA CCCCGGGTAG ACTCTCTAGG GCATGGGCGT 13500
CGCACGGTAC TTGGGCAGCG AAGCnTCGAT GAGTTCAGCA CTGAGTGCAA TGATTCGGCG 13560
TGCTTTCGTG GGGTATAAAT TAATCGTCCG TATGTAGGAG CATAGCCGTT CTTCCCCCAG 13620
CGCGAGCATT GCTTGGGGGG TGTCTGCCAC ATCAAACAGA GCAGCGGTCG CCTTGTTGAC 13680
GCTTTTGTCT GTTGCCTGCG CAGAAAGCAG TACTGCCACC AGGAGCGTAA AAGTATTGCG 13740
CCAGTGAAGT TCTCCTTGTG GTTGCGGGTT TGCTGCGTGC AgcTGCTCAA AAACGGCGTG 13800
TACCCCCTTG CTGTCTAATA GACGCATAAG GGTGCCAGTA AAGAGAGGTA GTTTAAAAAG 13860
TGCAAGAGGT CATGGGGTGG AAAGGAGGGA AATGAACGCA CAGGATTCAG AGAGTTTCCT 13920
GAAGTACGAA CTGCTGGACG CACTCAAGCA TATGCACCTC GTGGTTCAGT TTTCGGATAT 13980
TAAGCTTTTG CGGTACACTG ATAAGCAAGA CGAGCTTAGG AAAGCTTGTC TCCGACTTGG 14040
AATGTTGAAA ATTGGTTGAA ATGACGATGA TGGAATGCTT GCGAAGAAGT TCCATAACCT 14100
CGTTGACTTC AGGTTCATGA TGGGAGAACT GTATTTCTAG GCGCTTATTC TGGCGAGTGG 14160
GAAATACACA CTTTTCAATT TTATCCAAGC CTGCAAGGGT GATCAGTACG AGGAGCACGA 14220 GCATCGCCCC TGCGCTGTAC AGCCtGCGCC GATAACGAGA CCGATCCCTG CTGTCACCCA 14280
AATAGTAGTA GCAGTGGTTA AGCCTTTCAC GTTTGCACCC ATTTTTAAGA TGGCACCGCC 14340
GCCGAGAAAT CCCATGCCGG AAACCACCTG TGCAGCGATG CGCCCGGGGT CGCCGATGTG 14400
GTCTCCGGTA ATCTCACTTA CGCAGAGGGA CAGGAGCATA ACGCCCGTAG CACCGACACA 14460
GATGAGTGTG TGAGTGCGCA ATCCCGCTGC CTGTAACTTT GAGGAGCGCT CCAACCCGAT 14520
AGCAAGTCCT GaGACAAAGC TGaGCAAGAG CCGGaCAACA ATAACGGaAT CTGTAATCAT 14580
GACTTTTCTC TTAGGGcGTA gcAGGaTGCA AGTGCCTCGA GGGAGACTTG AACTCCCACG 14640
CCGGTGAAGG CACTAGCACC TGAAGCTAGC GTGTCTGCCA ATTCCACCAT CGAGGCAAGA 14700
AAACCCTTCC ATGGTGGGAA ATATAGTTTT TCTAGTCAAG GGATTAGAGC AGCTTTCAGG 14760
GCACGGGATG CAAAGGCGGC GTACTTGACA AAATGCCAAT TCCAATACAC GCTGcCCGcG 14820
GCgCTGCGCg TGGCGCCGTG GGCTATTAGC TCAGCTGGTA GAGCAACGCC CTTTTAAGGC 14880
GTGGGTCGAT GGTTCGAATC CATCATGGCT CAGAGGTGGG ATTGGTGCGC AACAAGGTGC 14940
GAGTTCTTGC GGTGGTCGCA GCGCTTGCGG CTGCGTGCGC GGTGGGCTTC TTTCTAGGAA 15000
GGTGGTTCGA CTTCTCTGCT AGGTCCTCGG TGCTCGAAGC AGCTGATTCC CTCTCCGTTT 15060
CTTCTTCGGA AGCGGCCAGC TTTTCCACGG TTGTTGCAGA GGGGGACCCG TACACCGTCG 15120
ACGAGCGGCA GAACATCGCC GTTTACCGCA GTGCCAACGA GGCCGTTGTC AACATTACCA 15180
CTGAGATGGT AGGGGTTAAT TGGTTCTTAG AGCCCGTGCC TCTCGAAGGT GGCTCTGGGT 15240
CTGGCGCTAT CATTGACGCC CGCGGGTACG TGCTCACCAA TACGCACGTC ATCGAGGGTG 15300
CGTCTAAAAT TTATCTCTCG CTACACGACG GCAGCCAGTA CAAGGCAACT GTCGTGGGTG 15360
TAGACAGGGA GAATGATCTT GCGGTGCTTA AGTTTGTTTC TCCTCCTGGA GCACGCTTGA 15420
CAGTTATCCG CTTCGGTTCT TCGCGCAACT TGGATGTCGG ACAAAAGGTG CTTGCCATCG 15480
GGAATCCCTT TGGACTAGCG CGTACTCTTG ACCGTCGG 15518 (2) INFORMATION FOR SEQ ID NO : 23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6234 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: TTTGGAATTT TGTGTTGTCG TTCACGGTAA ATAATTTGTA GCGTTCCGTG CCCGTTTTGA 60 AACGGTCCGG GGCTGCGTCC ACCAGGCAAG GAGTGATATG AAAGCGACGC TTACCTTTGT 120
CTTTATGCTC CTTACGTCGC TGCTGCAGGG TCAGTCGCAA CACATCACGC GCTTTGCCGT 180
CATAGATGCG GCCCGCATTT ACTCAACCTT TTGGCGCGAT TCGCCGTTCC TGCGCrATtA 240
TGAATCTAAA AAAGCACGGC ACCAGGGTGA AATTCAGAAA ATGTCTGATG AGCTCGTAGA 300 nTCCGGGCAA AAAAAAGTTG ACGCGCAGAT GCAGCAAAAC ATCGCGTCAG TCCAAAAGTA 360
CGAGGTGCTC ATTGCGTCAA AAACCGCGCT CCTGTTGGAG TATTCTAAAA CGTCCAACGA 420
CGAGCTCACC GCGCTGCGCA AAACGCTCAT CGCAGATGAC GCATTCTATG CAAAACTCTA 480
CGCCGCTATT AGGCGAATTG CAGAAAGTGA AGGCTACAGC ATCGTCTTAG ATCTGCAAAA 540
AAACGCCGGA ATACTCTGGT ACAGCCACTC GGTCGATATT ACCGAAGACG TCCTGCGGGA 600
GCTGAGCAGC TCGTGATGCA CCGTGAGCAC CGCGTCTcCT GCCTCCTACG TGTTGGcCCA 660
GGAGCGTCCA CGTGAGGTCC CTCGCGTCAG ATACCCCTCT CATGCGTCAG TACCACGCCA 720
TTCGGGCACA GCATCCGGAT GCGGTCCTGT TCTTTCGCTT GGGCGATTTC TACGAAATGT 780
TCGATTCCGA CGCGCTCCAC GTGAGTACCC TCTTGGGGCT CACCCTTACA AAACGAAATG 840
GAACACCCAT GTGCGGGGTG CCCGTCCATA CCGCGCGCAC GCACATAGCA CGCCTGCTTA 900
AGCACGGTAA AAAAGTTGCC TTGTGCGAGC AGGTTTCTCA TCCTGTCCCC GGAGAACTCA 960
CACAGCGCAA GGTAATTGAG ATTATCTCCC CCGGGACCGC AGTGGAAGAT GACTTTCTCA 1020
GTCAGGGATT TTCCCAATAC TTAGCCACCG TCTGTGCCTC AGACGCCACC GTCGCCTTTT 1080
CTTACCTAGA AGTCAGCACC GGCGCCTTCT TCATCACCAG CTTTCCCCGC GCCGAAGCAG 1140
CGGACGCATT GCAAAAAGAG TTCGGACGTG TCCAGCCGTC TGAGGTTCTC CTGTCTGCTT 1200
CAGTGCTCCG TTCACTGCCT GAACTTGCCG CTATCCTCAG TCTCTACCCC CGGCTCGTTC 1260
GTACCACCGG CGCAGATGCG CTTTTTAATC CCGAGCACAC TAAAAACCGC CTGCACCATT 1320
GCTTTCGCAC ACGCAACTTG GATTGCCTCA CCCTCCTGCC CCATTCGCCA GACCTCGCTG 1380
CCGCCGGGGC GCTGATTGCG TATTTGGAAG AAACCACGCG ACACCCGCTC TCCCACGTCA 1440
GTGCCATCAC CCGCTACCAT ATCCATGACT TTGTAGAAAT CGAnTGaCgc TACGCGCAAA 1500
AATCTAGAGA TACTTCAAAA TCTCCACGAC AGCACCCATG CGCATTCTCT TTTTGAAACA 1560
CTCAACTATA CACACACCGC CATGGGTACC AGGCTCCTGC GCTATTGGCT GCACCACCCC 1620
TTGCGCTCCC AGGAGGAAAT TCAAAAACGC CTCAGTGCAG TGGTCTTTTT TCATCACCGT 1680
CCCCACATCC TCAAGAacTG CGTGCAACAC TCTCGTGTGT TCGGGATGTG GAGCGCCTAG 1740
TCGcCCGCGT GGCGTTAGAA AAGGCGCACG GACGTGACTT GCTCGCCTTA AAAGAAAGTC 1800 TCAGGGCAAT CCTTACCTTC CGCAGCCTCG AGCGCGAAAG TCCCTTTCCC CCAGACCTTC 1860
TTCCCTCAGA AGGGGATACC CCGGTGCTGC AGGAACTGTA TGGTCTTTTA GAACAGTCTA 1920
TCAAAGAAGA TTGCCCCGTA ACGCTAAGCG ATGGGAACCT TATCAAGCGT GGTTTTTCTG 1980
CGTCCTTAGA TGAACTGCAC CGCGTGCGTG ACAATGCAAA TGAAATTCTA AAACAATATT 2040
TGGCAGAGGA GCGTGAGCGC ACGGGTATCG GTACATTAAA AATGAAGTAC AATCGCATGC 2100
TCGGTCACTT TCTGGAGGTA TCCAAAGGGC ATCTTTCTGC TGTCCCTGCG CACTTTATTC 2160
GTCGCCGTTC ACTGAGCAAT GCCGATCGCT TTACCACCGA ACAGTTGTCA GAATTGGAAG 2220
CAAAACTTGC CCGCGCCCGT GAGgGCcTCG TTTCCTTTGA ACAAGAACTC TTTGCAGATA 2280
TCCGCCGTAC CGTATGTTCT CATACCCAGC TGCTGCGCAC GAACGCTGCA CGGGTGGCAC 2340
AGCTGGATGT GCTCCAATCT TTTGCGCACG CTGCGyTCCA GCATGGCTGG AGTCAACCGG 2400
TCTTTATCAA AGACGGTGCA CTTCGTATTA CGGGGGGCAG ACATCCGGTG GTGGAACTTC 2460
ATCTCCCCTC CGGGGAGTTT GTACCCAATG ATCTGACACT TTCTTCAAGT GAACATGCGG 2520
TGTTGCCGCG CTTTGgsTCA TCACCGGACC GAATATGGCA GGAAAAAGTA CTTTTTTGCG 2580
TCAGAcTGCG CTCATTTGCC TGATTGCGCA GGTTGGCTCC TTTGTCCCTG CAGAAAAGGC 2640
AGAGCTCACC CCCGTCGATC GTATTTTTTG TCGGGTAGGA GCGGCCGATA ACCTTGCGCG 2700
CGGGGAaTCT ACCTTCTTGG TAGAAATGAG TGAAACAGCA CACATCCTGC GTGCAGCAAC 2760
CCGCGACAGC CTTGTTATCA TGGACGAAGT AGGACGGGGA ACGGCAACTG AAGACGGTTT 2820
ATCCATAGCG CAGGCAGTCA GTGAATATTT GTTGCATCAT GTGCGTGCAA AAACGCTGTT 2880
TGCAACACAT TACCATGAAC TGTCCCGTCT TGCCCACCCG CAGTTAGAAC ACCTCAAGCT 2940
TGATGTTCTA GAAACTGACA ATACCATTGT ATTTCTGAAA AAAGTGACGC CCGGTTCTTG 3000
CGGCAGTTCG TACGGCATTT ACGTTGCGCG TCTGGCGGGG CTCCCTGAAT CGGTACTGGC 3060
ACGCGCGTGT GAGCTTTTGA AACAACTGCA GCAGCGGGCA GGATCTGCTC CACGTGCGTn 3120
CTnTGCGCAC GAAGCAGATG CAGTGGCTCA AACAGAAGCA GTACACGCGC ACAAGGCAGC 3180
GTCTAAACCG TGCGCGCagc GTGTGTCGGC AGATCTATTT ACTCAAGAAG AGTTAATAGG 3240
CGCAGAGATT GCaTCGTTGA ATCCaGACGC CATTACACCG CTTGAAGCGC TGACACTCAT 3300
CGCGCGGTGG AAACGCAGCC TCCGCGGTTC TGCAACGCAG CAGAGCAGCG CCATGACAAA 3360
ACGGAAGGGG TAATGGTATG TTCCCCTGTT ACGCACGACG GGTATCGGGC ATGCGGCGCG 3420
CGGCGTTTTG TCCATTCTTT GCGCTAGAAA CAGAGCGAAC AATATTCTGC CTACCTGAGG 3480
AGAGAAAAAC GTGAATAAtT gCACTCCGTG cGTaCCTGAG TACGCGTGCT CCTGACCAGA 3540 TACATAGTGC TTTTGTTGCG TATTTGGCCA ATCTTGATTT AGTTGCGCAC CAGTTTCCGC 3600
AGATTGCTTC TGATATTGTG CAGGAGCTGA TAGATCAGCG GTCGTATGTA AAGTTAATCG 3660
CAAGTGAGAA TTACAGCTCT CTTGCGGTGC AAGCGGCGAT GGCTAACTTG TTGACTGATA 3720
AATACGCAGA AGGGTTCCCC CATCATCGCT ACTATGGCGG GTGTCAGAAT GTTGATTCTA 3780
TTGAGTCTGC CGCCGCTGCA GAAGCATGCG CGCTCTTTGG TGCTGAGCAC GCATATGTCC 3840
AGCCGCACTC CGGTGCAGAT GCGAATCTTG TTGCATTCTG GGCTATCCTT TCGCGGCAAA 3900
TTGAAATGCC AACCCTTTCT TCTCTTGGTG TCACCGCCgC TACGCATCTG AGTGAGGAAC 3960
AGTGGGAAGT ACTGCGCCAG AAAATGGGTA ATCAAAAACT TATGGGGTTA GATTATTTTT 4020
CAGGCGGTCA CCTGACCCAC GGGTACCGCC AAAATGTTTC AGGACGAATG TTTCGTGTGG 4080
TGTCCTACGC GGTGGACCGA GACACAGGAC TGCTCGATTA CGCTGCAATC GAGGCACAGG 4140
CAAAGCGGGA AAGACCACTT ATTTTACTTG CCGGATACAG CGCGTATCCT CGTTCCATTA 4200
ATTTCCGCAT CTTTCGGGAA ATTGCAGACA AAGTGGGCGC AGTACTCATG GCTGATATGG 4260
CTCACTTTGC TGGACTGGTT GCAGGCGGTG TTTTTACGGG AGACGAGGAT CCAGTGCGCT 4320
GGTCTCA AT CGTGACCAGT ACCACACACA AAACGTTGCG CGGGCCACGC GGTGCCTTTA 4380
TTTTGTGTAA AAAAGAATTT GCAGAGGCGG TGGATAAGGG CTGTCCGCTT GTGCTCGGCG 4440
GCCCGCTGCC ACATGTGATG GCAGCAAAGG CGGTTGCGTT TCGTGAAGCT CGAAATGCTG 4500
CTTTTAAAAC CTATGCGCAC GCAgTCCGTG ATAATGCGCG TGCGCTGGCA GATGCCTGCA 4560
TACAACAGGG GATGCAGCTG CAGACAGGGG GGACGGATAA CCATCTGCTA TTGCTtGACG 4620
TGCGTCCGTT TGGACTGACA GgTCGTCAGG CAGAgCGCGC GCTGATAGAC TGCGGAGTGA 4680
CGCTCAACCG TAACTCGCTC CCCTTTGACC CAAACGGCGC ATGGCTCACC AGCGGACTGC 4740
GCATCGGAAC CCCCGCGGTA ACGAGCCTTG GAATGGGGCC TGAGGAAATG AAAAGAATAG 4800
CGCGCCTGAT CGCGCGCGTG CTCGGCGCTG CAACGCCTGT GCGGACAAAG ACAGGTGCGC 4860
TAAGCAAATC GGCGGCCGAG GTGCCCGGCG AGGTTAGAAG CTCAGTCTGC TCGGAAGTGC 4920
GGGAGCTGCT CGCACGCTTC ACGTTGTACC CTGAACTCGA CGAACCCTTC TTGCGCGCAC 4980
ACTTTACGCG TCGCCCTGCn GGACAAAACA CCTGCCGACG AAGGgACTTG AACCCTTaCG 5040
GGGTTACCCC AACAGATTTT GAGTCTGTCG TGTCTGCCAG TTTCACCACG TCGGCCCGCG 5100
CGCAgCCTAT CACACGAGGA ACAAAAGgTA CAGCTGTTCA TGTAGTCTTC TTGCGTGAGG 5160
CCCCGTGTCT CCCATTGAGG GAGCCGTTAT TTTTCTCCCA TGAGGAGTTT TAGTTCCCGA 5220
ATATCTGCCA CCAGTTTAGA GCGATCTAAA TGCTGATAAC GCGCAGGGAG CATTTCCTTT 5280 CCTGTGCATT CGAGTACTGC CACTAAATTT TGCAGTTCAA TTTCGTGCGG ATAAGCAGGA 5340
GGGATAAAGT CTTCAATGGT GCGGGTGATG TCCTGTGTGG TTACCATCGT GCGATTTTCC 5400
ATCGCAGCAG TTAGCTGGGC GCGTACTAAA ATGGCTTCTA AGTCCGAACC GGAAACAGCG 5460
AATTTTATTC TGCGAATGAT TGCCGGTACG TGTACATCTT TGAGCTTGAT ACGTAATTTT 5520
TTTTTGAGTG CTTCAAAAAT TTCCGTTTTT TCTTTTGTGG TTTCAGGGTA GAAGAGCGCA 5580
AGATGCTCTT CTGCGCGTCC CTGTCGTTTC AGATCTATTG GTAGCAAGTC TGGGCGCGAA 5640
GTAATCAGGA ACCAAATAAT ATTGCCCCGG TGTTGGGTGT TACCCATAAA CCCTGCAATT 5700
TGTGCAAAAA TACGCGATTC ACCTGCCGGC GCGTTACGCC TACCAAACAC CGCATCAGCT 5760
TCGTCCACCA TCACCGCTAC CGGGGTAAGC GCTTTGAGGA TGTTGAGCGT TTTTTCTAGG 5820
TTCGACTGTG TAATGCCAGG CTGCGTTGCC TGGAAATTAC ACAAACGCAC CATGGGAATC 5880
CCAATTTCCC CCGCAAATGC GGAAACCATA AATGATTTGC CTGTCCCAAT CGGCCCTGAG 5940
ATAAGGTATC CCATTGGCAA CACATCTGCT CTTCCTTGCT TAATGGCGCG CACTGcGTTA 6000
TACAATCTCT TTTTTACAAA GACATTTCCT GcAACGTATG AAAGGTCGCA GGATGTGTCG 6060
ACAAATTCCA ACAAACCGCC TGCTTCGTGC TCAATAATTT CCTGTTTCTT CCTTTTAGAA 6120
ATGTAAGGTT GCAGAGTCCG TATGGGAAAG TCTACTGATA wTGwcCTCtG GCTcCCATTG 6180
CATCGATCGT TGGnACGTCT CTGCCGCAAG CTGGTGGAGG GTCACTAAAT TCAA 6234 (2) INFORMATION FOR SEQ ID NO: 24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1548 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24:
CGGATAAAGG ACGACTGGGC TGGCAGCGGG TGTGGGTTCC CACCTCCTGT TCGTGTCTTT 60
TCAGGGTGTG TGCGCGTTCC GAGAAGAGGG CGTTTTGTGT GTGGGGAGGA GTACGATGGA 120
TACGCATATA TGAGGCGCCG GGTGTGCACG GTGGTGCGCG CGGTGGTGTG TCTACTCAGC 180
ACGAGTTTGC TGACCACGTG CGATTTCACT GGCATCTTTG CGGCAATTCA GTCGGAAGTG 240
CCCATTAAAA CGCCGTCCAT CCCGGGGGCG ATTTATGGCC TGGTCAAGGC CGGGAGCAAG 300
CTCTACGCCA CCAACGGCCG GCTTTGGGAA AAGGAGCTGA ACGGCACTGG GTCgTGGCAG 360
AAAnTGTCTT CCTCGTCCGT TCCCACTGAC TCGGATAAAA AgGTTATGAr CaTTGCCACC 420 GACGGGaACA CGTTCGTCCT CGCCTGCGTG CCTGGCACGG GCGTTTACAA ACACTGCGTA 480
AATGGCGCGG GCAGCTCAAG CACCGGCACA ACGGCAAGCC CCTCGACTGA AACCTGCTCG 540
CAGCATGCGA CGCTCGTGGG GGGAACGTCC AAGCCCTTCT GGCTCGTGCC GGGAGGCgnG 600 nGAAATAATG GGAACTGCGG TTGCGGGGGA GGGGGGGGTG GCTCCTCCTC GAGTAGCAGC 660
TCGTGCATTC ACATCTGGCT CGTGCCGGGA GGCnGnGnGa AATAATGGGA ACTGCGGTTG 720
CgGGGGAGGG GGGGGTGGCT CCTCCTCGAG TAGCAGCTCG TGCATTCACA TTAAGGTAGA 780
AAACACGGAC GAACAGTTTC TCGATATGGG TGAGGGGTAC GTGGTGACCA CCAAGCACCT 840
CTACACCAAA AACGGCTCGT CCAGCGCGGG ACCGGCGCAG TGTCCCGGTG GCGGTGGCGG 900
CGGAGGCAGC AGCGGGGGTG GGGGTTCCTC GGAGTACACC AAAGCTTCCT GTTCCTTTTC 960
CACGCCCATT CTGGCAAGCG TCACAACGGG TGCTATCACT ACATTCTCAC CAAAGAAAAA 1020
GTGTACTGCA GAAAGCAGGA CACCGCTTCC TCCGCTGCGT CGTCACCAGC CCAGTGTCCC 1080
TCTTCCCCTT CTTCTTCTTC CTCCTCCTCG ACGAATGCGG GATGCGAGGT GGCGCACGGG 1140
GTGGACGACC CGCTGTGTCT TGCGATTTTT AAACACAACG GCTGCGAATA CTTGCTCATC 1200
GGCGGCAGTC GGGGCTACGG GGAAATAAAG CTGGAAGCGA ACTCCAGCGG TACGAACGGC 1260
ACCTGCATGC GATTGAAAGA GAGCAATGTG CACAAGAGTC CGGGCCAGTG GGGCGAGTCG 1320
AGCCCCACGC CCAAAGCGAG CGCCGAGCAG TATCGGGGCA CGGTCGGTCG GTTTGCCGTG 1380
CAGAAAATCT ACGTAgTTGA AAAAAATGGC GGTGGGAACG GTGTCGCCGC GGGTGGGGCG 1440
GGCTGTCCTG CAAACGCCAG CAGTTCCAGC GGAGGGACCA GCAGCACGCA GCGTCCAGAC 1500
CTCTACGCCG CAGTGGGGGA GTCGAGCGAC ACCTAnCACG GGGGGTTT 1548 (2) INFORMATION FOR SEQ ID NO: 25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3172 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:
TACGAAGAAT CGTACTGCCC ATCCCCATCC CgATCAAAAT TCCCGTCAAC ATTGTTGATA 60
AAGTACTGTT CCTTAAAAAA ATCGAGTCCT GTCTTTCCAT TCAGCCCCAT TGCGTTACGG 120
TGTACGTTAT TTACCAGATC TACAAAGTTC ATAGCCATAG TATCGAGCTT GCGCAsTTCA 180
TCGCGCACGT CTGTGTCACG CAtTCTATCA ACGCTGCAAG CTTACCCCCA GAAAAGTGCG 240 CACGATCCCC TGAGTCCTTC CACACCACGG ACACATATCC TTCAGGCGTT GTTCCACTGA 300
CAAGCCCAAG TGTCTTATAA CTCTTCCCCT GcACAACTTC TAATCCCGCA CTGTGAATAA 360
CGTAGGACTC ATCCTCATCA CGCACGTCCA CCCTGACCTC AATGCGGTGC GCGAGACTTT 420
CCACCAACGT ATCACGTCGA TCTAAAAGAT CATTAGGATT ATCACCCATC GCTTTTGACT 480
TCACAATCTG TTCATTTAAT TGAGCAATTT TGGCAAGGAG ATCATTCACC TGCTCAACCG 540
TTGCCTCAAT ATCCGCGTTG AGCATATCGC GAATACCGAC AAGACCTCTA TACTGATGAT 600
GAATGGCATC CGTTAGCGTC TGTGCACGCG TGAGCACAAC CTGACGCGCT GCACGGGCTT 660
CAGGATACAC AGACAGTTCC TGCCAGCCAT CCCAAAACTG GTCCAGCCTG GTACGAACTG 720
CAATATCCTC CGGCTCATTA TACACCTGCT CTAAAAGACG CACATACGCA TCACGCGTGC 780
TCCAATAmCC CTGTTCGTCT GTCTGAGACA CAATGCGACT ATCGAGGAGC TGGTCAcGCA 840
AACGCGCGAT AGAACCGATG GTGAcCCCTT GTCCTATCTG ACCAGGCAGC TGAGCGCGAG 900
AAAGATCAGG ACGGTACAGC GGCTCGAACG AATCGAGGTT TACTCGCTGG CGGCTATACC 960
CCGGCGTGGA AGAaTTCGAC ACGTTGTGTC CTGCAGTCTG TACAGATTGC TTATGCGCGT 1020
AAAGAGCACG CTTTCCAAGT TCTATAGATG CAAATGTCGA CATGTGTTCT CCCTATAAGA 1080
ATGGAGGGTA CGCGCAgCAG CCCCATCAGG AACCTCCCTC TCGCCCTGTG GATACTCGCG 1140
CTCTGCTCCT CCCCCGAGGT ACCGCACCCT ACAGCACACG ATCAAAGACA AGACTTCCAG 1200
GCGCACAACG GACTGGACAT CCGTCCTTCG TATAGGAGCT GCCCTGCTGT TCACACGTAA 1260
GGGCGCTGAC AAGCGcGTGT GCCAGACCAC GCGCGTGAGT CAGATAGTGT TGGATTGCAT 1320
CGTGCTCATT TTTTGAAGAG GCAACTTTgc tACGCAGCGT CCTATAGAGA GCGACGACCG 1380
CATCGTGCAC ATTGACATCC GCGCGCCTCA AATAGGCAAA GAAGGAATCA AAGTCAACCG 1440
GCGCGTCACC GTACGGCCGA ACCTCTTGGA GAAgTAAGAA ACACCGTTTA TCGAGATGCA 1500
AGAACTCACG ACTCaGCGCC TGTGCACGGC TGAcAAAGGA TTCTACATGC TCCCaCGCAC 1560
GTGTGCGcAG CGACTCGTAC ACACTACGCT GGACCTGTAT CACCTGGCCA ACAAGCTCAA 1620
TCTGCGCAAC AAGAATTGCC TCCACCTGCC CTGCCCGGTG CAAAGCCCGC TCTCGGTCCA 1680
TCGCCCACTC CTCTAGCCAG GAGTCTCGGC ACCTTTCCCG AGCGCTTTAC TTTTTAGTGA 1740
AATAAAAAAC CGTCCAGTGG TCTGCaGcTC CGCAGGCTAC TGGACGGCTC GCACCTGCTG 1800
GCTCTAtGCG CGGCCGGCGC TTCCGAGCAC TTCCTTGTTT TTGTGCATGT ACAACTTCTT 1860
AAGCTCATCG CGCGCAGGAC CCAAATACTT TCTCGGATCA AACTCATCTA CCTTGGTGGT 1920
CAGCACCTGC GTATAGCTGC AGTCATAGCG AGGCGACCGT CCGAGTCAAT GTTCACCTTG 1980 CACACCGCGC TTTTGGCAGC TTTCCGCAAC TGCTCTTCTG GAtACCCACA GAATCCGGCA 2040
GATTTCCACC GTACCTTTCA ACCTCCCGTA CGTACTCAAC GGGCACAGAC GAAGCACCGT 2100
GCAGCACGAT GGGAAAGCCA GGAATACGCT TTTCTATCTC TGCGAGGATG TCAAAACGTA 2160
GCGGAGGGGG GATCAACACT CCATCAGCAT TGCGCGTACA CTGCTCTGGC GTAAACTTTG 2220
CTCTCCCGTG ACTTGTTCCG ATGGAGATGG CaAGGGAATC CACCCCCGTT TTTTTCACAA 2280
AGTCCTCAAT TCGTCAGGCA TAGTGTAGTG GCTCTTCTCT GCCACTACAT CGTCTTCCAC 2340
ACCAGcGAGT ACCCCAAGCT CCCCTTCCAC GGTGACATAG TCCGCACGCG CATGGGCATA 2400
CTCGCACACC TTCCTGCTTA GCGCTACATT CTCGTCGTAC GGCAACGCCG AACCGTCAAT 2460
CATCACAGAC GAAAAGCCAC TCTCTATGCA GTCAATGCAC AGCTCTAGGC TGTCACCATG 2520
GTCCAGATGC AAAACAATGG GAATATCAAC GCCGAGCTCA TGGGCATACT CAACTGCGcC 2580
GCGTGCCATA TTGCGCAGGA GCGTCGCATT TGCGTACTTG CGCGCACCGG AAGAAACCTG 2640
CAGAATGACG GGAGAACGCG TTTCAACACA CGCCTGTATG ATTGCCTGGA GCTGTTCCAG 2700
GTTGTTAAAA TTATACGCAG GGATCGCGTA TCCGCCCTTT ACTGCCtTTG CGAACAGGTC 2760
CTTGGTATTC ACCAAACCGA GTGCCTTGTA ACTAGTCATG AGAACCCCCT TTGTTAGGAT 2820
TGCTTCGAGA AGAGTCACGA AATAGAGAAG CGTGCCACCC TCGGCAAGAG GGGCATGGTA 2880
GGGCGATCGG GACGCTCTAG TCAACCGAAG CGCGAAGGCT TGAGTCCACA CGTCAGGCGT 2940
TGGAACGGCA GCAAGACGAT TTGGACAGGT ACCACGCGGG AGGTTTGACA AGCTATTTCT 3000
CCATGCGCTA GAATGCGGCG AGCTGGCGCC TGCGAGGCGT TAGGGGTGGT GAAAAGGAGT 3060
TTGCGAATGA AACAGGGCTG TTTTATGGTG GCGGGCTTTG CGCTGACGTG CGCGTTTTTG 3120
GTGTCCCCCC TTGCGGCGCA AAGGTCGAAG GTCAATTACC AGGCATACTT CA 3172 (2) INFORMATION FOR SEQ ID NO: 26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24699 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:
CGTTTTTTTA ATGGGGTAGC CAAGCCAGCA TCCACCATTT TCTGCATAAT GGTGTCGTAC 60
TCTTTTTTCG AACCGTCGCA GACGTAGACG GTATCTGGGG CACAGAGTGC GACCATCTCT 120
TCTATCCACG CCTTTGCTCG AGCGTGGGCA ATCTCGTGAA GTTCCATAAC GCCGCTCCTT 180 GGTGCGTAGC GTGCTGCACG GGTATTCCAG GCACGTATCG CCCCAGAAGT AkAGCGCGsw 240
AaAGGTAAAT AAAAAAGACC TCTTCAAACC GAGTGTCTCT GTCACCGCAG GaGCCGACAT 300
GAGCGGTGTT CTTTACCTAG GnCGTTTCAC TCTCTGTCTA TTACCTGTCA GTTGTTTTTC 360
TCAAAAAGTG ATGACGTGTG CCGATACCGT CAGGGGTGCG CAAGAGGTTT TTATGCTATG 420
TATCTACGTT GAGCTTCCCT ATTACTATCA AcTGACGCGC ATCTTCCCTG CTGACATCGA 480
ATCGCTATGT GCGCGTATGA GAAGGTTCGC TGTCCACAAC GGTGCTGCCC TCCACGAGGC 540
ATCGTCCGTT CGTATCTTTG CATTTGAAGC ACACAGTCTC GGTTCTGTAT ACGcCGCGGT 600
ACGCTGCGTG CGTGCGCTGT ATCAAACACT GGACACATAC GAAAAGCAGG TGAAGGAATT 660
TCGTATCCTC ATGGACGTTG TTGCTGACGA TGCTTCTCCC TGTCTGATAG AAGATCGCTT 720
CCATGCATAC CGCAGTACGC TGATTCCTGA CCGTGGTTTT TTTGCATCCT TTCGTGCAAA 780
ACAGCTTCTC AAGCATTACC TTGAATTTTT GCCACTGCCA GCGCTGAATA TGTACCAGGT 840
TAATGGTTTC CTTTCACTTT GTGCGGAAAA ACCTTTTCCA CAAGGGGTAA CCACGCACTG 900
CATAGTTGTG CGTACCACTT CTTCATACAT GAGTGCTCTG TGTAATTTCA TGGCGCTCCA 960
TCCGTTGTCC GAAGCGGTCT ACTCAACGCT ATCTGAGGAA ACGCGTGCGT TTTTTTTTCA 1020
TCTGCGCGCT GCGGTGTCTT TTTTTAAAAG ACGGCGGTAT GATTCGTCTT TTCCCCAATA 1080
TTTAACCGAT GCATTTCTTC AGTATGTGGG TCTGTACTTT AAGCTTTATT ACGAAgCGgC 1140
GCCAAATGcG GCGCCGCCGC CCATTTATGT AGACCCTTGT GCTGGACATG AGAGCCAAAA 1200
GCAGGCAGAG AAAGTACTGA TCGTCAGTCC ACATTCTCCC CTTATGCGGT TGCCTGCATC 1260
CTGCGCAGAT ATTGAAGCTA TTCCGCAAGA TCTAGCAGAA GTCATGTATa CGCTTTCGCT 1320
TGCCTCCCGT TATATTTTCG CGGACGAAAT AGAGGAATTT TTTCTGTTTT TGAAAAAACA 1380
TGCtGACTTC GTCGGTGATT TATTTGACAA AATGTTTTGT ACCCAGGTGA CGATGGTGCC 1440
GCACAACGCG TATGCCATTC CAGAGGATGT ACACGACAGT CTAGAAAAGC GTGTGCGCGT 1500
GAAAATGCCT GTAATACGCG AATGTATTTm wwcCCTTCCT TTGGAAGAAA TATCAGGAAG 1560
GATCGCTTTG TGCTAGTACG GATCTGCTCA GAACTTTCCA GGAGCTTCAG TACAAATACA 1620
CATCCGATTG TGTGTTACAC AGCTTGTTTC ATACGTATTC TGACGTGCAG ATTGCGCACC 1680
TACAGGTAGA AGAGTACACC GGCACGGATG TCGGCGCAGT GTTAAAGGTA TACCAACACA 1740
CGCTGCTGGT GGGCATGCGT GAAGACGCAG AGGCCGCGTT CAGAGAAGCA AAGGCTTGTC 1800
TGACAACACT GCAGGCGCGG CGTTTTGTGT CCGCTGAATA CCGGACCTTT TCCCTCTTAG 1860
GATTTCTAAC CA AGGTCAG AGCAAATTTG AAGACGCGTT GGTGTATTTT GGCTATGCAC 1920 TCGATGATGC AGAACAGctG CGCGACGGTG ATTTTCTCTG TTCCGCGCTT TTTCATTTGA 1980
GTATTACCTA CTTTTTGCAG CATAACTTTA CCCAGGCGCG GCTTTTTCTG AGTAAGCTAT 2040
CCGATGCGAT ATCCACGTAT TTTGAGCAGC GATGGAAAAC TGTCAGTCTG TTTATGCAGG 2100
GCAGAATTTC TCTCAGCCTC GGGGAGTATG CACAGGCGCG TCGGTGTTTT GATGAGGCTG 2160
CCGATTTTGC ACTGCAGTAC TTTGAACACC AAGAACCCTT GTGCAGAGTG TGGGCTGCAC 2220
ATGCACGGCT ACTTGCGGAT AAGTCGTATG CAGCGCACGC GCTGTTTCAG GACATGTGTG 2280
ATCAATACCC TGATGCATAT CTCTTTCTTG TAGAAAGCTA TGTCCGCGCA GAATGTTTTG 2340
ACGATCCCAC GTTGTTTCAA TCGTTTCCTG AGGAAACGAC CTCTCGCGAG CCATGTGTGC 2400
CGTCCTTCTC TCTTGATACG CCGATTTACT CAGGGTTCTC CTGCGCAGAA GATCTGGTAT 2460
GGGGCAGGCA GTGTGCGTTT GCAGTGAGTG CGCAgCACAG tACGGTATTT GCTCATTACT 2520
ACCATTGCAG GGTGCATCTG CACCGTGCCG AGGATATGCA AACATTCCAC CACCATAAGC 2580
AAAAACTTGA GGCCATTGCA CGTCGCGCGT TTCAAATAGG TGATCCGAGT GCTGCGTTGT 2640
TTCTGTACCT CTGCTATGAT GTGTCCTACC GCGTGCACGG CGCAGAGGCT GCTGTCACGA 2700
CAGCGCACCT GAGTAGGGCG TTTAAAGTGA TGCAGCGCAg CGTTGCGTAT ATGTCAGAAA 2760
ATACCGTTCG CGCACAGTTC ATGCAGGATA ACTTTTGGAA TGCAAAACTG TTTGCCGCCG 2820
CGCAGGCAAA CAAACTCATT TAAAGCAGGG GGCACTATGG CGGTCAATTG TGGCATTATC 2880
GGTCTGCCGA ATGTGGGGAA GTCGACAATT TTCTCCGCGC TCACTGCAAA CGTCGTGGAG 2940
GCGGCGAATT ATCCCTTTTG TACTATCGAA CCTAACGTGG GTATGGTGAC AGTACCTGAT 3000
GTGCGTCTTG AAGCACTGGC TGGTCATTTT CGGCCAAAGA AAACGGTGTA TGCCTCCATT 3060
GAATGTGTGG ATATTGCTGG TTTGGTAAAA GGTGCCTCGC AGGGGGAGGG ATTGGGCAAT 3120
CGTTTTCTTG CGCATGTGCG AGAGGTTGGA GTACTTGCAC ATGTGGTGCG CTGTTTTGAG 3180
CATACGGATA TCGTTCATGT ACATAATAAG GTCGATCCTC TTTCAGATAT TGAAACGGTG 3240
CaTATAGAGC TGGCATTGGC AGACCTGGCC TCGGTAGAAA AACGGGCTGT GCGTGCTCAA 3300
AAGGAGTCGC GTATGGGAAA GTCCcTTCAA AAGGAAAGCA CGCTGGTATT ACGGGCACTC 3360
GAATACTGCG CGAATATTTA GAAATGGGAA AGGCGGCATG TATGGCGCCG CTGTCGGATG 3420
AGGAgCGCAA ccGGTGCGCG ATATGCGCTT GTTGACAATG AAGCCGCACC TGTACGTGTG 3480
CAATACAGAC GAAAGCGGCA TGCAGTACGG AAATGATTTC GTGCGCGCGG TGCAAGAGCA 3540
CGCACGTGTG CATAACACGC AGGCAATTGT TATGTGTGGA AAATTTGAAG CAGAGctTGC 3600
GCAGCTTTCT GATGTGGCAG AGCAAAACGC CTTTTTGCAA GAATTAGGGT TGCGCGAATC 3660 aGGACGTgCG GcTTGCGCGC GCAGTGTATT CCCTGATGGG GTTGCGTACC TTTTTTACCG 3720
CGGGGCCTGA GGAGTGTCGC GCGTGGACCA TTCGGGCAGG GCTGCGTGCA CCGCACGGGC 3780
AGGAGTGATC CACAGCGACC TTGAGCGTGG TTTTATTCGT GCAGAAACGT ATTCTTTCGA 3840
TGAtCTTkCs TCCtGTGGGA GTGTGGCAAA GtGAGGGAGG CAAACCGCGT TCGGCAGGAG 3900
GGGAAGGAAT ACGAGGTGCA AGACGGGGAC GTTATCtTTT TTAAATTCAA TGTGTGAAAC 3960
ACAGGCGCTC CgTTCCGTCT GTGCGCCgTG TGCGATACAt GAGCCTTGAT TCTGCGTTTG 4020
AAAGCAGGCA CAATGCtCCC GTGCAGCGTA TCATATCTTG GAATGTGAAT GGAATTCGTG 4080
CCATAGAGCG GAAAGATTTT CTCAGCTGGC TCGCGCGTGA GGCGCCTGAT GTTCTCTGTT 4140
TGCAGGAGAT TAAAGCGCAT GAGTCGCAGC TGAgTGTGCG CTTCGTGCTC CGGTCTGGAG 4200
TGCTGGGGCG GGGGGTACGT ACTATACCTA TTTTCACAGT GCGCAGCGTC CTGGATACAG 4260
TGGCACGGCG CTGTTCAGTA AGCGCGCGCC AGATGCGGTG CGTTTCTTCG GGGTTCCGGC 4320
TTTTGACTGC GAGGGGCGGA TGCTTGCGGC ACgCTTTGGC GAGCTGACGG TGGTAAGCGC 4380
GTATTTTCCG AATGCGCAGG AAGGGGGCAA GCGGCTCGCG TATAAGCTTG ATTTTTGCGC 4440
AcGTTTCGTG CGTTCTGTGA TGAAGAGCGT ACGGCCGGGC AGCACGTGAT CTTGTGTGGT 4500
GACTACAACA TAGCGCATAA GGAAATCGAC CTGGCACATC CTCAGGAAAA TGAGGGGAAT 4560
CCTGGATTCC TGCCTCAGGA GCGTGCATGG ATGGATACAT TTACGGAGGC AGGCTATGCG 4620
GATAGCTTCC GAGCCTTCTG CACAGAAGGG CAGCAGTACA CGTGGTGGAG CTACCGTGCC 4680
CGTGCAcGCG CGCGTAACAT TGGATGGCGC ATCGATTACC AGTGTGTGGA CCAAGCCTTT 4740
TTAGCGCGCG TGACCTCTTC GCAGATACTG TCCGAGGTGA CAGGATCGGA TCACTGCCCA 4800
GTGTGTTTGA CGTACGCGGA CTAATCCGTT TCCGGGGTGA GCGGCACGTC CGCGCAAACT 4860
AAGACGTACC CGCGCGCACA GGCAGCGTCA GAGGTGGTAG CGAACGTCCA CACCCGCGGC 4920
TATGAACTGT GCGGTGCGCG TGTTGGTCTG CTGTCTATCT TCTTCAATAA TCTTTTCGCA 4980
TGACCGGGGT ACGCCGCTGT ACGTGGCGCT TACCCCCAAG GACCAGTGCT CTGTCAGTTG 5040
AAAATAGCAC CCCGctGCCG CCTTGAGCAC AAGACCGTAG TAGGTAGACG TGTAGTAATG 5100
CTGATAATTG AAGCCAGCCC CTACCGTCAG TGGCAAGCGG ATGCGCCAGA AGGCAACCGT 5160
GTACCCGGCA GTGAGGGCAA CGGGAATTGC AAGGTAATAG TACGGAGTAG TGGGACTGTA 5220
CGTATTGTTT GGATAGCTGC AATGGTACTG CACACTTGCG TCAATCCCGA GCGACAGGCC 5280
GCGGCACACA AAGTGTTCAA ACCCTAACGC CGCACTGAAC GCGGGGTAGA TGTACTTGTG 5340
CCCGTTGGTT TGCGCGTTGG CATTACGGTC GTCCCCGCGA CCGCTGTTAC ACCAATCCAC 5400 TTGAAAGAGG GGCACCGCGC CCATGGCCGA AAGGGTATAG TACTCCGGCC CGCCGCAGTA 5460
GTGTCCCACG GGTCCGCGTG CACCGGGTGT GCAGCTCCCC ACACTCCCGC GCATATTCCC 5520
AGCACCGGGC CGACCGCCCA CCACTTCAAT TGTTTCATAC CCCGCTCCAA CGCCGATCCT 5580
CTTACGCGTC TCGTCGAGGA CCTACTCCAT TCTACCCCCC CCCCACGGCT GTTTGTCGAA 5640
CCCTTTTTAA AGGGTTCGTT CTCGCGCGCT GGGCAGCACG CGCGTGAGGC GCCTATGCCA 5700
TCGGGAGCTG CGTTTTTCTT ATGCCCCACG AGGGGACTGC GGGGTATGTC GTGCGTCCGC 5760
ATGGGTGTGG TATCGGTGAG AAAGACACCC TGAAATACAT TGCTCTAcTT CGTACCAGGA 5820
ACTGCAGCAG CAGGGGGAAC AGGGACACCC TGGGTGAAAA GACTGCACCA TGCTAGGATG 5880
GGGAATGGAT ATGTCCAAAA GTGTGATGCT GTGTTGCCTG TTGAGTGT C AACCCTGTTA 5940
TGCCGGGTAC GTGTTTGTTT CCCCAAAGCT TGGCGTGTAT GGAGAAGCAT TGGGCGGTCC 6000
TGACACGGTG GGTAAAGCGG TCAAGCAGGC CGACGGTACT AAGATTGCTC CGAAGATATG 6060
GTACTACGCG CCGCTACCCC GCTTTTTGGC GTGGATATAG GCTATCAGGC GGATAACGGC 6120
CTGTTGTTCC GGGTGAATTT GGATGCGGCA CTCACGCGCC TTATGTTTCG CAGCCAGTGT 6180
GTGGTGGGCT ATTCCTTGCG GTTCGGCTGG GGGGGGGGGT ACGTCTCTAT CGCTTCGGGA 6240
ATCGAGTGTA GTGCAACGGT CGATGACGCG CAGTACGAGC CCTACACGAA AAATGAGCAG 6300
GGGACTACTG TTGCCTCCAA CACCGTGTTC CCGTGCACGG TCTTGGAGGC ATTGGTGCGT 6360
GATCCGGCCC TTACCGCAGA TTACCTGCTT TACGGTATGC AAAGCTGTTA CGCAATTCCG 6420
CTCCATGTGG GGGTTTCGTA TTACCTTGCC AAGCGCTGGG GTATTGAGTG TGCGCTTACG 6480
GCCTCACTTG GCATTTCAAT GCGGACGGAT GTGCGCGTCC CCTACGCGGT ACGCATAGGG 6540
CCGGTATTCC GCGTGTAGGG CCTCCGGTGA GCCGCTCTCC TTCCCATAAG ATGGCGTTGT 6600
TGGCTGGGGC TGGGGCTGGG GCTGGGGCTT TCCAATGGAC GGGCATGTAC GTACGGTCCT 6660
ATGGAACTTC GTTGTGGGCT GCGGCTCCGG GTAGGGCTGG GACTCCGGCT GCGGCTCCGG 6720
CTGCTTGGGC ATAGCCGCTA GACAGTGTGG AGTTCCTCCG GGCGACTGCG AGCCGAGAAG 6780
TAGATGTCAA CTCGGGCTGG GGGTACGGCT CGGGCGTGGA ACGAGGTTTT TTGTGTGAGG 6840
GGGATGTGCG CGCGTGTCTT GTTCCTGCCT CCCACTTCGT AAGAGCAGGA ACCGCACCAG 6900
GGACGACGCC GGGGACCGCA GCAGCTTGCT TGTTCGTACC GCCGTACgTG CTGGGGGTTA 6960
CTCCTGCTGG GACgTTGCCG TTTGGTTTGC CCTCCCAGTC GGTGTTAGGG GGgcTGCGCC 7020
TAGCTCCAGT GCACGCGTCC TCCCTGCTTG AGGGGTTGGG CGCCACCATT TTTTTCAGAT 7080
GCAATGCCCC GTGAGgTTTG CGCAgTCGTG CTCTCGCATG GCTATGTTTC TAGCGAGAAA 7140 TACTCCCTCG AATACATTGC TGGCCTTCGA TGCGGGCGGA CGTCCCACAC CGGGAAGTGG 7200
ATGTCGGCTG GGGCTCGGGC GTGGGCTTTT AGGACTTTGT AATGGACAGG CATGTACGCC 7260
TGCGTCCTAT AAGATGTCGT CGGCTGCGGC TGGCACTGCG GCTCCGGCTC GGGCTCGGGC 7320
GTGGAACGAG GTTTTTTGTG TGAAGGGGAT GTGCGCGCGT GTCCTGTTCT TGTGCTCCAt 7380
TCGTACCAGG AGCAGGGGcT GCAGCAGCAg CTTTTTTGTT CGTACCGCCG TACGTGCTGT 7440
GGTTTGCTCC TGCTTGGACG ACGCCCTTGC TGTCTTTGCC CTCCCAGTGC GCACCTTCCC 7500
TGCTTGATGG CGCCACCATC GGATGCAATG CCCCCGACAC GTCTGAGAAG CTAAACGAGC 7560
CTCCACATTC ACGCAgTCCG CGCcTACGGT GCCCTGCGCT CCGTCTTTCT TACGCCCCAC 7620
GAGGCTGGCG CAGTCGTGTG CGGTCATGGC CATGTTGAGA AATACTCCCT GAAATACATT 7680
GCTGGCCTGC GATGTGCGTG AGGCATCCCC ATACCGGGAA GTTGATGTTG GCTGCGGCTG 7740
GGGCTTGGGG TCCGGTCAGG ACCGTGTAGT GGACGGGCAT GTACGCCTGC ATCCTATAAG 7800
ATGTCGTTGT TGACTGCGGC TGCGGCTGCG GCTGCGGGTA GGGCTCGGGT TTTCTGAGTA 7860
GACGAGGGTT CGTACGTTCG TCTTATTTCC GCGCGGGCAT ACTCAGCAAT ATTCTGCCTT 7920
CCAtTCGTAG GAGCAGCAGG AGCAGCAGGG GGCGTGGCCT TTTTGTTCGT ACCGCCGTAC 7980
GTGCTGGGAG TTCCTGCTGG GGCCTTGCCC TGaCTGTCTT TGCCCTCCCA GTCGGTATGA 8040
GGCAGGTCGC GTCTGTCCCT TAATGTGTGC CTCCTTGCCT GAGGGCTCCG GCGCCACCAG 8100
TTTCAGATGC AATGCCCACG GCATCGCCTG AGAGGCTGAA CGGGTCTCCA CACTCACACA 8160
GTCTGCGCCT ATGTCGCCAT TCGCTCCGTT TTTCTTATGC CCCATGGAGC GTGCGCAGTC 8220
GTGCGTCTGC ATGTCCATGG CATTGGTGAG AAAGACGTCT TTAAACACAT TGCTGGCCTT 8280
CGATGCGGGC GGACGTCCCA CACCGGGAAG TTGATGTCGG CTCCGGCTCG GGCTGTGGCA 8340
TAGCCGctAA CGCACGGGGA GTGCACGCGT TTTTACCATG TCACTTTCAT TCCGCAGACA 8400
AGGGTGCCGA AGTGGCGTTC GGACCAGATG CTCTCGGCAA TGCCCATGTA AGGAGCGTCA 8460 gcAAGCACGC CCTGTTCCCA CTGGGCGCTG AGCTCCACCT TCTCGAAGGG ACTGAACGTC 8520
AgTCCCACCT GGTACTGGAG CGCTCGTTCA TTCAACAGGT TGCCCGCGGG GTTAATAATG 8580
TTAAAGCGAT TGGTTGTGCC GAGCACGGAT GTGTGTGGTG CAAGCCAGGC GTGGGAACCG 8640
AGGGGGATGC GATAgcTGcA CCACGCCTTC CCCAAAATTG GCATATTGAT AGTCCCAGGG 8700
GGCACAGCTC CATTCAGTTC GTACCCTCCG TTATTTCTGT AACGGATGTA GGTGAGGGGG 8760
ATGTACACGC GTGCTTCGAC GCCGGCGTTC AGGCCGGTGA GCAGGTGGGT GTAGGGGTCA 8820
CCGCTTTTGG TTTCGAGCTT AAGGAATCCG GCAAAATCAA AGTAGTGCGC ACGAGTGGTA 8880 GCAAAGACGC GTTTGCCAAA GATATTAGTG CCTGCGGTGG CAAAGTATAT GCCAGAAGAG 8940
AGCCACTTCC AnTGCATACG CAGGAGCGCG TCTATGTTGA GCGCGTTCAT AGGTGCGCGC 9000
TCAAGGAAAG CGAGAAGTTT AGCAGTGACA ACTCTTGGAT CGGAAGAGCG GAAGACATCA 9060
CGTACTCCTT GCTCTATGTT CGGTACAAGT TGCGATACaA GCGCCGCGAG CGCGCCAGCG 9120
GCTAGCACGG TTTGAATGGC GCTGCCGAGC GTTCCTTCTG CAATCAAAGC AGCAAGTCCT 9180
ACCATCTCTA TGAGAGTGGT TTGTTCGGTG ATTCCTGGTG GCATCATGAT ATTGGGAAGG 9240
TTCTGCACGA GTTTCCCCTC CACCCGTCTA AACACTTCCC TTGCTTTGAG GATAGCTCTC 9300
TCTTGGGTCT GAGCATGTGC GTTACTCTGG TGTTGGTTAC CGGCGTCGAG GGCGAAGGAG 9360
AAGCGGAAGC CGGCGCCTGG TTCGAGGGTG AGTCGGCCTC CTACTCCCCA CAGGAGTGCT 9420
GTTTTGTTTT CGTTCTTGGA GTCTTCGGTA CCCTTAACGt AGTTCTGGTC CAGTGTGGCA 9480
TTCCCTGCCA GCTCCAACGT AAGCAGCCGC TGACGGTCGA CGCCATAGGA AAGCGTTGCA 9540
TCGGCCCCGA AGCCATACTT GCTGTGCGTG GTGTCAGTAC TATCCCAGGC ACCATTGGAA 9600
AGGAAGGAGA GGAAACCGAT GTCCACATCT ACTCCGCTGT TTCCCACATT GTGGGCCTGG 9660
TAGCCGAGTT TTGCCCCGGA GCCGGAGAAA CCAGGGGCAT AGCGAGTGTC CTTTTCTGAA 9720
TAGGCACGGG TGACAAAGGG TTTCCACAGC TGGGCAAAGT TAACCACACA GGAAGGACTG 9780
GTACCCACTG TCAGGTAGGC CCCATAACAG TGCAGGGTTG CCTGGAAGGA AGCGGTAGGT 9840
TTGGTAAAGG ACAGGGCCGT TGAGCTTTTA GAAGACGCAA GCTCTACTGC CAGGTCCTTC 9900
AGCTGCAGCT GTGCCCACAC CCCTGAGCGT GCCTCCCCTC GGCGGGTGTG GGTGTGCTTT 9960
GACACCAACG GCAGGGAAAT AGTCAGACTA TTGGTAGTGC GAAACCCATG GGTGTGCTTG 10020
CCCGGGCCAG TGCGTGGATT CTTCTGGAAC GCAATGCCCC ACTGGAGCTG GGCTGTGCCA 10080
CTGAcTGCGG AGTGAGTACG CCTGCATAAC CAGAAGCAGC ACATACCATG CCCGCAAGTA 10140
CCCCCGCTTG CATCACCTGC CTGCCCACTC ACTCCCCCTC CTCTCACTTC TACCTCACCC 10200
CCCCCACCCG TCTAGCCGCG TGTGACTACC AGGAGAGGGT GACGCCGCAC ACGATGCGGC 10260
CGATTCCCTG GGTGAGGCAC TCGGACACCA GCAGGTACGG GACATCAGAG AGCATACCCT 10320
GTTCCCAATC AAGGGAGAAT ACCGTCTTCT CTATGAGACT GGCTGAAATA CCAGCACGcA 10380
GCTGTGCACA GTACTCCTTG GTTAGATAGG TAGCTCCTAC TGCTCCACCT GCAGCAGGGG 10440
CATTCAGGTG TGcACGGTTG' GTAGAGGCAT GGACCGTAAC GCTTGGCTTC ACCCAGCCGT 10500
AATCCTGCAC CGGGATGCGA TAGCTACACC ACGCCTTCCC CACCACCGGC AGGCCAATGT 10560
GcCCTGAGGA ACCGcCGGAA GGGAGAGGGT TCCCGTTATT ATTTTTGTAC AGGTCATGGG 10620 TGAGGGGGAT GTACACGCGT GTTTCAACGC CGGCGTCCAG GCCGGTGAGC AGGTGGGTGT 10680
AGGGGTCACC GCTCTTAGTT TCGAGCTTAA GGAATCCGGC AAAGTCGCCA CAGCTTGCGA 10740
TGGTGTTATC TAACACCCTG GTGCCAAAAA CGTTTGCCGG TGCTGTGGCA AAGTATATGC 10800
CAGAAGACAG CCACTTCCAC TGCGCCGTAA ACAGCGCATC GAAGGCGACA TTGTAGGTGT 10860
CAAGATACAG ACACACGGCG CTGACTCCCA TTAGAAAGGC ACGCCaTGCA GACGCACGCA 10920
GGTTCTGTAT AGCCtGACGT ATCTGCTGCC CCGCATTCAA CGCATCCGTC TTCTTCTTCA 10980
CTTCTTCGGT TATGTGTTGC TGGACACTCG CGAAAAACGC CGTTATTTCC GTTCGCATCA 11040
TCGGCACTAA ATCTGCCAGA TCCTGTTTCA CCTCATCTTG CACGCTCACC ATGGTCAGAA 11100
TGCTGTTCTG CTTACGTGAT AGTTGCTGCT TAATGAgCGt TCGCCTACCA TTCTCGCAGT 11160
ATCGCCTAGA CTACCCCCGA CCGCTTGCAT GTTTGCATTA TTTTTTGCCA CAkCCGCTTG 11220
CACTTTCTGA TTAATTTCAG TGACGATTTG CGTCTGTACC TGCTCAAACC CCTTCACCAn 11280
CCTGTCCGCA TCGTACTGCA GCAAAACCTG CCCCATCAGG GAAAATGCAG GAAGTGCAGG 11340
AAgCGGCGGC AGGTTCGGCG GACTTCCTGC GGGGTGTGAA GATTTTGCAC AACyTTACCG 11400
GTAGGTTTAG CAGGATTAGG CTGAACTGCC TCTAGCGCGT TTATGTACGT AGTCCCCCGA 11460
GATTCCAGCG CGCTTCGAAC TCCAGCCGTT ACTGTCTGCG TCGCCTGTTG CACTACCTGG 11520
GTTACCCAGG CTTCCTGTTT TTGACTTTCT CCCTGGAAGA GGTTATTTGA GAGGGCGGTG 11580
AGTTCACTCT GCGCCCtCTG TGTGCGATTT TGAAAgTCCT GTGCACTCTG GTGTTGGTTA 11640
CCGGCGTCGA GGGCGAAGGA GAAGCGGAAG CCGGCGCCTG GTTCGAGGGT GAGTCGGCCC 11700
CCTACATTCC ACAGCAGTTT ATCCTTGTTC TGATTGTTTG CGTCCTTCTG TGCACCGATG 11760
AGGTATCCGT CTTCTAGCGT AACATTGCTG GCAAGCTCTA CCGTGCACAG AGGGTGTCCT 11820
GCACGCGCAT ACATTAGCTT CAAGTCTGCC CCAAAGCCAT ACTTACTGTG CGTGGGGTCA 11880
GTACTATCCC AGGCACCGTT AGAGGCAAAG GAGAGAAACC CCACATCAAG GCTGACCCCA 11940
CTGCCCCCAA TGTCCTGTGC CCGATACCCA ACCTTGCCGC CTAAACCCCC AAACCCCGGC 12000
GCATACTGTA CCGCATCCTC CTGGTACTGC GCTGTCACCC ACGGCTTCCA CAGCCGGGCA 12060
AAGTTCGTCA GAAACGTGGG GTTCTTCCCA ATCGTCAGGT AGGCCCCATA ACAGTGTAGT 12120
GTCGCCTCTA CCTTCCCCTT GCGCTTAACG GCAAAACCTG CCTTCCCCTG ACTCAGGTCC 12180
GCCTGCAGGT CCGCCACCTT CAGctCCGCA TACAGTGCCG GGTGCTGCCC ACGGCGCGTG 12240
TGGGTGGTGC GCATAACCAG GGGAAAGGAT ACTCCCACCG TGTTGGTAGT ACGAAACCCG 12300
TGCTTCAGAT TGTAGGGACC GGTGCCCATA ACTGCACCAG GGGCCTGGCC ATGACTGCCT 12360 ACCCCCTTGC CATAGCTGAT GCCCCACTCA AGTGTGGCAG AGCCAGTTAG CTTCGGGGAA 12420
AACTCCTGTC CGAGCAcTCC CCgCTCGCTC CTACCCCCAC CACCACACAC AGCACATCCC 12480
CCACCGCATG CACCCCATGm TACCTCACCC CCCCCCCCGg cCyTGTCTAg TAGCCCCcTC 12540
ACCCTGCCAC cTGCACACAC GCAAAAACTC ACCACTCCTT GCACCTCCTT GTCCCCTTGG 12600
GTTACACTGT GACCCCTTAT TTTGCGCATG TTTGTCGATA AAGGGGGGAG GATTCTTGTG 12660
AGAGAGAAGT GGGTACGCGC GTTTGCGGrC GTTTTTTGCG CCATGCTGCT CATCGGCTGC 12720
TCTAAGAGCG ACAGGCCGCA GATGGGAAAC GCAGGGGGCG CAGAAGGTGG TGAtTCGTCG 12780
TTGGAATGGT AACCGATTCA GGGGACATCG ATGACAAGTC CTTTAACCAG CAGGTGTGGG 12840
AAGGTATTTC GCgtTCGCAC AGGAGAACAA CGCGAAGTGC AAGTATGTGA CTGCTAGCAC 12900
TGACGCTGAG TACGTGCCTA GTTTGTCTGC GTTTGCAGAT GAGAATATGG GGCTCGTGGT 12960
AGCATGCGGC TCTTTCCTTG TGGAGGCGGT CATCGAGACT TCTGCTCGTT TTCCTAAGCA 13020
GAAGTTCCTG GTCATCGATG CGGTTGTCCA AGACCGGGAT AACGTTGTTT CTGCAGTGTT 13080
TGGTCAGAAT GAGGGGTCGT TCCTTGTCGG CGTTGCAGCG GCGCTGAAGG CGAAAGAGGC 13140
GGGAAAAAGC GCCGTCGGTT TCATCGTTGG CATGGAGCTG GGTATGATGC CTCTCTTTGA 13200
AGCGGGTTTT GAAGCGGGGG TTAAGGCCGT CGATCCCGAC ATACAGGTAG TGGTTGAGGT 13260
TGCCAATACC TTTTCAGATC CCCAAAAGGG GCAGGCGCTC GCGGCAAAGC TGTACGACTC 13320
GGGCGTGAAT GTCATTTTTC AAGTAGCGGG GGGCACAGGA AACGGCGTTA TCAAAGAGGC 13380
GCGCGATCGT CGTCTCAATG GTCAGGACGT GTGGGTTATT GGCGTAGATC GTGaCCAGTA 13440
CATGGATGGG GTGTACGATG GGTCGAAGTC TGTGGTGCTT ACCTCCATGG TCAAGCGTGC 13500
GGATGTCGCT GCGGACGGAT CTCAAAGATG GCGTACGATG GCTCTTTTCC CGGGGGGCAG 13560
TCCATTATGT TCGGGCTTGA AGACAAGGCA GTGGGGATTC CTGAGGAAAA TCCCAATTTG 13620
AGCAGTGCGG TTATGGAGAA AATTCGGAGT TTTGAGGAGA AGATTGTCTC GAAGGAGATA 13680
GTGGTTCCGG TGCGATCTGC ACGCATGATG AACTAAGGGG GAGAGGTGCC TCCCGTGCCT 13740
GCGCGCGGGA GGCCTTCTTC TTCATCTGAT TTTTGTTTGT ACGGCATGGC CGTCGTGAAT 13800
GGCTTCTGTG TGCAGGACAT TCCCTACGGG TCACGGGTTG TTTTGCCGGG GCGTATGCGT 13860
TCTTCTTCTG CGGGTGCGTA GAGTGGGGCG TGTGTCTCGA CCCGCCCGTG GTCAGTGGGT 13920
ATGGGGACGT CCAGTAATGA ACTTGAGGGA GGGGCTATGC CATACGCGGT GGAAATGCGC 13980
GATGTAACTG TCCGGTTCCC AGGCGTTGTT GCCAATGACT GTGTTTCTTT CGGTGTGCAG 14040
ACCGCGGAGG TGCATGCCTT GCTGGGAGAG AATGGTGCAG GCAAGTCTAC GCTCATGGGA 14100 GTCCTTTTTG GTACGTGTCC GAAGCAATCT GGAGAGCTGT TTGTAGATGG CAGGAGTGTG 14160
TGCATCCGTA gTCCGCGCGA TGCGcGCGCC ATGGCATTGG CATGGTGCAC CAGCACTTTA 14220
ATCTGGTTCA CAATCTAACC GTTAGTGAGA ATATCGTTCT TGGCGTCGAG CCTCGTGCGC 14280
GCTTTGCTCG CACGGATGTT CGTGCTGCGC ATCGCCAATG GGGAGAGCTG TGCGAGCGCT 14340
ACGGACTTGC GGTgGACCCA TACGCGAAAA TTCAGGACAT CACTGTTGGC ATGCAGCAGC 14400
GTGTTGAGAT TCTCAAAATG CTTTACCGCG ATGCTCGGGT GCTCATTTTT GATGAACCTA 14460
CCGCAGTTCT CGCTCCACAA GAAGTGCAGC AGCTGATGCA GGTGATCAGA CGTCTTGCTC 14520
GTGAGGGTAA GGCGGTGGTG CTTATCACAC ACAAACTGAG TGAAATTAAG GCAATCGCCG 14580
ATCGCTGTAC GGTAtGCGCA GGGGGGCGTG TATCGGTACG GTTTCTGTGG CTGAGGTGGG 14640
AGAAGAACGG TTGGTAGAAA TGATGGTGGG CCATGCGGTG GACTACGCGC TGCCTCGCGC 14700
TTCAAGGAAG GATGGGGCGT GTGTATTAGA GGTGCGTTCC TTGAGCGTGG GGGCGCGCCG 14760
CGTTACGTCT GGGCAGATGT GGgctGACGC GTCTCCTTCT GCAGCGCCCC GCGCGTACGG 14820
CGTCCGAGCC GTAAGCTTTC AGGTGCGGTG CGGGGAGATC CTGTGTATCA CCGGTGTAGA 14880
CGGCAACGgT CAGTCACAGC TGCTCGAAGC AATTGCAGGT CTTGTGCCGG TGTCGGAAGG 14940
TCAGATTCTG CTTGACGGGT GCGAGATAnC amACACTTCC GTGCGCGAGC GTGTTCTGCG 15000
TGGTGTCAGT TACATTCCTG AGGATCGGCG GAAGCACGGC CTTGTGCTCG ATTTTTCTGT 15060
GGAAGAGAAT ATGGTCTTGC GCTCGTATTT TCGCGCGCCG TTTGCGCGGC GTGGCATTCT 15120
CGATCGGCGT GTGAtTGCGC AGCATGCGCA CGCGTTGGCG AAAAAATTTG AGGTGCAATC 15180
CGGTGCGCTC GGGTGTGCGG TTCGTGCGCG TACCcTTTCA GGAGGTAATC AGCAAAAGGT 15240
TATCATTGCG CGCGAGTTGC ACCGTGCACC GCGTCTTTTG ATTGCCGCGC AGCCGACGCG 15300
CGGACTTGAT TTGGGTGCGG TTCAGTATGT TCATCGCGCT ATTGTCGCCG AACGTAATCG 15360
GGGGGGTGCA gTGCTCCTCT TTTCCCTTGA TATGGATGAA GTGCTTGCAC TGGcAGATTC 15420
TATTGCAGTT ATGTACGAGG GAGAGATAgT GGGGACCGTG CACGCGTGCG ACGCAACAGA 15480
GCAAGAGCTc GGGCGTCTCA TGAGTGGGAT GCGGAAAAAA GAGACTGCGG GCAAAAAAAC 15540
CGGGGTACAG GGGTGATTGC GCGTCTCCGG GGGTGCCTGG TTCACCCCAA ATACCACGCG 15600
CTGCTTATTC CCTGCTTGGC GGTGATCTTG GGGTTTGCCG TAGtGCGGTG GTAATGGCGG 15660
TGTCAGGTTT GCACCCTAAA TACATTCTCA TAGCTTTGGT ACGTTCGATG TTTGGCGTGA 15720
ATGTACAGGC TTTTGGCACC GGCAGGTCCG TGTGGAACTT CAGGTATATG GGCGAAGgAG 15780
TGGTGACGTG TCTGCCGCTG ATACTCACAG GACTTGCGGT GGCATTTACG TCCCATATGG 15840 GATTGTTCAA TATCGGGGCA GAAGGGCAGC TCGTAGTCGG TAGCGTGTGC GCAGTGTGTG 15900
TCGGTGTCCT TTGGCATGAG CACCTTTCTT TCTTTACCAT TCcTGCGGCG GTTCTTGCAG 15960
GAATGGTAGG GGGAGGACTG TGGGGTTTGA TACCAGGGGT GTTGCGCGCA GTGTGCGGGA 16020
TCAGTGAGGT GGTGGTTACC ATTATGCTCA AtACGTAGGA CTGTATGGGG CGAATTTTGT 16080
AGTCACCGCT TTGCCTGGGA GCGACTTGAT GCGTACGGTG TCTTTACCCC CAGCTGCGAC 16140
GTTGCACAGT GATTTTCTCT CGCGTGTAAG CAATGGGTCG CGTCTGCATT GGGGCTTTTT 16200
GCTTGTGATA GCTGCACTGG TTAGCTTTAA GTTTCTCATT GAGAAAACAA CGTTCGGCTA 16260
TGAGCTCCGT GTCGTTGGTG CTAGTGCCGA AGCGGCCCGC TATGCGGGgA TTCACATCAG 16320
GCGGCGTGTG ATGCTTGCAA CGAGCATTTC GGGTATGTAT GCAGGGCTCG CCGGTGTGTT 16380
GCTAGCGATC GGTACTTTTT CGTACGGGCG GGTGTTACCC GGATTTGAAG GGTATGGATT 16440
TGaGGGGATT GTGGTGTCCT TGGTGGGCCG TAATACGGCG TGGGGGTGTG TGTTCGGCGG 16500
TTCGCTGcTC GGTTCGTTGC GCGCGGCAGG CCCACTCATG CAGTTGAACG GAtGCcGAAG 16560 aGGTGTCGGT AATTATCGTT TCGGCGATTA TTGTCTTTCT TTCCATGCAC AATGGAATCC 16620
GGGCGATGCT CGTCAGGTGG GGGAGGCAGG GTGCGCACGT ATGAACACGT TTTATTCGAT 16680
GGTGGCGCTG ACGCTTGTGT TTTCAACCCC TATTTTGATT ACTGCGTTGG GGGGGTTGTT 16740
TTCCGAGCGG AGCGrGGTGA TAAATATTGC CCTTGAAGGG TTGATGATGT TTGGTGCTTT 16800
TTCCACTGCT ACGGTGACGG TCCTGTGCGA GCCGTATACG ATAGCTGCTC CGTGGATTGC 16860
ACTGGGAGTT GGCAtGGcAG TTGCCGCGTC GGTGGcGTTG TTTTACGCAT ATTTGAGTGT 16920
GCGCTTGTGC AGTGATCAGA TCATCGCAGG CACTGaTnCA ATTTGTGTGC AACAGGAATG 16980
ACGGTTTTTT TCGCACAGGT TATTTTTGGG CAGCAGGGAA CGCAGGCGTA CTCTCGTGGG 17040
TTAGTTAAGA CTAGTTACGG TTTTTTCAGT CGCATTCCGG TGCTTGGCCC GATGGTTTTC 17100
ACCCATACGT ACCCGACAGT ATATCTAGGT TTTGTGCTAG TAGCATTGGC GTGGTACGTA 17160
CTGTATCGCA CGCCTTTCGG TGTGCACGTG CGTGCCACAG GGGATCAGCC GTATGCAGTA 17220
GACGGTGCGG GCTTGAGTGT GTTTCGTCTG CGGTCTGCGG CGGTGGTAAT TTCAGGACTC 17280
TGTGCGGGAC TTGGCGGCGG GGTGCTGATA CTGACGCAGG ATATCCAATA CACCGTCTAC 17340
AGCACGCATG GGACGGGGTT TATCGCACTT GCAGCCTTGA TTTCAGGACG GTGGCATCCT 17400
TTCGGGGTAC TGGTGACAAG CGTTCTTTTT GGCTTTTCAC AGATTTTGAA CGTGTATGCC 17460
ACGAGTGTTG AGTTGTTGAA ACATTTGCCT ATCGAGCTGT TTAGCGCGCT TCCGTACGCG 17520
CTGACGGTTG TGGTGCTGcT ATTGTTTGGC GGCCGTGGGG AGGCACCGCG TGCCATCGGT 17580 CAGCCCTATG ACCGTGCGCG GAGGTATTAA TTTTTGGAAG AAGGgTGAGT TGGGTGGCAC 17640
CTATTCTTTG CAATGAGCAC CCGGGTAGGG GGTTCTGTAT TCCGGGCGCG TTTTTTGCTA 17700
CACTTAGCGG GCTGTGGCTG TGCTTCCTTC GGTGCGTAGG TTTGAGTGCT CACTGTTTGT 17760
GGTGCTCGTG CTCTGCGCTC TGGCCGTCTT CGATCCGCTT TCTGGCTTTG TGCAGCAAAA 17820
GTTGGCCGGT GTGCAGCGCG TCTGGCTTGG CTTAGTTGAG GAGTATTCAG GTTTACGTTT 17880
TCAGTATGAT TCTCTCTCCC CTTCTGTTCT CCGCGCAGTT ACGCTGAGAA ATGTTCGTGT 17940
TCGGGAAGCA GTTCGCGGTG AGCAGGTTGC CGTCTTTTCA AAAATAGTCG TTGCGTACAA 18000
TATTTTCTCG CTTTTTGGTT CCAACCCTGT GCGGGGTATT CGGGCTCTTC ATGTTCATGA 18060
CGGAGCAGTG GACGTCGACC TGTACCGTCA CCGTCATGTG AAAGAAAAGT TACAAAAACT 18120
GTTCTCGAAA GACGGGGAAA TGGCTTCGTT CTTTGCCGAT TTGCGCGAAA TAGACGTGCG 18180
CGTCCATAAC ACTGCAGTTA CGGTGCGCAg CGATTCCAGA CGCGCGCACC TTTCTGTGCC 18240
GCAGGGTAGG TTTTCTTTTG CGGAAACTGG CGCCTCGTTC GCTCTTTCTT GCGAAGCTGA 18300
GTATGTCGAC ACCCGTTCCT CTTCCTGGGG ACCGCTGTAC ACACACCTGG ACGCCTCAGG 18360
CGTGTTTGAA ACGTCGTTTA CGTCAGGTTC CGCCACCCTC GAGCTTGCAC CCCCGAGCGG 18420
CTCTTTTTTC AGTGTGCCGA CGCTTACTCT CGTGGCAATT TACGCAGATG ACCTGTTTAA 18480
GTTTCACACG GCGCGGGGCA TCTACCCTAT GGAAGTTTCT GGGCAATGGA ATACTGCAAC 18540
CGGCGCTTGT GAAGCTTCCG TGCGCTGTGA AAATTTTCGT CCCCTTAAGT GGGCGCGGct 18600
CCGCGACACC CACGTGCCAG CACAGGGTAT GCAGGAATTG TCTGTGAGCG GGAACGTTCA 18660
GGTTGGGTAT ACCCCCATAG AACAGTGGCG GTGGAGTGCG GATGTGCACG CGCACACCCC 18720
GTATGTAGTG ctTGCGCCGG GGTATCAGCT GGAAGACGTT GTCGCAACGT TACAGGCGCA 18780
CGGTGATCCT GCACGGATTC AGGTAGAAAA GATATGCGCA CGAGGTAGTA ATCTTGATGT 18840
GGACGGTGCG TTCGAGcTCA CGCTGGACCG CTGGATCCCT TCAGGGGTGC TTACGGTGCA 18900
CAGGCTGCCG CTTCTTTCGG GGGCATACCT TTCAGCGCAG tGCGTTTTCG CCCACAGGGG 18960
GTTGGTTTTG TGTGCACCGT CCCGCGGATA CAGGTGGGGG AAGCGTTTCT GGAGGACGTG 19020
GCGCTCTCAG TACGTGTGGA TCCGGCAAAA ACGGATTTCC GCCTGGTGGC TGCAGACAGC 19080
ACGGGGCGCT ACGAGTGTGA CGGATCATAC CTTGCCGCGA ATGnGGGGCA GTCTCGCTTT 19140
CTTGAGGCAC ACGTGGCGTT TGAATCGGTG AATGTCGGTG CGCTGTACCA AATGGTTGCT 19200
GCCTGTACGT CACCGCAGGC GCTTCCACGC TCGTGACGCG CGCACTGGTG CCGTTACAGT 19260
CAACAGCAGA TTTTTACGTT TCAAGTGATT TTCGTGATAT TTCGTACAAT TGTGTTCGTT 19320 TGGTGCTCGC ATCGGATGAA ATCGCTGACC TGTACGCCCT GCTGTCAgTG CAGGGGACGG 19380
CAGCTTCCTT TTCGGTCACG GA ATTTCGC TGCTGTGTAA GGGACTTGAG GTACAGGGGA 19440
ACGTGATGGC GAATTTTGAA CACGGGGGAG ACGCCCTCTT TGAAAGTGTC CTATCCATCA 19500
ATTCGGTTCC GTATCGTACC AGGGGAGTAT ATGCCGACCG TACGCTGACG GTGTATGGCG 19560
ACTATGATTT TTCGGTGGTG GCATCGTTTG ACGAGCGCGC AGGGgTTACC GGCACGTTTC 19620
AGGTGCAGAA TCTGCCGGTT CCTCTCTCTC AGAGTCTTTT TGATTGTGAC AGTTCTTTTG 19680
CAATGCGTAG TGCCCACTCG TGGGAGGTGC GCTTTCATCA CCTGCACCTC CGTTCTGGGG 19740
CGGTCGCCGC AGtGGATCGG AGCAAATAGA AACGGTCTTG CGCCTTGCTG GCGTGGCGAA 19800
CCAGGCCGGT GCTCTGTTTG ATCAGGTGTT TTTTGGTTCT CGCGATCGGT ACTTGGCTGG 19860
AACGGCGAGC TTTGCCGTTG TGCCGAGAAC AGGGCAGCAC GAGCAGGCGC GGTATGAAAC 19920
GGCCGTGCGC CTTGCATCTG AAGATGCGCA GGAGCAGGTG CAGCTTAACG CGCAGGTAAC 19980
CGTGGGGGAA CACGTCTATG TGGATAGCTC AGGGCGAATA GATAACGTAG ACGTGGGGCG 20040
TTTTGTTGCA GGGCAGGGGG AGCGCAGTCG CGTCACCGGG TCGTGGACTG TGCTGGGTAC 20100
GATGCAGGAT ATGTCTGGAC AGGTGCAGGT AGATTCACTC GAGCTGATCG CCAAGGGAGT 20160
GCCCTTTCAC CTGCGGGGAG GATGTGCACT TGATGACGGT wcgCTTGCGC TTTTGCCCAC 20220
CCAGGTGACG TGGGGGTCAC ATCAGTTTGC TGAcCTTGCA GGAGAATGGG TGCCGGGTCA 20280
GGCGCGTGCG TGGGTGCGCA CCACGTACTC AGGCGCGTTT GAAGGGCAGC CGACACATGC 20340
CACCTGTACG CTCACCCTTG CCGGATCCCC TGTGGATTCG GGTAAGGCGA CATCTGCAcT 20400
GCGCACGTCG TTTCTCACGC CATTTTTGCA GACGCACAGT CAATACACGA TTTCTGCGGA 20460
GTTTGAGCAC TGGCGCATCG CCACATACGA GGGTGAAAAG AACCGCATAC TGGTAGTGCG 20520
CGATCCGGGC GTATGGGCGC TGTACGCCGG TGAGCACGAC GAAATTACCG GATTTATGCT 20580
GGATGATGGT TCAGTGTCGT TGCAGGTGGC GCAGAGTTTG CCTGTTCATT TTTTCTTGAA 20640
CGGGTCGTTG AGTGCACAGC AGGTAGACGT GCAGATTCAG GATATCTTTG TTGATTTGGC 20700
GCGCGTATGG GCGTTTACGG GCATACGGCA TGTGCGCGTG CACGAAGGAG TTGCGGTAGG 20760
AAACGTGACG GTATCTGGAA mnCGTGCGCG CCCGGTGTTT GAAGGAAAGT TACGGGGAAA 20820
GGAGGTAGTT GCCAGCGCGC CTGGGTATGC ACCTGAGCGC TTTGGGCCAG GTTCTATCGA 20880
TATAGTAGCA CACGGCAGCA CGCTCATAGT GCCGTATACA GAGTTTCCCG GTCCGACGGC 20940
CCGTCTTTGG GGTGAGTGTG TTGCACAGCT GAATGGATTT TACCCGGATG AGGTGGTTAT 21000
CAAATGCGGG ACGGTAGGAG ACGCGCTGGG TGCGATTCAG ACGGATAACC TGCTTTTTGC 21060 GATGGACGGG TCAGCCGGCT GCGATCTGGA GCTTCGTATA ACGCCGCAGT TGTTATCGAT 21120
AAACGGAAAG GCGCGCTTTG ACCGCGGGTA TTTTTTACTG AATTTCTCAG GGATTGAAGA 21180
GTTTTACACA AAATACGCGG ACAGTGCGCA GAATTTTCAG ATGAATCTTT TGCTGTCTGC 21240
GGGAAATAAA GTGGAGTTTC GTTGGCCGCG CTCTGATTTT CCTATTTTGC GGACGCTGCT 21300
GCACGCGCAG GAACCGTTTG AGTTCATAGC CGATCCGGTT TCCGGGTCAT TTTATGTTCG 21360
CGGGTTTGCT CATCTTAAGG GGGGAGAGTT TTTTTGGATA AAACGAAACT TTTACCTCCG 21420
GGAGGGGACG ATTCACTTTG CACGCGATAC CCAAACGGCC GATCCGCGTA TTTCGTTTCG 21480
TGCGGAGCTG AAGGACAGGG ACACGCAGGG GAGGCCGGTG AGCTTGATTT TGTCCGCTGA 21540
GGACCAGGTG TTTTCTAAGC TTGCGCCAAA GCTCAGGTGT GATCCGCCGG TTTCTGAGCA 21600
AGAACTGGCG AAAATTTTAG GACAGGTGGT GCTGGGGGAT TTGACAGAGG AGAATATTGA 21660
GCAGAACGTG GCGAGTATCG CTTCAGATAT TCTTACGCAg TGGGGGATTA TGAAGCGGGT 21720
GGAGGATAAA ATCCGCTCAT TTTTGGATTT GGACGCGTTT TCGTTCCGCA CCTATGTTCT 21780
GCAGAACGCG ATTTTTGGGA ATTTGTTCAA TAAGGACCGC AGCAAGCCGC TGACAGTGGG 21840
TAACTATTTT GACAATACCT CCCTCTACGT AGGGCGTCGT CTTGGCCGGG CGGTGTACGC 21900
GGATGCGCTG CTTCACTTGT CTCAGTATGA TCCGcTTGCG CCAAATAATT TGGGGATTAA 21960
AnArCtGCGG CAGGGAGTTT GCTGTTCCGG CCGGAGCTGG GGCTAGAGTT TGCAACGCCC 22020
TTTTTTTCGT TGCGGTGGGC GTCGACGCCG ACACGTCTTG ATTCACTGTT TGTCTCTGAT 22080
ACTTCAATGC GGGTGTCGTG GAGTTTTGCG TATTGAGGCT CAAGGCAGCT CGTAAGGAGA 22140
AAGGAATGCT CAAAAAAGCC aGTGCCTTCC TAATTGCAAG TTGTTGTGTG ATGTCGCTGG 22200
CGTGGGCACA GGCAAACGAC AATTGGTACG AGGGAAAGCC TATCTCTGCG ATTAGTTTTG 22260
AGGGGCTCGA ATATATTGCT CGCGGCCAGT TGGACACGAT TTTTTCTCAA TACAAGGGAC 22320
AAAAGTGGAC CTATGAGCTG TACCTGGAGA TACTGCAAAA GGTCTATGAC CTTGAGTACT 22380
TTTCTGAAGT TTCGCCTAAG GCGGTGCCCA CCGATCCGGA GTATCAGTAT GTGATGCTAC 22440
AGTTCACGGT AAAGGAGCGT CCTTCGGTGA AGGGCATCAA GATGGTAGGG AACAGCCAAA 22500
TCCGCAGTGG GGACCTTTTG TCTAAAATCC TCCTGAAAAA GGGAGACATT TACAATGAAG 22560
TAAAGATGAA GGTGGACCAA GAGTCGCTCA GGCGTCATTA CCTGGACCAG GGCTATGCGG 22620
CGGTTAAGAT ATCCTGCGAG GCAAAAACTG AGGCgGGGGG CGTGGTGGTA CAGTTTACCA 22680
TCCAGGAAGG TAAGCAGACT GTTGTCTCGC GGATACAGTT TAAGGGAAAT AAGGCGTTTA 22740
CCGAGTCGGT GCTCAAGAAG GTGCTTTCCA CGCAGGAGGC GCGTTTTTTG ACCAGTGGGG 22800 TGTTCAAGGA GAATGCGCTG GAAGCGGATA AGGCGGCAGT CCACTCA AC TATGCAGAGA 22860
GGGGATACAT TGACGCGCGG GTAGAAGGCG TGGCAAAGAC GGTTGATAAA AAAACTGACG 22920
CCAGTCGCAA TCTGGTTACG CTTACGTACA CTGTGGTGGA AGGTGAGCAG TACCGCTACG 22980
GCGGGGTTAC CATTGTGGGT AACCAGATTT TTAGCACCGA GGAGCTGCAG GCAAAAATTA 23040
GGCTCAAGCG CGGGGCCATC ATGAATATGG TGGCCTTTGA GCAGGGCTTT CAGGCGCTGG 23100
CGGATGCGTA TTTTGAAAAC GGATACACGT CAAATTACCT GAACAAAGAA GAACACCGGG 23160
ACACGGCGGA GAAAACGCTT TCGTTTAAGA TCACGGTGGT GGAGCGCGAG CGCAGCCACG 23220
TCGAGCACAT TATCATTaAG GGAACGAaGA ATACAAAAGA CGAGGTTATC CTGCGTGAAA 23280
TGCTGCTGAA ACCGGGGGAT GTGTTCTCTA AGTCAAAGTT TACGGATAcT TGCGCAATCT 23340
GTTCAACCTG cGCTATTTCT CGTCGCTGGT GCCGGATGTG CGGCCCGGCT CTGAGCAGGA 23400
CCTGGTGGAC ATTATCCTGA ATGTGGAGGA GCAGTCGACG GCAAACGTGC AGTTTGGGGT 23460
GACGTTTTCT GGGGTGGGGG AGGCAGGCAC GTTCCCCCTT TCGCTCTTTT GTCAGTGGGA 23520
AGAAAAGAAT TTTTTGGGAA AAGGGAATGA AATTTCAGTA AATGCAACCT TGGGGTCTGA 23580
GGCGCAGAGC CTGAAGCTCG GGTATGTGGA GCGCTGGTTT CTGGGCTCTC CGCTGACGGT 23640
GGGCTTTGAC TTTGAAcTTA CGCACAAAAA TCTCTTTGTG TACCGCGCGG GTTCATACGG 23700
CAACGGGcTG CCGCACCCgT ACACGAGCAG GGAGCAGTGG GCTAGTTCCC CTGGGCTGGC 23760
AGAATCGTTT CGCCTCAAGT ATTCGCGCTT TGAGTCCccC ATCGGCGCGC ACACCGGGTA 23820
CCAGTGGTAT CCGCGCTATG CGGTCATTAG GGTGAACGGG GGGGTGGACT TTCGGGTTGT 23880
AAAGAATTTT TACGATAAGG ATAACAATCA GCCCTTCGAC CTGACCGTAA AAGAGCAGCT 23940
GAACTGGACC AGTATCAATT CGTTTTGGAC GAGCGTTTCG TTTGACGGGC GTGACTTTGC 24000
GTACGACCCG TCCAGCGGCT GGTTTTTAGG ACAGCGCTGT ACGTTCAACG GGCTCGTTCC 24060
CTTTCTCGAA AAAGAGCATT CGTTTCGCTC CGACACCAAG GCCGAGTTCT ACGTTACCCT 24120
GCTCAATTAT CCGGTCTCTG CCGTGTGGAA CTTAAAGTTT GTCTTGGCTT TCTACACCGG 24180
TGTGTCCGTT CAAACGTATT ATGGACGGAG GAAAAGCGAA AACGGAAAGG GCAACGGGGT 24240
GCGGTCCGGC GCGCTGGTAA TAGACGGCGT GCTGGTAGGG CGCGGGTGGA GCGAAGACGC 24300
AAAGAAAAAC ACCGGAGACC TGCTGCTCCA CCACTGGATT GAGTTCCGCT GGCCGCTGGC 24360
GCACGGCATT GTGTCCTTTG ACTTTTTCTT TGATGCGGCA ATGGTGTACA ACATCGAAAG 24420
TCAGTCCCCA AACGGGTCAT CGTCCGCCAG CAGCTCCAGC AGCAGCAGTA GTAGTAGCAG 24480
TAGAACCACC AGCTCTGAAG GACTGTACAA AATGAGCTAC GGTCCGGGGC TGCGCTTTAC 24540 ATTGCCGCAA TTTCCGTTAA AATTGGCGTT CGCAAACACC TTCACGTCAn CCGGCGGCAT 24600
CCCAaAAACa AAGAAAAATT GGaATTTTGT GTTGTCGTTC ACGGTAAATA ATTTGTAGCG 24660 TTCCCGTGnC CGTTTTGAAA nGGTCCGGGG GCTGCGTCC 24699
(2) INFORMATION FOR SEQ ID NO : 27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4637 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:
TGCCATGGAA ATGTGACCTA CGCCTCCCTT TATGGGCCAC AGGCTGGGAA TGTCAGAAAA 60
AAGTGCTGTT TGCGAATTGA GCAACTTTCC TATCTCCCGT ACTGGCTGCA CAGAGTTCTG 120
AAAATAGGCG GCTATGCGCC GCAGTTCCTC TGGCTCCGAC CCAACGCCCT GCGCGTTTTT 180
GCCATGCTGT TCTGTCACTC CAACGTTTCC CTTTTGTGCT CCATCCTTTG CTGACGCATC 240
ATCCGGCTCG TTAGACTGGT TTGTGTCGGA TACACTTCGT ACTGGTTTTT CTGCCCCAGA 300
TTGGAGCCCA CTGAGGCCAA CCTGAGAGAA AGTTTGGGAA AGGGAAGTCT GGAACGCCTT 360
AGAAGTGCGA ATAAGTTCTG CTGCTTCTTG GCGCAGGAGT GCAAGGTGCA CCTGCGTTCT 420
GTCCACCTCT GCGCGGACCG CCGCCAGGAA CgTGCGGAGG TGACAGACTG ATGGGCAAAC 480
CAAAAAAAAG AAGCCAACAC GCCAACACCG CACAGAAGAA CGCACAAGAG CGTGCGAGCG 540
GTAGTACAGA AAGTCCGGGC CGCACACTGA GAGTGTGGTA CTAACATGAC GGTGAGTTCT 600
GCACGGCCTG CGGCGCCCAG GCGCGCTGCG GCTTCGCGTA CAAGCCGCGC ACAACGCGTA 660
CACCGGTCAA TACAAAAACG AACGAACGCG TACTCAATAC GCTTGTATCT TCGGATGCGT 720
GCCATGCGCA AGACCTCTCC GGGGAAAACG GTATCACCGC GGTTAGATAC TGTCAAACCG 780
TGGAACTACG GGACGGTCTG AGCACGAGGA CGCGGGCACC CAACCCAAGC TTCGGTTCTA 840
CTTGCTCTTT TCTTTAAAGA GGACCAAGAG GGCACACGAG CCCCAACCCT GGCCAGGAGC 900
GAGCACTGGC TTGGCCCCCG CGCTGAGGGA AAAGCGGGAG GACTTTTCCG TGCTGCTATA 960
GCTGATTGTT GATATCAGCC AACGTTTTGC GCAGGGCAGG GTTCTGCGCG TAgTAGTACT 1020
GCTCGAGctC CGCTGCAGTA AGCCCAGAGA GCTCGTGGCT CATGTAATGG AACCACTCGT 1080
TATCTGCCTC GCTACGTACC TCGTGCTTGA TGCACCAGTA GTCCGGCAAC ACCGGGGGCT 1140
GTTTTTCTCC CACGAATACT TTGTAATTGA AGACAGCCAC GCGGATGTCG CCACGCGCAA 1200 GGCCCTTTTG CATAAGCAGG GAAGCGATGT AGTTGATGGT TGCGCCTGAA TCGAAGATAt 1260
CATCTACGAT GAGCACCTTA TCCCCGACGC GTAGGTACTC AGGAGGGTAG GTCCAGCCAT 1320
CTACGCTGAT GAcGCGCCss TTACGCAAAT CACAGTGCGA GTGAGCAACT ACCGCTGCGT 1380
ACAGGATAGG AGGCTCTGCC TTGTACGCGA TGGTTAAATA CTCATTGAGC ACGTTACCCA 1440
GATATACTCC ACCCCGTATG GGGACGTACA TAACCGTTGG CACGAACCTG TCTGCCACGA 1500
TGCGCCGGGC CATACCGAAA CCCTCATCAC GGATCACATT GTACGGAATA AATCGCTTTT 1560
TCACGTTAGC CTCTCCTGCA CGAGCACGAA AACACCCTAC ATCTAATGCT TTTTTAGCAT 1620
CATGGCAAGC TCTTTTTCTA TTCGTGTCGT GGCCTGGAAC TGTCTTTGTT GAaAGTTCGC 1680
CTGAATATTT TATGCTCCTG CGCGAGGGCC CCCGTGATAG AAAAGTTGGA AGAACTGCGC 1740
GCTCAGTGGA GAAAACTACA GCAGGAAGTG GAGAATCCTT CGCTTTTCTC TTCCACTCAG 1800
AGTTATCGTG AACGTATGCG CGATCACGCC TATCTTTCCA GACTGATGGA AGAGTATGAT 1860
CGCTATTTGC TTACTGAGAA GCAGTTGGaA GACGCGCACG TTCTCATCCA AGATGAGTCG 1920
GATGCTGATT TTAAGGACGT TATTCGGCAA GAGATCCGTA CACTTGAAGC TGCACTGCAC 1980
ACGAGTCAAA AGCGACTAAA GACGCTGCTT ATTCCCCCCG ACyCTTTGCA AGAGAAGAAT 2040
ATTATCATGG AAATTCGCGG CGGTACCGGC GGTGATGAAG CAGCGCTCTT TGCTGCAGAT 2100
CTATTTAGAA TGTACACGCA CTACGCTGAG TCAAAACAAT GGCGCTATGA AGTCCTTGCA 2160
GTGAGCGAAA CAGAGTTGGG AGGATTTAAG GAAATTACGT TCTCTATCTC GGGGCGCGAT 2220
GTGTATGGCA GTTTACGTTA TGAATCGGGT GTGCATCGCG TTCAACGTGT CCCTAGCACT 2280
GAAGCGTCGG GGCGCATCCA TACCAGTGCG GTTACCGTTG CAGTGCTGCC TGAGATGGAA 2340
GAGACTGAAG TGGACATTCG TGCTGAGGAC GTGCGTGTTG ATGTCATGCG TGCAAGTGGT 2400
CCTGGTGGGC AGTGTGTCAA CACCACTGAT TCTGCGGTGC GTCTTACACA TCTAcTACGG 2460
GCATTGTCGT TGTCTGTCAG GACGAGAAGA GTCAAATCAA AAACAAAGCC AAGGCCATGC 2520
GTGTATTGCG CAgCAGAGTG TATGATTTAG AGGAATCGAA GCGCCAGGTT GCCCGTGCAA 2580
GGGAACGCAA AAGTCAAGTT GGTTCAGGGG ATCGTTCCGA GCGCATTCGC ACGTATAATT 2640
TTCCTCAGAA CCGTGTTACG GATCATCGCG TGCGTGTTAC GCTCTACAAG CTAGATGCAG 2700
TGATGCaGGG TGCGTTGGAT GACATTATCG AGCCaTTGTG TATTGCGTCT CGAGAGAGTG 2760
TAATCTAGTG CAAGAACTCT GTACGATTCG ACAGGCGCGT ATGTACGCGC GAGCGTTGTT 2820
TCAAGACGCC CCCTGTTTGC GCGGACAGAA CACACCGCTT TTAGATGCAG ACCTTATTCT 2880
GTCgAAGTTG cTTGCGAAGC CGCGTGCGTG GATTCTCGCC CACCAGCAGG ATGAGATTGC 2940 CTCCGTTGCA CACGAGTTTA AGCGTCTCGT GCATCTTCGT TGTAgGGGAC GTGCGTTGGC 3000
GTATCTGACT CGAGAAAAAG AGTTTTTTGG TCTGAGATTC CGTGTCACCC GTGTACGCTT 3060
ATCCCTAAAC CGGATACCGA ATTGCTTGTA GAAAGTGTCC TGGCGCACGT TGCGTCCCAA 3120
ATGATGAAGC CGCGTTCAGT ATCTGTGCAT AAAGACACAA GTGCACTGCC TGTCTTGAAG 3180
ATATTCGAGG CGTGTACGGG ATGCGGGTGT ATTGCCATTG CACTTATGCA TATGTTGCGT 3240
GCGCtGGCAC GCCACCTCTC TATGTCATTG CATCCgACAT TTGCATGCGG GCCcTTGCCG 3300
TArsGCGGTA TAACGCGCGC CGACTCTTGG ATGTATCTGC AAATTCGCGC GTAcGTTTCG 3360
TGCACGCAGA TGTGCGTGCT CCTATTCCGT TCTTTTCTCC TTCTGAAGGC ACGGACnTGG 3420
TACAGGAGCG CGGGGTGTGC GTTCCGTATG ATGTGATATG TGCAAATCCG CCTTACtACC 3480
GAGTGCGCAA GCGCGCGCGC TGTTGCAGGA CGGGAGAGGG GAGCCTCTCG GTGCCTTAGA 3540
TGGGGGTGCA GATGGGCTAG ACTTGGTTCG CGCATTCGCA CACCACAGTG CCGCAGCGCT 3600
AAAGGAAGGC GGGTGCGTGT TTTGCGAGGT CGGCTCAAAC CACGCACAAC GTGCAGCGCG 3660
CATCTTCCAG GCAGCAGGGT TTGCCACGGT GAAAATTTCA AAAGATCTCT CCGGGAAAGA 3720
GCGCCTGATT AGCGGGATAc TGCGCTCGCA GTCTAGAGCT GTAACAGCGC CGAGTGGCTA 3780
GGGTGAAACA CGGCGACTGA GTGGTTATCC TGGCGTTTGC AGGTGGATGT nCGCGCCGCG 3840
TTGGCCGATA GGCTGAGTAC ATGAAGGAGT TAGAGATCAT CCACCATTGC GGATGACTTg 3900
CGTACGsGrT TGATTTTGCT TCAAAAAAAT CGGTTTTAAT CAAGTTTGCG TTGCTGTACT 3960
GACTTACCCA GCTCATCGAT TCCGGTTCTA CACGGTGCCC CTCGTACAAG GGCTCAAAGC 4020
CTAAATTTTC GCAACGAAGA TTACCCAAAT ACCGGATATA GTCTGCCACC ATGTGGCGAT 4080
TTAGTCCAGG GATCTGATCC CCAATGACAT AGTCCCCCCA CTTAATTTCT TGTTCGCATC 4140
CTTCGCGAAT CATATCGCGA AATAAGCGTA CATTGCGTGC AGTGAACACC TGwGGCTCTT 4200
CCTTTTGCAG TTCTTGAATA ATGGATCGAA AAAGCCACAG GTGTGTGTTT TCATCGCGGT 4260
TGATATAACG AATTTCCTGC ACCGAGCCGG GCATCTTGTT ATTACGCCCC AAGTTATAGA 4320
AGAACATAAA ACCCGAATAG AAATAAATTC CTTCCAAAAC ATAATTCGCA ATTGCTACCT 4380
TCAGCAGTGC GAGTACGCTT TTGTCATCTT GAAACTCGTT GTACAAGTTG CCAATGAATT 4440
TATTGCGCGC AAGCAGGATG CTCGTCGTCC TTCCACTGGT ATAGAATGTC ATGCGTTCTT 4500
CGGGGGAGCA AATGGTGTCC AGCATGTAAC TGTAACTCTG CGAATGCACA GCCTCTGGAA 4560
AGCCTGAAGG TTAGGCACAG TTAATCTCAT TGCGGTAAGT ACTGACCAAT ATTGGCAGAT 4620
CGCAGTCTGG GATGCTA 4637 (2) INFORMATION FOR SEQ ID NO : 28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10820 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28:
TGTAGACGGG GCACTGAGTG CTGAATGCGC AACGTCTCCA CGAGAGATTC AGAAGGACGC 60
ACGGGTCATG CCCCCACCAA GAATCGTGAA ACTGTTTTTC CTGTCACGCG CAGAAGCATG 120
CCGgCTTATT TTGGTTCTAA CGAACATTTA TACGCACGGC AAAGAAGTGG GTGAGCAGTG 180
CCACACCTCC CGCTCCTGCC AGCATGGAAG TGTATACCAC TGTACGCGCA TTGAAAAAAC 240
AAAGTGTCAT GACTGTTGCA GAAAAAGCAC CCATGAGGAG GAATATACAC AACGCAGCGA 300
TTAACCCGAG GAGTGGCAAG CGCGTGCGCA TAGAAAAAGA AACATGCGCG TGTAACTGCA 360
GCAGGTTAGT CATGTACAGC ACGAGTGTCA TGACACGCGC CAGAAGGTCA GGATCATTAA 420
GTTGTGTATA CACGTGCACA AAGAAAAACG ACGTATACGC GCCGAAGAGC GCAGACACGC 480
CCGATACGAG GGATACACGC TCAAGTGAGC GAGAAGTAAA GAGGAAAAAC GGCAGGCAAA 540
AGAGGGGAAA AACATAATCA AAAAGAAAAA AGCGCATCCA CTGCTCTTCC ACAAGGGCGG 600
AATCCGGCGG ATAGTACCCG AGAAAAAATG AACGAAGTAG GAGGAGAGGC ACAGCAAGCA 660
CGGCGCCGTG CACAAACGAA ATAAGTTCCT GAAGTGGGTC CCCAGCGTCC GCGAAAGAGG 720
AAAAGAACAA AAGAGGCAAC GAAATAATGA GAAATATTTC CACTATCGAT ACCGTAACCA 780
ACGGGGCACC GCATACGGCG AAAGATCACC GCGGGTATGC GTTCGCCCAT CAAAAGTGCG 840
CCGCTTTACA GCACAAGCTC GCTTGGAATA TCAGCGTTAC TGTAGACGGC CTGAACATCC 900
TCTTCTTCTT CGAGCCGGTC AATCATCTTC AATACCTTAC GCGCAGTCTC CTCATCAAGC 960
GCCAGGTACG TGTCGGGAAC CATAGATATA CCGGCAGATA GTGATTCCCA CCCCTTGGCC 1020
TGAAGGGATT CTAGGACCGT CTCAAACGTA CCGGGAACCG TGGTGACGGT GAGGACACCA 1080
CCGGCGTTCT GTATGTCCTC AGCACCCGCT TCGAGGGCAA GCTCCATGAG AGCCTCTTCG 1140
TCAACCTGTT CGGAATCGTA CTCTATAACT CCTTTGCGAT TGAACATATA GGAAACGGAT 1200
CCTGCCGAAC CTAAATTACC CCCATTACGG GAAAACAAAT TGCGCACGTT CGCGGCCGCG 1260
CGGTTTTTGT TATCGGTGAG CACCTCGACC AGAACGGCAA CACCGCCCGG CGCATAACCT 1320
TCATAAACGA GCTCCTCATA GCTACTGCCA GATAACTCCC CCGTACCCTT CTTAATAGCC 1380 CGCTCAATGT TATCTTTAGG CATATTAGCG GCACGTGCCT TAAGGATTGC AGTCCTCAGA 1440
CGTGGATTAG CCTGTGGGTC ACCGCCTGCC ATGCGGGCAG CAACAGATAT TTCCTTGATA 1500
AACTTAGTGA ACAACTGCCC ACGCTTTGcg TCCGCAGCTC CCTTAGCATG CTTGATAGTG 1560
GCCCATTTAC TATGTCCAGA CATGAGATCT TTCCCCTAAT GCCCGAAAAT GTACGTACCG 1620
GAACGCGGGC GCCTGATGCT AGCACGGTGT GCGCTTTTCT CCAAGTCCCG cTGGCGCATA 1680
TTGCACGGCC CGCGTAATTA CCGCGGGCTT AAGAAGCACA GACCTAGCAC GTCGGCGCGT 1740
GTGCTAAATC AAACAGATCT GCGTAGGCGC GCAACACGCC TTCCACCTCT CCTGCTAATA 1800
TGTGTTGTGC AAGCGTCTTG CCCAGCTGCA CTCCTTCTTG ATCAAAGCTG TTCAAGTTCC 1860
ACGCAAATCC TTGGAACATA ATCTTGTTTT CAAAGTGTGC AAGAAGTGCG CCGAGCGTTT 1920
GTGGGGTAAG CGCTTTAGgT ATAGCAGACT GGATGGACGC TCCCCGGAAA ACGTTTTATT 1980
TGCATCCGCG TGCTCTTTTC CCCTGGCGAA CGCGACAATT TGTGCGACGA CATTTGCAAG 2040
GAGCTTCTGC TGACCGGTAG ATCCACGGAT TATCGGATCC TGCCCGAGCT GACTATGTTG 2100
AAAGGCAATG AACTGAAGCG GCACCACCGA TGTTCCTTGA TGCAAATGTT GGTAGAACGA 2160
GTGCTGACCG TTTGTCCCAG GCTCTCCAAA GATCACCGGG CCGGTCTTAT ACGTTATCGG 2220
AATGCCGAAG CGGTTAACAC TCTTGCCGTT AGATTCCATA TCTAGTTGTT GCAAATGTGC 2280
AGGAAAGCGA GCCAACGCCT GGCTATAGGG CAACACCGCG GTGTGCTCGT ATCCCAGAAT 2340
AGTGCGCTCG TACACACCGA TGAGCGCGTC AAGAAGTGCT GCATTACGCC GTATGTCTTG 2400
TTCCTGTGCT GCTCGGTCCG CCTCTGCCGC ACCGGAGAGG AAGTGCCCAA ACACcTGCGG 2460
TCCAAACGCA AGCGTGAGTA CCACAGCGCC ACAGACAGAG GAACTAGAGT AGCGTCCACC 2520
GATAAAATCA TCCATGTAGA AGGAAGCAAG GTACTGGGGA TTATTTGCAA GTGGACTGGT 2580
CTCGCTGGTA ACTGCCACGA ACTGTGTGTG CGGTTCTAGA CCTGCTTGAC GAAGgACGTG 2640
TGCGACGAAA AGCTCATTAC TGAGTGTTTC AAGCGTCGTA CCACTCTTTG ATACCAAAAT 2700
AAAAAGCGTG GTCTCAAGCG GTAGTTTTGA GAGTACAAGC GCTGCGTCGT CTGGGTCCAC 2760
GTTGGAGATA AAATGTGTGC GCATCTTAAC CGCCTGGTGC CTCTGTGCCC AACCTTCCAG 2820
CGCGAGATAC AACGCCCGTG GACCGAGATC TGATCCACCA ATTCCAATTT GTACAACGTC 2880
GGTAAACGGT GCGCCGCGAG ACGTGCGCAG CCCCCTTCGT GTACTTGCCs TGCGAACGCA 2940
CATaCTCTTT CGTATTCTTT TGTATAAAAG GCGTGCATAT CGCGCACTTC GCACGGCAAC 3000
GAGGCAAGCG ATGACCCCTG CACGCCGAGG CGCGTTAGGT GATGCAGCAC CTTACGTTTT 3060
TCCCCCGTGT TTATCTGTGC TCCTGCGCGC AGGcGTCGTA CTTTGCGACT AATTCCTGCT 3120 CGTCTGCAAG AGCAGCAAGC GCCGTGAGAA TTTCTTCATT CACTGTTTTC GCTGCGTAGT 3180
GATAGCGCAG CCCAGCCCCC GCGTCGGTAC AATAGCGCCG CACACGTTCT ATCCCCTCTG 3240
GTCCACAGAG TACTGTCTTC AGCGACGGCG CACGAATCGC CTGCAGGcGG GCGTATGCGG 3300
CACACTCGTC AAGATTTCTC CAATTCACTG CGCGTTCTCC TTTTATCGTT CTACCCCGTA 3360
GggTTTACCT ACAGACATAT CGCCGGCTGT TCTATGTATC AAGACGCGGC ACGACAATCG 3420
TCGCGAGTGC CGGGTTCTTT TCTAAATCTC TTTTTAATCC TGCTGCCCGC GCCTTATTGA 3480
CATAGGTTGG ATCCTGGAAC GAGTAGGTAA AACGCGTGTG CACGTCGTAT GTTCCCTGCA 3540
TGAGTGCATA CGTGTCCTCA GTCATGACCA ATTCTTCATC TGTTGGGATT ACCAGAATGC 3600
GGACGGGTGA ATCGTCTGTA CTAATTTCAG TTTCTGCATT GCGCGTGCGG GCCAGTTCAT 3660
TTTTTCGCGC ATCAAGTCGG ATGCCTAGGT GTTCGAGTCC TGCGCACGCT GctGCGCGTA 3720
CGTCGCAACA CATCTCTCCA ACACtGCGGT AAAGACAAGC GCGTCCGGCT GTTTACCCAA 3780
AGCTGCAACG TATGCGCCGA AGTATTTCCG GATGCGGTGT ACCTCCATGT CAAAGGCAAG 3840
GCGTGCAAGC GCGTCTCCAT TTTTCATGGC AGCACACACA TCGCGTCGGT CCACGTATTT 3900
TCCGGTGATG CCTAGCAAAC CGGACTGTTT ATTGAGAGTG GTGTCGATGT CTGAGACAGA 3960
CATGCCTGTT TTTCTCATAA TGTAAAAGGC AAGCGCAGGG TCGCAGTCCC CGCAGCGTGT 4020
TCCCATAATC AGGCCTTCTA GCGGGGTGAT GCCCATGGAA GTGTCAAAGC TGACACCATT 4080
TTTGACACAA CACATGGAAG CGCCGTTTCC AATATGCGCA ATGATTATGT TTGTGTCCTC 4140
AGCCCTTTTT TTGAGAATGA CAGAGGCGCG CTTTGCAGTA TAAAGAAAAC TCGTGCCGTG 4200
AAAGCCGTAG CGACGTACCG CGTATTCTTC GTACCACTGC CGGGGCACTG CGTACATGAA 4260
GCTAGCTTCT GGCATGGTTT GATGCCACGC AgTATCCATA ATGGCACAGT GGGGAACTGA 4320
GGGGATGACC GCCTGGGCAG CCTCAATACC ACGGATGTTT GCGGGGTTGT GGAGAGGGCC 4380
AAGGTCTTGA ACAGAGCGAA ATGTTTCTAG CACGTCAGGA GTCACAACGA CAGACTTTAC 4440
AAAGCGATCT GCTGCGTGTA GGACGCGGTG TCCAACTGCC TTGATAAGAC TCATGTCGCT 4500
GATAACACCG ACGTGCGCAT CGGTGAGGGT GCTGATGATA AGCTGCACCG CTTCGGTATG 4560
GGTAGGGCAG GGACTTTCCC GAACGTGGTT CTCTCGGCCG TGCACCTCAT GCGTGATAAC 4620
AGATCCTGCC TGAGTAACAC GCTCTACCAC GCCGACGGCA ATCACCGCAC GCTCTGTCCA 4680
GTTATACACC TGGTATTTTA CAGATGAACT GCCGCAGTTT AGCGTGAGGA TAATCATAAT 4740
ACACCTCCAC CGTTTTGGTA ATTTCTCGGA CACCGTAGCA TACACGCAAA ATGCGCCACT 4800
TTCCTACACC GTTGGcTTAC ACTGcTTACG CGGATATAGC CCCCGCAGCA GCGtaTCCAG 4860 CAGCCACTGC GCTTTGGCAG TGTCCACGCG TGCAAGGGcA CGGaCGCGTG GGGATCTGAC 4920
GCGACTGACG CGAGCATGTC CTGCTTTTCA CGCCCGGTAA CTACGACGTA aTTTCGTGTG 4980
CATTATTGAT AAGGTGCCCA GTGAAaCTCA CACGTTTCTG ACCGGTGTCT GGGTGAGTTG 5040
CCACGACGCA ACAGCCGCTG TGGTCCCACA GCTCTATTTC ATGAGGGAAA ATAGACGCGG 5100
TGTGTCCATC CGCCCCCATA CCCAGGAGTA TAATATCAAA GCACGGCACG CCACGCTGTC 5160
TTGGGAGCCG TGCTTCAATT TCCTGTGAGT ATGCGGCGCA GGCGCTCTCC GGGGCGTCTT 5220
CTCCCCTGAC GCGAAACACC GCGTCAGgAT TTATTTCCAG AGGCTCAAGG AGCGCACTAT 5280
GGGTCATGTT GAAGTTACTC TGCGCATCCG TGGGGGGTAC GCAACGCTCA TCGCTCCAGA 5340
AGAAGCGGAG GCGCTTCCAA TCAAGGTGGT GTCGAAACTC GTGCGCCCAA GTTCTGAAAA 5400
TCTCCCTTGG AGTGGAACCC CCCGACAGGG CCAACCAGAG AATCTCTTGT GTTTTGAGCC 5460
GAGAATCAAA CACCGAAACG AGGAACGCCG CGATGGCACG CGCATCCTCA AAAATATGCT 5520
TCTTCATGGG CGAACATCCT CCTCTCTCCC GCTACGTTCT AGTGTCGTTC AAGGCTCGGC 5580
ATTACCGCTA GAGTCGGCAG GCAAAATCAT CGCTGAGCAG CGTAGAAGAG GGGTGATGCC 5640
ACCGTGGAGC ACTCCCTTTG ATCAGGTCGT CTGCaGCtTC GGACCCCAGC TTCCTGCAGG 5700
GTACGTAAGT AGAGGACTCT TGTTTGATTT CCATGCGGCA AGAATAGGAT CTATGAAGCG 5760
CCATGCAGAC TCCACCGCGT CATCTCGATG GTAGAGCGTG TTGTCTCCAT TCATGCAGTC 5820
AAGCAATAGC CGCTCATACG CGCTGGGTAA GTGCGAATAG GTAAGAGCCG AATACTGAAA 5880
ATCAACACTG ACGGGAATAG TCTTGAACCC CGCGCCGGGC TCTTTGAGGT CGATTTTAAG 5940
CTGAATTCCT TCGTCGGGTT GAATGCGAAT GACAAGCGCG TTGCCCTCGC GTGCGCACGG 6000
GCGTTCGATG TGCTCGAAAA GCGCGATGGG GAGCGTTCGG TAATGGACGA TCACCTCAGT 6060
GACGCCCGTG GGCAAACGCT TACCCGTCCG CAGtAGAAGG GAACGTCCAT CCACCGCCAA 6120
TTGTCGATGT AGCACTTGAG TGCGGcAAAG GTTTCAGTGC ACGAGCGAGG GTCAACGCCT 6180
GACTCCTCAA GGTAGCCGGG GACGGCTACA CCGCGTATCT TGCCGGCGAC GTATTGGGCA 6240
CGCACCGTAT GCTGCATGAC GTCGCGTTCT CCCATAGGGC GCAGGCAGTC AAAGACCTTT 6300
ACGATTTCAT CCCGTAGACG ACTTGAACTC ACGaCGGCGG GCGCCTCCAT CGCGATAATA 6360
CCCAAGAGGA GTAACAAGTG GTTTTGGATC ATATCGCGCA ATGCACCGGA CTGGTCGTAG 6420
TAACCGCCGC GGTTTTCGAC ACCTAGTGAT TCGCTTGCAG TAATTTCAAC GTAATCGATA 6480
TGGGTCCGGT TCCATGTGGG CTCGAAAAGG GGATTGGCAA AGCGAGTGAC CAGGATGTTT 6540
TGGACCGTTT CCTTACCCAG ATAGTGATCG ATGCGATAGG TTTGGTTTTC CTGAAAGTGG 6600 GCACGCAAGC TCGCATTAAG GTGcTGCGCG GTTTCTAGGT TGTAGCCAAA GGGTTTTTCA 6660
ATAACTACCC TGCGAAAATT ACCCTGTTCC CGGTTCAAGT GGTGCATAGC AAGCTGCGTG 6720
GGGATAGTTT CGTACAGGCT AGGGGGAGTG GCAAGATAGA AGATAAAGTT GCCCTCGGTG 6780
TGCAGCGACT GGTCGAGGGT GCGCACGTAC GTGGCAAAGT CGGCAAAGGC GACAGAGTCG 6840
GTGGGATCGA ACGAGAAGTA GTGGATCTTC TGCAGGAATT CGGTGAGGCG CGCCGGGTCG 6900
TGCGGTGTGC GCACTGCATG CTTTGTGACC GCCTCTGCAA GCCGTGCGCG AAAAGACTCT 6960
GTAGACAGAG CCGTACGCCC TGCGCCGAGT ATACCGAATG TAcGGGGCAG GAGCTCTTGC 7020
TCAAAGAGAT CCCAAAGCGA GGGGATAAGC TTCCGCGCGG CAAGGTCGCC TGAAGCGCCA 7080
AAGATAACCA GGATGTGCGG CGCGACCGTG CCGCTGCCAC TGATTTTCCC CATAAACCGC 7140
CCCTTCTTTC AACGGTGCGA CCTACACCGG ATGTGCCGCA GGsAaCTCTC CGCTCCCTAA 7200
GGCACTAAAT GCGGAACACC GGCCCTATTT TTACCATGAC CAGCGAGGTG CAGCAATACT 7260
TGGCCCATAT GTTCGACCAC GTCAGGTCCT GTCCATTCCC ATTTGCGCCC TTTTTCAAAT 7320
TCTTGGTGAG CGGCACGTTC ACTCCGCTGC CGAAGtGAAA TCCAACCCAA CGTGCTTGGT 7380
AAAGAAAATG AGGAAATCCA AATTTATCGG TATCCCCACC AGTTTGTCGT CCTGCGTAGA 7440
AAGGATATCC ACGCCCAGAG ACGGAATAAA TGCAAAATTC TTGCCGCTGC GTATAGCCCA 7500
GCCCACCAGG AACTGCGCAC GGAACATAAG AAAGGCAAGC CCCGCGTCTA ATTCTGTCGT 7560
GAACGCGAAg cCGTTGTGCG CTATTACCCC CACTGCCAAA CCGAGCGTCG GGGTGTACAA 7620
AAGAACGTCG GTCCTGGGGG CCGGTTCCTT TCCCCAGGGA TGCGCCCCTA CCTGTCCTAC 7680 tTCGGAGAAA CAAATACCTC CGCCGCAAAA ACGCCTGCCC CCATCCCGAG CGCAGCGAGC 7740
AAAGAACcTA CGCGCACCAc GCACCGCGCC CGTACCCCCC CCCTCGcCGT GTGCCACTGT 7800
ATACCCATAC GATCTAACCC CAGCTGTAGG ACACGCCTAC TGGGCCGATC TTCCCGCGTG 7860
GGGTGTGAGA GTGTCAAGCC CTCCCCCCTT CCTTGCGAAG AGGAGTATGC CAAACGGTGA 7920
GAAAAACTTG ACGGCGCGCG CTAAACGCCT AATAATTGCC TCGCAGCCTT TAGAAAAAGG 7980
AGGAGCTCGT GATTCGCGCC CTCTTTTCCC TCTTTCGGTC CCTCCATGCA AACACGCACC 8040
CGGCAGATCT CGCGCATGCG GCAGCGTTGG CACTGGCCCT CGCGTTGCTT CCTCGGAGTT 8100
CTCTCCTGTG GTACCTACTG TTTGCCGTCT GCTTTTTTAT ACGGCTGAAC CGTGGTCTGC 8160
TCTTGCTATC GCTCGTGCTG TTTGGTTTTG TCGTTCCTTC GTTCGATCCC TGGCTCGACA 8220
GCCTCGGCAA TTGGGCGCTG TGTTTACCAC GGCTGCAACC CGTCTACCGC GCCCTGATTG 8280
AGATTCCCTT CGTAGGGCTT GCGCGCTTTT ACAACACTAT GATTGCCGGC GGTCTGGTGG 8340 CAGGTGCGCT GTGCTATTTG CCGTGCTATG CTCTTGCACr CTGCGCGGTG ACGGCGTACC 8400
GTACATACCT GTACCCTAAA ATTCACCATG CGACGATTTT CTTTCTTGTC CGGAACGCCC 8460
CGTTGTGCAA AAGGTAAAGA AGATACTCAG CGTCAGGGAG AGGTTTTCAT GAGCGATGAT 8520
TCTACACCCA AGACGCCTTC GCGCCGGATT CGGCATACAG GAAGGAGGCG CACGCTGCAT 8580
CGGTTCTTCT GCAAACGGTA CACTCCCCGT TCTCTCAAGC GGTTCCTGCG CCGAATCCAT 8640
ATCCCTGCTG ACCGCGCGTA CTGCATGCGT TACCTTGCAG ACCCCGTATC CACCCCTGTC 8700
CGTGTGTTTG GCCGCACGCT CCTTTCTCGC ACGTATGTTC GCTTCGATCA GCaGGCTATC 8760
GCGCACTCAG CGGACCTGAA GCGGCTCAAT GCCATTGCAG CGTCAATAGC AAAGCAAAGG 8820
GGGCGGGTTA ATTTTTGGTC CCTCTCCATG GCTTGTGCGA GCGTCCTCGC GCTTCTCGGG 8880
CTCGTGTACT TGATCCGAAA TGTCATTGCT CGGCGTGTCG TTATCGGTGG TTCTGAGGCC 8940
GTCTTTGGTG CGCGGTGCGA AgCGGCAGTG GTAGATCTTG ATCTATTCAA CGCGCGCTTC 9000
CGCCTGAAGA ACTATGCGGT GGCAAACAAG CATCATCCCA TGTGGAATCT GTTTGAAATC 9060
GAAAGTATCG ATATCCACTT TGACCTCCTG GAGCTTTCGC GGGGTAAGTT CGTCTCACAC 9120
ACGATGGTTG TAGAGGGCGT GACGTGGAAC ACGCCGCGCA AAACGTCTGG TGCTTTGCCC 9180
CCGCGCCGCG CAAAACGTCA ACGTGTGCGC AGTAGTAACC CGCTTATTGC AAAAATACAG 9240
GAAAAAGCGG CGGAGCTGGC CGCCCCCGTG TCTTTTGGCG CAGGGTTTTC TGCGCTCAAA 9300
GCGCAAGTGG ACCCGCGCAT TCTCCTTGAA CGCGAGGTGA AGGCGTTAAA AACTCCCACC 9360
CTCGTACAGC ACGTGGGTGC GCAGGCGCCC AAACTTGCAG AGCGCTGGAC GcAGCGTGTG 9420
TTTGACGCAC ACGCCCGTGC GGAAAAAACG GTGGCGGCGA TCCGTGCGGT GACTGAGCTT 9480
GACTTTCACG CTTTAAAAGA CGTGTCGGCA ATAAAACAAG GTATCGAGAC GCTCGATAGA 9540
GCGCGCCGAT CCACTGAGGA AGCCCTCGCT ACTGCGCGCA CTATCTCCCA CGAATTGCAG 9600
CAGGATGTGC ATTCGACATT GGGTCTTGCG CGCGAGTTCG CCGCGGCGGT aAAGGCAGAC 9660
GGTGCGCGCA TCGyCCGTGC cGCGGCGGCT ATCCGTGATA TCCAGGCAGA TGGAGGAAAG 9720
AAATTTATCT CTGGTCTTTG CACCGTCTTT TTGGCACGGA GCTTTAGCCA TTATTACCCC 9780
TATGTGGCGC AGATGCTTGA TTATGTCCGG GGGTCGCAGC GAACACCGTC TGATGGATCG 9840
CCGTCTGCGG AGGCAGAAAA GACAGCTCAG AGCCTTACGA CGCGCAAGCn CTTGCAGGGA 9900
GTAATTTTTT GTTTGAGCGC AACGTCCCTT CCGTGCTGCT GAGAAACATT GGGGTGTCTG 9960
CCGCAGATCC GCAGgCAAGA TTTTCTGTTG CAGCCCGTGT GCGCAATGCG TCAAACGACG 10020
CGCACGGGTT TGGCGAACCG ATTTCGTTCC TCCTGGACGT GGCTGCAGGC GCACAGGACG 10080 CCtCkCTGCG CGcGTGGTGG ATCTGCGCCG GGCGCATCCG GACTTAGTAG ACGTCTCGTG 10140
CACTGCGCGG GGTATTCCGC TCGCTGTCCC GGCACCTGCA GAAGGATTCC CTGAGCTTTC 10200
TGGCGTGCTT GGAmTGCATA CGCAGgTGTT TGTGCGCAAA GATCACTCGG TGGAACTCAA 10260
GATGGGGGCA CGTATTTCAG ACAGCGTATT GCGCgCTGCG CCTTTTGAGC CGCGCGTGCT 10320
GTTTGACGTG TACGCGGATG TGTTGCGCCA GATACGGCAG ATTGCATTTG AAGCTACGGT 10380
GCGCGTCTCT GCAGAGGGTG CGTTGAGTAT TTCGGTAGAG AGTGACGCAG ATGGCGCGTT 10440
TGTGCGCGCT CTTTCCCGTG CGTTTGCGCA GCAGGTGGAC GCATTGCGCC GCGCGGTCAT 10500
TGCAGAAGGG GAGCGATTTC TTGCTCAGCA ACGCCGCGTG TACGCACAGG AAATTGCGCA 10560
GGTAACGCAG CTCGTTTCCC GTGCGGAGGA CGCAATTGCC CAGCTGGGGG TGTCTTCTCG 10620
CGTGATACAG CAGAAACGGG CTGAGGCGGA GCGCCTTCTG GAAGCTGCAG CGCGCAAGGC 10680
ACTGGGGGAG GTGACTAAGg TGCCGCAGAC GAGCTGCAGA ACAAGGCGCG AGATGCATTC 10740
CGCTCCTTTT TCTAGGGGAG TGGCGCCGCC CCTTTTCGGT GCGGCCTCAG GGTTCGGCTG 10800
AnGCCGGTGC GGGCGCTTTG 10820 (2) INFORMATION FOR SEQ ID NO : 29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13257 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 29:
CAGGACGGTA nTCTCGTCTC TACGCTGACG AAGTTGCCAC TGATAGTGGA GATCGGTTTA 60
TCCAAATGGC GTTGGTAAAA CTCTTGCCCC AGAGGGCGnC AGGCGGACAG AGACTACAGG 120
AGATTGTGGC GCCGAGTCAG TCGGACATCG TGCTTATCAT GCTGCTAACC TGGCTTGAGC 180
GTGCACGGCT GGACCGGTTC AATGCTGATG CGCTGCTTAC GGCGCAGTGG ACCTATGTGT 240
CGGCTGGACT GTATGGGGCG ACGGCGGGTA CCAATGTATT TGGTAAGCGC GTGCTGCCTG 300
CGCTGCGGTC CTGGCATTTT GATTTTGCCG GATTCCTCAA ACTCGAAACC AAAAGCGGTG 360
ACCCCTACAC CCACCTGCTC ACCGGCCTGA ACGCCGGCGT CGAAGCACGC GTGTACATCC 420
CCCTCACCTA CATCCGTTAC AGAAATAACG GAGGGTACGA ACTGAATGGA GCTGTGCCCC 480
CTGGGACtAT CAATATGCCA ATTTTGGGGA AGGCGTGGTG CAGCTATCGC ATCCCCCTCG 540
GTTCCCACGC CTGGCTTACA CCGCATACAT CCGTGCTCGG CACAACCAAT CGCTTTAACG 600 TTATTAACCC CGCGTACACC CTGTTGAATG AACGAGCGCT CCAGTACCAG GTGGGACTGA 660
CGTTCAGTCC CTTCGAGAAG GTGGAGCTCA GCGCCCAGTG GGAACAGGGG GTGCTTGCTG 720
ACGCTCCTTA CATGGGTATT GCCGAGAGTA TGTGGTCTGA GCGTTACTTT GGcACGTTTA 780
TCTGTGGGGT GAAGGTGGTT TGGTGAGGGG TTGTCGTGTG GGCCAGAGAA CGGGTACGGT 840
GGGGGTGCGC GTTTTCCCCG TGGGGgCTGT GCGCGCTCAG TTTACAGGCG AGGGATTGCA 900
GGGGTATGTG CGGGAAGCGT CTGGGTAAAG TGATGGTGCT CGGGTGTATG TTGCCGGGTG 960
TGGCGGCGCG TGTTTCTCTC TCCCCCAAGC TCGGGGTGTA CGGGGACGCA CGCGGCGGTT 1020
CTGACCTGTG GGGCATCTGC ATACAAGCTC CCACAATGCC AGATACAGAG AACCAGGCGC 1080
CTCCGCGCTA TGCgcCgGAG ACACCGTTGG TGGGGCTGGA CGTGGCGTTC CGTGCGGAAA 1140
ATGGCTTCCT GCTCCAACTG ACGGTGGACG CGGCACTCAC GCGTTTAATG TTCTGCGGCC 1200
GGTGTTTGGC CGGTTATTCG TTCAGACCGG GGGAAGGTAG TACGCATCTG TCGGTAGCGG 1260
CGGGTTTTGA GTGCACCGCG CTCATCTACG ATAGCCAGCA CTTTCTTTCG GTTCTTGGGC 1320
AGGGCTTACT GCAGCCGAGC AGCTCGTCTT ATTCAGCCGG TAACTGGCAC CGCCCACGTT 1380
CATTGCTTGG CGTGCTAACG TGCACTGCCA AGGAGGTAGG CGCCATACAC GAAGAGTCGC 1440
GTATTAAAGG GGTCTGTCAG AACTATGCGG TGCCGGTGCA GCTGGGGGTG CAGCACTACT 1500
TTGGCGCGCA TTGGGGGATA GACGCGACGG CTACCGTTTC GTTTGGCATT GACACCAAGC 1560
TGGCTAAGTT CCGCATCCCG TATACGTTGC GCGTTGGCCC GGTCTTCCGC ACCTAGGGGA 1620
GGCGCCGGGA GGAACGGGTC CTGTCGAAGA ATTGCGGGGA GGAGTGAAGG TATGTGGAGA 1680
AAATGTCTGG GTAAAGTGGT GCTACTCGGG TGTGCGTTGC CGTGCGTGGC CGCGCGTATT 1740
TCTGTCTCTC CCAAGCTGGG GGCGTATGGG GACGCACGTG GCGGTCCTGA CCTGTGGGGC 1800
TTGTGTATTA AGGCGACCGA TGCAGAGGAG GTAAGTGGGG ATCCCGATGA CACGGAGATG 1860
GAGTATTTAC CTCCCCGTTA TGCGCCGGAG ACGCCGCTGG TGGGACTCGA TGTGGCGTTC 1920
CGTGCGGAGA ATGGTTTTCT GCTCCAGCTG ACGGTGGACG CGGCGCTCAC CCGCCTGATG 1980
TTCCGTGGTC AGTGTTTGGC CGGTTATTCG TTCAGGCCGG GGGGGGGtAA ATACGTATCT 2040
GTCGGTAGCG GCGGGTTTTG AGTGCACTGC GCTCATCTAC GACAGCTACC ATTACATCAC 2100
CATCCAGGCC CCCAATGAGG GTTCGGTGTG TTCGTTCGAA CATGGAGGGT GGTACGTTCC 2160
AAAGACAGTG CTGAgCCTGC TGAGGCGCCG GAAGTGTCaG GATGCTAGGG CTGAGTCTGA 2220
GGAATTGGGC ATCACGGGGA TTTGCCaGAA CTACGCGGTG CCGGTGCAGC TGGGGGTGCA 2280
GCACTACTTT GGCGCGCATT GGGGGATAGA TGCGACGGCT ACCGTTTCGT TTGGCGTTGA 2340 CACCAAGCTG GCTAAGTTCC GCATCCCGTA TACGTTGCGC GTTGGCCCGG TCTTCCGCAC 2400
STGAGCGGGT GCGCGCTCAG CGTGCCCCGT TTAGAAGGAg GCCGAGCGCT CCTCTACCGA 2460
ACCGTCTGCG TGCGCAACAA AGACGCGTAC GGTTCGCTCG GTGAACAAGC CATTTtCAAT 2520
GACCCCCgGC AGTGCATTGA GCGCGCGTTC CATGTCTTGC GGGGTGCGCG TCGGGAGCGA 2580
TTgCCACCGC GcGTCTAAAA TAAAAtTTCC GTGGTCAGTC ACTACCGGTC CTTTTTTTCT 2640
TyACCtCGCG TATGTGCACG GACAACCCCC AATCCTGAAG CGTGCGCATC ACGCTCATGC 2700
GGGCCTCAGG CACCACTTCG ATAGGnGAnT GCGCGCGTAC CTAAGGTTTC TACCACCTTT 2760
GTTTCGTCTA CGATGATAAC AAAGTGCGCG CTGTTGTATG CAGCGATCTT TTCTTGCAAA 2820
AGCGCAGCTC CACCGCCTTT GATGACAAAA TTTTGGGTGT CAATTTCATC CGCGCCGTCG 2880
ATAGTCACAT CCAGTTTGCC CCCAATCCGT TTTGAACTGA GAGAAAAAAG GGGGATGTTG 2940
TACCGCTCAC ATATGAGCGC TGTTTGAAAA CTAGTGGGCA CTGCCGCTAT GTCAGAGAGA 3000
GTGCCGCGTG CAAGGTGATC TGCGATGCGT TTTACCGCAG GCATTGCCGT AGAGCCCGTC 3060
CCAAGGCCAA TACTCATGTG CGCGTGCAGC ACCCCCTCTT GAACGAGGGT GTCnCaCTGC 3120
GCTGGGCAAC CAGCAATTTC TGCGCGGTAA CGTCTAATGG GGTGTTCGTC GTCGTGTTCC 3180
TCTCGTGCAT AGCTTTTTCC ACAAGTGCAC TCACGCGTCT GTATCCTTTT TGTGGTGCAA 3240
AAGAATATCT GCATTGTGCC aGCTGAGGGT TGCGCACTTA ATGCGCGCAG GCATGGATGC 3300
AAAACAGGCG AGGATGCACG CGTCCTGTAG GTGTGCCCGC TCCTGGTCTG TGAGGCACTG 3360
CTGTGCCATC ATGTGAAAGA ACAGCGCAAC CGTTTTTTGC GCCTGCGCCA CTGACGCACC 3420
CTTGATCAGT TCGATGAGTA TATTTGTAGA AGCGGTGGAC ACCGCACAAC CGGTACCTAA 3480
AAAGGCTACA TCAGCGATGC GATCACCTTC TCTCTTTATC AAGAGCGTGA GGTCATCGCC 3540
ACAACTGGGA TTATGACCCC GCTCGATGCt ACCGGCCCTT CTAACACCtG CGGTGTTCCT 3600
GCTTGCGTGC GTACTCGAGC AGCACtGTCG GTATATCGCT TCTGCGTTCA TAGGGAGATC 3660
TCCTTAGAGA AAGGCAGCGA ATGCCAGGAA GGCGCACTGG CCTAGCTGCA TTGAAAAATC 3720
CTGCCCACGG cgTGCAAgCA CssGTcAGcg CCTcTACATC CTCCATGGTA TTGTATATGC 3780
AGAAACTTGC ACGGCAACAG GACTGAATGC TCAAGTGCGT CATGAAAGGC TTACTACAGT 3840
GATCGCCGCT GCGAACCATC ACGCCTTCTT CGCCCAAGAT ATGCGCAGTA TCGTGCGAGT 3900
GCACGTTCTT CACGTTGAAT GCAATGATGC CTAGGCGCTC GCGTGCGCGC GCATGGTACG 3960
TTTCAAGGAA GGGAAGCTCC TCCAGCCGCG CAAGGAGTGC AGCATCCAGC GCATGTACGG 4020
ACGCGCGGAC TGCGCTGcTC TCTAGGGACT CGCAATACTC AATCGCTGCA CACAGTGACA 4080 CGACAGCTGC AGTATTCGCG CTACCTCCCT CGTACTTATG CGGCGnACCC TTAAAGACAC 4140
TTTCCTGTTC AGTCACAAAA TCCACCATGC CTCCCCCATA CAAAAAAGGA GGCATGGATT 4200
CCAGGAGCGT GTGCGGTGCG CACAATACGC CGACGCCAAA AAGAGAGAAC ATCTTATGGC 4260
CGGAGAAAAC AAAGAAGTCG CAGCCTAAAT CTGCAACATT TGGCACGCCG TGCACCATAG 4320
CCTGTGCTCC GTCAATGACC ACCACTGCAC CGACTTGGTG TGCAAGTGCG GTCAATTCCT 4380
GTGCAGGATT TACCGCGCCG GTGGCATTGA CAACGGCAGA GAAGGACACA ATCTTAGTGC 4440
ACGCTCGTAT CTTTTTCTGC GCTTCTTGTA TATCCAAATT TCCTTCGGCG TCTGGATACA 4500
GCCACTGTAT CGTTGCACCT GTGCAGCGGC ACACGTGCTG CCACGGTACG ATATTTGCGT 4560
GATGATTGGA GATAGCAAGA ACGATCTCGT CTCCTGCGCG CAGCGTGGCA GCGCGTAACA 4620
GTGAGCGATG ATGTTGAGCG ATTCGGTGCA ACTCTTTGTA AAAACGATAT CGTGCGTTGG 4680
CGCTGCGTTG ATAAACTGCG CTGTTTTCTT CCGGGTGTTT TCTATAAGGA GCGCTGATTC 4740
AACTGCAAGT TCATGGGAGC CTCTGCCTGC GTTCCCATTC AGATGGGTGT GGTAGTGCAT 4800
AACGCGCTCT AGCACCGGCG CAGGGCGTTG GGTTGTGGCC GCGCTGTCTA GGTAGTGGAC 4860
GCGGGGACTG CGCAACAGCA GGGGAAAGTC TGCTTTATAA TTGGGGCCGC TCATGCCTTG 4920
CGCTTCCTAT GTGCGCGATC GAGGCTCTCG TCAAAATTAC GTACGAGTGT CTCGCGGATG 4980
TGAGCGTCAT CGATGAGGGC GAATACGGGT TTAAACGCAG CTTCTATGAT GAGGCGCTTG 5040
GCACCGTACT CATCAAGACC GCGCGACATA AGGTAGTAGA GCACATCGCT GCCGATAGTT 5100
TCAAAACTGG CTGCGTGTTC CCCGACAACG TCGTCTTCGT CACAAAAGAT AGTGGGGATG 5160
CTAACCCCCA CGGCAGTTCT GTCAAGCAAA ATGGTACGGT CTGAGAACCG TGCTACAGAA 5220
TGGCTACACC CGCGGTGCAA AAAAATATTT CCACGGAACG TTTTGCGTGC ACCGTCTTTT 5280
ACCACCCCAC AGGCGCAGAT GTGCGCGTGT GAATTTTTTC CTTCCACGAT GAGGTTATGT 5340
TCAAGATCCA TACGCCGCGC TTTATCAATG AAATACAGTG GGTGAATTTC CACACGTGCC 5400
CACTCGTCCC GAAGGAAGGC GGAGTTGGAA ACACCTGAGA TCTGTGCACC TATTTGTACG 5460
TCGTAGCAGC GCACCTGCGC GCTTTCCTGT GCGTGTAGGT GTACCGTTTC AAAGTTCACA 5520
GCCGTAGGtG CGTGTTCTGT ACTTTGATTA ACTCTACCGA TGCGCCACGC CCCACCTGCA 5580
CGCTTACCAA ACCATTCCTG AACGGAGCAC GCTCAAGCGC TTGCGGACCG ACGAgTGCGG 5640
GAGCATCCTG TGGGGTATAC CCTGCGCGTC CACACAGACG AGGACTTTTA CGCGCGCCCC 5700
TTCCTGTATA TCAAGAAAGG TCTGATCGTA TAGCACGCGG TTATGCGTGT CCATGGTAAA 5760
ACGGATGAGT ACATGCACCG TCTCAGATGT GCGCGGCACA CTCAGATACA CCCCTGCATT 5820 TCGTTTTTCC TTTACTTCTT GTACATACGC TTCCCCCAGA CCGCAGTGGC GCTGACGATC 5880
GCGTGTGCGA AAGTCAAACG CGCGCGcCGC CCTTTCTGGC TCGGTGCAAA GAAAATGCTC 5940
TATGGATGAT GAGcGCACAA GCGCTGAGCT AGAAGTAACC TGTGTTCTGG AAAAAGGCTG 6000
AGAAGCAACG TCCTGCGCGT GATACCCGAG CCGTTTAAAA AATTCCCTTT TTTTCATACT 6060
GCTGTGCGCC CTCAGGAAGG GAGTTAGCCG ATCGCTCCCT CAAGTTCAAT GGAAATAAGG 6120
TTATTCAGTT CAACGGCGTA CTCAAGAGGC AATTCCTTtG AGACGGGTTC TACGAACCCT 6180
CTGACGATAA GAGAGATGGC AGTCTGCTCA TCAAGCCCGC GCTGCATGAG ATAAAAAACG 6240
ACGCGGTCAC TGATTCTGCC GATTTTTGCC TCATGTCCGA TATCAACGTT ATCCGTACGT 6300
ACATCAATGA TGGGGATGGT ATCCGTATGC GACTGGTTAT CGAGCATGAG GGACTCGCAC 6360
TCAGCGACCG CTTTTGCCCC GTCAGCCTTT GGACCGATGG AAAGCAACCC GCGGTAGTTT 6420
GCCGTTCCGC CATTCTTTGA TATGGATCGA GCATGTACCT CCGATACCgT GTTCCTGCCC 6480
AGGTGCACTG TTTTTGTTCC AGTATCGAGG TACTGTCCTG CAGAAGCAAA AGTGATGCCG 6540
GTGnAnAnCT GCGCGAGCGA TCTCCTCTGA GGATACTCAT CGGATATAAC ATCGTGACGC 6600
GGGAACCAAA GGAGCCTGAG ATCCACTCGA TGACGCCGTC TTCGTCCACA ATGGCGCGCT 6660
TGGTATTGAG GTTGTACAGG TTTCGTGACC AGTTTTCTAT GGTGGAATAG CGTAGGCGCG 6720
CGTTCTTTTT TACGTACAGC TCCACGGCGC CTGCGTGCAA CgcATTTTTG TAGTACTTCG 6780
GCGCGCTACA CCCTTCGATG AAGTGGAGGG ATGCGCCTTC ATCCACAATG ATGAGCGTGT 6840
GCTCAAATTG CCCGGATTGA TTTGCATTCA AGCGGAAGTA GGACTGCAGG GGTAAGTCCA 6900
CCTGCACCCC TTTGGGCACA TACACGAACG ACCCGCCTGA CCACACCGCT CCGTGCAGTG 6960
CAGCAAACTT GTGCTCGTTC GGTTTAATCa GATGCATAAA GTGCGCGCGG ACAATGTCTT 7020
CGTGCTTGTG CACGGCAGAC TCCATGTCGA GGTACACCAC TCCCTGTTGT TCTAAGTCTG 7080
CCCGGAGGTT GTGGTACACc ACCTCTGAGT CGTACTGCGC TCCTACTCCT GCAAGGgATC 7140
TTCGCTCCGC CTCAGGAATA CCGAGGCGAT CAAAAGTCTT CTTTATCTCC TCTGGGACGT 7200
CATCCCAACT TTCTGCGATT GGCTTAAAAT CGGAGaCAAT GTAGTGGACA ATCTCTTGGA 7260
TATCAAGGTC AGAGATATCC GCGCCCCACT CTGGCATGGG TCGCTTCATA AAATAGCGCA 7320
AGGATCTGAG ACGCAAGTCG AGCATCCACT GTGGCTCCCG CTTGCGACGC GAAATTTTCT 7380
CTACAACCTG AGCGTTCAAA CCCTTACCGG TTGAGTAGGT GTAGGTAACG GCGTCTTTTA 7440
CATCGTAAAT ACCTCGCTTG ATGTCCGATA CGTACGTTCG CCTGCGCGGC TGTAAAAGCT 7500
GTCTCTGTTG CTGTGTATTC ATACGCGCTT CCTCTAAGCG GTGGATATGC GGTCTGCTGG 7560 CGTGGGGGCA GCGCGCCTCC ACAAAAGGAT AACTTACTTT TTCTATTCCG GAGAAACGGG 7620
CGTGCCTCCT TGTTCGCCTG CACGGGGCGT GGCAAAGTCA GCGTAACCGT GTTCAACCAC 7680
GTAGTGCACC AAACTCACGT CACCGGTCTT CACGATGGTA CCGTCGACGA GGATGTGCAC 7740
CACGTCAGGC TTAATGTACT CGAGAACTTC TCGGTGATGG GTGATGATCA GGAATCCCAT 7800
ATCGGGCGTA CGGATATCGT CAATGCCCTC GAAGACAATG CGCGTAGctC nAACATCAAG 7860
TCCTGAATCC GTCTCGTCAA GTATGGcCAG TTTGGGCTCA AGAACAGCGA GCTGAAGTAT 7920
TTCGTTCTTT TTTTTCTCTC CCCCAGAGAA TCCTACATTC AGGCCGCGCG AGGCGTACGC 7980
CTCACTGATG CGCAAgcGAG CAAGCTTCGC ACGCAACTGC gTGTGAAAGT CGAGCACGGA 8040
AACTTTAGTA CCAAGAACCG CCTCTTTTGC CGCGCGGAGA AACTCCTCGA CCGAAAGACC 8100
GGGGACTTCC TCAGGAGTTT GGAACGAGAG AAAAATACCC CGCCGAGCGC GCTCGTACAC 8160
AGGCACGTCG TTGATACACT GCCCTTGAAA ATAAATTTCC CCACGTTCGA TAGTGCAGTG 8220
GGGATTTCCC ACGATGGTGC CTGCAAGAGT GGACTTGCCT GCACCGTTCG GTCCCATGAC 8280
GGCGTGCACC TCGCCGGTAT TCAGGGTTAG GTTGAGACTT TTGAGGATGG GCCTATCCGC 8340
AATGGACATA CACAGGTCGC GGATATCGAG GAGTGTGGGC ATGAGCGGCT CCTGCAAGGA 8400
GTAAACTGAG CAGGAGTATA CGTACATTCG AATGTATGTT GCAAGGGAAA GAGACCACGC 8460
ATCCTGCACA GGAAACACAT ACAGGTTTAA TACCGTGCGC AGTGTATGTC CTACCTTGGC 8520
GTTCTACCAG TTAATAGTCA TTCCGCACAC GAAGgTGCCA AAATACCTAC GGGAAGTAAC 8580
GGTTTCCGTA A AACCATGT AGGGCGTTGG TTCCAACTGT CCCTGCTCCC ACTGTGCGTG 8640
AAGCGTCACT TTTTCAATGG GAGAGAGCGT CAGGCCCACC TGGTACTGCA CGCAACGCTC 8700
ATGAACTAGA TTTTCTGTCT TAAGATTATA GTTGAAGCGA TTCGTGGTGC CGTATACGGC 8760
AAGCGAAGGT TTAAGCCATG CAGTTTCGCC AAGCGGAATA AGGTAGCGCG CCCACACCTT 8820
CCCCATAACG GGCAAGTTGA TATGGGTGTC AGGGAGTTTC CACACACCCA TTGGAGAGAC 8880
GTAtATCCTT TTCCATTGTC TATGTACAGG CCGTGGGTAA GCGGGATATA GCACCGCACG 8940
TCCATCCCTG CTTCCAGTCC GTCTATAAgG TGTGTATAcG CGTCTCCCgC TTTGGTTTCT 9000
ACCCGCAGAA AGCCgCCgCC GTCCGTGTGT GTGCTCCCAT ACGTAGGAAA GACCATGGCC 9060
CCAAACACGC TTGCAGGAGC AGTGGCCACG TACACGCCGC AGGCAAACCA GCGCCATTGC 9120
AGAGTCAGGA GTGCATCGAG CGCGTAGGTG TCCAGCGCTT GTTGCCAGAG CACTGTCAAC 9180
AGCGCTGCAA GCAAGGGGTT TGAACTAACT GTCGGGTTCG TCTGTTCTGC TTGTACGATT 9240
TTGGTTGCAA TATCCCGCAC AGAGGATCCG TTAGCGGACA TTCCGTTCTT GACACTATCG 9300 AGCTTCTGCG CGTACTCAGG CAGCTGCAAC ACGCGCTGTG CAATGcATAC GCCACCGGkC 9360 yTCCGGGGGT GACCTTGGTT TTTTCCATAT CGATAAGGAT CGGGGCAAGA TAGTCGAGCG 9420
TTTTCCGACC TGCATAATGG TTACCCATGT CCAGAGCCAA CACCACCTTG AAGTCCGAAA 9480
GAGGAGTGAG CGTTATGCGC GCGCCTGCAT TCCACAGGAC GTCATTCTGC CTATAGTGCC 9540
TTTTCTGTTT ATGGCTGACC CTTTTATAGC CTTTTCCCAG TGTCGCGCTG GCTGCCACTT 9600
CTGCCCGTAC TAACTCGCGT TTCCACGGCG CATAGCTGAG CGTCGCGTCT GCCCCGAAAC 9660
CATACTTGCT ATGCAGAGCC TCTTCCGCCG CACCTCCAGG GGGAGCAGGA GCAGGGGCAG 9720
GCGCGTCCCA GGAACCATTT GAGGCAAAGG AAAGAAAGCA AATATCCAAG CTCACTCCAC 9780
TTGAGCCTAC GTTCTGTGCA CGGTAGCCGA GCCTGCCGCC GATGCCATCG AACCCTGGCG 9840
CAAACCGCAC CTCCTCTTCC TTGTAGAGGT CTGCAAGAAA AGGTTTCCAC AGTTGGGCAA 9900
AATTGGCGCG AAAGAGGGGA GCGGTGCCGA TCGTCATATA CGCACCAAAG CAGTGTAGCG 9960
TCGCTTCAAT AGCGGTTTCT TCTGTGACTA GGGTAAAAGG CTCACCAGGC TTTTTGGTCT 10020
GAAAATTTAC CTCCAAATCC TTGATAGAAA TTTCAGTCCA CAAGCCACCG TCAGATAATG 10080
TACCTGCGCG CCTTATGCGG TCACTTTTTA AAAACAACGG AACAGtTAaC ACAAGTGTGT 10140
GGTACTGCGG AACCCGTGCG TTATCTGACG AATGCGCGCT TTTTCCTTCT CACTCTCTTC 10200
TACTTCCTGT CCCTGCGCTC CATTTCCCTG TATCTGTACG TTCACCACAT CTCTGTCGTC 10260
GCCATCTCCA TTCTCGCGGG GCGGCACGGG AGGCGGCGGA CCTACTGCGG GGTCGTAGGG 10320
GAGCGTGATA CCCCACTGCA ACCrGGCAAA GCCTGAAAtC CGCGGCGTAC TTGCTAAGGT 10380
GTGCAGTCTG GTTTCTGCAC ACAGTCCTGC TGCGGACGAC ATAAAGAGGA GCGCCCACGT 10440
GCCAACTGAC CTGACACGCC TCCCTGCTGA GTGTCCAGGG GGATCATGGC AGGAGGAGCG 10500
CACAGCCGCC GACGGAGACA GCAGCGTGCG TCGCCGCCGG ATTTTGCTTT GAGAGTACAA 10560
CACCyTGCAG GCATAAAGAC AGGGACAGCG TACTCCTTTC ATGGCTCCAT CCTAAAAGTC 10620
CGCAGTGCGC GGCGTACGAG GAAACGGAAT AACATCCCGA ATATTCCCAA GCCCGGTGAC 10680
GTACTGCAGC AAGCGCTCGA AGCCGAGTCC AAAACCTGCA TGGGGCGCGG TACCAAAGCG 10740
ACGGAGATCG GTGTACCAGC GATAGTCGTG AGGGTCAAAA CCGCTGGCAC GGATGCGAGC 10800
ACAGAGTACT TCAAACTGTT CCTCGCGCTC CGAGCCTCCC ATAATCTCCC CTAATCCCGG 10860
AACTAGCAGG TCCATGGAAC GCACCGTTGT GCCGTcGGCA TTGAGCTTCA TGTAGAAGGC 10920
CTTGATTTCC TTTGGGTAGT CATAGACAAT CACCGGGCCG TGGAACACCT CTTCTGTTAA 10980
AAAACACTCG TGCTCGCTTT GTAAATCGCA TCCCCAGCGT ACGGGGAACT CAAAGGAGCG 11040 CCCACTGTTC TCCAGTAGTT TAATTGCCTC TGTGTATGTC AGGCGCGTGG CAGGCGCGCG 11100
GGCGACGTCT TCGAGCATGC GCGTCAgCTG CCCTGGCGTC CGCACTGGCG GTGTGCGCGC 11160
TGTGcTGCGC GCGGcAAGGG GTGTGTCGCC ccsCGCGCTt CcGCATGgCT GyGCgCGCtc 11220
GTCAAGGAAG GCTATATCcT GCGCGCAGTC CTTGAGTGct GCGCGTAGCA GGTACGCCAA 11280
AAACTCCTCT GCCACGTCCA TGCAGTCAGT GATGCGTGCA AAGGCGATTT CCGGCTCCAC 11340
CATCCAGAAC TCAGAAAGAT GGCGGCTAGT ATTTGAGTTC TCTGCGCGAA AAGTAGGGCC 11400
GAAGgTGTAG ATGCGCGTGA GGGCAAGCGC ATATGCTTCC CCCTGCAGTT GGCCCGAAAC 11460
GGTTAGGCGC GCTGCCTTAC CAAAAAAGTC GTCCGCGTAC GTGAGTGCGT AGGGGTTGCC 11520
TGCCGCGCCC GCTGCGTGTG cTTCGCGCGC AATACGCACG GGATCAAAAG TAGTGACGCG 11580
AAAGAGCTCG CCTGCACCCT CGCAGTCCGA AGCGGTAATG ATCGGTGTGT GCACGTACTG 11640
AAAgTGTCGC TCGGAGAAAA AGCGGTGGAC AGCGCCTGCA AgTGCACTGC GCACCCGTGC 11700
ACACGCGGCA AAGGTACTAG TGCGCGCGCG CAGATGGGCG TGCGCACGCA AAAACTCAAA 11760
ACTATGCGAT TTCTTCTGCA AAGGATAGGT TTCAGCAGGC GCCTCGCCAA GAACAGTCAG 11820
GTTGCAAGCG CGCAACTCAA GCGCTTGCCC GGCGCCTGGG GAGGGGACGA GTGCACCCTC 11880
GGCGCGAATG CAGGCGCCGG TAGTAACGCG TTTGAGCGTT TGAGCGAGCG TTTCCCCCTG 11940
GAGGACAGCG TCGCGGACAT TAGGGATTTG CTCAGTTGCG CCCCAGAGGA AAGGGAGGCG 12000
GAACACTCGG GcAGGGGAAC GGTAACCTGA AGGGTATCAG GGCAAGAACC GTCGCTCAGA 12060
CTGATAAAGA CAGCGCGTTT TGTCTCCCGT TTGGAGCGCA CCCAACCGTG AACGCATTCG 12120
TGCTGGCCTG AGGGGGGATG AGTCAGAATC TCCTTGAGCA AAGGGTGCAT AGCACGCACT 12180
CTAACGCTTT TACCCTCTTT GTGGAAGGGC GTGGACCGGC AAGCAGCTGT GACGCCACGG 12240
CGCACGCCCT GCGCGGCATC TGCATGGAAC GCCGCGCAGG CTGGGGAGA GCGAACCGTG 12300
CGAAAAAGCG TCGTTTCATT TCCAGGGAAC TTACTCTCTA GATGAGGAGC GGCCGAGGCG 12360
CGGTCTTCTG TGACCGGGAC CACGCCG AC GACATAGAAA ACCAGATGCA AGTAGAGTAT 12420
CAGAAACACT CCCGCAGAAA GGACGCGCGG GTGAACGATT ACCGCGCCTA TAAGACTCCA 12480
CAGGTGCACC CTTTCCATGG CGGATCCCTC GGCATGTGTG TTTCGTTCCT TAAGGATACC 12540
TGGGCACAAA CCCTTGACGT GGTGCGCAAA ATGCTCGACC ATGTGCCCGC GCGTGCCGTG 12600
CGGCAGTGTC TCGGTCCCGT GTGCATTTTG TACCATAAGT TTCAGGAGGA AATGTCCTAT 12660
GCAGCAGCGC TTCTTCTTAC TCGGTGTCTG CGCTTTTGCT TTTGGCGTCC CGGTTTTTCC 12720
CCAGCAGGGC ACAGATCCAA GTGTGGGTGC TCAGGCCAGT GCGGGCGACG GAGGCATGAT 12780 GACCGTCGAG CAAGCCTATC TGAACTCTGC AGAGGGTGTG GTGATCAAAG AGATGGTTGA 12840 gAGCaGGGGG CATGATTCAA AGGTGCTCGC GCTCCAGTAT ATCCAGGAGG CACTTGAAgG 12900
CGGACGTGGT TCTGATGACC TCCAGGAGGC GCTAAGTCGG TTGGCCACTG CTGGATTGTT 12960
CCGCGTGATC CGTGAGCAAG GGCGTGTGAT TAATGATTTC CCCGACATCC GCCTGCGTGC 13020
TTGCGAGCTA CTCGCCCGGT TTCtTCGGCT CGTACCAAGG ACGCTCTCAT CCAAGTCATG 13080
TGTGCTGACC GTGAGCTTCG GTGGTGAGGG CGGCGGTTAA GTCGTTAGGA GAGGTGGGTA 13140
TCAACGAGCA GGACGAGACA ACCGCCACTA TTGGCTGGAT TAGTCGGAAG TTTTCCGCTA 13200
TTAACCCGAc AGGTTCTCTC GCGCTTGAGA TTTTGAACAC GTACGAGCGC CTTGCTC 13257
(2) INFORMATION FOR SEQ ID NO : 30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 14512 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30:
AGTTTCCCGA GTGGTCAAAG GGAGCAGACT GTAAATCTGT TGGCGTTGTC TTCCAAGGTT 60
CGAATCCTTG ACTCCCCACT TTCGTCTTCC GTTTGCTTTT GGGTAGTGTC TGACTTGTCT 120
TTCCCTGGCG TTTCGTTCCA GGCGTTTTTG CTAGCTGCTG TGCCTCTTGT CACTTTCTTG 180
AGTGCAGGAT GTTCTTTTCG TGCGCGTGCG CGCGGTTGCG GAAGGATTTC AGTGGCGGAG 240
GAGGGGACGT GCGTGTGCAC TTCTGGGGGG TGCGGGGGTC TGTGCCTACT CCTGTGACAC 300
CTCGACAGGT CCaGTCAAAG ATAGCgGCTG TCGTTCaGCG CATAAGTGCa AAGGATGTCA 360
GGAATCAGAG ATCCAAGGAG CGTTTTATTT CTGATCTGCC TGCCTGGCTC TTTGGGACTA 420
CGGGTGGGAA TACTACGTGC GTGGAGATGG AGACTGATTG CGGGGAAACC CTCATCTTTG 480
ACGCAGGGAC AGGCATTCGT GATCTGGGTA TCGATCTTAT GAGCCGTCCA GGCTACAGGG 540
CGCAGGGGCA TGTATACCAC CTCCTGTTTA CGCATTTTCA TTGGGATCAC ATCCAGGGGC 600
TACCCTTTTT CAATCCTGCC TTTGATCCTC GTAATACCAT TATCGTCTAT AGCACTCGCA 660
AGAAAATGAA GGAATTCCTT GAAGATCAGA TGAGGTATCC TTACTTTCCA ATATCTATGT 720
TTGGACGCGA CGGTTTTAAC GCAAAGTTTG AATTTCGCCT GATAGGTAAC CATGAGGAGT 780
GCTTTGCTAT TGGGAAGACG AAGATAACTT GGAACCGGGT GCGTCATCCA GGCGGATGTG 840
TATCGTATGC GGTGAGCGAG GCTGGTGGGA AGAAGGTGAT TTTTTCTACC GACACCGAGT 900 TACGGCAGAA GGATTTTGAT AGAAGTGAGC GTAATGTCTG CTTTTACGAT GCCGCAAGTC 960
TGCTCATAAT TGATTCGCAG TACACCATGA CTGAATCCAT CAAAAAAGAA GGGTGGGGCC 1020
ACTCCACGTT CTCTATAGTG GTTGATTTTG CAGTAAGTTG GGGGGTGAGA AGACTGGCGC 1080
TGTTCCACCA TGAACCTACG TATGATGATA AAAAGTTGTT TAGCATTTTG CAGAATGCCT 1140
GCTGGTATCG CAAGTACGTT GGTGCGCACG ATCTTGAAAT ACTGCTCGCA CAGGAAGGAA 1200
AGGATATCTT TGTATGAGTG AGGAGGAGCG CATGTATAGC TTTAGCGGtG AAGAAATCAA 1260
GGAACTCGCG CTTCTGTTTC GTCGGTGTGG GCAGACATTG GCGCCGGCGC TGCGCCGTCT 1320
TGCGCTGTTT GTCGATCGCA CTGTTTGTCG CCATATGACG GTTGAGGAGG CTGAGGATTT 1380
TTTCGGTAGT GCAGAGCGCT AGGCAGTCGT GAGTATCCGG ACCTTTTCTT TTACTCCCAG 1440
CGTTCGGCTG AGGCGCATTG TTCTCTGGGG CAGTTTGTTC TGTGCGGGTG TTCTTTGTCT 1500
GCTGTGTCTG TGTCTTTtAG TTGGCCTTGC CCCGGTGCGT CCTTTTGTGA AAAAGGAGCA 1560
TATGTTCACT GTGCAGTCCG GTGTGGGCGC GCGGAAGGTC ATTCACGAAC TGAGGAACGC 1620
ACGGCTCATT CGATCCGAGT GGGCTGCGCG gTTGTACGTG TTCGCGCGCG CGCTTAATTT 1680
TAAGGCGGGT AcTACGCAGT TTCTCCTGCA ATGAGTGCGG TGCGCATTTT AACTATGCTc 1740
GACGATGTCG AACAACAACG CTTTATCAAG GTCACCGTCC CCGAGGGACT GACGGTAAAG 1800
AAAATTGCTG CACTGTTGCA AGACGCTACA GTGGTAAGTG CAGCGGCGTT TGTGGAAGCT 1860
TGCACGAGCG CTGCATTGCG AACGCGCTAT AAGATCCCTG CTCCTTCAGT GGAGGGTTTT 1920
CTCTATCCTG A ACGTATTT TTTTAGTTAC CAGGAACGCG CGGCCAATGT GGTGGGAACC 1980
ATGATCGAAA ACTTTCTGGC CAAGACTAGC CAGTTGCCGT CGTTTCCTGG TGATCCGGTT 2040
GCGCGATTTA AAACCGTCAT ACTCGCTTCA ATCGTGGAAC GCGAGTAcCG CGTGGCTTCT 2100
GAGGCAGCAC GCATCGCAGG TGTTTTTTAT AACCGGATGA AGGTAAACAT GGGACTGCAA 2160
TCTTGCGCGA CAGTCGAATA TGTCATTACT GAAATTGAGG GGAAAGCGCA CCCCGAGCGC 2220
TTGTTCTTTA AAGACCTTGA AATAGACAGT CCATTTAATA CGTACAAATG TGCTGGGCTG 2280
CCCCCAGCTC CTATCTCAAA TCCTGGGCTC ACCGCGTTGA ATGCTGCGCT GCATCCTGAA 2340
GTGCATGACT TTTTCTATTT TAGGCTCACC GATCCGCAGc GGGcACGCAC ACGTTCACCA 2400
AGACGTTGGA CGAGCATGAT CAAGCTGGGC TCATGCTGCT AAAGAAAAAT ACGGGAATGT 2460
AGGCAGTGGC TCGGATTTCT GCGCACGTTA TTGATGCGAT TGCTGATCGT GTGGATTTGG 2520
TTTCGCTGGT GGGAAATTAC ACGCATCTGG AGCGGCGTGG GGATGACTGG TGGGGTCGCT 2580
GTCCATTTCA TCATGAGCGT ACGCCTTCGT TTCATGTGGT GCCGGATAAA AAGATGTACT 2640 ATTGCTTTGG GTGTGGGGTT GGTGGATCCA CTATTAAGTT TTTTATGGAA ATCGAGAAAA 2700
TTGATTTCCA CGAAGCGGCA GTGCGTCTTG CAAAGCGTGC AGGAATCGAG ATGTCCTTTG 2760
AGGACGGGGT GCACGCTCCT TCTGcTCATG CTTCCTTTAC AATGcAGCTG TGTGAAGTGT 2820
ATCAGCGCAT TGCAGAGACG TTCCATCACG TACTTATGCA CACCGCGCAA GGamGCGTGC 2880
GCGCGCGTAC CTAGCCTCGC GCAaGGTAAC GGATGATTCA tACGCACtTT AAGCTCGGGT 2940 aCGTCCGCCG GATCCGGTAT GGTTGTTTCA ATTTTTAAGG CACAAGGGAT ACTCCCCCGA 3000
GTTTCTGGCC CGTTCTGGGT TGTTTGCAAA AAAAAGCGAG CGTATCGCCG TTTTTTCAGA 3060
TCGGATCATG TATCCGATTG CCGACCGCTA CGGTCAGGTT ATCGCATTCG GAGCGCGCGC 3120
CTTGGGGACT GCACCTGCAA AGTATTTGAA CACGGCAGAT ATGCCACAGT ATAAAAAGGG 3180
TGAGCACTTG TTTgCtTTCA CTGTGCTCTT TCTCAGATGA GAAAGACGCG CGCGGCGATT 3240
ATATGTGAAG GATACATGGA TGTTATCGCG TTTCATCaGG CGCAGTTGAC GTATGCTGTT 3300
GcGCcTTTAG GCGCATTGCT GACGAAAAGC CAGGCACGTT TGATGCGTTC GTTTGTCGAT 3360
CGAATATATA TGTGTT-TΦGA TGCCGACGGA GCAGGCAGAG CGGCAACGTA CAAGGCGATT 3420
TTGTTGTGTC GTTCCTTGGG TTTTGAGGTA CGGATAGTAG AATTGAATGG AGGTACTGAT 3480
CCTGCAGAAT GTGCGTGTAT AGAAGGAGAG GACGCTTTGA GAAAAAGCGT AGAACGGAGC 3540
ACTAcTGACG CGCAgTATTT GATACGGTGT GCACGCCATG AGCACAGTCA CCTTGGTGCA 3600
GATGACACAT CACGTGCGGT GTCCTTTTTA TTCCCTTATC TGAGTGTCTT GGACTCTGCC 3660
ATTCAGCGTG AGCAAGTCAT GCAGGATATT GCGATGGCGT TTGGCATTCG CATACAGGCG 3720
GTGCACGCAG ATTACCTGCG TTATGTGTCC CGTACCACGC AGAAAGGGAC AACAGGGAAT 3780
TGTGTTCTGT CTGTACAGGG AACAGCGATA CAGGTGAAGG AGCCTGCTAC GGGAGTACGC 3840
ACTGCGCAgC TGCGTTTGGT ACTAGCGGTG GTAGCAAATC CTGAGTTATT TGAGCTCcTG 3900
CGGGAGAGTG TGTGTGCAGA TGACTTTGAA GATCCTATGG CAAAAGAGTT ATTCATAATC 3960
CTAGAGGAGT GTTATCGTGC AGACACGCGT GCAAGTCCGC ATGTTCTTTC GTGTTGTACA 4020
ACCGACGAGT TAAGGAAACT CGTGAGCGAG GCAATTGTCT GTGGTGAGTT CTCTTGCAAT 4080
GCGCCGCAGA TTGTGCGTGA CGGTGTTGCG CTCGTGCGTC GTAATAGACT GCTGAAGGAG 4140
CGAGAATCGC TCGTAGGgCG GCTGCGCCGA TTTGGGGATG CATCTTCGGG TGAGGAGTGC 4200
GGGTCTATGC AGGAGCTTAT GATGGAAAAG CAGCGGGTTG ATGAGGAGTT AGAAAGGTTG 4260
AAAGGGGTGA GGAAATGATG GAGCTGTCAC GTACTCCTGC GGTGATGCGC CTGTTAGAAT 4320
ATGCGAGGGA GAAGAAGGCT ATAACGCATG ATGAGGTCGA GAACATACTC GCGCACTATG 4380 GCGTTGAGAC AGAAGAGCTG CTACATGATG TGCTTGATAT GCTTGAGCAG GAGAATATAA 4440
AGGTCTTCTC CTCTGAAGAG GAGGAGCTAG AAGACGAAGc TTTGCAGGGC TGAAAGGACC 4500
TGCCGCGGAC GATGGCGATG GGTCGTTCCC CCTTTCAACT GAGCGCGTGC GTGATAAGCT 4560
GTGCGACAgT AGCCGTGGGG CACGGCAGAA CTTGCTGTCa AACGCGCGGA ATATTGCACT 4620
TGACGATCCG GTgAAaCTCT ATCTGCGTGA TATCGGCCAA GAAAAGTTGC TCACTGCGGA 4680
CAAGAGGTCA TGCTTTCAAA GCGGATGGAA GAGGGCGAAG cATCATAAAG GACATTATTA 4740
CCCAGTCTGG GCTCCTTCTT CCTGAGTTTT ATCACATTGG GCGCAGTCTT TCTAAAAAAG 4800
CTCTTGCGGT TTTGGATCCT GCAGAAAGCG GACGTACGAG AAAGGAAATC AGCGAGGAGA 4860
TGGCCGATCG CCGGCGTCTG AAACaGGCAT ACGGAGAGGT GCTtCGCTCC TTGTATCCTG 4920
AAATGCGTCA TTACATGGCA ATGAAAAAGC GGCTGGATGA GCGTGGGGAG CCGGTGACGg 4980
TTTTGAGTAG TGATGAAGAA gTGTGTAAGC AGCGCGACAA GTTGCTTTCC TGTTTACAAA 5040
AGGTGGACTT GCAATTAGAG GAGATAGATC GCTTTTCTCG AAAATTTTTG GACACCGCGC 5100
GAAAAATACG GGAATACAAG CGGCGTAAAG ATCGCCACGA AAAGCAACTT ATGATTGCTG 5160
ACCTGTGTGA CATGCGCAAG ATTGGGCGTG GTCTGGCCGT GCCCCGTCAG CGTGCAAAGT 5220
TGGAAGAGAC GCTTGGTATG TCTGCAGATT GTATTCAAGA GATCTATACA CAGATTCAAA 5280
AAGTGACACG CAGGCTGCGA CGCATCGAGT ATGACTTTGA AAATACCATC GACGGTATTT 5340
TATCCATGGC GCGGGCAATT CACCGGGGTC ATGTCATGCT CAAGAAGGCA AAGGATAAGC 5400
TCATTAATGC TAATCTGCGT TTAGTTGTGT CGATTGCAAA GAAGTACACA AACCGTGGAT 5460
TGCTTTTTTT TGATCTCGTG CAAGAGGGCA ATATTGGGCT GATTAAGGCG GTAGAAAAGT 5520
TTGAATATCG CAAGGGATAT AAATTTTCCA CGTATGCGAC GTGGTGGATT CGCCAGGCAA 5580
TTACCCGTTC TATTTCCGAT CAGGCGCGCA CCATTCGGGT TCCGGTACAC ATGATAGAGC 5640
AGATAAATAA AGTGACGCGT GAGTCTCGGC AGTTGTTGCA AAAGTTTGGG CGTGAgCtTc 5700
TGATGAAGAA ATTGCGCAnA GCTCTGTTGG ACAGTTGAAA AAGTTAAGCA GGTAAAAAGT 5760
GTTGCGCGCG AGCCTATCTC TCTTGAAACT CCAATTGGAG AGGAGGAGGA CTCTTCCTTG 5820
GGTGACTTTG TCCCTGACGC TGACGTGGAA AATCCCTCTC GAGTTACAGA AAGAGTCTTG 5880
CTTAAAGAGG AAGTGCGATC TATCCTCTCC GCTCTTCCTG CGAGGGAGCA CGAAgTTTTG 5940
AGAATGCGTT TTGGTCTCGA TGGAGACTAC TCTCAAACGT TGGAAGAGGT CGGTTTGTAC 6000
TTTGATGTGA CGCGTGAGCG TATTCGGCAG ATAGAGGCGA AGGCCCTTAA GCGTTTGCGT 6060
CATCCACGAC ACAGCAGAAG ATTGAAGGAT TTCCTTGACA GTTAGGGGTA TGTTATGGTT 6120 CCTGCAAATG TTTTCGAGAA CTTACGGGCA CTGCAGGTGG TGCTTGCGCA GAAGAATCGC 6180
TTGGAAACCG AGATTGCAGA GGCGCCGAAG TTCTTAGTCG CTCAGGAAGA GTTGCTAACG 6240
CGTTGTAAAG AAAGTTTTAT TGAAAAGAAT GTCGAATACG AATCTGTGCG CGAAGAAGTT 6300
GCCCGTCTGA CCACCGAGTT GTGCAAGGCA GAGAAGCGGC GTGAGGATGC GGAAGTTGCG 6360
ATGGACAACA TTAGCACGCA GCGGGAGTAC GATGCGCTCG ATAGGGAGAT TCAGGAGGCG 6420
AAGCGGCAGG AGGTTGCATT GCGCTCCGAG GTAGCGCGCT CGGATGTAGC TTATAAGCGT 6480
TTGGCAGAAG AAATTAAGCT TGATCAAGAA GACATTGTGC AGCAGGAGAG GGAGCTTACG 6540
GAGAACAAGG CTCGCGTCGA CGCAGAGGTG CGTGGTAAAA GGGAGCAGGT GTTGCGTTTA 6600
CAGGAGGAAG AGCGGCGTCT TTCTCCAGAT CTTGACCGGG ATGTACTCTT TAAGTTTGAG 6660
CGTATTATCA AAAGTAAGCA GGGCGTGGGT ATCGTACCCG TGCGGGGGAA CGTGTGTGCA 6720
GGGTGCCACA TGATTTTGCC CGCGCAGTTT TCAACCGGCG TACGTGAAGG GAACAGTATC 6780
GTGTACTGCC CCTATTGCAG TCGGATTCTT TACTATGAGG AGACAGATGA GCCTGAGATG 6840
ACCTTCTTTG ATGAAGAGGA CCTGGGCAGT CTGTCGGACC TTGTCTATCC AGAAGAATCT 6900
GGAGGATTTG GGGGAGGTGA CCGGGAAGAG ATATAGAGAG GTTGGTAAAT GGGGTGACAG 6960
AGAACTGCAG ATAGTCGCTG CGGGTTTCTC GCAGAGGAAA GTCCGGACTC cTTCGGAAAT 7020
GATGCTAGTT AATTACTAGG CAGCGGCTCT CTGCAGTGCC GCTGACAGCA AGCGCCACAG 7080
AAAATATACC GCCTTTGGGT AAGGGTGAAA GGGCGAGGTA AGAGCTCACC GCGTTTTGGC 7140
GACAAAACGG CACGGCAAGC CTCATCAGGA GCAAGATCGA GCAGCAAAGG ATATTCCGAT 7200
CCTGTTTTGC GGGTTGATTG CATAAATTTA TATAGCGATA TAT AAGTGA GACAGATGAT 7260
TATCCTTGAC AGAATCCGGC TTACCAGTTC TCTGTTTTTT TAGAGTATCG ATGGAATTTC 7320
TACTAAGGCG GACGGgCACC AGAGTTTCTT CCCGGGGCGG AGGAAACTGC CTAATTCCGT 7380
GTTCCTCTTT TCGGTCTTTT TCGCCCTGGT AGTGGGCGTG GGGGTTGGTG CGTGGCGTTA 7440
CCGTCGGTAC TACCGTGGGT TGCCGAGCGC GCGCAGTGTw ATGAGGACTG GAAGAATGGT 7500
AATTACAAAG CGGTGTACGA TAAGGCGGCT GAAATTCTCC AGAGGCGGGT GTTCGACGCT 7560
GAGATGCTCG CGCTGCATGG GTTTGCTGCC TACTATATCT TTTCAGAGCA GACTGACCTT 7620
TCTGTCAGTT ACGACTACCT CAATAGTGCT ATTGTGTCCT TGCGCCGCGC GTTGCATGTG 7680
GTGCGCCCTG CAGAAGTTCC CAACGTTTCT TATGTCCTTG GCAAAGCCTA CTACCAGCGT 7740
GGGTATTACT ACGCTGACTT GGCGGTGAAG TACCTGGATC TTGCCTATAA CGCAGGGTTC 7800
AGGGCTGCGG ATTTGGCGGA GTTTCGTGGC ATGTCTGCCT CTTTGCTCGG AGATATGCAA 7860 AAGGCGGTTG AGTCGTTCAC GCAGGCTCTC GCTGCACAGC CCTCTGATCT TGTGCTCTAC 7920
GCGCTGGCAG AGTGTTATGA AAAACTTTCT GATTTTTCGA AGGCGAAgCT GtATCTGTAT 7980
GATACCATCG GGAAAACAAA GGATGTTTTG CTTGAGCTAA AGTGcAGGAA TAGGCTTGcT 8040
GCGCTGTATT TGTCTGAGCG CAACCTTGCA GAGGCTGAGC GAGAGCTGGA TGTGGTTTTG 8100
CAAAAGGATG AgCGCTCTGC GGAGGCCCAC TATCATCGCG GGGTTCTGTA TGAGATGGGT 8160
TCGGATTTGG TAAGGGCGCG GGCGGAGTGG CGGCGTGCCC TGAGGCTGAA TCCACTGCAC 8220
GAGCCAACAC GCGTGAAGCT GAACCTGAAA TAGCTTGGAG GTGCCATGTT TTTTCTCAGA 8280
CGATTTTCTG CTGACGTGGG TATCGATCTA GGCACGTGTA ACACCATTAT CTATGTGGAA 8340
GGAAGAGGGA TTGTCGTCAA TGAGCCGTCT GTGGTGGCAG TTGAGCGGGG AACGAAGTCA 8400
GTAGTTGCGG TAGGCTCGGA CGCGAagCGC AtGTTGTGGA AAACTCCGGG AAATATCGTT 8460
GCGATACGGC CGTTGAAAGA CGGTGTGATC GCGGACATGG ATaCTACCGA GAAGATGAtT 8520
CGTTACTTTA TTTCTAAAAT TTTGCCGCGC CACAGGCTCA TTAAACCGCG GATGGTCATC 8580
GGGATTCCCA GTTGTATCAC GGATGTGGAG TGCAGAGCAG TGCACGAGAG TGCTAGTAAG 8640
GCCGGGGCTG GGGAGGTGGA GGTACTTGAG GAGTCACTTG CTGCAGCCAT TGGCGCTAAT 8700
ATTCCCATAG AAGAACCGGC AGGGAACATG GTGTGTGATA TCGGGGGGGG TACCACGGAG 8760
GTGTCGGTTA TCTCGCTCTT GGGTATGGTG GTCACGAATG CAATTCGTGT TGGGGGCGAT 8820
GAGTTTGATC AGGCCATTAT CAAGCACGTG CGATCCGTTC ACAATTTGAT TATTGGGGAG 8880
CAGACTGCAG AGCGTTTGAA AATTGAAATA GGGAATGCTT CTCCGGAAAA GAATATTGAA 8940
AAGGTGGAGG TCAAGGGAAC CGACGCCATC ACCGGTCTTC CTCGCAGGCT TGAGATAGAT 9000
TCTGTTGAAG TACGTGAGGC GCTCAAAGAG CCTATCACGC AGATAGTGGA AGAAATTAAG 9060
CGGACGCTTG CTCGAACGCC TCCTGAGTTG GCTGCGGATA TCGTCGAACG GGGCATCGTC 9120
ATGACAGGCG GAGGCTCTCT CCTCAAAGGT CTCCCTAAAC TTATTTCTAA GGAAACGCAT 9180
GTGCCGGTTA TCCTTGCAGA GAATCCCATG AACTGTGTTG CTATCGGCkC AGGAAGGTAC 9240
CACGAAGTCT ACAAGGATAT TTCAGGGGAT CGTAGTCTGT ATGCGGGACT GAATTCATGA 9300 tTAGGTGGAA AAGGCTTTTT TTTTTaGAAT AGACTCTGAT CTATTCACCT TTATCGTGTT 9360
TTTGCTTGTT TCCTCAGgTC TCTTGGTCtT CTCAGGAGGG GAGCTGATTG TAAGCTTTAG 9420
GGATGTGGGG TTCTCCGTTA CCTCCCGCGT GGAGAAGGCT GCAGCTTCGG TTTCTTTTTT 9480
TGTTACTCAT ACGGTCAAGA CGTTGAAAAC CCTCTCAGAG GTGCAAAGGC GGTACGAGGT 9540
CTTGCGCGAA CAACTGAAAG ACTACGAATT CTTGCAAGGA TCACGCGAAA GTTTGAGAAA 9600 GGAAAATCaA AGGcTACGCG CCATGCTTGG GTTTTCCCGC GAGCTTTCAA CGCGCAACAT 9660
TCCTGCAGAG ATTATAGGTT TTGACCCCGA CAATTTGTAC TCCGGTATTG TTGTTAGCAG 9720
GGGTGCGCGG CACGGGGTGC GCAAGAATAT GCCTGTTGTT GCATTTCAAA GTGACACATT 9780
GGGGTTGGTT GGAAAAGTGG TGCAGGTTTC GCGTACCACG AGTATGATAG TGCCGCTTTA 9840
TCACTACCAA TTCTATGTTG CCGGAAAACT TGAGCGTGCT CAGTATCGGG GATTGATTAG 9900
TGGACAGGGG GGTAGTGACT TTCCCCTTCT AATGCGTTAT GTGAAGAAGC ACGGACAGGG 9960
AAGTATTCGT GTCGGCGACC TCGTGGTAAC TTCGGGGGAA AATTATCCTT TCCCGAAAGA 10020
TGTACCCGTC GGGAAGGTGC GGGACATTAA ACTCCACGAC CATGAAACTT CTCTTGAACT 10080
TTCTCTTGAC CCCGTTTTAG ACCTTTTCCG TTTGGAATAC GTTTTTATCC TCGACCTGTC 10140
CTTGTCCCAA GAAGGACCGC ACGGATGATA CGGCTCATCG CCTGGTCTGT AGGTACCTCT 10200
TTTCTTTTTA GCATTGTAGA GATGGCAGTG TTCGTACACG TTTCGTACTT ATCCATTATG 10260
CCAGATCTCG TCTTGCTCGT AGTACTGTTC ACGAGCATTC ACAATGGCGT GGTGGCAGGG 10320
ATATGGACTG GATTTAtTGC AGGAATTATT TTTGACTTCC TTTCTATCTC TCCCTTTGGT 10380
TTGCATTCGT TCGTTTTCAC CACTATAGGC TTTATGGTAG GAAAGGTGCA GGGaAGATAT 10440
CATATCGaTA GAGTATTCGC CCCCGCGGTA CTGGCAGGCT TTGCAATGAT TTTCAAGGTG 10500
GGATTGGTGT TGGTATTGCG AGGAGTGTTT GGTCCAAATA TCCAAGTGTA TAGCGTGTTT 10560
TCACGcAGCT TTGGATAGAA ATGACGTTGA ATATTGTGTT TGTCCCCTTT GTATTCGGGC 10620
TTTTGAATAT GTTTCCGACC ACTTTTCTTT ATAAGAGGTT TTCTTCGTAG ATGCGTTATT 10680
TTTCTCTCCT TCCTGATCGT CATATGCTTT TTAGGATAAA GGTTCTCACC TGGCTCGTCG 10740
TGcTGGTTAT GCTGTTGTAC ATGCGGCAGC TGTTTGTCAT TCAAATCGTG CGGGGGGATT 10800
CGTTCAAAAA AAAATCGCTG AACATATCTC AGCGTAgTAA AGTAATTCCT GCACAACGGG 10860
GGGAGATTTT TGATCGCCAC GCGGaTCTGC CCATGGTGCT GAATGTCAAT TCGTTTGCAG 10920
TTGATATGAT CCCCGGAGAG GTTCCGCCTG AGCAGTTCGA TACGGTGCTC AACAAATTGT 10980
CGCATATTCT GCGCGTACCT ATTTCGGATA TTCGAAAGAA AATTCCTGAT GCGGTCCGCC 11040
GTTCATTTCA AACGGTGGAG TTGCGCAGTA ACGTGAGTTA CGAGGACATC ACTGcTATCG 11100
CCCAAATAAT TGATGAACTG CCGGGCGTTT CTTGGTATTC AAAACCAGTA CGAAATTACG 11160
TTGAAACAGG ATCATTCGCT CACGTTATCG GATATGTGGG GGAGATTACA AAAGAAGAGC 11220
TCAAACGATT TTACAGTAAA GGG ACAGGC CCAACAGTCT CATTGGAAAG GCTGGAATTG 11280
AAAAAGAATA CGACGAGGTC CTGAGAGGGA AAGAGGGACA CGAGTACCGG ACCGTCGATG 11340 CCCGTGGGCG ATACATAGAA AACACTTCGG TTACTAACCC TCCTCGCATG GGTAATAACC 11400
TCGTGCTCAC CATCGATCGG CGTATACAAA AACTTGCAGA AGACGCGCTC GGTCCTCGTA 11460
TCGGAGCGGC AGTGGTACTG AAACCGACAA CGGGAGAAGT ACTTGCTATG GTATCTTATC 11520
CGTACTTTGA CCAAAACATT TTCACTCAGC ATAACGCCCA CGAACTGTAT GCGCAGyTTT 11580
CACATGATAC ACGGTTCCCT CTGCTTAACC GTGTTGTGAA TGCAAGTTAC CCGCCTGCGT 11640
CGACGTTCAA GATkGTCaTG TCAACCGCTA TTTTGGCAGA GAAGGCATTC CCCCATGAAA 11700
AGACGGTGGA CTGTCCAGGA GAGATCGAGT ATGGCAATCG CTTATTTCGC TGTCATATCA 11760
GAAAGCCTGG GCACGGCAAG GTAGATCTCC GTCGTGCGCT TGAGCAGTCG TGTGATATTT 11820
ATTACTGGAC AGTCTGTCGA GACTATCTTG GCATCGACCG CATGATTTCG TACATCAACG 11880
ATTTTGGATT TGGCAAATCG GCGCGCATCG ATTTACCCAG TCAAACAGAG GgTATGGTTC 11940
CAACACCGAA ATGGAAAGAA CGTCGGTTTC ATGAAAAATG GTTGGATGGA GACACTATGA 12000
ATCTCGCTAT CGGGCAGGGT TACATGCTTG TCTCGCCTCT GcAGGTGGCA AACATGGTCG 12060
CGATGACCGT TAACAATGGC GTCATTTATC GGCCCCATTT ACTCAAGGAA ATTCGGGACT 12120
CTCGTACTAA CGAATGCTAT TTAGGCATAA ACCTGAGGTA TTAAAGACAG CAAAAATTCC 12180
TGCAGAGATA TTCGAGCACG TGCGCGCAGA TATGCATTCG GTTGTCACGC GTGGCTCCTC 12240
CCAGTATGCA ATGAAAAATA AGACCGTGTC CCTGGCAGGG AAAACTGGTA CTGCAGAAGT 12300
AGGTTTTCAC AATCGGTGGC ATTCGTGGAT GGCAGCGTAT GGGCCTTATC ATCGCCCCCC 12360
GGATGAAGCG GTGGTCGTTG TGGTACTGGT AGAGGCAAGA AACGAATGGG AATGGTGGGC 12420
GCCGTTTGCA ACCAATATCA TTTTtCAGGG TATTTTTGCG AATGAGGATT ATGAGCAAGC 12480
AGTTGAGTCG CTCAAGTCGT ACGGCATTTC CCTTGGGGTG CCGGCAAGGA GTCGGCAGGA 12540
ATGAGGATTC GCGGTGTCAG TGATTTtGAC TACCTATTGC TTCTGACCAT GCtGGCGTTG 12600
ACCArCATTG GTATCTTGTT CATCTATTCT TCCGGGGTAA ATTCAGAGGG ACACGTTATT 12660
TCCAGAGAAT ACCTAAAACA AATAGTGTGG GCCGTCATGG GTGTGGTGCT CATGCTTTCT 12720
GTGAGCATGT ACGACTACCA CAGGTTCAAG GATAGAACAA CGCTTATTTT TGCAGGTTTT 12780
ATATTGCTGC TGATATACAC GCGGTTGTTT GGGCGGTATG TAAATGGTGC AAAAAGCTGG 12840
ATCGGTGTGG GAGAATTCGG CATTCAGATT TCTGAGTTTG CAAAGATCGC GTACATATTA 12900
TACTTAGCGC ACTATCTTGT TTATTCTCAG AGTGAGCCTA TGCTTAAGCG CTTTGCGAAA 12960
GCGGGGGTGA TTACCTTGCT GCCCATGGCG CTCATATTGT CTCAGCCGGA TCTCGGCACT 13020
GCATCCGTGT ACCTGCCGAT TTTTCTCGTT ATGTGTTTTA TTGCAGGATT TCCTCTCCGT 13080 TTGATTTTCG CGGTGGTTTG TGTGGTCCTC CTGACTTTGC TCTTTACACT GTTGCCCCTT 13140
TGGGAGCAAA CCTTTTTGCA ATACCAGGGG GTGGCTACGC GCATTGCAGA TTCGCGTATG 13200
CTGTCGCTGT TTGTGTTTTT TTCTCTCAGC GCTACGTCTG CGGTAgcGGT GGTAGGGTAC 13260
CTGCTCTCTG GAAGAAAATA CTACTACTGG ATTACTTACG CTTTGGGAAT GGTGAGTATT 13320
TCTTATGGCG CATCGCTGCT GGGAGTTCGG GTTTTAAAAC CGTATCAGAT GATGCGCCTG 13380
ATCATTTTTC TCAATCCCGA GGTAGATCCA CTCAAAGCGG GATGGCACAT TATCCAGTCA 13440
ATGATCGCTA TTGGCAGTGG CGGTGCGTTT GGAATGGGGT ACTTGAGAGG ACCGCAGAGC 13500
CATTATCGAT TTTTACCGCA GCAGAGTACT GATTTTATCT TCAGCATTCT TTCTGAAGAG 13560
TGGGGTTTTG TTGGCGGGGT GATAGTGTTT GGTTTGTATC TGTTGTTCTT TCTGCATACG 13620
CTTTCCATCA TGAGTCACGT TGATGATTTG TACGGTAAGC TCATCGCAAG CGGTGTGTTG 13680
GGTATGTTCC TTTTTCACTT TGTAGTTAAC GTGGGCATGA CCATGGGAAT CATGCCCATT 13740
ACGGGTATTC CTCTGTTGCT CCTTTCGTAT GGTGGATCGT CTCTGTGGAC CGCGATGATT 13800
GCAACGGGAC TCTTGATGAG TATCAATGCA AGGCAGTTGT AAATAGAGTA AGGAAAGGAC 13860
ATTTGGTATG AAGGTGGTTC TCTTTTATGA TCAAGGAAGA GCGCATTCAG TTGCTGCGAT 13920
ATGCGAGGTG CTTTGTGCAC AAGGATGCGC GGTAACACCG CATGCGATTG AGCAGGTGTG 13980
GAACGACACA TCACCGTGCA GTaCgcCTTT GGCnTTGGTA CAGGATGCAA CGCATGTGTT 14040
TTTTTTGTaC gcGCATGAGC CCATGCGCGA TcCGGCTTTT ATTTTCTTTT CTGGAGTTGC 14100
TTGTGGGCGT GGTATGCACG TGCTGCTCTT GGCTACAACA ACGGAGGTCA GGGATATCCA 14160
TGTATTTCGC GACTTGGTCT TTTTACTTGA GGAGGAGACG TTTGAGGATT TCTTTCGTGT 14220
CGAGCACGAG AGATTTGTAA GGCAGAAAAA GAAGCGTGTC GCACGCACTG CGCTGTTAGA 14280
GCGCGGTTAT CCATGTTTTG AAGAAAATTT CATCGCGACA GTCATGGATG GGAATATTGA 14340
TATTGTCAAT CTCTTTTTGG ATGCAGGATT TAGCGCTGCG TTGAAAGACG CACGCGGTAC 14400 gCCTGTGTTG TCTTTGGCAG TGCGGGAGGG TCAGGATGAG ATGGCAGCGC AACTTnATTG 14460 nCGGCGGTGC GCCAGTAGAT CCAGTTAATG GGATCCTCTA AGTAGTTAAT TA 14512 (2) INFORMATION FOR SEQ ID NO: 31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3569 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 31:
CCGCCGCCCG CGTATTTCTC GCATTTCTCG TGGGTTATGC CCGAGAGAGT AGTACAGGAA 60
CATATATCGT GGCGTTATAA GGCTCTCAAG CAGGATCATA TGGAAGTAGA CTGTGATGTC 120
TATGCTTACC GTGGAGGGCG GGTGTTTCCC CGTGTGTCCG GTATGGGGGT TATCGATCAT 180
ACAAACATCC CACATGCTCT GCGTATTTTT TGTGAAAAGA TGACTGATTC TTTTATGAAA 240
AAAAAAATAG ATCCACACCT GTGCCAGAGA GAGAGAAAGT TTTTACCCCA TTACTTTTCA 300
TTTCGAATGG GTAAGCTGCC gCGTATTTGC GCAGTTGTGT TTGCTCAACC GAATGTTATG 360
CAGGGTAACT TTCTTACCGT TCATTTTAAA TTAAATGTAG AAAACGAGGA TTCTCGTATC 420
ATAGAAGTCA CGGTGGCGAA AGAGCAGGAG AATTGGAAAC TATTCCAATT TTTATTTAAA 480
GAGGATCGCG CGCATCTTGC TGTCTTGTAA GTGATGTCTG AGTCTGTAAA GGAGATGGCC 540
GGAGGATGAA AGGTCAAGAT GTCATCCTGT GCGACGGGGG ACGTCATTTT TCATATAAGG 600
TACTTCCTCG TGTGGTCATT GTGGGAAGAC CGAATGTAGG TAAGTCGACA TTATTCAACC 660
GCCTGCTCGG TAGACGGCGC TCTATCACCA GCAATACGTC AGGGGTTACA AGAGATTCGA 720
TTGAAGAAAC CGTGATTCTG CGAGGGTTTC CTCTTAGACT TGTTGACACG AGCGGTTTTA 780
CCGTTTTTTC TGAAAAAAAG GCATCGAGAC AACATATCGA TACTCTCGTG TTAGAACAAA 840
CGTATAAATC AATACAGTGT GCGGACAAAA TCCTTCTTGT GCTTGATGGA ACGTGTGAAA 900
GTGCAGAAGA CGAGGAGGTT ATCCAGTATC TGAGGCCCTA CTGGGGCAAA CTCATCGCTG 960
CGGTTAATAA GACGGAGGGA GGAGAGGAGG TGCATTATAA TTATGCACGG TACGgTTTTT 1020
CTACCCTTAT CTGTGTCAGC GCCGAGCACG GTAGGAACAT AGACGCGTTG GAAAGGGCGA 1080
TTATCCAAAA TCTGTTTTCT GTCGATGAGC GCCGGGAACT GCCGAAAGAT GATGTTGTTC 1140
GTCTTGCAAT AGTGGGTAAG CCGAACACAG GAAAATCCAC TTTGATGAAT TATCTCATGC 1200
GCCtACCGTT TCTCTGGTGT GTGATAGAGC AGGTACTACC AGAGACGTGG TAACCGGTCA 1260
TGTTGAGTTC AAACAGTACA AATTCATTAT CGCAGATACG GCGGGTATCA GAAAAAGACA 1320
GAAGGTATAT GAGAGTATAG AGTACTACTC GGTAATACGA GCAATTAGCA TCCTGAATGC 1380
CGTTGACATT GTATTGTACA TCGTCGATGC CCGAGATGGA TTTTCTGAAC AAGACAAGAA 1440
GATTGTTTCG CAAATCTCAA AGAGAAATTT AGGTGTGATC TTCCTTTTGA ACAAGTGGGA 1500
TTTGTTGGAA GGAAGTACCT CTCTAATAGC TAAGAAAAAG CGTGATGTAC GGACTGCTTT 1560
TGGGAAAATG AATTTTGTTC CCGTGGTACC TGTATCAGCT AAAACGGGGC ACGGTATTTC 1620
TGATGCATTA CATTGTGTAT GTAAGATCTT TGCACAACTA AATACAAAAG TGGAGACTTC 1680 CGCTCTCAAT ACTGGCATTG AAAGATTGGG TAACGTCGTA TCCTCCTCCA AGAAAGTATG 1740
GACACGTTTC GTTAAAGTAC CTGGtGCAGG TATCGGTTAG ACCTATTGAA TTTTTGCTTT 1800
TTGCAAATAG GCCAGATCGT ATACCGGAAA ACTACGTTCG ATTTTTACAG AATCGTATTC 1860
GTGAAGACCT AGGATTAGAC TCTATCCCTG TGAAGCTAAC CATACGGAAA AACTGTCGGA 1920
AGCGATAGAT GCAAGATGAA GGAGTGGATA TGAAAAAACT TCTTTTACGT TCTTCTGATG 1980
AAGTTCGAGT AATCGCGCCC TCGTGcTCAA TGCGTAAGAT TGATTCATCG GTAATTGAGC 2040
GTGCACAGGA GCGCTTTCGA TGTTTGGGTC TCAATGTTGC TTTgGAGATC ACGTGTACGA 2100
CGAGGaTTTT TTAGtTCTGC ATCTGTTGAT AAAAGAGTTG CGGATCTCCA TGCTGCCTTT 2160
GCAGATAAAA AAGTAAAGTT AATcTCACTG CAATTGGAGG ATTTAATTCT AATCAACTAT 2220
TGCAGCACAT AGACTATGCT CTTTTGAAAA AGAATCCtAA GTTGTTGTGT GGTTTTTCTG 2280
ATGTCACTGC GCTATTAAAT GCAATTCATG CGAAGACAGG AATGCCAGTT TTTTATGGTC 2340
CACATTTTTC GACATTCGGT ATGGAAAAAG GTATTGAGTT TACTATTGAA TGCTTTAAGA 2400
ACACTTTTTT TTATGGTCGG TGCGATATCT TAGCATCCGA AACATGGAGT GATGATATGT 2460
GGTTTAAGGA TCAGGAACAT CGCCAGTTTA TTACTAATCC TGGGTATGAA ATTATCCATA 2520
GAGGAGATAT GGTCGGGATG GGGGTCGGAG GAAATATTAG TACATTTAAT CTTTTAGCAG 2580
GTACGGAATA TGAACCGTCT CTGAAAAAGA GTATTTTGTT TATAGAGGAT ACGTCTCGTA 2640
TGTCAATTAC AGATTTTGAT CGCCACTTAG AAGCACTTAC ACAACGGGAT GATTTTTGTA 2700
CGGTGCGTGG CATTCTCATT GGCAGATTTC AAAAGGATTC AGGTATTGAT ATGGACATGT 2760
TGCGAAAAAT CATTTCGAGA AAAAAGGCTC TTGATGCTAT TCCTCTATTT GCAAATGTAG 2820
ATTTCGGGCA TACGACCCCC CATTGCATAT TACCTATTGG GGGAATGATT CGAGTTAATG 2880
TTGATAGAAA ATGTATTACT GTTCAGTTGC ATTCCTCAGT TGAGCAACTC CCAGAGTAAT 2940
TTCGGTGAAT GATGTTCTTG CGTTACCATT ACGTATGCTC GCACACTGCC TGAAATGCTC 3000
ATTGGAGAAA TAAAAGAGCC AGTTTCTGTA CTGAAGGGAA CAGGGAAAGT TGTTCTTGCG 3060
CAGTTGGAAA GGCTAAACAT TAGCACTATT GGAGATATCC TTTCGTACTG GCCTCGTTtg 3120
TGGGwwgrkA GAACGCAAGA ACAGATGTTT TCCCAATGGA cgCTGGCGCA TAGATTGCAA 3180
GTACGAGTTA GTGTCACTGC ACATTGCTGG TTTGGATTTG GCAAGAGCAA GACTCTCAAG 3240
CTTGTGGTAC AGGATGGCCA AGGATGCGTC GCTGAATTGT TATGTTTTCG CCGTAATTTT 3300
TTGCATTTTA TGTTTCCTGT TGGAAGTGAA GCAGTCGTGT ATGGAAGTTT TTATGAAAAG 3360
GATGGGTTGC TGGAAAGTAG TTCATTTGAT ATCGAAAAAA TCGATTGTAT TGAAAAAAAG 3420 ATTTTGCCTG TCTATCCCTT AACCAAAGGG TTAAAACAAA TGAAATTAAG AATGCTCATT 3480
TGTGCAGCAA TGGATCAATG GATTGGCACG GTTGATTCTG AATTGCCCAA ACCTATTCTT 3540
GAGAAATATC ATCTACTCAC AAAACGAGA 3569 (2) INFORMATION FOR SEQ ID NO: 32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3858 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32:
TGCTGAATTC TTCCGCGCGT AATCCTGTCG CCCATGCTGC CTCTCGCGTT ATTGAGGCTC 60
CGGTAAGTGA GGGAGCGAAG AGTTTTGCTG GTGAGCGTGT CCTTGGTGTG CGCGTGTTGT 120
TCCCCACGTG GGACAGTAAC GCAAACGCAA TGATAAAGCC GGCGTTCGTA ATTCCTGCGT 180
ACGAGGTGAT GGCTCAGGTG GACGATCAGG GTAATGTACA GGCCCCCACA GAGGAGGAGA 240
AGGCTTCTGG AAAGGGGCGT TTTGAAGATG GGTACGGAGT GGTAAAGAAT GTGGGTGTTC 300
TTAAGTCCAT CGCGGTGAAC ACTTACGGGA TGAATTATCC TCATGGTTTG TACGTGATGA 360
TGCGGGATCA GGATGGTGAG GTGCATCGCT ACTTCATGGG GTATCTCCTG TTCGACTCCT 420
GGAAGatTGG TGTGGAACAA TCCTTCGTAT ATCTCTGATG TTCGGTCGCG GGAGGTGCGC 480
TTGTATCCCG TGTATCCCGC GTCGACGCCC CACGTCGTGT TTGAAGGCTT TATGGTTACT 540
AGGGACGCGG CTCATGCCGG AGGGGaCTAT GTTGGTTATT TCAAGGACGT CAAGATTATC 600
TATGATAAGG CGGTGCTGAG TACGGTGCGC GATTTTGCGG ACGAGGACCT GTGGGGTATC 660
CAGGCGCGGC GTGAGGCTGA GCGTAAGAGA GTTGAGGTTG CGCGTTTCGG GCAGCAGCAG 720
GTGCTGCGTT ATATAGAGCA AGAGAAGCTT GCTACAGAGG TTGGTTTTAC ACCCTCTGGG 780
GGTGCTCAGC GGCAGGAAGA GCAGCAGTAG TGCAGTAGTC TTCCTAGGGA gAGGGGGCGG 840
TGGGGTTCTA GGCGCGGGGC GTGTCTTTTC CCTCTCTTCT TTTCTTGGGT TTTAGCGGTG 900
TTTTGGCGTT CGGGGAGGTC GGATGGGTAG GAGTGTATCC GCCAGGAAGA GGCATGATCA 960
GAGTGAGGTG CGTAGGATGC GTGGTAGGAT GGCTAGGTCT GCGGCGCGTA CTTGTGCGCG 1020
GAGGTATTTG GCTGCTGTTA CATCCGGGGA TAGGGAGAGT TCTCTGCCTC TACTTAGGAG 1080
CTTGGTGAAG CGACTTGACA CCGCTGCCCG GaAAGGTGTT TTCGCTAGAA AGGCTGTGGC 1140
TCGCCAGAAG TCCCGAATGT GTAGACTGTA CAACGGTGTG TTCTCTTCAc CCGAGGTGGT 1200 GCGCGTTTGA GGCGGCTGTT TGCCCGCGTG TGTTTCTTGT CGTGAAGAGA GTTAGGAGAA 1260
CGCGGTCTTT CGTTGTCGAT GCACTTTGTG ACGAGgTGGA TTTGAGCCGT CGCCATGTCG 1320
CGAGGGTTGT TGATAGCTTT GTCTCTGTGG TAACCGCTGC ATTGGAACGG GGGGAGACAG 1380
TCGAGCTGAg GGATTTtGGG GTGTTTGAaG TCTCGCGTGC GTAAGGCTTC CGTCGGGAAG 1440
AGCATAAAGA CAGGGGAgGT GGTCTCTATT CCAAGTCATT GTGTGGTAGT GTTCCGCCCC 1500
AGCAAGCGTT TAAAGAGTGC GGTGCGGGGA TATCGTTCGG GGGAGGTTGG TGCGGATTGA 1560
GGAATGGTGT CGTTCCCGTC TGGGCGAGTT TTTGTTGTTT GTTCTGGCGG TTTCCCTGTT 1620
CGCGCTCTCT CACCCTAACC CTCTGCTTCC CAGAGGGTGT GCTCTCCTAG CGTATGGGGC 1680
GCTTGCTCCT CTCTTCCTTT TGGTAAGGTG GGCCTCGGGT TTTGCGGTTG TGTTCTGGGG 1740
GGGTGCGTAC GGCGCGTTCA GCTACGGTGC GTTTTCTTAT TGGCTTTTTG TATTTCATCC 1800
GGTGGCGTTG TGCGTAGTTG CCGGCTTCTC TGCGCTTTTT CTTGCGGCGC TGTGTCTTGC 1860
GCTGAAGGCT GGTGGTGCAT TTTGGCAGCG GCGGGCGCTT CTCGTGCAGT GTCTTGTGTG 1920
GCTTGGGTAT GAGTACGCGA AGACGCTTGG TTTTCTTGGT TTCCCTTACG GGGTTATGGG 1980
TTATTCGCAA TGGCGTGTAC TGCCGCTTAT CCAAGTTGCA TCGGTCTTCG GTGTGTGGGT 2040
TGTTTCTGCA TTGGTGGTTT TTCCTTCAGC GTGGCTCGCA TCTGTCCTGG GGCAGTGGGT 2100
TGAGGAAAGT GAAAGGAATG CTCGGGCGTT TTTGTCTGCC GCGTATAGCC ACTGGGTTTC 2160
GGCGCTGGTG TGGGTTGGTC TGTGTGGGTT TTGTGTATGC GCGGCCAAGG CGGGATGGTG 2220
GCCGGATTGC ACAGCTCACA CGCGGGCAAA GGTTGCGCTC GTTCAGCCTA ATGGTGATCc 2280
GCGACGCGGC GGTATCGAGT CATATCGGGC GGATTTTAGC ACACTGACGT ATCTTTCTGA 2340
TTGGGCGCTT GAGCGGTATC CAGATGTTGA TTTGGTGGTG TGGCCGGAGA CGGCTTTTGT 2400
TCCTCGCATC GACTGGCACT ATCGCTACCG GCACGAACAG CAGTCATTTC AGTTAGTATG 2460
CGATTTGCTG GACTACGTGA ACGCCAAGAA CTGCCCGTTT ATTATCGGTA GTGACGACGC 2520
ATATAAGAAG CGCACGAAGG AGGGGAATtG GGAACGTGTT GATTACAATG CGGCGCTTCT 2580
TTTCATTCCT GGGGTGAACG TGCTTCCGCC GAGTCCGCAG CGGTACCATA AGATAAAGCT 2640
TGTTCCCTTT ACGGAGTACT TTCCGTACAA GCGGGTATTT CCCTGGTTTT ACAACTTCTT 2700
GGAAAAGCAG GATGCGCGCT TTTGGGCCCA GGGGAGTGAA TTCGTTGTGT TTGAGGCACG 2760
AGGGTTAAAG TTTTCTGTCC CGATTTGTTT CGAGGATGCG TTTGGGTACA TCACGCGTGA 2820
GTTCTGTGCG CGTGGTGCCT CTTTGCTCGT CAATATTTCT AACGACAGTT GGGCAAAGAG 2880
TCTTTCCTGT CAGTATCAGC ACCTGAGTAT GGCGGTGTTT CGCGCAATCG AAAACAGGAG 2940 GGCACTGGTG CGTGCAAGTA CGTCTGGCCA GACGGTTGCA ATTGCGCCTG ACGGGCGTAT 3000
ACTCGATGAA CTACAGCCCT TTGCCCCGGG AGTTTTGGTG GCGGACGTTC CGATTGTCAC 3060
ATGCGCATGC GGAGGCTACC GGTATTGGGG GGACGCGTTG GGAGTCTTTT TTTGTGTGGC 3120
GTCCCTTTTT ATATTGATTG CTGGTGGTGT GCGCCATATG CTGAGATGCA GGAGGGGCGG 3180
GTGGCGTTGA AACGGGTTAG CGAAGGGCAT GGCAAGACTG TTCTGGGTGC GAAGACGGTG 3240
TTCGACGGGG TATTGCGATT CAAAGGTAAC CTGCACATCA GGGGAAAGTT CTCCGGTGCT 3300
ATCGATGCGC AGGGCTGTTT GACCATTGCG CCGGGTGCGG TGTGTGCAGT TCAGTACGCG 3360
CGTGCTGTTT CTATTTTTGT TGAGGGGGAA GTGAGAGGGA ATCTGACGGT GGTTGATCGT 3420
GTGGAGATGA GGGATGGAAG CCGAGTGTTT GGGGaTGTCA CTGCTTCTAG AATTAAAATC 3480
TGTGATGGAg TTACGTTTGA GGGGTCTGTT TGCAtGACTC GGGAAGGGAa TGTTTCGAAG 3540
CGGGATCTAT TTTCTGTCCA GTCTGAGCAA TTGAAGGAGC ATCTGCGTCG TTAGCGTAGA 3600
TATGGTTGGG TCTTGACTGA ATGCCtAAAA GAGGCGCCAC AGTTCCTGTA TACACCACGT 3660
GAAGTTAAGG GTGTCGTCTT CTGTTTTCCT GGTGTTCTAG TCTTTAGCCA ATTTAGGTGA 3720
GAGTGTTCTT GGGCGTGTAC TCGTTGGACG TCGGTTTTTC TTTCCAGGGT TGTAGCGTGC 3780
ACGGTGCTGC GTGCTGTTCA AACCGGTGTC GGTAATCTCG GTGTGTAAGT TATGAAAGTT 3840
TCTGTTGGTA CCGTCGTC 3858 (2) INFORMATION FOR SEQ ID NO: 33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 878 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33:
TCACCATATG GAAATCGCGG TTTAGGAATC ATCAATATTA CACAGCTgtA CGGGGTCGGT 60
TCTATCAGGG AGCGGAAAGA AATACAAATG GTGGTTCAAC TTGAAGAGTG GAATTCTTCA 120
AAGGCCTATG ATCGTCTCGG TACGCAGGAG CTGAACACTA CTATTTTGGA CGTCAGTGTT 180
CCCCTTATAG AAATACCGGT AAGGCCCGGA AGGAACATCC CCATCATCCT GGAGACAGCT 240
GCTATGAACG AGCGTTTAAA GCGTATGGGC TATTTTTCTG CAAAGGAATT CAATCAGAGC 300
GTACTCAAAT TGATGGAGCA GAATGCAGCA CATGCACCGT ATTATCGGCC AGATGATACG 360
TACTAGGGGG CTAAAAAACG TGCGGTGTAT GGCGGTGGAA GGAAAGCATA ATGGTCGTAA 420 AAACGGTGCG CGTGCTTAAT CGTGCGGGCG TACATGCGCG TCCTGCGGCG CTTATTGTGC 480
AAGCGGCAAG TCGCTTTGAT TCGAAGATAA TGCTTGTGCG GGATACGATC AGAGTGAATG 540
CAAAGTCTAT TATGGGTGTT ATGGCTATGG CTGCAGGGTG TGGAAGTGAG CTCGAGTTGG 600
TTGTAGAAGG TCCAGACGAA gTTGCTGCAT TGTCCGCCAT TGAGCGGCTA TTTCAGAATA 660
AATTCGAGGA AGAGTAAATA CGCTCTTACG TGTTAGAACG CCTGTGTTTG TGCTCTTTGC 720
GTGATAGGGG TACTGTACAC TGAGATAGGG AAGGGGCAGA AGGGATGTCC GTCTGGCTTT 780
TTACCGGACC TGAAATAGGG GAGCGAGATA GTGCAGTTCA GGAGGTGTGC GCGCGTGCAC 840
AAGCGCAAGG GACGGTGGAC GTACATCGGC TCTATGnG 878 (2) INFORMATION FOR SEQ ID NO: 34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5819 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34:
TCCAGTCTAT TAATnGTGGC CGGGAAnCTA GAGTAAGTAG TTCGCCAGTT AATAGTTTGC 60
GCAACGTTGT TGCCATTGCT ACAGGCATCG TGGTGTCACG CTCGTCGTTT GGTATGGCTT 120
CATTCAGCTC CGGTTCCCAA CGATCAAGGC GAGTTACATG ATCCCCCATG TTGTGCAAAA 180
AAGCGGTTAG CTCCTTCGGT CCTCCGATCG TTGTCAGAAG TACAGGATCA CTCGCAGGCA 240
ACATTTTGTG GnAAGCTCTG TAGGGAGATG GGATTGGCGG ACTGGAGTAA TCCTGCAGTT 300
GTGTTGGAGC GCAAGATTCG GGCCTTTACT CCCTGGCCGG GTCTATTCAC CTATAAAGAT 360
GGGGAAAGGA TAGCGATTTT GCAGGCGAGG TCGTGTGAGT CTTCGTTTGT TCCCCTCGCT 420
CCTGTGGGGA CAGTGCTTGC TGCAGATAAA AATGGGGTGT TTGTCCAGAC AGGCGATGGA 480
GTTCTGTCCC TTTTACAGTT GCAGCGCTCC GGGAAAAAAC CTCTGTTTTG GAGAGATTTC 540
CTCAATGGTT CCCCTCTATT GCTGACAGGT AGGTTAGGGG TGTGAGTGAT ACACGCCAGG 600
CGTGAGATTT CTACGCAACG CATGATGCGT ACCCCAAGTG TGTCTTGTTA CAGAGAAAGG 660
GGAGGTTGGT TTGTCCGAAG AAATTCTCAC GATAGAAGAG GTTGCGCGGT ACCTGCGAAT 720
TTCTGAACGT ACCGTGTATG AGTGGGCGCA AAAGGGGAAG ATTCCGTCAG gAAAAGTGGG 780
CACCGTGTGG CGGTTTCGCA GGTCAGAAGT TGAGCGATGG GTTGACACTT GTCTTTCCTG 840
TTCTCACAGA CAGAGCCATT CGGATGTTTT GCCCATTGAG CGGATCCTGT CCACCGATCG 900 TATCCTGCAT CTTGAACAGT CTGAGCGTCG TCCGGCGCTC TATGAGCTTT CTGATTGCTT 960
GAGCACTGCA CCTCAGATTA AAAATCGTAG CGAGCTTGCG GCAGAAATAG TGCGGCGCGA 1020
GGAGCTCATG TCGACTGCAA TTGGGTGTGG TATTGCAGTT CCTCATGTGC GCTTGTCTTC 1080
TGTAACTGAT TTGGTTATGG CGGTAGGAAT TTCAAAAAAA GGTATTGCTG ATTTCGGTCC 1140
TCTTGACGGA CAAGACGTAC ATCTTGTTTT TATGATTGCC GCTGCTACCA ATCAGCACCG 1200
GTACTATTTG CAAACGCTTT CTTTTTTTAG TTCAAAATTG AAAAGGCCCG ATTTGCGGAC 1260
GCGCCTCTTG CAGACTAACA CCGCGCTAGA AGCGTACACC GTGTTGACAG AGCAGTCTAG 1320
TTTGTAAGAT TTAGAAGAGA GCAGGATTGT TCAGGCAGAG GGAAAGCATT GACCTATTTT 1380
TTTGAAACGT ACGGGTGCCA GATGAATGTT GCAGAGTCTG CTTCTGTAGA GCAGCTCCTG 1440
TTGGCGCGGG GGTGGACAAA GGCGGTAGAC GCGCAGACGT GCGACGTGCT GATTATCAAT 1500
ACGTGTTCTG TGCGAATTAC AGCAGAAACG CGGGTCTTTG GGAGACTTGG CTTATTTTCT 1560
TCTCTTAAAA AAAAGCGTGC GTTTTTCATT ATCCTTATGG GGTGTATGGC ACAGCGTTTA 1620
CACGACAAAA TTCAGCAGCA GTTTCCTCGT ATTGATTATG TAGTGGGTAC GTTTGCGCAC 1680
GCGCGATTTG AATCCATTTT CCAAGAAATT GAACAGAAGC TTACCCAGAA AGATTACCGC 1740
TTTGAGTTTA TCTCCGAGCG TTACCGGGAG CATCCTGTCT CTGGGTATCG TTTTTTCGCT 1800
TCTTCATATA GCGAAGGTTC ATTCCAAAGT TTTATCCCCA TCATGAATGG CTGCAATAAT 1860
TTTTGTTCGT TTTGCATTGT GCCATACGTG CGTGGACGGG AGATCTCGCG TGATCTTGAT 1920
GCTATTTTGC AGGAAGTGGA TGTGCTCTCT GAGAAAGGAG TGCGGGAAAT TACGTTGCTC 1980
GGACAAAATG TTAATtCGTA TCGGGGAAGA GACCGTGAAG GgAACATAGT TACCTTTCCC 2040
CAGCTGTTGC GTCATTTGGT TCGTCGTTGC GAAgTCAAAG ATCAGATAAA GTGGATCCGC 2100
TTTGTTTCCA GTCACCCTAA AGACCTTTCT GATGATCTGA TTGCTACTAT TGCTCAGGAA 2160
TCTCGTCTGT GTCGTCTGGT GCATTTGCCA GTGCAGCATG GGGCGAATGG AGTGCTCAAG 2220
CGGATGCGAA CGGAGTTACA CGAGAGAGCA GTATCTGTCG CTGGTGGGTA AACTGAAAGC 2280
GAGTGTCCCC AATGTGGCGC TGAGCACAGA TATTCTTATT GGGTTCCCGG GGGAGACGGA 2340
GGAGGATTTT GAGCAAACGC TGGATCTCAT GCGGGAGGTG GAGTTTGATT CCGCTTTTAT 2400
GTATCACTAT AACCCGCGCG AGGGAACGCC TGCCTATGAC TTTCCCGATC GTATCCCTGA 2460
TGCAACGCGG ATTGCGCGTC TACAACGCGT CATTGCTCTG CAGATGAGTA CTACTTTGAA 2520
AAAGATGCGC GCACGGGTAG GAAAGACATT GCCAGTGTTG GTAGAGTCGC GCTCGCGAAA 2580
TAATCCTGAA GAATTGTTTG GACATACAGA GCTTGGGGAA ATGACCGTGC TTGAAGGAAA 2640 GGTGGATCCT ACGTACATCG GACGCTTTGT GGACGTGCAA GTGAAGGAAG TGCGCGGCAG 2700
GACCTTGCGT GCCCATCTGG TGCAGGAGCG TGCAAAATGA CATATGGAAA GCTGATTTTT 2760
TTTATTATCG TACTTGTGGG TTTCGCGCTC TTCATGTCCT TCAACGTGGA ACACCGCTGC 2820
GATGTATCGC TTGTCTTTTA TACtTTCAGG CAGTGCCGAT CACTTTGAGC TTGCTTTTTG 2880
CCTTTGCGTG CGGTGCGCTT ACGGCGTTGC TTTTTCTTAT TGATCCGGAC GCGAAAACAA 2940
GAAAACAGAA ACGTGAAGAC AGTCCTACCT CTGCTCCTAC AGGCGGCGTT TCTTCTCCGG 3000
AGCATGTGGA CGTTCCCTAG CCAGACTGCA ATGACACAAA GTCGCGTCTA GGGCTCGCAG 3060
GACGGCGCGC GTGTGCGTGT TTGGGTTCTC TGCTTAATGC GTGCAGTTTT TGTCCGATAC 3120
ACAGCGCATG GTGCTGTCGC GCGCGGTGTG CGCGTCCTTT TTCTTCTTCC ACGTAGCAGT 3180
TGCCGCGTAT ACGGCGCGTG TCCAGGAAAT GGCGATGCGT GGTTTTGCAT TGCGCAATTT 3240
TCAGCAGGTG CATGCGTATT TTGAGCAGCA TATTCCGTTG CTTTCTTCGT TTACGGAGAA 3300
AAAGGAAGCG CtCTCGCTCT TTGCTCAGTA TTTAGAATTG CACGATGCTC ATGAGCGTGC 3360
GGCACATCGT TACCGAGATG CcGGCGTTGT ATGCcGCTGG GTACTGAGCG CGTGCAGTTC 3420
TTACTTGAAr CTACGCGTAA tGCAATGGCC cgcGGATGCG CGCGAGTATG CACGGGAAAC 3480
GTTGGCAGAA GTCGAGCACA TAGGTGTGCA GGTGCTAAAC AAGAAACAGC ATGCTACGTT 3540
CTTGGTTTAT CACGTGTGGC TTGCGCTCCA TGCGGCGTCT ACGGCCGCGC ATCTCCATGA 3600
GCAGTTGGAA AGATTGGAAG AGTATGGCAC GCAGGGTGTG TTCAATGTGT TTGAGACGGT 3660
GTTGCTGTTT ACTCGTTGGT GGATTACTCA GGATGAGAAG GTGGCACAGC GTCTGACAGA 3720
GAGGTATcCG CAAAGCTTTG AAGCACTTTC GGTTATAGGG GCGGTGGAAA TAGCGCCGTC 3780
GGTTTTTTGG CATTTGATGC CGCGTGCGTA CGGAGAAGCA GTTGAATCAA TGGGAAAATC 3840
TGAGACAGTT GTCTTGCAGG ACGCGAAgCT ACGTCCTGTA CCCGAGGTGG TGGCAGCGCA 3900
CAGGACCCGT CGCGCGCACG TGGCCGCAGA CGGCACGGcT GCGCGGTCTG CTATGTCGTC 3960
GTCCCATAAT TTGGGCGTGT CGATTCTCGA GGGAGGGGTA TCTGTGCCCG ATGAGGTGGG 4020
CGCGGGAGAT GAGAAGCCAC GGGGGTACCA GCTCGGGTTT TTTCGAGCAA AGGAAAATGC 4080
GCAACGGCTG ATGGACGATC TGGAGAGGCG TGGTTTTGGG TTCCAGCTGC ATACGGTCCG 4140
ACGTGCAGAC GCGGTGTACT ACCAAGTTTT TGTGCCGGAG GATGATTCCG GCTTTGTTGG 4200
TCACCGACTA AAAGATGCAG GATACGAGAC GTTTCCCCTA TTCTAGGGGG CCGGCACACA 4260
TCGGTGTTTT AGAATGAGTT CCTGTATAAG GTGGTGCATA AACGCGTGGG GAAGCTGTGG 4320
ATATGGGGAT AGCGTGGGGA AAACCAGGAA TAAACCCGTG GAATGCAATT GCTCAGCAAC 4380 GCATCAGGGC GAAGGAGCAC TAGCCGGACC GGCGTGATAT CTGGTGTATT GACCGTCCCA 4440
CATCGACGTG CTGCCAATTT TGAGCCGTGC TACGTCTTCA ACCGCCCTTA CGTGTATCAC 4500
CTCGTGGGAT TCAAATGTTA CCTCAAGGCG GCGGCGGATG CTGGAAATTG CGGTGTTGCG 4560
CGTGAGAAAT GACACCGTCT TGCGGTGTGT GTGCGCGCTA TCTACCAGGA AGATTTCTTG 4620
GACTGATGCG CGTGAGAATG TGATCTCCGC GTGTTCTCGG TCAAAAAACA GCAGAGGATG 4680
GATATCGCCT GAGTTCGGGG AGATTTCAGT TTGTATCCAC AGTCCTGCCA AGAGGTGCTG 4740
GAGTGCTCCC TGTTCACCAT GTgTTGCACg kTTCCAAAGG GCTATGCGCA TTGCTGTTTG 4800
TACAGGCGGT TGTGTGTGTG TCTGTGCTnC TCCGCCTACA GGGGCCGGCG GGGCGTTTCC 4860
ATGTGACCGT GGGTCTGTCG TGGTGGACGA GCCGGACGTA TTGTCTGCCG TGTCAGACGG 4920
GTGTGCGGCG GTGTCCCGCG CGTTGCTGGG AAGTGCAGCA GTGGGTAGAG CGCCGAAGGT 4980
ATCATGCGGG GGAATTCTGC GTGGAGGAAC GTGACCGTGC TGAAAGCAGG AGAAAATATA 5040
TAGAGCCCAC GCAGCGACGA ACAACGATCC GGCCACACTC AGTAAAGGTC TATTCACGGG 5100
ACGCTTCCTT GCACGCAGTA CGGAGGCACC AGCCTAGTCA AGCGAAGGGG TATAGCGCGG 5160
ACTACTCTCT TTTGCAGGAG GAGTAGGGGT CGGGCGTTTC GAGTGCGCAG CTGCGATGCT 5220
GCGATACAGC TCCCGCGCCG TGTGGGCAAC GCGGTCGTGC ACGGCCATAT CCAAGGTGAT 5280
TTCGTACCAG TTGTCGTGTT TTACGCGTCG GGCTCTAACA AGGTAGGGAC TTTGAAAGGA 5340
ACGCTGTTTG CCACGGTACT TTGAGGTGAG CGGTGCGTCG TGCTCAAGGT CGGTAACTAA 5400
CAAGAGAACC TTGTGTCGGT TGTTGTCTTT TCTTTTTTCT TGTATCTCCC AGACGGTATC 5460
TAAAGCGCGG CCGATGTCTG TGTAGCGACC GTTTGGGACA ATGGAATCGA CAACGGAAAT 5520
AATTTTATCT CGGTCCTGCT CAcTGCGTAA GGTGAGGGTG ATAAGTTCCT CAGGCTTTTC 5580
GTAAAACTGG TAAACGGTTA TCCAGTCGCC TTGGATGGTC ATGGAGGAGA CGAACTCATC 5640
GCGCACCCAG CGGTGTAAAC TGCTGAACTT TCCTGGTTCT TGCATGGAGC GTGATTTATC 5700
TATCATCAGG AAGATGTCGA CGGGGACAGT GCGTTCACtG CATGCAGGCA CAGGTGGATG 5760
AGAAAGGTGC AGAGTGCAGG ACAAAGCGCT TTTTTCAGGT GCATTAGATA CTCCTTTAT 5819 (2) INFORMATION FOR SEQ ID NO: 35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 25187 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 35:
TGTGGCCTGG CGCGCTGCCT CATCTGCCTG CGCAAGCTTG AGCgCAAGCG TCTCGTTTGT 60
CTGTGCACTT CCTACAGCAG GAGCTTTTGC ATACGCGTGA TCAwTACACG CCAGTAATTG 120
TTGATGCGCA AGCAAATCCA CATAGCGACG CAAAGGACTG GTCACCTGAC TGTACTGAGA 180 cAGCCCGAgT GCTGCGTGCA CCGCGGCGGT TGTGtCACGC GACGAGCTTT CATCGCGCGC 240
CGCTTTTTGT ACTCCCCCGC CAATCCCGCT GGTAtTGCAC GGGCAGCTGA GGACGTTCCT 300
GACTCACATA AGGAAAGGCA AGGTTATGTA GAAAGGCAAA CCGTGCTGCC GCTTCTCCCG 360
CTAAGAGCAT GAATTCACGC ACCATGCTCA TAGACTCGTA CGACTGCTGT GCTTCAATGT 420
GAACACGCGG CACCTTTCCC TGTGTATCGG GCACGTCCCC GGCGTTCATT TCCTTTCCCG 480
CTTGCCCTGT TTCTTGCACA GGAAAATCTA CCCTCATGTG GACATCAGGA AAGCAAATGT 540
CCACTGCGCC GCGCCCTTTT CTCCGTGCAA TGTTGTTGCG CGCAAAGTCA AAAAGAGGCT 600
GCAACGCGGG GGTATCGCGC TGGGAATCCG CCTCCGCATA GGAAAGGCGC GTAACACGCA 660
CCATGCTCCG GAGCACGTGC ACACAGCTGA TGTCACCGTG CTCATCAAGT AAAATTTTAA 720
AAGACAGTGC AGGAGAAACT GCGTCGCGCG CGAGTGCACA CGTATCAACC ACCACGTCGC 780
TGAGCATGCG CACTGCGCCT TCAGGCAAAT AGAGCGAAnA CCCCGTGTAC GTGCGCATGC 840
ATCTGCGTGC GAATCAGGAA GAACGAGCTC TGCAGGGCTC GCTACATGGA TCCAAAAATA 900
CGTACCATCG AAACTGATCG CATCGTCAGG GTCGCGCGTA CCCTCCCCAT CGATGGCATA 960
CGCGGCAAGA TGCGTACAAT CTGTGCGTGC TTGAGTGACA CACCTATGTG TGTGCACATC 1020
TTGCTGCGCC GCACGTTCTC CATCAGAACC AGACACATGA GCAGGACTTA AAAATAGAGG 1080
ACACAGACGA TCAGGATACG GATTGCGATA CACCGGCCAG AACCCGAAGT GCAAGAGTAT 1140
TTCATGCGCT TGCTCCCTGT GCTCGCAGCC TAAGGCACAC TGCAAAATCT TACAGCGATT 1200
CGCGTGCCCC AACGCGAACG CTTCGATTTC CTGCAGAAAG GGCGTAAACT GCTCGTTCAC 1260
CTGCGACACG TGCACACGAT GCATTCCTTC CTGCGCAGTT GGTGCCATAT TGCGTGCACT 1320
TCGTGCGACG CGCCGTAGCT CCTGGATGAA CGCCTGCTTT AACGCCTGAC GCGCTTCCTT 1380
TTTTTCTTCC TGCACTGCGC AGGATGCGCA CTCTTGCTGA GAGCGGATGC GCACTGCGGc 1440
GGCAGGCTCA TTGCAAACAA AATACGCACT TTGCACACTT TGCTCCCAAT AGGCCCACGA 1500
CTGCGCGGCA GAAGCCCCCC AGAGGAGCTC TGCAAGTTCA AAAAAGGAAG GAGCTTCGGT 1560
ACCGAAAAAC TCGCGCGCGT CCTGTACGGA TTCCTCGCTT ATACGCGCAA TGTGAGACGC 1620
AGACAATAGT TCTACCAGGG AAGAAACAGT GCCCGGGTGC AAGAGCAAAA CGTCTTTCAC 1680 GCGCACGcGT CTTAACCCGT GTTCGGTTTC AATGGTGATC TTCGCATCCT TCCCTCGCTc 1740 GATtAgGTGT ACACACGCAG GGCGCTTTCG ATAGAGCACC GGACTCCCCA CATGTAGTTC 1800 CATCACGCAC tGCGCCTGCA GTATGCGCTG AATAGTTTTG AGTACGCTGT CTACCGAGTG 1860 ACTGCAGACA GGCGCACACC ATCCgTCGAA CACGCTTGaA CCACCCCATA CGCACACGCT 1920
CAGCACATGC ATATGCCGCA GCATGCAATT GCCGAACCGG CATATCTACG CTTCCTACAC 1980
GATGGAATAC GAATCCGCCA AAACCTGTTT TAAAACGGTA CAACCCGTGC ATTGGGTGAC 2040
GCACATCGTC CGTTGGCGGA ATACCGTAAA AATCATACCA AAGACAGnCc GCGCGCACGC 2100
GCTTCTTGAA TTGCATACCA TTGCAGCGCA TACGGTGCCA TAAGATGGCG TGCTGAATAG 2160
TCAGAAGCTC CATACACATA AGTTGCGCAC GTGTCAAAAC ACAACAATAC CAAAGCTGCA 2220
ATTGCCTGCT CATCAGCGCA CCCTAATTCT CTGTCTTCCG ATGCCGGGGG ATGGGGTGTC 2280
TCTATGTTCT TCGGTGTGTC TTTTCCTGCA ATACGCACCC GCAGTGCTGC ACGCGGAGCA 2340
TAGGCAAGAC AGAGCACCAG CATCCCCTGT GCTGCAAATG CGGTGCAAAA ATCGCGATAA 2400
TATTGACGGG TGTGGATGGC AATGCGATCA CGCGCCGCAG TTTTTTGGTA CAGCGCGTAA 2460
AACACATCCA CCGCCGCGCG CAGACTACCC GGAGAACCCT CCTGCGCGAG cGTATCAAAA 2520
CGCGCCaCAC GCACACCGTG CTTTTGCGCA CGTCGAACGT TGTAGCGCCA TTTTGGTTTG 2580
AAAGCAGCAA AAATATCTTC CCgcGCGGGG CGCATATCCA ACAGCAATGT ATCCTGAGGC 2640
TGCACGTTAC AAGCAGCGCG CCGTAGTCCA CACGCGTGGA GCTCTCGCGT AAAGAGCTCC 2700
ATCTCTGTTC CCACCGCGCA aTGCGTAGAG GAGGATGCAA GAGAAGGGAG CGAGCACACC 2760
GCAGCAGCCC ACCCCCACGG GGGATCAAAC CGCACGAGGA ACGGtTACGC ACGAAAAAGG 2820
GAAGTAGCGC GCTCGTTAAC TCACGTAGCA GACTGGCACG TGCGCGCGCC ATCTGCCGTG 2880
ACGGAATCTG ATCGTCCTGA AGATACGGGG GAGCACCCGG CGCATACGCA AACACGCCAA 2940
AGGGCTTAAT ATTCTTGCAC AGAATGAGCA GGGGAAAGTG TTTTTCTCCC CCAGTGTTTG 3000
CATCCGGGCG CACATGCACG CTGAACACGT ACGTCTGCCA GCCGTACGCT CGCTTGAAGT 3060
GCGCCCACGC AGGACTTTGT AAAAACGTTT CTGCAGTCCA CGTCTCCTGC GTCCACTTTT 3120
GCACGGTAAC TACGAACATG GGGCACCCAT TGTACTGCTC CCCGTGCACC GGATCCAGAT 3180
ATCTCCCAAA AAGCTCCATT ACCTGCCGTG CGCTCCCGGT ACGCTCTGTA TGCAGAGGGA 3240
TACGCTCTCT CCCTCTTGCA ATACATCCGT CCCTTACCCC CACACACGCA GGGGCATgCA 3300
CAaTGCTAAG AAGCACACAT GAGCACCCTg ACCGTTCACC GAAGAACATG CACAATGGgC 3360
GAGCCTGTGT GTTGCGGTCG AggTCCGAAG CGCACAGTTC TTGCGCAGAA AGGAGCGCAC 3420 CCTATGGCAG TGCCCCGAGC AAATACyTCA AAAGCAmGCA CCCGTAGAAG GCGTGCGGTT 3480
AATATGCGGC TTGAgGCCCC GCATCTTGTT GAGTGTGGGA ACTGTGGTAA TTTTGTGCAG 3540
TCTCACCGTG TGTGTGGTAG GTGTGGCTTC TACCGGGGGC GCCAGGTGAT TAACCCTGAT 3600
GACCTTTGCT AGTGCCCGTG CGAGTGTGCA CCTGAGCGAC TGCCTTTTgC TCGCGCACAA 3660
GGAGGCTGCC CCGTGGATGA GTTGTTCTTA AGAATGAGGG CATTAGTGGC AGAGAAATTA 3720
GAGGTGGAGG AGGCGTCCAT CACGCTTGAT TCCTCCTTCC GAGGAGATCT CGGTGCTGAT 3780
AGCCTAGATA CCTACGAGTT GGTCTATGCG ATCGAAGAGG AGATGGGGAT TACTATCCCC 3840
GACGAAAAAG CAAACGAGTT CGAAACAGTC AGAGATGCGT ACGAGTTCAT CAAGTCCAAA 3900
GTGACATGAG CCTGTGTCTC GGTCATATTT TTTCCCGCTC TCGTTCTCCC CTCACCCCCG 3960
AGCGTAGGGA GTCTCTCCGG CGCCTGCAAG AGACGCTCGG CGTTAAATTC CGCGATCCTA 4020
CCGCACTCGA CCAGGCACTT TCTCACCGGT CTTTGTTTTC CTCAAAAGAG GACCATTGCG 4080
GTGTGCGCCA CAATGAGCGC ATGGAGTTTC TCGGGGATGC CGTGCTTGGC GCGGTAGCCG 4140
CCGcTTGgCC TGTATCGCGC ACTTCCCGAC AGTCACGAGG GGGATTTAGC AAAGACTAAG 4200
GcGGTGCTCG TGTCTACTGA CACCCTCTCG GACATTGCCT TGAGCCTGCG TATAGACCAC 4260
TACCTTCTGC TAGGAAAAGG GGAGGAGCTT TCAGGAGGTC GGCACAAAAA AGCCATCCTT 4320
GCCGaCGCTA mCGaAGCTGT CATCGGTGCG CTTTTTTTGG ATTcAGGkTT CAAGGCGGCA 4380
GAGCGTTTTG TTCTCCGtCT CCTgCTCCCC CgTgTCCGCC CCaTaCGAGA GAAAAAtTTG 4440
CACCATGACT ACAAATCTAC CCTCCAGGTG CTTGCACATC AGCGCTaTCG TAGTAAGCCG 4500
GAGTACACGG TCGTCAAGCG CACCGGACyT GATCACAgCG TACGCTTCTG GGTGGATGTT 4560
ACCGTTGGCG ATGCACGCTT CGGACCCGGT TATGGCACCA GCAAAAAAAG CGCAGAACAG 4620
TGCGCCGCTC GCCTTGCATG GGAACAATTA TCCGGCACCC TCCGGGAGTA GCGCGTATGC 4680
TGCCCTGTAA GaTACTCTCC TTGTCCCGCT CTGACACCGC CCGCCCCTTC GTAAAATGGG 4740
CAGGAGGAAA GCGCGCCCTC GCCCCAACCC TTTTTGCGCA TATGCCACAG ACATTCGGCT 4800
CCTACTTTGA GCCTTTCGTG GGAGGGGGAG CGCTCTTTTG GCACTTGTGC GCGTGTACTC 4860
GGGTGCGCCT ACACGACATC TATCTATCTG ACATAAATTG GCCACTGCTG TGTGCGTATG 4920
CAGCCGTTCG TGACCGTGTA GAAGAACTTA TCGTCCGGGT TGGACAGCAC ATCGCCTGCC 4980
ACACCCCTAC CTATTACCGT CTTGCGCGGC GTAAATTCGC CGTATGCGAG CATCCGCTCG 5040
AGGTTGCCGC GCTTTTCCTG TACCTGAATC GGAGCTGCTA TAACGGACTG TACCGTGTCA 5100
ATAAAGCAGG TCAATTCAAT GTGCCTCTCG GACGCGCTGC ACCTGCGTCT CCTTTTCTAA 5160 ATACCACCGC GCCTACCCCT CGCAGTACAC AGCCTGCGGC GCAGGTCGGa CACCTTGcAA 5220
TACGCATTGA TGAGGAGAAT TTACGCAGCT GCGCGCGTGC GCTAGCAAAC ACCACTCTTA 5280
ACTGCCAACA CTTTTCTTGC ATTCAACCTG CACGAGGAGA TTTTGTGTAT CTCGATCCAC 5340
CGTACCTTGc ACCTTCAGTG CCTATGATAA AACCGGTTTT GATAGAGCAG CGCACGAATC 5400
GCTTGCTGCG TTTTGCATGC ACCTAGACGC GCGGGGAGTT CTTTTTATGC TCTCAAACAG 5460
CGATTGCCCT GAGGTACGCG CATGGTATCG TCCATTCCGT GTGCAACAAC TCAACGCCCC 5520
TCGGTGTATC GCACGATCCG CTCACGcAAG GGGAAAAAGG TGCGAAGTGC TTATCACCAA 5580
TTACCCCTGC GCTGACACGG CTACACCGTA GCTTTCTGCA CTCTCCTGGC CGTATCGCAT 5640
CGCGTATTGC GGCGTTTAAT GCCACTACAG AAGTTTTACG GTCATAAAAA CCATCCGTGG 5700
GGACCCGCGT GCTCTGCGAT AATGCTTCGT ACACACTGCA CGTATGACGT AGTAAAAGAT 5760
ATAAAACGGT AGAAAAACGT AACAAATGCA GATACTATGC CCGCCATGTA CAGTCGAAGG 5820
GGAACCGTGC CACATCTTAC CTTTGAAGCG GCACTCAGAC ACTGTGCCCA GCACTTTGGA 5880
TCTCAAAATG CAGTCTGCTT CCTAGGCCAT GCTACGGACG CGCATTCGCG GTGCTGCTTG 5940
AACTACCGTC TCTTTGCACA GCGTGTGCGC CGTGCACGCC AGTTGCTGAT GCGCTGTGGT 6000
GTGCGCGCAG GAAGCTGCGT TGCGCTCTTT GGCCCCAACT GTCCACAGTG GGGAGTTAGC 6060
TACTTTGCAA TAGTAAGCCT TGGTGCCCGC GCAGTCCCTC TCGTACCAGA GCTCAGTCCg 6120
CAGaGCTGCG CCGCTGCCTC CAGCATGCTC ACGTTTGCTG TGTCATTGCG GGCGCTGCAG 6180
AAAGAGAAAC ACTCGCCCAA GCGGATACAC TCACCGATCC GGACGCTGCT TCTTGcTCCG 6240
CAAAAGACGG GCAGGACCTT TCTACCGTAT CGCACACCGC GCAAAGAACA CTGATCGCTC 6300
TGGAAGATTT CTCCCTTGTC TGCACAACGG ACGGTGTACA AAACACTCCA GTACCTGTGA 6360
CGCACTGGAA GAATGCTGGA TCAGACCCGG ATGCCATTGC CAGCGTGGTG TACACCAGCA 6420
CCGGAGGCGC TGGCACTCCT CcCCGTGCCG TAACATTTAC CCAACGGAAT TTACTGTGCA 6480
CCGCGCGATA TGCACAGCGT GTACTGCGTG TACGCACGCA CGATGTGGTT TTTTCGCTCC 6540
TCCCCCTTGC ACACTTATTC GAGTTCGTGT GTGCGTTTCT TGCAGTTTTT TTTACAGGGT 6600
GCCTGCGTGT GGTATGCACC TCCACTCCCA CAGATGCGGT TGCCACTGCA GCAATTGCAA 6660
TGCGTAAAGC CGACGTTGCT TTTCTGTCTC CCACCTTTTC TGGAGGCTTC CGAGCAGCTG 6720
CGTTCGCGCC GCCCCTGTGC TCGGCAGCTC CATACCCAAC TTGGAGGACA GcTGCGCCTC 6780
TTGGTTCTCT GGAGCGAAGA GTGCAGTGAA CACACGCAGA TTTTGCACCG GATCTCGCTG 6840
GAAGCGGTGC TCTTTCACGG GTATCTACAT GCGAGCGTGC TCATTTTTGT GACCGCAAAG 6900 AAAAATACAA AAGACAAAGA CATGCCGGCG CCACGCACTC AGAcGCGTGC GGTGCCCGGT 6960
TTAAAAACGC GCCTTTTAAA CTACAACGCG TTCACCAGCA CCGGGGAACT TGCCTTACGG 7020
GGAGAGGGTG TAACACCCGG CTATTGGCGC GATGAGGTAC GCACACGTGC AGCGTTTACT 7080
CCCGACGGCT GGCTCCGTAC AGGGACGCTT TGGACAAAGA CAGAAACCGG TAATCTCCTC 7140
CCCTGCAGCA GCTCGTGCCA TATGCAACTC GGTGCGCGCG GAGAAGCGGT GTACGCAGAG 7200
GATCTTGTTT GTGTGCTTAT GCAACATCCg TGCGTGGTGC ACgCACACGT GCGCGTAGAC 7260
ACGCAAGGGC AAGCGCACTG CGCCGTATGG GTAAAACAAG GAGCCGAACG AATACGGCAC 7320
CCTCAGATAG TGTGCTTTTT GCGCACCCAG CTTTCCACGC AGGTGCGCGT GGGGACGCTT 7380
CGGGTGATGG AGTATGAAAA AACAGTAGGT GCTGACCGCT TGTCTCGACG CGACACACCT 7440
TTCCTAAAGG AAAACTCTTT GTAGAAGGCG CAGCTCGCAC GTTAATGTAC AAAACAGGCT 7500
TCcCCTGACG GAAAAAAAGG TACTCCACGT AGTGTTCTTC GATGGTTCGT ACAGGTCCAG 7560
GCATATTTAG GCGTAGTCGC GCCTGGGCAT CTCTTTTGCG TATTATAAAC GACCTATAAA 7620
AGGAGCACAG ATCAGCTACA CGCTGGTATG CGTGCGCCTG CGTGTTTTCT CCCCCGCCCT 7680
TTTCCCCGAT CAGGGCGTAT GCTTCCTGAC GGCTCAGTGC CGCAACCCCt GCGCACGCAA 7740
TTGCAAGCGA ATGGCTTGGA TCTGCTCATG TGAAATACCA AACAGGAACT CCTCTAGTGC 7800
GGTACAAACT CCtGtTCGCT CAAGCGCTGC ACTAAAAAGT CGCGCCACAT CTCGTCTAGC 7860
TCTGTCGAGA GCAAAAACAA CTCACTAGAA CCGAGCATTG AGCCAAAGCT AGGGGCCAAG 7920
TGCTCCCGTG CACACCACAC CTGATGCAGT GCCGACGTaG GCGCCGCGCA CGTTCATCCG 7980
GACGCTGTTC CCACAGCGCT ACGAGCGCAT GCGTAACAGC AGTCTTCAAA CGCGCGGGTA 8040
CGTGCTCCTG CTCCATTAGA GAGAGGTATA GATCCTCTGT CATGAGCATA CACACGAGTG 8100
CAAACTGCAC CTGGCGCAAC TCCTGCACCA GGTACTCAGC TAGAGCTACC CGCGTTACCT 8160
CCTGCGTAAG AAACGCAAAG GTTTGGATTT TAGCAATCAG GTACCCACGC CTAAGCGCCA 8220
CACGTGCAGG TAAAAAGAGG AACCTTCCTG CATCGGCCAG GTCTGTCACC GTTCCCTTTC 8280
CAAACCTTTC TACGTCCGCA CCATCTGCAC CTGAGTACGA ACGCATAGCA CGGACCAGAC 8340
GTTTTAAATT AGACACGCGC CGGCGTACAA CCTGCGCACG CCGAGGCTGC GTGCgCAAAG 8400
CACACAGCGC GCGCGAGACG ATATACTGTT CATGTCTGCT AAGAGCGGTG AAGTGTATGA 8460
CTGCAGCCTT TATGGGAAAC CCCGTCGTAG CCGCGCCCAT GGGTGAGTCT AATGTATTCT 8520
GACACGAATT ACTAGAATGA CCTCCCTACA AACCAAGGGA AAGGAAGGGC CTAACCGCTG 8580
AGAATGACAG GCGCACGCAC GGCAGGAATA ATCAATTCCA CACGTATCTT CTCCTGAATT 8640 CAATCATATA TTTTTTACCG GCATACATAG GGCACCTAAA AAAGCGTTGA CTAATCAGTT 8700
ACGCGTGCAT ACACTCTCGG CATGGAGACA GATTACGACG TTATCATCGT AGGCGCTGGG 8760
GCCGCGGGAC TGTCCGCAGC GCAGTACGCA TGTCGCGCCA ATCTCAGGAC CCTTGTGATT 8820
GAGAGCAAGG CACACGGTGG TCAAGCATTG CTTATTGATT CGTTGGAAAA CTATCCGGGT 8880
TATGCAACTC CTATCAGTGG CTTCGAGTAC GCGGAAAACA TGAAAAAGCA GGCAGTTGCC 8940
TTTGGGGCTC AGATTGCTTA CGAAGAAGTT ACCACTATCG GTAAGCGCGA TAGTtTTCCA 9000
CATTACCACG GGTACGGGAG CATATACGGC GATGTCTGTT ATTCTTGCCA CCGGTGCAGA 9060
GCATCGCAAG ATGGGCATCC CGGGGGAGAG TGAGTTTTTA GGCCGTGGCG TTTCCTATTG 9120
TGCCACCTGC GATGGACCCT TCTTTAGAAA CAAGCACGTG GTGGTCATTG GTGGGGGTGA 9180
CGCTGCGTGT GATGAATCGC TAGTACTGTC TCGCCTCACC GATCGGGTGA CGATGATTCA 9240
CCGCAGGaCA CTCTGCGTGC ACAGAAGGCC ATTGCAGAGC GCACACTTAA AAATCCACAT 9300
ATTGCCGTTC AATGGAACAC TACGCTTGAA GCGGTACGTG GTGAAACGAA AGTTTCCTCC 9360
GTTCTGCTTA AGGATGTTAA GACGGGAGAA ACGCGAGAGC TCGCGTGTGA TGCTGTTTTC 9420
TTCTTCATCG GTATGGTTCC CATCACCGGT CTTTTGCCCG ACGCAGAAAA GGATTCCACC 9480
GGTTATATCG TCACCGACGA CGAGATGCGT ACCTCTGTAG AGGGGATTTT CGCTGCGGGG 9540
GATGTGCGCG CTAAGTCTTT CCGGCAGGTT ATTACTGCTA CTTCGGATGG TGCCCTTGCC 600
GCGCACGCCG CCGCGAGTTA CATCGACACA CTCCAAAACT AAAACTGCGC GTCTTTGCAC 9660
TTCGGGTGTG CGTTTTTTAT CCTTCGAGGG GAGGGTACTG TTCTCTCTCC CCATCCCCAA 9720
CTCTTTCTGA GGAAGCTTTG GAGCTCGCGC TGGCTGCGAC GGTGTGTCTT TTGCAAAAGG 9780
GTCTGGCGGG GTCGTGCGCA GGTGTGCACT CGATCTTCCA GGAGCCTCTC GGGCCATGCA 9840
CGCTTTCTGT CTCCTCTGTA CCTCCGTCGT GGGGGACATA GTGTCTCGTC TCGTTCGCAA 9900
CCTCGACGCG CATACCATTG ATGAAGCCCT TGCCTTCGTT AAGTCGCGTG AAGCATTCAG 9960
TGTCAGCTTG GCGGAATATC TAAAAGCAGC TAAATGTTCT TTTTCCACGC GTGGTACCGC 10020
TCCCTTTATA CGGGGCGGTA GCGTACTGTA CCGAGATGCA GAACCGTGTG CAGTACTCCT 10080
GcTCACGCGC GCAGGGTTGC TTTTGCACAA CAGTGAGCCA AACACAAACA GTGCGGCGAT 10140
CTACCGTGCC TGCAAACGGC TGGTAACCGC GCAGGTGCGT TCGATTGTAG GTACAGAAAT 10200
GCACACGTGC GTTATCGCAC GCTCGATTTC AGGTATTACA ACTCACGCGC AGCAGGAGCG 10260
ATATTACCTA CTGGTGCTGC CACTACACAC ACCACGTGTT GAATATGAGC AAAACAGTGC 10320
TCCATTGCAT ATCCGGCGCG CACAGCTAAA AGATATGCGA GAGCTATTTC CTCTCCATAT 10380 GCATTACAAA CGTGAAGAAG TACTGCCCGT AGGAAACAGT CCAAAACATA AGGCGACCGT 10440
GCGTACCCTG CACGCACATC TCCGTACACG CGTGATATTC CACGCCTCAA TAGGAGGACA 10500
CATCGTTGCA AAGGCGCAAA CAAACGCACA TGGCTTTCAT TGTCACCAAA TTGGCGGGGT 10560
ATATACGGTG CCTGCATATC GCAACCGGGG CATTGCCACT GCATTGGTTG CAACGCTTGC 10620
GTATAACCGA CTCGATATAG GAAAAACACC GGTGCTTTTC GTAAAGGTAC GTAACATGGC 10680
AGCGCGGCGC GTATACGAAA AGATCGGCTT TACGCTACAC GGATTATACC GCGTCATTAA 10740
TCTATAGCGA AgCAAACAaT AAAGACGTTA AAAGAGAAGA ACACGGCAAC GAAAGGGAAA 10800
GACAACGTCG CGTGACCTCT ATTTTCCAAA AAAGCTACGT TGACCAGCCC TTACCCATCG 10860
CCCAGGGCCA CGTGACAGAA CCGGAGATGC AAGCTGCGTT GCAATCCAtG CGTATAGGAA 10920
TCGTCGTCAA CTCCGTCAAG CCGTATTAAC AATGTCTGCA TACGCGGTGC ATCCTTTCGC 10980
GCAAGACCGA GCACGCACTC CTCCGCCATG CGCACAAAAG CGGCAGACGC AACAGATACC 11040
ATAGCAGACG CACAGTGCGC GCTCGTCTCC CGGTGCACAA ACAGAAGTTT CCCACGCGCC 11100
TGACTGACAa AATGCGCAGT CAGCATGTGC ACCAATGCCA TATTTGCCAC GATCAAATCC 11160
ACCGAGATCC GATCGATACT GGAAAAATCA TCTCCCGGAT AAAGATCTAC ATAACTTTGC 11220
GCGTCAAAAA CGAACACCGC TGTCTCAAGC ACAtGCCTAG tTTTCAaTCT GAAGCAGAnT 11280
ACGCGCAATG AAAAGGGAGA CGCGCGATTC CAATAAaCCA aCCCATCACG CACAGGCACA 11340
TCCCCACGAG CTGCCTCGTG TGCGCTCAAG CACACACGAC ACCCCTGATT ACGTAAAACt 11400
CTGCGAGACT ACGAGAAAAC TGAAGGCGAT TATCGCTGAC GAAAACTCCC GCGCCCATAC 11460
GCGGGAGTAT AAAACACATC CTGGCGAAGA TAAAAGCGTC CTCACAGCAG AAAGACCAGC 11520
AACCATTCCG CCAGGAAGTA AAGAAGAAAC GACACGGAGT CCTGCGTTCT CTGCCCCGTA 11580
CCCCAACTTC TTATAGAGTT CCTCGCACCT TACAAGCTAA CAACCAAACC GCTCAAAGGT 11640
CTTTGCCCCG TAGTAGCGCG CCTGTGCACC CAACTCTTCC TCAATGCGCA TCAACTGGTT 11700
ATAtTCGCCA CCCGGTCACT GCGACTCATC GAGCCGGTTT TGATTTGACC TGTCTCAAGT 11760
GCCACTGCTA AGTCTGCGAT AAACGCATCC TCTGTCTCAC CCGAGCGATG TGAAATCACC 11820
GCCGCGTAGC cTGCGTTCTG AGCCATACGC ACCGCGTCGA CAGTTTCTGT GACCGTGCCA 11880
ATCTGATTAA GTTTTATCAG AATCGAATTG CACGATCCTT CTTTGATACC TCGGGCCAGA 11940
CGCCCAGTGT TGGTTACAAA AAAATCATCT CCCACAATTT GGACTTTGTC TCCCAACTCT 12000
TTCGTGAGCT GCACGTAACC TGCCCAGTCG TTTTGGTCAA GCGGATCCTC GATAGACACA 12060
ATCGGATACG TAGCAATCCA CTTCTTGTAC AGATCAATCA TTTCCTGTGC TGTGAACAGC 12120 TTCCCCGGAT TCGACTTCCA AAACTTGTAC CCTCTCCGAT CTCCTTCATC GAATAACTCA 12180
GAAGACGCAC AGTCAAGcGC AATACACACA TCCTTCCGCG GCGCAAGGCC GGcTTTTGCG 12240
ATCGCTTTCA TAATGTACTC AAGGGCTTGC TCGTTATCCA AATCAGGCGC AAAACCACCT 12300
TCATCACCAA CTGwACGTAr cTTTGCCGTC GGCGGCAAGC AGGCCCTTTA ACGCGTGGAA 12360
CACCTCTGCG GTCATGCGCA CCGCTTCGCG CATGGACGCA GCGCCGATGG GCATAACCAT 12420
AAACTCTTGA AAGTCAATTT TATTATCAGA ATGCTTCCCC CCATTGATAA TGTTGGCCaT 12480
AGGGACCGGC ATGCGAAAAG TGTGCACACC ACCGAGGTAA CGGTAGAGAG GAACACCCAG 12540
AAAGTCTGCA GCAGCACGCG CACAAGCCAT GGAGACGCCA AGcATAGCAT TCGCACCAAG 12600
CTTTGACTTA TTGTCAGTGC CGTCCAGATT CCGCATCnsG TGATCTATCT cACCCTGGTT 12660
GAGCGCATCC ATACcTTCGA GCGTATCAGC AATGAGCGTG TTGAmAGTTC CAAcGGCCTT 12720
GAGAAcACCC TTACCGTTAT AgCGcTCCtT GTmtCCATCA CGCATTTCGA GCGCCTCGAA 12780
CTCTCCgGTA GACGCCCCTG AAGGaACACA CGCACGGCCA AAGCTACCGT CAnTGAGCGA 12840
GACATCCACT TCGACAGTGG GGTTTCCCCG AGAATCGATG ATCTCGCGCG CTTCAATGCA 12900
TGCAATGTCA CTCACTTAAG ACCTCCTGAT GTGGGCGCAT GGTAACACGC GAAAAAAAAT 12960
GCGTAAAGGT TTTCCACTCT CTCTATCGCC CCCGCACGCC GCGCTCCCTT CCCACTGAGA 13020
ACACCACAGA GAAAGTAACT GTACCAACCG CCTCGGTATT CCAACGGAGT ATTGCAaCGG 13080
CGCGTTATAT CTATTCTCGA TACGCAATGG GAACCATCAA TTGATCACGA GACAGAATAG 13140
TGTGCGGGAA AACACGCTGC GCTTCCTTTA A AAACGTTT CAACTCGTGA TCGGTATATC 13200
GAGGGCTATA GTGGATGAGT GCCATAAGTC GCACACGCGC ATCGcGcgCT aTCGTGGCTG 13260
CCTGCACGCA CGTCATATGC TTTTTCTCTG CTGCATCCTT TTCCATCCCT TTCTCAAACA 13320
TTCCCTCACA CACAAAGAAA TCCGAATTCC GCACCTCGGc TGCAATGGAC TGCAAATATT 13380
TTGTATCAGT GACGAAGCTC ACCTTACGCC CCGGACGCGC CGGTCCCATT ACCTGTTCAG 13440
GATATACTGT CACCCCCTGC GCGGACTGCA CTGCAACCCC TGACTGTAAC TGAGACCACA 13500
GCGCCCCACA GGGAACGTGC AAATCCTGAG CCGCGCGCGG GTCAAATGAT CCGGGACGAT 13560
CCTGCTCTTC TAGCGTGTAG CCCATACACG GCTTGGTATG ATCCAGACAA AAACAGCGCA 13620
CCTGAAAATC CTTACCACGG TATACCACTT GTGGTTCTAT CACCTCTTTG ACAATAATCT 13680
CGTAATTAAT GTACATGTCC AAAATCCTGC GGCTCGTTTC CACATACTCT GCAGTTCTTG 13740
GAGGACCGAT GATGTACAGC GGTTCGCTGC GAGCAACTTG AGAAGAGAGC ATCAAAAGCC 13800
CCGGCAGCCC AGTGATGTGG TCTGCATGGG TGTGACTGAT GAAAATGGCA CTGATTTTCT 13860 TCCAGCGTAA CTCAGACGCC GCAACGACAC TTGGGTACCT TCCCCAGCGT CGAACAGAAA 13920
CAACTCTCCC TCACGACGCA ACAACACAGA AGTCAGATGC CGATGGGGTA ATGGCACCAT 13980
GCCGCCACAC CCTAAAATAA ACGCTTCAAG ATTCATATGC ACACCATTCC GACGTCAACC 14040
TTTTGGAGGT TAACTCAGGC GCCTGATGGA TTCTTGTGCC TCTTTGTACT CAGGACGCAA 14100
ATGCAACGCA CGTCGGTATG CATCAAGTGC AAAGCCTTTG TCTCCTGCGT ACTCGCGCGC 14160
AAGTCCCAGG CGATACCACC ATAGCGCATT CGACGGTTCA AAATTCACCG CTGTTGAGTA 14220
CGCGATGTCT GCCTTCCGGA AGCGTTTGGT GAGGCGAAAG ATCTCTCCCA AATAAAAGTA 14280
TGCCACACTG ATACGCTCAC CGCGCGGCGC CATGTCAATA TAGCGCTCCA TAGCGTGCAT 14340
GGAACCCTTG TAATCGCCGA CAAAAAAGAG CGcCTCTGCG AGAGTTTCCA CCACGCGGTG 14400
ATCGACCGAA ATTTTCAGCG CCTCCTGGCA TAGGGCAACC GTATCTGCAT AACGACCAAG 14460
ACGGAAAAGA GACCAGGTAC ACACTGCATA CGCGTCGGCG TGTCGCGGAT CGCGCTCAAG 14520
CACACTGCGA CAAAGCTCAA CCGCCTGCGT ATACATCTTT TGTGCATCTT CACGCCCACC 14580
TGAAGTGTCC ATGtACGCCC ATTCCGGTAA AGAGAGAGTG CCTCTCGCAC CTCAGCTGCT 14640
CCCGCCTGTG CAGCAGCAGG AGGTTGCTCC TGCGCACTGC CACGTGCAAG AAACCAGACA 14700
AGAmCGmaTG CCCCCGCTAC TGTCCGTGTT CGTGTAAACA CAGCGCCTCC TTCAGACACA 14760
TCGAAGGCTC CCGCACAGAG CGCACGCCCT TTATGAAACG CGACGCGCAC ACGGGACACC 14820
TTCTTTCAAA AGACACACCC ACACCATCCC CATCCTTGAG TATGCAGAAG AACCGTCAGG 14880
ACTGGGTAGG TTTTAAACGG AAAGAACTTG CACCCTACAA AGCAGGCGCC ACCTCCCCAC 14940
CCTTGTAGCG TTCCTTCAGA TACGTACGCA CGCGCCCAcC ACACAACGCA CGGAGCACCG 15000
CCTGCACGCG CGCATCAGCC TCGTTTCCTC GTTTTACCAC CAGCACATTC GCGTAGGCTG 15060
AGGCATCAGG TTCCACTGCA AGCCCGTCAC GCCGTGCAGA AAGACCAGCC ATTATTGCGT 15120
AATTTCCATT AATCACCGCA CCATCTACCT GATCAAAGAC GCGCGGCAGA AGGGCACTTT 15180
CCACCTCCTG AAGTACCACA TTGCGCACAT TTTGCTGCAC ATCCTCTACT GTGGCAAACA 15240
GTCCTGAACC CGCACGCATC CGAATGAACC CTGCTGCTTC CAAAAGTCTG AGTGCACGTG 15300
CCTCGTTGGA CGAATCATTT GGAATGGCAA TGACCGCGCC GGCGGGGAAA TCACTCACAT 15360
GCCGATACGT TCTAGAGTAT AACGCCAGTG GCTCTACGTG CACGTTTCCA ACACTTACCA 15420
GGTCCCCGTT GTGCTCCTGG TTAAATTGCT GCATATGGGG CACATGCTGA AAGAAATTCA 15480
TCAGAATATC CCCCCGCATT ACCGCCTCGT TCAGCGCCAC GTAGTTTGTA AACTCTACAA 15540
TACGTAGTTC GATGTGCTGC TTCTTCACTT CTTCTTTTGC GATCTCAAGT AAGCGCGCGT 15600 GCGGTTCAGA CAGCACCCCT ACCCCCACCG TTTCATCCTT CACCTGAGTA CACGCAACCA 15660
CCCCTACGCT TAGGGCAATG AGTTTCCCTA CGAGCGCAGC GCTcACCGTT TTTCCTTTCA 15720
TCTCATTCGG CCTCCCCCTG TCCCTCTATC CATCCTGTCA ATGTCCAAAG AAAAAGCGCA 15780
CATCAAGACT CACCCTGCAT TTCACACCGT TTCCGCACGT GGCGCATCGC ACGCCCCCTG 15840
CACCCGTACG TCCACATGCG GAAAGGGAAC CTCAATGCCC GCCTGTTTGA AGCATTCGTC 15900
GATATCCACG AAGATAGCAT TGCGCAAATC ATTGAAATGC TCAATGTGAG TCCAGGTCAG 15960
GAGCGTTACG TCAATACCCG AGTCAGCGAA GGCATTCCAC AAAACCGCCG GCGCAGGATC 16020
CGAAAGCACA AACCGGTTAC GTGTCGCAAC ATCTAGCAAG AGCTGTTGCA CCCGGCGCAG 16080
GTCACTTCCA TACGCCACGG AAACTTCCGT TTTCACTCGG CGATGAGGAC AGTGCGAATA 16140
GTTAACAAGG TTCGCTTTGA GGATCGTTTC GTTGGGCACG CGCACATACT GCCCATCGAG 16200
CGTTTTGAGC GCCACCGAAA GCAAATTAAT TGACTGCACT GCACCGACGA TACCGTCAAT 16260
TTCTATCACG TCTTGAATTC GAAAAGCACG TTCGGTCATG ACAAACAGCC CTGATATGAC 16320
GTTTGAAACC GACGTTTGCG CCGCAAATCC AAGCGCTACT CCCGCTATCC CCGCGGCCCC 16380
TAGCAGCGCG CTCACGTTGA TCCCCAACCA GTGAAAGGCG GTAAACGTCA TCACCGTGAA 16440
CGAGAGATAG TTTAGTGTTT TGAACACAAA ATGCTGCGTC TGCGCGGATA ACCGCCTTGC 16500
AACAACACGG CGCACACCGC GCCTCAGCAT TCGAAAGAAG GCAGAGGTGA TGCACAGCAC 16560
GGcAACGAAG CGCAGGAGAT ACCACACGCG CTCTGACGTC GCAACCTGCG TGAGGCCCGC 16620
GCCCAGcGCG CAAGCAGCCG TAGCAAGAGA CTGCACGAAA TGCTCCAATT CTCGCATATT 16680
CCCCGCTTGT GCACGCCTCG ACACCACGCC TGCATCCTAC GCAAGGGCGT ACTGCGCGCA 16740
AATATGCCCC CAATCTCCTG CTCCCCTTCC TAAAAACAGG AGAGCACTAC ACCCCTACTT 16800
ACCTGACCAC ACCCCGTGCA GGTTACAAAA CTCATAGGCT TCGAGCACCT GATCGTCTGC 16860
TGTCAGTGCA AAGGTCACCT CAGGCGCTCC ATCTACCGGA AGTTCCTTGA GCTGAATACC 16920
CTTCCGGGTC TTGAGGCACA CCCACGCAAT GTAATGCTCC GGCGTCATTG GGTGAGCCAC 16980
ACTCCCCACC TTCACCTTTA CCTCGTGTCC GTGCACTTCT ACCACGGGGA TATGCTTTTC 17040
CTTCGCTGCA TCCACTGTAC CTACAGGCAC TGCACGCAGC ACCTCACTGC CGCACGCGAC 17100
ACTACTGCCT GCAGGCGCAT CCATACCGAG AAAAAATCCC GCACTTTCCT TCTGCAAGAA 17160
AAACGACAAC TCCCGTCCCA TGGCAAGCAT CTCCTATGTT GGTCTGATTT TGTTCTCGTG 17220
CGGACGCCTG TCGTGCCTCC GCGTGCGGAA ACCGCCCGCG CCAGAGCACA GCCCCGCAGA 17280
GGCGCTGATT CTACCTAAAT CAGGTCCGTG GGTAAAGCGT CACCCACCTG TACGCATTCA 17340 GTAAACTCGC TCCTTCCGAG CCTGTGCTAC GCACGAGTAT ACCTTGACCA TTTGGGTAGT 17400
TTCCGCTACG CTCCTGCCAG CTTGGATTCC TGAGGAGTTT CTACGTGCCC TGCGGAAAGA 17460
AAAGAAAAAA GCAAAAGATA GCAACCCACA AGCGCAAGAA GAAGCTTCGC AAGAATCGGC 17520
ACAAGAACAG GTAGTCTCGG CCCGTGCGCC TTCTGTTCGA CATGAGGCGT TTTCGTCATG 17580
GATTTGTCGC TGcTTCGCTC CCTCACTGGG CCCCACGATC TTAAGAGTCT CTCCCCCGAG 17640
CAGGTGCGCG CGCTCGCgcA GGaGGTACGC CAGGAGATCT TGCGCGTTGT CAGTGCCAAT 17700
GGTGGTCATC TTGCCAGTAA CCTCGGTGTC GTCGAGCTCA CCATCGCACT CCACCGCGTC 17760
TTTTCCTGTC CCCAcGACGT TGTCGTTTGG GATGTCGGTC ATCAGTGCTA CGCGCACAAG 17820
CTCCTCACTG GACgCGCAGG GCGCTTCCAT ACCCTCCGCC AGAAGGATGG TATTTCGGGG 17880
TTCCCGCGGC GCGATGAAAG CCCGTACGAC GCTTTTGGTA CCGGTCACTC TTCCACGGCA 17940
CTTTCTGCCG CAAGTGGTAT CCTCAGCGCC CTACGATACC GGGGTAAATC AGGTAAGGTA 18000
GTCGCTGTCG TAGGAGACGG CGCACTCACC GCGGGCCTCG CcTTCGAGGC CcTCCTGAAT 18060
GTGGGCCGTT CCTGCAGTGA TCTCATCGTC ATCCTCAACG ACAACAAAAT GTCCATTAGC 18120
CCCAATACGG GGTCCTTTTC CCGCTACCTG AGTACCTCAC GGTAAAAGGT CCATACCAGA 18180
AGCTCCACAA ACTTCGCCGC GCGCTCCAGA CTGTCCCACT CGTCGGTCGC CCCGCCTGCC 18240
GCGCCCTCAG CCGCCTGAAA CGAAGTGCAA GAACGCTTTT GTACCAGTCA AATATTTTCG 18300
CAGACTTTGG ATTCGAGTAC GTCGGTCCCT TAAATGGACA CCATATCGAA GATCTTGAGC 18360
GCGTACTCAA CGACGCTAAA AAACTCACCC GTCCCACTCT CCTCCACGTG CAGACTGTAA 18420
AGGGAAAAGG CTACCCCTTT GCGGAGCAGa ATCCTACCGA TTTCCACGGC GTAGGACCGT 18480
TTAACCTTGC AGAAGGAATA GTAGAAAAAA AGGATGCGCT CACCTTTACC GAAGCCTTCT 18540
CCCATACCCT CCTAAATGCA GCGCGTACTG ATGACCGTGT TGTCGCTATC ACCGCTGCTA 18600
TGACTGGCGG CACCGGGCTT GGATTGTTTT CCCATATATA CCCTGAACGC TTCTTCGATG 18660
TTGGCATTGC TGAGCAACAT GCGGTCACGT TCGCCGCAGG cTTGCATGCG CCGGCGTAAA 18720
ACCTGTCGTT GCCGTCTACA GTACGTTTTT GCAGCGCGCC GTTGATCAGG TTATTCACGA 18780
TGTTGCTGTG CAGAATCTGC CGGTCATTTT TGCGCTTGAC CGCGCAGGTG CCGTACCCCA 18840
CGATGGGGAA ACACACCAGG GCCTGTTTGA TCTCAGCATT CTTCGCGCTG TTCCGAACAT 18900
AAACATCCTG TGCCCtGCGT CGGCGCACGA GCTTTCGTTG CTCTTTGGCT GGGCGCTTGC 18960
ACAGGACACC CCCGTAGCTA TCCGCTATCC TAAGGCGTTA TGTCCACCTG AAGAAGACGG 19020
ATTCAGTACA CCTGTACATA CCgGACGCGG CGTCCTTATC ACCCGAGAGA ATGAGTGCAA 19080 TGTACTGTTA GTGTGCACAG GGGGCGTTTT TCCCGAGGTA ACCGCTGCGG CCAACACTCT 19140
TGCGCGAAAG GGCATATTTG CAGATATCTA CAACGTGCGC TTCGTAAAGC CGGTAGACGA 19200
AGATTACTTT TTAGATCTTG TAGGTCGCTA CCGTTCCGTT CTTTTTGTCG AAGACGGCGT 19260
AAAAATCGGA GGAATTGCAG AAGCGCTCCA GGCACTCTTG AACACCAGGC ACCCGGCTCC 19320
GTGCAGCGAC GTGCTTGCTT TTCAGGACAT GTTCTACCCG CATGGTTCGC GCGCGCAGGT 19380
ACTCGCCGCA GCAGGCCTTT CTGCACCGCA TATTGCCGCA CGCGCAGAAT GGCTGTTAGC 19440
CCATTCAGTT GGGCAGATTC GGTGAACAGT ATGCATCTGC ACGCCGTTCG TTACAtCCGy 19500 sTGCGctTGC AGCGCATGgG CAGATGGCCG CCATACAGCG GAAGGAACGG GGAGGCCCCG 19560
CCTGCTCACG CCAGGCGCCG GGGGACCGCA TCCGTTTCAA TCGGCGCACA CGCTGCCTGG 19620
AGTCAGAACA TCGTGCTATT TCTTAGAAGT ATGGTCCTGT GGTACGcAGC GTACGTTCGT 19680
CCGCTTTTGG ATGTCGCGCT CCTTTCCTTC CTCCTGTACA AGACATACGA GATACTTGTT 19740
AAAACACAGG CAGTCCAGTT GGTGAAAGGC GCCTTCTCCA TTCTCGTACT CTACGCTTTG 19800
GTTTTCGTAT TAAAATTAGA AACGCTCCTT TGGATTCTCA ATGCAACTGC CCCGGGCGTG 19860
GCTATCAGCA TTACTATTGT GTTTCAGCCG GAATTAAGAA AAATTTTTTT GAAAATTGGA 19920
GAGAAGAACT GGCTCCGACA GCGCGAATGC GCmACCAT C GCACATCGAC GCGGTATTAA 19980
CTGCCGCAGA TGTTCTTTCT AAAAGGAAGC GCGGCATGTT GGTAGTATTT GCCCGTCACC 20040
ATACCGTGCG CGAGGTCAGT GAAAcGGGTA CCGCGCTGTA CGCGCGCCTT TCATCCAGCC 20100
TGCTTGTGAC TATTTTTGGC CACGATACCC CCATGcACGA TGGAGCAGTC ATtGTGCGCG 20160
ATGGGCTCGT TGTCTCTGCA GGCTCCTTTT tGCCGCTTTC TGAACAGCAC GATATTAGGA 20220
AAACGTTCGG CACACGTCAT CGTGCCGCGC TTGGTATGGC TGAAAAAACA GATGCCATTA 20280
CCCTGGTCGT GTCAGAAGAA ACGGGCGCGC TCAGCCTTGC CTACGATTCA AAGCTGTACT 20340
ACGATCTTCC GCACGCGGAC GTATTGGCGC AnTCAAACAG TTACTCGAAA CTACCACTCG 20400
GGCTGGACAC GCTCAAGGGA CACTGGATCA TGGTCGCAGC ACGTTGTCTT GATAGGATTG 20460
CGCACAATTG GGCTGCCAAG GCATCGAGCA TACTGCTTGC GTTTTTGCTC GTGCAATTTT 20520
ACAGCGGCAG TCTGCTGGAA CGGCGCGCCA TTTCTGTTCC GTTAGTTGTG AGAAATGAAG 20580
GCGCACTAAC TCCTGCGCTT CGCTTTCCTC AAAAGGTGAC GGTGCTGATG CGCGCTTCAC 20640
GTGATACGCT CGGCGCACTG CGCGGATCTG ACATTGTCCC CTATGTGGAT TTGTCCTCCT 20700
ACACAGAGGA GGGAGAGTAT GCAGTTCCTG TGCGGGTGAC TGTAGCTGAC CATGTTGCGC 20760
CACCAGATGC GCTTGAACTT GTCGCAGATC CTGCCATCAT CCCGTTCAAG CTGGAGCGTA 20820 GTGTcACCAA AAATATCCCC ATAACCCTAT CGCTTGAGGG TGTCCCTGCG TACGGCTATG 20880
AGCTGCGGGA AGTCGACACA AATCCTTCCA TGGTGGAAAT TCGCGGTCCG GCCTCTTTGC 20940
TCGTTTCCTA CACACAAGCa GTTACCGAAA CGCTCGACAT AACCAATAGA CGCGCGTCCT 21000
TCTCAGGTGT CATTGGACTT ATCAATCCGA GTACGCTCGT TTCTTTTCCA AAAACTAAAA 21060
CGCAGTTCGT TGTCAGGGTT CGGGAAGTTT CTGAGCTCAA AGAGCTTGAG ACAACACACG 21120
TCTCGTTCAC CGACTGTGCC CCTCACCTTA CGTTCAGCAT CGAACCGGTC ACCATACGTG 21180
CACAGGTGCA GGTGCCAAAG CATGTAATTG AAGAGATGCA CCCAGAGGAG TTCTTCTCTG 21240
TTTCTGCAAG AGAAATTACT GAACCCGGAC GCGTGACCGC TCCCCTTATC CTCTCGCTGC 21300
CCGAACACGT GCGTATGGTA CAGTACAGTC CCAAAGAGGT TCACGTTCAT GTGcGCGAAG 21360 cGcakTCAGT CCCGGCGGAC GGACATGAAT GATCATTGGC GTGGGAATAG ACATAGTAGA 21420
AATAGAACGA TTCGTATCTT GGACACACAA CGTGCGCCTG CTCCGTCGCT TCTTTCATCA 21480
AGAGGAGATT GTAGACTTTT TTAAAAACCA CATGCGAGCG CAGTTTCTTG CCACGCGCTT 21540
TGCCGCAAAG GAAGCATTTG GAAAGGCACT CGGTACGGGA CTCAGAAACA TGGAGCTAAG 21600
GAATATTCGG GTGTGTCAAA ATGGATGGGG TAAGCCGAGA CTAGAAGTCT ACGGTGCTGC 21660
ACAGGCTATG TTGGCTGCAA CAGGAGGCAC GCATATACAG GTGTCGCTAA CGCATGAGAG 21720
AGAAGTCGCC TCAGCCATCG TGATTATCGA GGGAGAACCG CTATGACCCG GTCATCTACA 21780
AAGAAAACAG ACAAAAAAGA AAGCACTGTG TCTTTCTATT CAAAAGAGCG CATCGAGTGT 21840
CCGGTGTGCA CAACCGTCTT CCAAAGAGAA GAAATGCATT CTGGAGGAGG TCGTACCATT 21900
GCTGGTGATT TAACCGATGA ACTAAGAAGG ACATACGAGA CGTCCGCAAA GTATGGAGAG 21960
GTATTTCCTC CCATTTACCA CGTGGTAGTT TGTCCCACCT GTCTTTACGC AACCTTTCTG 22020
CAAGACTTTA GAAATATCGA GCGTGGGATT GTCACTAAAC TTTCTTCCAC CACATCACAG 22080
CGCCGCACAT CAGTTGAGCG GCTCATTCCT CAGGTGGATT TTAGCGCACT GCGCACACTC 22140
TCCTCTGGGG CGGCGGCTTA CTACTTGGCA ATACTGTGCT ATGACTTTTT TGATAAAAAG 22200
TATTCTCCTA CCATTAAACA GGGGATCTGC GCGCTCAGGG CAGCATGGCT TTTTTCTGAT 22260
CTTGAAAAAA AAGATCCGAA CGAGCATTAC GATTACATCC GCAATCTTCT ATACCAAAAG 22320
GCACTTTTTT TCTATCGCAA GGCAATTGAG TGCGAAAGCc AGgCGAAGAA ATTATCGCAG 22380
GATTAAAATC CTTTGGACCG GACACGGATA AAAATTATGG GTACGACGGG GTACTCTATC 22440
TTTCGTATCT CCTTGAGTAT AAATACGGGA CCAAGCGCGA CAGAGCAGTC AGAAGGGAGC 22500
GCATGCAGCG GaACAAACAA GGACTTGCAA AGATATTTGG CCTAGGAAAG TCTTCAAAAG 22560 AGAAGCCAGG TCCATTGCTG GAACTCGCCC GACAATTGTA CGAAAACCTG CTCGCAGAAT 22620
TACACGAAGA CAGTGAAACT ACATGAATGA TGTGCGCAAA ATTCTCTTGC GTATTTCGTA 22680 CGATGGAACA CGATTTTGCG GATGGCAAAA ACAGGTCTCA GGCTCACGGG AACGTGCTCC 22740 CTCTGTCCAA GGTGAGTTGG AAAAAGTTGC TGAGAAAATT CACCACCAAA AGATAGCAGT 22800 CATCGGTTCA GGGAGAACAG ACTCTGGCGT ACACGCAGTA GGACAGGCAG CACATTTTTG 22860 TACCCCCATG AGAAATATAC TCGCGTATCG CTTTATCCCT GCATTTAATT CGCTGCTCCC 22920 GCACTCCATT CGCATTACAG ACGCACGCGA AGTCTCCTCT CAACTCCACG CACGCTTCTC 22980 TGCCGTCATG CGCACGTACC GTTACCACCT CCACTGCGCA CCCGTCGCAT ACGCGCACGA 23040 ACTGCCTTAC TGCTGGCACA TTGCGAGAAT GCCCGATATA CACTTGCTCA ATCAATATGC 23100 TGCAACACTC AAGGGAGAAC TAGACTGCAC AAGCTTTGCT GCTGCAGGAG ATAAAAGTGC 23160 GAGTAAATCG CGTTATTTTT ACGACACACA CTTTTCTTTC AACCATCGCG TACTGACCTT 23220 CGAAATCTCT GCTAATGCCT TTCTCTGGAA AATGGTGCGC TCTCTTACAG GAACCCTACT 23280
ACACTGCGAA AAGAAGCGGT GCTCCGTGCG CGAATTCGTC CGCATTTTGC ACGCGAAAGA 23340
CAGGCGCtTG CAGGGCCCAC CGCACCGCCG CATGGGCTAT TCCTATGGAA CATCCGTTAC 23400
CCCGAACACT TACTCCGTGC AGAATAGGAA CACCCTCGCA CGTGAACTGG CATCCACAGG 23460
CAATGCAAGG TGGAAGACGT ATTAAGCATG CACGTTACAT CTCTTCAAGA AAAGGAATCA 23520
GCACCAGaCG CATAGCTGTT CTCAGCACTA TGCGCACCGC ACGCACAAGT TCAAGCCTTG 23580
CACACGCGTA GTCCgGTCGT GCTTCACACA GAATGGGACA ATCATGATAG AAGCGACTGA 23640
AGCTTTTTGA GAGTGTATAG AGATACCCGG TAATAACGCT CGGATCATGT CCCTGTGCAG 23700
CGCGCGTGAC ACACGCAGGG AAACGTGCAA GCGCCTTCAC CAACTCCCAC TCAGCTTCGT 23760
GCGTGAGCAA TGCAGGGTCA CACCGGACTT CACGAGGTCC CTTTTGCTCC ACATCTTCCT 23820
GAACCTTCTT TAAAAGAGAA GAGATGCGAG CACCCATATA CTGTAAATAG GGACCAGTGT 23880
TTCCGTTAAA AGACAAAGAC TCTTCGGGGT GAAACACCAT ATCCTTTTGA GGACTGACTT 23940
GCAATAAAAA ATAATGAAGC GCGGCGATGG CAACATTCTC TGCAATACAC TGTGCGTGTT 24000
TCAGTGCATT TTCCCGTCCC TTTTTTGCAA TTTCCTCTTC TGCCGCACTG TGCAGACGAT 24060
CCAAGATATC GTCTGCATCT ACTACCGTCC CCTCTCGACT CTTCATACGC CCATGGGGCA 24120
AGTTGACCAT GCCATAAGAG ACGTGATGCA ACTGcTGCGC CCACGGATAA CCGAGCAACC 24180
TAAGCACAAA GAACAATACC TTAAAGTGGT AGTTCTGCTC GTTTCCCACA ACATACAGCA 24240
ATTGATCAAA GGGCCAGTCC TGTGCGCGAA AAATCGCCGT GCCAATATCC TGCGTAATGT 24300 ACATAGTGGT GCCGTCAGAG CGAAsnAACG CCTTTTTGTC TAAGCCTAGA GAAGACAAAT 24360
CCACCCAAAT AGAGTTGTCC TCCATCTGAT AAAAAACGCC GCAGcAnAAC CACGTCTAAC 24420
CTCTTCACGT CCCTTGGTAT AAGTTTCGCT TTCAAAATAA AGTTTATCAA AAGATATGCC 24480
CGTTCGCTCA TATGTTTGTT TGATACCGCG CAACGCCCAT TCGTTCATTG TTCTCCACAG 24540
CGCACGCACG TGGGATTCTG CACTTTCCCA GCGCTGTAAC AGGTCACGCA CATCGTGcTC 24600
TGCTTCTTCC GGGTACTGCT GTGCGTAACG GTTAAACTGC ACGTACCAAT CTCCCACAAA 24660
GCGATCGGAC TTGATGCCGG TATGCGCAGG TGTTTTTCCA TGGGCGAATT TTTGATACGC 24720
GCACATAGAT TTACAGATAT GTACTCCGCG ATCATTGATG ATATTTACCT tGAACACATC 24780
CGCACCACAG AACGCAATAA TACGCGAAAG GCTTTCCCCA ATCGCGTTAT TGCGCAGATG 24840
ACCTACATGC AACGGCTTGT TAGTATTGGG ACTAGAGAAC TCAACCATGA TACGTTTGCC 24900
CTGTAAGTAc TGCGTGTGGC CATAGCGCTC CCCcTGCGCA AAGATAGCAT CAAGCGTATG 24960
CGCAGtACAC ACTCCTTATT TAAAAAGACA TTAAGATAGG GTCCTCGCGC CTGCGGGTGc 25020
CATACGCACA CATGGACGTG TCTTCTTCAA GCAGTGTGCA CAACTGCTGT GCAAGCTGTG 25080
CAGGACTCCT GCGCACACGC TTTGCAAAAA GGAATAGAGG aAAAGCTATG TCCCCCATAC 25140
CCGGCTCCGG CGGCTCTTCC ATAACTAACT GCGCACCTTC GACCGGn 25187 ( 2 ) INFORMATION FOR SEQ ID NO : 36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21170 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36:
TGCATGAAAA TACAACCAGC TTGCTGCATT AAGAACGCAT CAGTACCGTA GCGCAGATAC 60
GCACATGGCT TACCACGCCG CGTAGACGAA TGGTCCTCAA TATCATCGTG CAAAAGGCTG 120
GCTGTGTGAA TGAGCTCTAC TACCGCGCTC AGTGTGTACC ATTCGCGTTC GGAAATTTTC 180
TTCCTCTTGT GTCCTTGGAG TTTCCGATGC GTGCACTGCA CACACGCGTG TGCAAGTTCT 240
GCAGAAAGTA TCAgTAACAA CGGTCTCCAG CGCTTTCCGC CGCAGCTGAc ACACGCGTTG 300
CACACGGTGC GCAGGACGCG CGTATCTTCG GACGcAAGGA CACGGAGCCG CGTTCCCATC 360
CACGAGCGCG TAGtTGCGCa GGAAGTGCAT CGGCGAGTGC TCGTTCAATA TTTCTGAGTC 420
GTTGAGCAAG TGCaCGTTTC ATAGTACCGT TATACGGCGT GTTTGCCTTT TGTGCGAGCA 480 CGATGTGTGG GAGCTGTCAG TGTAACACAT AGCATGGGGG GCGTGTAGGA AGGTGTGAAC 540
GTCTATAAAC GTGCTGATTT TTGGGCAAAA AAGGCAGCCG CCgCGGGGTA CCGCGCGCGT 600
TCGGTGTATA AATTAGCGGC GCTTGATAAA AAATACTCCC TCCTGTCGCG CGCCTCGCGG 660
GTGCTCGATT TGGGCGCCGC GCCAGGAAGC TGGACCCAGT ACGTGCTGGG CACCGCTGCT 720
GCGTGCACCG CCGTGTGTGC GGTGGATGTG CAGCCGATTG CGTCGGACAT TCAGGACGCG 780
CGTTTGCAAC GAGTGCAGGG GGATTTGTGC GCAGCAGATA cACGTGCGCG TGTTGCGTGC 840
AACGCTCCTT TCGATCTTAT TCTTTCTGAT GCCGCACCTC GTACGACCGG AAACCGCACA 900
GTAGATACGA GCGCTTCTGC GTGTCTTGCA GCAGGGGTGT GTGCGTACGT CAACTTCTTG 960
TCCTCTGATG GAGGATTGGT GTTCAAGGTG TTCCAAGGGT CAGAGCACCT TGCTATCCTT 1020
ACACACCTGC GTGCGCATTT CGGTGCGGTG TGTAGTTTTA AACCGCCTGC TTCTCGTCCC 1080
CGTAGTTGTG AGTTATATGT GGTGGCGCGT TTCTTTCGCG GTACGTGCGG CAAGTAATGG 1140
GTAAGAATAG GGAGCGTAGG CGCCTGTGGC ATGCAATCAG GCATTGATCC ATTTAGCAAA 1200
CTTGCGTCAT AACCTCGGTG AAATTATGAG CCGTAcACGC GCGCGTGTGT GTCTACCTGT 1260
TAAGGCGGAC GCGTATGGAC ACGGTGCGTG TGACGTCGCA CAGGCGGCGC TTTCGTGCGG 1320
GGTGCACTCG TTCGCCGTTG CATGTGTGCA AGAAGCGTCG CAACTGCGTG CGGCAGGTGT 1380
TCGCGCGCCG ATTTTGTGTT TAAGTACTCC AACTGCTGAA GAGATTTCTA GTCTTATTGA 1440
GCATCGTGTG CACACCGTGA TTTCTGAGCG CGCGCATATT GCCCTTATCG CACGCGCGCT 1500
CCGTCAGTCT GCTGATACGG GTGCCACGTG TGGGGTACAC GTAAAGATTG ATACCGGAAT 1560
GGGGAGAATC GGCTGTGCGC CGGATGAGGC CTGTGCGCTC GTGCAGATGG TGTGCGCAAC 1620
ACCGGGTCTC CATCTTGAGG GGGTATGTAC GCATTTTTCT GTCGCGGATT CTGTGCGTGC 1680
TGAGGACCTG CAGTACACTG AGATGCAACG TGCACATTTT ATGCATTGCG TACAGTACAT 1740
ACGGAAAAGT GGCATATCGA TTCCATTGGT GCATGCGGCA AACTCTGCAG CGCTGTTGTG 1800
CCATCCGCGG GCACACTTCG ACATGGTGCG TCCGGGATTG TTGGCATACG GCTATGCCCC 1860
TGAGTCTGTG CATCCTGCTG TGCGCAgTGT GTTCCTTCCC GTCATGGAGC TTGTTACCCA 1920
AGTCCGTGCA ATCAAAAAAA TACCTGCAGG CGCGTACGTT TCTTACCAGC GCTTGTGGCG 1980
TGCGCATACA GAAACACATG TAGGTATTCT GCCTATCGGA TATGCAGACG GAGTTATGCG 2040
CGCGCTGTCG CCGGGTTTGC AGGTGTGCAT TGGGGGGAAG TGGTATCCGG TGGTGGGGGC 2100
AATTTGCATG GACCAGTGTG TAGTGGACCT AGGTACCCCG CTGCGTGTGA CAGTTGGAGA 2160
TAGGGTGACA CTTTTCGGTC CTCAGGACGC AGGTGGCCCA GGACAGGGGG CAGATGTGCT 2220 CGCCTCGCAT GCAGGCACCA TTCCCTATGA GCTTTTGTGC GCGATTGGTA AGCGTGTCGA 2280
ACGGGTGTAC ATCCGGTGAA TATGTTTTTG CAACGTTTAT AAAAAGAGAC AAAGGGAGGA 2340
AGGCGCGCAt GAATGTCCTC GGAATTGAGA CCTCTtGTGA TGAGACTGCA GTTGCAATTG 2400
TAAAAGATGG CACGCACGTG TGCAGCAATG TTGTGGcTAC GCAAATTCCT TTTCATGCGC 2460
CGTATCGTGG CATTGTCCCA GAACTTGCAA GTCGCAAGCA CATTGAGTGG ATTTTGCCAA 2520
CGGTGAAAGA GGCGCTTGCA CGCGCTCAGc tGACGCTTGC TGATATCGAT GGCATCGCCG 2580
TAACACATGC ACCTGGGCTG ACCGGATCTC TCCTGGTAGG CCTGACGTTT GCGAAAACAC 2640
TCGCATGGTC AATGCACCTT CCTTTCATTG CGGTTAATCA CCTTCATGCA CACTTTTGTG 2700
CCGCGCACGT GGAGCACGAT CTGGCATATC CCTACGTGGG CTTGCTGGCG TCTGGAGGAC 2760
ATGCGCTCGT ATGTGTTGTG CACGATTTTG ATCAGGTAGA AGCGCTTGGC GCAACGATCG 2820
ACGACGCTCC CGGGGAAGCC TTTGATAAGG TTGCAGCCTT TTATGGCTTT GGATATCCGG 2880
GAGGCAAGGT AATTGAAACG TTAGCAGAAC AGGGnTGnGC gCGTGCCGCG CGTTTTCCGC 2940
TTCCTCATTT TCACGGAAAA GGGCATCGGT ATGATGTATC ATATTCAGGA TTGAAGACAG 3000
CAGTTATTCA TCAGCTCGAT CACTTTTGGA ACAAGGAATA CGAgCGCAcT GCGCAGAACA 3060
TTGCTGCGGC GTTTCAAGCG TGTGCAATCA ACATCTTGCT CCGTtCCcTT GCGCGCGCAT 3120
TACAGGATAC AGGGCTGCCA ACGGCAGTAG TGTGCGGAGG TGTTGCAGCA AACAGTTTGC 3180
TCAGAAAATC TGTAGCGGAC TGGAAGCATG CGCGGTGTGT GTTCCCTTCG CGTGAGTACT 3240
GTACAGACAA CGCGGTGATG GTTGCTGCGC TCGGGTACCG CTATTTGATC CGTGGTGATA 3300
GGAGTTTCTA TGGGGTAACA GAGCGTTCGC GCATTGCGCA CTTCAGTAAG CGCGGGGGAG 3360
ATCGTCTCGC TGCACAGAGA AGCGCTGCTT CTCAGCCTCT TTTTTGAGCA TGTGCGGCTC 3420
AGTCCTTGCT AGGCAGTGTC CCGTTACCTA GATGCTGTGC CGTTTGATGG TAAAAATGAG 3480
CGACGCGATG AAGCACGCCA ATGGCAGCAG TTCCAACGTG AAGCCCACTA GTGACACGCC 3540
TGGTACAGtG wACcGCGTGA TAGAAAAGCC TGCCGCACGC AGAAATGCAA ACAGTTGTGC 3600
AAGGTACATA CACAGTACGC CCGCGATAAA GCACACCGTC TCCTGGGTAC GGGTCTTTGT 3660
CCACGCGACA ATGGCGAGGA ACACAGCAAC CGCCGTTACC GCTAGGCGTA ACATCAGCAA 3720
AATAGTCTGT GCTCGAGAGA GCAGAGAGAG AAATTCATTC ATTCGTGGTG TTCCTTTTCC 3780
TGTTCTTGAA GAAAAAAAGT GCATAGCTGG GTATAGTGCT CGAGCGTAGG GAGTACTGAG 3840
GGTTCAGTGT ACAGGGGGGA GAGCAGGATA TGCGCGTCAA GCgcACGTGC GAACAaTGCC 3900
TCATACCGTT CTTCAGAAAC AGTGGGAAAA AGATAAGGCC cTTCCGTGCG CCACCATGTA 3960 CTCATAAGCG TTATAAGCTG TGCGCGGTCG CgCTCTTGCC GCGCGCATTT TTGTGCGCGg 4020
CGCACGCGGT TCAGACCTCT GCGCCGTTTT TTCGTGTGTG GCAGCGCAGG GACCGCGTCT 4080
TCCTGCGTGG TGTGTGAGGA GTCTAACAGA GCGTGAGGCA TGGCGCGTGC GGGCTGAGAA 4140
AGGGAACCCA CGCGCGCTAA GTCCCAAAAG GCACGAGCAA TCGCCTGCGC AATCGGAGCA 4200
AAGAGCACGT cCCCCGCAGG GGCGCGCGCG GCACAGTGGG CACGGAGCGC AACAAACGGT 4260
GCAGGAGAAG GAAACGGCGG CACAACCAAA AAGACATCGG TTTGCGGGAC GCACGGTGCG 4320
CAGGGGCGCC ACACTGGCAC CTCGCAGGGT GCGGCGCCGC AAGCAAGGTT GAGAAAACGC 4380
GCACACACTT CCTGCGCGCG TTCGGCATGT GCGTAGTTAG CGCACCTCGT GCGCAGGAAA 4440
ACACGTGCGT AGTGCCTGCT GCaGTGCGCA GAAGCACAGG TGGGTAGGGG ACCCCACAGC 4500
CCGCGGCTAA GATAACGCTT AAAGTCCAGG CGCGCGCGGC TGGCGTTCCG TCCAAGAATA 4560
GAAGCGCCGC CGTCAAGGAA TAAATCCGTG ATGCGAACCC CACGCTCAGT GTATAGGTAA 4620
AAGCCGCGCG CACGCCGCAC GCGTCCAAAC CTGCGAATAA TCTCCGCCTG ATAAGCTTCC 4680
ATGCTGTTCT CTTAGGGGAA CACTCCCCCG CTGTGCACCC TGCGCATACc CcCCCTGCGG 4740
GCAACGCCTC GTCAAACCTT TCTATCCCCG AAATGGACGA TACCGGACTT GAACCGATGA 4800
CTTCTACCGT GTGAAGGTAA CACTCTACCA CTGAGTTAAT CGTCCTGCGC GCAGCATAAC 4860
AGCGAGGGTG TTTTTGTGCA ATTGCTTATC TCACTCTGCA CTGTTGTCGT AGGATCCTGT 4920
AAGAAGCGTG CTCGATGCAG TATAGTGGCC CTATGCAAGA AGAGGTTAGT GAGCGTACCT 4980
GTCTGGTGAG TGATCCGTCT TTGTCCTCTG TCGCTGGTGC AGGAAGCGGG GTGGTGCAGG 5040
CGTACGTCGC GCGGCAsTTG CGCGCCAGGT GCACGCCTGC GTTGACTGTG GGATGCGTGG 5100
GATACGCCGG GCGGTACGCA GGTTTGCCGC TGATGTTATT GCGCGCGAAG CTCCTGAGCT 5160
TACTGCCGCC AAACGAGAGG CGCTGTTGGA TGCGTGGGTT CCCTCGCCTT CCTCTGAGCA 5220
CGTCGCTGTG GGTTCTCAGC CTGCGCAAGC GTGTGGTGGC GCGCCCCTTC CTGCAGACGT 5280
GCAGTACAGC ATGGTACTTC ACTTTGTGTG GTACGGTCTT GGTGTGCTTT CTGAACAGGA 5340
GAGGGTGCAG CTTGAACAGG CGGTGCCTCA TTGGCCCCAG GTGTATTGGA GTTGCTTTTC 5400
ACCACAGCTT AAGCGTCTGA TTAAGGCTTG TTTGACCGGG CTGTTAGATG TGGAAGCGTT 5460
CTGTGCTGCA GTGAGAACGC TGCTGGGGAT AGCGGGCATG GACACTGCGG CGGACGCGTC 5520
GATTCACCCT CACGAAAAGT AGAGGCACGA AGGGGAGGTC GAGATGAGGT GTTCGCACAA 5580
TTGGGACGAC CCACCGCCGC TTTTTGGCGC GGTGTCTTAC GGGATGCAGG AAGGGGCCGG 5640
GAGGGGTGTG CGGCGAGAGG CTCGCGACAC TCCCTGCAGA GGGACTGCAG AGGGACTGGC 5700 CACTTCACAG CCTGAAGATG GAGAAACGCG CGCTGCgcTG CAgsGGATTG ATCACTTGGA 5760
CACGCAGCTC CTGCAGCTGG AGCGGGACCT TGCCCATTAC CTAGAGATGG CCGAATTGCC 5820
TGATCCCTTC TCAGAAAACT AACGCCCCAC CTCCTACTGG AGGAGGCGTC TCTTTCTCAT 5880
GATATCAAAG ACGCTCTCTG GACCGCAAAA GTGCCCGGCG CTCGCAAACA CAAACGTTAT 5940
GCCGTGCTGT AAACTCAGCC GGTAGAATTC CTCTGCCCaC GCAGCGCGCC GCGATGCAGC 6000
AATCTCTGCC aCGTACCGGC GGTGGAGTCC ACcTGCaGCG TCCTTCGTTA CCAGTGCGTC 6060
CAGTTCCGTG CTTACTCTTC CCAAAGCGGT TTTGTCGTTC GAGAGGTAAC TGCGCACGAG 6120
GGCACCGAGC CTGCCCTTAA AGTCCGCGGG ACTTTTCCCG AGAGCGATCA GTGCACGAAG 6180
CAGCGTTATC TGCTCCTCAC GATTGCCAAA GGAAAGCATG TTCAGGTGTT TTTGGATGCT 6240
GTCTAGTCCA AGAATTTTGC GGTTCCCCGC GCGCTGGTAC AGGAACGCTT CGATGTTCTT 6300
CCCGGAGTCC AGCTTTGTAT GCGCGATGAG TGCCTGGTAG AGTGCTACGC GCATGACCCA 6360
CGGTTCAAAT CTTGAAAGCG TGTGCATGTC ATCTCCCAGG GTGCTCCGGA GCATTTCTAA 6420
CTCCTCCCGA GAAAGGGAAG AGAGCGTTGG AGCAGCGTTT TCCTGCTCAA GCATCCCGTG 6480 aAGCATACGC CTCTGGAGAA CGCTGGCAAA ATTTTTAATG TCCTCTGAAC CGAGCTCAGC 6540
GTAGAGACGA CTTGCAGAGT CAAATACATC AAGGATTTTG TCCTGAAAGT GCAGCAGCTT 6600
TTCGCTGCCA ACCGAAATAG TGCCCAAAAT ATATACAGAC CCTTGAGGAC CACGTATTTC 6660
CCAGAACATA CGTTCCTTGT GGGAGATAAG CGACGGCAAG GCACCGCGAG AAAGGCTCGT 6720
GCAGCACGAA AGGAAAGGGA GAATAAGGAG CACACACAAG AACACGATCG CACAGCGTGT 6780
GGCACACAGG GAGCAGCGCT TCAAAACGGT CCTCCTGAGC AGTGGAAATA CAGGACGCCC 6840
GGTGGTATTC ATCGGGCCTA ATGCAGAGGA ACGCTCCTTT TCAGAAGGAC CCACGTGGTG 6900
CCCTTACCCC CGCGCCGTTC TGCAGGgTGA AAGAGTTCAC CCGCGTGAGG ATGGGCCTGC 6960
ACGTAGCGCT TGACCGAGGG AGCAAGGACA CTCCCACCCT TGGAATGGTG GCCCTTTCCG 7020
TGGACGATTT CAACCTTCTG GAGGAGCCgC TCACGCGCCT GCGCAAAAAA CGAATCAAGT 7080
GCACTGCGCG CCTCACTGCA CGTCATGCCA TGGAGGTCTA AACGCGCCTC AGGGACTGCG 7140
GTGCGCAgCT TCCTCsTTCC CCGTcGGGAA TGGATGGAGA AGGTACGCCT CTGcCGCGCA 7200
TACTCCGCAC AAGskCCTGC AGCGTCCTTG TCGAAAAGTC CGTaGCGcGC AAGCGCmACT 7260
TCCATCAGGG AAACACGCGG CGCAgCGGCA GCCTGCGAGG ACGGAAGcgC GCCACGGCGC 7320
CGTGCGCGCA CAGTCGCAyT tCGCACGkcc GCCCTTCGGG cTCCTTCCCA CGTACGCAAC 7380
GTCCGGGCAA ACGcGCTCTG gCCcTcAGAG CcTCCTCAAG AGGcAGAaTA TCcTTACGtC 7440 TTTtcCCCGC naTGAGCGGg CGcATTCTTT CATAAAAAAC CTGTCTGTGC AATCAACCGC 7500
GTGAAGcgGC ATCCTGCGTG GTAGGAAGGG GAAGACAGGG AGGCGGTCAC GGTGCATGAG 7560
GAGTGTAATT TTCAGGGCCT CACAGGGATG CTTGCGCCCC gGAGGGgaTG TGCGATAATC 7620
GGCCCCCGGG GAGGGCGAGC GGTGGAGGTG AGAGTTCGGT ACGCaCCgTC TCCGACGGGg 7680
CTCCAGCACA TCGGGGGTAT TAGAACTGCT CTCTTCAACT TCTTGTTCGC GCGAGcgCAt 7740
GCAGGCGTAT TTGTCCTCCG TGTCGAGGAT ACTGAcCGCA GTCGCTGCAC TGCAgyGTTt 7800
GAGCAGAACC TTTACGATAC GCTCCGTTGG CTTGGGGTCT CCTGGGATGA GGGGGGAGGG 7860
TGCCCAGAAA CAGCGGTGAA GCAGGGCGCG CGGGGGGATG GCCGCTCTGT TGCTCACGCT 7920
GGTGGGgCCT ATGGCCcTTA CACGCAgTCT GCACGGACAG ATCTCTACCG CGCGCAGGTG 7980
GCGCGGCTCG TTGAGACAGG GCAGGCGTAT TATTGTTTTT GCGATGCGTC GCGGCTCGAG 8040
CGCGTTCGTA AGATCCGTAC GCTCAACAGG ATGCCCCCCG GTTATGACCG GCATTGCCgC 8100
GAGCTCCTGC CTGAAGAAGT TCGGGAATGT CTCGCATCCG GGGTTCCACA TGTGATCCGC 8160
TTTAAGGTCC CCTTGGAAGG GAGTACTCAT TTcCGCGATG CGCTGCTCGG TGATATCGAG 8220
TGGCAAAATG AGGAGATCAA TCCAGACCCG ATTTTACTGA AAAGCGACGG GTTCCCCACT 8280
TACCATTTGG CTAATGTGGT AGATGACCAT GCTATGCGTA TTACGCATGT TTTGCGCGCT 8340
CAGGAGTGGG TTCCCTCCAC CCCGTTACAC CTTCTGTTGT ACCGTGCTTT TGGCTGGCAG 8400
CCCCCGCTCT TCTGTCATCT TCCGATGGTT ATGGGGGCAG ATGGGCACAA GTTGTCAAAG 8460
CGGCATGGAG CTACTAGCTG TGATGAGTTC CGCAACGCGG GgTATTTGCC TGAAGCGTTG 8520
CTCAACTATG TTGCAATGCT CGGTTGCTCG TACGGAGAAG GTCAGGATCT GTTCACGCGA 8580
GAGCAGCTGT GTGCGCACTT TTCTCTGTCG CGTTTAAATA AGTCACCGGC TGTTTTTGAC 8640
TATAAAAAGC TTGCGTGGTT TAACGGTCAA TATATCCGTG CAAAAAGTGA CGAGCAGCTG 8700
TGTGCGCTCG TGTGGCCTTT CATTGCAAAC GCCGGTGTGT GTGGCCACAT TCCGGCAGAT 8760
GTGGAAGCAG GAGCTGTGCG CACACGACGT TTTGCAGACG AGGCGCCGTG TGCGCCTACA 8820
GAAGCGCAGs GTtCCATGCT CATGCGAGTT ATCCCGCTGA TTAAGGAGCG GTTGCGGTTT 8880
CTAACCGATG CGCCGGAGTT GGTGCGTTGT TTTTTTCAAG AACCGTCTCT CCCTGAACAA 8940
GGGGTGTTTG TGCCGAAGCG CTTGGATGTT GCGCAGGTGC GCGCGGTACT GGTGCGCGCC 9000
AGGGGCCTGG TGCACGAAAT AGTGAGTGCC AGTGAACCGG ATGTTGAGGT GCTCTTGCGT 9060
GCTGAGGCAG AAAAGTTTGG AATAAAACTT GGTGATTTTC TCATGCCCAT TCGCGTTGCG 9120
CTCACCGGTG CTACCGTGAG TGCCCCTCTG GTAGGAACTA TCCGCATCCT GGGGGCGTCA 9180 CGATCCTGTG CGCGTATTGA ACACGTCATT CGTGAACGCT TTTCGGATGA CAGTCAAGGA 9240
GTGGGAGGAG GCTGATATTC TCAGTTAACG CGGGCTATAG GGAAAAGAGG TATGCAGGGG 9300
ATGTGTTCAA AGGCGGGCGC GATCTCTGTG AGGTCGCCgC GGGTGCTGAC AGTATGCTAG 9360
ACAGGGGGGA AGGTGAGCGG CGCkTCACCA TGGAGAAGAT tGTCGGTCTC TGCAAACGGC 9420
GTGGCTTTGT GTTTCCATCT TCAGAAATTT ATGGTGGCCA AGGAGGTGTT TGGGACTACG 9480
GCCCTATGGG CATTGCGCTA AAAAACAATA TTGCCCATGC CTGGTGGCAA GATATGACAC 9540
GCCTACATGA TCATATCGTC GGGCTGGATG CAGCAATCTT GATGCATCCA AACGTATGGC 9600
GGACGTCTGG CCACGTCGAT CACTTCAGTG ATCCTTTGGT TGATTGCACG GTGTGTAAAA 9660
GTCGCTTTCG CGCGGATCAG GTTGCCGTGC CGTCTGCCGG GGGACCCTGT CCTCAGTGTG 9720
GTGGGGCCCT CACGGGCGTG CGTAATTTTA ACCTCATGTT CAGTACCCAC ATGGGTCCTA 9780
CGGATGAGCG TGCCAGTTTG CTCTACCTGC GTCCTGAAAC TGCGCAGGGG ATTTATGTAA 9840
ATTATAAAAA CGTCCTGCAA ACTACACGCC TGAAGGTGCC TTTTGGTATT GCCCAGATCG 9900
GTAAGGCGTT TCGCAATGAG ATTGTCACAA AAAACTTTAT TTTCCGTACG TGTGAATTTG 9960
AACAAATGGA AATGCAGTTT TTTGTGCGCC CCGCAGAGGA TACTCACTGG TTTGAGTACT 10020
GGTGTGCACA GCGCTGGGCT TTTTACCAAA AGTACGGGGT GCGTATGAAC CACATGCGTT 10080
GGCGTACCCA TGCTGCACAT GAGTTGGCTC ATTATGCACG GGCTGCCTGT GACATTGAGT 10140
ATGCATTCCC TATGGGCTTT AGGGAATTAG AAGGGGTGCA TAACCGTGGT GACTTTGACC 10200
TGACgcGCCA CGCGCAGCAC TCGGGTAAAG ACTTGTGCTA TGTGGATCCT GATCCAAACC 10260
TGGATGCGGC AGCGCGTCGG TATGTGCCTT GTGTCGTTGA AACGTCTGCA GGATTGAmGC 10320
GCTGCGTACT CATGTTTCTG TGCGATGCAT ACACAGAAGA ATATGTGCAG GCGCCGAATG 10380
TCGCGTTTTC GGAAACGACA CAGACAGCTG ATCAAGAAGG TGCTGCACGT ACGGGCGAGA 10440
TGCGAATAGT GCTGAgGTTG CACCtGCGCT TTCTCCCACC ACTGTTGCTT TTTTGCCTTT 10500
GGTAAAAAAA GACGGATTGG TTGACCTTGC GCGTGCGGTG CGCGACGAGC TGCGTGAGGA 10560
TTTTGCCTGT GATTTTGATG CaGcTGGCGC GATTGGAAAG CGCTACCGCC GTCAAGACGA 10620
GGTGGGTACT CCCTTTTGTG TCACAGTTGA TTATCAGTCA AAGGAAGATG ATACGGTTAC 10680
GGTACtCTGC GCGACAgCAT GGCACAGCGC CGGGTCTCTC GTGCCTTTCT TGCAGAGTTT 10740
TTGCGCACAG AGATAAAACA CTACCGGCGT CCCTAGGTTG TTGTCCGCTC TCTGCGCGCG 10800
GGGAAAATGT CACATATTAC ATCGCGAAGG AGCTCTCGTA TGAAAGCGTA TTCTTATGCA 10860
GTAGAGGATC GCTCGCTTCT CACTCCTTTT CTGTATCGCT TCTGTGTAGA TCCGCTGTTA 10920 CGCGTGGTGC CGTATCGAGT TCCGGCGAAT CTCATTACGc TGTGCGCAAA CGCCTGTATG 10980
CTGCTTGCAT TTACCCATGC GTACTGCGGC TCGGTGGGGG GTACcTACGC GTATTGGTTT 11040
CTAGTTCCTG TGCTGTGTAT TGTGTACCTG GTCGGAGATT GTCTTGATGG GCGCCAAGCT 11100
CGGAGAACGG GAACTGGTAG CCCCTTGGGA GAATATTTTG ACCATTGTTT GGACACCTCT 11160
GTTGTAGGAC TGCTGGCAGG AATTTTCGTG CTCGCGTTTC GTATACGCGA GCCATTTCTT 11220
TTGACGTGTA TCTTTTTTGT TCCCGCGTTT GTGCAGATTT CAACCCTGTG GGAAAAGCTG 11280
CACCGCGGGG TGATGGTGTT TGCGCGCATT GGGTCAAACG AGATGGTArT GCTGACCACA 11340
CTCGGCGCAT ACGCTGGGTC GTTCGAAACA CTGCGTGCGC TGTTCCTCAC GCCGTTGTTT 11400
TTTTCCTGTA CTCCTGCACA GGTATGTGTA TCAGTGCTCT CAACGGGAGT GTGtATTTTT 11460 tCGTGTGCGG TGTTTTGGCG TATGCGAGTG TTTTCATGCG CACTTTTTTT GCATTTATCC 11520
CTTTTCTTCT TTCTCTGTGT ATTTTCAAGT ACGTATTTCC CCACGCAGAT TGGATATATA 11580
ACGGCACTGT GCACGTTATA TCACATGCGA TATGCAGAGC GCCTTCTGCG CGTCATTGTA 11640
CAGGGGGAGG GAACTGCCCG TGTTGAgGTG TTGGTGCCAC TTTTGTGCGG TGTGTTGTTT 11700
CTTTTTCCTC AGACAAGCTT TTGGGTGCAG CGGGCGCAGT GTAGTATTTT GGCACTTGAG 11760
GTGGGGGTGC ACTTTGTACG ATTTGTGTAT GCTCATCGCT GTTATTGGCA TTGGCTGAAT 11820
CCTCTTCCAA CACAGGAGTA GCGTGGTGCA TGTGACGCTT TTGTACGGAG GCCGTTCTGC 11880
AGAGCACGAT GTTTCTGTAC GTTCTGCACG TTTTGTGGCG CgCACGTTGT GCTTACAACA 11940
CACCGTAATG CTCATCGGTA TTACCCGTCG TGGCGTGTGG TATGCGCAgC CTGCGTGTGC 12000
ATTAGAGCAG TTGTGTACCG GCACTGTCGC GCTCAGTATT CAGGAAGATG AAAAGAGGCG 12060
CGTGTGTCTT GTCCCGGGAG GTGGTACTGC AGGCGCTTTT GTCATAGCGG GGATGCCGTG 12120
TGTCACGGAT GTGGTATTCC CCGTATTGCA TGGCAGTTAT GGGGAAGATG GTACGGTGCA 12180
GGGTTTGCTT GAGATGCTGC AGGTGCCGTA CGTGGGGTGT GGAGTGTGTG CAAGTGCTCT 12240
TGCGATGGAT AAGGTAAAGG CAAAGATGCT ATGGCAGGCG GCGGGACTTC CCGTTTTACC 12300
GTTTGTCTTT TTCCGTAAAG ATGCATGGCG TATGCAT TG CAAGAATTTG TTGCGCAgCT 12360
TGAAACACGC CTTGGCTATC CTCTTTTTGT AAAGCCAGCT CAAGCAGGCA GTTCCGTAGG 12420
AGCCAGTGCA GTGCAGACGC GTGCACCGCT TATCCCTGCG ATTGAAGCGG CTTTTCAGTG 12480
GGATGAAGTG GTGTTGGTGG AGCGATATGT GCGCGCGCGA GAAATTGAAT GTGCGCTCAG 12540
TGGGAACGGA CCCTATACTG TACATGGGGC AGGAGAGGTG ATTGCGCAGG GAGCCTTTTA 12600
TGACTACGAG GAAAAATATG CTGATGCAAG TGTCGCGCGT GTACTCGTTA CGGCTCCTCT 12660 TGAGACTGCC CAGTACGAAC AGATTACCAC ACTTGCCCTG CGCGCATACG AAGCATTAGG 12720
ACTCACGGGT CTGGCGCGGG TTGATTTTTT TCTGTTAGAA ACGGGAGAAG TA ATGTGAA 12780
CGAAgTAAAC ACGATGCCGG GTTTTACGTC GATATCACTC TTTCCCCAAA TATGTCAGGC 12840
TGCAGGTGTT GCACCGCAGG ACTTAATGGC ACAACTCCTT TCTTGCGCAC GAGasCgctT 12900
TGCAGCGCGC GCCGCACTGA GCACCGACTT GCACGCCCAC GTGTGTGCGC CCTCGGTGAC 12960
TGCTGCACAT GACCCCGATG CGCAAGGGGA CGACTGGGAC CAGAGGswCT CGAACCCCCT 13020
CCCTACTGCT TAGAAGGCAG TCGCTCTATC CGGGTGAGCT ATGATCCCGT GGTACGCTGC 13080
GAGCAAAAAC CCTGCAAGGg TGGaTAAAAA TATATAACGT GTCAACAATC CTAGAATGCT 13140
GTGCTGTAGC TCCGACTGCT TATCGGGTGC ACCGTTTTTT GTTATAATGG CGCGCATGTC 13200
TTTTGTTCAT TTGCATGTTC ACTCAAATTA TTCACTGTTG GATGGAGCTT CTTCATTGCA 13260
GCGGCTAGTG CGTACTGCAA AGTCGCTGGG ACAAGAAmGc sTTGCgCTTA CCGACCATGG 13320
GAATATGTTT GGTGCGTTGC ATTTTCAAAA AGTTTGTTCT GCTGAGGGTA TCAAAGCGAT 13380
TATCGGATGT GAGCTCTACG TGGCACCCGA AAGTCGCTTT GATCGCAGTG AGCATACTAT 134 0
CGGTCGCAGA TACTATCACC TCATCGTGCT TGCTAAGAAT GAGACGGGAT ATCGAAATCT 13500
AATGGTTCTA TCCTCCAAAG CCTATATCGA GGGTATGTAC TACAAACCAC GTGTGGATGA 13560
CGAGCTTCTG GCCCAGCATG CAGAAGGGCT CATTTGTCTT TCTTCTTGTC TTGCCGGACA 13620
GCTTCCTTAT CTGTTATTGC AGGGCAGAAA AAGGGAGGCA GAAGAACACG CGCGCAAATA 13680
CCGAGCGCTC TTCGGTGTAG ATAATTACTT TATTGAGGTG CAAGATCATG GACTTGATGA 13740
AGAGAAGAAA GTAGCACCGC TTTTGATTGA GCTTGCATGT AGGCTCGGCA TTCCGTTGGT 13800
GGTTACAAAC GACGTGCATT ATGCGGAgcA GGnAAGACTC TGTTGCACAA GACATTCTGC 13860
TGTGCATTGG AACGAAGAAG AATCGCTCCG ATCCCAATCG GCTTAAATTT AAAACAGACG 13920
AGTTCTATTT AAAGTCTTCT GAAAAAATGG CTCAGCTGTT TCCCCACTAT CCTGAAATGG 13980
TGCTGAATAC GGTGCGCATT GCACAGAGAT GTAATGTGCG GATTCCTCAG CCTGGCCCGC 14040
TGCTTCCGCT CTACCAGATT CCTCATGAGT TTTCCAGCAA GGAACACTAT ATTCGCCATC 14100
TGGTCCATCG AGGTTTGTAT GATCGCTATG CAGTAGTGAG CGAAGAAATT AAGGCGCGTG 14160
CTGATTATGA ACTAGATGTT ATCGTGAGGA TGGATTTTGT TGGCTACTTT TTGATCGTGT 14220
GGGATTTTAT TACGTGGGCA AAGGAGCATG ATATTCCTGT TGGTCCGGGG CGGGGGTCTG 14280
GAGCAAGTTC TATTGTTGCA TATGCGTTAA AAATTACCGA CATCGATCCC CTTAGATATA 14340
AGTTGCTTTT TGAAAGATTT ATGAATCCTG AGCGTATTTC TATGCCCGAT TTTGACATCG 14400 ACTTTTGTTT TGAGCGCAGA CAAGAAGTGA TTGAGTATGT GCGTGCGAGA TATGGAAATG 14460
ACAATGTTGG GCAAATTATT ACGTTCGGAA CACTTAAGCC AAAGGCGGcG ATTCGTGATG 14520
TAGGGCGCGT GTTGGATATT CCGCTTTCGG AAGTTTTGAT GATTACAAAA CTGATGCCTG 14580
ATGATCCAAA ACTGACTTTT AAAAAAGCGT ATGAATCTGA ACAATTAGCG CAAATGAAGC 14640
AGGAGCCGCG CTATGCTGAA TTGTTTCAAA TAGCAGAAAA GCTTGAAGAC ACCAATCGAA 14700
ACACTAGTTT GCATGCAGCA GGTATCGTTA TTGGTAAAAC GGCGCTCACT GATTATGTAC 14760
CGCTCTACAa GGATTCTAAG ACGGGAAAAA TTAGTACCCA GTTTGGTATG GATTTAATTG 14820
AAGACTGTGG ATTAGTGAAG ATGGACTTTC TTGGGCTAAA AACACTTACG CTCATCCAAC 14880
GGACGCAGAA TCTCGTACGA CGTAAAGGGG GTAAGTACAC AACGTTTTCG ATATCGGATA 14940
TCAGTGATCA GGATCCTACG ACTTTTTCTA TGTTGGCGGA AGGAAAATCT GCTGCaGTGT 15000
TTCAGTTTGA AAGTCGCGGT ATGCAAGGCA TCCTCAAGCG TGCAAAGCCC AGTAAGATGG 15060
AGGATCTAAT AGCGTTGAAT GCATTGTACC GACCTGGGCC GATGGCATTC ATTGATCAAT 15120
ATATTGAATC GAAACGTGAT CCTGGGAAAA TAAAATACCC TGATCCGTGT TTGGAAGACA 15180
TCCTTTCAGA AACATATGGG GTAATAGTAT ACCAAGAGCA GGTTATGCAG GTGGCACAGC 15240
GCATTGCAGG TTTCTCGCTG GGAGAAGCAG ATATTCTGCG CCGTGCGATG GGAAAGAAAA 15300
AGCTTGCAGT GATGCAGGAA AAGAAAAAGG AGTTTGCTGA GCGTGCAGAG AAACAGGGTT 15360
TTGATAAAAA GCATGCTGAG AATATTTTTG AAATTCTTAT TCCTTTTGCA GGGTATGGGT 15420
TTAATAAAAG TCACGCCACT GCATATTCAG TGGTTGCCTA TCAAACTGCA TTTCTAAAAG 15480
CAAATTTTCC CGCCGAGTTT ATGGcTGCGA ACCTTTCAAA CGAAATTAAT TCTGcAGAAA 15540
AATTACCACT CTACATGGcT GAAGCAGAAA AGATGGGTCT GTCCATTCAG AAACCGGATG 15600
TCAATGCTTC TGAACCTTAT TTTAGTGTTT GTGAAGGGTG CATTGTGTAT GGGTTGTTGG 15660
GTATTAAAGG TTTGGGTGAG CAGGTTGCGT TTGACGTTTT TGATGAGCGT ATTCGCAACG 15720
GTCCTTACAC CTCCTTTGTA GAGGTGCTGG ATCGAGTTCC TGCAACCTCG TTAAATAAAA 15780
AAAATGCCGA AATAATGATT AAGGCTGGAT GTTTTGACCG GTTCGGGGTA ACTCGCGCAA 15840
GTCTTACAGC GCACCTCGAC GATGCAATGA AATATGTTGC GCGAAAAAAG GCGGTTACAA 15900
GTTCTAGACA AGCAAGCCTT TTTGACGAAA CGGATTTAGG AGAATGTTCT GAATACACCT 15960
TTCCGGTTAT GGAAGAATGG TCCCAGAGGG AGAGACTCCG TATAGAGAAG GAACTGATGG 16020
GGTATTATAT TTCTGGTCAT CCTCTTGATG AATATCGAAG TGTGATAGGA GAAAAGGCGA 16080
CATTGGATTT AGGACATATT GAAAATGCTC GTTCTGAAAA TAAATACCTG ATTGTGGGAG 16140 TGCTGAATGC TATTCACCCG TATACAACTA AGTCAGGAAA GAATATGGCT TTTGGCTCTT 16200
TTGAGGATCT CCATGGCTCT GTAGACATAG TTGTGTTTCC TGTGCTGTGG GAGGAGCATC 16260
GCGCGCAGTT CTTGCCAGAA ACTATTATGG GGTTGGTGGG AACTGTAGAC TTTTCTAAAG 16320
AAACCCCGGC GTTCTTGGTA GATTCTGTCA TTGACTTGGA ACAATTACGG TTTGCTCAGG 16380
TTAAAACTAT TCTGGCTGGA TCGGAGCATA GGCGTGTATC GTCAGGAGAG AAAACTCCAC 16440
TGCAGAAACG TGGCGTTTCG CAGGAAGTGC ACATCGAGGT GAGTTCTCAC GTTCGTGCGC 16500
ATGCACAGTT TAAATCGTTG TATGAGATTT TGAGTGCACA TACAGGAGGC TCGGGTGAAG 16560
TGTTTCTTCA CATGCATGTG GATGACCGTA CGTACGTGGT GTACGTTCCT TCGTGTAAGG 16620
TATCTGCCAC TGAGGTATTT GCGCAncAAT nTAAAAGGTA ATGAGAGTTT TGTCCAAATT 16680
CTAAAGGAGT GCGTGCAATG AGTTCTGTTC TATCTACACT CTCGGCATTA TTGTCAGTGT 16740
ATGCGCTCCT GTGTACGGCA CGCGTATTTC TCTCGTGGGT GCCCCATCTT tCACATTCAC 16800
CCCTGGGGGA ATTCTtATCT GCGATATGTG AGCCGTACCT GTCCTGGTTT AGGAGATTTT 16860
CGTTTATGCG TGTTGGTACG GTGGACTTTT CTCCCATGAT TGCGATTSGG GTGCTCACCA 16920
TACTCTCAAA CACTGTCGGA ACTATTTTCC TTGTCGGTTC GGTTTCTGTG TTAAAGTTAC 16980
TGCTGCAAAT GCTGATGCTG TTGCTGTTGC TGTGGTCGTT GTGCAAGTTT GTGTTGGAGT 17040
TTTTATTGAT TCTTTTTGCT GTTCGATTTG TTTCCGATCG TATGAATGTA AATGTTCATA 17100
CGTTATTTTT TGTGATGATG GATAGGATAT TAAATCCGGT ACGTGTTGCG TTGACCGCTC 17160
CGTTTAAGTT CCTTGATTTG AGTTACCGTG CGTCCTTGCT CTTGTGTGTT CTTGTGATAT 17220
TGTGCGCGCG GGTTCTTGGA GGTTTTTTTG TGAATGTAGT GGTGCGGTAC TTTTTGACTG 17280
GAACACTGCA CGTGGCAGTG ATGTAATCCG TCGCTTTGAG ACAAAGGACT GA ATCCCTA 17340
TTCACTGTAG GCAGTGTTAT TCGTCGAAAT ATGTATTGCC TGAAGAAATT ATTCGGGACG 17400
GAGGGATTTG AACCCTCGAT CTTCCGGTCC CAAACCGGGT GCCCTAGCCC CTAGGCCACG 17460
TCCCGTACGC TTGACACTGT GTGTTAAGAA TGGATAGGCT GTCAACGGTT ACCTGCGAAA 17520
AAGTCTCGAT TCTTGTGTGG GAGATTGGAT GGGCACGTTT GTGGTGTCAC TGCCTGGTGG 17580
GCGCCGAGAA AAGTTTTCCG AGTGCGTTCC AGCGCGCGTC CTCTTTGAGC GATTTTTTGG 17640
CACAGAATCG TCTGTGTATG GTTTGATGTG TAACGGTACA CCGGTACTGC CATGCCAGGT 17700
GATAGGCGCC GACGCGGTAG TTGAGCCGGT TCGTGAGGAT ACGGTGTTAG GGGCCGCTCT 17760
GTACCGTAGg ACTGCGCGTT TGCTGTTTGC CACAGCGTTT CACTCGGTGT ATCCGCATGT 17820
GCGATTGTTT GCaGGGTATC GAGTGCmAGG GGGaTATTGC TACCGTACCG AGGGTGCGTG 17880 CGCAGATGAC CTGGaTGTTT CGTTGGTAGT GCGTAGGATG AAGGCGCTTG TGGCGCAGGA 17940
TGCGCCCATT CACATGCAGT ATATGACGCG TCGGGAAGCC TTGAATCTGT TTACGCAGTG 18000
GAATTTTCCA TATTCACATC ATTATATTCT GGGTTCGTAC CGGACTGTGT TTTTAACGCA 18060
GGTACTGGAC GGTTTTTCTG CGTTGTTTTT TCAGCCGCTC ATGGCTTCTG TAGGGAGGCT 18120
CACCGTCTTT GAGGTGCgGA TGTGTGCTGA GGGTTGTCTG TTGCGTTTCC CTGAAGGTGG 18180
ACAACGcCAC ATCATTTCTC AGCACAACGC GTCGCCACAG TTTGTGGTAA TGTATCGGAG 18240
GCATCGGCAG CAAGAAGAAC AGACAAAAAT ATGCTCAGTA GGACAGTTGA ATGCGTGCAT 18300
TCAGTCTGGT GATGTTGCAA CTTGTGTTGA CATGGCTGAG GCGGCGCACA ATCGGCAGAT 18360
TGAGTGTTGT GCCACAGAAA TTGCACGAAG GGACAGCGTG CGCGTGGTGT CGATAGCAGG 18420
ACCGTCAGGT TCTGGAAAGA CAACGATTGC AAAAAAACTT TCAGTGCAGC TGCAAGTACT 18480
TGGTTACGAT CCGCATGTGA TTAGTCTTGA TGATTACTAT GTGGGGATTG AGCGCACGCC 18540
GTGTGACGCG GAAGGTAATC CTGATTTTGA GTGCGTCGAA GCCTTAGATC TTCCCCTGAT 18600
TAATAAGTTG TTTTTGGATC TCTTGCAGGG GAAGCGTGTT GCACTTCCTT CGTATAATTT 18660
TAAAACAGGG AAACGAGAGT ACCGGGGGCG GGAAGTACAG TTTGGTGAGC GTTCGCTGCT 18720
TATTATTGAG GGCATACATG GCTTGAACGA TCGGCTCATC TCGTTGATGA ACCGGCGAGT 18780
TGTATTTCGG TTGTACGTCT CTGTGTTCAT GCATTtGTGC TTGGATGAAC AGCACAGGGT 18840
TTCGGCGTCt GAtGGAAGgT TGTTGCGGAG GgTTgTsCGa CGcGCAGTTT CGCGgTATTT 18900
CTGTCGAAAA AACACTTGAA ATGTGGCAAC GGGTGCGTGc AGGTGAAGAG CGCTATATTT 18960
TCCCTTTTCA GCACCGTGCA GACATGATGT TTAACAGTGC ATTGGTTTAT GAGTTTGCaG 19020
TGTTAAAGCG CCGTGCaCAG GAAGTTTTAA GCaCGGTTTC TTCTGCTTGT ACCACGTATA 19080
GGGAAGTCCG CAATTTGCGT GCCTTGTTGG AGCAGTTTTG TTCGTTGTCT GATGTGCATG 19140
TTCCGGGTCA GTCGATATTA AGAGAATTTA TTGGGCAAAG CGATTTTTGC TATTGTCTGT 19200
AGCGAGTGCT TTTATAATGC AGGGTATGGC GACACAAAGT GACGCGTGCA GAAGGGAAGT 19260
GGTGCTTCGA GTGTTATGAC GCTGTATGAA TATTATTTGA TATTTCCTGA TGGAGAATGT 19320
CGGGAGATAT CAGGACCTCC CTGTGAGAGG AGTCTTCTTG ACATGAATGG ACATCCGTTG 19380
AGAGTTCCCC TGTCTTCGAA TAGAGTGATC GCGTACCGCG TCGCGGrAAA GCGCACTGTT 19440
GCAGGTGGTC GGGGGGTAGT CGGCATATGG TACACGGmCG AGCAGCTTGA CGCACTCGAG 19500
CTGCTCGAGT ACGTCTCGGG GCCTCTTGGC CAGCGATGAC GATTAACACG TGTCGGGAAA 19560
CGGGGCTCCA TAGAGCGCTG AAGGACTACT TTAGTCCTCG TGGTTCTCGG CAGGAAGTAG 19620 AGCTTCGCGG gTCGATCTGC GATGTCGTCC ATCCGGACGG aCGATTGTCG AAgTTCAAAC 19680
GTCGGGGCTA GGACGCCTGG AGGCAAAGCT GAAGAAGCTC CTCCCTTACC ACCAGGTGAT 19740
GGTGGTGTAT CCgGTCTCCA GACGTCTGTA TATTAGAATG CTGAACGAGG ATGGCAGCGA 19800
GCGGCATTAC CGCAAGAGCC CCAAGGAGGG TTCGTTCTTC CAAATATACC GGGAGATCGG 19860
CAGACTGCAC GACCTGCTCG ACCACGAGCA CCTTTCTCTC CATATCGTGT ACATACACAG 19920
CGAGGTCATC AAGGTCGACG ACCGGAAGGG GAGAAGTAGG TACAAGAAGC CGCGCATAGT 19980
CGACAGAAAA CTCCTCGAAG TGCAGAGCTC AGAAGAATTC CGCAACAAGG GGTCCCTCGC 20040
GCAACCTCTC CTGTCAAAGC TACCTGAAAT CTTCTGCTGC GATGACCTGG CGCAAACGGG 20100
CACAGGCGTG CACTGcCGct ACGCCcTGCG GTTTCTGAGG AGGAACGGGA TGGCCACCCC 20160
GCACTCGAAG CGCGGCAGGA CAAAACTCTA CCGGAAGGAA CCGCCGGGGG ACAATCGATC 20220
ACCTCCTCCC TGGCAAGAGC CACATGGGGA AGGCTTAGCA GAAAAGCTAA GCCCGGGCCC 20280
GGCCAGGTAG ACGCACTCGC TCATCCTTTA CCAGGCATCA GACATGTCAT CAGGCTCGCG 20340
CGATTCCACG TCGTAAAGGT CAGTGACCAC CCTCAGGTCA TCGAAATACA CGTAATAATT 20400
GCCATACGCC TCGAGAGGAT CGCAGTCTAC GCGGAAGCCT aCGATATTCA GCCCAGACTG 20460
GTTAGGGAAA CGGCGGCTCT TCTGTACGAT ACCCGTCTTC CCATCAACAT GCTGAGGGGG 20520
GATTGCGACA CTCATGAGCT TCCAGCCGGA AAAATCGAGC TGTCCCATGT GCAACTCAAA 20580
GCGCTGTCCC CAGAAATCCT CCAGCAAGAG ACTGAGCGAG TGCGGGTATC CGCGCCCAGC 20640
CACCCACACG CTCACTGTCT TTGCGACACC CTCAACGGGC AACGGCTTAA CGGAAGAAAC 20700
CTCAAAACTG TTGTACCCTC GCCGGTAAAA CGAAACtTCG CGCCAAACAC CTTAGAGTCG 20760
GGAATCTTCA TATCCCCTTC CTCGGGGATA GGGCGCTTTC TGGCAGGTCC ACCCTCAAAC 20820
AGGCGCCCCT TTATAGTCCc TTCGTCAGAA GACATGGAGA CAACCCAGGT CCCTTCATTC 20880
TCAAACTTAT CTACGGACAC TTCCTTGAGG CGTTGCGCAG CAGCATACAT CCCTATGCGA 20940
GATGGATCGG CAACATCCCT GCTCCCAGCC GCCTCCTGCG CATGGGCAGA AAAAACCAAC 21000
ACCCCTACAC TCACCACCGC TATCTTCTTC ATTTGCCGCT CTCTCCTTCC TCCTTGAACT 21060
GAGCATTTCT CAACTCGTGT CCATCGTAAA AGTCGATATG CATGTTAGCA AGCGCCTTGA 21120
ACTGGTCAAA GAACACGTAG AAATCATCCA CCCGCTCTGA TGGGCTAGTA 21170 (2) INFORMATION FOR SEQ ID NO: 37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11516 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37:
ACGATGATAA TCCTCCTCCC TTCAATTCTG ACTTCGGCCT TTnCCACAAC GCACCCGCGT 60
CATTGACTTG ATCTACTTTC TTCTGAGTAT CCGTTACCGC TCCATCCTTG TCGTACTCCT 120
TCTGCAGGTA ATGCAGATAG ATGGGAAACG CGATATGATC AGAATACTTC TTAATTACCT 180
CTTCAAGACG CCAGCGCGTT GCAAACTCGG AATTTTCCTG GCTCAGGTGC AACACAACGC 240
AGGTACCGGC ACTACCCTCA GCAACCCCTT CAAGTACTGG GAAGGCAGCC GCATCAACCT 300
CATCCAAGGT ATAGGCATTT TGCCCTTCAG ACGTCCACTT CCACACGGTG TTCTCTGCAG 360
CTTTCTTGGT GATTACTTCT ACTTTGGAAG CGACCATGAA GGCAGAGTAA AACCCTACGC 420
CAAACTGGCC TATCAGATTG GAATCCTGTT TTTGATCACG CGTCAGCGTA CTGAGAAACG 480
CCTTTGTACC GGATCGCGCA ATGGTACCTA GATTGGCCCT CAGATCTTCT GCGTTCATGC 540
CAATACCCGT ATCACGCACA ACAAGCCGTT GAGCATCTTC TTCAAACGCG ATGTCTATAC 600
GCGcTTCGCA ATGCAACTGC TTGTACGTAC CATCAACAAG TGCCTCATAC TTCAACTTAT 660
CTAACGCATC CGACGCATTA GAGATAAGTT CCCGGAGAAA AATCTCTTTA TGGGAATAGA 720
GAGAATGGAT AATCAACGTT AGCAGCTGAC TCACTTCAGT TTGAAACTCG TACTGAGCCA 780
TGTATCCTCC CAGAGGTTAA AAAAGATTCC ATTACGCCGC GCACAGACCG CGCGCGAAGT 840
GTAGCACAGA CTATGCAGCA CAGTAAACCA ACCGGAACAG GTGGTACACG CTGCCCGATG 900
AACACCAGAC AAAAGAACCC GTGATTGTAT AGCGCTCACA CCCCATGGTA TGATGGGCAG 960
GTCATGGATT ATCCGAGAAG GACTATAGCT TGTGGCGAGC TGCGCAGGTG CCACGTCGGA 1020
ACGGTAGTTG TGCTCAATGG ATGGGTCCAC CGAAAGCGGT CGCACGGAAC CGTTAGTTTC 1080
TTTAACATGC GCGATAGGTC CGGAATAGTG CAGGTTATAG TGAGCCAGGA GGAAAACGCT 1140
AGCCTGTGGT CCACGGTAAA CCGCATACGG TTGGAATGCT GTCTTGCAGT CGAAGGCGTG 1200
GTGCGAGAGC GACCTCCTTC AATGATAAAT CGCGCCCTGC ATACCGGGGA GGTGGAGGTG 1260
CACGCTCGCA CGCTGTACGT TCTCTCGGAG AATGCTGTGC TTCCGTTCCG CGTTGATGAT 1320
GTTGTGCATG CGCACGAAGA TATACGCTTA AAATATCGCT ACCTCGACCT GCGCTCTCAG 1380
CGCATGCAGG AGCGCATTGC ACTGCGCTCA CGCGTTGCCC TGGCCATACG GCAGTTTTTA 1440
AGTATGAAAG GTTTCATCGA GATCGAAACT CCCACCTTCA TCTGCTCTAC CCCCGAGGGG 1500
GcACGTGACT TTGTTGTCCC TTCCCGAGTG TGCCCCGGGC GTTTCTATGC CCTGCCACAG 1560 TCCCCCCAGC TGTACAAGCA GCTTCTGATG GTGGCAGGGT TTGACCGCTA TTTCCAACTT 1620
GCCCGTTGCT ACCGAGACGA GGATGCACGA GGCGATCGTC AGCCAGAATT TACCCAGATA 1680
GACCTTGAGA TGAGCTTCGT TTCTCGAGAC GATGTTATGC GGGTGAACGA GGATATGCTT 1740
CGGTACGTGT TTAGAACCAG CATCGGTGTC GAACTGCCTA CCTTTTTTCC TCGGCTTACC 1800
TACGCGCAGG CGCTAGACCA ATATGGAACA GATAAGCCAG ACATGCGCTT CAAACCGGTC 1860
CTGCAGAATG CAGACTTTAT GGGAATGCTT GGCACGTTCA CCCCGTTTGA AGAAGTCGTC 1920
GCACAGGGTG GCAGCATCAG AGCACTCGTT CTTCCGGGCA AGGCACGTTG CTACAGCCGT 1980
AGnAAAtCGA AGCGTTGGAG TCTATCGCTC GAGCACATGA GGCGCACCAC CTTTTTTGGC 2040
TTAAGGCAAC CGGTGGAGGC CTCGAGGGGG GTATCGCAAG GTTTTTTGCa GGGGTAGAGT 2100
CCGAAGTACG CCGGCGACTT TCTGCTCAGG ATGAAGACTT GTTGCTCTTT GTCGCCGATT 2160
GCCGGCACCG CGTGTGCTGC GTCGCACTCG GCGCAGTGCG CAGCGCTCTT ATCAGGGACG 2220
AGTCGTTCCC AGAGAAGGAG TTGTTTTCTT TCGTGTGGAT CGTTGATTTT CCCCTCTTTG 2280
AATGGAACCC AGCGGAAAAC AAGTGGGACC CTGCTCATCA CATGTTCTCT GCTCCTCAGG 2340
AACAGTATCT TGAGACGCTC GAGCAAGATC CCGGTTCGGT AAAAGGTGAC CTCTATGATT 2400
TGGTGCTCAA CGGGTATGAG CTGGCTTCAG GCTCAATTCG TATCCACGAC ACACAGCTGC 2460
AAAAACGCAT CTTTAAGATA GTGGGATTAG ATCCTGAAGA AGCGGGGGAA AAGTTCGGGT 2520
TTCTCACAGA AGCGTTTAAA TACGGCGCGC CgcGCACGGc GGCATcGCAC ACGGGTTGGA 2580
CCGCCTCGTG ATGCTCATGA CAGGAAGCGA GTCAATTAGA GACGTCATTG CTTTTCCTAA 2640
AAATACACTC GCCGCCAGCC CCCTGGACAA TTGTCCTAGC GTGCTCGATA AGCGTCAgCT 2700
TGaCGAGTTA CACCTCACTG TACACGTCTA GGGGCATCGC TACTCGCTCG TCGGCGTAAA 2760
ATACCTACCA GGGGGGGGAG GGGTACATGG CTTTTACTGA GAAGCAAAAG GGTACTTTGT 2820
GCCTAATGTG CTCGAGTTTT TGCTTTAGCG TGATGAGCGT CTTTGTGCGT CTTGCAGGGG 2880
ATCTCCCCTC TATTCAGAAG GCATTTACGC GTAACCTGGT CTCAACGCTC ATCTCGGGAT 2940
CTATGCTCTT TCGTGCGCGT ACCCGCGTCC ACGTGCAGGA TCTCCCCATG CTCTCCTTGC 3000
GTACCGTGTG CGGGACGCTA GCAATCGTCG CAAACTTCTA CGCAGTAGAA CGCTTAACAT 3060
TGGCAGACGC GTCGTTGCTT TCGAAGCTCT CTCCGTTCTT TACCATACTG TTTTCTTGCC 3120
TTTTCTTGGG AGAACGCATT GCGCCGTATC AAGTCGTCGC CCTCTGTGGT GCCTTTGCTG 3180
CAGGCACGCT CGTGGTCAAG CCGAGTCACA CCCTTTCTCA CCGTGTATTT CCCGCGTGTA 3240
TTGGCGCAGT AGGAGGCATG ATGACGGGAG CTGCGCACAC GTGCGTACGC TACCTCTCCA 3300 CCCGTGGCGT AGAGAAGTTC TTGGTTATCT TTTTCTTTTC tTCGGATCGC TGCTATTGCT 3360
GCTCCCTGCA TTTATATGGC AGTACCAACC GATGAGCTCA CCGCAAGTGc TTACGCTGTG 3420
GGCCGCAGgA GTGGCAGTAG CAGGTGCACA GTTTTTTCTC ACTGTTGCGT ATCGATACGC 3480
GCCAAAAAAG TCGATTCCAA TTGACTATAC CCACATCTTA TTTTCGACGG GCATCGGTTT 3540
CTTGTACTTT AAAGAGGTGC CCGACCACTG GACCGTAGCG GGCATCGGTA TCATTCTCGC 3600
CATTGCCCTG TACGTGTTTG CGCGCGAGcg TGaACGGAAA GAACCCACCG TGCCGTCGCA 3660
CACACGCTAG AGCCGATGGC ACGCACGTAC GCGAAgCACA TGGTCTACCC CATGCTTAGA 3720
TTTTTCTCGG TAAAAGAATG AGGCAGGTGC GCGTGACGTG CACAGGAACT TCCACCGCGT 3780
ATTTGCCGTC GTGCGGCGCG TCACGTACAG CGACAACGTG GAGAAAATCC TTTCTCGCAA 3840
GCAGCCgCCC TGAGGGGCGC TGGTACTGAG TAAAATCGTA GCTCACACCT ATAACACGAC 3900
GACCACCAAG CGGTACCGTA ATGCGTCCGT TAATGTCGGT AcGGTAcGTT TCATGCTCTG 3960
CATGCTCGAA ATGGACAGTG CAAGGCAGCG CTTCGTACGA GATGTGGAAc ATATCCGCCA 4020
CGAAGATATG CACGAAATGG GTATCTGCGC GCGGCGgGGT TTTACATGCA GCGACACACA 4080
CCCCCACCGT AAGCGCCACA GCCAGAAAAA ATGCTGCACG CGCGCTGTAC CCACACAAGG 4140
TAACGGAGAT TGCCGCACGC GAGGTTCTTC TCGTATACTC ACCCCTCGTA TGAGTACTTG 4200
GACACACATC TGGTCTACTG CGTTTACCTT GCTGTTTATT ATCGATCCGA TTGGGAACAT 4260
ACCGGTGGTA CTGTCytGCT GCGCACCGTG CCAGCTGAGC GTCATACCCG GATCATTTTT 4320
AGAGAACTGC TTCTAGGACT GGTGCTCATG CTCTCCTTCC TTTTTTGCGG AAAAGTTTTC 4380
CTATCTTTGT TCCAGCTAGA AACGGGAGTA ATGAAAATGG CCGGAAGCGT CATTCTCTTT 4440
CTCGTTGGCA TCAAGATGGT ATTTCCTGAT CAACACGCGC TCCCCTCCAC CACAGAAGAG 4500
GAACCGTTTA TTGTTCCCAT CGCCACTCCC ATGATCGCAG GTCCTTCGGC GTTCACCACG 4560
CTGGTAATTA TGGGAGAGAC GAAGGGGACA TCCCGTCTCG CCACCTGTGc tGCGCTGCTT 4620
GTTGCGTGGA CGCTCGCGTG TCTTATTATG ATAAGCGCAC CGTGTCTATA CCGTCTTCTT 4680
AAAGAAAAGG GAATTACCGC GCTTGAGCGA ATCACAGGTA TCTTGCTGCT CATTCTTTCC 4740
ATCCAGATGT GTGTTGAGGG AGCCCGGGGC ATTATTGCCA CTTCCTAGCA AGAAGGAAAA 4800
CTACCCGCTG CGTACGTGCG GGCTTAGGGG ACGACGACAA CGTTCGCGAC TCTGCCATCT 4860
GCCAGGTATG CGCGGGCGTT GCTCTGGGTG TCAAAGGAAG AAGTGCCATC TTTGACGAAG 4920
GCATAGAGCC ACCTTCCAGG CGGGAGGGGA AGCTCTAGCT CGTAGTGGCC GGGACGCACC 4980
TCTTCCAGAG AGTACATGAA TGGATCCCAG TTGTTAAACG TACCTGCAAG gTGGATAGTC 5040 TGTCCCGCTG CACCCTGGTA CACAAACCGA GTGCCCGCGG CCGTATGCTG GGTTTGATAC 5100
GATTCGTGAG ACGGCACATC GAGGTAAGAA ATGGaCATGC CATCGCGGTG ATCGTAGCTT 5160
TCGAAGCTAT TTTCAGGATC GGTAGTCCAC AACCCATCAA TCACAAGCCG GTAACTTAAA 5220
CGCGAACACC cTTCAGGAAT AGGCGCGATA TGGAAAAGAA CGGAGCGTTC AGTGAGATTC 5280
TGGGCGCTCT CTTGACTGAG GCGCACGAAC GAGTATATCG GGCGGTACCn TTCGTGCTCA 5340
AACGCGATAC CCACGTGGCG CGCTGCCCCT GACGCAgTAA ACACGACGCA GCGCCCCTGA 5400
ATCCGAGGCG CTTCCACGCG GGAAATAGAC TCGATAAGCG CGCGGCGcTG CGTCGGATCA 5460
AGTCCAGCCG CGCAGAGTCC GACAGCACCA GACAAAACGA GCATGACACC AAGCGCACAT 5520
CCTCTCATCG AGTTTCTCGA TCCTCCCCGG CAAAGCGCAC CACCACGAAC ACACCCCCAT 5580
ACCACCGGTC CTGGCGAACT CGCAAAAGCG CGGCACACCC GAAACCCATA CCGCGCACAG 5640
CTTCGGGATA TGCATGCGCA TGCTAAAGGG AAACCTGTCC TCCTGGCAGA CTTCACTCCT 5700
CCACAAAAAA AACCGATACG AGGGCGGGGA GTATAACGCG CAATGCCGAG TGCACAACAC 5760
CTGTCAGAGT TTGCTCGCGA GCTCAAGACT CTTGGGAATG AGCCAGACAC CCTCAAATCT 5820
TGGGGTACTC TGTACGATGA CCTACCACCT CCTGAATCTA CCCCCGACGG GGCACAGCCT 5880
GCGCCCACGC CTGAGCGGCA GTCCGCGCCT GCATCCGCGT CAgcTTCTGG CCCTGTGTCC 5940
GCACATGGGC AGCGCcCCTT TGAGCCTGAC ACAGAAGCAT CGAGCGTTGC CTCGGGAGAG 6000
GAGGTCGTGC AGGAAGATGC GCACGCACCA CAGACTCGAA TGCATGACTC CGCACAGGAG 6060
CCAGCGGCGG AGATTTCTCT CTTTTCTGAA GAGCGGACAC CGGAAACTAT GCCGACTGCT 6120
GCCTGGAGTG CACCACCGGA TCCTCTTTTT GAAACCGAGC ATGCTGTCCC CCCCCTACCT 6180
CTTGACCCGG AAGAAACACC AGTGCCCGGA GAAAAAGGTC TCCAGGAGTC CGCCGTGCAG 6240
GAGGAAGACG CCGGATTTAA CCAGATGCCT GCGACAGGAG GGCAAACCAG CGAGAATCAA 6300
CAACACTTTG ACGCATTGCT CGCCTCTCTT GATCTTGATT CGGCAAATGG CGAACGCGTG 6360
GTCCCCGAGA ATGCAGATGA GTTCGCCGCT CAGGTACCTG AATCCCTTCT AGAAGGGTTG 6420
CATCCAGAAG ACCAAGAGAC GAAACGCTCG CAAGAGGAAC CTGTATCCTA TGACTTCCCT 6480
GCGTTTGATC TGGACCAGGT AGCGCCTCCT ACACCAGACG CCCCTGATTC TTCTAACTCT 6540
GCTCTCACTG AGATTGAAAT CACCCCAGCG CTCTCTGAGC ACCCCACGCA GACGCAGGAA 6600
ACGGGTACCA CCTCGCCACA ATCGCAGACT GTGCACGCTG ATGCGTCTGC CCTAGGGCCT 6660
AGTGCCTCTG ATCCTAATTT TTCCCCTGGG TCTGCGGATA ACTTGGTCGC CCAATTCCCC 6720
ATTGAAGAAA GCGTGCAGAT ACCTCCTTTC CCCGCTGATG GCTTTGAACT TCCCGGTAAA 6780 TTCCAAGAAT TTGCGAGAGA ATCTGAGAGC CCCTATTTCA GTCcTGATAC AACCGCCGAC 6840
GCAGACCAAG CACAGACCAT AAGCGAAACG GAATATCAAC GCTTTCTCCA GCGGCTCGAC 6900
GCCCTCCCCC TTCCTGTACG TATTGCGGTT CAAGAATACC TGTCCTCAGA GGAGACCTCG 6960
GACAAAGAGG GTTATGCGCT CATTAGCAGC ATTGCAAACA ACGCCTCGCC AAAAGCGGTT 7020
GCTACTCAAC TCGAGCACAT TCTAAAAAAG CCGCTGCATA TTCCCAGAAA GTTTGAACGC 7080
AAGTCAGCTG CCGCACACGA ACGCGAGAAG TCTTCCCTTC CCTACATCGC GAAACACACG 7140
GTGCTTCCCC TGACGGCCAG CTCAGCGGCC ATACTCATTT TCATCCTTTC GCTTGCAGTC 7200
CTCTCCTGGC ACTTTCTGTA CAAACCCCTT CATGCGCACC TGAGCTACCG CGCAGGGTAC 7260
CATGCATTAG AACTGGACCG CTACGAAGAT GCACACACTA ACTTTGAACA CGCCAAACAG 7320
TACTGGAAGA TAAAACACTG GTACTTTCGT TATGCGCGTG CCTTACGTGA CAAAAAACAA 7380
TATACACGTG CTGAACAAAT TTACACCGAG TTACTCTTTG ATTTCCGGCA TCCCAAACAG 7440
GGGAGCATTG AATATGCGCA CATGCTCTGC AATGAGcTGC GCAAATACGA ACAGGCAGAA 7500
ACGACAtGCG TCGGCAGGGA CTCGACCATC ATCCAAATGA TCCTGATATC CTCAGCGCAC 7560
TCGGAGACGT ATATCTAGAG TGGGCAGAAG AGGACCCTGC TCAATACGAG CAGGCTCGAA 7620
AAACATACCA ATCACTCATC GCTTCCCACG GCACGCGCGA TGCGTATCTT GCACGCATGA 7680
TGCGCTATTT TATCAGAACA GATCAGCTCG CGCAGGTACT TCCTCTTAAG GCACACTTTA 7740
CCAATACGCG CGCTAGGATC GCTCCTGAAG ATTTGACAGA ACTCAGTGGA TACCTTTTAG 7800
AGAAACGCTA TGAATCTCAA CCCAGTGACT CCCTTACATT GCAGTCAAAG ATTGAGGATC 7860
TGCGCGCATT ACTTGAGCGG GCCTTTAAGG CGGATCCTAT GTCTGCGGAT GCGGCTTATT 7920
ACCTTGGAAA ATTCTTTGTC TACAATCACC GCAAGGACAG CGCGCGGGAA CTCCTTCAGC 7980
AAGCTGTCAA CCGTTACCCG CACATGCCAC ATTCCACAGT CAGGCGTACa CTGCGTGAAA 8040
TTGACGCGAT GCGCCTGCTC GGTACGTTAC TCCTGGAGGA AAAGGGACAC GCTGCTGCCC 8100
GCGAAATATT CACCCAGGCA CTTACGCGcT ATCGCAGCTA TATCGTAATG cGTGaCCTAC 8160
CGCCGcATCG GaCTATTGGA AAACTGTACC GTGaCTATGC AGATATGGAC TACTTTATCT 8220
ACAAAAACTA TGACTCTGCG TTGGAGCACT ACCAGCATGC GCGGGCGCAG TTACTTGATA 8280
CTCCTGAGGT TCAATACAAA ATAGGGTATA TTCAGCACAA AAAAAACAAC TACCCCGAAG 8340
CGATTCGGGC AATGAATGCA GCGTACGAGC ACAATCCTCA GGATAAGCAC CTTTTATATG 8400
GATTCGGCAC CCTGTTGTGT AAACGTGGTG ACTACTTTGC TTCCCAGGGG TACTACGAGC 8460
AGTTACTTGA ACTGTTAGAT GCGCAGcGTA CAAGACGCGG TGTCATGCTC CCCCACATAG 8520 AAAAGGCGGA CGCCGCGTTT GTTGATTTGT ACATGCGCAC GTGTAATAAC CTGGGCGTAG 8580
TATTGCACCG TTTGGCAACG ACTCATGGAG ATTCGCGGAA AAATGCACGG GCGTTAACTC 8640
TGTTTGCAGA ATCCTCTCGT GCATGGGACG CACTCACCCG TCACCCTGAA ACCAGGGTGC 8700
GCTCACAAGC TACCGGTCTT TCATACCTAA ACGTCCATCA CATGACACGC CCCTACACAG 8760
AGTTTCAGCC AGAACTGTAC GACGACATTC CTCTCCTACT TGAGCACGAA GAACCGCCCA 8820
TCCAAAAGGA ACAAGAGAAC TAGCCaACGG TGCCCGCTTG CCTGCATGAC CGAAACAGGG 8880
TAGTCTCCCC TGAGAGGAGG CGACTGATGG GAACGTACAT GTGTGATTTG TGTGGCTGGG 8940
GATACAATCC AGAGGTAGGG GATGCAGACG GGGGCATTCC CGCGGGTAtG CGTTTGAGAA 9000
CCTACCGGAC CACTGGGArT gTCCACTCTG TGGGGTGGAC AAGACAAGTT TTGTGAAAGT 9060
GTAGCTCTTC TGCCTAGAGG AAAGGGGAAC GATCCAGTGA AAAAAAAGGA CGCTTTCGTC 9120
GGTACGATCG GCTACGACGG TCAACGGGCA GTAGTGGACA GGGCCCGCGT GCTGAAGCAC 9180
AGCAGGAGTT CCCTGCAGGA ACTTCTCAGT GCGGGGGCCT TcCGAaGAAG gCGGCTGCCT 9240
GmGCCGTTTG GGAACGCTCG AmArAAGCAC TGGAGGCCGT CGCCTCCGCC TACAACGCCC 9300
GCTCAGGGAG CAGGTACAGC GCGCAGGACA TCGCAAAAGT TTTCGGCATT GCCTCCGAAC 9360
CAGGGGAAAA GGCGGTTGTC CTCTAGCCGC CTCCTCCTTT GCTGAAGATC CTGCACCCCC 9420
TCAGGCTTAG CCTGAGGGGG TGCAGGnTTT CCCACTACCA ACTTTCCTGG CGGATAACGT 9480
AATCGTGAAA CCTTCCCCTT TTCAGCGCGT GCTTTACCCT CTCTAGGnTC CGCGGGGCGT 9540
ATCCCCGGcA GCACCACGCG CACCGcGTgC gTCTGTTCGC TCATATGAAG GATCAAAGCC 9600
CGCCTCGCGC ACTTTAACTA CCAAGCGGAC CnATTCTCCT CACGAACAAA GGCTCCCAAC 9660
TGTATACGCC AAAGCACCCC CCGCGTTTCC CCCGAATGGG TTGGCGTATA CACCGACTTC 9720
ATCCCTGACA CCTTCCCGCC ACCAGCTGCG TAGGGAAGAG GCGCAgCAGC CGCATAGGAA 9780
GAAGACGAAG GAACAGGCTG CGCGTGCGAG TTAGGTGCAG CGGAGCCAGG CACGGCCGTC 9840
CCATAGGGAA CCCGACTGGG AGTCGAACCC GGTGCTGCGT ACGCGACAGG CGGCGCACCA 9900
TACTCCGAAG CGGGGACATC CGTTGTATTC GCCACTCCCG GCACACCCGG AGTTCCAGCC 9960
CTCCTTCCGA CAGGAGCTGG CGGTGGATTA TGAGGATCCG CA ACATGAC AGGCGCACTG 10020
GACGTGGGAG CAGTAGGCGG AACACCAAAG GAATCTTGAG GTAAGACACC AGGAGAAGTC 10080
TGCCTATCGT TGCGTTGCTG TGAAGCGTGC GCATTCGGAT CTGCCTTGTG TATGGAGACG 10140
CGCGCCACCC CCGCGTTCAG CATGTCTAAC GCAACAGCTG CAGCCTTTGA CACGTCAATC 10200
TCTCTATTTG CAGCGTAAGG TCCCCGATCA TTGATGCGCA CGATTACCTT TTTGCCGTTG 10260 TCCAAGTTCG TCAACTCCAC AACCGTACCA AAGGGAAGCG TGCGGTGCGC GGCAGTATAC 10320
GCGTTCATGT CAAAAATCTC CCCACTTGCG GTAGGTCTTC CGTTAAAAGA CTCCGCATAG 10380
TAGGAAGCAT ACCCTTCCGG AACGATTACC TCGCCGGCTG CAAAAAGCAT CTGCACGTTC 10440
CACAATACTG CAGCCACTGC AACGACACGC TTGTCCATCA TTCAACACTC CTTCCAAAGG 10500
CTTCACCCGA GCAAAACGAT GCTTCACAAG ACACCCCCGA CGCTTATCGG AATATGGACA 10560
AAAAGGTTGA AATCTTTTAA GGTAGGGGCG CAGTGGGTTG CTGGAGACGA GACTTGAACT 10620
CGTACGACCG CTGCCGGTCA AGGGATTTTA AGTCCCTGAT GTCTACCAAT TCCATCACTC 10680
CAGCGTTGTG CGGCCGTGCT GCCACTGTAG CGGGTAAGTA GCCGGGAGGT CAACATATAC 10740
AGTGATTCGC ACTGCCACCT TGCGCTGCTT GTAAAGCAAA GTGAGGATCC TCAATCCCTC 10800
TTCAGACAGC TGGACCTTGC ACGCTTCCCG TTTCTCATGG AGGTCGGTAC GCGCGCAGGG 10860
GATTATCAGG AAAGAAAGGC TCTGCTCCTG CAGGCCTGCG CGGGGCGCTC GCTTCCTTCT 10920
TGTCTGCACT TCTCTGTAGG GGTCCGGCCT GCGCCGGAGC CCATTGCGCA TCCTGAGACA 10980
GCCCTTTCAC TGCTTCGGTC AGACGTGCGT GCCCTGTGCG CAGAACAGGC GCCCTACCGA 11040
GCCTTGGGTG AATGCGGTCT TGACCGACAC TGGAATGGCC CTCAGGTAGC GTGCAAAGCA 11100 cGGAAAGGAT CTGGTGTGCG CGGTACACCA GATCTTGATG CAGAGGAGTA TCTTTTTAAG 11160
GCACAGCTCT CTATAGCGAA AGcTcAGAAC CTGCCGCTCA TCATTCATTC ACGGGACGCT 11220
TTTGAACCGA CACTCCGTTG CCTGGACTCA GTGGGGTGGA GAAAGGGTGT GATGCATTGT 11280
TTCTCGTACG GATCGTTGAG GCACACGCTT TTTTAGAACG TGGTTTGTAC ATCTCTTGTG 11340
CAGGCACACT TACGTACGCA AAGACGACAT CCGAACTTCT CGCGCGCGAT GCGCTTTATT 11400
CGGAGTATCC CTCTGGATCG TCTATTGTTA GAAACGGACA CTCCCTACCT CGCTCCAGTA 11460
CCGCATCGAG GAACACACAA CAGACCCGAG TATGTCCGAC A ACCTACGC GTTGGT 11516 (2) INFORMATION FOR SEQ ID NO: 38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2450 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 38:
CACGCATGGG CGCAGACATt GGgTtCATTG GAyTTGCTGT CATGGgAGAG AATCTGgTTC 60
TCAACATgAG CGCAACGkTT TTTCCkTCGC AGTTTTCAAT CGCACCAcCA mGGTGGTCGA 120 CCGATTTCTT GCAGGGCGCG CTCATGGCAA GCGAATCACC GGCGCCCaCT CCATTGCAGA 180
ACTTGTTTCA CTTTTGGCAC GTCCACGCAA AATCATGCTC ATGGTCAAAG CAGGCAGCGC 240
AGTCGATGCG GTCATTGACC AGATACTGCC CCTTCTAGAA AAGGGGGACC TCGTTATCGA 300
CGGTGGCAAC TCTCATTACC AGGATACCAT CCGGCGCATG CATGCGCTAG AGGCCGCAGg 360
TATTCATTTC ATTGGCACAG GAGTTTCGGG GGGAGAAGAG GGGGCCCTCC GTGGACCGTC 420
CCTCATGCCT GGAGGCTCTG CTCAGGCTTG GCCGTTGGTT TCTCCCATTT TCTGTGCCAT 480
TGCCGCCAAA GCCGACGATG GCACCCCGTG CTGCGACTGG GTCGGCAGTG ATGGCGCCGG 540
GCTACGTGAA AATGATTCAC AACGGCATTG AGTACGGCGA CATGCAGATA ATCGCCGAGG 600
GCTACTGGTT TATGAAGCAT GCGCTGGGCA TGAGCTATGA GCACATGCAC CATACGTTTA 660
CCCGCTGGAA CACGGGCCcG CTTACACTCG TACCTGATTG AGATTACCGC GGCTATTCTG 720
GCACATCAGG ACACAGACGG CACACCACTT TTAGAGAAAA TTCTAGATGC CGCTGGACAG 780
AAGGGGACGG GCAGGTGGAC GTGTGTTGCA GCGCTCGAAG AAGGCAGCCC GCTTACACTG 840
ATCACAGAGT CAGTGATGGC GCGTAGTCTT TCTGCGCAAA AGCAAGCGCG CTGCAAGGCA 900
CATCGCGTTT TTGGTTCTCC CGTGAAAGTC TCCAAAGCAG AAACGCTAAG TGCACAGCAG 960
CGCGAAGAAC TGGTGTCTGC ACTGGAAGAC GCGCTGTATT GCGCGAAAAT AGTCTCGTAT 1020
GCGCAGGGTT TTGAGCTGTT ATCGCATACG GCAAAGCGCC GAGGATGGAC ACTGGaTTTT 1080
TCCCGGaTTG CATCGCTGTG GCGTGGCGGG TG ATTATTC GTTCAGGATT CCTGTCCAAG 1140
ATCAGTGCGG CGTTTGCTCA GCAGCACGAT CTAGAGAATT TGGTACTTGC TCCCTTTTTC 1200
GCAGAGGrAT TAAAGCGTGC GTGTCCAGGC TGGCGCACCA TAGTGGCAGA ATCGGTACGG 1260
CAGGCGTTGC CAGTTCCGGC CCTCTCTGCT GCGTTACCTG GTTTGATGGG TTCACCGGTG 1320
CTGCTTTGCC GGCCAACCTC CTTCAGGCAC AGCGAGATTA TTTTGGTGCG CACACCTACG 1380
AGCGCACAGA TGCGCCGAGA GGAGAGTTTT TTCACACAAA CTGGACAGGC ACCGGCGGTG 1440
ATACCATTGC AGGAACCTAC TCAATATAGG GGATCCTCCC GTCGCTTGCC TTTCGTTCTA 1500
TATTTATATT CCCAGGTGAT CTTGACACCA CCTCGGGTGC TGCGCTAgcA TGCGCCCGTC 1560
CGGCGGATGT TGTATAACGG CTATTACCCC AGCCTTCCAA GCTGGAGACG TGGGTTCGAC 1620
TCCCATCATC CGCTTTCCTC CCTACCTCGT TGATTTTTCT GTTCTATACG CGCTACACTC 1680
GCCCCTCGGA GGGGTAGGGT GCATTCGGGG CAGCAATTaC TtGAAAAGAA CAGTATTATT 1740
ATCAGCGGTC TTCCCCCCTG GGCGCAGGAG TTGTCCAAGA AGTATTGCTC TAAAACGGTC 1800
AATcTGTAtT CGTACATGGC AATATCCGTG ACTTCCTCCC CCATCGCGAT ATTCAGGGCA 1860 GCTTCTCcTT CGTTCGCATT AATGACTACA TATCTGAGGT TATCTTCGGT AACCGGGrTA 1920
TCATCGTCTT CTATGACAAA TCTGCAGGGC TCACATTTTG TCTACAAGAA ATGCTGAGCG 1980
CTTACTTAGA GCGTATGcAT GCCCAGTATC CTACTGAGGC ACTTGCTGAC TTTCTTTCGC 2040
GTGATCCGGT GAAAGCTTTT GCGTACCTTG AGCGCTACTT TATTATGAAC ATGAAACAGA 2100
ATAAGCGTAT GGTCCTCATC ATCGACTATT CTGAATCTCT CGTTCCCTCA GAAGATATTG 2160
CAAACTTAAG CGAAACAGAT CGCTATTGCT TCGTCACCCT CAATCGCTGG GCAAATGATC 2220
CGGTGTTCAC AAACGAAGAC ATATCCGTTG TGATGCTCAC GGAGAATATC ACTGACATCA 2280
ACAGTCGGTT CACCGCTTCT CCTTCCACCG TTAAGATTCA CATACCCCTG CCAAATGAAG 2340
AAACACGGAT ACGCTTTCTT GAATATCTCA AAACCCAGGA GGAGATTTTA GTACTTGAAC 2400
GTGGGTTGAA TACGGAGAAA ATTGGCAAAC TCACTTCCgG TTTGAATTTA 2450 (2) INFORMATION FOR SEQ ID NO: 39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6426 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 39:
AGCCTTTTGC TGCCGTCAGA GGACGTGCGA ACTTCATAGG TGCCCGTTTG GAAATTACGT 60
ACCACACGTG AACGACCGGT GGTGCAGCTG TAAACTGCAG CACCACGCTG TACGTACTGA 120
GGGATTTCGT ACGTCACACC GTCATCAGCG GTAAAATACG CnCGTGCGCT GCAAATGCCA 180
GCGCGTCATC ACGCCAGTAC CGAAAGCGGT AAAAATTCGC CTCGGTGAAG AAAAAGGCAT 240
TCAGTTCTCC AGGCGCCTGG CTAGAATACG CACGCAACGC CTCAGCAGAC CCTTCCTCAA 300
TCCAAAAACC AAAAAGAGGA TCGTCTGTTC TCACGGGAAT CTGCTCGGAC GAGACGGACC 360
CGGAACGGGT ACCATACGAT ACCCGCTGAA AAAAAGAGCA GAACAGCGCG TCCCCCCAGC 420
GACACAGAAC GAGCGGyTGC ACACGCACAC GCCCCTCAAA CGACAAGACA AGATCCACCC 480
GTTCTGCCTT TTGCGCCGCA GGGGGCGGCA GCACATCAGC ACGAGCATCA GACAAAGATG 540
AAAGGGAAAA GACAGCCACG CGTTCATACA CGTACGCATA ATACGGCTTG TGCACAAGCA 600
TGAGCGAAAA ACGCATCGCA TCAGCAGTAG GAGCGATAgC AACGATAcGC GTGGCGTTCT 660
CCCACACCCC TGCCAAACGC GCCAGAGCAG GCTCAGCTGC CAACCCCACC TGAGTACTGA 720
GCACGCACAG ACGCACCCAC GCGCTCCACC CTTTTCCTGT CATACCGCGT AACCGGTCAC 780 TCAAGCGGCG CAAAATACCC TGTCTCGTCC CGCCCCGAAC GGCTCAAGAC GCGCACTTCT 840
GCCTCCACCG GCATATATGC ACGACCAAAA TCCGTCGGCA ACCGCGATCG ATACGACAAT 900
GCATAGGTGc CCCGctGCGC CTGAGCGCAC GTGCGCCACA AGAGCCCCTA ACCCTTCTTG 960
CGCATACACT GAATACACCG CACCCCTTGT CTTTTGAACC AGATATCCAA GCTCCACGGG 1020
CAATACCGTA CGCTGAAGCT GTATTACGTA AAAGCTCGTT TCATTATTCG TCAAATATGC 1080
AGCCAAGTCC GTCACCCCAT ATTGTGCAAA ATCCTGTGTC TGCAATTCAC CAAGAGACAA 1140
GAATACCACT GCGCGCTTGT AATCTGCGTT CACCAGCGTG CCTGCCGCAA GACGCAGACC 1200
CAGATCAAAA CGCCACTGCG AAGAACTCTT TGCTCTTAGG CGAAGcGGTT GCGcCTGCAG 1260 tGCGCCGCAG TGAAGGTGCC TTCCAGCACT GGTGACTGTG CCGCAGAAAC TACCGACAGC 1320
GTCCCCTGTT CCCCCACCGC TTGTGCAAGC TCACCAATCA CCCGTGTAAC CAAGCGCATC 1380
TCCTGCTCCG TCGCAGGAGA GCGGTCCACC AGTATACTGA GCGAACAcGT ATCGTTCAAa 1440
TACGCTGCCC CcTGCAGACG CATCTCACTT ACCGGCCGGT GTTCTTCGGT AAGAAAAAAG 1500
TTGGAAACGT CCAATCCCAC TACCGGCGTC CCCTCACGCG TGTGTACCGA CACATTCACA 1560
GTAACGGAGG GAAACCGGTC GGCGTGCACC CGCTCAAAAT GCACAAACAG ACCGCCGGCA 1620
AGCTCAGAGA TACGCGAAAC AATCTCAATC CTTTCGTTTT TATAATCGGC AAGGAGCACA 1680
TTGCCATTTG CATCCGGAAC TGCCGCCGTT AAGCGAATAG GCGCATTTCC CAAACGGGCA 1740
ATCGTGTGCA AGGACGCGAG TCCTACATCC ACCACCATGA CCTCATTCGG GAGAGACACC 1800
AACAAGCGTC CATTCCACGC CCGCACAGAT TCAACGTGCT TGAGCGTCCC TTCCGCAACG 1860
AGGGTACGCA CATAATTTCC TGCCGTATCA AACACGTAAA TGGCGCCCTT CAGAGCATCT 1920
GCCACGTACA CTAGCTCATC GAGAATGGCA ATGCCGCCCG GAGCAGAAAA CCCAAAGAAG 1980
CGCGCAGACT TCTGCCCAAA ATGGAAGAGA GGAGCACCAT CAGGTGCAAA CACCGCCACA 2040
CGCGCATTCC CAAAATCTGT CACGTAAATG TTATCGTAGC GATCAGTGGC CAAAAACTGG 2100
GGGCCGATGA GTTGTCCGAC GCCTCTCCCC TTTCCCCCAA ATGACTTGAG GAACCTGCCT 2160
TCCTTCGTAA GACGACAAAT GCGATCAGAG GCAAATTCAG AAACGAGCAG ATCGCCTGAG 2220
CGCGTCTGAA TAACATCAAA AGGACGGTCA AAACCCTCAA CGGGCCCACG CGTACGcgCA 2280
ATAACACGTC CGTTCACGTC AAAGCGAAgc AGCTCGTTAG AACCGTATGC GCTCATCCAA 2340
AACGTACCGT cAGcTAACGC ACACAAAGAT AGTGGTCTGC GGAAAAGAAC CGTTCCCCGG 2400
CGTACAGCAT GAAACGATTC ACTTTCGCTA AAGTGCAGCG CGTCTGCTGA ATCAGGCGCA 2460
AAGTCACGCC GCTGCTGAAC CACTTCTATC TTGTTCCGAA GCAACGCGCC GCCGTAGCCT 2520 AGATCCCGCG CCGCGCCCCA CTGGTGcAGC GctGCGCCTT CAATCCCACT GCGGTAGTAC 2580
GCATTCCCCA ACCACTCAAG AATGAGCGGA TTACGGGGAG CAGCAGAAAG CGCACGCTCA 2640
AACAGCTGGA TAGCATCATT GAACGCACCC CGGTAATAGG CCAAAACTCC GCGGCGAAAC 2700
TCCCCTGCTG CAAGTGCTGT ATCACGCACA ACCGGTGGCG CATGCTCCTG CGCCCCCACT 2760
GCAAAAAGCA ACAGCAGCGC GCCTGCGCAC CCCACAGATC GTCTACTCAA ACACCAACCC 2820
CCTCTCACTG CCTTTCAGCG CAGTCTCTTC TTTCTCCAGA AAGCTCACAA AAGGTGCACA 2880
AACAAAAAGC AGAGAGAAAA AAGGAGCACG CAGGCCAAGA CAAAGAGACT ACCTCGAACA 2940
GACGCACACC ACGCCCTATC CTCAGTACGA GCAACAAGCC TGGAACGCAA AATCCGGCAA 3000
CGGCAACACA GGAGGCATTG AAACCGGCTG CGCATACACA AGCGTAATCG CAATGTCACC 3060
ACGCATTACA ACCGCCCAAT CGCTGTCCTG CACCGCGCCT GTATGCGCCG CTCCACCCTG 3120
GTACTCAAAC ACGTGAGCAA GGCCCGCCTG GCGCACCGCC CCACGTTCCT CATATATCCG 3180
ACGcGCAACG CGCGCtAACA AGCACGCGCA TACGCCTGTG CCGATCGCGC GTTTATGTTC 3240
ACCAACACGT CCTCGGAGGA AAGTGCAGCC AACGCCCGCA CGTGACGCGC CACAAAGCGG 3300
GCAAGCGCCT GGAAACGGCA GCCTGTTCCA AAATCAGAAA CGGGCAACAG GCTCGCTGCA 3360
ATCCCCGCAA TCCCATACAT CACCGCCTGC CGTGCTGCGG CAACCGTTCC CGAGAACACA 3420
ATATCAGTCC CCAGATTCTC CCCTTCGTTA ATTCCTGACA CCACCACATC CGGCGGTGTA 3480
CCCACGCACA CCTGGCGTAA CGCGCGATTC ACACAATCCA CCGGCGTCCC TGAGCACGAC 3540
CAAATACCTG GCTCCACTTC CTTTACGGTC ACCGGCTCGA GCGTAGTAAT CCCATGCGAA 3600
ACTGCAGAAC GATCTCTGTC CGGCGCAACT ACCGTCACCT CATACCCCTC AGGCGCTGtT 3660
TCAGCGCCGC aTGCAGCGCG CGAATGCCTG CTGCCTGATA CCCATCATCG TTTGTCAGTA 3720
GTATCCTCAT AACACCCGGG CCCCTTCAGA GCACTGTACC TCATACGCCG CTGCTTTGAA 3780
ACCGAAGATG CGCTCGTACT CGTCGAGCCT TTCTAGGTAC GGCTCAAAGT CTTGATCGCG 3840
CAAAATCGCA TAGGTGCACC CACCAAAGCC CCGACCCGTG AGGCGCGAGC AGACCACATC 3900
CGGCGCATCA GGATCTACAA ACTCAAGCGC ACGCTTCACC AACCAATCGA GTTCTGGACA 3960
AGAAATTTCA AAGCGGTCCC GCAGGCGCTc ATGAGAGCGG TTCACTACTC TTGAGAACGC 4020
AGCAAAATCC CGCTTACGCA GGGCTTCAAT CGCCTCATCA ACGCCCAGCG ACTCGCGCAC 4080
CAAACTGATC ACTCGCCTCC GTATTCCCTC AGGCACATCT ATTTCCTCCA ACGCTGCTGC 4140
CaTGAGCTTA GACATAGCGC GAGGCATATC GGGATTGCGC TTCACCAATT CATAAGCATC 4200
CACGCAACGC TTCAAACGCG CGGTGAACTC CTCACGCGCG ATGAAACGGG GAACACGCGA 4260 GTCAGTAAGC ACAATACGCT TCCCCTCCGA GGGAAATTGA CACAGTTCCG CCTGCTTCTT 4320
GCGGTGATCA GTGCGCACGC AGCTACCCTG CTTTGCAAAC AACACGCACA GAATATCCGC 4380
GCGATGTGCG TGGGTCTTGA GATAGCGCTC ATTTGCGTGT TCCACGATCG AAACAACACT 4440
TTCCTTTGGC AGCGTAGcGG CAAACAACCT TCCAAGCACA AGGGCCATGG CAACCTTCAG 4500
CGCATTGGGA GTACCCAGCC CCGCATCAGG AGGAATCTGA GAAAGGATAG TGCAGTTCAA 4560
CCCCGTCAGG TGATACCCAC CATCCATGAA GGAGAGAATG ACCGCCTTTA CCGAATTAGC 4620
CCAGCGATCC TCCTTACGAT AGCGTAAATT AGCGGTGGAA ATCTTCCTCC GCTCCCCAAG 4680
CGTTAAGGAG AAAAGGCGAA AGGTGCTATC CTTTCGGCGC GAGACACACA GCGTAAGGGT 4740
TTGATCGATA GCCATCGACA GGGTGTTGCC CtGAGCAAAC CACAGATACT CCCCCAACAG 4800
GTGAAAACGA CCCGGAACGA CTGCAATCGC CTCAGGCTCG TCGCCGTACT CCTCTGTGTG 4860
GCAGGACTCT AyCcCGTGCA TGCGCAGCAT CATAGcCAGT GTATTGAAAT AATACAACAA 4920
AAATGCTTTT CTGGCAGGGG AAAGTTATGC TTTGCACAGC GCCTCTTGTT TCAAGCGCCG 4980
CCTCGGCGGT GCTCTTGGCA TTTGCGATTC CCAACGAGTT TTGGCTCGCC GGTTCCTCCG 5040
TGCTAGGGTT GGGGGCGCTT GTTCCCTTGT ACGTTGGATT CCTCCTCTCC CCTGCAAAAA 5100
AACACGTTGC CTGTTCTTAT GGGCTGTTCG TCGCACTCGT GCACGCGTGT TCTAGCTTTT 5160
GGCTCAAAAA CTTTCAGGGC TTCGCGCTCT TCACCCTCGG CGCATCAACT GTCGGTTACT 5220
TCTTCTATGC GCTTCCTTTC GGCGTAgcGT tCGCATGCAT CCTGCGCAAg CaGGCgCCCG 5280
CGCGTGCCTG CGCTTTTGCG CTCGTGTGGA CCCTCTGGGA ATGGGTAAAG TCAACCGGTA 5340
TACTCGCCTA CCCGTGGGGT ACGGTCCCTA TGACCGCGCA CAGCCTCTCG CACCTCATAC 5400
AGATAGCTGA TATCACCGGC GTCTGGGGGC TTTCCTTCCT CATCCCGCTC GCAAACGCGT 5460
GCGTTGCAGA AAGTCTCCAC TTCTTCATAA AAAAGAGAGA CAGCGTCCCT GTGTTCCGTC 5520
TCTGGCTCCT CACCGGCTGC TTGTACTGCC TGTGCAGTCT CTACGGTGCC TACCGCATCG 5580
CCACCCTTGG GGCTCCACGT ACCACGCTCG CGTTGGCAAT CGTACAGCAA AATGCAGATC 5640
CGTGGGATAC AACTTCCTTC GAAAAAAACC TCACCACCGC TATACATCTG ACTGAGACAG 5700
CCCTTCGTAC GCAAACAGCT CCCCCCCTGC CGACTACTCC CTACAGAAAA GAAAAAACAC 5760
TCACACACGC TTCTGCGCgC GCACCTGTCG ACATGGTGGT TTGGAGCGAG TCTAGTCTGC 5820
GCTATCCGTA CGAACAGTAC CGTCACGTGT ATAACGCATT GCCAGCGGcA CGACCTTTCT 5880
CGGCGTTCTT GCGCAcGCTC GGCGCGCCCC TTCTGGTGGG AACCCCCTTG AGACTGTCTG 5940
GTAACTCCAC TAAAGGTGGA TACGCCAATG CAGTGGCCTT GcTCCGCCCA GACGGGCACG 6000 TGGCGCAGGT ATATGGCAAA ATGCAGATGG TGCCATTTGC AGAATTCATT CCCTGGGGAC 6060
ACATGACATC TGTACAAAGA CTGGCGCAGA TGCTCGCCGG CTTTTCCGAA AGCTGGACGC 6120
CAGGGCCAGG GCCGCGCTTG TTTCATGTGC CGTGCGCCGC AGAGGCAGCG TGCGCTTCGC 6180
AACTCCCATC TGTTACGAAG ATGCCTTTCC TTCCCTCTGC GCCGCTTTGC ACACACAGGG 6240
GAGTGAGCTC CTTATTAATC TTACGAACGA CTCTTGGTCA AAAACTGCCA GCGCAGAGTG 6300
GCAGCACTAT GTTGTCTCTC TTTTTCGGGG CATAGAGCTG CGTACCAACC TCGTGCGCTC . 6360
TACAAAnTCT GGCTATACCG TCGTCATCGG nCCAGAGGGA AAAAnGCGCG CCGGTTTTCC 6420
GTTGTT 6426 (2) INFORMATION FOR SEQ ID NO: 40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2190 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 40:
TGTGCGCAAC AGACAAACAC GTCCGGCAGG ACGTACTTCC ACAAGnAAGC GTTCCGTCAC 60
GCCACAGGGG TAGGCACCAG GACGCGCCAC GTAGATTGCA CTCACTCCTT GCTTTTCAGA 120
GGAAGGAGGT GATCCTGCAT CCTGTTTCTT TGTCTCACGT GCTGTGTCCG ACGCATGTAT 180
GCAGTGGAAC GAAAACTCAC GTTAAGGGAT TTTGGTCATG AGATTATCAA AAAGGATCTT 240
CACCTAGATC CTTTTAAATT AAAAATGAAG TTTTAAATCA ATCTAAAG A TrTaTGrGTa 300
AACTTGGTCT GACAGTTACC AATGCTTAAT CAGTGAGGCA CCTATCTCAG CGATCTGTCT 360
ATTTCGTTCA TCCATAGTTG CCTGACTCCC CGTCGTGTAG ATAACTACGA TACGGGAGGG 420
CTTACCATCT GGCCCCAGTG CTGCAATGAT ACCGCGAGAC CCACGCTCAC CGGCTCCAGA 480
TTTATCAGCA ATAAACCAGC CAGCCGGAAG GcCGAGCGCA GAAGTGGTCC TGCAACTTTA 540
TCCGCCTCCA TCCAGTCTAT TAATTGTTGC CGGGAAGCTA GAGTAAGTAG TTCGCCAGTT 600
AATAGTTTGC GCAACGTTGT TGCCATTGCT ACAGGCATCG TGGTGTCACG CTCGTCGTTT 660
GGTATGGCTT CATTCAGCTC CGGTTCCCAA CGATCAAGGC GAGTTACATG ATCCCCCATG 720
TTGTGCAAAA AAGCGGTTAG CTCcTTCGGT CCTCCGATCG TTGTCAGAAG TAAGTTGGCC 780
GCAGTGTTAT CACTCATGGT TATGGCAGCA CTGCATAATT CTCTTACTGT CATGCCATCC 840
GTAAGATTCG CACTTCTAAG GCGTTCCAGA CTTCCCTTTC CCAAACTTTC TCTCAGGTTG 900 GCCTCAGTGG GCTCCAATCT GGGGCAGAAA AACCAGTACG AATGnATCCG ACACAAACCA 960
GTCTAACGAG CCGGATGATG CGTCACAAAG GATGGAGCAC AAAAGGGAAA CGTTGGAGTG 1020
ACAGAACAGC ATGGCAAAAA CGCGCAGGCG TTGGGTCGGA GCCAGAGAAC TGCGGTCGCA 1080
TTAGCnCCTA ATTTTGCAGA ACTCTGTGGC AGCCAGTACG GGAGATAGGA AAGTTGCTCA 1140
ATTCGCAAAC AGCACTTTTT TCTGACATTC CCAGCCTGTG GCCCATAAAG GGAGGCGTAG 1200
TCACATTTCC ATGGCATTTG GCAAGAACCG ACATCCATTT ACAGGGCAGT GGTATGTACA 1260
CAAGGGTATT GATCTATCCA CTCACCGTTC AGGGGATCCT ATCGTTGCCA CTGCAGACGG 1320
ACATGTGGTG ACGGTAGAAT ACGATTCGGG TTGGGGAAAC TACGTTATTA TCAAGCACAA 1380
ACATGGGTTT TATACCCgcT ACGCGCACAT GCAATCCTAC ACCGTCACCC GTGGGCAGCA 1440
CATCCGACAA GGACAAATCA TCGGTTATAT CGGCGCCACG GGTGTAGCGA CTGGTCCACA 1500
TCTGCACTAT GAAATACATA TCGGCTCTGA CGTTGTCGAT CCTGGTAAAT ACCTCAACGT 1560
CAAAACTGCA GGGGCAGGAT AGTGTCTCAA CAGGATGGAA TACATGGCAA AGATTGAGCG 1620
TCGCTCCATG AACACGCTTA TTGGTGCAGG CTCCCGTATC AGCGGGAACG TTGTTGTCCC 1680
CGGTTCAGTT CGCATTGAAG GGGATGTCGA TGGGGACGTT ATCACTACAG GGCACGTGGT 1740
AATCGGGAAG CGngcGcGTG TCCGCGGCGT CATACGGGTA GGGAGCATCA TCGTAGGAGG 1800
AATGGTTGAA GGAGATATCG TTGCGTCAGA GGCGGTGCAG GTGCTCCCTT CTGGAGTTAT 1860
TCTGGGCGCA TGCTTACCCG AAAAATTGTG GTGGACGAGC AAGCTTTTTT GGATGGTTTT 1920
TGCTATGCAG TGGCAGATCA AGAGGGATTC AACAAAGTGC TCAAGGCCTA TCTCGGTCGT 1980
AAAAGTATTC ATACGTCTGC GTTTgGATAC AACAAGTACA GCAAGTCAGG ATAAAGCGGa 2040
TGGGATATCG CGTAGGAAAT TCTGACTCTA CGTCTTTACT GTCCGCATTC GCTCCTCCTG 2100
AGAGAGCCAA AAAAAAGTCA AAAGAAAAAC GGCCCCTGCA GGCTGCGCGC TTTCTCTCCC 2160
TCCTATATCC TAAGACGGAn CCGCACTCTG 2190 (2) INFORMATION FOR SEQ ID NO: 41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6570 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41:
CTCCGTATAG AGGGCCTGAG TATAGGCACG CCCCACAGGG ATTGTCAACG TCTTATGCAG 60 AGCACACACA GGCGCGTCCT TCCTCTTTTT GGGAGGATCC GCTATGCTCA GAGCCATGCG 120
CACCATTCCC CCTATCACCG GCATTATTGC CTCAGCAGGA TGCGCCATCG GTCCAGTCTT 180
TTGCTTCGAT ACCCTGCTAC CTACCCGCTC CCGCCCTGGC GGATCCCGCC TGCGCCCCCC 240
ATCGCAGGAA ATTGCCCGCT TGCGCAACGC ACTTTCATAT GCGCGCGCCT CGCTGCAGAA 300
CCTGCTCGAT TCAGTGCGCG CAAAAGCATC CGGAGACGAG CCTGCACCCG AGTATGCCGT 360
GCTCAGTGGC CAAGCAGAAA TGCTGGCCGA CGCGGCCTTC ATAGCTACCG TAGAGGAAAC 420
GCTGCGCTCT TGTTCCTGCG ATGCAGAAAC TGCTTTGCGC AAGGCAATTA CCCACGTGAC 480
AGATGCCCTC TCTGCTACCT CAGACGAGTA CCTGCGTGCC CGGGCAGCCG ATATCCGAGA 540
CGCGTTTAGG GTGTCTTCGA CGCACTTGCG CATGACACCA CACCCACCGC AGnAAGCTCT 600
TTGCCAACAC AAGGGATTGG AAATAGCACC CCACACTCCC CCTGGGAGCC TGACTTTAGC 660
GCCGTTCCCC CAGGATCCAT CGTGGTTGCC GCTCACGTAC AACCTGCGCA CGCACTGCGC 720
CTGCACGAGG CAAATATCGC TGGTTTGGTA ACCGAAGTGG sCAGCGTAAC AAGCCATGTC 780
GCCATCATGG CGCGCGCGTG GAGTCTTCCC CTGCTCGTCA GTGCACAGGG ATGTAAAGAC 840
GTTGCACAGT ACGTGCTCCG TGTGCGGCAA ACTGCTCGTG CCACCGATGA GGCGCTGCGC 900
GCACTCCTCG ATGCTGAAAs AGTGGGGGAA AAACTGACGC TCTAGGAACC CTCACCGTAA 960
ATCCCGACGT GCGCGCGCTG CgCACrCGCA TGCCTcACCC TTTCCTCACC GTCAAACACA 1020
CCAGTACAGC TGAACAGAGT CCCCCGGCCG CCTGTGTGCT AAACGCACCG CTGCGCACTT 1080
ACTCAAGTGA CGGTATCCGT TTTGAAGTCG GGGCAAATAT CGTTATGCCC CAGGAAGCGT 1140
GTGCAGCTGC TGCGCTCGGA GCAGCAGGCA TCGGACTGTT CCGTTCGGAG TTCTTGCTAT 1200
TCGGATCCGA CCGCTTCCCA GATGAAGAGA CGCAgTGCTC TGCCTACACG CGCGCGCTGC 1260
AGGCAATGAG AGGACTCCCC GTCGTGCTTC GAACGTTTGA CCTTGGTGCA GACAAACTGG 1320
TGCCAGACCC TGCGCGAATG TGCGCACTCT CGGACGCTGC TGAACCGTGT GCACACACCG 1380
CTTCGGAGCG CAATCCTCTT TTAGGGTTAC GAGGCATCCG CTACTGCCTC GCACATCCTG 1440
AGCTCCTGAA AGTGCAGCTT CGTGCAATGt CCGCGCCGGA rCkTGCGCAA CATGTGCAGA 1500
AGGGnACTGC GCATTCTCAT CCCCATGGTT TCACGGGTGG AAGAAATTCA CGCCGTCGCC 1560
GACCTCATCT CTGAGGTAGC CGAcGAgTGT GCCCGCGCGC ACGTGAGTAC ACCCGATCGG 1620
GTAGCACTCG GCATTATGAT CGAAACGCCC GCTTCGGCAC TGATGGCAGC AGAtTCGCTC 1680
CCCACGTGGA TTTTTTTTCC ATAGGGACGA ACGACTTAAC CCAGTACGTG TTCGCCGCCG 1740
ATCGAGAAAA CGAACAGGTC AGCAGCTATG CCGATTACTT CCACCCGGCA CTCCTCCGTC 1800 TTATCCAGCA CGTAATACAT GCGCACAGAC ATCTGCGGCA ACGTCCCGGT ATTTCTTTTG 1860
GAGAACAGGG AATCGGACGC GTGGTCATGT GCGGCGCCAT GGCTGAAGAT GAAAtGCGCT 1920
CTTTCTTCTG GCGGGGCTCG GCCTGCGAGC GTTGAGTGTG CCTTCTTCAC GCATCGAGAC 1980
GCTGCACACG TTCTTATCAC GCATTTCAGT CTCTGATGCA GAGCACTGTG CACGTGCAGC 2040
CGTGCAGCTT TCAGATGCGC AGTCAGTCCG CACACTCATC GAAGAACATC TGCGCACCGC 2100
AGGTATTACG CTTGAGAAAG ACGAGGAAGA ACCCTCACCC CCTCGATCCC CATAGCGGAG 2160
GAGGCCTCAG GCGTTTTCTC CATACACAGA AAGGAAAAGG CAATGGAAAT CGAAGAATTT 2220
GGTCCACAAA TCACCGCCCT CGAGGCGCGC GTGCAGGAAG TATGGGGGAG TCTTTGACGT 2280
TGCCGCATAC GAGGCGCGCA TAGCAACGCT TGAgGCTGCT GCAGCAGCGC CTGACTTTTG 2340
GAGCGAACGC GCGCGTGCCG AAGCGCTGTT AGCGGAACTG AAAAAACTAC GCGCAACGCT 2400
TGAGCCGTGG CGTTGCGcTG CGCCGTGAGA GCGCAGATCT GCGCGCGTTG TACGAGCTTG 2460
CCCGCGAGGC GCAAGACGCA TCGCTGGAGC CAGAACTTTC CTCCCTTTTT TCAGACATTT 2520
CTGCTCGTTT CGAAGAGGCA TCGCTTACCC GTCTCCTGCA CGAAGAGGTA GACCGCCTCG 2580
ACGCGTTTGT TACCATCCAC TCCGGCGCAG GAGGAGTGGA GGCCTGCGAC TGGGCACAGA 2640
TGCTCATGCG CATGTACACG CGCTGGGCAG AGCGGCGCAG CTTTTGCGTA CACATAGTTG 2700
ACTTACTTGA GTCAGAAGGG GGAGTAAAAT CGGTGACGTT AAAAATTTGC GGGTCACACG 2760
CCTTTGGTTT TCTCAAGGGA GAAACGGGGG TACACCGGCT CGTGCGCATC AGTCCGTTTG 2820
ACTCTGCCGC GCGCAGACAT ACCTCTTTTA CCTCCACCTA CGTCTTCCCC GTATTAGACG 2880
ATCACGTTGA GGTGCACATA CGGAGCGAAG ACATGCGGGT AGATACCTAC CGCTCAGGGG 2940 gAGCAGGCGG TCAACATGTC AATAAAACGG ACTCTGCCGT GCGCATCACG CATCTGCCTA 3000
CAGGgATAGT AGTCACCTGC CAGAACGAGC GCACCAAATC AGCAACCGTG CAAgGCGCTG 3060
AGCTTGTTAC GCGCCCGCCT GTACGCCTAT GAACGGCAAA AAAAACAGCA GGAACATCAA 3120
CGGTTTGCTT CTGAAAAGAA GGATATTTCG TGGGGAAATC AGATTCGCTC GTACGTCTTT 3180
CATCCCTACA CCATGGTTAA AGATCACCGC AGCAAGTGCG AAACGGGGAA TATTCACGCA 3240
TCATGGACGG AGCGTTAGAA CCGTTCATCC GTTCCTACTT GGAGTTTCTG TGTACCAGTA 3300
CCCAGTGTGT AGAACCACAG TGAACGGGAG TTACGCGCAA TCATTTGCAG CACTGCTTTT 3360
CTTTCCCCAA ATCGCGGTCG gTTTAGTGCA AnGGCACCGG cGCCGTCCTT TGACTCTTTC 3420
CTGTCCGCGC GTCAGTCcAC CCTCCTGCCT CTTCCTTTCT AGCATCACCT GCAGCGCCGA 3480
CACACCCTCT TTTCGCAACC GCCCGTACGA CTGCTGCGCA cTGCTGtCTA CGCCTCCTGC 3540 nCCCGCCCGT CCCCGCCCGC GCCCCGATCA CGTAATCCCC AAGGAAAAGT GGCGCCTTGC 3600
CGTTGCAGAC TTTACCTTTC ACGGTATTCC AAAGATTTTT CAGCGCTACG TGCGTCCTGC 3660
GCGGGAGctA CTCTTTATTG AACTAAAAAA ATTACCCCTC CGTCATTTTC TTTCTGAAGC 3720
TGAACAGCGC GAGcGCGCCG CCTTGCCCCA CGAAGAAGCC TACCACGCCC GGCTCAAAGA 3780
ACGTGCACAT TTACAGCGsG CGCGTGATTT TGTTTCCTTG CACCCTGTCA GCGATCACGC 3840
GCGCCGTCTG CGTACGGCAG CATTTGAAAA GCAAATCAAA GAGAAGGAGC AAGAAATCGA 3900
GCGTGCCCGT GTGGAAGTGC GCACgCACGC GCGCGGTTTT TCCGTCCCTG GCTCCAGGCA 3960
GAGGTGCTCG TCTTAGGTGC GCAAAACGAA CCGCATGCAC TGCCTGAGCG CTTTCACCTT 4020
GCCACCCATT TACGGCAAAA AAAACTTTCT GCACTGGTTA CGGGAAAACT CGTAGACGTC 4080
GCCGGTTACG TGCGCATATC TCTCTATCTT TCTACAGGGC TAGAAGCAGA ACCCACGCGG 4140
GAATTCACGC TCGCAGGTCC CTACCGAGAA CTGCCGCGTC TTATGCACAC GCTGTCTGCA 4200
CAATTGCGCA GTGCCATTGA AAACGCACAA CCGGTGCGCA TTGTGTTTGA CGTACATCCT 4260
CCGCATGCAC GTCTTTCGTT TCAGGGCGTG CCGGTAGAAG ACCTTTCCAA ACCTCTTATC 4320
TCATACCCGG GCCGCTACGT GGTGGACGTG TCTGCTGCAG GATACTTTTC TGCCACAAAG 4380
GAAATATACA TTGAAAACCG ACCTGCCTTT TCACTACGGG TGCGTTTAGT TGCCCGTCCA 4440
CAACATCGTG TGCGCGTGCA GCTTACTGAC AACAGCGCAG cACCTATCTT TTCTGGCGCA 4500
CGCTCAGTGG GAGTCACTCC CTTCAGCACC GTGGTTACTG ACTTGCGCGA AATTTTCACC 4560
GTCGGACCGG CAGGCGCGCG TTCGTTTGCC TTCATTGAAC GCGGCACATT TCCTAACTCT 4620
CAGCCGAGCA CGCTCGTGTT GCCTGCGCCT AACCCAAACG CAACACAGGA TCTTGCGTAC 4680
AAAAGGGACG TAGCATACTG GTCTTTTGGA GCCCTCTGCA TTGCCGTTCC CATCGCGCTC 4740
ATTCTCGGCT CCACGCTTGC AGACACGCAT CAGGCGCTAG AACGCGCAAA AGCTGCAAgC 4800
GCGgCAACCT CCTCCCCCTC CTGCACCGGC CGGCACGGGC GCATTAGAAC GTAAAAGCCA 4860
GCACCTGCTC ATCGGCACGG GGGTAGCAGT AGGAGTGGCG GTTATCCTGA GCATTAATTT 4920
CATCGTGCcA CTGCGCGCTA TTTGAACGCG GTGATGCACA ACGCGCCACA GGCAGTACGT 4980
CCCCGCGCGG ACAAAGACAT ACAAACATTA ACGCACCGCG ACGAGGCAGA AGAAGATCAG 5040
GAAGAAGATT CCTAAAGGAG CGTGAAGGTG GGTTTAGGAA ACCTAGCACA GAAAATACGA 5100
CGCCTGCTCG GTGGACAGGC GCCTCTGGAC GAAACGTTTT TTAGCGCGCT TGAAGAGCTG 5160
CTCATCGAAG GCGACCTGAG TCTTTCGACG GCAGAGAGCT TTTGCACACA GCTTCGAAAC 5220
GCCGCGCGCA CACGTTCTGT ACATACGGAA GACGCAtGCG CACGCTCTTT GCGGAAATTA 5280 TGGAATCGTG CGTACGCGTT ACCCATCTTG CACCAAATCC GAACCAGTGC TCACTGTATC 5340
TCCTACTTGG GGTTAACGGG AGCGGGAAGA CCACTTCTGC TGCAAAGTTG CAGCGTACTA 5400
TCAGACCCAG AAGGTGCATC CGATACTGTT TGCCGCCGCA GATACGTTCC GCGCAgCAGC 5460
GGCAGAACAA CTCGCACACC ACGGTGCACA GCTAGGCGTG CGCGTCATTG CGCACCCGGG 5520
GGGAAAAGAT CCTGCTGCaG TGGTATTTGA CGCAGGAGAA GCCTTGCgcG cGCAAAAGCG 5580
SGGTCTTTTA CTCGTTGACA CCGCAGGGCG ACTGCACAAT AAGACGCACC TCATAGCGGA 5640
GCTGCAAAAG ATCGACCGTA TTGCGCAGAC AAAGGTGAGC GCAGATGCAT ACCGCAAGAT 5700
ATTGGTATTA GATGCCACCA CCGGTCAAAA TGCATTTCGT CAAGCGCAAA CtTTCACGAA 5760
GCTATTGGCG TGGATGCACT GCTCCTTGCA AAATGCGACA CACGCGCACG AGGGGGAGCA 5820
GTTTTTTCCA TCATGCAAGA GTTAGGTATT CCATTAGCCT TTTTAGGGTG GGGGGAGCGC 5880
TATACAGACT TGGTTGAAGC GAACGCGCGC GAGTTTGTTT CCTCGTTCCT GCACGGAGAA 5940
CGATGATTCG ACCCCGGTAT GGCTGGATGT ACAGCAGCGG GATTGCAGTG CACCTGTGTG 6000
CgGCCgTGTG CGCGCACAGT GCTGTTCCTG CCGCGTGGAC CTTTGCAGAA CAGACACAGG 6060
CGCAAAAAAC AGACACTCCG CTTGATTCCT CCaGtACGCA tGaCCTCCCC TGAGGAAGCA 6120
CCCAATGAAG CAGATCCGTT TGAGAAGGAA CTGGaACACG CGTTCGAAAG AGCGCACGTC 6180
AGCACAGGCG GTGCAGATTC CTCATCACAC GCCGATTTTG TACACATGGA AGAGGCAGGA 6240
CGTGCCCACG CGTCCGCCAA TCGCTGGTAT CACGAAACGT TTGACTCGCG TCAGCGTCCA 6300
TCCTCTGCAG TTCTGTACGA AGGGGCACAG CTACTGCATA CCGTTCACTG GCACTATGTC 6360 gGGGACCGGC TGTTTCCCTG TGAAAAAATA ATTACCACAC CACACACACG TATnCCGCGC 6420
GCGCTATAAT TTTTCCGGAA AGATCGTCGC GTACGAAATG CACACGCGCG GGGTACTGGT 6480
ATACGCACGC ACCTATCGGT ATGnAnCGCA CGCGCGTATA TGTGAAAAGG AAGAAACAAC 6540
TGCTCGAGGG AATGAACGCA TTACGTATGA 6570 (2) INFORMATION FOR SEQ ID NO: 42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19483 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42:
TTTTTGCGCG CGTTCTAGCA CCCGAGTnAA TAGTGTTTTT TGAAAAATGG AGGnTCGCGT 60 CTACCCAGTT GTAAAAAGAG tGTTCGCGCG CGTCCgTGCT CTCCACACGA AcGGAcTCCG 120
TCCACTCACG AAAGATATCG TGTCCGAAGT ACAATACGAG CATCATAACG TAAACTGCAC 180
AGGGTGTTGT GTACgtεTGt GCGCACATGT AGCACCTTTC CCATACGGAG ACACAGGGAG 240
AAAAGTTCCA AACGGGATGT GCACGCGTAA AAAAGTGAAG ATGCGCACAG CAATAAAAAT 300
AGGCGGATGC CTATGTGGTA TCCTGCATGT ATGTGCTGGT TGATTTGTGT AGCGCTAACA 360
ACACCGGTGC TGAAAAATGT GTGCAGGATC TTCAGAGAAA AGAAAAAAGC GAGGTAGATA 420
CTAACTACAC GTATGTGTGC AGCCACGCAT TTCAGAGAAC GTGTGACAAT GAGCGTTAAT 480
GCGAGCAGCA CAAGTGTAAG CGACGCgTGC ATACACCAAC TCTGTGCATT ACAGCACAGG 540
AGAAGGAGAG CGAAgcTTGC GACTTTCGCT ACGGGAGGAG CACGGTGAAG CAGCGATTGC 600
CGGCGTTCAT AAAGAGAGAA AAACATACAT TTCCTTGACT CCTCCTAGCA GGTGGAAGTA 660
CTGCATGCAT GTCGCTACCT AGGTCCAAAG GAGATCTGAA AGAGTGAGTG GACAGTGCAA 720
AGGATCTCTT AGTCCGTAGC AAGAGAAAAT TTTACTGTCG AGTGCGTCCT GTGGCGTGCC 780
ATCGTAGGAA ATGACCCCCT TAGAAAGGAT GCATAGACGC GTCGCAGCAG CGAGTATTTT 840
TTCAACCTCA TGGGTGATGA TAACAAGCGT TTTACCTGCG TGTTTGAGGC TTATGATGAG 900
CTGCACAACC TGACGAACGC TGGGGTAATC TAAGTTTGCA AACGGCTCGT CAAGAATGAC 960
TACCTTTGCA TCCAAGGCGA GTACGCCGGn CAACGGTTAG GCGTCTTTTT TCTCCACCTG 1020
AAAGCGCTCG GGCGTAATGG TCACGCCGGT CAAGCAGTGA CACGGcTGCA AGTGCGCTGT 1080
TGGTACGTGC GTCAATTTCT GCGCGGGAAT ATCCCCACTG CAGAGGACCG AAGgCGCAGT 1140
CCTCAAATAC CGTTTCGCCT AGGATCTGGG TGTCTGCATT TTGAAACGCC AGACCGACAG 1200
TAGTCCCGCG TGCCATATAC ACACGGCCGG AGGAcGGCGG TTCAAGTCCT GCAAGaTGTT 1260
CATGAGCACA GTTTTACCCG AGCCATTTGC ACCTGCGAGG ACGACACAGT CCCCAGGAAA 1320
CACCCTAAAC GAAACGGAGT GTnAATACTT CACAGTCGCG CTCAAAAGAC TTACTTACAT 1380
TGACCAGTTC AAGCAGCGGT CCTGCGCACG ACTCCACAGC CGTGTCTGCC GCGACATCTG 1440
CGCTCATGCG TCCACAGAAG ATTCACCGTG CCCTGTGGCG CGTACACACC CGCGACACGC 1500
GAATACTATG GCATCAACCA TGACGGTACA AATGGCGGCG AACTGTAGGG GCAAGATGAT 1560
GGGCGAGTAG GACAACAAGG GCGATTTTCA GGGTGTCGGC AAGAAAAAAG GGAAGGAAGA 1620
ATCCTAGCAT GAGCTCCCCG GTCtTGAGGC CAAGCACGTA AcCGAGAACC GgCAGACCGA 1680
TGGAGTAAAT CGAAAGAAAA CCAACGAGCG TTGCGACGnT AAGTCTGATC CAAAGAAGGA 1740
GTGCGCGTTC CACCACCGCG TGGTGGAGTG CAAGACAGGC GGTGGTGCTG CGCGATTGCc 1800 CAcGAGCGTA GCAGCGAGTA TGTATCCAAG GAGGAATCCT CCCGTAGGGc AAAAAGCGCG 1860
GTGTATCCGC CCCGACCTCC TGAAAAAACC GGCAGACCAA GGAGTCCTGC CCCGAGGAAG 1920
CTGAGAACGG CGAGTGCACC GTCTCGCGGT CCCAACAATA AACCGGTGAG AACGGCCGCT 1980
GCATTCTGCA GTACAAGCGG AACAGGCTTG AGAGGAATGC TAACGAGCGC ACTCGAGCTA 2040
ATGAGTGCGG CAAAAAGCGC AACAAAAGCC AAAGACTTAC TACGGTGCAT GGTACAGTAC 2100
TCCTCCGAAG GTTCGATGCG CAGTGTGACA CGGAAGGAGG AATCTTTCAA TATCTTGGGT 2160
GGTGCCACAG GTATAGTTTT TAACAGACTT ACCCGAACGG CTGCCAGGTG CGTACGCGTC 2220
GGTTCGTTCC CACTGTGCGC GGGCAGAGCT CGTGTAGTGT CTATTGACAG ATGCAAGGAT 2280
CGGGTACCGT CATGTACGCA gTTTATGGTA GGCTCAGTCC TATGCACGCT GGGGACAGAG 2340
AGAGTATCGC ACGCTTCGTG cGTGTGGTGC GCGATTGTCT GGATTTGTTT CGCACCGAGG 2400
GTATTGGGCC CCGTCCTAGG AATGATTCGG TAATTTTACC GAATGCTGCG TGTTCACCGC 2460
GTAATCATGC AGGAAAGCGT GCGCAGAGCA CTGCCGATGC GTGTGTGAGA AGCAGTGACG 2520
GGTCTGTATA CACGGACGAA ACCTTGCGCG AGGAAATTTT TGCATGCCGT GCGTGTGAAT 2580
TGTATCAACG GCGTACACAT GCGGTGGTGG GAGAGGGTGT TGCAGACGCA GACGTGCTCG 2640
TCGTTGGGGA GGCCCCTGGA GCGGAAGAAG ATCGAAGCGG TCGTCCGTTC GTAGGACGGT 2700
CAGGTAAATT GCTGGACGCA ATGCTTGCGG CGATTGGACT TTCGCGTCAg cAAAATTGTT 2760
ATATCACCAA TGTGGTTAAG TGCCGGCCGC CAAGGAACCG CACACCAACA CCCCACGAGA 2820
CTGCCTGTTG TGCACGGTTC CTCCATGCGC ATCTTACGCT GCATCGCCCG TGTGCTATTT 2880
TGGTGCTCGG CCGcTGCGCC GCACAGCACA TGCTCCAAAC AACCGATGGT ATTGGCAAGT 2940
TGCGCGGgCG CTTTTTTACC TAtCAGGGgA TtCCCCTTCT GGcTAcGTAC CATCCGAGTG 3000
CGTTGTTACG GGATGAAGCG CTGAAACGTC CGGCGTGGGA GGATCTCAAA ACGTTTCGTG 3060
CACGGTTGCT GCAGTTGAAG CAGGACGCAC ACATGCCAAT ATAAAATCAT GGCGCCGTGG 3120
CTTGAGCTTG TTTTTGAcgT TCCACTGGAT AAAAGCTTTA CGTACCGTGC GTGTGCTGCC 3180
CACGCGGGTG AgGCACTCGT GGGTAGACGG GTTCTTGCTC CCTTTGGGGC GCGTACACTC 3240
ATTGGATTTG TGATAAGTGA ATCACATTCT TCGCCTGCTG ATTGCGGTGG TGCAGTTGGC 3300
ACGTTCAAGG AGATCATCCG CGTCATTGAC AGGGAAGCGC TTTTTGACCA AACGCATCTT 3360
GCGTGTGCGC GTTGGATGGC GCATTTCTAC CTGTGTGCCT TAGGTCAGGC GCTGTGTGCG 3420
GTGGTTCCGT CTCGGAAACG AGAACGGACA TTGTCTTCTT TTGCTTCTTG TGCGGGTGTT 3480
CGGCGCACTG ACACCTATGC GCTTTCGGGC GAACAGCGCA AGGCGATTGA TGCGATTACC 3540 GCGAGCACCG GTGCGCGCAg TTTTTATGTG CACGGGGTGA CAGGGTCGGG GAAGACGGAA 3600
GTGTTCTTGC GCGCACCGAG GCAGTCCTTG CGCGTGGCAA GTCGGTTATC TATCTTGTTC 3660
CTGAGATAGC GCTCACTCAC CAGGTGCTCC AGGAGGTATA TGTGCGCTTT GGCAGTCAGG 3720
CGGCGGTGTT GCACTCAGCG CTCAGTGGCA GTCAGCGCCT AGGTGAGTGG CGGCGCATAC 3780
AGTGCATGCG TCACTGTGTA GTGATTGGAG CTCGGAGTGC AATTTTTGCT CCGTTGAAGC 3840
GGCTGGGCCT TGTGATAATG GATGAAGAAC ATGACAGTTC GTATAAGTCT GCGCATGTGC 3900
CGCGCTATCA TGCGCGgCAG GTAGCGATGT ATCGCTGTGC GGACGCGAAC TGTCCGTTTG 3960
TCATGGGGTC TGCAACACCG TCTGTGGAGG CCTGGTACGC GATGCTGCGG GGGGCGGTGC 4020
GTCGTTTACC ATTGACTGCG CGTGTTGCGG GGGGGcTCCG CCGCGTGTTG AGGTGGTGGA 4080
CGTGTCAAAA GAGGCCCTGT TGCTCTCTAC CCGTCTGGTG GATGAAATAC GCAAGACGAA 4140
GGAGGCAGGA TATCAATCGA TGCTCTTTTT GAATCGTCGA GGATTTTCCT ATTCGTTTCA 4200
GTGTCGCAGC TGTGGATACA CGCTGTGTTG CACGCAgTGg CAGTTCCCTT GACGTGGCAC 4260
AAACGTGTGG GGGCAATGCA ATGTCATTAC TGTGGCAGGC AAGAGGCGCC GCCTGAAAGT 4320
TGTCCGTGCT GTCATTCATT TGATACCCGA TACGGCGGGG TGGGCACAGA GTATATTGAG 4380
GAAGCAGTAC AAGCGCTATT TCCTGAA AC CGTATTGCAC GGGTGGACAC CGAtGCGCTG 4440
CGCTCAGGGC ACGTGCAGCA GACGATGGAG CAGTTTCGCG CGGGGAAAAT CGATGTACTG 4500
TTGGGTACGC AAATGATAGC AAAGGGATTT AATTTCCCTA CGCTGCGTTT AGTGGGTATT 4560
GCCTGCGCAG ATACTGGACT GCACACGCCA GACTTTCGCG CCGCCGAGCG GAGTTTTGCC 4620
TTGATGATGC AAGTGGCCGG ACGTGCAGGT CGCTATGTAG ATAACGGCCT GGTCATCATC 4680
CAAACACGCA ATCCTGCGCA TCtGCGGTGG TGTGTGCGCa GCACGGGGAT TGTGAGTCCT 4740
TTTATGCGCA AGAACTTGCG CAgCGGGAGG CGCTGTGTTT TCCGCCCTTT GTGCGCCTTA 4800
TTCGGTTTGT TTTTCGCAGC AAGACGCGGC GCAAGGCTAA AGACGCCGCG TATGCGGCAC 4860
ATGCGCTTTT GACGGCGCAG ATGCCTCTGG GTGCGGATGT ACTGGGACCT GCAGCGTGTG 4920
TGGTGGCGCA GGTGGCAGGC AGCTATCGGA TGCAAATACT GCTGCGTGCC CCATCATTCC 4980
CAGTGGTGCA GCAGGTGGCG CGCAGCTTTT TAGATGAATT TCGAGCTCCG GCGGGGGTGT 5040
ACGTAGAATC TGACGTAGAT CCTGTAAATG TACTGTAGGG CGAGTAGATG TACTCCGTGT 5100
TATCCTGCTG TTTGCGTGTT TGGTTGACCG GTAGTATGCG GTGCCTGGTA TAGGTGCGGG 5160
ACGGAAAGGA GAGAGGATGT GGCACTGCCG ATTATTTTTC AGGACGCAGC gGTGGTGGCC 5220
GTTGATAAGC CGGCAGGACT TGCAGTACAG CCGGGTGCGC GGGTGCGGGT GTGCGTAgTT 5280 GACGTATTAC AGAAACAGCT TGGGGTGCGT CTGTTTCCTC TGCATCGTTT GGACAAGGAC 5340
ACCGCGGgCG TGCTGCTGTT TGCAAAAmAT GCACGGGCAG CTGCTCTGTA CCAGGGGATT 5400
TTAGGCAGCA TGCGTGTGAT TAAmGtATCG CGCACTTTGT TTTGGGCGAC CTCCCCGAGA 5460 gTGTGGTGAT ATtCGCGTTC CTATCcGTAC CGGTACGGCa GCAAGGCGGC GTCAgGTTGT 5520 gCGTGCCGCG CATACTGCAT ACCGTGTGTT GCGTGCGACT GATACGCATA CATATCTTGA 5580
ACTCACtTGC ACAGTGGTCG GACCCATCAG ATTCGTATTC ATCTGcTGCG CTAGGATGTC 5640
CTATAATTGG GGATGACAAA TACGGTGATT TCGCGCGTAA CAAGGCGTGT GCTCGTGCGT 5700
GGGGAGTAAA AAGGCTCCAG TTATTCGCAC ACAGTCTTGT GTTGCCATGT GCATGTAAAC 5760
CGCTGGTGTT GCGTGCACGT ATGCCTGTAC ACTTCCTGCG TGCTCTTGAT GCCgTTGcGC 5820
TATGATTGCC tGTAGCAGGG CATTCTGGTA rGCGGTGTGT GGTTTTGAGT TCTGCCGGTA 5880
ACAGAAAGAG TGTCGTGTGA ATTTCAATAG TTTTTCTCTA GGGTGTGTAC TGCACTCGTT 5940
GTGTTTTTGC AGGCGCGAGG GGAGGGGAGC GGTCCCCTGC TGCTGTACTG TCTGTAGGGA 6000
AGATACCGGC GCCTATTGTT ATATCGGGCT ATTTGTGCTA GAGTGTGCGA AACCGCTAGT 6060
GGGGATGGCC TATGGGTACT GTTGTTCCGG GATTCGATGA CGAGAAAGAC GAAAGTCTTA 6120
AGATGAATCT GCAAAAGATC GATGACCTTG AAGGTGGCGT CGTTGTTTTC CTCAACGGGT 6180
ACATCGATAC TTACAATTCT TCCTTTTTTC AAAAGAGGAT TGCGAAGGTT ATCGATGCAG 6240
GCTACACGCG TATTGTATTT AACTGCGCCT CTTTGAATTA TGTCTCCTCC ACTGGAATtG 6300
GTTCTTTTAC GGCGTTTCTA AAAACGGTCA AGCCtAAAGG TGGCGATATT GTTCTCCTCG 6360
ATATTCAGCC GAGGGTGTAT GAGGTTTTCC AGTTACTTGG TTTTTCTCAG TTTTTTAACA 6420
TTCGCGATTC TATTGCGGAT GCAGTTAGCC TTTTTAGGAA CAAGgTCTCa CCGyTGAAGg 6480
TGGAcACCTT TCCgAAGGTG TTTTcTTGCC CgATCTGCTC TAAGAAgTTA AAGGCGACTA 6540
AGCAGGGGCG TTTTCGTTGT TCCGAATGTA AGACGATTyT CGCCCTTGAC GCGAGCGCAC 6600
ACGTGTCTCT CGGTTAGGTG ACGCGCCTTT CTTTCTGGGG CAGGCTGGTT GGCTGCCTGT 6660
TTAGGGAGGT GTTTCGTGCT TGATGTGTGA GGTGAGGCGT TATACAATGC GGGCCGGCCT 6720
CCGGGCGGCT CGGGGAaGTC CTGTCTGTGT TGCTTCTGTA GCTCAGTTGG CAGAGCGCAA 6780
CCATGGTAAG GTTGAGGTCA GCGGTTCAAT CCCGCTCGGA AGCTTCCGTC tGTGGATGTG 6840
AGGAGGGGTG GTATGGCAAA GAGGACGGCG GTGGAGCTTA TTGCGCTTCA GTGCACTGGA 6900
TGCAAGCGGC GTAATTACAC CACTTCAAGA AACCGACGTA ACGTTCAGGA AAAGCTCGAG 6960
CTCAGGAAGT ATTGTCCTTT TGAGCGTAGA CGTGTGCTGC ATAGAGAGGC GAAGATAAAG 7020 TAGGCTGTCG TCATATCTGT TACGCACGGG GTTTTCTGGT GTTTTCCGGG GATTTGTGGG 7080
TCAGTAGCTC TAATGGCAGA GCGTCGGTCT CCAAAACCGA ATGTTGAAGG TTCGAGTCCT 7140
TCCTGGCCTG AGTGCTTTCG AAAAGGTGTT TCATGTTGAA GTTCGCAAAG TTTCGTAGGG 7200
AGTGCGTTGC CGAGTTCAGG AGGGTGGTGT GGCCTGCGCG CACTCAGGTA CATACCGCGG 7260
TTAAGGTAGT GCTCGTCTCT ACCGTTGTCA TGGCGCTTTT CCTCGGGCTT ATCGATGCTC 7320
TGTTCGTGGC GTTGCTGAGT TTCTTCTTCT GAGGGGATAG AATGGCGAAA GAGTGGTATA 7380
TTCTGCACAC ATTCTCGGGT CGCGAGGCAA GGGTGGAGCG GGCTGTCCGT ATGCTCGTGG 7440
AGCATGCGAG GATTCCAACG AACGTTATCT TTGATATAAA AATCCCTGAG GAACTGCTTA 7500
CCGAGGTGAA AGATGGTAAG AAGAGGGTGG TTAGGCGTAA GTTTTTCCCT GGTTACTTGT 7560
TGGTGGAAAT GGATTTGCCC GAGGTTGACT GGAGGATAGT GTGTAACGAG GTGCGCAGGA 7620
TTCCTGGTGT TTCCGGTTTT TTGGGTTCTT CGGGCAATGC GAACCTCAGG CGGTTTCTGC 7680
GGATGAAGCT CGGCGTATTT TGCAGAAGGC GGGGGAAATT AAGGGGGATA GGACTCCTCG 7740
TATCGCTCAG ACTTTTTTGG TTGGACAACA GGTGAGGATC GTTGAGGGGC CGTTTGCTAC 7800
TTTCTCGGGT GAGGTGGAGG AGGTGATGAG TGAACGCAAC AAGGTGCGTG TGGCAGTCAC 7860
CATCTTTGGC CGCGCTACTC CTGTGGAGTT GGAGCTAGTC CAGGTGGAGG CGCTCTGATT 7920
TTCTTCTTCC AGGGTGGAGA GTGTTGCAAT GCGCATGATT GCCTGCCGCT TACGCGTTGG 7980
TTTCGGGTGT TTTGTTGTTT TTTACGTCAT AAGGAGAGGC CAGTATGGCA GCGAAGAAGA 8040
AAGTGGTTAC TCAGATAAAG CTGCAGTGTC CTGCAGGCAA GGCGACGCCC GCGCCGCCGG 8100
TTGGGCCTGC GCTTGGGCCG CACGGGGTTA GTGCCCCGCA GTTTGTGCAG CAGTTTAATG 8160
ACCGTACTAA ATCCATGGAG CCTGGGTTGG TGGTGCCAGT GGTTGTCACC GTCTATTCTG 8220
ACAAGAGTTT TTCGTTTGTG CTGAAAACGC CGCCTGCGGC TGTTCTTATT AGGAAGGCGT 8280
GTGGGATCGA AAAAGGATCG ACGAATTCTG TTAAGCAGAA GGTTGCGCGC TTGTCGCTGG 8340
CGCAGTTAAC GGAGATTGCT CAAGTGAAAT TACCTGATAT GAGCGCTTTA ACTCTCGATG 8400
CTGCGAAgcG TAnTCATCGC GGGTACGGCA CGCAGCATGG GGGTGGAGGT AGAGCGTTCA 8460
TTATGAAGAG GGGGAAGAAG TATCGCGCTG CCGTTGCGCG TTATGATCGC GCCGAGCGGT 8520
TCAGTCTTGA CCGTGCGGTA GGTTTGCTTA AGGAAGTGAG GTATGCTTCC TTTGACGAGA 8580
CGGTGGAGGT GCACGTTAGT CTGAGGCTTA AGAAGAATCA GACGGTGAGG GATACGGTTG 8640
TGCTCCCCCA CCGTTTTCGG GCCGAGGTTC GTGTGCTCGT TTTTTGTAAA GAGGATCGTG 8700
TTTCGGAAGC GCTTGCTGCA GGTGCTGCCT ATGCAGGCGG TGCTGAATAT CTTGAGAAGG 8760 TAAAAGGAGG CTGGTTTGAC TTCGACGTGG TCGTTGCTAG TCCTGACATG ATGAAGGACG 8820
TCGGTCGTCT TGGTATGGTG TTAGGTCGCA GAGGGCTGAT GCCTAACCCG AGGACTGGCA 8880
CGGTCAGTGC GGACTTGGGG GCTGCTGTCT GTGAGTTGAA AAAGGGGCGT GTCGAGTTTC 8940
GCGCGGATAA GACAGGTGTG GTCCATCTAG CAGTAGGGAA AACGACGATG GACTCTGCGC 9000
AGATTGTAGA GAATGTTGAC GTGTTTCTGT CGGAGATGGA TCGCAAGAAG CCCGTTGACG 9060
TAAAAGCTGG TTTTGTCCGT TCGATTTCGC TCAGCTCCAG TATGGGGCCT GGGATTTGGG 9120
TTGTCCATAA GTCAGAGGAG TAGTATGGCA GTACGCGCAC GAAGGCTGCA GCCGGCAAAG 9180
GTGGCTGCTG TCGAGAGCCT TACGCGTGAT TTGGGTGAGG CTTCTTCTTA TATCTTTACG 9240
GAGTATCGAG GGCTTACGGT TGAGCAGCTG AnCCgcGTTG CGsCsCGCct GCGCGAATTC 9300
TCGTGCGTGT ATCGGGTGGT GCGTAACAAT TTTGCGAATA TCGCCTTTAC GTCCCTAAAC 9360
ATGACGGTGG GAGAGTATCT GGTGGGGCCC ACGGCCATCG CCCTAGTGGA CACGGAGCAT 9420
GCGAATGGCG TCGCGCGTGT GCTGTTCGAT TTTGCAAAGG AAGTGCCTGC CTTAGTGGTG 9480
AAGGGTGCAA TTCTTGATGG GGAGGTGTTT GACGCTTCGA AGGTAGAAGC GTATTCGAAG 9540
CTTCCTGGAA AGAAAGAGCT CGTTTCCATG TTCTTGTCCG CGCTGAATGC aACGACGGTG 9600
AAGTTCGTAC GCgTATTACA GGCTGTGATG GACAAAAGGG ATGAgGGTGT AGAAgTTTCC 9660
GTGGTGTCGG GAgGTGATTC GTCCtAgGCg GTTGTTGTAA CTTAGTTACG GGGTATGTGT 9720
TaGGCcGGTc AGGCTTCTGG GGTGCTGTCT TCCTGTCCGT TTATAGGGGT TATTTCGCAT 9780
ACAAGGAGAA GATAATATGG CGGCGTTGAG TAATGAACAG ATTATTGAgG CGATTCgGGG 9840
CAAGACCATC CTGGAGCTTT CTGAGCTTAT CAAGGCGGTG GAGGAGGAGT TTGGAGTTAC 9900
CGCGGCTGTG CCgGTAGCGC CGGTAGCGGA AGGTGGCGGG GCaGGTTCTG TAGCCGCTGA 9960
GGAGCaGACA GAGTTTACTG TTGTGCTTAA AGGACTTGCA GAACCAGGCa AAAAAATCGC 10020
GGTTATTAAA GAGGTGCGCA ACGTTATCTC AGGGCTTGGC TTAAAAGAGG CGAAGGATCT 10080
GGTGGAGGGT GCGCCAAAGA CTTTGAAAGA AAATGTATCC AAGGAAGAGG CGGCAAAGAT 10140
AAAAGAGTCA ATGACCGCAG CGGGTGCGCT CATTGAGATT TCCTAGTGTC TGGTTTTTTT 10200
TGCATGCGTC CGGCGCGTCG TTGTGTGCCT CTGACACCCT TTCGTGTGGG AGGGCGTCGC 10260
GCTTTTGAGT AGAGCGTGGG CTTCTATTTC TTTTCATACT TGTTCTCGGC ATTTTGGCAT 10320
GCGGGTTGGG TCGCGTTCTC CTCACTTGAG TGGAGGGGAC GGCGTCTCCC CTGTGTGGGG 10380
AGTATTACGG TAGAGCGTGT GGTATAGGGA GCACCGTGTC GGTTCGGTGC AGCTTGAGGG 10440
GGGAGTGCAT GTCAGCACGA GTTTGCAAAA CACACAGAGT GTACGTGGGA AGGGATGTCA 10500 GGAATTTTAT GGACATCCCG GATCTCATCG AAATCCAGCT TCGATCTTAC GACACcTTTC 10560
TGCATGGGGC CCGGAATACA CCGTCCGGCG CCGACACCCT TATCTCCGGT ACTAGAGAGG 10620
AGCTCGGCCT CGAAGACGTG TTCAAGACTA CCTTTCCTAT CGAGAGCTCT ACGGGGGACA 10680
TGACGCTCGA GTACCAATCA TACTCCCTTG ATGAGAAAAA CATCAAGTTC TCCGAGGCGG 10740
AGTGTAAACA AAAGGGTTTG ACGTACGCCA TTCCGCTGAA GGCGCTTGTT GATTTACGTT 10800
TCAATAATAC GGGGGAGATT AGGCGCAAAG ACATTTATAT GGGAGATATC CCCAAGATGA 10860
CTGAACGCGG CACCTTTATC ATCAACGGTG CGGAgcGTGT GGTGGTATCC CAGATCCATC 10920
GTTCCCCTGG TGTTGTCTTT TCTCATGAGA AGGACAAGGA AGGACGGGAG GTATTCTCCA 10980
GCCGCATTAT TCCGTACCGG GGAAGCTGGC TTGAATTTGA AATTGATCAG AAAAAAGATC 11040
TCATCTATGC AAAGCTTGAT AAAAAGAGAC GTATCCTAGG CACCGTGTTT TTGCGTGCGT 11100
TGCACTACGA AACGCGTGAG CAGATCATCG AGGCCTTTTA CGCCA AGAA AAGACGCCTG 11160
TTTGTCAGGA TCGTGCGGAG TACGAGCTGC TCACAGgTAA GATCCTAGCA CGATCGGTGA 11220
CGGTGGAAAA TGAGCAGGgT GAAACCGGGT GTTGTACAAA GCAGGAGAGA AAATCCATCC 11280
CCATGTCATC GATGATCTGC TGCAAAACGG CATATGTGAG GTCTACATTA TTAACCTTGA 11340
AGCGGAAGGT TCGTTGCGTT CTGCGGTCGT TATCAATTGT CTTGAACGAG AGGAAATGAA 11400
GTTCTCTAAG TCGGGTGCAC AGGACGAgCT TTCGCGTGAA GAGGCACTGT GTATTGTATA 11460
CTCAGCGCTA AGACCAAGCG ATCCTATGAC CATGGACGCG GCGGAAAAAG ATTTGCAGAC 11520
AATGTTTTTC TCCCCACGTC GCTATGATTT AGGGCGGGTG GGGCGCTACA AGCTGAACAA 11580
GAAATTTCGC TCTGACTCGC CGACTACTGA GTGCACGCTC ACCCTCGATG ATATCGTAAA 11640
TACCATGAAA TTTCTCATCA GAATGTATAG CGGTGATGCA CAGGAAGATG ATATCGATCA 11700
CCTGGGCAAC CGTCGTATTC GTTCGGTGGG GGAATTAATG ACCAATACGT TAAAAACGGC 11760
CTTTTTGCGC ATGGAACGTA TTGCGAAGGA GCGTATGAGT TCTAAGGAAA CGGAAACGAT 11820
CAAGCCGCAG GATCTCATTT CCATAAAACC TATCATGGCT GCGATTAAGG AGTTCTTTGG 11880
TGCAAGTCAG CTTTCTCAGT TCATGGATCA GGTCAATCCG CTGGCGGAGT TGACACACAA 11940
GCGGCGTTTG AACGCACTTG GTCCTGGTGG ACTTTCAAGG GAGCGTGCTG GGTTTGAGGT 12000
ACGCGATGTG CACTACACGC ACTACGGTCG GATGTGTCCC ATTGAGACCC CCGAAGGACC 12060
AAATATCGGT TTAATTGTTT CTATGGCCAA TTACGCACGC GTTAACGGGT ATGGGTTCTT 12120
GGAGGTGCCG TATGTACGGG TGCGTGACGG AGTTGTTACG AAAGAGATTG AGTACCTGGA 12180
TGCTATGGAC GAGGATCGCT ACTACATTGG GCAGGATTCT ACGGCGGTAG GACCGGACGG 12240 GGTCATCCGT GTAGATCATG TCTCTTGTCG GCACCGGGGG GATTACAGTA CGCGTAGTCC 12300
TAaGGATATC CAGTATATGG ATGTTTCCCC CAAGCAGATA ATTTCTGTTT CTGCTTCTCT 12360
CATACCGTTT CTTGAGCATG ATGATGCTAA CCGTGCGTTA ATGGGGTCGA ACATGCAACG 12420
GCAGGGAGTG CCGCTTATTT TTCCTGAACC CCCGCGCGTG GGTACAGGCA TGGAAGAGAA 12480
GTGTGCATAT GACTCTGGAG TGCTGGTGAA GGCAAAGCAA GACGGAACGG TTGCCTACGT 12540
TTCCTCAGAG AAGATAGTGG TTTGTTCCGC CGCGGCGTCT GGGGAAGAGC AGGAGGTCGT 12600
GTATCCGTTA CTTAAGTATC AGCGGACAAA TCAGGATACC TGTTACCACC AGCGGCCAAT 12660
AGTGCACGTG GGAGATCGGG TACAGGTAGG AGATGCGCTT GCAGACGGTC CTGCAACGTA 12720
TCGAGGGGAG CTTGCGCTTG GCAGAAACAT TCTAGTTGGT TTTGTGCCGT GGAACGnTTA 12780
CAACTACGAG GATGCCATTT TGATTTCTCA CCGGGTGGTA AAGGAGGATA TGTTCACCTC 12840
GGTTCACATC AAAGAATTTT CTACTGAGGT GCGTGAAACC AAGCTGGGTT CTGAACGAAT 12900
GACGAATGAT ATCCCGAATA AGTCTGAGAA GAATCTGGAT AATTTGGATG CAGAGGGGAT 12960
CATTCGTATT GGGTCAAAGG TGCGTGCGGG AGACGTGCTT ATCGGAAAGA TTACGCCAAA 13020
AAGCGAGTCT GAGACGACGC CAGAGTTTAG GCTGCTGAAT TCTATTTTTG GGGAGAAGGC 13080
GAAGGAAGTG CGTGATTCTT CTCTACGTGT GCCGCATGGA GTTGAGGGTA CAGTCATTGA 13140
CGTGCAGCGA CTCAGGCGTT CGGAGGGAGA TGATTTAAAC CCCGGGGTGT CAGAGGTGGT 13200
GAAGGTTCTT ATCGCTACCA AGCGTAACTG CGTGAAGGGG ATAAAATGGC CGGTCGCCAC 13260
GGTAACAAGG GTATCGTTGC GCGCATCCTT CCTGAAGAAG ACATGCCGTA TCTGGATGAT 13320
GGTACCCCGC TTGATGTCTG TTTGAACCCG CTCGGTGTAC CTTCTCGTAT GAACATAGGA 13380
CAGATTCTTG AATCTGAATT GGGACTTGCG GGGTTGCGGC TTGACGAATG GTATGAGTCT 13440
CCTGTCTTTC AATCTCCAAG CAACGAGCAG ATTGGGGAAA AGTTGATGCA GGCAGGTTTT 13500
CCGACTAATT CAAAAGTGAT GCTGCGTGAC GGACGCACGG GGGATTATTT TCAAAACCCT 13560
GTATTTGTGG GGGTTATTTA CTTTATGAAG CTTGCGCATC TAGTGGATGA CAAAATGCAC 13620
GCCCGCTCTA CAGGTCCATA TTCGCTTGTG ACGCAGCAAC CCTTAGGGGG TAAAGCGCAG 13680
TTTGGAGGGC agCGTCTCGG GGAAATGGAG GTGTGGGCGC TTGAArCcTA CGGCGCGGCG 13740
AATACCCTGC AGGAGTTGCT AACGATTAAA TCGGATGATA TGCACGGGCG TTCTAAAATT 13800
TATGAGGCAA TTGTAAAAGG GGAGGCTTCG TCTCCTACCG GTATTCCTGA ATCTTTTAAC 13860
GTGTTGGTGC AGGAGCTGCG GGGACTTGCG CTCGACTTTA CGATTTACGA TGCGAAGGGC 13920
AAGCAGATTC CGCTCACTGA GCGCGATGAA GAAATGACGA ATAAGATTGG CTCTAAATTT 13980 TAAGGGGTGC aGGGAATGAA GGATATCCGG GATTTTGACA GTTTACAGAT AAAGCTTGCC 14040
TCCCCTGATA CCATTCGGGC ATGGTCCTAT GGAGAGGTGA AAAAGCCTGA GACAATTAAT 14100
TACCGCACGT TGCGTCCTGA ACGTGAAGGG CTTTTTTGTG AACGCATTTT TGGTACTACA 14160
AAGGAATGGG AATGCTTTTG TGGAAAGTTT AAGTCAATTC GGTACCGGGG TGTTATCTGC 14220
GATCGGTGCG GGGTGGAGGT AACGCATTTC AAGGTTCGCA GGGAGCGCAT GGGGCATATT 14280
GAGCTTGCAA CGCCTGTTTC TCATATTTGG TACTACCGTT GTGTACCAAG TAGAATGGGT 14340
TTGTTACTCG ATCTACAGGT GAnTCgCAtG CGTTCTGTTT TGTACTATGA GAAGTACATA 14400
GTTATAGAGC CgGGCGACAC CGATTTAAAA AAGAATCAGT TGCTCACTGA AACTGAGTAC 14460
AATGaCGCGC AGGAGCGCTA CGGTGGCGGC TTTACGGCGG GAATGGGAGC GGAGGCTATC 14520
CGTACCCTTT TGCAAAACCT TGACCTTGAC GCGCTTGTTG CACAGTTGCG TGAGAAGATG 14580
ATGGAGAAGG GTGCGAAAAG CGACAAACGC TTGCTGCGTC GCATAGAGAT CGTAGAAAAC 14640
TTTCGGGTGT CGGGAAATAA GCCGGAATGG ATGATTTTGA GCGTTATCCC GGTGATCCCG 14700
CCTGATTTGC GTCCTATGGT GCAGCTCGAC GGAGGGCGTT TTGCTACCTC AGATCTCAAT 14760
GACCTGTATC GGCGTGTGAT CCACCGCAAT AGCCGTTTGA TTCGGCTCAT GGAACTGAAG 14820
GCGCCGGATA TCATCATTCG GAACGAAAAG CGCATGTTGC AAGAGGCAGT GGACGCGCTT 14880
TTTGATAATT CTAAGCGCAA gCCCGCGATT AAAGGTGCGT CAAACCGGCC GCTTAAGTCT 14940
ATTTCTGACA TGCTCAAGGG GAAGCAAGGG CGTTTTCGCC AGAATCTTTT GGGCAAGCGG 15000
GTCGACTATT CCGGGCGTTC GGTTATCGTA GTGGGGCCTG AACTTAAGTT GTGGCAGTGC 15060
GGGTTGCCTA CAAAAATGGC GCTTGAGCTG TTTAAGCCCT TTATTATGAA AAAGCTGGTT 15120
GAGAAAGAAA TTGTCTCGAA CATCAAAAAG GCAAAGATGC TCGTGGAACA AGAGTCGCCG 15180
AAGtATTTTC GGTGTTGGAT GAAGTGGTAA AAGAGCATCC AGTTATGCTT AATCGGGCGC 152 0
CGACATTGCA TCGATTGGGC ATTCAGGCTT TTGAGCCGGT GTTGGTGGAG GGGAAGGCGA 15300
TTCGTCTTCA TCCGCTTGTG TGTAAACCTT TTAATGCTGA TTTTGATGGG GATCAAATGG 15360
CGGTGCATGT GCCGCTGACG CAGGCGGCAC AGATGGAGTG TTGGACGCTC ATGTTGTCGA 15420
ATCGCAATTT GCTTGACCCT GCAAATGGGC GCACGATTGT GTATCCATCT CAGGACATGG 15480
TTCTGGGTTT GTATTATCTG ACAAAGGAAC GCTCTCTGCC GGAGGTGCTC GTCCTCGCCG 15540
TTTTTCCTCG GTGGAGGAGG TAATGATGGC TGCGGAAAAG GGGGTAATCG GCTGGCAGGA 15600
TCAGATTCAA GTGCGATATC ACAAATGTGA TGGTCAGCTT GTGGTCACTA CCGCAGGAAG 15660
ACTTGTGTTG AATGAGGAAG TTCCCGCAGA GATTCCTTTT GTCAACGAAA CGCTTGATGA 15720 CAAACGCATC AGGAAATTAA TTGAGCGGGT GTTCAAGCGT CAGGATTCTT GGCTTGCGGT 15780
GCAGATGCTC GATGCACTGA AAACTATCGG TTATACCTAC GCGACCTTCT TTGGTGCAAC 15840
GCTCAGTATG GACGACATCA TCGTGCCTGA GCAGAAGGTG CAGATGCTCG AAAAGGCGAA 15900
CAAGGAAGTG CTAGCGATTG CGAGTCAATA CCGCGGGGGG CACATCACGC AAGAGGAGCG 15960
TTATAATCGC GTCGTTGAGG TGTGGTCTAA AACAAGTGAG GAGCTCACTT CGCTCATGAT 16020
GGAAACACTT GAGCGCGACA AGGATGGATT TAATACCATT TACATGATGG CTACCTCAGG 16080
TGCGCGCGGG AGTCGCAATC AAATCgCCAA CTGGCGGGAA TGCGTGGCTT AATGGCAAAG 16140
CCGAGTGGGG ATATCATCGA ATTGCCTATT CGTTCTAATT TTAAAGAGGG ACTCAATGTC 16200
ATTGAGTTTT TTATTTCTAC CAACGGTGCA CGCAAAGGGC TCGCAGACAc TGCGCTAAAG 16260
AcCGCTGATG CGGGGTATTT GACACGTCGT CTGGTTGATA TCGCGCAAGA TGTGGTGGTG 16320
AACGAGGAGG ACTGTGGTAC CATCAATGGC ATTGAATATC GCGCGGTGAA GTCCGGCGAT 16380
GAGATTATTG AATCGCTTGC TGAGCGCATC GTAGGAAAGT ATACACTTGA ACGTGTAGAA 16440
CACCCCATCA CCCATGAACT GCTGCTCGAT GTGAACGAAT ACATCGACGA TGAGCGTGCA 16500
GAAAAGGTGG AAGAAGCGGG CGTGGAGTCA GTGAAGTTGC GCACCGTGCT CACGTGCGAA 16560
TCTAAGCGAG GAGTGTGTGT GTGCTGCTAC GGGCGGAATC TTGCACGCAA CAAAATTGTA 16620
GAAATTGGGG AGGCGGTTGG GATTGTAGCC GCTCAGTCCA TTGGTCAGCC GGGTACGCAG 16680
CTGACAATGC GCACGTTCCA TGTTGGGGGT ACGGCAAGCA GTACTACGGA AGAGAACCGC 16740
ATCACGTTTA AGTATCCCAT ACTGGTAAAG AGTATTGAGG GGGTGCATGT GAAAATGGAG 16800
GATGGCTCTC AGCTGTTCAC GCGTCGGGGG ACGCTCTTTT TTCACAAAAC TCTGGCAGAG 16860
TATCAGCTTC AAGAGGGTGA CAGCGTGCAG GTGCGTGACC GCGCGCGGGT GCTAAAGGAT 16920
GAGGTTCTCT ACCACACCAC CGATGGGCAG ACGGTGTACG CTTCGGTGAG TGGTTTTGCG 16980
CGTATAATCG ATCGAACCGT GTACCTGGTA GGGCCTGAGC AAAAGACGGA AATTCGCAAT 17040
GGTTCTAATG TAGTAATCAA GGCAGACGAG TATGTGCCGC CCGGAAAGAC CGTGGCTACG 17100
TTTGATCCGT TCACTGAACC TATTTTGGCA GAGCAGGATG GCTTTGTGCG GTACGAAGAT 17160
ATTATTTTGG GCTCTACGCT CATCGAAGAG GTAAATACTG AAACGGGGAT GGTGGAGCGC 17220
AGGATTACGA CGTTGAAAAC AGGAATACAG CTTCAACCGC GGGTATTCAT CTCTGATGAG 17280
TCGGGGAATG CGCTGGGTTC GTACTACTTG CCAGAGGAAG CGCGCTTGAT GGTTGAAGAA 17340
GGCGCGCaGG TGAAGGCGGG TACGGTCATT GTAAAACTGG CAAAAGCAAT TCAAAAGACA 17400
TCGGATATTA CGGGGGGGCT GCCGCGTGTT TCTGAATTAT TTGAAGCGCG GCGCCCTAAG 17460 AATGCGGCTG TCTTGGCACA GATTTCTGGG GTTGTGTCGT TCAAAGGACT GTTTAAGGGT 17520
AAGCGTATTG TCGTGGTGCG TGACCATTAC GGGAAGGAAT ATAAGCACCT CGTGTCCATG 17580
TCGCGTCAgC TTTTAGTACG TGATGGAGAT ACGGTTGAGG CAGGCGAACG CTTGTGTGAT 17640
GGTTGCTTTG ATCCCCATGA TATCCTGGCA ATTCTGGGTG AAAATGCTTT GCAAAACTAT 17700
TTGATGAATG AGATCCGTGA CGTGTATCGT GTGCaGGGTG TTTCAATCAA TGACCAGCAC 17760
ATTGGTTTAG TGGTGCGGCA AATGCTACGA AAGACAGAGG TTGTCTCGGT TGGGGACACG 17820
CGTTTTATCT ACGGGCAACA GGTGGATAAG TACCGTTTTC ACGAAGAGAA CCGTCGGGTT 17880
GAAGCGGAAG GGGGGCAGCt GCGGTTGCGC GCCCAATGTT CCAGGGTATA ACGAAGGCGG 17940
CGTTGAACAT AGACTCTTTC ATATCTGCGG CATCTTTCCA AGAAACGAAC AAGGTGCTCA 18000
CCAATGCGGC GATTGcAGGC TCTGTTGATG ACTTGTGTGG GTTGAAGGAG AACGTCATTA 18060
TAGGGCACTT AATTCCCGCA GGTAcGGGGA TGCGGCGTTA TCGTCAGGTG AAGCTGTTTG 18120
ACAAGAACAA GCGGGATCTT GATGTGCaGA TGGAGGAAGT TATCAGGCGT AGAAAACTTG 18180
AAGAGGAGGC GCTTGCCCAG GCAGTTGCGG GTATGGAAGG GGAACCTGAA GGCGAAGCGT 18240
GATGGATTGA CCTGGTTTGG CTATTCTGAG TATCCTAGTC CGCGTGTGCT GTGTGCGGCA 18300
AGGTTTACGG TGTTGAGGAT TTTTTTGGGG AAGTGAGCGA AAAGAATGCC GACAATTAAT 18360
CAATTGACGA GGATAGGGCG TAAGGCGGTT TTTTCTCGTA CGAAGAGCCC TGCGTTGcAG 18420
GCTTGTCCgC AGAAGCGCGG AGTGTGTACG CGTGTGATGA CAGTTACGCC AAAAAAGCCG 18480
AATTCTGCTC TGCGTAAGGT GGCGCGTGTG CGTCTAAGTA GCGGGGTTGA AGTGACGGCG 18540
TACATTCCCG GGATTGGGCA TAATTTGCAG GAGCACTCGA TTGTGCTGAT TCGCGGTGGA 18600
CGTGTGAAAG ATTTACCTGG AGTACGTTAT CATATTATCC GGGGGGCCAA GGACACTCTT 18660
GGCGTGGTGG ATCGTAAGCG CGGTCGTTCA AAGTACGGGG CTAAGCGCCC TCGCGCGTAG 18720
GGGCTGGGGA GAGGAGTTGG TATGGGGCGG AAGCGACGGG TGTCGCGTCG GGTACCGCCG 18780
CCTGACGCGC GGTATAACAG TGTGGTGTTG GCGAAtTTAT TTGTCGAATG ATGCTGGCGG 18840
GTAAGAAGGC AACTGCGGTG GGTATTATGT ACGATTGTCT TGAACGTATT CAGCAAAGGA 18900
CTGGTGAGGA GCCTCTTCCG GTGTTCACAA AAGCGTTAGA GAACGTAAAG CCTGCAGTGG 18960
AGGTTAAATC GCGGCGGGTT GGTGGTTCTA CCTATCAGGT GCCGATGGAA ATTCGGGAAA 19020
CGAGGCGTGA GGCTTTAGGT ATGCGCTGGA TTATCGGTGC AGCACGCAGG CGCACGGGAC 19080
GTGGCATGTC GGAGCGACTT GCAGCAGAGA TCCTTGATGC GTACCACAGC ACGGGAACTG 19140
CCTTTAAACG TAAAGAGGAT ACGCACCGCA TGGCAGAGGC CAATAAGGCT TTTTCGCACT 19200 ATCGCTGGTA GATACGCGTC TCTTCCTGGG GCGTTTGTTG CAGGGGCGGT GTCTGCCCTT 19260
GGCAGGGGTG TTTTTGCCCT CGTCCTTTCT CTTGATTCAT CTGGACGTCG GTTTTGGGTG 19320
GCGTGCTCTT GTGCGCCTTA TCAGCATAAA CGGAGGGTCC ATACGGTGGG GGGGCTACTC 19380
TCGGATCCAC ATAATTTTGC GCGCGCGTGT GCCCTCTTTC GTGGAATTTT CCGCAAGGGA 19440
AGAGCGCTCG GGGGTGGTTC GCGCAAAGCT TCAAGTGCCC TGT 19483 (2) INFORMATION FOR SEQ ID NO: 43:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4724 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43:
CCTTTTTTCG ATCTGTCCAA TATGAGTGGT TGGACGAGCG GACATTTTGT GGAAATGGAA 60
TCCGCTCTGT CTGAGTATAA AAAGTCAAAA AAnCCGCTCT ACGTTTTTTC TACCTCTTAC 120
AGTTTGGCTG ACTATTACAT CGCCTCTTTT GCTGATGAAA TTATCCTTGA TCCGATGGGG 180
TCTGTGGATC TGTCGGGCTT TTACACGGAA ACTCTCTTTT ACGGAGGTAT GGAGGAAAAG 240
ATTGGGGTGC GTTGGAACGT CGTGCATGCT GGGGTGTAnA AGGGCATGGC TGAGATCTTT 300
TCTAGGAAGG ATTTTTCTCC TGAGGTTCGC AGAAATTATC AGTCTGTATT TGCGCGTCTG 360
TGGCAGCAGT ATCTCAGTGA TGTTTCGCGT AATCGAGCAC TAGAGGTGCA GCATCTTGCC 420
CGTTACGCGG ATCGTCGCCT TGAGCTCCTG CAGAAGTATA ACGGAGACGG TGCGCGCACC 480
GCATTGGCGG AAAAGTTAGT AACGCGCGTA TGTTCCTACG ATGAAGCTGG CGTTGCGCTC 540
AAATTTTTAA AAGAAGACGA CTACGAATCT GCAAAAAATT TCGTTGGTCT AGACGATTAT 600
AATCGTGACC GTGCACAGCG GCAGGTGCAG GATCAGGTGG GGATTATTCA TCTTGCAGGA 660
CCGATTGCTG CACACAGGGA TACGGAACTC GGCGGAACGA TCAGCGACGA GGTTAGTGCT 720
TTGTTGGATG TCGCGATGAG TGATCCGGAT ATTAAGGCAG TAGTGTTGCG TATTGATTCC 780
GGTGGGGGAG AGGTGTTTGC TTCTGAACGT ATCCGCCGCG CGCTTGCGnG GGCAAAGCGT 840
CGAGGCAAGA AGCCAGTGAT AGTATCGATG GGTGCGATTG CTGCGTCTGG TGCGTACTGG 900
GTTGCTTCTG CAGCCGATTA CATCTTCGCA TCCCCCTATA CCATCACTGG TTCCATAGGG 960
GTGCTTTCGG TACTACCGAC ATTCGAAACG TTTTTAGAGC GATATGCGGG GATCACTGTC 1020
GATAGCGTAC AGGTGCACGG CGTTCGCCAA CCTTCTTTGC TCAGGAGTGG AACGGCTGAA 1080 GACACCGCGC GCATGCAGCT TGATGTGATG GCGACGTATC GTACTTTTCT TTCGGTTGTT 1140
TCTGCCGGGC GTAACCTTAC CCTTGATCGG GTGGCGGCGG TTGCAGAGGG TAGGATTTAC 1200
GCGGGGGAGG ACGCAtTTCC CTAGGCTTGG TTGATGCGCT AGGCGGACTA GATGAAGCGG 1260
TAGCACATGC AGCGAAAGAA TCACATTGCA GGCAGTATTC GGTGAGAGTT TTGAAGCGGA 1320
CCsCACGTAC GGTGAAGAAT TTCTGCAGTC CCTGTGGGAT GTCCTGCAGA AACGAAtCTT 1380
GCTTTTGGAG AGCGTGTGAT CATTGGAGAG TTACTCCAGC TTGACCTAAG CAAGGGCACC 1440
TACGTATATG AGCCGCTGCG CTTGCATTGG CGTTGACGGG CACTGCTACG CTTGATCGAG 1500
CGCACGTnGT TTGCTACGGT TGGCGCCGGT TTTTGGGGAT GTAGCTCAGT TGGTTAGAGC 1560
GCTTGCATGG CATGCaAGAG GTCAGGGGTT CGATTCCCCT CATCTCCATC GCCGTGTGTG 1620
AGGGAGGGGG TGTGTCTGAT TTAGGTTTAG ATCCGGATCT GTTAGCTCTG CTGCAAGATA 1680
CGCCGCAGGt GTGCCGTCTG AGCATTCTTC TGCAGGGAAG GGTACAGCGA TGTCGCCTAc 1740
CGGGACGCGA GATCCGAGTG ACGTTGATCT TTCTGAGCGT AgTTTTCCCT TGGTTACTGA 1800
GTTTCAAAGC AAGACCCCGC ACCAGTTTTT TGAGTCAGCA GAGTTTTATA AACGTGTCGT 1860
TTCGGATGAG TTGGAAGTTG GGCAGCGTGC GCATGCGGCT TTGGCGCGCT ATTTGTCCAC 1920
CACTGACTTA AAGGATCGCT CTGTGTGCCG GCAGCAGCTT ATTAGCAGTT ACTGGCAATT 1980
AATGGCACAG ATATCGGGGA AAATCGGCGG TGGGTCGGCG TGCATGGAAA AGCGTTACGC 2040
ATTGCGCTAT GGACTGTTGC TTCCTACCTT GTTGACCGCA TCCCAGAAAG ATATCTTCGC 2100
GCGGATTATT GAGACGAATA GTTTGCAGCA GCCTCTTTAT TATCTGGATG AATGGCTGAT 2160
TGCGATTGGT TCTGGAAAGG TTCGCCCTTC AAGCACCGAC GAAgTGCAAG TAAAAAGGAA 2220
AGACGATGTC GCACGCGTAC GGCAGGCGTA TGATAAAGCG TGCGGGCAGT TGCAGAGTTC 2280
TGAGCGTCTG TTGCAGGTGA GGTCGGCGGA gcGTGCCCGT GTGGAAGAGG AGGTGAAGAA 2340
CAGAATTTCG CGTCTTTTCG TGCACGAATC CATTGAAGGT CTCCCTGGGG TGACAGCAGG 2400
TTTCAACGAG GCGCAGAAGC AAGGAATCTC GGAGATCCAT GAATTGTTAA AAAAGTTGTT 2460
GGGTATAGAT CGGGAGTTTA ATGGGTTATA TGCGGGCTAC CGCGCTTCAC AAGACGCAgT 2520
GCATTCCCTG CGAGAGAAAC TAGATGCGCC CAATGCGGAG AACAGTTCAG CAGTGAGTAC 2580
GGAGTACGAT aCCGTGCGCC AAATGATAAA GATGAGCTGC GGGCGCCAGG GCAACCATTT 2640
CCCCCTCTTG TCCAGAGAGT ATTTCCGTTC TGCGGAGCAT GAGATTGGCA CGCGGGAAAA 2700
TGTATTGAAA ATTATGGCTT GGATTGAAGG TCTGGATCCG GAAGCGTATT GCCGTCAGTA 2760
TAAGCAGCAG GTAAACAGGA TTCCGCCATT CGTGGTGCTG TTGCCTTCTT ATGGGGACAT 2820 AGGATTTTGT TGGGAGCCGT TTGATCGTTA CAATCGCGTG ACAAGCCGTG GACGCGTTGC 2880
GtGCCTATGT ATGGAAGGAG CTTGAAGCTT GCAGTTATTA CCGCGACGGC GGATTTACGT 2940
TGGCAGGTTG CAAAGGAAAA GGCTTCGTAT TACTGGATGG AAGAGGGCTT GACGGGGAAT 3000
TATTATCAGT GGTTTCAACC CCAAAAATTA AGGGGTGATG TAAAGGAGTA TTTTATTGCC 3060
GATTACACGA CCTGGCTCCT GAAGGAAAGC GAGGGCATCC AGAAACTGGA CAAAGAGGTC 3120
CGCAATGTCT TTTGGCGCTA CATCCCCTTT CCCCAAAAAA TCAAAGACGA ACTCAAGACA 3180
AAGTCCTTTG TGTACCAAGA GCTTTGTCAG AAGGACGCCA ATCGCCAGGT ATCTGACGGC 3240
TATTGATAGT TTCTCCTGAA TCGGTTGGTG TCCTGTCATG AGGGGATAGC TTGTGCGCCG 3300
GTGTCGGGTG TTCGTTGACC GAGAAGGGTC AGGGTGTTTT TnAAGCTtys CTCTCGCGCG 3360
ATTGATGGGC AAGTCTACTG CAAGCAGGCG TGCGAGGTAG ATCCCATAGT GAGGATGATC 3420
CTCAATCAGT GAGATGAACT TCATCTTTGA TATCTTTACC AGTGTGCCTT CGCCTACTGA 3480
TACAATCGTT GCAGAGCGCC GGTTGTTGAG CAAGAACGAC ATTTCCCCGA TGAATATGTC 3540
TGATGGGGTC AGCATGGACA TGAACTTGTT ATCCACGTAC ACTGCGAATT TCCCCGACGA 3600
AATGTAGAAA AGGGAACTGG ATTCTTCGTT CTGGTAACAT ACCACCTGGG CATCCCGGAA 3660
CGTTAGTACC TGTTGGTTTT TCAAAATTGA AGGCACAAGG TTCGCGACGT TTTCCTGGTT 3720
GTCAATTTCA AACGTCACCT CGTTGCCCAC GTCGTTATAG CTGAGCCTTT TGACAAAAAT 3780
TTCTGTCATT TTTATGCCCA TACCGTGTAG ACCAGGTTTG CACGCGCTTG CCATGCGGCT 3840
TTTCCAATCA AAGCCTGTGC CTTCGTCACG AATGGTAATG CGTGTACGCT GCAGTGTAAT 3900
GTCATAGGAA ATATAGATTT TTTTCgCGCT AATGCGCGGG TCCtGcTTGC GCAGAGCAAT 3960
CAAATCAAAG ATATCCTTGC GCTGTTCGAG CCaCTCTGTT TTTTCGTCGT AGCTAATGCC 4020
GCAGTTTCCG TGCTCCAGTG CATTGAGTAA CAGTTCCATC ATTGCGCCTT CAAACGAAGT 4080
GCGTTCAAGC TCATTGATAC GGTTGGTATT GTACAGGTAC GAACTAATCA AGCTGGCGTA 4140
AAAGGTAATC TCAAAGGAAT CGGTGTCGCA GATAAAATTT CCCTGCTCGT GTCCATGTGC 4200
TTGGTGCACG AGGCTGCGAC TAGAAAGAAA GTGTCGGTTT CTGTCCACGA TTCGCACAAC 4260
CTGGGAGGCA TGCGCTTCAA ATTCCTGCCG CGTGGAAACT GAAAGGAAAT TCGGGTCTTT 4320
GCGATTTACG ATTTTTATTT TTTCTTCCAT CGAGTTAGTG ATAGCAATCA CCCCACCAAA 4380
TAGAAGCCAA GGATCATCCT TTATAATTTT TAAACACGCT TCGCTGTCGA CGTTTGGGTC 4440
ACCAAAATCA ATAATCTTAA TCTCAGGCAT CTCGAAGCGA AAAACGGATG CTATCTCATT 4500
CAGACGAGAG AGCGTCTGAA TGTGTATATC CACGCGTTCT CCAGTACACG CACCGTTAAg 4560 GCAGATATGG TAGACGTAAC CGTACTGATA AGAGGTATTT gCTCATACTC ATAATCCTTg 4620
TTATAGAAAT CGAGCCACGG TAATCATCGG TTGACTTATC ATcGAGAATG AGATCTGGcT 4680
ACGCATTAgG TA ATCGTGT GGGGGCATGC GCnTGGGAAC AGGC 4724 (2) INFORMATION FOR SEQ ID NO: 44:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 14822 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 44:
TAGCCTGCCG TGGCGCACCC CTGCTTTGCT CCACGCCGCG CTCTTTACGT TCCCCGCACA 60
CATGCGCTAC ACTCCCCGCC ACCGCCGCAG GcAGGGCCCC GTGTTACAGG ACCTATCCGC 120
AAATGCCCGT AAGTACTGCT CGAGCTCGGT CAGACCGTTG CGCGTGAGGC TCGCGCCTCG 180
CTCCTTCAGC CAGAGCACAT TCTGCTCGCC CTCATTCAGC ACAAAGTAGG CCGCGGCTAC 240
AAGCTCATCG AAAAACTCAT TGAAGATGTC GCTACCGTCC GCCTCATCCT CGAGCAACAC 300
GTCCTTACCA ATGAGGGAGA CGTCGCCAGT CCCCAGGACC TGCCCGTCTC AGGACGCGTC 360
AAACACTTGC TCGACATCGC AGCAATGGAA GCACGCTCCc TGCGGTGCGC TTACATCGGT 420
ACCGAACACC TCGTTATCGC CTTTGCCCGA GAGGAGCAAA ATCCTCTCTT CCAAAGCCTC 480
ATCCGAGAAG GACTCTCGCT CGATGACCTG CGAAACGCGA GCATTATATC CTCACCTCAT 540
TCTGATACCA CCCGCACCCG GCTCGAGCGG AAAGTTGCAA GTGTCCTTGA CGAATACGGC 600
ACCGACCTTA CCGAACGCGC GCGCGCCGGC GCCCTCAATC CGGTCATCGG ACGAAACAAA 660
GAAATTACCC GCGTCATTCA AATCCTGTGC CGGAGAGGAA AAAATAACCC GGTGCTCATC 720
GGAGAGCCAG GTGTCGGGAA AACTTCCATC GTTGAGGGGC TCGCGTACGC CATCGTTCGG 780
GAGGAGGTCC CGCACATCCT GCTGCACACC CGCGTCGTTT CCCTAGACCT TGCCGCCGTC 840
ATAGCAGGAA CAAAGTACCG CGGCCAGTTT GAGGAGCGGC TCAAACGCAT TATTAAGGAG 900
GTGGAAGAAA CTGAAAAAGT CATCCTTTTC ATCGATGAGC TGCACACACT CATCGGAGCA 960
GGAGGCACGC AGGGGTCTTT GGACGCCGCC AACATGCTCA AGCCGGCCCT TGCACGCGGA 1020
CAAATCCAGT GCATTGGGGC AACAACCCTG GCAGAGTATC GCCGTTACTT TGAAAAAGAC 1080
GCAGCTCTCA CCCGCCGATT CCGATCGGTG CTCGTGCGTG AACCGAGCTT TGAAGAAACC 1140
TGCACTATTT TACGCAAAAT AAAATCACAC TACGAACGAC ATCACCAGGT GATATACCAA 1200 AGCGATGCGC TTGAAAAAAT TGTTGAGCTT TCACGGCGCT ACATCCCTGA GCGGTTCTTT 1260
CCAGATAAGG CAATTGaTCT TATGGATGAA GTAGGAGCCA TGAAACGGGT ACAACAGCGC 1320
GCGGATACGC AGGTATTGCG TTCCTTTTCC ATAAAAGTTG CTAATCTTAC CACAGAGACT 1380
GAGCGCGCCA TTGCGCTTGA AGATTGGGCG CGCGCGCGTT CCTTACACAC CGATGTGGTG 1440
CAGCTGcGCA GACGGCTCCA CGCGCTGAAG GTAGAGTGGA GCGCGCGCGA AgyGcgTCTA 1500
TCTTTGcAGA AGATGTTGCA CAGGCTGTCT CTCTCATGAC CGATATCCCG GTACATTCGC 1560
TCGAAGGGGA TGAGCTGTGC CGCTTTACCA ATATCGAACG GGATCTTTGT GCCACCGTGC 1620
GTGGGCAGCG CGAGGCCATT GCAACGCTCG CGCGCGCTAT CGTACGCGCG CGTGTCGGCA 1680
TCTCTTCAGA CACGCGCCCC ATTGGCTCCT TCCTGTTTCT TGGACCGACC GGTGTAGGCA 1740
AAACGCTCTT GGCAAAGACA CTCGCGGAAT TTCTTTTCGG TTCAGCAGAC GCGCTCATCC 1800
GCATTGACAT GAGCGACTAC ATGGAACGCT ACAACACCTC ACGCCTCATG GGAGCACCGC 1860
CTGGATACGT GGGATTTGAA AATGGCGGTC TACTTACCGA GCGCGTACGG CACCGCCCTT 1920
TTTCTGTCAT CCTTCTGGAT GAAATTGAAA AGGCGCATCC AGATGTCTTC AATGTTCTCC 1980
TCCAGGTGTT AGAAGAAGGA GAGCTGCAAG ACAACCTGGG GCACACGGTG AACTTCCGCA 2040
ACACTATCAT CATCATGACC AGCAATGCAG GCACACGCGG CCTGGGGGAA AACGTTCCTG 2100
GCTTTCAAAC CGCACGCGCG CGAAACATCG AGTACCGTCA GcTGCGCGTA CAGGCCcTCC 2160
GGGAAATAAA ACGCATCTTC TCTCCGGAGT TTCTCAATCG CGTTGACGAG TGCGTAGTGT 2220
TTGCTCCGCT TGAGCGAGAG ACCCTGCAGG AAATTTTAGA ATGCGAACTG AAGAAGCTCG 2280
CAGAACGCCT ACGCGGTAAA GATATTGTGC TGCGCTACAG CGCGGCTGCA AAGGCCTACT 2340
GTCTTGAACA CGGCTTTGAC CCATTCTTGG GCGCACGCCC CtGCGCCGCG TATTGCAGCA 2400
AGAAATTGAA AATGAGCTTG CGcTGCGCAT GATTCACGGA ACGTTGCGCG CAGGATCGTG 2460
CGTGCACATA GACTCAGACG GCGCGCGCCT CCACCTTTCT ACCGAAAAAA GTTACCTGAC 2520
GCTGCATCCC CAAGAAATAT AACTAATCAG TCACACGCGC CCGTATCTCC CGTACCTGCA 2580
GGTCACTTTC CCACACAGAG CTTCTCAAAC AGCGCATCTA GGATATCTTC GCTGTGCACT 2640
TCTCCAGTAA GCGCCCCACA ATGATAGAGC GCCTCTTCCA GATCGTGCAC CACTGCATCC 2700
AACCCGAACC CACGTGCATA CGCCTCCTGT GCATGCTCCA ACGCCTGCAC TGCGGCGTCT 2760
ACCAATACGT ACTGGCGTTC TGAGCCAAGA GAAAGCTCCT CGTACGGCAC CTGACCGCCG 2820
TGCAGCAGGT GGAGTGTCTG TGCACGGAGC GCGTCCAACC CCGCGTGAGT CTTTGCGCTT 2880
ACACACACGA ATGcgCGCGG CGCACGATCC CTCACCTCCC CGTTCTTTCC CCCTGCTAAA 2940 CACTGCTCCC CCGCCCCGCG CGCGTCCTGA CTGCGCGCAC ACGACAACAC CGGTGCCGAT 3000
ATAAACGGCT GCACTGCCTG ACACACCTGT ATGCGCTCAG ACATAGACAT CAAATCGTTG 3060
TGGGTAACTA CCACTACCAA GGGTACTGCA CAGTCCGAAA GAAAAGCGCA ATCTGCAGCC 3120
TGCACACCTG CACGTCCATT AATAATGTAA AAAACGCAAT CTGCTCCCTG CAAGAGTTGC 3180
TCGCTGCGTA CCACTCCCTG TGCCTCAATA GGATTGTCAG TTACTCGTAA GCCTGCCGTA 3240
TCACACAGAC GCACTGGAAT GCCCGmTAGA TCAAGGTCTG CTTCAAGCCA ATCGCGCGTT 3300
GTACCCGGAA CGGACGAAAC GATGGCACGA TCCTGTCCTA AAAGAGCGTT GAAAAGAGAT 3360
GATTTACCCG CATTTGGACA ACCGCCGAGC ACGATGCGCA CTCCCGTTCG CTGCAGCGCA 3420
CGCTCCTGCC AGCAGGCACG GAGCCTGCGC AGACGTTCTA CCAACGGTTC AAGTTCACGC 3480
ATATCGATAT CGTGCACACG CGTTTCTTCA TCTTCCGGAT ACTCAATTTC CCCCTGAAGC 3540
GTGGCTGAAA ACGCGAGTAA CGCACGGGTA AGCGCTGCTA TCTCCTGCTG CAGCGCACCT 3600
GAAAGtGmAA CACCGCTTGC TGcTGCGCCG CACACGTGCG TGCATCAACT AGTGACTGAA 3660
TCGCCTCAAT ACGCGTCAAA TCCCTTTTAC CATGAAAGAA TGAACGAAAA CTAAATTCAC 3720
CTCGCTGGGC GGCACGGAAC CCnTGCGCAA GACAGAGCCG ATACACAGCC TGTACGGTAC 3780
GCACGCCCCC ATGACAAATA ATTTCTACCG CATGTTCTCC CGTAAAACTG TGCGGTGCGC 3840
GGTACACCAG CAGTACTACC TCATCCACCC GTGTCTTTCC GTCCAAAATC CATCCGTGGA 3900
GAAACGTATG CGCACGTGCG CGCGTCAGAG CCTGCGCACG AGAAAAAAAG GACGCAACAC 3960
GCTCAATGGA GCTGCTCCCA CTCGTGCGGA CAATACCTAA CGCGGCAGGA CTGAGCGCCG 4020
TGGCAATGGC GACGATGTCA TCGTCGAGCG CATACTCATG TGCGCGCATC AGCTACCGTT 4080
CCCTCACGGC CGTGGGCGCA GGTGCGTGAA AAGGCAAGCC GCCTGCAAGT CCCACACGGA 4140
GAAAAAGCAG CGGCACCACT CCCGTACTCA GGGAAGCTCC TATACCATAA CGCACAAAAC 4200
GCAACAGGCG TGCATACTCA GCGGGCGCAC CCGAGAACAG TGCATGAaGC GCAGAaGACA 4260
CCACACTACA GAACACACAC CCTACGCAAA AGCGCACCAC CTTCTGGGTG AAAGACCCCC 4320
CAGCAGCATG AAACCCCACG CCGcCTGCAG GTGCCCTGCG CGTAACCCAA CGATCAAAGA 4380
TAAAGAGGTG CCCTGCTCCT AACCCGCAGA ATGCACCACA CATCGACACA TCCGCCCCAC 4440
CGACAGCTAG CAAGAGTGCT ATGCACACCA mGGCAGAAAA GCGCAACCCC GCAAAAtGCG 4500
TCCTGAGACC TCCTGTATGC GCGTAAAGAA CACAACCCCA CGCCTGAGCG CGCGCCGCAT 4560 aCCGAGAATC AGCGCGCCAA AAAGTACTGC GTTTAACCAA CCAAGAAGCA CATCAATAGG 4620
ATAGTGCACG CCCAAATaCA CACGAGAAAG CCCAATGACT CCTACAAATA GCACGCCCGC 4680 CACGCCCGTC CATGCAGTGC GCGCCAGCGG GtaCGCTCCT GTCCATAAGA CGACGGGTTC 4740
TCCCGATCCC CTACCTGCCT ATCGGTTCCA CAACCTGGCG CACAAAAGCA GGGAACAGGA 4800
GTCTTCGCAG ACGGATAgcT ACGCGCGAGC AACACAAACA AAGCACTCGC CTGTGCATGC 4860
CCGGAGGtGT AGAAAAACCA TCGTGGAACA CAAGTTTCAC CGACGGGTCA CGCACAAAGG 4920
GCCGCGGGAC ACGCAACAGC CCCTTCAGGG CGTAATTGAG CCCCTCGCTA CATGCCAATG 4980
CGTAGGCAAT GGCTAAACCC TTTCGGTACT CTACGCACCA CAGCACCCAG AGCGAACACA 5040
GGGCGATACC CTTCCCTCCA AAGAAGGTAA AAAGAACAAC CGCGTGTGTT ATCACAGGGT 5100
GCGCAgCCTG cTGCACCGCG TGTATGACGG ACAAGTTCCA GAATATAAAT TCTTCCATGG 5160
TGTCCTATCC TCACTTTGAC ACGCGCGTCT GCATCGATCC GCGCTCCGTC CTGTGCGGCA 5220
GATCCTCAAG CCAATAAACT CTATCCGGAG CGCGCTCCTG CACAAAAATG gCGCTGTAGC 5280
GCCCTGaATG CTGCACACGT ACGCTGCCAG GAAAATACTG CGTGTGCAAA AACCACACAA 5340
GATCGCACAT ATCGTCGCGC GAATGgAAAc CGAGTAACGG TAGTAGAGCG TTGCATCGGT 5400
AAGCAGCGCT CTGCgCTGGa TCTGCTGTCT CATGCGGGTC AACTGCTGCG CGTACATTTC 5460
CCCGCAATCC CCCATACCAC GCGTGCGCCG CAGTCAGGAG ATTAAGCGCG CGCGCTTCAT 5520
CCATGTAGTA CAACTCGTAC ACACCTGACA CTGCAGGTAC CGCAGTAGAA ATGCGATACT 5580
TGTCCACCTG GGTGAGTGCC GACCACGTTA GCACGTACAG AACCTGCTGC GGGkTTCCCT 5640
CAGGAACCCC AACGGGCGcA GGTCTCAGTT GCTTAGTGAT CAACGGCTCC AAGGACTGCA 5700
TGTAACGCAA ACTCCGTGAC AATACAAAAC TGGGAAAAgA GAGAAGCACC CGCCAAGACA 5760
CGCGCAGGCC GAAcGAAGGC GCCGATCAGA AtCGAACTGA TGCATAAAgG TTTTGCAGAC 5820
CTCTCCCTTA CCAaTTGGGc ACGGCGCCGA GGACCCCTCA GGCTAACAAA AAAAGACGCA 5880
ATCGTTCAAG GGTAAACCAA CCGATACTCC AGGCACGCTG TGACCTTGCG CAAAGGGGAT 5940
TACCATGGAA AAACAGTCAC CCGCACAAAC TATCTCGCTC TTCGTGCTCC TCGCGCTCAT 6000
GTTTGTACTC GTGTGCATGC TGTTCGTACC CTACtAACGG TGCTTCTCTG GTCGAGCATC 6060
CTTGCTATCC TGCTTTCACC GTGTTATCGC gCACTGTGTG CaAGAATAGA TATGCaTGCT 6120
TTTACGCGTA CTCGACATCT CGTTTCTCAC ATGAATGGAG AGGATGGATG TACCGCGGCG 6180
ATTACCCGAG CGACGCGCTT TCAAAAAAAG ATGCTCGCAG CGGTATTTTC ACTTGTGATT 6240
ACCCTTCTGG TGACCACTGT ATTTTTTTTC ATTGCAATTA GTTTGTTTGG ACAGGGAAAG 6300
CTCTTGTTTG ACAAACTTTC GCTCTTCTTC AGGGAATACG ATCTATTTGA AGGTGCAAAG 6360
CAACGGAGCT TTACCGCGCT TATTTTTAAA CTTTCCCGAG GAACGGTTGA TATCTCTACC 6420 CTCAATGTGG AGGAGCATCT GCTACGGTTC TTCGGCAAGC ATGTAGAATC GGTGTTTGTG 6480
TATACACAAA TTTTTGTCAA AAACATCGCT CGCGCAgCCC TTTCCACGTT GTTCTTTAGT 6540
TTTACCCTAT ACTTTTTCTT TCTCGATGGG GAACATTTGT CCTGTCTGCT CATCGCTGCA 6600
CTACCCTTGA GGAAgCGCGC AAgCGcACaG TTGTTAGAAA AATGCAAAGA GGCAACGCGT 6660
CATTTGTTCA AAGGTCTATT CTCCATtGCT TTTTATCAGA CCTGCGTTGC ATTTGTGTTC 6720
TACGGAATCT TCCGCGTGGA AGGACCGATG GCTTTAGCAA TGCTCACCTT CTTCGCCTCA 6780
TTCTTACCAC TGGTcGgcTG CGCCTGsGTG TGGCTCCCAG TGGGAATTAG CATTGGATTT 6840
ACGAGCGGGT GGATGCGCGG CACCCTTTTC TTGTTTGTCG CTGGAAGTTC AATCACTATC 6900
ATCGACAGTT TCTTGCGCCC GTTGTTGCTG CAAAATAAAA TGCGCATCCA TCCATTGCTT 6960
ATTTTTTTCT CTATGCTCGG TGGGGTGCAG ACGTTCGGCT TTAACGGTAT GGTGCTCGGT 7020
CCTATTTTGG TTATCCTGCT GTTCACGGTT ATCGACTTGA CGCACGACGG GGAGTCTCAC 7080
TACACGTCTA TTTTCCACGA CCCCCCTGCT GCAGGTGTGC ACGCGCAGTC GATACACAGA 7140
CAAGGAAAAA AATAGGGATA TCTTGCTGCT CGGCGCCCTT TTTATTACCA TGCGGCCCAT 7200
GACGCGCGCG TGTATATTCG ATCTTGATGG AACGCTAACG AATACGCTGG GGACCATTGC 7260
CTACTTCGTC AATATGCAGG CTGCCCaTTA CCATTTACCC CCAATTCCCT CTGAAAAGTT 7320
TGCGCTGTTT TTAGGAGATG GTTCGCGCGC ACTGATTCAG CGCGTGCTtG CTCATTACGG 7380
CGCTGCAGCT CAGACTATTT CTGAGGATGA ATTTTTACAG CGCTACTGCC TCGCGTATGA 7440
GGCAGACTTT CTCCAACGCT GTACTGTATA TCCGGGGGTT CCTGAGATGC TTGTGGAGTT 7500
GAAACGACGC CGCATAGAAC TCGCCATTCT CTCCAACAAG CCACATTCTA TCGCGCAGAA 7560
GGTAGCGTCT GCTTTTTTTG GGGACAATGT TTTCTCAGTG GTGCTTGGCC AACGCGAAGG 7620
CGTACCCGTA AAACCAGATC CTGCTGGGCT TTTTGAGATC CTGCGTACCC TAAACGTGGA 7680
GACGGCGGAG GCGCTTTTCG TCGGAGACAC CGCCGTGGAT ATACGCACCG cGTcCGCAGC 7740
GCAAGTGCGC AgCGTGGGaG TGCTCTGGGG CTTTCGAGAC GAGACGGAGC TATCCCAGGC 7800
GCAAGCCCAC GTGCTTATCA GGACGCCCGC CGAGTTACTC CAGCACCTTT CTTTCTAGAC 7860
TCGCGGGTAC AAACTCAGAC GGAGCGCACG ACGCTCCCGG ATCCCTGCAg GGCACGAGCC 7920
GCTACTTCTC TTCACGCCCA ACGCAgTTCG CCCGCAGGGT ATAGCGAAGT CCACGCAGCA 7980
TCAGTGCCAG GGCGCCATCC CCAGTGATGT TACACGCAgT CCCAAAACTG TCTTGCAAAG 8040
CAAATATCGC AATGAGCAAA CCGGTTCCTG TGGTATCAAA GTGCAACACA TCAAGCACCA 8100
GCCCGAGCGA CGCAAGCACC GTACCCCCTG GAACCCCCGG CGCACCTACG GCAAAAATGC 8160 CGAACAAACA GGAGAACAGC ACCATATCTG CAAGAGAGGG CATGGACCCG TACAACATCT 8220
GCGCTATCGT TAGACAAAAA AAGGTCTCCG TCAGAACAGA CCCGCACAGA TGTGTGGTTG 8280
CACCCAGCGG GATCGCAAAA TCCACAATTT CTGCAGGCAG TGCCCGTGAC TTGTGCGCAC 8340
ATTGTAACGA AACCGGCAGT GTTGCTGCAC TCGACATCGT GCCCAGCGCA GTCGCATACG 8400
CCGCTCCATA ATGACGAAtA CCTCGAACGG ATTTTTGCGT GACAGTATCC ACCCCACCAG 8460
GTACAACACG CACAGCCACA GGAGATGACC CACAATGACG ACCGCTACCA CTTTGGCAAA 8520
AAGCGGCAGc TGACGAGTTA AACTCCCGCT GTACGCAAGT TCTGCGAAGG TAGCCGCCAC 8580
AAAAAAGGGA AGCAGCGGCA CCAACACTCG GCTAATAGCT TCACCCATCA TGCGACGAAA 8640
TTCATACAGC ACCTGCTCCA CCGCACGTGC TTTTACCCAG AGGGCAGACA GCCCCACCAA 8700
GAGAGCAAAA GCAAGTGCAG TGACCACGGG CA AAGAGAC GGAATCTCAA GGGTAAAGAT 8760
AACCTTAGGG ATTGTACGCA AACCCTCCAC CGTGCGCGGG ATCCGAAGAT ACGGGATAAC 8820
AACACGCCCC ATCGCGGTGG CAAAAAGGGA GGCACCCACC GAAGAGAGAT AGGAAAGTAC 8880
CAGAAACGAG CCTAGCATCC TACCGGCACT CGCTTTCAGA CTCAGGACAG TAGGGGCAAT 8940
AAAACCAAAA ATAACTAGGG GAATAACAAA AAAAACAACC CCGCCGATAA GCGTTTTCCC 9000
CGTGTGGATA ATGGCCATGA CCGACTCATT AACGCACAGC CCGAGCGCAA CGCCACAGAC 9060
CATCCCCCCA CTGAGCTTTG CGAGCAGCCA AAACCCCGCA CTCCCGGCCA TAGCGTCCTC 9120
CTCCGCACAC GCGCCGGcAG TATACCAAAA AGACTATCCT CTGATAACAG GTCAGCGGTC 9180
TTTTTATGTC ATAGAACCAA CCTCGAAGGC GAGGCAAAAC AGATCGAACC CGCACCTCCC 9240
AAGAACTATG CAGGAAAGAC GCACCGACGG GTTGCATGCC GGGGCGGAGc GCACCCCTAT 9300
GCAACAACCC AGCTTACTCC CACTACGGTT CACTCAAAGA ATGTTCTCGA ACTCCTCCCT 9360
CACCGACCGC GGCCATACAC TGGCCTGAAC TTCACCAATA TGCTTCCTTT GTAGGAGTAG 9420
CATCGCCAAA CGGGATTGAC CGATACCACC TCCGATGGAT TGAGGAAGAC GACCATTGAT 9480
CAGATCCTGG TGCCAGyTGC ATGCCAGACT ATCCTCATCG CCaGTAAGAG cCAGCTGCGT 9540
GCGAAGCGCA CCCTCGTCCA CCCGTATCCC CATCGAAGAC ACTTCAAACG CACGCCCCAA 9600
CACTGGATTC CACACCAAAA TATCGCCGTT CAGGCCCTTG TATTCCCTTC GGAAGGCGTC 9660
GTCCAGTCAT CGTAATCTGG AGCGCGCACA TCGTGCGGCT TGCCGTCAGA AAGCACACCA 9720
CCGATCCCAA TCAGGAACAC CGCACCATGC TCTTTGCAAA TAGCATCCTC ACGCCCCTTG 9780
CTGTCTAAAT GCGGATAACG CCGCACCAGC TCCTCGCTCT GTACAAATAC AATATCCGCA 9840
GGCAAAAACG CCCGTAGgCC GAACCTCTCA CTTACCAACA CCTCCGACTC CCGAAGAGCA 9900 CCGTAGACCT TACGCACCGT GTCCTTCAGA TACGCAAGAT TTCTCGACCC TACCGGTACT 9960
ACCTTCTCCC AATCCCACTG ATCCACACAC ACAGAGCGCA CCTGATCCAA GAAATCTTCA 10020
TCCGGGCGGA GcGCGATCAT GTGTACAAAC AAACCCTCAT TATCCTGAAA GCCGTAGCGG 10080
GCAAGCGTGT GGCGnTTCCA CTTTGCTAAC GAGTGcACAA CCTCAAAGGC AGTACCCGGG 10140
ATCTGCTTcA CGGAGACGGA AACCGCCTTC TCCCGACCTG AAAGACCATC TTGGATCCCG 10200
TCACCCACCT GGCTCAGAAG AGGTCCCTGA ACTTCTATGA GTCCCAGGTG CTCCATCAGC 10260
TTTTGGGTAA ATGTGTGCTT GGCAAAGCTG ATCCCCTGCT GTTGCAAAAT AAATGATTTT 10320
TCCATAGTTA ACGCCAACCT TTTACCTTGT TGAGTAAACT GTGCACGCAT TATTTAATAG 10380
GGTGGCGGTG TAGTGCAATA CTCAAGTAAT CTGACAGCAG GGAGGTGGTG TGAAAAAACG 10440
AATGTGGCGC GCGGTGCGGA CCCTGCTTAT CATCTGTGCG GGGGGAACCG GAGCGCTGTG 10500
GGCGCATCCG CACGTTTTTA TCCGCACGAA AGTAACCTTT CAGTGGCAGA AGGGGGTGCT 10560
TCAACGCGCG CATATTACCT GGGAGTTTGA TCCGTTTTTC AGCGCCGATA TCATTAGCGG 10620
ATACGATACC AATAAAGACG GGCTGTTTGA CAAAAAAGAA ACACAGCAGG TGTTTGAAAA 10680
TGCCTTCATC CATACCAAAC ACTATTCTTT CTTTACCTTC ATCCGTTCCG GGGAGTCgCA 10740
TGCGCGACGT gCTCGCTCTC AAGCAGCACG TACAAGTCCC CAGTCAGTGC AGCATTTCTC 10800
GGTCAGTCAG AAAGACGGTA CGCTGTCTTA TCACTTCTCC ATTGACCTTT CTAGCTACCA 10860
GCACGCTAAG TCCGCACCCC CAGGAACCCG GCGAACACTG TATCTTGCAC TCTATGACCA 10920
CTCATTTTTC TGCGACTTTC GTTATGCAGA ACACGACACC GTACGCTTTG TGTGCGATAA 10980
GGCGCGCGTG CAGCCTTCCT ACGAAATTGT TGAAAACCGA ACCGCTCCTG TGTACTACGA 11040
CCCCTTCGAT AGCA AGAAA GCACTCCCCA ATACGAACAC TGGCGTCCCG GTCTGCATAC 11100
CTACTACCCA AAAGAGATTC TCCTGCGCTA CACTGCCCCC TAAGGTCCTT TTCCAAGGGG 11160
AGTTGAGAGC GTATGAAGAA AGTAGGGGTk cgCGTTCGCG CGTGTATCCT GTGCGCGCTT 11220
GCCGCGTGcG CCACAGGCGT CCTTGCTAAT CCTTTTTTTG GCGgcGCTCC CGCGCGCCCG 11280
CGgAGGCAGC GCACCCCGGA GCTTTTkcTG CGCAGATACG CGCTCGTCCA TCAACGCCTC 11340
GGTGCCGCCA TAGTACAGTG GAGCAAAACC CATTCAACAC GCGCGTGGTG GATTACTGTA 11400
ATGCTCTCCT TTGCGTATGG CGTTCTGCAC GCCTTAGGAC CAGGACACAG AAAGGCAGCG 11460
CTTTTTTCTT TCTACCTGGG GAGGAACGCA CCTGTGTGGG AACcTGCGCT CACTGCAGCG 11520
TTACTTGCGG CGTTGCATGG CGCAgcTTtC CCTGCTCTTG CTTTCTGCAT TTAGAGGTGT 11580
TTCCGGCGCA ATCGGTGCAC ACAGTGCACG CACAATGTGG TACATGGAGG TGGGTTCCTA 11640 CGGATTGCTC ACCTTCTTAG CGCTTTTCTC TCTCGTGCAT GAGCTGATGC ACCTTTTCCc 11700
TTCGGGCGGG CGCTATTTCT CCTGCGGTTG CAGCGCGCAC ACTGCCGTGT GTATGCGGAC 11760
AGGAACAGTC GCCCACATGC AGTGGGGTAC TATGCTCTTG AGCGGTTTAT TTATTTGCCC 11820
TGCTGCGTTG TTTGTGATGA TTCTGGTGCT CAGCTTAGAT GCAGTTGGAC TTGGCGTCGC 11880
AGCGGTGCTC AGTATTTCAG CGGGGTTAGC ACTCCCCCTG ATGGCTGTCG GTTATTTGGC 11940
CTGGGCGAGC CGGGCAGGTA TTTTTTATCG CATGCAGAAG AACACTCGTC ATGCACAAGC 12000
GGTGCTCTCT GTCGTGAGCA TTACCTCATA CGGAATTATG CTCATCGTCT GTACTTCAGC 12060
GCTCGTAGCT TCACTCGGTT GAAAGGAGAA TGTACCTCCG CTATCTAGGT GACACTGCCT 12120
GGATAAAACC ATATACCTAA CACGTGGTGA ACGGAAGTAC GCAGTATCTT GCACACGTCG 12180
GTGAGCTCAG CTTAAAGAAG GGGAACCGTA GACAGTTTGA AGTGCAGCTT GAGCGCAACC 12240
TCACGCTCAT GCTACGAAGC ATAAACCCTC ACGTTACTGT CCGCGCAGGC AGGCTGTATC 12300
TGTCAGTCCC GGCCTCCTTT GAAGCACAGA CCACCGCTGA GCAAGCCCTC TCGTACCTGC 12360
TGGGAATTAC CGGTTGGGCT GCTGCTACGG CGTGCCCCAA AACTATGGAA GCGATCACAC 12420
GGTGTGCACA TGCTGAGGCG ACGCTCgcTG CGCGCGAAGG AAAGCGAACA TTCAGAATAG 12480
AGGCgCGGCG CgcGGaACAA ACGCTTCTGC CGTACCTCGA GTGAGATTGC ACGGGAAGTC 12540
GGCGCGGTTA TCCACCAATC AGGCGCTTTG TCCGTGGATC TCCATCATCC TGACGTGGTC 12600
ATTTTCATAG AAGTGCGCGA GCGCGAAgCC TTTCTGTATG GTGCCCGACG TCGCGGCCTG 12660
CGTGGTTTAC CCTGTGGCGT CTCAGGACGC GGGCTACTCC TGTTATCCGG CGGCATTGAC 12720
TCCCCGGTAG CCGGGTACCG AATGCTTTCT CGTGGCATGC ACATTGACTG TCTGTATTTC 12780
CACTCTTATC CCTACACCCC TCCTGAAGCA CAGAAAAAGG TTGAAGACCT GGCAAAGGTA 12840
TTGGCGCGCT ATGGACTTAG TACCACGCTG ACAGTCGTAT CGTTGACAGA CATTCAAAAA 12900
CAGCTCCAAA CACACGCCCC TGCCCCTTCC CTCACACTGT TGCTTCGTAT GTGCATGATG 12960
CGCATTGCAG AGCACGTAGC GCGGGAACAG CGCGCACGTT GCCTTATCAC TGGAGAAAGC 13020
CTTGCACAGG TAGCAAGTCA GACGCTTGAG AACtAACGGT GACCAGCGCG TGCACGCATC 13080
TGCCGATATT CCGCCCGCTC ATTGGTGCAG ATAAAGAAGA TATTATCCGC ACCGCCACAG 13140
AAATCGGTAC GTACGCCATT TCTATCCGTC CGTACGAGGA CTGCTGCACA CTCTTCGCAC 13200
CAAAACACCC AGTGcTTCGC CCAGAGGTAG AAGAAATGCA AAAACAATAC CAATCTCTGA 13260
TGCTCGGTCC ACTGTTAGAA GACGCGTTCC GGACGCGCAA ACGCACGCGC ATATACGGAA 13320
ACTATGGGGT ACAGGAGTCA GGCGAATGAG TACCGCTTAT CTTACGCGGC AGCACCGTCC 13380 442
GCCCCTTCTT TAtGACGCGT TTATGCTAAC CAtCAAGgAT TATTcCACCA GCGCTTCGGC 13440
GGTAAATTCC AGCGCCTGTG CCACCTGcGT ACCAGGGCCT GACTCTCCGA AACgGTCGAG 13500
CACAAGGCAT TTTTCCCGCT TTGCCCACGC TCCCCAGCCT TGATACACCC CTGCCTCAGC 13560
CACTACAACG CGTGCTCCTC CCTGTATGCG CCGCTGCACC TCGTCCCcTG CTGCCTCAAA 13620
ACGCTCCTTG CACAGTACAG ACACCACAcG CACACGTCTT TTACTcAGTG CGCGGCACGC 13680
AACGCCAAAT CCACCTCAGA GCCACTTGCC AAGACAGTCA GCTCAGGCGT AGCACCCCCT 13740
TCGCGCACTA CATAGGCCCC CGACTCCTCC ACCGTAGAGC GCCACGAACT GTCACTTTTC 13800
TCAAAAACCG GCACGTTCTG CCGACTCAAA ACGATACACA CAGGACCAcT GCGGTGCAGC 13860
AACGCTATTT TCCAAGCTTC AAACGTTTCT TCTGCGTCAG CAGGGCGCAG AACAAGCACG 13920
TTGGGAATCG CACGCAGCGC AGCGAGCGTC TCCACCGGTT GGTGCGTCGG CCCATCTTCT 13980
CCTACAAAAA TAGAGTCATG TGTTAAAACG AAAACAGAAG GGATGCGCAT GAGCGCCGCA 14040
AGACGGAGCG CAGGGCGAAA GTAGTCTGAA AAAACCATAA ACGTAGCGCC AAACGCACGC 14100
AAACCGCCGT GCAACTGCAT TCCGTTCACA ATGGCTGCCA TGGCAAACTC GCGCACACCA 14160
AAATAACAGT AGCCGCCTGC ACGATGCTCT GCAGAAAATG GTCTTAACGA AGAGACCGCT 14220
ACCGCATTCG GCCCGCGTAA ATCTGCAGAG CCACCTACCA GATTCGGTAG CACAGAGCAG 14280
AGCGCGTCGA GCACCTTTCC AGAAGCAGTC CGAGTAGCAA GTGACGAACC CTTCTCAAAA 14340
TGGGGACAGA CAACACGAGC TAGCTGCGAA GTACTTAnCC CTCCGGGAAC AAAAGCAGCG 14400
TCCCAGTCAG CACGTTTTTC AGGATAtGCG TGCTCCATGC TTCAAAGAGC TCATTCCACG 14460
AGTCCTCGAC ATGCGCACAT TCACACTTTC GTTTCTGGAG AACAGCGGTA AGCTCAGGCG 14520
CTACAAAAAA AGAGCACGCA GGATCAAGTC CCAATGCCTT TTTTGCCTCT CTCACCCCCG 14580
CTTCCCCAAG CGGGGCGCCG TGGGCACGCG CGCTCCCTTC AACGGTAGGC GCACCCTTTC 14640
CAATAATCGA ACGCaGGATA ATGAGAGAAG GCCGATCGTC ACGCTTTGCA CACGCAgTGA 14700
GATCCATAAT ATCCGTATAC GAATACATAG AACCGCGCAg CACCTGCCAG CCATACGCTT 14760
CGTAGCGCTT AGCCACATCC TCGnTAAAGT CAGATCGGTA GATnCGTCTA TGCTGATGTG 14820
GT 14822 (2) INFORMATION FOR SEQ ID NO: 45:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16710 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear 443
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45:
TGCATCnGAG ATACACAAAC nGTTTCTGCC CCTTAAAGCT TCAACTGCAA gTACTTTTAG 60
CTGAATGACG TTGGCGCATT CTCGCGCGAG TGTTTCTTCC GTGCGCGGCG TGATAGCTTC 120
CGGTCTGAGG CCAAAGACTA CCTGTGTGCC AATATAGTTT TTTAACAAAA AGGACGCGGG 180
GATATCCGGT CGAAGAACAA AAAGACCTGC ATCAATTTTT ATCCCATGCT CATCTTTAAC 240
AATCGTAACA GGGAAACAAT TCATAGGAGG AGAACCAATG AATTGTGCAA CGAACGTGTT 300
CGCAGGATGC TGGTAGATAT GGAGAGGAGA ACCAATCTGT TGTACGCACC CGTCTTTCAT 360
GATGACAATC TTATCTGCCA TTGTCATCGC TTCCATTTGA TCATGGGTGA CGTAAATCAT 420
CGTGGCCTTT AGGCGCTTGT GAAGAAGAGA GATTTCGGAT CGCATTTGCA CGCGCAACTT 480
TGCATCTAAA TTTGACAATG GCTCATCAAA GAGAAAAACC TTAGGATTTC TGACAATGGC 540
ACGTCCAACT GCAACGCGTT GTCGCTGTCC CCCCGAAAGT GCTTTGGGTT TTCGGGCAAG 600
CAGTGGTTCG ATATCAAGAA CACGCGCTGC TTCGTGGACA CGGCGGATGA TTTCTTGCTG 660
AGGGATTTTA CGGATTCTAA GGCCGAACGC CATGTTGTCA AAAACGTTCA TGTGTGGGTA 720
GAGCGCGTAg TTTTGAAAGA CCATCGCGAT ATTGCGATCT TTTGGTGTAA CGTGATTCAT 780
GTGCTCACCG TCAATGTAGA GGTCACCTGA GCAGATATCT TCAAGCCCTG CAATGATACG 840
TAtGCAGTTG ACTTGCCGCA TCCAGATGGT CCGATGAACA CCACGAACTC TCCACTTTCT 900
GCGGTAATAG TTACGTCTTT TACTGCATGG ACGCATCCGT GATACGTCTT ACAGATATGC 960
TTGAGTTCAA CCTTTGCCAT AGCGTTTACA TTCCTTTTGA AACACGGGTG CGCAACACAC 1020
TACTTTCCTT ACGCAAACGG GAGGTGGTGT TGTCATGTTA CGCGCTCTGC ATGTGGTGCA 1080
AGCGGTCGTC TAGATATGCG ACAAGCGCTT CGGTAAACAA ATCTTGATTT ATCAATTCGT 1140
GTGCAAAGAA ACGGAAGCCG ATGGCACCGG CGTGTCCACC GCCGTTTTGA ATAGCAAAAT 1200
GAGATAGCAG TAAACGCAAA TCCAACTGTT TCACCCGCGC GGCAGTGCGC ACGCGCCACT 1260
GCGTGACATA AAATTGCTGT GTTGGATCTG GATACGCGGA GATACCGATA CCTGAGATAC 1320
TCTCTGCAAT GTTGCTGGTC GCCATTTTTA TCAACGAAAC GAAATGTTCT GCATTGCAAC 1380
ACTCGTGCAA CGCCTGTGTT TGTGTCTGAT TGAGAAGGAG CAGGGAAATA CTGCCGCGGT 1440
GTTGCACGTT TTTCACCATT GTTTGGTAAA CCGCATGTTC TTCAGCGGAC AGAGACTCCA 1500
GGGTATGGAG GATTTGTTTG GTGGATGCTA TGTTCCCGGT AATCGGTTTT GTTTTTTCCC 1560
TGAGCATAGT GTTCAGCCTC TGGGTGAAGT AGGTGTAGAG CGCGCGGTCT TTCCGACTTA 1620 444
TCAAATAAGC ACCTGTCTTT GCATCCCCTA TCATACCGGT GAGTATTGAA AGCACAACAT 1680
TGCGTGAATA CAAATTGCGT ATCCCAAGCG CAGTgcGCTG CGCGTgGTTA CGCGCGAGTT 1740
TGTAACAAAG ATAAGCGATG ATTTCGCACG TGCTAGAGGC GCGTGCGATA AGGCTGTAGC 1800
CTGGATCACC GCAGCAGGCA GCGTTTGCAA AAAGGTGATG GTCAAGTTCA ATTTTACGGA 1860
TAGTCGAATC TGAAAGGAGT AGTCGGCACG ACGGAGGCGC GTAGATCATA CCGGGATTTG 1920
GGGTGTCTAG AATGACCAGG GCATCCGGCA TACGGGGGAC AGTTTGTGTA TCGAGATGCA 1980
CGGCAATTCC GTTATAGAGA CAAATATCAA TGAGGAATGA AATCTGCACA CGAATGGGAC 2040
CTTGACAACA AATTTCCACG CGTTTATTGC ATCGCGTGAG CAGGAGTGCG AAGGCTACTA 2100
ACGATGCGAT ACAGTCTTCA TCAGGATGTT CATGTCCGAG CAAAAGAAAG GAGCCGTGTA 2160
TCGCAATCTC CTGCAGAATA TTTCGGACGA CGGCATTTTT CGCTGCAATG GAGAGATCTC 2220
GTTTTGGCGA CGGAGAATAG GTGTAGTCAC cACTAGCGCG CCCATGACGG CGAGTATACT 2280
ATAGAGTGTC CGCTGGTGTG CGGCAAATGA GACCGTTCTC CCAATTATTT TCTGGGTAAT 2340
GTTGCTCCCT TGTTACTCGT GTGAATTTGC AGTACCGTTG GGCCAGAGGC AGCTTGGCGT 2400
GCGCAGGAGT GATTGTGTGT GCGCCAGACG TGACGGAGCC GAGGTTGTAT TGAGTTTTCC 2460
TTTTGCATGA TGACGTTTCC CGAGGCTGCA TCGTTTTTGC CGTTCGTGCG GCGCGCAAAG 2520
GCGTGCCTGC GCGTCTTGCG TATATGAACG GACTGCAGCC CTGCCTTGTG CAGGGTGTGT 2580
CTTGCGATTG CGTTGGGGCA TGCTGCAAGA AGCGTACACA GGGGGGCATC CCCCACGAGG 2640
GTATGGAGAG GAAATGAAAA TTATCATCAG TGCTTCTGTG CAAATTATTC TTGATCAGGC 2700
CTTTGATTTA GCGCGTAAGc GGCGTCACGA GTACATCACC GCAGAGCATG TTCTTTTTTC 2760
TGCCCTTGCG CATCCTGCTG CGTTGGAAAT TATCAATCTC TGTAGCGCGG ATATCGCGCT 2820
CATCCATAGT AATCTGTCAG AGTTTTTAAA AACACAAGTT CCCGTTGACT TAAGTCATAC 2880
TCCTTCTCAA TCACTGGGTT TTCAGCATCT GCTTAAGCGT GCAGTCTTGT ATTGCGAGGC 2940
GAATAAAAAA AGTGCTCTTG AAGTCGGAGA TCTGTTGGTA AGTCTCCTCC AAGCGGAGAC 3000
AAACTATGCT TCGTACTACA TGCGTATGTC GGGTATGAGT ACGGCGCGCT TGATTGAAGT 3060
AATAGCTCGT GTCAATGGCA TCCGACACGG GGATAAGAAT GTGTCGATGG GGTCGAACGC 3120
GCAAGAAAGG TATTCAGAGT CAGACGACGT TGGCGAATCT GCAGGGCACG GTCCTCCGCT 3180
CGATGGAACA GAGGGGGATG GTAACACAGC GGACGTACAT GTGCACTATG AACACTGCGC 3240
GCATAAACGC ACGGATGCAG ATACGCATCG GTATACGGTG CTGGAAAAGT ATACGGTGAA 3300
TCTCACCGAA CGTGCTCGTC GGGGAGAGCT TGCTCCGCTC ATTGGGCGTA CGCAGGAAAT 3360 445
TGAGCGGACG ATTCAGATTT TGTGCCGGAG ACAGAAGAAT AATCCGGTAC ATGTGGGTGA 3420
AGCTGGTGTG GGAAAGACGG CAATTACTGA GGGGCTTGCG CAACGTATCG TGCGGTGCGA 3480
TGTGCCAGAG GCGTTAGAGG GAGTAGAGAT TTTTAGCCTT GATATGACAA GCCTGTTAGC 3540
AGGTACAAAG TTTCGAGGGG ATTTTGAAGA GCGGCTCAAG CGTCTTGCAG AAGAGTTGGA 3600
AAAGAAAACA CAAGCAATTC TTTTTATTGA TGAAATTCAT ACGGTAGTCG GTACTGGCTC 3660
AGGCGGTTCG GGTGGTTTGG ATGCGTCTAA CTTACTCAAA CCGCTGCTTT CTTCAGGAAA 3720
GATTCGCTGT ATTGGTTCTA CCACGTATGA GGAATACACC AAACATTTTC GCAAAGATCA 3780
GGCGTTAgcA CGGcGTTTTC AAAAAATTGA TATTGAAGAG CCTTCTGAGG AGGAAACCCT 3840
CCGAATTTTG GAAGGGATTC GCACGCTTTA CGAAGACTTT CATGCAGTGC ATTACAGTGA 3900
TGAAGCATTA GCTGCTGCGG TGAGACTTTC GGTGCAATAC ATCCAAGGGA GACATCTGCC 3960
GGATAAGGCG ATTGATATTA TCGACGAAGC AGGCGCGTGT GCAAAGCTAT CCCGGGGAAA 4020
GCACGGAACA GAGGGAGTGT GTTCAGTAAT TGGGGAGTCG GATATAGACG AAATTGTGGC 4080
AAAAATTGCG AAAATCCCTA AGCAGCGGGT ATCTGCAAGT GAAATAGAAA AGTTGCGTAA 4140
CTTTGAGCGC AGTATTTCAG AAAAAATTTT TGGACAAGGC GAGGCAATTG ACTTAGTCAC 4200
TCGTACgCTG AAGCGCGCGC GGGTGGGATT GCGCGTAAAG CATAAACCTA TAGCAAACTT 4260
GCTTTTTGTG GGGGCTACCG GTGTGGGAAA AACAGAGCTT GCGCGGACGC TTGCCCAGGA 4320
ACTAGGGATT GTGCTGCATC GTTTTGACAT GAGTGAGTAT CAGGAAAAGC ACACGGTGAG 4380
TCGGTTGATC GGCTCACCGC CCGGTTATGT TGGGTTTGAA GAGGGGGGAT TGCTCACCGA 4440
CGCGGTAAGG AAACAACCGC ATGCGGTGCT CCTTTTGGAC GAAATAGAAA AAGCTCACCC 4500
GGACATTTTT AATGTCCTGC TCCAGGTTAT GGATTACGCA ACGCTCACTG ACAACCAAGG 4560
CAGAAAAGCG GATTTTCGCA ATGTTATTTT GATAATGACA AGTAATGCGG GTGCCCGGAA 4620
CATGGGTGTT TCTCTCATCG GTTTTCACAA GGGGCAGGTG GGTACTGCAG TTATCGACGA 4680
AGCAGTAGAA CGTATTTTCT CTCCAGAATT TCGGAATCGG CTGGACGCAg TTATTCGTTT 4740
TGATGCGTTG TCCTTGGAAA CGATGGAACG CATCGCCCGC AAGGAGCTTG CCCTGGTGTG 4800
TGAGCAACTG CAGAAAAAAC ACATTCGTTT TGATATTACC GATGATGCAC TCGCGTTGCT 4860
CGCTGAGCGT AGTCACTCAG GGGGAAGTGG TGCGCGTAAT GTTGCACGCT TGGTAGAGCA 4920
AGAAGTTGCA AATGTGCTTG CAGATCTTAT GCTTTTTGGA GGAGTCGCTG AGGGGGATGC 4980
GTTGCGGTGC ACGGTAAAAG ATCGGCATGC TCAATGCAAT TTTCTCCGCA TCGAGTGCGT 5040
GCAGTCTTCG TATTCGGGGA GTATCCAAGA CGCGCTGGGG TGATGATGCG TGGCACGGTA 5100 ACGGTGTATC CGTGTGGTAG GCAGTGCACG TGCACTGAGT TATTCAAACG CGTTTCTCTC 5160
TTATCTTCCG CCAAAGCTCT ACACTCCATA CCGCCACCTC GTCTATGATC CTGCGTGCGT 5220
TCCCACCGGC AGTGAGAGTG GTGCCTGAAA CGAGCGGAAC gCaGcGAGCA CCATTCCATT 5280
TGATTCGTCA AGAACATGCG CAAAACCGAT AACGTCCGTT TCTTGTAGCG GTGCGCGCAA 5340
AAGGGGAGGG AGTGCGAACG TTATGCTGAT GCGCGTACCG GGCGTATTCA GAACTGGGCA 5400
CGATGTACAC GAGGGGTGGA GGATTGGTCG CAGTGCCCCA GGGCGCTTAC TGCCCAGGAC 5460
GGGAATAGCA CAGGGAAGCG CATCACTGAT AGCCGTGCGT ACATCGGTGC ACTGGAAATG 5520
GGTGAACGCC CAGTATAGAA GGGTAGCGCC GTCACGCGCG CGGATGCgCT TTCCTTCGGG 5580
AATGGAATTC CCTGCGCCGC CTAAGATAAC TGCAATGATG CGGGTGTCTC CGCGCAGACT 5640
GCTGAGCGCG ACGTTAAAAC CGGATTCGCG GATATAGCCA GTTTTTAATC CGTCGCAGCC 5700
GGGCACTGCT GTGGAAACGT TTCCGCGCGC TGCGGTTGCT GCAAGCAGGG TGTTGGTTGC 5760
AGGGAAACGT GTGTGTGGGG GTATTCTCGA AGTAGCGGGA GAGTTACTAC GCTGAAAGTA 5820 gCTGCGCGCA TGAAAGCGTG CAAGGTTTTC AGGCCATCGG CGCACATACT GGCAACAAAA 5880
GAGCACAAAG TCACGCGCAg TAGTTACATT GTGTTCGCTC AGACCGCTTG GTTCCACAAA 5940
GCGTGTGCGC GTAAGACCCC ATTTCTGTAC AAGCGTGTTC ATGCGCGTGC AGAACGCCGG 6000
TATACTGCCT GCAACTGCAT AAGCGAGGGT GTAGGCTGCA TCGTTCCCCG AAGCGATGTT 6060
CATACCTGCG AGTAAGTCGT GTACGCTGAT GTACTCACCT GTACGTAAAA ATATAAGCGA 6120
GCTGCCCGGT GCAAAGGCCT GCGCGCTACC TGCAAGCGGT ACGCGTATAC GCTGTTGCCA 6180
GTGGAGTTCT CCACGCTCGA GTGCTTCCAT GACCACTGCA CAGGTAACCA GTTTTGCCAA 6240
GGACGCCGGG GGGAGGGGTA GGTCTGCGCA GAAGGAGGCA AGGAGTGTGC CGCTTCCTCC 6300
TTCGGCGATG GCGTACGCGC GGGCACTGAT GGGGGGTGGG gTGGATCCTG CAGATAGATT 6360
GGTAAATTGA AGCGTACGGA TAGGGTGAGC AGAAAAGGGA TTACGGGACG ACGGAGAAAA 6420
ACGCAkTgTT GGGGAGAAAA GAAAGGAAGG GTGGAGTACt CkCTGCGCGT GCCACTGCAG 6480
CGCCCCTGCC CCGACTACGC ACGCACCTAA CGCATACAGC GCGCGCTGCC CACGCCGCCG 6540
TGCCACGGCG CGCAGGGAGC GCGATGAGTG GTCTTTCAAA CGGGCAGTAC ACGCGTGTAC 6600
CGGGTGTCCC GGCACACCGT ATTAGGGACA GAGCGTACGG CACGGGCCGA CCGCACATGC 6660
GTCACCCCTT GAAAAGCACG CCGCGACACG TCCTGTGTAT CACAGAAAAC ACCACACGGG 6720
TCATCATACA GCCTCCCGGC AGAGCATGTT GTTTGCATTA CTTTAGTATA GCAGAATGCG 6780
AAGTGTGCAG CGAAGGATTC ATCAATCCTG TTGCGTTCTC TTCTTTTTTG TGAGGCATAT 6840 ATTGCACCGG ATGCTCCTTG CGTCTGCCTG CCCTTTTGGG CAACACTTGT TGCCAGAAAC 6900
CTGTCTTCAC GAAgTCTTGT TTAAATCAGA GTAGGAAACG CTATGACGCG AAAATTAATC 6960 ACCGCCGCAC TCCCCTATGT GAACAACGTT CCACATTTGG GAAATCTTAT CCAGGGTCTT 7020 TCTGCAGACG TTTTCGCACG TTTCTGTCGG ATGCGCGGCT ATCACACGTG TTTTGTATGT 7080
GGTACCGACG AATACGGCAC GGCAAGCGAA ACCCGTGCGG CAGAACAAGG TCTCAGTCCT 7140
GCACAATTGT GTGCGCACTA CCATGCACTG CATCGCGACA TCTATCAGTG GTTTGATCTG 7200
TCCTTCGATT ATTTTGGGCG CACTACAAGC GATGCGCATA CTGAGcTTAC GCAAGCGTTG 7260
TTTCGTCATT TGGATGCGCG GGGTTTTATC AGTGAACATG AAAGTGCGCA GgCGTACTGT 7320
CTGCACTGTG CACGGTTTCT TGCTGATCGC TATTTGCGCG GTACCTGTCC CCATTGCCGT 7380
AATGCTGAGG CGCGTGCTGA CCAGTGCGAG CACTGTGGAG TGCTCCTTGA GCCGGAAACG 7440
CTCCTGAATG CGCGCTGTGT GAGCTGTGGC ACGGCGCCGG AGTTTCGCCC TACGCGTCAT 7500
TTGTATTTAA ATTTGCCTGC ACTGGAAAAA GCCTACCGCT CGTGGTTTTG CACCACGAAT 7560
CATCTGTGGA CTAAAAACgC GGTGCgTATG ACTGAAGGTT GGCTACGTAC GGGATTGCAG 7620
GAGCGTGCGA TCACGCGCGA TCTGCGCTGG GGGGTGCCAG TTCCCAAAGC AGGATTTGAG 7680
CAGAAGGTAT TTTATGTGTG GTTCGATGCG CCAGTCGGTT ACATTTCCAT TACTAAGTGC 7740
GGCACAGAGG CAGCTTCCTC GCAAGAAGGG GGGGGGACCG ACGATGGCGT GAAAGAAAAA 7800
TGGCAGTCTT GGTGGCTTGA TCAGCAGGAT GTGGAGTTGG TCCAGTTTGT GGGGAAGGAC 7860
AATATTCCCT TTCATACGCT GTTTTTCCCC TGCATGCTCA TCGGTTCCGG GCAGCGGTGG 7920
ACGaTGcTTA CGCGTCTTTC TGCGACGGAg TATTTGAATT ACGAAGGGGG aAGTTTTCTA 7980
AGTCTTTAGG GGTGGGCGTT TTTGGTTCGG ATGCAAAAGA ATCGGGCATT CCCTCAGATC 8040
TGTGGCGTTT TTATCTCCTG TACCATAGAC CGGAAAAAAG CGATGCGCAC TTTACCTGGC 8100
ATGAGTTTCA GGAGCGTGTA AACAGTGAGT TGATTGGTAA TCTGTGTAAT CTGGTCAATC 8160
GTACGCTCAC CTTTGTGGCG CGTACGTACG GGGGCGTGGT CCCTGCGCAA GATGGAGCGC 8220
GCAgCACCCG TGCGCAGGTG ATGGAAGAAA CGCTTGCGCT CCGCGAAGrt GCGGgAATAC 8280
TGCAAAGCGC ATGACAGATT TAATGGAGCA GGTACAGTTG CGAGAAGCGT TTAGAGAAGT 8340
GTTTGCGCTC TCAGCGCGTG CGAATAAGGC GTTGCAGGAT GGTGCACCGT GGAAAACGCG 8400
GGCGCAGGAC CGTGAACGTG CAGACGCCTT GATGCGTGAG TTATGCTATG TGATTCGGGA 8460
TGTGCTGATT TTAGCGCATC CTTTTTTGCC GTGGTACACG CAGCAAGCGG CCCGATTTTT 8520
GGGTGTTCAG TTGTCTTCCT GTGCACCAGA GGGGGGAGGA GCTGTGTGTG CTGCGAAGAA 8580 AGACGCGGAT ACGGCGCAng AACnACAGTG CAACCGACCC TCCGATGGTC AGACGTGGGA 8640
GAACGCAAGG GTTTAACGCA gGTGCATCCG CCGGTGATTT TATTCCGTCC GTTGGAGACG 8700
GAAACTATTG CTGCGTATCG TGCCCGCTAT GCTGGAACAc CAGGGATGGG GCAGGAGTGA 8760
GCGTACCGCG CACTGCACAG ATGCCCACGG GAATGAATAA GAAAGAGACA GACGCTCAAC 8820
AAAAGAAGGA GGAGCGTGAA ATGCCCCCTC CCTCAGATAC TGCACGGTTA TCTGCATTTT 8880
TTTCTGAGCG CGTTGTACTG AAAGTAGCAC GAGTGTTGCA GGTGGAGCGT CATCCGAATG 8940
CGGATATGCT TTTTGTTGAA ACATTAGATG ATGGCTCTGG CGTTGAGCGC GTTATTGTTT 9000
CTGGTCTTGT GCCTTATATG GCTGCAGATG CGTTGCGTGG TGCGCACGTG CTTATTGTGG 9060
ATAATCTGCA GCCGCGCTaC TGCGTGGGGT ACGGTCTTGC GGCATGCTGT TGGCCGCAGA 9120
GTATGTAGAT GCGCAGGgCA CAAAGGCAAT TGAATTGGTG CAGGCGCCAT GGGCTCTGCC 9180
CGGTGAACGC GCAACACTTG CGAGTGCGCC GCCGGTCATT ACACCGCACG GGTCTGCCGT 9240
TATCGATGCG GACGCTTTTT TTTCTGTGCC TATTCGTGTG GTAAATTATG CAGTAGAAGT 9300
TGCAGGTGAG CCGCTCATGG TTGGAGGAAG GCCACTGGTA ATGCAGCGAG TGAAAGAGGG 9360
AACTGTCGGC TAGGAATATT CACAGAGCAT TTGGTTTTCC GTGTCGGATA GGGGGAGCGC 9420
AgcATGAACG TGGGATTTTT GGGTTTTGGA GCAATGGGAC GGGCGCTGGC AGAAGGGTTG 9480
GTGCACGCAG GAGCGCTGCA AGCGGCTCAA GTGTACGCCT GTGCGTTAAA TCAGGAAAAG 9540
TTGCGTGCGC AGTGTACATC TTTGGGCATA GGTGCCTGCG CGTCAGTTCA GGAACTGGTA 9600
CAGAAAAGTG AATGGATTTT TCTTGCAGTC AAACCATCTC AAATCAGCAC GGTACTGCGC 9660
GATCGCCAAT CCTTTCAGGG AAAAGTGCTT ATTTCCCTTG CGGCGGGTAT GTCTTGCGCT 9720
GCATACGAGG CATTGTTTGC CGCGGACCCT CATCAGGGTA TCCGTCACCT GTCACTTTTG 9780
CCGAACTTAC CTTGTCAGGT GGCGCGGGGG GTGATCATTG CAGAAGCGCG CCACACCCTG 9840
CACCACGATG AgCACGCTGC GCTTTTAGCA GTGCTGCGCA CAGTTGCACA GGTAGAGGTG 9900
GTGGACACCG CGTACTTTGC GATCGCAGGG GTGATTGCAG GCTGTGCTCC GGCGTTTGCC 9960
GCGCAGTTTA TAGAAGCGCT CGCTGACGCA GGGtGCGCTA TGGCCTGGCG CGCGATCAAG 10020
CGTACCGGCT TGCGGCACAC ATGCTTGAAG GGACTGCAGC GCTCATACAG CACAGTGGTG 10080
TACATCCTGC ACAACTTAAA GATCGCGTGT GCTCTCCTGC AGGGAGTACT ATTCGCGGGG 10140
TGCTTGCGTT AGAGGAGCAG GGATTGCGCC GTGCAGTTAT ACACGCGGTG CgCGCTGCGC 10200
TCAGTTCTTC CTAAGGGGTG GGCAGGGTGC ATTGCTTGTT TTTTTTGACT GCTGACAGTA 10260
CAGTTGCACC CTTGTGAAAA GTTCGTGCGT ATATTGGCGG ATCGGGGTTC TCGTTTGTAT 10320 TCTGTGTGGA GTGGGGAGCT GTGGCGgTCG TGCGCGCGTG CgcGAGTATT CGCGTGCGGA 10380 gcTTGTTATC GGTACGCTCT GTCGCGTGCG CGTGTACTCT AAGCGACCTG CTGCTGAAGT 10440
GCACGCGGCG CTTGAGGAGG TGTTCACGCT GCTACAACAA CAGGAGATGG TGCTGAGTGC 10500
TAACCGTGAT GACTCTGCGC TTGCTGCCCT AAACGCTCAG GCAGGTTCGG CACCGGTTGT 10560
TGTTGACAGG TCGCTGTATG CGTTGCTTGA GCGTGCGCTT TTTTTTGCAG AAAAGAGTGG 10620
GGGTGCGTTT AACCCCGCAC TAGGTGCGgT AGTCAAGCTT TGGAATATTG GCTTTGACCG 10680
TGCTGCTGTC CCTGACCCCG ACGCGCTCAA GGAGGCGCTG ACACGTTGTG ATTTTCGTCA 10740
GGTGCACCTG CGCGCTGGGG TATCGGTGGG CGCGCCACAC ACGGTACAGT TGGCACAAGC 10800
GGGCATGCAG TTGGATTTGG GCGCCATTGC TAAAGGATTC CTTGCGGACA AGATTGTACA 10860
ACTGCTCACT GCGCATGCTT TGGATTCAGC GCTCGTTGAT CTGGGAGGAA ATATTTTTGC 10920
CCTTGGTCTT AAGTATGGAG ATGTGCGCTC AGCAGCcGCG CAGCGGTTGG AATGGAACGT 10980
GGGTATTCGC GATCCGCACG GCACGGGGCA GAAGCCTGCA CTGGTGGTGT CGGTGCGCGA 11040
TTGCTCGGTG GTGACTTCTG GTGCGTACGA GCGTTTCTTT GAGCGTGACG GGGTACGCTA 11100
CCATCATATC ATCGATCCGG TTACCGGGTT TCCGGCACAC ACTGATGTGG ATTCTGTGTC 11160
TATCTTTGCA CCCCGTTCCA CAGATGCAGA TGCGCTTGCT ACCGCCTGTT TTGTATTGGG 11220
GTATGAGAAA AGCTGTGCGC TCTTGCGTGA ATTTCCCGGT GTTGACGCGC TGTTTATTTT 11280
TCCTGAcaaG cgcGTGCGCG CAAGTGCaGG GATTGTCGAT CGCGTGCGTG TGCTCGATGC 11340
ACGTTTCGTG TTAGAGCGTT AGGACAGCAC GTGTGCTGTT CGTGTGTAAA AAAGTGTGGC 11400
GGACTGTCCT CATCATGGTG TGTGTGCAGG ATGCGTGCGC GGGGGTTCGG TCAGATGTCA 11460
GGGTGTAGGC AAAGATGAGC GCAGCGCTGA CAAGAGGTGT TGAGTGCACC CTTTACTCCT 11520
AGGTTCAGTG AGCTGCGTAA TTTTGAATCG AGGAGTACAG TGATGGAGAC GTTTTTTACC 11580
TCAGAGTCTG TGAGTGAGGG TCATCCTGAT AAGCTGTGCG ACCAGATTTC TGACGCTGTT 11640
CTTGATGCCT GTCTTTCGCA AGATCCTCAC AGTTGTGTTG CGTGCGAAAC TTTTGCCTCC 11700
ACGTCCCTTA TCCTGATTGG AGGTGAAATT AGCACGCGGG CGCATATTAA TCTTACCCAA 11760
ATTGCGCGTG ATGTTGCCGC TGACATTGGA TATGTAAGCG CTGATGTCGG TCTTGATGCA 11820
GCGTCCATGG CTGTTCTTGA TATGACTCAT CATCAGTCGC CTGATATTGC GCAGGGGGTG 11880
CACGGTGCAG GACTGAAGGA GTTTGCAGGA TCGCAGGGGG CAGGGGATCA GGGGATTATG 11940
TTTGGTTTTG CGTGCCGCGA GACGCCGGAG TTTATGCCCG CCCCCCTCAT GTGCGCGCAC 12000
GCGgTTGTGC GCTATGCTGC CACGCTTCGT CATGAACGCC GTGTGCCGTG GCTGCGTCCT 12060 GATGCAAAAA GTCAGGTTAC CGTACAATAC GAGGGACATC GACCGGTACG TATCAGTGCG 12120
GTTGTGTTTT CTCAGCAGCA TGATCCGTCA CCTTCATACG AAACCATTAG AGAAACGCTC 12180
ATAGAGGAGA TAGTGCGTCC GGCGCTTGCA CCTACAGGTC TGTTAGATGA AAACACGCGT 12240
TTTTTTATCA ATCCAACCGG TCGTTTTGTC ATTGGCGGTC CCTTTGGGGA CACTGGTTTG 12300
ACCGGGAGAA AGATCATCGT AGACACGTAT GGGGGAATGG GCCGCCATGG AGGAGGCTCC 12360
TTTTCAGGTA AGGATGCATC TAAGGTAGAT CGTTCTGCAG CGTATATGGC GCGTTATATT 12420
GCAAAAAATA TTGTGGCAGC CGACCTTGCT GAGCGCTGTG AGGTGCAGCT TGCATACGCA 12480
ATCGGCGTAC CATATCCGGT TTCGCTGCGG ATAGAAACAT TTGGAACGGC GCGCGCATCT 12540
GAGTCACACA TCACACACGC GGTGAAAGAG ATTTTTGATT TAACCCCAGC GGGTATCGTG 12600
CGCACGTTGG ACCTGTGTGC GCCTCGGTAC CGCTCGACTG CAGTGTATGG TCACTTTGGG 12660
CGCGAACAGT TTCCTTGGGA ACGCACAGAc TGCGTGTGCG ACTTACAGCG TGCGGTGCGC 12720
CCGTTCGCGC TCTCTGGCCA GATAAAAGAG TAGCTTCGTT TCTTTTTTGT CTGCGCGGGG 12780
CCTGTATCGT TACAGCCCTT CACTTTCTGC CCATGTTACG ATGATTGGCT CTAGGGAATG 12840
TATGGAAAAC CCAAGGGTAT GGACCTGCTG GTATTCATGA CTGTTGGGCC ACCGTTGGTA 12900
GGGGTCATCG TAGTGCGTGT GCAAAAAGTG ATAGATGGTG TCTTCTGCAT TGTTTTTGcG 12960
CGCGCGTAGG CgCAGACCAC GGCGTACTGT TGcACGGTTG AGCACCGTAC GAATGGCGGT 13020
CCGGACCTCT TTGAAGAACA ATTCGCGATA CACGCCGTAG TCCTTCCCGG TGATTCTTTT 13080
AACAATGTGT GCGATGGATT CGACCGTCTC TTCCGGTGGG GAAGATCGCT CTACGGTTGC 13140
AGCTCCCTCC TGGTGCACGG GAGAGGAAGT GACGATATCG GAATGTCTTG CCTCCGCGTG 13200
TGTTCCTGTG CGTGATGCAC GGGAACTGCA CGTGCGGACG TTCTTTGTGT TGTGACGGAA 13260
AACCCTTTCC ATTCCCGCTG TTTCATCTTT CTGTTTTGAT TCCCCACTAT CTTTTTTACT 13320
AATTTTCTCC GAAGGAAGGG CAGGTAACTC TTTCTTACGC GCACGAGTCC gTGCACGCGT 13380
GCCCGCCGTC TTGCGCGCAT GCGGAGTAGA AGATCGTCTG TGAGTCGCAG GCACTTTTTT 13440
ACCCTTTTGA TGATGaTgCT CCGCACGTTC TACCAGGCGC GCTTTTAACC AGTGCCAGTT 13500
TCGTTGCAAA TACAGATCAC CACTGATGCT GTAGTAATAC CCTGAGGCGA CGTGCGCGAG 13560
CACCAAGTGC AAAATATCTT CGTCTTTCCA TTGGTGACTC CGTCCAGTTT TAATTGAATG 13620
AAAGATATTG TAGTGATGTG TTTTATACTT TCTGCTGTAC AGTAGCACGG TGTTCTCAAA 13680
GGCTCGATCA TCGTATCTGT GCCAGTATTT GATTGCATAC GCAAGTACCC GATCTTCAAT 13740
TTTTGTAATA GTTTTCGCGA TTTCAATGGT GCGTACCTGT TCTGCGGTGA ACACCAGTTG 13800 TGAAAAATCT ATCGTTGAAG AGGTATCTGT ACCGTGAGCC TCTGTGCAAA AGCCGTAGtT 13860
GCGCGCGTGT GTGACTCCTG CCTAAATGTT CGCACAGAAG AAAGCGTGTC GGTAACGTAC 13920
ATGCGAATGA TGTCTGCAGC CTTTTCCATC TCTTTCTCTC GGTAGAGCAG CCATTGTGCA 13980
TAATGCGTGT GCTTTCTGGA GAAATGAACG GACTCAAAGC GGTTGAGAGA AATAGAACGC 14040
ACAATCTTTT CGCAAACTCG GATCAGCGTG CGCACGTCGG TCAGGTCGAG GGGCGCAACC 14100
GCGGCGGCGA AAAAATCAAT AATTTTCTTT ATTTCTTGGA TTTTCTTCTG AGCGAGCTCT 14160
GCTAGGGCAT CGGCAAGCAC TTCCTTAAAC CGCACGTCCC TTTTGCGATA ATAGGACATG 14220
AGGAGTACCC TACGCTCCTT CTGAGGGTAT TTCACCAAAA GCGTGCGGTT GTAGATATCC 14280
AGTGCCTGTC TTTGATCTCC AAGGGATCTT AGTTCAAAAT ACCGGTCGAT GTCGGCATCT 14340
TCGCTAAAAC TCAGCTGGGG AAACTCCGCC CTCAGTACTT TTTTATCAAG CTCTGTGAGT 14400
GTGCGCATGA CATCCCAGGT GCGTAGACAA GCTCTTTAGA AGAAAGGCTG GCGCGCAcGG 14460
GCCTTTtCGA AGCGGTGAGA AGAACTAACG CAAGAGGCTT AGAACGCTCT GcGTAgCCTG 14520
ATTCGCCTGT GCAAGCATAG CAGTCCCAGA CTGCACCAGA ATCTGGTTCT TGGTGTAGTC 14580
TACCATCTCT TTTGCCATAT CCACGTCGCG GATGCGAGAC TCAGCTGCCT GCAAGTTCTC 14640
TGCCGCGACA TTGATACCGG CAACCGTGTG GTCAAGTCTA TTCTGGTAGG CACCGAGATC 14700
AGCGCGCTGC TTATTAATTC TCTTTATTGC CTGATCAAGC GTACCGATTG CGCGGTTGGC 14760
CTTTTCAGGA GAATCGATAT TCATGACCGA CTCGTCACCT GCATCCCGAA TTCCCATGGC 14820
AACTGCAGTC ATGGTCCCGA TATACGCACG CGTGCGCTGG TCCATGTTTG CACCGATGTG 14880
GAACCACATG GATGCAGTTA CAGTGTTCTC CCCGCCTTGA CGCGCGAAGC GACCAGTGAG 14940
CATGTTCATG CCATTGAACT GAGCGTGGCT GGCAATGCGA TCCACCTCTG CTACCAACTG 15000
AGAGACCTCT ACCTGAATGT AGAGACGGTC TTCTGCGGAG TAGATACCGT TCGCCGCCTG 15060
CACACTCAGT TCGCGAATGC GCTGGATAAC GTCGGTGGTC TCCTGTAAAA ACGCCTCCGC 15120
AACCTGAATG AAGGAGATGC CGTTCTGCGC GTTTGTAGAC GCCTGGTTCA AACCACGGAT 15180
CTGGTCCGCA TCTTTTCAGA AACTGCAAGA CCCGAAGCGT CATCCCCTGA CCGGTTGATG 15240
CGCAGTCCTG AAGACAACTT CTCAATGTTC TTCTGGACGG ACAAGTTAGT GTGTCCGAGC 15300
GTTCTTTGAG AGAACATAGC GCTCATGTTG TGATTGATGA TCATGAAGCA TTACTCCTTT 15360
TGGTGCTTTC AAGCGGACCA GCCGCCCTGG CATCCCTGCC gTTGCACCCC GTGCTTGGTA 15420
AGGGGTATCG GAATACGCCG GGTGCACTTG AGGAAAAAGC GGTGCGTATA TCTTGCGTAC 15480
GTGAGTGCTT GAACGTTGTG CAAATCGGAG GTAGAATCCC CGTCCTGTTG ACCTCTGCAG 15540 CAGAGTTACC CCGGTTAGGT TCGTGCGTGA GATAGGTTGC CGGTTGCGTC CGGCTGTGTG 15600
TGCACTGTGG ATTGAGTGGC TCTGTCCTTG TTTGAGCTTG TGCGCGGCGT AGCTGTACTT 15660
GGCGTGCACT GCCGTCTTAG CTTTCCACGG AGGGATGTGG GAGAAGATAA TTAGGGAATG 15720
TGGGGAAGGC GTATGAGGTG TATGAAGATA CCCAGGCAAC TGACGAGGCG TCGGCTACTT 15780
GAGAGGTTTT ACGCGCACCC GTGGGTGCTT GTTGCGGTGC TTAGCgcGCt GACGCTCTTT 15840
TTTGCAGTCC aGcTACGCAC GCTACGCTTG GACAATAATA ATTTTCGCTT TATCCCCAAG 15900
GAAAACTCGG TGCGTATCGC CGATCAGCGC ATCGATAGCA CATTCGGCTC CCAAGTTCCT 15960
GTGCTCATTG GTATTAAGCG TGAGTATACT TCCGTCGTTG ATCCTGTCTT TCTTGCGGAC 16020
GTGCGGTCGC TTATTGAACG CATCAGTGCG GTCCCCTTGG TGAGGGCGGA GAGTACTCTC 16080
TCACTCCTGT CTGCCGAATA CCTTGGTCTG CGTGCAGGAA ATATTATCAG TGAGCGTGTT 16140
GTTCCTGATG AGTTCTCCGG AAGTGCAGAA GAGGTACAGG GCGTTTATCG AAAACTTCGA 16200
GATTGGGATT TCTATGAATG TAGTCTAGTC TCGCGCGATC TACGCTCTAT GCAGATAGTC 16260
GTGTTTCTAG ACACCTCCAA CGAAGAAAGT AGTTCACCTG AAGCGATGGC AGCTTGTCGC 16320
GCGATCATAC GCATTCTCGG TGCGTGGAAA AGTCGTGACG CTCAGACTTT TGTCACAGGG 16380
GTGACTGTTT TTAACGAAAT GGGGAATGAG GCGTCGACGC ACGATTTAAC GCTCCTGGTG 16440
CCGCTTGTGG TGCTCATAAT AATCGTGGCG TTGTTTGTAT CGTTTCGCCG CCTGGCGGkT 16500
ATCTTCTTGC CCCTTTTGAC AGTGGTCATA TCTACCGTGT GGGCCTTAGG AGCTATGGCT 16560
TTGTGTGCCA TACCACTTTC TATCCTTTCT GCCATCTTGC CTGTAATTCT TATTGCCGTC 16620
GGGAGCGCAT ACGGCATTCA TATAGTTAGT GCGTATTTTT ACGGCGCCTC CTCGCGTATC 16680
TGCTCCACCC GGCAGGAGCA TCGCGCTCGC 16710 (2) INFORMATION FOR SEQ ID NO : 46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1235 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46:
TCAGCCCGCG CACAGGAAAG TATAnGATCG GCACGTTTCC TTGCCTCCGC ATATATATCT 60
CTTTCTAACA CcTCTGTTGA ACGCAGCTCT TCCATGTAGT CTATACCTTT GCCCCGATAA 120
CCTGTCGAAT TGATTCACGC AGGGACGTGC GTACGTTCCC CTGTTCGTGC AAAGCGGGGA 180 CTTCTACGAC AAAGGGTAAC TCCCCCTGCA TCTGCCACGC ACGCACCTCC TCTCTCAGGT 240
AAGACATAGC GCGCTGGGTG AGTATGAGGA CCTGCGCGGG ACGGTCTCCA GAGGGGGGAC 300
GCGTGACCTG GGAGAATGCG CGCGCCGCCT CCTCTGGAGA ATGTACCACG AAACCGTGCA 360
CCCCTACTAA GGAAAAACCC AACACCAGTT CTTGCTCTCC AATTATGCAA TACGTCACTT 420
GATGACACCC CAGACGCACG TCAAAGCAAG ATGATAAGCA ATGCGACCAA AAAACCCCAC 480
AGACAAATGC CTTCTGCAAG ACCGATAAAG GGAAgTGCCT TTCCTGAAAT TTCAGGATCC 540
TcAyTCATTG CCCCCATCGC TGCAGCCCCG ATTTTACCTA CTGCAAGGCC TCCCCCAACG 600
CAAGCGAGCC CCACCGCGAG TcTGCGGCAA TGTATTTTAA GCCGCCATCT ACATGAGAGG 660
GCGGCTGmmT CTCCGCGtTa AGaAGACACG CACAAGCCAG CAAACnCACC CGTAAACCAT 720
GCTCTTTTCC AACCCATACT AATCCTCTTG ATATCCAAAC CTAAnCGGTG CGAAGACGCT 780
CCCACTTTTG GTAAAAAACT TTGAAAAAAA CTCGTAGTAT TGCAGCCGAA CCGCTTGAAt 840
GGCAACGATC AACCCTTCTA GAAAGATAAT GACTCCATTC CCAAACACGT ACACGAGTAT 900
GCCCCATAGT GAAGCGTAGC aCCAACGAAT TGCGTCATAG TAAACACCAC AAAACTTAGT 960
ACCGCATGGG ACAAGGCAAA GGCTCCCACG CGCAAAAAAC TCATGGAGTT GGAGAAAAAT 1020
CCCGACACCA CATCGACCAT TTCGATAACA CCGTGCATTA GATACATGCC AACACCTTCA 1080
GGAAACCACG GACGCACACG CTTGCACACA CGCTCCAAAA ACTCTTGACA AAAAtACCCA 1140
CGaGAGGCAC GCCCTTGCAA CCGCATCAAA GACCCCGAAT GGATTCCAAA AGTGGTATGC 1200
GCACTGCAAG GGCAACATGT ACCAAAAAAA GAGGG 1235 (2) INFORMATION FOR SEQ ID NO: 47:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16636 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47:
ATTCTCnGCA CATGTnCCCT GACACTTCCG TAGCGGCTGC CGAGACCTGC AGCCAGAATA 60
ACTAGCGTGA CGTATTCGCT CATAGAAGTC TTCTAGCAAA ACGGAGCACG CCCCCGGgCC 120
ATCCCCGGGA AAGCGTAAGA GACGCACGGT GCACTTATGA GCGCGCACGC AAGGCCGCCA 180
TCGTAACGTA TTGTACGGCT TGTTAAAATG CGGCAGAAAG AACATGTCCA GCAGTTTTAA 240
CTTATCAATG GTCACCCGCT CCTGGATTGC AAGAGAGAAC AGGTGAATAC CCATCGACAT 300 GTCGTGCCGT GACGCCATTT GAGCGCCCAC AATCACCCGC GTTTTTTTAT CAAACACAAT 360
CCGGATTTTC ACCGGGTGAT TGTCCACTTC CATAAAGGCG GGTAACTGTG AGTCTTCAAA 420
ATCCGTTACC TCCACCTCAA GTCCCATGCG CGCGGCGGCT nCCTGCGTCA CTCCTGTGGA 480
CACCATTTTT AAGTCGTATA TGTTGATACC GTTGGAGCCC TGCACCCCAA TGCCTTCAAG 540
TGGGAATCCT GCAGCGTTGT GCGCCGCAAC GATACCGCTG CGCATCGCAT TGGTTGCAAG 600
CGCAATGTAA GAAGTTTGTC CGAGCGAATT GTCAAACACC GTTGCACAAT CGCCAATTGC 660
GTACACGTCT TTCACACTTG TTTCTTGTTT TAAATCTACC gCATACGCGC CATTGGCAAA 720
GGTTCGCACC TCGTTCTTTC CCAGTGCAGT ATTGGGGCTA AAGCCAATGC ACACCATAAC 780
CATGTCTGCG GGGTACTCAC CCTTATCTGT CACCACTGsC ACTACCTTTC CATTGCTGCC 840
ACGGAGTTTT TGCACTTTCT GGCCAAACGC AAGGGTGATG TGATGGTGCG CAAGTTCTCA 900
TCCATGAGTG CACGGAAGGA TGCGTCGTAA TAATTGGAAA GACTAGAGTC CATCGCATCG 960
ATGAGCGTTA CCTTTTTTTG GTGGCGCTCA AACGCCTCTG CAAGCTCCAC GCCAATGTAC 1020
CCGGCACCGA TAACGGCAAT ATTCTTAATG GAAGGCTGCT CGAGTTTTTT AATCACCGCT 1080
TCAGCATCCT GAAATAGCTT AATGCGCTGA ACATTCTCCA AATCCATGCC GTCGATTTTA 1140
GGAATGATAG GCAGAGAACC GGTTGCAATA ATAAGCTTGT CGTAGGACTC TGCGATTGCA 1200
GAACCGTCTC GTGCAGTCCC GTACACCTTT TTTGAGGCAA AATCGATACG GGTGATATCG 1260
CTTTCCATGG AAACGCGTGC ACCCTTTTTT TCCAATTGTT CTTTATTTGC GTAGAATAGA 1320
CCCTCCGATC CACGGATTTG TCCGCCAATC CAAAGAGCCA TGCCGCAACC AAGGAAGCTA 1380
ATATTATTGT TACGGTCAAA GACTACCACT TCATTCGTGG TGGTAAGGTC TGTGAGGCAA 1440
TTGACACAGG CGGTTCCTGC GTGGTTCGCT CCGATAATGA CGATCTTCAC GGCGTCCTCC 1500
TTACGTTTGG CATTGTAGTT CAGGGAAAAG ATTTTTGTAC AGGCGCCTGA AAACAGCCGC 1560
GGTTGTTTTG TTCCCAAACG CGATAACTGG GAGAATGTTA TTcTGCGGTG CAGGTGGTTT 1620
TTCTTCAAAA ATGAGCGCGT GGAGCGCGCG AACCCGGTCT GAGTCTTCTT TTTGCACGAG 1680
GATATTCCAC TGCGTTGCCG CTTGCTGTCT ATCCTGCATG GTTAGATACA GATGAATCAA 1740
CTCAATGCGC ACCTCAGTAC ATTGGGGCCA CAGGGTGCAC GCCTCTTGGA GCGTAAGCTC 1800
TGcACGGGGA AAATCACCAA GTTGAGCAGC AACGAGTGCA ATCAGCGTAT ACACATGCGC 1860
AGTAAGTTGT ATATCTAACG CCTTCACCAG TAAGGCGTCC GCTTCATGGA GTAAATGCAG 1920
TTTGATGCTG TTTTCTGCTT TGTGCACGAG CATCTGCACG TTCTGCGGcT CCTCCCGGAG 1980
CGCTTTGTCG TACCACGGCG CTGCGTGTTC ATAGCATCCG TCGGCGTAAA AGGCGTTGGC 2040 AAGAAGATGA AAAGCCTGAC CACGTTGCGC AATCACCGCC gCGTCTGCGC GCTGCGTTTC 2100
CACTTCAAAC AGGGGAACAC AGCaCGCTAG CGCACCGCgc TGGnCGCTGT AGTTCTATAT 2160
AATTTTCCAA CACTACGAGT TCGTGCGGTA ACAGCGCACG CGCGCGTTGT AAAGACTGCG 2220
CTGCAGCGTc GAAGTTTTGC TCGCGCAGAG CCTTTTGCGC ACACAGGTTA TGCAGCCAGC 2280
CGTTATCCGG ATCGAGTGCC AGGGCGCGCG CAAGAGGCTC GTCACAATCT TTTTCGCTTA 2340
AAAAAAGAGA TTCAGCGTAT TTAAAGTGGT AGAGCGCACA GTCCGGCGCA AGTGCACAGG 2400
CTCTTTGAAA TGCATCACGT GCGCGCTGTT CGCACGCTGC TGCAGCAGGA GCGTCGTGTT 2460
CGTGCGTGCC CTGCGCTTCT CTCAAAAGCA GGCCATATAA GTACCAGACG GTAGCATCAG 2520
CGCTCCtGCG CGGCAGAGTG CGTCAAAcTG CGTGTGTGCC TGACGGTGCC GTCCTGTTGC 2580
ATAATACAAT TTGCCTGCGA TACTGCGAAC TTCTGTACGC TCAGGATCAA GACGGCGAAG 2640
TGCGAGTACC ACGCGTTCTA AATCAGCGTA GGATTCCTGT GCTAAAAAAA GCCGCGCCGC 2700
ATGTAAATAC GCCTGAATTG CTTGCTCCTT TTCCCCCAGT AGGGAAAGCT CCTGCGCCGC 2760
ATGCAGCGCA AAGAGTCCCT GGTGAGGGTC CAAACGAAAT GCACGCTGAT ACGCAGCAGC 2820
AGCGTCCTCG TGTCTATTTT GTGCAAGTGC AAGGTGACCG CATAGATTGG AAAGAAATGC 2880
ATCGCGCTCG GCTACGACAC GGTGTGTGGC AAGGAGGTGC GCGAGTTTCT CGTATGCGTT 2940
TTGTTGGTAC AGCGCACCTC CTAAGGCGAG AAGTGCCCGC ATACAGTGAG GGTCTTGCAC 3000
AAGCGCTGCG TTAAACGCTT CTTCTGCCTC ATCATAGAGG TGCTGGGCAC ACGCGATGTC 3060
TCCTGCAAGA AGATACTCGC GTAAACCAAA GGTACATTTG GCTTTTAGTT TTTTCATCTC 3120
TGcACGAGCG CaCrCTGCGc GTcCCATAGC GTACAGGGAT TCTGCGAGAC GCAGgTGCGC 3180
GCTTGCAGGG ATACGCGCAG gTTCGTTGGC GCGCAAGCAT GCAGTTTCAA CCATGAAAGG 3240
TTGCGATTCC AGACTGAATG CTGAAAGTTC AAAAGAACGA TGCGCATGGA CACCCCAGTT 3300
CTGGGCAGCA AAACAAACTT TCCCTGCCGC TTGTATCGTG TCATCTTCAA TTTGCGCAAT 3360
CCACTGATCA TCTACACAAA GAGTAAAGCT TGTGCCAACT GCAATGAGGC TCAACACGCG 3420
CACTTCATCG CtGGCGCCAG TGTCAGTCCA TCCGAGCAGG GGGAGGGGGG TGTTGTTGAC 3480
TACCGCGTCC AGACGCAGCC ATCCACCGTC TGAAACAAGA AGCGCGTAAA AGGTGCTCTC 3540
ATTCAGATAA CGAAACAAGA GCCcTGCGGC GCAGGTACCT GCTCGCTCGG GCACCGCCTC 3600
GCCCATTGCG CTCGGGTCGA CGGCAACTGC CTCCGGAAGA CAGGCAGGCC GAGAGGTCGA 3660
ATCAAGAGAA GCAGGTACTT CTGGCTGTTC TGAATTAGGC AGCGCACGCG CCTCGCCTGG 3720
AGGGTGAGTA CCCGGGAGAA AGCGGATGCG CGCAGTAAGC ACGCAGTCCT TATAACGAAA 3780 GACGGGGTTA GCACTCCACG CATAGAGGAA CTTACGTCTG AGGTGGAGCG TTAACCCATG 3840
CGCGCCACGT GCAGTCTCAT AGCCGTCCCC TGCCTCCGCG TGCCAGCGTG CATGTTCCGT 3900
AGAAGAAAAG TCAGCGCGCC AAAACTCAGA CACAATTTCT TCATAATCTA CAGGAGGTGT 3960
TACACGCCTT TTCAAGAACT TTGCTTTGAA ATATTGAAAA TACAGGCGCG CACGTCCCAC 4020
GCCCAAACTC TACGCTAATT TTCCCCAAAG TAAAAGGGGA AGGGTGAGTC ACAAGACAGC 4080
ACACAGGTGA TCACAGGGAG TGCGCGTCTC CGGTTAGGGA AATAAGAAAT GTGGTATGCT 4140
CCGCCTGTAC GTTTGGACTA TGGTGCAGAG GATAGGAAAA GATGCAACTG TACACTCCTG 4200
CGCCTATCTG GTACCCTCTG GcATAGTTTT TACCTAAGGA GCATTTCAAT GGCATTACTT 4260
GACATAAGTA GCGGGAACGT CCGCAAGACT ATCGAGACCA ACCCTCTGGT CATTGTGGAC 4320
TTCTGGGCTC CCTGGTGCGG TTCGTGCAAA ATGCTCGGTC CTGTTCTGGA GGAGGTAGAA 4380
AGCGAAGTCG GCAGCGGTGT TGTTATTGGA AAACTGAATG TCGATGACGA CCAAGATCTC 4440
GCCGTTGAGT TCAATGTGGC GAGCATCCCC ACGCTTATTG TTTTTAAAGA CGGGAAAGAA 4500
GTCGATCGTT CCATAGGCTT CGTTGATAAG TCAAAAATTC TCACGCTCAT CCAGAAGAAC 4560
GCCTAAGGAT ATTTCTTTCG TACGGAGTGT GCTACCAGCT CATCAGCAAA GCGATGCAGC 4620
CGGTGCCTAG GGGAACAGTT ATATTGTCGT AGTCTTTGCA GGGGAGCAGT TCGATGAGCG 4680
CCGCGCAGAC TCCGAGCCCC CAGCtGCGCG CGCCGGAGnT CCCAGCGCGT GCACACTCAG 4740
TGCCGTTGCT ATGCAGCACA CGGCACTGCC TACGGGTGTT TTCCCCTGCA CGGAAAAGAG 4800
CGTCGTGGTC CTTCCCCTTC TGCGTCCGGG GGAATCAGCG GAATTTTTGA AGCGCAAGAA 4860
AGTGTATACA GTACCGGCGA GGCTTGCGCA TCCGTCTCCG AAAGCGAGGG CGCAAATGGC 4920
GGCTGCTGCG ATAGGTGGGG GAAAGAGAAG GATCGCGCTT GAAACACCGA TTGCAAGGGT 4980
GAGTGGCCCT CGCACAAGAG AACGTTCGTG TTTGTCCCGG TCTCGGGCCG CCGCTCGGGT 5040
GAGGAGTGAG ACAAGTGGCA ATGTTTTTCC ACGCAgTCTC CACCGCTCGG CAACATAATA 5100
GCCAACGCCG AGCGTGCCAA TGGCGCCGAG CGTAAGGGGT TTGCTCCATG CCGCAAGCAC 5160
AATGCTCAGT GCTGACGACA GGTGTATGCC CTTTCTGAGG CATTCGCTTG CAAGGCGTGC 5220
GTGCATGCTT TCTACGTGAG CGGCGCGGCG AGTACGTGCG GCTGGGTTAG CGCCGGCCTG 5280
TACGGCCCCG ATCCTGGCTG GATGCTCGCC CCGCCGGCAG GGGGCGGCTG TCTGCAAGnT 5340
GTTGTTCTGG TCCTCAGCGT CCACCGGCTT CCGTCTTTTC ACAGGGGCCA GCGCAgTACT 5400
ACGCGCGCTA CATTGAAAAA TACGAAGAGA CTGGTAGTAT CTACGATTGT GGTGATCAGC 5460
GGTCCTGCCA TGATGGCGGG GTCTAAGCAG AGCTTTTTAG CAAGAATAGG AAGGAGCCCT 5520 CCGGTCAGCT TTGCCGCTAC TACGGTAATC ATCAGTGTTA TTCCCACGCT CCCACATAGG 5580
GCAAGCGGCT TGCCATTTAT GAAGTACGTT TTCGCCAGGC TTGTTGTGCC TAGAATGCCG 5640
CCAACTGCCA GCGCAACTCG CAGCTCTTTG TACAGCACCC GGAGCCAATC GTGTAGCTGA 5700
ATCTCACCCG TTGCAAGGCC TCGGATAATA AGCGTAGAAG ACTGACTGCC TGAATTCCCC 5760
CCGGTGTCCA TGAGCATAGG GATAAACGTG GTAAGCGCGG TGGCAGTCAC CAACAAATCC 5820
TCGTATCCGG CGATGAGATT TCCCGTGAAC GTTTCAGACA CCATAAGAAG TAGTAGCCAG 5880
CCGATACGGT GTTTCACTAG GGTGAAGACC TCAGTTTCCA GGTATGCTTC ATCTGAAGGC 5940
TGCATGGCCG CCATGATCTG GAAATCCTCG GTAACTTCCT GCTGCATCAC GTCCATGATG 6000
TCATCGACGG TGATAATGCC AATGAGCCGT CTTTCAGTAT CCACCACAGG GAGTGCAAGG 6060
AAGCCATATT TTTTAAACAC CGCCGCGACT GCTTCTTGAT CATCATGGGT GTGTACAAAG 6120
ATGCAGTCAC GTTCACACAG ATTCTCAATC AACAGATCTC CCTGACTAAG CACGAGCTTT 6180
TTTAAAGAAA TGACACCGTG CAAAAACCGG TTTTGGTCAA TGACATAGCA CGTGTACACG 6240
GTTTCTTTTT TTAATCCGGT TTCCCTAATG CAGCAGAGGG CATCGTGCAC GGTCATCTGC 6300
TTCTCTAAGT CTACATATTC AGTTGTCATG AGGCTTCCTG CAGAATCCTC TGGATATTTT 6360
AAAAGTTGAT TGATAACTTG ACGTTCTGTT TCACCCGTTT GTGCGAGAAT GCGTTTCACA 6420
GCATTTGCAG GCATTTCTTC TACCAAATCG ACTATATCAT CCATGGCGAG TTCTGCAAGA 6480
ATAGGTGCAA GCTCTCTGTC TGTAATGGTG GCGAGAAAAG CAGATTGTTT GCTGCTTGAA 6540
AGCTGCGCAA ACACATTTGC AGCGAGGTTC TTGGGCAGCA TTCTAAATAG CAACAACGCC 6600
TGTGCAGGTG ATTGCATGTC CAAGACATGC GCGACGTCCA CCTCGTTCAT CTCGTTTAGA 6660
TTTGCAATAA GTGGTACGTA ACGCTTGTGC TCGAGAAGCG TCTGGATTTT TTCAAAGTTC 6720
TCGTTCATAG CCAATTCCCA CGCGGAGTTc CGGAGTATAC GTGAGTTCAC CTCTGTTCTT 6780
CCATAGGTGC ACGTCCCGCA CGAAAGTGAC TGCTCCTGCT CCCAAAGGCC TGGCGCCACC 6840
TAGGCAAAAC GCACAAAAGC AGAGAGGGAG CGAGAGACGC GTTCGTTCAG GCAGATCAAT 6900
CGATGGAGTT CAAAATTCGA GTGAGCCTGT TTTTATCGTA TAGATCCATA GGTCTGCCCT 6960
CTGCAAAACG GCGCGCCTCA TCAGTAAAAG CACCGGCGGT CATGCAAATG CCCTTGCCAG 7020
CTTTTAGCTC TCTGATACGT CCATGCAGAT CGCGCAAAAC GAGTTCCCCT ACTGACCCCT 7080
GAGAGCGAAA AAAACGGAAC AATACGAGGT CGGCCCACTT TGGCGTGTCA ATTTCAGAAA 7140
CTATATCTGT GTGAGTATTC AGAACCGATA TGTCCAAAAT CTTTACCCGC GCGTGCGGAA 7200
AGTACTTTGA TACCACCTTT CTGCATAATC CAATGAACTC ACTCTGAGCC GCTATCATGT 7260 ACAGGTGTAA GTTTCTATTC TGATTTAATT CCTGATAGTA CGTAATGAGC GCCGTGACAT 7320
CCCGGTACTC AGGGTTAAGC TGGTGAATCT CTCTAAGCAT AACCAGGGCT CTTCCCAGGT 7380
TCTGGGTTTT TATCAGTGTC TGCGCGAGAC GATAACGCAG CTCGTTTGCA ACGTCTGAGG 7440
GTATATTGTC ATGCTTCAAC CCAATTTCAA AATCCTCTGC AGCGTCCTCC AGCTGGTTCA 7500
GCTTTGTCTT AATGGTCCCT GCATACAACG CAGCCCGCGG CCCAACGAGA GGATCAACCC 7560
TCAGATGATT GAAGATTTTC AATGCACGGT CGTGTGCATT CGCCTCATAG AAGCACTCTC 7620
CCATTACGAA CAATGTTTCT TTGTCTTCAG GTTGGAGGTC CAGCGCCTTC TTCAGATAGG 7680
GAAGGGCTTC CTGATACTTT TGCAGTTTTT GAAACGTGTA ACCCGAgcaG sTGCGCTGCA 7740
GGTGCACTAT CCGGTTGCGA TGTTAGCGCT CTTCTCAAAA GAGGAACGAC CTGCTCATAC 7800
TGCCGGTCGA GGTAATAGAC CCTTCCCAGG TTGTAGTTTA CTTCAAAATG GTGGGAGTCG 7860
ATGCCTTGCG CAAGGAGCAG TCCCTTCTTT GCCTCCGCGT ACTTTTCTGT TTTTACGCCG 7920
CAAATCCCGT ACCGGAGTAG GATCTCAAAC TGCTCCATCT GTGAGCTCGA GCTGcTGCGC 7980
TCCACGAGTG TCGCGTACAT CGCGAAGGCT TTTTCCCACT CCTGATCCTG GTAGTGTATG 8040
TCCCCCAGTA CCGAAAGACC CCGAATGTCA TACGGGTCCT GCGCAAGCCG TCTAGATGCT 8100
TCCCTCATGA GAGACTCTTT GTCTTTTCTT CTGCTTGAAC GCGTCCCCAC CTTGCTCGCT 8160
ACCGTGGCAA CAAGCATGAT ACTCGAGAGC GCAAGCATAA CGACAAAGAA GACGATAACG 8220
GCAGAACTCA CTCGGCGCAG TGTGCTAACA AATCATTCAG CTGTCAATAG ACTCTGCAAT 8280
GTGCCGGTAG gCTGGaAACA GAACTTAAGA AACCTGACTG CATATCTTTG CAACACCTGC 8340
GCGCCACTCC AGCAGTGTAT TTTTTGTTTC CCACTCAATG AGCCGAACAA TACTAAAGCA 8400
GGTGGTACTC GCTGGGTTGG GAAAAACAAT AACTGCAAAT TTCCCTTGGC CAACATACAC 8460
GACATGTTCG TGCGCAGCCA AACACTCAGA CTCCCTTAAA AAACTGAGCA CACAGTCAAA 8520
ATGAACTGGT CGACCGGAAT CCTTGTCACC AAGAGCTACG GGAAGATCTT TGATAACCTC 8580
TTTGCAACGC GGACACACAT ACTCGCAAAA CTCCAGCTTT GGCGAAAAAA CATCACGCCT 8640
CGGCGTGCGC TTCCTTTTTC TTCTACGGCC GCTTGACTGA ACGTCAGTCA CACTCTATTC 8700
CTCTCATTGA CAATAACCGC TTACTGAGAA TGAGCGCGAG CCTTTGAATA GCCCCCTCAA 8760
TATAGGACTG ATCTTGAATC TTCTTTTCCA GTTCGTGAAT TGTTTCAGAG CAAAACACcT 8820
GCGGGAATTC GTCTTCCTCG AGAAGGAAAA ACGGAACCTT TCATACCAAA TCCTCAAAGG 8880
CATGTGGCAA GTGCACGATG TCCACATCCT GCCTTCGAAA AGGATCGGAC CGCAGCACGA 8940
TGACGTCAAA GCGCGCACAC ATGTGGTTGT ATTCTCGAGC GCTGGCAAGA AAATGTTTAG 9000 CGGTTTCACA GATCCTTTTC TGCTTGCGCT TTCCAACAAT AATTGCTAAG TCGGCGTAAC 9060
TGGTACACCG TAGCGTCTTC ACTTCTACGA ATACTATTGT GTCATCCtGc TGCGCAATAA 9120
TATCAATTTC ACCTGTTGCT CTGCGCCAGT TTCGTGTGAT GATAATATAT CCGCGCGTCG 9180
CTAGCCAGCG CGCCGCATAC GCCTCTCCAA ATGCACCGAG TAACTTATTG TGCTTTGGCA 9240
TAACTCCACG GACTCACTTC TCCTACAATG TAATTCGTTT TATTGACCGC TAAATCAATG 9300
AACTCACCGC TGTCGTCAGG ATGGGTTACG TCGTGCTCGT CGATAACTAT TCTTAAACAC 9360
GGACGCAACG GCGCCGTGGC CGTGGTCTTT ACCACGCGCG CAACTGCTGA GTTATTCAGG 9420
AGCACCAACG ACCCAATGGG GTAAATACCG ACGCTTTGAA TCATTGCCTT GATTACGTCT 9480
GGATCAAAGC GACGTGCGTT GTCAGCAAGT AAGCTTTTCA TTGCTTGATA GCCACTGAAC 9540
GGTTTGCGAT ACGACTTTGG CGCAAGCATT GCAGCAAAGT TATCAGTAAC TGCAAGAATC 9600
CTCGCACCGA TGGTAATTTT GTTTCCAGAA AGAGACTGGG GATACCCTTT TCCATTCCAG 9660
TGTTCATGGT GCTGGAGTAC ACTCAGTCCA ACCGAGTTCG GGTATTTGAG CGTGTTTACG 9720
ATGTAGGAAT GTGCGTAAAT GGTGTGCGCG TCAACTGCCT GCTGTTCTTG AAAATGCAAT 9780
CTCCCCGACT TCTTCAAGAT GTCGGCAGGA ACATGCTGCA TACCGATATC GTGCAGGAGT 9840
GAGGCAACGA CCAAATCAAA TATATCTTTT TCAGAAAAAC CCAAATGCTG CGCAACGATA 9900
ATGGAAAAGA TAGCCGTATC TACTGCAGGT TTTGCAAATC GAAATCCTTC GATTTTGTAT 9960
GACAGCACTA AACTGACAAA CCCGAGTGTA TTTGCTCGAA CTAATTCTGA AAGGCGCTTT 10020
GCAAGCATGT CCGCAGGCGC GCAGGAAGTG TCGTGTGCGC ATTCATCTTG GCAAAAAGCA 10080
TGTTCAATTC CTGAATAAAG CCAACGTATT CTTCATGATA GTGGGGATTG ATACACACCT 10140
TAGGAAGGAG TTCACAAATA TCCTTCAGGA TATCTTGAGT ACGGAgTTCT CCGGTGGAAA 10200
CGATATCGTC TTCAGGATCA AGCAGTTCTT CTGCTGCAAg cTCTTCAAAT TCTGCTAATT 10260
CCTCCACGGT AGAAGAGGGA ACTGACTCCC CTTCAGCCAG CACCCTACCT GCGGTCACAA 10320
CGTAGGGAAT ATTCCAATCC TGGAGCACCG TCAGCTCTCG AGTGCTTACC GGCTCCCCTT 10380
CCCTGAGGAG GAGGTTTTCT CCGTCGTCGA AGAACACGGG CTCGGAAAAA CACATTCCTT 10440
CTTGCAGTTC AGATACATCG ATTTTTTGTG ACATGGACCT GTACCTCTTT ACCCGCCTTA 10500
TCTTCGGCAT CGGTGCGCAC GGGTTAATAC CACGTCTTAT GCACACACAG CGGTAACGTT 10560
ACTGTTGTGT GTACAAAAAG GCAAAGATTG CAGAGACAAC CTGATAGGTT TCCGGTGGGA 10620
TACACGCaCC TATCCTGTGC TCAGACAGAA CACGCGCCaG AAGTTCGTCT TGCACCAAGG 10680
CGATATCAAA CTTTTTTGCA ATTTCAACAA TTTTTTCTGC AATGGCGCCC GTGCCCGAAG 10740 CGACAATAAT GGGCGCCTTA TCTCCCGTTG CATAGGAGAG CGCAACCGAG CACGCACGCT 10800
TTCTTTTCAT GGTGGCGGTA GTGTGCGTCT GCTATACAGA AACGTCAATC CCTTTGAAGG 10860
GGACCGCATC GTTTGCCGAG TCTGCATGCG CACCATACCG TACGGATAAA AAATCAATGC 10920
CsCGTTCGCG AAGGAGGGCA CACAGGCGTG CAACTGTTTT TTTCtGscTG TGCGCGACAG 10980
TTCCCGTGCC GCGTGTCTTT GCACCGTGAG TACGTTATTC TGCACGCAGA AGACCCATTC 11040
TCCTGCAGTG TTACACGCAC GTACGCGCAg cTGGgTGCAT GTTTTTTTGT GGAGGTGTAA 11100
TAACAGCGAG AGAGTGCCGC GCCACACGGT GTGCGCGCCG CgCCGCTCAA AGGGTACAAG 11160
AAGCCAGTGG AGTGCTGAGT GATATGTGTG GTTAACGAGC GCAAAGAGAT CTCGCTCCGT 11220
GTCAGAGTCT GTGTATGCGC CTGATTCCCC AAGAAAGGTG CTCAGCATAC GGCGGAGCAT 11280
CGCTGCATCA ACGGGAATGT TTCGGTCGCC TAAGATACTC GCAAGGAACG CGGCGTGTGC 11340
TCTTTTTTGC TCAGGAAACT TTTTTAGCAA AAGCGCAAAG CGTTCAATGA GCTGCGGCTG 11400
CAGCGGCACA CACAGGGATG TATGCGCGTG GATAAGCGCT GCAGCTTCAG GAGAGAAAGA 11460
AACACCCCAG CGCTGCAAAA AATGCGCCGA CATATCCTCA GCAGATGGAG GAGTGCTCGT 11520
GCACTGCGGG TGCAGGAACA CCGTACCCGC GCGGATAGAC GCGCGCAGAA ACAGGACTGC 11580
GCCTTTGTGC ATAGGCTGCG GTACACGCGC GCGCACCCGT TCACCGTTAA TGGCGACAAG 11640
CGCACGGCCG GCGTGCGTGC TGCTGAGAAT GCGGACGCAC ACGAGCGCCC CTTCAGTAAG 11700
GGAAACCGTG CGCGGCACTT CGGTGAGTAC TACCCGAACA GCTCCGTTCA CGCGCTGGGC 11760
GACGGCTTTT TAACAATACG TGCTTTGATA CGGGCGGCCT TTCCTATTTT TTCTCGGATG 11820
TAATAGAGcT TCGCGCGTCG CACCTTTCCT GCACGTAcTA CGTCGACCCG CTCGATACGG 11880
GGGGAGTGGA GCGGGAATAT ACGCTCCACT CCAACGCCAT AGGAATTTTT GCGCACCGTA 11940
AAGGTGCGCC TGACGCCGCT ATTTTTAAAA CACAGAACGA GCCCTTCGTA AGCTTGGATG 12000
CGCTCTGTTT TTCCCTCCAC TATTTTGAAA TGCACACGTA CGGTGTCCCC GACGCGGAAC 12060
GTTTCAGCTG GTTCCTTTCG CTGCTGGTTT TCAATTTGTT GGATGAGGTG GCAACTCATA 12120
GTCTAACTCC TTAAGAAGGG ACTCAGCCTC TTGAGTCCAG GCTGCAGACG CACGCGCAGC 12180 gctGaGGAGG TCAGGTCTAT TCCTTCGTGT TTTTTCGATC TGGCGCGCAA gcCGCCACGT 12240
GCGGATATGC GCGTGGTGAC CGGAGAGAAG TACAGGGGGA ACGTCCCGGT TGTGAAAACA 12300
GCGCGGCCTG GTGTACTGCG GGTACTCCAC GAGACTGTTT ACAAAACTTT CCTCCTCGAG 12360
AGATTCATGG CGGATGACAC CGCTAACACA GCGGCTCACC GCATCGATGA GCACGAGCGC 12420
GGCGATCTCT CCTGAGGAGA GAACGTAGTT CCCGATGCAA AtTCGTCGTC GACATACTCG 12480 TCTATGATAC GTTGGTCAAT TCCCTCGTAT CTGCCGCAGA TGAGCACGAG GGCACGTTCC 12540
TGTGCGAGTG AGCGCGCATA GCCTTGCTCA AAGAGCTTTC CAGAGGGAGT GACGTACACG 12600
ACGCGCTTTT TGGGAGCGTC TACTGAATCC AGGGCCTTCC cTAACGGTTC TGAGCGCATG 12660
AGCATGCCAG GTCCGCCCCC GTAGGcGGGG cGTCACAGTG TTTGTGTTTG TCGTGCGCAA 12720
AGTCACGGAT GTTGACAATA TTGTAGTGAA TGATCCCGTC GCTCACGGCG CGCGCCATTA 12780
TTGAGGTGGA GAAATAAACC CGCGGGATGG CGGGAAAGAG AGTCAGTACG TCAATGTTCA 12840
TTCGAGAATC CACCGCTGCA ACAACTCGAT TTTTTTTCGA CCAACGTCCA CGTCCCCAAT 12900
GAAGGTCCGG TGAAAAGGCA CATAGCAAAC GCCACCATGT GTCCTTTGAA CCTCTAAAAG 12960
GGAGCTACCG CCCCCTTCGA CAACGCTCAA GACAACACCC ACCGCCGAGC CCTCGAAAAC 13020
GAGTTCACAA CGACACAGAT CGGCCAGGTA AAACTCCCCA GCGCTAAGCG GACAAGCCTC 13080
GGCaCgCGGT ACCCGCAGCT CTGCTCCTAC AAACGTCCGA GCGCACTCTA CCGTATCTAC 13140
GCGGTGGAGC TTGAGCAGCG CGTcCTGCGC ACGTAGGAGA ACGTGCTCTA CCATGTGGAC 13200
GGCCTCACGC GGGAGAGCAC AAGCGAGGGT GCCTGAGGAT CTGCTCCGTG GAGGAGCAAG 13260
ACAAACCTGC TTTAGTGTGG CAAGATGTGC ATACTCACCC GAGAAGCTCT TGAGCCTGAG 13320
TAAACCCGCA ACCCCAAAGG TGCTCACGAT GCGTGCAGTC ACAATTCTAT CCATAACCCA 13380
CCACACACCG CCTGCAACAA GGCCGCAAGA CGGAAAAGGA GCCGCTAGTC GATGATCTCT 13440
AAAGCGTAAC GCGTCTGAGA AGCGTGCGCA GACGCAGAAA GGAGCGTkcG CAGAGCGCGC 13500
GCAATTCTGC CGTGCTTGCC AATGACCTTC CCTACATCTT CAGAGGCAAC ACGTAACTGA 13560
AGGATCTCCA ATCCCTCCCC TGGAGACTTG GTGACGGTAA CCTCCCCAGG ACGATCCACA 13620
AGCGCCCGCG CAATATAGGC GATTAGCTCT TCTTCCATCG TGACCATCCC CTTGCCTAGA 13680
CGCCCTGCCC CCCTGGGGAA GAAGGGATTG GAGCGGCGCA GGAAACGGAC TCTACGTGcG 13740
CCAGATCGGC AGCCTGCTGT GAAGAAGCAA CACGGCGCTC ATCTGAGGCA AtGCGTTTAG 13800
GACAGAACCA CGCCTGGACT GCAAGAGCtG CGGACCGTAT CCGAGGGCTG gCGCCGCGyT 13860
CAAGCCAGAA GCGCGCACGG TCAAGGCGGA AAGACACCTC GGTACCCTTT GGGGCTATGG 13920
GCTGGTAAAT ACCCAGTTCT TCGATTGCCC TGCCATCTCT CGGCTCGCGC GCGTCCTGAA 13980
CTACGATTCG GTAGTACGGA CGCTTCTTAC TCCCCAATTT TTTCAGTCGG ATCCGTAAAC 14040
TCACTCTGTC TCCTCCGCCT GAGACACACG CaGGTGCGCT CATCCTTTCT AAAATTATCT 14100
GTGCTGTCAA GTGTCTGAGC ACGTAACGGG ACATGGAGAA TAGATTACAG AGGAGCGGCA 14160
CGTGACGCGT CATTTCTGCG CTGTAACGGT TGTATTGGGG GAGAAGGAAA ACtGCAGTGC 14220 AGCGGGTGTG CCTTTGTCTT GGGTGTTCTT ATTCAAAAAT GATACGCACC TGGGATTTAA 14280
AGGTAGTACC TCGCTCAAAC TCGTGGGCGA AAAGTTCAAA ACTCGCTGCC CCAGGAGAGA 14340
ACACGATCAC GCCTGGGCTT TTCTGCCGCG CCCGGAGATC CTGCAGGAGT ACCTCAAGGG 14400
AAGTAAACGG TCCGTAAAAA GGTACCTGTG CTGCATGAAG AAGTGGTTGC AACCGTGCCG 14460
TAGCACTTyC TGCAAGCAAG TACAATGCGT GCGCCTTGGC TGCTGCCTGT GCCAAGGGTT 14520
GGTAGTCTGC ATTCTTATCA GTGCCGCCAA CGATAAGAAC CACGCTTTCA TCAAAGGCTT 14580
CCAGCGCTGC AATTGTTGCC TCAGGTACAG TGGATGCAGA ATCGTTATAA AAACGCAGTC 14640
CCCCCTTTTC GTAAAAAAAC TCTAGTCGGT GTTCGATGCC CGTGTAGGAC TCCAGCGCTT 14700
GTGCAAGACG CCGGGTGTGC TCTTGGAAGG GACTGTGAGC GGACGGACAA GCGTAGTCTG 14760
GGGGGGACGC GTGATTCnCG TACGCGGGGG AATGGGAGTG TGCGCAAAAA CACGGGGGGC 14820
AGGAGGAAGG AGGTAGAGAA TGCCCTGCGC GAACAGGAcG CTGCAAGCGC TGCACTCGCC 14880
ACTTGCGTTT GGAGGACACG GCCCGGTACG TGCAGCTGCG GTGGGATGAG CATGCAGGCG 14940
CGGTCACCTT CTGCAAAACG TGCCCAGTAG GTTCCGTCCG TCGCTCTCCA TAGAGCTCTC 15000
TCCATGAGGC GCGGGGTGCA CGCGCGGCAA GCGGTCTCAG GCGACTGGGC CGTATACCAA 15060
AAGACGCGsA TCCGTTTTTC TGCkTcGCAG GCAAAGCGGG GTCCCCACCC GTCATCTGCt 15120
TACACAGCAG TGTATCGTGC GTTCCCTGGT GTGCGTATAG CACCTGTTTG TCTGCCACGT 15180
AGCTTTCCAT ATCCGCATAC CAGTTTTGAT GGTCAGCCAT AATGGGAGTC ATGATGGCAA 15240
TCTCCGGGCG CAGCAGACCG GCGTGGTGTA CAGTGTGGTC CTGTGCATCG ACTGCGCGTA 15300
GGTCTGCAAG CTGCCAGCTC GACAgTTCCA GAACCACTGG TGTTGCAGGC GTTGTGTGAC 15360
GCACAAATTC CAGCGGGCTG ACTGTGCTAT TCCCCCCTAG AAAGGCGGGG AAACCCAGCG 15420
CACGCAAGCT GTAGCACAGG GCGCTGGCAG TGGAGGATTT TCCCTTGtGC CGCTTACTGC 15480
TAGCAGCGGG GCGGGAGAAA GGCGTAGGAA AAGGGAGATA TCCGTTTCgA tGgCGCGCCG 15540
GCGCkTTGAG CAGCGGAAAG GTAGATGTTG TGTGCACCCT TCACGATGGG ATTTTTGATG 15600
ACAACATGCG CGTTTTCAAA ATCTTCCAGC CGGTGTTCAC CGAGCGTAAA GCGGATGGAC 15660
GGGTACGCAC GAAgTCTTTT CAGGGAAGGG GTAAGCGCAT CAGCATTTCG CAGGTCGGTA 15720
ACCGTAAGgC GCgCTCCCGC TTCTGCACAA AAGnCAGsTG CCGCGCAgcC CCCGCCGTGC 15780
ACGCCGAGGC CCATGATGGT TACCGTTTTG CCTTGAAGAA GTGCGCGCGC CTGCTCCACG 15840
ATGCGGCCGA TTGTAGCCcG CGCAACGCGT GACAATAcAA GAGACGCGTG CGGTGCTCGC 15900
GGCACGGTTT CTCTTTACTT TTTGTTGCTT TTTTACTACC CTCGCGCGCT ATCTGCTTAT 15960 GGCTGAACAT ACTTCCTGTA CGAGCATTCA TCCTCTTGTG CGCAGCGCGT TTTACGCCGG 16020
GGGTGCGCAT GCAGTACTGC TTATTCATGG GTACATGGGC ACCCCGCGCG AGATGCAGTT 16080
TTTAGGTCGT GCGCTCCACC GGGACGGCTT TACGGTCTCT ATTCCCCGTT TACCTGGTCA 16140
CGGTACGAAT AGAGAGGATT TTCTTGAGAC CGGGTGGAGG GATTGGCTGC GGCGCGTGTG 16200
TGATGAGTAC CGTGACCTTT CCGCTGCGTa CCtTCGGTAT CTGTGGGGGG GCTGTCCATG 16260
GGAGGTGTGC TGACTGCACT CGTGGCGGCG CGTTTTTGTC CCCAGAAAGC TTTCTTTTGT 16320
GCACCGGGTT TTGCAGTTTC TGATTGGAGG ATAAAGCTGT CTCCTCTAGT CAGGTGGTTT 16380
GTGCGTGAGT TTGCTGCGGA CGCGGCTCCC TTCTACCCCG AGCAAGACTT TAATGACGCC 16440
ACAAAGGATT ACCGGAGTGC GCACTACATT GCCCAGGTGG CGCAGTTTTA CGCACTGCAA 16500
AGACGTGCGA TCCGTTCGCT GGCGTGCATT CGGAGTACGT TGTTAACGAT CCTGTCTCGG 16560
CAGGACCCAT TGGTGCCGTG TGCAGCGGTG CAAAAATTAC TCGATGCGCG TGTGCGCACG 16620
CACACCAGTA CGTATG 16636 (2) INFORMATION FOR SEQ ID NO: 48:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13330 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 48:
TGATAAGCCC AGCGAATTAA TAACAAATCC TAAAAAAAGC GTGwArCCGA AAAAGGCCAT 60
AAGCGTGTGC AGAGAGCCAT GAGAAGGCAT TAGGTGCAAG ATGTGATCGC GTGGTACTCC 120
ACCACATAGC GCAGTAACGC TGCGAATCAA CGGAGCGAGG AGCGTACCAT GAGCGAAAAA 180
CTCCCCCGTC AGAAAACCCA TCACCATGCT AGAGAAAcCT ACGGACAAGA ACACGTAGTC 240
AAGGTGTGCC CATCTATTTA GCGCACGTAC ACGCCGCGTC CGCAGTAGCA AACCAAGTAC 300
GAAAAAGAGG AGACCCTGCC CGAGGTCCCC AAACATAATC CCAAAAAGCA GCGCATAGGA 360
GAAAGCAACG AAAGGAGTCG GATCGACGAG CCCGTAGGGG GGACAACCAT AACTAGACAC 420
CATACGCTCG TAACTACGCA CAAAACGGCC ATGCTGGTAA CACACCGGCA CATGCTCGCT 480
GCCATCCCTG ATAAAAGACA GCTCCTGTGG TTCAAACAGG CGGACTGCCA TCCTCCCTGT 540
GGTCACGTTG TCCAATCCTG CAACGAGGTC CTTCGCCTCA TGTGCTGGCA ACCAGCCAGC 600
TATACGATAG GTATGCCGGG TAGACTCAAG CGCATCACGC GTGcGGTGcA CACGTTCCTG 660 CAGCGCAAAA CGCCTAAGCA GCGCACACAG TGCCGGGcCT CGCATCGAGC CGAGCGCACA 720
TTTCTCACCC TGaAACTGCT CTTCGTTGGG CAGAATCATG GGCTACAGAA CTCTTCCCAC 780
GGGGAACTGA GGGAGCATCA CCCTCTCGTT CGCGTGCGTG CAAAGCGTcA ACTTCTGCAA 840
GTATCTGCTG TGCCAACGTG TAGTCTTCTT CGGTAGGCAA AGAATCCCCT GGAGAAAACG 900
CGCACTCGCC AGAGACACCC AGGTACTTAC ACGCTTCTTC AAGACGGCCG ACATACTCTT 960
TGCTCTGCGC ACAGTGGGAA GAACTGCCAC GCGCCGCGGC GGAAAgACGC AAATGGACAA 1020
GCGCCGTCTT TCCCAGGTAC TCCAGTACTC GGTCCACATC CCGTTCAAGC ACTACAAGTT 1080
CGAGGAACTT CATCCTCTGC GATCTAAACA TAAGCTGCTT CCCCCAAGAA CTCACGTGGC 1140
AGCGTACGCG CACTCCCTGc ATGTATACAT TCTGCAAGCG CACAGATGCC aCGCGCTTCA 1200
AAGCGCTTCA TCATAAAAAA TGgACGATCG TCGcAGGAGT AAAGwrtGCG CATGAAAAAC 1260
CGTGCGCGCA AGACAATACA GAAAACGGGA GCCCGCACGT TCAAACGCGT CTAGATCCGG 1320
CTCCCACTGT ACCCCATCTA CAGGATTATT TAAGAAAGCG CTCATAGCGC CATCCACTCC 1380
ACGCCCCGCG CTCAGTAAGA GAACAATCCC ACGCAAACAG GATGCATTTC CACCACGTCC 1440
CCAGACACCT GCGTATCGCA CCGTTAGAAG AAAGcTCCGA AGGwrCnTyT nTCTTCTGTA 1500
CACCATAGAA CATGCGCAGA CGAAtACCCA CCCAACAGCG TACAACGCCA CTTCTTCACG 1560
CACTAACCGG GAACATGCAG TGCGATCATC TTCGGGAATA GAACACAaTG TCTGCCACAA 1620
CGCATGGTAG TAACTTTGAT CTAAGCGCAA ATCCCACAAG ACACGCTCAT CACTCCGAGG 1680
TACGCGGTTG TACCAGGAAA CTGGGCTCCC CCGAGTGACA GCCTCGATAC ACGGCCACTT 1740
TTCCCAACGA AAAAGGGAAA AACGACCCAA ATCTGGCGGT TGGCAGACGC GAAACCCCGC 1800
CTGACTCTCA GAAGAAGAGA GCGTCTTCAG GTAGAGATAG TCGTAACGAG CAAGCAGCGC 1860
GGAAAAAAGC GAAGGGGGTT CCTGATAGCA CGATGCCGCA CGCACACAAT CGCGCAGCAG 1920
ACACGCCTCC ACCCGCCGCT GGATCAGACC TACCAGATGT GCACCAGCAG ACCTAGGGGC 1980
AGGCTCAGAA AAAAGGCGCA CCCAGAGATC ACACAGCGTC CCCACACCCT GCAGGGTACC 2040
CAGGCGCTCA CCAACCCAAA GCCGAGCACA CAAACCGGAT AGTTTCGCAT AGACGAACAC 2100
ATCCGCGACA TCTCGCTCCA CAGCCACACA CTACCCAAAA AAGAAACCAT CAAGGAAGGC 2160
TTCGAAGCGC TCCACGTCCT CCTCCAGCAC AGAAAGCTCG CGTTCTAACT GTCCACACGC 2220
CTCCCTTCTC TGAGCGTCTA TCCGCGCACA GGCCTCATGC CACGCCTGCT CCTGCGCCTC 2280
AACCACGCTC CCCTCCGCTT CCTTCACCAA GGTCTCCCCT GCCGCCTCTG CAGATTCGCG 2340
CAtCCtGCGC CTGCTCtTGC GCCTTCTCCA CGAGGACAGC GGCCGCATCC TCCACCTCCG 2400 CTAGGCGGAC CATCACCCCT TTATCCATAT TACCAACCCA CGCGGCCGAC ACTCAGCGAG 2460
CACATCGACC TACTCCCGTA CCTCAGGGGA CGGAGGAACC GCTACCCCAG AGCCCGAGGA 2520
ATTCCTGCGG CAGGACCAGT AGACTCAGGA GACGTTGGAG CACCAGCACC TTCTTcGGAG 2580
GkTGCTTCCA CCACTCCACa GCcGTCTCGG CTTGCTTcTg TGCGCCGCTT TCTGCAAGCC 2640
CGTGTCGTCC GGAGCCCGGT TCAACAACGC GAGAAAGAAA GTCAAACCAA AAAAGAGCCC 2700
TACCATAACG TAGGAAGTCT TCGTGAGGAC CGAAGCGGAG CGCGAACCAA AGGCAGAACG 2760
CGAACCGCCC GAAAACATGC CACCAgCCCA TCTCCCTCTT CAGTCTGCAA GAGsATAGCG 2820
TGACCACCAG AgGCAAACCA CCACCAGGAG TGAGAGTATC ATAACGCTCA GCACAGCCAT 2880
GCCTCATGCT ACACCAAAGC GTGCGCAAAA ACAAGGACTT CAACCTTTCA CTCCCTGTCC 2940
CGAGAACAGA AAAACGACCA TGTGAACGGC ACAAAAAAAG AAACCTGtTA CGCAATACCG 3000
CACACCACCC AGAATCCTTT ACTACTCTCT ACACtGCGCG CGATAGGAAC AAAAGACGCA 3060
GCCTCCAGCG AAcACCGCCA ATGAGTCCCC CGTCAATGTG CTCTTCAGCC AACAGTGCCC 3120
GCGCGTTCTC CGCTTTCATG GATCCGCCGT ATTGAATACA CAGTGCCTCT GCGATAGCCG 3180
CGCCGTACAT CTCGCGGACT ACTGACCGAA TATGAGCATG AACCGCATTC GCCTGTGCCG 3240
GAGTGGCAGT CTTACCCGTA CCAATTGCCC ACACAGGCTC ATACGCAACA GTTACATTAT 3300
GCATGAGTGA CCCACACACG TCTGCCATCC CTGCGCGCAC TTGAGTTCCC ACTACCTCGT 3360
TGGTACACCC CGCTTCATAC TCTTGGAGTC GTTCGCCGAC GCATAAGATG ACGCGCAAAC 3420
CGCTTTCTAA CACGCGTCTG ACCTTTTGAT TGATAAGCTT ATCATTCTCC CCACGCCCAT 3480
GACGCCGTTC GGAATGCCCC ACGATGACTA CCTGTACCCC CAGGTCTTCG AGTTGAAGGA 3540
CGGATACCTC TCCAGTATGC GCCCCCCACT CTTCACTACT CACGTCCTGC GCGCCAAGAA 3600
GTACGTTACT TCCCCGTAGC ACCTTCCCCA CCGCGTCTAA AGCGGTAAAA CTCGGCGCAA 3660
TCATGTATGT GTGCGGACCA CCCCGTAATT CCCGCACGAG TTCCTGCGcA AGGcCACCGC 3720
CTCCGCACAC GTTTTATGgC ATCTTCCAAT TCCCCGCGAT AAAATAGCCG CGCATATCCC 3780
CTCCTTAGCA CATcCTTTCG TTCACACCAC AAACACCCCG CCGATAGCTC CACGAGAAAC 3840
TGTTCTATAG AGCCACCGCA GAGACATACC CTCGCCAGGA GATCTCGCCC CAACCCAAGG 3900
GAAAATCTCA CGGCACCACT ATATCCTATG TTTCAAGACA CGAAATGCCC GGTAAAACTT 3960
TACCCTCAAA GAGCTTCAGC GATGCGCCAC CCCCCGtAGA TACATGGCTC ATGCGACTTG 4020
CAAGCCCAAA CTTGCTGACT GCTGCAATAG AGTCTCCTCC ACCAACTACC GACGTAGCAC 4080
CCGCATCCGT CGCCTCTGCT ATCAACTGCG CAAsACCCGT GTACCGTGTG CAAAGGCATC 4140 GAACTCAAAA ACCCCTACCG GACCATTCCA CAACACGGAG CTCACTCCCT TTAAATGCGC 4200
ACGATACTGC TCAAGCGTAC GCGGACCAAC ATCCATACCC ATCAAGTGCA TAGGAATATG 4260
CACATCGTCC ACCGcAACCG GcTGCGCATC CGCACAGAAC GTGGaCGcAC ATaCGTGaTC 4320
GACCGGCAAT aCCACCGaCA CACCACCGcT TTGaGCCTTT TGCAACAGCA TaCGTGcaGT 4380
GTCGATAAAG TCATCCTCCA CTAGGGAGGT ACCTACACCC ACACCTTGCG CTTTCAAAAA 4440
GGTGTATGCC ATCCCTCCCC CGATGATAAG CGCCGTCGAT GTTCGAAGCA GACTCTCCAA 4500
GACTGCTATC TTAGAAGATA CCTTGGCACC ACCGACAACC GCCACCATTG GCACCTTCGG 4560
GTTGCATACC ATAGGTTCCA GGTACCTCAC TTCCCGCTCT ATCAAAAGAC CGGCCACTCT 4620
GCGACGCATA AGTCTCGGGA GTACCACCGT AGATGCATGT TCACGGTGCG CAgTGCCGAA 4680
CGCATCATTA ACAAAAATGT CCCCATACTG GGCAAGCTCC CGCGCAAATT GCTCCTGCAC 4740
CTTCGCATCA CCAGATGTTT CCTCGGGGTG AAAGCGCACA TTCTCCAAAA GCACTACCGA 4800
ACsGTCGGGC AGCCCTTCAA TAAATTCACG CTGCCCGACG CAGGAAGGCG CAAAATGCAC 4860
CGGCACCCCC AACTTCTTTG CAAGGCAGTC CGCAACCGGC TTAAGCCGgT GTTTGCCGTT 4920
AATGAATGCG TGGCGGTCAA AGGGACAACC ATCTTTCTTA GCGTTCCCCT CTGCTTTATC 4980
CGCATCACGG GTAGGGTCTC CAAGATGGCT AATGaGCACT ACGTGTCGCG GCCCCTGCTC 5040
GATGATGTAC CGCAGAGTAG GAACTGCTGC AGTGACGCGC GTGTCGTCTT GCACCATACC 5100
ATCACGCATC GGTACGTTAA AATCAACACG CACGACAACA CGCTCACCTC GCATTGTGAC 5160
ATCTTTACAA GTTCTCAGCA TCATCTCCTC CTTTTTGACG CAGGGGTTAC CCCATCCGCC 5220
ACACGTTGCG GCTGATTCTC ACTATATTTC AAAAAAAGAT TCAATATCCG CATCGGTCGG 5280
ACGCCTCACA TCACTGCTCC CCTGCCTGTG CGCCCTACAC GCGTACGGGG TGGGGCACAG 5340
ACCCTTCGTC ATTGTTGACA TTTTCTATGC GGAATGA AT ACCCCGGCGG GTGCTGATTC 5400
CCCTGTAGCT CAGTTGGTAG AGCAAATGGC TGTTAACCAT TGGGTCCGTG GTTCGAGCCC 5460
GCGCGGGGGA GTGATGTTTT TGGTTCTTTC AGTTAAGAAT TCTCATGGAA GGTGGTGTGT 5520
CTTTCACGGG TTGTGGCCCC TTGGGGGCAG TGAGCAGTAC TTCCAGCTTT TTTAGAATGG 5580
GCCGTGCAGC GTCCCGTGGG TCTGTGTGCG CCTGGTTTCT ATCGGAAAAA TGCGGGGCTT 5640
GGTGCCCATG AGAGTAGGTG CCTATGGAGT TGAAGGTCCG TCAGAGTGGC GGAATATGTG 5700
TCGTAGAaTC AGTGGGGACA TGGATCTGTA TCATTCCTAC AAGCTTAAAG ACCTTGTGCT 5760
GAAGTTGTTC GATAGGGGCC CGCGCTG AT CGTCATTGAC CTTGAGGCGG TAGAGTATAT 5820
CGATTCCTCA GGGATTGGCG TTCTCATCTA TCTGTGTTCG ACAGTGAAAA AGTTAAAAAT 5880 CCACTTCTTT ATCTCAGGTG TGCACGGCTC TGTAAAGAAA GTGATTGAGC TCACCCGGCT 5940
GCTGAATTAT TTTCCCATCG CTGAAAGygT AGACGAGGCT CTTGCAAGGG CCCGATCCTC 6000 TGCACCGCCG CAGACCGGCT CCCTGTAGGT TTTTCCTCGT CATGGGTTGA ACCCTCCCAC 6060
GCGCGGGAGG GTTAGATATC CCACAGCTTT TTGCTGCCGc GTGCCGCTGC ACGGACTGCA 6120
GTGGTAGCGT CTTTCCTTTA TACTCAGCAC TGTGCATATG GTAACGGACG GCGCTTCCCC 6180
AAGAAGTGGG GTGTCGCTCA TTATCGGCAG ACCTTCCTCA GGTAAGTCAA CCTTTCTCAA 6240
TGCCGTGTGC GGGTACAAGG TGTCCATAGT TTCCCCTATA CCTCAGACAA CCCGTAACAC 6300
GGTGCGCGGC ATCGTAAATA TAGAATCCGA CCAAATTGTC TTTATGGACA CCCCGGGGTA 6360
TCACCGGTCT GACAGAAAAT TTAATCTGCG CCTGCAGTCC CTTGTGCACA GTAATGTAAA 6420
GGATGCTGAT GTGCTGTTGT ACCTAGTAGA CGCTACCCGT CAATTTGGAG AAGAAGAAGC 6480
AGCCATCTGT GCATTGCTTG CCCCGTATCA AAAAACGCGC GTATTGCTTG CCTTCAATAA 6540
AGTGGATGTC CTTCACAATT CGACCTCGTG CGACGAGCAT GCCTTTTTAC ACAGGCAAGG 6600
CAGCGTGCTG CGGGCCGGCA GCCTGGGACG AgCGCTACAC GCCGCACTCC CCCACCTCCC 6660
TGCTGATCGG GTATTTACAA TATCTGCCCT GCACCAGGTT GGGCTCGATG CCCTCATGCG 6720
CACGCTGAGA GATCTCTTGC CAGAAGCGGC GCCTCTGTAC CCTCAGGATT GCTATACGGA 6780
TCAGACCATC GCCTTTCGCG TCACTGAGCT CATCCGAGAA CAGGCAATCG CACGCTGCCG 6840
GGACGAACTG CCGCACGCAC TATACGCCGG AGTGGAAGAC ATGGAGctGC GCCGCGGCAA 6900
GCGGGAACTG TGGTGCCGTG CGTTTCTTGC AGTAGAACGG GAAAGTCAAA AGGCAGTGCT 6960
CGTGGGGAAG AAAGGTGCAG TTATTCGCgc CATACGGCTA GATGCCATCC GCGCGcTACG 7020
CACACTCCTC CCCTACCATA TTTCCCTTGA TATACGAGTG AAGGTAGACC GCAGCTGGAG 7080
ACAACGCGAC CACACACTCA GCTCCCTTCT GTACTAGGAT GACCGGTGCC CAAATGAGGA 7140
ATTCGCCGCA GGGGCGGGCC GCTCAAGGCG TATAGTTACT GAAGGTTCGT CACACACAGC 7200
CGGAGgTCCA TAATACTGTA CCGCCCCCGG ATACACGTAG CTTGTTTTTA AGGCCCAACT 7260
CGCACGCCGA CGAGAGAACA CCCGAAACGG CATGCCCTCC AGGTCAACCA GCGCCTTTTT 7320
TATCACCGGC TTTTGACTTC CATGTCGGCG CTCCATGTTC ATGAGCATAG TAAGCGGCAC 7380
GCCACCTACT GCCCACTCAG CTACAGAAGA CGTCAAGTTA CGCACCGACG CTACGTACCC 7440
TGTAAACCTG TGCACGGCCA GAAGACACGC GGTCAAACCG AGCGTATAGC AATAATCTGC 7500
ATCAAAGTTG GACGGAAAAG CGCATCGCCC TTCGTAACCA AAAAAATGAG CAATGCTGGA 7560
AAAAACACCG GTGTACGTAC CTTCCTGCTT CATCTGCGCT AAGCGCTCCG TTACCTGGAG 7620 AATGAGCAAA CGCTCTGTGT CAATGCGCGA CACCTGCACA TTCCCATGTG GATCCCGATC 7680
TGCCAAAAGC TGTGTGGAAA TTTCAGCAGG TAATGCGTTA AACACCGCAC GAGCAGAAGC 7740
AGACAACGCC TGCTCTATCC AAACGCGCTG CGGTtCAGGA GTGTCCAGCG CCTCAAATTC 7800
CTGCGCGCGG CGTGCCATCA CCTCGTTGAG CTCCGTAATT AGAGCCTTCA TTTCAGGTAT 7860
AAATTCGATA AGTCCTTCTG GAACTAACAC TATACCAAAG TGCTCACCGT GTTGTGCGCG 7920
CGTGGCGATG GTGTCACACA ACGACTGCAC GATCTGTGCG AGCGTTAACG ATTGCGCCGC 7980
TACTTCTTCC GAAATGAGAC AGACATTTGG CTGTGTTTTC AGCGCGCACT CAAGCGCAAT 8040
ATGACTGGCT GAACGCCCCA TGAGCTTAAT AAAATGCCAG TACTTGCGGG CACTGCACGC 8100
ATCGCGCGCA ATGTTCCCGA TAAGTTCACT GTATGTTTTT GTGGCAGTGT CAAAACCAAA 8160
CGAGGTTTCT ATCGCCTCAT TTTTCAAGTC TCCGTCAATA GTTTTGGGAA CACCGATAAC 8220
CTTGGTAGAA ATACCACTGT TTACGAATGT TCTGCCAAAA GGGCAGCGTT CGTGTTGGAG 8280
TCATCACCTC CTACAACTAC GAGTGCATCA AGCGCCATAC GCGTGACTGT CTGCGCCGCG 8340
GmGGcAAACT GGGACTCACT TTCGATTTTG GTGCGTCCTG AACCAATGAG GTCAAAGCCA 8400
CCTGTGTTGC GGTAgcATtC TACACGGTCT GCGCATATCT CGATATGATC GCCAGAAAGC 8460
ACGCCCGCAG GACCGCCTAG AAAACCGATA AGGACAGAGT CAGCGTGCCA TCGTTTTAAT 8520
CCGTCGAAAA GCCCTGCTAT AACGTTGTGA CCACCTGGTG CCTGACCCCC TGAGAGTACT 8580
ATGGCAACAC GTAATCCTCG CGGCTCAGGT GCAGTCTCCA TGGGGGAATC TTCGTTTTTC 8640
TCACTAGCAT TAACGAAnTT CACCAGCGGC TGACCGTAGy CGGCGcAAAA AGAGAGCGCA 8700
ACGgTcATAG TCTGCCACCG CAgTGGTGGA TAAGCCGCGA CGCGCACAAA CGCGCCGAAA 8760
GTCCCCCCGA AGAAGATCGG GGACCTTTGG CAGGTAGCGA TGCCGTTCCT GTTGCAAGAG 8820
AGAAATACTC ATCGATGATT ACTCCTTCAT ATACGAAAAA TAGCACGACC GCACCGCCGC 8880
ACCCCCACAA CTCACTCTGC AGCAGGCGCG ACCGCGTGTG GATGCGCAaT ACTCAACGCA 8940 aGAaTAGCAC GTTTAAGAAC CGTCGCTTCT TCTTCATACA GTGAACGCCC CACACTGCAC 9000
AACGCAGCGC TCTCCTGTGC GATCACCTCC GCCGCCATTT TCCTGTCGTA TCCCATCTGT 9060
ACAAGAGCAG TTACCAGATC CTCAATTTCC CTCGCATGGG GAGCACACCC AAGATTGCTC 9120
GGATGTGCAG CACGATCATC TGTCTGACTC TGGGCACAAG AGGCCgCGTC GGTTAGCGCG 9180
AGCGTACCTT TCAGCGCTAA GAGCATGCGC TGTGCAGTCT TTTTTCCAAT GCCTGGTATG 9240
CGCTGGAGTG CACATAAATC TCCTGTATCA AGCGCTGCAC ACAAAGCCTG ACTGCTAATA 9300
CTCGAAAGAA CTTTGAGCGC CTGCTTTGGA CCAATACCTT CTACCTTTGT AAGACTGAGA 9360 AAGAGCGTGC GCTCTTGTAC ATTCGAAAAA CCAAAGAGGC GAAGCGCATC TTCACGGTGA 9420
TACAACCAGG TAAAGACCTT AACGTGTGAT CCAACCTCAC CGAACGCAGC ACTACTGTAT 9480
GCGGACACTG CAATTTCCCA TTCAATACCA TGCACCTCAA CACAGAGGCG CTCGCGCTCA 9540
TGGAGCGTCA AGATACCGCT GATGCTTTCG AACATTATCT CCTCTTTATG TGTGGCGCAC 9600
TGAAGCGAAC ACTACGCTAC CATGACAATT GCACAAAGAC CGGATCGTGA TCAGAAACGC 9660
GTTCCCGAGC AGGCTGCTCT GCGTTGATAT GCAGTATATC AGCAGTCTCC GTGCGCGCTC 9720
CTACAGACAG GATATTATCC AGTGTTTGCG AGTAACCGCG GTACACATAG GTATATCGCT 9780
CCGTCTCCGG CAAGAGATCC AACGCACTGT GCATCCCCAC TGCGGTGAAT TTTTGAATAA 9840
CATCAGAAAA CCAAAAATCA TTGAAATCTC CCGCCACCAC CACCGGAAGA TCTGCACGCT 9900
CGCGACGTAt GCAGCAACAA AAGCGGCAAC CTGCGCCGCC TGCTGTATAC GCTTGCGTTT 9960
GGAGTGTTCC TGTGCAGGTT GCGTGCTACC CCAAACGGGG TCATCCCCTC GCTTTGAAGA 10020
AAAGTGATTC GTTACCACAA CAAAATCTTT CCCCTTATTC ACCCCTGATA CAAACTGAAA 10080
ATGTGCCACC AAAGATTTAC GTGTGTTTTG AAAACTTTCT TGCCCTACTC CGATGCGCGC 10140
AGGATTTTTT ACCATCTGTC TTCCCCCGCG CACCATTTGG GCAACCGAAT GAAATGTTCC 10200
TGCACTTCCT GTCTGGTCCT GCACCAGCTG CACACGATCG GTAGGTACAA ATAACAACAG 10260
CGAATATTTC CTCCCGGTTG TCCGCCATCG GCATCCAACG ATTGCACGCC CGCaGGAGCa 10320
TTaTCCTGCG GATCGATATT CACCGCTTTA TACCGAACGG CGCTGAACTC TGCCATtGCa 10380
CGTACCAGTA AATCCAACGT GTGCTGTGCG CTCGTACAGT GATGATGTTT TTTTGCGCCA 10440
TCGTCATCCT GTATCTCAAC AAGACAAATA ACGTCCGGCG CCTTAAGATC ATTCACAAAG 10500
TnTTCGCAAA GACGCGCGCA CGCGCTGAGT CTGCTTTATT CCCTGCAGAA AAATTCTCCA 10560
CATTATAACT CGCTATATTC AAAAATCGTG CGTTGAACTG TATGGTCGAA ACTTCAGGAC 10620
TAAAACCTGA GCGTCTCAAA GGGGGAAGCG GCTCAGCAAG TTCTAATTGG TAACTAGAAG 10680
ACGAATACCC CATGATCCCC ACCACCGTCC CTTCAAAGGA ATCACCAGGC AGAGGAGGGG 10740
CGCTTTTGAA TACTTCAGGA AGGCTGTCAA ACATACGCCG GGGACAAAAG GCAAGGACAG 10800
GACGTATATG GGTTTGCTCA TACACGTATC CTCCGTGCAT ATTCAAACGT GTAGAAGGGG 10860
TATCCCCCGG TAGGAGGTAA TACGTAGAGC GATACGCAAC AGCAGGAACG GTGGGATTCA 10920
CCATCTGAAC CCGCATCCCT TCCACACTTT CATAAAAATC AATAGTCTCT GCATCCGGTG 10980
CGAGGTCTGC AAGGTTGCTG ACAAACACCG GCTGAGACAC CCGCGCATAC GAAATCAACA 11040
CCGGTTCAGG CAATTCCCTG CCATGTGCTA GCACTCGCAC ATCCTGCGCG CGCTTGATAA 11100 CAAGCTGGGT GACGCTCAGA TCCCGAGCAT TGCCTTTTGA GATATACTCG CTGACAGTAC 11160
CGAGCACCGC CACGTAGTCA CCCACGCGCA AACTATcAGG GAAAGCCTTA CCACAATACA 11220
CAAAAATGCC GTCAGACGTT TTAGGATTGC CATCCCCATG CGGATCTTGA AAATAAAAAC 11280
CAATAGGTCG TTTACCCGAA CGCGCAATAG CAGTTACCAC GCCACGCACA TCACGCACGT 11340
GTTTACCCTC ATAGGCAGAA CGGTGTCCTT CCCCTTGGAT CGCACCGATT GAGTGGGGAA 11400
CAGACGCCGC ACTGCACCGT GATCCTGTTC CCACTATCCA AAAGATGACC CCCACGCACG 11460
CTCCCGCTAC TTTACTGCTC ATAGGACACT CCTCACGCGC AGTGTATCAG CGCAGGTAAT 11520
TTTGCACAGT ACGGTAGTTT TCTTTGGTGA CAACTTTATA GGGGATCCAC ACACACTGCT 11580
TTTCTGCCGT ACCGAATACT GCCGCGCGTC CTGCCACTCC GTTTCCTGCG TGCGTAAACG 11640
CAGTGCTGTC TTCAGCGGGG ATCGATGGGA CAGAATCACC CATAAGTGCA AACAAAAGAT 11700
TTAAAATAGC CTTCCCCTGA CTGGAGACAT CGTTAAGGAC GGTGCCGAGC ATCAGATCCT 11760
CTTCAATAGC TTTCAAAGCA GACGCAtAGC ATCGATACCC ACAACCGGCA CACGCTTATT 11820
TTCTTTAAAA AAACCTGcAC TCTGCAACGC TTCAATGGCG CCGAGCGCTG CGTCATCGTT 11880
ATTCGCAAAT ACTGCCCTCA ATGCGATCTC CGTGTGTGTG AATAAGCGTG TGCATCGcAG 11940
CyTGTCCTTT CACCcGACTG TCAaGCGCAA AaGCCTCCCc GATTATCTCG CCcTTTAATC 12000
CGATTTCTCT CAGCGCCTGA CACACATACC GCGCACAGCG AGCACCGCTT TTATGATCAG 12060
GATCCCCTTT GAGCAcTACG CATTGGATAA TACCGTCGGC GTTCTTATCT GCACTTGGTG 12120
TACGTTCCAG ATATTGCGCA ACCAGTCTGC TTTGCAGCAA ACCAAGCTCG TCGTCCTTGA 12180
CGCCTACGTA ATAGGCGCGT GCATACCGGT TCAAATCAGA AAGGTCAGGC ATACGATTGA 12240
AGAATACTAG CGGAATGCGC GCCTGCTGTG CCTTTTCAAT AACCGTGCGC GCAgcaCGAt 12300
GGTCTACAAG ATTTACCGCA AGACCGTGCA CGCCGCGCGC AATAAATTGA TCGATGTGCT 12360
TGTTCTGAAT ACTCTGCGAT GCCTGACTAT CCACGATGAG GATTCGAGCA TGTTTTTTGC 12420
CAACCGTAGA GAGTATGTGA CGCAAGCGCG CCACGAGCGT GTTGTCATAC TGATACACGA 12480
CTACTCCGAT AGTCGGCTTT TCGCTGCGCT TGCACGCGCC CGCACCAAGT GCACACAAAA 12540
GGAGCGCTAC ACACATCCCT GTACCTTTCA TATTTCCTCC TCATGTTCAC CAGCGCATTC 12600
TGATTTGACA CTTCTTTCCC CTCACACCCT GATACCCGCG CGAGGAATAT AGAAATTAGA 12660
AAAAGGATGG ATTATCCAGT GCTGCCACCA ATCGCATGAA CGTGTCTATG TACCCGGCCT 12720
TGCGCCGTTT AGCGTACACg TCTGCGACAA TCGCCTCACT TGCCTCAATC ATGTATATGT 12780
TTTGGATTTC TATTACGTCG TACGTGCTAA AGCCATGCTC ATCTGCAAAA TTGCCAGGAA 12840 CCAGTGACTC CACACGGTAG GTGTTGCGCG TGCCGGTGCc gCCAAGCGCA TCCCAAAAAG 12900
GGGCAAGAAG GCCCGGTACG ATAAGTCGTA CTTATACAGG ATCCGGGCAG GATGTTCCGG 12960
CCGTACTCCT AACTGGCAGT ACCATTCATC CTGCTCATAG GCGGCCAGAT CATGCAGTTT 13020
GCTCTTGTCC CGCAGTCGGT ACCCTGCAAG ACGGACGATA GTTTCGGGCA CGTATTTTTC 13080
CAACTCTGCC TGTAAGTCAG CTACACAAGA AACGGATGCA TCATTGACCG TGGTAATAAT 13140
TGCCCCTTCT GGTATTCCTG CGTACGAGAG AGGACTGCCA GGAATAACGT ACGAAGAAAG 13200
CACACCGCCG ACGCCAGCAT TTTTCCACAC ACGGTGTGTT TCACCAAACG CACCAAGCCA 13260
CGGATGCGTC ACCAATCCCC CGCGGTACAA GTTGGCAGCA CCTGCTTGAG CAATTCTACA 13320
GGAAATGGCA 13330 (2) INFORMATION FOR SEQ ID NO: 49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10214 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49:
ACACGGTGGC GCGCAGATTA CGAAGAGGTT AAGCAGCTCG GTGGTTTGTA CGTCATTGGC 60
ACAGAGCGGC ATGAAAGCAG GCGCATTGAT AACCAACTTC GGGGGCGTTC GGGGCGTCAA 120
GGGGATCCAG GCCGCTCAAA ATTTTTTCTC TCTCTGGATG ATGATCTTAT GCGCATTTTT 180
GGGGGGGAGC GGCTGAAGCG TTTTATGAGC CGTGTGGGTA TGGAACCAGG AGAACCTATC 240
ACGCATTCCT GGTTGAATAA GAGTATTGAG CGCGCGCAGA CGAAGGTCGA AGCACGCAAC 300
TTTGATGTCC GTAAGCACTT GCTGAATACG ATGATGTGCT CAACGAACAG CGCTCCTTCA 360
TATACgCGCA GaGcACAAAT TTTGATAGAC GAGCATGTGG TAGAGCGCGT GTATACCACA 420
ATCGAGGAGT ATCTTAACCG AGAAATAACC GCACTTCGGC AAGAATTGAA GCGGCGTGGG 480 cGGCTTTCCC TCGGGGCGTT TCAACAAAAC CTGAGCACCC TGTTCGATTA CGCACTGGGA 540
GGTGAGGACG CATCTGGCTG GAACGAAACG CGTCTTGGAA CGCTGAAGCA AGAAATCCTG 600
GCGCATTTAA AAAAGAATAT TGAATCAAAG TATCTGCTTG CAGGGGCGCA GAACATGGAT 660
ACGTTCATCC GCTACCAGTA TGTGCAGGCG ATCGATAAAA AATGGCTGGA CCATTTGGAA 720
CTTCTTGAAA TCCTCCGGGA ATCGGTGTAC TTGCGTTCAT ATGGGCAAAA GAACCCGCTT 780
ACCGAATACA AGCTTGAAGG GTTCGACCTA TTTTACACCA TGTTAGACGA CATTCGCCTT 840 TCGATCGCCT CGCAGGTTGT GCGCGTAACG GTTCACATGG AAGAGCAGCG CGTCCCGAGG 900
CCACCACACt TGCACAGGCG GCACACGAAT TTCAAGCACT GGGGCAGCCT GGCAGAGGGC 960
ACGGATCGCT ATCTGCTCTC CCGATTCAAG CCGGCGCAAA AGTGGGGCGC AACACCCCcT 1020
GCCCcTGTGG AAGTGGCAAA AAGTACAAAC ACTGTTGTGG CCGCTGAAGA GCAATCTCAT 1080
TATTTTGCTT GATGGGCAGG ACCATCCAGA TGTCTATCCT GTTCCAGGTA AAGACCGCCG 1140
CTCAGAACAG AATGATAAAT TCTTCAAGAA AGACATGGGT AACTCTCCCG AGACTCAGCT 1200
GTGTGTTGaG CGCATCGCcA AGGCAATTTT TAGCTGCTCG GGGCCAAGCA TATGCAATTC 1260'
TTGCGCGGTG TGCTGTGCAC aaCGCAAGAG ACGCACGTCT CAGTTGTTCT TTTTTCTGAA 1320
CGAGCTCCTC AT CAGGGCG TGATCTGCGG CAGGATATTG CAAAAGTGGG GTAATGATGA 1380
CACTCACAGG CTGCTTATCC GCACTCAGCG CGCGCAAGTn CCAAGTTCTG CATAGACTGC 1440
ATGGGCAGAT TCCCGCTGGG CAATCCCCCG GGTTGTACGC CTAGTCTGCG TTTCCCTTGG 1500
AGCACTGTGT GTCCTCACAC GGAACGCCCC GCAGTGCCGA GAAGTAACAC ACAGACGATG 1560
AGCGCTGCGA CAGTTCTCAA TGCACGGATA ACACGTTGTG CAGTCTCCTC AGTCATGGGG 1620
CATTGTAGCA CGCACAACAC TCACTGCACA GCGATAAAGA CTTgCTTGAC AGCACCCTTG 1680
TACCCTCGTA CACTGGGGGC GGGCATGGGT GTTCTTTCGT GAAGACAAGT CTGTTGCTTT 1740
CCGTTTGCGC mgsGCTGCGC TGTCCGGTTG TGCCACGGGT CAGAGTGATG CGGTCACAGA 1800
CCCGCTCTCG GTTCTGGAGG TTTCTCAGAC AGAGACGAGA GAGGCGCTGA TGCTATTTGT 1860
CTCTTACAAC GAGACGGGTG CATCTGTCAC CATCTTTACC CCTGAATTGG TTGCGCGTCT 1920
TTCCAAATCG TATCGCTTTC TTCGCGTCGA GGCTCCTCAC AGCGCATACA CCCTTTCCCC 1980
TGAGGCGCGC GAACGTAATC GCTTGTTGTT TTCGGAGTAT GAGGTTGATG GCCTTCCGTT 2040
CCTTGTTCTC CaAAGCGCAC AAGGGGACGC TTACTTTGCG CAGCGCATAC ATTCGACGCT 2100
GTCGAGCGAG CAGGAGCTGT GGGCGCTAAT ACGGTCTGCG GACGCTTCGA GAAAAAAAGT 2160
GCTGGCGGCG CGTGACCGTA TCGCTCAGAC CGAAGCTGCT GAAAAAGCAA TTGCCATCGA 2220
TGCATTTCTT AAGACGGTGC GTTACCCACG CTCTGCGCGG TACGACGCCC TCCGAAAAGA 2280
AGCACTCCAG GCTGATCACG AAAATGTCTC AGGTCTCCAC GGGGATTACA TGTTTCACCT 2340
GGCACGGCGG CGCGCAGAGA AATTTATCAA GCAAGAAAAC CTTGTAGCAG CGGGGAATGC 2400
TTACAAGGAT TTAGCGCAGT CACCGTTTCT GAGTGCATCT CAAAAACAGG AAGCGTGGTA 2460
CCTGACCGCA TACACCTATG CTCTTTCAGA AAAGGTATCT ACAGAGGACG TATCGCGTGC 2520
TTGCGAAAAG CTGTTGCAGC CCATCCGCAT GCTGCGCGGG TTGCACAGAT CAAGCAAACC 2580 ATAAAGAAAC TACTTACCGA GAGAGGcATA TGAACGAGGT CGATGCAGTA AAAAAGGGAT 2640
AGACAGTGcA AATCCTGCAG CAAGTTGCTC AGGGGATCGC GTCTCATTTC GGGCATGACT 2700
GTGAGGTGGC TGTGTACGGC GTCAGTAGCG ATGGTAAAAA CTGCGCGGTT GATTTTATCA 2760
CAAATGGACG CGTTACCAGT AGCAGGGTTG GAGACAGACC CCGCCTGTCG CTCTTCAAGA 2820
ATTACGGAAT AGAAACAGGc AAGGGCGGCT CAACTACCTC ATtCGCACGG AGAACTGCCG 2880
CTCCCTTAAG TCGAGCATGT TGTATATTCG TGACGAACAT AGCACGGCTC AGGCGATTCT 2940
AGCGATAAAC TTTGATATTA CTGCTTTGTA GGTTACGCAt TTGCGCTTGG CCGGCTCACC 3000
GGCACTGCTG CGGAGACCGC CTCGCATATC CACCTTAAGA GCGTCAGTGC GTTCCTCGAC 3060
GACCTGATAG AAGAGTCTGT AGAAAGAGTA GGAAAACCTG CAGCGCTCAT GAGTAAAAAG 3120
GAAAAAACGG ATGCCATCCA CTTTCTCAGC CAGATAGGGG CGTTTCTCAT TACACGCGCG 3180
GAAGACAGGG TCTCCCACTA CTTCGGCATT TCAAAGTACA CCCCTACAGT TATATCGAAA 3240
CTGGCAAATC GTGATCGCAC CGGACTGAGT CCCCAGCAGA GGGATCGCCG GGCCCTACTC 3300
CTTCCCTGGT TCAAGCTCCT CGGcGAAGAC AACTCCTCCG GAGCGGACCG CTCGCACCAC 3360
GCTCCCACCC GTCCTGAAAT ACTCGGACAC CACCGGTGAG GTCGCCGCGC GGAAAGATTC 3420
AACTATTCTT TGTCCCACTT CGTCCCCGTT CCGCGGGATC CGCCGCACCC AATATAACGC 3480
AGCGTTACGC GGGTCTTTGG TAAGCGCGTG CACCTCCCCA TCCACTAGCA GCCTTTTAGG 3540
CACCCCCTGC ACAAAATCTA CTGTCACGTG CCTACCTGTC ACCGGGTGCA ACCACTCAGT 3600
ACGCGCATGA CCGGTAGGTA GTTCTGTGTG CGAAATCGTC ACACGTCCAC GGCCATACCA 3660
TTTTTGCACC TTCTGTCCCT TCGCTCCATA CTCTTCTTGA TAGGCAAAAC TTCGATCCTT 3720
GTGCACGTCA ACTGACAACG CACGTACCGC CCCTTGAGCA TTGTAGTATT CCCGCGTTTC 3780
AAAAAATCCA TCATCATCCC GATCGCTATC CCGTtCGCGT TGCGCGTCCA TCCACATAGT 3840
GCGTACGCGC ACGAAAACGC GATCCAACAT GAGTTTCAGA AAATAAAGGC AGCCCTTCAT 3900
CAAGGTACGT CCtTACGCGG GCACGCTCAA ACAGGGAATC AGGCTTTTCG TAATAAAGGG 3960
AAGAAACTGT GATCTGCTGC TCAGTAGGGA GCGGCTCATT TGTCAATACC ATTGTGAAAA 4020
AATCGTGCGA CCGCACACCT TCTAAATCTC GCGCAAGATC TAAGGACTGC ATGCGTACCG 4080
GCTGCCACCG AAGTGCACGA GGACGCAATA CGTACGTTTT ATCCTCCCAC CCCACCTGGT 4140
GTACTTCAGG GTAGCGATCG TAACACACTC TGTAGCCTTG CTGCGTCAAA GGAACACGCG 4200
CCTGAGTTTC TCCCTCCACC CCATTATCTG CAGGAAGCGA CACGCGCGGG GCAATCCAGC 4260
GCTCAGCAGC ACGCTCATGG TCGTGCGCGG CAATATTCTG GGGTATGGGG AACACAGAAG 4320 474
CGCGCGACAA TTCCGTCTTT TGCGTACTGT GCATCGTGTG CACGCACGTT GGCGCACCGT 4380
CATTCGCATA GACTTCATAC TCGAGGATGC CATCCTGGTT TGTATCGAAC TGTGCCCGGC 4440
TCGGCCTTCC TGCTTTAAAA AACACACGCG CAGAAACAAT GCCGTCCCGA TTTTCATCGG 4500
TATACAACAC ACCTTCGAAC TCCGCAAGAA ACCGCGCAAA ACGTGCGCGA ATCGGCCGAC 4560
TTGCAAGCAA ACGACTGAAC TCATGCAGCA GATCCGCATA CAAAACCACC ACGCGCTGCA 4620
AGGGAGTGGA TGAAGACACA GACGACGCAT GCACACCGAA GCTTGAAACA TCATCGGTCG 4680
GGTCAACCGC AGGAACTGCC AGAGGGGAAT TCAGCGTACA GAACATCTCC ATCGCGCGTT 4740
GTTCATCCAG CACTCCATAC TGCAATCCCA ACACAATTGA CTGGGCACGC GCGTACAGCT 4800
CCTGCTCAGG AGCaGAATCC CCTGCGCGCG TGCTTTGCTC ATATGAGGAT TGAGCAGAAC 4860
TCTCTCCAGG AGCATCTAAC GGACGCAAGG TAAAATACGT TTGCAGATAT CGAAATGCCA 4920
TGTTCGTTCG AGGTTCAAAT AGAGCCGCCT CTACAAGCAG CGATGGGTCC TGCTCCTGCC 4980
ATACAGAAAG ACGTGACAGG ATGGAATCTG CTATCTTTTT TGACCGCGAC GAGGGACGTC 5040
GTGAACGCTC TTGCGCAAAA AACAACTTTG CAAAGCGCGC GTCAAGcgCC CAACGTTCCA 5100
ACGCTTTTTC GATCAGCTCC TGCGCGTGCT CAACTTGTCC GAGCCCGTAA CGTGCCCGGG 5160
CAgCAACCAA TCTGCATCTG CAGACACCTG CTCAGCCGTT GCAAGAAGTT CCAGTGCACG 5220
GGCGTGCTGC AACGTATCCA CACACAGACG AGCATAAAAA AGCCGCACCT CTTCTATATC 5280
GTACACACAC CATTGCATAT CCTTTGCCAC GGCACGGGCC ATCCACTGTA ACGCACGTGC 53 0
GCGCGGCTGC TGCAGCGCGT AAGAAGCCCG TGCAGCAATA AATAAAAAGT CTGCTATTTG 5400
CGGAgcaGAA GCGACTCCCT GCTCTGCCTG GGACAAAGCT TCCTGCCATC GCCCTTCCTG 5460
CAGATACCGA GCCGCAACAC CTGGATGATT ACGTTCTAAA TCCTGTGGAG GTGCAGGTTC 5520
AGACACGCAC GAGGCATCTT GGATGCCAGA AAATGCACAA AAGATACTCA CTGCACAGAG 5580
TCTTCCCATA CGTGGGCACA TTCCCTTACT CATGAGGAGT CCTCTCCGCG TAACGATTTT 5640
GGTAACCAAG TGctGCGCCA TAGAGCGCAC GCAACACACC ATCCTGCTCT ATCATCGGTA 5700
ACACCGTGCG GTCCGATAAA GGTACGTGCC ACTCTGAGAA CATCTTTCGG ATCCCCTTAT 5760
GACCACCGCG GATGGAGATG GTGTCTCCCG TGCGATGGGT TCTGATATAA AAGGGAAAAG 5820
AAAACGGACC TACACCCACG TGGTCCTGTG CGCAACAGAC AAACACGCCG GCAGGACGTA 5880
CTTCCACAAG AArgTTCCGC ACGCACAGGG GTAGGCACCA GGACGCGCCA CGTAGATTGC 5940
ACTCACTCCT TGCTTTTCAG AGGAAGGAGG TGATCCTGCA TCCTGTTTCT TTGTCTCACG 6000
TGCTGTGTCC GACGCATGTA TGCAGGAAAA AAGCACATAT GCACCGGCAC GCTCTAACTG 6060 CAGCCCTGAA ACGTGTATCC GACGCACACC ATCAAAACGC GCACACCGTT CAAGCGCCCC 6120
GCGTGGCACC CGGTGCGAAA CTCCCAAACG AACGCAAGCC TCCTGCAAAA GAAAGAAGCG 6180
CAATATAAAT TCAGCGGCGA GAAAGTCCGA CCGAGGCATC CGCAGACGCG TGCCCAACGC 6240
ACGTGGTACT GGTTCCCACG CATGCGAACA ACCTTCACGC CACCGCGTCA AGGCAGCAAC 6300
ACAAAAGCTG TGTTCTGCAC TAATCCCCGC AAACGTTTTG TCTAAGCCAG AGCGCCACCC 6360
TGCAAGCACT GCATCAAGTG CAGGGATAAG TTCATGACGG ATACGGTTAC GCACATATTT 6420
CCTGCACGTA TTTGATGCGT CTTCGCGCCA ACGCACACCA CGCGTCTGCA AGAAATCTTC 6480
AACACACGTG CGGCTCACCT TTAGCAGCGG ACGCACGTAC CGTCCACGCG CAcTCGTATA 6540
CCTTGCAACG CGGAcGcGgC CGCTCCCTGG AATAAGCGCA TGAGCAGTGT TTCGTACTGA 6600
TCATCACGGG TGTGCGCGGT TAGAACCACC TGTGCTCCGC AGCGAGCAGC CACGTGGTCA 6660
AAGACCTTAT AGCGCAGTGC ACGCGCCGcg TCCTGCACAC CGCGGCCACG AATTTTAGCA 6720
CACGCGTGCA CCGCACCGGc AGAAATCTGC TGCACGAAAC ACGGAAGGGG AGGAGAAAAA 6780
CGAGCACACA GCGCACGCAC AAAACGCGCA TCGAGCGCAC CTTCCTGAGC GCGCAGACTG 6840
TGATCAACCG TGACCGCGCA CGCACACACC CCAAAGTCAG GAGCGAGCTC GTGCGCCGCA 6900
TAAAGAAGCG CAAsmGAnTC GGCACCTCCT GAAACCGCCA CGAGCAAGCA AGAAGGCTTT 6960
CTCGGCACAA GGAAATGCCC AAAGctACGC GCCACGTGGA CGAGCAGCGG GTGAAGCTTC 7020
TGCCTAGACT CACTCACCTA TAAAGACGGG CACGCTGCAC AGTGTGCCGC ACCGCGcgCG 7080
TTACACCGCG CACCATCTAG CCGGTCCTCG CGCCAGCGGG TGAACCCGCT TCGGAAGCAG 7140
AAGAACTGAG TGCCACAATC ACCGCATCAG GATCGCTGAG AACCACCACC GACGCGGGCA 7200
GAGGAACATC ACGCACACGG CGCACGTCGC CGGCCCCGAG CCCACTGATA TCAAGCACAA 7260
CACGGTCGGG CAAGTTGCGC GGCAAAGACT CTACCTCGAT ATATGAGAGC CCCTTTTCCA 7320
AGCGAGCCCC ATAGCGCACT CCTTCAGGAG AACCACACAA CTGCAGCCGG ATTCGCATTC 7380
GCAACGGAAC ACTCTCTTCA AcTGCGTAGA AATCCACATG CTCCACACGG TCACTGACCA 7440
TGTTATGCTG ATAGTCCTTA ACAAAAACGC AAAAGACCTC GCCACCATCC AGTTCCAAAG 7500
ACAGAACAGT ACTCCTGGTT AAGGCACGAA ACAATCTATc GAAgnTTTGt GCGCAAGTTC 7560 aAGGGGAACG GACACGCCCC GATGGTCATA CATAACCGCA GrCAAACGCC CTTCCTTTCT 7620
GCCAgCACAG CGGCATACTT CCCCAACTGG ACGCGCCTTT TCCCCTTCAA ACGCCTTTCA 7680
TCCACAATCC AATCCTCCAT GCACAGAAAG CGAACACGCC GCAAACTGGG ACGGTAGGAT 7740
TCGAACtACG GAATGACGGT ACCAAAAACC GTTGGCTTAC CACTTGCCGA CGTCCCAAAG 7800 ATATCCTACC CACATAACCG GACACGCACA CACCAACACC ACCGCTTTGC AAGCAGCTTA 7860
TGCGCGCGCC GAAGCTCCTC CTCATCCCGA TACAGACCAA AAACCGCGCT CCCGCTTCCA 7920
CTCATCGCTG TAAAGCACGC ACCCGCACGG GCCAGATCCC AACGCGCAAG GGCGACTACA 7980
GGGTACCGAc GCTGTACAGG GGCATCTAAG CTATTAAAAA ACCGCCACCG CGCACAATCC 8040
TGTGCATAGT GCGCAGAAAG CGCGGTAGCC CCACGCAGAG AGTACTGCTC GCCGTCGGCA 8100
GCATGTACGC CGCACGCACG CAACCTGTcC AAATCCTcAT AGGcCTGTGC AGAACCGCTG 8160
TGCAATCCCG GcCAGACCAA AAGCCCCAGA TAGCCAGTCT TTGGAACAAG GGGAACGAGC 8220
TGCTcACCAC CACCTAGcAC gCACGCAGcC TGGGAAGCCA GGAAAAAAGG GACATCAcTG 8280
CCGACACTAT ACGCCAcTTC TCGTAGAAmC CGAGCAGAAa GGGTCGTCCC AAACAAgTAT 8340
CAaGGCCACA CAAAaGCGCG GCAGCATtCA GCAGACCCCC CACCAAGTCC AGACCtGCaG 8400
GGATACGCTT CACTACGCGC ACGCGCACAC CATCGTGAAC GCCAGTTACC TGACAAAACC 8460
GCGCATACGC ACGGGTCAGC GTGTTTTCTC GAGGCAGAGC CATATAAGGC GAACACACCT 8520
CACACCGGCC AGGGATATCC AGGCGCGAAA GAGACAAAGA ATCCGCAAGC GTAATGCGCT 8580
GCATTACACT CTCAATCGAG TGAAGACCAT CGGCCCGAgT GCACCAACCC ACAGATGCAT 8640
GTTCACCTTT GCGTGAGgCG CAAACTCAGC GACTGCACCC GCCATTCTAT GACAAGCGGA 8700
CACAGCGTGT CAATTCCCCC TTCTCTCTAC cTGCACCCAA AACACAAGAG AAAAAATACC 8760
TGTGCCTATT AGGCACAGTT GACAGCGTGT GCGCTCCCCT cTACGATCCA CCCCTAGCTT 8820
TCACCATACC ACAAGCAGAG GTCAGCCATA TGAACGAGAG AAACAAGTTA CTCGCACGCG 8880
CCCTGTATTC CTGCGTTCCA CACGTCCAAG GCTCGGACGA CTACGAGGAC GACTTTGAAG 8940
ACAGCGACTT CCAGGACGGG GATTTCGATG ATTTTGAAGA CGAGGATGGC TTTGACGATG 9000
ACGATGACTT TGAAGACGAC GATTTTGAAT ATGAAGATGA GGACAATGAC CTAGACTTTG 9060
ACGAATAGGA CGCACGCGCG GGTGTGGTTG TCGAGGCGAC ATGATCGCAT TCCTGTTGCC 9120
TGTGATGCGA GACTGCTAAG AAATCTTAAT AAAAAAGTTT TTGATAAAGC GTGCGCGTTC 9180
GTCTGCCTTT TTCCAGTATG GGCTGTGGGG GAAGCGTTCC AGTATTGTCT TGTATGCTTC 9240
GAGCGCGAGG CGTACGTTTC TCTGTGCGCC GTTGATCTCA TAGGCTTGTC CACGCAGGAA 9300
CCACGCTTCG TCCATTCGTT CGTGAGAAGG GAACTGCGCA AAGAAATCGC CGAGCGAAgn 9360
GgGGCATCTC GCGCGTTTCC CTGTGCACAA AACTGGCGCG CTTCTGCTAG GTGATCGCGT 9420
TTTTCTTGAC CCTCTTTGTG AGCAGATGCC GGGACATGCG CCTCAATAGG CGCAgCTGAG 9480
GGAGCAGAAG GTTGAGACGC TGGTGAAATC TTCCGCGGAG AGTACCGCTC CGACACACCC 9540 TGTGCTACTG CATCCGACGG CACAGGGTCA CGTCCTCCTA CGGTGGGTTC CCGATCTTTT 9600
TTTTCAtCAG GACGGGGCGT ACCGTGCTGA GCTTTCTCTG CAACAGCAGT ATCCGTCTGA 9660
TCCTGCCGGA CAGGCGCTCC AGTATGGGCA GCGGCGCGCT GAGAACCAGA AGTTCCACTC 9720
TCTTCTGCTC TGCGCTCGGT ACCCGTTCCC GCAGAGATAA CTCAGAAACG ACAGTATCAG 9780
GAGGAGACGA CACCGTACGC CGGTACTCAG GCGCACGCAC CACACGCGCG AGCCCTTCCC 9840
GCTTCGGTAC CACCTTGACC GCAAGTGCGT CGGAGACAAA ATCACCCCGA AACACATCAA 9900
AATAGGAGAA CGCTAAGACA AAATCACCCT CTCGCTCAGC ACTAAAGGTA AAAAGCGAAT 9960
GCGAnCTCCT CCAACTTGCG CTGGTGATAG CGCAAACCAG GCTGCGCAGT ATGCTCGCCC 10020
ACGTACACCC AACCTTCGcC CGGaTACAAA ACCTcAAGTT TTTGcCCCAC TGcAAGcTGT 10080 aCCGcGCGCG AAACGgGGCT ACCTtCATCC TtCAGGgCGG TTCTTcAGGc ACCATCGCGc 10140
GTGGAGAATC CTcTGcCGGC TCAGGCTCAG CCTGcAcCTC CGcCTCACGA GGAGGsTCTG 10200
ATGCAGGGGG CGGA 10214 (2) INFORMATION FOR SEQ ID NO: 50:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 660 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 50:
CTAATGAAGG CGATGTTTTC TTCTGAAAAG GACCCGTGGT ACCTACTCGG CGCGGGGGTT 60
GCGTGCGGTT TGGGAATTGC CGCTTCGGcG CTTTCTCAAG GGCGGGCTGC CGCAGCCGGC 120
GCCGATGCGC TTGCAGAAAC AGGTAAAGGA TTTAGCCAGT ATTTGACTAT CGTTGGTTTG 180
TGTGAGACGG TGGCGCTTCT GGTGATGGTT TTTGGTATTA TCAACTGCTA GATGTGGTGA 240
ACGTTGTGGT ATAGCGCTTC GACCATGCTT TTGATAGACG TAGGGAACTC GCACGTATTT 300
TCGGAATCCA AGGCGAGAAT GGTGGCCGTG TGTGCGTGCG TGAGTTGTTT CGCCTTGCGC 360
CTGACGCGCG TAAAACCCAA GATGAGTACT CGCTTCTCAT CCATGCGCTT TGCGAACGTG 420
CGGGGGTCGG CCGTGCTTCT CTCCGTGATG CGTTTATTtC CTCCGTCGTG CCTGTGTTGA 480
CAAAGACCAT TGCAGATGCG GTCGCTCAGA TTAGCGGcGT CCAGCCGtTG TCTTTGGCCC 540
GTGGGCGTAm GArCACTTGC CGGTGCGCAT ACCAGAGCCA gTGCGCGCGG AAATTGGCAC 600
TGACTTGGTA gCCAAmGCGg TGGCGGCCTA TGTGCAnTTy CGTTCTGCTT GCGTGGGTAT 660 (2) INFORMATION FOR SEQ ID NO: 51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8648 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51:
ATTTCCACAT TACTCAATAA AAAGACCCAG GATTTTAAAA AAAAATACCG CTACACCGCG 60
GATGTACTTC TTATAGATGA CATTCATTTT TTTGAAAACA AAGACGGATT ACAAGAAGAG 120
CTTTTCTATA CGTTCAACGA ACTTTTCGAG AAAAAAAAAC AAATTATCTT TACCTGCGAC 180
AGGCCTGTAC AAGAATTGAA AAATCTCTCT TCTCGCTTAc GCTCGAGGTG CTCCCGAGGG 240
CTTAGCACTG ATCTGAATAT GCCATGTTTT GAAACGCGCT GTGCTATCTT GATTAAAAAA 300
ATACAAAACT ATAACAGCAC CTATCCTCAC AAAGCCATCC ACATTTCAGA CGATGTTGTC 360
CGACTTGTTT CTGAAAACAT TTCTTCAAAT ATCAGGGATC TTGAGGGGGC ATTAACAAAA 420
ATTATCGCTT TCATTGAAGT GTCGGGATCC ATCACGATAG ATATCGTTCC CTCTCTCCTA 480
AAAGAGTTCT TCCTCTCTGC AAGGCCAAAA CACATCACAG TAGAAACTAT TCTTCATGTA 540
GTTGCAGATC ACTTTAACAT TTCGTATTCA GATCTAAAGG GTAAGAAACG CAATAAAAGC 600
GTTGTTTATC CTCGGCAAAT CGCTATGTTT CTCTCAAAGG AACTGACAGA GCTCTCCACT 660
ACTGAACTTG GTATCGAATT TGGTGGCAGA GATCATTCAA CCGTCATTTA CGGATGTCAA 720
AAAATAGAAG GAGAAATTCT CACTAATCCT TCGTTACAGG CAAATCTTGA TTTGCTGAAA 780
AGTAAAGTTC AAGATTCAAT CCGCTAGGGC GTAGACACTG AATTCGATGG GGATAAGTGG 840
TGGATAAAaG AATATAAATT AGTCATTACA CTTTACTCAC GAATATCCCC CTTTTTTTAG 900
AGAAAAAATA TACTTTCTTC ACAaGCTTGT GTGCGGTTTT TGTTTGGTAA TTCTCGAGAC 960
ATAaGCACTT ATCCAGATAT TCACAGTTAC TATTATGTGA TACGACTACA TTCTTTATAC 1020
TTATAAGATT AATAAGGAGG AAACTAACTG TGAAAATCCT ATGCGAGAAA GAAGCCTTTC 1080
TGAAGGAAAT AAGCACAGCA CAAGAGGTTA TTTCAAATAA AAAAAACACG TCTATTTTTT 1140
CGAACGTCCT ATTAGCTGCT CAAGGAGCCC TGCTTACCAT CAGAGCAACC GACACAAAAG 1200
TTACCTTTGA AACTAGCATT CCCGTCAATG TTCTCGCCGA AGaCaACGAC AGTTTTTTGC 1260
GACAAACTTG TGAATGTTGT TTCTGCCCTT CCAACAAAAG AAATCGAATT AACGTTATGT 1320
GAAGAACAAC TTGTCATTAC cCCTCCAAAC AAAAAGATAA GCTTTcAGCT CAGAACCCTC 1380 TCGCATGAGa GTTTTCCATG TTTCCCTCAA AATGAAGGAG GCGTCTCTCT TGCTGTGCCT 1440
ACCTCCGATC TTAGAAACAT GATTAACCAT ACCGTTTTTG CAGTTTCAGA AGACAGTACG 1500
CGCCATTTTA TCAATGGCGT ACACGTTGAT TTTCAGTATG GAAATATTAT TTGTGTTTCA 1560
ACAGATGGAA AGCGGCTCGC CTATATAGAA AAAAAGGGAG AATCCTCTCC CCAATCCTTT 1620
TCGGGTGTTA TTGTGCCAAC TAAGATCTTA GGCATAGTAA ACCGTAAGCT TACCCCTGAA 1680
GGATCAGTGA CGCTATGCAT TACGTCGCAG CACGTTTACT TTTTTTTCGG TGGATATAAG 1740
TTTTCTTCTG TGCTTATTGA GGGGCAATTT CCTAATTACA AAAGAGTAAT CCCTGATCAT 1800
CAGGAGCGTT CTTTTTGTGT TGGACGTGTG GAGCTAATGG AGGCACTTAA ACGAGTCTCG 1860
TTGTTGGTAG AACAAAAATC TCACAGGATA TTTATTACCA TACAGCAGGG TTTGTTGACT 1920
TTAAGCTCAA AAGCTCACAC TCAAGAAAAT GAAATAGGTG ATGCTCAGGA AGAAATAGCC 1980
TGTGCTTATA CAGGAGAAAG TGAGGTCATA GCTCTTAACT ATCTATACCT TGAAGAACCG 2040
CTTAAGGTTT TTACTTCGAA GGAGGTTCAA GTGGAATTTA CCGATCCTGC AAAAGCACTC 2100
ACGCTTCGTG CTGTACCAAA CACGGACTGC TTTCACATCA TTATGCCTAT GCAAACGGAG 2160
TGATTCTTTG CCTTTTCTCA CAGTGACTGC AATAAATTTC AGAAATCTTG CACATCACAC 2220
GATTGATATA TCCTCTCCTG AGGTTTTTTT TGTGGGAAAT AACGGACAGG GAAAAACCAA 2280
TATACTTGAG GTTCTATATC TTGCTGCGTA CGGAAATTCG TTTCGAACAC GCACCGAAAG 2340
CGAACTGTAT GCAACTCACG CGCGTTCGAA TGAGTATCGG GTAAAAGTTA TGTACCGCGG 2400
GGAGTATACC CACACAGTGC AGATTTTCTC CAAAAATGGA AAAAAGCGCA TTGAGAAAAA 2460
CTTGAAAAAA ATAAGGACAA AAAAAGAACT TATCAGCAGT ATTCCCTGTA TTTTGTTTTT 2520
TCATAACGAT TTGGACTTCG TAGTTGGTAC GCCAGAACGC AGACGCTTCT TTTTGGATCA 2580
ATCCCTTTCG ATGTGTAATC CTCTGTATTT GGAATACTTG CAAAAATATC ACGCACTAAC 2640
AAAAACAAAG AACAGAGAGA TAAAAGAGAA ACGCGTTCAG TTACTCGATG CACTGGATAC 2700
GCAAATTGCA ACCGTGGGTT TTGATCTCGT GCAGTGGAGA ACTCAGCTTG TCCGTGACTT 2760
TAACGTGATT TTTACTAAGT ATTATGAGCG CCTTGGAGAC CTTGCGCAGG TGCGCATTGA 2820
GTATAAGCCT TCATGGTCTG ACTCCTCAGT TGAGGAGATC GTACATTCTC TTTACAAGAG 2880
ACGTAAGCAC GATCTTGCGA TGGGGATGAG TATGTCAGGT CCTCATAGAG ATAAGATTCA 2940
CTTTACTCGG TCGCAGGCGC TTTTCATTCC TCAGGCTTCT ACCGGACAGA GGCGGTTGGT 3000
TTCGTTGGTA CTGAGGATGT CGCAGGCTGT GTTCTACACA GGaGTAACGG GAAAACTGCC 3060
CGTACTCTTA ATGGATGATG TCTTGTTAGA GCTTGATCCT GAGAAGCGGG AAAGGTTCAT 3120 GATGAGTTTG CCTCCGTATG ATCAGCTGTT TTGTACATTT TTGCCAGGGG AAGCGTACAG 3180
GCGATACGGG CGTGAAAAAA CGCGGGTATA TTTTGTTTCT GAAGGGGCGT GTCATGAATA 3240
ATGGTGTGAA TAAGCTATCG GACTTACTCG TGTTGACCAC TGAATATATC CAAGCTTCCT 3300
ATGAAACGGA GGCGTTTGAT GCGCATCGAG AATGGGTGTG TATTGTGGGT AACCCCGTTG 3360
CGTTACACAG CACGCTGGTA GATATCAGAA ATGGGAAAGT TGTGGTCAAG GTGACTCATC 3420
CTGGTTGGGC ACAATACCTT TTGTTAAAGA AAGACGAAAT TGTACATGCC CTTCGTAGGC 3480
GATATCCGTC GTTGGGAGTG ACGGGTATGA GTACGTACGT AGATTCTACC TCACGTACCC 3540
CTTCTGCGAA GAAGGACATG CAGGGACTTT CGGTATCAGA AAAGCAGACT CGTCCTGTGC 3600
CTGAACTTGC CGAGGTATTT GAACAGCTCC GAACGCTTTT TCAGGTGAAA ACGGAAGAAC 3660
CGTCACATTA GTTTTGCGGA TGGGATTCGA CGGATCTGTT CAAAGTCCAT AGGACTGCGG 3720
TTTTTCTTGC GTGCAGCCTA TGCACGACTG TGTCTCTCCT TGAACGCAgT ATGGCTTTGC 3780
GTTAGAATGC CCGCCCTATG GAAGAAATTA GCACCCCAGA GGGTGGCGTT CTTGTGCCCA 3840
TTTCTATAGA GACAGAAGTC AAGCGTGCTT ACATAGACTA TTCTATGTCC GTCATAGTTT 3900
CTCGTGCGCT TCCGGATGTC CGCGACGGTT TAAAGCCTGT TCACAGACGT ATTCTCTACG 3960
CGATGGAGGA AAAAGGGcTA CGCTTTTCAG GACCTACACG GAAGTGTGCC AAGATAGTGG 4020
GGGACGTTTT GGGAAGCTTT CATCCTCATG GGGATGCGTC CGTCTATGAC GCGCTAGTGC 4080
GTCTTGGGCA AGATTTTTCC CTTCGTTATC CAGTCATTCA TCCTCAAGGA AATTTCGGGA 4140
CTATCGGGGG CGACCtCCGG CAGCGTATCG GTACACCGAA GCGAAGATGG CGCGTATTGC 4200
AGAATCTATG GTAGAGGACA TAAAAAAGGA AACGGTTTCC TTTGTTCCCA ATTTTGACGA 4260
TTCTGACGTA GAGCCCACGG TTCTTCCTGG AAGGTTTCCT TTTCTTCTTG CGAATGGGTC 4320
CAGTGGTATT GCAGTTGGTA TGACTACAAA CATGCCACCG CATAATTTGC GTGAGATAGC 4380
CGCAGCTATC TCTGCGTACA TCGAGAACCC AAATCTTTCG ATTCAGGAGT TATGCGATTG 4440
TATCAATGGT CCTGACTTTC CCACGGGAGG CATTATCTTT GGAAAGAACG GGATTAGGCA 4500
GTCTTACGAA ACAGGTCGAG GGAAAATTGT TGTCCGTGCT CGCTTTACCA TCGAGACGGA 4560
TTCAAAGGGT AGGGATACCA TTATTTTTAC AGAAGTTCCG TATCAAGTTA ATACTACCAT 4620
GCTTGTTATG CGTATTGGGG AACTTGCACG TGCGAAAGTG ATCGAAGGTA TTGCGAATGT 4680
AAACGACGAG ACTTCCGATC GTACAGGsTA CGCATAGTGG TAGAGCTCAA AAAGGgTACC 4740
CCCGCACAGG TAGTACTCAA TCACCTGTTT GCAAAGACTC CCCTGCAGTC CTCTTTTAAT 4800
GTGATTAATC TTGCTTTGGT AGAGGGAAGA CCTCGAATGC TCACGCTCAA GGACCTAGTG 4860 CGCTACTTTG TAGAACACCG GGTCGATGTA GTGACTCGGC GTGCGCATTT TGAATTACGT 4920
AAGGCTCAGG AGCGCATACA CTTGGTGCGT GCGCTGATAC GTGCCTTGGA TGCCATTGAT 4980
AAAATCATCA CGCTTATCCG TCATTCGCAG AACACAGAGC TTGCAAAACA GCGTTTGCGT 5040
GAACAATTTG ACTTTGACAA CGTGCAGGCG CAGGCGATCG TAGATATGCA GATGAAGCGC 5100
TTGACAGGTT TGGAAGTCGA GAGTTTGCGT ACGGAATTGA AAGATTTGAC GGAGCTGATT 5160
TCTTCTCTGG aGGAGTTACT TACTTCTCCC CAAAAGGTCT TGGGAGTTGT TAAGAAAGAG 5220
ACGCGTGATA TCGCAGATAT GTTTGGGGAT GATCGGCGTA CAGATATTGT GAGCAATGAA 5280
ATAGAATATC TGGATGTAGA AGATTTTATC CAGAAAGAGG AAATGGTTAT TCTTATTTCC 5340
CATCTTGGTT ACATTAAGCG CGTTCCAGTG TCTGCGTATA GAAATCAGAA TCGGGGAGGA 5400
AAgGGCTCAA GTTCAGCGAA TCTGGCGGCT CACGATTTTA TTAGCCAGAT ATTTACTGCA 5460
TCAACACATG ACTACGTGAT GTTTGTCACG AGCCGTGGGC GrGCCTATTG GCTAAAAGTA 5520
TACGGGATTC CTGAATCTGG TCGGGCGAAT CGTGGTTCGC ATATTAAGTC GCTTCTCATG 5580
GTAGCGACGG ACGAGGAGAT CACGGCCATC GTATCTTTGA GAGAGTTTAG TAATAAAAGT 5640
TATGTTTTTA TGGCTACTGC GCGAGGTGTA GTTAAAAAGG TAACTACTGA TAATTTTGTG 5700
AATGCGAAGA CGCGCGGTAT TATAGCGCTT AAGCTGAGCG GAGGTGACAC GCTGGTGAGC 5760
GCAtGTTGGT GCAGGACGAA GATGAAGTAA TGCTTATTAC GCGTCAGGGA AAAGCATTGC 5820
GCATGTCGGG GAGGGAGGTG CGCGAGATGG GTCGCAATTC CAGTGGGGTG ATTGGGATAA 5880
AATTGACGTC CGAGGACCTA GTGGCGGGGG TTTTGCGAGT AAGCGAACAA CGGAAAGTAC 5940
TGATAATGAC GGAGAATGGA TATGGTAAGC GGGTCAGTTT TTCAGAATTT TCTGTACATG 6000
GGCGAGGGAC TGCAGGACAG AAGATTTACA CACAAACGGA TAGAAAAGGT GCTATAATAG 6060
GTGCTCTTGC TGTTCTCGAT ACAGATGAGT GTATGTGTAT TACTGGTCAG GGAAAAACGA 6120
TTCGCGTGGA CGTGTGTGCA ATCAGCGTGC TGGGGCGTGG TGCGCAGGGC GTGCGTGTGT 6180
TGGATATCGA GCCATCGGAT TTAGTAGTAG GACTTAGTTG TGTAATGCAG GGGTAATGGG 6240
CTCTGGGGTA TATTTCTCCG TGAGTGGCTG TGTATATGTT GTGAGTATTG TGGATAATGT 6300
GCGTGCAGAA GTTGATGTTT CACGTGAAAC TgTsGGGATG AGGAGTGGGA TCAAATCTAC 6360
CCTAATTCTG GAGGATTATT TGGGTTCACG TTCATGTAAA CTTTATGGGG GTTGTGTATG 6420
GGGACTCGTG TCAGATTTTC CTTCTGCGGT ATTGCAGGTG TATGTTTACT CGCACTAGGT 6480
TTTTTAGTTA GTTGTTCTTT GCAATCTTCA CGAAGCGCTA CAAAGAAATC TGAGGCGCGG 6540
AGGACTTCTT ATCGGATCGG TCTCATGACA AGTACGGGAT CTyAGTCTGT AGATGATGTC 6600 CTTGCGAAGA CACGCCTCGT CAGTATCTAC GGAGAGGCTC GTGGGGAAAC GGGTGGAAGG 6660
ATTGTCCATG TTACTTACTC CGATAACTTC TCCCACGACC ATGAAGCAAC CGTTTCTAAG 6720
TTGCTTGCAC TCGCTGAGGA TTCGACTATA AAGGCCATTG TGGTTAGTCA GGCAGTTCCC 6780
GGCGTTTCAA AGGCGTTTGG GATCATTAAG TCTAAACGTC CTGATGTTTT GCTTTTTGCG 6840
GGAGAACCAC TTGAGCCGGT AGAGATGCTG CAGGAGTCTG CAGACATCGT GGTCAGTCAG 6900
GACTACTTGT TCGGTGGATA TGCCGTTCCG TGGGTTGCGG AAAGGATGGG GGCGCGCACA 6960
TtGGTGCATG TCTCTTTTCC CCGGCATATG TCCTACCCCG GTTTGAGGGT TAGGCGTACG 7020
GTGATGAGGG CAGCATGTAC CGATTTGGGA CTTTCCTTCG CACACGAGGA AgCGCCTGAt 7080
CCTGTAGAcG GTGTCAGTGA CGGAGAACTT GAGGATTTTT TCCACAAGAC GATTGTGAAG 7140
TGGATCAAAA AATATGGCAA GGAAACCCTG TTCTAcTGCA CCAATGACGC TCACAACAGG 7200
CCGCTCATCA GTGCCTTGTT GAAATATGGC GGTATGCTAA TTGGTGCAAC CATCTTCGAT 7260
TACGCTGATG CGCTCGGGGT GCATTATGCT GAGCTTGAAG ACGTGTATAA AATACGAGAG 7320
AAGGTTGAGA AGTCATTGGk TTCTTCGGCG CAGAGGGGCG CTTTGGATTA AATTTAAATG 7380
CACAGGCATT TACGGTGACC ATGGGTTTTG TGGAGTATGC GCGCAAAATC ATAGATGGCG 7440 aACCGCGTAA AGATGATATG CGTGAAGCTC TTGCCGAATC CTTCGACTTG TTTACGCGTG 7500
ACGCACATTG GCGTATTGCT CCTTACCTAA GACTGAAAAC GCACGAAATT GTTCCGAATC 7560
ACGTGCTGGT GTATACGGAC ACATACGTCC TGGGTAAATT TACCTTGCCC GTCACAGACC 7620
AAGTACTCCC AGAAGGGTAT TGGGCATTGA CCGCTAAGGA ATAAGAACTC CGTTCGGGTT 7680
TTCTGTTTGT AGCCGGGGAG ATGGATCGCT TTCTCTGTTT GGCAATGTCG CCGTCTCCCT 7740
GGGTCACCAA GTGATCTTGC ACCCTAGAAA GAGTGAACCG GTGTATCCAG GCCAGCTCCA 7800
GTTCTCTTCT ATCAACATGT AGGGATCCTG TGAAAGCAAC CCTTGCTCCC ACCGCACGGA 7860
AAACTCCACA GGTTTGATAG GACTTGCACG CAGCTCAACA GCGTATTGGA AACAAAGTTC 7920
TCCCTTTAAA TTGCGCGTTC CTTTGAACCC ATTGAAATTG AATCGGTTGG TTGCCATATA 7980
TATGTGCGCA CGTGGTTCTA TCCACATACT ATCGTAGCAC GGTATGCGGT AGCCTACCCA 8040
TGCATTCCCC ATTATCGGAA GGGCTATTGA AgCTGCGCCT GTTGCTATCG CGTCTGCCGG 8100
TAACCCTGCC GCGCGTGCTA CAAAGTTTGT GGCACTATTC AAGAAATTAA GAAGACCTAA 8160
AATACCTACT CTGGCAGCAT TGGCGACCTG CTGTGCTACC CCTTGGGCAG GTACGACGTC 8220
TGGGGGGAGA CCTCCGTTGT CAAGATAACT TTTGTAGCCC AGGGGAAGGT ATACACGTGC 8280
TTCTATGCCT GCGTTTAGGC CGTGCAAGGC ATGGGTGTAA TCATCTCCCG AACGAGTTTC 8340 TAGTCTGAGA AACGCAGCAA AGTCCGTGTA TTGAAAAGTT GACTTTACAA AGGGACCACT 8400
CCCAAAAACA GACGCCGCCC CTGTTGCGCC GTATACACCT CCTGAAAGCC AACGcCACTg 8460 cGCTGTAACC AGCGCGTCTA TGCCTAATGA GTCCACGTGC TGTACCATCC AGCTGAATAT 8520
ACGACGCACC GTCCGTAATG ACGGATACGT CTTCTCCGCG AGTTCTAGTG CCTGTTCGAC 8580
GACACGTGCT CTCGcACTGT TCGTATCCCG GTAGGTATTT CCGGCATCCG nAGCTAAAAT 8640
GAAGCGGA 8648 (2) INFORMATION FOR SEQ ID NO: 52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6993 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52:
CACCAnCGTC CCGnATCCAG TTCCACGCAC ATTGGCAACG GCGCACAAgC GCTCTATCTG 60
ATCTTTGTGT ATATCAGAGA AAAcGCGAGC ACTGCAGAAA TATCTCCCGA ATTAATTTGC 120
AATATATTGC ATACATGCCG GAATGGTACC TGGTACGAAA TGCACGGCGG CATACCACGC 180
ACTTGTGACA ATTCGTAGAT CCTTTTATGC CGCATAAATT CGTGTTCGCT TTTTGCCGCA 240
TGTATTCCCC ACGCAACACG CTCACTTTTA TCGTAATCCT CGTATATCTT AAGAACATCA 300
AGATCAAAAC TAATGGAAAA CTCGGTATTT GGACGTGTGG AAACAAAAAG GTATCGCAAT 360
ACTTCAGGCT GATATACTTC AAGCACATCA CGCAGCCCAA CCACTTTTCC CGCGGACGAA 420
GACATCTTCC CAGGCAAACC TTTTAATCCA ATAAAATCAT AACGAAAAGA AACAGGCGCA 480
GGCCAGTGAT AAATGTGATC AGAAATTAAA CGCGCAGTGT CAAAAGAACC TCCCTGAGAA 540
TGATGATCCT TCCCTGCAGG CTCAAATACC ACATGCTCCT TACTCCACCG CATAGCCCAA 600
TCAACGCGCC AGcTAAGTTT TACCGCAGAC GTCTGGCGTA AATCCACCTG CTCCCCATGC 660
CCACACTCGC AATGATACTG AAGACACCAG TGGCTATCCC ACGCATCAAC CGTGGTGCAG 720
TCTTTATGGC ACGCTGTACA AAACACCGAT ACGGGCCAAT ACGTTCCACT GATTTTATGC 780
TGCTCATCTC GATATTCGTT TAAAATCGCT TGAATACGGT GCCGATTGTC GAGCGCAATC 840
TTTATTTCCT GTGCGTATAC CCCCGCCTGG TATTGCTTTG ACTGATAAAC GTATTCAGGA 900
TAAATACCTA CCTCCGGGAG CGCCGATTCA ATTTCCCGCT cATGGTGCCg CGCGTAcTAT 960
CTTCCTGCTG AAAGGGATCA GGAACTGAAG TGATAGGCAT GCGAATATAC TGCTTCAATT 1020 CATCTTGAGC AGGTACATTG TCGGGAATCC TACGAAAAAC GTCATAATCG TCCCACGAAT 1080
GTACAAAGCG CACTGATTTC CCCTGGTCAC GCAGGcCCGC ACTACAAGGT CAACGGAAAT 1140
AATCTCTCTG AAATTACCAA TATGTACCGT TCCTGAGGGG GTAATCCCCG ATGCACAGGT 1200
GTATTGATCA CAGTCAGCAC GTTCCTgATA ATCTgTGCGC AACCTgTCAG CCCAATGAAG 1260
TGACTTTTCA CAGATACTCA TGATCCTTCC TTACAGGTAC GCAAAATATT TTTAAAGCCA 1320
AGCCAGTACG CACTTCCCAC GCGCACATCC TACACTTCCA CATAGGTGGC AGTGCCGAAT 1380
ACACACAACA GCCATACGGT GAGTGCTCGC ATCTGCCTGG CAGACCGTAT GGTCACTCCC 1440
GCGATACAGG GAAAATAAGA ACTGCGTGCG TGTCAATACC GCGCGCACAG CATGAAACCT 1500
ATTGCTAGAG GGTGCATCTC CTTCTCAGAT TCCTATTCAT GCCGTATCAC TTGATGCGGC 1560
TCAGTATCCG GTCGAACCGC CACCACATAA CGACCGCCTC TGTGTCCGTC GCCAGCACTG 1620
AAGAAAGAAT CCCTGTGGGA GAAAAATGAA ACACATTGTA CACCAACGTA TCACTGCGCA 1680
TATGTATGCT CCTTTGACGT ATCGTGAATG AACGCAAGTC TAACAATAGC GCGGCATATC 1740
CGAAGCGGTC CGGACTCACA CACAAAATAC CTTGtGCGCA CTCACCCCGA GCAATGAAAA 1800 aGCTTTTTTA TAaGGTCCcC TTTCTTGGTT TTCTGTACCA GACACCTCGT GCGCGGGAAG 1860
ATCCACTTTC CACTCGTACA CCCCCGTTTC AAGAGAGATA AAATACACAC TGCTTTTCGC 1920
ATAGCGAACA CTACTATTTG CTCCCGTAGC TGGATCATGC ACTGCGGTAT AGTAATCTAC 1980
CTTTGCAATC AGCCTCCGTT CACTCACATC AGGCAGTACC CGCTCTACGA ATGCATACCC 2040
TTTCTCTTGC GCAGcAAATG ACGTCGGAAG CGGAGAAAAC GGCAACGCAC GTTGGTGTAT 2100
GGGTACTCGT TGTGCATTGA ACCAATACAC TCTCATTGCA TCCACACTCC TACACACCAC 2160
CACGAGCTCA TCTGCACTAT TTACATACAC ACCTTCAATA GCAGGAAAGG GTGTGCCCCC 2220
TATCCCCTGC TGTCCAATAG CATGCATGAA ACGTCCTTCC TCATCGAACA GCAATATGGT 2280
ATTCACCAGC GCAAGGTTTT CTTCCGGATC ATGCTGCACG TGTTCTGGTA ACACGGCATC 2340
TAtACGTACA GTGTATTCTG CGAATCTACG GCAAGAAACG TTGGCGCATG CAGCGGATAG 2400
GGCACGGCAC GACGCGTAgT AATTGCTGCC GATTGCAACC CTTCAGAGAA CTGAGGAGTC 2460
ATTGCGTTTT TCTCGGGATT AAAAATAACT GCAAGCACAT CCCCAAACGA AtCATTCGCA 2520
TGACcTTGCC AGTTGCAGCA TGTGCCACGT AGAAAATACC GTCTTTCATA CACAGCTGTA 2580
TATTAGACTG CGATTGCGCA TACCCCGCGT CGGGGAAATG CAGTTGATTT TCAGCATCCC 2640
CGTACGTTAG GGCGAACAGA CGTTGCCCAT GCAATTCACG CCCCATCCAC CGTGTGCAGG 2700
AAGACACCAA CACGAGAAAA CTACCCAGCA AGAAAAAAGT AAAAAATCCC AACCGCAACG 2760 GGTGACGCGT CACAACGCTA CAGGACACGA GGATAGAGAT ACTCAAAACC ACAAAACGGC 2820 ACCAAAGCTT GCGGnAATGC GCACGCGACC TTCCGCATCC TGCCCATTTT CTAGCAGCGC 2880 AATTAACACG CGAGAAATTG CAAGCGCCGT CCCATTCAAC ATGTGTACAT AATGCTTCTT 2940 CCCTTCTGCA TCCTTATAGC GGACATTTAA GCGCCGCGCC TGATAGTCTG TGCAATTCGA 3000 CGCAGAAGTC ACCTCTCCCC ACGAACCACC CTGGCGTCCA GGCATCCACG CCTCCAAATC 3060 CCACTTGCGA TACGCAGGCG CACCCAAATC TCCCGCACAC ACTTCCACCA CACGAAAAGG 3120 AATTTCCAAT GCAGTAAAAA TCTCTTCCTC AAGCGACCGC AGgCGTTCGT GCAGGCACTC 3180
AGAATCACTC GGTGTACAGT ACGCAAACAT TTCAAGTTTG GTAAATTGGT GCACGCGATA 3240
AAGACCGCGA GAAAACTGGC CTGcAGCACC AGCCTCTTTa CGAAAACAAT GCGAGAGCCC 3300
TGCGTATAAA CGCGGTAAAC TCCGCTCTTC AAGAACCTCG CCTGcATGGT ATGCCCCCAG 3360
CGTAATTTCT GCAGTTGCTA CTAAACAGCG GTGTTCTCCC TCAATACGAT AGATATTCGA 3420
TCCACTCCCC CGCGGATTAA AACCCAAACC ACACACCATA CCCTCACGAG CAATGTcAGG 3480
AGTGAGAAAT GGCACAAAAC CGCGCTCTTG TAAAAACTGC AAACCAAACA TAATCAATGC 3540
CTGTTCAAGC AGCACCCCTT CACGCTTCAG ATAATAAAAC TTTATCCCCG AGACCTTTTT 3600
CCCCGCTTCA AAATCAACTA TATCCAGCAA GCGCGCTAAT TCCACGTGAT CACGTGGcGA 3660
AAAACTAAAG CATGGAGGCA CCCCACAGCG CTTGATTTCG AGATTATCAC TGTCTGATCG 3720
ACCATGGGGA GTGCACATAT GCGTCATGTT TGGCAACGCT TGCGTTGCAG ACAAAAGCTG 3780
ATCGGAAATC TGTACCAATA GACGCTCGCT GTGAGCAATG CGATCTTTTA GTGCTCTGCC 3840
CGTTTCAACA CACGCCGAAC GCGCAAGcGC ATCCAAAGAG CTTTTCATCG TCTGTGCGTT 3900
CTCATTACGC GCACGTTGTA ATTCTTGCAA CTCTGCTAAA AGCTTTACGC GCTGATCATA 3960
TAAGTGCACA ATCGCGTCCA CATCTGCATG CACGTTCCTG ACCTTCACAT TTTCTTTTAC 4020
TGCATCCACG TTCTCTCTAA TAAACCGATA ATCAAGCACG CGCCTTTCTC CCCTTACTTA 4080
TTCTGAATGT ACAAGAAAAA CGACACTCTC ATCGAaTGCT GCGCAGAAGC GCTAACAACA 4140
TACCCATCGC CCCATCGTGT ACGAATGTGT CAGACGTGGT AGCCGAGCTG TCCGGAAGGC 4200
GCGGTTCACA TATTACCGCA TCTCCCACGC TGATGTGATC GTAGTATCCT ACGCGCCTGA 4260
GCACGCCTTC ACTCATATCC TCTGAAACCT TTGTCACGGT AAAATGCCCC AACACGTCCT 4320
CACGTCGATA TGCAAGCCCA ACACCTTCTT TCAGCACTGA AAcTGCGCGT CTTTTTACCA 4380
CTTCGAGAGA CTTACCTTGT AACTCTGCAT CTTGTGTTCC AAGATCAATC ACCGCcTCCG 4440
ACTGGTGACG GCGCACGACA GTACCCATAA TTGGCAAACG ATCGTTGAGC ATCTGCATGA 4500 TCCTACGCAA CACACTCTGA TACCGATCAT TTCCCGAGCG ATACGCATCA AAGGTGTGTG 4560
CCCGAGATCC CGTCGATGCA ACATACAATT CCAAACGCAC GCGTAAATCC TGACCGTGCT 4620
CCTGCATTGT GATGAGAGCA AAATAATCAT CACCAGCCTC ACGCGCCGTG CGAAAAGCTT 4680
CTCTATACGA GTGAGAACGC GCGCTGTACC CAGTTACTTT TTAaCTGCGG TTATAGGCAA 4740
ACGAATCTTG CACTGCGTCA GAAAGAAACG CTCAGCCTCA GGATGCAATG CATTCGCAGG 4800
ATCAGGATGG TAAAAAAGAG ATATTGAAAG ATGCGCCTTA TCCAGATACA AGGCATCAAC 4860
ACGCCACCTA TTTTTAATTG AACGTGCATG CGTCTTCTCG TATGCCTCTA CTGCATCATT 4920
GATGCGTGCA CTGCTCTTTC CAATAGACTG TAAAAACTTT AATTGCTCAA GGGAGCGCTC 4980
AGGATATCCA AAACGCAGGA GCAAACGTGC GTAAgcTTcA CGAGCAGCAC CGTCGTACGG 5040
GTACACCTTT AGTGCGCGCC GATACTCATC CAGAGCCTGC CGACTCATAT TCCGACGCGC 5100
GAAACCGTCT GCCTTTTGCG TGTGAAAACG CGCAAGTTGC ATGCGATACT CATCTTCGTA 5160
TTCAAGGTGA ACAATCGCGA TCTCTTCTAG CAAGATGCGC ATTAGATCGT CACGTGGATC 5220
TACTGTCAAC CCAACTTTTG CAGTTGCAAG CGCCTCAGTA TGTTTACCCA ACTTCAGAAG 5280
GGACAGTGTC TTTACATACC AGGCATCCAC TTGCGTTCGA TCCGCCCTTA TGCGTTGATC 5340
ACACTGAGCC ACCGCGCGCT CATACGCGCC GCGCGCATAT AAAACTGCTG AAAGAAGCGC 5400
ACGGGCACGT GGATAAGCCG ACTTAATGTG GAGCGCCCGC TCCAAATAAC GCTCTGCATC 5460
TTCATAGTGT GCCCGAAGCG TTGCAAGATA CGCGGCAAAA AAATGCACCT GTGCATTATC 5520
ACCGTGATAT TGCAACGCAC GTTCAACGTA CGTGAGCGCA CGCGGATAAT GACCAGCCTC 5580
GTACGAGATA AGCGCAAGcg ACAACAACGC CTTGCGATTC TCTGCCTGAC GCTCCAGCGC 5640
TGCTTGGTAT AACAGACGCG CAGAGCTCAG CCGTCCCTTT GACACCTCAA TCTCTGCCAA 5700
ACCAAAGCGA GCATCTACGT CATTCGGATA GCGCGCAAGA ATTTCCTCAA AAAGACTACG 5760
CGCCTGATCC AACTCACCTT GACCAACTAA ACTGAACGCG CACAGCTTTT CAAGGGAAAG 5820
ATCCTGCGCC ATGAGTTTTT GCGCTTTGCr CACATGGTGC AACGCCTGAT CATATTCACC 5880
AAGTGCGTAG AAACACTCGG CAAGACCACG ATATGCAAGG TTGTAAGAAG CATTTTTTTT 5940
TAATGCTTCT TGGTAGAATT CGATAGCAGC ATGCCAATCC TCCTGcACAT GGGCCTTTCT 6000
TCCTGCTTCG TAAAGCTGCA CGCCCGTCTG AGCAAACACT ATGCTGCAAA GCGCACCGTA 6060
ATACACGCAC AGCAGGCCTT TCATGCTTTT TCCCTTTCCT CTATGGCCGC TCAGATACAC 6120
ATGCAGGCTC GGAATCACCC CGCAGaCAGA TACAATCTTT ATAGTATTTA TCTTATGTGC 6180
TTCCATGTCT TGGATAATAA AATCGAAGGC ACCCCACGAA ACCTTCTCGT ACTTTACGGG 6240 AATTTTTCCA AAAAGATTAA ATACAAATCC ACCTAACGTT CCAAACTCTT GAGAAGGAAA 6300
AACAGTATGC AAACACTCAG ACAAATCTTC CAAATCCACA CGCGCATCGC ACAACCACAC 6360
GCCCTGTCCG AGCGGTTCGA TATCCTCCCG CTCGTGGTCA AACTCATCCT GGATATCCCC 6420
AACAATCTCT TCAATAATGT CTTCCATGCA CGCAATACCC GAAACGCCGC CGTACTCGTC 6480
CACCGCGATC GCAATGTGCA CGTGCCTGCG CTTAAACTCT CGCAGAAGAC TGTCAATTCG 6540
TTTGGACTCG GGGACAAAGA AgGsTTACGC AGCAGTCTTT CTAACCGCAC CTCCTGTGGC 6600
CTTCCAAACA GCTTTATTAA ATCTTTGACG TACAGCACAC CCACCACATT ATCAATAGTT 6660
TGTTCGTAGA CAGGAAAGCG TGAGTGTCCA CTCTCGGTTA CCTTTTCAAC GAGTGTTtCA 6720
CCGCTCATAG AAAGCTCAAG AAAATCCACG TCAATACGCG GTATCATCAC CTCGCGCACC 6780
GAAGTGTCAG AAAGATCCAC TATAmCGCGG rTCATAtCCT GcTTTTCTTC ATTCAGCGGT 6840
TGCTGAAAAA TATGGGTAAC AGCGTGCCTG CGCCTCAACC AGTCTATGAC TCCCATGGTA 6900
TACCCGATGA TAGCACCCGA CACGTGTGCG CCAGTATGCG CTCCTGCAAA CGCAACATCT 6960
CTTGTCCAGG GnTCCTnCGA TCAGACTCTA TAA 6993 (2) INFORMATION FOR SEQ ID NO: 53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5460 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53:
TCGCGnnAGT CAAAAACGGC AACACTGAGT TTTTGCTCAT TGGGGGCAGC CAGGGGTACA 60
AGGAAATAAA ACTGGAAACG GGGAGCGGCA GCGGTACCGG CTGCCTGAAG GCAGAGAACG 120
TGCGCGGTCC GGAACAGTGG GGTGAAGACA GTGTCACTCC CAAGGATAGG GTAAGCCAAT 180
ATGAAGGCAC CATCGGCCGT TTCGCAATCA GCGACATTTA CACCGTTGAG TCCACGAGTG 240
GAGCTGGTGG CACCAACGGC GGCACTAATA AGCCGGACGT GTAtGTGGTG GTGGGGGATT 300
CACAAGACGG GTATACGGGC CTGTGGAGAT TTGACGCCCA GAAAAAGGAG TGGAATCGGG 360
AGTAGCCCGG GCGGATGCGT GCTGCAGGGA GGCGCGGGGC GGGAGGCCGC GCGCCGGTCA 420
TCTTTACGCT TTGATAAAAA ACAGTTCGTG AATGGCGCGC CCCTGCGTCT GCGCCTTGCG 480
TTCAAATTCC GTGGCGGGGC GCCAGGGGCg cGCACCCTGC GGTGCCCACG TGAGCGAGGG 540
CGTGCGCGCA AgcTCTTCCT GCGCGCGCCG TGCGTACTCG GCCCAGTCGG TGACCGCGTA 600 TAGGTAGCCA CCCGGTGCAA GGGCGCGCGC GAGTAGGTCT GTGCGCGGGC GATACAGCAG 660
GCGCCGCTTG TGGTGCCGCG TTTTTGGCCA CGGGTCTGGA AAGAAAATGT GCAGGCCTGC 720
AAGTGTCTGC GGTGCGATCA TGGTGCGCAg caCGTcGAGT GCATCGTGCT CGATGATGCG 780
CAGGTTGTGT AAACGTTCGG CTTCAATTTT TCTCAGCAGT CGTCCGATTC CTGCGCGGTA 840
CACCTCGATG CCGAGGTAGG AAAGGTGCGG GTTGCGTGCC GCGATTGCCG CAGTTGCGCT 900
CCCCATACCA AAGCCAATTT CTACTACCAG CGGTGCAGGC GCGCACGCCG GAGCGACTGC 960
GTCCGTTTTC CCCTGCGGAc GGGaAAAGCA CCGGCAGGCG CAGAAGGTGC CGCCGGTGAA 1020
CAGAATACGG CAGCGTAGTC GAACACCGTG TTCTGATACG GGATnATCCA GCGGGACGCA 1080
AGGTGCTGGT AGTCGCGTTT TTGGCATGCG GTCATGCGGT TTGATCTGCG CGTAAAGGTG 1140
AGAACTTTCC GCATGCGTGC ACTGTCGTTT GTCATGGTGG CGCTTGCTCA GACAGGGCGT 1200
CTTCAGGATA TAAACGGTGA GGTTGTGAAA TAAAGCGCCA GGAGCGCTGA AAGTCCTCAA 1260
CCACGCACTG CAGGTAATAT GCGTCCTGGT GTGTCGTTTT CAGTGCGTCC ATGGACGCCT 1320
CGAAGGTGTG TACTCGGGGA GAGGAAAGGC GCGCATTGGG CAAAGAAGTT AACGAAAGGG 1380
GATGCTGTGC CACAGTGCGG TAGTCGCGCG TCACAGCACC AGGGAAGTGT GCGTGTGCGC 1440
AGTAAAGGGA CCGACCAGTG CCGGCTGCGC AGGCGCCGAT AGCGTCCAGC GTTGCGCGTC 1500
TACAGAAAAG GAAAACGGAA AGGGTTCGTG TGCTGGGTAG CCTGCGCCTA GGGTAACGCG 1560
CTGCTGTGCA TATGTGCCTG TGCTGTCGCG CACGTACACC ACGTACGTTC CGGCAGGAAA 1620
ATCTCCCCAC GGATACTGCA TCCCCCCTAC GCGCAnATtA CGGCGTCCAG GATGGAGTTC 1680
GTCAGTGGCA ATGTACACGC GCGTGTCTTT TTCCCCAAAG ACCCATCGCA TTCCGGTGCG 1740
CAnTCCTcTA CTTCAAGGTA AAAGTACTCC TCAGGAGGAG CAGGGTACTC AAGCGTCACG 1800
TGCAGTGCCa GCGTTGCGTC TGTCTTTCCT GCTTTTTCGA AATACACTCG TTTTAGGGTT 1860
AGCCCCTCTG TGCGTGCATA AAAGGGCATA CAGGAAGAAA GTACTGCGCC CGCTACACAC 1920
GGCACCACAC AAGAAACCAG TACAAACGCG CACTGGCGAG CGAGAACCAT GCGATTAAAA 1980
CTCAAAAGAC AAATCCACGG TGGTGAACTC AGAGCGTGTA AAACGGTTAA GACTTGACTC 2040
AAAGCGCACG TTTGCCTCTT TGwCnCTTCG CTCAAATACG AAATGTAGTA CACGCCCAAG 2100
TTGGCGTGCT TGGCGGTATC TATAGTTTGC TGGCAGTAGT CAACAAGCGT GTCAGTGTCC 2160
GCATTTACCA CTTTTGCATC CTGTAAGCAA TTGCGCATTT CTTGTGGGTA ATCATTCCTC 2220
CCATACACGG TAATTTTTAG CTTACGGCGG GTACsGGCAG TAGACACGCT CCTCCCCTGA 2280
AGTTTCCACG AAATAAAGAG AGACTCAGCG TTGTGTTTGA GTTCTGCAAC AATACGCCTG 2340 CGGATACCTT CGTCGCAGAG CGCCTCTTGC GCGTCTTCAA GGTCTGGAAA GAGCTTTTGC 2400
ACCTCTTTTT TTAAATTCGT ATTCTCCTCA TTGATGAGGA GCTCTACAAG GACTAATGCA 2460
ATA ACTCAG CAATGCGCGC CTTATTCAAA TATTGCGCAA GGCTGGTGTG AATGCTACTG 2520
AGCACAGACC CCATGTCCGG TGCGTGCTGA AACTTACTGA TGATAAACCA CGTAAAGGGG 2580
CGCAGGGTAT CCAAAAACTT TTCGCCGAGG AAAAGAAGCG TATTTTTCTC CTTTGGAGTA 2640
CGTTCTTCGC ACGCAGCGAG CGAAGCGTGA AGCAAACGAA GAATATGATG CTTAATCTGC 2700
GTGACGTGTG CTTCATGCTG TTTTAAATGA TTATAGATGA AAGGAGCATT GAGATTTGCG 2760
TGTTCGTCAA TGCTGCTCCC CGGGTTACTG CGATTCCACT TTTTAATTAC TTCAGAGTTC 2820
AGAATCTGTC TGAACACGTA GCAATCGTAT TGACGGTAGA GGATGGAATG CACGATGAGC 2880
TTTGAAAGAT CAATAATTTC TTGACGTGAG GAAGCAAACT CAGGCGGTGA AACCTCAATG 2940
AGGGAAACGT ATCCAGAGAG CAAAAGACCT TGCACCGTTT TAGGTGC AA GGCATCGAGT 3000
GCGATACCAT AATCCTCGAC ATGCTCAGCG AGTTTTAATT TCATGAGCTT CCTGTTTTGT 3060
TTTATAAAAA ACTCGCTGCC CTCTTGCGTG AGTACGAGCT TGAGTGGGAG GTTCAGGATG 3120
CGTCGTTTAT GTTTTGAACT CATACTTTTC TGCGCTCCCG TTGAAGTGTG TCCGCCGTCC 3180
TTTCATTCTA TCAAGGAATG AGGGGTGGGG GATAAAGAGA TTCTACGTAG CAGACGAACC 3240
GCACCGATCC TTCCTGTGCA TGCAGGAAGG AGTCAGGAGG GGCGGTGGGG AATCAGAAAT 3300
AACCCAGAGA AAGGCTGATG TTCTGCGCAA AAGCATCCGG AGTGCTGACC GCTACAAAGA 3360
GGATAGGAGC TGCCGCACGT GAGGCGCCTG TGCATAATCG GGATTGCGAT CCAAAAGCTG 3420
CGTTAGTACC TCTTGTGCAT ACTCTTTTTC GTTCACTTTG ATTGCTAGTT TCCCAGCCTC 3480
TAGGTACACG TCCCACGCGC GTGCGTCCTG AGCGAGCACG GCACGGTACG CGCGGAGTGC 3540
TTCATGCGGC TGTCCAGCCG CTnACATACA CGCGTGCCAC TTGGCACTGT GCCTCGCGGT 3600
CTTGAGGCTG TGCGGTGGCG GCGCGGCGGT AGTGCGCTAT TGCCGTCTCC CACTGCGCGC 3660
GCAGGACATA CAACTTCCCT AGATTGCTGT TTACCTCAAA GTTTTGCGCA TCGTGGGCAA 3720
GAGCCGCCTG CAGGTGTGTT TCCGCCTCTT GCAATGCTCC CTTGTCCAAA TACAGTTTGC 3780
CCAAATTGTT GTGAGCCTTC ACGTGTGCAG GGTCCCGTGC TGCAGCCAGC TGATACTGTG 3840
TTAAGGCAAG ATCCACACGC CCTGTTTTTT kCGCTGCAAc aTnAcGCGTA GAGGAAGCGT 3900
GCCTGCGATC CATTCGCGTA GACGGCCtGC TGCGCGTGCT TTAACGCTTC TTCGTTCCTA 3960
TCGAGATCAA GCAGCACGGT TGCGAGGTTG TACAAAGTAA GCGCGTCCTC AGAATCCCGC 4020
TCTAGGATTT CTTGAAACAG ACGCACAGCC TCTTCCTTCG CACCACTTTT TGCCAGGACA 4080 490
CGTGyGCGTT CACGCGnGGC TGTCAGGTAT AGATCATGCG cTTCCGCAGC AGCCGCGCAT 4140
TCTTGCGCCT TTTGATAGTC GGCAAGCGCA gCAGACACTT CTCCTTGTGC ATCGTGCACA 4200
CGTCCAAGCT CTATcCACGC ACGCGTGTGC GTTGGGTTCA ATCGGATGAC CGCCCTAAAC 4260
GCCTGCAGGG CTAGATCGTG TTTTGCACGT ACCCTACAGG TGAGCCCCAG ATTAAAAAAT 4320
GCAGGCTCGA ACTTTGGATT CGCAACCGTC GCCGCATTGA ACGCTTCCTG CGCCTCAGCA 4380
AACCTGCGCA CTGCGAAAAG ACGCTTACCC AACTCATAGC TGTAGCGATA TTCGCGCCGG 4440
TCAAGCGCCG CGGCGcGTTC AAGCAACGTG AGAGCCGTTG TTTGGTCGTC ATCAGGctGC 4500
GCATCTGCAA TGCACGCGGC AAGGTAGTGT GCAGCTGCGG AgCGTGGGTT GAGCCTCAAG 4560
GCTTCCTTCA CATACACGGT TGCCGTTTCA AGTGCCCGTG TCCGCTCAAA ACCGTCACGG 4620
TTATCGTGCT GTGAAAGCGC GTACATAGCT TCTCCCATAC GGGTGTATGC GTCTGCTGCG 4680
AACACCGCGT CCCCGGCAGG GAGTGCACGT ATTGCTTTGT TAAACACACG CACCGCTCCT 4740
GGATAATCAC GTCGTTCCGT CAGTTCTTTT CCTTCGGAAA GCAACGCGTg CACGTGGTGC 4800
TGTGGCGTGG CAGTTTCTGC AGGTGCGCGC ACCTCCGGGC GCGTTGCAAT CTGCACGGCC 4860
CGCTTTATCG ATTTTTCAGG AGAAAAAGAC GCCGTGCGAG ACACTCCCCG GGGCGGGGTG 4920
AGAACACGCG CGCCCTTTTG CCGCGTGTCT TGTTCTTGCA TACGCTCTCT CGGCGTAGGC 4980
GCGTAAGGAA CTGGGCGCTG GGGAGCTAAC TCCTCTGAAA GGGTCTGTAG GAACTGTTCT 5040
TCATCACTAT TCCCTGAGAG CTGGACGTGA TGGTCCACAA GCACGCCGGG TTCCTCCCCT 5100
TGCTCTTGCA GCAGCTGGTT CGTCTCCAGC CAGGCGAGTT CTTGCTCAGA AACGCCCTCC 5160
TCCTCAAGGA GAGTCTCGCC CGCTGCACGA GGTAGGCGTG CGCGCACCAC CCCCCGGGAA 5220
AAGAGACTGA ACCCGGCAAC TACCGCAAGA AGCAGCACCA GCCCCGCGGC GAGTGCAATG 5280
AACGTCTTGT GCACATTATT CAAGGTTGTG TTCCTCCTGA TAGGGGACGG TGTCCTCCGA 5340
TCCAGTGGAG AGGGTAnGCG CGTCCTCCGC TTGTTTCAGT CTAAGCGCGC GCTTGAGAGC 5400
TTCAAACTCC GCCTCCTTGC GCCGGGnTTC CTCCGGCTTC CTTGCGGCGG GnTTTCTCCG 5460 (2) INFORMATION FOR SEQ ID NO: 54:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10461 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 491
AAATCGTGCT GATACTGCAA CTTAGTAGCG ACAGGAGTAA TGAACCAATT TTGCACCAGC 60
GAACCATAGA AAGTAGTGAT AAGCGCATTG CCATGTTAGA TCCCAGCGAG GACTTGTCTT 120
CAAGCGTTGC AAGCATACCG ATAAGCCCCA TAACGGTGCC CAGCATACCA TATCCGGGCG 180
CGAGcGCAgC CCAGGAGTTC AAAAGGGAAA TCCACGTATT GTGCCGATCC TCCATGTGCG 240
TCAACTCGCT TTCCATCAGT GCCTTGATCG CATCTCCGTC CACACCGTCT ACCACGTTCC 300
GCAAACCAGT GCGCACGAAn TCATCGTCAA AGTCCTGAAT TTCTTCTTCG AGCGCAAGTA 360
AACCGGTGCG CCGACTTTTC TCAGCAAGCG CGTAGAGCCG CTGGACAATC TCCCGTTCGT 420
GAAAATCCGC CGCATGAAAA ACGCGCGCAA TTACCCGAAA AACACCCACG GCATACGAAA 480
GCGGATAGGT GAGAAAAAGC GTTAAGTACG AGCCCCCCAC GGTGATCAAC AATGACGGTA 540
CGTGAAAGAG CCCCCTCGCA GAACCACCGA GCACCGCACC AAAGATAATG ATGGCAAAAC 600
CGCCGAAAAG CCCGATAAAC GATGCGATGT CCATCGCTTC CCCCGTGTCT TAGGTCTCGT 660
CGTTGAGGCA GCCGATGcTG CGCCGATAGG AGACAATTTT ATCGATAACT TCTTGCACAC 720
TTTCCCTCAC CACATAGCAC TTACCCGACA GCATTTGAAG CGTTACATCA GGTGTACAAC 780
GCATCGTTTC AATGTGGTGG GGATTTACCC AATTTTCATT TCCATTCAGT CGCGTCACTT 840
TAATCATCCC TCATCCCCAT CACGCCACCT GCCGCTTAAG ATACATTTTC ACACAGTCGA 900
CACATCAGCG CTTCAAACTC AACACCGTAT CCAACATGGT GTCTGATGTC TGAATCGTCT 960
TTGCGCCCGC CTGAAACCCT TTTTGGGTAA TGATCATATC CGTAAATTGA TCGGTTAAAT 1020
CTACGTTGCT CATCTCAAGT GTCCCTGCAA TCAACTTTCC CTTCCCCATC ACCCCCGACG 1080
TGCTAATGTT CGCTATCCCT GaGTTGTTCG ATTGTACGTA GGTGTTCTCT CCTGCCTTCT 1140
CAAGACCACC TTGATTTGCA AATCCTGCAA GTGCGAGCTG GCCAATGTCT TGGCTCACCC 1200
CATTTGAATA CACACCAGTG ATGACACCGC TTTGATCTAT TTTAAAATTT TCCAAATATC 1260
CCATCGCGTA ACCGTCCTGC CGGTAGCTTT GGTAGTACTG CGTTCAGCAA AcTGCGTAAT 1320
CGTATTGCGC GCGGTGCCAA TTTCACCCAA GTTGAGCGTG AAAGCGTGGC GCGTAACCTG 1380 cCCTGcATCG TCCGGaTTCG CACCGACAAC ATCGTACGAC GCTTCAAGGA GCACCTGTCC 1440
GGTAGGACCG GTCACGTTCC CTGCAGTGTC AGTCACTGAA GCGAGGTGTC CAAAATTATC 1500
AAAATTTACA ATAAAGGTGT TTGCCGCACC GTCAGATGTC CCCACCCCTA CACGCGTTTG 1560
CGTATCTACC TCTGTCCCCG GATCCACTGC GACAGTGGCC TGCCACTGAT TGTTCGTCCC 1620
CGGCACACGC GAAAAGTTAA TCTGCAACGT ATGCTGCTGC CCGAAGCTAT CATACACTTG 1680
AAAGTCAGTT GTCCACGTGG ACTTACGCAC GTCCGCTTCG TTCGCATCTG CAGCAAGCTC 1740 AGGCAGACGC TTGTCTAAAT TACAGGCATA GTGAACAGTG CTGGTCTGTG CGCATCTATC 1800
TTTTGCCCAA TGGGGATAAC GAGATCCTGC GTCTGTGCAG AGGAATTAAT TAAACGCTCC 1860
CCCGCCACGT CCTGCGCCAT CCAACCTTGA ACGCGCATAC CATTCGCAGG GTTCACGAGA 1920
GTGCCCGCAT TATCAACCCC AAAGGCACtG CGCGGGTGAA AAACGTCTTT TCCCCACTTT 1980
TCAGCACAAA AAAACCACTC CCCTGAATAG ACACATCCGT ATTGATACCC GTCGTTTGCA 2040
GTGCACCTTG CGTGTGAACA GTATCGATGC TTGCAATCAG CACGCCCAAT CCCACTTCCT 2100
TGGGATTCAC TCCTCCAACT TCTTCATTCG GACGCGCAGC cGcACTCAGT TGCTGAGAAA 2160
TAAGATCTTG AAAATTAACA CGCCCACGCT TAAAACCGGT AGTGTTAACG TTCGCGACGT 2220
TGTTCCCAAT GACATCCATG CGCGTTTGAT GATTCTGCAT ACCAGACACA CCTGAAAAAA 2280
GTGACCGCAT CATATGCTCT GTGTCCTCCT CATGTGACTC ATTTTCCTAA TCTCTCTTTC 2340
TCTGTTCACA AACCATACAA CTGCTACGAC GCACTCGGAT CTGCAATCAC CTTGACGTGC 2400
TCCCATTCGT ACCAGTGCGA CCCCACCCGC ACCTGGGGCT TGTCAGCACG GGTGACTGCA 2460
CTGATAAGCC CACGAACAGT GTTATCCGCC TCAGTGACTT CAACCATTTT TCCCACCGCC 2520
TGCAGCGCTT CAGTATTGCC AAACAGCGTT CCGAGCTTCT CTACCTGCGC ACTCATGTTG 2580
GCCATCTGCT CGAGCGAGGA AAATTGCGCC ATTTGCGCAA TAAACTGCGT GTCCTGCATA 2640
GGCGCAtAGG ATCCTGATGG GTAAGCTGCG CAATAAGGAG ATGCAAAAAA TCGTCCTTTC 2700
CTAACTCCCG CTTCGCACTG CGCGCGCCTG CCTCAAGCTG CTTGTTTATA ACGCGCACAT 2760
CCATTTCTAA ACGCGTACGC TCAGCGGCGG TCATTTCAAA CCGCATATTA GTGTTCTGTA 2820
CCATGCCCCG GCCCTCCTCT TTTTTCACCG GTTGTGTACG CTGACCCCCC TACAGGGTAG 2880
GCAAAACCGG CGCGCCTATT TAGGCAAACA CGTCAATCGT GAGCGCAnct CCcTGcGCAT 2940
GCCAATGGAC TTCTTGCACA ACAGGCTCAA CGCCCGCTCC CCCAAGACGC TGTGCAGCAG 3000 cgTAGGCTGC CGTCTGCGAT GCCAAATGCC CGTCCTCTGC GTGCGCACCA GCGCCAAACC 3060
ACTGCACATC AAACTGCGCA GskTCAAAAC CATTTGCCTC GAATGCACGC GCCAAATCCC 3120
CCAGATTTTC CTGAAAAGCT TCAAACGCCT CCTGAGAAGC AACGTGAATA GTACCCACCA 3180
CCCGCTTATT CTCCGACAGG GCAAGACGTA TGCTCACCGC ACCAAGGTGC TCTGGCTTCA 3240
GCGCAATGTC GATGTATCCG CGTCCGTGAT CGCGCAGCAC AACCCGTCCA GATTGCGCAA 3300
GCTCTGCACT ATGgCACGAA TGTGCGccGA AAGAGCCGCC TGAGTGGTAG CAAATCCTCG 3360
GATCCCCTGC GAnTGCGCTG TCTCCTCACG CGCGTGCGCA ACTCCTTCAA ACGCACGCTC 3420
CACGCCTGCA CGCTCCCCCT CACGCAGCGA CTCgTACCGT GCTCTTCCGC CCCCGCATCC 3480 ACGTCCGCAG CCGaCAGtGC GCCTGCGCCG CAGGAAGTAC GGCGCCCGCG TGCGAAGACA 3540
CACCCGACCC TGCACCGCGC GAATGCtACG CGCGTCCAGC ACAGTAAAGC GCGCGTCCGA 3600
AAAAGATCCC AACTCCTGAG GAGAATCCCC ATGCCGAATC TGCCCGTACG CAGTCGCCCC 3660
CTGGAAmGCC GCACCGGCGC TCACCCCTGC AGCAGCAGAG GCGTGAGACG CACCTGCCTT 3720
TCCTCCAAAG GAGCTGCCTG CTCCCCCAGA CTGCAAAACA GGCTCACCAC GGATAGCAGC 3780
CGCCACCAAC CGCTGCCGCA CCTCTGCATC GAAAATAACG TCAAAGACTT CAGACTCCGA 3840
CGATGCCGCG TACGTGGcTT CGCTCCCCTG cTCACGCAGC AGGGCAGGGG CAGCGCCAGA 3900
AGGAAGAGAC CCTCCGCGCA ACCCCCGCGC ACCAGCGCTT TCCCGCAGAT GCTCCCCAGA 3960
CTCCTGCCCA CACAGGTCCT GCACGTGCGG TGCAGTCTCC GGCACACGCT GCGsTaCGCG 4020 cgAGaCAGTC CTGCAGGACG CGCCGCACGC CCCCGTTCCC CGGATGAGGG CTGCTCTGGC 4080
ACAACAGACC GCGGCGTTAC GGAAGTTGCC TGCCGCTCTA CTACAAGAAA CTCCGGCGCG 4140
GACGGCCACG ACGCACCACC CGAAGCTTCC TGCGCCGCGC gnACGTAATC AAAGGGCCGC 4200
ACTCCTGACG CCGCCTCCTG CGCAGcaCGc AAACCAGTCT CGTACGCCAC GAGAAATACA 4260
TCCGAAAGAG ACTCTTCTGT GAAAACGTCC TGTTGGGTAC CCGAACCGGT TTGCTGCTCA 4320
TCAGCTTCCT CGCGCACCCT TTCGTGCACT GCAGCCGACG CAGGGAAAGA GAGACTCTCT 4380
GTGGCGCAcG CCACGCATGT GTTCATGCAA CGACTGAGCA AAAGAACACG GTGcTGCCGC 4440
ACACGACGTA CCGACTGAAA TAGTCTCCTG CGCAGCAGGC GcAGACTcCG CCACACCGAT 4500
ACCAATGGCC CGTGCCAGCA GTCCTCTCAG TTCCATGCAC TCCCCCCACT CCCTGCCATT 4560
CTCkGcGCAC TGCgCAGaCG TCTTAGAAAA AAGACCTGCC AGTCCGGTTC GGACGCATCT 4620
TTTCAATACA AAGCGsACGA GAAATTCAgC ACCyTCCcAA AGGsTcCAGC ACGCCGCTTC 4680
TTTCCCGTTC AGTATTTCCC CGTTTCCCCT GACAAAAGAT GGATTCTCGG ATACTTTTCC 4740
CCTCCGTCTA TGGAAAAGGA AACAGCTGGC ATTCTCTGCT CCTATACGCT TTTTCACAGT 4800
GCGCTCGTGC TGGCGCTGTC CCTCGCGCAC GGGCGTACCC AGGTGCCCCC CAGCTCCACG 4860
CTCAGCTTTT TAACGGTCAT TGTACTCTGG CACTGTCTGC TCTTCTTTTT TCTTGTCGCG 4920
TATAGCAGAG AACCTGCAGA TACCACCGTG CCGTTTAAAC CGCTGCCTGA ACAGACAGCG 4980
CCTATTTGTG CCGCCGCATC TTCTGACTGT AAGGAGAACC GCACCGCGCT GAAAACGCTG 5040
AACACTGCAA CGCACATCAC GCTTATCCGT GCCAGTGCTA TTCCTATCGT TGGCTTTCTG 5100
CTTAAATTCC ACGCACTGGC GGGGCTTTCT TACTTCCTCG TTGCAGGACT GAGCGTTTTG 5160
TTCCTCACCG ATTTTATCGA TGGCAAAATT GCCCGCGCAA GACGAGAAAC GTCCCGCGTG 5220 GGAGAAACGC TCGACGCAGC AAGCGACTAC GCGCTTATCG GGCTCATCTC AGCGCTTTAC 5280 TACCAAAGCG GTGTGGTGCC CCTGTGGTTC TTTGTGCTTA TCATCACCCG GCTTTCGTTA 5340
CAAACGGTTA TTGCCTGTGT GTACGCGCTT TTTGGCCACC CGATGAcCGG TTCCACCGCG 5400
GGGGGCAAAG CGACGGTGGC CGTGACTATG CTCCTGTACA CGCTCGAACT TGCCCGTCTC 5460
CTGCTGCCGA ACCTTGCGCG ATCAAACAGC GGCGCGCGCT TTTTTACCGG GGCAGAAATC 5520
TTGCAGGATT CGTCATTTTC ACCGGGATAG TGGAAAAACT GTATCTTGGC GTTCAGCATC 5580
GCCCAGGACG CTCCCCGTAG GAGAGACGAT ACTTGCGCCG TGCCTTGCAA CACACAAAAC 5640
CTGTACCAAC CGGGGCAAAA GGAGTGCACG CCCATGGATG AAGGAAGAGA AACTGTCCAG 5700
CCtgcGCATC GCGCAAAGGA GGAAAAAAAA CAGGACGCCC ATCTTGCATG GGAGGTACGG 5760
AAACnGCACG ArGCGTGCgC CTGCGCGTTT TTCACGTGCA AGAACTCGAA AGCGTTTCAC 5820
CGCGCAAAAC GGTACtCGCT TTGTAACGCT CACTGCACCT GAGTGGGTAA TCGTCGTGCC 5880
GCACGTGATG GAACGCGCAC AACGCTTCTT CGTTATGGTk CGCCAGTGGC gCTGCGGTTC 5940
ACAGACGGTG TGTACTGAAT TTCCCGGCGG GGTTATCGAC GCAGGGaGCA CCCTGAGGCT 6000
GCAGCGCGCA GGaGCTGTTT GAAGAAACAG GCAGACGCGC TTCCTCTCTT GCACACCTTG 6060
GCACCATACA CCCGAATCCC GCCGTGTTGG AGAACCGCGT GCACATCTTC AGCGCCGAGT 6120
GTACGCCTGA GnTACGTGAA CCGCAGTTGG ATACCGACGA GTTTTTAGAG CGGTGCGTGC 6180
TCCCCGTGCA CGACGTGTAC GAACGCATGG GCCGCGCACC CTTTGACCAC GCGCTCATGG 6240 cGCAGCCCTC TTTCTTTTTT TGCGGGCGCA TCCGCTTTCC TCCCTGTAAC TCAGTGCGGT 6300
ACGTCcCTGC AGCGCGTCCA TCTAGGTCGG CATAGAGCGC CGCTCTAAAG GGGGGTATCA 6360
TCCCGGCTGC ATAcTCTGCA GCGCAGAGCG TGTTGTGCAG CAGCATCGCG ATAGTCATCG 6420
GTCCTACTCC CCCCGGAACA GGCGTGATCG CCTGCACCTT GTGCGCCaTG CGTCAAAATC 6480
CACATCACCA CACAGTCTTC TCCCGCGCGG TGCAGTTGCA TCTGGCACGT GATGAATACC 6540
CACATCGATA ACCACGGCGC CGGTGCGCAC AAACGGCGCG CCAATGAAGC GCGCCTTTCC 6600
CAGTGCTGCA ACGAGGATAT CTGCCTGCAC ACAGATATCC GCCAAACCGC GCGTGTGACT 6660
GTGACAGAGC GTCACGGTTG CATCACAGCC GGGAGAGGCA AGGAGCACTG CAAGCGGACG 6720
GCCAACGATG GCAGAACGGC CGACAATTAC CACGCGTGCC CCCGCAAGCG GCACCTGCGC 6780
ACGCCGGAGC AAGTGCACAA TCCCCGCAGG tGnCAGGGAA CAAACCCAGG CTGCGCAAGG 6840
AAGAGCGCAC CACAGTTAAG CGGATGAAAG CCGTCGACAT CTTTTTCTGG CGCCACTGCG 6900
CGGCACACCC TCGCTGCGTC AAGATGCGCA GtAACGGCAA TTGGATCAAA ATGCCGTGCA 6960 CCCGCGCGTC CTCATTGAGA CGAGCAATAA GTTCTAACAC CTGTGCGTGA GAGGCATGAG 7020
CAGGCAGCCG GTGCGTTTCC CCCCGCAGTG GGCGCGAgCA GGGGCACGCT GCTTTGCTGC 7080
AACGTAGTAC AAGAAGCCGG GTCATCCCCC ACCAGCACTG CGGGCAAGAA AnGGCGCCGT 7140
GCCTAcCGCC GCACGCAGCG CCTGCACACG CGTTGCAAGA CGgCCGTaCA CTCGTGTGCG 7200
GCTTGTTTTC CATCGATGAg GCGTGCGTCC ACGcGCCCAg TATAGACACG CGCACGCAAC 7260
AGCGCAAAGA CCGAACACGC ACGGACACTA GACGGAAGCC CAAGAAACAC CGTATGCTCG 7320
GCGTCGTATG AGCAGAACGT TCCGCGCGTG GCAGTGCGTT GGTGCGCTGT GTGCGCTCTC 7380
TCCCCTGCTG CCTGCCTACA rcTCCGAGGG CGTGCGAGAg GTACCCCCCT CCCAGTCTCC 7440
GCAGTGGTGG TGGCGTACGA GCCCATTCGC CCCGGGGATC AGCTGCTCAA AATTGGCATT 7500
GTTGCAGGCT GCCAGTTGTA CATAGCAGGG GGAAATGGAA CCAACGGCTC TTCGAGTTCC 7560
GGCACCAACG GTAACGGCAA CGGCAAACTG CTCGGGGGCG GGGGGTTTCA CCTCGGGTAC 7620
GAGTATTTTT TTACCAAAAA CTTTTCCCTC GGCGGGCAAG TTTCCTTTGA GTGTTACCGC 7680
ACGACCGGGT CAAACTATTA CTTTTCTGTT CCCATCACGG TAAACCCCAC GTACACGTTT 7740
GCCGTAGGcG ctGGCGCATA CCGCTCTCCC TGGGCGTTGG GCTCAACATT CAGTCCTATC 7800
TCAGCAAGAA GGCGCCGGGG CTTATTGCGG AAGCCAGCGC GGGGCTCTAC TACCAGTACA 7860
CCCCGGACTG GTCCATCGGC GGCATTGTTG CCTACACGCA GCTTGGGGAC ATTGCAAGCT 7920
CCCCCGACAA GTGCAGAGCC GTGGGCCTTG CCACCATTGA CTTTGGGGTG CGCTATCACT 7980
TTTAGCCCCG CCGCCGGGGC AGGTGGCGCG CGCGTCCCTA CTGGATAATG GCTTCAAGCG 8040
CAATTTCTAT CATTTGGGTA AAGGAGCGCT CCCGCTCCTG CGCGCTAGTT ACCGCGCCGG 8100
TTACCAGGTG GTCAGAGATA GTCAGAATGC TCAGCGCCTC GCGTCTGAAc TTTGCAGCAA 8160
GCGTGTACAG cTCCGCCGTT TCCATTTCCA CCGCTAACAC CCCATACCGG GCCCACAGGC 8220
GCCAGCTTCC TGATTCATCG TAAAAGACGT CAGAGGAAAT TACATTCCCC ACCTGCACCC 8280
CCGTGCCCAT TTCATCAGCA ACCGACACTG CCGTGCGCAG GAGCGACCAG CTTGCCGTGG 8340
GCGCAAAGTG CATGCCGCTA AACtGCGCGC GTTTATTGCA GAATCCGTTG CCGCACCCAG 8400
CGCACACACC ACCGATTTGA GCGCCACTTC CTCCTGCAAT CCACCGGCAG TCCCCACGCG 8460
GATTGCCTTT TGCACCCCAT AATCTTGAAA CAGCTCCGTT ACGTAAATTG AGTGCGACGG 8520
CAGCCCCATA CCTGTCCCCT GCACCGACAC GCGCACCCCC TTGTAGGTTC CCGTAAACCC 8580
GAGCATGCCA CGCACCTCAT TGTAGCAATA CGCATTGTGA AAAAAACGTC CGCCACAAAA 8640
CGCGCACGCA GCGGGTCACC GGGCAACAGC ACGCGCGGCG CAATATCCTC TCCCTTTGCT 8700 CCAAGgTGAA TACTCATCGT CACTCCCTCC CTTCGTGGGG CCTAGACCCA CAcGTTTCGG 8760
TAACCTCGCG CGTGTACGCT CAGCGCATGC AGCACCCGTT ACGCTTTTTG CGCAAGGTAC 8820
GCGTTTATAG ACGCCGCTGC ACGCCGCCCC TGCCCCATCG CACGAATAAC CGTTGCCGCT 8880
CCTAAGACAA TGTCTCCCCC AGCCCACACT CCCGGAATGC TCGTCCGTTG ATCCTCGTCC 8940
ACCACGATAG TACCCCGCTC GCTCACTGCA AGACTGCGCG TTGTCTTTGC CATGAGCGGA 9000
TTTGAACCAT TCCCAACGGC AACGATCACC GCGTCTGCAG CAAGTTTACA CTCAGCATCG 9060
CCGCAGGGCA GAAACACACG TTCTCCTGCA TCAATCTGTT CCTGACAATC GCGGAACACT 9120
ACCGCGCGCA CGTTCCCCTC TTCATCCCCC AAAATGCGGG TGGTCTGACA CAAAAAGTGA 9180
AACGTCACCC CCTCATCTTC TGCCTGTGCA ATTTCCTCCA CACAGGCGGT CATATCCGCA 9240
CGCGTTTTTC TGTACAGACA GTGCACCTGC TCAGCCCCTA AACGGAGCGC CGTACGCGAG 9300
GAATCTACCG CCACATTCCC TCCACCGACT ACCACCACTG ACTTTGCCGC ATACACCGGC 9360
GTGTCCGCAT GCGCAGTGTC ATACGCCTTC ATCAGCGTCG CACGCGTTAG GTAGTCGTTT 9420
GCTGCAAACA CCCCGCACAA TTCCTCACCC TCAATATTCA TAAAGCGCGG CAATCCCGCA 9480
CCGGTCCCGA TAAAAACTGC ATCAAAACCG TACTGCGAGA ACAGcTGTTC CAGCGTTGCT 9540
GTTCTGCCCA CCAAAAAGTT CATCCGGAAC GTcACCCCCA TTTTCTTGAG TGTTTCAATT 9600
TCCGTCACTA CCACTTCTTT CGGCAGGCGA AACTCAGGAA TACCATAGGT CACCACTCCA 9660
CCCGGTTTGT GGAGCGCTTC GAACACCGTT ACCGAATGGC CTGcACGCGC CGTATCTGAG 9720
GCAACTGcAA GACCTGCAGG CCCTGACCCG ATGACGGCCA CTTTCTTGTG CGTAGACGGC 9780
GCACAGTACG GAACTGTAAT TTGACCATGC TGCCGCTCCC AGTCAGCGAC AAAACGCTCA 9840
AGCGCACCAA TCGACACCGC CTTGGACACA TCCTTAAACA TCTTTCCCAC GGTACACTGC 9900
AATTGACACT GACGCTCATG CGGGCACACA CGACCGCAAA TTGCAGGGAG TAAACTCGTC 9960
GTCTTAATGA TATCAACTGC TTCCTTAAAG GCTCCCCTTT GGACACACGC AATAAACTCA 10020
GGAATCGGCA CTCCTACCGG ACAACCCTTT ACGCACGGCT TGGTTTTACA ATTCAAACAA 10080
CGCTGAGACT CAACCAGTGC CTGCTGCTCT GTAAAACCCA GCGCCGCCTC CTGCATGAGG 10140
AGCGACCGCT TTTTTGGCGG CAGCATACGC ATACGCTGCA AAGGGATCTG CGTGCGATCC 10200
TTCATCTTCA GCTCTTTACC CTGGAGCTGC GCCAGGCGCT GGCACGCTTC TTCCTGGAGC 10260
AGTGCGTGCG GCCGATACGT ACGCGCTTCT GGTTCTGACT CAACCGGTAC GTCACACGTC 10320
TTGGCATCGC TTACGACATT TTGTACAGAT GTCATACCTA CCTCCCCGCG TGGTGCTTCA 10380
TCTTACAGCA GTGGACATCA TGCGCTTCCC TTGCCTGAAA TGCCCTCATT CTCCGCATCA 10440 TGCTCTCAAA ATCAACTTGA T 10461
(2) INFORMATION FOR SEQ ID NO: 55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13367 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55:
CTTCGCGCGC ATCGACATCC TCAATACCTT TATGGACAAG GCAGATACAG ATTCTGACGC 60
TTTCAGAGAA ATGTTCGACT ACTTTAACAC ATTTTTGCGT GCGTTTAGTG TCGTGGACGG 120
CAATGTAATT GCGGCTTACT TGGTGGTAAC GCGTGTTTCC ACGGTGCTGC CTCACCTAAA 180
TGCGTGTAGA CCCCATGGTT TTGCGGATTT GTACGCGCAT ATTGCGGATC CTCGATTGGT 240
GTACACAGAG ATAAAGGATA AGGGCCTCAA GTGGGAATTC GTGAATAGTG TGAAAAACTT 300
TGTGAGCAAT TGGAGCGATG AGTATGTCAA GCTGTTCCCC GAGGTGCTCT CTCTAGAGAT 360
TCTTCGCGCG CTTATGGAAG AGGGATATAA GGAAAAGGCA CTGAGGGTGG TCGAGGCTTG 420
CTTTGAATAC TATGCGGATA ATCGTGCGGC GGTTATTGGT TATTCAAGAC GGTAAnGGAT 480
GAGCCTTGGT TCCAGGGAGC TGCGCATTAC CGCAGAACAG CGGATTATCG TCCTCATCCA 540
CATTGTGGAC ATTACTTATC GGGAAATCGC TAACCGGCGG AACACCACTG AGAACCGAAA 600
ACTTAACnAG CAGGCTCTTT CGGTACTCTT TGGGAtGATC ATTTGCyAgA ACACyTCCAt 660
GCyTTCGCaC GATGTGGGAA CTACTACCCG TCTTTACACG TTATAAGTGA TATCCGGGGC 720
TTGATCCAAA GTTAAAGGTC CTTTGCGCCA TAAATTATTG AGAAGTACAG GATTTTAAGT 780
TTTTTGATAC TGAGGAACGT GTGGTTTCCG GACGTGGACT AGTGGTAACT GCAAAGATGC 840
TCAATGCAAA AAAGAAAGAA TTGCaGGATT TGCTTGATGT TCGTATTCCG GAAAATTCTC 900
GAGAGATTGG TAGGGCCTTA GAACTCGGTG ATTTGCGTGA GAACGCAGAG TATAAgGnTG 960
CGCGAGAAGA ACAAACAAGG TTGAACAATA TGGTGACTCG GCTACAAGAG GAGATTGAGC 1020
GGGCACAGGT ATTCGATCCT ACCACTGTTG TAGCTGGCAG AGTTTCGTTT GGTACGGTAA 1080
TTAGCTTAAA AAATCACACA AGTGGAGAAG ATGAGACATA CACTATTCTT GGTCCGTGGG 1140
AGTCGGCTCC AGAACGTGGT ATTATTTCGT ACATGTCTCC GTTAGGTAGC AATCTGCTCA 1200
ATCGTAAGAC AGGGGAACAA CTTGCCTTTA CGGTGGGAGA ACATGAAAAG GTGTATGAGA 1260
TCTTAAGCAT CTCTGCTGCA GAGATCTAGT GAGGAAGTGT GCGATGCGAA TTATGCGGaG 1320 ATTAATGTTA TTTCTTATGT GTCTATGTgc TGCGCTGTTT GCGCAAGAGC TGGTTCGCGA 1380
ACAGAGTGTT ACAAAGTCTG CAGATATTAC GGTGCTACTT GA ACGTCTG GCACTATTTT 1440
ACCGTACCGT TCCGTGGTAA GCGGTAGTGT GCTAAAAGAT ATCGCTACTC GTTTTGTGCG 1500
TTTGGGTGAT TCGTTCCATA TTATTTCGTT TAGTGCCACG CCACGTCACG AGATTTCTCA 1560
GGTTATCCGT AGTGAGTTTG ATCTTTCTCA GGTAGTGTCT CGTTTCATGA TATTGCATCA 1620
GTTGGGGTTA TATTCTGACT TTTTAACAGC GCTAGATTTC GCGCGTACAC ACTTaCGCGC 1680
TTTGCCTGCA GCACATGAAA AAATTTTGAT TGTTGTGTCT GmCGGTATTT TTAACCCGCC 1740
TGCGCGTAGT TAgTgAAAAA CTACAACAAG GATCAGGTAA AAATTAACCT TGCACGGGCT 1800
GCCGCGGATC TGAGACGAGA GCAGGTGCGT GTGTTTTACA TAAAACTTCC CTTTCCCCAG 1860
GACATCCAGA TCCGCGATTT GGATGACAAT CTGCTGACTG ACCTACAAAA GACAGATGAT 1920
GTTCAAATCT CTGCAGTCGG TAGCTTTGCA GAAGGACAAA CAAGAAGGCC TAAGTTGGAC 1980
ACTGTGGGTG TGGTTTCCGA TCAAACGGGC GGCGTTGCAG ATAACCATGC AGTTGCTACG 2040
CACGGAAGGG AGGACGGGAC AGTCCAAGGG GTTGTTGGCA GCCATGTGGA GGTGGCACGC 2100
ACACAGGACA GACGCATAAT GCAGATCCTG CTAAAAGGGA AGGGGTTCGG CCTTCCTCAG 2160
AAGCAACTGA TGTTTCCCGC GAGTTCACGG AGGATTTGGG AATCAGGGTG AGTCCGGTTG 2220
ATTCAGATGG TTCTGTGCGT TTTTCCGAGA AGGAGCGCAC GCTTcCCGTG TTACACTTTC 2280
CAAGGGTCCT TGAGGTACAG GGTAAGTATG CAGAATGTAT GTTCGAGGTT GAAAATAGCA 2340
CGGATGCTCC CGTTTTGTTG CATTgGAGCG GGTGATTTTT GACAATGGCG TTGAGACTGA 2400
CATAGTTTCG GTGCAAACAG AGTCTTGTGC AGTAGCGTCC GGTGCACGCG CGATGTTGCG 2460
AACAACTTTT TTATTACCTA AGCGCTACCA CGAAGAGGGA ACGTACCAGG TGACCATGCG 2520
TGTACAGTTT GCAGATAACG TCCGCGTGTT CCCTCAGGTG GCAACAGCAG AGCTGCGCGT 2580
TTCTCCTTTG CCTTTTCTTG GATTGGTGCG GAGAGGTATA CATGGGGTTC TGTCTTCTGT 2640
AGGGCTTACG CATGCGTTTG GATATGTGTT GGACATGGTA GGGTTGAGTC GCACGGGTTT 2700
CGGTGCGGTG CTTTTGCCTC TGTTTGCTTT GGCTATCTTC TTAGTACTTG TATCAGCCGT 2760
GGTGTGTAGG TCAAAGCGCG TGTTGTCTCG TAAGTCATGG CGCGGAAGTC CCCGTACAGA 2820
GAATGGGTGT CAGGGTCCTG GTTCGATGTC TGATTTTCGG GCGCATTCTG TTAAGGAACA 2880
AAGGCAGGAT CAGGAGCGCG TGTATGCAGG CATGGAGAGA ATTGTATCTC AGCGTAAAAG 2940
CGATGTGCAG GATCGCCTCA GTGTATTGAA TGCGGCAACT GCATTTGGGC GTGATCGAGT 3000
TTCATTTTCC CCCAGGGTAA CGCGTGCGGA gCATGGATGT AGTCGGTCAG GAATGACTGA 3060 AATTTTTGTG TTTGATCAAA CACGTGCGAT TGGCAAGCGC AATATTCACG TAATGAAAGC 3120
AGGAACCCGT TTAGGGGTTG GCGGGCACAA GGGGGATGAC TTCCTAATTT TTTTGGTGCC 3180
GTTTCCAAGG CGGCTAGCAC AAGTGTATTT TGACGGTGAA GTATATCATC TTGCTATCTT 3240
GAAGCCGAGG TACTTCCCGT ACGAGGAGTC GAGTGTGGTG CcrAcTGCGT CGGCAGAGTG 3300
GTTACCCTTG TCTCTGACAG GGGGTATCAT GTGCCCTTCA CATTCCGCCA GTATGAGGAT 3360
CCCGCTGTGA GATTGAACAA TCTGCTCACC TCTATCGAAT ACGCTTGATC AAAGCGATAA 3420
AGGAACGAGT CGAGAGGTGG ATTCGGGACT CGATTGAGCA GATGGAAAGG GGGAAAGATG 3480
AAAACCGGGG CGGATCGCGC ACCGAAGCGT CCCCCTACCG TTGCTTCCTT CCGGACCTAG 3540
CGGGGTTTGG GGGATATTTC GTGGGCGGCC CCGGAACGGA GAGAGAGGGA TTCGAACCCT 3600
CGGTACCCTT TTGGGGCACA CACGACTTCC AATCGTGTAC TTCGGCCACT CGGACATCTC 3660
TCCTACGGCC GCACCCAGCC GGTTCTGAGA AGGGGGTGCG ACGTTTCCTC AGCCAACAAC 3720
GGAGAGAGAG GGATTCGAAC CCTCGGCGCC CTTGCAAGAG CGCTACGGTT TTCGAGACCG 3780
TCCGATTCGA CCGCTCTCGC ATCTCTCCTC AACAACAACG GCAGAGCCCC ACAGGACACC 3840
ACCCTCAGcG GGACAAGTCC CGTAATGAGA CTAGGCGGAT TCGAACCGTC GACCTTCAGA 3900
TCCGCAATCT GACACTCTAT CCAGCTGAGC TATAGTCTCA AGGGAGTGGG ATGCCAACCG 3960
GCCCCCAAAC CGGAGCaGGG GGGATTCGAA CCCCCGGCAC TCGGATGAAT GCAACTCtTA 4020
GCAGGGAGCC CGATTCGACC ACTCTCGCAC CGCTCCAAAA AACAGCAAAC AGACGCACcG 4080
TACCGAAT C TCCCCGCGGA GCAGGGGGGA TTCGAACCCC CGGTGCCTTG CGACACAGCG 4140
GTTTTCAAGA cCGTCGCCTT CAACCACTCG GCCACCACTC CGGACGCCCT TCCATCCTGC 4200
GTGTAAACGT TGCTCCTGTC AAGTCTTTGT ACGAGCAGCA TAAAAAAGTG GTACGTGTAG 4260
AAAACTTCCC TTCTGGGGAG AAGCTCTTAG AGAAGTAGCG TTTTTATGTT ACGCTCCCCC 4320
TTGTAGCTTG AGTAGGGGAG TATATGGACG ATGCAAGATA TGCAGAATGG AGTGCATCTT 4380
TGGTGCAGTT GCCCGATACG CATTTTTTTG ATCTTATGCG CCTCTATTTG GGTGTGCTTA 4440
AGACTCCATT TCATAAACAG AGGCTTGTTC AACAACTTAG TGCCTTCCTG CAAAGAAAGT 4500
CTATTCAGAA CGCTGTGGTG CAGATGCTTG ATGAACTCGA CTTGTTATTT ATTTCTGTTG 4560
TTATGTGCGT TCCCCGTGCA ACGCTCGAGC TGCTGACAAT TTTTTTTTAG AACGTGTTGC 4620
CCAGGCGGAG ATAAGAACAC GTCTACTGAA TTTAGAAGAA CGTCTTATTC TTTACCGCAT 4680
TCCTCAGATG CCTGGTGAGG TTACACAGGC AGAAGTCGCG AGCGTTGCGC AGAATGGTAG 4740
GGTGCGGCAA ACGCCGTGTT ATGGTATCAA TCCTCTGTTG CAAAAAGCAT gAGTACGGTA 4800 GCTGGACTCA ATCTTTTTCT CATTCCGCAA AAGCGGATGC GTCCATCCGC ACAGTTATTG 4860
ACAACAGATT TGATGCTGTG CGCATTGTAT tCGTTTTTTA CGCACGGGgA AAATTTATTA 4920
AAAGTCGAtG GGACGTTTAG GAAAAAGGCA TTtGTTATGT TCCAGGCATT GTTTCCtGTT 4980
GATCCGGATG TGGTGAGTGT GGCACTCCCT GCATATCTGC AGAGAGCAGG GGAGGAAAGG 5040
GGTACATCAC GTCTTTTACA GGAAGGTCGG CGCGTCTTGG AACATCTGGG ATTGATTGTC 5100
TGCGAATCAG CACAGGTGCA TGTGCAAGAT AAACGGTGGG CTTCTTTTTT CTCCTTAACT 5160
GCTCTGGAAC GTGCGGTGTA TTTGACAGTT GCCAGTACGG CTATTCTGCG CAAAGAGGTG 5220
CTCGTACAGC GAgCGCAGGC TTTGCGTACA CTTCTCTGTG TGTTGCACCC AGATGCGCAA 5280
TACGCACCTG AAGATCTAAC ACGCGTGTAT CGTATCTTGG TGGAAGAGGC AGCACCATCT 5340
GTTGCTGCTG ATTTTTTCTC TTCTTTGTCT TTGTCCAAAG ATACAATGCT GCAAAAGCGT 5400
AAAGGAGCTT TACATGATTC ATCGGTTTTT TCTATGCAGT CGGCGATCAC GGCTATACGC 5460
ACGGCCCAGC TTTTTGGGTT GTTGTGTGTG AAAGATGGAC TGTGCGCGTT GAATGAGGCT 5520
CTATTTAAAG GACAGTACAC GCGTGGGCCA GGAATGGTCT TGTCAGCGAC GGCAGAGTTA 5580
ACCATTTTCC CCGATGGAGA TATGCAaGGG GTTTTGCCAA TTTTATCCTG TGCGCATGTC 5640
TGCTCACTAC AAACAGTTGC CACGTTTGAG CTCAATAAAA AAAGCTGTAC CACTGGCTTT 5700
GCGCGCGGAT TAACAGTGCA GGCACTTGCA CAGGCTTTAG AATGTAAAAC AGGTGAGCAG 5760
GTGCCACAGA ATATACTATC TTCTTTCCGG CAGTGGTATG CaCAGATAAC CGCGTTGAcC 5820
TTAAGACGCg GCTTTGTCAT GCAGGTTGAT TCATCTCAGC AAGCTTTTTT TGAATCTGGC 5880
GGGCCACTGC ACCCGCTAGT GCGCACGCGT CTTGCAGAAG GAGTGTACTT TTTTGATGAA 5940
TGCCAAGAGT GTATGTTGTA TCaGGCcTCG CGCGAGCGCG TCTGTCCTAC CTGTGCGAGC 6000
CAATTGATAC AGCCACCCCG TTATTCCGCC CTGGTGAGCA GGGTGCACGT GCGCTCCATG 6060
TGCCTTCCTT TTCTTTTCCA GTGCGGTCTG CTCGGGGAGT CTCCGAGGAA TCAACGCGAG 6120
ATTTTGCACA TTTAGGTGCC TTTGTGTTGG AAACTCCGAA CGTTTCGTGC ACGCACAGTG 6180
CTGCAGATAC TCCGTCTATT TCAGAACAGA CCGGTGGGGT GGCTCACGTG CAGAGCGAAG 6240
AGGATGTAGA TCCGTCCACG TCTGGTGCAA CGGGTAAGTA TTGGGACAAG GCACAATGGC 6300
GCaAGGTGCa ACGGATGCGA CGTGCTGTGC GGCTGCAGCG GCTCAAAGAG TTTGAGGCGC 6360
ACCTGCAACA ACTAAAATTG GACGCAACAG AGCAGACGGA GCTACGTGCC CGCTTGCAAC 6420
GGGGGTTGAT TCTGGATAGA ATGCAACTTT CGTCCGAAAC GATCCGCaGG GAGAGAACGG 6480
AAGCGAGCGG GGTTGATTTT TTAGGCAAGT ATCGTCTTGC aGAGTGTGCG TTACGTTCTG 6540 GTGCTTTACT TGAGATTGAG ACTAGTTCAG GGCAGTCAGT GCATAAGATA GTGGGTACGG 6600
TGTGCGCAAT TGAAAAATGC GAAGAGGATG CGTTGCTTCA CGTGTGTGTA CACGCAGAAC 6660
TTCCCCCTGA GCGAGTATCG ATTGCGCGCG CGTCCAGGAT AGTGCTACTG AAAAATTCTA 6720
TTTTTTCTTG AGTCTGTTCT GAAGGGGATC CTTTTGTCTC TTGTAAAAAG GAATAGACGA 6780
GCGGGTAGGA TATGAGTCGT AGGAAACAGG GACGAGAGTT ATTCAACAGT CATGTGGGCG 6840
TGGTGTTGTC TTGTGTCGGT GCGGCAATGG GGCTTGCAAA CGTGTGGTTG TTCCCTGGAC 6900
GCCTGGTGGA ATTTGGTGGT GTGACGTTTT TAATTCCGTA TTTTATTTTT CTATTTGGTC 6960
TTTCCCGTTT TGGACTGATG GGGGAGTATG CTTTTGGAAA GACACTGCGC TGCGGTCCTG 7020
TGCGTGCGTT TACCCGTGTG TGTGAAACAC ATtCCATCGT GTTTTTTACG AGCACTACGA 7080
GGTAGCGGGT GGTTTCCGGT AGGAGTATTG CTCGCTACCT GCTCTTTTTA TGTAGTGATT 7140
ATAGGGTGGA TCTTGCGTTA TGTAGTATTT TCGTGCACGA ATGCACTTGC AGGTACTCAG 7200
GCGCACGACC TGTTTTACCA GGTTGCAGGG ACAAGTGCGA ATGTGCCGTG GACGCTTGCA 7260
GCTATCGCGC TCACAGCGTG TGTAGTGAGT GCGGGCGTGC AAAAGGGGGT GGAGCGAGGA 7320
AACATTATAA TGATGGTACT TTTTTACGGT GTCCTTGCGT TTATTACAGG ATATATATTT 7380
ACTCTTCCTA ACGCGTGGAT AGGTATGCGT AGAATGTTGG CATTTCAATC TTCATCATTG 7440
TGCAATCCGA GACTCTGGTT GTATGCATTA GGCATGTCGT TTTTTAGTCT CAGTTTGGGG 7500
GGCGCGGCTA TGGTTTTATA TGGCAGTTAC ATGCCAGATA CGGTGGACAT ACCGCGTACT 7560
GCATTTCAGA CAGCGACCTT AGATTTTTTG GCATCAGGTA TGTCCGCATT ATGTTTAATT 7620
CCGAGTGCGT GGGTTTTAGG TATGGACGTC AGCAGTGGAC CGGAGTTTTT GTTTGTAACA 7680
ATAACCCGTG TCGCCTCGCA GATACCGATG GGGGTGATGA TAAGTGTGnT AwTCtTTTTG 7740
TGTGTACTAT GTGCAGCGTT AAgTTCTGCA ATTGCTATGT TAGAAGTAAT ACTCGAGTCT 7800
TTTGTGCACA CGTGTACAGT GGGGCGCCGA ACGCTGACGT GGTCACTAGC ACTCGTGGTT 7860
GCGTTTGTAT CTCTTCCTCT GAATGCCTCG ATGAGAGTGT TCGAAACGTT TACAGATATA 7920
GTGGTGGTTA TACTATCTCC GTTATCTGCC CTTATGGGGA GCGTGATGAT ATTTTGGGTA 7980
TATGGTGCAG AGCGTTGCCG TGTAGCTATC AACCGGTGTG CACGCGGTCC GTTGGGTAAA 8040
TGGTTCACGC CGTATATGCG GTACGTGTAT TTGGGGCTTT GTGTAATGAT TATGGTGCTT 8100
GGGGTAATGT TCGGTGGTTT TTAGTGTGAT GACGCGCAAA AGCGGCCAAA CCCACAGTTG 8160
GGTAAATATA TTCTTTGCAA ATTGTCGACA CAACCGTTGA CGAGAGGGAT CGCAGGTGGA 8220
GAGGTGTCGT GCCGGGGATA TGTTGACACG TCCGTACTCC TCAGTTTGTG AGGCTCCAGT 8280 TATAGGAGGG GGGATAgCTA CGCGTGAAAA GATTTGCTCT TATTGGACTT GGAGACTTCG 8340
GTCTTAGCAT GCTAAAGGAG CTGCTCAAGC TCACTAACAA TATAGTCCTC CTGGACAGGG 8400
ATCGAACGCT CGTTGAAACC TACCGTAGCa GGGTGAGAAT CGTGCGCGCA ATTGAtGTGT 8460
TGGACGAATT CACTCTGTGC AAGATGATTC CACAGGaTAT CAACGCAGCG GTTATTGATC 8520
TGGGGGTTAA AATTGAATCA TCAATCATGA TAACAACGTT TTTAAAAAAA TTAGAAATTG 8580
CAGATATCGT AGTTAAGGCA TACAGCGCTG aACAAGGGCa TATCCtCTCG aGCGTTGGTG 8640 yTACGCACGT AGTkCTCCcG GACCGGGAGg CAGCTAAAAA AGTCACTCCT ATGATTGCTT 8700
TCGATCTTCT TTTCAACTTT ATGCCACTTT CTGCGCAgCT GGnCAATTGC GGAAATGGCT 8760
GTGCACGAGG ACTATGTGGG AAGAACTTTG CGTGAAGTGG ATGTGCGCAA AAACTTCTCT 8820
CTTAATATCA TTGCTATCCG TAAGCGCGAT GCAGAGGATT TTTGTTTTAT CAATGATCCT 8880
GAATACTGCT TTGAAGCGAA CGATGTGTTG CTCGTTGCCG GTTCTCACAA AGACATCTAT 8940
GCACAGTCGC AGGACAAGCT GGCACATACC CATAGCTTCA GCGACTTTTT CAAACAATGG 9000
TTCCTTACCA GCTGACTTCC CAATGTTCCG CGCACGGGAG TAGGCGCGTG TAATCTTCCC 9060
TTTTCCCGCA CATGCCTACG TAAAGGGGAA TATTTAGAGA GGGGGCTCAG CTTCAAGTTT 9120
TGAAAAATAA GGCTCAAGCG TTGCCGCTTC CCGAATTGAG GTTGCAGTGC TTACCACCGC 9180
AGCTTCAGCA CACGTGCGcT GCGCCCAGAG TAGTTGGGTA CACAAGTGCA TGTCTGTTAC 9240
CCGTGCGTCT AGAATATGCT CAATAGCCGC AATACGCTCT GTTTCTGTCG TTCCGTTGAG 9300
AAAGCGCAGG AAACTCGCAA AACTTTTATG CTCCGGTGTT TCAGGAGTTA CCGCCGTACT 9360
GTACGCTCCT ATGATTAGAC GTTCCATCGT CTTCTGGTTA AAGAGACTGG AGGTGCCCTT 9420
GTGATCGGCG TGCAGGTGCT CAATCGTCTT GAATATCACG TCAAGGGAGT GGAGCGGaTT 9480
TGGATCGCGG TAAGTTAGTG AGGAAAATAT GCCATGTACT GGATCTGGCA GGGTGAATGC 9540
ACCGTAAGCA CCACCTATCG TTCGAATTTT TTCCCAAAAC GGCTCAGTAC TTAGATATCG 9600
GGCAAACACC TGCTCTACCC CGCGTCTCTC CAAAGGAAGC CGTGGATGTG CAAGGGACAG 9660
CGCTGCAAAA CCCACTTGCA CAGGGGCTGG AAGCAGCGTC AnCATATTGC GGGTGCGCAT 9720
GTGCTGCAGG GCCTCTTGAA AGAGCACACC GTGAGCGGAG GGTATCTGTT GTTCGTGCGC 9780
TGTGGGCGCA GAGGTGTGGT GGATAAAATA AGTGGATAGC GGTGCACGAA AACACGCCAG 9840
AGGTTTTGCC AATGCATCTA GCGCTGTGTG TAACGATGTT TCTGTACCAC ATACACATCC 9900
GATCACTCCT GCAGTGAGCA GTTTTTCATG CAGCGCTTTG AGTTTGGCTG CCAGCGAAGG 9960
GGAGGCAACG GTTTCTGTAC ACTCTGTCCA CAAAGCACGT ACCAAGCGAA TCTGCGTGAC 10020 TCCAGTCCAG AGTTCTTCTA CGGCCTTTGC TGCATTCACG CGAGCGTTTG CCTTTGCAAG 10080
TGCAATGGAA TGTCCTGAAT GCATGGCAGC GCTATCCAAG TCATTTTTAT ATTGTGCAAG 10140
GATGTCTTTT AACCGTCGTG TATCTGTAAA AGAAAGACTG CGCACGTGTG CACACACGTA 10200
CGAAATCGCC TGTACGATAA AGCGCGACAG CATTTTTACG CTGACAACTA GCCACGCTCG 10260
CCCGACTATA TCGCTGCGTT GCAGTGTGTT CTGTCCCCTC AGTAAGGGGA GTATCTCGCT 10320
TCCCTGATCG CCTGCTACTA TACAGCGGGC GGCAAAGCCC CCGGTGAGAC GGGCAATCTC 10380
GGCAGATACC ACACTCCAAT GATGTGTCTC GGTTCCCATA CCTGTCAGTG CGTAGCCATA 10440
CAGTGGCAGA AGTTGTGCTT CTTTTACACT GAGCATATCT GCTGGGATTG CAAGGTGTAA 10500
GTACGTAATA TCGTTCGTGG CAAGCTCATG CACGAGAACA GGAACAGAAC CAAAAAACTG 10560
CATGGTTTCG CTCAGTTCTG GGGTGGGGAC TGGCAGTTGT TCCCGCTTTA TATGGGGGAG 10620
CAACGCAAGG AGTTCCTCCG GATCGGGTGT TGTCTGTCGT ACACGCAGTG ATTCTTGGTC 10680
AGCTCGGAGG CGCGCCGCTG CTGGCTGCGT GAGTGTACGG GAGAAATCCT GTACGTATTT 10740
TTCTAATTGC TCATCAAGTT TTTTTGAGAA GTCTGGGTCT GGGTGTACCG AAAGTACCGT 10800
GTACTGCGGG TTGCGCAgcA AGtGCGTGAG GATGAGATTT TCCACGTAGT GtGgATGGTG 10860
GTGTACCTTT TCACGCAGgc CTGCAGTGCG GGGATATAAC GCAAAGAACT TTCTGGACCT 10920
GCACCGTGCA ACCATCCACG CAGCGAACGC TGCATGAGCA CGAGAGAAAA AGGACCGTCA 10980
GAGCGGCGTA CTTCAGTATT TGAAAATTCG AGTGCATTCA GCGCTGTTTC CACTTCCTGT 11040
GGAGGGATGC CGTGCGCAAC AAGCGACTCT AGTGTTTCAA ACACGCATGC CTTTAGTGCA 11100
TCGACCTGTG TATGCTGCAC CCCAGTCATA CCTACAAAAA AAAGCATACG CTTTAGATCG 11160
ATGTGACTGC CGTTATATGC GTATAAATCC TCACCGAGTT CTGATTCTAA CAGTGCCTGT 11220
GCAAGGGGAG CAGCATCGTG ACCGAGCAAA ACGTGTTCGA GCAAAAACAC GTCCATTAAC 11280
TGTTCAGCCT TGTCTGATTC TGGGAGTAAC CAGCTGAGCA ATACGGCGCA CCGTGTTAAA 11340
TCCATCCCCT CGCTCGCCGG TGCGTACCCG GTGTACG AC GGGGACTTTG GTATGCAGGG 11400
ATAGGGGGGA TGGGGGGCAA CGCTTTGCGG GCAGAAAATT TTGAAAGGCA TTTATCCTCA 11460
ATAAATGCCA TCTGTTTTTC GGTGGGTATA TTTCCGTACA GAAAAAGCTT GCAGTTTGAC 11520
GGGTGATAGT GTTTTTTGTG AAAAGCTTTA AACGATTCGT ACGTGAGACG AGGAATAACT 11580
GTTGGATGAC CTCCTGAATC GTGTGCATAC ACTGAGCCAC GTGTGGTCGC GTGTGTTGCG 11640
TGCTTATACA CAAGCGTATG AAAGTCTGCA TACACACCGC GCATTTCATT CAGTACAACG 11700
CCCTGGAGGG TAAGTTGGTT GTGCTCATTA AACTCAAAGC GGTGTCCTTC TTGCTTAAAG 11760 504
GTCCACTCTT CGATCAGGGG GAAAAAGACT GCGTCTGCAT ATACACTCAT AACATTGAAG 11820
TAGTCAGTCT CTACCAAGGA GGAGGCCGGA TATACTGTTT TGTCCGGAAA GGTTAGAGCG 11880
TTAAGAAACG TTTTCACGCT TTGTTTCGCG AGTATGAGGA ACGGATCCTT GAGGGGATAA 11940
TGCTGTGATC CACAGAGCAC CGAATGCTCA AGGATATGAG CAACCCCGGT ACTTGCTTCT 12000
TCTGCCGTCA TAAAACAGAA GGCAAACAAA TTCTCCGGGT CTTCGTTGAG AATGTGGTAC 12060
AACTCAAGCC CTGTTTTTTT GTGTCGAGCA TAGACACCCA CTGCCGAAAG CTCAGCGAGT 12120
GAATGGCGCC AGATAATTTC AAAACCGTGA AGAAGCGTAC TCATCGGTGA TTCTCACTCC 12180
TCTTCTTGCA AGCTATTTGG AAGCAATGTG CTGTTGCGCG CCGGCACTGC GCAATGTAGC 12240
TAAAAAGTGC TCAGTGATGG TGCGCGTATC ATGCGCTGAA GGCGGGAACG TGTCGTACTC 12300
ATCnCGCACA AGCTGTGCGC GGTTTTTAGA GAGATTAGAG AGAATTTTCT GAACAAAAGC 12360
AGGATGATTG GTGTTGATGA GCGCAGCAAG AGTTTTTTCA GAACAAGACG CCAGGTGTTT 12420
TTGCAAAAAT GTATCTGGGA GCGCGGGGAT ATCATCCAGC GTGAAAAGAT GCGTGCGGAC 12480
ACGCGCTGCA AGTGTTGGAT TTTTTTCTGC AAGGGCATGA AGAATTGAAT GCTCAGTCGC 12540
GCGCTCCATC TTTTTGAGAA TTGCCGCAAG CACTGCATGC CCGTCAAGAT CACGGCGCTG 12600
GGACAAATGG AGCGCTGCAA ACTTTTTGTG CAAGGAGTCA CTCATGACTT GCAGCACCTG 12660
AGGGTTAACG TGCTTTAACT TTGCAAGGCG AACGATCAAG TCCTTCTTCT CCTCTGTGCT 12720
GATATTACTC AAATAGTGCG CAGCGCTTTC TGGAGGCAGC TGCGAGAGGA TGAGTGTTTT 12780
GGTGGCAGGT AGTTCTCcTT CCAGGAGGGG GAGAAGTTGG GAGGCTTCAA GCGCAGCCAA 12840
AAACTCAAAA GGTTTCgGCT tGCCGCTGGC ACCGCCCGCT TCAAGATAAG ATCGGCCTTT 12900
TCTTCCCCAA ACGCTTTGGA AAGCATCGAc TGCGCAGCAC GCAGTCCACC GGTAACAGGC 12960
GACACACGAG CGCAGAGGGC AGAAAACTCC CGTAGGATCT CACGCGCTTC TTCTGGACTG 13020
AGGGGTTTGA GTGTCAGGAG CTCGGCAACC ACCGCCTCAA TCTGTGCAGG CTCAAGTTGC 13080
TTGAGCACCA GCGCCGCCTG CTCTTCTCCA ATGAGGGAGA GGAACTGGGC AATCTTTTTA 13140
TAAACGGTTC GGCCTCGGTC TTGTTCACGT ACGGTGGCTT TGATTAAGCC ACGnAGGAGA 13200
TTCGGTTCTA TTnCATAGCA AAGAGGACTC CGCGCGGTCT CCCGGCACAC GCAGTTGCAT 13260
TGTAGTGGAG GGTGTGCTCT TGACACAAGG GCGTGCAnAC CTTAAAAGGT GTCCCCCCCC 13320
CAGACGGGGT AGGGGTCCAA GGATGTGATG GCGTTGTCTT TCGGTTn 13367
(2) INFORMATION FOR SEQ ID NO : 56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6856 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56:
GCATTGcTGC GTCTCGATAG GCTGTTCGGT ATCAGCTGCG ATGATGAGGT GACCGGTCAG 60
TATCACTATG TGGTTATAGT TGGTGCGGCA GAGAAAAAGG TGGGGCTCAT GGTGGATGCG 120
CTGATTGGTG AGGAGGACGT ATCATCAAGC CACTGCGGGA TCAATTCACT AGTTCCCCTG 180
GTATTGCAGG GGCATCTATC CTGGGTGACG GTTCGGTGTC GTTGATTATC GATGTGGGGC 240
AGCTGCTTGA GCTTGGGTTG AAGCGGGAAA TATTGGCGCG TGAgcgTcGA GAAGCCACGG 300
TGTGGTAGGC GATCTGGGGC ACGGATTGGG GACTATGATA GAGCATATGG AAGCAGAGAT 360
CGGCATTCGG GAAAGTTTCG ACGGGGGCGT ACGTGAGCCG CTTGCGgTCA TAGACTTCAA 420
GATGGTTACC TTTTCCCTCG CGGGGAAGGA CTACGCGGTA GATATCATGC AGGTGAAGGA 480
AATTGCAAAG GCTGGGAGCT TTACCTATGT GCCCAATACG TCTCCGTTTG TTCTGGGGGT 540
GTATAACTTA CGGGGGGATA TTATTCCCAT AATTGATTTA AGGAGATTTT TTAATATTCC 600
CGCTCCGCGC AAGTCCCGGC AGGCGATCGA GAATATGGTG ATCGTCACAG TGGAAGATCA 660
GACATTCGGG GTTGTAGTAG ATGGCATCGA TAAGGTAATT GGGGTGTCAA AAACAACTAT 720
TCAGCCGCCA CACCCTATCT TTGGGGACAT CAACATAAAG TATATCCGGG GGGTGGTTGA 780
GGAGGCGGGA AAGCTGTACA TCCTACTTGA TGTGCACCGG ATTTTTTCCT TCCGTCTTGG 840
GGAGGAGGAA CGGACGGCAG TTGTCGATCG TGGTGTTGTG CCGTCTCCTT CACCTCCTGC 900
CGTATCTGTG CCGCCGGGGG ATGAAGAAAA TTTAAATGTT GGTTTCATTA GCGATACGTT 960
GGCCGCGTTT GGCCGTTTCT TTACCAGTGC AGTGAATGAG GGTTGGTTGC GCAgCCGGTA 1020
TCTTGTGTGG CGTGACGTGC GCTCTGGAGC TGAGGTACAG CTTCAGCATG AGGAGGATGT 1080
CGCCGAGTTC TTGAGTACAT TTCCTTCCCC GGACACAGGT GTGTTTTGGT CGGGGGAGTA 1140
TGCGGCGAGT GTGGGATCTG TTCTTTCTCG GATGCAGGTG GGAAAGGTGG TGACGGTGTG 1200
GAATATCGGT TGCGGTGCGG GTCACGAAAG TTACAGTCTT GCGGTGCTTC TCAGAAAAAC 1260
CTTCCCCGAC GCGGTGGTTC GGGTGCACGC AAGCGATTCG GATCTCTTCT CCATTTCCAA 1320
TGCTCCCATG CTCACTGTTC CTGAgCATGT GATCGGTGAT TGGTATAAGC CCTATGTGGT 1380
GAAGGGGGTG AGTGGTTCAT ACACCTTCTC CCAGGAAATT AAGGAGATGG TCCTGTTTGA 1440
GTACCACGAT TGTACGCATC CGAGTGCGCT TCCAGACGTC GATCTTATCG TGGCGCGGGA 1500 506
CGTACTGTCA TCTCTTGCGG TTCCAGTGCA GCACACCCTG TTGAAGGAGT TTTCTGAGAA 1560
GTTGAAGGCA ACAGGAGTTG TTCTGCTCGG TCAGAACGAG GTGATGCCTA AGGATACAGG 1620
ATGGTTGCGG CAGATTGAAG GCACCGTTGC GGTGTTCAGC AAGGAATAAT TAGCGCATGA 1680
GGAGTGGTGT ATGCGTGTAG AGTATATCAA CCCGTTCAGT GAGGCGGCGT ACGTGGTTCT 1740
GTCTGAGGTT TTAGCAGGGG AAACCAAGCG GGGGGACTTG TATTTGAAGT CTACGTGCAT 1800
GCCGGTGATG GGTGTTGCGG CTATCGTTGG CCTTGCAGGG GATGTAGAGG GGCGTGTGGT 1860
ATTTGACATG ACGCTCGATA CGGCGCTGAA GATTGCCTCT TCGATGAACG AGGAGAAGTT 1920
AGCGGCGTTT GATGAGcTTG CGCGTGCGAC GATCACCGAG CTCGCCAATC TGATCACCGC 1980
AAAGGCGGTT ACTACGTTGC ACGAGCTCGG ATTTAAGTTC GATCTTACCC CTCCGGCGCT 2040
GTTTACTGGG GACAACATGG AAATATCTAG TAGTGATATT GAAgCGCTTA TCGTGCCCAT 2100
GGAGACGCCT CAGGGTAAGG TGGAAATTAA TGTTGCCATC CGCGACAAAG TATAAGAGGG 2160
AGGAAGTATG ATTTCCAAGC AGGATTTTCC CACGATCAAC GATCGGGTTC CCGCAGaCaA 2220
AAACCGAATG GGGCGCCCTA TCGTGTGTTG GTGGTGGACG ACTCCATGTT CGTTTCAAAG 2280
CAGATTGGTC AAATCTTGAC AAGTGAAGGC TACGAGGTTG CAGATACTGC GGTGGACGGC 2340
GTTGATGGGG TTGAAAAGTA TAAGGCGATG AGTCCGGGCG TTGATTTGGT GACGATGGAT 2400
ATCACGATGC CCAAGATGGA CGGGATTACt GCGCTTGAGA AGATTCTTGA GTTTGATAAG 2460
AATGCAAAGG TAGTTATCAT TTCGGCGTTG GGGAAAGAGG AATTGGTGAA GAAGGCACTG 2520
TTACTGGGCG CGAAGAACTA TATTGTCAAG CCGCTCGATA GGAAAAAGGT GTTGGAGCGA 2580
ATTGCAAGCG TACTAAAGTG AGGGCGGATG TGTCCTGCGG GCTGTCTCGT ACGGTTTGCC 2640
CgCTTGCGTG TGTGGATGGT TTCTTGAGGT TTTTGCCTTC GCGCGCGGAG TGCCCGTCTC 2700
TCTGCGTGCG TGTTTTCTGT GTGTGTGCCG CAAGAGGAGA AgTGGTGTCT CCCTCAGCCC 2760
TTTGCTCGGG GCCCTGTGGT GCCTTTCCTG CGGTGTAGTT TCTATACTCC TTCGTAGTTC 2820
CTAGTTGGTT TGGTTGGAAA GGGTTCGGTT CGATTTTGAA GAGGTGCACA CGTTGTATTG 2880
TGCGCATGAA AGAGGGAGCG GTGTGTGCTC TTCCTGAAAA CGCTTGAGGT ATTTGGCTTT 2940
AAGTCGTTTG CAGATCGCGT TCGCGTTGAG TTTGCAGATG GCGTCACTGC GCTGTTGGGC 3000
CCAAACGGCT GTGGCAAAAG CAATGTCGTT GACGCCATAA AGTGGGTCCT CGGAGAGCAG 3060
TCCTCTAGGG CCTTGCGTGC CGACAGAATG GAAGACGTTA TATTCAACGG GACCGAGTCG 3120
CGTCGTTCGT TGAACGTTGC AGAAGCCTCT CTTACCGTTT GCGATGAAGC TGGTATCCTT 3180
TCGCTCGATG TGCCAGAGAT TTTAATTAAA CGCAGACTCT ATCGTTCCGG GGAAAGTGAG 3240 TACTTTCTTA ACGGGAATGC CGTCCGTCTA AAGGAGATCC GCGAGCTCTT TTGGGATACG 3300
GGAATAGGGA AGGTTGCGTA CTCCGTTATG GAGCaGGGGA AAATAGACCA GATTCTCTCA 3360
AATAAACCGG AGGAACGTCG CTACCTTTTT GAAGAAGCAG CaGGGGTGAC GCGCTTTAAA 3420
GTTCGTGGCG CGGAAGCAGC aCGGAAATTG GAGAAAACGG CGGAGAATTT GCGTCATCTT 3480
GAGGTTATTC TGCAAGAAGT AGAGAAGAGC TACGAGAGTT CAAAGCTCCA AGCTGCCCAG 3540
ACGCAACGTT ACCGCATGCT CAAAGAGGAG ATTTTTGCGC GAGATCGCGA TCTTGGTCTG 3600
TTGCGTCTGC GTGGGTTTTT AGAAAACCAA GCCCGAGCGG ATGGAGCACT CCAGCGCAaT 3660
CGCGCGGCGC GACGCGTTGC AAACACAGGT GGAGGAAGCA CAGCAGACGC TTTCTGCTCG 3720
CATAGGCGAG ATCAATGATA TGGAAAAGcg CGTTGACGCG CTCCAAAAGG AAATCTATGG 3780
CCTTGCAATT GAACAGAAAG CGAnCAAAAC GAGGCATCGC TACATCGTAA GCATCTTTCT 3840
GAACTGAAAG AGTCGATTGG TCAGATAGAA ATGCGCAAGA TTGGTGTAGA AAGTCGCGTG 3900
CAGAATTTGG AAGAAGAAGT AGCAGAGCAA GACGCACACG TGTATCAGTT AGGCAGTGCT 3960
CTATCCTCTG TTGAAGAGCA TATTGAATCG TTTGCGCGGA CTTGCACGTT GCAAGTGAGC 4020
ACGTCTCAGA GAATGATCAA ACGCTTCGCG ACATACAGGG ACAGATGCAA GAGATAAGTG 4080
CCGCGTGTGT TGAACTTGAA GCGTCCCTAC GTGACGTGGC AGAAGATATT GCCGCAGAGC 4140
TTGACACGCG CCTGAGTGCA GCCGGGTACT CTGCGCGCAA TCGGGCAGAG GCTGAGCGTA 4200
CGTTGGTAGC GGGGGTACAG CGCCTGCGAA CCTTCGTGGA GGGGAGAGCA CGTATTGTTT 4260
CAGACTTTCT GGTGGTAGAT ACCCACACTG AAGGGGAGCT GTGCCGGATG CTGACTACAG 4320
TTGTGGACGC GTTCAATGAG GCGGTAAAGA TAGTGCACTG CGTTGAGTCA GACATAGCAG 4380
AATATGCGCG TGTTTCTGCC CGGTTTATCG ATGAGTTTGT TGCTCCTCAG GGGATTATGA 4440
CCAAGAAACG TGAATTTGAG CGACAGCTTG AACAGCACCG TGCACAGCTT GAGCGGCaTG 4500
CTGCGCGTCA GCrCAaCTGn CAGGAAGAGA ACAAGCTCCT TGTTGGGAAG ATAGAAGCCT 4560
GTCGCAAAAC GCTTGAATCC CTGCGTGTGG ATCAGGCGCG TCTGCGTGCT GAAGCTGAGG 4620
CAGGACAAAA ACAGGCTGCA GGAACCAGAG GGGAGGTGGC ACGTCAGCGC GCAGTGATTA 4680
AAGAGCTCGA AGGGGAGTTG TTTACCGAGG GGGAGCGGGT GGCGgCGCTC GAAGAGCGCT 4740
TACTAGAGGT TGAAGGGGAA ATAGGACAGC TAGAACAGCG CGGTGTTTTG CTCACCAAAA 4800
GTCTTGAGAA CTGCGAAGGA GAGATCCGTG TGCGGAATGC CGCAGTAACA TCTGAAGAAC 4860
ATGCGCTCCA GGAAGCGCGC GTGGAACTTG CACAGGTGGG GCGGCAGCTT GAGCAGGCAC 4920
ATCGGGAGTT GATGCAGTGC GAAACTGAGA TTCGCAATTT ACGTGAACAT TTTCGAGAAC 4980 AGCACACCCG CGATCTGAGT GAGTTTGAGG ATTTAATACC GGGGaTTGAA AAAACGGCAA 5040
GTGATCTGCG CCaAGAGCGT GGGGAGCTTC aGGCTCGAGT GAAGGAAATC gGGGCgGTGA 5100
ACTTTATGGC GGTGGAGGAG TTTCAGGAGG TAAAGGAGCG CTACGAGTTT CTCGTTGCGC 5160
AGGTTGCGGA CCTTGAAAAG GCGCGCGCAG ATCTGCAGCG GGTAACCGAT AAAATTAAGG 5220
CTGAATCTGC AGAACTTTTC TTGGCAACAT ACCGACGGAT TCGTAAGAAT TTTcACGAGG 5280
TATTCCGTCG TCTGTTTGGG GGAGGTCGCG CAGAGATACG TCTTTCAGAT CCTGCAGCGG 5340
TGCTCTCGTG TGGAATTGAA ATCCTCGCGC AGCCACCGGG GAAGAAGCTC GAGCATATTG 5400
GCCTCCTTTC TGGTGGAGAA AAGGCAATGA CTGCAGTAGC GTTGCTCTTT GCAACGTATA 5460
TGGTGAAGCC TGcGCCGTTT TGTCTTTTGG ATGAAATCGA CGCAGCGTTG GATGAGCATA 5520
ATGTAGCTCG TTTTGTTGGG ATGCTTGATG AGTTTTCTGA CGTCAGTCAA TATATCGTAA 5580
TCACGCACAA TCGGCGGACG GTTTTGGGTG CACGCACCAT GCTTGGGGTA ACAATGGAAG 5640
AGCCGGGGGT ATCGAAAGTG GTTTCGATTG CACTTGAATC TGCTTCTGAG CGACCGGCTA 5700
ACGGCGAGGC AGGAGGAGCC ATTTGATGCG TCTGCGTGGG GTGGCAGGTG CCCTGTTGGG 5760
TGCGGTAGTG CTTGTGGCGT TGGGGCTGAT GGGCGTCTGG TGGGTGTTCT ATCCAAAAAA 5820
AGGGGACCGT GGGGCGGCTG TGGCTCGCGA GCCAGTGTTG TTGCACATAG ATCCTGCACA 5880
GATGGAGGCA GCTGATGAAC CGTTGACGCT TCCCCCTATC GAGCGTTCCC GTGAGCGGAT 5940
GTCGGCGTGG AGTGAGCAGG AGTGCCTCCG ACAGCTTGAG TATCCGACGG AAAAGGCGGT 6000
GCAGGCATTA GAGCACGCAA ACGAGAAACG TATACAGCAG ATGCTAGAGG CAGTACCGTG 6060
AGTGTGTGGG TGGCGCTCGC CTTGCTGGGA ATGTGTGTTT CGTGTACGCA CGTGCCTCCG 6120
CCTCGTGCCC TCATCGTTTC AAAGGAGCCG CCTCCAGCGT TGGATTCTGC GCCGCGCCCT 6180
GCGATTCCAG AAGCAGTTCC TCTTCCGTCC CCTGTGGAGG AAGAAATCGC CGGTCGCCTC 6240
CCTCCTGCAC CTGCCGCTGC ACCTGAGCGC GTTCCTGAGT CCTCACAGGA GCGGGAACAG 6300
AAACCTGAGT CTTCGAAGCC TCAGGTGGTA GAGCCGGTGT CGCTTGCCTC TCCGGTGAAG 6360
CCTCGCGAGG CTGGGAGTGT AcCTGATGTT CTTCCAGTAC CTGAAGTGTC GTCGCCGCAC 6420
GTTGCGCCGC CGGCACCCCC TGCGCCGAmA GCTCCCCGGC CGCATCGTCC CTCCCCTCCG 6480
CCTGTATCGC CTTCTGCATC CAAACCAAAG CAGCGCGCTG TACCTCCTTC TCCGCCCCCT 6540
GCATCAGAGC CTCCTCGTGA GGCGGAGGTG CAGGCTGAGC CTGAGCCGGC AGAGGATTCT 6600
CCACGCGCGA TGGTGCCTGA AGAAnCGACT GGAGGCATGA nGnnCCGCGC GTTTCGCnCG 6660
GATGACAGCT TGCATGGGGC AAAAACTTGA GGTTTTGTAT CCGGGGCGAA GTTGGGTGTA 6720 AGTGGGCGAG CATACTGCGC ACCTGGTTTG CGCTATCACC AGnGCAATTG GAGGAGTCGC 6780
ATTCGCTTTT TAACTTTATG CTGAGCGAGA GGGTGATTTT GTCTTAGnTT CTCCTAATTT 6840
GATGnGTTTC GGGGTG 6856 (2) INFORMATION FOR SEQ ID NO: 57:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 10928 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57:
CGCGTATGAA CGCAATGCCC AGGCGGTTAT TCCGTTGGAG CGTATCAGGC AGACAATCCG 60
TGCCGTTGAC GCGCGCGTGC AgtGCACTGG CTAGTTATTT TGAAAAGATA gGGGAAGAGA 120
AGCGGcTACG GGTCCTTGCT CGTCTACTCG AACGCTATGC ACCGCTTATC GGCGAGCAAA 180
AAATAACGGT ACgTTTCTTC GGTTATTGCG AGTCGCGGGT GCGTGATCTT CTCAATCAGG 240
CGCTTCCACG TGCTGTCCTG CGTTCTCTCA CCCCCTTTGA TAAGGCTGAG GCCTGGCGCG 300
CACAGTGCAG TGATGGGTTG ACTATTGAGA CGGAGGACGG GACGCTCCAG TGTCGGAGTA 360
CAATCGAGGA GATCTGCGCG CAACTTTTGT cTGAAAAGAG ACAGGAGTTG GCGTGTGCCC 420
TGTGCGGTAA TGGAGTGGTA GCGTGATCAA AGACGATGTG GTTACAGGCC GTGTAGTGAG 480
GGTGTCTGGT CCCATTGTGT ATGCCGAGGG CCTCTCTGCG TGCAGCgTAT ACGATGTTGT 540
CGACGTAGGg GAAGCATCGC TCATCGGAGA AATTATCCGG TTGGATGAGA GCAAGGCGgT 600
CGTGCAAGTA TACGAGGATG ACACAGGTAT GCGAGTCGGG GAGAAGGTGA CAAGCTTGCG 660
TCGACCACTC TCAGTCCGCT TAGGGCCTGG ATTAATCGGC ACCATTTATG ACGGTATTCA 720
GCGCCCACTT GAGCGCCTCT TCCAAGAAGA CGGCGCCTTC TTGCGTCCTG GTGCGCGTTC 780
ACAACCGCTT GATGGCTCCG TACGCTGGGA TTTTCGTCCT CATTGTAACG AGCGCGGTGA 840
GGCCCTGTGC GCGGGGATTC CGATTGCACC TGGGTCaGTG TTAGGGACCG TGCAGGAGAC 900
TCCTTCTGTT GTGCACACTA TCATGGTTCC TCCTGACATC CGGGGGAGCG TGCTATCTTC 960
GTTCAAGGGC GCAGGTGCTT ACACAATAGA TGAAGAAATT GGACGCACTG ATCTTGGTGA 1020
GCCGCTTTTT CTATCCCAGT ACTGGCCAGT GCGTCGTGCG CGTCCTTTCA GCAAAAAACT 1080
TGCAGTGTGT GAGCCACTAG TTACTGGACA GCGGGCGATT GATGTTTTCT TCCCCCTATC 1140
AAAGGGAGGA ACGGCGGCTA TTCCAGGGGG ATTTGGAACT GGGAAGACAA TGACGCAGCA 1200 TGCCGTTGCC AAGTGGTGTG ATGCAGATAT TATCGTGTAC ATCGGCTGCG GAGAGCGGGG 1260
CAACGAGATG ACAGACGTGC TCTCTGAATT TCCCAAACTC ATCGATCCGC GCACAGGACG 1320
CTCTCTTATG GAGCGGACGA TTTTGATCGC AAATACGTCC AATATGCCTG TGTCCGCACG 1380
CGAGGTGTCG CTGTATTCAG GGATTACCCT TGCGGAATAC TACCGTGATA TGGGTATGCa 1440
TGTGGCCATC ATGGCTGATT CTACCAGCCg CTGGGCGGAG GCGCTGCGTG AATTGTCTGG 1500
GCGCATGGAA GAAATGCcTG CGGAGGAGGG ATTCCCTGCG TACCTTCCGA CGCGTCTTGC 1560
AGAATTTTAT GAGCGCGCAG GACGCGTGGA AACCTGTGTG GCGCGCGAGG GCTCTGTGAG 1620
CATCATTGGT GCTGTTTCTC CCCTGGGTGG AGATTTCTCT GAGCCGGTGA CGCAGCACAC 1680
AAAGCGCTTC ATCCGTTGCT TTTGGGCCTT GGATCGTGAA CTTGCACACG CGCGTCATTA 1740
CCCTGCCATT GGGTGGATAG ATTCATACTC TGAATATGCG CAGGAAGTAA GTGCATGGTG 1800
GAGTAAGTAT GAcCCgCGCG CAGGCGTtGC GCGCCGCAGC CTTGGATTTG CTGAGAAAGG 1860
AACAGCgGTT ACAGCAAATT GTCaGGCTTG TCGGTCCtGA tGCGCTGCCt GGAGAAGATC 1920
GTCTGGTGCT AATGGTGTGT GAAATGATCA AAGGTGGCTT TCTGCAGCAG AACGCTTTTG 1980
ATCCGACGGA TGTGTTCTCC TGTCCCGAAA AGCAGGTGCA GATCTTGCGT ACCATAGTGG 2040
ATTTTCACGA ACGTGCCGTG GTGCTGCTGC GTGCAGGTAT TTCGCTTTCT GCGCTGTCCC 2100
AGCTTTCGTG CCGGGAGCTC ATCGTACGTA TGAAAAnTAC GTACGGGAAT GAGGATGTAC 2160
ACAAGATGCA GAAAGTGTAC GACACGATGT GCACTGAGTT TGACCAACTG AGTGTGTGTG 2220
CTGCCGCGCG CACACAAGGG GGGGAGAAAG TCGAATGAAG GGAGTGTGGT ATCGGGGTCT 2280
GTCCTCCATC GACGGTCCGA TCGTGGTGGC AAAGCGCCGG GAAGGTGCAT TCTATGGGGA 2340
GATTACGGCC ATCCGTGATC GCTTCGGTGC TCTGCGTACC GGCAGGATAA TTGATCTTTC 2400
TCAAGAGTGT TGTCTGATTC AGGTGTTTGG CTCCACGCTT GGGCTCAGCC TCGACGGTGC 2460
CTGCCtTGAG TTTTTGGACg TGCCGATGCA GCTGCGTGTC TGTGAGGGTT TGATGGGGCG 2520
GGTATTCGAT GGATTAGGGA GACCAATCGA TGGTTTCCCA GAGGTGCTCT CTTCTCAATT 2580
GCGTAATGTG AACGGCTATC CTATCAATCC GTACGCGCGC GTATATCCAC GTGACTTCAT 2640
TCAAACCGGT ATTTCTGCTA TCGATGGTAT GAATACGCTC ATTCGTGGGC AGAAACTGCC 2700
AATCTTCTCT GGGAACGGCC TTGCGCACAA CCGTTTAGCA GCGCAGATTA TCAGACAGGC 2760
AAAAATTCTT GGCACGGATG AGGCCTTTGT GATGGTATTC GCGGGTATGG GTATTAAGCA 2820
CGATGTGGCC CGCTTTTTTG TTTCTTCTTT TGAAGAAACA GGGGTACTGT CAAAGGTGGT 2880
GATGTTCCtG TCGCTTGCAG ATGCGCCATC TATCGAGCGT ATTATCACAC CACGCTGTGC 2940 ATTAACCGCA GCTGAGTATC TCGCCTTTGA AAAGAACAAG CATGTATTAG TCATTTTTAC 3000
AGACATGACA AACTACTGTG AGGCGCTGCG GGAAGTTTCC ACCACACGAG GGGAGGTACC 3060
CGGGCGTAAG GGTTATCCGG GTTACCTGTA TTCTGATTTG GCAGAACTGT ACGAGCGCGC 3120
AGGCAGAGTG AAAGGATCCT CCGGTTCGGT GACGCAGATT CCgAtCTTAA CTATGCCGAA 3180
CGACGATATT AGCCaTCCGA TCCCtGACCT GACCGGGTAC ATcACCGAAG GACAGATTGT 3240
GTTGCAACGC GACCTATCTC AGCGGGGCTT GTATCCGCCC ATTGGGTGTC TACCCAGCCT 3300
ATCTCGCTTA ATGAAAGATG GTATCGGGGA GGGTATGACA CGCGCAGATC ACCATGCGGT 3360
TTCAAGTCAG CTATTTGCTT CATACGCAAG AGTACAAAGC GTACGGAGCC TTGCCTCGAT 3420
TGTCGGAGAA GAGGAATTAC CTGCACTCGA TAAGTGTTAT CTGCGCTTTG GTGACTTGTT 3480
TGAGCAGTAC TTTCTCACGC AGgATGAGCA TGAAGATCGG AGTATCAGTC AGACGCTCGA 3540
TATCGGGTGG AGTTTGCTCT CACTTTTGCC GCGCACCGAG CTATATCGTA TCGACCCAAA 3600
GCTTATCGAT CAGTACCTGA CCGCTTCGTG CAGCGCGGTG AGTGATCAGT TGCGAAAGGC 3660
GATAGAGGAG GCCCGCACCC CGGTTGCGGA CGCGTAAAGA CCATGTGTCC TATAAGGCTC 3720
TTGGAGAAGG GTGATTTCTT TGCGGCGCTC CCTTGCTGTG TGTCTTGGCC ACGCAGGGAG 3780
AGGATACAGA GGTGAAAACA CCTTTAGCTC CCACCAAGTC GAATTTGGCG TATGTAAGAG 3840
ATCAGTTGGG TTTGGCTCGT GATGGTTATC GCTTGCTTGA GCAAAAACGA GAAATCCTCT 3900
TTATGGAGCT CACTTCTCTC TTGGAAGAGG TGCATCTTCT AGAGACTGAG CTTGATAAGC 3960
GTCGGAAGCA GGCGTATGCG TCGCTGTGGC AGCTGCTTCT TGCACAGGGC CGCGATGATA 4020
TTGCTGCCTG TGCGCTCGTA ACACCgGTGC CCTGCCGTGT GCAGCAGGAG GTGCTTTTAA 4080
TTGCTGGATT GCGATTTCTC CGTCTGGATG CAGTGATGC GCCACCGAAG CTGCAGTATG 4140
CTGCGCTCGG CTCCAGCGCG TGCATGGATA GAGCGCGGGA GGACTTCGGG TTACTGTTGC 4200
AAACACTCAC GAGAATGGCA TCCGTACAGA CTATCGTATG GAGACTCGCG TCAGAAATGA 4260
GAAAAACACA GCGACGTGTG AATGCGCTGA GCAAGCAGAT AATCCCACAG ATGTGCGAGA 4320
CGTGCATGTA CATCGAAAGC GTGCTCGAGG AGCGCGATCG GGAAAGTACT TTTGTGCTCA 4380
AATCGCTAAA GGCGCGCAAG GATCCCACAA CCACCCTTTA GCACTCATCC GGCTGTACGT 4440
CCTGCGCTGC TGTTGTTCCG GGCCGACGCT ACCTCAGGGA GGCGCGTCCG ACACGCACTC 4500
TTCTTTTCCG CGGCCCTTTG CGTAGGtGCT CTTCTTCAGG AAAGcTGCGC GCGTGGGGGA 4560
CGTGCCGTGC TTCTCAGCGC GGCCCCTACG TTCTAGCAAA GCGGGaGCAA TGAGCTCAGC 4620
AATTTTTTTC GAAAGGGGAG AAACGGACAT TGCATACATA CGAGACGTGC GCACGATCTC 4680 CCCGGCTACA ATAAACTTCG GCCGGTCTTT ATACACACAG GACCCAGGAT GAATGTGAAT 4740
ACACTCTGCA GTGAGCGAGC GATACGAATC GCGGTGTTCT TGCACGCACA CGAACTGTAT 4800
CATGCCCGTG CCCACGCAGA TTAAGTATTC TTCCACGGTC CCGCCCGTTA GCAGAGGGAA 4860
TCCCATGCCA CTTACGATAA GCTCGAGCTG CTCCTTTATG TTAGAAATCT CAGCCATAAT 4920
CCGCTCGTCC AAATAGAAAT GTTGGCAAAA GGAATGTTTG TTATTTTCtT GCgcATACGC 4980
GCGGTAGATC CGCAAAAAAG ACACGAAATC CCCCATGGGA TCGGAGAACG TGCCGTGTGC 5040
CTTTTTTGTT TTTTCTTCCT GATCCTCTGA AAAGATTAGC GGGCTGCGAG CAGACAGAAA 5100
CGCCGCAGCG ATAAGTACAT CGTCAATAGA ATGGGGATAG CGCCGCAGCG CCTCTACAAT 5160
CATCCGGGAC TGCCgAGGAC CGAGCGGAAA CAGGCACATC ATTTTTCCAA TTTCACTCAG 5220
ACTCCGGTCA TCTTCTAACG CGCCGAGCAA GCGCAACGTG TCTTCTGCGC CGATAATACC 5280
ATGGGTGCCA GGAGGAGAAA TAAAATCAAA GTGTTCGAAA TCGTGGATAC CGAGTTCTGC 5340
CATGCGCATG ACTACCTCAG ATAGGTCAGT GCGGTAGATT TCTTCAAGGG TGTACGGTTC 5400
ACGCTGCTCA AAATCATCGC GCGAATATAG GCGATAGCAC GTGCCTGCGC GTACTCTGCC 5460
TGCGCGTCCA CGCCGCTGGT TACACGAAGC CTGAGAAATA GGAGTTTCGT CCAAACTTGC 5520
AGTATAGGAA AGCGGGTTAT ACGAATTTAA CTTCACCAAA CCAGAGTCAA TGACGGTAGT 5580
TACATCGTCA ATGGTGATGG ATGTTTCTGC AATATTCGTT GCGATGACGA CTTTTCTTTT 5640
TCCAAATGGC GCGCGGTTAA AAACTTGCTC TTGTTCTTCT TTACTCAATC TTCCATAGAG 5700
GGGCAAAAGA AAGAGCTTGC GGAACCAACG TTCATGGGAA AGACGGGTAA TACAATTTTT 5760
AATAGAACGC TCCCCTGGCA GAAAAATGAG TATGGCACCT TTGTCCCTTG AAGCGATAAC 5820
ACGCTCAACG ATACAAACGA TCTTTTCTAG CAAGGCGGCC TCCGCTTCCT TTGTATGAGT 5880
AGATGCaGGC GTATcAGGAG GATCGAAAAT AACAGTGACC GGGTATGCAA CCGCATCTAT 5940
TTTGATGAcA GGGCACTCAT TGAAATAGCG GGAAAACATG GCCGTGTTGA TTGTGGCAGA 6000
GGAGATGACG ATGCGGAAAT CATGCCGCTG TTGCAAGACG CGCTTAAGCA ATCCTAAAAT 6060
AAAATCAATG TTGAGACTCC GCTCATGCGC TTCATCTACC ATGATGATGG AGTATTTACT 6120
GAGGAGTGGG TCGAGCTTCA TTTCTTGCAG AAGGATTCCA TCAGTCATTA CTTTTATTTT 6180
TGTTTCGACA TCTGTGTGAT CCTCAAAGCG CATTTTGTAT CCGACAATGC CGGGCTGCAC 6240
GTGGAGCACC TGCTTGGCAA TGAACTCGCT TACAGAGAGG ACAGCAATTC TACGCGGCTG 6300
GGTGACGCCG ATAGCACCAC CTTCATTGTA TCCTGCTTCA TGAAGAATGA GTGGCAGCTG 6360
GGTAGTTTTC CCAGATCCGG TGGGGCTTTC GACAACAATG ACGTGATGGy kCGCGAcGCG 6420 CTAAGAATTT TGTCTTTCTG AGAGTAGACG GGCAACTGCT TGTAACTGAA CATGATTGCA 6480
AGCTCCTCTT ACTGCGTGTG GA AGGmCAG GATAGAAAAA AGAACCAGAA GTGGGAGTGG 6540
TGCGAACGGG CGGTAGGGAG CGTCCGCACc GCACTGCGgG AcgGTGcTGA GAGTACAGAA 6600
AGACGGAGCG ACCAAGCGCT AGTCATTGAC ACGTTCTTGA TATtCATtCG TCTCTGTATT 6660
TATCAGGATC TTCTCTCCTT GCTTGATAAA TAGGGGAACG CGCACGACAA GACCCGTTTC 6720
GGTAGTCACA GGCTTTGTGG CGCCAGAGAC GGTATCCCCC TTAAGATACG GCTCGCTGTG 6780
TGCAACACGG AAAACCATTT TGGTGGGAAT TTTTATGTCA ATGGACTCCC CGTTCCAAAT 6840
TAGGATGTCG TATTCGTCCC CTTCGCGCAA GTAGCGCTCT CTTCCTGGGA CATTCCCTTT 6900
GGAAACGAAA ATCTGTTCAA AACTGCGGGT ATCCATAAAG ACGAAGCATT CCCCGTCATC 6960
GTACTGATAC TGAGCGCGGT GGCTGTCTAC AACCGCATCT TCGACTGTAT CTGAGGTCTT 7020
AACTGTCTGA GTGAGCACAG AGCCGTCACG AAGATGTTTC ATTTTAACGC GCGCAAACGC 7080
AGCACCCTTA CCCGGGTTTA CGAACTCGCG CTCGACAACC AGGTACGGAG CACCTTTATG 7140
GAGCAGGACC GTCCCCTTTG CGATATCTCC CCCTCTAATC ATGTAATTCC TCTCTTATCT 7200
CCTAGTAAAC GTCTTGCACG ACCTGCGGGG GCGCAGTATA CCGCGCAGtA TATTTTTTAA 7260
AAGGCCTCGA ATGGAGGCAT TGACTTTTCG TCCCTTGCCT GGATACTAGG CGCCCTATGG 7320
CGAAGAACAC TGATATTGAG CACGACGCGC ATGAGCCGGC CGGGCACGGG GATGTGCGTG 7380
AGTCTGCCGT GGAGAATCCG TCTGCTTCGG CAGTGTCTGA CGGGGAGGAG CGCGCCACGT 7440
TTGCGCCGGA GtTGCTCCGC AAACCGATAC CGAATCAGCG CAAGGTGCAG CACAGGAGTC 7500
AGAGCCAGAG GTACAGCGCG CAGGAGAAGC TGAAAAGGGT GTACCAGAGA AGGCTAAGGC 7560
AGTAGTGCCG CTTGATGAGT TGTTGCCGCA GAAGGTCCAC TTAATTCCGC TCACCGGACG 7620
GCCTATCTAC CCGGGTATTT TTACTCCGCT TCTGATAAGC GATGAGGACG ATGTGCGTTC 7680
GGTGGAAAGT GCGTACAGCG AT GTGGTTT TATTGGGTTG TGTTTGGTGA AAACCGACAC 7740
GCAAAACCCA ACTATCAGTG ATTTGTACGA GGTAGGATCG GTCGCTCGTA TTGTGAAGAA 7800
GATTAATCTG CCAGACGGTG GGTTAAATGT TTTTATTTCT ACACAAAAAC GTTTTCGCAT 7860
CCGCAAGCAC GTGCACCACA GCAAGCCTAT CGTAGCGGCA GTGCAGTACC TGTCCGATCT 7920
TATTGAGGGG GATCCACTCG AGATAAAGGC ACTTGTGCGT GGCCTTATTG GGGAAATGAA 7980
GGAGCTTTCT GAGAACAATC CACTTTTCTC AGAAGAAATG CGGCTGAATA TGATCAACAT 8040
TGATCACCCC GGCAAAATCG CCGATTTCAT CGCGAGTATC CTGAATATTT CAAAAGAAGA 8100
GCAGCAACGC ACGCTAGAGA TTCTGGATGt GCGCAAGCGC ATGGAGGAAG TCTTTGTATA 8160 TATCAAAAAA GAAAAAGACT TATTAGAAAT CCAGAGAAAA ATTCAAAATG ATTTGAACAG 8220
TCGGGTGGAG AAAAACCAAC GCGAGTATTT TCTGCGTGAA GAGCTGCGTT CCATCAAGGA 8280
AGAGCTGGGT CTTACCACCG ATCCAAAGGA GCGTGATCAG CGGAAGTTCC GTGCGCTAAT 8340
AGATTCGTTT CACTTTGAAG GGGAAGTGAA AGAGGCTGTG GAGAGCGAAT TGGAAAAGCT 8400
CTCCCTTACA GACCCGAATT CCCCTGAATA TTCaGTGGGT CGAACGTACC TCGAGACGGT 8460
GCTCTCTTTa CCTTGGcACG CTCCTGAGAA GGAGGAATaT GACTTAAAGA AAGCTCAGAA 8520
ACTGCTTGAT GAAGACCATT ATGGACTCGA GAATGTCAAA GAACGGATCG TGGAGTATTT 8580
GGCGGTGCGA AAGTTACGCG CCGATACCAA AGGCTCTATC ATCCTGCTGG TAGGTCCGCC 8640
GGGTGTGGGA AAAACCTCGG TGGGCAAGTC GATAGCGCGC GCCATCCACA AGCCCTTCTT 8700
CCGTTTCTCG GTTGGAGGGA TAAGCGATGA GGCCGAAATC AAGGGGCACA GACGTACTTA 8760
TATCGGCGCC CTGCCGGGTA AGGTGCTACA GGGGCTGAAA ATAGTAAAAA CTAAGGCTCC 8820
CGTGTTTATG ATCGACGAGG TGGACAAGAT TGGTTCTGGC GCGCGCGGCG ATCCTGCGGG 8880
GGCTCTGCTG GAGGTGCTTG ATCCGGAGCA GAACaCTACG TTCCGCGATC ATTACTTAGA 8940
TTTGCCCTTT GATCTCTCTC ATATCGTGTT CGTGCTCACT GCCAATAGCA CCGATCCTAT 9000
TCCCCGTCCA CTGCTGGATC GCGCTGAGAT TATCCGTCTT TCCGGTTATA TCGATACGGA 9060
AAAGGTTGAG ATCGCAAAGC GCCATCTGGT GCCAAAAACG CTGGAGAAGA ATGGTTTAAA 9120
GCGTGCGTGC GTCTCTTATC GGAAGGAGGT GTTGCTACAC CTGGTCCATT CTTATGCGCG 9180
GGAGTCTGGG GTACGGGGGC TAGAAAAAAG CCTTGACAAG CTGCATCGCA AGCTTGCCAC 9240
CGAGATCGTG TTAGGGAAGC GATCGTTTGA TGACAAGTGT TTGATGGATG AAGCTCTCAT 9300
AGGGACCTTT TTAGGGAAGC CCGTGTTCCG CGATGATATG CTCAAAGACG CGAACAAAGT 9360
TGGTACTGCG GTGGGTTTAG CCTGGACTGG CATGGGGGGA GACACGCTCC TTGTTGAGGC 9420
AATTACTATA CCAGGAAAAG CAAGTTTTAA GCTCACTGGG CAGATGGGAG CGGTTATGAA 9480
GGAATCCGCT TCTATTGCCT TGTCCcTGtG CGCCGTTACA GCGCGCAgCA GCGTATCnTT 9540
CGCCGAATTG GTTTGAAAAG CGCGCAATAC ATCTGCATAT CCCCGAGGGC GCAACCCCAA 9600
AGGACGGTCC GTCCGCGGGG ATTACCATGA CCACCACGCT CTTcTCGTTG CTCACCCAGC 9660
AGAAAGTAAA GCCTCGCCTA GCGATGACTG GAGAACTCTC ACTGACCGGA CAGGTGCTCC 9720
CCATCGGGGG ATTGAAGGAA AAGACTATCG CsCACGGCGC GGTGGTATCA AGGAGATCAT 9780
CATGCCAAAA GCGAATGTGC GGGATCTGGA CGAAATCCCC GAGCACGTCA AGAAGGGCAT 9840 gTGTTCCACC TAGTTGAATC GATGGAAGAG GTCCTTTCTC TCGCCTTCCC CAAGGGGAAG 9900 CGTGTCCGTG CTGGCACTGC CGCCCAATCT GCTTCTCCTG AAAcCCTTAC AGGCTGACGT 9960
ATGCGCTTTC GTGCACGCGT ATCTCAGTCA ACTGCGAAgT GcGTCGTGTT CACAGGAGGC 10020
GGCACGGgAG GACACATTTT CCCGGGAATT GCAGTTTTTC AAGCgCTTGC gCAcrGGCGG 10080 cGGtGCGTGT CGTGTGGATT GGTGCAGCGC GTGGTGCTGA TCGCTCCATA GTGGAATCTG 10140
CCGGATTAGA GTTTTGTGGT ATCACCGCTG GCAAGTGGCG TCGGTACGCG AGTGTGCGCA 10200
ATTTTTTTGA TGTATTTCGA GTGCTCGTCG GTACGGTGCA ATCCTATTGT ATCTTGCGCG 10260
CTTTGCnCCC GCAGGCACTA TTTTCTAAGG GAGGGTTTGT GTCCGTGCCG CCGTGCATCG 10320
CAGCGTGGCT TTTGCGCATA CCCGTTGTCA CGCATGAATC GGATATCAGT CCAGGACTTG 10380
CCACACGCAT CAATGCGCGT TTCGCCGATC GTATTTTAGT CTCTTATCCG CACACGTCCT 10440
GTTATTTTCC CCgTGcGCGA CGCgcAGCAG TTCACTGCAC GGGGAATCCT GTGCGACAAG 10500
ATTTTTTTTC TGCACAGGCA GAGCGTGCAT ACCAGTTTTT ACGCATTGAC CAAAAAAAGC 10560
CATTGCTCAC AGTCCTCGGA GGAAGTAGCG GTGCGCGTGA CCTAAACGCG CGTGTTCTTT 10620
CATGTAGCAC CTTCCTTACC GAACGCTTCT ATCTTGTCCA TCAATTTGGC GCAGgCAACG 10680
AGGACCAAAT GCATACTATC ACCAATTCGC TTAGcGTCAA TGCTCGGCAT GCCTACATGT 10740
CGTTTCCTTT CATTCAGGgC ACATCTGCCC GATATACTCG CCGCGAGCGC ACTGGTACTC 10800
TCTCGTGgCT GGTGCGAACG CGGTGTGGGA GTGCGCATGC TCGGTAAACC AATGGATTCT 10860
TTTTCCTCTC GAACGAGGGA GTTCCCGTGG GGATCAGATT GAAAATGGCA GAATATTTTA 10920
GCGCACAC 10928 (2) INFORMATION FOR SEQ ID NO : 58:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3237 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 58:
TACAACGCCG TGAGCAACCC TGCACTCAAA AAGATCAAAG CAATAACATG GGGAGGAGCG 60
CCCATCGCTT TTAACATCGC AATTTCTTTA CGCCGTTCTG TCATTAGCAC CACCAGTACC 120
GAAGAGATGT GCACTGACGC TACTAGCACT ATCAAATACA TAATGAATAA CAATAACTTT 180
CGTGATGTTC GGAAAGAATG AAATTGCGAT CGATTCATGT CTTGCCACGT GTACGCACTA 240
AAATGGCTCG GTAATTGTTC GTGTACTTGC TCAATAAAAC GAGTCATCGC CTTAGCGTCA 300 AACGCGTCGG CAGTTTTTAC CACAAAAGAA AGTAGCGCAG ATGCGGGGGA GAGAATTTTC 360
ATTCCCAGCG TGAGGGGGAT AAATACCCAC AACGCATCAA GCTCCTGATA TCCGCAAGAA 420
ACAATACCTC CTACCACCGC GCGCACCATT TTGGGGACCG CACGACCTGT CCCTCCTTGC 480
ACGAGGGTGA GTATCTGGCA CGTGTCCCCA CAGCGCACCC CAATGCGCTC AGCGATGCGT 540
TTTCCCAATA TTAACGTGTG TACTCCtGCG GCCTATCTAC CAGTTCAAGT GAACCTTCGA 600
CGGTTAAAAA TGGACGAAGT CcACGcTCAC TAGAAAAAAA ATCAGGGGGA ACTGCGCGGA 660
TATTCCCCCC TGCACGCCCT GTTTTTCCGA TTACAATACC ATCTCCCTGA AGGTGCATCC 720
ATCGTGAGTG ACAGTATGGG CCAAAGTCcT GCGCCATAAA TGCATTAAAT ATGCgctGCG 780
CGTCTTCATA TCGTTGCGTT GCCGTTTCAT TGGGAGCAAG CGGCAGTATA TCGATAAACT 840
GGAGGTGACC CGATCCGAGT TCAATCATCC GTGTGGTGAT CCCTTCAATC ATTCCATCAG 900
ACACCACAAG GACAACAATG AGTGGGATGA TGCTAATCCC GATGCCGAGC GCGGCACAGA 960
AAAAACTTTT GCGCAAAAAG GAACGTCGTT TTCCTGCTAC CGGAGTACCT GATAGAAAAG 1020
GTACTGGcGT AGGCAGTACG TGGTGCGCAT CACCGTGTAG AGATGGGGTG TGCCCATATC 1080
CTGcGCACAC TCCTGCGCAA CGTAATGCAC ACATGAAAAT AACTCGAATC AGATTCACCG 1140
CGGCGCACCT TTGTTTCATA TGCGTATCAA ACTTCCCTGC TGTAGCTGGT AACGGTAtCG 1200
GTCATCGATG CaATACGTGG GTCGTGCGTT ACAATGAGTA ACGTCTTTTG ATATTCCTCT 1260
GTCAGAGAGA ACAGCAGATC CTGCACTATC AAAGCGTTCT TGGGATCCAA ATTGCCAGTC 1320
GGTTCGTCCG CAAGAATTAG GGTGGGATCA TTGATCAGTG CACGCGCAAC TGCTGTCCGC 1380
TGTCTTTCTC CTCCTGACAT TTGTGCAGGA AAATGATGGG CGCGCTGCAC TACGCGTACT 1440
TTTTCTAGCA ATTCGTATGC GCGTGCACGC ACcTCACGGT AACTTTTTCC TGCGATAAGT 1500
CCAGGCAACA TGACATTTTC AAGCGCAGTA AAATCCCTCA GTAGATGATG AAATTGAAAA 1560
ACTAATCCTA AAAACTGTCT GCGGTATTCT GTCAGTGCGT GCTCATGCAA AGTGAGTACG 1620
TCGCATGAAA GCACTCTGAC GATCCCCGAA TCAGCGTGTT CCATTCCTCC AATAATATTC 1680
AGTAAGGTAC TTTTACCGCA GCCGGATTCT CCGGTGATTG CAACCTTCAC TGCACGCGGC 1740
ACGcTAAATG ATACGTCAGA CAAAATCTGT ATACGTTCTG TTGCGCAgCa GAAnGCTTTT 1800
ACTTACTTGT TCGACAGAAA GAATTGGGTC AtTCATCGCG TAGCACCTCA GCCgGCTTGA 1860
GCaGGAGTAT TTTACGCGTG GCAAGGTACG TTGCAACAGA CGCAGAACCT GTGCCAAACA 1920
GAAATACAAA CAGTACCTCC TGAAAGAAAA TCTGCACGGG AATACGCTCC ACGTTGTAAA 1980
AATATTGCGT ACCAAACACA CTGAAAGAAG GGGTTTTCGT TCCCGAGAAG AGGGAGAACA 2040 GGAAAAACGC AGAATTTACA GCAGTCTCAA TGCACGCAAT TATTTCGTTA ACGTGGATAG 2100
TAATGAGCAA TCCCAGGAGT ACCCCCAAGA GAGAGCCCAA AAAGCCAATC ATAATGCCAT 2160
TGCCGATGAA CAGAATCTGC ACGTGACTGA CAGGGGCGCC AAGTGAAACG AGCATAGCAA 2220
TTTCTTCCTT TCGAGTGCGA ATAGAGCGGC GCATGCTGTG ATAAATGTTT ACGGTTACCA 2280
CCwTAAAAtC AAAtGACAAG AAGTATCATG ACGTTCTTCT CTATGCGGAG CGCACTAAAA 2340
AAaGCaCGGT TGTACTCCCG cCAGGATTCT GCCtTGAGAT scAGGAATGT GTTGTGCaAG 2400
AAAGAAAAGG TAGCGATCGT CTCGCTCATG GTTATTTAGT TTGACTGCCG CGGTAATATC 2460 aGGCGTCGTA CCAAATAAAG TGGTGCCCAT GTCCAGAGGA ATGTACGCAA ACGTGGAATC 2520
TACTTCGTGG TATCCCGATT TGAAAATGCC CGTTACCGTA AGTTTATTCC AGCCTGGCAT 2580
TATCTTTTGT GTATCACTTC CTGACAGGGC AAGCGTGTCA ACCTGATCTC CGGTACGTAC 2640
CGAAAGGTGG CGCGCCAGTT CATATCCGAG CACAATGGAG TGCTTTTTAC TCAAATTAAA 2700
ACTTCCGGAT GTTATCGGGA GTGCACGCGC CAGCAACCTA TCCCGATGGA AGATATCTGC 2760
AGGAACTGCA CGCACAAGCG CACCGTGTTG CCGATAATAG TTGCCTTGCA ATAAGGCATG 2820
CGCTTCTATA AATGGATAAA AGGATTGATA GcCGCCTAAC GTCTCTGCAC GTTTTAtGCG 2880
TCAACACTGC CATATACACG AACGTGTGCA GAACTCACCT GTAAAATGGT GCCAATAAAA 2940
CCCTGCTGGA AGCCGTTCAT AACCGAAAGG ATGACAATTA AGGTAAGTGC CCCAAAGGCA 3000
ATGCCTAATA TAAAAAAAAG ACTGGTAATC GCGTTyGCAC TCCGCGCGCG CACTGAATTT 3060
AATCTGCGCA CCATAAAACA CATCCACCGc AGCGTTTGCA CGTGGGTGTT ACTCATCGTG 3120
TACTTCCTTt GTAGAGTAAA TTTCCTTCCC ACGCTCAAAT ACCTGCACGT CCTGCGCACC 3180
TTCCTTTTGG TGAATTATTG CCTGTAATAC nCCGTTGCGA TAACGTTTTT CTAACAC 3237 (2) INFORMATION FOR SEQ ID NO: 59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2582 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59:
GTCGTATCCG nGnTAGTCCA CCGGTTCCTG AAAACACCTG CTGCGCTGCA CGGACACCAC 60
CTTTCCCCCA AGTTCATCCA AAACAGGGTC GCAAACCGCT GCGAGTTGAA AAGCTCGACG 120
CGTTGCTGCA CACGTAATCT CATAAATATT GCCCTGAAAC TCAATCAACT GCTGCATTGG 180 GAAAATCATG GGTCATCGTA ACACCTATCG CGCCGCGCAA CAAGCACTGC TTGACGTACA 240
CGATCCCCCC TGTGTAGAAT GCGCACCGCG GGCAATGGCG CAGTAGGTTA GCGTACAAGT 300
CTGGGGGACT TGGGGTCCCC GGTTCGAGTC CGGGTTGCCC GAGGAACAGC TGGCCAGCTC 360
ACCCCGGTAG GCCGTTTGTG TCTTTGGGAA GCGGCGTGCC CGGCTGCGTC TCGCCTGTAG 420
GTTGTAGCTT CCGTAGTGCG TTCTTAGGCA GGTGTGTAAG GAATGGCGGG GCTAGTAGGC 480
ACTCAGGGTC TATCGGCGTG CTATTTATCC GTGCTTCCCA GTGTACGTGA GGCCCGGTGG 540
AGAACCCGGT GGTTCCGCTG CGTGCAATGA GCGTGCCCGT TGACACGTAC GTATCTTTTT 600
TCACCAGCAG CTCGTTCAGA TGGTAGTACG CTGTGTACAG CCCCGGGGCG TGCTCCAGTA 660
CCAcGCTCCA AmCCGTGGTT GTCCGTTGCT CTGCGAGTAA CmACCtCCCT GTACnTGcTG 720 cATACAmCGC CGTTCCCACc GGAACTCCAA AGTCCTTCCC CCAGTGGTAC CTGnCAGAGC 780
GCGTCCCGTC GGTGTACACA AAGACGCGCG CnTGCCCAAA CACGACGTAC ACCGTCGAGA 840
TTCCACCGGT TGTCGAAACG GCCCTAAAAA GGCCCGAGTC TGAGGGGAGn TCACGGTCTC 900 nGGAnGCCTT ACAGGsGTCA CGCTGCACCT TTTTGCGCTC ACTCTTGTCC TGCGCAATGG 960
CGGTATTCTT gCGATCTAAG CGTAATTCCT CACGGGGAAA TTCCTTTTTC TCAATGCGCA 1020
GCGGCGCACG CCGCACATAT GGCTTCCCTC CCGGCACACG CACCTGCGCT TCAAGCATCC 1080
AATCCCCCGG TTCCCAGAAA ATCGATATCC CCAGCAAGGC AACGTGCGTC ACATCCTGAG 1140
AACCTGCACG CGAGACGCCA GCGGTTGCCG CGTCCGTCCC TAACTGAGCA ATACCCTTTG 1200
GGGGAAGCGC AAAAGCGCGC ACCGTCTTTG CTTCTTTACC CGCAGGGGTA CGCAGCACCA 1260
GATGTACCTC AGTATGCGCC TTGTCCTTTT CTTGCAATGC CACTAAAGAA AAAGTGGCCA 1320
TCGCACACGC ACCTTGGGAT ACCTGACGCG GGAACTGCAT AGCGATACGC TCGAAATGTG 1380
CAGGAACCAC CTGACGCTCC GGCGGcGGCA CCGCAGcCGA ATGAAGCAGG AAGGACACCG 1440
CGCTGACAAA GACACCAGAG AACAAGAGTA cTTCGCACAG ACCACGCACC CAACGAACAC 1500
TTCTTTTCAC CGACGGTGAC TGCACGCCCT GCGTCTGCAC TGCCCTTTTA GCGTTCACCC 1560
CCGGTGCGCG TCCTACTCTC TCTGCACCCA TCACTCACTC CTCCACAAAT CTTGAATGAC 1620
CAGCTGCGGT GTGCACGTTC CCTGAAACGT GTTACGCGTC ACTTGAAAAA CCGCGTCAAC 1680
TACATCCCCC ACTGCAAACT CCTGTGCCAA CTTTTCCCCT GCTCCCCAGT AAATTGCGGG 1740
CCATTTATGC ACCTGTGCAT CCAAGGTCAA TTTTACGTGC ACACGTTCTG TACGCCCAAA 1800
AAGCGATGCA GAAAAAATTT TCAATCTCTT CGCCAAAAAG CACAACGGGG GATTGCCTTC 1860
TCCGTACGGC TCAAAGCGAT CGACAAGGGT CAAAAGCCCC CGCGTCATCT GCGTAGCATg 1920 cCAGTTCTGC ATCAAATTCT CCACACTCTT GCGCGCTTTC ATCAGCAAAC TCAATGGTTG 1980
CCGCATACAG TTCCATACGG TGCAATAGCT GGGGAATTCG CTCAGAGGGA ATTGAAAAAC 2040
CCGCCGCAAA TGCATGCCCC CCATAGTCAG AGAACAAGTC TGCAAGGGGA TCTAAGAGCG 2100
AAAATAGGTG ATATCCCCGC GCCGAACGCA ACGATCCTAC CGCGTGCCCG TCTGCCATTA 2160
TACAAATGAT CACACAAGGC ACGCGCAGCC TnCGcTCAAA CAGTTTGCAA GAATCCCCGT 2220
AACGCCCCGA TGAATCTTAT CGCTACAAAC CACTGCCAGG CGGTTGCTGT ACGTCTCAAG 2280
ACTTGcACGC GCAAGAGGCT CAACAAGTGC ACGAGCACTC CTTCCTAACT TTTTTCGCTG 2340
TTCGTTCAAT TGCACCATTT TTCGTGCCTG cAGCGCGCGC TGCGAAgTTT CGCGCATTAA 2400
AAACAGTTCC ACTGCACGGT GCGGACACCC TAACCGCCCC GTTGCATTGA TAAGCGGCAC 2460
AATACTCCAC CCTAnCTCTA CGGTTCCTAA CTTCTTCCCC ATGAGACGCT GTATCGcAAA 2520
CAAcTcACGC AAACCCACAC GTGGACGGCC TCATTCATCG CCTGCAGACC GTAGCGGACC 2580
AT 2582 (2) INFORMATION FOR SEQ ID NO: 60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5504 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60:
CAAACAGATC AGCGGGAGAA ACGTCTGGCA ATTnTAGCGC GCGCGCATCG GCAGGAAGAG 60
ATGGCGGCAG CCCCGGTTTA TCTGCCACAA GCACACAGCG CACCAAGCCC CCGCGACGAG 120
CAAGACAAAG CGTCTGCAAC TCATCGGGCA AGCAGCGGTA TGAACGCTCA GAAAGCTCCA 180
CGGGGATAAG ATGTATCCCC CGCTGTGCGA GTGCCCTCAC ATCTGCCGGA TCGAAATCCC 240
CCCACGGCTC ACAGCGTTCC AGATGCGCAA TGCACTGCGC AATGCGCTGA GCAAGTTCCA 300
CGCGGTCTGA ATGAGTACGT ACGATCTGTT CCACGGCCTC TGCTGCTTCT ACCACCTGAC 360
CTGCAACGCG ACACTCCTCT CCTCGGGTAA CATTCTTTGT CTGAGCGTCA GTGACGAGCG 420
CAATGGCCTG CACGCACCGA GCGTCCAACG CGTgCAACTC TGCAAGTTGC TCACTTGCAC 480
ACTCCCGCAA CTGCACATGC ACAGCACCAA AGGAACGCAA CGCCTGcAGC GAACGCTCTT 540
GCTCAGAACC GAGCACCAGA AGCGTTACCT TTTTCATAGG AACTATCACC GTGTATCCTC 600
TTGTTCCATC CGACTCACGT CTACCAGATT CTTCTTAGAC ATCTTCCCCC GCACTACTGC 660 AGCAACCTGC TGATCACCGA GGTACACCGT TATTTtCCGT ATGCATGCCC GCGTCTCAGG 720
AATCTTAACT TTTTCAAAGA GGTTAACACG CTGCGTTGTA GTCCGCAACT CTGcACCCAG 780
GAGAAGCGCC TGTTCGTCGA GAACATGCGC CTCCAAGTCT AAGCTTAGCA CTTCCTGCAT 840
TTTGCGCACT GCAGTATCCA CCCACAGAGG AACACGATAT AAGTCATAGG GAGGACAAGC 900
AAAGTGCACT TCTAAAAAGC AGGGAATACG CACACCTGCA ATGCTAGCAT ACGTTTTCTT 960
TACCTCTTGC ACGCGGAGCA AACGCGCGTC GAACACACCG CTTTCAGAAA AAACTGCAAC 1020
CCACTGCTGA ACATCCTGAC GCAGGGCATC TGCACGGGAA CGTACTTCAG AAGCGCGCGC 1080
CTCAACGGCA CGGATCTCAG CATACAACTG CTGCTTTTTA AGCTGAAGCG TAGGGAGAAA 1140
ACGGCGAAAC GTCTTGAGCG TCTCTTTTTG ACGTTTCAGT TCATTTTTGG TTAAGCGCAC 1200
CGCCAtCGGT CACCACCCTA CGCAGGCCAA TACGTGTTAA TCAAATCAGA GCGAATCCCC 1260
GTCTCCTCTG GGGTGAAACA CCGGCCCAGA ATTTTCCACC CCGTATCGAA CGCCTCTTCA 1320
AGCGGAATAT TCACCGAAAG ATCCATGAGc TGCGCTTCAA ACAGCCCACC GTATGTGAGC 1380
AGTTTCTCAT CCCACTCGCT CATGGCAAAA CCCATAGATC TTTTCTCAAG CGCATCACGA 1440
TAGGCGGCAT ACAACTTAAT CATATTATCC ATAAGCGCGC GATGATCTGC ACGCGTACGC 1500
CCGTTTACGT TCTGCTTAAG ACGGGATAGA CTCCCGAAAG gTTCAATGCG CCCGTTCTtC 1560
AGATAAAACT GaCCCTCAGT AATGTACCCC GTGTTATCAG GAACCGGATG CGTAACATCA 1620
TCCCCTGGCA TGGTGGTAAC GGCAAGGATA GTCACTGACC CTGcATCATC AAAATCGACC 1680
GCCTTTTCAT AGCGCGACGC AAGCTGGCTG TACAAGTCAc CCGGATACCC ACGATTCGAG 1740
GGAACTTGTT CCTGAaTAaT CGCAaTTTCC TTCATAGCAT CAGCAAAaTT AGTCATGTCG 1800
GTTAAGAGCA CCAACACATC CCTACCCTTC AAGGCAAACT GCTCGGCAAC TGCAAGaCAC 1860
ATATCAGGGA CCATCAAACA TTCTACGGTA GGATCTGAGG CAGTGTGCAC GAACAGGACT 1920
GCCCTACTCA ACGCTCCTGC CTCTTCCAAT GCACTTTTAA AATACAGGTA ATCGTCATGC 1980
TTCAGCCCCA TACCCCCGAG GACGATGACA TCAACCTCCG CTTGCATTGC AATACGGGCC 2040
AGCAGTTCGT TGTACGGTTC CCCTGAGCTA GAAAAAATAG GCAACTTCTG AGAAACAACC 2100
AGCGTATTAA ACACATCAAT CATGGGAATA CCCGTGCGAA TCATACGCCG CGCGATAACC 2160
CTCTTTGCCG GATTAACCGA AGGACCGCCA ATTTCCACCC TCCCTTCCTT TAAGGCCGGA 2220
CCACCGTCTC GGGGAACGCC AGAGCCATTA AAAATTCTCC CCAATAAATA ATCTGAGAAA 2280
CTCACGAGCA TACCCCTCCC CAGAAAGCGC ACCTCGCTCC CGGTGGAAAT ACCCCGGCCT 2340
CCCGCAAACA CCTGcAGGGA AACTACATCC CCTTCAAGCT TATTCACCTC AGCAAGCGAA 2400 TCGCCAAACG CCGTTTTTAC CCGGGCCAAT TCCCCGTAAT GCACCCCCTT TGCCCGCACC 2460
GTGATGACAG AACCGTTGAT CGACTCAATC TTCTCGTACA CCTTGTACAT CGTCTACTCC 2520
ATCCCCCGTA TAATTCCCTC TGcTTCGCTG TCGATTTTCG TCGATTCTCC CTGGAGAAAG 2580
GCACGTATCT CCTTTTCTTT CTCCACAAAC GCCTCAGAAT TCCAGGCGCA ACAGTTGTAA 2640
TCGATAAACA TATGCCCAAG CTTGCTGAAG TATGCCCGCG CGTCATCTTT TGATTCAAAC 2700
GCTAAAACAC TGCCAAGAAC CCGCATGACG ATGGCATAGC AGTGCTTTTG ACGTGCAACA 2760
GGTACTGCAC TATCGACTGT GTCAAAAGAA TTCTGCTGCA GATACACCGA ATCAAGAAAC 2820
GAGCCTTTCA GATATACGAG AAAGTCCTCC ATACTTGTGC CCTCTTCGCC GACGACCCTC 2880
ATCATCTGCT CCACCTCTGC CCCACGGCGC AGAAAAGAGC GACCGTACGC AACAGCCCGC 2940
GCGTCAAGCA CACTTGGATA CTTAGACCAT GAATCAAGCG GATGCACCGC AGgATACCTG 3000
CGCGCGTCAG AGCgyTnCTC GAGAAAGTCC GTGAAAgCCC CAACCACTTT CAATGTAGCC 3060
TGCGTTACCG GTTCTTCGAA ATTACCACCT GcCGGAGAAA CCGTCCCTCC AATAGTTACC 3120
GATCCTTTCT CTCCACTCCT CAGCCGGACC ACACCAGCCC GCTCATAAAA GGCTGCGATA 3180
CAAGACTCCA GGTACGCAGG AAAGGCCTCC TCCCCCGGAA TCTCTTCCAA ACGCCCAGAC 3240
ATTTCACGCA GGGCCTGTGC CCAACGGCTC GTAGAtCCGC CAGCAAAAGA ACATCCAACC 3300
CCATCTGACG GTAATATTCT GCAAGCGTCA CTCCCGTGTA CACTGAAGCC TCACGAGAGG 3360
CAACGGGCAT AGAAGAAGTG TTGCACACTA TAACCGTCCG CTCCATAAGC GACCGACCAG 3420
TGCGAGGATC CGTAAGATCA GGAAACTCCC GCAGgTtCTC AACCACCTCC CCTGCACGCT 3480
CCCCACACGC AGCAATCACT ACCACGTCCA CATCCGCATT GCGACTGGTA GAATGCTGCA 3540
GCACCGTCTT TCCCGCACCA AAGGGACCGG GAATACAGTA CGTCCCCCCC TTGGCCACCG 3600
GGAAAAAGGT ATCTATCGTC CTAATGCTCG TTACCAATGG CTCAGTCGGT TTCAAAcGCT 3660
CTGcGTAACA ATGGACGGGT CGCTTCACTG GCCAACGAAA TGCCATGGTC AGTTCGTGCT 3720
CATGTCCCTG CGCGTCACGG ACCCGCGCAA TCACATCGTG CACGCGGTAC GTCCCTGCAG 3780
TCTGAATGAA GACAACCTCA TAGGAATCCC CCATATGAAA GGGAACCATA ATGCGGTGTT 3840
TGAGCGCACC CTCTGGGGTA TACCCGAGCA CGTCCCCACG CACCACGCGC TCACCCACTG 3900
AAACATGCGG GGTAAACATC CATTCACTTG TCCGAGAGAG GGCGGGCAAA TACACCCCGC 3960
GCTCCAAGAA ATACCCAACC TTTTCTGCAA GCAGCGGCAA CGGATTCTGT AAACCGTCGT 4020
ACACCTGACC GAGCAAACCA GGACCTAGCT CAACAGACAG CAAATCGCCT GTAAACTCAA 4080
CACGGTCCCC AACGGAAACC CCTCTTGTGA TCTCAAACAC TTGCAACTGT GCCTCACGAC 4140 CACGAACACG AATAATCTCC GCTTTCAAAC GCGCATTTCC AACATGCACG TATCCGACCT 4200
CGTTGAGCGA AACGACACCC TCGAACGTAA CGCTCACCAT ATTGCCGTTG ACCGCAGACA 4260
CGATACCcTT CGTTTGCGTC ATGTATATGC TCCAAGAATT GTATAATAAA GAGCTGCGTA 4320
TGCTTTGGAC CCCGCCTGCA TTGTGAAACG CGACCGATGC GTAAGCAACA TCAGCTGCAG 4380
CCCGTACAAA AACACTGCCT CTGAGGAAAA CGGGTCAAGC GGCCTACACC CCTCGACAAA 4440
GGAAAAGCGC GCGTCGTTCA AAAAATACTC CGCTTCGAGC GGATCGTCCA AAGAGACCGC 4500
AACACGAGCC GCACGAGCCA CCGACTCCTG CGCCACAGGA CACTGCCTTA CTTCAACAGG 4560
AGTATCCCAC CGCAGACGGT cCgcGCGCTC ACgcgCAAGc GCACAGCGTA ACGCGTACTC 4620
AAACTCGCCC CATCTATCTA AAACACGCGA TCCCGTGGAG TCACGCGCCG GCACGGGACA 4680
CAACGAGATA TTTCCAAGCA CCGCAGCATC CTGCCGACCC AAGAAACGTA GCGCACAATC 4740
CAAAAAATCC TGATAACGCA AAGGAGGCAC CGCGCCGCAT AGAAGAGATG GTAGCTGCGT 4800
TATAAGGTAG CAATAGGAAG ACATCACAGC TCcTGCGCAG CAGmCTGAGC ACCTCGGCAA 4860
CACGCGCAGA AACATACGAA GAGAACAACT GAGCAACgCG GCGGCGGAAA AAtcATAGTa 4920
CGACCCGCCC TCGGCAGGGA CTATCCTAAA CCCTGCCGTA AGGCAATCGT CAGACCTCAA 4980
CTCTACCCCT GCCGAAAGCT GCTCCTGCAA CGCAsCACAA AACACCCCCT CAAGCGTCCG 5040
AAGATCAGCA GGAGAGAGGA TGAGCTCTAG CTTATCACCC TCCGCTTGAA CCCAGGCAGA 5100
AACGACACGA GGAATAAGtC ACGCAAAACA CCCGCATCGT aGtTGCGCCG TCTCCATCGA 5160
AATAATAGCC CGAAGAGAGC GAGTCACCGA ATCTTGAAAG GATAATAAAA CGTTGCGACT 5220
CGCCTGCGAC AGGCCGCAAG AGACGACGAC TCGATCCGCT CTGCCTCCTC ACGCGCAGCG 5280
GAACAATCCG CTCTGCCTCC TCACGCGCAG CGGAACAATC CGCTCTGCCT CCTCACGCGA 5340
CTCACCAAGC AAACGAGACG CCTGCTCCTC GGAAGAGGnC AAnCGCTTCG CGCTTAATTC 5400
GGTCATCAGA TCTTGCAGTT GAATCTCCAC TTATAGTTCT CCTCGCAGCC CTCCGAGTAT 5460
ACTAAAAAGT CCCCACCGGG AnAAGGCATA ACACAnTTCG ACCA 5504 (2) INFORMATION FOR SEQ ID NO: 61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8467 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: TTGTATTAAC CCATTGCCTT ATCCTTTTTC ACCCAGCGCC AGTTCACGAG ATGCATTACG 60
TTCCTCCCTT GGAAAACGGA GAATGACTTC CGTTATATCC GCCCGTTCTC TAGGGTGGAG 120
ACAAATCCAT AAAAGTAACG CCTCTTTTTT ACTCCCCCAT ACCTCATCAC CGCATACAAA 180
GCAAAACAAA ATCACTGAGG TTAAACATAC CCACCGTGTT ACGCTGTACG CGAATCCACA 240
GATCGCATAA CCCTCACCGT TTTCTCCGAT AAAGAATCTG CATCACCACA AACAGCATTC 300
CTATAGCATA CACTATTCTC AGAATGACTT GGTTGAGTAC TCACCAGTCA CAGAAAAGCA 360
TCTTACGGAT GGCATGACAG TAAGAGAATT ATGCAGTGCT GCCATAACCA TGAGTGATAA 420
CACTGCGGCC AACTTACTTC TGACAACGAT CGGAGGACCG AAGaGCTAAC CGCTTTTTTG 480
CACAACATGG GGGATCATGT AACTCGCCTT GATCGTTGGG AACCGGAGCT GAATGAAGCC 540
ATACCAAACG ACGAGCGTGA CACCACGATG CCTGTAGCAA TGGCAACAAC GTTGCGCAAA 600
CTATTAACTG GCGAACTACT TACTCTAGCT TCCCGGCAAC AATTAATAGA CTGGATGGAG 660
GCGGATAAAG TTGCAGGACC ACTTCTGCGC TCGGCCCTTC CGTATTTGTT CCTTACCAGG 720
ATGCGTACTC CCCTTCGTAC AGCGCCGCTT CTCTTGCTGC TCCTATGCGC ACTTCCCCGG 780
GCGTTGTGTT GCTCTCTAGT GCAcTGCGCG GGGTACCATT CGATGTACCG ACCCCATACG 840
TTTCCCGTCG GGCGAACACT ATCGACGCTG CCACCTTTGA AGACGCTCAT GTACCTGCAT 900
TATTTCCCGC GCTCTTTGCG CTTTGCAGGC ACGCGCCCaC ATTCGTGTAC GCAGAAAGTG 960
CCCATGAGGT GATGCTCAGC CGTTTTCTGC AACAACAGCC ACATGCATGC GCCGGTGTCT 1020
TTTTTGTCCT TCCTGACTCT GCAGCGCGCG GACCACACCA TGCTCCTGCC GTGCAGGGCG 1080
CACCTCCCCC CGTCGACACA GCGGGCGTTG CGTCTGCTGT CCGTGGCGCC AGCCGGACAC 1140
TACCAGCTGT GTATCGACAG TATGTTCACG CAGCAGAGGC AGCGTGGGCA GAGCTCGCAT 1200
CCACCGATAT ACTGGCCGCT TACtTGCAGG GCTCCCTCGG GACCGCCACA GAACGCGCCT 1260
TCAGGCACGT GctACGCAGG TAGACCAGTG GATACGCGCC CAGCTGCATC TATCAGAACC 1320
TGTCCTTCCG CACGCGCAAG CGCTATCTCA TCACACTGTC CATGCGGGAG GGACCTATGA 1380
CCGCACGTGA GTTAGATGCG TACTTTCGTA GTTTTTTGAA CTTTGGACCG TTCGTCTCCT 1440
GTGATGTCGC TCTCAACGGC CTGCAGGTAG CAAATAGCGG TGCCCCCGTG CACAAGGTTG 1500
CCTTTGCAGT GGATGCGTGT GCACAGTCTA TCGACGCAGC CGCCCGCGCC GGTGCACGCA 1560
TGCTCTTTGT CCATCACGGT CTTTTTTGGG GACGCATAGA GCCGCTTACC GGTATGCAAT 1620
ACCGACGCGT ACAGGCGCTC CTGACGCACG ACATAGCGCT GTACGCAtGC ACCTACCACT 1680
CGATGCACAC CCGCAGTACG GTAACAATGC GGGCCTTGcT GCGCGAGTCG GTCTTAGGCA 1740 AGGTGGTCCT TTCGGTTTTA TCCGTGGAAC TGCCGTAGaC TCTGGGGGAC GGTGGCAGAA 1800
AACACCACCC CCTCTCAGGA GGCAATGCAG CAGCATGCAG CGTGCACAGC ACCCGATACC 1860
CACCGCGTGA CGCATGCGAA TGCAATATCG CCGAGTGCCG GGCTATCTCT CCAACAAGTA 1920
GTACATCGCC TCTTCCCCGC AGAAGAGCAA CCCGTGCGCC TGTTACCGTT TGGGAAACAG 1980
CGTATCGAGC GCGTGGGTAT AcTGTCGGGC AAAGCAGGCA CGTACCTTGC GGAGGCTATC 2040
GCGTTAGATC TGGACCTGTT TATTACCGGG GAGATTGAAC ATTCTTGCTA TCACACCGCG 2100
CGCGAGCACT CTATCTCGGT AATCGCAGGG GGACACTACC AAACAGAAAC CGTAGGtTGC 2160
AGCTGGTGGC GCGCAAcTGC AACGGGATAC AGGCATAGAA ACGCTTTTTC TAGACATTCC 2220
CACGGGGATG TGATACGCTC GCGCCCGTTA AGGGTGGATA CAATGAAACT CACACGGATA 2280
CAGAAAGAAA AGTGGATCCC GCTTTTTGCC GCTGGATTAG TTGTTGTTCT GGATCAGTGC 2340
GCTAAATTGT TGGTGGGTGC TTATGTGCCT ACAAACACCT CGGGCGTTCG CGTGCTCGGT 2400
GATTTCGTGA GAATTGTTCA CGTGTACAAT GTTGGCGCCg CTTTCAGCAT TGGCCATCAG 2460
CTAAATCAGG TTCTGCGTAC GCTCGTGCTC GGTATCGTGC CGCTAATCAT TATGTTCCTT 2520
ATTGTTTTCT CCTATTTTCG CACTGACGCC TTCTGTCCTG TTCAGCGCTG GGCCGTGTCA 2580
GGGATTATCG GGGGAGGGAT AGGGAACTTA ATCGATCGCT TCCTGAGGCC AAACGGGGTG 2640
CTCGACTTTA TCGACGTAAA GTTCTTTGGC ATCTTTGGCT TTGAGCGCTG GCCCGCTTTT 2700
AACATTGCAG ATGCGGTCAT CATGACCTGT GGTTTGCTCT TGATCATTTC GTTCATAAAA 2760
CAAGAAAAAG AGATCAGCTC CCAACCCTCC TGCAATGAGA CGGGGGGCGT TTTTCGCACG 2820
TAGAGCTGGG CCGTGCGCGC ATGTCCGCGT CGGCCGTTCT AGTTCGCGTG CCCCTGTGCC 2880
CGCAATGGTT GCTTTGTTCT CCGCAAATAC CGCGCGTGTG TGCCGCGCGT TGCgcTtCCG 2940
GCGTACCAGG GCGGTACgcG CGAGsgcCTC ACAGCACTCA GGATATTAGC CCATGCAGAT 3000
CTTCGATACT CACGCCCACA TCGGTCTTAT TCACCCAGAT CCCGTAGAGC GGCTGCGgGT 3060
AGTACAAGAG GCACGACGAG CTTCTGTCAC CCGCATCATG AGTATTTGCA ACAGCCTTCA 3120
TGACTTTGCC GCCGTATACG AGACGCTCCA GTTCTCACCC TCTGTCTATC ACGCCGTAGT 3180
GTCTCCCCTT CTGAGGTCAT GGCCCCGGGG AAGGATTGGA TAGATACTAT TCAAAAAAGC 3240
CTACAACTCC CTCAGGTAGT TGCCTTAGGC GAGACCGGAT TGGACTACTG TAAAAAGTAC 3300
GGTGATAAAC GCTCCCAGAT TGGGCTTTTT ATCACTCAAT TGGATATTGC TTCAAAGGCA 3360
AAAAAACCAG TTATCATCCA CAACCGTGGT GCGGGCCAGG ATATCCTGGA CATCCTCAGC 3420
GAGCGCATTC CCGACCAAGG CGGTGTGTTC CACTGTTATT CTGAGGACGC AGAGTACGCA 3480 CGTATGGCGC TGGATTTACC TGTGTACTTT TCTTTCGCGG GGAATTTAAC TTACCGGAAT 3540
GCACGAAATC TCCATGAGAC TGTATTGGCC CTCCCGCTTG ACCGAATTCT AGTGGAATCC 3600
GAAAGCCCGT TTATGTCCCC CGCCACGTAC CGCAACAAGC GCAACCGACC GGCGCACACA 3660
GTTGAAACCG TGGAGTTCAT GGCTGAGCTC CTTGATATGG ACATGCTTGA GCTTGCCGAC 3720
CAGCTGTGGA AAAACAGCTG TGCGTGTTTT CACCTTCCTG AGTGAGCAGC AGATGCAACA 3780
ACACGCCTTA TATCATCCGG TTTCTATTGG CCCGTTGTCT CTCAAGGGGA ATGTGTTTTT 3840
TGCTCCCGTT GCAGGCTATT CTGACAGTGC GTTTCGTTCA ATTGCCATTG AATGGGAAGC 3900
AAGCTTCACC TACACCGAAA TGGTTTCGTC TGAGGCGATG GTGCGCGATT CACTCAATAC 3960
CAAACGTTTG ATTCGGCGCG CGTCAAATGA GACGCATTAC GCTATCCAGA TTTTTGGTTC 4020
TAATCCTGCA GTAATGGCAG AGACGGCAAA ACTAATCGTC GATAGCGCGC AGCCGTCCTG 4080
TATCGACATC AACGCGGGAT GTCCTATGCC TAAAATCACT AAAACAGGAG CCGGAGCCGC 4140
ACTCACCCGA GAACCGACGC GCCTCTATGA AGTGGTAAAG GCGGTCGCCG ATGCTGTGTa 4200
CgcGCAAGAC GCGCGTATCC CAGTGACAGT AAAAATTCGT GCTGGGTGGG AAGAGGCACA 4260
CCTGACATGG AAGGAAnsTG CGCGTGCGGC AGTAGACGCA GGAGCACAAG CGCTTGCGTT 4320
GCACCCgCGC ACCTGCGCGC AGTGTTACGC GGGAGAGGCA AACTGGGACA TAATCGCAGA 4380
CCTCGTGCAG TGCGCGCGTG GGTGGGGAGA GGTTCCCGTG TTCGGCTCAG GGGATCTGCA 4440
TGCGCCTGAA GACGCACGGG CAATGTTAGA ACACACCGCA TGCGCGGGGG TTATGTTTGC 4500
CCGCGGTGCT ATGGGCAACC CGTTTATTTT CAGACAAACC CGTCAGCTTT TAACTGAAGG 4560
ATACTACACG CCCGTGACGT TTGAGCAAAA GcTACGCGCA GCCTGGCGCG AGCTTCACCT 4620
TCTGGCACAA GACGTGGGAG AAAGCTCAGC CTGCAAGCAG ATGCGCAAGC GTTTTGTTTC 4680
GTATGCAAAG GGTGAGCGGG GTAAAACGCA ATGGTGTCAG CGCGCGGTGC ATGCGTCTTC 4740
CTTCGCAGAC TTTGCAGCAG TCATTCGTGA CGCGTGTCCA TGTATTGGTT TATAAGTTGC 4800
ACGGCTTTTC AAACCGCGTG AAAAACGTAC GCTTCCGGCG TACCCCAACT TACTTTGTCC 4860
TACAGGACGC GCAGnTCCCT CGATAGAAAG CGTGACTATA TCTGTCCTGC GTGCAACTTG 4920
TACAAAGCCG GTTCCTGTGC CAGAATCGTG CATCCGCGGT GGCAGGGAAT CCTGGTGAAA 4980
GAATGTGTTT CTAAAGAAAA GCGAATCTAT GACCTCAAGC AGCTCCTAGA GATTTCTAAG 5040
AGTTTGAATT CTCTCCTTGA GTTTACTCAC CTGGTAGAAG CCATCCTCTA CGTCGCGATG 5100
GCCCAGACCA AGACGCTGGG GGCAGCGCTT TTCACCAAGA AAAACGCCGG TATGAAAAAA 5160
TTGTCTTTGA GCCGCAaTGT GTGCGGCTTT GACGTTTCCC ACCATGCACA GCTGATAATC 5220 TCGGAAGAGG ACCCTATTCT CAGACTTCTG GACGAAAAGG CCTGTTGTCT TTCTCCCgAA 5280 gAgGTACAGA GCGCGCTCGC CCCCTCAAAG AGCGTACGTT CGCTCCyTGA CTTGCAACCT 5340
TCGCTCTTTG TTCCACTAAG AGCAAAGGAC CACCTTGTTG GTCTTATCCT TTTAGGCAAG 5400
AAAAtCAACC TACACGAAGC CTACACTCCC TACGATCAGA GCATCATCAT GGATATTGCA 5460
CAGCTTGCTG CTATTGCCAT CAACAATGCG TTACTGCTTG AGCAAGCTAC CACTGACATG 5520
ATGACCCAGA TGAAGCTCAA ACACTACTTC TTTGCCATGC TCACCGCGAr CTCGATACAC 5580
TCAGTACACA AGAGACCGTA TCTGTTCTCA TGCTTGATAT CGACTTTTTC AAACAGATCA 5640
ACGACACGCA CGGTCATCTG TGTGGCGATC TAGTTCTCCA ACATGTGGCA GAAATTATTC 5700
GATCCTGCAC CCGTCCATGC GACATCGCCT CTCGCTATGG GGGAGAAGAA TTTATGCTCA 5760
TGCTATCCAA CAACTCGTCT CGGGaAGctG CGCACGTTGC AGAAmgCATT CGCGTGGCAA 5820
CCGAGCAATT GACCATCCCC TACCATGAGG TATCAATTCG AGTCACTGTT TCTGCAGGCG 5880
TCGCAGAATA CCTTCCTAAC CAAGAATCCG CCGAAACACT GATAAAGCGT GCAGACAGTG 5940
CGCTGTATCA AGCCAAACAA AATGGCAGAA ACAAAGTCGT CATCTCAGAG AAAAACATGT 6000
GCTCATCTCA GGAATAAACC GATACTGGCG GCATGAGTGT GATCAGGAAG CCCTTCAGGT 6060
ACTCGTACAC CAATGTGACC CTTTCCCTTG TGCTCGCGAA TGGGGCGGTG TTTGTGATCA 6120
CGTCGTTGGT TGAATCACTG GGTATATATC TGGCGCTCGT GCCAGGACTC GTACGTTACC 6180
ACCGTATGTA TTGGCAAATA TTCACCTATC AGTTCGTACA CAGCGGCGTG TGGCACTTGC 6240
TTTTTAACAT GCTAGGACTA GTGTTTTTCG GGCAGACGAT AGAAAAGAAG ATGGGATCTT 6300
CTGAAATGCT GTTGTTTTAT TTGCTTGTCG GTACACTCTG TGGTGCGGGT GCGTGCGCGG 6360
CATATCTGTG TGTCGGTCGG TTGAACGTAC TGCTGTTGGG GGCGTCGGGC TCCATCTTCG 6420
CAATACTTTT TTTATTTTCG GTTATGTTCC CCCACTGCGC TCATTTATCT ATGGGGTGTT 6480
ATTCCTATCC CCGCTCCTCT GCTCATTGTA GGATACATTT TGTTTGAAAT TTTTGATCTA 6540
TTTTTCTCTC GTGATAATGT TTCTCATCTT ACCCACTTGC TCGGTGTCCT TTTTGCGTGG 6600
GGATATATCC GTATCCGGTT TGGCATCAAA CCATTGAAAG TGTGGAGCAT TGTCCCGTAA 6660
CAGTCGAGGC AGTGGGAGAT ATGTCTTCGT CGTGCTAGCC TGCGTATTTG GTTATACGCG 6720
CGCCGTGCAC GCTGAGGTTT ATACGGACCC CAGCACATCG GGACATGTCA CGATTTCTAT 6780
TCCCATATGG GCTTyTGTCG AGCCCCAGCC GGGTGTCATG ACCCAGcAGs GGAGTCCCCG 6840
AGGACTCCGC CTCnCCAGAC CTTGCGAGAA TTAGGGGCGT TCGTATTAGG CGGTGCTGTG 6900
TATGGGTGGC GGTTCTCTTA TACGCCaAAA GAAAAGAAGC GCGCCGTCAT GGAGCACTTT 6960 ACCCTCACTC CCATTTTCCC CCTACCGCCC GATAGTCCTC AGATAAGTCT GCGTCACGTA 7020
CGGACGCCGT ACCCCTACAT CCAtGCCGTG CAGAGTACTC ATTAGACGCC AGGCACGCGA 7080
CACACATGAG ACAGAGCAGA AACCTAACGT ACCAACGTGC GCAGGGCAGA GGAAGAGGAG 7140
AACGGAAAGA GGAACTAAAG GGAGTATATC ATGCATATCA CCGCGCGATT GTAGACGCAC 7200
TACGGAAAAC GGTTAGAAAG ACACAGAAAA ACAAGCCAAA AGAAGTAGAA GGAATGCTAT 7260
ACGTTAAAGA CAATCCCCGC CTCTTTGTAG AGGCGGGGGA ATTTGTCGCA GAGCTCTCAC 7320
TCAGTGTCCA CTTCACAAAG ATAACGCCCT ATAGCGTATA CTAGTAGCAC GCACCGAGTC 7380
CTGACCGCTA CCCGCGTGCG AGCAGACGGT TCACCCGCTT CACAAAATCA ACCGACGAAC 7440
CTACGTCCAT GCCTTCAATG AGCAAGGCTT GATCCAGAAG AACAAACGCA AGATCTTCCA 7500
CAAACGCCTC ATCCGTACTT TCTTTTAGTT TTTGTACCAG CGTATGACTT GCGTTAATTT 7560
CTAAAATTGG CTTTATCTTT GATTTATGCG TTTGTCCCGT GGCGCGCATC AAGCGCTCCA 7620
TCTGCACCGT GGGATCATTC TCATCGATAA CAATGcAAGA CACCGAGTCA GAAAGCCGTT 7680
TTGAAAGACG AACTTCCTTC ACCGAATCAG ACAGTATGTG CGTCAACCTT TCTAGTAGCG 7740
GCTTAAAACC CTGTTCCCTC TGCGCGGCGG CGTCTGTTTC TTCGTTGGGA CGCAACTCCT 7800
CCTCTGAACC TAAACGATTA ATTGCCCTTA ACTCCCACTC CTTGTATTTC GAAACAGAGG 7860
GCATCACGAT ACCATCTATG TCGTCTGACA TAACGAGCAC TTCAAAACCC TGcAAACGAT 7920
AAGACTCTGC ATGGGGAGAC TGACGCAGCA CACGATCGTC GTTTCCCGCA ATGTAGTATA 7980
TCGCCTTTTG ATCCGGTTTC ATGCGAGAAA CGTATTCGGC GAAcTCGTCC ATCCGTCTTC 8040
TGGAACAGAC TCACTTAGAG TCCTGAAACG AACAAGTTCC AGCAGCTGCT CACGGTGCTC 8100
GTAGTCGCTG TATAAACCCT CCTTCAAGGG ACGATTATAC TGCGTGATAA ACTCATCGTA 8160
CTTTTTCCCG TCACACTCCG CGAGTCTCTT AAATTCCCCG AGCAACTTTT TCACCGAAGC 8220
CGACTTGATT GCTGCAAGGA CTCTATTTTG TTGCAGAATC TCACGGCTTA CATTCAGGGG 8280
CAGATCTTCG CTGTCTATTA CACCGCGGAC AAAACGCAGA TACACTGGCA ACAGTTCCTT 8340
CTCGTCATCA GTGGATGAAA ACGCGCTTAA CGAATAGCTT TACCCCCGGC TTATAATCTG 8400
ACGTGAAAAA GGTCAAAAGn GCGCTTTTTG CCGGGCAAAT AAAAGAGCGT nGACGTACTC 8460
CTGTGTA 8467 (2) INFORMATION FOR SEQ ID NO: 62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4354 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62:
CTCTTCAATA ATGTCTTCCA TGCACGCAAT ACCCGAAACG CCGCCGTACT CGTCCACCGC 60
GATCGCAATG TGCACGTGCC TGCGCTTAAA CTCTCGCAGA AGACTGTCAA TTCGTTTGGA 120
CTCGGGGACA AAGAAGGGTT ACGCAGCAGT CTTTCTAACC GCACCTCCTG TGGCCTTCCA 180
AACAGCTTTA TTAAATCTTT GACGTACAGC ACACCCACCA CATTATCAAT AGTTTGTTCG 240
TAGACAGGAA AGCGTGAGTG TCCACTCTCG GTTACCTTTT CAACGAGTGT TTCACCGCTC 300
ATAGAAAGCT CAAGAAAATC CACGTCAATA CGCGGTATCA TCACCTCGCG CACCGAAGTG 360 tCAGAAAGAT CCACTATACC GCGGATCATA TCCTGCTTTT CTTCATTCAG CGGTTGCTGA 420
AAAATATGGG TAACAGCGTG CCTGCGCCTC AACCAGTCTA TGACTCCCAT GGTATACCCG 480
ATGATAGCAC CCGACAGTGt GCGCCAGTAT GCGCTCCTGC AAACGCAACA TCTCTTGACC 540
GGGAGAATTG TCCTGGTGAT CCATACCGCT CAGATGCAAA ATGCCGTGGA TGAGCACCCG 600
TTTAAATTCC TCGTGCGCGG CAACGTGAAA ACGTTCACTG TTTTCACGCA CACTTTCAAG 660
ACTGATGATA ATATCACCAG CAAGAAAAAA ACGCGTCCCT GCGTCATCGC AATACTCACC 720
ATCGTTCTCA AAAGACAGCA CGTCAGtGGG AGAATCAATA CCACGGTAAT CGTAATTTAG 780
CCGGCGAATA AACGCATCAG TGCAGCAGAC AATGGAAAGA TCCCAGTGGG AAATAGCCTG 840
GGAATCGAGC ACCGCACACA CAAACGGCGC AACTTGACCA ATCCAAGGAG GCGGACAAAA 900
GCcTTCGCAG GAAACAGAAA CTTTATTCAC CTCGGACATA AAGATTACTC CTTATACGAT 960
CCTTGGGCTA CGGACACGAG CTGCTGCTGA TCAGAATCTC TTTGGTGAGG ATACTCTATG 1020
CGGGAATGGT AGTATCCTGC CAGTATTCTC ACAAAACACT CCTTGACGAC CTGCACATCC 1080
CGAAACGTTA AATCAGAATT GTCAAGCTGa TGTGTTTCTA TCTTCTGTTG CACAACCTTA 1140
TCGATAAATT TCCCTAGGCG GGGGATCGTC GGTTTATTCA ATGTCCTACA TGACGCTTCA 1200
ACCACATCAG CAAGCATCAC CACCGCAGAC TCCTTTGTGC GAGGAGGAAC CCCCGGATAG 1260
GTAAAATCTT CCCGATCAAC ATTCGGATCG AGTTCCCGCG CCTTCTCGTA AAAGTATGTA 1320
ATAAGACTAT TACCGTGATG CTCTGCAATT ATATCGATAA CCTCCTGAGG TAAGCGGAGT 1380
TGATGTGCCT TTTCTACCCC CAGCTTTACA TGACTCCGAA TTACCGTTGC AGAAAGCCGT 1440
GGATTTAAAT CTAAGTGTTT GCTATCGCCC GTTTGGTTTT CTACAAAGTA CTCACCGTTT 1500
TCCATTTTTC CAATGTCATG ATAATACGCG CCAACTCGCG CAAGGAGCGA ATGAGCCCCA 1560 ATGCTACGAC ACGCATTTTC TGCAAGAGTG GCAACCATCA TGGTGTGATT GTACGTACCT 1620
GAAACTGTAA GCAGCATTTT TTTCATGATA GGAACGTTGA GGTCCGAAAG CTCCATAAGC 1680
CGGAACACGG TAGGAGCATT GGTGAGCGCT TCAAGGATGG GCAGAAGGCC TAACACCAAA 1740
ATGCCGTTGA GAAAGCCACT GATCGCCACG CCTGTAAGGA GGAATATTGC GTCAGTGTAC 1800
GCATGCGGAA ACGCAAACAT GAGCGTAGCA GnCAAGGAAA GGCTGAGCAA CGGCAAGGAC 1860
ACAGGAACTT TTAACAATGT CGAGCCGAGA GCTCATAACA CGCATACAAG CAGACGCCGA 1920
CACCCCAGAG AGGAGCGCAA AAAGCGTAGG CTCaGTATGG AACTGTGAAG CGATGAGCAC 1980
TGCGAACGCA ATGAGAAAGG AACTAGTGAC GGCACTACGA TGGGAAACGA GCGCGGTAAC 2040
GAGCATGATA CACAACGCaG TTGGCTGAAA AGGAATGCTA TCcAAGCGGT GCAGGGAGCG 2100
CAGCTATCTT TGAAAGAAAA AGTGTGCACA GGTATCCGGC AACGCTGGTA TAGAGAATGA 2160
GTAACTCTAC ACGCAgTTTA AGAGGAGGAT GGGCCATCCG TTTACTGAAC AAAAAGAAGG 2220
CAAGCAGATA CAAAAAGGCC AGTAACAGGA GACTGCTTAC GAGCAGGGAG CGATCGACAG 2280
ACAGTTTAGA GTGTGCAAGT GCCTGCAATC TTGCGTAGTC AGTGGCGGAT ACGATAAAGC 2340
CGCGACGGAC TATAATTTCG TTTGGATGAA TACTGAGGGT GACCGGTCGT AACCGCGCCA 2400
ATGCGTTGCG GACATGTCGT TCACTTTGAA TAGGGTCAAA GACAATATTT GGACGCAGAA 2460
AGGGTCCGAG GGATGAAAAG AGGAGCGCCG CCTGCGACGT AAGACCGAAA TCGGAAGCCA 2520
GCGCATGGAC GCGCGCGGCG AGTTGATCGG ATCTGATGAG CGTTTCAATT GCCTGAGTCC 2580
TCGAAAAGAC ATCCCCTGCG GCATTTTCCG CCTCATCCGC ACTCGCATCG GGTGACAGGG 2640
GTACAGACGC AGAAGGGAGA TCTCCTGCTG AGGCTTGCGA CGCCACGGGG GCCGACATAT 2700
CCGACGGGGC GTGGAAGGGA CGTTCCTCAC TGATAGTAAT TGTGTGGGGG TTAAAATCCT 2760
TAAGCGCATG GTCGGACAGC TGCACCACAC CTTGCGCGAA GATACGCGCG AGCACTTGAG 2820
TTCCCACACG CAGGAGGGAC TCAAACGTGT CATCGTCAAG CTGAAGCAAG GATCGCAGCG 2880
TCTGGCGCGA AAAGTGAACA AATTTCTGCT GCAGCAGGTG CACGTGTGCA GACGCCCGAT 2940
CGTGCGCAgC GTGGaGTGAC CGCTCCTCCT CGTAGTCCGC TGCTCCTCCG CCTCCGTCCG 3000
ACGAGGTATC CAGAGCCATA CCAACGCGCG CTTTCTGCAA CGCATGACAA AACGCCTGGT 3060
ATGCGCGTAC TTCAGCCTGT TCCAGATCGA GCCGACGCTC AAAAACAGCA GGAATTTCCT 3120
TCTTCCTGCG AGCATACTGC CcGCTGGGTA rCCAGCTCAT CAGTAAGGGA AAGAAAGmCA 3180
GGAGAGACAA CgTTCCgcTC AGtACACGCC CTACCGCAAA TCAGCAAGTT CAGTCTTGCA 3240
GGGTCTTGCT GTTCGCTGAT GCTCACCGCC TTGGCAATGC TGAGGACAAC GAAGGAGAGC 3300 GCCAGATTGA GCGCGCGCGC ACCCgGCGGC GAAGTACGTG ACACAGCGTA TGCCACAATG 3360
CGTAAGACGG GkTGGTCCTT TCTTCCTCAT GCGCACTTCT CGCGCCAGCG AGCrTaACAC 3420
GCCAGCGGTA ATCTGTCCAG CAAGGACACA CGGACGCCTT GCGTACCCCA ACCGCGAGCC 3480
TTGACAGAAC ATACCCAAAT ACCGCACCAT CGGCCTCCGC AATGAGAAGG AGTGCGACAG 3540
ACCGTGAAAG GATGCGCCGT CACCATCGAC CAGGTCTCAA AAGCATACGG TCACTGCCTC 3600
GCCGTTGACC GTGCCACCGT TCACATTCGG CAGGGAGAGT TTTTCTCCAT CCTCGGTCCT 3660
TCAGGCTGCG GAAAGACCAC GCTTTTGCGT ATCATTGCAG GGTTTGAACA GCCGGACTCA 3720
GGAGACTTGA CCTTCGACCA CGTGAGTGTG CTCGGTGTTG GTGCAAATAA GCGGAGGTCT 3780
AACACCGTTT TCCAGTCGTA TGCCCTCTTT CCTCACCTTT CCGTGTACGA GAACATCGCC 3840
TTCCCCCTCA GGCTCAAACG CCTCTCAAAG AACCTCATCg CGAGCGCGTG CACGAGTACC 3900
TTCACCTGGT ACAGCTGGAC GAGCACCTGC ACAAGAAACC CCATCAGCTG TCAGGTGGCC 3960
AACAACAGCG CGTCGCCATT GCCCGTGCAC TCGTGTGCGA GCCAGGGGTG CTCCTGCTTG 4020
ACGAGCCGCT TTCTGCCCTG GATGCAAAAC TTCGCTCCAA TTTGCTCATA GAGCTCGATA 4080
CACTCCACGA TCAGACGGGC ATTACyTCGT TTTTATCACC CATGACCAGA GCGAGGCTCT 4140
GTCCGTCTCC GACCGCATCG CCGTCATGAA CAAAGGAAAG ATCCTGCAGA TCGGTACTCC 4200
CTACGAGATT TATGAGCAAC CTGCGACTGA CTTTGTCGCT AAGTTTATTG GGGAAACTAA 4260
TAGCTTCCTG TCAACTGTCG TCTCCTGCAC CnCCATTGAA AACGAAGAGT TTATGCTCAG 4320
TCTCCAGGTT CCGGAACTTG ACCnTACGCT CACC 4354 (2) INFORMATION FOR SEQ ID NO : 63:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21948 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63:
GATACTTCCC AATGGCACTT TcsGGTCGCt GcTTTTtCyT CACgTTaACA GCGAACGTAT 60
TGATTTTAaT ATCCACCTGC CAAAAgGAGG TTCAtTACAG GACTATGCTC ACATCCGCmA 120
CACACTCAGC CGCAGCGTTG CGCACTTCTA CCGTCAGTGC ACTATTGCTC ATACGTACGT 180
GCAGAACTGC CCACGCACtG CCACTCAGGG CAACGCGCCA ACACATTCCT CACCCCCCTG 240
CACCGGCGTA CGAGAAGAAC CCGCCGCTCC cTGCGCGCAC ACACCCCGGT ACGAATCCCT 300 GTTCCCTCTA CCCGTGCAGC ATGCGCACCT GCTTCCTCCG TCACCTCCTC ACATCTCGTG 360
CGAACACGCG CGCGATTGCA CTCACCCAGC CCCCGCTGCC GAAGGAGATG CGCCTGTGCA 420
CAACCATACC CATACAGGTG CATTCAAAGT ACTCGGACAG GTAGCAGGAA CATTCATCGC 480
CGTAGAACGC AACAACGCTC TCTACCTTAT CGATCAGCAC GCAGCACATG AACGCATTAT 540
TTTTGATACG CTACAGCGGA ACCTTGGCAC TGCACAAATA CTTCTTATTC CCTACCACAT 600
TCACCCACGC TCGGATGAAG AGGCGCGCAT CATGCACCGC GCCTGCACAG AACTTTCTCC 660
TGCAGGATTT CGATTTCACG AAGAACCAGA CGGTTCGTGG CACGTAACTG CGGTGCCGCT 720
CCACTGGCGG GGGAGCGAAG AGCAACTTGC ACACGATATC CTCTACTCAG GAAAAAACGC 780
GCACGACATC CTGCGCCACG TCCTCGCTAC CTGTGCCTGC CGGTCTGCGT GTAAAGACGG 840
CACCATCCTG GATGACGCAA CGCTCCACTC GTTAGTGGAG CAGGCTTTTG CATTACCACA 900
ATCGAGGTGT CCCCACGGAC GGCCCATTTG GATTGTCATT GGCCGAGACG AATTGTTCAA 960
ACGGATCAAG CGCACGTAAC GCGCTGCAGA TACGCAAAAA GAAGCCTGCT ACGTCTGCGC 1020
TCTCCGCGTC GGCACGGGGA GGTGCGCCGT GTGCACACAA ACACCACCAG TGAGAGGATG 1080
TACGGCAGCG CAAACAGTAC ACCGGTGGGC ACCACGTGAG TGCCCTGCAA TAmGTcaCAC 1140
ATGTGTTCAA TACCGGAGAA AAAAATCGCC GCCGGCACAC ACCACATCAT CCGCTTGCGT 1200
GCAAGAAAAA CAATTGCAAG TGCCGTCCAT CCTCTGCCTG CAGCCATCTG CGGGGTGTAg 1260
TACCGACACG CAATACTAAC AGTCCCCCCG CACACACCGC ACACACGCCT GTaCCGCCCA 1320
CGACACCATC CGATtACGCG CCGCGTCAGT TCCCCGCACC TGCAAGGTAA CGCACCTTCC 1380
CCrkAGTGCA TAAAATTGAT ACCCACGTTT GTAGAGTACA GATACAGGTG AAAAACCCAC 1440
ACCAGTGCAA AGGCCACCGC AGTCCCCCAC AGGGGGTGAG GTAAAACGCG GGTATGTGCA 1500
AGAGAAACAT GAGTGAAAGA GACACCATGC GCTGCCGTGT CCATCTGCAT CGCAGAAGCT 1560
GCAGCGCGTG CAAACATGCT GGACGCACCA AATGCGCTCA TCCCCATTGC AGAAAAGTGC 1620
ACTGCTATGC CCGTTAAAAA CGGATTTGCC CGCATACGCT CCGTACCCAC GGCCACAAAA 1680
AATAAACACA GCGGCACCAC ACACACGGTA ATACCCAGTC CACCCCAATA ACTTCCCCAT 1740
ACCAGTGCGA AAAACGCTAT GCAAAAGGAC GAGAAGGTAA TCACCCCTTC CATAAAAATT 1800
CCCAACACTC CCGCGTATTC TGTTGCGAGC GCTCCTGCTG CAGCGCATGC AAGCGGTGct 1860
GCGCGATGTA ATATTGCTAT CACTGTGGTG CCTATCACTC CCATCGAGAC CGCCTATGAT 1920
GTGTATCATG TACAGAAAGC GCATGACGTC TGCGAGTTCT GTGCTTTTCC CCACGAAAAC 1980
AAAAAACGGT GACAAGGAAT CGATATACCC GGCGTGCGCC TCGCCGCACT GCATTCCACG 2040 GTGCGGACCA TTGTGCAGAA ATAAGCAAAA AGATCGCCGC CTGTAAAAAT AGCACCACAT 2100 TTACCGTCAG gTGCGCACCA AGTACTGCAG CTTCAGAGGC TGTCTCCATC CACGCGAAAA 2160
AGAACGCAAG CGGTACGAGT ACCGTAATGT GTGCATGGGC AATTAaCGCG TGCGCTAAGG 2220
CTGCGTAACC CATCCCCACA GAAAAACCCA CATAGCAGGT GCCAAACAGC CCAACTACAG 2280
AAAAAAATCC GGTAAGCCCA AACAGCGCCC CTGAAAGCAC CATTCCCCAC ACATAGGTGG 2340
CCCATACGGG AAACCCTACA AAACGCCCAA ATTCGGGGGC CTTTCCGCAT ATGCGAAACT 2400
GATATCCTAC GCGGGTGTAC GAAAAAAAAC ACCCAACTGC GAGTGCTACT AAGGACGCAT 2460
AGGTCAATAC GGCCGGCACA CCGAACAAAG ACGTCTGTTG CTGCAATATA AAATGCGAAT 2520
GAACCGGCGC AGTTGCCAGC AAGTTCCCCG CAGaTCACGC GTAACCGTTA TAATCAACGC 2580
ATCGATGAGA GGCACGCATG CGGTGGATAA CAAAAAGGAA GTAATCATTT CGCTAGTTGC 2640
CAGCCATGCT TTTAGTATCC CAGAAACACA GGCTAATATC CCCGCGACCG AcAGCGCACA 2700
GAGGAGCGCA ACACTCCATT GCAACAAAAA GCCCACACCC CAGTACTCAC GGAGCAACAA 2760
TGCGGTGACA AAACCTGCAG CATAGATCTG GCCATCACCA CCTAAATTGA TCATTCCTGT 2820
TTTTAGCGCG CAnTCGCCCC CAGTGCCATA CAGACAAACA GTCCTGCTTT GTGAAACAGG 2880
GCACGTATGT AGCCACGGGT AGAAAAAGGT TTGAGAAAAA ACGCTGCCAA AGATACGGAT 2940
GGATTTTCCG AGCACAGAAC AATCACAGCA CTCATAACTG CAACACCGAG CAACACTGCG 3000
ATACACGAAT TGATCACCCG TTTCACGTAT GAGAATCCTG AGACGGAGAC GGAGTGCCTG 3060
ACACTTCAGC ACACAACGTA CCTGCACGTA GCAAGAAACG TTCTGTGCAC AACGCACGCC 3120
ACTGTGCCTG ATGCTGTTCT CGCGCAAGGA GCACAAGAgc AGTTCCTGCC TGTGCTACCT 3180
GGCGCAGACG TGCAAGCAAG CGCTGTTCAC TGGCGCTATC CAATCCTTCT GCAGGTTCTG 3240
CCAAAATGAG AAGACGTGGA CGCGTTGCAA GctCACGCGC TAAAATAACG CGCTGCAACT 3300
GTCCGCCTGA AAGCGTACAG GCAGGCTGCA ACGGATCGCA GTAAATTTCT TCTTCTGCAA 3360
GAAGACGAGC AACAAAGCGC ATCTGGcgCg GCACACGCGT GCGCCACGTA CGCAACGTGT 3420
AGGGAACGAG CAAATCAAAA AGAGTTAACT GCATTGAGGC ACCGCGCTGT ATGCAATTAG 3480
ACGGCACACA CGCAACCCCG TGTGCCCGCA GCAGCGAGGG CGTATTGCGC TGGAGGGGGA 3540
GACACCACAC CTGATCGTGC TCCTGCAAAA GAATATTCCC GGTGCAGTGC GTACGCGACG 3600
CCCCAGCGTG CATATCACAC AGTATATCTT CCAATACGTG CACACCATCT TCTGGCGTAC 3660
CGACTATCCC TATGATAGCA GATGCGGCCA CAGAAAACGA AATATCTGTG AGCGGAACGT 3720
CTGCGTGTTT ACTCACCTGC AGCGACTCAA CGCGCAACAC CCAAGGACGA GCAGAAGATG 3780 TGCGCGGCAC AGTTGCGCAC GACTGGGTAT CTGACAGAGA GGAAAAAGAA CTTACGGCAG 3840
AAGAAGTCAC CGTTGATGCG GACATGAGCG CACAGGACAC TTTCTGAATA CATTCATTCA 3900
CCTGATGCGC AGAACAGTAT TCGTCTAAAA GATCCGTACG CAGAAAACTG CACGCTTTTC 3960
CCCCTTCTAT CAAAGAAATA CGCTGTGCCC ATCGCAATGC ATCAGCAAAT CGGTGCGTCA 4020
CTACTATCAC TCCGCCACCA CACCGGGGCG CGTGCGAAGA ACGCACAAAA AACTCTTCAA 4080
GATGAGAAAA GAAAACCGCA CGCGATTGCG CCGGAGCACA CCGCGGCTCA TCCAGGATGA 4140
TGAAACGCGG ATTGCGAAAC AATACACAGA GCAACGATAC AAAAAACCGC TTGTCTGCAC 4200
TCAAACATGC AACGTATTCT TCCTTCTTCA AGGGCATACG CCACTGGGCG ATAATGCGAT 4260
CTATGCGTTC TCTCACcTGT GCACGGCGCA CCCACCGCAC GCCGGTgAgT GCAGCACTAC 4320
CCATCACTAC ATTTTCAAAT ACTGTTGCGC GTTCTGCAAA TACCGGTTGC TGGTGCACTA 4380
TGCCAATTCC TGCACGGAGC GCATCGAAGG GTACGGAGAA GCGCTGCTCC TTTCCATCCA 4440
GACGGAGCTG CCCATGCGTC GGCACGCAAA AGCCCGAAAG AATATGCGCA AcGTGGATTT 4500
TCCTGCACCA TTTTTTCCCA ACAACGCGTG AATTTCACCG GTAAAAAAGG AAAGATTCAC 4560
ATnGCTGAGC ACGCTGTGCT CAGGrCsGTC CGTCTCACGC GCGCCCGAAC ACGGGCCATG 4620
CGCTGTGTGC GCATCGTCGA CTGCGCGCCT GCCAGGGTGA CCGAACATAC CCCAGACCCC 4680
GCGCTTTGAA CGCGGCATCA CGCGCGGGTA GGTTTTCCCA A TGGTGAA GCGAGAGCAC 4740
ACCGCGCGCA GATGCCCTAA CGCCGCGCTC AGCTATCATC AACGCACCGG CAACGTAAGC 4800
TCACCGCTTT GAATACGCCT GAGCAACGCA GACTGCCGCA CACGAATCGG TTCGGGTACc 4860
GTTTGCAGGT ACAAGGGATC CTCTTCAATG AAACGTACGT ACCCGTCTTT CACCCCCAAT 4920
GTCCAGGCTC CTGCAGATGG CAGTTCACCG CGAATGCAGC GCAGtcTGCT cATACGCAAG 4980
ACGCTCCTGT TCCATAACGG AACTGCCAAC TACGTAGCCC GGTGCCCTCG CATAGCCGTT 5040
ATCGTCAAAC CACGAAACAT AAAAACCGAG CTCCCGCGCG GCCGCAAGTA CTCCCTGATT 5100
CGCACCGCCG CAAATTGGCA TCATAACATC CACCCCTTCG TGAAAGAGAA TCCGTGCGAG 5160
GTCTGCACTT TTTGCAGCGT CATACCAGTT CCCCACCACG CGCACATCGA CTTCAAAGGC 5220
AGGATCTACT GCACGGGCAC CTGCGAGAAA GGCAGGAATA ATAGTCTGGG TCATCACCGG 5280
ATACGACTGC CCCGCAATAA GACCGATTTT TTTATCTGCA TTTGCAAAGC GCATAGCACT 5340
CGCACTCACT AACGCGGAAA GGTGTCCTGC AAGGTAGGCT TGCTCCCACT GGTTATAGCG 5400
AAAGGTAATC AGCGAgTGCT CtGCGGCGCG TAGGCATCTA GAACCAAAAA CCGCTGCAGG 5460
GGAAATTGAC GCAAAATAGG CTCAAGGACG TGCGGGAGTG CAGGGTTGGA AGACACAATC 5520 AAACGATAGC GCTGTTCTGC AGCAAGATGC GCCAACTTTT CGCGCCAGAG CGCCTGGTTC 5580
GGCCCCGCTT CGATGATATC AAGCCCaATG TGCGCCCTGT CGCGCGTTCc TGCGTAACTG 5640
CACGCTCAAC ACCGTCACAC AACATTGCAT ACACAGGACT GTCGTGACGA AAACCTGGGA 5700
CAAAAACGGC AATAgCACCG CGCGCTCATC TTGCACCGCA GGCCTACACG AAAAGCAAGT 5760
AAACACTGCA ATGAGCGCAC TGAGAACACA CACCGCACCG TTCATAACAC CTCCCCCCAA 5820
AAATCCCTCT CTCGCGTAGG GTGCACCcTA CCGGCACCCA CAGCCTGAAA AGACCAGAGC 5880
ACTACTCCTC ACCCCTGCCC CCAAACGCAT TGCACCACCC AGAGAGAAGG AGAAAGACTA 5940
GTTTTTCACG CGTTCTTGAA AAACGTAGCA TCCGCTAAAC TCTGCGATCT CACGGATGCT 6000
CTTCATCCAA TATACCGCTT CTGTACGATT AGTAAAAGGA CCGACACGCA CACGGTGACG 6060
CAAGCCTGCA CCCGTGCGTT TCGTGAAGAT TTCTGCCTTC ATGTGTCTAG CTGCAAgCAC 6120
ACCCCGAGCA CGCTCGGCGT TGAGCTTACT TGAGAGCGAA GcGGCTTGCA CCCAGAAAAG 6180
GACAGAAGGA GCAGCAGGCT GCGTGGACGC ACCGCGCACG CGCGCATcCC GCTGTGCAAC 6240
AAGCGGAGCA CGGGTACGAG TAGCGGGAGG CGACTTGGCA GACGCTCGGT CACTCCGTGC 6300
ATGCTTTGAC GCACCCTTTG ACTCTGCAGA TTCCCGAGTG TCAGAGGCAG GAGAAGTTTT 6360
CCTGTCCTGC GCCGCGTCTG TGCGCTCAGC AcGCGCCGGA GGAGAAACAT CAAGACTTTT 6420
CGCACGAGCG GTAGGGACAT CCTTTACCAC CGTGAGATCA GGAATTGCCC TCTGTGTTGG 6480
CGCAGGAGTC TTCCCCAGCT CAGGAATTTT TTCAGGATTT TTTAGCCATA AGCTCGGATC 6540
CACTGACGGG TGCTCAGGTA CGTCACCACG CTCAACATAA GAGGACACGT CGGTCACTCC 6600
TGCAGAATTA GAAGAATAGG TAGGAGAGTA CAACAAAAGC GCGACGCCAA AAATAATGAG 6660
CATAAACACA CTCAGGGAGA TGACAATCCA CAAAATTCTC TTCTGTTCCA TATTTACTCG 6720
CCTAAAACAC CCCTGCGCGT CAGAATGCAA TCGACTTTTC GCGCCAATGA CGCAACACGC 6780
CCACAGTTGG CCACCGTGTA GGTATCGGCG TTTTGCGCGA TGCTTTGAGC ATAGAAACCC 6840
TTCTGCGCAG AAAACCTGCG TAAAAGGTGA GTAAAAGATA CTCGTTCCCG TTTCTTACAT 6900
CGCCACATGC GCAGTATGCT TGGCGCCCAT ATGTACAAAA CAAAACTACA GGCTTGTAAC 6960
AGTTCTGTTT TATGCAATGT CGGCGCATTC AGTACTATCG CTTTTGGACG CGCAsCTGCG 7020
CGCGCGCTAT GTCTTCACAC AAAAGACGGG TAACTTTCGG TAACAAAAAC TCTTCGTGCT 7080
GCTGCAGTAG TTTAGGCTCA GAGAACAGTA ACACACCCAA ATGCGCAGAG TGCAGGCCGC 7140
CATCTTTACG CCGTAACGAA AGGCCGCGCG CAGCgCAGCC ACTTGAAACC GCTCTACTAT 7200
AGAGTCGCCA TAAGATTCCA AAAGTTCAcG CGTGCGCGCA TCCGCGTCAA TACAGTAACA 7260 TCCCCGTTCT GCAAGCAGGC GCGAGACCAC ATTCTTCCCC GCACCACTTC GACCGATGAC 7320
ACCAATTAGT GGACAAAACT CGCGCACAGC GAGAGCGTAA CGTCAAACAC GCACCGTCCT 7380
CAAGTGTTCA GAATGTGCTG AACACCCGAT ATTCCTGCGT ATCTTGCGCT GACTACCCCT 7440
GTGCCTCACG CGCAGCACGC AAAAGACGCT TCTGTGCATA CTGTTCGTGT CTGGCACGGT 7500
TGGGAATAAA CTCACGCTTG AGACGACACA GCCGTTGCGC TGCTGTTTGC ATATCCGCAT 7560
CAAAATCCAA GGCAACACAC GCGAGCGCCG CGTTTCCTGT CAGTTCTGCG TCATGAATTT 7620
CTGGTAACAC AAAACAACGC CCACTGACAT CAGCCTTAAG CTGCAACCAA CGACTATCCT 7680
TTGCCTGTCC ACCTGAAACG GTGTATACCG GTTGCAACTG GGTCACAGAC TCCAGATACT 7740
CCAGACCTGC ACACACCTCA AAGGCAAGAT CCTCTACCAG CTGTCTCCCT TCTCCCACCA 7800
CGGTGGGGGG ATACGCATCA TGCAGCTGAA AAGGTAACGC CATAATTTGA CCCATCCGTT 7860
CCCCAAACCC TTGAGCATGG GTGCGCCGCA TATAAGAGGC AAAACGACTC CCTGAATCCG 7920
CAATTAAAAA CGAGACATTC CAAAAATCTG CTCTCAAAGA CGGTAACACG CGGACCCTTG 7980
CATCCGCAGA CCTGATCGCA GGAGGCAGAC GAACACATAC ATTTAACCCC TCGCTGGATC 8040
CTGCCCGATC GCATCCGCTC CCTGCATGTA ACGTATTTGT TCCAATGAGT GCTGCCGCAA 8100
AGTCCGGCGC GCCGCAGACT ACCGGTATTC CCCGATAGGA GGCAACAATT GAGCCAGGAG 8160
CCACGAACGG CGCAAAGAGC GTTTCAGGTA ACGCACACGC GCGTAAACTC TCAGACGTCC 8220
AGTACGTCGG CATGTAACGT CGCTCTGGTA ATACCGTAAC CGCACACCCC GTGAGTCGAT 8280
ATATCAAGTA TTCGTGAGAA GAAAGAAAAA ACTGTACATC ACGCGCACAA AAATGCAATC 8340
GCTGCAAAAG GAGCAAAACC TTTGGCAAAA AAAGAGAAAC ACCGCACCGG GGATCAGACG 8400
CCCCCGCTTG ATTCCACAAA ATCAGTTGAT CCTCGGCATG ACTCTTCTTG TGCACGGCAA 8460
CGACGCTCGG TCCATTGCCC GAAATAGTAA TGGCAATAAC GTGATGCACG GCACGCAACC 8520
GCTCAAACAC CGTAAAAAAT GAACGCACCC AATCCTGGGC CTTCACCGGC TGAGGAAAGA 8580
ACACGCGCTG GTACTGTAAC ACCTTTCCAT CTTGGGAAAT AATCGCCGCT TTTAGGGAAG 8640
ACGTACCGAT GTCCGCGACA AAGATCCCCG AACCCATGTC CGTTGCGACG CTACCCTATG 8700
CGCGCGGTGT CCCCGCAACG GACAAGCGAG CGTGTAGGAG TCGCTCGATG ATACTGCGAT 8760
CATCCAAGAG ACTGCTCAAC GATTGATTAT GATGAanTGC GCGcaTGACG CGcGCGAACA 8820
ACCCTGACAC ATCGGTTTCA GTATACCACT GCTTGTGCGA AAGCTGTGTG TGGAACACCG 8880
CATTTGTGCC AATAATCCGA GAGAAATACC GTTTCTCGTA CGCTTCATCA AATAACTCAA 8940
GTGCATTTCC CGTAAAGAAA GGTAAACTCA CTGCCGCGAT CACCTGCTTT GCTCCGCGAC 9000 TTTTCAAAAA TTCCATCGCC TTTAGCATCG TACCTCCGCT GCCAAGCATA TCGTCAGCAA 9060
TAAACGCCGT CTTCCCCTCC ACATCGCCGA GCAAGTTAAT TTCTACAATA TTGCTCTGCT 9120
TTGCATTTTG CGCGACCACC GAATAATCAC GCACCTTATA AATCATCGCG AGTGGCTTTT 9180
TTAAACCAGA AGAATAAAAT TTATTCCGTT CAACCGCCCC GCTGTCCGGC GCCACTACTA 9240
CAAAAGGGAT ATCGGGGTCA GAAAGATTTT CAATCTTTGC CAACTCCCGG ATAATCTGAT 9300
AACTGGCGTG TAAGTTTTCA AGCCGCGTGC GATGAAAGGC ATTTTCAATC TCACGTGAAT 9360
GCAAATCAAG AGTGACAATG TGACTCACGC CAAGATACTC ATATACACTC CCGAGCAAAC 9420
CCGCCGTCAG TCCCTCACGT CCACACTTTT TGTGCTGACG GCTATACGGA TAAGTGGGTA 9480
AAACCAAGGT AACGCGCCCA GCTCCCGCGT GCCGAACTGC ATCTATGGTC ACAATGAGCA 9540
TCATCACGTG ATCATTCACG GAGAATATTT TTTTACTTTT TCCGTTATTC ACCAGGACTG 9600
GTTGATGATT TTCTACATCT TGGAAAATAA AAACGTCCTT GCCACGAATA CATTCATTAA 9660
TTTGCGTTTT TAACTCACCA TTTAGAAAAC AGATAAACTG TGCATCCACC TTAAAATGCG 9720
GTGGATTAAA ACGACGTACG TCATCGTGCG CACACAACTC TGTGGAAAAA AGGTCCCGGT 9780
AAAAGTTCGC ATCTCGTATC ACCGACCCCG AGTCAAGACC GTAGCGGTCC GTCAAACGGT 9840
CCATTCTCTG GTGAAACTTG CGTTCACACA CACGCGTCAA ATGTTTGATA GTTTCGTCCG 9900
CGAAGTGCTC GCCACCAGGA CAGGCGACGA TCGCCAAATC AGTAAACCCT GAACATCTCA 9960
TGCAATCTCT CCACACTTGT CAACGCGGAC TGGACAAGCA CCGTCTCAAC CGCACGATAG 10020
GGAGCCCATC CGCGCAGGCT AATACAACGA AACACACCCA AGCGAACACA CCTTGCATAC 10080
GCAAGGCACA GCGAACGCAC ACGCGTCATG CGCACAGGGC AACACTTTAC TTACTTATGA 10140
TAGTGATTTC TACCCGTCGG TTTTTTCTAC GACCATCCTC TGAATCATTT GGCGCAATAG 10200
AcTGctGCGC ACCACAACCG CGCGTATATA CATGCGCTGC ATCCACAACA CCTAATTCCT 10260
GCAGGTAACG TGCAACCACA TCAGCACGCT CTTCAGAAAT CCTCTGTTGA TCCTGCACAG 10320
ACCCCCGTCG TGCCGCATGT CCAGACACCA ACAACTCTCG ATCGGGAAAC GCGCGCAAAA 10380
GTTCTGCTAT TTTGCGCAGC TTCTCGTACT CAGAAGGTGC AAGGGATGCA GAGTCTGCGT 10440
CAAATTGAAC ATTTTCTATA CTGATGGTTA CTCCCTCTTC TGTTTCACGC ACcTTCGCAT 10500
CAGGCATATG CAAGTCCTTG AGCGTTTCTT GCAGTTCCAC CACCGTGCGT GCAGGATCAA 10560
AGCGCTCAGG TGCAAAATTC TTTGCCGTCG CAGTACCCTG ATATCTCAAA ACCGTTCCTC 10620
CACTCAGGTA TAAGAGAATG CGAAACTCAT CGTCGTACTC AGCTATGTTC CCAAGTTCGT 10680
TATCCCAATA CAAATTTTGC TTAGAAACGC CCGTAGTACG CACCGGATAC ATTCCTTCTT 10740 TTGCATTGCG CTGCACGCCA TGTGTCCGTT TGGGCGACTC ATAACTCATA GAATACGCCG 10800
CAGTAATGTG ATGATAGCGC CGACTACCAC GCTGAACTTC ACCACGGTAG GTATACGACA 10860
CGGTAAACGG CACAATGAAG GGCGTTTGGA TACCAAAGCC GTCACGCAGA TCATGCGCTT 10920
CTTCTGCTTC ATGCTCCCAG GTATCTCCAA CTTCGATATC ATAATCAGGA AACACCGGCA 10980
CATTCCGTAC TACCGGCATG AAGAATGAAC GATCAATATC ATACACACCA AATGCGTCGC 11040
GCCAGAAAAT ACTTTCGTAG TGCCTCCCCC AACGAAACGT ATTATTGGGA CTTTTCTCGG 11100
ATGTCATGAA GTGACACACA TACCGCGCCG CATCAGGCGC CGACCCGTGT GCAACAckTA 11160
CTTCGGAAAC ATGCACCGTG ATTCGATTCG TAATCTCCGC CGTGTGAGCA AGCGTATCGT 11220
TCACAAACAC ATCCTCGCGT ATCAGCGAGT TGATACGGTG CGTATCCCCC TTACGAAACT 11280
TGTaGCGCAA GcgCAGAGGG TACGCCGCAC TTGCCGCGCA CCAGACTGCA AGAACACACA 11340
GCATTCCCTT CCTAGAAAAC ACACTTCCCA TCATACACAC CGAACGGACC TATCCGTCAT 11400
ATTCCGCACG AGAAGTTGCA CCCCACCCTT TCCTCCTTTG AGAAGAGCTG TGATAAAAGG 11460
ACGCCCTATC AGCACCTGct GCGCACCCAA CTcCCstGct GCGCACAAAT GCGCATAACT 11520
GCGAATACCA CCGTCTACCC ACACTTCACC GGCACAGCGC GCTAACTCTC CACCGTATTC 11580
AAAAAGAAAG TCCgCTGTGC TTCCGCGCGC AGTTTCAACC cTGCCTCCAT GATTTGATAC 11640
GATAGCGACA TCCGGTTTCA ACTCTCTGAC TAATTCGACA TCGCGCGGCG CAAAAATTCC 11700
TTTCACCACG ATAGGAAGTT TTGCAAAACG TCTGACTGCA CGAAGGTGsG TAGGAGTCTT 11760
TTTCTCTAAC TGCACTTTAT CTCTCATGGT CACAATATGG TACGCATCGA TATCCACACC 11820
CACAAACTCC GCAACGTCCC GACCCCACTC GATACGTTCA AAAATTTTTT TGTTCACATA 11880
CGGTTTGATA AACACTGCAG CCTTCTTTTT GAAGGAACGC AACGCAGCAA TACCCGACTG 11940
CAGTTTGATG TCCGGACAAC CATCTCCCAC ACTCAACAGG ACACCcGTTC CGGACACCGC 12000
TTCGATGAGC CGATAATAAA ACGAGACCTC GTCAGGATAT CCGACGTTTT CAACTGCACC 12060
CGTCATAGGC GCAAGACGCA CGACCGGTAG TTCATGCCTT TGCGGAACAT ATTTTTTCCA 12120
TGCAAGGCAG TTCGCGATAA AATTCGCACT GTTAAAAACA CCCCCCATCC CTGGTAACTG 12180
ACCACGACAC CCGTATCCAT CACAACGGGC ACACAGGCGA CACTTGTACT CTGAACGCGC 12240
AGAACGCACC ATGCGATCTT CCTTTTCCCT AAAAAAGCAC TGCCCTCTTT ACTCTACCAC 12300
CCCATAGCGC TTCAAAGCCT GTGCAAAACC TGCATCATCA TTAGATTCGG CAATATACGT 12360
TGCGTGTTTT TTTGCTTCCT CATGACCATT ACGCATACAA AAGGAAACGC CCGCTGCTTT 12420
GAACATGACG ATATCATTCC TCTGATCACC AAACACGCAT ACTTGTTCCA AAGAAATGCC 12480 GTAGTACTGG CACAATA AT GCAGGGCATT CCCTTTATCC ACTCCCTCGG GCATTACATC 12540 AATCAGGTGC GGCATGGAGT GTTCACAGTG CACTCCATGC AAACTGCCAA TAAACTCAGC 12600 TGCTGTTTCA AGATCTGAGG GAGAAAAATT TTGCACAAGG ATTTTCATAA CCTGTCGCTG 12660 CCGATCGCCT GATAGTGCTT CTATGGGAAA GATAGGGATA AGCGCGCTCC CCCCACGACG 12720
GGCCAAACGA TTGTATTCAT GGAACGCCCG TATACGGACA CTGTAGTCCG GTGCATATAC 12780
ACTGTCAGAC GTATAAACTA GGTAATCTAG GCGATGCGCC ATGCCGTATT CCAAAACACG 12840
AGCAAATGCT TCAGCAGCAA AACATTTTTG AAAAATAGTT TTGCCCGTCT TAATTTCACG 12900
GATCATTCCC CCGTTATACC CAACAACCGG TCCAACCAAC TGCAACTGCC GCACATACTC 12960
CTGCACCATA GGCAGCATCC TACCAGTTGC AATAACAAAA GGGATATTCT TCTCGTGCAG 13020
ACACTTTACC GCCTGCTTAT TCTCAGGTGG AATGTTATTT TCGCTATCCA AAAAAGTACC 13080
GTCCATATCG GTAACCACCA ACCTAACCTG TGCGCCTGAC ATCAACATCA CCTCTGAACT 13140
TGAACAAGTG CACACCCAAA AAACAACACG GCAACTTCAC TTACTCGCAG ATCCATGCGC 13200
GTCTCCTTGC TTTATCGCGT ACACCTCCAT AGTGTCTCCT AATTGGAACC ATTTGCTCAA 13260
TGCCATCAGC ATCTTCCATT TCAACCCACC AGATTTTAAA TGCTTGTGAG GATAAAATCG 13320
CTCGGGATGA TGACCAGTCA CCACTATTTT CTTTACCGTA AAACCGTAGC GCAAGAGCTG 13380
CTTTTTTACC GTGCGCGCAT CCCATATTGT AAAATGATCG CAAGGACTCT GCGCAAAAAA 13440
AAGATGCGGA AAACGCATAC CAGTGACACC TGCAAAATTA GGGGTAGAAA AGGCAAGAAT 13500
ACCTCCAGGC ACTAAAAGAT CTGCCACCTT CCTGAGTACC GCCTCCAGGT CCTGAAAATG 13560
CTCTATCACA AACCACAGGG TAACGGCGGA GAACGTACAC TCTCGAATGT AAACAGATAC 13620
TTGTTGGGGA CGATTCGTGG CAAAGCTGTG CCGAATCACA AAGTCAAAAC ACTCAGGAAG 13680
TAATGGAAAC GTTGCAACAC AAGCAGGAAT GCAGAGTGTG TCTCGCACAT GACGCACCGC 13740
AAATTCACAG ACATCAACCC CAACAGCATT CCACCCCGCA GCCTTTGCCG CAGACAAGAA 13800
AGCGCCATAT GCACAGCCAA CATCTAAAAC CTTTTTATCA ACTGCAAACG AATTACCTTC 13860
ATCTGCACAA AAAACTTCTG CATATAAGCG TTCGATCTCC TCCATCCTGC GCGCACCATG 13920
CATTCGAATT TGGTCAAAGT CCTCAAGGTA AGTCTTGCCA TACTGTGCCT TGTATTCTTC 13980
AAAAAAATAT GACTCGGAGT ATCGAACTGG ATCTGAAATT ACAAAGGAGA GAAAAATCAT 14040
ATCGCATTCA TTACAGCGCT GAAAGGTTTT ATGTAATGCA CGACCGACAA CATCCACCGC 14100
TACCATTTCT CCACAAAAAG GACAGCAGTG TTTTTTCCCC CGCGCAACGc gCATGaTTTC 14160
ATCTGCCAGA TCACGACACT GCGAATGTCG CGTCACAACA TCGGGTATAA TAATTCCCTG 14220 CTGCATCTGA GACCACAACT CATGCGCAgc AGGnATCAGC AGATCGAATA ACAGAAAAAC 14280
CCACTGcACT CGAGAGTAAA AAATGATAGG GCGTTGGTGA AACAAGCAAC ACCGCCGCCC 14340
CTGCAGCTGC AGCCTCAAAT GCGGTAAAAC CAAAATGGGT AACTACCACG TCCCAGCGGT 14400
GGAGATGCTC TTTCAAATGA GGCAAAGAAG GGTATATGTG CACCTTACCG TCCGTCCGTT 14460
CAGACGCTTC ACCAGGAACT ACCACTGAGG TGTCAAAACC TAATGCAGCT ATCCGCTCAG 14520
CACACGAACG CGCGCGGCCG TGTGTATCTT CTGCCCCATA CACCACCAAT ACGGTGGTGA 14580
CGCCAGGTAT AGGAAAAAAT CTGCCGTTTG CCACAGGCAG ATCCTGCTCC TTTCTCGTCT 14640
CTGGAAGTGG GATAAACGCA CTATCACGTA CATTCGTAAG AGCGGCCAGC GACCTTCGAC 14700
CCGGACTTTG CAAGACGGGA AATACATCTA TCAAGTAATC CGCATTCAGA CGTCCAGATC 14760
CCCCCTCATC CAGAGCAAGC ACTGGTGCAG TACGCTGAAG TAATTCAATC TCGCACGTAG 14820
AAGTACGGAA GTTGTCGACC ACTATCAGTG ATGCATCCGC AGAACCGTGT CCACTATCAG 14880
TTCCTCAGGA AAGGGAGTAG ACAGTTTGCA CAGCAAACTA CGATCGGGCA CATACAAGCA 14940
ACAGCGCACA CGTCCTTGCA GTCGCAAGAC TAAATACGCC GCCCGATATA AATGCCCCGC 15000
TCCCTGTCCT ATTTTCACTG AGGGTACAAA TACCACCACC TGCCGCGCGT ACGCATACGC 15060
ATCTAAAATA ACATCGGAAG AGAGCGGAAA CGGACACTGC AACGTGCACA CGTACTCCAT 15120
CATATGCTGC GCTCGCTGAA AGTCTTCCTG CGTGTCCACC GTTACTCGCA CATCAGGGTG 15180
ATACCATGCG GCAGCGGcAG GTTCACGCAC ACACACGAAA ATACCCGGGC GGCGATGTAA 15240
GGCAGGTCCT ACATGCTCAC GGTCGTACGC CTCCAAAGGA AGACGATCTG CTAAAAGCAA 15300
CGAGCGCGCC TTTAATATTT CCACTCCGCT GCCGTAGGGA AGACCAGTGA AGGTAAAATA 15360
ATCTGGCTCG TCCAGTTCCG CATAACGCAG GAGCGCTGCA GCAGCTGCTT CGTGAAACAA 15420
AAAAGGGTTA TCTCCGGTAA CCCGCACGAC GGTTCGAATT GGGAACGAAT GCTCAAAAGC 15480
CTTAACTGCG ATACAAAAGC GGTGGAGCAC ATCTTCTGCC GATCCTGAAA TACAGTAGAA 15540
CCCATGCGCG CGCGCAACGG GTTCAAAATC TTTTTTAGAA TGTTCATCAC ACGCAAGAAT 15600
ATACGTTTCT GCAGGGATGA CCCGCGTTGC CTGCAACACG TAATGTATCA GCGGCTCCCC 15660
CATAAGGGGC AACAGCGCCT TTCCTGGTAA CCGCGTAGAA TCAACGCGTG CTTGCACAAT 15720
AACTGCAACA CCGGAACGAT CCTGCTCCAT TAAACgTGGT TCCCACTCAC TTACTAAACG 15780
CGCGGCAGTT TTTTTAAAAC GGTATCTATT CACCGCCACC CATTTCTGAG CGTCCCGAAA 15840
GAGACGGTAC GCCTCTATTA GATGCAGACC AGAATTTCTA AAAAATGAAA AAAACTGCGT 15900
GCCCGAAATA AAGGCCCTAT CCTTCTCTAG GAGTGGCGCA AGGTTTTTCA GATAAAAGAT 15960 GCGGTACGAA CCATCTGCAC TTGCATCCTC TTGCGGGGGA AGAGCCTCAT AAAACAGTCT 16020
TATTTGCGTA TTAACCTCTA TGCACTCCCC CCACAAATAA GAACGCAATC CAAAATCAAG 16080
ATTTTGCCAA TATGCATTCG CGATGGTAAA ATCAAAACCT CCGCATTGGA TAAACCGGTC 16140
CCGATGGTAT ATTCCAACAA AATCATACGG GTATATCGTA GGCGTATGGT TAGTCGTACA 16200
TTCCGTAGGC TGTGTAAAGA AGTCACTTTT TCGCAAGGCA GGGACAATTT GCGTTGGGAG 16260
TACAGTACTG TGGGAAGAAC ACAATTGCGG CGCAATACAC ATGTGCGTAT TCGTGCGCAA 16320
GATATCCTGT ATGTGCTGGC TCATACCTGC GGATCTATGC GCATGTCGCT CCACAGAACT 16380
AAAACATACA TGACATCAAG TTCTGCAACA CCTAAATTGA TCATTTCTCC AACCGAAATA 16440
CACTCAAGCg GTGTGATAAA CTTAACGAAG GGGAAACgCT CAGCAAGACC GGTAACATCT 16500
GGAGCTGCTG CACTCCGCTC CACCGATACA ATGGGCGCGA TGCCAAGGGT AGTCAGTTGC 16560
GCAAAAAACT CTGCCCGATT CCACCGCACT CCCCGATTGA GTACTACTGC ACCTAAAAAA 16620
GCAGACGCCA CCGGACAGAG TTGACCACCC ACCACCGTGT GTGCAAmATT TTTCTCGTTA 16680
AAAATTATAG GTATAGTACT CATCACACTG CCCACATAAT CCTTCATGTA CTGCACGAAC 16740
GTGCTGAAGA TACACATCCT GTCCCCGCCG CCAAATCTGT GCAAGATCCT GCTGGAAAGC 16800
ATTGCCCAGC GCGTGGCGAC AGTGCACATC TTCTTTGCAC AGCGGAACTC GACCATCAGT 16860
GAAAATAATC ATATCTCGTT TTAGATGCCA ACACGGA AC CTCTCCAAAG GAGATAGATC 16920
TGCAACCCGC CGATCGGGAA GCAATCCGCA CACATGATCA TACTTTTGAA CAATCACCTG 16980
CCCCACACGC TCCTTCCACG TGCGGTAAAA GGGCTCGAGC TCTTTTTCGT TTTCATTCAT 17040
ACGGAAAATC TGTGGCCACA GCACGCCCGG ACATTGCGCA TGTACCTGCA TTGCAAATTC 17100
CGTCGCTTCT TTTAGAAAAA ATTCCGCTTC TGACAGCGAC ACACGGTGTA CCTGACTGTA 17160
CATGCCTGAA CTCACCGCAT CTAAAAACAC AATCCAGCCA ATGGCAAAAG GAGTGCGCGC 17220
ACTGTTGCGC GCACAtTTCG CACAAATCAC GCACTACAGA CTCCTGcCAC CCCAACCCAC 17280
TTGTCTCAAT GAGCACCGAC AAACCCGGAT ACTTGAGAAT CTCACGTACG ACGTCACACA 17340
GTGCAGGATA CAACACCGGA TCCCCAAATA CCGAAAGCGA AATGACCGCA CGTTCTGAAA 17400
AGTCTGCAAT GCGCCGGATC AACGCACACG CTTCCTCTTT TGGCATCAGT GAAGCATTCT 17460
CCACCTGCGC AGGAAAAGAT ACTGGCCGAT ACAACGAAGA AAGCGGrTAC GCACGCGTCA 17520
ACTCAAGCGC ATAGTACGCA GGAACTGTGC GCAACGCATG CTCACGTGCG CTGATAAGCT 17580
GTGCGTgAAT TTTCTGCAGT GATATCGGTA AATGCAGCAC ACTGCAAGAA CTGCGCCTTA 17640
GAACTCGTAT AAAATTCCAG ACGCAGATGC CGCACATCAA CCGGCGCAAT CATAGTTTCT 17700 AGATCAAAAG AATTAATGTC TGTTTTAATG CTCTCAAAAA TGAACGAATG wCaAAACAAA 17760
TATGTGCATC CTGAGTCAAA GTAGCAAGGA TGGGGAAAAG TCCTGGCGCA ACAACTGCAG 17820
CAAATAACCC CTCCGGGTAG CCATCTGCAA AACTATACTC TGCAAGATAT TCGCGATGCT 17880
GCGCGTATAA CTGGGCGCTC GCCACACTAT CAATAAAGGG CGCGTCTGCG gCAGaACAAA 17940
GACAGCCTCA GGTTCTTCAC CCAATTGCGC GTATTCCGCA CATACGCGTG CAACATGCGC 18000
GAAGAAGGCG CTCACTCGCA TGTCATCCAG AACGTTCACG CGCAGaCGGg AAAAATAGGc 18060
GTACGCAcTG CATAAAACGC GCAACCTTCG cGGCGCTCGC gCATcCGCGT ACACACACAC 18120
CTGATGGCAA CCAGGCAACG CGTAAGCAGc TGTAACAGCA CGCTCGAAAG CGCAACGCCG 18180
CCCCACCCCG GCTGTACTCT CACAGCCCCT CACCTCCACG CACGGCACGA ACCCCGGCAC 18240
TGCTTTCATA CACGACTGct CACGCGTGCC ACACAAGGGC ACAAAGGCAT AATCGCTCAG 18300
ATCAAAGGCA CAGACCACAG CAACCGTTCC CACCGCCCCT TTATCGGCAC GCATGcAGGA 18360
GAACCTTGCT CAAAACTTTT CTGAACTTTA AAAGCACAGC CCCGAAGATC ACTCAACCGA 18420
GTGAGAAAAA GAATCATCAC ATACTCAGCT CATGCAATTC ACGCCTGCGT AAACTTCTCT 18480
TGCAAAAGGC GGGCAACTAC ACGGGGGACA AAAGTAGACA CATCACCACC GAAAGAAGCA 18540
ACCTCGcGTA CCATGcTGGa ACGAAGCGcA GCATaGCaGG GcTTTGCCGc CAAAAAAACT 18600
GTTTCTAAAC CAGCGTCGAG CGCACGATGA ACCCATGCAA GATCAAACTC CTGACAGAAA 18660
TCAGTAGCAT TTCTCACACC GCGAACCAGC ACACGCGCAC CAACATCTCG AGCGTACGTA 18720
ACCACAAGCG AACGCCAAGG AAAGACGTAC ACACCCGGAC GATCCCCAAG GACTTGCCGC 18780
ATCAAATCAA CGCGCTCACA TTCTGAAAGC AAATACCTTT TCTGAACATT GACCGCAACC 18840
AACACGTGGA CCTCTGcAAA AAGACTACGC GCGCGCAGAA CAAGATCTAA ATGCCCAAAG 18900
GTAGGCGGAT CAAAAGAACC GGCGAAAATC GCCTTCACGC ACGGCAACCC CTCACGTTGT 18960
TCAGGAACAT GCGGCGAACT ACACAAAAAA CAGGCAGCAC TATTTATATT CGCACCCACT 19020
CCGTCAACTC CTCGCGGAAA ATCGGTGTTC ACGCGAGCTC TCTACGCTTC CTAGCTACAC 19080
AACGTGCGCA CCGCAsGnCG CGAACGCTTC TCCTCCTACT GTGCTCTTCC CCCACCCACT 19140
GAGATCGCAA TAGCACGCGA CTTCCCGAGC GCCGCCCATC CATTGAGCGG AAAGCGCTCA 19200
TACAGCTGCA TGTAGGCAGC TGTAGCAGCA GCCGCGCGCC CTAACGCCTC TTCCATGCGA 19260
CCAACGTTAA AGAGAGCACG GGGAACGAGT GGAAAATCCT GCACGCGTGC ACTCCTCTGA 19320
TACAGCTCAC GCGCCTCCTC GAAGCGACCA CGCTCATCCG CGCAAGACGC TGCATTAAAA 19380
TAATACACGC CGGCCACGTA GCTCCTACGA GCACCATACG CTGCACGGAC ATAGGCCTGC 19440 TGTGCCTTCT CCCACTCCTT ACGCGCAAAG AAAATGTCCG CGACACAGGC CTGTGCGTAC 19500
GCATATGCAA AGCCGTCCCG CCACGGCGAA CTGGCGCACG ACTCAAGGCG CGCCAAAAGC 19560
GCATCCTCCT TCGCACGTAT GGAAGACCCT CCCCGCCCCT CCACTTCATG ACCAGAAGCG 19620
CTCACTGTAT TATCCGGTTT ACGCAATACG TCCCACTcAC GCGCTATGsk CgTnACCTCT 19680
GCTGAAGCAC GCGCGCGTAA ACGCGTCATA ACCAGCAAAC ACCCTGCACT TAGCCCTAGC 19740
CCCCCGAGGA TAGCGACGAG CACACCCACA AGCAACCGCC GGTGCAGTTC CAAGAACCGA 19800
TCAACCCGCA CTATCCCACG CCGCTGCTCA TGCATGGATC CCTCCTCCCC TCACAGAATC 19860
ACTCAACTCG AGAACCCTCA CCGACAGgcG cGAGCaGCGa AnmCCCACAC CTGcCCAAGG 19920
GAAAGAACAG CACACCGGaA CTaCCGcAAG ATACACACGG aGGCTCGGGc ACCTaCCTcA 19980
CACGcAAAAG ACCTTGCGGA GAAGCACCCA CAACAACCGA GGAGGAGCCA CCAAAACCGC 20040
AGTCAACCCA CTGCGCCCGA AAGAACCGCG TATCACACGC GCAACACACC GACCCACTCA 20100
CTTTCCAGAC GTTTGCCTCA TCAAATCACC GAGCGTAAAC GAGCCTTCGT CCTCCCCCCG 20160
CGGGGCGGaC ATATACCGAG AAAGCTCGTC ACGCTGTACC TTTCTTTGAT AGTCTCTAAC 20220
AGAAAAAGCA ACCTTCCTGT CCTTCACGTT CA ATCTACG ATCACTGCCT TGACCCGGTC 20280
CCcCACTGCG TATTTCCTTA GCGCTTCACC CGGATCCCCA TCCCGATTCT CAACCAGATG 20340
CTGCTTGCGA ACAAGCCCCT CAACGCCACC GGGAACACGC ACGAAAATCC CAAAATCCGT 20400
CACGGAAGAT ACTTCCCCCT CCACGGTAGA CCCTACCCCA TAGGCGTTCG CAAACACCTG 20460
CCACGGATTG TCGCTCAACT GCTTAACACC AAGCCGAATA CGGCGCGCTT GCGGATCACA 20520
CTCGATAACC ATACACTCGA TTTCTTTACC TACCTCAAGC TCATGGTCTG CAGGACGCGT 20580
CCGCTTAACC CAGGACAGAT CATCGACGTG CAAAAAGCCG TCTATTCCCT CTTCCATTTC 20640
AATGAAAGCA CCTGCGTTCG TAACCTTTAC GA ACsGsGC GTAAAGCGcG CACCCACAGG 20700
ATAACGAGCC TCTATTTCCT CCCAAGGATT CGCCGTTACC TGCTTAAGCC CCAGAGACAC 20760
CCGTCCCGCC TGGATATCAT ACCCGAGGAT CA ACACTCC ACTTCATCCC CAATTTTAAC 20820
CATGTCACTG GGTTTACTCG TTTTCTTTAC CCAGCTGAAC TCACTAATAT GCGCAAgCCC 20880
CTCGATACCC TCAGCAAGTT CAATGAACGC ACCGAAATCA GCGATTTTCG TTACACGCCC 20940
CTTGACCACA TCATTCACGC CGAACTTGTT TTCAAACTCA AGCCACGGAT CCGGCTGAAA 21000
ATGCTTCAGG GACAAATTGA TACGCTTcTC CGCCTGATCC AGGCGGATAA CCTTCAACTC 21060
AATGGTTTGT CCTTTCTTCA CAAACTcGcG CGGCCGCGCC ACGTGCCCCC AGCTCATGTC 21120
ATTCACATGC AGGAGGCCAT CGAAACCGCC CAAGTCAATG AAAGCACCAA AACTCGTAAA 21180 GCTCTTAACC ACTCCGGATA CGGAATCTTC AATATGAACC GAATTGAAGA ACTCCTcGCG 21240
CGCCTGCCgC GCACGCTCCT CCAAATACCG GCGTCGATTA ATGACAATGT TGTCGTTGcC 21300
GCGATGCTGT TTGCTTTGGG ATATACGCTC GATATAGAAC TTAGACGTAA GCCCAATGAG 21360
ACTCTCAGGC GCGTCGACTT TCTGACAGTC CGACTGGCTG ATAGGTAAAA AgGCCATCAT 21420
CCCCGCACCC AAGTCCACTT CAAAACCACT CTTCTTTTCC GTTAGACGGA CGATCCTCCC 21480
CTCAACCGGA GTCCCGTCTC GCTCCGCATC ACGTAACTTA ACTTTCAAAC CCAAGCGATC 21540
GGCCTTCGTC TTGGAAAGCT CAGGGCCATA AGGCGTCACG CGCTCCACAT ACACCCGAAC 21600
GCCATCCCCT GcCTTCGGCG GcGCCTCAAA CTCTTCCACT GGAACGCGCC CTTCAGATTT 21660
TCCCCCGATG TCTACAAACA CCGTCCCCGC ATTAACCTGa ACCACCGTCC CCATCCTAAC 21720
AGAACCAGGT TCCGGAGCCT CAAACGAATA CCGCTCCTGc AGCTGcCGCG GcACCAATGG 21780
TGTAcCCTTC CCCTcCTGaT TTTCCACTGa ACGCTCTCCT CCCCACAAAG CTyTGCGGTG 21840
GCCTCGCGCG CGATTCTTTC ACAAACCTcC TcAATGGTCA AGCAAGAAGT ATCCAGTACA 21900
GCGGCATCAG GGGCACAACT GAGCCCCCCC AAGGTGnGnG CCCTGTnG 21948 (2) INFORMATION FOR SEQ ID NO: 64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13518 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64:
AGTGTTCGCC CGACGGGAAA CGTATGGGGT CGGTACATCG AATGTTACCC CGCGCATGCA 60
CTAGAGAGCA ACACAACGCC CCGGGGAATG CGCATAGGAG CAGCAAGAGA AGCGGCGCTG 120
TACGAAGGGG AGTACGCATC CTGGTAAGGA ACAAATACGG TATAGCCACG TTCGGCAAGC 180
ATCAGTTCCA GCGTTTCAGG GTGTGCCCCG GTGCGCAGAC TGCGGGGTAT TTTTTAGAAC 240
AACCACGTTC ACGCCGACTG GGTTTTGTTT CGGGTGCGCG CGCGACTGCG GkTTTTCGGA 300
AGGkTkTGGC GCAGACGGkT TATCCAGATG AGATGTTTCG TAGGGATTGG CCAGGGAGAG 360
CAGGAsCCcT GCGTTCACCT TCGTGTTACy TAsCCGAAGG GAAAAGGAAC TGCGCAyTAC 420
GCTACGGCTG GGCCGATAAC CGGGCTCCGG GGCACAGAGG CAGACAATCA CAAGCGTGAA 480
AACACCTGCA ACGAGGAGCA GGACCCGCGC AATACGTGCT CCTGCACTGT ACAGGTCATT 540
GGGAATTCCC TGGCTGAAGG CGACAAGGCG CGGCAACTCC GTGAGACACA CCACTGAACT 600 TGAGACTAGA CTCAAGGTAA GGTCCAACGT CAGACCCTGT CCGAGGACTG TCAGTGCACC 660
GACACTCAGT GCAAGCACTG GCAGCAGGGG GATAGCGCTT ATCTTTCTGC TTTGTCGGCT 720
GAAGGGGCGA AGAAGCGCCG GCAGACTCAA ACCGAGGATG AGTAGCTCGC TGACAAGCAA 780
ATCTGACACG GCCTCCGTCT CGGTTGACTC GGTTGTACGC TTTACCAAGT TTTTCTGAGT 840
GAG aCCACC TGTTTCCTTG TTGCGCAAGG GAACAGGTGG TGGTAGGTTT GCGCGGTGGG 900
TACCTTGTAC GTGGTAGCAA CGCCGATTGG AAACTTGGCA GACATCACCC TCCGTGCCTT 960
AGATGTATTG CGAACGGTGG ATGTAGTTGC CTGTGAAGAC ACGCGTAGGA CGCGTGCGCT 1020
CCTGTCTCAT TTTGGGATCC ATAAGCGTCT TGTTTCCTGT CGTGCACACA ATGAGGCGCA 1080
GGCGGCGCGT CGACTCATCC ATTTTTTGAG CACCCCTATT TCTGCTTTTC TCTCTCCAGA 1140
GAAGGGGAGG GGCAGGCAGA GCGCGCGGCG CACGCGTGCA CGTCCGGGTG AGACGGTAGG 1200
GACAGCTGCG CTGCAgCTcG CTGCAGAAGC AACGGGGGAA CAGGAAGTGT GTGGATCGCC 1260
GCACGCACAG GTAGCCTATG TTAGCGATGC AGGTACGCCG GGGGTCAGTG ATCCGGGAGC 1320
GGTTTTAGTG CGCGCGGTGC GGGATGCTGG GCACACGGTG GTACCGATTC CCGGTGCTTC 1380
TGCACTGACT ACTTTGCTGA GTGTTGCAGG CGTGCGAGAC AAGACCGTGC TATTCGAGGG 1440
GTTCCTTTCA CCTCACCCGG GTCGTAGGCG TGCGCGCCTG GTGCAATTGT GCGCGCAgcg 1500
TGTaGCTTTT GTTCTGTACG AGAGTCCCTA CCGGGTTCAA AAGCTTCTAG AGGATCTGGT 1560
GGCGGTGGCG CCGGAGTCGC AGGTGGTGCT GGGTCGGGAA TTGACCAAGG TGCATGAGGA 1620
GCTCTGTGTG GGGACTGCCT TGCGCGTCAT GGAGAGCTTC TGtGCcGGAC GCGytGCGGG 1680
GGGAATGCGT GTTGCTGGTT TCTGCAGAAA AATTTTAGAT CTTTATTTTT CTTACAAATT 1740
TCCGATAATG GGGCGGGGGT GGGGCTCTTn TGATGATCGA TAAGCTAATn GACTTGATCC 1800
GGTTCAGAAC CTTCGCGCCT CTTGTGCGTC TGAGCATGTG GCGCGTGCTC CAGCCGGCGA 1860
TGAGATTACT GTCTCTGCGG AAgCCCAGAA AAAGGCTGAG TTGTACTTGG CCCTGGAGGC 1920
GGTACGTTCT GCGCCTGATG TGCGTGAATA CAAAATAGCA GCTGCGGAgC agAaGCTTGC 1980
AGACCnTGCG TATCTGGAGC GGGCGCTGTc CCACGTGGTG GAGCGCTTcC TGGAGGAGCA 2040
GAATTTATAA GcCTGTAGGC AGGCTTTTTA GGTCCGGGTG AGGGCGTACG GGCTGTTGTG 2100
TTTATACCCT CAGGCGGACG CTCTCGATGT CTGGGCTGAA CAGTTCTCGC ACGTCTGAGA 2160
TACCGAGCGC CAGGAGCGCC ATGCGGTCTA CTCCTAGTCC CCAGGCCATG ACGGGGACGT 2220
GCACGCCGAG CGGGTCGGTC ACTTCTGGGC GCAGGAGACC TGCTCCTCCC AGTTCGAACC 2280
AGCCGAGTGC GGGGTGGAGT GCGTGTAGCT CGATAGAGGG CTCCGTGAAC GGAAAGTACC 2340 CGCCCACGTA GCGTACCTCC TGTGCACCGG CGATTTCTGT TGCGAGGATT TTGAGCATGC 2400
CTAGAAGGGT ACAGACGTTC ACATCCGTAC CCAGTACGAT TCCTTCTGTT TGGTAGAAAT 2460
CTGCAAGGTG CGTCGCATCC ACTTGGTCGT GACGGAAGCA GCGTGCGATC CCAAAATACT 2520
TACCCGGTAT GTGGGCGGTT GGCAGGTGGC GCGCTGAAAG TGCTGTTCCT TGGCTGCGTA 2580
ATAGCAGTCG GCGGGTAAAA TCCCGATCGA ACGAGTAGCG CCAGCCGAGG CTCCCGCTAT 2640
CTGctCCGCG TTCATGCGTC GCCGCAACGC GGGAGAGGAA CGGCTCTGGG ATTGTAGGTG 2700
CGTGCGTTGG GTGTTTGAGG TAATACACAT CATGAATGTC CCGTGCAGGA TGGAACTGTG 2760
GCATGAACAG CGCGTCCGCG TTCCAGAAGT CTGTTTCCAC CAGTGGCCCG TCAAATTCCT 2820
GAAAGCCAAG TGCCACCAGA CGATCTTTGA TGTGTTCGAG GAAATCTGCG TAGGCATTAG 2880
ATCGGCCGGG GATGATGCGG GCAGGGGGAA TGTGAACGTT GTAGCGGCGT AGATGTTGTG 2940
TCTTCCACGC GCCACTTTTT AGGCACTCGA CGGTGAGTGC ACCTATCTCG TTTCCGGTAA 3000
GCCcTGCGGT ATGCAGCGCC TCCTGCACGG CACGGGCGGT GGGGGTAAAG GTGAACGTTA 3060
CCCTCTCGCG TACACTGACT TTGAAGAGGC TGTCGctTGC GCCCCGCTTT TTTGCTATGC 3120
GTTCCATTAC ACGTCGCTCA TCATCAGAGA GCTCAGATTC AAAGAGGGTG CCTGGAGGAG 3180
TGTCCGAGGC CTCTGACGGG GAAGCGACGC GTGCAGCGGC GCGCTGAAGC AAGGTGCGCG 3240
TGAGGGACAT GCGAaTCACT TACGTGCGGT GAAACGATGT GTATGCGTTT TTCACCGTCC 3300
ATACGGAGGA TACCCTCCTG CGCTAGGATA CCGAACGCTG AACCTACATC CTTTGGTGCG 3360
AGCGTGAεCG CATGGGCAAG CTCAGGGAGA CTGAGCCCGT TGCAAAGTGG GGGGCGAGGG 3420
TGTAGGTGCT CGGCTGCATC TGCAATAGCG GTAAGGGAAG GGGGGGAAGA CAAGAAGGTG 3480
AGCATACGCT CCTCTGCAGT ACCGTCGCTA GCGGCAGCAT AGCCGCAGGG GGTGAGTTCA 3540
AAGGACCGCA TCTGTTCCCg CTGGTGCTCT TCGATGATTC GCTTTGCCCG AAGCCAGGAA 3600
AACGCTTGGT TTGCGTGTCC TTCCTTAAAG CCTAGCCGGG AGATGAGCAA CGAAGTCGAA 3660
AGGATCTCAT CCATTGCGCA GTTCTTGAGG ACTTTGATCT CAAGGGGATG CAGCTTGTGC 3720
ACAAGCGTGT TCAGATCGGC TTTACCTGTC ATGCGCGGCA TCATAACGTA TTTTCGCGCC 3780
GTTTAGATGT AGGCTGTTTC TCAATTTTTC GTTCTGCATC AGGGTACATC TGCCGTAGTG 3840
TGGAGAAATA GATGGAACGA TCCTGGGAGG GTATTGACAG AGGTGTGCCC GTCTATGGTA 3900
TGGGTTGCAC CATGGTTGCT CCTGGGATGC CCGCAGCTTT TGCTCGCTCG GTGGTTCGCG 3960
CGTCTGCGTG GGTGGGTGTG GCGCTTATGT GCGTTGCGTG GTCGGTCTCC GCCGCGGAGG 4020
GCACGCGGTC GGGTGGGCAG GCTCAGGAAC GCTTAAGTTC CTGGCGCCAG GTTGTGCAGC 4080 GCATGGAGGT ACATCTACGT GCGGCGTACA CCTTTTTTGA GAGTGGGGAT AGTGATCGCG 4140
CCTATGAGCA GATAGATAAG GCGTACTTTC GCTACTATGA GGCGAAGGGC ATGGAGAAGA 4200
TCACCATGGG GTATCTGTCC GGTGCGCGTA AGGCGGCGGT GGAGAACGCG TTTTTCGCGT 4260
ATCGGCGTTC GGTGCGGGGT GCGCGTGATT TGGCGGGCGT TGCCTTCTGC AGGGACAAGC 4320
TGGTTACCAT GTTGTATGAG GACGCGCGTG CGCTGGATGG GGTTGCGCGT GGTCGGGCGG 4380
GCTTTGCGGC GCATATCGCC ACGTTTGTTG CCTCGTGCGT GTTGGTGCTG CGCGAGGGAA 4440
TTGAGGCAAT TTTGGTTATC GCAGCGATTG TTGCGTATCT GGTGAAGACT GGTAAGGAGC 4500
GGTGCTGCGC TGCGGTGTAT GCGGGAGCGG GCGCGGGTGT TCTGTTCAGT GTCGTGCTTG 4560
CGGTGATGAT AGTCCGGGTG TTGGGTTCGG AAGGTGGTGC GGCGCAGGAG ATTATCGAGG 4620
GTGTTGGTAT GTTCTTCGCA GCGGCGATGC TCTTTTACGT GAGTAACTGG ATGTTGTCCA 4680
AGGCGAGGGC ATGTGCTTGG GATCGCTATA TCCGTCAGAA AGTTGAGCGG TCGGTGTCTC 4740
GGGGTAATCA GTGGGCGCTC GTGGCCACTG CCTTCCTCGC AGTGGCGCGG GAAGGGGCGG 4800
AGCTTATTCT TTTCTTTCGA GGCATCCCAG TTGCGGGGCC ATATGGGCGG CTGGCTGTGT 4860
GGGCAGCGGT TACTGTTTCT GCCTTGGTTC TGGTGGGTGT GTTCGTGGCG ATCCGTTTTC 4920
TGTCAGTGCG ACTTCCGTTG AGGCCTTTTT TTGTTGCCAC GGGCGCGGTG ATGTACTTGC 4980
TATGTTTCTC TTTCGTGGGT AAGGGTGTCA GCGAGCTGCA GGAGGCAGGT GTGGTCAGTC 5040
GAAGTACGGC ACCGTGGATG CATGGGTGGA GTTTTGATTT TCTGGGCATC TACCCGACCT 5100
ATGAGGGTCT GGCCCCTCAA GCGTTTGTGG TGGCGTTGGT GGTGCTTTCG GCGGTATGGT 5160
GGTGTGGTGG TCTCTGCCGT GGCGCATCCA GCACGTAGGC TTGGGACGGC TGTGTCGCGT 5220
CCTACTGGGG CCGGGTGTGT GCTGCGCCGT GGAGATTTCC ATTTGTTTTT CTATAATGGT 5280
GAGGAAAAGA AGCGCTGGAC GGGAGAAGGC GTTTTGAAAA GGAGGGGCGC GTGACGCCCC 5340
AGGGGAGTGA AGAATGAAGA GGGTGAGTTT GCTCGGGAGC GCACCATTTT TGCGTTGGTT 5400
TTTTCCGCGT GCGGGGcGGT GGAGAGCATC AGCACGGTGA GGAGATGATG GCCGCCGTTC 5460
CTGCTCCAGA TGCAGAGGGG GCGGCCGGTT TTGATGAGTT TCCTATAGGC GAGGATCGGG 5520
ATGTGGGGCC CTTGCATGTG GGArGGGTGT ATTTTCAGCC GGTTGAGATG CATCCGGCTC 5580
CAGGAGCACA GCCGTCGAAG GAAGAGGCGG ACTGTCACAT AGAAGCGGAT ATCCACGCAA 5640
ATGAGGCGGG TAAAGATTTA GGGTATGGAG TCGGGGATTT TGTGCCGTAT CTCCGAGTTG 5700
TTGCTTTCCT CCAGAAGCAT GGCTCTGAGA AGGTGCAAAA GGTGATGTTT GCGCCCATGA 5760
ACGCAGGGaC GGTCCGCATT ATGGGGCGAA CGTGAAGTTT GAAGAGGGGC TTGGTACGTA 5820 CAAGGTACGT TTCGAGATCG CTGCACCCTC GCATGATGAG TACTCGCTAC ATATTGATGA 5880
GCAAACTGGG GTTTCCGGAA GGTTCTGGAG CGAGCCATTA GTTGCAGAGT GGGATGATTT 5940
TGAATGGAAG GGGCCTCAGT GGTAGGGACG TTCAGAAGGT CCGAGGGTGC GCGCGCATAA 6000
GGGCGTTCTT TGTTCAGTAA GACAGGCGGG TAGTGCAGTG CGTGGCGCTG CTCGCCGGGT 6060
CCGTTTTGAG GGTGTGGGTT TTGACACGCA gTTATTTTTT TGAAAGTTCT CCTGCGCGTT 6120
CTTCTGTCTC CGTGGGGTTG TGCGGTGTAC AGAACGGGGG GGGGTGTCGT GAGTGCGGGT 6180
ATGAAAGTCT TGGTGTACGC GGTGGCGCTG GGGTTCGGGT GCGGGGGTGT GGTGCACATG 6240
CGGGAGGGGG ACACCTACCA ACAACTCCTC GAGCACCGCA TTGCAAATGG TCGGGAGTTT 6300
TCGCGGGTGT TTGCgCAGGC ACAGGTTGAC GAAGCTGAGC ACAATGAAGT TCGGACAAAG 6360
ACGGCGGGAA GTGTGCAAAT TGGCACGGGA GACGTGCTCT TCAACAAGAA GAATGGCAAT 6420
GGTGCTAACG GCTACAAGGT GGAGATGGCG CCGCATTTGA GTATTGCGTC CCCCTTTATA 6480
GGAAATTCTC GGCTGAATCT TGTTGCCCCC CGCAAGCTTG ACGGTGTCAC AAGTACCTCC 6540
ACCGTGTCGG TGGATTACAC TACCGATTTT TACTCCTCCG TTCGTCCAAC ATACCTGAAC 6600
TCCCTCAAGG AAAAGACATA TCAGAAGGAG AAGAGCGGTT CGGCGCTGCG TGATGGGCGC 6660
AGGCTAGTGG AACGGGAGTT TTTGCAGGAA GTACAGCGTC TGTACGGTAG TTACGCGGAC 6720
CAGGTGCGCG CAAGTTTGGA GTTGGTGCGC GCGCGGTTGC GTTTTGAGTC AGTAAAGAGA 6780
CAGGGATATC AAGAGGATTC GGCGTATTTT CAGAGCGCAC AGCTTGCACA GGTGCGGGCG 6840
GAACGCGCCC GGGCACAGGC CAGGCAGCGC TTTGACCTTG AGTACACGCG GTTTGCAGCG 6900
CGCAACGGGG TGGCCTACGA GGACGATGAG CGCGACGGTT TTTTACACGA TTTGGCGGTT 6960
GCAgTGCCGC TTGAGCCGGC GATGGCGGTG ACTCAGTGcG nCAGGGGAGC GGGGGCGcGA 7020
GTATTGTGAT GCGCAGGAcC GCTGCGAGCG CGTCATTGCC CAGCGAGGTA CAGATTACTC 7080
CCCCTTTCGC ACAAGCGCGC GCGTGTACTT TACCGATGGG GAAGAAAACA AGCAATTAAC 7140
TAACGGCATG GCGCCAGCTG CTCCTAGCAC TACGAGCACG TATGGGGGCA CGTTCAACAT 7200
GGCGTTTCCC GGCGGGGATT CCAGTTTTAC CGTGCAGAAT AGTAAAGGGC TGGCGGGGAT 7260
CCTAGcGAAT TTTGAGTGGA GCCCGATACG CACGCcTATC GCTCGCTGGA CTACACGGCA 7320
GAGCGCGCAG AGCGTGTcTT TGACGAAGTT GAGCTTCAGG CAAAGGGTGA TCGGTCGAAT 7380
AAGTTGTTCG GTGCTATAGA CGCGCAGGGG GACTCAGTGC TGGTgcTGCG CGGGGTGGAT 7440
TTGCAGACAC TGGATAACGC GCGCAAGAAG GCACGGTTGC AGAAAGAACG CTTGGAACGT 7500
GGTATCATCG GAAGGCTGGA GTACGAGGCA GCGCGTTCGG AGTATCTTCT GGCGCTTGCT 7560 TCTGTGGCAG AGGCAAAGGC GCGGGCGATT ATTTTTAACA CCGACCTTGC GTGTGCCTAC 7620
GGGGTGGGTG CGGACGCCGC CGCCGCTCAG TTGACCCAAG AGGAAATGGT GGTCTCTGAG 7680
AAAAAGGATG CTGAAGAAAA GAAAGAGAGG TCTTCGTGAG CGTAAgTTAT CGTGGCCCGA 7740
GGTGGTCTTC GTTCGTCCAC GTGTCGCAGC ATTCGTGTAG GTTCGTAGCT CCTACGTGCg 7800
CTGAGGGTGC TCAGGGGTGC TCTGAGTTTG GGGCGTTCCC TGTTTTTGAG GAAAGGGGAA 7860
TGTGCGCGGC GCGGCGTATG CGCAGGGCGG CAATTGCCGC GTGCTGTGTG TTCGCGCGCG 7920
GTGCGGCGGC CAATCCGTAC CAGCAGCTAT TGCGCCACCG CCTGGAAGCG TTGCGGCCGG 7980
GTGCCCGCGC GCAAATAGAG TTTGATGTGG CGCACTGTGG GTATGAGAAG GCgCGtTGCG 8040
CTcAGCAGGT ACGTACGTGT TGGGCAGTGA GCTTGAAATC AGAGGACACT CgGCGGGGGA 8100
TTTTGGGCTC CCTCGCTTTG GAATAAAGCC CATTATCGGC GTGAGAAGTC CGCGCTACAA 8160
TAACCTGGTC GTGTCCATCG ACACCGCAAG GtAACTAGCA TAGGGAATAT ATCCCGGATA 8220
AACGCGGATA TAGGGGTGGA TTTGTATTCT AACGTGCGGG GGCGCGAgcT CATTCGTATG 8280
CGTCGTGCAG AGCAaAAGaA AAGGCGGCGC AGAACGGTGA ACGAATTAAA TCGCCGTcGG 8340
TGGAGCTAGC GCTCATCGAT GAGCTGGAAG TGCTTTTTAC CCGCGCGCAG TCGCTCGTGC 8400
GGCGAGAGTT TCATATGGGG GATGCGCGTT TGGTGCACCT GCGCACGCGT GCGsCAGGTT 8460
TTTCTGAGCA CTCTGAAAAG GCCCGGCGCG TCCGTTTGGC GTACGACCGC ACACAGCGTG 8520
AGTTTGAACA AGAAGAGCGC CTGTTTGCGC AGGTGTGTGA TCCCTTCGCT GCCGTCTGCG 8580
CAGTGGGCGG AGGGGATGAA GCGCGGAGAG ACTTTTTGCT GCAGCTTGCA GAGGCGGTGC 8640
CGCGCGAGGT ACCGCTCTCG CTCGTTTCCT TGCATGCTAC AGATGCGCAC AGCCTTGCGG 8700
CGGcGCAgGA GATGGCACTG CTTGAACGCG CCGCGCAGAT tCGGAGCGTG ATTTGTACGC 8760
TGTGCGGTGG GCGCTGTTGT GAGCATGGGT ACACGGAAGA CTTTCATTCT GTTTAAGGGA 8820
GACGGAACCG AGTCGCTTGA AGGTTCCGGG ACGGTGGCGC TGCATATGCC CAGCGTGAAC 8880
GCGCAGGTAG AGGTAAAGGT GCCCTACGCG GAGAGGGGTA AGCATTCCCG TGACAAGGTG 8940
GGAGTGTACG GGAAGTCGCA GTGGAATCCG CTTGAAATTG CCTATAAGGT GTTCGAAAGA 9000
CGGGAGGAGC GGGCGCAAGA GCAAGAACAG GAGCAGTATT GTGAAGATTC CCTGGCGCGT 9060
GAAAcGcgGA aGATGGAGGG GTTAGAGGTG CAGGGCAAAC AGCTTTTTGC AGCACAAGAA 9120
ACCGCCTTGc GCACGCGCGA GGCgCTGCGT TTAGATCTTG CCaAGGTGGa rCgCGCCcgG 9180
CGCGCGGGkT AGTGGGAGGA AATCGCCTCG CGCGTGCGCG kTGTGAGTAT GCCGTGGCGC 9240
AgcTGCGTGC GGCGTGCGCG AAGTTGCATA TGTTGCGTTT TAATCTGGGA GTGGTACGCG 9300 CATTTGGCCT GGTGCCACAG GTGGCGCCGT GAGCGGTCCG CGGGGTGGTT CGTATGGCAA 9360
GCGCCGGGCG GCCGTGCGTG TGTTCGCCGG AAGCGTGCTG TGGCATCATG CAGTGTTGGG 9420
TGGGATGGGC gGsTGCGcTC ACCGCCAGCG AGTTAACGCC CGGCGCACCG CCGGCGGCAA 9480
GCGCCCGGGC GGCCGCGCAA GAAACGGGAA CCGACnTCTA CCAGCGCGTG GTGCGCTATC 9540
GGCTGCAGCG CAgTACGGCG GCGGCGCAGg CTGTCCGACG GCAGACGATA ACACAGAGCC 9600
AGTACGATAA GCAGCGGCTT GATTCCTTGG TGCGCCTTTC TATCGCAGCC GGGGACATTG 9660
CGTGGAACgc CGATGGGGTA AAGTTTCGCA TTACGCCCAA GGCnTCGGTG GCATTCCCTT 9720
CTTTTTATAA CCTGACCACC CATTTTGGTA TGACGGTAAC GCAGCCGAAC GGTGCCGCCG 9780
GGGGAGGAGs skGwnGAGGG GGAGGAGGCG ACTGGCAAAA GACgCTCGAC GCGGGGgCAG 9840
GCATTGATTT GTACTCGTCG GTGCGTCGCA GCCATGTGTT TGCGGTGAAC ACCAAGTACG 9900
ArGsmnTGCG TGATGCGCAA GAAGCGCTCG CCTGTGAGCC GCACGTAAGT GAGAAGCAGG 9960
TGCTCGAGGA CATGCGCCGG ATGTTGGATT CCTACGTGCA GCTGTTGCAC GCGCAGGAGT 10020
CGTTTGCGCA AAAGCAGAAC gcAGAGCGAT CAGTGCAGGT GGCTGGATAC ACGGACCGCT 10080
CCATTGTGTA mCGCGCAgCA GCGCTCGAGC GGGAGCGrGC ACAGGACGCG CTCAAGGTGG 10140
CGCAAGACGC GTTTGACGGA GAGTACCGGG ATTTTATCAT CTCTGCTGGT CAGGAATTTT 10200
TAGAAAAACG TGCGGATCAG GAGCGCTTTC TGCTCGCGCT GGCTGAAAGC GTTCCTGAAA 10260
TGCCGCTGGT GTCCACCGAG CAtGCGAGGC AGATACGTCC CGCCCTCTGc GCAACGCGCG 10320
TGAGGCAGCA GATAACGAGC GCGAGGAACG GGCGGTACAG AACTTTCCCG TGGCGCTTCG 10380
TCTTGACACC CGCTTTACCC TAGATGAAGG AACCGGGGAG CTTTCCGTTG CGTTTCCAAG 10440
CGTCAAAATA ACCAGCGCCC TGGCCATAGG TTACACCGGT ACGCTCAAAA GCATTGGCGG 10500
GTCTCTGGAC TGGCATCCGT TTGAAATCCG GTACGCGCAT TTGCGAGGAA AAAATCAGCG 10560
CCTGCACGAT GCGTTAGGGG CACGGGAGTA TGCACAGAAA AAGGAGCAGC AGGAGAAAGT 10620
AATCGCAGAC CTCCCACAGC GTGCAGAGGA TATCCTCTGG GAGCGTGAAA CTGCACGCGC 10680
AGAGCGGGAC ACGTACGCAG AAAGCGCCCG CGCGCACAGG AAAGGACTTG ATCGGGGAGT 10740
TATCGGCGCG CGTGGcTACG CGGCAGTACA TTTGGACTAC GTACGGGCGG TTATCAATTT 10800
GGCGAAGgCG AATGTAGACG CGCTCATTTT TAACATCGAC GCGCGCGTAG ATTTTCTTTC 10860
TTCTGGAACC CAAACATGAA CATGGGGAGT CTGGTATCTT ATGCTGCAAT CCAACGGGGG 10920
GTAAGGTGAT CGCACGCAGG ATGCTTTGCG CGCGCCCGTG GGGGCCGTCG TGCGTGGTGT 10980
GCGCTCTGTG TGGGGCGCTT GCCGCCTTGG TGCCAGCAGT CGGTGCGCAG GAACAGGCAG 11040 TGCCTGCGCC GGGGACGCCG GCTCCTCCCG CACACACGGC TTCAGAAGCG GTGCCTCCTG 11100
CGCCAGAGCC CCGTGCGGAA GGGGAGCAGC CGTCTCCTCT TGTCCCCACG cTCTGCCGGT 11160
CCCTGGAGGG GCAGTGGCTG CACGCGCAGC GCCsGGCACA GTCGGTCCGC GGCTGTGGGA 11220
GCAGCTGCTG CAGTGGCGCG TGCAGCACGG TGACGAACAC CAGGCGCCGC AAATGGCCTA 11280
CGAAATTGCC GCGAACAATT ACGACATTGC GTTGGTAAAG TCCATCGTGG ATCTGAGGAT 11340
GGGGACTGGA CACATACACC ACAACCTGAA TGGGAACGGG GCCGGGGGTA TGGCAAACGG 11400
TACGCCGACG CTTTCTCCCT ACGTGCATCT TTTTTTTCCG ACCTATCAGA ATTTGAGTTT 11460
AAAAGCGGAT ATTGCGATCA AGACCAACAC CCcTTCGGCA GACGTGACCG CGCTCTTTGG 11520
TATGGATCTG TACTCCAAGG TGCGGCGGCA GCATCAGCTG CAGGTGCGGC GTGCGCGCAA 11580
TAGCATGCTT GACGCGTTTG CGGCGCACtG CGGGGGCAGc ACgctGCGCG GGAAGCGTTC 11640
CTGGCTGAGC TCGATGAGCT GCTAAGCGCA TACAGCACGC TGCTTGAAGC ACAGGTAACC 11700
GAGCAGGAGT GCACGCGCCT AGTGCGCACG ATGCGCATAC AGCGCTACCA AGCGCATTCG 11760
GTAAAGTTGC GctCCgCAAC GCTCAAGCAC GCACGCGCAG AGAGAGTTGC CCGTCGTGCG 11820
CGCAAGACGT TCACCGCCCT GTATCAGGAT TTTGTGCGCA AGTGCGGGGC CTTTGAAGGA 11880
AATGATCCGG AAACATTCAT GCTCCATCTT GCGCAGGTAG TTCCGCAGGA GCCCGTATCT 11940
TCTAnCCGCA CTGCTTTCAG TGGAAAATGA CTGGGAGTTT CTTAAGAACA GGGAAGATTT 12000
GGAAACTCAG GCTGAAGCGC GTGCAGTGGA TGCTATCTCG TACGGGTTTA ATGTGGAGTC 12060
TGGGGTGGGG TCTGAGGGTA AGTCATTGAA gAGAATATTG GCAAATGTCA GAAtGGACTT 12120
TCCCGGCGGT GGCTTTTGGC TTGGATTGAA CTTACCGTAC CCGcAGTGGT CCCGTGTGGA 12180
GGTAAAATTT CGGCTCACGT GGGACCCGCT TTCCATTAAG TATcAGGAGC TTTCACGGCA 12240
GACACTGCAG CTTCATGAGC GGCTCAGTGC GCTTAAGCTT CAAGACGCGT ACGAAGCTTC 12300
TGAGCGTAAG GTGCTTGGCC TGCGCCACAC CGCCGAGTCG CTCGGCTGGG AACAAGAGGC 12360
GGCACTCACC GAACTGAATA TTCTCAGGCG GAGTGCGCAA ACGCACCAGA AGTGGCTGGA 12420
AAGAGGAGCT ATCGGCGCGC ATCAGCACGC CCGGGCCCAG CACGCGTACC TACAGGCGCT 12480
CATCACGTTG GCCAAGATCA ACATTAAAAT ACTAAAGTTT AACCTTGAAA CTGCGTCTTC 12540
GTTCAGACCA GTACTCTAAA GAATACCCCA AGAAGGAAGT TGTATGACCA CAGCACAGAA 12600
ACTCCTACAC AGAAAATCGA CCATCGCCAT GGTGGTCGGA ATTCTCGCCT TCTTATTTGT 12660
TCTTCCCCGC TTGGTGCGGG CGCTGCGTCG GGTTCCGCCG CCTACCCTCA GTGTGAGTAA 12720
GGAGGTGGTG CTCAATAGGA TTGAGATTTC GGGGTACATC GAAGCGGCTC AGCACCAAAA 12780 GCTTGAGTCC CCTGGTGAGG GAATCGTGCG CACCGTACGG GTGCAAGAGG GAGATACGGT 12840
GAAGAAGGGG CAACTCCTCT TTTCGCTTGA AAACTCTCAC CAGCAGCTTG ACCTTGCCGA 12900
GCATGAGTTT GCAATCGAAC AAGAAGAAAT TAACGGTGTT TCTAAAAAAA TGGAGATCAT 12960
GAAGCTAAAG AGAAATATGC TCCAAAAAAG ACTGAGGGAA CGCTACGTCA CTGCCCAGTT 13020
TGATGGCGTT GTTGCCGCTT TTAAGCTCTC TCCCGGACAG TACGCGAAAC CTCAAGATTA 13080
CTTTGGCACT CTCATCGATC GCTCTTACTT CAAGGCAAAT GTCGAGATTC CTGAGGTGGA 13140
CGCTTCGCGC CTCAAGGTAG GGCAGCGCGT TGAAATTTCT TTTCCCGCAG AACCAAGCGT 13200
GAAAGCGGTG GGGAGTGTCA CTTCCTATCC GTCCATCGCG CGCGTTACCA GTGTCGGGCG 13260
CACCGTGGTT GACGCCTCCA TCAGGATCGA TGAATTGCCA GAAATACTGC CGGGTTATTC 13320
CTTCAGCGGG GCAATTGTTG CCGGGGAGCA gGAGGAAATT TTAGTCCTGA AAGCCAAGAC 13380
GGnCTCCGGT ACGAAGAAGG GTGCTCCGTT CGTGGAnCGA GTGCTCCCCA GCGGTAAGAT 13440
AAAGTCTGTG GCCGGTTACG GTGGAGCCGT ATGTTnCCTG GCTTTGGTCA AAAATAAATT 13500
TCTGGGGCTG GGGGGCGG 13518 (2) INFORMATION FOR SEQ ID NO : 65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4448 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65:
AAAATGACAn AAGCACAACG GGnAGCGGTT GGAGGTTGCC GGTGACATGC AGCGCATGAT 60
GAATGCAGGG CGCGCGAAAC AACGCACGGC GCACAGGAAG CGCGTGAAAG TGTCACGCCC 120
TGCTACGGCC TTAAGGTGGT GGATGCCCAG CACTTGTTAT CGGAAATCGT GCTCGTTGAT 180
CCAAAGACTG AAGCGCCTTT GCGTTCTTCT TCCCTACGGA CCGTCCGCAA CCGGCTCCTG 240
TACAGCGAGC CTCACGCGCT CGTCGCCATT GCTGACACGA CAGGGAACGG CACCGTCCGC 300
CTCGTGCACA TAGACCCAAA GACGCTGGAG GTAACCAAAG AGAGTACCCA GCGTATAGTG 360
CGCAAAGTTT TCTCTTGAGG GAAGAGGAGC AtACTATGCG GTGATCGACG AAAATGGCAG 420
CCACTTCCTG GGACGCTTTA CCAAAAATCT TGAGCTGACT ACTCGTTCTG CAGCGCnGnT 480
GACGCCTTAT ACCGCCGTCA CCGTCACTCC GCGCGGAATT ATGGTGCAAA CAAAAGAGAA 540
AGGTATGGCC CTATTGCACA CACGGACGCT CGCCGACGCG CTACCCAGAA CATGAGCAGA 600 AAGAACGCGA ATCAGAAAAT TGTGTGGCCG CTTACGACGA AACTCATCGG TATTATCAGC 660
ACGGTGGTGG TGCTCGCCAC CATTGCGGTT ACGGCCATGG CTTCCTGGTT CTTCGCTTcG 720
AGTCTGCACA GTAATGCTGA GCTTAATAAT CTTGCCGCTG CGGAGAACTT TGCTGCGCAA 780
ATCAAAGGGG AATTCGAAGC CATTGCCACA AGCGCCAAGT CCTTCGTTTC CCTTGGCCTC 840
AGAAgTGGCG CGCGCATGCA CTCGCGGTCC GCACTTTCTA AAGATTTTTT TTCCTTTTAC 900
CCGCGCATCG GCTACATCGG AGTCGGCGGT GTAGCCGAGC TGTGGAACGG TGACTTCTTC 960
AAGAAAAATC AGCTGCGGGT GGCAGACGCT CGTCGCTTCC TAGCTGACAA CGCACAGGTT 1020
ATTTCCACAC TTCAAACTGC CCCAGCCACG CTCAACGCCG CCCCCTGGTT TAAAGCGCAG 1080
ATCATTGCTA TCGTCGCGCC CTTTGAAGTT GACGGcGCTA CGCGTAACGT TGTGGTTATC 1140
TTCTCAGCGG ATGTCGTTCA GCACCTGCTA GAATCTGGAG CCTCCTCCGG AACCATGTAT 1200
GCCGTCACCT GGGCGGGGAA CTCCCTGTAC CACCCGGAAT ACTCTCTCAA TtACAGCAAC 1260
ATtAACTTGC AGGACTCGCC CGTTGTGCGC GATTTACGCG AATCTACACA GCTGACCAAA 1320
CAAATCAGCT TCATCGGCAC GGACAACAaG CGCTACTTCG GCGCGTTCGC CAaGCAAACC 1380
TTTGGAAAGT TCGCCmTAGT CCTAGAAACG CCTATGAGTG TGGTGTACCA GGCAGTATAT 1440
TACGCGATTA TCCTCGACGG TATCCTCACC GGCATGGTGC TCCTCGCCTC TATCTTGCTT 1500
GTCTGGTTCA TTGCGCAGTC TATCACCCGC CCTATCCTTA CCCTCGTCGG CGCAACGCAC 1560
GCTATCAGCT CAGGACAGTT CCTCCTGGAT ATCAAGCCTT CAAGCAAAGA CGAAATTGGC 1620
CTCCTCACCG AAACATTCGT GAGTATGGGG CGTGGTCTGG CAGAACGGGA ACGCATGAAA 1680
GAAGCGTTTG GCAAATTTGT AAATAGAGAC ATCGCAGAGA AGGCCATGAA GGGAGAGCTC 1740
GCACTGGGAG GGGAACGGAA AACCGCTACC ATTTTTTTCT CAGACGTGCG CTCCTTTACT 1800
GAGATGTCGG AGAAGCTTCC CCCTGAGGAC GTAtAGAGTT TCTCAACGAG TACATGAGCT 1860
GTATGGTAGA CTGCATCGAG CAGACAGGCG GCGTGGTGGA CAAGTTTATT GGAGATGCGA 1920
TTATGGCGAT ATGGGGAGCG CCAGTTTCCC TCGGCTCTGC ACGCTTAGAC GCATTGCAGA 1980
GCATGAAAGC GGTCTTCCTC ATGCGCGAAA GCCTTATTCA ACTGAACGAA AAGCGCGTCG 2040
CATGCTCAAA GCCTCGCATT GGCATCGGAT GCGGCGTAAA CACAGGCTCC TGCGTCGCAG 2100
GTCAAATCGG CTCTTCCAAA CGTATGGAAT ACACCGTCAT CGGAGACGCG GTGAACACCG 2160
CAAGCAGGAT CGAAGCACTG AATAACCcGT TCGGCACTGA CTTTCTTATC TCCGAAAACA 2220
CATATGAGCT TGTTAAAGAT ATGCTTATAG TGGAGAAAAT GCCCCCCATA ACGGTAAAAG 2280
GAAAACGAGA ACCACTGAAT GTGTACGCTG CTATCAATCT AAAGGGGCAT GACGGACCGC 2340 AGACGCTCGA TGAGCTGCGT GCACTTCTTT CCATTGAAAA GCCGGGGCTT TCTGCCGACC 2400
CTGACTTCGA AGAAAAGAAG TGTGAAGTTA TCTAAGCAGG ATGCCACGGT TACGGTCGTT 2460
ATTCTCCTCC TTATCCTGCT TCTCGGCTGG GGCTACTCCC GCGCGCTCCG TCTGTCCCAG 2520
GGGAAGGGAA ATCCAATCGG ACGGGTTTTT TTTTATAAAA AAACCGCAAC CCGCAAAAAA 2580
AACAACCAAG CCTTATGGCT CAAACTCAAA GACGGGGTGC CCGTCTACCA TCGGnGyAss 2640
TGCGCACCAC CACCGGTTCT GAAGCTGTCA TTGTGTTCAC TGATAACAGC AGGCTCGACA 2700
TTGCAGAAAA TACCATGGTG CGCATCAGTc ACACAGGAAT GAAAAAGAAG GATGTACGTT 2760
TGGTCACAGG AGCGATTACG tACGCaCGCG CCGCTGGGAA TCCAGCAGCG CATACCGTAC 2820
ATGTAGGAAA GACAACCATC TCGCTTTCTG GAGACGGTCA GGTGAATGTG CGCGGAGGCG 2880
AACGCGATTC AAcTGTCGAG ATAGCACGCG GTGAGGCACT CCTTCACGAT GCGCAGGGAC 2940
AGACAcTTCC CCTTCAGACG TTCACCCAAC TTGCTACTTC CCGGGAGGAT GGCACTGTGC 3000
GCATTCTGCA CCCCACCTTT GTCCCTCTCC TACCCGACCA AGATGCACTT CTCCTGACTG 3060
CCGAGCACAC CAGATCTGTG GGCTTTGTCT GGCTCGGCGA TGCCACGACG GTACAGCCGA 3120
GCGTCCGTCT CCAAATTAGC CGATACGCGG ACTTCTCGGT TATTGAAACG GAAAGAAAAC 3180
TTACCCTTCC GCATGAGGCA AACGCCTCGA GGACAACATT CAAAACCAGC GAACGACTCG 3240
GGGAAGGACG CTGGTTTTGG CGCCTGGTCC CGCAGAACGG CACGcGTCAg CGCCCCGTTC 3300
CTTTTCTGTG CGTCGCGCGC GtAAGGTGAT GCTGcACACG CCGCGTGCTC AGGCAGTACT 3360
CTCCTATCGG GATGCGATTC CTCCTACCCT TTTTTCCTGG ACGTCTGTAG AAGACGTGGA 3420
ACAGTACCGG CTACTGCTTT CTTCCCGGGC CGACTTTAGC GCGGATGTGA AGACATTCTC 3480
TTTGCGTACG CCGGAGATCT CGGTACCCGG GCTCGGCGAG GGAACGTATT TCTGGAAGgT 3540
AGTACCTCGC TTTGATGAGG GAATAGAAGA CCCAGTCTTT GCTTCTGAGG TAGGAACCTT 3600
CTCCATCAAA CAGGGAAAGG AGCTGCATGC GCCCGTTGCG CTCTTTCCCG CCGAGGACGA 3660
GGTGCTCGAA CACGCGGATC GGGAAAATCG CATGGTAATC TTTACCTGCG AGCCAATACC 3720
AGAAGCACGG CGCTATGTCT GGACGGTTAA AAACATGGAT GCAAACGCGT CCCCGCTTGT 3780
GACTACCACG TCGGTACCCT TTCTTACCGT TCCCATGCGG AGCCTGCGTG CACGATTGCA 3840
GGAAGGAACA TATCAGTGGC AGGTAGCGTG GGAAACGCGT CGGAGCGATC GCTCCCCCTA 3900
CTCGGCACTG CGCGCGTTCA CGGTCATTGA AGGAATGCAC GCGTGGGAAG AGGAGCCAGA 3960
GACGCGTGAC TTGATTkCGC TCcGCTcCTT CCTTTGgyTG CGCGACATGC CAGCACTCAT 4020
TACTGAAAAA TACCTTTTGC AGCATCGcGC GTTGCGTTGT AAGTGGACGG CGGTGCACAA 4080 CGCACAGCGG TATACGGTGA CGTTAAAAAA CAAGAAGACA GATGCGGTAC TGCAAACGGC 4140
AACTACCACA GGGGTGGAGT TCTCATTTAC CAACTTAGCG CACCTTGAGG AAGGGTCATT 4200
TCATTGGGTC ATACAGGCAC ACACAGAGCA GGAAGGCTAT GAGCCTGCAA GTGCACAGGT 4260
GGTGCGCGCG TTCACCATAC GGGTGTCTGA ACTTGAAAGG CCGCGCGCAA AAGAAATTGT 4320
CCATTATGAG TATCATTAGC CGCGTGTGTA TACCGTGTGC GGTGCTGCTG TTTGCGCAAC 4380
TGCACGCGAA GGAACTCGTC CACGTATCTC AGTTAAAAGA ACAGGAAGCG CGTATCAGCT 4440
GGCAGGAA 4448 (2) INFORMATION FOR SEQ ID NO: 66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3219 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 66:
CGCGCAGCGC GGTTTTCAGA TTTTGCGCCT CTTCTGCACT GAATCCACTG ATACTCCCTG 60
AGCCCGCTGT TATCGGCTCC CGGATAGCmG GGGCAGACCT aATTTTACCG TCAGAGACAA 120
TTGCAAGGCG ACGCCCTATC TCCTTAGTGG TAAGTTCCGA AAAAATACGC GCTCCTTCAT 180
GGTCCAGGTC AAACAGCACC AACGGCTCGT TCGCGCGACC TGAGCTCACC GTTGCATCAC 240
GAATATGTCT TCCTTCAAGC GCAGGCTCCT TCTTAACAAC CAGAAACCCG TCGCGCACAT 300
CAAGTCCGTA GgAATCCTTG CGATATACTC CGAGCACACT GGTGTGCTCA GGAACCAGAG 360
ACAGATCGTG CAACTGATGC GCAgskTCGA AGGTaCccTG CGGGTTATTG CGATAGTGAT 420
CGAGAAGCTT TTGAGTCGCA TCATCATCCA CGAGATGAAA CGCCAGGACA CCACGACCCA 480
TGACGATAGA ATGAACACGG TCACGGTCAG TAAGACCAGG AATCTCCACA TACACGCGAT 540
CTTCCCCTTG cCTCCGAATA ACGGGCTCAG AAAGACCAAA GCGATTAATA CGaTTCTCAA 600
GGGTACTAAG CACCAGCGCC ATCGCTTCGC TGCGTATTGC GGCGCGCTCT GCATCCGGAA 660
CTCCCTTGGT AACTTCGCTC AAATCAGCTT TAATCACCAC GCTAGnGCCG CCGGAAAGAT 720
CAAGCCCGAG CTTGACAGCT TGGGCCTGTC TCCTTTTCAT CTTCAGGACG GCTTCCCGGT 780
ACGTCTGcTC CATCAACGGT CGCGCGTAAA GAACAAAACC CTGCTCGCTT TTTACGGGGA 840
ATGCAGAAAC GAGTGCCGCA GCGGTCCAGC GAGAAGGGGC CGGTCTGCCC GAATAAGAAA 900
GATTCTGGCG CGCAGGcAaC AAGCGGTGCA TAACGCGCTG AGATATCCTC ATCCGACCCC 960 GCACGCGCAA GACGGGTTAA ATCCGCAAGA TCACGCTCAG CACTCTGCAC AGCGTACTCT 1020
TTTATCTGCT CGCGCGAGCT GAGCGCACGC TGCCGCGTTT GTGCGTCGGT CAGAAAATAC 1080
CACTGGAGTG TAGGGAACAA AAACCCAGAG CACGCAGCAA GAACAACAAG CACGACCCCA 1140
AACCGAGCCT TCTTACTCAC CTGGCGATCT CCTTGTCCAC ACCTGTCAGG GGCACGCCGG 1200
GCTTCGAATC GCAATCTGTC TTAGGATTCG AAACACCTCT CCTGTCGTTT ATGCGCGCAA 1260
TCGCACTGCG GCTGACTTCG AGCGTGCCAT GCTCATTCAC CTTTATGACA AGGCTGTGCT 1320
CCCGCACCAC GCTTACCACC CCGTGGATAC CGCCGATAGT AACGACAGGA TCACCCTTTT 1380
TTATGTTCTT AATAAGAGCC TGCGTCCTTT TCTGTTCCCG CAGATTAGGC GCAAAAACAA 1440
AAAGGTAAAA GATCAGACAT ACGACGCCGA TAGCGAGCGG TGGGATCCAG CCACCGTTCG 1500
CCGTAGTGAT TTGCAAAAGA GTTCGATGGG GCATTGTTTT CATCCTTGAG CGCGCAGGAC 1560
ACACGAGCGC GCCCCCAGGC TAGCGCAAAA AAGACAATCC AGTCAATCAC ATCTCTTCTT 1620
TACCAaCGCG CGyGyGCGCT gGCATTAATC TCAAAACGAA TCCATATCGG GCAACTCTAA 1680
AATCAGCTGG ACTTCACCCT TAGAAATCTT CAGTGCGTGC GCAATAGCGT CGTCAGACCA 1740
GCCACTTTTG TGCAGCTTCA CCACATTCTG ACGTGTAGCC AGGGGCGGCG CTCCCGCACC 1800
GGGTATTTTG TTTGCTGGAT CCTGACGCAT CAAATCACCC AGCAAGCGCA ACTGCCCCTC 1860
AGACACCTTA GAAATTTCTT GCAGACGAGT TTCAGTACCC GCAAGCCACT CACGCGCATG 1920
CTGTATCTTT TCAATACGAC TTTCCATTTC TCCCAGCAGC GCATCTGCGC ACTCTATGCG 1980
CGAGCGCACA CGCTCTGCCT TTTCCTGGTT ATCCAACAAT ACGGCAATTT CTGCACGCAC 2040
ACGCTGCAAT TGAGGATCTA CTGTCTCCAA TTCTCCCCTA AAATTTTTAA GCGTCTTTTC 2100
AAGTTCCTTC AAGTTTTCAA ACGCTCTATC CACGTCCTGC ACCGTCTGAT CCAACACCGC 2160
ACCTTTCTTA TCCAAACGCT CATAACGCGT ACTGATATCG CCAAGCCCTT CTTTGACCTT 2220
TCGAATTTGC ACCTGATAGC GCTGCAGATC GTCATTCGCT AACGTGAGCT CAACAATCTT 2280
CTTGTCCATA GCATCAGAAA GAGCAGCAAG CTTTGAAAAC TCTCCCTCCA AAAGATCAAT 2340
ATTCTTTCGC TCCTGCATAA ACTTCTCAAC ACGCTGctCC GCCTCCTCTC CAAAGTGTTT 2400
CACTTTTTCA TACTGGAGAC TAAGCTTATC CATCGCCTCT CGATACACTT CGAAACGGGT 2460
CACCGTCTCA GTCAGCCGCT CGATATCCTT TTCAAGATTC TCCCGCAACT CGTCCGCCCG 2520
ATCAAAAATA CGAGTCTGGC CGATAAACTC ATGCTGTTTA CGTTCAATCT CCTGCAGCAC 2580
TTGCGAAAAG CGGTCACTTT CTCCCTGTAA TTTAGTGAGA AGACCTACCT GCGCCTCCCC 2640
AAACTCCGCG CGCAAATCCT GCACAAGGTC TCGTGTCTCC TGCAAGGTGC GGTCCACCTG 2700 AGCACGTGCT TCTTTTACTG TCTTGTGCAG TGCCTCACTT TCGCGCCGAC CATCTGCATG 2760
GAGCTGTCCT GTCACAGTAT TTACATAGCC ACCCAGTTCC TCTTTGACCG TCCGCATCGT 2820
GTCACACATC TTGCCTATTT CATCACGCAG ACTTTGCAAA GACTCACCAT TCTTCTGAGA 2880
AAAATCTTCA TACTGCATAT CATAGCGCGC ACTCAGGTTT TCAATCGCCC GCTCAGAAAG 2940
ATTCACCAAA TGCGCAATTT TTCCTTCAAA CAACTGCTTT GCATCCgCAA ACTGCTTGTC 3000
GGTGTGTGCC TTCCATGCCT CGATATCCCG TTTGACCGAC CCACAGCCCC CCTGCGCCTC 3060
CTGCTTTATC TCCTGCACGA GCACATTCAT ATCCCGaACT TCGGTTTCAA TCATACTTGA 3120
GTGAACATGC AAAGAGTCAC GCAGgCGTTG CCgCACTGCT TCCAATTCTC TCTCAATGAG 3180
ATTATGCGCC TTATGTGCAA GGCCGCGACA TCTAAGGTA 3219 (2) INFORMATION FOR SEQ ID NO: 67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2725 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67:
CAGGATnCCC CATTCCTGAG AAGAAGGCgC GCATCrCGmA GtCgACTGAC TACCCTTCCG 60
GCAGCCTCCG GTGCATCGTG CCTCACCTTT TTTACCCGTG GACACATACC CCAATTGCGC 120
AtTTCAAAAA GTCCGTTGAA CAATCGTTCG TCGTTTTCTT ACACGCAGAT GTGCAACAAC 180
TACGAACGCA AAACATCACG TGGCTTGGAT CCATTCGGCG GACCGACCAC CCCCCTTGCT 240
TCCATTTCTT CGATTAGGCG CGCGGCGcgA TTGTAGCCTA TCTTCAATTT ACGTTGCACA 300
TACGATGTGG ACGCTTTACC CGCGTATTGC ACTACCTGCA CTGCCTGCTC GTATAAAGGA 360
TCGCTTTCAT CCACAAAATT TCCAGATATA CTCGCGTCGT CATCGTCAAA GAAAATTTCT 420
TCATCAAGAT ACTCAGGCGT TCCCCACGCG CGTACATGGG CGATCACGCG CGCTAATTCT 480
CGCTCGGAAA CATACGCACC TTGAATCCGC GTAGGAAAAG ACTGACTCGG GTTCATGTAC 540
AGCATATCCC CTCGTCCCAG CAATTTTTCT GCGCCCATCT CATCCAAAAT AATACGGCTA 600
TCCATTTTAG ATGAAACCAT AAAGGCAATT CTGCTTGGAA TATTTGCCTT AATAAGGCCG 660
GTGATGACAT CGATTGACGG TCGCTGGGTG GCAAGTACCA AATGGATGCC TACTGCACGG 720
CTCATCGCGC ACAAACGCGC AACACTCGTT TCTAATTCTT TGCCAGAGGC AACCATTAAG 780
TCTGCAAATT CATCAATGAT AATAACGATG AATGGGAGAG GCTGCGTGGC GATGCTTTTT 840 TCCTGTATTT TTTTGTTGTA GGTCTTAATG TCGCGGCATT CTAATTGCTC AAGAAGCGCA 900
TAGCGTcGCT CCATTTCGCA CAGGATGTAC TGTAGTGCTT GGAGTGCTCT TTTGGGCTCA 960
GTGATGACAG GAGTGAGAAG GTGGGCGATA TCGTTGTAGA GCTTTAACTC TACGATTTTT 1020
GGATCAATGA GCAGAAGTTT GGTTTCGTCA GGACACTTGT GGTACAGGAT AGAGAGAATG 1080
AGCGCGTTTA CGCATACTGA TTTACCCGAC CCAGTTGCGC CTGCAATGAG CAGGTGAGGT 1140
GTTTGGGCAA GGTCGATAAC CTGTGGTTCG CCGGTAACGT CTTTGCCAAG GATGACAGGG 1200
ATGGCCATAC GGTTGCTGCC AGCTGTGCGC GTATGGAGCA GTTCTTTGAA TGTAACGAGG 1260
GATCGTTTTT TGTTAGGGAC TtTCCmCCCT ATGGCGTGTT TTCCAGGAAT GGGAGCGACG 1320
ATGCGCACGC TTGAAGCAGC AAGCTTGAGC GCAACGTTGT CCTGCAGATT TGTAATTTTT 1380
GACAGTTTGA TGCCGGGTGG AGGGAGAAGC TCGAACATTG TGACTACAGG ACCCTTCTTG 1440
ATACCGGTGA TTTCTACTCG AATGTTGAAT TCAGAGAATG TTTCCTCAAG CAGGAGTGCA 1500
AGATTCTTGG TGAGCTCGTC AATTCCTTCA TATGTGTCCT CTGAGTACTG GTCAAGCAAG 1560
TCGTACGGTA CTTGGTAGCC GCGGCAAGGG TgCCGAAgCG GAGCTGCTGA GGCAGGAATA 1620
GGACGCGgTG GTCCCTGTTC ATCGTCTTGC GCAGgAATAA GGGTTTCAGC GGGGGCGACT 1680
GAGGGAGCAG AGATAGGAGA GAGGGCCATG ACACACGGTG CCTGTGCAGG GATGACAGAC 1740
GGCAAAGACC CGGGTGACGC GGGGGCGTGA ACGTCTGAGG GAAGGTTACT CTGAATGAGC 1800
CCGGGCGCTG GGAGCAGGGG GAATGGAGCC TGAGAAGGCA CCGATGGCGC CAAAGCAGTT 1860
GGCGTGGACA CACCGCCACA CGCTGCCACt GCGTGGCAGG CTGCACCTCT GCCTCAGAAA 1920
TCAAAAATTC TCCCCCCTGG AGGGGAACTT CCGTGGAGAA TTGCCCCTCT GGGGGCGCGC 1980
TTGCTTCGGG CGTCTGCACA TCTGCGGTGG CGCAGGAGGG AGCGGGGGGA GGGGAAACGG 2040
TGTCAGGATG ATCGGCGGTG GAGGGAGGGA AGGAGGGGTC TTGGAATCCA TCAGCGATGA 2100
AATCcGAGGG ATACGTGCAT GAAACCATAC GTAACACcTT TCCCCGTAAA TGAGTGCAGC 2160
ATAGAGCTCT GCTCCCAGCA ATGCGAGGAG GCAAAGGACG CATACGATGT CTATCCCTCC 2220
CCGCGTTGAC GGTGAAATTG ACCGTGCcAG CGAGTGCGCG TCGTAGCGCG TAGAGACCGT 2280
GTTCTCCACA CACTGCAGTA ATGAACAAGA GTGGGAAGGC AACAAGTGCG CTTTCTGCCC 2340
GTAACGAACG TCCGCCGACA AACAAGAGGA GCGCTGTGTG CAAGAGTAAC AGCGGCACGA 2400
GCAAGGAGGA GAAAGCGTAC GTTTCGTAAA GGAGAGTGCC AGGTACGAAG AACCAGTGTG 2460
ATGCTCGGTG CAAGGTAAAA AGGGGAAGAA ACGTGGACAG GGTCaGGAGC ACTGCACTGA 2520
CGAATAGCAG TGTGCCGAAG gTAAGAGCGA TAATTCTAGG TAAAGGGGAT CGTTCCATGC 2580 ATTGTCCTGA ACAGTTAATC TGTTAGCTTG CACGCCCTGC AGGCTACCGA CCCCGACAGA 2640
AGGAGCCGAG TGAGGGGAGg AAACAGGCGC GACCCAAtAT CTTTGTAACG GTAAGATGCT 2700
TTGCGTTACA CTGnGACGGG CGTnG 2725 (2) INFORMATION FOR SEQ ID NO : 68:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3406 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68:
CGGCGCATAC TGTACCGCAT CCTCCTGGTA TGCGCTnTCA CCCACGGnTT CCACAGCCGG 60
GCAAAGTTCG TCAGAAACGT GGGGTTCTTC CCAATCGTCA GGTAGGCCCC ATAACAGTGT 120
AGTGTCGCCT CTACnTTCCC CTTGCGCTTA ACGGCAAAAn CTGCnTTCCC CTGACTCAGG 180
TCCGCCTGCA GGTCCGCCAC CTTnCAGTCC GCATACAGTG CCGGGTGCTG CCCACGGCGC 240
GTGTGGGTGG TGCGCATAAn CAGGGGAAAG GATACTCCCA CCGTGTTGGT AGTACGAAAC 300
CCGTGCTTCA GATTGTAGGG ACCGGTGCCC ATAACTGCAC CAGGGGCCTG GCCATGACTG 360
CCTACCCCCT TGCCATAGCT GATGCCCCAC TCAAGTGTGG CAGAGCCAGT TAGCTTCGGG 420
GAAAACTCCT GTCCGAGCAC TCCCCCGCTC GCTCCTACCC CCACCACCAC ACACAGCACA 480
CTCCCCCACC GCATGCACCC CATGCTACnT CACCCCCCCC CnGGnCCTGT CTAGTAGCCC 540
CyTCAcCTTC TTTTCTAACA CTACTGCCCA ATCAAGGTAT CCAGCTGCTT AGACAGCGCA 600
CGGTgTGCGC ATTGTTCGGA TCCAGTGCAa TCACCTGATG CAAATAGTAC TGCGCTTTGC 660
GGAAATCCTT TTTCTTTCGG TACCATTCAT ATAACGCAAA AAGAGTGCGT CCATTGCGCG 720
GATCTGAAAG CAAACTTGCA CGCAATAAAC TTAATCGCTC CTCCTCGTGC ACTGACAACA 780
ACGCTTCATA GTACGAAAAT ATCGACCGCA GCGTACCACG CGCGGACGCA CTCCGAGCTG 840
CAATCACCGC ACGGATCTCC CGGTAATGAC GCGCCTCATA CAACGTATCT AAGTACAAGA 900
CGATGACCGT CTCAGACGGA GGCTGTGCCG AATGATATAA ACGCCGCGCA AGAGAAATCG 960
CCTCcTGCGC ACGACCTGAA CCACTGTACG CGCGAATAAG TAATTCTTGA TGAGCCTCAC 1020
TCGGATATGC GGTATTCAAA CgCTCCGCGC GCGACACTGC CTGTTCCCAG TTACCCTGCG 1080
CCAGTTCATA CTGCGTCAAT AAACGGAGCG CTTGAGCGTT ATGCGCATCG GCGCGGAgCA 1140
CCAGCGCGAT AAAGCTTcGC ATGTTTTTTT GTGCAAGCGA GTGACCGGTC TCAAAGCAGT 1200 GCTGCGCACA CGCAAGGAGC ACCTGCACAT CACGCGGATA CAAGCGATAT GCCTGCTGCA 1260
AAAAGTCGTG CGCAGACGCG TCATTTTTGC TCCATTCCTT TGCCACCTGC GCACGCAAGA 1320
GCAAGTACGT TTTATCCGAA CTGTCACGCT CTGCAAAAGA GTCAAGGGAC GCGTGaGCCT 1380
TCTGaTACTC CCGcTGaGCC ACCAAGAT C GAATGTGTAG CAAAAGCGCC GCATTGTCCT 1440
CAGGCTGTTG ACGCAACAGC GCTGCAACGT ACGGTTGCGc GGCGCTCCAT TCTTGACGCG 1500
CGCTGTGAAT TTGCGCCTGC AAAAGCTGCA CTGCTCGGTC TTTCGGGAAA CGGTGAAGCA 1560
ATAGGCGCGC AATGGGCAGT GCCAGGGCAG TCTCACCGCG CTGTACCAGC ATGCGCGCAT 1620
AGCAGATGCC TGCAGGATAG CAGGATGCGT CTTTTTCCCA TGCGCTGCGG TACCTGnACT 1680
CCGCTTCAGA GAGCTCTCCG TGCTGCTCGT GCAAATACCC AAAGAGCAGG AACGGAAGGA 1740
GAGAGTGCGG GGCCTGCGTG CGTGCGCGCG TtCAGGCGTT CTCGGCAATC TGCGTGTACC 1800
TCATCAGGAA GGCGTTTTTT CAACAGAGTC AGCGGGGGAA TGATATCCGA AAAAAAATCC 1860
GGTTCTCCAG GAATGAGCGG ATACGTACCT TTTTCCACGT CATCGAGCGC AGTCAGGTAC 1920
GGATGCGCGG TGTTGTACAG CGGCACATGC CAAGAAACAG CTTCCTGCGG ATATACAAGT 1980
TGCATAAGCG CAnACACACG CGGAGGTACA GCCGATTGTG CGGTGTCAAT CCGGCCGGAT 2040
CGCGCTGTAT GCACGCAGCT GCCTCGCGCA ATGAGGCAGG AGAAGCAGTT TCAATCAAGA 2100
AGAGTATCTT TGGATCCATC AGCTTTGTGC GAATTGCGCG TCGGTCCGGT ACGTCAAGCG 2160
CAtGCCaCGC ACAGGaTGCG CCGCCACCTC CGGCGCCTGG GATGCGGAAG AAACCGGATC 2220
GGTACtGCGC GCGTCGGCAC CACCTTCGAA GAACTGCGAC AACAGAGAAA AACGGGTATC 2280
AGCACAAAAC CGACACTCAC CCCGATGGCG CGGTCAATGC TTTTAGTAAG CGCCCCCATA 2340
GAAACCCGTT TTTACCTCCc TGcATGGCAG TCGTGCCATT TACACAAAAC GCCTGTTGTG 2400
ACCGTACCGA CAACGTACGC ATACCGGCGC ACCGCCGCTT CTTTTCCCTT TACGCGGTGT 2460
TCAACGcgCA CGGCGCATCC CTTGTGCTCC CCACGCAAGA CATGCTAGGC TGGCTGCCAC 2520
CGAGGGCGAA GAGAGCGTAC AGGAGGTTAA CGGTTTTTTT GCGAGAAATC ATTACCGCCC 2580
GTGCGTGTTC ACTCTTCCTG TTTCTCCTTC GTTGTTTCCC TGCTGGTCCC TGTGCGCGGG 2640
CGCGCCGGTT GTCCTTCCTT CTGTTTCTCT GTGGTGCGGC AGCCTGCCCT CCGCTTTGGG 2700
GGGCGTACGC AGCGCACcAg CGTTGCGCGC TCAGTCGGTA CCTGACACCC TCATTCAGCG 2760
CGCGCTCGTG CTCGGTCCGC TCGTGCACCC CCTGTACCCG CCGATCCAGT CCTTCAAAGA 2320
ACAGTACCGG AGCGCGCGTT ACCGGGAATA CCTCTCTGTC GTTATGcAGC GGAGCGCGCC 2880
CTACCGCCCC TTTATCGAAA AACTGTGCGC GACGCTCACC TTCCTGTCGA .GCTGCTCTTT 2940 CTCCCCGTTG TCGAATCGGG CTTTCTCGAA CGGGCTGTCT CCAAATCCGG CGCAGTCGGC 3000
ATTTGGCAGT TCATGCGCAA TAGCATCGCA GGATCTGCCA TGCGCGTGAG TGACTGGGTA 3060
GACGAACGGC GTGACCCCTG GAAGGCTTCC GTCGCCGCAG TCAAAAAACT GCAGTGGAAT 3120
TACACGCAGC TGCGTGACTG GCCCTTGGCC CTCGCTGCGT ACAACTGCGG TCTTGGCGCG 3180
ATCAAGCGAG CCATTGCCCA GGCAGGAACC GCCGATTTTT GGCATCTGAG TGAGCGCGGc 3240
TTTCTGCGCG ACGAGACAGT CCGCTATGTC CCAAAGTTCC TTGCGGTTGC AGAAGTACTC 3300
AGCCGGAGCC ACGAGCACGG CATCGCCTGG GGAGCGGCAC ACACCCCCGA GGAGACCACC 3360
ACGGTTACCG TTTCGCGCGC GGTAGACTTA AACCTCTTGG CACAGG 3406 (2) INFORMATION FOR SEQ ID NO: 69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7874 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: TGATAGCAAA TTATCTGCTg AAAGGCTCAC AGTACAGCAT GTTGGTCCCA GTGGTTGTTC 60
GCCTGGGGGC GtATACAAGG TGTGAGGGAG TTGGCATTCG GGGGGCGTGT GCGGAAATGA 120
AATGGaGTGG GCTCTGTCTC TTTCTCCGGA CGGGGGGGGG GGGGTGCGAA CAGATGAACG 180
GAAAGCGTGT GTTTCTGGGC ATTGTCgTGG TGGTCtGTGC CGCGCGCTGT TTTTGCGCcG 240
GACGTGTTCT TCTCCTCGCa TCTGGGGTaT GGGCGTTTCT ACGCCcGTGG GGAAAGACAT 300
TGAaGGGGCa GCACATGCAC GTTCCAAGCA TTGGCGGCGG CGTATGTGTG GTGGCArACA 360
GCGGGTTTGC CTTCGCCTGC ACGGTGGACG CAGCCCTGAC CCGTATAATG CTGAAAACTC 420
AGGCGCTCTT TGGCTATGCC TTTCGGTGGG GAGCGTTCAG CCTCATCCCC TTGCTTGGGA 480
TGGATGTGAT TGTGTCGAGC GACCACGCGT TTGGTGTTGC CGCGCAAGTG TCGTTCCAGC 540
ATTTGATTTC TGAGTGGTGG GGCTTTGCCT TGAGTGTGAG CGGCGGGGTG GACTTTCCGC 600
TCAACCCTAA CACCCGCTTT TTAGCAGGTA AGCTGCCTGC AGAAACGGTG CAGCGCGTGG 660
CsTCGTTGCG CTGCGGCAAA AGCTTATTAG CGAAnGGATT ATCAAGGCAT TGGATTTGGG 720
CTGGTTTATT ACCTTCGCTC TGACCGTTGT TGCCGAGGGA TTCAGTTGGA TTGTGTCGCA 780
GAGCGCTTGG ATTGCGCAGA AGGCGGTGAA TTACTTTTTG AGCGACACCA CGCGTTGTCT 840
CATTCTCCCG GTCACGCTGC GGGCCGGTCC TACCTTTCGA ATATAGCGTG CGGGGGGGGG 900 CGGATTTAGC GGCGCGTTGG CGTCCGCTGT CGAGTTGCCC AGAGCGCGAG AAGAATGCGG 960
TCGATTTCTT GTGTAGCGTT CCGTCGTGCG CCGCGCACTT GAAcGAGCTC AGCCGGTGCA 1020
TGAGCGAAAs CgCGTGGACG ACTGTCTGCC CGAACTTTCC CGGCAGtGCG GCGGCACcGC 1080
AGTATCCACT TTGAAAGGTA TACCATCCCC GAGGCCATGC AGGCCTGGCG CGCTCCTGCC 1140
GAAAGCGCAC CGCTCGTCAG GGGGACCCCA AGCAmCGTTG CCCACACGCG TCCAACCGGC 1200
GGTCTAAGGT AGCAGGCGAG CGTTCCCACC CACAGCCCTG CCAAGGTCAT CCAGCGGATG 1260
CGTCCCCCCk TTGCCGCATA GAGCAGCAtG CGATCCCGCA CACGGACGCC TGCACGCTCA 1320
GCCGCGGCAC ACACAGCAGT ATCAACAGGA GCAGGAGTCC TGTCCCCATC CGTACATAAC 1380
GCAGCACACG CGGCCTAAGC CgTGTTCCGA ACACGGCGCC GTGCATTCAC ACCCCGCCTG 1440
TACCGGAAAA AAAAGAGACG CAGGCGGTGC GGTGTCTTGT TGTCCGCATG CCGCGTTCTG 1500
CAGGTGCGCG TACCACTGGG AGTGGGAGAG AAAATGGCCG GTTAGCGCTC CGAGTAAGGT 1560
ACTGCTCACT GCCCCCAGCG CGCAAAATAG CGGGAGTACG TACCACACCC CCGCGCCAAA 1620
AACGAGCACC CGAGCAAGCG TAAGCTGTAT CACGTTTGAA CAGAACGCAC CCATCACGCT 1680
TATCCCCACG CACGAAAGGT ACCGGCACGG GACGAACCGC AGCGCATACA TGAGCGCACC 1740
CGAAGCGGTG CTCCCTGCAA GCGAGAGGAC AAATACATAA GAAAAGAGCG TCCCACTCAC 1800
CAGAGCCTGC CCTATTACCT TCAGGAATAC TAAACGCGCG TACGCACAGA AAGGGAGCAG 1860
ATCCGGCGAG ATCAACAGCG GCAAATTCGC AAGCCCCACG CGAAAGAAAG GCAGCGGCTT 1920
TGGAATGACG TGTTCAACCG TAGAGAGAAA GAAACACATG CCGCCTAAAA GCGACACTAA 1980
CTCATCGCGT ACGTCTAGTG GCAGCCTGCT CCGCACGAGC CGCACCCTCC CGCGCCGCTC 2040
CCACAGCCAC CGCCGCTGGT GCTTTCCCGA AACAGAAGTG CAGCTAAGTC ATCGTCCGTC 2100
GCCTCTCGCA CAGAGCGCAC AGCCACCTCA AAGTGGAGCG TCTTCCCCGC GAGGGgATGA 2160
TTTCCATCTA CAATAATCGT TTCACCTTGC ACGTCAGTGA CGGTCACCGG TCGACTGTCA 2220
CCCCCGCTTC CTGCATCAAA CCGCATGCCC ACCTCTATTG GCACGTTTGG AGGAAACTGA 2280
TCTCGCCCCA CTGTCATGCG CAAGTCCTCC TGCACCTCTC CATACGCTCC TACCGGAGGA 2340
ATGGTTACTG AAAACTCCTC CCCCTCTTCT CGGTTAATTA AGGCGGTCTC GAGGCCAGGA 2400
ATGATCATGC CGTGCCCCTG AACATACTCG AGCGCACCCA TCACGTCGGA AGAATCGATG 2460
ATCTCCCCCt GTCATCTCgC AGGGTGTACT CGATgTTCAC CACACACTCA TTTGCGATTT 2520
TCATGCGCGG CATGCTAGCA CAGGCAAGAT aCTCACGGCA AGGGCAGTTT CTGTGCCGTG 2580
TGCCyTTGAc AGAATCGCCG TTATAGGGGA TAAGCCGGGC GAGGTGTTGG GAGCGTGTGG 2640 TCCACTTCTT GCCCTCTTGC GCgGTGCTGT GCgGTAAAAG AgGGGGCGTC gCGTTCGAGT 2700
AAAATTTTCT CTTAAGCCTT AAGTGAGATA CCCCATTATG GTAGAGGTCt AACCGCGGTT 2760
GCGCGCTGCT GTTGCTCGGT TGGCGGTCTG TAGCGCTGCG GAGAAGGACG GTGCCCTGcC 2820
GCTGTGGCGA TGCGCTACAT GCGCAGCGGG AGGATATCCT GCGTGCAAAT GCGCAGGATC 2880
TTGCGCGGGC GCGTGAGGCG GGTCTTGCCg CACCGCTTGT CGCCCGGCTC GCGCTGAGTG 2940
AACACCTTCT TGAgGACATG TTGCGGTCTT TGaCTGTTCT TTCGCTTCAG CGGGATCCTA 3000
TCGGGGAAAT TATAGAAGGG TACACTCTTG CGAATGGACT GGAAATCCGG AAGGTACGTG 3060
TTCCTCTGGG GGTGGTGGCT GTCATCTACG AGTCTCGGCC CAACGTGACC GTAGATGCGT 3120
TTGCACTTGC GTACAAAAGC GGCAATGCGG TGCTCCTGCG CGCAGGTTCT GCAGCGAGTT 3180
ATTCAAATGC CCCGCTTTTG CGCGCAATTC ACGTGGGTTT GAAGAAAGCG CATGGTGTCG 3240
TGGACGCGGT GGCTGTTCCT CCCGTTTTGG AGGAAAAATA TGGTGATGTG GATCATATCC 3300
TCcGCGCGCG CGgCTTTATC GATGCGGTAT TTCCTCGTGG GGGGGCGGCG CTTATCCGGC 3360
GCGTCGTGGA AGGCGCCCAC GTGCCAGTTA TTGAAACCGG ATGCGGCGTG TGCCACCTAT 3420
ACGTAGATGA GAGTGCGAAT ATCGATGTGG CGCTGCAGAT TGCAGAAAAC GCGAAGTTGC 3480
AAAAACCGGC CGCATGCAAT TCAGTCGAAA CGCTGTTGGT GCATCGTGCG GTTGCGCGTC 3540
CTTTTTTGCA CCGTGTACAG GAGATTTTTG CCACCTGTGA GGAGACTACG CGCAAcCCGG 3600
TGGTGTGGAT TTTTTTTGTG ATGCTGAGTC TTTCTCCCTT CTCACAGAAA GGGGCGCGAG 3660
AAAAAATGTT TTTCATGCAC AGGCAGAGAC CTGGGATCGG GAATACCTGG ACTATCAGGT 3720
ATCCGTGCGG GTGGTGCCAA ACCTTGAAGA AGCACTCAGG CACATTGCTC GTcATTCTAC 3780
GAAACACTCA GAGGTTATTG TCACGCGCGA TCGTGCCCGT GCGCGTCGTT TTCATCAGGA 3840
AGTAGATGCT GCCTGTGTAT ATGTCAATGC TTCAAGTAGG tTTACCGATG GAGGGCAGTT 3900
TGGCATGGGA GCAGAnATTG GGGTCAGTAC GCAAAAATTG CACGCGCGCG GTCCGATGGG 3960
TTTGTGTGCA CTGACTACTT CAAAATATCT GATTGATGGA GAGGGGCAGG TGCGTCCGTG 4020
ATCCGTGCGC TTTTTGCTGC GGCAAAAAAA AtTGTGATAA AGATTGGGTC AAATACGCTT 4080
GCGCAkGCAG ATGGTACTCC TGATGAGGAG TTTTTGGCGG wGTGTGCTCG CGCCTGTGCG 4140
GCGCTGATGC GTGACGGCAA GCAGATAGTT GTGGTGTCGT CTGGCGCTCA GGTTGCAGGG 4200
ATTTCTGCGC TCCATTGCCT TTCATCTCCT CCTCAGGGGG CGGGTTTAGA GCGTCACGAA 4260
TCGCGCGGCG TTATTCCGGG TGATGGTGCG TCCTGCAAAC AGGCGTTGTG TGCGGTGGGT 4320
CAGGCGGAgT TGATAAGTCG TtGGCGTTCT GCGTTTGCAG CGCACCAGCA GTGCgTGGGC 4380 CAGTTTCTGT GTACGAAGGA GGATTTTACT GACTCGGACC gCGCGGCGCA GGTACGCTAC 4440
ACGTTGTCCT TTTTGCTCGA GCGCAGGGTA GTACCTATCC TTAATGAAAA TGACGCGCTC 4500
TGTTGCAGCG ACGTCCCCTC TGTAmCCGCC GACCGGcGGt GTCCCTATCA CCTCAAAAAA 4560
GGATTGGAGA TAATGACAGT CTGTCCGCGT TTGTAGCGCT GTTGTGGCAG GCAGATCTTT 4620
TGCTTTTGTT GAGTGACATT GACGGCGTGT ATGACAAAGA CCCAAAGGCA CACACAGATG 4680
CGCAgcACGT TCCTCTGGTG ACGGACGTGT CAGCGCTTGT GGGTAAAACG AGCATGGGTT 4740
CTTCCAATGT CTTTGGTACG GGTGGGATTG CTACAAAGCT GGATGCTGCG CGTCTTGTCA 4800
CGAGGGCGGG AATTCCTCTG GTGCTGGCAA ACGGGCGCCA TCTGGATCCG ATCCTGAGCC 4860
TTATGCGCGG GGATGCGCGG GGGACACTTT TCGTGCCTGT TTCTTAGAGA GCGACGTGGG 4920
TATGCGCAAG TGCACGCATT GTGCCCTATA ATGCGCGGCG TGCGGTCAAT TTCTGACGTG 4980
TAATTTTTCT CGGTGGGGCG ACGTCTCCGT CTGTCTGTTA ATTCGGTGGT GTGTTTCGAT 5040
GCGAGAAAAG GAAGGAGGTG TGGTGAACGA CGATTTTCAC TATGAAGTGA CGCGCAACTG 5100
GGGCACGCTT TCCACATCGG GGAATGGCTG GTCCCTCGAA CTGAAGTCTA TTTCTTGGAA 5160
TGGCCGGCCA GAGAAA ATG ATATCCGCGC GTGGTCCCCA GACAAGAGCA AGATGGGAAA 5220
GGGGGTaACg cTTACGCGTG CAGAGATTGT AGCCCTGCGC GATTTACTAA ACAGTATGTC 5280
CCTGGACCCG TACTAGGGAC AGTCTGCAGT GCTTTGTGCA GcGCGGCGCg cAGcgTCGGt 5340
GGCTAGCCGG TCGCACAGTT CGTTGTACGG GTCTCCTGCA TGTCCTTTTA CCCAGCGCCA 5400
CTCGACGGAT AGGGCGTCGG CGAGTGCGCT GAGCGCTTCC CACAAATCCT TGTTCTTGAC 5460
CGGTTGTTTG GCAGCCGTTT TCCAGCCGTT GTGTTTCCAG GTATGGATCC ACTGGGTGAT 5520
GCCTTTGCGT ACGTATTGGG AGTCGGTGAC CACTACCACC GCCTCTGCAG CGCGTCCGTG 5580
TGCCTCTTGC AGTGCGTTGA TGACCGCGCA CAGTTCCATG CGATTGTTtg TGCTCGGGTA 5640
GGCgCTGCCG CTTCTAGTGA ATGCGGCAGC TTCTGGTGCG GTTTGTCCGG TTTCTAGAAA 5700
GGGTACGTCT GAGGGCACCA GAGCAAACGC CCACCCGCCC GGACCCGGGT TTCCCAGACA 5760
GGCGCCGTCA GTGTACAGGG TAAGTGCAGC GTGCGCGTTC ATAGTCGCGC tACGGTAACA 5820
GTTTTGCGCC GTGGGGACAA TGTATTGGTC CGACAGTTGG TGATGGAGCG AAGATATTTT 5880
CGCAAGGAGG GAGAATGAGG CGCGCACGGA TTGTGCAGGA ACTTTGGTAC GCGGGACGAC 5940
GGTTTGGTTT TTGCGGTACG CTGTCTTATT CTGCAAGGCG GTGTACACGT GCGCGTTGCA 6000
CTTTCTCCTC GGGTGTACAT GCTGCACTGT TTTTAGAGGA AAGCTAACAC GGAGAGGGCA 6060
CAGATGAATA TTCTGCATAA CTTTGTTGTA TTCGAAGGTA TTGATGGCAC AGGCACGAGT 6120 ACACAGTTGC GTGCGCTCGA ACGCCATTTT CAGGCCCGTA AGGACATGGT CTTTACTCAA 6180
GAGCCTACCG GAGGGGAGAT TGGCACTCTC ATTCGGGATG TGCTGCAAAA GCGTGTGATC 6240
ATGAGCTCTA AGGCATTGGG ATTGCTCTTT GCCGCAGATA GACACGAGCA CTTGGAAGGT 6300
GCAGGAGGCA TTAACGATTG TCTTGCAGAA GGAAAGATAG TGCTCTGCGA TCGGTATGTT 6360
TTTTCCAGTT TGGTGTACCA AGGCATGGCG GTGTCGGGTA GTTTCGCGTA TGAATTAAAT 6420
AAAGAGTTTC CGCTTCCTGA AGTTGTGTTC TATTTTGACG CGCCTATCGA AGTATGTGTT 6480
GAGCGTATCA CCGCACGTGG GCTGCAAACG GAACTGTATG AGTACACGTC TTTTCAAGAA 6540
AAGGCGCGCA AGGGGTATGA AACTATATTT CGCaAGTGCC gTCaTTTGTA CCCTGCAATG 6600
AAAGTGATTG AAATAGACGC GCGCGAGGAA ATTGAAgTTG TGCATGAGCg TATTCTTCAC 6660
CATCTGCGCG AATACAGGCG TCTAAAATAG TGTGTGGACG TAGATACACT ATCTGAGGAG 6720
CAGTGGAGAG TATATATCAG GAACGTGCTT TGCAAGCGGA AGGCGCGTGC TCGGTAAAAC 6780
GGTGCTGCAC CGGCGCAgcA TaAGCAAAAT AATTGGAAAA TTTGTCCATA GGTTTTTGTC 6840
GTCCGGTCAC AGTGCTCAGT GCCTTTTTCT AGGCTGTTTT TCAATAACTG TTTATGTAGA 6900
CTGGACGGGT CTTCCTTTCT CAACTCACAT ATTCTTTTCG GGGACATGCT GCCGTTGGCA 6960
GACGTTGGGT GTGACGGGTG TTTCTCTGGT GTGTAAGAGG AAGATATATT CCCCTTTTGT 7020
ATCTGCACTG ACCCCTGCAC GGGGTACAGG CTATTGACGC TTCCTTTCGT CTGTGTGTCT 7080
TCACTGTTGC GTGTACGGCG CGTGAACGGG CCATATAGAT AGATGCTTGA CGGGGTCTGG 7140
TTGCCATGTT AGGATCCACC AAGCGTGACT ATTCTTTTCT GGCCGCGTGT GATGCATAAG 7200
ACACTCCCAT AGCACCGTTA AGAGTCTCGC GAAACCTCCT CCGTATGGAG AGGGGTAATC 7260
CAATTGCCGT GGAACGCGAA GGTTCTGTGT TATGTCCGCA AAGATTTACG TCGGTAATTT 7320
AAATTATGCC ACCACTGAGG CTGGATTGGC CTCCCTTTTT TCTCAGTTTG GGGAAGTGCT 7380
GTCCGTGGCT GTAATCAAGG ATAAGCTTAC GCAGCGGTCG AAGGGCTTTG GTTTTGTTGA 7440
GATGGAAAGC GCAGAATCAG CCGAGTTGGT TaTTAACGAG TTGAATGAGA AGGAGTTTGA 7500
AGGGCGTAnG CTTCGCGTTa ACTATGCGGA GGAGAAGCCG CGTTTTcCCT TTaAGAATTA 7560
GTGGAGGATG GGGAGGACTT TcCATCGTGG CGCATGTTTT TgGCGTAAGG TGCTTTCGCG 7620
TGCGTTaTCT CATTTcTCGT CGTCTTTTGG TTcTCCCCGT TTGTGTGCGT CGCGGTgTGT 7680
TTGGTTcCTG TTaGGAACCC CTTCGGGGcT TCTGTcTATT TTGcTCCCAA GACTGCTAkT 7740
ACTATGGaTG agGcTGcGTC TCGCGyCCCA GGGTTgyCaw GwAgGGTGCC gTCtTTTGCG 7800
CCTGGGTTGA AGCAAGGTTT GCCnGGAACG TTGGGTCCGT TGGGTTGAAC CCAAGAAAGA 7860 AAAAAGTTnG GGCC 7874
(2) INFORMATION FOR SEQ ID NO: 70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20682 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 70:
GTTATGGTCC CTGTTATTGG CGATTTAAGG ATGCTGCcgT ACGTGCrCTA TCTTAtmCGA 60
AasTGCgCTG CGATGCgGTT TTTcATGCTG CCGCGTATAA GCaCGTTCCT ATGATGGAAC 120
TCAATCCTGT TTCAGTGATT GAAAATAATG TCTTCGGCAC CAAATTCTTG CTCGATGCCT 180
GTATTGCGTG TAGGGTTAAG CGCTTTGTAC TTTTGTCCAC TGACAAGGCC GTGGATCCTG 240
TTTCTATCTA CGGAGTATCT AAGATGCTCA ACGAGAAGAA TGTCTTGTAT GCTGCTGAGC 300
GTGTGCGCGA TTTCGGTCAC GATGCCGCGT ATATGTTTGT CCGTTTTGGA AACGTATTGG 360
GTTCCCGTGG TTCTATCATG CCGCTCTTTA TTGAACAAAT AAAGAAAGGG GGGCCCGTTA 420
CCGTGACAGA TCCTGCCATG ACACGATTCT TTATGACTAT TCCCGAAGCG TGTTCACTCG 480
TTTTGCAAGT CGGTGGAGTA GGAGTAAATG GAGCGTCGTA TCTTTTGGAC ATGGGGGAGC 540
CTGTGAGCAT TATGGAGACT GCGCAgcAAC TTATTCGCTA TTTTGGTTAC GAGCCAGACA 600
GAGATATTCC TATCCACGTG GTGGGCTTGC GTCCTGGCGA GCGTCTCAGT GAGCCACTCG 660
TTTCCAAAGA CGAGCGTATA GAGCCGACGG TATATCCAAA GGTTCTGCGT TTGCGTGAAC 720
GTGAACCTTT GGATTTTGCG CACCTTGAAC GCCTGTGGGA TCAACTGTAT CCTTACTGTT 780
TCCCTTCAGG AGAAAAGGTG CGGTACCGGC ACAAAGAAGG ACTTGTCCGC GTGCTATGCG 840
ACTCGTGCGC GACACTGAAA CAGCGGTATA TGCCAAATAG CGAGGCATAG GAAAATGGAA 900
GGTACCGTGA AAAAAAAGAA AGAGGGTGTT CGTGATGATA ACGCGCAGCA TGCGGTGTTC 960
AACAAACAAG TGCCGTTTTT TGTGCCCTCG TTTTCTGAAG CGGAAGAGCG CGCAGTCTGC 1020
GATGTGTTGC GTTCAGGATG GATTACGACG GGAACACAAG CACTCGCGTT TGAAAAAGAG 1080
TTTGCckTwT gTGGGtGCTC CCTATGCGTG TGCGGTTAAC TCAGCTACCA GTGGTTTGCT 1140
TCTCACCTTT GATGCAATGG GCATTGGGCC GGATAGTAAG ATACTTACCA GTCCTTATAC 1200
GTTTGTGTCT ACGGCGAGCT CTGCACTCCA CCTAGGTGCG CAGGTGGTGT ACGCCGATAT 1260
CGAGCGCGAC TCTTATAATA TCAGTGCAGA GTGTGTTGAA GCGTGTTTAA AAAAGGATGC 1320 GCGCATCCGT GCTATTGTAC CCATCCATAT TGCCGGGAAT GTATGCAATA TGCGTGATCT 1380
CAATGCTCTT GCGCGTAAGT ATCAAGTGGC AGTGGTGGAA GATGCAGCAC ACGCTTTTCC 1440
ATCGAAGACT GCGTGTGGGT ATGCAGGCAC ACTGTCACAT GCGGGGGTAT TTTCCTTTTA 1500
TGCCACCAAG CCGTTAACCA CCGGTGAAGG AGGTATGGTT TGCACAAATG ATGCGAAgcT 1560
TGcAGCGCGT ATTGCGTGTT TGCGTTCACA TGGCATTGAC CGGGCTATTT GGGATCGGTA 1620
CACAAATGGC ACCGCACCGT GGCGTTATGA CGTAACAAGC CTTGGGTGGA AGTGTAACCT 1680
GCCGGATATT TTAGCAGCAA TTGGACGCGT ACAGTTGCAG AAGGCGGCGC ATCTTTTTGC 1740
ACAACGCGCG CGTATTGCCG CCGCGTTCAC GCGTGCTTTT TCTCGTTATG AATTTTTTTG 1800
TACTCCGCCT GATGGGGATG GAAACGCGTG GCATTTGTAT TTGTTGCGCT TAGTTCCTGG 1860
AACGCTTTCT GTTTCTCGGG ACGAGTTCGT CAGATTATTG CAGGAACGGG GATTGGGCGT 1920
TTCTATGCAT TTTATTCCTC ATTTCGAGAT GACGTTTTTT AAGAAAAGTC TGTGTGTACG 1980
AGCGGAAGAT TTCCCTGAGT GTGCGCACAA GTATCAGCAC AcGcTTACGC TTCCGTTGTG 2040
GCCGGGAATG GATGACAGTT GCGTGGCGTA TGTGATAGAG ACCGTGGTGC GCACCGCACA 2100
AGAATGTGCA AAGGGAAGAG CATATATATG AGCGTGTTCG TTTCAGACGG TGCGCGCACA 2160
GGGAGCGTCT ATGCACAGCT TGTCCGTGCG CCGCGCGTTG CAGGATTGCT GCTGAACATA 2220
GATATTCCCT CTCTCCtGAC GGGTACTCTT TTTATACTGC AGCACATATT CCCGGATGCA 2280
ATGCCGTTCG GTGTGGGGAA AATACTGTGC CGGTTTTTGC GCATGGAGAG GTGGTGTACG 2340
CAGGGaACCG GTGGGTATCC TCATTGGGCC TGATGAGCAT GTGGTACGTA ATTTAGTGCA 2400
AGATGTGGTG GTGCATACGT GCGCAGAGCG GGCCTGTGCG TCGGAAATAC TCTGTGGAAT 2460
CAGTGAAGGG GAACCCCTCG CTCAAAAGGT GGCGGTGCAA GGAGATGCAG AAACTGCTTT 2520
TAAACGCGCA TCACACACGG TATGCTCCTC TTGTACATTT GAGCCGCGTG TACACTACTT 2580
TGCGGAAATG CCAGAAGTAC AGGCACTACC CGACGCGCAC GGTCTGCACG TGTACGCTGC 2640
TACGCAtGGc CTGCGCACAT GAGAAAAACT ATCGCGCAGg TACTGAATAT TTCTGAGCAT 2700
GCGGTGCACG TACATCCGCA GCAGGAAGCG CTTTCCTGTG ATGGGAGAAT ATGGTTCCCC 2760
TCAGTGATGG CAAGTCAGGC GGCGCTTGCA GCCTATTGTG CGAAAAAGCC GGTACGCTTG 2820
TCTTTTTCCT TTCAAGAGTA TGTGCAGTAC TGTCCTAAGA CTCCCAAGAT TACCATTGCA 2880
CATCGCACGG CGCTCAACGC CGCGCATGCG GTAGAAGGTA TGTTTGTTTT TATCTCCCTC 2940
GATGCAGGAG CGGGGAATTT ATTGATCGAT CGTATGGTTG CGCATATGGT CCATACTGCA 3000
TTAGGAAATT ATGAAATTCC TCGGTACCGC ATTGAATGCA CAGCGTTTCG TTCAAATGTT 3060 GGATTAACGG ATGTTTTTAA TGGATGGGCA GATGCATACA CTTCTAATGC ATTAGAAATG 3120
CATATTAATC AGTTATGTGC TGAGCTTCAT A ATTCCCTG ACGAGTGGCG TGTGGCGCAC 3180
ATGAAAGATA CGCGGGAAAC ACAGCGTTTT GCGCGGTTGC TCGCCTATCT GTGTGAGGAA 3240
GGAGATTTTC GTCGAAAGCA CGCAGCCTTC AGCATGGTCA ATGCAGTACG AAAAGCACAT 3300
GACACCCATG CCTGGCGTGG TATTGGACTC GCGTTGGGGT TTCAATATGA TCCGTCTGCG 3360
ATGTTAGCCC GTTCGGGTTT TTCCTATGTA TTACAAATGA CGCTGCACAC TGATGCGCGC 3420
ATTGTGGTGC ACAGCGTTCC GCTTTCTGAT TCGTTTAAAC GGGTAGTGGT TGCGTTTCTC 3480
ATCAGAGAGT TTGCGTGTCT GGAGGATGCG ATCTTTTTTA AAAGTAGTGA TGAGGCGTAT 3540
GGCGTGGATC TGTTGGGTCC GTCTGTGGAA TCAGTGGGGA TGAGGGTGTT TGCACGGTTG 3600
GTGAGAAAGT GTGTACGAGC AATTCAGAGA CAACGCTTCA GAAAGCCACT TCCTATCACG 3660
GTACAGGGGT CCTTTAACAC GGCCAAGAAG GGGCAGGTGT ATCAAGTGGT GACTGTTGCT 3720
AAGTCAGATG TTTCGGTGCC CGATGCGCAA TCTGAGCAGT GTGCCTCAAA GGTACCTGTG 3780
ACTGCTGATA CTAGCGGAAA ATGTGAGGAT ATGAACGGTT TTACCAAAAT GCACGGAATG 3840
AGCACGCACA CTCCTGCAGC CTGTATTATT GAACTCGAAT TAGATGCGTT GTGCGTGCAA 3900
CCTAAGATTG TCAGGTTGTG GTTTGTTTGC GATCCTGGGT ATGTCTTTTG TGAAAAAGAT 3960
GTGTACCGTA CCGTGAGTCG AAGCATTACT CGTGCGCTTT CGCACGTATC TGTAGAAAAG 4020
ATTTGGGAGC GTGCGCGCAC ACCCGAGTAT GTTATCATCG ATCCATCCGA TACTCCTCCC 4080
TATCACGTCA CCCTTTTGAG TTCAAATGCT GCTGCGCGTG CGGTGGGAAC GGTTGCCGAA 4140
GtATTGTTCC TGctGCGTAC TACGCAGCAC TGCGGCAAAT TTTGCCGATT TCTCAAAACG 4200
CTACCCATAA GGTTCCTTTT GTTGCGCGGG ATATTTTTTA TGAGATGTTT TCCCTCAGTG 4260
CAGACGATTC TCTATGAATA TTCGCTTTAC GTTGAATACA GAACAGGTAC ATGTGGATGC 4320
TATGCCtCAT GAGCGTCTTT CGACCGTTTT ACGGAGATGT TTTCATCTTC CTTCGATAAA 4380
AGGTTCACAC GGACATGGAG AGAATGGCGC GTCTACCATT TTGTTTAATG GGGAGGCAGT 4440
ATCCGCG AT ATCATACCCT TTTTTCTTGC GCACGAAACA CAGATAGTTA CACTTGATTT 4500
TTTTCAAAAG ACTAAAAGAG GACGGTGGAT TGTTGCGTGC TTTGCGCAgC ATgCATATTT 4560
CTTTTTGTGG GTATTGCGAT GCAGGAAAAA TTCTCACCGC AGAGAGCTTA TTACAAAGGA 4620
ATTCTGTGCC CAACGAAGAA GAGGTAAGAC ACGCCTTTTC AGGCATGCAA TGCnGGtGTA 4680
CTGATATCAA TGCGTTGATT CGAGCACTTC AACGTATGCC TGCTTGTCAT GAGTTTTCCT 4740
AAAACTG AT GCATCTGTAT AAGAACTCAG AATCAAGTGT TATTTATTAC GTAAAAAGTT 4800 TGGGTCAATT GTGTGCGGTA CTACGTAACG TCGCGCAGGT ACAGCCAGTT GGGGGTGGAA 4860
CGGGCTTGGT GCAATATCAG ATAACCCCGG TTTTAACGTT GCCTTCACAT TTGGTTGTTC 4920
TAAACGGTGT ACCAGAGTTG AAAGATATTT CTAAAACTGA GCACTTTCTT GAGTTTGGTG 4980
GTGCTGTGTC GTTGCAAGCA ATTGTGCGAT TGGGAAGAAA AAATATTCCC GTGGcACTGC 5040
ACGAGGCACT GTCGCACGCA GCAAATCCTG GGATACGGAC TCTGGCCACT ATTGGGGGGA 5100
ATATTGCAGG TACGCGTCCG CATGCTTCTG CTCTTGCGCC GCTTATCGCG CTTGATGCAA 5160
AAATGGAGGT GCGGACTGGA CATGAAAACT TTTGGATTTC TGTGGCACAC TATGCACATG 5220
CGCGTTCTGA CACGCTGCGA CACCGGAGTC ATGTAATTAC CCGTATTCGC CTTCCAACAG 5280
ATTACTGGGA CTTTTCCTAC TACAGACGTA TTGGGTCGCG TGCATTATTT GGTGAACGTG 5340
CCGATTTCGT GTTTCTTGCA CAGCAGCAGA AAAACGCGTT GTCTGAAATG CGTATGGTAT 5400
TTTTTTCAGA TGTAGTAATG AGAAATAGAG AATTTGACAA TTtGCTGTTA GGCAGAGCGA 5460
TTCCTCTTTC TGCAGGGGAT ATTGCGGCAA TCGTATATCG AAGCAGAGAG TTCTTTGCGC 5520
CTGAATCCTT TAAGAGTGCG TACATCGCGC ACTGCTTCTT TCATCTGCTG GAAGACTGTT 5580
TGCGCCGCTT AAGATGAAGC TACAGGTGGC GAGTTTTACC CAGGCACGCG CAAACAGcTG 5640
ACGCTAAAGA CCGAGTTTTT TCATTTTGCT TTGCAATGTA CTTGGCTTAA GACCAAGAAT 5700
TTCTGcTGCG CCATTTGCTC CGTATATCTT ACCGTTGCTT GCATCAAGCG CCGCTTGAAT 5760
TGCTGcGCGT TGCGCCTGAT GAAAGTTGAC CACCAGCGTA GTTTCTTCCT TTTTCCCCTG 5820
TGTGCATCGC ACCATGCAAG AGCTATCGCG TATGCACAGA GGTATAGCTG GTTGAACGCT 5880
TTCTGACACT TCGGTGTGCT CCCGAGGATA GACAGTAGTC TGCGTGCCGG ATTCCGGGGT 5940
CCTACATACA AGGTGTTCTG CGCCGATGGT ATCTCCGCGT GCAAGAAGTG CAGCACGCTC 6000
GAGTAGGTTG CGTAACTCAC GCACATTGCC AGGAAACGTG AGCGAGAAGA TTTTCTTAAA 6060
CGCGCTGGGA GAAAGCTGAG TGCGCTCAAA CCCCGGGCGG GTCTTAATTT TTTGGATAAA 6120
ATGCTCCGCT AGAAGCGCAA CGTCTTCTGC ACgCTCGCGC AAAGGGGGGA GACTGAGGGG 6180
AAAAACATCG AGCCGGTAGA GGAGGTCTTC CCTGAATTTC CCTTGGGTGA CTGCTTCTGA 6240
AAGGTTGATA TTCGTAGCTG CAATAATGCG AACCGAGACG CTTACCGAAC GCTCCCCTCC 6300
AACGCGCTCA AATACTCCGT CTTGGAGTAC GCGGAGGAGC TTCGGTTGCA GTTCCAGGGG 6360
GAGATCTCCG ACCTCATCGA GAAAAAGGGT GCCACCGTGA GCCAGTTCAA ATCTTCCCCG 6420
ATGGGTGCCG ACCGCACCTG AGAAGGCACC TTTTTCATGT CCGAATAATT CGCTTTCTGC 6480
AAGGCTATGG ACGAGTGCTG AGCAATTGAC GGGGACGAAG GGCTTGTCGC TGCGGGTGGA 6540 AAGTTGGTGA ACGGTTCGCG CAACAAGCTC CTTTCCAGTG CCGGTTTCTC CACAAACAAG 6600
GACAGGGAGG TCAGAGGCTG CTACGAGCTT TATAGCATCG AGTGTGCGTG TCCAAGCAGG 6660
AGAGGTTCCG ATCATATTTT TAAATGCAGG TGATTGGGGA GCTAAGAGCG CATTTCGTTC 6720
GGTCAAAAGA GCGTGACTCT TTTGACTCAG TGTCTCGGAC GCGTCGGTCT GGGCTACTGC 6780
GAGCGAGATA AGTTTAGAAA GAGTAGTAAT GAAGCGTACA ACGTCTGGGG TAAACTGCTC 6840
GCACAGGCGA TGGTCGAGCG TGAgCATGCC AATGGGAGTA TCATCGATGT AAAGCGGCGC 6900
GATGAGACAG GAATGATTTT GGGGCATGGG AATAAGCTCT GTGTAGGTAT CAGTGTGGGC 6960
AAGCGTCGGA TCGAAAAGGT ATGGACTCTT CTGTGATAGG ATGCGCGCAA GATCCTGCCT 7020
TTTGGTGAGG TCTATGgTGT GGTGTTGGAG GCGGGGAGTG TACAGGGGAC CACGCGCCTT 7080
GCGAACTCGC AGTATTTGAG AAGATTCAAA GCTGAGGACC ACGGCTAGCT CgTAACGGGC 7140
AATCTCATAG AGGcGTCCAG AATCATTtCC AGCGACTTTT CCGCAGcAGG GGGAGAGCGC 7200
GCGTGCAGGA CAGCCCGAAC AgTTCATGGG GCCCGAGTAT AGAAGAAAAA gGCATATCCG 7260
TGCAATTcTC CGCATGGAGC CTGTGGGCGT GTCGTGTGCA GGgGTATGGT ATTTGTTTTT 7320
CGAATCCTTT TCCtCGCGTT TTTTATGGGG TATAATCGCG CGCATGAGAC GCGTGTGGAT 7380
AAGTGTTCTG ATGTTTCCTT GCGTATGGGC AAATGCGCAG GGAGAATTTC TCGCAGGCGG 7440 yGCAAAGGGA TTGTACCGTA TTACTCCTTA CGCTCAAGAC GTACTGCTCT CTGGCGTTTC 7500
GGTTAGCAAG ATTATTGCTG CGGGAGAGAA CTGGTTCTTG CTTACGTCTC GAGGTGTCAT 7560
GACCTCGCGC GACTTAAGGA CTTTCGCGCA CGTGGGTGAG CAACTACCAA AGAAGGTAGT 7620
GAAGAAGATA GTCGATCGGG AAAAGGTTTT TGTGTCTCAG CCGCAGCCAT TGAAAGATCT 7680
TGAGGTACAT CCGGATAACG GAGCGGTTTT GGTTACCGCT ACCAATGACG CAGTGTTTCT 7740
CAGCAAAAAT GGGGGACGGA CTTGGCAAAA TCTGGGCTGT AATGCAAACA GCAGTGGGAT 7800
TAAGGCGGTc GCGGTGCTCG ATTTTCCTGA TGAAACGGGT AAGCCAGTGC TTACCGTGTT 7860
TGTTTCGCAT TCCCTGCGTG GTATTGCGTG GATGCAGCCA GAGAAAGGTC GTTTTTGGAC 7920
TGATATTnAn GCcTwCnCTT GCGCTTGGTC CTGAAGCCAC TGAAGAAATC TCAGACATTG 7980
CGGTGCGCAG GAGCGTGCAT GGCAATGAGC TTTTTGCAAG CTACACGTTC GTGCCCAAGA 8040
TCGTACGCCT TAACTGGGCC AAAAAACGCT TTCAGGACGT ACGTGTGTGG AnCGntGCGC 8100
TGAAAGATGC GCGCTGCATT GATGGATTGA GTGCGTCTGn CnTTCGCTCG TTGGGTGTCG 8160
GGATGGTAGT TTGTTTGAGA TCCCCCTCAT TATGCCTCGC CCCTTCGATT TGGCGCGTCT 8220
TGAACAGGAT TTGCGTCGGA TCCCGGATCA AATCTTATGT GCGTGGGTTC CGCGTCATGT 8280 GTCACAGACG GGTGATGCGC TGTCTCTTTC TGAGTTATGG CTTTTGCACG ACCGGTCTAG 8340
TCTTGCAAAA GAGGGACGTT TTGCGCGGGC AGATTTGAAA AAGGGTATCT ACGTACCGGC 8400
GCATCACATT AAGGATCCTA AGCTTCGTGC AATGCACTTC AAAACGATTG CGGACAATAA 8460
ACTTAATATG CTTGTGTACG ACATGAAGGA TGAGCTTGGA ATGGTGCGTT ATCAGTCGCA 8520
AGATCCATTT GTGCGATCGG TCGGTGCAGT TCGTCCTTTT GTTGATATGA AGACGTTTGT 8580
GAAGCAGGCA AAGGAAAAAA AGTTGTACCT AATAGCACGT ATTGTGGTGT TTAAGGACAA 8640
GTATTTGTTC CGTTGGAATG GCTTTGAGCT TGCGGTTAAG GCGGGTGGTA AGCCTTGGCA 8700
GGGGTATAAA AACGGAGCGC TCCGTAAGGA AGAAATTCAC GAGCATTGGG TGGATCCGTA 8760
CAACGAAAAG GTGTGGCGGT ACAATGTCGC CATCGCGAAA GAAGCAATTG AGTTTGGCTT 8820
CGATGAGGTA CAGTTCGATT ATATTCGGTT CCCTACCGAC GGAGATAACC TTCACCAGGC 8880
GGAGTATCCG GCAAGAGAGT CCGGGATGGA TAGGGAGAGT GCGTTGATGT CGTTCCTGGC 8940
GTACGCGCGC GAGCaTaGAC GCGCCAATCT CCATTGATAT CTACGGAGCG AACGGGTGGT 9000
ACCGTACAGG CGCGCGCACG GGCCAGGACG TGGAACTGCT AGCTGAGTAT GTGGATGTGA 9060
TCTGCCCGAT GTTCTACCCC AGCCACTTTA GCCAGAGCTT CCTAGCTTAC GCGCCTGCGC 9120
AGGAGCGTCC CTATCGCATC TACTACTACG GCGGTACCGC AACCGGGTGC TGGCCCGCAA 9180
CCGGGTGGTC ATCCGACCCT GGGTGCAGGC CTTCTACtAC CGGTCTCTTA CGACCGGGCG 9240
TACTACGGCG AGGATTACGT GCAGCGCCAG GTGGCTGGCA TCCGCGAATC AATCGATGAA 9300
GGATACACGT ACTGGAACAA CTCAGGGCGT TACTCAGACG TCCGGCCCGA CGGCGCGCGC 9360
CTCCGTTAGC CGCCAAGGCA ACCCGGCGAG CGGACGCCCC TTGTTGGGCC GTTTTCCCAC 9420
ACAGGACCGA AAATCAGCGT CCGTCCTCTC AGGAGAAkmA CTTCTCTAGC TCGCCTATGT 9480
CCGCAGTCCT AAAAAGCTCG CTGCTGTCCT CGTCAAGGAT AACAACCATT GGAACGCTAC 9540
CTATACCGAA ATCCGAGACC AACTTGGCCC CTTCGGGGGG AAGTCGATTC AGCAATCTTT 9600
TTTTCTACAT CTGGCAGACC ACTGCAATAC TCGCGCAGGC GCGCGCTtCG CCTCGTCTGC 9660
CCCTACCACA GCGTTGACAA CCATAGCAAA CTCCCTCCGT TTCCGTCAGT CTGGGGCTTA 9720
CCACCCCCCC CCTGCACCGA TACCGCACAA GCTACGCGAG TTCAGTCAAC TCGGCGCACA 9780
GCACACGCAA AAGACCGCTA CTCGGTGAGG ATTTTAATAA CCTCCGCTTT ACTTGTTGCA 9840
TCAAGGATCT CCCGACGTTT CTCAGAACTC TTAAACAGGA GACTGATCTC AGCCAAGAAC 9900
TGCAGGTGCG GACCAGTGAC GTCAAGTGGA GAAAGGGTCA TTATGAAAAT ACGACAAGGT 9960
TCTTGATCCA AAGAGTCGAA GTCAACCGGA CTGTCAGAAA CACCAACCCC TGCAACCAGA 10020 CTACTCACTG AATTAGTTTT ACCGTGGGGT ATGGCAATGC CATGCTTCAT CCCCGTAGAC 10080
ATTTTTCGCT CTCGATCGAG CACACACTCC CGCGCAgcAA CCTTGTCGCT CACCCTTCCT 10140
GCACGGACGA ACATCTCGAG CATTTCGTCG ATGATCTCCT CCTTGGTAGA ACCCTTCAGG 10200
TGCAGGCTTA CGGTTTCCGG CGTCAACACG GTCTCCAAAT TCATTCCCCC AAGCTAAAAA 10260
CTCTCAAAGG AAAAGTCAAG CTTTTTGGAA AAGCTCCCCA CGCTCTTTCC GCTCCCAGGA 10320
TATCTTGACC TTTTGCCTAC TTGGACTTTA CCATGCGCGC GTGGAGTTCG CCACCAGGGA 10380
GCAGCTGAAT AGGTACTACG ATTTGTACAA GGATGTCGAT GTAACTTTCT CAAAGGATGT 10440
GATGCAGGCG CTCTGTTTTA ATGCGCGGCA GGTGTGCGTG CGAACGCGGn GAGGTCAGTG 10500
TTCCTGCGTA ATGAATTCTG TGTCTATGGT GGGTGCGAAA GTTATTCTCA GCAGGAAGAG 10560
TAGTCTGCTT GAGAGCATTC AAGTGGAAGG GGCGAGCGTC AgCATACGGT TTTCTTTCTT 10620
TGAGTCCGAT GCGCGGGATG CGGTTTCCTT CTTCGTTACT GCCAGGGTTC TCGGTGTTGA 10680
AGACTATGCC CaGAGTACGG AGCTAGTGGT GTTAAGCGTG GCGTATACGC AGCGCATACC 10740
TGATATGCTC ATAGAGCGTT TGGGTTTGCT TGTTGAGGCC AACATTAGTT CCAAGAAGCG 10800
TAAGTCGGAG CGTATTGCGg TGAACAAGGA GAGTATGCGC AGGATCGGCT TGATGAGAGC 10860
GGAGACCATC GTGTTCATTC AGGCGATTCC TCGCCGCTGC GTTCTGCGGG ATGTTTCCTT 10920
TGGTGGTGCG AAGTTTATCA TGATGGGCGT TGCGCCGTTT TTGAAAGGCA AGGAGACGGT 10980
GCTGAAGCTT GATTTTGAGG AGCCGAGTAC GAGCATGAGT ATTAGGGGGC ACGTGGTGCG 11040
TGCAGATCAG GTTGAGGGGC GTAAAGACCT GGTGGCCGTG GCCATGGAGT ACGACTTTGA 11100
TGTGGTGCCT GTCGCGTATC GTATGTGTTT GAACCgsTAC GCATCGGACC gCTGTCGCCG 11160
TTTTCCCGGT ACGGACGAGG ACTGCTCTGC GGCGTCTGCC GGCGATCCAG GGCGGTCGTC 11220
AGCAGGCGCT GAAGGTATTG ACCTTTCTGT ACCCTTCTCT TTGTCTTAGT TTTAATGGCG 11280
CTTGTCACCG GATTGCCCCT TAGGGGGGTC CGCATGTCTG CGGATGCGCG CGAGGCTCCG 11340
TGCACTGAGC CTCGCTCCGA CCAGTAGTTT CGAACGCACA AACCAGCCTC GTgGCACTTC 11400
CAGTGCGTAC CGTACTGACC GGTTGCTGCG TACGCTGTGT GTGCTCAGCG GGACTAAGTC 11460
TTCTATCTGT ACGATCGCCC CTTGAGCATC CAGGAATGCG AGAGAGAgCG GGTGTGGGGT 11520
GTCCTTCATC CAAAaGGAGA GGCGTGTGTC CTGTTTATAC ACGAAAaGCA TGCCgTCCcG 11580
TCGGGGATCc GTGTACGCCC CATGTACtCg CGnCCTGCGC TTCTTCCGTG AGTGCGAGTT 11640
CTACAACCAC CGGCACGTAC TGCCCTCCTG TACAAAAAGC GATTTGCGCT GTTTCTAGGC 11700
TATTCGTTCT GCACGCCACA CAGGAAAGCA ACCCCAGTAA CAAAAGCGAC AGCGCagCAG 11760 GTGTCCTGTG CAGGTAGAAG GAGACGTGCT CTTCAAAAGC CTTGTACAAA ACGCTTTCGA 11820
TCCTTTTCAG CATTGCTCTT TTGAGTGTCG CGTGCACTAA GCTGTTCGTT AAAGGCGCGG 11880
ATATCTATGT ACTTGATAAT AAGAGGCCGC TCGAGCACTA GGCGTGTGGA CGCGTCTTCC 11940
CACAAGGCAA CCCGAGGACT GAGCTTATGA GGTTCGCCGT ATTTTTTACA CAAGTTCTCA 12000
TATACCGAGT AGTAATCTAT CGCGTCAGTG TTGAGTTTGA ACGTCATAGC ATATAGACGG 12060
TCACGGTAGA ACTGAAACCA GCTGCGTGCA ATAAAGTGGG GACCCGTGGT CTCAATCAAG 12120
ATACGGTTCT CACTCATTAG CAGTGACACG TCACGCTCGC CGCGATATCC AAAAATACTA 12180
TCCTTTTTCA GTGCTTCCTT TACCTCGGTT ACGCCCATAC CCAAACTCAG CGCGCGGTAC 12240
ACGGCGGGAA TTTTCTCAAG AGAATGTGTC TGCGAAAGAG GAGTGACTGT AGCCAGCATT 12300
CTCGCACCGC CTGCGCCGCG ACTAGGGGAA GGGGAAGAAC ATAAATAAGA CTGCATATGC 12360
AACACCCCGC CCAACGCAAG CGCCATACAA TTTTTAGCAT AATTCATGCC GTTTTCCTCT 12420
CCTTGTGTGG ATACATCTAC TGCCTGTTGT GTGGGTCCTT GCGTGCCTTG GCGAGGGTAC 12480
GCTTTTTCTT AAAATAGGCA ACGCGCTTCA CCAATTCTAG GTCGTCGGTG ACAGATATTT 12540
TGTTATCGAC ACAGCGGACA ATAGGTTCTC GTAAAAAgTC TtCGGTTACC TGAGAGACGA 12600
TGTCTTTAGG GAAACCGCAC ATGTGTGCGA GTTCTATGGG ACCGAATTCA AAATCGTAGG 12660
CCTTGCCGGT GCTAGGTATG TAGCGAATCT TTTCTAACTG GATGGCCAAC ATATCGTACA 12720
TTTTTTCAGT CGGTTCCGGA AGCAAAGTAT TTGCGAGCTG TCGGTACATC GACCAGATGC 12780
GATCTGCGAG CGTGGTGGTA AGACGCGCAg TCAATTGCGG TTGTGTGGCT ACCAGCTGTT 12840
GGAAGTTCTT TCGGTTCACG GCCAAAAGCT GGCAACCATC AGACATAACA ATGGCGCTTG 12900
CAGAACGCGG CTTGTTCTCC AGCAACGCCA TTTCCCCAAA CATATCTCCT TCTTTTAAAA 12960
TCGCCAGCAC TACCTCATTG TTATCAACAA TCTTAGTAAT TTTTACATGT CCTTTTTGAA 13020
TGATGTAAAA CTCATTTCCC AATTGACACT CACAGAACAC CATCGCCTCT CGATCGTAGC 13080
AGCGCGTGGC TTCAAGTATG TTAGGTTCGA GTATTTCTAC TGGTACCTTA ACTCCTGTGG 13140
ATTTAATCGC AACAAATCGT TTGCGTGCTT CCTCTGCATA CGTCCCCTTG GGACTTTCCT 13200
TGAGATAATG ATAGTACGCA TAGAGCGCAA GTTCAAACTT CGTCATTTTG ACGTAGTATT 13260
CGCCAATAGC GAAAAGATGC GAGACATCCA CATCAGTGTG TTTTTTCAAT GTCAATTGGG 13320
TAAGCGCCTC ATTGAGGTAG CGCATTTTCT TTGTGAAGGA AAGGATGATT TTCATAGCAA 13380
TCGCCGCGTT CTTTTCAATG AGCTGGGGGA ACTGCTCATA ACGAATTGCA ATAAGCACGA 13440
CATCAGTGAG CGCAACTGCA GTTTCAATCT GATTATGCCG CGACATGCAG GCAACTACAC 13500 CTAAAAAGTT ACCCGCGGTT AGGACGTTTC CTTCCTCCTC TGCAACTATC TCTACTTGTT 13560
TTGCAATACA TACCTGACCG CTGTGAATGA TATAGAAAAG ATCTGCGTCA gcTTTCCCCT 13620
CTACTAGTAT GTAAGAACCC TTCTTGAAGT TAACAAACGT CAGCTGTAAC AAAGTATCCC 13680
ACTCCTTCCT AATCGTCGCT CTATGCTCTG TTTACAAGAC AACCCGTACG TCTTGCAGTG 13740
CACAGGTGCC CGCCTCATGA GTCTCGCGAA CTGGAAGTCA TCCTCACAGA TTCTCCAGAA 13800
AAAATGATTT CCCGTTCGAG CCACACTCCG TGTGTTTCAA ACACGCGTTG CCGAACGACG 13860
CGCAAGAGTG TGCGCACCTG ATGTGCGGTG GCATTCCCCG TATTGATAAT GAGATTTCCA 13920
TGCCAGGGTG CTACCTGCGC AGcCCCACAG GAGGTGCCCC GTAAACCTGC CTCTTCTATG 13980
AGAATGCCAG ACGGTTTACC AAAAGCTGGG TTGTTTTTAA ACGCGCTGCC TGCTGACGGA 14040
AAGCGAAACT GCCCCTTTGA AATACGATCG GCAATCTTCT CCTGCATGTG CTTCCTAATC 14100
TGCGCCGGAT TGCCGGGAGT GAGACGTACA CACAGCGAGA GGATAAGACG CCTTCCTGCA 14160
TGGAGTTCAA CACCGTGAGG ACTCTGGAAA GGAGAGCGCT TGTAGCCCCA ATCCCCGCGC 14220
GCGCGAAGAC GGTCTGAAAA CTGCTGCAGG TAAACGGTCC GCCGTCGAGG CCAAGACATT 14280
CCCCTCTTTT GTCTTGTGCG TTTTTTCTCA CCTCTGGCAG TTCTTTTGCG CGCGAACGcA 14340
CGGGGTGAAG TaCGAGCGTG CGCGCAGAGT GAAAnCAAtC TGCGATTGCA CGCCCATAAC 14400
ATCGGGCGTT CATGTACGCG GCACCACCGA CACTACCAGG CAGCCCTGCA AAGGTCTCAA 14460
GCCCGCGTAG AGCGTGATGG GCACAAAAGG CCAGGAGGGC GGCCACAGGT AACCCCGCGC 14520
CTGCATGTAC GAGCACTGAG CCATCGCGCT GTGTTTGGGT GTGTAGACTG CGAAAgCGAC 14580
GAAGGCTCAA CATcAGACCC GGTACGCCCT cGTCTGCGAT TAACACGTTA GAGCCTCCCC 14640
CAATAAGGGA CAGCGGAATG CGTGCGCgCT GCGCTTcCTC AATAAGCGCG CGCAgcTGTG 14700
TGCAGGAGCG CGgcTCCGCC CAAAACTGCG CAGCgCaCCA ATGCGGAAAG AACATCGCTC 14760
TGCAAGTGGG ACGTTACGGC GCGTGATCCG ACGCGCGCGT ATCCGGTGCG CGGACATGGA 14820
CAGAAAACAT ATACGATTTC ACGCGCTTAT GCTACAATGG CGCGGTCTTG GCCTTCTTTG 14880
CGTTTCGAGG GAGGGTAGAC TGAAGCGCAG GCGGGCAAAG GCGTTTGCGC GACAATGGGG 14940
CTGGGCGTGG TCCGCACGTG TTCTGCGCCC GGTTGGGAGA AACTTCGAAG GGGCGCACAT 15000
GGCCTTGCGC GTATACAACA CCCTTACTCG TCAGCAAGAG CACTTTCAAC CCTGGGAGCA 15060
CGGGCACGTG CGTgCTCTAC GGTTGTGGGC CTACGGTGTA CAATTATCCC CATCTGGGGA 15120
ATCTGCGCGC ATACGTTTTT CAGGATACGG TTCGACGTAC CTTGCACTTT CTTGGATACC 15180
GCGTCACCTA CGTTATGAAT ATTACCGACG TTGGGCATTT AGAAAGTGAC GCAGACAGTG 15240 GTGAGGATAA GCTGGTAAGG AGCGCACAGG CGCATGGCCA CTCGGTGTTG CAGGTTGCAG 15300
CGCACTATCG CGCAcCTTTT TCCGCGATAC TGCACTGCTC GGTATTGAAG AGCCGTCCAT 15360
TGTCTGTAAT GCCaGCGATT GTATCCAGGA TATGATCGCG TTTATCGAGC AATTGCTCGC 15420
GCGTGGGCAC GCGTACTGTG CAGGAGGGAA CGTGTATTTT GATGTGCGAT CCTTTCCTAG 15480
CTACGAAAGC TTCGGTTCTG CCGCGGTAGA AGATGTTCAG GAAGGAGAGG ATGCGGCGCG 15540
CGCGCGGtGG CACACGATAC GCATAAGCtG ATGCACGTGA TTTTGTGCtG TGGTTTACCC 15600
GTAGTAAATT tGTGCGTCAT GCGTTGACGT GGGATtCTCC GTGGGGGCGG GGGTACCCCG 15660
GTGGCACATC GGGTGTTCTG CAATGAGCAT GAAGTTTTTA GGACCACGTT GCGACATCCA 15720
CATCGGAGGG GTGGATCATA TTCGTGTGCA TCACCGTAAC GAGCGTGCTC AGTGTGAAGC 15780
AATTACTGGT GCACCCTGGG TGAGGTACTG GTTACACCAC GAGTTCTTGC TGATGCAGCT 15840
GCAAAAGCGC GCAGTACATG CGGATATGGG CAGTTCGgTG GTGTCGTCTT TTTCTAAAAT 15900
GTCCAAGTCC TGTGGGCAGT TTTTGACGCT TTCTTCGCTG CAGGAgCGTG cTTTCAGCCA 15960
GCTGATTTTC GCTTCTTTTT GTTGAGTGGA CAGTATCGCA CGCAACTTGC TTTTTCTTGG 16020
GATGCGCTAA AAACGGCGCG TGCCGCCCGA CGGAGTTTTG TGCGGCGAGT GGCGCGTGTA 16080
GTGGACGCTG CTCGAGCAAC TACAGGCAGC GTGCGCGGCA CTAGTGCAGA GTGTGCCGCA 16140
GAAAGGGTGT GTGAATCGCG CGCATCAGAA TCTGAGCTGC TCTTAACTGA CTTTCGTGCT 16200
GCGTTGGAGG ATGACTTTTC TACGCCACGT GCTCTGAGCG CCTTACAAAA ATTGGTGCGT 16260
GATACCTCGG TGCCGCCATC GCTGTGTGTT TCGGCACTCC AGGTGGCGGA TACAGTGCTA 16320
GGGTTAGGCA TAATACAGGA AGCGACCGCA TCGCTATCTG CGCAGGTTCC TGCTGGCGAT 16380
ACGTTGCCGC AGCGTCCTTT ACCGAGTGAG GAGTGGATTG GACAGTTGGT GCGTGCGCGT 16440
GCACATGCAC GCCAAACGCG TGATTTTCCC CGTGCAGATG AGATCCGTCG GCAGTTGAAG 16500
GCTGAAGGGA TTGAACTTGA AGACACCCAT CTTGGGACTA TTTGGAAGCG CGTGTAACAT 16560
TTTGGGAGAT ACATTGTTGC ATGAGCAGGA GCTTTTAAGA GCACAGGATG ATGCAGATTT 16620
TAAGCTCATG TACGAGCAGC TTGTGCCAGT GCTCTAsCGC GTAGctAcAA CGTGGTGCGC 16680
GAGGAGGACA TCGCTGAGGG GCTCTGCCaT GATGCCTTCA TTGCAtGACA GAAAAGAGGA 16740
TGGAgTTTCC GTCTCTGTCG GACGCAAAGT ATTGGTTGAT CCGCGTGGTG AAAAATGCCT 16800
CGTTAAATTA CGCTAAGCGT CGTGTACGTG AGCGTCATTC TtGTGAGCAA GCGTCGCGCG 16860
AGCATGTGTG CGAGCCGGAT ACCGGTgrmT TCGCTTGTTA AGAATAGAGA CGATTGAGCA 16920
GGTGCGCGCG GCCTTAGATC GACTGCCCGA GCACCTCCGT GTGGTTTTGC AGTTGCGCGA 16980 GTATGGGGAC TTAAACTACA AGGAGATCGG ACGTATCCTG GGCATCAGCG AGGGGAATGT 17040
AAAGGTGAGG GTGTTCAGAG CGCGCGAACG ATTAGCGAAG TATTTAGGAG AGACGGATGC 17100
GTACCTGTCC TGATTGTGCT GCTTGGTGTG CTTATGTGGA CGGAGAAGGT TCGCAACTGC 17160
AACGCCGTGA GATGTGCGCG CATCTGCAGG GTTGCACACA CTGTGCCACG TGTGTGGCGC 17220
ACTATCGCGC CATGCGGAGT CTTGTCAAGC ATGCTGATCG CGTTTCTTCC CGTGATTTTA 17280
CAATGGCTTT TCCATATTTG CGCGTGCGTC ACCGTGTCGC TTCCTGTATG CCGAGGCCGT 17340
GGTGGCAGGC ACGTTCCTCT CCTCTTTCTG CTGCAGGACC GGTCCgTGCT GCGGCACTCG 17400
CTGTGGCGGT CGCATCTTTA TGTGTATGCA CCCTGTTGCT TACTCATATT GTTGAAAGGC 17460
GTCCTGTATC CCGTGCGGGT GAGGCGAGTT TTACCCCCAT TGTACCTATG CGTGTTCGCG 17520
CCCCTGTTGG GTACGCGCGC GGTGTGAAAG TGTTTGGTCC TGCCGTTAGT GCGAATTCCA 17580
ACGTgTGCGC AAACCAGCTG CGGTGTTCAC CGTCTGTGCG TTTGCGCAGT TGTATGGCTC 17640
AGATCCTGCG TATGAAATGG AAACAGTGCC GGTGAGGCTA TCGGTTATTC CTGTGCCTTC 17700
CTATGTGCTC AATGCTTCAA AAGCGCAGTT CTTTTCCCCA TAATCCAGGC AAATGTGTAG 17760
TAAAAATAAT GCGCCCGCGC GGACGTGTTT CCTGTTCTTT TCAAACCGTT CTGAtCGTTG 17820
GGTGTTCCTG TCTGCAAACT TGATTGACCT GCTTGTCAGG TAGCCATAAG GAGAATGTCT 17880
ATGACCTTCG TTGAATCAAT GCAGCGGCGT GCTGTGcTTG CGCAAAAACG ACTCGTGCTT 17940
CCTGAGGCCT GCGAGCAGCG TACGCTCGAA GCCGCCCGTT TGATTGTGTT CAGAAACATA 18000
GCCGCAAAAG TTTTTCTTGT CGGATGCGAG CGTGATATCA AAAACACCGC AGACAGGTGC 18060
GGTATCGACC TTACCGACAT GGTCGTCATC GATCCGAGCG TTAgCAAGCA CAGAGATCAG 18120
TTCGCAGAAC GTTATTTTCA GAAGCGAAAA CACAAAGGAA TAAGTCTTGC CCAGGCTGCA 18180
GAGGATATGC GCGATCCTCT GCGTTTCGCT GCTATGATGC TTGACCAAGG TCACGCAGAT 18240
GCCATGGTTG CCGGTGCAGA AAACACTACC GCGCGCGTTC TTCGTGCAGG CCTCACCATC 18300
ATCGGAACCC TTCCGAGTGT TAAAACTGCC TCTTCCTGCT TCGTTATGGA TACTAATAAC 18360
CCCCGTCTGG GAGGAACACG TGGTCTATTT ATTTTTTCAG ACTGTGCAGT GATCCCCACT 18420
CCCACCGCAG AACAGTTGGC TGATATCGCC TGCTCTGCTG CAGAAAGCTG CCGCACCTTC 18480
ATTGGAGAGG AACCGACTGT CGCACTTCTT TCCTACTCTA CTAAAGGATC AGGAGGTGAT 18540
AGTGACGAGA ATATCCTGCG TGTACGTGAG GCAGTCAGGA TTCTACACGA ACGGCGGGTG 18600
GACTTTACCT TCGATGGGGA ATTGCAGCTC GATgcTGCGC TCGTACCTAA GATTACCGAA 18660
AAAAAAGCGC CTCACAGTCC TATTACGGGA AAGGTGAACA CACTCGTGTT TCCCGATCTT 18720 TCTTCGGGTA ATATTGGGTA CAAGCTTGTC CAGCGCCTTT CAGATGCGGA TGCATACGGA 18780
CCTTTCCTGC AAGtTTTGCA AAACCACTGT CTGATCTCTC GCGTGGGTGC TCGGTTGAAG 18840
ATATCGTCGC CGCTTGTGCA GTCACACTTG TGCAATCGAA TGGACGCTAA TGACGTCCAC 18900
CCAGGCGCGT ATACGTGAGG CAGTCCGTGC AGGGAGCGTC CGAGATTATG CGCGTGCTAT 18960
CCGTATTCTT GAAGAGCTTG CCGCTTCAGG AAAGGCAGAA GGATGTCATC ACCCAGATGG 19020
CGGTGCGGTG TATGAGAGGG GGGCACAGGA AGAGTGGAAT GAGGGGTCGT CTGAGTCGCA 19080
CGCGCACGGT GGGGATGGTA CGCAGGACGC GTATCCTGAG ATTTATTTGT ATCTTGCGCG 19140
TGCATACCAC GCACAAAGGC AGTATGCGCG CGCGGTAgTA ACGCTACTGT GTATTCTAGG 19200
CGCGTGCCGC gcGrACGGCG CAGGTTGGTT CTTTTTGGGA AGGAGCTATC TTGCACTGCA 19260
TCAGGGGGGG TATGCGGTTG CAGCGCTTCG GCGCAGTGTA CGAGAAAATC CTGCCTCTCT 19320
TGGGGCGCAG GCGCTGTTAG GACTCGCCTA TCTGCGGAGT AAGAAGCCGC GTGCAGCGCG 19380
CATGGTGTTT GAGCAAGCAC TTGCGCAGTA TCCAGACAAT AAGCGTTTGA ACGCAGGGTA 19440
TTTGAATTCG CTTTTTGTAG AAGCAGTGCA GCATCTAAAA CGGGGGAGCG CAGATCTTGC 19500
GCGTCAGATG TTTACGTTTC TGATTAATCA GGATGTAGAC GGGGTTGCGC CACGTTTATA 19560 cTTGGCGCAC GCGTTTCGTT CTTTGAAACA TTTTCCTGAA GCGCTTACCC AGTATCGTGC 19620
AGCAAGCGCA TTTGCGCCGC ACGATCCTGC CCTCAAGTGG TACGAAGCGG CCATGCTTGT 19680
AGAAATGGGG TGTCTGTCGC AGGCGGCAGC GTTGCTGTCG ACGTTGGGTG TTTCCATCGA 19740
GCGTGATCAG ATTTCGGATC GTTTTCTAGT GATGGGCGCC GTGCGCAAGC ACATGGAGGA 19800
GGGGGCGTGG GCTCGTGCCG CTTCTGCAGC GCATTTATAC CTGAAAACTT TTGGGGGTTC 19860
TGTAGAAATT CACCTGCTAA TGGCAGAGGT TCACCGGCGT GCGGGGCGCG TGAACGTGGC 19920
TTTGAACCAC TACACGCGTG CGATGAAAAT AGAACCGAAA AATTGTTATC CGCATTATGG 19980
TCTTATGGTG TGTTTGCAGG AAGCGAGGCG CTGGCAAGAG CTGGCAAAGG CAATCAGACG 20040
TGCAGAAGGC GCAGGGTGCG ACGCGCAGGA TTGCTACTAC TACCGGGTGA TTACAGCTGC 20100
CCATTTGAGC AATCtCCCGA GGAGGTGTTA CCGCATCTGC AAGAACTTGC GCGTGGAGGG 20160
AAGGCCGATC AGCTTTTGTT CAATGCTCTT GGGGTAACGT ATGTGCGACT GGGAATGGCA 20220
GATCTCGCAc TTCGCTGGTA TGAAAAAACC CTTCTTCTGG ATGCAGAGGA CGAAGAAGCG 20280
TGCGTGGGAC TGATCGCCTG CTAcGAgGCG CTCTGCGACG AagCGcGCGC GTACACCCAG 20340
TATGGAGCGT ACCTGTCCCG CTGGAGGGAC AATCGGGTTA TCCGcAAGGA TTTTATAGCC 20400
TTTCTTGAGA GAACAGAACG GTGGTCcGAA GCGGCGGACC ACATCGAGTT GCTCGCCTCG 20460 GGTGAGCGAG GGGGTTTTTG GGGTACTCGC CTTGCGTTTG CGCGTAAAAA AGCCGGCCAG 20520
TACAGGCAGG CTGCAAT AT CTACCGGGCG CTCTTACGTC AGAGACCGGA CGAGCGGGTT 20580
TTACTGCACA ACTTGGTATA CTGTCTTGAC AAGATGGGGC AGGCAGACGC AGGGCTAAGG 20640
CTGTTCCGCG CTGCGTGCAA CGCGTTTGGG ACGAGCGTGG AA 20682 (2) INFORMATION FOR SEQ ID NO: 71:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1356 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71:
TTTATGCACC CCAnTGAATC GACAGCCCGA CTTCAGAnCA CACAnCCCGC GCAgCCACAg 60
GATGAGCTcC TGCGGCAAaT GGTTACACAA ACACCGTCAC TTCACCGCAG CATATATCCC 120
TGCACCAATm rCGGCTACAC CGCACACAAC TGCAACCCCG ATAAGAACAT TATTCGTGAT 180
CTCTAAACGC CTCAACCGCA TGTTCAAykm GTTCACACGC TGCCTCAATA TCTCCAATTC 240
GCTTCTCAAT GTCTCTATCA ACTCTTTCGA TTCGCTCAAT GCGTGCTCCG ATTCCTcCAA 300
GgCCTTGGCG GCCTTCTCCA ATTTCACGTC GAGCGTCTGT AAGGCCTTTT TGAGCGCGGC 360
CGATTCGGCG TGCCGTTCCC TCAACTGCTG CTTGAGCATG TTTGATTCGA GCCGGATCGA 420
TGCCACcTCC CCCATAATTT CCTTTAAAAG CCCACCAGTA GCCGCCTGCG AATCCGCATA 480
TGCCACAAAA GAGCGCAACA ACACCATACC CCACAATAGT GCGCCCACAC CCCGCTTCCA 540
CATTCGGTCT CCTCACGACG ATGCGTTCAC CTTCCATTCA TGAATAAATC CAAGCATGTA 600
TTGCTGGATA TCCTCAAACC ACTTCTTTTG GAGTGCGCAC GCAGCaCCCC TACATAACGG 660
AAATGCCACG GCTCCCATAC ATACCCCGTC ACCTGCTCGT AACCAGGGGG AAAAGACAGC 720
GACCATCCAA AACGATGGGC GTTGCGCTGC GTCCACCTCC CTGCATCACT CCGTGCAAAC 780
GCCGGCGTGA TAGAACCGAA ATCCACTACC GTCCCCAACT GGTGCTGACT TGTTCCTTCT 840
CGCGCCGAAA AACGCATAGC CTCCTGCATG CCATGCTCCT GCGCATACCA GGAGAACAAC 900
TTTTTCTGAT ACGCAAAAGA GCGATAGGCA GAACCAACGG ACAGTGCCAC CCCGTCACGC 960
GCAGnCGCCT GAATCAGCTG ATGTAACGCT TCGTACGCAA TCTTAGTTAA AAGGAGCGAC 1020
CTCCCCTTTG AAAAAAAGAG CCACTGCTCA CGCACCGGCA CCAGATGCTT CGGCACGAAC 1080
GTTTCTGGCA GGGGATGTTT CTTGTCAACC AAACGCAACA GATACCCCTC CGTGGTAAGT 1140 ACCGCGTCGA GCTCTTGCAA AAACTCCCTC CCCTGTGCAC ACAAACGCGT ACAAGGCGCG 1200
CAGGAAGCGC CGCGCGtTCG CAGCAGCACG CACACGATGC AGATCCACCC GATCCACGCC 1260
CTGcGGcGAG ACCGCATTTC CCAGAGCAAC TAACACGTAC CCTGGCATAC GCACACGCAT 1320
TCCAACGCGC CAnTTAGTCC AACTCATCTA ATGATT 1356 (2) INFORMATION FOR SEQ ID NO : 72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4579 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72:
TACTGGTGGT ATCCACCTCA ATGAACTGTT TTTCCCTTTG TGGGAATCTT CTGAGGATCT 60
CCaTCAACGC GCAGGAGCTT GGCTGTATAC TTATGCACGT GATGCAGAAA AGGAGCTCTG 120
ACATACAATC CAGCGGTGTT ATTCGTTTTT ATGATTTCTC CGAACTGGGT GATGAgCGCA 180
AcCTGTCCTT CCTGGATGAG GTAAAACGGT TGCAGGwGCA CAACACCACC TAACAGCACC 240
CCAACGACTA TACCTATGTT CAGAACAGGT CGTAaCGTGC GTGTACCTGT AGTCCACGTT 300
TCCTCATACC CCTACTCCTC GCGTGTTCCT GCCACGACCT TCTTCGATAC CTTACTGATA 360
TCCTTGAGCG TTAAAAGATT CTCCAGTTTT TTGTCAATCA ACAGCACATT TTCAGTCTTT 420
TCCAGGATAG CCCCCAGTCC CTCAAGGTAC AAACGCGTTT TGGTAACATG AGGTGCTTTG 480
ACATATTCAG CATAGATTGA GTCAAAACGT GCTACATCTC CTTTTGCTCT ATTTACGCGT 540
TCATTCGCAT ATCCCATAGC CTCCTGAATC AACTTGTCCG CGTCACCTCG GGCCTTAGGA 600
ATTTCCCTAT TGTAGGACTC TTTTCCCTCG TTAATGAGTC GATTCATATC CTGAATAGCA 660
ATATTCACGT CTTCAAACGC TTGCTGTACC TCCTGAGGAG GAACAACATT TTGCAGCTGC 720
ACGGAGGAAA CAAGAACACC TAGGCCAATC CTTTTCAGGA GAACATTCAT CATATCCTTC 780
GCACGCATCT GAATCGCACT GCGCTCCGGC CCCATGATAT CAAGAATCGC TCGATCTCCA 840
ATTAAACTGT TCACCACTGC TTTTGAAATG TCTCGAATGG TTTGCCTTCG CTCCTGGGAC 900
TCAACATTAA ACACCCATGC TCTTGGATCT ACAATGCGAT ACTGAACCAC CCACTCGACG 960
TCTACAATAT TCAAATCCCC CGTAAGCATA AGAGACTCGT GACTGATATT ATTCACATAG 1020
TGACTCTGCT CGGAACTCTT CGACGTTCTG AACCCGAACT CTTCCTTTTG CACCTTGGTT 1080
ACCGGCACTT TATACACCCA CTCTACAAAG GGGATAAGAT AATGCAATCC CGGTTCTAGC 1140 GTCCGATGAT ACTTGCCAAA ACGGGTGACC ACCCCATTAT CAGTGGGAGA AATGATCCTA 1200
ATAGGGGAGG CAATTCCAAC AATCACGATA CCGAGCACCC CACCTATGCA TCCTGCCACC 1260
ACGCTCCACG TTGCTGGAGT CCACTTTGGT ATTCGCATCA CGGCACCTTC CTACACACGC 1320
TCGTCTTTTC GCCCATCTTA CGGGAAACAT TTTCCTGTGA CAACACTCAC CGTATTCACA 1380
CAGAcTTCGT TGTAGACAGA ATAAAAATTC TCACTCAGTA TAAAAACACA GGAGGCATGA 1440
TGTATCTTAC AAAGGAACTA CTCGATACGT TTGCGCACGA AGTCGCCGCA GATCCTATAC 1500
ACAAAGCGGT CGCAGGAGCT GTTGCGCGCG TCGGTCTTGA AGAAGCTGCA CTGAACACAG 1560
AAGTGGCGCG TCAgCACACA CATATTTTTT CTACCGAGAC AAAACGTGGA GAAATGACCA 1620
ATCAAAAAAT GAGTGGTCGC TGCTGGATAT TTGcTGCGCT CAACGCCGCG CGTGTAAACA 1680
CCATGAAAAA GTTGGACATT GAAACAGTTG AGTTTTCCCA AAACTATCTT TTCTTTTGGG 1740
ATAAATTGGA GAAAGCAAAT TTCTTTTTAG AAAATATCCT AGAAACACTT GATGAACCTC 1800
TCACCAGTCG GTTGATGGCA CACCTGCTTG CAAATCCCGT CCAAGATGGC GGGCAATGGG 1860
ATATGTTTTC AGGGTTATTA GAAAAATACG GTCTTGTGCC CAAAGAATGT ATGCCTGAAA 1920
CTTTTCACTC TTCCAACTCA CGCGTTCTTC TTGCAGTCCT CACTCGTCGG CTGAGGAAGC 1980
ATGCACAGCT TTTACGTTCT GCGCATGAAG AAGGCGTTGC GCTGCATACC CTGAGGGAGA 2040
AAAAGGAAGC GTTCCTTTCT TCCATCTACT CTATCCTCGT GAAGGCTCTC GGGAGACCTC 2100
CGGAGAAATT CGACTTTGTG TACAAGGATA AGGAAAAAAA ATTTCACAAA GTCAGAGACC 2160
TTACGCCGCA GAAGTTTTTT TGCGATTTCG TCGGATGGGA TCTTAAAAAC AAAGTGAGTT 2220
TGATTCACGC GCCAACTGCG GATAAACCGT TTGGCAGAGC ATACACGGTT AAATTTCTAG 2280
GCACCGTAAA GGAAGCCCCG TGCATCTGCT ATGTCAATAC TCCCATTGAA GTGCTCAAAG 2340
AAGCTACAGC TTCTGCAATC CGAGCCGGGG AGCCGGTATG GTTTGGTTGT GATGTAGGTC 2400
AAATGATGAC GCGCAAAGAT GGTATCATGG ATACGGAGAT ATTCGGGTAC GAGTCGATGC 2460
TCGGCACTAC CCCTGAATTC AATAAAGCAG AACGGCTTGA CTATGGCGAA AGTCTTTTAA 2520
CACACGCGAT GGTCATAACC GGTTTTGACG AGGATGCACA AGGTAACCCC GTACGCTGGC 2580
AGGTAGAAAA TTCGTGGGGA GATGACACAG GAAAAAAGGG CATGTTCTCT ATGAGCGATC 2640
GCTGGTTTGA CGAATATCTC TACCAAATTA CGATCGACAA GAAGTTCGTA CCACAGGTGT 2700
GGCTCGATGC GCTAGAGAAG CCAATAATAG CGCTCGAACC TTGGGATCCG ATGGGAGCGC 2760
TGGCGGACAC CCCTCTGTAT CTTAAAAATT AAGAAGAAGA ACAAGTGCGC AATTCTGATC 2820
GGTACTTATT TACGGTACGT CTTGCGCACT TGATGCCCTG CTCACCGAGC AACTGGGCTA 2880 TCcTTCGGTC GGAAAGGGAT ATACGCTGCG TTCGTACCTC TTGTATGAGA CGTGATATCC 2940
GGTACTTAAC TGATACTTTT GAGTGCGGAG AACTCGGGTA ATTCTGTCCA AGACTGGACC 3000
GATCACGATA TTCTTCGGTG GATAAAACCC GAGGGGAGAA AAAGTACCTT AAGGAAAAGT 3060
GTTGCGATCC GTACTGGAGC CATTTGTCGC GCACTATGCG GGACACTGTT GAAACGCTCA 3120
ATCCGGTCCT GTGTGCAACA TCTGTCATTC TCAGGGGCGT GAGctTCGCA GGTCCGTGAT 3180
CAAAGAAACC GCATTGGTAG TGAACTATTG TTTTCGCGAT ATCCAGCAAG GTACGTTCCC 3240
GGTATGAGAG CATACTTACA AGACTGAGCG CGTCGTGCAT GCATGCTTTC AACGCGTGGT 3300
TTTTTTCTGC CGCTTTTGAA TGCATGCAGT AATCGTTTCG GAAAACCACA GTTGGGATGC 3360
CCGTGCAGTT AATCTGTGTA ACAAACCCGT GCGCAGTTTT TGTAATCAAT ACATCTGGTT 3420
CAAGCAACAT GTTCGTGTCA GcCCGCTGAG CGTTCGACAC ACACTTACCT GGAAAGGGAT 3480
GTAGTTCCTT AATGAGGAGC AAAATATCTT TCACGTCATT TGACGAAACC TTCTGCACAC 3540
AAAGCCCCAT ACTATTAATC TGTGTCGTCA GCGCGTGCAC GGATACGCGT CCATCACACA 3600
TATTGTCAGA GCAGAAAAGC AATTCGCTGT GGTGTGTTAG TAGATTGATA ACACATCGAT 3660
ACAAGGGATC AGAGAAACGC TCAAAGCGCA gCCGCGCTTG GACTGCCAAT GATTCTTTAA 3720
AATTAAAAAC AGCACACCCT TGTGGCTCAA GTCTTTGAAT GAGCGCTATT GCCTGcGGTA 3780
TTTTTTCTTG AAGGGCTGTG GGCATACTAC CACACATGTT CTGAAAGATC GCAGGAGATA 3840
TGGAAAAAAA ACCGTGATCA TCTAACATCT GGATAAACGC GCACGCCAAA TCGAGCACAA 3900
TCGCTTCGTG TTTTTGATAA AAAACTTGTT CACGCAATAC AGCTCGGATA TTGTCAACCT 3960
GCTTATCGGG CTGATTTTCC AGCAATTGCT GAAAgCGATC ACGTGCGCGC ATGCgctCAC 4020
GTCTATCACC GAGCGACAGG TAACAAGCCT TCCCAGTGCG ACGAGCGGAC GAAGGACGTA 4080
TTTCTAAAAG GGGATTGCGT TGCACGGCAC GGAGAACCTC GGTCTTCAAA TCCCCCCGAG 4140
AAAGCTGCAG TAAACAAAGC CCGTGCACCA ACCGCTGATT GAGAACTAAC CGCTGCTGTT 4200
GAACAAGCTG CTGCATATCA CCACGTGCCC TGCAAACCAA TCCCACGAAA GGAATACAGA 4260
AAAGGAGCGA AACGCTCACA GGTGCGTAGT TTCTAGGTTC TCCATTTCTA CTAGCGCACG 4320
CGACGCCGTT TGCACATACG CTCTCGCCTG AGAAGCCTGC TCACGGAAGC TAGCATTGTC 4380
GCTCCCACGG ATATCAGACC AATTGGTAAG CGTTGCGTAG GCACTTGTTT TCGACGCCCC 4440
GATCATACTC GCCAACATAG CGCCAATGCA ATGAAATGCC TTTAAGGTTG CATCAAGCAA 4500
GACAGTCGCC TCAGAGTCAA TTACCTTTAT CAGAGCTTCC AGAATACCCT TACTTTTCAT 4560
GGTAACTGTG TGCTCTTTA 4579 (2) INFORMATION FOR SEQ ID NO: 73:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1015 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73:
TTCCCCAAAA CGAAGCGTCC AATCTTTTAA tAATGTGCAA GTTCaATATT CaAGGTACGT 60
ATTGGGAACA GCGCAGAACT CTGTTTCATT TCATCCCAAA AGTATAACGA TGAGACAAAT 120
GCATCCCACA GCTTAGTTTG TGGAGATTGC GATGACTGGG TGGGAAAGTA AGGCGTTAGT 180
TTCTCATTTT TAATCACAGT GTTTATAGAA AGATCGAGAA ATTTATAGAT GGAAAAAGTT 240
ATTAGCGGAG AAAAACTGAT ATGACTTTTA TCCAGATCTT TAAAATTAAT CTCTAACGAG 300
GAAGACAATG TCCCCTGTAT TTTTATACGC CGCTTCCAAA AACGTACTGT CAACGGAAAA 360
TCATCATGAG ACAACGTGAG CTTTAATTTA CTTATAGAAA GACCATTATT TCCAGAAGGT 420
GCAGCCTTGC CAGATTCCCC CTTTAGAAGA TAAGAAAGAG AAAAATACTT CCAGCCGAGC 480
GACACTTCAT AAGAATCGCT CATACTTTTT CCAATATCGT ATACGTAGGT TTGTTTACAG 540
GTTATTTCGT AAGGCATTCG AAAATCTAAT TCAGCGCGTA CcTGCGGTTT ATCGAGAAAG 600
CGTGCAGAGA GCGTCGCAGA GATATACGGA AAAGAAAGAA ACGTCGCAAT GCTATACGCA 660
TATGGTCCGG GCGCAAAATG CACACTGACA CGTATCTGCT GTGTATATCC ATACAGTGAA 720
AAAGCAGCAT TTGCGGTGAT GCTATGATCA CGTACATGTA ATGAACTTTT CCCAGTAGGT 780
CGAGACGAGT CGTATAGCAC GGGGGTAATT GACCAACCGA GAAAACTTTC TCGAAAAAAG 840
GAATCTGCAG AAAAAGGATA CACTGCAAGA TTATTCGAAC TCTGCACTAA AGCGGAGTTC 900
TGAAATCCAT TTTGTTTTGC CAGGTGGTTG ACTATGAATC GGATATCGGT GCTGCAGAGT 960
GCACTGTATC CCGTTTGACA TGCGAATTAT ATCATTTACA CACTGAAACT GAGAG 1015 (2) INFORMATION FOR SEQ ID NO: 74:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9974 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: AAAACAGATT TGTAATGTAC CATCTGCCCA TGGATATGGT ATCTGCGGCG TCCGCGCAAG 60
CCTACCCCCG CAGCCCCCTG ATTGAGCGCT CGTCCCCTAC AGTCTCACAC TTTTTGCCGA 120
GAATATCTTA ACCGTGCTCT GTCAGCTCAA TACTTTGTCT ACAAGGAGAC GCGCCTGCCG 180
TGAGAATCCA TCAACGCTCT GCTCCGTGCG TGCCTGTGCT TCTTTTTCTC TTCTTGCCGA 240
GTGCGCCGCT TTGTGCGCGG GGTAGCAAGG ACTGGACGCC GCCGCAACTG GGCGAGGTGA 300
TAGAGAGTAC CGAGCAGGAC CTTGCAGAGT TTGATGCCGG CCTTTTCCGT GCGGATCGCA 360
TCCTGGATCG CCATGACCTC TACCGCAAAA CCATGCACCA GCTGTTCTCC ACGCTCCTTG 420
AAGAACCTAA AAACCACGCT AAGCACCTGC AGCTCATCGA AACGTTAGAA AAGCTCGCCG 480
GTCCAGAGAG CAAAGAAATA CACGAGTTTC TCAATCGAcT GCGCAATTCT TCTACGTACG 540
CATGTACGCT GCCCGTTTCT TTCACCTCAT GGAGCGGGCG CGCATCCTCA TGGCTCGCCA 600
GGAATACCTG AAGGCCGCGC TCCTGTACCG AAGCGGCTAC GAGCTCTACT ACGATGAGTA 660
CCTTGCCGAC CCGTCAAGTC CGGGGAAAAA GGAgGTGCGT GCTCGCGTCG AGCAsgCAnA 720
TGcGCATGTT TCCCGCGCAA AGCCCCTCCT AGAAGCGGTC GCCGCTGCAC GGGCTCAGTA 780
TCAGAACACG CAGAAAAGGA CGTATGCTGC CAGCGCCCAT GAaGGCTGCG CGCGCGCGCG 840
AcGCGTACTC TGCCGCCCCC GTGCGCCTGC TGcACCgGTG CCCGCGcACC GAGTGCAGCG 900
TATCCTCACT CCTTAACGGT GGAGGCAGAA TTAAGGATTT TGCAGGACTT TTCTAAAACC 960
ACTGAGGAAA GCGCCGCGCT CACTTCCCTG GTCCaAgCGC TTGGAGCGCT TTTAAAGTTT 1020
TCTCGCGACA TAGAGCACAC CGGTGTTGTT TTTGAACAGC TATCCACACG CGCGCAGAAA 1080
AATAACGAGA CACAAGAGGC CTTCTTGGCC GTTGCACGCA AAATTACGCT CGGGCGCAGT 1140
AAACTTGAGT TCGAAGGTAT TCTCGGCGCG CTCCAGGCTC CTGCCTTTGA CGCTTTTGTA 1200
GATCTTTTTG AAGCAGGTCG CGCACATGTA GCGGCGCTCC ACGACCAGGC GCGCGCACAG 1260
TTTACGTTTG CACATCCTCC GCACTCAGGC AGAAACATTC CCGCACCCAC CGACACTGcA 1320
CTGGCAAGTG CAGGCGCATG GGCAGCAGTC GGTGCAGGAC CTGCAGGATC GCTCATTCCT 1380
GGCGCTCCTC TCAGTGCGGG AGTCGGCTCT CGCGGCGCGT GGGGAGCGTT GCCTGCGCCA 1440
GTAGAGCCGC TGCTCCGCCA GGCGGATGAC GCATTGGGTG CGCTTGCAAG ACTGTGGGCA 1500
GCGTGCGCCC CGCTCGGTGC CCAGCATGGC AGATTTCCCC GCGATTATGA GACCTTTGGC 1560
GCGCAGATTG TAGCGCTCAG TGCGCACGCC GAgCGTTGCG CGCCACAAAA CACGCGTACG 1620
ACTTTTACCA TGCACTGCTC GCCTTCCAGC GCGCCCCCAC CGTGCCTGTT TCGGCTGCAT 1680
TGCGGCGTCA GGACCTTTCC CAGAATGAAG CGTTCGCGCG GgATCTGAGC GAACTTGCAC 1740 ACCACCAGGA GTTTTTGCGT CGTGCCCTTG CAGAAACCGA GTCTCTTTCC CCGCCTGCAG 1800
ATACGGCAAG CaCCCcGTCA CCGGGGGGTG CAGGGGATAC TCCAGTGCCC AGCCAGGCTG 1860
ATAAGGGAGG GGCAAAACAG AGCGCTGCCC CTGATACTGC GCAAAAGGCA GTAGCCCAAA 1920
AAGCGGGTGC GTCGGAGGAG GCTGACGCGT CGTCTTCCCC CTCCGAAATG GCGGTGCGTG 1980
CGGCGCGTGC ACAGCTGCAT GCCATCCAAA GTGAGCTGTT GCGCcGCTTC ACGcgCtTCA 2040
AACGCAACCG CTATACCGCA CATATGGCGT TTCATCAGCA CTCAGGCGTT TCTGCGCTCG 2100
CTGAGTATGC GCAgnAGcTT aCCAGTGCCG AGGAAGCATT GCGCTTTGAC GCGAAAGACG 2160
AGCGCAGGGT ACGCGCGTTG AGCTTTGTGT CTGAAACGGG TCCTCAGCAG GTGAGTAAGG 2220
ATATGGAGGC ACTTGATCGG TTGCTTTCTT TTTTTTCTGG CGAAGAAGAG TTCCTGTCTG 2280
AGCGTGGCTA TGCGTATGGG CTGCAGTCCC TGCGTGATTT GCGCACTCAG TTTGAACAGT 2340
TCTCTGCACG CGTGCAGACA CTTTTTTTGG CAGCAGAACA ACGGGCTATT CACGAACGAC 2400
TGGCGCGTCA AGAAGCAGAG TACCGTTACC GACAGGCAGT GGAAGGTCTA GGTCAAGATG 2460
ACTTTGGCGG TGCCCGTAAG AATCTGGTGC TATCTCGAGA AAAGGCCGAT TTGGCGCTCT 2520
CGTTGCGGTA CGACACCGGc tACGCTACCG AAACTGACAC GCGATTGAGC ACGCTTGATT 2580
CCTCAATTAA CAGACGGGAA AATGAACTGG TTGTAAAGGA CGTGCGCGCG TATATCgCAC 2640
AGGCAAAAGA TAAaTATTAC AaGGGAGAGG TGCTCGATGC GGAGcGTTTG CTCATTCGTG 2700
CGAAAAATCG CTGGGCAGTT ACAAACGTCA CCGAGAATGG GGAAATTACA AATTGGCTTT 2760
CTGTCATTAG TACGGCGGTT GCGCTCAAAA TCGGGCGGGT AATTCCTGAC TTTGCACCTC 2820
TTTACCCGCA GATGAGTCAG TTGTTACACC ATGCAGAGCA GCTGTACTTG CACGCGGCAT 2880
ATTTGAACGC GTCGCAGCGC CAAGAGATGG AACGGTTACT CGCCACCTCG CGAGAGAATA 2940
TACACAAAGT ACTGCTTGTC TATCCGTTGA ACGAGCGCGC AGGGCAGCTG AGTCTGAGAA 3000
TAGACCAACT GCTCGATCCC CGCTCCTTCC GGCAGCAGTT TGCAAAAAAG CTCGATACCA 3060
TCAGAGGAAC GTACAAAACC GAATCAAAAA AGGCCTACAG TTTGCTCCTA GATTTGTACG 3120
CAATCGATGC ACGCTTCTCT GGTATCGAAA AGCTGAAGCA GGAAGTGGAA ATCTACCTGG 3180
GGGTTCGATT GCCGCCGCCA AACCCGCAGG CCATTGCACA ATCGTCGAAT TTTACGCTGG 3240
CTGCGCGTCG TATCTTTGAG CGTAGAGACG CGGCGCTCTA TCAGGTAGCA ATTCAGCAGT 3300
TAGACGAGGC GCTTAAGCTG AATCCTGATA ACGATGCGGc TGCGCAgCTG AAAGATCGTA 3360
TCCAGTCGCT CACCGGTGAC GGTGCGGTAA ACGTACTCAG TAGCGAAGAC GAAAAAGAGT 3420
ATCAGCGCGC cTTGCAGGAA CTCCAAAAAG GAAATAAGCT CGTCGCCTCC GCGGTGGTTG 3480 AGCAGCTGTT ACAGAAAGAT CGCAATAAGA AGTCGGCAAA GATTCAGCAG TTAAAAAAGA 3540
GGATTGACGC ACAATTATGA ATGCTCGTCT GTGCTTTTTT TCGCGTCTTA TCTTTTGCGT 3600
ACTTTCTATC tGTGcTTtGC CACTTGTTGC TCAGGAAGAT AAGCTCTACT GGGAAGATCC 3660
GTGGGCACTC AGCACTGAcG TGCCGCTTTC GTCAAAGTTG CGTATTCGCA CGATGTCGTT 3720
GCCGTCGTAT GGCAGGAAGT GACGCCAAAA AATGCTACCT CGGGAGAAAT ACGACTGTCT 3780
GCGTCTTTTT ACGATGGCAG TACGTGGCAT ACCGTGCGTA CATTTTCTCC ACCCCTTTTG 3840
TACAACCACC GTTCTCCTTC TCTTGCCTCC GTTGCTGTTA ACAGAAAAAA TGAGATTTTT 3900
GTTGCTGCCG CTTTTGATGC ACACACCATC ACCGTCTTTA AAACTACGGA TTTTGGAAAA 3960
TCATTTACGC ATACTGTATT GCGTTCTCAG GGAAGCGATA TTGTCGCCCC CTATGTGAGT 4020
GTTGCTTCAG ATGACTCGCT GCTGCTGTTT GCCTCTCACG GTTCTGAGGA TCACTTTTCT 4080
ATCTTGCTTT GCCGATCCGA AGATGGGGAG CGTTGGACTC CcTTTCAGGA GTTTTTGTCT 4140
ACCGAATTTA GCCGCAGACT CTTTTTGCCT TCGCATGTTT CAACGCAGGC CCAAGAAATA 4200
GTGGTGTTTC AGGCACATCA CCAAGAGGGT GAGAGAGCAA GCTATCAGTT GTATTCAACC 4260
GTTAGCTTTG ACCAGGGCAA TACGTGGTCT GCgCCTGTGC CTGTTACACA ACCTGATGAG 4320
TATCACAATC AGCGGCCCTT TTTGGATCGT CTCTCAGATG ATCGTTTTGC AGTTACGTGG 4380
GAGCGCTCTG AACGTACGTC GACGCGATAC GAGATGTGCT ATGCCGAGCT CGATCGCTAT 4440
GGGAGAAAAA TCGGGACTAC gCTCCGCCTG GCAGAACCTT CTGACCGTCT CATCACTCCC 4500
AACTTTGTGC ATATCGACGG TACCACATTC TGTGTGTGGG CAGGAGAGTC AGCCGGGCTC 4560
AATACCATTT TTCTCGCGCA GAAAAAGGAA GGCGCGTGGA GTACTACTGC CGTACGTTCT 4620
AGTGAGGATG CCTTGCTGTT TcCGCATGCG GTGCGCGTTG ACAATCACCT TGAGGTTTTT 4680
TGGCAAGAGG GAGAAGGGGC GCGTGCACGT GTGATGCGTT TGCGTCCAGA TCAGAGTGTA 4740
CAGCCACCGA CCCTGATTGC AGAAAATTTT TCGCCAAACG CGGTAAGAAA GGGGACGCGC 4800
GCGCGGtACG CATTGTATTT CCTCGGGATT CGTCAGGCAT TGCAGGGTAT AACTACGCGT 4860
GGCAATGCGG CGTGCAGCCT GCTGCTCCTC CTGATTACGT TGCACACTTT cCGGACAAAC 4920
CTCAGATAGA ACTGGAGGCA ACGCAGGATG GCACGTGGTT TTTGGCCGTA ACGGTGTGGG 4980
ACTTCGCCGG CAATAAGTCA GCTCCCGCGT ACCTTTCATA CACGCGGGGT ACTACGCCTG 5040 tGCGCGTCCA CAATTGCAAA CTCCTCTACT GGAGAACACG CATGCGCTGA AGAGCAACAC 5100
GTTTACACTC AGTTGGAATC AACCCAGTAC TGATGCGCAA GGAAACGAGG AGCGCGATCA 5160
CACCAGCTTC CTTTGGAGCT TACAACAGGT GGCACCGCTT TCAGCACTAA CGTCCCTGCG 5220 TGTGGATACT GATGTACGAA CGTTCGAAGA ATTTCAGCAG CGCTGCGTGC GCGCCTTTCC 5280
TATACCTGTG GAmGTGCACG GCAcGCGCAg CAGGcAGTCG TCCGTATCGT TCACTAATAA 5340
GGAGAACGGC ATCTATCGCT TTAGcGTATA TGCCCTTGAT CGCTCTGGAA ACGTGAGCGA 5400
GCCCGCAGTT GTCTTTTTTG CCTTACGGCA TTTCGTACCC TACACCGCCA TTCGCTATGT 5460
GGATGTGAAA AAAGATCCTG CCGGTTCATT GCAGATGTCG ATTGTTGGTA ATGGGTTTCG 5520
TGCGCAAGGG ACAGTCAGTC AGGTATACAT CGATCGGGAT CGCAAAGCTC CATATGACTT 5580
GGTATTGCAT GCGCAGGAGT TCGCCGTTGG TTCAGACAAC CTTATTTCAG ACATACACAT 5640
CGATAATTTA AAAAAAGGTT CTTACCACGT GGGGGTATGG CACCCTGCTC GTGGGGTGCA 5700
TTTTGCAGAG TCAAGAGTGA CGGTTTCTGA AATGGGAACG GTAAAATTCG GCGCGTACGA 5760
CTATGAGCAT CAGGTGCGGT GGAGTATCCC ACACACTGGT GGATTGAGAG TGAATTTTGT 5820
TTCACTGTTC ATGCTGATAG CGCTTTTTCT TGCGGGTGTG GTGTTTGCAG CGTCACTTAC 5880
CAGGATAGGT GATATCGTCG GAGAAGCGTT TGTACTTAAA AAGCAAGTGG AAGCGCTCAT 5940
GATAGGAGAG CTTATGCCGT CAGAGAAGAG ACGAAAGGCT ATGGCACTGA AAACACACGG 6000
TGCAGGATTG CGGGTGAAGT TCATCCTGTT TGCACTTACG CTGGTTATAT CTGTCATTTT 6060
TATTGTGTCC GTGCCGCTTG GAGTGCGGTT TTCAAAAACA CAAAAAGATT TGCTGGCTAA 6120
AAATCTTTTT TCTCGGGTTC AAGTGTTGCT TGAAAGTCTT GTGGCGGCAG GAAAGGTATA 6180
CCTTCCAGCG AAGAATAAGC TTGAGCTTGG CTTTTTGCCC AATCAAACAA CGGCATTGCA 6240
CGAAGCGCGT TACGCGtTAT CACAGGAGAA AGTGAAGAGC CTCACGAAGA AGGTATCGAT 6300
TTTGTGTGGG CAACGAATTT TAGCGATATT GAAACGGTGC TCAATGAGCC CGAATATCGG 6360
CAAGGCAATT CTCGTTTTGT TGACAAAAGG ATTGCGCAGA TTTTGCCGGC AATGGAGGAT 6420
TTGAACAGAC AGGTTAAGAA AGATGCAGAA AAGATAGCAA AGGGTATTGC GGATCTGACG 6480
CAGGAGGCAG TTGCGCTTGC GTTGCGCACT GATCAGGGGT CAGTACGTCG CCGAGATGAT 6540
ATTCAGTCCA TTACGCGGCA AATGGATCAA AGGCTTTTGG AAATTTTTTC TACATTTTCA 6600
AACAACGCGG TGGGCTCCTA CCCTGAATAT CGGGTTGATA ATTTATCAAA GCGTCACAGC 6660
TCCTACCTTT TCTATAAGCC CATCCTGTAC CGCCAACGCG GACACGcGgA TAGTTTTGTG 6720
CACGGCGTTG TGTTTGTAGA AGTCTCTACG CAGGAATTGC TCGAGCACAT TGAGGGTTTA 6780
CAGCGCGATC TCATTAAAAT GGTATTTTAC GTTTCTTTAA TCGCACTCGC CTGTGGGGTC 6840
TTTGGCGCGT GGATTCTTGC CTCTATTATC ATCAAGCCTA TACGCAGGCT GGCAAGTCAT 6900
GTGGCGATGA TTCGCGACAC GGAAAAAAAG GAAGAACTTG AAGGAAAACT GATTGCCATC 6960 AAAGGGCAGG ATGAAATCGC TCTCCTCGGA AGAACTATCA ACGATATGAC AGAAGGGTTG 7020
ATCAAGGCGG CGCTTGCCTC AAAGGATTTG ACGGTTGGAA AGGAAATTCA AAAGATGTTC 7080
ATCCCGCTTG ATACCAACAC TGAAGGGAGA AAGCTTACAT CTGGGTATAC GTGCGATGAT 7140
CACGTGGAGT TCTTTGGGTA TTACGAAGGC GCGCTCGGCG TTTCTGGGGA CTACTTTGAT 7200
TACATTAAGT TAGATGATCA GCATTATGCC ATCATAAAAT GCGACGTTGC AGGAAAGGGA 7260
GTTCcCGCAG CGCTTATCAT GGTTGAAGTG GCAACGCTCT TCCAGAACTT CTTTAAAGAT 7320
TGGAATATTC AAAGTCATGG TATCAACCTA AGCGACATTG TCTCTCGCAT TAATGATCTC 7380
ATTGAGGCGC GCGGGTTTAA AGGAAGATTT GCAGCCTTTA CCCTGTGTAT CTTTAATACA 7440
GTGTCCGGTA CGGTGCACTT TTGCAATGCa GGGGATAATA TAATTCATAT TTACGATGCG 7500
CAGmAAAGAA AAATGAAGCG TATTACGtTG CGCAAACTTC TGCTGCAGGG GTATTCCCGA 7560
GTTTTATGAT TGATATGAAA GGTGGGTTTG GTGTGGAAAC CCTCACCCTG CGTACAGGTG 7620
ATGTCCTGTT CCTCTATACT GATGGCATAG AAGAGGCGAA cGTCTTTTTA GAAACAAGCG 7680
GTTTGAACTG GTAcTGTGCC AGGAACAGGG ACTTGCGCAT GATGCGCCCC ATGAGACACA 7740
TACGGTAGGT CAGGCCGGAG AGGAGCTGGG AGCTGAGCGT GTCAGCAGCA TTATCGAATC 7800
AGTCTTTCTG AGGAAAGGTT TTTCCCTACA AAAGTGGCAT AACCCTGTCG AAGGCGAAAA 7860
GTTTGAATTT GATTTCTCCT CTTGTGAAGG AAATCTAGAC GAAGCGGTGC TCGCACTTGT 7920
GGCGGTGGAG CAGGTGTTCC GTATGTATAA GCACCCTCGG GCAACCAACC TTGATAAAAT 7980
CAGGGTGGAT AAAAAAGTGG ATATGTTTTT AGCACGGTAT TTTGTTCAGT ACCCTGAGTA 8040
CTGTGCGCGC AAAGAGGTAA ACAGCGAGTA CGAAGAGTAC CTGTATTATA CGTTCATTAA 8100
AGAAGACGAC CAATACGATG ATCTCACTAT CTTGGGAATA AGAAAGAGAT AGTGCCGCTG 8160
TTGTGCAGGT TATTGCATGG TGTGTGGGTT GTGACAAGGA GACGCAATGC AGATTATACC 8220
CATTGCGAGT GGAAAGGGTG GGGTTGGCAA GAGTTTGCTT GCGGCAAATT TGTCCATAGC 8280
GCTCGGTCAA GCGGGGAAGA AGGTAGTAGT AGCGGATTTA GATCTTGGCG CGTCGAATTT 8340
GCATCTGGCG CTTGGCCAAA AGGGAAATAA GCACGGAGTG GGAACATTCC TTATGGGTGC 8400
CTCTTCTTTT GAAGAGATTA TGGTGCCAAC TGGATATCCC AATGTATATC TTGTGCCAGG 8460
AGATTCTGAG ATACCTGGCT TTGCTGCATT GAAGGTTTCT CAGCGGCGGG CTCTAACAGT 8520
GGGTTTGTTA AAAACGCATG CTGATTATGT GGTGCTGGAT TTGGGGGCAG GCACTCATCT 8580
TGGAGTGCTT GAGTTTTTTC TCCTTTCTTC ACGAGGGATT ATCGTTACTG AGCCTGCAGT 8640
TTcTGCGGTT TTGAATGCCT ACCTTTTTCT AAAAAATGTG GTGTTCAAAA TGTTGTGCGC 8700 TGCCTTTAAG AAAGGGACTG GGGGAAGTAT TTTTTTAGAG AATCTCAAGT CTGATGCTGC 8760
GGCGGTACAG CGCATGTATG TGCCTAAGAT TCTTGCTGAG CTTGAGCGTG TGGATCAGCG 8820
GGGAGTTGCA GTACTTCTGG ATCGGATGCG GTCTTTTAGG CCGAGACTAG TCATGAACAT 8880
GATTGCAGAT CCGAAGGATG TGGATAAGGC GTTAAAGATT CGCCGCTCGT GTGAGCAGTA 8940
TCTGAATATT ACGCTTGAGT ACCTTGGGGT CATATACCAG GATACGCAGC AGAATGTCGC 9000
GCTCTCCTCT GGTCTTCCCA TTGTTGTGTA CAAACCGCAG TCACTGATTG CCCAGGCAGT 9060
GTACCGGATT GCCGATAAGA TTTTGCAGTC AGAGGGTGAG GAGGCGTCTT CCATTGAGGA 9120
TTATGAAGGG TTGGTGGAAC GAAGTTTTGC CTCTGCAGAA GCAGAAGCAG AAGTGGATTT 9180
CCAGTTTCGT ATGGACTATC TTGAGGATTT GATAAAAAGC AAAACAGTGT GTGTGGGAGA 9240
TCTTGCTGAG ATCATAAAAG CTCAGCAGTA TGAAATTGCT ACTCTGAGGA AGCAAAATCT 9300
GCTCCTCCAA AGGAAAATAA ATAAGACATT GCGCAATGCG TGAACTTCAT GAGGGTGGGG 9360
TATAACCCCT ATTTGTGGGG GTGTTTTTGG GAGAATACAG TTTACgCGGA GyGtGGTGAA 9420
TGGTGACAGG ATAAACGGAA AACGGTGGCG GGGTAGTGCG CGGTGCATTT CCTTACGCGG 9480
TGGAGGTTGT GTGATGTTGA GTATTGTCTA TCCGTCGTGG ATTCGTCCGG AAATAATTCC 9540
TTCTTTTCCC TATTTTCGCT GGTACGGCTT CATGTATGTG GTTGCATTCA GTATCGCGTA 9600
CATACTGTTT CGCTACCAGG TGCGGCGCGG TGAGCTTGAT AAATGGAGTC GGGTGAGCGA 9660
GCCTGTCACG CAGGATGACA TTATGAGTTT TTTTACGTGG ACGATTCTGG GCATTTTAAT 9720
AGGGGCGCGT GTTTTTTCCA CCATGGTGTA TGAGGTCGAT TTGCTGTATA TGCGCAAgCC 9780
ATGGCTGATT TTTTGGCCGT TTTCTTTGCA AACGGGTGAG TGGGTTGGAT TGCGAGGAAT 9840
GTCGTACCAC GGTGGGTTAA TTGGCGCGCT CGTGGGGGGT GGcTTGTGGA CTCAGTCGCA 9900
TGGGAGAAGC TTTCTTGCAT GGGCCGATGT CGCTGCAGCG TCAACTCCAC TTGGGTATAC 9960
TTTnGnAGAA TTGG 9974 (2) INFORMATION FOR SEQ ID NO : 75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5861 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75:
AGGAAGCACT GGAGCACGTC CGnAAGCACC GTCTCGCCCA TGCGCGTACC ACGGCAACAT 60 AACTTTGAGA TTCAGTATCC CCGGTCTCCG TGCACTGTGC AGTAAAGTGA TGCGTATACG 120 CTCCCTTTGG GAAATCCCAC ACATGCAGCT GCGCGAACAG CACACACTTC CAGCGTGCAT 180 CAGAGTCGCG TTCAAGACGC CTTACCTGAG TCAAGCCAAA CACAAGTCTA CGGACATTCC 240 CCCCAAACTT TTTACGCGCG CGCGrAwTTA CCGCGTGGTC CGGCACCTTA CCTACCCCGT 300 ACAAACTATC AAGCCCGATG TTCACAAACC GACGCTTCAA AAGCCCCCCG aGCAAACGGG 360 TGGCCGTCAG ATTCTCGTGC GTCAGGGrCC GTCCTGTCTT CCCATAATCT AGAATACATA 420 TCGTTGTAGA TACCTGCCTG CGCACGGGAT GCTGTTTTGG ATCAGACGAA GGCGCCGCAA 480 CGCGACACAC CAAATTAGCA AGCGCAGGCT CCCCCGCTGC GCTGTTTTCT TCCCCCTCGG 540 GGGACGGTAT GGTAGACACG ACAGgTTCGG GAGATGArGG TCGTTCCTGC AACGCAGAAT 600 CCTGTGCACA AAGGGAAGGA GCATGTACCA AAGCGGCAGT GTCAAACGAA AGCGTCACCT 660 CCTGCGAAAC GCCGCTGAcG GAAAGGGGAA AAAGAGCACT CCCTGcGTAT CTGCGCACAG 720 CGTCACCTGC ATCACATCGC CATGTTCAGG ACCAATTTGG TAGCATAGCG TaAAACGCCC 780 ACGCTGTAGT GGCCACACGC TGTCGCGATA CTGCAACGTG GCACTAACCC CTACATGGGT 840 CGGCGGTGCA ACGCGCGGAG cAAGGcGTAT ACCTTGcAGC AGCGCACGGA CATACGCGTG 900 CAGcGCAcGG CTaCGCGCCG TGTCAGTCTC AGAAGAAGGG TCATATAACG CTTCAAGCTC 960
ATTGAAAGCA GCGTATGTGT CCCCCGTACA GAGCGCATCC GTGATGCGCA TTACCCTCTT 1020
CTGTGTCAAA AAACGCCTGC AATGCCGCGT GGTGCTCGGC AGGGATATGT CCAACAGTAC 1080
CCTGcACCCA CGCAGACCCC CGCCCAGGAG AAAAAGACGC CGGGGATTCA GACACACTCG 1140
CATCTGCCGC GCCCGCCTCA TTCCCCATTG CCGCAGCATG AGAAACATGC TCAGAAGCAT 1200
GGGCCACACC CGAAGAATTC AGGCCATCCG CATTCAGGGA AAAACTTGAG CGTGTCTGTG 1260
ACGTACAGGC GCCAAAAAGC ACACTCCACA CCCACACCGT ACAAAAAAAA CGGTCCATGT 1320
ATAATCTACA CCTCTTTATT CTGCAGCGCA CACCACAGCC GCGTGCTAAA GTACCGTCAC 1380
GGGCCtTCGT TACAGCCACC CTACGATATC CACCAAAGAC ACATCACGTC TTCTTTTCGT 1440
TAGGGACGTG CAGGACGACG CCGACACCAC CCATACATGA ACGCAAAGCA AAATGCCCGT 1500
CGCATCGCTT CCGCACGGcT GCGCGCCTTC CGCGCCTCCC CTACCGCTCC TGCCTATCCA 1560
GCACACTCCC TTTTCACTGC ACAGACAGCG TACCAGCGTG CACGCACTGT GTTCTTCTAT 1620
GCGCCCCTGC CCCTAGAAAT AGACCCCTAC GCCCTTGCAT ACACTGCAGA AAACGCAGGA 1680
AAGCACGTAG CTCTTCCTCG CGTATCGGGA AACGACTTGC ACTTTCACGC AgTCACGTAC 1740
GCGTGCACTA CCCGCCCCTC TGTTTCCTGC TTCACCACCT TGTGCCCTAG GACCAGGGGA 1800 ATTAGAGAAC CCGATGCACA CAGTCCACGC CTCTACCCCC CGCACCCTTC GCCCAATACT 1860
CCTGCACAAA GAACACTTGC CCTACCGCTT TTGATCGTAG TTCCCGCACT GGCATTCAGC 1920
ACAAATGGCG CACGCCTCGG CCGCGGCGGA GGACACTACG ATCGCTTCCT CGCCCGGATC 1980
GCCGCTACCA TACCAGCAGG GAGCTACTAC ACGCTCGGCC TCTGCTTTGA TTGCCAAATC 2040
ATGGCTGTCA TTCCTCAAGA AGCACACGAC CAATCCGTAC ACGCGGTGCT CACCGAAACT 2100
CGTCTCATTT CCTGTGCCAC GGCGCGTGCA CCAGCGCCAC CGTTCTCTTT ATAGTGCCTT 2160
ATTCCTCCAT TCTAATCACA CACGTGCATG CACCAAGAGG ACAGCGCCGT GCTATCTTCC 2220
CAGAAAGGAG GATGAAAACA CGTGAAAACC ATTCTCA AC TGGGTGCAGG AACCATGCAA 2280
GCCCCTGCAC TTCGCGCAGn ACGGGAGCTT GGGCTGTGGG TGTGCGCGGT AGATGGGAAT 2340
CCGCATGCAC CsTGCGCGGC ACTTGCAGAC GAGTTTACCC CAATCGATTT GGCCGATAGC 2400
GCCGCGCTCG TncGCTnCAm gcGCcGCAAT TcGCGCGCrC sGCGGCTTGG ATGCTGTGTT 2460
CACCGCGGCA ACAGACTTTT CCGTTTCCGT CGCTGCCGTC GCCGAGGCCT GTGCACTCCC 2520
CGGCCACCGA TTGGAGGCAA CCAAAAACGC TACGGATAAA ACGCGCATGc gTGCCTGCTT 2580
CACACGCGCC CGACTGCGCT GCCCCCGCTT CACGTTCCTT GAGCCTGACT CGTTCGCCTG 2640
GGACACACCG CCTGGGCATG CCCGACTGTG TTCCCACCTG CATAGCGCTG GACTCTCGTT 2700
TCCTCTCGTC GTAAAACCGA CAGACAACAT GGGAGCCCGC GGCTGCACGC TCGCGCAATG 2760
CAAGGATACC CTCATAAATG CCtGCGCCGT GGCGCGCCAG TTCTCTCGCA GCGGCCGGGT 2820
GATTATCGAG GAATTTATTG TCGGAAGAGA GTTTTCCCTG GAAGGgCTCA TATTCGACGG 2880
GACGTTGTAC GTCACCGCAC TTGCCGATCG CCACATCTGC TTTCCTCCCT CATTCGTAGA 2940
AATGGGACAC ACGCTCCCGG CAGCGCTCTG TACACAAGAc GCACAAGCGC TCATCGACAC 3000
CTTCCACAAC GGTGTGCGGG CACTCGGGCT CACCCATGGC GCCGTGAAAG GAGATCTCTT 3060
CCTGAGTACC CCCTCCCCGA CGAAAACTCC ATCCACTGCC GCCACACCCA ACCCTTCTGC 3120
CCCGTACACA CCCGAAGCAG TATTGGGAGA AATTGCCGCA CGcCTTTCAG GGGGCTTCAT 3180
GTCTGGCTGG ACGGTGCCGT ACGCTCTGGG TTTCGACGTC ACACGCGCTG CATTGCACGT 3240
GGCGCTTCAC GGTCCTTCAG CTGCCGCCTC GGCTGCCACC GCGTCTGTCG CCCCCCCTCC 3300
TACTGCGCTc ACCtGctGCG CACACAGCTC ACCACTCTGT CTCCTCTTCC AGAAAAAAGC 3360
CCATACGCCA GCGCAGAACG CGCGTGGATT TCCATTCCTG GGGTAATACA CCGAATCTGG 3420
GGCCTTGCAG ACGCTCAACA GATCGCCTAC GTCAAAAACG TGTTCGTACG TATGCAGGAA 3480
GGAGCCgcgG TGCGCTTTCC TCGTAATAAT GTGGAAAAAT GTGGCAACGT GCTGAGTCAG 3540 GCCCCCACCC GTGCACAGnT ATCGCCGCAG CAGAAACCGC GTGTCGCTGC ATTGTACTCC 3600
GCCTTGTTCC TGCACACCCT GCAACAGACG CCTTTCTAGC AAGAAAACGC AGCGCAGAAT 3660
CAGCGGCCAG CCCAGCGCTC CAGGACGCTG ATTCTGAGTA CGCAGCGTCT GCATCACACC 3720
CCTTTGGGCA AGAGAGTATA CCGGACATCG TCTGCGATGC CTCAGGACGC TTCTTTACCT 3780
CTGAGGTTGC CTGTGCACCG CTCGTGCGCA CAGGACTCTT CCTTATCCCC GAGCCACTGG 3840
TGCGCGctGA cGCACGAGAC GTGCAGGGTC GCAGCATCCA TGCGCTGTGT ACCCTTGCAC 3900
TTAAGGTAGA GCCTGCGCTC GAACCTGCGC TGTGCTTTGC GCGTTCCCAA AACCTCGCAG 3960
AGTTATGGCG CGCACTTATT CGCGGTGGCA TTCAAGGATT ACTATACGCG TTTGACTCCT 4020
TTCAACTGTC CTGATGTTCA GTGCCAGAAA AAATAAAACG CGTGCGCAAA AGCTCTGCGG 4080 cCTCAGTCAT ACGCAGCCAC GTCAGTCCCT GTGCACAGAG AGTCTGCACG GACGCGGCCG 4140
TTTTTTGCTC AAAACCAGGC ACCACGCTAC TACCGTCCTT CAACAAACTC ACGCGTGAGA 4200
GGACCCCTGC GCGCTGCATA TCCTGCAGCG TACACACAAC ACAATGCGAG AGTGCTTCAC 4260
CTGCTACGAA AATACGCTCA TGTGCGCGGA GCATTTGGCA GCGCTTAACA AAAAGAGAAT 4320
CCGCAGTCCC TTGAGACGCT GGATACTCCG AGGACAGCAC ACTAAATTGT TCCACACAGG 4380
GGTTTTCTCC TTTAAAGAAA AACTGAGGAT GTCTTGTACG ATGCGCGCGT TGCCAAAAGC 4440
GCACCGCCTC CACGATAAGC GGGTGCACCG CCTGTCCCCA ACTGCCACGC ACACAATGCT 4500
CGGGCCAAAG GTATAGAGGC CCCTTTCCGG TATATGCACG GAATGCCAAG TACCCCGCTA 4560
CGGTCTGTAC ATAGCCGACA CGCACAGGcA CACACGCCCC AGAACGCAAA CGTTCGAACG 4620
AAACTGTATC AAAAGGACCG AGAGCATCTC CTGTCAGGGA GCGCCAAAAA CAGGGGTGCG 4680
CAACGTGCAT CCGCGGGTGC CGATCGCAAc TTACGTACAA TGCATCCACA TGCGCAGCGT 4740
GCaCGCGCAA GAACTCAGCA ACGCGCACAC AGTCCTGATC CGCGCCGGGA ACGAACAACG 4800
CACCGCGTGG ATCGCAAAAA TCATTTTGAA AATCAACCAA AAAAAAGGCT CTGCTCATAG 4860
GGAAGCGCGC GCAGGAACGC GCACGcgCGT GCGCgcAnTn ACCCCACCCC AGCAGCACAA 4920
GCTAGACCCG GAACACAGGC CCTATCTGCA CCGTCAGCGG ACACTCCAAA CAATACTTTT 4980
GCAAAAACTG TTCCCATTTC AAATCAGTGG AGCCTGCGCC GTTCTTACAC TGCTCGAGGT 5040
TCGCAGTAAT CGGGATACCC ACACCGCTTG CCACACCGAC AGACAGTCCC ACAACCCCCG 5100
TAAAAAAGAA ATGAAAGTCA AGGTTCACAG GAACACCGAC TATTTTGTCC TTGGTAGACA 5160
TGATATCCAC ACCCGTAGAA GGCAGAAAGA ACGCCCGCTC TCCCATACGG AGTACCCACC 5220
CCAATAGGAA CTGCGCGCGG AACAGGAGCG TAGAAAGCCC CGCATCTAGC TGcGTGACGA 5280 AAGTAAACCC GTTCTCCGCA GAAACCCCCA CCGAAAGCCC CAGCGTGGGG GTGAAGGCCA 5340
GTATATCGGT GCGTTTCGTA nCTTGGTTCC GCCCTCAnCG TTCGCGCCcT TTCCCCACAC 5400
AAaGAcTCCC ACCTGTCCAA cTCGGGGAGA AACAAAAACt GCGCGGCGTT CGCGTCAGTG 5460
CAAgACCCCA CAACACCGCC AAAAgAGCGC ACCCCCCCCC gCCACCGAAC GGCGCAgCGG 5520
CACATCACCT CACCCTCACA CCAACCACTC ATACACTACA CTCGGAACGT CGGCCCTATC 5580
GTGAGCGAGA GCGGCAAGGT AAACTCCTTG AAATTAAAGT CACGAACACC AACGGCCGTG 5640
CTCGCAGCAA CCGCAACTCC GGCAAAGGAA GTGAGATAGT ACTGCACCTC TAAGTTCAAC 5700
GGTACGCTGT ACAGCAGCTT GCTATACCAC GCAGACGATT TCCCCTCAGA TGTCGCACAC 5760
GAATCACCGC AGATATTCAC CCCACTGGAA ACGATGGCCC GCAGCCCTCC CACACGCACC 5820
GCGTAGCCAA TTAACGCCTG TGCACGCACA AAGACATTGG T 5861 (2) INFORMATION FOR SEQ ID NO: 76:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3694 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 76:
CGAGTAGGAG ATACATACCG ACACTCAGGG TTTACACACG CAGTATATGT GCCGACTCTG 60
CGATTTGATT TTTCAACCAA AAAGCATCGA CACTGAGGAC ACACTGCAAA GGTCGGTTTA 120
AAGTGAGTGA CAAAATCGCA GACAGGGAAA CGCGTGCAGC CATAGAATTC CTTTCTTCCC 180
CTTGTTTTTT TCCCCACGAT ATTCCCATCG CACGCAGGAC GCGGACACTT TGCAAGAGGG 240
ACAGGCTGAG TATTTCTACA CTCAGGAAAC TTTCCGCATG CAAGGAAAAA TCCAAACCTG 300
CCCAGTTTTT TCACCATCGT ATCACCACAC TGACTACACA CCACATCTGT TTTCTCATCG 360
AACACACCGC GCATGCTGTT AAGATCTTTC ATCACCGTTG AAACCTTTTC GCTGAAAGCA 420
GGATAGAAAT CCGCAATGAC ACAATTCCAC TTGATTTTAT CTTCCTCCAC CTCATCGAGT 480
TTACTTTCCA TGCGCGCGGT AAAACTTACA TCAACAACAT CATGAAAATA GGTGGTGAGA 540
AGATCACTAA TGACCTTTCC CAATGGGGTC GGCATTAGCT GTTTTTGAAT ACGAGTTACA 600
TAATAGCGAT CCAGCAGTAC TGAAATAGTC GGTGCATACG TTGAAGGGCG CCCAATTCCC 660
TTTTCCTCCA ACATTTTTAC GATACTTGCA TCCGTGTACC GAACAGGACC tGCGTAAAGT 720
GCTGTACGGA CTGCACGTTA TGTAGTGCAA CTACCTCACC TTCCTTCGTA GGGGGAAGTA 780 CAGCTTTAGA GAGATCTTTG GGGGATAACA TTTTCAGTAC ACGGTAGAAT CCCTGTTCAA 840
TAACCTGCGT TTCAGTTGCA CTGAAAACCG CCGGGCCAGC GGTAATTTCA AACGTCAAAC 900
TGCGCACTCT TGCATCTGTC ATCTGACTTG CAACAAAACG CTCCCAAATC AACGTGTACA 960
GACGTATTTG ATCACGCGTA AGGTGCGCTT TAATCCGCTC AGGAGTGTGG GCAACATATG 1020
TTGGTCGAAT CGCCTCATGT GCGTCCTGAG ACTTTCCCTT TGCAGCGTAC CGATTGGGAG 1080
TACCCGGCAG TGCGTCAGAA AAATGCGTTG CTATCCACGC GCGCACTTCC TTTACAGCAG 1140
CTTCAGAAAC GCGCACCGAA TCTGTACGCA TATATGTAAT GAGCCCCACG CGGTGGGTAC 1200
CAAGAGATAC GCCTTCATAG AGCTGCTGCG CAACCTGCAT CGTTTTACGC GAGGTAAACC 1260
CGAGCCTATT GGCAGCGCAT TGCTGCAACG TAGAGGTAGT AAAGGGCTGC TTCGGTCGAA 1320
CATTTTTTTC AAAACTGCGT ATTTGAGAAA CTCGTGCCTC ACTCTGAGAA AAAAGACCGA 1380
TAGCGCTTGT AGCCTCCTGT TTGCTTTTGA ATACAGCCTT TTTCCCTTGA ATCAGTATCA 1440
GTAGTGCAGA AAATGACTTT TTATCCTTTT CAAACGTTCC TTCAACCGTC CAGTATTCTT 1500
CTGGAACAAA GCGCTTTACT TCAACTTCTC GTTCACAGAT AAGACGAAGT GCAACCGACT 1560
GCACACGTCC TGCAGACAAC CCGTTTTTCA CCTTATGCCA CAGGAGCGGA CATAGGTGGT 1620
ATCCTACCAA ACGGTCCAGT AcGCGCCGCG CCTTTTGTGC ATTGACCTTT GCGGTATCTA 1680
TTGGAACCGG ATGGCCAATT GCCGCCCTAA TCGCGTGCGG TGTAATTTCA TTAAACACGA 1740
TCCTTTTGAT CGGCGTATCA CAATACGCCT GGATAGACTG TGCAAGGTGG TACGCAATCG 1800
CCTCCCCCTC TCGGTCACGA TCGCTGGCAA GAAACACTTG CAGTGACTGC TTAGATAGGG 1860
TGCGCAACTC TTTTAAACAC TGCGCACGAC CACGAACTGT AATGTACTCA GGCTGGAAAT 1920
CGTGCTCAAT ATCAATAGCT AAACGAGACT TTGGCAAGTC AATAACGTGG CCCATGGACG 1980
CTCGCACCAC GTAtGCGTTC CCAGATATTT TTCGATGGTC TGCGCCTTCG CAGGAGATTC 2040
CACAATAACC AAATGcTTCC GCGCAAATGT CTTCTGCCTT TTCGGTTGTA GCCCACGCAC 2100
TTCCATGTTT TCCGCCCCCT ATGCTTACCG AAACTGTCCT ACGTCCGACC CGTACATGCG 2160
TATTCCCAAT CGTCAAACAT ATCCTGcaCG CACGTCAGCG CGCGTGCACC TTCTGCATGC 2220
AGAAGTCTTC CTCCCTCATT TTGAAGACTT TCAAGCAAGG GTTCATACAC GTACACATCA 2280
CGTCCCTGTT CCAAGGCACA CAATGCGGTA ATCAACGCGC CTGACTTCTT TGGTGCTTCC 2340
ATAACAACGA GCGATCGTGC CAAACCTGAA ATGAGCCTAT TGCGTTCAGG AAAACGATAG 2400
CGCATCACAT GCTCAGATGG CGCATATTCA CTCAGAATGC ATCCTCCGGT TTCTATAATC 2460
CGTGCAGCAA GCGCGCTATT TGAGCGTGGA TATAACTGGT CTACACCACA GGCAAGTACT 2520 GCGAGGGTGT ACCCACCACC TGCTAATGCT CCTTTGTGAC AGAATCCGTC TATTCCACGT 2580
GCAAGTCCTG AAACAATGGC AATGCCCGAT TCAGCACACG CCTTGGAAAA GGCCAAACTG 2640
TTCCGAACTC CTTCACCGGT TGGAGTACGT GTGCCAACCA TCCCCACAAT AGGTTGGGTT 2700
GCACACGGCA AAGTGCCCCG ATAGAACAGC ACAAACGGTA CATCACTTAT TTCTCTTAAC 2760
CAGGAGGgAA ACGCATCATC GTCTTTGAAC ACCATTTTTA TCTGATAACA CCGCATGGTT 2820
TGTATACCGT GCCGAACGAG GTGAGGCAAC GCAGAAAGTT GCGCACCCGC AGTGcGTATG 2880
TGCCTTTCTA CAACACGCTC AAAATCACGC ACCTTCCATG CAGTAAGCTC CTGArAAGAA 2940
CCTACAGCTT TTGAAACGCG CAACCGCTCC CCACCTTTTA AAAAGTGACA GTAAGAGAGC 3000
GCAAGAGCAA TCTTGTGCGT TTCAGTGAGT ACAGTATCCG TGTTCATCCA CGGTCCCCAC 3060
GCGTAcACGC TGCGATACAT TCGTGACTAA TTCTTTATTC TGTCGCGCAC GATCTCTCTG 3120
ATGGGAGTCA CTGCTTTGGA CAAGGATGCG ATCGTAACAC TCTATTGCCT GAGCATATCG 3180
TCCAAGTGTG TAGTATGCGC GCGCAAGAAC GAACAAAGCT GCTTCTGTAT TTTCTTTCAA 3240
TTCGAGGAGC TTTTGCAAAT GAGGAACCGC GTTCTGTGGC TGATTCAGTT CAAATACATA 3300
CAATACAGAA AGACCGTACA ACGCGTGGGA ATACTGCGCG TCCAACGAGA GCGCACGCAC 3360
ATACGCAGAG ACTgcTAACG CGCGATAAGC AGCTTGCTGC TGCATTTTCT TCTCAGGATC 3420
AaTAGGAGCG ACGTACTTAG CCGCATATGC GGCACACAAC GCCTGGTAAA AAAAAAGATG 3480
CTTATTTTCG GGAGCAAAGG TAATGGCCTG GGTAAACGCA TCAAGCGCGT GCGTATACAT 3540
CTTACGATCG AAGTAGCGCA ACGCGAGCAT CTTGTACCAA ACACCCACCT GATnCnCnGT 3600
GCGTGCAAAC GCTCGAGGCG CTGTTCGTGC AGCTTTACTG CCCTCCGCAG TTCTTCAATA 3660
GAGGTTGGAT GGGGCACTCC TTTCTCTAAA TCCT 3694 (2) INFORMATION FOR SEQ ID NO: 77:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6422 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77:
TTACCTaAAC CGCGAATTTC CATATGGTGA CCGaTACTTT TGTGCACCCC CCGCCCAATG 60
AGGCTATTTC CATTGACGCA AGAAATTTCT ACTAAGTCAT CTGCAACAAG TCGATGACCG 120
CGCTCAATTA ACTCCAGAGC AGTCTCACTT TTTCCTACTC CTGAATCTCC TGAAATAAGA 180 ATCCCGACGC CATACACCTC CACCAATACT CCATGAAGCG CTATCGTCGG TGCGAAGATA 240 TTGGAGAGAA CACGCATGAG ACGTAAAGAA AGCTCGCTCG ACGTAAGACG AGTGACCAAG 300 ATAGGGCAAG AAGAAGGCTC AGCAAGATGC AAAAACTTCT CCGGCGGGGT AATTCCATGG 360 GAAAAGATAC AACAAGGCAA GTCAAAGGTG AACATCTTTT CGATAGCACC GTATCGTCCC 420 TGCTCTAAAA GGGCGAGCAG ATACGCATGT TCTCCGCGGC CAAAAAGCTG GATCCGCCGG 480 TAGGnAAACA AGTCAAAAAA GCCTGACAGG ACAAGACCTG GTCGGTTCAG ATCCGAGATA 540 GTGATGGGAT TTGCCAGTCC ATGGTGACCT GCGATACAAC GCAGAtCAAG CGAATCGCGC 600 TCTTTCAGAT CGAGCTTGAG CACATCGAGA ACGGTAAAAA GAGGAGCACC CACGGCCGCT 660 ACTGTAGCAC AAAaGCCAGG ACCCGTAAAC GAGCCCACCG CAGTGAGAGC CTCTCTTCAA 720 GAGAGCCACA AGTCCGTACA GGAACTCTTG ACTTTTACAT ACAACTGGTG TCTGCTGACG 780 GCGCCCGGGT GTGGGGCACA CGATCCATTA GCTCAGCGGA GAGAGCGGCC GCCTCCTAAG 840 CGGCAGGTCG GACGTTCAAG TCGTCCATGG ATCAGGAACG GCGATGGTCG GGCGAGGGGG 900 ATTTGAACCC CCGACCTCTC AGTCCCGAAC CGAGCGCGCT AcCACTGCGC TACCACCCGT 960
GCACGCAAAG AAACCACACA GAAAGGGACG CGCACCGCAC AGTGCGCAGA CGGGAGCGAC 1020
GGGGCTCGAA CCCGCGATCT CCGGCGTGAC AGGctGGCGC GATAACCAAC TTCGctACGC 1080
CCCCAGAACT TGCGCGCATC CTACATCACC CGCACAAAGT TATCAAGCGG CGATGATAGA 1140
TCACCCAAGG AAATAGCGGC AATAGGGATT GAACCTATGA CAGCGCGGAT ATGAGCCGCG 1200
TGCTCTACCA ACTGAGCTAT GCCGCCAAAA AACCCCCGAC CGCACACCAC CGCCATCCTA 1260
TCCCTTTTTT TTACACTCTG ACAAGTCCTG CCTTCCCCCT GCCTCCCCGT GCTTGACGCA 1320
AGAAACAATA AAATTGCTGC CTATGATTTC TTTAGCCGCG CGTATAATGC GCCGAGACAC 1380
CGGGCGTGGT CTTTCGTCGC ACAGTGGCAC TCGCGCTTCT TCTGCGTACG CTCCCCTGCG 1440
CCGCGCACTT CGGAAACCGT GACCGCACGT TCTACGACCT TAACAACGCG CCCCTTGCTC 1500
TGCGCGCCAT CCAGGACGCA TATCCTCATC TCAACGCGGT CATTGCCTAT GACCCGCGGG 1560
AACAGGACTG GCTCATCCGT TCAGACGGGC GCACCCTCTA CTGGGCAGAG GGGCGTCTTT 1620
TACCTCGAGA ACACCGTGAT CAAGCCCACG ACTGGCGCCC CATCATCGAT TATGTCTACG 1680
CGCGAGAAGT CCTAGACCCC GCGCACCTTT TTCCAGAAGA AATACACGCG CTTAGGCCTA 1740
AGACGCTTGC AATTAAACGC AGCGCTACAA AACCCTATCA CGACGCTTTT TTCACGTGGC 1800
TCTACGGTCC TGCCACACGT TCGGAAATCA ACGCTCGTCT CGCGCGCGAC TATACGTTCT 1860
TAGGAAAGCC CGTATACGTA CACAAAGCAC TCATCACACG CTTAAACGCA GTACAGGAAA 1920 AAATCCTCAC TGCCGCGAAG ACGCATGCTC ACGTACAAAA GTTTATCGAG GATCTTTTAC 1980
GCGTCGACGG CTTTAACTGG CGTGAGATTT CTGATTCTAG ACAAAAGAGT AACCACAGCT 2040
GGGGGATCGC GTTGGATCTT ATGCCCAAGA ATTGGCAACG CCACACCATG TACTGGAATT 2100
GGGAArctGC GCATAACGAA GATTGGATGC ACATCCCCAT AAAAAAGCGC TGGGCTCCAC 2160
CTGCAGAAAT CATCAGTCTT TTCGAAAGCG AAGGGTTTAT CTGGGGCGGA CACTGGATGC 2220
TGTGGGACAC TATGCACTTC GAATACCGGC CGGAATTACT CGCTGTACGT AAAATCCTTG 2280
CCGAGGGGAA CCGCTATGAC TTTCAAGAAC AAAATATAGT GGTGCATGCA GATGATTTTC 2340
CTGCGCAATA CTTTTCTCCC AAAGAAGTAT TCGGCACAGA TGAGAAGGAG CACATTACCT 2400
ATGCAGAATC CTGCGTtCGT GCAAcGCAnG CAmaGTGTTA AAGAACTCGT TCGTGCACGC 2460
ACGCTGGTAG CGCGGTTTTC TCCTATGCGT CGGCTGCACG TGTATGCACC TCCTGAAAGC 2520
ATTCACAACA GCATAGATAC AGCCCTATTA CGCATGACCG CACAACTGAA GAAAAATTAC 2580
ACAAATGCGA AAaTACGGAA CAATTCTCGT TTGCTTTCAA AAAgCATGCT CAGACACGCG 2640
CGTCTCGCAG AAGCGCAGAT GTGTACACGG TATCGTGCCG TCATGCTCAG GCAATCGCAn 2700
TGCAGTATCC ACACGCCCTG TCTTGGCAAA GTAAAGAACG AAGCGATGCG CTGTGGATTG 2760
CGCTTTTTTC CgTACGGCAA GAAGCGGCAC GACGCTCCGT GTGCACACCC TCGTCTAAGG 2820
AACAATGCAT GACGCACGCA CTTTCTTCAT GCGTGGATCT TGCACGTACG CACATCCTGT 2880
TGCCATAGGG CGCTTCTTCC CCCTCTCTTC CCCTACTCAC ACACCACAGG GTACACTTAT 2940
GAAAAGTCAT GGCACCATGT GCTCAAGGAA TGCGCTTCTT TTGCCGAGAA GGGGCGCAGG 3000
GCTGCATGTT CTTACCCCAC GTATACgCGA GGCGCGACCG GTGAACACAG GCGTTAAGGT 3060
TATTCTCAGT CTATTCGCGA CGCTCGTCCT TATGGTGGGG GTGTTTTTCT GCGCACCACG 3120
CGCTTCTTTT GCCGAGTTTG AAAGACACTT TTACCAACCG ACTGTTCTCA GTGCGCTCTC 3180
TACCAACTTG CGTGAGGTCA GTAAGGCAAG TGAGGCTTGG CACAGTCGAT ATCGACCCCT 3240
GTTTTCTCAG TTCTGTGCGC TTGATGCAGT CAGAAGTAGT TTCGATCCTG CGCAAAAGGC 3300
TGAAGACATT ACACAACGTG CCCGGGAGGC CAGTGCGCTC TTGTCTTCTG TCGCTGGTCT 3360
CAAAGGGGTG CGTATTGTTG AGGCGCAGAA ACCAAATATC CATTTTTCCA CCTTTGAGTC 3420
CGACGTTCTC CTTGCTGACA GTGGTTCTGT AACCTACAGA AAGTACAACG CTGAGGAGCA 3480
CGACGTCCCT CTTCAGTTTC TAGGGGAGCA TTCCCCTGAA CCGAAGtTAT TATCGACGAG 3540
TACCATGATG CGCTGCTGTA CTCTTTCCCC TCCCTGGGGA ACTACGGGGA ATATCGTGGA 3600
CGCATTCTTT TCTACTTGTC CTTGCGTGCC TTGGGCACCC ACCTTATTGC GGAAAACAAA 3660 CTGAAGATCA CAGACAGCAT TGTTCCGCTT TCCGCTGATG ACtwaCCTTC GGTGGCATCG 3720
TTATTGGTAT CCCCCATGAG GGGGTACGTT CCCTCAAACC CTCTGTGCTC GCAGAGTGGA 3780
AGCGCAAGCA GTTCAGGGTA CAGACAGTCA GGAGTGAGCA GCACGAAGAC TGGGCACTGC 3840
TCAGTAATGC ATCAGGCGCC TTTGTCATTG CACAGGCAGT GCCCGTCTTG CTGTTTGGCT 3900
TTACCCCTCT GACGAAGGGC CTTGTCGCTA TGGTTGCTGT TGTGACTACT TTTTTGCTCG 3960
TATTCCAGTT GCTCAGCCTT CGCCAGGACC CCCTCACAAA ACTGAGGGAC AGGCTGATAC 4020
ACTTCCaCGC GCAGCTCCTA CACAGTTGTC TCGAACAGAA GGAATCACTC GAGTGGGAGG 4080
AGGTGCGAAC CCGACTTGAA CACCGCAGGC GGGAAACAGA TGCAGAAATG AAGAAGTCTC 4140
TTCCCAGGCG TCTCCGTATA AGGCGGGGAC GCGAGCTCGA TGCGCTCCTC AGTAAGGGTT 4200
GGGATGACGT CTTCTCCACC TTGGAGCATG GTTACGGTGG TGCGCGTGCT ATGAACCGCG 4260
CGCAAATCGA ACAGCTTGTC AGGGAAGTgc TCGCGCAGAG CCTTGCAAGT GGGGAGGCTG 4320
TGCTACCTGT GGCGATGCGT GCGGACACAG CCGATGAAGA GCTCGACGAG GTGCTAGAGG 4380
AACTCCCTGA CGAGGCAGCC TCTTTGCCTT CCGATTCCAG TCCGGAAGAG GACCTGGACC 4440
CCTTGGAGGA AGTCGAGAGT ATCGAGGGGA CTGCTGAAGA AAGCACACGC GAGTACGCGG 4500
CTGCGGGAGA CGCGCTCCTC TCGAAAACAC CCCAGCTTTC AACGCACAGC GAGTACGTGC 4560
CGGCGACACT CGCAGAACTC CTGGGCCGCA ACGCAGAGCC CGGCGACGTC GTGCGGGACT 4620
CAGCAGTCCT CGAATATATC GAAGGCtCTT CGACTATCGT CCCTGCTGTT TTTATGAGAG 4680
CCACGCTGTC CACGACTGCC TAGAAGTAGT CACGGGAGAA GACGGCCCCT CTCTCAGCCC 4740
TATGGAAAGC ATCGTCAGCA CCGAGGACGG TCTTTTCACC ATTCGGGTGA GTAAGGAGGA 4800
AGGAAACCAC CTCAACCGCG ATTTCAAGGC CCTGGTGGAT TCCGTACTGT ACTGAAGAAC 4860
ATATCTTTCC GCCGGTGGAG CCGGTCCTCT TACTCGGAAG CAGGCACGAA CGTGTGCGCC 4920
ACCACGTTGG TTTTTATGAG CTCATCGACT TCCGTCTGGG AAGACCTGAG CCAGTACTGG 4980
CTTCTGCCaA TCCCCCAGCT GATTTCCCCG CGCATCTGCA GCGTATCAAm CTTGTAGCTC 5040
CTCCCATCCG CcAGGGATGA AATTAATTTT GCAGTAATAG TACTTACCGC TTGCAGGATC 5100
TATTACGTAT CCACCACCCC AGGAGCCTGG AGAAGTACGC TCGAGGTTAT AGATGAAgGG 5160
CGTACCAACC AAGGGCATAT TGGCGACGTT TCCCTTCTTG GAGAAATCAG GATACGTTCT 5220
TGTACACGAA ACCACCGCCG CATTCGGAGC CCTCCCCATG CACACGAGGA TCTTGCCAAA 5280
GAGCTTACCA TCCTGAACAT ACAACCGCCA CACCCCAGTG GGTTTCCCTG TATTGTCATC 5340
AACGCTCTTC CAGATACCTT CCACCGGATC GGCCATTTCC TTCTGCACAG ACGGCACCTG 5400 CGCTGCCTTG TCCGAGCTTG CTGTGAAACA CGGCACACAC AGACACAGTA CGACACCATA 5460
TACGATAACC TTTCTCATAG CCCGCCCCCA CAAAATATAA TGCGCACAAC CAACAAGGGG 5520
AAATACGACC TGCAAAAAAG CAGGTACCAT ACGACGCACC CCCCACATAC ATCGAGCTCA 5580
CCGTGATTGT GCAAAACGCT CTTGAAACTC CAACACTACA TCCGATATCG GCGGAGCACC 5640
ATTAAGAGAA ACTAATTTCC CCCGCTCACT GTAAAAGTGG ACAATGGcTc CGCCTGCGCT 5700
CGATAGGCGG TAAGCCTCTG AAGAATTGCC GACATCTTGT CATCCTCCCG CACGACCAGT 5760
ACTCCTCTAC ACCGATCGCA CACACCCTCT CTCTTAnGsT GCGCAAAGAG CACATGATAA 5820
CTGCTCCCAC AGGCCGmACA CACCtGCGGC CAGTAAGACG CGCAACAAGG ACATCGTCCG 5880
GtACTACAAT ACTCACCGCG TAGTCTATCG GCACAATGTC CTCTAAGCAC CTAGCCTGCG 5940
TGACAGTGCG AGGAAACCCA TCTAGAATAA AACCGCTAAC CACATCTTCG TGACTGACAC 6000
GCTCCCGCAC TAGCTCCGTA ACGGTCTGGT CATCTACCAA GCCGCCCACT TCAACTACTT 6060
TTTGAACTTT TTTACCTAAT GCCGTCTGTT TCTGAATTGC TGCCCGAAGA ATACCCCCTG 6120
TGGAGATGTG CACAACGCCA CAACGCCCAG AAATTTCACC TGCAAGCGTA CCCTTACCGG 6180
CACCAGGAGG ACCAAGAAAA ACAAACCTCA TGAAACAAAC TCCACCTTAT CTTCCTACGG 6240
GGAAGAAAAA ACACACCTCA CCCCGTTCTC TCGCGCACAC AACCGACCCG AAGgCGTtAC 6300
ACGCCGCGCG CGCCGCGAAA CCCCGACCGC CCAAACAAGA GCGCACCGGT ACTCCAATTC 6360
TACACGGGGA GGATTCATTG GCAACACAAA ACTGCGACAT GCTCGATAnA nCCTTATGTA 6420
CA 6422 (2) INFORMATION FOR SEQ ID NO: 78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4646 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78:
CTTCAAAACC GCCAAACAGT ATAGAAACAA ACAGGTTTAT CATGTAAGGA AGAATTACGC 60
TCAGTAAAAC AAACGTAACT ACGATTGTAA GCGGAAGCCC TTCAGAGAGT AGATTCATTT 120
GTGGGGCCGC TTTTGTTAAT AAACCCATTG AAACGTGGAT TAACAGCAAT GCTCCCATGA 180
TAGGCAGTGC GATAGTCATC GCGTGTAAAA AAAGAGCACT CAACGCTTTG GTAAAAAACA 240
GCAGGAGCGC TTCCTGTTTC CGCAGAAAAA CAAAGCAATT AACAGCCTGA AAGCTCCGCA 300 GCACGCCTCC TAAAAACAGG ATTTGAAATC CTTTTATTTG CAAAAAAACA AGCATCGCCA 360
CAAAGTTCAA AAACTGTCCC ATCAAAGGAT TTTCTATTTG TGCAAAGGTA TCGTACATCT 420
CAGATGTTCC AAAACCCATC TGATACGAAA AAAACTGTCC TGCCGCACTA AAAGTCGTAA 480
AAATTACGCT AATAAAAAAA CCTGTTAAAA TCCCCAGCAA ACCTTCTCCG AGCAACAAAA 540
GCACATAGTA CGCACTAAAC TCACGAACCT GcATGGGTGC AGGGTACGCA AGCGGTAATA 600
CGAGGAATGC AATCAGGcCT GcGAGTGCCA CCCTCACTaC CCGAgaAACC GAgCGCACCG 660
AcAAGaGaGG TACCGTAAAC ATaAGcGcAA ACACGcGGAc CGCCGwCAAG aAAAAAAAGA 720
GAAGCCTGAG aAAAgAGTGC ATCAAAGGAC CGTTCCATCG CACATATCCA CTAGACAGGT 780
CCACTCCTCA CTAACTGAGG GATAATGTCA AACAGCcTTA CGGTATAATT CTGCAGCATT 840
GTCAGCATCC ACCCACCGAG GAGGGCAATC ATTCCCAATA TGGTCAACAT CTTAGGAACA 900
AAGGTAAGTG TTTGTTCCTG AATAGACGTC ACTGCCTGAA AGATAGCCAC TATTAAGCCA 960
ACGACAAGCG CTGTGCACAG AACAGGCGCG ACAAGTAACA CCACCTGAAA AACACCCTCT 1020
CGTATCAAGC CTAATACCGC ACCTTGCGTC ATCACACACT CCCGTCCTGT ATGTTATAAA 1080
AACGAATGAA AAAGCCTATC TATCAGCAGA TTCCAACCGT CCACCAGCAC GAACAAAACC 1140
AATTTAAACG GCAATGAAAT CTGAACCGGC GGCAGCATAA TCATACCCAT AGACATCAAA 1200
ATACTCGCTA CAACCATATC AACAATTATG AAAGGTAAGT ACAGGAAGAT ACCAATCTGA 1260
AAGGCTACGG TCAGCTCATG CAGGATAAAA GCAGGGATAA GGACATACGT GGGCACGTCC 1320
GCAAGTGTAT CTGGCTTAGG CAGCTTTGCC ATGGACATAC AAAGACGCAC AGAAGACGGG 1380
TCATGCGCCA TCTGACGATA CATGAAGACA CGCAGCGGTC TTTCTGCCTC CGTATATGCA 1440
GTCTGGATAT CTACCTGGCC ATCGGTAAGA GGTTTAAACG ATTTGGCATA AATCTCAGTA 1500
AAGACCGGCC ACATGATAAA CAGGGCGAGA AACAATGCTA TGCCGTGTAA AACCTGTGTG 1560
GGCGGCACTT GCTGCAGCGA CAATGCACGT TTGATAAAAT CAAGGACGAT AGACAAGCGC 1620
AGAAAGGCAG TCATCAAAAG CAAGATACTC GGCGCAAGGG AAATGAGCGT GAGCAACAGG 1680
AGAAGTTGCA CAGAAAAAGC CACTTCCCGA TTGGTCTGGG GCTCCCGGAT ATCAAAATTG 1740
ATGAAAGGAA TGCGTGAAGC CGGCCGCTCA GCATTGATAC CAGTAACGCC GCGCTCGACA 1800
CCTCCGCGCG CATCCTGTGC AAAAAGCGGG AAGAAAAACA ATGACACGAA AAAGAGCGCG 1860
CGGCGTACGC ACGCACGAGC ACGGATCACA AAGCATCCTG TGCAGCAGAT TCTTCAGAAT 1920
CATTACGAGG GATGCGACGC AACTTCTTTC TCGTATCTGC AAGAAAATCT GCCTCAACTA 1980
ACGGCTTCCC CTTACCGGTA ACGCGCGCGG GCAAGAGGCG CGCGAGCATC TGAGAAAAAT 2040 CCGCACGGGC GTCAGTGCCC TGCTCATCAG CGACGATGTT CATGGTATCG ATGAGCTCTT 2100
TGTCCTTGAC CTCTGCAATG AGTGAAATGC ACGTATCAGA CGCTGCCAAT ACAAAGGCGC 2160
GCTCTGCAAG TCCTACCACG TACACTGCGC GCCCTGGCGc AATGGGCAAA CAGGCAAGCC 2220
GCTTCAAAAA TGGATCGTGC GCGCTAGAAA GAAACGcAtG CGTCTGATAA GACGCAGAAA 2280
CCCGTATACA GCCGCACACA CCACGCAGAG CACGAGACTA AAACGCAACA GAAGAGAAAA 2340
CACCGAAGGG GACGGGTCAC GCGCAACCGG GGCAGCGTCG AAACGGAACG CCTGTTCAGC 2400
AGGGGTGAGC GGGAACGCGT cCTCCCTCGT GTCACGCGCG GACTCAGGCG CCGGCTGCTC 2460
CTTCTGTGCA GAAGTCGCTG ACACCGCTTC TGAAACGGCA GAAACATCTA CCCCCTGCTG 2520
CTGTGCCCAG AGCTCAAAGG ACACATGCAA AAAAACATGC ACCGAAAGGA ACAGCGTcGC 2580
GCACCGsGGT ACGATCCGAA GGgAGcAATT AAACGTCCGC AATACGTTCT CCCGGCGAGA 2640
GAATTTCCGT AACACGCACC CCAAAGTTTT CATCAATAAC CACCACCTCT CCTTTTGCGA 2700
TCAACTTGTG ATTGACCAAA ATATCAACAG GTTCACCGGC AAGCTTATCC AACTCGATAA 2760
TGTGGCCTTC CCCCATACCC AGGATATCTT TAATCATCAT GCGTGTACGC CCGAGCTCAA 2820
CGGTAACTTC CATGAACACG TCCATGATAA GCCCGATATT TCCCTGTTCT GCGCCACCTG 2880 ctGCATTCTG CAGCGGATGA AACTGGACTG ACTGCACACT CGGACTCGCG GCGCCTATCC 2940
CCATCTGCAT GTTCACGCCC CCCATTTGAG AATTGCCCAC CTGgCtGCGG GCcTGCATTG 3000
CCCCCCCCAT CCTCTCGATA ATTCGAACCA TCAGCTGCTC AGACACCAAC TCCCACAGCG 3060
TATACGAAGT GCCATCTAGC TCCACCGTAT AGGTAAAAAC GCACAGACGC TGCGGGGGAA 3120
AGCGAACCAT CGCCTTAGGC ACCTGCACCG ACTCTGCAGG AGCCACACTT ACATTCTGTA 3180
CGTTCCGCGC CTCAAGCGTA GAAAGCTGTG CGCTGACATA TTGGGTGATC GTTTCACTAA 3240
CAACCGAAAG TCCCATATCA TCAATTTGAT CGTTGTCCTC ATGACTGACC AAATTGACGA 3300
GTTTCTGCGC AAACTCAGGA GCCATGAGGA ACAAATGGTC CCCTGnAAAn TCTCCTTCAA 3360
AATCGATGAC AgTTGCCACT AACATGTCCG GAATGACGCG GGAAAACTnT TCCTTAGAGG 3420
AAATTTCCAC ACGCGGCGGG GAAATAGAAA CAnTCTTACC GGTCAAAGAn TCCAAGCTCG 3480
GGCAAAAGGA ACCCACATTC GCCTGACAGA AAGACTGCAA CAACTCGCTT TGTGCGCTGG 3540
AGAGCCCCCC ACCGGAGAAA GACGCGCCCG CAGCGGGGGA GTCGCCGGCT CCCATCTCAA 3600
CACCTGAAAG CAGGGCATCG ATTTCAGCCT GAGAAATAGA GCCGTCACTC ATACAATTCC 3660
TCCTCGTCCG CGGATAATTC CTCAAAATCC TCTTGGGAGG TACTTTCTAT TCGTTCCAAA 3720
ATCTGCGCGG CAATCTTTTT TCCCACCACC CCAGGcTGrs AsrrAAACTT CTTGCGGTTC 3780 CCAATACTGA GCACAAAAGG ATCGCCCACA TGGGTGTCGT GCAACCGGAT GATATCCCCC 3840
ACCCGGAGCC CAAGGATATC GCGCACTGAA AGGCGGAGCG ACCCAACTTC TGCCACCACA 3900
TCCATATCCA CCGTGGATAG CTTGTCGCGC AGAACCCCCA TGTAtGCGTG GTAGAACTCC 3960
TGCGCACCGA AGAAAACCAA AACTGACTCG ACAACTTAGA AATGATAGGT TCTATGGTGA 4020
TGTACGGAAT GCAAAAGTTC ATCATCCCCT CTTCCTCACC TACCTTTGTC TCGAGCGTCA 4080
CCAACACCAC CATCTCTGAG GGAGGGACGA TCTGCGCGAA TTGCGGGTTC GTTTCAATTT 4140
GACCCAGGCG CGGACGCAGA TCGATAACcT GCGTCCAGGA TTCACGCACA TTCGCCAGAA 4200
TACGGACGAT GACCCCTTCC ATTACTGAAT TTTCAATATC AGTCAAATCC CGCTGCACCT 4260
TGGCTGCCTG TCCTGTTCCT CCAAAGAGGC GGTCAATGAT AGAAAAAGTA ATGGAGGGAT 4320
CCACCTCAAG CAtGCGTTCC CTTTGAGCGG ATCCATAGTG ATCACCGCAA GCGTAGAAGG 4380
GGTGGGAATA GAACGGATAA ACTCCTCGTA CGTGAGCTGA TCTACCGACG CAACGTGCAC 4440
GTGCACCATA CTGCGCAGTG cGCCGACAGC GAGGTAGTAG TCAACCGCGC AAAAGTCTCA 4500
TGCATCAACG ACAGTGTACG CATCTGCTCC TTTGAAAACT TATCTGGGCG CCTAAAATCA 4560
TAGAGCGTAA TCTTaCGGGT GTCGCTGATA GGGCGCGCAT CTTCAATACT TGmATCCCCA 4620
GAaCTGAtAG CCGTTAgCAG CTGAnT 4646 (2) INFORMATION FOR SEQ ID NO : 79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11191 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79:
ATGGAGTAAT GAGCAGTTTA CCCAGTATCT TGAATATCTT TTTCGGGTAC GCAGGCTGTC 60
TGCGCATACG GTTTCTGCCT ATGCGCGCGA CTTGAATCTT TTTGAACGCT GGTTGCAACA 120
CGCGCAGAGA GCGTGCGCGC GCGTAACAGT TTCTGATATG CGTCTGTTTG TGTGTGAGTT 180
AGGAAGACGG GGACTTTCCG CAGCGAGTAT TAACCGAGTT TTGTCCGTGG TGCGAGGTTT 240
TTATGTGTTT GCTAAAAAAA AACATTGGTG CGCGGACAAT CCTGCACGCT TAGTGAGGAA 300
TATAAAAGGT CCTTCAAAGT TGCCTCGTTT TATGTTTCCA CCGCAAGCAA AGGCGTTTTA 360
CACCTTACCA AGTCGTACGG ATATTTTGTG GCAGGAACGG GATGCGGCAC TTTTTGCGAT 420
GTTGTATTCA ACAGGATGTC GCGTTTCAGA GATAGCGGCG CTCTCATTGA AAGATGTGCA 480 TCCGCATCTT AGTTCTGCGA TTGTGCGGGG AAAGGGTGAT CGGGAsCGGA CCGTGTTTAT 540
TGCTCCGTTT GCGCAGAATT TTTTGCACGT GTATATGCAG GCGCGTGCGC AcGAnTGTGC 600
GCGcTACGCC TCTTGCACAC CCGCGCTGTT TGTGAATCAG CGGGGTCGGT CGCTTTCTGT 660
GCGCGGAATA CAGTACCTTG TTAGTCGGTA CGTGCTTTTG GCCCAGGACG TGCACGCGCT 720
GTCTCCCCAC GCGTTTCGGC ACAGTTTTGC TTCGACGTTG ATCCGTCGGG GGGCTGATGT 780
GCgCGTTGCg CAAGAGTTAT TAGGACATGC GAGTGTGTCT ACCACCCAGC GATATGTGCA 840
TGTGACTTCA GAGCAACTGC AGGACTTGTA TCACCGTGCG CATCCGCGTG GATAGGGGGT 900
AGGAACGGAG CGTCCAAACG ATGCGGGGAA GCGAGCTGCA GAGAATGTAC ACCAGTGCGA 960
AGTGCTTTTT TCTGAGACTT TTTTGAAGAA GACTTTCTTA AGCTCGCTTT TTTTTGGTCG 1020
ACAATGGGTC GGGGGTAGTC GGATGAATAG TTTTACCAGA ACGGTGGATC TTTTGCATCG 1080
TGCTTTGGAT GTCAACgcGT TGCGCTATGA AGTGACGGCG AATAATCTTG CGAACGCAGA 1140
GGTTCCAGGG TTCAAGCGGA CGGACGTAAA CTTTGAAGCA GAGCTCAAGC GTGCTCTGGA 1200
TTCTCAAAGA AATGAGACAA GTTTTTTCAA GCAGGCAACT GCGGGGACGA ATATGTTGTC 1260
CAGTGATGTT ATCGACTAcC GcTCGGTGCG TCCGCGCCGC GTGTTAGACT ATTTGACGGA 1320
TGTGAAGGCG AACGGAAACA ATGTGGATGC TGAGCAAGAA GCCATGCATG TTCTCAAGAT 1380
TCAGATGCAC TATCAGATGT TGAGTCAGaT GGTAGGGTTC CAGTATCGTC AGGTTGAGTC 1440
CGTGTTACGT TAAGCGTATG GAGAAGCGTG ATGGGTTTGT TTAGTGGTAT CAATATTGCC 1500
GCGACGGGTA TGAGCGCGCA nTTTGGCGGG CCGATGTGAT CTCTGACAAC ATTGCTAATG 1560
CTTCCTCCAC GAGGACTCAA GAAGGTGGAG TGTTTCGGAG GAGCAGGGTA GTTTTGGCGC 1620
AGAAGAATCC TGGCATTGAC TGGCGTATAC CTTTTGTGCC CGAGCAGTTG GATCGGGGGG 1680
TAGGCACAGG GGTTCGTGTG GTAAGCATAG AAAAGGACAA CGCTCCTTCT CGTCTTGTGT 1740
ACGACCCAAC GCACCCTGgA TGCGATTCTA TCAGGGCCGA AGtGGGgTAC GTGGAGTATC 1800
CtAACGTGGA TATTGTGACA GAGATGGTGG ATCTTATTTC TGCCTCTCGC GCGTATGAGG 1860
CAAACATATC AGTTATTTCA GGATCAAAAG AAAaTGTTTC AGCGTGCGTT GGAGATTGCG 1920
CGCTAGGTGT GTTGCGCGTA CAgTCTGTGA AGATGTCTGT GCTGTGTGAG GGGAGGATAC 1980
AATGACGCCA GTTGGTACCA TTACGAATAG TGCGAATGTA TATAAAGTTC CATCTCTGAG 2040
GAAGGTGCCT GAAATCGGTC CAGTGTCGGT AGAAAGCGTA AGGcAGCGCA TGCGAGGGAA 2100
TACTGACGCG GTGGATCAGG CAGTGAACAA AAAGGCGATG AGTTTTGAGC AAACGTTGCT 2160
GCGCGCTTTT GATCAGGTAA ATCAAAAGCA GCAGAAGACT GCTGAGTTGA CCGAGCAAAT 2220 GATAGTAGAT CCTGAGTCTG TTGACGTGCA TGATGTAACA GTGGCGATGG CGGAGGCTAG 2280
TATGTCCTTG AAAATCGCGC AGACTGTCAT TGATAAAGTC CTTAAGAGCT GGAACGATGT 2340
CACCACTGCT CGGTAAGGTT TACAAGGCCG GGCTGTTCTG CAAAAAGAGT ACCGACGGTA 2400
TATCAGgTGA AAAGAGGGTG GGACGCGCTT AGTGCGCATT GGCTCGTTCT ATAGTGAGGG 2460
GAGGGGACAC GCGTGGGCGA ATGGTTGGGG CAGCTCGGAG TCAAACTCAA AACACAGTGG 2520
AAGAAGTGGA CGCTCGTGCA GAAGTCTGTG CTTGCCGGCG CGGCGCTCGT GTCTGTCATG 2580
GGGGTTGTTG TCTTGCTCAC GTgGtCGcGA AGCCGACkcT CGTGCCACTT ATCGACACTC 2640
CTATCACTGA TGAGACGGTG CGGGAAAAGA TTATCCTGCG CCTTAACGAA GAGAATGTGC 2700
GTGCAACCGT CTCAAGCGTT GGGTTGATTT CTGTCTCGGA TGAGAAGACA GCGCGTCGTA 2760
TGCGCAGCAT CTTAATTCGC GAAGATTTGA TCCCAAAAAA TGTGGACCCA TGGGCCATAT 2820
TCGACGTCGA GCGATGGACG CGTACTGACT TTGAGCGCAG GGTGGACGTG CGGCGTGCAA 2880
TTAATAATAC CGTTACCAAT CATATCAAAG CGCTCGACGA CATCGATGAT GCCCATGTAG 2940
TAATAAACGT GCCTGAGGAT GCGCTTTTTC AGGCAGACCA GAAACCTATT ACTGCGAGCG 3000
TTGTCATTTT CCCTAAACCG TCGAGCACGA TCGCCTCAGA AAGAAAAAAA ATAGAAGGCA 3060
TTCAGAAACT ATTAAAGCTT GCAGTTCCTG GACTGAAGGA TGAAAACATC ACGATTGTAG 3120
ATAGTGATGC TACCGTCCTA AATGATTTTG AAGGGTTCAA GGACGCTGAT CGGCTGAGTC 3180
TCATTGAAAA GCAACAGAAA ATGATTGCGA arCTGgAATC CCAGTATGAG GCAAAAGTGC 3240
TGGCTCTCTT GCAAAAGACG TACGGTAAAG ACCGGGTGCG CGACTTAAAT ATCAAAATTG 3300
AAATGGATCT TTCTGAAAAG ACGTCGCAGA tACGAAGTAT CTGCCTATAG AAATCCGTCA 3360
GGACAATCCG GATACCCCGT GGGATGATTC TCAGGTTGTG CCCTCTGTCA CTTCGATATC 3420
TGAAACGGCm ACcACtACGT GGnCAGGGTA CGGGGCTTAA CCCTGAAGGA CCGCCGGGAG 3480
TTGAGGGTCA AACACCTCCT GCATACAAAG ACATGAGCAA CCAGGTGGGA CTTTCTAACC 3540
AGTCGGTCGT TAAGAAGCAA GAGGCGATTA GCAAGAGTGA GATCAACGAA GTAGTGAGCC 3600
CGGTGCTCGG CCGCAGGACG GTGTCGGTCA ATATCGATGG AGAATGGCGC AAAAAGAGAG 3660
ACGAGCACGG AAGATTCATT GTGAAGGAAG GACACATTGA ACGTGAGTAT ATCCCCATCT 3720
CTGnTGAGGA GCTGCGGGAG GCAACGAAGG CAGTGCAGGA TGCAATCGGC TTTGATGCGG 3780
GGCGTAAGGA TTCGGTAAGT GTTTTAAATA TCAAATTTGA CCGGACGTCA GAATTTGATA 3840
GAGAAGATGA GCATTACCTG CGCGTCCAGC AGAGGAACAT GATCaTCtTa TACTCCCtTg 3900 ccAGtgtgGC AATCGTTTTA TTTATCTTCA TGGTATACAA GGTTATCAGC AAAGAGGTGG 3960 AGCGTCGCCG TCGTCTGCGG GaAaGGAGCT TTTAAGGCAG CAGCAACTGA TGAGGGAGCG 4020
TGCCCTGTGG GAGGCTGAAC AGGCGGGGAT GAATGTTTCC ATGTCGGTGG AAGAGCGTAA 4080
GGnCTTGAAT TGCAAGAGAA TGTGTTGAAT ATGGCGCGGG AGCATCCGGA AGAGTTGCGT 4140
TGCTTGTGAG AACGTGGTTG ATGGAGGAGT AGTACTATGG CCGTTACATC CGTGAAGGAT 4200
AAGCTCGCCA CGGGAGAAAA AAAGCAACGG GATATCAAGT CTCTCAATGG TCGGCAAAAG 4260
GCAGCGATAT TTCTAGTTTC TATTGGGGAG GAAATATCCG CTAAGGTCAT GGGAGAACTT 4320
AAGGAAGACG AGATTGAAAA GTTGGTGTTT GAAATAGCGC GTACAGAGTC aGTTGATGCA 4380
GAACTCAAGG ATGCagTTTT AGAAGAaTTC CAGGAACtGA TGACCGCACA AAACTTTATC 4440
ACCTCAGGAG GTATCGATTA CGCGCGGGGA TtGTTGGAGA AGTCGTTGGG AAGTCAAAAA 4500
GCAATCGAGA TCATAAATCG GCTGACAAgC TCCTTGCAGG TGCGTCCCTT TGACTTTATT 4560
CGCAGAACTG ATCCCACACA CCTGTTAAAT TTTATTCAGC AAGAGCATCC GCAGACAATT 4620
GCGCTTATTT TGGCGTACCT TGAGCCGAAT AAAGCTTCTG TTATTTTGCA GAACCTCCCT 4680
GATGAGATTC AGAGTGATGT GGCTCGGCGC ATAGCCACGA TGGATCGGAC GTCCCCTGAT 4740
GTGTTGCGCG AGGTTGAACG AGTACTTGAG AAAAAATTGT CAACGCTTTC TAGCGAGGAT 4800
TATACGGCCG CAGGAGGTGT CCAGAACATC GTGGaCATCT TGAATTTGGT CGATCGTTCT 4860
TCTGAAAAAT CTATTGTTGA AGCATTGGAA GATGAAGATC CAGATCTTGC AGAGGAAATT 4920
AAAAAACGTA TGTTCGTGTT TGAGGATATT GTAATGCTCG ACGATCGGGC CATTCAAAAG 4980
GTGCTGCGGG AGGTGAATAT GGAAGAACTC GCAAAGGCAC TCAAGGTTGT CGACACTGAA 5040
GTACAAGATA AAATTTTTAG GAATATGTCG AAGCGGGCAG GGAGTATGCT GAAGGAAGAA 5100
ATGGAATACA TGGGGCCGAC CCGCTTGAAA GATGTGGAGG AAGCCCAGCA GAAGGTTGTT 5160
TCTATCATCA GACACCTTGA AGATAGTGGT GACATTGTCA TCGCGCGTTC AGAAGAAGAC 5220
GAGATGaTTG TGTAAATGTT GTTCCTGATA AGCGATATGG GGTTCGAAAG GAAGCAGACA 5280
GTATGCCAAA GmTsATATTT CGGAACCATG AAGTGAAGAA TCTTGATCAG TTCTTGCTGC 5340
TTGATCTGAG CAGGTCTTTT GGTGTCGAGC CTCAGATTGA GGAGGTGCAA AGCGAACCTG 5400
TGTGTCCAGT TCCTGATATG CGTGAAGTGC AAGAGGAAGT TGAGCTGTTT CGAAAAAGTT 5460
GGGAAGAAGA GCAGGTGCAG CTGCGCGCGC GTGCAGAGCG TGAGGCACAA GATCTAAAGG 5520
AGCGTGTAGA GGAGGAAATC ACAGCATATC GCGAACAGTG TACGCAGGAG GCGGATCGTA 5580
TCCTTGCTCA GGCAAAGGAA CAGTCTGAGC TACAAATTAG CGAGGCGCAA CAGCAAGCTG 5640
AACGCATGAT TGCTGAGGCA GAGACGTCTC GTCAGAAAAT ATGTGATCAC AGTAAGGCAG 5700 AAGGTATTCG TCTTGGCAAG GAAGAAGGGT TTCGTGCGGG ACAGGAAGAG GTGCGGTATT 5760
TAACTGAGCG TTTGCATAAG ATGATCGAAG AAGTGATGGG GCGGCGTCAG GGTATTTTGC 5820
GGGAAACCGA AAGACAGATT GTTGATCTGG TGTTGTTGAT GACAAGGAAG GTGGTCAAGG 5880
TCATTTCTGA AAACCAACGC GCTGTTATCA GCGCAAATGT GGTGCATGCG TTGCGTAAGG 5940
TGCGAACGCG CGGAGCGgTG ACGCTGcGGG TAAACCTTGC GGATGTGGAG CTTGTTACCC 6000
AGCACAAGCA GGAGTTTATC GCTGCAGTGG AGCGTGTGGA TGATCTAACG GTAGTGGAGG 6060
ACACGTCAGT GGGTAGGGGC GGTTGCgTGG TGGAAACGGA TTTTGGAGAG ATTGACGCGC 6120
GGGTTGCAAG TCAGCTCCAT GaGCTTGAGC AGCGTGTTTT GGAAGTTGCC CCCATTGTAG 6180
TGTCATCAAT GTCAGCATCT AAGGGTTCTT GATAGAGAAA GAGGCGTGGG TGTGCGTGTA 6240
TGGAAGCAGA CCTGTTGTGC AAGTATGAGG TGGCgCTCCG CGAGAGTGAG CCGGTAAAGT 6300
ACGTTGGGCA TGTGACAGCA GTGAGGGGTT TATTGATTGA AAGTCGTGGC CCTCACGCGG 6360
TAGTTGGTGA ATTGTGTCGG ATTGTGTTGC GCCGCCAGGG GCGACCGTTG ATAGCAGAGG 6420
TAGTAGGACT TGCaGGATCG ACGGTAAAAC TGATGAGCTA CACCGATACG CACGGGGTTG 6480
AAGTTGGCTG TGCGGTGGTA GCAGAAGGGG CGGCAtTTCA GTCCCCGTAG GAGATGCTTT 6540
ACTCGGAcGC GTTTTGAACG CGTTTGGGAA GGCAATTGAC GGGAAGGGGG AGATATATGC 6600 cgTCCTCCGC TCCGAGGTGT TGCGCGCGTC TTCTAATCCT ATGGAGCGTC TTCCGATTAC 6660
GCGTCAAATG GTAACAGGAG TGCGGGTGCT TGATTCkTtG CTGGCAGTTG GTTGCGGACA 6720
ACGTCTGGGT ATTTTTTCCG GTTCGGGGGT TGGGAAGTCG ACGCTGATGG GGATGATCGC 6780
GCGCAATACA GACGcAGATG TGTCGGTCAT TGCCCTTATC GGGGAGCGTG GCCGTGAAGT 6840
GATGGATTTT GTTGCGCATG ATTTGGGTCC TGAGGGTTTG AAGCGCTCGG TAATAGTTAG 6900
TGCGACGTCT GATGAAAnGT CCTTGCGCGG GTACGAGGTG CGTACACGGC GACAGCGATT 6 60
GCAGAGTACT TTCGGGATCA AGGCAAACAG GTGCTGCTGC TGTTTGATTC TCTGACGCGC 7020
TTTGCAAAAG CTCAGCGTGA GATTGGGTTA GCGTCGGGGG AGCTCCCTGC AACGCGTGGA 7080
TATACCCCGG GGGTATTCGA AACGTTACCG AAACTGCTTG AGCGTGCAGG TTCTTTTTCC 7140
ATGGGGAGCG TCACCGCTTT TTATACTGTT TTGGTAGATG GGGACGATCT CGATGAGCCG 7200
ATATCAGACG CCGTGCGTGG AATTGTAGAC GGGCACATTG TACTCAGTCG CGCGCTTGCg 7260 cAGcgCAATC ACTATCCTGC AATAGACGTG TTGCAAAGCG TTTCTCGCTT GGCGCACCGC 7320
GTGCTGGGTG CAGACATGAA AGAGGCAGTG CGCATAGTGC GTCGTGCGCT TGCAGTGTAC 7380
GCAGAAGTAG AGGATTTGGT ACGAGTTGGT GCGTACCAGC AGGGGAGTGA TGCAGAACTT 7440 GATCGAGCTA TTGCGATGCG CGCAGAGCTT GAACGGTTCC TAACGCAAGG AGCCCAGGAG 7500
CGCGTGCGTT TTCAGGATAC TGTAACGTCG CTGTCCATGC TGACAGGGCT CAGTATAGCA 7560
CAGCCGCCTT CGGGTGTGTG AATCTGCAAG AGCAGAGGAG ATAGCGCGTG TGAAAAGGTT 7620
TTGTTTTTCT CTTGAGCGTG TGCGACGCTT GAGAGCGTTT CGTGTACGCG AGCTGGAAGT 7680
TGAGTTAAGC AAAGTTCTTG CAGAATACGG AAGCATAGAT ACACAGATTC GATCGATTGC 7740
TGGCGAGTAT CGTGCGCGGA TGCAGGACGT AGCGCCAAAG CGTGGAGCAG TTTTTTCTGC 7800
TGCGTCGGTG AGCGCTGTGC AGGATCAAAT TGACGTGTTG CAATTACGCC GAGAACAGCT 7860
GCTCCATAAG CAGGCGCACC TTTCTTTTAC TCTTGAGCAA TTGCGAGAAC GATACGCGCA 7920
CGnGCGCCGT GCACACGAGG CTTTGCTCAT GCTTGAAGAA AAGGAGAAAA CACGCTGGCG 7980
AGAGCAGCGA CTGCGCGCTG AGGACCGAGC GTGTGACGAC CTGGTCAGCG CACGCGTaCC 8040
TGGTGCACCC AGCAAGCATT AATGGCTGGC GCGCTGCGTG CGCGCTcGGG TGTATGAGGA 8100
AGGCGGTCCA TGTCCGTGGA AGAGTATGAG CGTTTCGTGT GCCGTGCACG CTCGTTCCAA 8160
GATGGTGTCT GCCTCATTTC CCGCTTCTTC GTACCCTGCA GAACACAGAT CCCCCGTGAA 8220
CGCAAGGTGT GCAATACGGT ATAGGTAACA GCCATACGCA GGGGATGCAA AACAGGTAAC 8280
GGTAAAACCT GCGCAATTGA GGGAACAGTC TCCCTGTAAC AGGGTGCTCT CGTGGGCATT 8340
CCAGCACTTT TCCCCGCAAA AGATGTGCGG GGAGTATACG CGCAAAAGCG TGCGCACGCC 8400
CGCTTGGGTT GCGTCATCTG TGCGGGTAAG AAAAACAGCC GTCAGGGTGC AGGAATTCTT 8460
TTCAATGGTG TGGATCAAGT GAGTGCTCAC GTGCCCGGCA TCGATTAGGA GCGCTTCGTT 8520
TAAGGACAGG TTGCACAGCA GATAACTCGT GCGCTGGGTA TGCTGCTCTG CAGGGTGTAT 8580
GTAAACTTTC ATGGCAGGTC GCTCAGGTAT ATGCGGGCGT GTTCGGTACT AGATCCTTTG 8640
AATGTGCaGT GCGTAGTGTG CGAGGGCATA AAGTGCGTGT GTTGCACGGT GCAATTTTCA 8700
AAACGCACTG TTGCAAGCGT AGCGTGTAAG AAACGGGTGT GAGAAAGATC GCTCTCCTGA 8760
AAAGAAACAT CCTGCGCGTG GATACGGTTG AAGTTGTCAG ATAAGAGTAC TGCATGTGCG 8820
AAGCTCACCG TGTGCAGGAT GCTCCCCTCA AAAGAGGAGA ACTGAATGTC AGCTTCAGTG 8880
AAAGAGCTAT TGACCAGGTA GCAACGCTCA AAAAAACACA TACGCATCAC TACATTCGCG 8940
AACACACATG CATTAAATAC GGTATTAAAA AAATTGCAGC CGACGAAATG AAAACCCGAA 9000
AAGTCTACGC GGTTCAGGCG CATGCGTGGA GCGTACACGC CGTGTATGTG TGTTGACGTA 9060
TGCATGAGCT TAGCAAAGTC CGACGTAGAG CAATACGGTG TGGGTTCGAA CATGCGCGCA 9120
AAGCTTATAC GGGCTGGACG GGTGTGGTGT CAATGTATGC GGTGTGCAGG ACCTGGGGAT 9180 AACACGGTTC CTGTGAAATT TTTCAAAGAA CAGGCTAGGA TGATGTCTAT GTCGGACCGA 9240
CGTGAGCAAT TTCAATACGC ATTTTTGGAT TCAGGTATAG GAGGATTGCC CTACGCACAC 9300 GCCTTACGCG TGCGTGTGCC TGAGGCCTCA CTGGTGTACG TGGCGGACCG TGTATACTTT 9360 CCTTATGGGA ATAAAAGTTC TGCACAGATT ATTGCGCGTG CGTCTGCAGT TTTGCAGAAA 9420 GTGCAGACGA ATTTTTCACC ACACATAGTG GTACTCGCGT GTAACAGCAT GTCTGTCAAT 9480 GCACTTGAGT TTTTGCGTGC GCAGGTTTCG GTTCCAGTGG TGGGGGTGGT GCCTGCAATT 9540 AAGCAGGCGG TGGCGTGCAG TCATAAAAAG CACATTGGTG TCTTAGCTAC ACAATGCACG 9600 ATTACGCATC CGTACACAGC GTGTTTGAGA GCACAGTTTG GTGCaGGGTG TGTGTTTCAG 9660 AATgcTGCGG ATGCACGCCT TATTGAGTGT CTTGAGCGCG GGTTAATTTT TGAAgTCgAA 9720 GACATGCaGC GGGAGGCAGT GGCGCGCTCA GTTATGCCCT TCCAGGAAGC GGGGGTGGAT 9780 GTGCTCGTGC TCGCGTGCAC CCATTTTGTG CACGTGCGTC ATCTTTTTCA GGACTGTGTT 9840 GGTACCTCGT GTACGGTGGT AGATTCGCTA GAAGGTGTGG TACGCAGGAC GTTACGTCTG 9900 TGTCCACCGC AATCTCAATT GCGTGGGAAC GCCGCCTGTT ACGTAACTGG TGCGCGCGAT 9960
GCAGTGTGCG CGGCACGATA CGCACGGTAT GCGCAGCACT TTGGATTGCG CTGGGCGGGT 10020
TTTTTGGaCk TATGAACACG GCACTGGATA TCGGGTGCGT GCACTGTGTG TGTTTGTGTG 10080
GAGGCGGTAG ATAAgAgAgG CTGATAgACA GCGCGGTGCT GCGTGCGTAC AATGGGCCAT 10140
GGGGAAGCCG AGgTtTCGTG CAGTGGCCTT TGACATtGAT GGGACAcTGT ACCcTGGATG 10200
GCGCCTTGaT GCGTGTTAtG CCCTTtATGA TTCGCAATGC GCGCTTGATG CGTGCGTTCC 10260
GTGCGGTGCG TCAGGAGCTA CGTCGTGAGC AACGTACGGC ACTTATTCCT TTTGAAGACT 10320
TTTTTTTTGC GcAAsTACgC GCATCGCGCC GcGCGTGGGT TTATCTGCAG AAGAAGTGCG 10380
AGCCTTCCTC GACACAGCGC TGTATCGGGG GTGGAGGCGT CACTTTTTAC ATATAAAGCC 10440
ATTTCCTCAC GTGCTTTCCT CGGTGTTGGA GCTGAGGCGG CATGGGCTGA AGATAGCGCT 10500
TTTGTCGGAT TTTCCTCCGA GTCAGAAAGG CTGTCTATGG GGGGTGCGCG CGTTGTGCGA 10560
TGTAACGTTG GGCACAGAGG AGATTGGGTC CCTCAAGCCT TCTCCCCGGG CCTTTTACGC 10620
GcTGGCGCAG AGACTGAATC TGcGCTGTGA AGAAATTCTT TACGTGGGGA ACAGTGTTCA 10680
TGACGTGGAA GGCGCGCACG CAGCAGGTAT GAGGATTGCC TGTGTGCGCA GgCCCTTTAC 10740
GAGTCTTCGC GTTCGGCGCA cGCGGaCTGG CTCTTTTCCG ACTATCGCAC ATTGTGcGCA 10800
TATGTGATAG CATGAGCGCC GGCGCAGGGT AGTCTGCCGA ACCCCACACG TCCAGCGTGG 10860
CGCCCGCGGG TACCCGCTGT GCGTCGCGTG AAGACGAAGt GAGTGGAGCA TGGAGTACTT 10920 TCTGACGGTT GTCATTGCCT GCGCGATTTC CCTCGTGATG GTTGCGTTCT CCCGCCAGCT 10980
GGACAAGGGT AACCGTTCTC TTGAAAAGGT CAAGCGCTAC GCGkACTACA TAAAGGAAGA 11040
TCTTGAGTCA TcAGCGCAGA GAAGATTGCG ATGCTCAAGG ATGCGGCCAT CGAGTTAAAT 11100
GTAAAGCAAG AGCAGGCGAT TGCCTCAGTG AAAAAAATGG ATCACCTCTA CGACCAGTTT 11160
ATGAAgaAGT CTACTGCGCT TGCGGTGCAA A 11191 (2) INFORMATION FOR SEQ ID NO: 80:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1773 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80:
GGAAAAAACC TTTAAATTCC CGGTACAATT GTGCAGAGAG CGTTTTATTA TGACTGATGA 60
CAAGnGTCGG TTTTTGCAnG CGTGCAATAA CGTTTGCAAC TGTAAACGTT TTCCCCGAAC 120 nGGTAACACC CTTGAGCGTT TGAAAACGCG CCCCCGCATG CAAGCCACGC ACGAGTGCGT 180
CGATTGCGGC AATCTGATCG CCTGCAGGTT GGAAAGAAGC GTGTAATTTA AATTCTTTCA 240
TGTGCTTCGC AGTATCGGTT ATTACCATAG GATTTAAAAA ATCTGTAGTA CATGCTGTGC 300
CTTTCCCTTT GATTTTGCTC TCTTAGAACC TCTGGTTGAC AAACCACACC CATGGTGAAT 360
AGAGTGGACC CCGCTTAAGA GGAGGCACGT TATGGTCAAG TTGCTTAGCA TTGGTGGATC 420
GGATGCTTCA GGTGGTGCGG GCATCGAGGC GGATCTGAAA ACTTTCCAAG AGTATGGAGC 480
GTTCGGTGTG GcTACGCTCA CCGCCATCGT TACTATGGAC CCATCCCGGA ACTGGTCGCA 540
TCGTGTACAT TCACTTGAGG AAGACTGTGT GCGCGATCAG CTTGAAACCG CATTTGCAGG 600
CGTGGGGGTC AGCGCGGTGA AAAGCGGTAT GCTTGCCTCT GTCCATGCAA TCGAATGTGT 660
CGCGGAGTAT CTCGAACGTT TTGCAGTTGC TGCATACGTC TTTGATCCTG TCATGGTATG 720
CAAAGGATCG GGAGATGCAT TGCACCGTGA GTTGAACGAA TTGATGATCC AGAAACTTTT 780
GCCACGCGCG ACAGTTGTTA CTCCCAATCT TTTTGAAACC GCCCAGATTG CCGGTATCAG 840
CGTACCACGG ACAGTGGACG AAATGAAGGA GGGTGCACGT TTGATTCACG AGCGCGGCGC 900
GTCGCACGTG TTCGTCAAAG GCGGCGGAAG ACTCCCCGGT TGCAAGCACG CTCTGGATGT 960
TTTCTACGAC GGCAAGACGT TTCACCTCGT TGAAGATGAA CTTGTGCAGA GTGGATGGAA 1020
TCACGGCGCG GGCTGCACCG TATCTGCGGC TATTACTGCA GGACTGGGCC GAGGACTCAC 1080 CGCCTACGAC GCGATACTGA GTGCTAAGAG ATTCGTGACT ACAGGCCTCC GCCACGGATT 1140
CCAAGTCAAC CAGTGGGTTG GAACAGGAAA CCTCAGCAAA TGGCGCGACC GCTTCCACTG 1200
ACTCAGGCGG TACATACGTG GGCGATCAGT GCTGGTATAG GTGCTTGAAG TATTCCAAGT 1260
CGGTTGAGAG GATCTTCTCC GTGTCGGGGA GCGATTTCTT GTACACCTCC AAGCTTTTCC 1320
AGAAGCCGTA GAACTCAGGA GATTTCCCGT ACGACTGCGC GTACACGGCC GCGGCGCGGG 1380
CGTcTGCTTC ACCcTTGATA CGCTCTGCCT CCTCGTACGC TTTTGAAAGT AAAcTGCGTT 1440
TTTCGTTGTC GAGCTTTCCA aGCCACTCTG CCTTCTTTCC TTCGCCTGTG GAGCGGAACA 1500
TTTGCGCGAT CTGGTTGCGC TCTTTTACCA TCCGATTGAA CACAGATGCT TGCAGCTCAT 1560
CTGAGTACTT AATCCCCTTG AAGATCACAT CGACAACGAC AATACCGAAA TCTTTTAACT 1620
GATCATTCGC CGCCTGTGAG ATCTCCCGCG CAAGAGACTC TCGCCCCTTT TCTATCGTCA 1680
TATGCGCAGT TTTCTCCGCA CCCCTATCAA AGGCAAGCTG CGACACCGGG ACGTCAAACT 1740
GCTCGGAGTG ATTGGACTCG TTGATAGCGn TTn 1773 (2) INFORMATION FOR SEQ ID NO: 81:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19142 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81:
CAGCACATGC ACATGCATAT TCTTCCTTTT CGTTCATGAG CGGTATCCGT TTCAGGTCAT 60
TCAAATATGC CGTGAGAGTA TTCTCATCGT GGCGGAACTG CATTTGTCTC ACAGAGTACT 120
CCTTTTGTTC GAAAGGTACG TACCATATAC TCAGCAAATA TCATGCCATT TACGATAAAC 180
CAGAGGAGAT CTGTTCGGTG TCCCTATCAC GCCACTGGAC CCTCTGTGCA TACCTCCCTT 240
TCCCACGGCG CGCTACAGTT CTCTTGATTC TTCAAAAGGA AATGTATAGA ATGCGCCCCG 300
CGcGGTGCGT GTAGGCATAC GGCGCAAACG TGAAGATATG ACTCGTTATG AGGAGGAACG 360
CATGAAAATT ATACCGCTCG CGGACCGTGT CCTGGTAAAA ACTGATAAAT CGGAAACTAA 420
GACTGCTTCT GGAATCATCA TCCCGGACAC TGCGCAGGAG AAGATGCAAA GCGGTACCGT 480
CATTGCTGTT GGTTCTGACT CGGAAAAGAT AAAAGTTTCG GTGGGTCAGC GTGTCATGCA 540
CGATAAATAT GCCGGAAACC CAGTAAAGAT TGATGGAGAG GAGCACCTGC TGCTCAAGGG 600
TGCTGATATC CTAGCTGTCA TCGAGTAGTT TCATCTCTTT AACGGGtTGC GCGCGCTGGA 660 GCAGTGCACG CGCGGCCACC GTCTTTCCCT GTTGCTGCAA AAGTTGTCCT GCACGCTGGT 720
ACCATTCTCT CCAGCGCTGT GCTTTCCCCG CCGCCGGGcG CkCTCGAGGA TGTTCTTCGG 780
CCTCCCCCTC TCCCGCnTCC TTTTGAAGTT GTCGTACTGA TTCgCGCGTT GCTTCGTCTG 840
CCGCAAGCAC AAAGTGTTCT GCAGCCGCGG TTTTCTTGCC CATCCGCTCA AGCACTATCG 900
CTATGTTTGC GgArCGnCTG tGCGCGCTTC CACAGCGCGT GTCGCTTCCA CCAAGCCCCC 960
GAGGATACTC TCCACCCTCC CCGTACGAGA AGCTGcAGCA AGTGCCCGAT AGAAAGCATC 1020
CTCAGGACCA CCTCCTGCAT CACTGAGACC AAACGCTGAC TCATATTCCC CAATTCGAAA 1080
AAAGAACCAA CGTGCATACC GACGCACtGT GTGTCCACAG GgAAGACTTC AAGCAACTGC 1140
CACAGCGCGC CCCGACTGCC CTGTAGTCTC TTTTGTGCAG GATAACAAAA ACGAAAGTAT 1200
TCCAATGCAA AACGCACATC TGCCCGCCCA TGcATCGCCT CTTGTGCTAA TTCCGCCAGC 1260
AATGACCGAG CCTCCACAGG AAAATGGGGA GACAAAAAGT GCTCCTGTTC CATCTGCACA 1320
GAATACACCC CCCTTTCAGA GAGCAATTCC GTAACAGAAT CGCGCTGTCT GTGTGCTGCA 1380
CGAAAGGCAA TGCTCTCTCG CGCATAGCAC GCAACGGCAG GATAGTACGT TTTATCCCCC 1440
TCAACACACT GCGCGAGCAT CCGAACACGC TCCTGTGCAT GAGGAGCAGT GAGTGCTAGA 1500
TTGTAGAGTG CATGTGTGGA TGTTCCAGGA AAACGGTCAA CGTACGCGTA CCAAAAAGCA 1560
CGTGCACGCG CTCTATCCCC TCCATCGAAA GCGGCATCTG CCGcTAACAG CAAGTGAGTA 1620
CGGGCGTGCG TCGAGGCGGC AGAGTACGTA CCGAACAAAT CTGCACGTGC AAGACTTATG 1680
GGGAGCAACT CAAAGACCAA AGAAAACTGT CCTGCGTCAT AGGCGACTGT GGCCCAGAAT 1740
GCCGTATTCT CCGGTGCCTC AGTACCCAGT ATACGGGAGT ACAGACGGAA CGCAGCATGC 1800
AACTCCCCCA CGCGGGCGTA CGCACACGCG GCATTCTGGA GAAAGGCATG CATACCAGTG 1860
ACACCAAAGG CTGCAAGCaA AGCGCAGGAG CAATGCGGGC AAGTGCGATG CCAGTTGTAT 1920
GCCGTATTGG CGGCGACGGT TCTACACGAA ACTGCTCTTC TCCCCGGTCA AGAGGGAACG 1980
CCGTCACGTG CGCTTGATCT CTACCCCCCT GAACCTGAGA AAACTCAACA GGCTCCTCTG 2040
GCACGTGCTC GGCTCTCTCA CTGTGAAAAG AGCCCCGAGA TGGAGAGTCA CCTCCTGTGC 2100
GCGAACCTAT GAGAGCAGCC TTTATCAGTG CCTCTGCGCC AATACTCTCA TATCCCTTGT 2160
GCACCACCCG GGGGACATAG GAGATTGCTT CCTCGAAACG AGCCTCCCGC AGGAGAAGAT 2220
GTATCACCAA TGCAGCAAGC CTGCCATCAT CTGCACTTCT TCGTATGCCA CGCTGGAGTG 2280
TACGCAtGCC GCAGCGGAGG CTGAGAGTTC CATCTGACGC TTGGCAATAC TCAGATACTG 2340
CGACGACTCT CGCGCGTGTC CCACCATACG CGCAAGACCG CGCAAGGCTG CGCCTTTGCG 2400 CtTCCTGAGC AATCAGACTA TCTGCACGCG CAAGTCTCTC CAAAAAGGGC CGACTCCCCC 2460
AAACGGAGCG ATAGATGCAT AGCCCAGCAA TACCCACCCC AACTACTCCA AGGTAAGAGA 2520
GGAGGTGGCG CTTACTACGC ACTTTTAGCA ATGTTATCGA GCACACCATT TACAAAGCGG 2580
AAGGAGTCGT CcGTACCGAA ATCCCGAGCG ATACTGACCG CCTCGTGTAT AACGACGACG 2640
GGGGGAATGT CTTTTTGAAA GAGTAGCGAA TACGCACTTA GGCGCAGGAT GGCCTTGTCC 2700
ACCTTGTTCA AACGCACAAA ATCCCAGTGT TCCAACCGTG AACTCACACA ACCGTCAATT 2760
TCCCGCAGGT GCTCGAGCGT ACCGAGAAAG AGGAGTCGGG AAAACCCCAA ATCCTGGGTA 2820
GAAGGAGGCG GGTTTCTCCG CAACCAAGTA AACTGAGTTA ACGTCTCCGG CGTGATGCCC 2880
GCCGCGTCCC AGGCAAAGAG AGCCTGAAAA GCCAGAATCC GAGCGCGCCT CCTCCCTATC 2940
TTTGGGAATA CTTCACTCAC CAGTCGAGCA CCTTGCCGAG CTCTGCGTCA GCCAAGAATA 3000
CCTTATAGGT ACCAGACTTC CTCAGcTCCG CCGACACTTC CCGGATAGCG TCCTCGAGCG 3060
CTTTTTGTTG GATTTGGGAA GTAAGAAGAT TCTTGATAAA CTCGTAAAGC GAGACGGTCT 3120
TGTCAGGCTC CACCAAATCG CTGAGCGTTA GGATCTTCGC TTCCTCTTTC TTCAAAACTA 3180
TGAAGCACTG ATAGTCATTA GCCGTTTCAT TCACGTCCGA AACAGCACCA ACTCCCATGC 3240
CAAAAATTTC AAGCAACGCC TCCATCGTCA GACCAAGCTG CGTGGCAGTA ACCGCAGTCT 3300
TCCCCAGGTA TATTTCCCCG GCAGAGTAAC CCGCCTGTGC ACCATTTGCC TTACTCTTAA 3360
TGTCAGCCGT GGCTTTGACA CCGGAACTTT TAAGTTTCTT AACAAACTCT TGCGCTTTCG 3420
CCTTTGCAGC AGCAGGACCT GAAACtTCGG GACAGAAATA AGAAACAACT TAACCGTGTC 3480
AGGGCGGAAA AAGGCCTGTT TATTAAGCTC GTAGTAGGAA CGGATTTGAG AGTCCTCAGG 3540
TCCCTTCAAA TTCCGAAACT CGTCTGCCTT CTTTTGTGCC ACGTAGCGCT GCGTGCTCAC 3600
CTGCGTCTTT AGAAACTTTT TGTATTCTGC CATAGTCATG CCGTTTTGCT GCTTCATGAA 3660
CTGATCGAGC GAGATATTCT GCTTCTCCTT GACGTAATTG GCGAACTCAG CCTCCGTTAC 3720
CGCACGTCCA ATCTGTTGAG AAAGCATTCC ATTAAAATAC TGATTCACTT CAGCATCCGT 3780
TACCTGGATA CCCGCCTTTT CTGCCGCTTG AGCAAAAAGC TTTTCGTCAA TAAGACTGTC 3840
CATGAATTGT CTACGCTCAG CAGTGCTGAG CTTCTTTCCC ATCTCTTTCT CAATCGCAGA 3900
AATTCTTGCC TTAATCTGTC CGAGCGTCAc CGGCTCACGC CGGAATAAAT TCACTTCGGC 3960
GATAGGCTGC AGCGCCGACT GCGCGTGCGC AAACCCCATA CCCGCCACGC ACAACAGAGC 4020
GGGAACTATG TATCTGCCCA TGAGAACTCC CCGTAACAAC CGCGACTcAG TCACCAGCAC 4080
ACACGCCGGA GAAACTATAC GCAGCAGGGC AAGTTACGAC TTTTCCCCGC TTTTCTCAAG 4140 AATCGCGCGA ATACGCCGCA CCCCCGCCGC ACTCGACTGT TCCTTTTGAA TAGAAAACCG 4200
CCCGAGCTGC CCGGTACGTG CTACATGTGG ACCGCCACAC ACTTCCCGAG AAAAAGTTCC 4260
TATGGAGTAC ACCTTTACAG TTGATTCGTA TTTCTCACCA AACAACGCCA CGGCACCAGA 4320
ATTCATCGCA TCTTCGAGCG ACATCACTTC ACAGCACACC GGCAAGTCTG CCCGGATCTG 4380
CTCATTCACC AACTGTTCCA CCTGCACCTT CTCCTGCGCA CTCATCGGTC TTGGGTGAGA 4440
AAAGTCGAAA CGCAGGCGTT CTGCCGTAAT ATTTGAGCCT TTTTGCTGCA CGTGCGTACC 4500
AAGAACCACT CGCAATGCCT GGTGCAGCAG ATGCGTCGCC GTGTGGTACG CTGTCGTTTC 4560
CGCTGAATGA TCAGCCAACC CACCCTTAAA TACTCGCTGT GCACCGATCC GAGAGCACGC 4620
CTGGTGCGCC TGAAACGCGG TGTCAAACCC TGCACGGTCC ACCCGTAAAC CCGATTCACG 4680
CGCAAGTTCC TCGGTCAGCT CAAGGGGGAA TCCATACGTA TCGTATAGCC GAAAGGCAAC 4740
TGACCCAGGT ATTTCTCGCT CTGTCCCCTG TAAAAACTTG GGTATCATCC TCTCGTACTC 4800
TGCCTCACCC TTCCTGAGGG CGTCGAGGAA CTTACGTTCC TCGTTTGCAA GCTCCTGCGC 4860
AATACACGTA GCTTTCTCTT CCAGTTCCGG GTATACCGCA GCGTATTGCC CAATCACCAC 4920
GCGCGCGAGG GAGGACAGGA ACTCCCCATC GATACCGAGC TTCCTTCCGT GGCGGACTGA 4980
ACGGCGAATG ATTCTGCGCA GTACGTAGCC TGCACCCACG TTAGATGGGC GTACAGGGAC 5040
AGGATCGCCG AGGATAAAAG TGGCCGCACG GATATGATCG CATACAATCC GCATGGATAC 5100
GTCGTGCGCT CCCTGACACC CATACCTCTT CCCACATAAC TGACCTATCC GCTCCAGGAG 5160
CGGGGTAAAG ATCTCCGTAT CATACACTGA CCGCTTGCCC TGCAAAACCG CGACGGTGCG 5220
TTCAATACCC ATACCGGTGT CCACACAATA ACGTTCAAGC GGCCGGTACC TGCCGTCTGC 5280
GTCCTTACGA TACTGCATGA ACACGTCATT CCAAATCTCT ACGTACTTGC CGCAAGAACA 5340
TCCCGGACGA CAGCTCACAC TGCAAGGAGG AACTCCAGTA TCAAAGAATA TCTCGGTATC 5400
CGGACCACAT GGCCCTGTTT CCCCCGTAGG TCCCCACCAG TTATCCGCAC GTGGTAAAAA 5460
ATGAATATGG GTGCGCGCGA TACCAAGTCG TTCCCAGATA GCGGCAGATT CCTCATCACG 5520
CGCAACAGCC TCATCCCCTG CAAAAACAGT CACCGAAAGC CGGTCAGGGG ATATGCCGAG 5580
CCATGGAGCA CCAGTAAGAA ATTCAAAGCT GAACGCGATT GCCTCCTCCT TGAAGTAATC 5640
GCCCAACGAC CAGTTACCCA ACATCTCGAA AAAGGTCAGA TGCGAGTTAT CGCCCACCGC 5700
ATCGATGTCA CCGGTGCGCA GACACTTTTG CGCATTGACC AAGCGGGTAC CAGCCGGATG 5760
TGGCTCACCC ATAAGATAGG GAACCAACGG ATGCATGCCA GCAGTAGTAA AAAGCACGGT 5820
AGGATCGTGC TCGGGCACAA GGGACTTACC CGAGATAACC ACATGAGCCT TCTGGCTAAA 5880 GAAGGCGAGA TAACGCGAGC GTAgCTGATC GGCGCGAATA GGAATGCTCA TGGAGGGTAT 5940
TATCGCCTTT TCCCTGCTGC GGTCAACATC TGACCCTAAA CGGGAAAAAG AAACGGGGAC 6000
TCTCTGAGCA ACCTTGCGAC AAGATCCTTG ACAGAATTCG CACACACCTC TAGCCTCTCG 6060
CAAGACAGTT TCGCACCCTT AAAAAATATA AGGAGCACAC ACATGACCAC GTCTATTGTG 6120
ATCGGTGTTG TCCTTGTTAC TGTCGGTTTA AcCTTCGGAT GGACCATTCG CTGGCTCTAC 6180
GCCAGATTTC ACTTATCCGC CTGTGAGCAA CGTGCAGAAC GTATCCTCCA GGAGGCACAA 6240
AAAGAAGCTG AATCCAAAAA GAAAAGCATT CTCCTTGAAG CAAAAGAATA TGTCCTTCGC 6300
GAAAGAAATC AGCAGGAACG AGACGACAGA GACCGAAGAG CTGAGCTGCA GCGTGCAGAG 6360
CGACGCCTTC TTCAAAAAGA GGAAGCCCTC TCTACGCGCG CGGGGGAGCT TGATTCTCGA 6420
GAACGATCGC TAAAACAGCG GGATCAGTCC CTCTGTCAAG AAGAGGCCCG CTATCGCCAG 6480
GAGCTCGAGC GTGTCTCTGG CCTCACTCAG AATCAGGCAC GGGATCTCAT CATCAAAAAC 6540
CTTGAGAACG AGGCGAAgCA CGACGCACAG GCTCTCATCA ACAAGATAGA GGAGGACGCG 6600
GCTTTGAACG CTGAGCGTCG CGCGCGCGAC ATCCTCGTTA CTACCATGCA GCGTATTACT 6660
GCTGATGTCA CCGGTGATGT GACCGTCTCT ACGGTGAATC TACCCAGTGA AGAAATGAAA 6720
GGACGCATCA TTGGGCGCGA GGGACGTAAT ATCCGCGCGT TAGAGACACT CACTGGTGCT 6780
GACGTTGTCG TAGATGACAC ACCTGAAGCT GTCGTCATTT CCTGTTTCGA CCCGGTACGC 6840
AAAGAGATTG CGCGCATCTC TCTTGAGCGT CTTGTACTTG ACGGTCGAAT CCATCCGGCG 6900
CGCATTGAGG AAATTGTGCA GAAGGTGACG CAGGAAGTTT CTCAAAAAAT CTATGAGGAA 6960
GGGGAGAAAG TGCTGTTTGA CCTCGGTATT CACGATATGT GTCCCGAGGG GGTACGGGCA 7020
CTGGGGCGCC TGTATTTCCG TACAAGCTAC GGACAGAATG TACTCTACCA CTCAAAGGAG 7080
GTGGCTCTGC TCGCTTCCAT GCTCGCCTCG GAAATCGGCG CAGATGTTGC CATTGCCAAA 7140
AGGGGCGCGT TGCTGCACGA TATTGGCAAG GGAGTGGAAA CTGATTCAGA CCGCAACCAC 7200
GCAGAAATTG GTATGGAGAT GGCTCGCAAA ATGAATGAGG ACCCGCGAGT GGTAAACGCC 7260
GTTGGTTCTC ACCACAACGA CATAGAGCCG TGTTGTGTTG AGTCTTGGCT CGTTCAGGTA 7320
GCTGATGCTA TCTCTGctGC GCGTCCTGGT GCTCGGCGTG AAATGGTGGA TCacTACGTC 7380
AAGCGTCTAG AAAACCTCGA GGCGATTGCT GAGGGGTTCT CGGGTGTAGA GAAAGCCTAC 7440
GCTATTCAGG CCGGGCGCGA GTTGCGTGTT TTAGTGAACA ACGATAAAAT CCCCGACAGG 7500
GACGTGAAGG CACTTGGACG TGACATCGCA AAGAAAATAG AGAGCGACTT GAAGTATCCT 7560
GGGCGTATCC GGGTCACTCT TATTCGAGAA ACGCGCGTCG TGGAGTATGC CCGCTGAGCC 7620 TCAGGGAGAG GGGAGAGAGT GCACGGGCGT CCGTGCAGGT TTGCATCGGC TGCAGTGACT 7680
CTCCTACCTC CCTATTCTAG TCCGGCGATA TTGGTCAACA AGGCACATGG GAGTATCATG 7740
GCACAACAGC GTATTACGTC TGATATCTTT GCTCAGCTGC TCACCCTTTC TCACCTCGAA 7800
AGCAGCGAGT GTGCAGTAGG ACTTGCAACA CAGATCGAGG ACATTATCCA GTATTTTTCC 7860
GTTGTAGAAC AGTTCGACCC CGGTCCACGC GACGATCCTG ACACGGATAA CGCACAAGGC 7920
CGTTGCTCCC AGGGGAATAA AATTGACGTG GACTGCTGCC CGGACTGGGT ACGCAAGGAT 7980
GTCGCATTAC CTGGTCTTTC CGTTCACGAT CTCAAGCGGT TGTCCACAGA GTTTGCTGAC 8040
GGTTAcTtTy kCGCAcCGCG CGCGCTCGAT GGTAGCGCAT AAATGGACGC GCATGCTATT 8100
ACCTGTGCAA GCTGGAATAT GTTAAAGGCT CAGCTTGAAG CCGGTGCAAT CAGCTCTTTG 8160
CAGATTGTGC GTGCGTTTCG CAACGTATAC GAGGAAGACA CACGCAGCGC GTCCCCGCTT 8220
GGGGCTTTGG TCGAGTTTTT CTCTGATGCG GAGGAGCACG CGCGTACGGC AGACAATCTC 8280
CGTGCCTCGT GTGCCCAGAG TACTAAAACA GCTGGAGCAA ACGGGGGGAG TGTCTCAGGT 8340
AAGCCTTTGT TAGGTCTACC CTTTGCTGTC AAGGACAATA TTTCAGTGAA AGGAAAGCAC 8400
TGCACGTGTG GCAGTAAACT CCTTGCAGAC TATAGGGCTC CGTACGATGC CACCGTTGnT 8460
TGCcGnCTGC GCGcCGcAGG TGCTaTCCCG CTCGGGAGAA CGAACATGGA TGAGTTTGCT 8520
ATGGGCTCTT CCACCGAGTA TTCTGTTTAT GGGCCGACGC GTAATCCtCG GGATCGGAGC 8580
CGCACCAGCG GGGAAGTTCC GGCGGTTCGG CTGCCGCCGT nCGCAGGCGG TnCAGGCACC 8640
GTTTGCACTC GGTACCGAAA CGGGAGGCTC GGTACGCCTG cCAGCTGsTT aCTGCGGCCT 8700 cTATGGCtGA AgCCGACcTA TGGTCTCTTG AGTCGATATG GGGTGGTTGC CTTTGGCTCC 8760
TCTCTAGACC AAATCGGCTT TTTTGCTACC TGCATTGACG ATATTGCCCT CGCCCTCTCC 8820
GTCACCTCAG GGAAAGACCT GTACGACAGC ACGAGCACTT GCCCCCCTCC TGCGACGGGG 8880
CGACACGCTG TGTCTCACCA TCTTGCCCCT TTTTCTGCCC ACGAGTGCTC TATCCTGCGT 8940
GCTGCTGTTC CCCGCGAATT AGTAGATGCT CCTGGCGTGC ATCCTGACGT GTCTGCGCAA 9000
TTTCAACGCT TCCTCACCTG GCTGCGTGCC CAAAACGTAC AGGTAGAAGA AGTGACGCTT 9060
CCTGCACTAC AGGCGGCAGT GCCTGTATAT TATCTTGTCG CGACAsctGA AGCCGCCAGC 9120
AATCTTGCGC GTTTTGACGG TATTCGCTAC GGGCAGAGGG GAGACACTGA TGCTCTTTTG 9180
GAAAATTACT ACCGCGCCGT CCGTACCTCA GGCTTTGGAC CCGAAGTACA GCGAAGGATC 9240
ATTGTGGGGA ATTATGTTCT TTCACGCCAT TTCTCCGGTG ATTATTACCG AACGAGTGTG 9300
CGCGTACGTT CGCGTATAGA ACAAGAATGT ACGCAGCTCC TCTGTTCCTA CCACTTTATT 9360 GTTTGTCCTA CTGCCGCTAC CGGTGCCTTC CCGCTTGGAG AACGCATACA TGACCCGCTG 9420
GCCATGTATT GCTCGGATTT ATTCACCACC TTCGTTAACC TTGCCCGCCT ACCGGCGCTA 9480
TCAGTACCAG TGGGAACATC AGGCACTGGC CTACCCATCG GAATACAGAT TATCGGTTCT 9540
CAGTGGCAGG AGTGTGCCGT TCTCCGGCTA GCAAAACGTT GGGAGGAGGC ACCTCATGTC 9600
TGACCTCCAA ACAGGCACAG TTCCCTCCAT TGCAGGCGCC ACAGATGACA CACATGCCGC 9660
ACCCTTTTTC TACGAGGTAA TTATTGGCTG TGAAATTCAT TGTCAGCTTC TAACAAAGAC 9720
CAAAGCTTTC TGTGCtGTGC AAATCGCTCA GGAGGAATGC CGAATAGCCG TGTGTGTCCT 9780
GTGTGTCTTG GGTTGCCAGG AGCGTTGCCC GTTGTGAGTG AAGAGTACGT GCGGCTCGGG 9840
GTGCGCGCCG GACTTGCGTT GGGGTGCACT ATCCAGCTTT GGTCCGCTTT TGATCGCAAG 9900
CACTATTTTT ATCCAGATCT CCCAAAGGGT TATCAAATTA CCCAGTACGA CGCTCCCTTG 9960
TGTACGGATG GTGCAGTGGA TGTACAGGGA GTTGACATGC CCGTGCAGCG CGTGTCCGTA 10020
TTGAACGGAT ACATTTGGAG GAGGACGCAG GCAAAAGCCT GCATGCTGCA GACGCTTACA 10080
GCTATATTGA TTTCAATCGT TGTGGGGTGC CGCTCATTGA AATTGTATCT AGGCCGGATC 10140
TGCGCTCTGC AGAGGAGGCC GCATGTTTTA TGCAGACGAT CCGCGAGATT CTCACCTTTA 10200
TCGAGGTAAC GGATGGTAAT TTAGAAGAAG GCGCACTGCG ATGCGACGCG AATGTTAATG 10260
TGAGGATTCT GTACAAAGGG CAAGAACACC ACACTCCCAT TTCTGAAATC AAAAATATGA 10320
ACTCGTATCG TATGGTGCGG GACGCGTGTA CGTATGAGGT ACAGCGTCAA TTGCAGGAGT 10380
TTTGGCAAAA GGGTCCTGCG AGCAAAGAAG AGATGCAGAG AAAACGCACG ATGGGCTGGG 10440
ATCCGGTCGA AGGGGTTACG CTTTTACAGC GTACAAAGCA CTCACTGCGC GATTATCGTT 10500
TCATGCGCGA TCCAGACTTA CCTGACCTGC ACTTGACCCC TGCATATGTC CAGCATCTCT 10560
CTTACACAGT CGGGGAACTT CCGGCAGCGC GGCGTGCACG TTTCAAACTT GACCTTGGCT 10620
TGTCGGCGTT TGCAGCCCAA ACGCTTACCG GCAGCCGCAT GCTCGCAGAC TGGTTTGAGA 10680
AGGCAGCGCA TGCGTCTAAG AATGCGCGAC GAGTGGCAAA CTGGATTCTG TCGGAGGTTC 10740
TTGCGGTAGT AAACGAGAAG AATATCTGCA TTGCAGAGCT CAATCTGAGT CCTGAAGCAA 10800
TTGCCGAACT AATGGATGCA GTTGAAGATC AGCGCATTAC CGGAAAACAA GCAAAGGATA 10860
TATTTGCACA AATGCTTGCC ACCGGTGCGC GAGCGCAGGA CATTATCTCC GCACAGGGTC 10920
TGGCACAACT TTCAGATGAG GAAGAAATCG CAACGTTAGT GCAGACGGTG TTTCAAGAAC 10980
ATCCAAAGGc nCTGCGTGAT TGGCAACACG GTAAGACAAA CGTGGCTGCC TGGCTCATGG 11040
GGCAAGTAAT GAAGCGTTCC CGCGGGCGCG CACACCCTGC GCGAGTGGCG ACGCTCGTCC 11100 ACCAAGCACT CTCTCAGCTG TAACAGCTGG AAAAACTCCA CGGAAGAGCG GCGGTCTCTC 11160
TTTCAGCATA CGCCCGGyCC CGcTCACGCC AGGAGAAGAA AGACGCACCA AAGGGCTACT 11220
ACGCCTTCGC CTTGCTAAAT ATCTCCGCAA TTGTCTGTGC GAGCGTACCC ACCACCCGTA 11280
GGGTATAAGG AGGCGGTGTG TCGTGTTCAA CCCCTATAGG TACATATATC GTTGAAAATC 11340
CCAGACCATA CGCCGTCTTT AGCCGCGTCT TTAGCCGACG CACAGGTCGA ATTTCTCCTG 11400
ATAAACTTAC CTCACCGATA AACGCGGCAT TTGTTTTCAC TGGGGTGTTT TGCCGCGCTG 11460
AATACAAGGC CATTGCCAAC GCCACATCCA CCGCAGGCTC ATATAACCGG ATACCCCCTG 11520
CCACATTCAC GTAGATATCC TGATCTGAGA ATTTCAAACC CACACGTTTC TCAATTACCG 11580
CTGCAACACG ACTGACGCGG GCCGAGTCGA TACGATCAGA AAAAACGCGC GTAACACTAC 11640
TTTTTGCAGG AACGGTCAAT GCCTGTATTT CTACCATAAA AACACGGCTC CCCTCACACA 11700
CGGGCACAGT TGCAGACCCA ACAGGAAACA TTCCCTGCCT GGTACTAATA AAAAATCCTG 11760
CAGTGTCCTG CACAGCGGAA AGTCCATTTT CACCCATGGT AAAAATACCC AGCTCATCAA 11820
CAGAACCAAA TCGATTTTTC AATGCACGTA AAAAACGAAT ATCCTCTTCA TTCCGTTCAA 11880
AAGAAATCAC AGTGTCCACC ATATGTTCCA CTACTTTTGG CCCGGCAATA TTCCCATCTT 11940
TCGTTACATG CGCAGTAAAA AAGAGAACAG AGTCCCGTTC CTTTACCCAC GcTATCAACT 12000
CATTTGCGCA ATATTTCAGC TGATTGATAG TCATAGGAAT GGCACCTGCT TCGGGGGAAA 12060
AAACTGTCTG AATCGAATCA ACAATAACGA AGGTAGGGCA TCGTGTATTT AAAACACGCT 12120
CGACATCCTC GACCCGCGTC GCACAAAGCA ACTCGATGTT CTGAATTGGA ATATTCAGCC 12180
GATCCGCACG CCCACGAATT TGCCCCGGAG ATTCTTCACC CGAAACATAG AGAACCGATT 12240
TCCCGCAGgC TGCAGCGATT TGTAACAGTA ATGTAGATTT ACCAATGCCC GGTTCCCCGC 12300
CAATCATGAT CGCGGAGyGT CTTACGGCGC CTCCGCCGAG GACACGATCG AACTCTGCGA 12360
TACCACAACT AATACgCTGc gCATCCTGCG CGCGCACAGC ACACAGCGGG AACGCCTGTA 12420
CAGGAGAAGA AGATGCCTTT TTTACAGCAC GAACATCGCC GGAGGACAAC GAGGGTGTCT 12480
CTTCGAAGGA ATTCCACTCC CCGCACTCAG GGCAACGCCC AAGCCACTTA GGATGAACGT 12540
AACCACACCC CACGCAGGAA AAGGCACGTT CCGTCTTTTT AGCCACTCCA TTTCTCTTAC 12600
GTCAGAAAGA AAAAGCACTG CACGGCGGGC CGctACGCAC CCTCCGTTTC TACCGCAAGA 12660
CAGCGACGAA CGAGCGAAGG AGAAAATTGC CTCCGCTGCA ATGCACGCTG CAACGTATGC 12720
GGCCCATATC CCCGCcTTCG CAGCTTCTCC AGCAACCTCA AGCAGAGCGT TTCTTCATCT 12780
TGCTCTTGAA ACAAGAGATC TAACGCGCCT TCTGCATCTG CATGAGAAAC CCCTCGCCTT 12840 TTTAACTCAC CTAATAGCTG AGCACGGGAC GCAGGACGGG AATCGACGCG GTTTCGCAAC 12900
CACGCCCGCG CGAATCGcGT ATCATCAAGC CAAGAATATC TTTTAAGAAC CGGGAACACG 12960
CTCTCAACGA CCCTCTTCTC AAAGCCTCTT TTCAAAAGCT TAAAACCCAG CTGCTGAGCA 13020
CTCGTCTCAC TCCGAGCGAG CAACCGCACT GCGACGCACT CAGCCTCATA ACACCTACAT 13080
GCAAAGCACA CCGCCCCGTa CTGCTCATCA GTGGGACGAG TACCCACCAA CTCCTCTATC 13140
GGGCAGGAAA GCGCGCCAAG GTAACTCAAG CGAGTCTGCA GAACAGCACC CACCTCATCC 13200
GTAAGTTTAA GCACATCCTC CTGGAGGCTC TGGATAGCAC AGAGACAAAA GCGCCTATCG 13260
CTTACTGAAT TGAAATCTAC GCCGTGCACC GCGCTGGCCA TACTTCTTGC GCTCTACCAT 13320
GCGAGAATCA CGCGTGAGCA ACCCACCTGC ACGCAGCGAA GCCTGATTTG AAGCGTCAGC 13380
ACGCACGAGA GCGCGCGCAA TACCATGcGC ACACGCACCA GCCTGCCCGT CAAGTCCGCC 13440
GCCATACACA TTGACAATCA CATCGTAACG CCGCTCGTTC GCGGTAgCGA AGaGGGGTTC 13500
GCGCACCCGA CGCAATTGct CCGCCGTAGG AAAATACGCG CCGACATCCC GTCTGTTGAC 13560
GGTAACATTC CCGTTCCCCA TACGGATACA CACGCGAGCG ACAGCCGTTT TCCTTCTCCC 13620
TGTTCCGATC CCAAGATTCT TCACGCTCTT TACCACTCCT TAACACGACA GCGGCACGGG 13680
ATTCTGCGAC TCGTGCGGAT GCACAGATCC CGCGTATATC TTCACATTCT TGATAAGCTT 13740
GCGTCCCAAA GGCCCCTTGG GTAGCATACC CTTAACCGCG TGACGCAACG GTTCGACAGG 13800
CCGGCGCTTG ACCAACGCGC TAAACGACAC GCTCTTGAGC CCCCCAGGAT AGCCTGAGTG 13860
GCGGTAGTAC ATCTTGTCCT TCGGTTTAGT CCCACTCAGG AACACCTTCT CAGCGTTGAT 13920
TACCACAACG TAATCACCCA TTTCCTGGTT TGGCGTGTAC GATGCTTTAT GCTTACCACG 13980
CAGGAGACAC GCAACGCGCG cGGCAACACG CCCCAAGGGG CGCCCCGCTG CGTCAATCAG 14040
GTGCCAAGCG CGAACAGCCT CGCGCTCATT AACAAAGATT GTCCTCACCC CTCTGCTCCT 14100
TTATCCACAC GCTCCGGTAG CTTGCGCGTG GGCGCACGAT CACGCATGTT TGCCACCCCT 14160
ACAAGGAGAG ATATAAAACC AGCAAAAATA CCGCCGACTG GCGCAGCACC ACGCAAGAAC 14220
CGCAGTGCAT CCTGCGCCCA CCCGAGACCA CAACCGTACT CCATAGGGGA AAGAAGAGCG 14280
AAAACCGCCG CTCCCATCAG CAGCACACCA ACCAAAACCG CGCCCACCAC CGCctCCCGC 14340
GCAAAATCTC CCGATCGTCT CATAAAAATC ATTCTGTGTC AAGnTGCGCT AGcACAAAGA 14400
CAGTTAGAAT GCGTATACTC TCCAGATAGT TGCGGAAAAC TTCCTGAAGA AACCTTCTGC 14460
TGAAAGAAAT cAGCTAACTT GAACAAACGC TCAGTCCGCC TGATGATGGG CGCATAACGC 14520
ATGCAAAGAT TTGAAGACAT TCCGTACACG CGGCCACACA TGGACATACT CGAGCATGCC 14580 GTCGACGCGG CTCATGGGGA GTTCGCGCAG GCCCGTTGTG CACGCGATGC GTATGCGTCT 14640
ATACTCGCCA TAGAGGATCT CCAGCGCCAA TATCTTACCG CACAGGCACT AGCGAACATG 14700
CGCTGTTCCA TTGATACACG CAACACCTTT TACCGCAGAG AACAGGATTT CTTCGATGCT 14760
GTACATCCTC GCTTTGCCCG CTTAGATCAT GCCTTCAACC AGCTGcTGCT TGCATCACCA 14820
CAGCGCGACG GTCTTGAAAA ACTTATTGGC ACTCACCGCT TTACCCTTGC ACGCCTTCAG 14880
AGCAAAACCT TCTGcTCGGA GATTATGGAA GACCTCGCAG AAGAAAATCG TCTCACCAGC 14940
GCCTATGAAA CACTCCTCGC TTCCGCACAC ATTCTCTTCC GAGGCCACCA CTACACTCTC 15000
GCCCAGCTGT CCCCCTTTAT GGAACACACC GACCGCAACA CGCGGcGCGA CGCGCATGaG 15060
GCATACTATC ACTTCTTTGC TCAACATGAA TCGGAGCTCG ATACCCTCTA TGACACGCTG 15120
GTACGAGTGC GCACACGCAT CGCACGCACG CTCGGCTATG ACAATTACAT CCAACTCGGc 15180
TATGACCGCC TGTTACGCAG CGACTACGAT ATGCAAGATA TTGCGcgTTA CCGCACCTAC 15240
ATCCTGCGct ACGCCGTACC CCTCGCTGCG GAACTACATG AACAACAGCG ATCTCGACTT 15300
GGACTCAGTG AACTTCTCTT TTATGACGAG CCGTTGTACT TCCCTTCTGG AAATCCAGTT 15360
CCCCAGGGAG ATGCACCCTG GATATTGAAT CAGGCCGCTT GTATGTACCG CGAACTGTCC 15420
CCAGAAACAG ACCAGTTCTT TACCTTTATG CGCGAGTACC ACCTATTTGA TGTCTGTGCA 15480
CGTATTGCAA AAGCGAGCGG TGGATACTGC ACAACCTTGA GCACATATCG TGCGCCTTTT 15540
ATTTTTGCAA ACTTTAATCG CACTGCACAT GACGTGGAGG TTATGACGCA CGAGGTGGGC 15600
CACGCCTTCC AAGCCTACCA ACGCTATCGA GCGCGTCTTG ATCCCTGTTT GGAAGCGTAT 15660
GTGTGGCCCA CGTACGAAGC GTGCGAGATC CCCTCAATGA GTATGGAATT TCTCACCTGG 15720
CCGTGGATGG GGCTCTTTTT TGGTGAACAG AAAGAACGCT TCTACCTGCG CCATTTAACA 15780
CAGGCAGTGG AGCTTTTACC GTACGGGGCA GCTGTGGACG AATTCCAACA CTGGGTGTAC 15840
GCACATGCGG ACGCTTCTGC CACTGAACGC AAGAAGGCGT GGCGCGCATT AGAAACTCAG 15900
TATTTACCTC GCCGTCGGTA CGGAGGGCAG CACTACTTGT CCTGCGGGGG ACTGTGGATG 15960
CGTCAAAGTC ACATTTTCTG TATACCCTTT TACTACATAG ACTACACGyT CGCGCAGATA 16020
TGTGCGTTGC AATTTTGGGA TCGCAGCCGC GTTGCATACA CTCACCTTTC TACTCTCACC 16080
GGCGCGGCAC CGTACGCCAG CATAACTCCT ACTGCCTATG CGGAAGCCTG GCATGACTAT 16140
TGCGTACTGT GCAGCCGAGG CGGCAGCGAA CCGTTCATGC GTCTGCTTGC CACAGCGAAT 16200
CTGCACAACC CCTTTGAGGA AGACACGTTT GTTTCAACAC TCGCTTCCTG CCGTGCGTAT 16260
TTTCGCACGg TTGGTGACCG CCTTTCCTAG GTCTATGAAA AAAGGGTAAA AGATGCCTCG 16320 CCAAAAAGAG AACTACCTGT CACCGTCCCC CGTGGTCGGG ATTCTTCGTG ACTGGGGAGT 16380
GCTGTGCACC TTACGCTTAA AGGGGAAACA CATGAAGCTT GTCTACAGTA CGGATTGCGA 16440
ATACCACATT GGACTGAAAG CGTCAGACAT CGGACACTAT GTTATCTTAC CGGGGGATCC 16500
TGCACGAAGC GAAAAGATTG CCCAACATTT TTCTCATCCT CACAAAGTTG GCCACAACCG 16560
CGAGTACGTC ACGTACACGG GCACCCTCTG CGAAACACCA GTCAGCGTCA TGTCCACCGG 16620
TATTGGGGGG CCGTCAACTG CAATTGGTGT TGAGGAGCTC ATCCATTTGG GCGCACACAC 16680
CTTTATCCGC GTAGGGACCT CAGGGGGCAT GCAGCCTGAT ATTCTTGCCG GGACGgTAGT 16740
TATTGCAACC GGTGCGATTC GCTTTGAAGG CACCAGTAAA GAATATGCCC CCGTGGAGTT 16800
TCCTGCGGTG CCGGACTTTA CGGTCACTGC TGCACTCAAA CACGCTGCAG AAGACGTGCA 16860
GGTGCGCCAC GCGCTcGGTG TGGTTCAGTG TAAAGACAAC TTCTACGGTC AACACTCCCC 16920
CCATACCATG CCCGTCCATG CAGAACTCAC GCAAAAATGG cACGCATGGA TTGCATGCAA 16980
CACACTCGCA TCCGAAATGG AGTCTGCAGC GCTCTTTGTG CTCGGGAGCG TACGGCGCGT 17040
GCGCACCGGC GCAGTGcTCT TAGTCATTGG AAACCAAACC CGCAGAGCAC AGGGATTGGA 17100
AGACATTCAA GTTCACGACA CCGAAAACGC CATACGGGTT GCAGTCGAAG CGGTCAAATT 17160
ACTCATCACC CAAGACTCCC CGCGCTAGGC GCACTGCAGT GCTTTAGGCA AAGTGTCGCG 17220
GATCGGTTAT TTCTAACAGC GCGGAGTCAA GCTCGGTCAA AAAGCGGCTG GGCTTCATCG 17280
CCGTGTGTCG TCCCCACATT CTGCGGTACG CACACGCGGT AAGGTATAAG CTATCCATAG 17340
CCCGCGTGCA GGCAACGTAC ATCAAACGCC GCTCTTCCTG TATATCTGCT TCGTCATCAC 17400
GCGGAAACAC CCCGTTCTCT AGTCCGGTCA GAATCACCCG TCGAAACTCC AGCCCTTTTG 17460
TATTGTGAAT GGTGATCAAG TGCAtGCGTC AGCCGcTCCT CCCTCGTCGG CCATATTTTG 17520
GTCCAATTGG ATGTGTTCTA GGAAACTCAC TAACCCCTCA TGCGAACATG CATACAGTGA 17580
CGCTGCGTTC ATTAACTCCT GCACGTTGAC CGCGCAtGCG TCCCTTCTTC CTCATCCTTC 17640
TGTCGATACC ATTCTTCCAG CCCCGTGTGT TCCATTACCA CAGAAACAAA GCGCGCAAGT 17700
CCtTCTGCaT CGTGGGTGCG CTCCtCTACC GGCTCCTCAG GCGGCGCACT GGTGCGAGCT 17760
TCTTCTCCCG CTGCGGGGGC CTGTGGCATG CGTGCACGCA GCGCACGTAA CAGCGACAGA 17820
AAGCTACTGA CCTTTTGCCg CGCACGCGTG CCAAGCGCGG TCAGGTGGGT GGACTGGAGT 17880
GTGGTAAAAT CAGTTATGGC TGCCTGCTGT GCACAGACAA ACAATGCGTC TTGTGTCTTT 17940
TCTCCAATGC CCCGAGGCGG CTTATTCAGC ACCCGCCGGA GGGCCAGTTC ATCTGAGCCA 18000
TTGACTATGA GCTGGAGAAA CGCCAGCACG TCTTTTACCT CTGCGCGACT GTAGAATTTG 18060 AGCGTGCCGA CAATGCGATA CGGAATGCGA TTCCGCAAAA AACACTGTTC AAAACTCAGC 18120
GACTGTGCAT TTACCCGATA TAAAATCGCC CAATCCGCGT ATGGGATGCC gCGCGCACGC 18180 kcTTCTTGAA TGAGGTGCAC GCACAGCGCA gCTTCTTCAT CTTGATTATT CAGCAAGAAC 18240
AGGCGCGGCT TAGTACCTCC CGTGCGCTGG GCAATCAGCG CCTTTCCTAA GCGGTCTTGG 18300
TTTTTTTTCA CTACCGAATC AGCAACACGC AGAATTGCGT CTGTGGACCG GTAGTTGTAC 18360
TCCAGGCGGA TAATCTGGGT ATTTTGAAAG AACTCAGGGA AGGTCAAGAT ATTTTTTACC 18420
TCTGCTCCGC GAAAGcGATA GATGGACTGA TCGTCGTCCC CTACCACACA AAGATAGGTG 18480
TGCGcACCGG TGAGCACCTG CAAGAAATGA AACTGCGCCA CGTTTGAGTC TTGATACTCA 18540
TCTACCATGA CCACCTGCCA CCGTGCATGC AGCTGTTCGG CGACGTCCTG GTgcTCACGC 18600
AAGAGCTGCA CCGGAAGCAT AATCAGATCC CCAAAGTCTA CCGTTCCCAT TTCGCGCATA 18660
CGCCGATGGT wrCACGCATA TGCGTGCGCA AACTGCCTGT CACCCAGAAC GGCTCTGGCA 18720
GCGCATGCAG GAGCGGAGAC ACGCGCGTGC ACTGACTCAA ACGAGGCGCA GTCGAGCCCA 18780
TAGTCCTTTG CCTGAGAAAT TCCACGCGCG AGCATGCCTG CCCGACTGTG ATCGCAATGA ' 18840
GGCAGGATTT TTGGCAAGAG TGCACGGACG TCATGGTCGT CATAAATACT AAAATGGGGG 18900
TTCAATCCCA GACGGACTGC ATAGCGACGC AGGATCCACA CCCCCAGTGC GTGGAAGGTA 18960
CAGATAGTTG CCCCTGCGCG GCAGACTCAA GCGCGCAGGC ACGCGTGCGC ATCTCAgCGC 19020
CGCTTTATTG GTAAAGGTTA CTGCCAGAAT CTGCTCGGGG CGAACCTGCC GGGAACGGAT 19080
AAGATGGGCG ATTTTGGTGG TGATAACGCG CGTCTTTCCT GAGCCTGCGC CGGCAAGGAT 19140
AA 19142 (2) INFORMATION FOR SEQ ID NO: 82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2178 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82:
TATCTATTTC AGTTGGATTT TGGCGCCTGG AACTGGGGAT TTTTCGTGTG CCACTTTCTG 60
ACACGCTAGC TCTGTGGCGC ACATTGAAGG AGCGCCTcTC AgGGTTGTCG CTTCCCACGC 120
TTGCGgTGGA CTTGCCAGGG GGTGGAGGAA AGTTTCCGCT TGTGGCATTG GCCTTGCAGC 180
AAGATGTCAC GTGGCATCAG GAACGCGAgG CGTTCTCCGC ACGCGGcATC GATGGCGCGT 240 GGTACACGTA cCCGTTCTGA CCGTCAAAAC ACCCCGCGCC TCATCGAGTG GAGAGTGCGC 300
CGCGTGGTAC AsGGGGCAGC GCGAAACACC CCTGCGTGAA GGTAGGCGGA GGAGGGGGTA 360 CGGTGGTATG CACCGAACCC AGTGATGCAC CGCCGCCCCC CCCCGAGGGG TCCTCTTCTG 420 CCCCGCGCGC CCACGCAGCG CCGCAGAACG GTGAGAAAAG AGGACGCGCT CTCTACCGGC 480 TAACACATAT CACGCGTGCA CGCAGCCCCC TCGCTGCCGT TGATGCGAGC TTCCAAAACC 540 TCGAGTGCAT CGCGAAGCTG CTGTGGCAGC CGCGCGCCAA ACTTTGGGTA GTGATTCTCT 600 CGGATGTCTT TAATTTCCTT TTTCCATCCG GCTATGTCCA CCGACAAAAG CTCTTTCACT 660 GCCTGCGTGC TTACGTTTAA CCCCTCTGTG TTCAAGGCTC CCTCTTTGGG CATCCAACCG 720 ATCGCTGTTT CCACCGCGTT GTCCACACCA TCACAGCGGT CAAAGATCCA CGCGAGTACT 780 CGGcTGTTAT CGCCATATCC GGGCCACAGG aAGTTGCCCt CTGCATCTTT ACGAAACCAG 840 TTAACGCAGA AAATCTTTGG CAGGTTTTCG GCACGTGCCT GCGATCCGAG CTTAATCCAG 900 TGCGAAAAGT AGTCTGCCAT ATGGTAGCCG CAGAAGGGGA GCATCGCGAA cGGGTCTCGG 960
CGAATCTGAC CTACCTGGTC AGAGATAACT GCTGCAGTTA CCTCCGAGCC GATGATGGAA 1020
CCTAGAAACA CCCCGTGATT CCAGTCCCGG GCCTGATGCA CCAGGGGAAC CGTACTGGGG 1080
CGACGGCCGC CAAACAGAAA AGCGTCGATA GGGACCCtTC GGGATCTTCC CAGTTACTTG 1140
CAATTGCAGG GCACTGTCGC GCAGgAGCGG TAAAACGCGC ATTCGGATGC GCAATTTCTT 1200
CTCCTTTTGG ACTTTTATCG CGTGTAGGTG CGGGGCGCGA CACGCCGTGC CAATCAATGA 1260
TTGTTCCTTT AGCGGGATAG CCGATACCCT CCCACCACAC GTCGCCGTCT TCGGTCAGAC 1320
CACAGTTGGT GAAAATGGCG TTTTCCTTGA TAGAGTCCAT GGCATTCTTG TTCGAGAAAT 1380
CAGATGTCCC TGGTGCTACG CCGAAGAACC CCGCTTCAGG ATTGATAGCG TACAGGCGGC 1440
CGTCCTTTCC GAATTTCATC CACGCGATGT CATCGCCTAC GGTCTCGACC TTCCATCCAG 1500
GAAGGGTAGG GATCATCATA GCCAGATTCG TTTTGCCACA TGCAGAGGGA AACGCCGCAC 1560
CAATGTACTT GGTCTTTCCA GCAGGGTTGG TGATTTTAAG GATGAGCATG TGCTCTGCAA 1620
GCCACCCTTC GTCTCGTGCG AGTACTGAAG CGATGCGTAA TGCGAAACAC TTTTTCCCCA 1680
ACAGGGCATT CCCTCCGTAT CCTGAACCGA AAGACCAAAC CAAGCGCTCT TCAGGAAAGT 1740
GAGAGATGTA TTTGCGCTCC ATATCCGCGC AGGGCCACTG GCCTGCGTCA GTTACGCCCG 1800
GTCCTAACGG CTTCCCCACA GAGTGCAAAC AGGGGACGAA CTCACCATCA GTACCCAACG 1860
CCTCAAGCAC GCGGGTACCC ACGCGTGTCA TGATGTGCAT GTTGCAAACG ACGTACTCAG 1920
AATCGGTGAT TTCGATGCCA TTTTTAGAGA TGGGTGAGCC GACCGGTCCC ATGGAAAAGG 1980 GAATGACGTA CATGGTACGG CCCTTCATGC ACTGGGAATA GAGACCGGTC ATAGTCTTTT 2040
TTAATTCTGC aGGATCGGTC CAATGGTTAG TGGGTCCTGC ATCATCCTCC CTTTTTGAGG 2100
CGATGAnAGT GTTCGCTTCG ACGCGCGCAA CGTCGGAGGG CTGTGAGCGA AAGAGGAAGC 2160
AGTTCTTACG TTTTTTTA 2178 (2) INFORMATION FOR SEQ ID NO: 83:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9365 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) , SEQUENCE DESCRIPTION: SEQ ID NO: 83:
TAAATATCTT GTAAGCTACG AGGAGCGAGA ATGTCTGCAT TGTTTTCCTT GGTTGCGGTG 60
TACGTGCTTG TGTGCGCGCT GCACAAACAG ATTAAAAAGT ACGCGTCTGT CTGCTACCTC 120
GGCAGTGCGT GTGTCAGTGT TGCGGTTGTC TGTGTCGTGT GGAGCGGCGC AACCAAGGGG 180
AATTTTGGCG TGCGTGTGCT TCTGCATCCG CTGACGAGTG CGAGTTTTTC CACCGCGATC 240
TTTACATTTG TGATGTGCGC GAGTGTGCTG AAGAACGGTT TGCTCAAGCA GCGTGTCATG 300
GGGTTGCGTG CGGAACTGGC CATCACCGCG GCGATTC CA CTCTCGGGCA TAACATCGCG 360
CATGGAAGGG ACTACCTGGT GCGTCTGTGC GGGAGTACCG GGGATTTGTC TACAGGGTTT 420
CTTGTCGCGG GTGCTGTCAG CATGGTCCTC GTTCTTTTGA TGAGCATTTT GGCGGTAACG 480
TCTTTCAAGg TAGTGCGCAG GCGCATGGGC GCAAAGACAT GGAAGCGTGT GCAACGCCTT 540
GCATATCTCT TTtACGGGCT TACGTATGTG CACCTTTCCT TTATCCTCCT ACCGACCGCT 600
TTGAGGGGGT ATATCCCGAG TGTCGTCTCC TACGTGTTGT ATACCGTCAT TTTTGCCACG 660
TACGCGTTGC TGCGTGTCCG CAAGGCGTTG GGAAAGCGGA AGGGTGCTTG CGCGTTGTGC 720
TCCGCGGCGG TTGCTGTGTC CTTCGTTGCG TTCGTCTTGG GCGCGTCTCA CATGGTCAGG 780
CACACGAGGC GTGCGCACAC GGAGAGGACT ACGCGTGCAA AGGCGCGGAA GTGTTCTCCT 840
GCAGAGATGA AGGACGGCGT CTATGAGGCT AGCGCGCAGG tCACAACGGA AAGCTAAGTT 900
TGAGGGTGAC AATCTCGCAG GGTAGGATTG AAGCTGTCAC CGTCGTGGGG CACAGCGACG 960
ATGATCCTTA TGCCTCCTGG GCGGTAgAGG GTGTCTCGGC GGCAATTGTA GGGGCTCAGT 1020
CTACCGATGT CGATGTGGTG AGTGAGGCAA CTTCCACTAG CGAGGCAATA ATTCGGGCCG 1080
TGGAGAAAAT TCTCCAGCAA CCGCAACCGT AGATCTAAGA AGAGCGGCGG TGGGATGTGC 1140 TTGTTTGTCT TCGTTTTGCT GGGCGCGTGC GGGGGTCGGT GTGTCTGTGA GTAAAGGTTT 1200
AAGGGGTACG CTCATTGGTG TGGGCGTCTC ATTGGCTTTC CTGTTGGTGG TA ATCGCCT 1260
CTTCGCGGGG AGCTATAACC GTATGATCTC GCTTGATGAG GCTGTCAAAA GTGCCTGGAG 1320
TCAGGTAGAG GCGGTCCTGC AGAGGCGCTT AGATCTGATC CCCAACTTGG TCTCGACGGT 1380
CAAGGGTTAT GCAGAACATG AGTCGGATAC gCTGAAGCGT GTTGCGGAAn GCcGTTCGCg 1440
TGCCGGTGGC GTTATGCAGG TGGGTGAAAG CGCACTCTCT GATCCGGAAA AGTTTGCTCG 1500
GTTTCAGCGT GTCCAAGCAG AAATAGGCGG TGCACTCGGG CGTTTGTTGC TTGTCAGCGA 1560
GCAGTATCCT GCCTTGCGCG CAAACGAGAG TTTTCTTGCG TTGCAAAGTC AGCTTGAAGG 1620
AACTGAAAAT CGTATCTCTG TGGAGCGGAC GCGCTACAAT CGCGCCGTCC AGGAATACAA 1680
CGCCTATATC CGTTCTTTCC CACGCAGTGC GGTGGCGCGT TGGGGTAATT TCAGCTCCCG 1740
CCCGTACTTT ACGGCACATG AGGGGGCTGT TGcTGCGCCT CGCGTTCGGT TTTGAGTGTG 1800
TGCGTTGCCG CGCCAAGAGC GGTGTGGTTT TCTTTGCGTA GGATCGGTGA TCTTCTCTCA 1860
TGGCTCGTGG GGAGTAAGGG GGTTCGTGGG TCTTTTTGTG CCGATCCTGT CCCTCGTGTG 1920
GATGGCGCGC ACGGTGGCGA TGACCCGAGG CGGGAGGAGG ACGCGTCGGG CGCTTTTCAT 1980
ATTCCCCGTT TTGCTGCTCG GTTAGGACTG GACTCTGCAG GACTGCTGCG TATCCAACGT 2040
ACGGTGGCTC GTGTCGAAGG ACACACTGAC GGAGAGGTGG CGCTCGCTCT TATCAAGGAA 2100
AGTGATTCGT ACTCGGTCTA TGAGCTGTTT TTCGCACTCG TACTTGCTGG CGCCTGTTTT 2160
TGCGTTCAGC TGAGCATGGT GTCGGTGCTT GAACAAGCGT TTGCgCGTTT CTTTTGGGCG 2220
CCGCCATCGT GGTATATGCC gGCGTTTATG GTAGCGTCGA GTGTGGCAGC GTGTGTACTT 2280
TTTTTCTTTA TCGCAAACAT TCCGTGGGTG GATCGCGCGC TGGTACCGGG ACCTCTCAAG 2340
CAGCAAAAGA GCTATGCgCg cGCGGTGCGC CaCTTTGTGG AAAGTGGTGT GTGCAATACG 2400
CGCAATCGCA CGGGGATTCT CATCTTTATT TCTGTGTTGG AGCGGCGTGT GCTGGTGCTC 2460
GCAGACGTGG GGGTGAGCGC GTACGTGCCT GCCCgTGAAT GGACTGAACT GTGTCAAATA 2520
ATTACTGCGG GGTTGCGCTC GCGTCGTGCA GCGGACGCCT TATGTGAGGC ACTTACCCGT 2580
GTGGAACAGG TGCTCgctAC GCgGATGCCA CCTCAGaAAA AAAGTTCCAA CGAGTTACCG 2640
GATGGGTTAG TGATTTTGAG TCACTAGAAT TGAATCGGTG TTCTGCGTAt GTGATTCCAC 2700
TAGGTAAGCG GCATGGGCAT TCTGCTGTGG CCGTGTGTTC GCGGGGTGTT TCTGTGTTCA 2760
GAGGAGAGAA GGCGATGGAT GGGGTGCGGG GGCGCCTGAG TGCCCTGCGT TCGCTGATGC 2820
GTGCGCAAGG GGTAGATATT TGTTATATCT CCGGTGAGAA CGCGCATGGG CAGATAGATG 2880 GTGCGCGTGA ATATTTTTCT GGATTTACAG GCTCTGCAGG AATAGTGGTG GTGAcCcACA 2940
ACGTGyGTTT TgTGGACAGA TGGACGGTAT TTTATCCAGG CTGAGCGTGA gcTGTCGGCA 3000
TGTGAAGTGT GCCTTTTTCG TACAGGGCAG GCGGGCGTAC CTCGGGTAAC TGAGCTTTTG 3060
CGTACAGAAC TCCGTGCTTT TTCATCCGGG CCTGGGCACG GAGGGGGAAC GCTCGCAGTG 3120
GATGGCCGTA CGATTTCTGC GGCTGTATGG GAGCAGTTTC AGCAGGAGCT TGTGGATGTA 3180
TCGctTCGCC TAGACTTTGA TGGGgCsCTC CTGCTACCGC AGaGCATCGC tTTCCGTAtT 3240
CCTTCCCtGC GTTTTTGTTG GATGAGCGCT ACACGGGgTT GAgTGCGGCG CAAAAGCTCA 3300
CCCAACTGCG CGCAGCGCTC AgTGCACGCA GCTGTGATGC AACGGTGTTG TCCACATTGG 3360
ATGATGTGTG CTGGCTCACC AATGTGCGCG CACACGATGT GCCGTGTACA CCGCTGTTGG 3420
TGGCATACAT GGTCGTCACG CACACCCGTG CCTTTCTTTA TGTGGATATG CGCAAAATTT 3480
CTTCTGCATT GCATCAAGCT TTGTATGCGC AGGCGTTGAG TGTATGCCGT ACGATACTTT 3540
TTTTGATCAG GTGTGTGCGC GCTGTGGGAT CAGGAACCGA CGGTACATGC TGTGGGGAAA 3600
GGAAGAGCAG GAGTGCAGGA AGTGGCAGGC AGAACGCCAG TGcGTGTTGT TGGACTTTGA 3660
GCGTTCGTGT GCTGCACTGG TGGATCTCTT TCGTGCATCG CCGCAGtGTG TAGACCGCaG 3720 tGGAAcGATC CGCTCCTTCC GGTTCGTTAT CTTCTGTGCT GGGCATAGAA GAGGATGGTA 3780
CTGAGACCAG CAGAGGTGGG AAAAGTGCAT GTGCGTTGCA GTCTGCaCGC GACGCTCTGG 3840
CAGCAGGGAA AGAAAAGGAG AACAAAGAGG AGCGGGGACA GACCATGTGC TTCTCCGTTT 3900
GTCGTGGACT TTTGCCAACT GTAGCGCTCA AAGCATTAAA GAACGACACC GAGCGAGCCA 3960
ATGTGCACCA GGCAATGATA CAGGATGGGA TTGCGCTGGT AAAAACGCTG CAGTGGGTGT 4020
ACCAGCAGCT TGaCGTGGGT GCAGACGTTG ATGAATGCGC TGTAGCGGAG TTTGTACGTG 4080
CTGCCCGGGC GGTGTCTCCG TCTTTCATTG AAGAAAGCTT TCACACCATT GCAGGATACG 4140
GGGCGAACGC AGCAATGGTG CATTACCGCC CCGTGCGTTT TTCAGCTTTA CACCCTGCTG 4200
CGGGTCAAAc GGCAGCAcTg cTTCGCGCGC GTGGTTTTTT ATTATTGGAT TCTGGCGCGC 4260
ATTATCGTGA GGGTACCACC GATGTGACGC GCACGCTGGC TCTCGGTCCT TTGACAGATG 4320
TGCAGCGTGC AGACTACACG CTGGTATTGC AGGCGCACAG TGCgCTTGCC GTGCGcGCTT 4380
TCCTGCAGGG ACCAGTGGGG CGGTGCTCGA CGGAATTGCC CGGGCTCCGC TGTGGGCACA 4440
GGGACGAGAC TACCCACATG GGACGGGGCA TGGGGTGGGT TTTTGTCTTT CAGTGCATGA 4500
GGGTCCCTAT AGTATTTCTC CGAGTGCTCC CGGGAGAGGA GGAACTGCAC GAGGCATTGG 4560
GGCAGAGCAC ACGGGAGATC CTCCCTTTTT TTCTGAGGAG GCGGCGTGGC AGCTGCGCCC 4620 GGGTATGCTC CTTTCCAATG AGCCTGGGGT GTATGTGGCT GGCTCTCATG GCGTGCGCAT 4680
AGAAAATCTT ATGTGGGTGG TACAGGCGCA TGAGTCTGAC GCGCAgTGTG TGTGGAAGGA 4740
AGGAGGGGAG GGAAAGGAGG AGAACGCGGC GGCGCGTGAG TGTACGGGTG CAGATAGGAT 4800
GCAACCGTCA CGATGCCGAA GTTTCTATGG ATTTCAAACT GCAACGCTGT GTCCAATtGA 4860
CACGCGGCCG CTCGTGCGAG AACGATtGCA CGATGAAGAT ATTGCGTGGC TGAATGCCTA 4920
TCACTACGGG TGTATGTAAC CTCGCGCCGT TTTTAGAATc cGTACGCGCG CCTTTTTGCG 4980
CACGTGCTGT CGTGCGCTAT AGCGTTTTCC TGTTGATTGG TGTTATACGC AGATCTTGAT 5040
TTTTGATCAT AAAGGACATT AGCTTAGGCG GGTGCACTGC ATGTGGTGTC TGAAAAGATG 5100
TCCGCGGGCA AGAGGAGGGA ATTGTGATTA CGATTTTCGA AGCGCTTGAG CGTGTGCGCG 5160
TCATTCCGGT GGTGACGCTT GAGCGCGTGG AAGACGCAgT GCCGCTTGCA CGCGCCTTGA 5220
TAACAGGTGG TATCAGGTGT ATGGAAGTAA CATTTCGAAC GTTGGTTGCT GCGGAGGCGA 5280
TTGCGGCAAT CCGTCAGgAA TGTGCTGATG TGTTACTTGG TGCAGGAACC GTACTGACGG 5340
TAGAGCAGGC GCAgCAGGCG CAGGCAGCAG GTGCGCAgTT TGTGGTCAGT CCCGGTTTTA 5400
ATCCGCGGGT TGTTGCGCAC TGTTTGGGGC ATGGCGTTCC GATCATACCG GGGATAGCAT 5460
CTGCAACAGA AATTGAGCGT GCGCTTGAGT TTGGTATTTC gGTAGTAAAG TTTTTCCCCG 5520
CTGAGCTTTT GGGAGGTACG GCAATGATGA GTGCGCTCGC AgTCCCTACA CGGCGGTGCG 5580
TTTTGTGCCT ACGGGGGGAA TTCATCTTAA TAATCTTGCT GAGTATGTGG CGCATCCTCG 5640
GGTGCTCGCC TGTGGGGGCA GTTGGATGGT ACCGGCGCAG TCAATAGCGG CAGGAGATTT 5700
CTCGCAGgTT ACTGCACTTT CTCAGCAGAC GTTACAGATT GTCGGGGTAA TGTAGGGGGT 5760
GGTGGCTCAA ACGTTGTCTT TCTCAGAGAG GGCACGTATA CTGCCGGCGG CACTCAGGTG 5820
CGGGAGGTAA CTGGTGCGCA GAAACTCGAG GACGATGGCA AAGATAGTGG CGcTGCTTGC 5880
GCTTATTCTT TTGCTCATAT TGGCAGGTTT TATTTGGTTT GATTATTTAG GAGTCCTCGA 5940
TGCAAAGCGG GCGATTTCTC CGTTGTACCG TCTTTTTGGA CGTTCGGTGC CGGAGGGAGT 6000
TGTGTCGACT GCTGATCCGG ATTTGGATGC GGATCGTTAT GCCAAGCGTC TTGAGGCGCT 6060
CGGGGAGCGT GCAGAGGAAT TAGATAAAAA GGACGCTGAG CTACAGGAAA AGGAAAAGGA 6120
TCACGAAAGG GTTTCTCAGG AGTTGGATGA GCGTCTGCGC GCGCTTGAGG ATAAGGAGAA 6180
ATCCTACAAC TTGCTTGTTG CGGAGACAAA CGAGCGTCGC GGGAATGTGC GTAAGATTGC 6240
AGAATACGTC AGTGGTATGC CTCCGGAGAG TGCGGTAAAG ATTCTGCTGA AAACTGATGA 6300
TCAAGATGTG ATTGAAGTGT TTCGTATGGT GGATGCGGCC GCTCGGCAAA GGGGTGTTAA 6360 CTCTCTTGTG CCGTATTGGC TTTCTCTTAT GCCTCCTGAC AGGGCAGCTG AGATTCAGCG 6420
GAAGATGGCA AATAAACCTG CTGACTTTCC CTAGGCGGTT TATACCAACT TGATAAAGTG 6480
CTCTTATGTT TTTTTCTCTC CTTTCGCCGG CTGCTTTTCT GTTTATTTTT TTTATCCTGT 6540
ATTGGTACGT TTTCCGCACC GCGACGCAGC GGGTCGTTGT GCTGCTTGTA GCAAACTTTC 6600
TTGCTATTGC AGCTTTTGAT ATTCGCTTCT GCATTCCGTA TCTCGTTTTA AGTGCGTTAc 6660
CTACAGCTGT GGGTTGCTCA TACTCATGCA GAAAAGTTTC TTATGGAGAA AAGTCCTGCT 6720
CATTGCGGGT ACCTTGTTGC AGATACTTTT CTTTTGTCTT TTTAAACATT TCTCTGATAT 6780
GCTCTCGCTC GTGCGTGCAT TTGCTCCTGC ATATTTTGCG CAGCACACAT GGCACCAACA 6840
TGTAAAAGAC TGGAATATAT GGCACCCAGT GGGTATTTCG TACTGTACAT TCAAATGTAT 6900
GAGCTATGTG TTTGACGTGT ATCTGTGCAA GATACGCAGA AGAGAGCCgT TTGCACGTGT 6 60
GCTTTTGTAT GTGTCTTTTT TTCCTCAAAT GATTTCAGGA CCTATTGCAA ACGCATCGCA 7020
TTTTTTTACA CGTCTGCCGC ACAATTTGCG CGCTGGTGAA AGCCCCTTAG ATCGTCCTAT 7080
CCACTTTGAT CGTGCGGTGG TATTACTGTA CACGGGGTTG GTCAAGAAAG TTATTTTTGC 7140
AGATTTTCTT TCTATACTTG TGACTGATAA AATTTTTACG CTTCCTTCCG CATACAGTAG 7200
CACCGAGTTG CTCTTTGGCC TCATCAGTTA CAGTGCGGTT TTATACTGCG ATTTTTCTGG 7260
GTACAGTGAC CTGGCAATTG CAGTTGGGTT GCTTTTTGGG TTTGAAACAC CGGCGAACTT 7320
CAAACGCCCT TACATATCTC AGTCAGTTAC TGAATTTTGG AGACGCTGGC ATATTTCCTT 7380
TTCTCAATGG TTGAAAGAGT ATTTGTATTT CTCACTTGGG GGTTCACGTT TTGGGATCAA 7440
AAGAACGGTG TGTGCACTTT TTTTTACCAT GCTGATCGCA GGTCTCTGGC ATGGCGTACG 7500
CTTGACGTTT CTGTTGTGGG GTATGGCGCA GGGTGTGGCT TTGGTAATTG AGCGGGTGTA 7560
TAGGGAAAAA AGACGGGTGA ACGGTGCGAA TGCCTTTGGA TCAAGTAGTG TGATGGGAAG 7620
ATGGAAAGCG CGTGCTATGC GGTGTATACG CGTCAGTGCA TTGTTTCTTT TTGTCaGTGT 7680
TGGATGGCTT ATTTTTCGCG CACCGTCTTT TGCAGAAGTG TGGCGGTACG TTACCTTGCT 7740
GTTCCGAGGA AGTTGGCATG GGCCATTCCA AGTTATCACG CCATTTACCG CGTTGCTCGC 7800
GCTGTGTGCA CTGTGTGTAC AACTCCCTTC AGATCGTACG CGTGCGCGCG CGTTTGCTTG 7860
CTACTGCGCA GTGCCCTTAC CCGTAAAGGC TTTGTGTGCG GCGTTCTTTT TCTTTGTACT 7920
GTCGGTTATG ACTCCATCAG GTATTGCGCC CTTTATTTAT TATAGTTTTT AGCGAAGGGG 7980
CTATGATGAC AACAGTGCGT GTTATATTGC AAAGGTGTGC ACGGGGAATA TGTAGTAATA 8040
AGGGTCGGTA CAGTGCGAAT CAGGTACTGC TTTTTTGCAT ACTTACGCTG AGCCTGTGGA 8100 CGCCATTTCT CGGTCCGGCG TTGCGCCATC CTGCCGTGTA CATACGGCAC AAGAGCGTGA 8160
AAAACATCTA CCTTGCATTC GTAGATCCGC TCATGCACAC TGCAGAACAG TGGGGAATAG 8220
ATACGGTATT TCCTCTTTTG CGCGAGAGTT TTTTGCATGC GACAGGTTTG ATACAGCATC 8280
CGGAGTGGGA AGATACGTTT TATCACTGCG AGCAACGCTC TTATGAGCCA GCTGCTGCCC 8340
TGGCGGAGTC TGTGCCTTCT CCTGTCTTAC AGGAGGCCGT GGCCGTTTyC CCTCCGGGGG 8400
TGGCGGTTAA CGATTCTGTT GCGGAAAGAA AAACAACTGC TCCCGCACGT GTCTTTTCGC 8460
GTACAGCGCT TCGTATTTTG ATGTGTGGAG ATTCCCAAAT GCGTTACCTT ACCGGCGGTG 8520
CGTTGCAGGT GCTAGGGACG TCTTCGCACG TGCAGATTCA AGAAGTGACG GTTAGTTCTT 8580
CTGGTTTTGT GCGGACCGAT TATTACCACT GGCCACGAAA ATTTCTCGCG CTCCTGGATA 8640
CGCACACCCA ACAAGAACCA TATGCAGCGG TAATTATGGC ATTTGGTATG AATGACTATC 8700
AAAATTTTTA TGATGCGGAT GGCTCTTTGT GTGTGACGAA AACTGCACGC TGGGAACGCG 8760
CGTATGAGCA AAAAATGCGC GCCTGTTTGA ATATTATTCT GCACACAGTA CCGAAGGTGT 8820
ATCTGTTGGG TATGCCAGAG ACACGTAACA AACAGTTGAA TGAGAAGCTT GTGTACATCG 8880
AGCACGTACA AAAGAAAGTA GTGGCGCAAT ACGATCCGcA GCGGGTGCGC TATTACTCCC 8940
TCAAACCAAT TGTACCCGGT GTACACGGAA CATATGCAAG CGCGATAAGG GACACGCACG 9000
GCCGTTGGGT ACACGTGATG CACAAAGACG GCATCCACTA CACCATAGAG GGTGGTGCGT 9060
ACGTTATGGA AACTCTCTTA CCCCTTATTC TTGCAGATTT GGAACGGTCT CGTCACGGAT 9120
ACATGCGTTC TTCTCTGGGG TCGCATGAAC TCCCTGCGAC GAAGGGGATG GAAAGAGCAC 9180
GTCACGCGTC AACTCGAACA TAGGGATAAA CCGCACCGTT GTATCCTGCA TGACAGGGGG 9240
TGCGTCCTGT CAGGGGGTTA TCGTACGTCC ACACCGTTGC CTGCACTTGC ATAGTGCTGT 9300
TTTGGTCACC TGAGAATGTT ACCGTAAAGG GGAGTGGTGG GCGCGCCTGC GATATGAAnC 9360
GTACC 9365 (2) INFORMATION FOR SEQ ID NO: 84:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5019 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84:
AGGGTGTnCn CCGCACGGAA AATCTCnTCT ACAGCnTCCC GATTGGTGGG ATCCTGCTCA 60 ATGACGAGTA TATTGCTTTC GAGCTGCTCA ACCTGTTGtC CGCCTCGCTG AAAAAATCCT 120
TCAGCAACTC TTCGTTGCCC GCGTCGAGGT AATCGCTCAT GGGCACAATC TTCCCGCATT 180
CTCAGCCTTT GGTCAAGGCA ACGTCCAAAA GACCCCGCCA CCCGCTGCAT CCCTTGACAC 240
TTCCATCAGC CGAGAATAGA CTCCGCCATG GCCCGCGCGC AGGTACGTTT GTCAGGTCGG 300
TTTCTCGCCC TCGGTGGTTA CTCCTTGTAT GCCGCGGGAC TGGTCCTACT CCTCTACCGA 360
GCGGTGGGGG TCTACGTTTG GATCCACTTT TATCAAAACC AGACCCTCTC GCAGCTTTCT 420
ACCCACGCGC GGTCGTCTTG GGTATCGATT GGCGGACACG CaGCCTCTTC CTCGGCGCGG 480
TGCTCTTGAC CGTTCGGGAC GTGCGCACCT TCAGAGACGA CAGCACGATG TGGATAGCCC 540
CCGTCGTTCT CTGGTCCGCG GCGACCTACT CCAGCATGGG aGmCTGCGGC TTCTCTCTCA 600
GGGCCGTAGC CAGCGGCGCA TGCGGCGTGC TTCTCTTCGC CGTTTGGTCG CTCGCGTcAG 660
AGCTTGGGCG CCGCCGCCCT TAGCGGCGCG TTCGCCCGmG GCTGAGCCTT TTTCATAAGA 720
GGGTTGcTGC GTAGGCAGCG CGCGCAAATC CGCATGTTCA GCGCAGAGCC ATCTAcAACT 780
ACTcTCAtGC GAgCArGTTC GGTTTCCACA CCCGCTTGCC AGTGGTGCAT CGATTTACTC 840
ACGGCGCACC CGCTGATCGT CCCCTTTCCA CACAGTCCAC ATCTTCTTGC caTCGkwmAT 900
GTCTCCGCCG CCGATGATAA AGAAAAGGTC CGCCAGAGTA AAGGCGgCAC GCGCGCTTTA 960
TGGCATAATG CTCCCCTGTG CGGTGGTCAA CCAACGCAGA AGGAGCGCCC TCATGTGCGG 1020
AATTCTCTCC ATCGTCGTAA GTATCGCTGT TCTTGCACCG CTGGCAGTTG CAAACACCCT 1080 cTGtACGCAC TGTACCGCCC GGGATGCAGA TGGGTTAATC ACGCGGTATC CACCTGGGGT 1140
GCACGGACTA TCTTTGCCGT TCTACATACt ACGGAAGACT TAAGTTTTAC AGCGACTACA 1200
CCCACGCTGA GCACCTGCCC ACGCAgTACC TTGTAGTCTC AAATCATCAA AGTGTGCTAG 1260
ACACCCCTGc ACTCATGCGT TACTTTGGcT ATATtGATGC GCCGAGGTTG CGCTTCGTAG 1320
CAAAGCGAGA GCTCGCGCGC TTCGTCCCCC TTGTGTCCAC CATGCTCAAA AGCGGTCGGC 1380
ACTGCCTGGT AGACCGGCAT CAGTCAGGAG CACAGGCGAT GCAAGCGCTC GAAACGTTTA 1440
CACACCACGT CATGCGAGAg CGATGGCTCC CTGTGATCTT CCCTGAAGGG ACACGATCAA 1500
GAAACGGCGC CCTGCgCATC TTCCACGCCG CAGTTtCCGT GCGCTCACGG CACGGCTCCC 1560
ACTCCCCGTA GTTGTATGCG CGTTGGACGG AGGGTGGAAC cTACGCCCGT TGCTGAAAAT 1620
AGGACAGAAG CTTAAAGGAA GCGCTTACCG CGTCAAAGTG CTCGGCGTTT ACCCCGCACC 1680
CCACAACAAA GACGAACAGC TCTCCCTCCT ACGCGAGGCA AAAGCGCTCA TTCAAGCGCA 1740 gcTAGACGTC TGGCGCTCGG AGACCACCTG TCCTCAAGAC GAGTGCCTCA CGCGCGCCTC 1800 CAAAGACTCC TAGTTAAAGT CACGCGGCCG ACGCTCAACA CGCCTGCACT TAGGACACTC 1860
AGCGATGTTC TTCACCTTTC CGTTATCAAC AATAGACCTC AGCACCACCT CTCCCCCGCA 1920
CGGACAGTTA AAGCGCTGCG TCGCCGCCGC AGTACGCGAA TTTTTGCTGG CACCAGCCAT 1980
TTTCATACCT CCTAGTAGCT CGCTCGCCAG GCTGCATCAC ATTGTCCTTT TGGTCAAGCG 2040
TATGGCGAAT TCTAGGGGGA AGACACGACA CTCGCCCCGC CTACCGGCGg TACGTACTCT 2100
CTGCATCGCG ATATAGAACC GTGGAATGAG AAAAAATGGC CTCATATTGA CAGAGATTTT 2160
TCCTGTTCCC ATAATGGCAC CCTGGGTGGG GTAGGCGTGG CGCTCACTAA GTCGTTTTTG 2220
GAGTCAAGAT CAACCGGGGA GCTGTTTGCG CTTGCAGATG AGCTCGGTCT TTGCTTGCCT 2280
GAGGATCTTA ATCGAAGACT TGTCATTGGC GAGATCCTTG ATTGTTACCA CAGCGCTCTT 2340
GATTTGAACC CTCCGTGCGC TCCCCAGTCC CTTGAATCAA AGGGGACTTC GTGTGCCTAC 2400
AATACCACCG AAATCCATAT CCTTGCCCGT GACCCGCTTT GGTTCTTTGT CTTTTGGGAT 2460
ATCCACGAGC AACTCTTTTG CACACTCACC CAGAGTCCTC AATTTAGGTC GTTCTTTCTG 2520
CGCGTGCACT CCCTCGGTGG TCATGGCTGG CACACCTCGC TCGACCACTT CGATATTGAT 2580
GTACCCCTCA AAGACAGAAA GCGTTACGTG CACCTGTCCT TGGCCGACGA TGCTAATCGC 2640
ATAGATCTTT GCTGCAAAAT GCTCCAACGC GAACGCATCC TTGCTCAATC CAGAgTTGTC 2700
ACGCTCCAGC GCAgCkTCAT AGAACGGAGT CTTwACCCCg AGGATCCAAC CGGCGCAGAA 2760
GTTCTCAGCC TCTGTGGGCT TCCCCTGCTT GAGGAAACCT ACCCAAGCAC GTCTCTTCCT 2820
GTGTGCTCAT AAGTGGGACC TTCATGAAGA AAATGCCaGT GCACTCTCTT GCATTTGTTC 2880
TCGATTGTAA TCTTCCcTTC GTCCGAGGGG CCGGCGCATC TTCTCTCCTC GCTGAATCCC 2940
GTTTTTTTCT CGAGATTTCC TATACCTACC TCCCCCTACT CCGCTTATGC GAAACACTCG 3000
AACGTGAGCG TGTTCCTTTT AACATCTCCC TCGCTATCGG GCCCGTTCTG TGCGAAATGC 3060
TCGCTAACCG CGTGCTTATG GACCGATACC GGCGTGCACT CGACGCACTC ATCGAATTCG 3120
GAGAACGGGA AGCCATTCGC CTGAGGAACA GTCTCCAAGA GCGCGTGCAA GCTGAAGCAG 3180
TGCTTCGGTC TCTTCGCTCT CACCGGGATT ATTTTGaTCA CTGmGATGGG GCACTCCTTG 3240
AACGcATCAA TCACTTTTTC CGcACAGGTT CCATTGAATT ACTCGcAACA ACGGCAGTTA 3300
ATTGTTTCTT ACCCTTCTAC CAAGaCATGc CCGAATCTAT ATCCGcCCAA ATCGAAATGG 3360
GGCTTATTAA TTACCGcAAA CATTTTTCCT CAATTCCCCG CGGTTTTTAT TTACCTGAAC 3420
TTGGCTATGC ACCAGCGCTT GAGCGCACTA TAAAATCATA CGGATTCTCG TACACCATAT 3480
TGGAAACACA TAGTTTCCgT TTGGcACTCG CGTACCCCGA CGTGGcATCT TgAGCCkGCA 3540 CAGACGTCCA AgGCTTGTGG TGCTTAGGAA AGGAsCGTGT CGCCACkGCA GAAGTGCAtG 3600
GCGCCACgCA TTCCTTCTGT ACACAGGCGG TGTACGGAAA TACAGAACAA GATGCTGGGT 3660
TTATACTGCC TGAGGAGGCT CTGTACCCTT TATTTGAACC ACACAAGGAA CGCATGGCTA 3720
CTGGGTATCT GTACCAGGCC CGTTCAGGCA CGCCGTATGA GCAAGAAAAA GCGCAACGCA 3780
CGTGTGTCGC TGATGCGCGG GCGTTCGTGC GTAATCGGAC AGAAATATTT GAAAAAGTAG 3840
TCCACGCAAC CGCTCCCTTC GAGGCTATGT CGGTGTGCGT ATTCCCTGCA TCACTATTTG 3900
GAGTTGCATG GGCAGAGGGG ATGGATTGGC TTGAAGCCGT ATTTCGCACC GTTGCAGAAA 3960
GCGCGCAAAT GCGnGTgTCC TGTACGGCGG CGCTCACCTG CCCCGCAgTG GGGGTGTCAA 4020
TCATTGAACC CTTTTTTGGA TCTTCACTGG GGGGAGGTTA TGCGGATGAG CTTATTAATA 4080
GCGCAAACGA CTGGATGCAT CCTGCAATAC AAAAAACCAC AGAACGCATG ATCGACCTCA 4140
CAGAGCGCTT TGCACACGAC ACCGGCTTTC GCGAACGCCT ACTGAACATG GCAGCACGTG 4200
ATGTGCTTTT GTGTCAGTCG CTGTTCTGGC CCCTTTTAGG GAACCATTGT CGCTACCCCG 4260
AGTACGCCGC TAGCGAGTGC GCCGACCACC TCAAGGCCTT TACGAGGGTA TACGAGGCGC 4320
TCGGCTCCGG AGAGGTAAGT GCACAGTGGC TAATGCGGCG CGAACGCGAG CTACCACTGT 4380
TCTCTGAAAT TAACTTTCGC TTTTTCAGTA AAAAGAAATG ACACCTCACC AAAAGGCGAT 4440
CTCCCCAGAG TAATACGATC GCATCGACCC AGCGCTATCT ATGCACAGCG CGCCGTCCGC 4500
GGCCAATCCG ACAACACGAC CCAGAATCGG AGGACGTGTT CCCGCACATT CGCGGAAACA 4560
GACGTACTCA CCCTGCTTCC ACAGACACGA CTCAAGCACG CCGATACTCG GcGGCGCCAT 4620
GaCACAGGCA TACAGCCGAT CCAACAGCAC AGGAAGGAAA GCAAACGGAT CAGGGCACCG 4680
CTCGCCTCCG ACAATCTGCG CGAGAGAGCA GGCATGCGAC AACTCGGGCG GAAACTTCAC 4740
CTGcAGGAGG TTACACCCAA TACCCACGAG AArCGCTCCC GCGCGcAcCT GGCAGAGCAC 4800
CCCGGnGATC TTACGATCGC ACACCAGGAC GTCGTTGGGC CACTTAaTGc GCGGTGCGCA 4860
CACACCTCCG AGGAAGGCCA TGTATGCAAG AGCGACCGCA TATCCAACAC AGAGCGAAAA 4920
CGCAGGGAAA GCAACACGCC GCAAAACGAC AGTACACAAA AGATTCTTTC CCGGCTCCGA 4980
TTGCCACTTT CTCTGGTCAC CACGAnCACG TCCAGCAAA 5019 (2) INFORMATION FOR SEQ ID NO : 85:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3843 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85:
GTGATCACTC AGGTACGCTA TCACGGTTCT TTAGATCCTA AGAACGTGGA AATTGACGAG 60
CACAGCTATG TGGGTACCCA CCAAGCGGGC AGTCGCGTGT TCTGGGCAGA GCGGCAAATT 120
CTTCTCTCCT CGGTTGACGC GCGCGCCTAC GTCCTGCAGC AGGATGCGCG TGTTACGGTC 180
GACGGGGTCA GCATCCCCTT GGTCGCAGGG GATAACGTGT ACGCTATTAT TTCTAAAATT 240
AATGATTCAG GCGCGGCAGT CCGTGCGTAC CTTGACCCGG TTACCAACGG TTTGAATATG 300
GAGACATCTG ACGCCAGGCA GTTGTGGTTG CAAGATGAGG ACGAGGCAAA CGTGTTTGCC 360
TCGTTGGGTT TGATTACCGA AGGcAGCGTC CGCCGTACAA TGTGTCAGCA GGTGCACGGG 420
TTTCAGGCGG GTCCATTTTT GACATGGCCA TTGCGGTGCG CAATGCCCTG CTTGCAGGTG 480
ATCAAGAGTC GCTCGGGGGT AAGATTTTAG GAAGTGCAGA TGGTGCAGTT GACAACCTTT 540
CAcTGCGTCT TGCCGAGACA GGAGCGCGGT ACgcACGTGC ACAGGCAACA CTTGCACGTT 600
TTAATAGCCA CATTCCGAAT GTGGTTGCAG CGGAGTCTCG TGAGTCTGAT ATTGATTTGA 660
CGCAGGCGAT TACGGATTTG AAAATGTTTG AGTATACGCA CCAAGCAACG CTGAGTACGG 720
TGGGCTCTTT GTATAAGCAC ACGCTCTTGG ATTATTTGCG GTAGGGGAAG GAGCCGCACG 780
CTATGGAGAT TCAGACGAAG ACGCTCGGTA CACAAACGGT TGAGGCACAC CAGATTATTA 840
CGTTGGAGCG TGGTCTCTAT GGTTTTGAGA AATATCATCG CTTTGCGTTA TTTGATGCAG 900
TGCAGGTTCC GTTTATTCAT ATGCAGTCCT TAGACGATCC GGCGTTGTCC TTCATTGCTA 960
TTGATCCGTT TTTGTTTCGT CCGGACTACG AATTGGACAT TGACGATGTA TTGTTGCAGC 1020
CGCTCGATAT TTCTTCTCCT ACCGACGTCT TGGTGTTCGC GTTGGTGACC ATTCCTCCCG 1080
ATGGATCTGC GGTGACTGCA AATTTGCAAG GTCCCTTGAT TGTGAACAAG AAAAACCGCA 1140
AGGCGATGCA AGTGGCGATG GGTGGCGATC GGTGGAGAAC GAAGCACGAT ATCGTCGCCG 1200
AAATGGCAGA AAGAAGGGCG CAGGAACAAT GTTGATCCTT TCGCGCAAGA CAAATCAGAA 1260
AATCTTTATT GGGGACTCGA TTGAACTGAC TATTATTGAG ATTCGCGGCG ATCAGGTAAA 1320
AGTCGGTGTG GAAGCGCCGC GTTCGGTGAA AATATTCCGA CAAGAGGTAT ACGAAGAGAT 1380
CCAGAGAGAG AACCGCGCTG CGTCCGACTC CCCCTGGTCT CCTAACTCAT TGCCTCAGTT 1440
GCCTGTGTAG TTGCAGAGGA TACCCATCCC TCGGGGTGGG AGTGTTTTGC GCGGATGACT 1500
GAGTTCACCT AAATCGCACC GCGTTCCAGT GCGCATTGTA GTGTTCGATT GCTGCCCCCA 1560
CATCTGTTTT TAGCGTTCCT CGCTCGAGGT CATGCTCGCT GTAGTATGGA GTCTTCTTCT 1620 GATATGCTGC CGCTTCCCGG TGGATAGTTG AAGGGAATCC AAAGGTGTCC AAAAACTCCG 1680
CGTAGTGTGC AGGCTCTAGG AAGAAGTTGA TAAACGCATG GGCAAGGTCG CGATTGCGTG 1740
CCCCCTTGGG AATGCAAAAG CTGTCTACGT ACACTGGGCT GGCGACATCT TGTGGTATAA 1800
AGAAGTCTAT ATGTTCGTGC ATTGCCTCAG GAGTCTCTGC AAAGAAGGCC TCTGCAAAGC 1860
CATGAGCTAC AACAAAGTCT CCCGATGCAA ATGACTTTGC GTATCCGTCC GAATCAAACT 1920
TTACCAAGTT TGGTTTCCAG TGGTCGGTGA CAAGTATTGC TGCCTGCGCA AgCTCTTGTT 1980
CGTTTTTTGT GTTTACGTTG TAGCCAAGTG AAGCAAGTGC AGCACCCATT AcTTCGCGCA 2040
TATCGTCCAT CATGCTCATA CGATACGCCA GGTCTTTGCG TGAGAAGATA GACCACGTGC 2100
GCGCGTATGA CGGAACTGCT TTTTTGTTTA CCGCAATGCC TGCCGCTCCA AGATAATACG 2160
GCACCGAATA TTCCATTTTT GGATCGTAGG CTATGCGAGC ACGGACACTC TCTTTGATAA 2220
ACTGTACGTT GGGAATCTTG GATAGGTCAA TTTTTTCCAA CAGATGCTTG CGTTTCATGA 2280
TGCTGACAAA GTCACCCGAA GnGCCACTAA ATCATAACCA CTTGCACCAA TGCTCAGTTT 2340
TGCAAACATA TCTTCATTTG AAGCGTAATC ATCATAGACT ACCTGCACGT TATACTGTTG 2400
TTCAAACTTT TTAATGAGGG ACGTCGGGGT GTAGTACGTC CAGTTATACA GGTACAGGAC 2460
ATCCTGTCGT GTCTGCAGGC ATGATCCCAT CCAGAGGGAG AGAAAAAGGA GAGAArGAAT 2520
GCgCGAACTG CTCACACAAA AACGTTTCAT GCTTTGCTCC TTACGAGATT GTTTGCAGAA 2580
AAGTAGTTAC TTnGAATGGA CTATTGTTTT TAnAGAATTG CGCAGGAGGT AGGCGACCCC 2640
TACAATCCCT GCCATCATGA TCAGGGAAAG GGCATTGATG ATaGGAGAGA CCCCATAGCG 2700
GATCATTGAA AACACATACA GGGGGAGTGT GGTGGAGCCC GGTCCTGCAA CGAAAAAGGT 2760
GATGACAAAA TCTTCCAGAG AAAGGGTTAC TGAAAGTAAA AAGCCAGACA GTATGCCTGG 2820
CATGATGGCA GGGATCACGA TTTTTCCTAG CGCTTGCCAC TCGTTTGCAC CTAAGTCTTG 2880
CGCCGCCTCT ATGAGAGAGA GGTCAAAGGT GTCGATGCGA GTAAGGATGA GCAGGAGCAC 2940
GAAGGGCAGA CAAAAGGTGA TATGAGCGGT GATGnATGTT GCGCGCCCCA GCGGCAGGCG 3000
TACTAGGGAG AAAAAAACGA GCATTGCCAT ACCTGTGATA ACCTCAGGGA GCAGCATGGG 3060
CAGAAGGCTC ATTACCTGCG CATATAGCCG GCCTGAGAAA CGATACCAAC GAATTGCGAT 3120
GGCAGCGGCA GTCCCCACAA TTGTTGCTAC AAGTGCAGAA ACAGATGCTA TAAGCACGCT 3180
ATTAAGAAAG GAAGACCACA GTTTTTCTGA ATAAAAGAAT AGCTCTGTGT ACCAGCGCAG 3240
CGAGAAACCG GTCCAGATAA GGGATTTATC CTTGTTGAAG GAAAAAAGCG CAATAACTGC 3300
AAGCGGCAGA AATAGGAACG AGACAACTGC CGCCAGCAGC ACAGCAGAGA AGGAACACCG 3360 TAGGGGAgTG CGTGCGCGCG TGTGGATCAG ACGGCAATTA GGCATATGAG CGGCGTCCTC 3420
CGCAAGGGGG AGGTGTTGTA TTGCAGGCAC CGCTTTCTGC AAGTATCCAG AGTACTCCTA 3480
CCCCCCCTGC GAGGGTAACG AGCATCGCGA AGGCCGAAGC GAGTGGCCAA TTTCCCACGA 3540
TACGTACCTG GTCCACAATT GCGTTTCCGA TAAGGTAGGA ATCCTTTCCT CCCACCAGGA 3600
GGGGGACGGT GTAGGAACCG AAGACAGGAA TGAAGGTAAA GAACACGGCG GTGGCAATGC 3660
CGGTTTTGAT GTTGGGAAGC AACACGCGGA TAATGGCACC CGTGGGGGTA GAGCCTAGAT 3720
CGCATGCAGA TTCAAGGAGG GAGAAATCGA AGCGATCGAT AGCGGCGAAA ATAGGAAAGA 3780
TAGCGTAGGG CAGGAACATA TAGGTGAGCA CCACAATGAC TGCCCCGTTA TGGTACAGGA 3840
GCG 3843 (2) INFORMATION FOR SEQ ID NO: 86:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2072 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86:
CnTCATCCGG CCCCAGAGGG AAAAAACGCG CGCCGCGTTT CCCGTTGTTT CAGGCCCACC 60
AGCnCGGTGC TCCACGTTCC AGTGTACCCG GTAGTGCGCA CGTACTATGC GCGCATGAGA 120
GATTGGGTGA TCCGTGCTGT GCGCGCTCAT CTTTTTTGCC AGAAGGAGTG CGCATGGCCG 180
TGCACACCCG CCGGCACAGC ACCACGCAGG CGGAAAGCTC ACTGCAGCAG ATAAgGGGGG 240
AGCATGTTTG AGCAACTGAG CGCGAgcTTT AAACGTATCG TAGGAGCGCT GGGAGGACGC 300
GCAACTATCA GCGAACGGAA TATTCAAGAA GCAGTAGAAT CAATTAAGCG CGCACTGTTA 360
GACGCCGACG TGCACGTTCG CGTAGTGCGT CGCTTTGTGA ATCAGACCAT ACAACACGCG 420
CAGGaCAGAC GGTGCTTGCG TCGGTTAGCC CTGCGCAgcA ATTCATAAAA ATTGTACACG 480
AAAGACTAAC TGCCTTCCTC GGTGAACATA CGCGGTCGCT GCATCTTAAG GGGCCCGATA 540
CGCAnTCGAT TATTCTCTTG CTTGGGCTCC AAGGATCGGG GAAAACTACC AGCGCAgCAA 600
AGCTTGCTGC GTACCTGAAG GATGCAGGTC GCTCCCCTCT CCTGGCCGCT TGCGATCACG 660
TTCGTGCCGC AGCGAGTGCT CAGCTGGCCG TTCTCGGCAC GCACATTGGC GTTCCCGTGT 720
ACCAGCATGC GcTGCCGCAC GAACAGCAGC CGTGTGCTCT TGATACgGcG CGCGGTGCGc 780 yTTCAGTACG CGCGCTCACA CGGCAATGAC GTACTTATCA TTGACACTGC TGGCCGTCTC 840 CACGTGGATG CCGCGCTCAT GCAGGAGTTA ATCCTTCTCA AAGAAACACT GGTTCCTGTG 900
GAAACACTCC TTGTTGCAGA CGCTCTAACC GGTCAGACTG TGGTGCGCAT TGCAGAAGAG 960
TTCCATGCCG CGGTGGGTAT TTCAGGCGTT GTGCTCAGTA AGTTTGATTC AGACACCCGC 1020
GGCGGAGCTG CACTGTCTTT GAAAAGTATT ACCGGTCAGC CACTGCTGTT TGTTGGAACC 1080
GGTGAACGAC CGCAGnACTT TGAACCGTTC CATCCCGAGC GAGCCGCCGG AAGAATTCTG 1140
GGTATGGGGG ACATCGTTTC TCTCGTGGAA AAGGCGCAAA AAGCCTTTGA TGCACAGGAA 1200
CATGCGCGTG CGCAGAAGAA AACGCAATCG CACGAGCGTT TCACGCTCAG CGATATGCTC 1260
GACCACCTGC AAACTATAGA AAAAATGGGA CCGCTGCACT CGTTGGTGGA GATGATTCCC 1320
GGTTTAGCGG TAGCCGTTTC TGCCGATGCT CTTGACGCGC GCGCGTTCAA GCGTCAAAAG 1380
GCGATTATTC AATCGATGAC CGTGCAAGAG CGTGACAATT TTCTCATTAT CGGCCCCTCA 1440
AGGAGGCGGC GCATCGCGGC AGGGTCAGGC ACTTCGGTGG CTGATGTTAA CCGTTTAATT 1500
AAGAATTTCG AGAGGATGCG CACGCTCATG CGCAAGACTG CATGGCAGTC ACGCCGGGCA 1560
CACCCTAAAG GAGATACACC CTATGGATGG CCACATCGCT AGATACGCCG CCTGCGCCCT 1620
TTTCGCACTC AGTGCGCTGT GCsCCCCgcT TCGCGCGCAG AAGACGCGGA CCACTCCTGG 1680
TCCCCCGTAC ATACCCCACC CTACGATCCT ACACTTTTCC AGTCGGACGC GCAGcGTGCC 1740
GCCTTCCACA CCTTGGCGGC GGAACACCTT TCCTTCCTTA CCGGTCACAT GTGCTTCTTC 1800
CGTCCTATCC CTACCCGCGA TCCTTTCCTC ACCCGTGCCT ACGAAATCTC CCCACATCCC 1860
CGCACACAGA AACCCACCGT GCTGCTCGCC TTTGACTCGG ATATCATCTA CCTTCTTTTC 1920
TACGATCACC GACCAACAGA TTTCCCCGCC CTCCGCTTCT TTCAAAACGC ACCTACTTTC 1980
CAAGAACTTC CGAGCACCTT CTACCCCTAC ATTGCCATGC ACAGCGACGC CGTTCTCGTG 2040
CGACATCCAA CGCCGCGCCC CCCTACCCTT CC 2072 (2) INFORMATION FOR SEQ ID NO: 87:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3288 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87:
ATGnTATGAT CACATTCATC ACATCTCCTA CATTTTCTTG CGCACACGAA AGCTCCACGC 60
TCATGATAGG CTCAAGCACG TATGGCTCAG CGGCTACACA CGCCTCACCA AATGCTTGCA 120 CCGCTGCAGC TTCAAATATA AAAGGCGAAG ACGTCAGTTC CTGATACTCC ACCGACAGGA 180
GATGCACCCC CACGTCTACA CACGGATACC CACTCTGAAT ACCTCCATCC CACGCGCCGc 240
GGATAGCATG CTCTACCGCT CCGATAATTT CTGCCGGAGC AGTATGCGCC GTGCACACTG 300
TGCCCCGCAA CTCTTTCACC TGACAAAAAA ACTCATTTCC CGCACCAcgC TTCCGCGCCT 360
CCACACGCAA GgTTAACCCT GCCATATATT CTTTGCCACC GATTACTCGC TGTACTCGCA 420
GCGTACGCTC AACTGTTTTA CGAATAGATT CACGGTACGT CACGTGCGGC TTACCAACAC 480
GTACCTGCAC ATTAAAATCC TCGCGCATCC GGGTGGTGAG CACATCAAGA TGCAACTCCC 540
CCATACCGGA AATTAAAAGC TGCCCTGTCT CTGCATCTTC CCGAACCGAG AAAGTTGGAT 600
CCTCTCGCGA AAGAATACCA AGTGTCTCCT GTAACTTATC GCGCGATGAT GCGTCCATTG 660
GCTCAAGCGA CACAGAAATA ACCGGTTCAG GAAAATGCAT AGACTCCAGC ACTACCGGAC 720
ACGAACCATC CCCCACGCTG TCCCCTGTTT GTGCAGATTT TAGTCCCACA ATTACCGCAA 780
TATCCCCCGC CTGTATGCAT TCAACGGTTT CAGACTTATT CGAATGCATA CGCAAAATGC 840
GATACACCCG TTCACGTTTC TTTTTCCCAA TGTTAACGAT ACTGTCCCCC GTACGCAGTT 900
TTCCCGAATA CATGCGCACG TAGCAGAGTA AACCcGCTTC ACGTTCGTAC TGAATTTTAA 960
ATACAAGTGC CAGCAACGGT CCTTCAGCAG TAGGAGCGAT AAAAACAGGC TCCTTTTTCT 1020
GTACGTGAAA ACCTTCTACT GCTTTacGCT cgCGGCGCAG GCAAtACTCT AmCACTGCAT 1080
CGAGCaGTGG TtGCACACCC AAGTTATGAC GAGAAGAACC GCACAAAAAA GGAACATATC 1140
GwCCGtCGCG CACAGCCTTT CTAATTTCTG msTGCAgTAA CTGCACTGGA ACGTGCTCCC 1200
CTGCAAGCAC ACACTCGGTT ACCTCATCCG AATATATGGA AATGACATCA AGCATTTTTT 1260
CTcGCGCttc gGcCTGGGCA ATACGTGCGC TTTGAATAGG CCGGTACTCC ATCTGTTCGC 1320
CACTACTTGC CGCATCCCAG AAAATCTCTT TCATGGTGAT CAAATCAATT ACCCCCTCAA 1380
AAGAGGTACC AGAACCAATG GGTATCTGCA ACGCGACTGC ATCTATACCG AATTTATTGT 1440
GGACTTGGTC CAATACTGAG AAAAAGTCAG CACCGATCCG ATCCATTTTG TTAACAAAAC 1500
AAACACGCGG GATATCATAA CGATCTGCCT GGATACCATA CGGTTTCTGT CTGTGGGCTG 1560
TACTCTTCCT ACCGGCACAC AATACCACCA CTACCCCATC TAACACGGGG CAACGCAnTT 1620
CGACTTCTGC AGTAAAATCT ACATGtCCCG GcGTAcAATA AtGGTAATGT CTACTTCACG 1680
CCACCGCACC GTCGTTGCAG CACTCTGAAT GGTGATACCG CGTTCCTGTT CCTGTACCAT 1740
CCAATCCATC GTTGTCGCAC CATCATCAAT TTCCCCCATG CGGTGGATCT TGCCCGTGTA 1800
AAAAAGCATA CGTTCAGTGG TAGTAGTCTT ACCAGCGTCA ACATGTGCCA TAATGCCAAT 1860 ATTCCTCATC TGCTGTTGTC TCATATCTTC CTCTTTATCA CGTACCTGTA CGGATAAAAA 1920
ACAGTGCACA GGACTACTCC CCACCATCAT AGCAGCGGGA AAGAGAATAC CTAAAACGTA 1980
AAACAGTGCA CGCTACACCG CACGCGCATT CAGTGACAAA TCACAAATCG TCTGTGGAAA 2040
CAAAATCTGc AATTAAATGG GGTGGTACCG AATCTTTCAG TTTCCTTACA ATAGACTGCA 2100
GCACACGCAC ATCTTCGGCT TCACTTAACA GAGACTTTTT TTGACGCAAA AATTCAACCG 2160
CCTGAATATA TTTTTTCGCG TGCACCAATA CGTTTGCATA CTCAAGAATA CGTTGAGTAT 2220
TTTTCTGATC ACGCTCATAC AACTGTTGAT AAAGAAGTAG CGCGCGAGCT GTTTCACCAA 2280
AAGAAAGCaA cCATAnCGTA CGCGGCAGCA ATGGTTGCAT TGGCAGAATC CGCATCGTAT 2340
AACGGTTGCA GTGCATGAAC AGCaGTTTTC CAATCATTCT GCAATCCACA CACACGCGCA 2400
AAATTATACT GCGCTGCATG ATAGTACGCA GGGTCACGCG CTGCACGCAT ATAATAATCA 2460
CGCGCCTCGT CATAACGATG CAGCTGCACA TACGCATGGG CAAGATCAAA ATACTCTGGC 2520
GCAAGAGACG ACTCAGGCTT TCCGCATGCA AATAGTCCTG TATAAAACAG GTGAAACAAA 2580
AAAAAGCAAA AACACTGCTT ACCGTTCATC AGTCAGCGAG AACCTAATAC CGTTTCCTCT 2640
ATCTTGCGGA GCTGGGACCG GTACAGGCCA GAAATATTTG AACCATAGTC TGCCAAAAGC 2700
TGGATAATAT TCCGCGGATA GTCTTCAACA TCAACCCCGT TAACTCGATC ACCAGCATCG 2760
CCAAGACCAG GCATAATATA CGCACGCGCG TTGAGTACGG GATCCATCCA CAGCGTATAC 2820
ACCGTGCAAT TCTCTAGGGA ACGCACTACA CGAATCGCAC CTTTCAGTGC AGAAATCATG 2880
TGAAAACAGC TGATAGATTT TGGTTTCACA CCGAGATCTT GCAAATAACG CACTATGGTA 2940
ACCAAACTAC CACCGGTGGC GTTCATGGGA TCGGCGAAAA CCAGATCCTT ACCATCCAAC 3000
TCTCGTGCAG AAAAGTATGA TTTATCCAGA TCAAACACAT ACTGcATATC GCGCTCATCA 3060
CGGAGaTCAT CTCGCTTGaT TTTAAAAAGC GCAAACGGcG TTACGTACCC ATGcGAAGAA 3120
TATTCTTCTA TCTCCTTAGA AACAATCATC GAaGGTAACA GCGCTCCTCG TAACATGACA 3180
CACATCACCG TGTTCTCAAT TTTATAATCC ACGTTTGCAA TTTTATGTAC TGCATAGTTT 3240
TGTACAGGAA AAGCAACCGG TGTTTTTGTA ATAAGATATG TTTTATGC 3288 (2) INFORMATION FOR SEQ ID NO: 88:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4238 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88:
AGCCGTTGCG TCCGnTGGGA GTGGATTCAG ACTGTGGCGn CAGTGGAGGT GGTGGAGCAC 60
GGGGAGCATC CGTGGTTTTT CGGCGTGCCA GTTCCACCCG GAATTTTGTT CTCGACCGAA 120
CCGGGCTCAC CCTTTGTTCC GGGCGCTGGT GGCGGCGGGC TTGGAGCGAA AAGACAGTCG 180
TTCTTGACGC GCGCAGGTAT TTTTTCTAGA CTCCTCTGTC CTTCTCAGCC GGTTGAAAGA 240
GCTTTGTGTG GGCGTGCGGG TTTTGTTTGC GCGCGCTCCG ATGCGGTTGC AGGTTTGAGG 300
GAGTGGATAT CCGTTTTGTG GGGAGGGATG CGGCATGCAG GGAGTTATGC GATGGCAGTG 360
AAGCGGGCAC GGGGAAAGAT TGTACGTCGG CTGGGGATTA ACATTTTTGG GAATCCAAAG 420
TACACGCGGC TTTTGGGCAA GAAGCCCGCG CCTCCAGGCA AGGAGCATGG GGTGAAGCAG 480
CGGGCAAAGG TGTCTGTGTA CGGGGAGCAG CTAAAGGAAA AGCAGAAGTT TCGCTTTGcg 540
TATGGAATGT CCGAGCGACA ATTCCGTAAT CTTTTTGCTC AGGCACATCG GATGAAGGGC 600
GTGACGGGTA ACAATATGcT GTCGCTAATG GAGCGGCGGC TGGATAACAC GGTGTTCAGG 660
ATGGGCTTTG CGATCAGTCG TGTGCAGGCG CGGCAGATGG TGTCACATCG TTACTTCCTT 720
ATCAATGGGA AGACGGCCAA TATCCCTTcC ATGCGCATTA GCGCGCATGA TGTCATCACT 780
ACTAAGAACC GGAAAGGTAT TCATAGCATC ATTCGTCACA ACCTGACCCT TTCTCAGGGG 840
CAGCGCGGTT CCTGGCTAAA CGTGGATGAG GAGCAGCTTT CGGCAACTGT CTCTGAGCTG 900
CCGCGTGCGC AGGATATCCA TCCGGTGGGG AATATCCAAC ATATCGTGGA GTACTACTCG 960
CGGTAGGATC CTTTGCCACT TTAGCTGGCG TTGCTCAATT ATCTCCCAAG TCTTACCAGG 1020
GACTTTGGGG CGTGGAAGGA TGGCGGCGGT GTATGGATGA TGGTCCCTTG AGGGTTGTTG 1080
TGCTTACCTC ATTTGTCATA CTCGTAGTAG TCTGTGCCGT TGCGCTGTGT ACTTTTTTTG 1140
TGTTCCTCAA AAGCCCTGAT CAGGTGATGA CTCCCCATAT CGTGGGCAAG GACTTTGTGT 1200
CTGCTGCTAT AGAGATGCAG GCAAAGGAGC TGTATCCCCG CGTTCAGTTG CGGTTTTCTA 1260
CCCGTGAGAA GCCTGGTGTT GTTCTTGAAC AGAACCCACC TGCGGGGGCC ATCGTCAAGG 1320
CTGGGCGCTA CGTGGACCTC GTAGTGAGCC AACAAGCAGT GACTACGCAC GTTGAGGACT 1380
ATCGGGGATT GCAGGTTGAA GAAGCGGTGG CGCGCATCGC TGCTGCTGAA GTTGAGCGCC 1440
GCATCTCAGT GAAAACACCC CACTTATATC GGTTCAGCAC TGGCGCAgCT GGcACCATTT 1500
TGGAGCAGGA CCCTGCTCCT GGCGCGGTTC TGTCTGCGGA TGTAGAGTTG CGTTTTGTCG 1560
TCAGTAAGGG GTCTGAGCGC GAGCAGACTA CAGTCCCCCT ATTGGTAGGA TATAGTTTGC 1620
CTGAGCTGTA CCGTGTTATG GCGCAGACGG CGCTCACCTT GCAGTTTACC GTATCTCCCC 1680 CGTCTCCTTC TGGGGAGAGA AAAGACGGAG AAGCACGTGG AAGAACGCGT GCCAATGCGC 1740
AGGACTACGC GCGGGTTTCA GCACAGGATC ATGACCCTGG TTCGCGCGTT GAAGCCTTTC 1800
GCGCCATGCA GGTGCAGGTG CTCTTTCCAG AGCGTGGAGA GGCTCACGAA ATATACGGTA 1860
TCTTAGCTCT CGATCTGCCG CGTTATCCGT ATCCTATGTC CTGTGTGTTG GATGTACAGT 1920
ATCCAGGGGG GGTGCGTACC GCGCTTGCAA TGTTTCAGCA TCCGGGGGGA CGTTTCACCA 1980
TCCCCTATGG ATTGCCTGCA GGGGCGACGC TCTTCCTAAC GGTGGGGGGG AAGGAATTGT 2040
TTTCTGGAGA GGTGGGTGCA TTGCCTCATG CAGGTTCCTA GCAGACGTGA TGGAGCACTG 2100
CGGGTGCAAA GGTGCGATGG CTGCGTGGTC TGGCGCAgcG TGTGTTGTGC ACTGCTGGTG 2160
GCGCTTTTGT GTCTTGCCGT CGGCTGCGAT TCCCCTGATT TGCTCGTAGA TAGCGATCTG 2220
TCTCTTTCGC GCGTGCGCgT GGCAAAAACG CTGGTTATGG GAGTGAGCGA TCGTACGCCG 2280
CCGATGTGTT TTCGCTATCC GAATGGGGAG AtTGTTGGTT TTGATGTTGA TCTTGCGCGC 2340
GCGGTCTGTC GTGTATTGGG GGTACGCCTT ATCATTCGTC CCATAAAGTG GACGCTGAAA 2400
AGGAATgcGC TGCGCTGTGG TCTTGTCGAT TGCGTTTGGa CGGCGTTTGC CGTAaCGcTC 2460
GGCGCCGCAC TGAGTTTTTA CTTTCCGAGC CATATCTGCG TACTGCGCAG GTACTCCTTG 2520
TGCGTGAgGG CAGGTTGCAT CCGGATTTGG CACACGTGGA ACGGGAATTG GGGCAGCGTA 2580
TGACAGGTGT TTCGGCTGTG CATACGCGCG GTGATATTTT GCCTATGCGC TCGTCGCATA 2640
GACAGGCTCG CATCGCCGTG TTGCGCGGTG GTcCGGTACC GGGAAATGAG AAGTGGCAGT 2700
TTGGATTTGA ACCACACGGG AAGGTTGTGT GGTACCGACA CCGGAGTGCC ATGCTTGrAG 2760
GCnTGCGCAC CCGGGCGGTG GACGCGGCAC TTGTGGATCT GGTTGAGGCT CATGACGCAG 2820
TGCATCGTCA GGGTGCGCCT CTGAGGGTGA TGCGGGTACC GCTTGGGTTG AGCCAGTATG 2880
CGGTTGCATT TCGGCGTGAG GATCGTGCGT TGCGTGACGA AATTCAGCGA ATCTTGTATC 2940
GTATTGCTGC CTCCGGTGAG GCATACCGTA TTGCAGAAAA ATGGTTTGGT GTTGGTCAGT 3000
CGGTTATTGG GATAGAATAA AGGTGCAAGG CAGCGGTGCG TTTTTTGCAC TGCGCTTTTT 3060
TTGCGTGTGT GCAGGGGTGG GGGGGATGCT CTGGTCGTGT ACTCCTCGTG CAAGGGTGTT 3120
TCACGCGCAG GATGCGTCGT TCGATGAGGC GCGCGTGCGG GGTACACTTG TGGTGGGCGT 3180
CGGTCGGGGC TTGGCACCCT TGGTGGATGC TGCCACTTTC TCTGCCTTCT CTCTTCCTTC 3240
TTCGGTTGTT CCTCCTCCTG CGCGGTGTTC GTTGCTTTTG CAGGAGGCGC GCGGCTACGA 3300
TGTTGAGCTG TTAGCTCAAG TGGCACGTCG TCTCCATATG GACGTGCAGG TGAAAGTCGT 3360
TCATTGGGAT GAAAAGGAGC GCGCCCTCCA TGCGGGGGTA ATTGACTGTA TCGCAGACGG 3420 ATTCACCTAT ACTGCAGATC ACGCGCGCAG ATTTGCACTG ACGCAGCCGT ACGTACGCGA 3480
TGTGCGCGTC TTTGCGGTGT TGCGCCAAGC CCCGTACGCA ACGGTTGCAG ACCTGCATGG 3540
AAAGCGGCTC GGGGTCCACG CAtgACCGAT GTGGAAGAAA ATGATGCATA CCACGCGTTG 3600
TTTGGGCAGG TGAAAACGTA TGCCCACTAT GTTCAGGCAC TCACTGCTTT GTCGCGAGCG 3660
GAAGTAGATG TGGTGGCGCT GAATTTGGTG ACGCTCTGCG CAGTGACGCC GCACCTGCGG 3720 gCTTGTATCG AATTTTGGAT GAACCGATAG ACACGTGTGA ATACGTGTTT GCGTTTCGTG 3780
CGGATGCGCG TGCTTTGCGC GACATGGTTG TGCGCACTCT GTCGCAGCTG CAGCGAGAGG 3840
GTTTTGTGTC AGCGCTCTCA AAGCGGTGGT TTGGCAGCGA TATGTCCATC ATCGACCGCT 3900
AAGGCGGGTG GAGGGGGAAT ATCGGTGGAT CCGTTGAATG CCGTTATTGT GGAGGGAAAT 3960
GTCGTTCCAT CTGCTTCCGC GCGCGTGCCG GAGGgGCCGT GTGTGCGTTT TGCATTCAAA 4020
CGCAACGGCG CGTGCAAGGG GAGGGGAGGG TGCACACAGA GGTTTCGTAT TTTGAAGTTG 4080
AGGCATGGGA TGCACTTGCG CGCGTGTGTG CGCAACAGGT GCGGCCAGGA GTGGGGTTGC 4140
GGGTGGTCGG CCGTCTCAAG CArGATCGTT GGCAGCAGGA GGACGGGGTG CGAGTGCAGC 4200
GGGTAAAGAT TGTCGCTGAG CATGTAGAGT TTCAGACT 4238 (2) INFORMATION FOR SEQ ID NO: 89:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 12411 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89:
TTCCGTTTCG GGATTGAAGA TTTCAATAGG TTTTCCTGAA TCGTTACCGG ACCACAGTGC 60
CTTTTCCTTG CGTTGTG AT TGTGTGCGAT AACGCCATTT GCGGCAGCCA AAATAGTTCC 120
GGTGGAACGG TAATTTTGCT CTAAGGTAAT TTCTGTGGCG TAGGGGAAGT CTTTTTCAAA 180
AGAGAGAATA TTGTCGTGGT GTGCTCCACG CCAAGAATAA ATTGATTGAT CATCGTCCCC 240
TACTACACAG ATATTTTGTG TAGCGAGCAT TTTCATTAAA CGGTACTGCT GTGCACTGGT 300
GTCTtGGAAT TCATCCACTA AAATGTAATG ATAGCGACTC TTATATGAAG CAAGGATATC 360
AGGATATTCA CTGAAGATCT GGATtGGCAG TACGATCAAA TCGTCAAAGT CTACCGCATT 420
AAATAATTTC AACGCAgTAT GATATTCGTG CCAAAGAGCA CGTTCCTTGT GCTGTAGCTC 480
TTTTAAGTTC TTACGCTGCA TTTTAATAGC GGAAAAGAGC GTgcTCACGC AGTTTGTGTC 540 AAGAACTTCT GGCAGGAGAT GTACTTCCTT TGcTGcTTCG CGAATGAGAG CACGTTTATC 600
ATTTTCATCA TAAATGCTAA AGTTCTTCCT CCAGCCGAGA ACATGGATAT GTTCTCGCAA 660 AATCGTAACG CCAAAAGCGT GAAAAGTACT GACGGTCGTG TTGCGTAAAG GTTTGCCCGT 720 GAGAGCTTTA ATACGTTCAG ACATCTCGTG TGCGGCCTTG TTAGTGAAGG TCAGTGCAAG 780 AATGCGCGAT TGCAGAATAC CGCATTCGAG CATATACGCG ATGCGCGCCG TGATTACGCG 840 CGTTTTCCCT GAGCCTGCGC CGGCGATAAT GAGAAGCGGT CCCTCaAGTG TGGTAACGGc 900 TCGACGTTGC TCAGGATTAA GTGTAGAGAG CATAGACGCA CAAGCGTAgm AnAATAGGGC 960
GGAAAAAAGA ATATCCGTAT GTGGAAGGGc ATAGACTGTG GGTAATACTT CAGATTGGGT 1020
AGAGATGGAT GGATTGCGCT ATGTGTATGC CGCGCAnGGT GCGGCCCCCA TGCCTGCTCC 1080
TACGGATAAT CCTGCTTGTG ATGCGCACAT GTCGCATGAC GTC TAGCGC GTACTGCCCA 1140
AGCAGTTTTT GGTATTCGTG CGCTGTTTCC TTGGCAGCGC TTGGTAATTG CTAACATACT 1200
GGATGCGGCG CATGCGTGTA CACATACAAC TCCGTTCGCT GCGGCAGGTT CTTCTCAAAC 1260
CGATGCTACG AGGGTGACTC ATGTGGATGA CGCGCACCTG AGGATAATTT CGTCGGTGCC 1320
ATGCAAGACA CACGTTTTGA TCAGGATGGC GTGTCACGCG CACATCAAGT GGTGCTATTG 1380
CCGACAGgTG CAGGAAAATC GCTGTGTTTT CAAGTACCTG CCCTCTTTTT AGAGGGGCCG 1440
ACGCTAgTGG TGTACCCACT GTTATCGCTC ATGCGTGATC AGTGCCGTCG GATGCAGGCA 1500
GTGGGATTTT CGGTCATCTT GTTACGTGGT GGACTGAATG CGCAAGAGCG CGCGTACATG 1560
TATGCGCAtT GGATAgtGTG CTGAgGCGTA TGGCCGGATG CGAGgCGTTA cGCcTCCtGc 1620
ACACCAGACG GCAGAATTTT CCCTTGCAGA TTCGATCTCT TTTGATGCGT CACTCTTCTC 1680
TGATGACGTG AGTACCTTCT CGAAGGTGGT ACATGTGGAT GAACGTCTTG CTCAGAAAAG 1740
GACAGAAAGT CGAAGAGGTG TATGCATCAT CGCAAGTCCA GAGATACTCA CACAAcCgck 1800
CTGCGCCACG CGTGCGTGCA TGTCGCGTTG CGCATTTGGT TATTGATGAA GCGCACTGTG 1860
TGTCCGAGTG GGGAGATTCG TTTCGTCCTG ATTACGTGCG ACTAGGCGAA TTGGTGCAGG 1920
ATCTTGCGCC TCAAGTGGTG ACTGCATTTA CGGCGACTGC AAGTCAAACA GTGCTTGCGC 1980
GCATCATGGA AGTGCTGTTT GGCGGTCGTG CGCACGTGTT GCAGGGAACA GTAGATCGCC 2040
CGAACATTCG ATACACCGTA CGCACGGTGC TGTGTAAGCA GACGGCACTG ACTCAACTTG 2100
TAGCGCGTTG TGTGCGCCCT GCAGTTATTT TTTGTGCTCG TCGGGTACAG GTGGAGCGTG 2160
TAGCCCACCA TTTGCGCACG TGTCTTTCTG ACACACAGAT ACGTTTTTAT CACGCAGGtT 2220
GCAGAGGGAA GAAAAAGAAA CAGTGGAGCG ATGGTTTCAT ACCCATGATT CTGCCGTTTT 2280 GGTAACTACT TGCGCGTGGG GAATGGGAGT TGATAAGCCG AATGTACGTA CGGTCATTCA 2340
CGTGGATGCG CCACTGACTG TGGAGGCGTA CGTACAGGAG GTTGGAAGAG CAGGACGGGA 2400
CGGAATGCGT GCAGACGCAT TTTTATTGTG GTCACCTCGA GATGCTCGCT CGATAGAAAC 2460
ACTGCCGCAT GCACAACGGG TGCGTGCGCA CGTGTTGCGC CACTTTGCTG AAAGCGGACG 2520
TTGTCGCCGC GCAGTTTTAC TTGAGTCTTT GGGGGAACAG AATGTGTGTG CCGGATGTGA 2580
TGTGTGTGCA GGCACTGCAC GTTTTGTATG TGAGGATGTA GAAGCGCTCT TACAGTTTTT 2640
GAAAAAGAAT GCGCGCAGAT TCACTGTATC ATCGTTGGTG CAGCACCTCG CGCTACATCA 2700
GAAAGTGCTC AGTGTGGCGG ATGTACGTGC CTTGCTATAT TACGCGCTCG AAACAGGACG 2760
TGTGAAAAAA AAACATTCAC TCTTGTGGGG TGATGTCCTG TATGTTGCAC GTTAACGATT 2820
CTGCGAGCAA ATCGTATCTG CAGGAAAGCA AGAAGGATGG CGAGAACATA CAGTTGCCTT 2880
GTATATTCCG CGAGTTACGC GCATTTTTAT GGCCGAGTGG TTAGCATTAC TTCTACAGGT 2940
TTTTTATGCA ATTTTATATA CTGAGCCGTC GTTCATGTGT CTCCGATACG GTGTGGTCTA 3000
GGTTCCGTAC gTGCGGGCAC GGAACACATC GAGCGGACGC GTCTGTTCGT GGAGGATATT 3060
ATGAAAAGGT TTATTCCCCA TCGGGTGATT CACGCGGTGT GTATCGGGCT TGCACTTGTA 3120
GGTTGTAGGA AACTCGATTC TCGTGCGGGG GATTTTGAGT TAACGATTAT ACATATCAAC 3180
GATCATCATT CGCATTTGGA ACCAGAACCC TTAGAGCTTG CAGTGGCAGG GGAAAGACTC 3240
AGAGCGGCTG TAGGCGGTTA TGCGGCGCTT GTGCACGAGA TACAACGGTT GCGTGCGGAG 3300
TCGAAGAACG CATTGGTACT GCATGCAGGA GATGCACTCA TAGGTACGCT GTATTCTACC 3360
CTCTTTAGAG GGCGTGCGGA CGCGGTGCTG ATGAACCATG CAGGATTTGA TTTTTTTACC 3420
CTTGGCAATC ACGAATTTGA TAATGGGAAT GAGGGACTCA AAGAATTTCT GCACTATTTG 3480
GAAGTGCCAG TTCTCTCTGC AAATGTGGTT CCTAATGCTG CCAGCACGTT GCATGGCTTG 3540
TGGAAGCCGA GCGCTATTGT GGAGCGTGCA GGTGAGCGTA TTGGGGTTAT CGGACTTGAT 3600
ACGGTAAAGA AAACCGTGGA GTCATCCAGT CCCGGTAAGG ATATCAATTT TATTGATGAG 3660
ATAGAGGCGG TGCGTCGTGC AACTGTTGAA ATGCAGCaGC AAGGAGTAAA TAAAATAATC 3720
CTCCTTTCTC ATGCAGGTTT TGAGAAGAAC TGTGAAATTG CTCAGAACAT TTCTGGTATT 3780
GACGTCATCG TGTCAGGTGA TACCCACTAC CTTTTGGGGG ATGAATCACT CGGACGGCTA 3840
GGTCTTCCGG TAGTTGGTGA ATATCCCAGA AAGATTATGT CCCCTGCAGG GGAGCCTGTG 3900
TATGTGGTAG AGGCGTGGGA GTATGGTAAG TGTCTGGGCG AGCTGAACGT AGTCTTTGAC 3960
CGAACAGGAG TAATAACGAG TGCAGTAGGC ATGCCGCGTT TTTTGTTACA TACGAATACA 4020 TTGCAAAAAA AAGGAGCGGA TAGAAAAAAT TATCCTCTTG AGGAGGCAGA GCGTGAAGCG 4080
CTGCTTGTGG CACTGAGGAT GACGCCAGAG ATTATATTTG CGCAGGAGAA TGATCAGATT 4140
ATATCTGTGT TGGAAGAATT TAAAAAGGAA AAGGAGGCGC TTGGTGCGCA GGCAATTGGC 4200
GTAATTACCG GTGCCTCAAT GCGAGGkGGn TCTGTGCATC GAGTTCCCGA TGCACAGAmT 4260
CCACAGGGTT CGGTTGCAAC GCGGTTTGTA GCAGAGACGA TGCTCTCAGA CATTCAAAGT 4320
TTTGGTGCGG GGAAGgTAGA TTGCGTAATT CAAAATGCAG GCGGTGCGCG GTCAAATATT 4380
CAGCCTGGTG AGATTACGTA TAATGACGCA TACACGCTCC TCCCCTTTAG TAACACGCTG 4440
GTGTTGGTGG ACGTCAGCGG TGCAGAGTTG AAACAAATTA TAGAGGATGC ATTGCAGTTT 4500
GCACTTGGTG ATGGTTCCAC GGGAGCCTTC CCCTATGGGG CGGGTGTCCG GTATGAAGCG 4560
CGCCAAGAAC CAGATGAACA TGGCAAACGA GTGATAAAGC TTGAGGTGCA AAAAAAAGAT 4620
GGAGCGTGGG TGCCAGTAGA TGAGCGCGCG CCGTATCGGT TGGGTGTGAA CTCGTACATT 4680
GCGCGGGGAA AAGACGGATA TAAAACGCTC GGAGAGATTG TCAGTACGCG CGGAcTGAGG 4740
ATACGTATCT GCGTGATGCG GAGTCTTTGA TTAAGTTTTT GCGTGCGCAT AAAAATTTTC 4800
GTGCATACAC AGATTCCAAT GTGATATTCC GTCTTAAATA GTAGGAAGTA ACTTACATTA 4860
GAGGcCTGTA AAGAACTACG TTCTTTACAG GCTGTGCCAA TCTGCTTTTC CGGGAAAGAC 4920
AAAGGGTATG CCACGTTAGG AGCGGAAAGA AGGGTGCTGC ACATAACCTT ATCTTTGCGA 4980
TTGACCGTGG TATACTCCTT GCACCTTATG CAAGAGAAAA AAACGCTTTA CCTTCTTGAT 5040
GCCTACGGAC TTATTTATCG GAGTTACCaC GCGTTCGCGC GTGCGCCGTT GATTAACGAC 5100
AGCGGTGCGA ATGTTTCTGC CGTATATGGT TTTTTTCGGA GTTTGCACAC GCTCCTGTGT 5160
CACTATCGAC CCCGTTATTT TGTTGCTGTT TTTGATTCTC TCACGCCTAC CTTTCGGCAC 5220
GTACAGTACC CAGCCTATAA GGCAAAAAGG GATAAGACTT CTGCAGAGCT TTATGCGCAA 5280
ATTCCCCTTA TCGAAGAAAT CCTGTGTGCA CTGGGCATTA CAGTTTTGCG TCATGACGGC 5340
TTTGAAGCTG ACGACCTCAT TGCAACCCTA GCAAAACGAG TTGCGGCTGA GCACTGTCAT 5400
GTTGTGATTA TCTCCTCAGA TAAAGATGTA CTTCAGCTTG TGTGTGATAC GGTGCAAGTG 5460
CTCAGACTTG ACATAGATCA TAAGTGGACA TGTTGCGACG CTGCGTACGT ACAGCAACGG 5520
TGGACGGTCA TGCCAACACA ATTACTTGAT TTGTTCTCTC TCATGGGAGA TTCCTCCGAC 5580
AATGTGCCTG GTGTGAGAGG GATTGGTCCT AAGACGGCTG CACATCTTCT CCACTGTTTT 5640
GGCACACTTG ATGGTATTTA TCGTCATACC TATTCCTTAA AAGAAaGCGc TGCGCACGAA 5700
GATAGTGTGT GGGAAGAAAG ATGCATTTTT TTCTCGTTCA CTCATTGAGT TGCGTGACGA 5760 TGTACCATGT GTTTTTTCGC TCGAAGATTC CTGTTGTATT CCGCTCGATG TAACGTCTGC 5820
TGCACGTATT TTTGTGCGAG AAGGATTGCA TGCGCTTGCA CAACAATATC GTGCTTGTGT 5880
GCAAGAAATA GATACAGAAG CAACAAACGA TACATTACAA ATGACAGAGT CTTCTGTGCT 5940
CACGTCTGGT CGATGTGCAA ATGAGTGTTT CTTATCTCAG GTAGAAGGGA GGGCTAGTAC 6000
ACCGGAGGTG AaCTCCGTAT TGAAGTCGGA GTTGAAGACG AGTGCTGTGT CTGGCGCCAT 6060
ACCTATAGAA AaTAGAGATC TTAGGCAGGA TGTTATGCTT gCACGCAGTG CaGGTCATTA 6120
TCGTGGTGTT ACTGACCCTG TAGAACTTAA ACGTATTATT GATTGCGCGT GTGCGAATGG 6180
TGTGGTCGCG TTTGATTGTG AAACGGATGG ATTGCATCCG CACGATACAC GTCTGGTCGG 6240
ATTTTCGATC TGCTTTCAGG AAGCAGAGGC TTTTTATGTT CCTCTTATTG TTCCGGACGT 6300
TTCTCTTCAT ACCGAGTCAA CTCAGTGTAC ATGTGCACGT AGCACTAATG TCGAGACTGA 6360
AAAGGAGTGC ACAGAACAGC ATGGGGTATC TGCATCTGCT GTGCAGGATC CGGCATATGT 6420
CCAAGCTGTC ATGCACCAGC TTCGACGTCT TTGGAATGAT GAGACGCTCA CACTTGTTAT 6480
GCATAATGGA AAGTTTGATT ATCACGTTAT GCATCGTGCA GGCGTTTTTG AGCACTGTGC 6540
ATGTAATATT TTCGATACGA TGGTTGCAGC TTGGTTGCTG GATCCCGATC GCGGTACATA 6600
CGGTATGGAT GTACTTGCCG CATCATTCTT TCAGATCAGA ACGATTACAT TTGAAGAAGT 6660
GGTAGCAAAA GGGCAAACCT TTGCGCACGT CCCTTATGAG TGTGcAGTCC GCTATGCAGC 6720
GGAGGATGCA GATATTACTT TTCGTTTATA CCATTATTTA AAACTCCGCT TGGAAACAGC 6780
AGGATTGCTT TCTGTGTTTG AGACCATAGA AATGCCGCTT TTGCCTATCC TAGCACGTAT 6840
GGAAGAAGTG GGGATTTTTT TACGTAAGGA TGTTGTGCAG CAGCTCACTC GATCTTTTTC 6900
AGATTTGATC CAGCAGTACG AGCACGATAT TTTTTCTCTT GCCGGTCATG AATTTAATAT 6960
TGGTTCTCCG AAGCAACTGC AGACAGTCCT TTTTCAAGAA TTACATTTAC CGCCCGGTAA 7020
AAAGAATACT CAAGGTTATT CTACTGATCA TTCTGTATTG AAGAAACTTG CACGTAAGCA 7080
TCCCATTGCA GAAAAAATAT TGCTCTTTAG AGATCTTTCA AAGTTACGTT CGACGTATAC 7140
CGAATCGCTT GCAAAACTTG CTGATCAAAC AGGGCGTGTA CATACTAGCT TTGTGCAAAT 7200
TGGTACCGCA ACTGGAAGGC TTTCGAGTAG AAATCCAAAT TTACAAAACA TTCCCATTAA 7260
AAGCACAGAA GGAAGAAAAA TAAGGCAGGC GTTTCAAGCT ACTGTTGGGC ATGAGTTAAT 7320
TTCGGCAGAC TATACACAAA TAGAGCTGGT CGTGTTGGCC CATCTATCTC AAGATAGAAA 7380
TCTTCTCAAT GCATTTCGAC AGCACATTGA TATTCATGCA TTGACTGCTG CATATATTTT 7440
CAATGTGTCT ATAGACGATG TACAACCTGC AaTGAGAAGA ATCGCAAAAA CTATTAACTT 7500 TGGAATCGTG TATGGAATGA GCGCTTTTAG ATTGAGTGAC GAACTTAAAA TTTCTCAGAA 7560
GGAAGCGCAG AGCTTCATTT ACCGTTATTT TGAAACGTAC CCGGGGGTGT ATGCTTTTAG 7620
TACACAGGTT GCAGAGCAGA CACGTAAAAC CGGCTATGTG ACTAGCTTGG CTGGAAGACG 7680
ACGCTACATC CGTACTATCG ATAGTCGCAA TACGCTTGAG CGCGCGCGTG CCGAACGTAT 7740
GGCGTTGAAT ACTCAAATTC AGAGTTCTGC GGCGGATATT GTGAAAATTG CCATGATAGC 7800
AATCCAGCGT GCGTTTGCGC GCCGACCGTT ACGTGCACAA TTGTTGCTGC AGGTACACGA 7860
TGAATTGATT TTTGAGGCGC CAGCTGCTGA GACAGCGATA GTGAAAGAAA TTCTCTTTGC 7920
TGAGATGGAA CATGCTGTTG AGCTCTCGAT CCCGCTGCGT ATACACGTGG AGTCTGGAAA 7980
TAGTTGGGGT GATTTTCATT AGCATACCCA TCTGAGGGAT GCAACAGGGC ACGTTATGAG 8040
GTTACCTCGG CGCGTAGTTC CTTAAAAAAT GATGCTACCA CGCACAACAT AATCAGCGCT 8100
AAAGGAAATG CCGCAATGAT GGCTAAACTT TTCAGGTGCA TGAGTGTGGA CTGGGAGAAT 8160
ATGAGAGAAG CGGGAAGGAG AATGCACGCA ACCGCCCAAA ACGATTTCAT TATTTGACGT 8220
GGTTCTTCTC CCGGTGCAAC GCTTTTTTGC GAATAGGAAG CGATGATGAG CGTTAATGCG 8280
TCAAAAGTAC TTGCATAAAA GGCGATCATG GTAGCTGCCA ACAGCGCCAT AACGATGTAC 8340
GCGCAGGcAG TGTCTGAATA ATTGCGATAA TCACCTCAGC GGGTGTATTC CCCGCGCGCA 8400 nAAGGTACGC GGCAGGAAGG AGGTGGTGCG TTTGTAAATA GAGCCCGTAG TTCCCTAAGA 8460
CGATGAAGGA GCCGTACGTA CCTGCGATAC CCCAGCAGAG CCCTCCGACG ATGGTATTCC 8520
GGATGGTTCT CCCTTCGGAT ATCGCGCCGA TGAAGAATGG GGTTGCAACA GACCACGTGA 8580
TCCAATACGC CCAGTAAAAG ATAGTCCACC GCTGTGGAAA TCCAAGCGTC CCATCCGTTT 8640
CCTGTAATGA AATACGAGAA GGATCCATCC ACGTTGCCAT AAGAAAGAAG TTTTGTAGCA 8700
TTTTCCCTAT CGCGGTGATA CCCGTCTCGA TAAGATACAC GGTTGGTCCT GCACACAAGA 8760
AGAAAACAAG AACGGTACTA AAGCAGTACA CCGCGGCACG cgAGAGCTTT GAGATCCCCT 8820
GGGTACCCAG CAGTACTGCT GTGGTGTACA CCAACGCAAT AACGCAGAGC AACGCTAAAG 8880
CGAGCAGCTG GGTGTTAGAA ATACCGAACA AGAGAGAAAC CATGAGCGAA AGGAGGGGCG 8940
TTGCTAAGGA AAACGTGGTT GCAACGCCTA GAAGCAATCC CACTACAGAA CAGATATCGA 9000
TTGCTTCTCC TATTATTCCG TCTACGTACG CGCCCAGCAG CGGACGACAG GCTTCAGAAA 9060
TTTTGTGTGT GTGTCGCTTC TTTACATGCA ACATGTAGCC GAAGcaCTGC GAGAAGAACG 9120
TAAAAAGACC AAGGTATGAT GCCCCAATGG AAAAGGGGAT ACGCTGCCGC CCATTCTTGT 9180
CTTTCCGTGG GAGGGGAGTG TTCTGCTATA AACGGAGCTT GAGTGAAGTA GTGCGCCCAT 9240 TCGATGAGCG ACCAATACAA AATATCCGCC GCCATGGTGG ACGTAAAAAT CATGGAGCCC 9300
CACGTAAAGT TGGAATAGCG TGCGGTGCGC GTAGTACCCA GATACACCGC ACCATACCGT 9360
GAAAAAGCAA TAGTCAGCGT CGTTCCTAAA AAAAACAGTC CGGTAAGGAT ATAGAAAAAG 9420
CCCAGTTTGT TTACCAGTAT ATTCAAGAGG GTACCGATTA CCCGATGAGA AATGTCAGGA 9480
AAGGAAATAA ATAGGAGTGC GCACGAGATC ACGATTCCCA ATGGGATGAG AGAGACGCTG 9540
AAGTCACATT TTTCTTTCTG CATGATCTGC CCTATGTGTT TTCAGGAGCG TCTGAACTGC 9600
TTCGTAATAT TCTTTTGCAT ATCGGTACAT TTTCAGTCCA TAATCTCCAA ACGAGATACC 9660
AAGCGCCTGT TTGTAGCTCG TCCACAAAGA CCACAAAAAT CCCCCAAGAG CAATATAGCA 9720
AAAGACGCGC AGGCGCTCTT CTCCGTGCGG ATTACGCTGA AAATAACGAT ACATGAGATC 9780
TTCTATCTGG GCGGTGTTAA AGTGGGCATA CAAAGAGAAC ATCGCGATAT CCACAAGCGG 9840
ATCACACATG CCCGCATATT CCCAATCGAT GAGCTGGGCA CTGCCATCGC AAAGCAGGAA 9900
GTTATCTGGG GTTAGATCGA CATGCGTAAG TACGCACGGT TTGTCCACTG AATCGACAAT 9960
AGCCAAGAGT GTATTCATTT TTTCTCTCAC TGAACGATAG TCAGCGTAAA GAATACTGTA 10020
CTGGTCGAGG GCAAGCTTTT CGTAATACGC AATACGCGAT CTGAAGTCGA ACCTATGGGC 10080
AACACAAATA CCTGATTGAT GGAGTTTGCG CGCAATCCTC ATGCACAAGT CAAGGTCTGC 10140
CGGATCACTA GGGTTTGCAC TGCGGCAGTC TTTGTGGAAG ACGGTGATTT TAATACCACG 10200
GGCAGGCTCA AGGTGAACGA GAGTGTCACA AATATTAAGG GGCTTAATTG CTTCATACAC 10260
GGCAGCTTCT TGGAAGCGAT TGACAAGGAG TTCGGTACCC TCTCCAGGAA TTCTGAATAG 10320
GTACGGCTTA TCATTCAGCT CAAAGATAAA AGATTTGTTC GTCATTCCTG CTTTAAGTGG 10380
ACGCAGCTTA CCGATTGTCA TTTCGGGTTG GTTGAAAACA TGGGAAATGA CGCGCATCCA 10440
CTCGTTATCG CTCGAAACCA TGTACGCAAA ATCAAACTTG CGCAGcTCAT CGAGCGACTC 10500
AAATTCGTAG ATAAGGTTGT CAGCCTGTTT GTTGGCAAAC ATGGTAAGGG ATTTTGTATT 10560
GCGCATAAAG ACGTCTTCCC AGAACCACGC GCGGTTTTCT GGGCGCATGT AGGCAGCTTC 10620
GATGAGGGGG ATGATCTTTT TAGAAAACGA TGCAGAAAAG TAAGCTGGAC CGTACATGAT 10680
CCACCCACTT CTGCCACCAA TTTTCACGGC GGTAATTTTG TCGTACAATC CGGTTTTTAA 10740
CACCCATTCC TTTGTTTTCC CTTGTGTCTT GACAGCCGTA TACCAGGAAT CCCACTCATG 10800
GGAGTGATAA ATGTTTTCGC GTAGCcAGTT ATCGCTTGAG AGTATGTAGG TATTTCGAAG 10860
TAAATGACGG GCGTGGTAGA GGGTGGATAG GTTGTTGGCA GTTTCATAGT CAGGGTTATA 10920
GGTGAGCGTC ACGTCATGCT TGTCGACGAG GTACTCAAAG GCCTCTTTGA GGTAGCCgAC 10980 GACGACCGTG ATGTCAGTAA TGCCGACTTC GTGTAACTGA CGAATTTGAC GCTCAATCAT 11040
GGGCTCACCA AAGACTTTCA GCAGGCCCTT TGGAGTAGCG TATGTAAGAG GCACGCAACG 11100
GGAGCCAAAG CCTGCAGCCA TTATTACCGC ATTATGAACC CGGCAAGACT GATACTGACT 11160
ATGTGCAAGG GGGGTGAGAG AaCGCGTCCG CGCACCAAAG GTCCCGTGGG TTTCAAGGAG 11220
TCCTGCAGAC TCCATAGCGT GTAGAAGGCG ATTGGTGTAA GCCAAGGACA GGCGCAACGC 11280
CTTGGCGATG TCACGCTGGC AAAGCCGAgG CGCATCGCGC AAAAGCTGAA AAATCTGGAA 11340
AAAACGACGC CCCACACTGA GGAAGTGTAC GGGCGTGTTC GCCCGCCGTC AAGCACGCAA 11400
AAAACAGGGG AAAAAGCACC CCCCCCCCTG CTCGCTTCCT GCACACAGCT GTGAGGAGCG 11460
CATCCTTCGC TCTCCTAGAT AATATTTTTC ATCACAATCG GCTGTCCTGT TGCATCAAGG 11520
GCGTCGCGCC ACAACGATCC TTCAGGATTT ACGTGTTTGC GTTGGCACAC CACCACATCG 11580
ATTGGGAGGT GTACAAACTT ATTGTGCACC AAGCCAATGA TCATTTTCGT TTTACCGCAC 11640
ATCGCCGCAT GCACCGCATT GTTACCGAGG CGTTCGCAGT AAATCGAATC TATGGGCGCA 11700 gcAACCGCGG aACGAATCAA GTAGCTCGGA TCGATGTACT TTAAATTGAT GTGTATACGC 11760
TTTTCTTTGA AATAGACTCC AATCTTTTCT TTTAAGAACA AACCGATATC CGCAAGGCGC 11820
TTGTTACCCG ACGCATCCGT GCCGCTGcTC ACGCGCAAAC TACCTCCTTG GGAATCACCG 11880
GAGGGCACAC CGTCCGCATT TACCATTAGG TCTTGCCCCG CACCTTCTGC TACAACGAGC 11940
ACCGCATGCT TACGTAGCGC GATTCGCTTC TCTAGGTGAG CCAAAAGCCC ATTTGGACCG 12000
TCAAGGTCAA AGCTCACTTC AGGGATGAGT ACGAAgTTTG TCTCATGGCT CGCAATCGCC 12060
GTGTACGTAG CGATGAATCC AGACTCACGC CCCATGAGTT TGACCAGTCC AATGCCGTTA 12120
ATCTGTGAAC GAGCCTCCAT GTGCGCTGCG GCAACTGCCT CTGTTGCTTT GACAATAGCA 12180
GTATCGAAGC CAAATGACTT TTGAACAAAA GAGATGTCGT TGTCCAcCGT TTTGGGAATG 12240
CCGATGATGG AAATCtTAAg GTTGCGGtGT TTTATTTCGT CGGCAATCTC TTTTGCTCCC 12300
TTCTGACTCC CATCCCCCCC AATGaTAAAG AGAATGTGCA gGTTGAGCCG CTCGATACCA 12360
TCGAAGATGT CAAnCAnAAG GTTCCCCCCA ACGCGnGAnG TGCCTAAGCA G 12411 (2) INFORMATION FOR SEQ ID NO: 90:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 971 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90:
ACCACCGCAA CAATCGGGAT ACCAACTCGC CTCGCTTCAC GGATAGCGAT AGTCTCCTTA 60
CGCGTATCAA TGATAAACAC TACACCCGGC AGCTCCTTCA TTTCCTTTAT GCCGCCCAAA 120
TTCTTTTCTA GTTCGCGTGC TCCTTGCGTA AAGACGCCAC CTCCTTCTTA GAGAGATGCT 180
CGAAmGTACC GTCTATCTCC ATGCGTTCTA TCTTCTTGAG ACGCGAAAGA CTCTTCCTTA 240
TGGTGGAAAA GTTAGTGAGC ATGCCGCCGA GCCAACGGTT AGTCACATAA AACATCCCAC 300 nAGCgcTGCG CTTCCTTGGC AATGGTTTGC TGCGACTGCT TCTTTGTGCC CACAAACAAA 360
ACGGACTTGC CTGAGGAAAC AGTCCTGCGC ACCATGTCGT ACGCCTCGCG GATGGCCGTA 420
ATCGTCTTTT GCAGATCAAT GATGTGAATG CCGTtACGCT CCGCGAAAAT ATACTTTTTC 480
ATCCGCGGAT CCCACCGCCT GACCTGATGG CCAAAATGAA CCCCAGATTC AAGCAGATTT 540
TTGATAGTCA CCACTGCCAT ACCATTCCTC CACAGAGAGC GCCGAGGCTC TCATCCTTCT 600
CTCTTTCACC ACACGCACAC GCGCCGGGGA TATTCTCGAG AGGACAGAAA AACGCGTCCC 660
CTCACACACG GTCACCTTGA CTAAAATCAG ACGTATGGAA TACGGTTGCC CGCCTCTTCT 720
GTGGACCTTC CCAGGCCCGC AGCAGGTTAC AGAGCAGGGA AGATTAGCTC ATTCGGATAG 780
AGCGTTGGCC TCCGGAGCCA AAGGCGnTGG GTTCAAATCC CGCATCTTCC ATCCTCTTAC 840
CCACGTAAAC GGGCGCCCCG CGCmTrACGC TCCTTCCAGT CAAgCTGGCA ACTCAGGGGc 900
CGTCCGCCCG CTCTCCTTCA GCAGGTCACT TGCCGCCAGA AACGACTCCC TTCAnCACTT 960
CGAGCGCGCG A 971 (2) INFORMATION FOR SEQ ID NO: 91:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1985 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91:
AGTGCCGACG CGCGAGAACG TATTTTTGCC CATCTTGAAA TGCGTGGCGC AGGTGCACGG 60
CGCGTCACCT ATCGCCTACG CGACTGGGTG TTCAGCCGTC AGCGCTATTG GGGAGAACCC 120
ATCCCTCTTG TGCACTGTCC TTCcTGCGGT GTTGTACCTC TCCCTGAGAG TtGCCTGCCG 180
CTTTTGTTAC CCGAAACCGC CGATTTCACT CCCACGGAAG ATGGGCAGGG CCCCCTTGCA 240
CGAGCGCGCA CGTggTGCGC GTTCCCTGTC CGCAGTGTGC ATCTGACGCA GTGCGAGAAA 300 CAAACACAAT GCCCCAGTGG GCAGGATCCT GCTGGTATTA CCTCCGTTAT ATGGACCCCC 360
GCAATAAGAC TGCCTTTTGT GCACCCGAGA AGGAGCGTTA CTGGGCgCCA GTGGCGTTAT 420 ATGTAGGTGG TGCAGAGCAC GCCGTACTGC ATTTACTGTA TGCACGCTTT TGGCACAAGG 480 TATTGTACGA CTTAGGTCTT GTAAGCACGA AAGAGCCCTT TGCGCGGTTG GTGAACCAAG 540 GCATGATTAC GTCGTATGCA TATCGCAGGA AAAATGGCGC GCTTGTACCT CACGACGAGG 600 TGCACACTAA TGCTCAAGGT ACCTACGTGC ATGCTCGTAC GGGGGAAAAA CTCGAGTGCG 660 TTGTGGCAAA AATGTCAAAA GCGTTAAAGA ATGTCGTCAA TCCTGATGAC ATGATTGCAG 720 CGTACGGTGc tGACGCGTGC CGGGTATACG AGATGTTCAT GGGACCTCTT GAGGCTTCCA 780 AACCGTGGAA TACGCAGGGG TTAGTGGGGG TTTTTCGGTT TTTAGAAAAA ATTTGGGTAC 840 TTGCGGGGCG CGTGGCGGCC GCAAACGGTA TTCCACAAGA CTCTCGTGCA GAGCCGCCAG 900 GTGACCTGCA CGCACAGAAA AAGTCTTGCA GCATGTACGC CCTCGAAACG CTGTTACACC 960
GGACTATTCA AAAAGTGaCg ACGATACGTC GGCGCTTAGT TTTAACACGG CAATCAGTCA 1020
GATGATGATA TTTGTAAATG AwGgTACGCG GGTGGCGCGG AGGATGCCTC TTCCCTCTAA 1080
AATGTGGGAG ATGTTTGTAA AAATCCTCTC TCCCTATGCA CCACATTTGG CAGAAGAACT 1140
CTGGGAAATG TGTGGGCACA CGCACACTAT CGCATATGAG CCTTGGCCAC AGGTGGACCC 1200
TGCGCGTGTG GCGCCGCATG TGTGCTCCGT AGTGGTGCAG GTGAACGGTA AAGTGCGCGA 1260
CACCTTCTCC GTAGCGCCGA ACGCTCCAAA TGAGGAACTC GAGCAAAAGG CGCGGGAAAC 1320
CGCCGGTGCG CGTAAGTTTC TTGGTACGCA GCAGCCAAAG CGCGTAGTGA TAGTTCCCAA 1380
TAAATTAGTA AATTTCGTTC TGTAGTCCGC ACTGCTCCTG CAGCGTTGTG CAGTACCTGT 1440
GTGGTGCCCT ACCCGCGTGC ACTACAATCG CGTAAAGGCA CAGCTGCATC AGCGGCCCTT 1500
GGAGAGGCGT CCCACCTGAG CGGATCATTG CGTCTGTCTG GAACAATCTA TCCAAAATAG 1560
CATGCGTATC GCTCAGGGTC CAAAGAGACG CAGCGCGCTG GTATTGCGCA ATCATACGCC 1620
GAGAAGTGAA CCCATAGCGC TTTAGCGTTG CAGGGTGCGT TTTGTCCTCC GGTATATGGT 1680
GCCAATGCGC AAGGCGACGG AACGCGTAAG nAGtCCTGCA AGGATTTGTA CAGGGGCAAC 1740
GTCTTTCGAA CAAAGCAGCG TGTTGAGAAT CATAAGTGAA TGCTCAAGAT CTCGTTTGGA 1800
CAGCGCATCG AATAAGGTAA ACGATGTCTC TTCTTTTGTG TGCACCAGCA ACGAACTAAT 1860
GTCGTGCGCA gTGATGCGGc GTCCTTTTTc AAAAAAAAGA GAAAGCTGCG TACAAACAGT 1920
TTTGAGCGCA CGAGTGTTGT TCTCCACCAA CTCAAGAAGA GATTCGATAG CCTCCCTGTC 1980
AATGC 1985 (2) INFORMATION FOR SEQ ID NO: 92:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1043 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92:
AGGGAGTGGC TTATGGGATA CGGTTCATCA GCTGAGGAAA CGTCTTCTAC ACCGCATGCG 60
TCAGGCCAAA GGAAGGTTGG TTTTCTGTCC CTGCGCACCA AGCTCGCGCT GGTGTTTGGG 120
CTGCTTGCGT TCGTGTCTGG TCTGGTCCAG GGCGGTATAT TGGTCGTTTT TGCGCGTAAC 180
TCCATAGTTG GGGAGATTTC TAGTCACCTT GCTGGCCGTG CTCGGGATAC CTCCTCCATC 240
GTGGAAGGGC GGATCGGCGC GCTGTTCCAG TTTTTGGAAG GTTTGGCACG TCTTGAAGTT 300
TTGCAAGGCT CGTCCGACAG GCGCCGTGCC CAGGTGGACA GGCTAAAGAA GGAAGCGTTT 360
TTTAACCGGG ATATCGCGCG TCTTGCGGTG GTAGATCTCG cAGGCGTGTT GTACGGGGAG 420
GACGGGCGCA CGCATTACGT ACAAGATCGA AAGTACTTTC AGCGGGCGGT TAAAGGCCGT 480
TGTTACGTCT CTGCGCCCTA TCCCTCGCGT TCGTCGGATG ACATGGTCAT TACCTTTTCC 540
ATCCCGGTAT ATGACGAAaT CGGCGGGTTA TgCCGTGCTC GTAGCGGaTG TGATTTGGAC 600
GTGGCTGTGT GATATCACAG GGGATTTTTC TGTAgGgGGG TGGGGAGAAT CGCCGTTATT 660
GACGAGGTTG GTACCGTTGT CGCGCACCCA CGTCACGAGG TAGTGGCGCA CAGACAAATT 720
ATATCCGCCT GGCAAAGGAA GACCCGGCCA CGTACGCGTC CGTCGCAGAG TTCGTTGAGA 780
AGGTTATCAA GTCAGACTCT ACTGCCTCTC ACGTGTTCTC GTATGAAGGC TTAGAGAAAA 840
TCGGTTCATC TGCCAAGATG AAGAGCACAG GATGGACCGT CGTGGTGTTT GTGCCTGTCT 900
CCGAGTTTAT GGGGCCTGTG TACACCCTGG CAGAACTACC TGCTTGCGGT GGGTATCATG 960
TGGTACTCTT CTCCCTCATG TGGTGTATGC CGTTGCGCGC AAGATTGTGC GCCCGCTACG 1020
CTCTACCGTC AGGTGTTAGA AGA 1043 (2) INFORMATION FOR SEQ ID NO: 93:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1357 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93:
CnTCTnTTAT TTATTTTGAA AAAGACACGT TGCTTTTCCT GCGCGAATGC GTCAGAATGG 60
CGCGGTCCCT ATGATATCGT GTTCGGTGCG CAGGAGGCCG AGATGGGAGC CTCAGGTCGG 120
CGCTGCGTTT CTTGCCTTTG CGCTCTTACC GGTCCTGGCG AGCGGACGTG GTATGCAGGC 180
GGCAGTGGCC ACAGCCGCAG GGTCCAGTGG TTCCGGCAGT GATGGCAAGC ACCCCGGCAA 240
GGAACAGTTT CTCCAGTTCC TCATTCCATC TGGCGGTCGC TACGAATACC TCGGGGTGAG 300
CTTTACAGCG CTGGCAGATG ACGCCAGCTT CTTTGAAGCT AACCCTGnCG GCAGCGCCGG 360
GCTCAGCCGC GGGGAAGTTG CTCTGTTCCA CCACTCGCAG ATCCATGACT CACACACCGA 420
AACGGTTTCG TTTGCGCGAC GTACGCAGAA CACCGGCTAC GcGCCTCCGT GCGCGCCTTC 480
TCTTCTGAGT CAGATCTCAA GTCCTTCTTC GGGGGCAACA GTGGTGGCAA TAAGAACGGC 540
GGACACCAGG GCAAACAGGG AAAAGGCTTC GTGGCAATAG CCAATGCGTC TCACACCTTC 600
TGTGGCCAGT ATCGCTTTAA GGGCGTAACT TTGGCTGCAA TTTCAAGATG GGATTCCGCA 660
AGGtAAAACT GACAGCCACG TGACCGTCGC GGGTGACTTG GGCCTGCGCG CTGCCTTTTC 720
TGTGGCAAAG AACTTTGGCT CAAATGAGCC GAACATGCAC GTGGGGTTGG TGCTCAAAAA 780
TGCCGGGATC TCGGTAAAAA CAAACAGTTG CCAAGTCGAA CACCTCAATC CGGCCATTGC 840
CGTCGGCTTT GCCTACCGGC CGGTGTATGC GTTTTTGTTC AGTCTCGGGC TGCAGCAAAC 900
CCTCACCAAA AGGGAGTCGC CGGTGTGCAG TGTTGGGTTC ATGTTTTTTT GTACCCAACA 960
CGTTACCCTC CTCGCCTCTG CTGCGTGTGA AGGAGGGGCC TACGCCCTCT CAGGCGGCGC 1020
AGAAATCCGC ATTGGCTCCT TCCACCTCGA CATGGGGTAC CGGTACGACC AGATTTTCCA 1080
AGCCGCCCAC CCACACCACG TGTCAGTAGG GCTGAAGTGG CTCATACCCA ACGGCGGCAC 1140
CCAGGCGGAT CAGGCCCTCT TGGTCAAAGA GTCCTATCTA GTGGGGCTGC GCTTTTATGA 1200
CCAGCGGCGC TACCAAGAAG CAATTACTGC GTGGCAGCTG ACGCTGCGCC AGGATCCGGG 1260
CTTTGAACCG GCTGCTGAAr GCATCGAgCG CGCACGACGC TTTTTAAAAC TACACGAAAA 1320
ACTTTCTCTC TTTGATATTC TCAACTAGCC TGCCGTG 1357 (2) INFORMATION FOR SEQ ID NO: 94:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2442 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94:
CACCGaCCCT GrAATaAAGC GTTACAAAAC CTACTCCTAC CATAAGGGCG ACAAGCAGTA 60
TGAATACAAA ACTGTAACGC TCTTGATGTA CATTTTTCTC GATGGAAATC ATCGGTCTCA 120
TCGCTCTTTC CTCAAAAGAG GTTCAAGCCG CTCGAGCGcA AcACTGCGTG AACCTTTCAC 180
CAGAACAAAA TCACCCCGCC GTAATTGCGC GTCTAACGTT TCTTGCAGCG CAGACAGTTC 240
CTCTAGTGCA AAGGCGTACA CGCGCTTCCT CCCCCAGGAA ACTTTTCTTA CCGCACGACA 300
AAATTCGGGA CCAAACACAT ACACTGCACG CGCGTCAGAA GCGGCTGCCA AAACACACAC 360
CCGATAGTGA GCTTCGGCAG CGGTAACTCC TAGCTCTCCC ATGTCACCGA GCACATATAC 420
TTTTGATACT GCAGAAATAT GCGCACACAA ATGAAGCGCC GCAGCCATTG AATCAGGATT 480
CGCATTGTAA CAATCGAGCA AAAAGGTCAG CGACGCACAC ACCACATGAG AGCGGCCAAA 540
GGGCGGTTTT ACCCGCTCCA TCCCCCGCTG AATTTCCTCA GCAGGAAGTC CTACCTGTGC 600
GGCAAGCGCA ATCACTGCAA GCGCATTCTT TGCGTTATGC ACCCCAGGTA GTGGCACGCG 660
AATCCATCGT CCTTGATATA ACACGCGAGA ACCACGTAAA CCCTCATCTA TCACCTCAGT 720
TGCTAGACCA CGCCCCCCCT GATCGTAAAC TACAACCCTA CCGTACGGGA TATTAGACAG 780
GAATACAgAT ATGCATCGTC GGGGACAAAA CCCACGCTGT GTTCAGTAAA TTGAGAAAAA 840
ATCTCTTTTT TCTCTTCCGC AATTGCCTGC TGCGTGCCCA GAATGCcTAC GTGCGCACAA 900
CcTACGTTGG TAATGATCGC GTAATGAGGA ACAAGTATCT GAGCGAGCGT ACGCATCTCC 960
CCCCGACGAT TCATCCCCAG CTCAAAGATT CCTACCTCAT GTTCTGCACG CACAAAAAAC 1020
AGCGACTGCG GtAAACCTAt CTCTGAATTT AAATTTCCTG GCGTCGCAAC CACCCGATAC 1080
CGTTCACTGA ACACCGCGCG AGCcATTTCT TTTACGGTTG TCTTTCCGCT TGATCCGGTA 1140
ATGCCAATCC TAATAAGCGC AGGAAACTTT TTGCAGTAAA AGGAGGCAAG ATCTTGCAGC 1200
GCCCTGAGCG TGTCGTGCAC CGCAATACAG GCAGCTCCAA AGCGAGTGCA CCAAGCAACA 1260
TATTCCCCAG CATGGGGGtA CCTTTGATCT ATAAGCGTTG CAACTGCGCC CTTCTGCAGC 1320
GCTTCTTCAA CAAACGTATG TCCATCTACG TGCGCACCAC GGAGCGGAAT AAACAAATCA 1380
CGCGGCACAA CCGCACGACT GTCAAAGGAA ACCCCGTCAA AACCGCGCGC CCCTCGTGCA 1440
TCGCACACGC GAGCCCCTTG CACTGCCGCA CATACCTCAT CAAAACTCAG AAGCATGAGG 1500
GGAGGAGTCC TTGTGCGCAG GACGCAGCTT AATGCGCACA ATTTCGGCGT GCGACGCCCG 1560
GTGTAAGTGC AGTTTCTCTA CCGCGAGCTT TTCAATACGA TCCGGCTTAG ACAGGATCGC 1620
GATCCCAGAA ATCTTACGCT TATTTTCTGC GATAATACGG TGTTGCTCTG CGTCGTACTC 1680 ACGCACCACA CGCTCAACCG CCTGATAGCG CGAAGCCTGC CACACCCCCG CACACAGCAA 1740
CACGGGGATA CTCACCGTAA AGAACAACGC GCTCACCTTT TGCACCGTCA TCCCTCTTTG 1800
CCGGCGAGGT ATTCCCcTTC GCTGCTTTTC TCTCATCGCC CCTTTCTCAC GCATCAGCCC 1860
AGTCTCCTAC CGTTTCTCAA TGACGCGTAA CGTAGCACTC CgcGAAGCGG CATTCGCAGC 1920
ACGCTCCACA CACGAAGGTA CAAGCGGCTT CTTGGTAATC AGCGATGCAC GCGCCACACC 1980
GCCGCAGCTG CATATCGGCA CACGAGCCGG ACAGCTGCAG CGTTTCGCCC AGTGCCGAAA 2040
ATGTACCTTA ACAATCCTAT CCTCACGCGA GTGAAAACTA ATAACCGCAA GTCGACCCCC 2100
AGgCGCAAGC GCCGTAAACG CTGCCGTAAG AAGGCGTGGC AAACGCTCAA GCTCCCTGTT 2160
CAyCGCAATG CGCAATGCTT GAAACGCCTT GGTTGCCGGA TGGAGCTTCG GCAAAACCCC 2220
AAGAACACgC GCCGCTTCCC AAAmCGCGCC GTyCGCATCG GCGGCaCCAC GCGCGCAACG 2280
ACTTCTGCAA ATGCGCGCGC AGAGcAAAAG GGCGCCTGCC cGaaCTGCGc GCACACCGCC 2340
TGCGCAATCC GACGCGCGTA ACGTTCCTCT CCCCCTTCAA AAAACAATTG TGCCAAAnCG 2400
TCTGCGGCAG CCGATTCAGG AGGTCTGCAG CGGTCTGGGA GG 2442 (2) INFORMATION FOR SEQ ID NO: 95:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1921 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95:
ATCCGGGATG CGCGGCCCCC TGGGGGAAAC GGAAATTGAA ATTCGCGCCG GGGCAGCCCG 60
GGTGTGCCGC TCGCCgTGCG CGAACGGC C kTGCATCGCG CACCCGCCGG TGCAGCGGGT 120
GGGGGAGTGG AACGCCTGTC TACCGAACGG CGTCTTTCTG TACtGCACGG CACgACGCGG 180
CTGAACCcGA AGCAGACGCC GTGCAGTAAC GCGCCGGcgC GCCCGCGCGC GTGCCGGCAG 240
ACGGCGCTGG TGGGACAGTC CGCAGGCGGC GCCCCGCTAC CACGCGATTT TAAGGCCGCA 300
CACAAAGGTG CCAAAGTGCT TATCTGAGGA AATATCTTCG GTAATTAGCA TGTACGGCGC 360
GGTGGCCAGC CGCCCCTGCT CCCACCGGGC GTCAAACTCC ACTTTTTCTA TTGGGCTAAC 420
GGTGAACCCC ACGTGGTACT GCATTGCCTT TTCGCGTAgc AGATTGTTAG CTTTGTCGGT 480
GTTGAAACGG TTAGTGGTTC CGTACACCAC CGCGTACGGC TTGAGCCAGG CGTGGGAGCC 540
GAGCGCAATC TGATAGCTCA GCCACGTTTT GCCCACGACC GGCAGgTTGA TAGGCCCCCT 600 CATTGTAGTC TTCTTGTAAT CGATGCCCCC GTTGTTTACG TAGGAGGTGT AGGTGAAGGG 660
AATGTACAGG CGCGCCTCCA CCCCCGCGTT CAAGCCGGTG AGCAGGTGCG TGTAGGGGTC 720 GCCGGACTTT GTTTCTAATT TTAAGAATGC GGCGCAGTCC AAGTAGTCTG TAAGCTGCCT 780 GGAAAAGACG CGCTTGCCAA ACATATTGGC GCCGGCGGTG GCAAAATACG CGCCGGAAGA 840 AAGCCACTTC CACTGCATGC TCAGCAAGGC GTCAgcGcCC AACTCGTGCA CGTGCAGGCG 900 GTGCAGCCAC GCAAAGAACG TGACGAGCGC AATGGCCGGG TCCGGCTTTT TAAGGAGCTC 960
GGCGAATCCC TGGGTGAGCT GGTTTACCAC CGCTCCCAGG CCCAGTGTGC TCTGAAGTAG 1020
ATTGGAGAGT TTTCCTTCCA AATGGGAAAC CACcAGTGCC AcTTGCGCTA TGAaGGCAGC 1080
GCCCGcAcTT TTTTGACAAA AAAcTCTCCG ACGCCACCGT TGGCCATaTC TGGGCCATTG 1140
TGCCCAAAAT cGGTTAACGy TGCAtTmAGG CTTcTGCCGm CTTGGsCTGC AgTTTACCGT 1200
CTTCCCTATA TCCGCGTCGC TCCGGTGAAT GTTTCCCACA TCAAGCGCCA GCACGAGCCG 1260
AAACCCGTAC CCCGGGGTCA GCGTCAGCCG TCCGCCGGCG CTCCACAGCA GGCGGTTTTT 1320
CTGCTCATTG TTTTTTCCCT GCTCAGTCCC CGTGGTGTAT TGGGGCTCCA GCGTGGCGTT 1380
CCCCGCCAGC TCCATCTTGA TGCGCTCAGC GCGGTGGTGC GTGTACGTAA GCGTGGCGTC 1440
TGCTCCAAAG CCGTACTTGC TGTGCGCAGT ACCACTATCC CACATACCGT TTGACGCAAA 1500
CGAGAGCAAG CCCACGTCCA ACCCAATGCC AtGCCGCCGA TATCCTGCGC ACGGTAGCCG 1560
AGCTTGCCCC CATAGCCGCC AAAGCCGGGs CATAGCGCAC GTCCTCCTGC TTGTAGTCGC 1620
TGGTCACGAA CGGGTCCCAC aGtGCGCAAA GTTAATAAAG CAGTTCGGGT CCTTGCCAAT 1680
GGTCAGGTAC GCGTTGTAGC AGTGGAGCGT TGCTTCAAAA GACGCTTTGG GTTTTTTGAG 1740
CGTAAAGGCC TGCCCCGGCC TGGGGGATTC AAAATCAACG GTCAGGTCCT TGAGCCGCAG 1800
TCCGCCCACA CGCCTGAGCG CGCCCCGCCG CGACGCAGGT GCGTGGCCTT GGGCACGAGG 1860
GGGAGCGAGA TTTTCAAATC ATTGGTGGTG CGAAACCCGT GCGTGTACTC GTTnnTTGCA 1920
C 1921 (2) INFORMATION FOR SEQ ID NO: 96:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 658 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: GTCAATTGCA TTAAGGTTAT CACGCGCGCC CACACCCTCG GCTATTGCGG CGGGGTGCGT 60
ATGGCGGTGC GCATGGCAGA ACACGCCCGC GCCCACCACC GCGGGAGtTC TACACGCTCG 120
GTCCGCTCGT CCACAATCCC GTGACGCTCG CCCGTTTGCG CGCGCGTGGC ATTGAGTGTC 180
TGGATCCTGC TCATCTATCT TTTGCGCTGC ACGCTCCTGC GGcACCGGGC GCACGCCGCA 240
TGCAGTGGAA GAAAAGACGG CGCGTACCGT GGTGATTAGA GCGCATGGCG TGGCACCTGA 300
GGTGTATGAG GCCCTCGAGC GTTCCGGAGC GCAGGTGGTG GACGCCACCT GCCCGCGAGT 360
TAAGGAAAGT CAGCGGCGTG CTCAGGGTTT TGCCGCGCAG GGACTGCACG TTATTCTCGC 420
CGGGGACCGC AATCATGGGG AAATCGTTGG CATCGAGGGG TATGTGCGCG CGGGAGCTGC 480
GCAGGCGTGC ArCCACTTGC CAGGCGGCGC ACCAGACGGC ATGCTGCCAC AGGTGCAGTG 540
CTTTGTGGTG CAAAACGCGC GTGAgGCTGC CGCGTTGCCG TGTTTAgCGC GTGCAGctTC 600
CTTgCCCAAA cTACCATTAC ACAgGGTGAA TAmGACGCGA TyGCCGCTGC GGgCGTAA 658 (2) INFORMATION FOR SEQ ID NO: 97:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 763 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: TTCTGCAGTG TACGTTAAAT CGGACGCATA CTATGGGAGG TGAGCATTGA TAAGGCGTAG 60
ATATCGTGGT TGTACGCAGG AGCGTGGATA GTAAGTGTTG GTATGCTATT TGCATCGTGC 120
ACTTCAGGGG CGTGGAAGGC ATCAGTAGAT CCGTTGGGGG TTGTGGGATC TGGTGCAGAT 180
GTGTACCTGT ATTTCCCTGT AGCGGGGAAC GAGAATTTGA TTTCTCGTAT TATCGAGAAC 240
CATGAGTCAA AGGCAGATAT TAAAAAAATA GTGGACAGGA CTACCGCGGT ATACGGTGCT 300
TTTTTTGCCC GATCAAAAGA GTTTCGTTTG TTCGGAAGCG GTTCGTATCC ATACGCCTTT 360
ACTAATTTGA TTTTTTCTCG ATCCGATGGC TGGGCATCTA CGAAAACGGA ACACGGAATC 420
ACGTACTATG AAAGTGAACA TACGGACGTT TCGATTCCTG CGCCGCATTT TTCCTGTGTG 480
ATTTTTGGTT CCTCCAAAAG GGAGCGGATG AGCAAAATGC TGTCTCGGCT CGTTAACCCC 540
GATCGACCGC AGTTACCGCC TCGCTTTGAA AAAGAATGTA CGTCGGAAGG TACGAGCCAG 600
ACTGTTGCAC TCTATATAAA AAACGGGGGA CACTTTATTA CCAAACTGTT GAATTTTCCG 660
CAGCTTAATT TACCACTTGG GGCAATGGAA CTGTACTTGA CCGCGCGGAG GAATGAGTAT 720 CTTTACACGT TGAGCTTGCA GCTGGGGAAT GCAAAGATAA ATT 763
(2) INFORMATION FOR SEQ ID NO: 98:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4968 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98:
GGCCCCAATC CCATAAAATA CTCTGGTTGT GTTTGCTGCG TATGACGTGA TAGTCAGCTC 60
TTCCACCGTA AATGAGATCG CAAATACGGG GGCGGGTGCG ATATACAATG AGCATACAAG 120
GATCATGCAT TCGTATCGTA AACGGTACAG TTTGATCTGC GCTCGATTCG AGTACGGACT 180
TTCTACTACT CACAACCTTC AAACTACGGG AATCGCTTTT GTCGACAATT AAAAGTTGAT 240
ACATAACGGA GTGAAATACT TCACCAACCG ACAGTATAAC CCCATTGTGT TTTCTAGTTG 300
TGTGAGATTC AAATCCCAAG GATGGCACGG AACTGAGTAC GCGCCAATAC ACCTGcCTTA 360
ATCGTATAAA GCTTGCAGGT ACCACCACGT CGCTAAAAAC AAATACAAAA AGCGCACTGC 420
ACAAACCAAT CGACAAAAGA GGGAATATCA AAAACCGCGG CGAAACGCCC AGCACTTGCA 480
GCGCAAgcAG TTCAAGGTTG TTACGCAGCC GCCCCAACCC CAAGAGGGAA CCGAcCAACA 540
CGGCAAAAGG AGTAACGTAA ATCAACGCAG CGGGAATGGA ATACAGCAAC AGCCGGACCA 600
CGTCCGCAGC GCTCACGTGC TTGGTAAGTA ATGTCTGCGC AAAAAGCAGT ACGTTGTTCA 660
CAAAAAAAAC CAAGAAAAAG CACAGCACGC CCACCAGGAT GTGTTTGAAC ACTACCTCGC 720
ACACGTACAC AAAAAGCACT CTTCTCCACA TCTAGGTAAC TACAGAACCC CTGTGGATCC 780
GTACCCTCCT TTTCCCCGTT CAGTCAGCGA GATACTACCG GAGCGCACAT ATTCAACACG 840
GAGCACAGAA GAAATCACCG CCTGCGCGAT ACGATCTCCA TGGGAGACTA CGAAAGCAGC 900
TAGCCCAAGG TTGACAAGCA ACACGCGTAT TTCCCCCCGG TAGTcAGCGT CTATCGTTCC 960
CGGGGAATTT AAAACCGTCA CTCCGTATTC GAGTGCTAAC CCAGAGCGGG GACGAATTTG 1020
CATCTCCAAC CCCACAGGAA GCTCCACACA AACACCCGTG GGGACGAGAA CCCGGCCCAG 1080
GGGATGAACC TCAAGCGGTC CTCCGGGAAG AAAGGCCCGC AAATCGGCTC CACTTGAGCC 1140
TAACGTCTGG TACTCGGGAA AAGAAGCTCC CGGATACACG ACAGCTCGCA CACGGATCAT 1200
TTCGTCGTGT CCAGACCTCC AGACTTCCCC TCAAGAGCAT CGATATAGGA AAGATTCAAA 1260
CGACCCATCC TGTCGATATC AATCAACTTC ACACATATCC GCTGACCCTC TTGCAGCACA 1320 TCGCTGACTT TGGACACGCG GCTGCGCGAC AGctTTGAAA CGTGGCAGAG TCCTTCCTTC 1380
CCTGGAAAGA TCTCCACAAA AGCACCGAAC TCTACGATTC GTTTCACTAC ACCCTGATAC 1440
ACCCTCCCTA CCCGAGGATC TTCAGTAAGG CCCACCACGG CGACCTTTGC GTCGAAAACG 1500
GACTGCGCAT CCCTTCCGGA GACGGTTACG GTACCGTCAC TATCAGTGTT GATAGTCACC 1560
CGATACTGGT CAGAAAGCGA CTTAACGGTT TTCCCCCCAG GACCGATGAG CGCGCCGATT 1620
TTTTCAACCG CTATTTTAAA ACTCTCAATA TGCGGCGCAT AGCGAGAAAT GTGcACGCTC 1680
GGTGCGCTGa TTGTCTGATT CATGACAGAA AGAATATGGA GCCTACCTAC ACGAGCTTGC 1740
TGCAAAGCCT CCTTCATCAG AGACGCAGAC ACCGCCTCTA CCTTCACATC CATCTGAAAA 1800
CCGGTAATGC CGTCACACGT ACCTGCTACC TTGAAATCCA TATCACCGAG ATGGTCCTCC 1860
TCACCCAAAA kaTCCGAAAG AATCGCATAT CGCACGCCAT CGGTGATGAG CCCCATCGCG 1920
ATTCCCGCAA CAGGCTTTTT GATTGGGACC CCTGCATGGA GAAGAGAAAG CGTCCCTGAG 1980
CACACAGTCG CCATGGAGGA AGATCCATTC GACTCCAAAA TTTCTGAAAC CACACGCACG 2040
GTGTAAGGAA ACTGTTCTGG ATCCGGAATG ACTGCCGAGA GGGAACGATG CGCTAGACAC 2100
CCGTGCCCAA TCTCCCTCCG ACCAACCCCC ATTCTCCCTA TTTCCCCCAC TGAAAAAGGA 2160
GGAAAATTAT AGTGAAGGAT AAAATTCTCC CGTCTATCCC CTTCGATGTC GTCGTACACT 2220
TGCCCGTCCG ACATAGCACC GAGCGTGACC ACCGCGAGCG ATTGAGTCTC CCCCCGGGTA 2280
AACACCGCAG ACCCATGCGG ACGCGGCAAC ACCCCGACCT CACAGGCGAT GGGCCGAATG 2340
GCATCAATGG CACGGCCATC GATGCGCAAA CCCCTGTCAA GAATGTTCAG CCGTAGTATC 2400
TCATACTCCA TCTCGTGGAA CAACGCGTCG AACAACCTGC GCTGCACATC GTTCTCAAGC 2460
TGAGCAGCAT ACTGCTGTGC AACATCACGC TTCAcCGCGT CGCAGGCACT GcgCCGCTCA 2520
CCCTTCCCCT GTGCATACAA AGCCTGCGCA AGACGCGGAT AGGCGAGCTC ATAAATACGA 2580
TCGCGACCTA CAAGcTGCGC AGAAGAAGGG ATAACCGTCT GTTTCTCCTT GCCACACAGT 2640
CCACGCAGAC GCTCCTGCAT ATCGCAAAGG GCTTTAATAT GCTCTTGTGC CTGTTCGAGC 2700
GCGCCGAGCA TGAGGTCCTC GGACACCTCT CGCGCACCAC CTTCCACCAT GGTAATGCCC 2760
TGCCTAGTGC CTGCAACGAC AACCTCCATA CTGGCGGCAT CAATCTGAGA AAAGGTAGGA 2820
TTAATAACAT AGGAACCGTT CAGATATGCA ACGCGGACTG CAGCAACCGG TCCATGGAAG 2880
GGGATATCCG AAAGAGTAAC GGCAGCTGAA CTGGCAACAA TAGCCAAGAC GTCATGAGGA 2940
TGGACCATAT CCGACGATAT GCACGTAGGG ACAACGTGTA TATCACGTCC AAACTCCTTT 3000
TCAAAGAGCG GCCGCATCGG ACGATCAATG AGGCGCGAAA TGAGAATCTC TCTGTCTTTC 3060 GGACGGCCTT CACGCTTGAT GAAGCCGCCA GGCATCTTCC CCACCGCATA ATACTTTTCG 3120
TTGAAGTCAA CAGTGAGCGG GACATAGTCG AGCCCTTCCT GTCGCTGAGC AGAGGAGCAT 3180
ACGGTCGCGA GAATCGCCGT ACCTTCACAC TGTAAATACA CGGACCCGTT CGCTTGCCGC 3240
GCCAGATACC CACTTTCAAG GAGAAGGGGG TGGTCCCCAA TAGTGCCGGT TATGCTGTGT 3300
TTCATACGTT TCCTAAGAAC AGAATGTATC GCAGGCCGCA CCAAGCCCTG GCACAGCCAC 3360
CTGCGAAGCT AGACAAAGAA AGCAGAACGG ACAACCGTCC TATGCACAAC CCCTCGGCGG 3420
CACGCACTGC GAAACTTCCC CATGCAAAGC CCCTACTTCC TCAAACCGAG GCTTTTGACA 3480
AGCGAACGAT ACGCCCCCAT ACTCACGCGC CGGGAATATC TCAACAGGCG ACGACGACGC 3540
CCCACCAGAA CGAGCAACCC CCGGTTGCTA CTTTTGTCtT CGGATGAACC TTACAGTGGT 3600
CAGTCAAcTG CCTAATCCTC TCAGTGAGAA GAGCGATCTG CACACTCGAA GATCCTGTAT 3660
CCTTTTCTCC TGAGCCGTAC TGCTGCACTA CCGAAGCAGT ACGTTCCTTT GTCAGTGCCA 3720
TTCTCCTTCT CCTTTCCTAC TCGAACTCGA TAGAGTCAAG CGCCTCCCGC CCATGCGACA 3780
GCGCCTCGGT ACCGAGCCTC AGCTGCACGA GAGAGGAACC CACACTAAGT GCGCACGCAG 3840
ACCCACCCCA CTCCCCTGTA CAAGGGTACC GCATACTCGC CCTCAGGTGG AAGCGTCTGA 3900
CTCACACGCC CACGCTCGGn GCACAACGTG CGGGCACACC CCyCGCTCTG CTGCCAGGGA 3960
ATTACCTCCC CGTCAAGCGA AAACGCACGC CCCAGCAGAC GCCGCGCGCT CTCAAAGTCT 4020
GCGCAGCGcA cckCACGGGT ACCGCGCTCG AACTCACCCG AACTCCCTCG AGAGTGTAGT 4080
GACCCACCGC GTCaGACAAA GCTATGCGCA TCACCAAGCC GACGCAGCTC ACGGACCCCG 4140
GTGTCCAGCC CATGTCCACA CCGAAAGTCG ACACCAACTG CGAGGTAGCA CACCCTCACT 4200
GCACGCAACA GGGTGTTGAA GAAGACACCA CCCGGGATTC TAGCAAAATC CTTGGAAAAG 4260
TCAATGAGCA CGACGAAATC AAAGCCGCGC GCACGAAAAT AACGGAGTCG CAAACGCAGC 4320
GTTGACAGAT CCCCCTCGTA AGAAGAGGTT TTGTGCTTCC TGGGAGGATG AGTAAAGGTG 4380
ATCAGCCCGG TGCACCGCGC CCGATCGGCC ACAGGAGCGC ACGCAGCGGC AAACACCTTA 4440
TCGAAGAGAA AAGCATGTCC CCGGTGAGGA CCGTCAAACC CCCCCACCGA TATTGCTGCT 4500
CCCCGATCAC ACGCTATACA TGCACCCTCC TGCAACTGAG ACCAACGAAA AATGCGnCAC 4560
GCCTCACTCC GTACAAAATA CCGCCATCAT ACGAAAACCC ATTCGCCGCT TTCCTTATTA 4620
TCCAAGACAG CGACCCTCGG AAAACACCAA CGCACGTTCC CCGGGCTTCG CGTACCAAAC 4680
GACTGGAACC AACATGCGCG TATTCTTTCC CCATGCAAAA ACCGCGAACG CATACTGCGC 4740
ATGCACTGAA CACGCCGTAA GTCCAATACG ATTCGCAAAG TCCACATCAC AGCTAACCAC 4800 TGCCTGCTTA ATCTCACGCA CCGTCAAGTC TTCACACCCG AAAGAGACTG AATCGGTCGC 4860
CACGCCGCCT GAGGACGGAG GCGGTGGAGC ACCCACATCA AAGCAAGCGG CACCGCAAGA 4920 CGCACGCTCT TTGCCCCATG TCCAGGACCC GAGACGAACA CACCCGCC 4968
(2) INFORMATION FOR SEQ ID NO: 99:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6086 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99:
AGTTTCGCGC GGGCTAAAGT GGGTTGGTCG CGTGTGGGTG CCAGCGCGCT GTACTTGTCG 60
CGCGTTATTT CGTGCATTGA GCGTTCTAAG GTGGTTTAGC GCTTATGAAG GAGAATTCTT 120
GCACGGCGTG CAGCAGACGG CTCGCCTTGT TCGTGGGCGC TGCGGTGCTT GTGGTAGGCT 180
GTTCATCCAA GACGGATGTC ACGCTCAACC GTGACAAGCC CCTAGTGTTT TTTAACCGAC 240
AGCCCTCTGA ATCCCTCACG GGGAAGGTTG ACATGGCTGC CATGAACTGG AACGACAAAA 300
CCTATTACGT GGGTTTTGAC GCTAAGTTTG GTGGTTCTAT ACAGGGAAAG ATGATTCTAG 360
ACTTCCTCGC CTCTTCTGAG TCCTCGGTTG ACCGCAACGG TGACGGCATC ATCGGTTATG 420
TGCTTTGCAT CGGTGACGTC GGGCACAATG ATTCGAAAGT CCGCACCGAG GGTATTCGCC 480
GCGCGTTGGG CACGTGGACC GGCTCCTCGG ATCCGGGACA GGCGAAAGAA GGCCAGGCAG 540
TGGTAGGAGG GAAATCCTAC AAGGTGGTGG AGCTCGAGGG AAAGGCGATG ACGGGAACTG 600
ACGGTTCCAC TTGGAATACG AATTCTGCAA CCGAGTCAAT GGGAAGCTGG GTGGCAAAgT 660
TCGCGGATAA GATAGACCTG GTCATCTCAA ACAACGACGG GATGGCAATG GGCTGTCTGC 720
AGGCGTCCAA TTATCCGCGG GGGCTGCCTA TTTTCGGATA CGACGCAAAT GCGGACGCGG 780
TCGAGTCGGT TGGTAAGGGT GAGCTCACGG GGACTGTCTC TCAGAACGTC GACGCGCAGG 840
CTGTTGCAGT GTTGCAGATT ATCAGGAATT TGCTCGATGG CTCCAGCGGG GAAGATGTGG 900
TCGCCAACGG TATTTCAAGA CCTGACGCCC ATGGCAACAA GATAAGCGCG CCCGTGCAGT 960
ACTGGGaAGA TGTTAAAGCG ATTATGGCCG ATAACTCGGA GGTCACGAgC GCmAACTGGA 1020
AAGAGTACAC CAGGGGAGCA CGGGATGCAG GGGTGCGACA GGTAAGTGCG CCGACGAAAA 1080
AGGTGCTGCT CACTGTCCAC AACGCGAGCA ATGATTTCCT TGCTTCTGCC TATCTTCCCG 1140
CACTGAAGCA TTACGCTCCG CTCCTGAATG TCGATCTCAC TGTCGTGCAG GGCGATGGCC 1200 AAAACGAGCT AAGTTGCCTT GATAAGTTCA CTAATCTCGA CATGTTCGAC GCGTTCGCGG 1260
TaAACATGGT AAAAACGAAC TCGGGCGCTG ACTATACAGA CAAGCTCAAA TACTGAGCAG 1320
CCGGGTTTGG ACGTGCGTTG GGTAGCTGCT GTTCCTGGTG CACGTCCTGT TCGTTGAATA 1380
GGTAGGGTCT ACCGACCTCG CACCGCTTTC GCGCGCGAGA GGAGTGATAG TTGCGATGTG 1440
CGATGTACTC ACCATAAGGG ATCTTTCTAA GTCTTTTGCG AGGAACAGGG TTCTCAACGG 1500
GGTGAACTTC CGTATGGGAA AGGGTGCCGT GGTGGGGCTT ATGGGAGAAA ATGGTGCGGG 1560
AAAATCCACG CTtATGAAGT GCCTCTTTGG AATGTACGCT AAGGACACTG GTCAGATTCT 1620
CGTGGATGGA AGCCCGGTGG ACTTTCAGTC TCCCAAAGAA GCGCTAGAAA ACGGTGTCGC 1680
CATGGTCCAT CAGGAGCTCA ATCAATGCCT TGATCGCACT GTCATGGACA ATTTGTTTCT 1740
CGGCAGGTAC CCTGCCCGTT TCGGGATAGT TGACGAGAAA CGCATGTTCG ACGACTCCCT 1800
CACTCTGTTC GCTTCCTTGA AAATGGACGT AAACCCGCGG GCCGTCATGC GCAgcATkTC 1860
TGTcTCTCAG CGGCAGATGG TAGAGATTGC CAAGGCGATG TCCTATAACG CGAAGATTAT 1920
AGTCCTCGAC GAGCCTACTT CCTCTCTCAC GGAGAGGGAG ATTGTCAGGC TCTTTGCCAT 1980
TATACGAGAC CTGAGCAAAA AAGGAGTGGC ATTCATCTAT ATCTCCCACA AAATGGATGA 2040
GATCTTTCAG ATCTGCAGCG AGGTGATTGT GCTGCGGGAT GGTGTCCTCA CGCTCTCACA 2100
ATCCATAGGG GAAGTGGAAA TGAGCGACCT CATCACCGCT ATGGTCGGGC GCACTTTGGA 2160
CAAGCGCTTT CCCGACGCTG ACAATACCGT CGGTGACGAT TATCTTGAAA TACGAGGTCT 2220
TTCTACAAGG TATGctCCGC AGCTGCGGGA TATTTCCCTT TCTGTGAAAA GGGGCGAGAT 2280
TTTTGGCTTG TACGGGCTGG TCGGTGCGGG GAGGAGTGAA CTGCTTGAAG CGATTTTCGG 2340
CCTGCGTACC ATCGCAGACG GTGAGATCTC TTTAGCAGGA AAAAAAATTC GCTTGAAGAG 2400
CAGCAGGGAC GCAATGAAAC TCAATTTCGC CTTTGTGCCC GAGGAACGTA AGCTCAACGG 2460
AATGTTCGCA AAGGGGAGCA TAGAGTATAA CACCACGATT GCAAATCTCC CTGCGTATAA 2520
GCGTTACGGT CTACTCTCAA AGAAAAAGCT GCAGGAGGCA GCGGaGsGGG AAATAAAGGC 2580
CATGCGCGTG AAGTGCGTTT CTCCAAGCGA GCTTATCAGT GCGCTCAGCG GGGGTAATCA 2640
GCAGAAAGTC ATTATTGGAA AGTGGCTCGA ACGCGATCCC GACGTCCTCT TGCTTGATGA 2700
GCCGACCAGG GGGATCGACG TGGGTGCGAA ATATGAAATT TATCAGCTCA TCATTCGTAT 2760
GGCGCGTGAG GGAAAGACAA TCATTGTGGT TTCTAGTGAA ATGCCTGAAA TTCTTGGAAT 2820
CACCAACAGG ATCGCAGTCA TGTCCAATTA TCGATTGGCT GGGATTGTGG ATACAAAGAG 2880
TACCGATCAG GAAGCCTTGC TCAGACTTTC TGCGCGATAC CTGTAGGGAG GAGCAGATAC 2940 ATGCGCGATC GTACACAGTG TGTGGCGGTG CCAACTCAAG CGTTCAATGA GATTTTAGAT 3000
CAGGACGGTC AGCTCACCGC GTACGCCCAA AGGCTCGAGC AGTTACGAGA GCGCGGTTCC 3060
CATAGGGTTG CCTTGCTCCG CGGGGAGCTT GCGCGCATAC GGCAGGATCA GGTCTTGGGC 3120
ATGCCGGAGA AAAGGGTGCA GGTTGCGGCG CACAGGCTCA AGATTTCCGA AGCGCAGGCC 3180
GTTGCACGAC AGTrmAAAAC TGAGGAAACG CAgTTGGtTA GGAArGsTGT CGCGCGTGTA 3240
AgGGGGCTCt TTCGAGACTT TGACTGCTCT GTGCGCGACG CGATGCGCGA ACAGCGGCTC 3300
TTGCTAAAGC AAGTTGCGAC GGTGCaGCAC ACCTCTGCCT CATCTGACCA AAGAGAGCAC 3360
TGTCTGGCTC AGCTCCGGCA ATGCmAGGAG GCGCGACACC ACGCCTACCG TTCCTTGGTC 3420
GAAAAGaGCt GCGCTGCGGA ACGGGAAAAT GaCGTTTATC GAGCGCGTGG TGCGTGCTCT 3480
TAGAGAATAT TCGTTCAATT TTGACGCAAC CCAGTTCTTC CTCGCAAATG GTTTGTACAT 3540
TGCTATTGCG GTATTCTTTA TTGCGTGCAT CGTAgTTGCA CCTTTCTCTG GTAATGGCAA 3600
TCTTCTTACC ATTCCCAACA TTCTCACCAT ACTGGAGCAG TCTTCAGTGC GCATGTTCTA 3660
TGCGGTGGGA GTAGCAGGTA TTATCCTGCT GGCAGGAACT GACCTCAGCA TTGGGCGTAT 3720
GGTGGCAATG GGGTCTGTAG TCACGGGTAT TATTCTTCAT CCGGGACAGA ATATCGTTAC 3780
ATTTTTTGGA CTGGGGCCGT GGGATTTTAC CCCTGTCCCC ATGGCTGTCC GTGTAGTCAT 3840
GTCACTTGcA GTTTCTGTCG CACTTTGCGT TTCGTTCAGC CTATTTGCAG GATTCTTTTC 3900
TGCTCGCCTC AAAATACACC CTTTCATTTC AACTCTTGCA ACGCAGCTTA TCATCTACGG 3960
GGTTTTGTTT TTTGGGACAA GTGGTACGCC AGTTGGCTCT ATTGACCCAT ACATCAAAGA 4020
CCTATTCGGT GGGCGGTGGA TTCTAGGCAC CATGCAGGGC ACACTCGTGA CCTTCCCAAA 4080
GCTGATAATT CCTGCCACCA TTGCGGTGGC CATCGCGTGG TTCATTTGGA ACAAGACGAT 4140
TCTAGGAAAA AATATGTACG CCGTTGGAGG GAATGCTGAG GCAGCGAATG TTAGCGGCAT 4200
CAGTGTTTTC GGGGTGACTA TGAGCGTTTT TGCAATGGCA GCTGTGTTTT ATGGCTTTGG 4260
CGCGTTTTTT GAGACGTTCA AGGCAAATGC AAGTGCGGGC ACTGGTCAGG GTTATGAGCT 4320
CGACGCAATT GCCTCCTGTG TGGTAGGGGG TATCTCCTTC AACGGGGGAA TCGGAAAACT 4380
CGAGGGTGCC GTGGTAGGCG TAATCATTTT CACCGGTCTT ACCTATTGTC TGACTTTTTT 4440
AGGCATCGAT ACAAATCTTC AGTTCGTGTT CAAGGGTTTG ATCATCATCG CTGcAGTTGC 4500
ACTCGACAGT GTGAAGTATC TGAAACGCCG CTAGTTCTTG CCCcGCTGGG CGGGAcGTCA 4560
ACGTTCACAA TACGAATAAG CCGGGCGCCT TTCTGGGcCA TTGTTCCCTC TTTGGCTAAC 4620
TCAGGGTGTG GGCTGACAaG AAGGCcTCCG CTGTCCGAGC TCTACCGTGC TTCAGATGAG 4680 CCcTTTtCTT TTCTCAGTAG TTCGAACGnc yTCGCGCGCA ACTTGGAGGA TAGGGTAATC 4740
TCTTACTGGA TCCGCAACCC GAAATCCACT GTACCCAGAT TGCTCAAAGT TCTTGGTATC 4800
TCCCACATCC CcTGGTCCTC GTAACTTCAG ATCCTCTTCG GCGATAACAA ATCCATCCGC 4860 aGTACTTCCC ATAaTTTTCA GCCTGCGTTT CGCACACTCA GTCATTTCGT CCCCATGCAT 4920
TAaGAAACAA TACGACTGCA CATCACCCCG ACCAACCCGA CCACGCAGTT GATGTAGTGC 4980
AGAGAGGCCA AAAACTCCGC gTGCTCTATA ACGATACAAT TCGCATTTGG TACATCCACT 5040
CCCACTTCAA CAACGCTTGT AGCAACCAAG ATATGGACGG TACCTTCGCT GAAATACTTC 5100
ATGATACGCT GCTGCTCTTC CTCAGTCATT TTTGAGTGAA TCATCGCAAC AGCATATCGT 5160
GCAAAATAAT TTTTTAGATA CATATACATA CATTGCACCG ATTTTAAATC GGTTAATCCT 5220
ATGTCATGAA TACGTGGATA AATAAAATAT GCCTGCCTAC CTTTTTCTAT TTCATTTCCC 5280
ACAAACTCAT ACACCTTTTC TGCTTTCGTC TTTCTTGCAA TATACGTAAT CaCTGGTTTT 5340
CTTCCACCAG GCaTAGATTT AATTATTGAA ATATCTAAAT CACCAAATAC AGAAAGTGCa 5400
AGCGTACGTG GAATTGGAGT TGCGCTCATC ATAATAATGT GTGGAGTCTT TCCCTGAGGG 5460
TTCCCTTCCC TTCCTTTCTG AATCAAGGCC GAACGCTGTA ACACTCCAAA ACGATGCTGT 5520
TCGTCAATGA TAACCAACCT CAGATCATGG TATCTTACGC TCTTTGAAAA CAGCGCATGT 5580
GTTCCTACAA CTAAATTGAT TTCTCCTGCA ACAAGAGCTT CGAGCAAGTA CGCCCTTCCT 5640
TCACTTTTCA CATTACCTGT CAGGAATGCA AGTCGAATCC CAATAGGAGC AAGTAATCGA 5700
GCTGCaGTGT CAGCATGCTG GCGTGCAAGT AATTCAGTTG GAGCAAGCAG TGCGACCTGT 5760
CCACCTTGTT CAATAATTTT TAAACAAGAA AAAAACGCCa CTAACGTTTT TCCTGATCCA 5820
ACGTCTCCCT GAATTAGCCG TGCCATCGGT TCTTCTCTTT CAAGATCCTG CGTAATTTCT 5880
GTAATTACTC TTTTCTGATC CACTGTCAAC TCAAATGGCA AACACCGGTG AAGTTTCTTC 5940
TGTAACAAAG ATAATTCAGA AACAACTGAC GGAATAGCCG ACTGCTGATC AGATTCTCCC 6000
TGTGTAAGAG GCAATCTCCC CCGCTTCTGT AAAGAGCGCA TACCGATAGT CATTTGAAGA 6060
GAAAAAAATT CTTCAAATAT CAAAGA 6086 (2) INFORMATION FOR SEQ ID NO: 100:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20757 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100:
GCAGCGGCCG GTGTGGGCGC CCACCGGCGG GCGGTATGCG TCTTTGGACG GTGCGTTTAC 60
CGCGCTGGCA ATGATGCAAG TTTCTTTGAG GCAAATCCGG CAGGAAGTGC GAACATGACG 120
CACGGGrAGC TGGCTTTCTT CCATACCACT GGCTTTGGCT CGTTTCACGC CGAAACGCTC 180
TCTTACGTTG GCCAGTCGGG CAACTGGGGA TACGGCGCGT CGATGCGTAT GTTTTTCCCT 240
GAATCTGGGT TTGACTTTTC TACCACCACG GAGCCCGTGT GCACACCTGC TTCGAACCCC 300
ATTAAGCAGC GCGrGGCAAT TGGAATCATC AACTTTGCCC GGCGTATCGG AGGTCTCTCC 360
CTGGGAGCCA ACCTGAAGGC GGGGTTCCGC GACGCGCAGG GCCTGCAGCA CACCTCTGTC 420
TCCAGTGACA TCGGCTTGCA GTGGGTGGGG AACGTTGCCA AGTCCTTTAC CTCTGAAGAG 480
CCCAACCTGT ACATCGGGCT TGCGGCCACC AACTTGGGAT TGACCGTAAA GGTCTCGGAC 540
AAGATAGAGA ACTGCACGAG TACCTGTGAA AAGTGTGGTT GCTGCAAGGA GAGGTGCTGC 600
TGCAACGGCA AGAAGGCGTG CTGCAAGGAC TGCGACTGTA ACTGCCCCTG TCAGGACTGC 660
AACGACAAAG GTACGGTGCA CGCAACAGAC ACCATGCTGC GTGCAGGGTT TGCATACCGG 720
CCCTTCAGCT GGTTCCTCTT TAGCCTTGGT GCCACCACCA GCATGAATGT GCAGACCTTG 780
GCTAGTAGTG ACGCCAAGTC GCTGTACCAG AACCTGGCTT ACAGCATAGG CGCCATGTTT 840
GATCCCTTCA GCTTCCTGAG CTTGAGTTCG AGCTTCCGCA TCAACCACAA GGCTAACATG 900
CGAGTGGGAG TGGGTGCAGA GGCGCGCATT GCCCGCATTA AGCTGAACGC GGGATACCGC 960
TGTGACGTCA GCGACATCAG CAGTGGGAGT GGGTGCACAG GCGCGAAGGC TTCGCACTAC 1020
CTTTCCTTGG GTGGCGCGAT ACTGCTCGGC CGAAATTAAT TCATAATATG CCGGGGCGCC 1080
CGCCGGTGCC CTGCTGAAGA ATGCGGaCGG cAAGACGTGG AGGGGGTTTT GCCGCTTTTT 1140
TGGTGCGGCG GCCGTGTGTG TCGGCTGCGC CAGTACGCGT AGGAGGACGA TTGGTGTTCG 1200
GTTTTGCACG CGTCGGTTCG CGCGGGCTCT GCTTGGGGGC CCTCCTGCTC TCCCyTCGCA 1260
TCgTGTTGGC ACAGCACGTT GCTGACGCTC CTTTGGGCGC ACGCGGGGTT GTTCCGCGCA 1320
GnTCCTTGCC TCGGCGCACG CGGGCGGCCC GGGCTACGAC GCTGCGATCT CGGGGCGGCG 1380
TGGTCAGTTC CgCGCGAGCG GGGGAACGCT CGTGGTCACC GCGCAGAAAC CGAAGGTCAT 1440
GGCACGAAAT GACGTGGACT ACCGTCCGCT CTCCCTGCAG GCCGGCGGCA GACAAGGCTC 1500
GTTGGACCTT GTTGCCACgC AACGGCGGAT GACGCCAGCT TCTTTGAAGC GAACGCCGcA 1560
GGAAGCGCCA CCATAcCGCG CATGACGCTC GCCTTTTTTC ACACCATGCG CATTTCCGAC 1620
TCCCACATAG ACGTACTTTC CTTTGTCGGG CGGGCGGGGC GCACCGGCTA CGGCGTTTCG 1680 GCACGCGCCT TTTACCCAGA CATGTCCAGC AAAACCACCG GCTTCGTGGG AATTTTTAAC 1740
GTATCGCACG CTTTCTCTTC CGCCTATCGC TTTAAGGGCG TGAGCGTGGG CGCAAACCTT 1800
AAGGTGGGGT ATCGCCACAC CCGGGGGGGG GGGTAGCAGC CAGTCAAAGA GCTCCAACGG 1860
GAAGGAGAAC CACCACATAG TCCTGACCGC GGACGTAGGG GTGCGCGGTG CGTGGACGGT 1920
GTCTAAAAAC TTTGGTGCGC ATGAGCCAAA CCTGTGGGCA GGAGTAGCAT TCCGCAACAT 1980
TGGCGCGTCA ATCAACGCCA CAAACCTTCA CGGAAATAAC GGCGCCGGAG GCAGCGGCGG 2040
CGGTGGAGGG GGCAATGGCG ACGGGAAACC TGCCCACGTC ACGGACTCCC GCGTTATCCT 2100
TGCGCTTGCG TACCAGCCGG TGCGGTATTT TCTTTTTGGC GCCGGGCTTG AGTGGCTCTA 2160
CAATGTGGGG TCTATCAAAG CCGTCAATTC GCTCCGGTAT GGGGCGGCGT TCATGCTTTT 2220
TCCGCTCAGG CAATTGGCAT TCAGCTCGAG CGTGGTTATG AAGGGGATGG GTCCACAGCA 2280
GGTCCGCGCG AGCGCAGGGG CAGAAGTGCA GTTTTCTCAC GTGCGGTGCA CCGCCTCGTA 2340
TTCGTATCTT TGGAGTGCGA CACCCACACG GCCsCACTAC GTTTCAATTG GGGTAGCCGG 2400
TTTTCTCAAA CCGGTTCCCG AACAACCCCT GTGGCAAGAG GTGTACCGCT CCTATTTGCG 2460
CGGTGCGCCA CTACCACGCG CAgCTaCGCA GAGGCCATCG CCGAGTGGAA GCGCACGCTG 2520
CAGCAGGGCG TCAGTTTTGA GCCTGCGCGG GAAGGCATCG AGCGCGCCAC CAAGCTTTTG 2580
CAGCTGAACC AAAAGGTTCA CGATTTTAAC ATTTTCTAGC CGCGCCGCCG cGCAtCATCT 2640
GCTCCGTCCT GTGCCACGCT GCCGGCACCG GCGGAAAGTG GGGGCGGACC CCTTACTCGA 2700
CCGTCACCGA TTTTGCCAGG TTCCGCGGCT GGTCAATGTC AGTCCCyTTG AGCAGGGCAA 2760
TGTGGTATGC CAACAGCTGC AGCGGTACGG CGTAAAAAAT CGGCGCGGTA AGGGGAGATA 2820
CCGAAGtACG GTGACTATCT GACTGCACGC CCCTTCCGGC GCGTCCGCCT CGGGCGTGCA 2880
TACCGGGCCA AAACGCTCCG GCACGTCCGT AAAGATGTAG AGCATCCCGC CGCGCGCGCG 2940
AACTTCCTCG ATGTTTGAGG CCATTTTTTC AAACAGGACG CCAGGTGACG CCGGCGCGAT 3000
TGCAACCACC GGCATCTGCG CGTCCACTAA TGCAAGGGGC CCATGCTTTA GCTCCCCCGC 3060
TGCGTATGCT TCGGCATGGA TGTACGAAAT TTCTTTCAGC TTGAGCGCCG ATTCAATTGC 3120
AATCGGATAC AATTCCCCAC GCCCCAAAAA GAGCgCATGC TGCGCATGCA CAAAATGCCG 3180
CGCGCACCGc GCAACGTCTG CCTCACACTC AAGCACGTGC TCCACATCCT GAGGCAGTCG 3240
CTGGAGCGCC GCAGAGAGCG CGTCCTCGGG CTCcTGCGTG AGTATCTTTT TTGCCTGCGC 3300
AATCATGCGG GTGAGCACGA GCAAGCACAC CAGCTGGGTG GTAAAAGAtT CGTTGAAGCA 3360
ACCCCTATTT CTGACCCCGC GTGGGTGAGC AGCAtGCGTC CGACTCACGC ACCAACGTGG 3420 AACGTGCCCC GTTGCAAATG GCAATCGCAC AGAGATACCC TTGCGTTTTT GCCAGGCGCA 3480
GTGCGGCAAT GGTGTCAGCC GTTTCTCCCG ACTGAGAAAT CGTCAGTACT ATTTCACGCG 3540
CGTGCACGAC GCTCGTGCGA TAsGGTACTC TGAGGCAATC TCCACCTGAC ATCCCACCCC 3600
TGCAAATGCC TCAAACCAGT AACGCGCCAC TAACCCTGCA TGGTACGAGG TACCACACGC 3660
GATAATGCGC ACCCGTGTTA TCCGTCTAAA CAGCCGCTCA AACGTCTTAC ACGAGGTACC 3720
GTCCAAGACC CGGTCCTCCC CGAACGTCCG CACCTGTGCG CGAGAAGACG AAGAAAACGA 3780
CATATAGGCA TTCAGCGTAT GCCGTATAGC GTGTGGCTGC TGCCATATTT CTTGATGCaT 3840
ATGGTGACGG TGCGTACCCT TATCCTGCGT ACAAAGCTGC ATCTGATACG TAACAACAGG 3900
ACGCGCCACA ACGTTTCCCT GCGCGTCGTG GACGCACACG CTATCTCGGT GGACGTCTGC 3960
GATGTCTCCT TCCTCAAGAT ACAAAAAACG CTGCGTAACA TGCGCAAgcG CAAGsgGGTC 4020
TGACGTAACA AAATTTTCCC CACAGCCGAG TCCTACCGCC AACGGACTGC CAGAACGCGC 4080
AgCAATCAAC CGCCCAGGAG ATGCAGCGTC CATGCAAAGT AACCCGTAGt ACCCCGAACC 4140
TGCGTCAACA CTTTTTTTAC CGCAAGCAGG AGGTGCGCCG TGTACCGCAA CTCCCAGTGC 4200
AAAAGATGCG CGAGCACCTC GCTATCAGTT TGTGAATGAA AAAAATAGCC ACGGGTCACT 4260
AGCATTTCAC GCAAAGACCG ATGGTTTTCA ACAATACCGT TGTGAACTAT CGCAACGGAT 4320
TCAGAACAAT GCGGATGCGC ATTCGCTGCA CACGGCTTGC CGTGCGTTGC CCACCGGGTG 4380
TGCGCAATGC CCATGGTCCC GCAAAGAGGA CTCTGACCTA ATAGCGCGCA GAGCGACTGA 4440
ACACGACCCT CACAGCGTAA AAGGCGGAGc GCACAGTCCG AGCCAACGAC AGCGATCCCT 4500
GCAGAATCAT ACCCGCGGTA TTCAAGACGA CGCAGCCCCT CAAGCAAGAG ACCTGAGACA 4560
TCACGCCCCG CCACCATCCC AACGATTCCA CACATAGACG CCTTTTCAGT GCAAAAGCAC 4620
ACGGGAGAAC GTTAAACTCA AACCCACAGA ACACACGCAC GCGTAGGAAC CATTCTGACT 4680
CAACCAAGCC ACGAGCCGCC TCGCAGGCTA CGGTCCAGCG CACCAACAGG GAGACTCACC 4740
ACCTCCCCCA AGGGGAGACC CGACGAACGA GCAAGGTAAC CCACAGGTCA CAGGCGATCC 4800
CATGCCCATC CGTTGCCTTG CTGCACATCG CTCACTTTTT CGAAGCTGAA TCTGGGAGTA 4860
TTTCCTGTAA CTCTATCTCA AAAACTAGGA GCGCACCTGG AGGGATAACC CCCTCGATAC 4920
CACGCTCCCC ATACCCCAAG GAAGAAGGCA CATAAAACCG ATAGGTAGAA CCCACCGGCA 4980
TCAGCTTTAA GCCCTCAGAT ACCCCAGGCA CCATACCATC CACCGGAAAC TCCGCAGGCT 5040
TATCTCGAGA GGCATCAAAC ACCGTTCCAT CAAGCAGCGT CCCCTTGTAC TGAGTGCGCA 5100
CCCTCTGAcC GCCtGCGGCT TTGGACCATC TGCAGCCTTT ACCACCTCGT ACTGCAACCC 5160 AGAGGAAGTT ACCTGCACAC CTGGCTTCTT CGCATTCTCT TCAAGAAAGG CCTTTGCCTC 5220
CTGCGAATTC TTCTCTACCT CTTTTTGCCG ATATGCCTCG AACGCACGCT GTAGCACGTC 5280
CTGCGCATCC GCAAGTGCCT GCTTATCCTT GTCTGCACTG ACTGTCTTCT TTAGACCCTT 5340
CCACACCTGA CCCAGGTCAA CGTCTAACTT CGAATCCTGC AGCGTCACCC CCATGAGAAC 5400
CCCAAAGGCA TACCCCACAC TCTTCTTTGA AAGCGGCGTC TTCTGcCGCT CCTGCTCCTG 5460
CACCTTTCGC GGATCGAGCT GATCAGCAGT CAGCGCCTTC TTATCCGCAG CCTCCCCGGC 5520
AGAGGACACC CCCTCAGCAC CCTTCCGACA GGAAAAAATA CTCATACCCG CAAGGAGCAG 5580
AAGCGCGAAA GAACACACCC CTGcAGCTTC TTTTTTCAAA ATCACCGCTC CTTTTGCGCG 5640
CTCATCCGCG CCGCCCCGAG GCGACCAGCC TACCCCATCC GTACTTCACA CGTCAAGGAG 5700
ATAGCTAATG CCCCGATCCT TCCTTGAGCC ACACCCACTT TTCAGGCATC GTACGCACCA 5760
TGAACCTCTA TTTGGATTGG GCTGCAGCTG CTATACCCGA AACCCGGCTT ATCCTACGCG 5820
AAACGCAACA GGCCTGTACT CTCTTTGCCA ACCCTTCTGC TGTGCATTGC TTGGGAAACG 5880
ATGCGcGCGA TGCGCTCGAA CGAGCACGAC ACACGTGTGC CCAGAGCTTA GGCATCAATC 5940
CAGAACGCCT TATATTCACT TCAGGCGGGA GCGAAGCAAA TCAACTTGCC CTCCTTTCTG 6000
TCCTTACTCG TCTTCCTCAC GCAGAAATCA GCGTCAGTAT GCTAGAGCAC GCGTCGGTCA 6060
CTGCACTTTT GCCCCGGCTT GAGCGGTTGC AGGTATCCGT ACGCCACATC CCCGTCAATG 6120
CCCGCGGTTT CATTACCCCT GAGGCTGTAC GTGCAACGCT CAGTCCCCGT ACCACGCTAG 6180
TGTGCGTGAG CGCCGTACAT AGTGAAACCG GCGCCATCCA GCCGCTCCCT GCTATTGCGC 6240
ACGTGCTTGC ACATACAGGC ACACGCGGAC GCTCTATCCA GCTCCACGTA GACGCCGCAC 6300
AGGCCTTTGG GAAAATACCG CTCAATCTGT ATATGGACCT TCCGCGCATA GAGGAACATG 6360
CACAGGAAAA CAACGCGCCA CAGACACCAC CGGGCTACCC CGCACCCACT GCACAACGCG 6420
CGCTTACCTA CTCGGTAGCA ATCAGTGGCC ACAAAATAGG CGCACCACGG GGTATTGGGC 6480
TACTGTGCGC ACACCGTTCA TTTACCCCCT TTGTCCTGGG AGGCGGACAG GAAAAAGAGC 6540
GCCGCCCGGG AACTGAGAAC CTTGCAGGTG CGCTCGCGCT CGCCGCTTGC GTGCGCGAAG 6600
GCGCCTTCTT CCGTACTCTA CATACCACTC CGGAAGGCCC TACACCGCAT ACGAAGCCCA 6660
CAGCTCCTGC AGGGTTACGC AGTGTcCGAG CGCGTACGTG CGCCTTTGTG CGTGCACTCA 6720
GCGATTTACC GCGGGTGCAA CTAGTCCCTG CAACGCGCAA AGAAGACGAA GCGCACTTCT 6780
CTCCCTACAT CGTCTGCTGT GCGGTACAAG ACGCCAGCGG cGAGGCATTA GTACGTGCAT 6840
TCTCAGACGC AGGTGTGTGC ATCTCCACCG GTTCTGCCTG CTCGACAAAG AAAGGTGGCG 6900 TTTCAACACG CCTTCTGCGT GCACTCGGGG TAGAATcCCg cGCAAgCGCG GCGtGCTGCG 6960
TTTCTCTTTT GGTCCACACA CCACCGCCGA AGATCTCGAT CGGGTCTTAA CGCTTTTCCG 7020
TACCCTGCTG CAAAAACTAT GACCGCTCCA CACAGAAAGG GTACACGCAC AGGAAGCACT 7080
CAAGCGCCGC ACACAACCCC TTATTTTGTC AAAATTTTTA ACGGATTTAC CACCGACCCG 7140
TTCTTAAATA CTGAAAAATG CAAATGCGGA CCAGTTGCCC GACCGCTAGC CCCCACGCGC 7200
CCAATGGTCG TCCCCTGGAC TACCCGCGCC CCACGCCCTA CCATCACCGA ACTTAAATGC 7260
CCATACATGG TTTGGTATCC ACCGCCGTGC ACGATGATCA GGTAATTCCC GTAAATTCTG 7320
CTATATCCAA TTTCTGCCAC TTTCCCATCG AGCGTTGCTT TCACCTGCGT CCCATAGGGT 7380
GCGGCTAGGT CAATCCCATT GTGAAAGCTC CTTTTGCCTG AAAAGGGATC TGAGCGGTAC 7440
CCAAACCCAG AGGTGCGCCG CCCGCGAATT GGATACATGA ATAACTCCCC CAACACCTTC 7500
CTCAAATCAA AAGCAGATAA TTTTGCACCC GGAATAAACA AACGCTGTCC AACTGTTAAT 7560
GCACGACTGA CTAAATCATT CGCATCCAGC AACGTATTCA GGGGCAAACG AAAGAGACTG 7620
GCAATTGCAC TAAGCGACTG CCCCTTTTGT ACCGTGTGCA TGAGTCCATC CATGGACGGA 7680
ATAGTAATTT GATCCCCTAC CGATAGTCTG CGCGCGTTTG AAATTCCGTT CACCGACAGC 7740
AGCGTCCCCA TATGCTTGAG TCCTGcGCgC ACAAtTGGcA CTAATAGTAT CTCCCTTACG 7800
TACGGTGTAC GTGCGGTAGc TCACCGTCGC AAGCTGCGTA TCCAACGCCT CTGCAAGCGC 7860
GGCAGGTTTC GCGCCGCCCT CGCGCGTTTC TACCTCCTCT GCACGAAAAA CCCCACGACG 7920
CAGCGCATGC GCCAGGTGCT GTCTGTTTCA AAACGCACCA GCGCAAgCCT TCTGCCACCG 7980
GTAACGTGCA mCGCCCACGG GATACACAGC ACCATGAGCA CAAAGAGCAG CGCGCAAGCC 8040
AACGCCGCAC ACAACGGCAC ACAAGAAGCA TCCTGCAAAG AACGTACAGG CACCGAAAAG 8100
GAAGGAAAAG AAAAACGAGG CACCcGGTGC GCGCAAGCGG CGTGCGTGCA CTCTTTTCTC 8160
TCCGGAAAAA ACGCATCCCC CCTATACaCT CCtGCGGCTC ACCCGAACAG AACGGTTTGA 8220
AATACTGCTG AGCAACCCCG AAGCGTTGAG CGCCTGCGCA CGTATCAAAC GCGCTTCTGC 8280
ACGTGTGCTC CATTTTGCGT AGCGCGCGCg TAAGCCCCGC CCTCCTCCGA GTCTCCCACT 8340
GACTCCGCTG CCTGACTCAG GAATATCGAG GGTATCAGGA CTTTCATGAA ACGAAATTAT 8400
CTCTTCCTCA AGGGGAAAAC GGGGAACAAG ACGGTTACGC GTACGAGCGC GCGCACCCCC 8460
GCTATCGGAA AACGCACTAC TTGAAAGAAC TGCGCGGCGC ACTTCTGCAG GGCTCGGACA 8520
CACGCTGAAC GGCTCACATC CCCGCGCATG ATCCGGAACT AAGTCCGCAC CCGTGTACGT 8580
AATCACGTCC ATAACGGGCG GTATCGGAAC ATTTATTCCA GGCGTTGAGC ATTTCGCCTG 8640 CAGCgCCACA GGcGGCGGaG TCCCCAGCAG CGTGCCACTT GTTTTTGAAC CGCACGGCGC 8700
ACGCCCCGTA TAAAGAACAC ACCTCGCCGG GCGCAACCGC CGAGGGTGTC TTTCCCGATA 8760 AAGCAGAGTT CAGTGAAGAA AATGAAAATC GTAAACCTGC CGGTTTAATT GCAAAAGCTT 8820 CGTTGCACGC TCGATACCTT CTCGCGCCGG CTCAAAGCTA CCGGCACGCT GCAGGGTGCG 8880 GCGCCATTCG GCAATCGCTT CCTCATAGTG cTGCGCATCA TAGTGACGCA GCCCCCGCAG 8940 gTATGAAGTG TATACTTCCT TTTCCAAGTG CTTGCGTCGG TCTCGATTAA AAAATCCTGC 9000 AATTCCGCAG GAAATGACAT GCTCATCCTT GTCTGACTCA TAGGTGTATG TGAGATCCAC 9060 ACGCACCCAC GTACTTTTGA ATTCTACTCC TGCACTTGCA CGGATATCTG AGGCGAGCCC 9120 GGTGAGAAAC ACATTTGAGC CAAACGCAAC GTACTGCACC GGCAGGAGCA GGAaCGctAC 9180 GCCGTATCTG AATCTATTGT TGTCTGCAAA TTCTTGCACG TTGTACTTCC ACTCAATCCC 9240 CGTGCCAAAG AGAAACCACC GTATCGGTTG ATACGCACAT GCGAGAATAA AACTCGAGTT 9300 TGTAGCGTGC ACCGTCCTGC CGCCACTCAT GCTAGAACCG CTGTTGGATG CGTCCACCTC 9360 AAcTGAAAGT CCAACGTTTT TGACAGTACC GCCCACCCAC AGATTCGGCT CATGGGAACC 9420 AAAATTCTTG GCCACCGACC ACGTCCCCTG CAGACCGATG TCTGCGGTCA CCACCACGTG 9480 CTTCTTACCC CCCTGGTTTT TTTTATTCCT CTCCCCGCCG GCTGAAGAAT CGCGGTAGCC 9540 AACTTTCACG TTGGTTCCCA CACTGATGCC TTTAAAACGA TAGGCAGACA AAAAACGGTG 9600 CGCCACATTG AAAATGGCAA CGCCGCCTAC TGCCTTTCCC TCCATTGTCA GGTAGGGATA 9660 CTGAACACTC GCAGAAAAAC CGTAGCCGGT CCGTCCTATG CTGTGCACAA GTGCAATCGT 9720 ATCTGTGTGC GATTGATTAA CCCGGGCAAA GTGAAACCCT CCCACCAGGA GATAGGGGAA 9780 CGCGGCACTC CCTGCTGCAT TCGCCTCAAA AAAACTTGCA TCGTCTGCTA AAGCGGTAAA 9840 TGCCAAGCCG AGCATTTCAT ATCTGCCGCC TGTCCTCAGC GAGAGGGCAC GACGGCCCAC 9900 TCCCTTCTGC TTCTCGGATG TCTTTGCTGC CATTGCAACA ACCGGCGTCA CGGACGTTCC 9960
AGACACGGCA CGCCGCACCt GCGGTGCGCC GGCACTCGTG CGCGAACTTC CCCCAGAACT 10020
TCACCCCCTT GCGAAGACCA CGCGCCCAAA GAAGCTGTCC GCGCAAAAAG CGGGGCAAGA 10080
CACAAGGCAG ACATCCCAAT ACGCATCCCC AGAACCCGAA CTGATAGAAA gCTGCGGCAC 10140
CGCGCACCGC AGCACATCCG CACGCGCCAC CGCTTGAACA AGCTCTTCTC CGCCTCATCC 10200
GTACCACTAG ACTACCTCAC CGCATCCGCT CCGTGCAACC CTTGCGcACA GTCCCTTCCC 10260
TTTCCCAytA CtGCGCCCCT ACCAGTGCCC CGATTAcGTC AGAGAATGGG GCAAAaGCGC 10320
AcTCCAcGTA cGAAGGGCTA CGGCyTCTTG CAgTGCGCCA CcTGTAAGCG CGCAACACcT 10380 TCTATATCTG CAaTTGTACG CGCTATTTTT AATACCGCAT GTCCTCCGCG CCCCGATAGC 10440
TGCTCCTTTC CTACTGCACG ATGGAACTCC CGCGCAGCAT CaTCTGTCAA CATACACCAA 10500
CGCTGCACAT TCTCAGGAGA AAGACGCGCA TTCCGGTATC GAATCCAATC TTCCACCGTT 10560
GATGACCGCC CGTTAATCGG CGCACATAGA CAGCTCCCCT GCCGTTCcCA cTGCGCTTCA 10620
AGTGCGCACG CAACCGTCTT GcgCAAGCGC GCGGTACAGC ATGCCGGTTC GGACAGCAGC 10680
GTATGCGAAG CAGGcGGCAG CACCTCCACC CGCAAATCAA CGCGATCTAA AAGAGGCGCA 10740
GTcAGctTGC GCCAATACCG CTCCAcCGCC TGAGGCGCAC AGGTGCATAC CTTGTGCTGC 10800
ACCCCAAAAT TCCCACATGC ACACGGATTG ACTGCCAAAA GCAACTGAAA CCGTGCAGGA 10860
TACGTGCTGC TTTTTCCTGC GCGACTGACT GTTATCTGCC CTGTCTCAAG CGGTGTGCGC 10920
AscgTCTCTA ATACCGGACG CTTAAATTGC GTCGCCTCAT CTAAGAAGAG CACTCCCCCA 10980
TGCGCAAGAG AAATTTCTCC CGGAAGACAG GTGCCTGCAC CCCCAATTAT TCCTTCTGCG 11040
CTCGCACTCG AGTGCGGCGT GCGACACGGC GGACGCgCAT GAGCGGGTCC TGCTCAGCAC 11100
CCTTTGGGAG GAGGCCTGCA ATACTGTGTA CTCTTGTCAC CTCAAGTGCC GTACGTGCAT 11160
CCAAGTCTGG CAGAAGAAGC GCAAACCTAC TCAACGAAAG CGTCTTGCCA CACCCAGGCG 11220
CCCCGTACGC AATAAGATGA TGTCCTCCAG CAGCTGCAAT CTGTAGTGCC CGGATCAGTT 11280
TTCTCTGTCC CCGCACGTCT TCAAACCCAC CGGTAACCCC CAACGCAGCG TGTGGCCAGT 11340
GCTGCACCGG GCGTTCTTGC CCTGTTCCCC CGGCAGAATC AGATGCGTCT CCCGTGGACC 11400
TGCCAACCGA AAATAGAAAT GGCGATACAC CCGGCGCTGT CCCCTCAGGG CGCGCGCCTG 11460
TCTCTCCCAC CGCACCCGGA CACCGCTTTG GGAACAACGC CATAATTGCC CCTGCATCCA 11520
TACCCGCGTC CTGGTGCACA GACGCACCTG TATCTGCCGG AAgGTGCACG CGATGCGCGG 11580
TGTCTTCCAC AGAAACCTGC TTCCCCTCCC ACAAGAGAAC TGAGTTTGAC TGACCCCACC 11640
CAGGAGCGGA CTGCTCATGC GGCGCATCCT GCGGTTCTAC AGCTAGCTGC TGGCAGGCAA 11700
CAAGAGCTGC ACGAAgTTCT TGCACGGCAA AAACGCGCAC TCCAGGCGTG ATACGCGCTT 11760
CTGCCTCATT TTCCTTTGGC ACGATGTAAT CGTAAATGTG CGCGCTTAAT CtGCGGCAAC 11820
TGCTGCAAgC GTGCCACGCA CTGGCCGGAT ACGTCCTGAA AGCTCAAGCT CTCCGAGTAC 11880
CATCACACGG CGAACCTCGC GTGcgCATCC GCGCGCACCA CCTGTCAGTT CCGGCGTTTC 11940
CTCTGCAGAG CTCGCGTGTA CCTGGGCGCG GAgcACTGCC AATGCGATGG GCAGGTCAAA 12000
CGCACTCCCT TCCTTTTTCA GATCTGCAGG GCTCAGATTG ATGAGAATGC GCTCCTGAGG 12060
AAAGGGAAGC GCTGCATTGC GGATAGCAGC GCGGATCCGC TCCTTCGCTT CTTTAACCGC 12120 AGACCCTGGC AGTCCCACAA TATCCACTAC CGGCAGTCCC CTCCGAAGAT CCACTTCCAC 12180
CTTTATGACC TCGCCTTCAT ATCCAAAGGC GGAAAAGCTC ATAATCTGCA CCGTTTACCT 12240
CCCATGCGAT GCGCAGGACG CAGTGCGATA CACGCGTGTA CTGTGCATGG AGCTGGGCGC 12300
TACACCTGCC ACCCTGCCTA CGGTAGCCGC CGCATACCGC ATACCCATAA CGATAGCCGA 12360
TCAACCTGCA TAGCCAACGC ACACCCGGAC GTACCTACCA AGGAACGTAA CAAAAAAGGC 12420
GAAAAAAACC TCAAGTCCTT TTCCGTTTTT CCGAAAATAG GATTGGCCCG TTTCTGCTCT 12480
CGTGCGGTGC AsGGGCCCCT GCAGGGCACG GACGTCTTGC AGGAATGTCA GAGATTCCAC 12540
GGAGGAATGC ATGATTATCA ATCACAACAT GAGTGCGATG TTCGCGCAAC GCACACTCGG 12600
GCACACCAAT GTCCAGGTTG GAAAGGGCAT CGAGAAGCTT TCATCCGGcT ACCGCATCAA 12660
CCGCGCAGGG GATGACGCTT CTGGTTTGGC TGTCTCAGAA AAAATGCGCA GCCAAATCCG 12720
CGGCCTCAAC CAGGCATCCA CCAATGCCTC AAACGGTGTG AACTTCATTC AGGTTACCGA 12780
AGCCTATCTG CAAGAAACCA CCGACATCAT GCAGCGTATC CGAGAGCTTG CAATTCAAGC 12840
GGCAAACGGC ATCTACTCTG CTGAAGACCG CATGCAGATC CAGGTGGAAG TTTCGCAGCT 12900
TGTGGCAGAG GTAGACCgCA TCGCTAGTTC TGCCCAGTTC AACGGCATGA ACTTGCTCAC 12960
GGGCCGCTTC TCCCGCACTG AAGGTGAGAA CGTCATCGGT GGCTCCATGT GGTTTCACAT 13020
CGGCGCTAAC ATGGACCAGC CATGCGCGTG TACATCGGCA CTATGACTGC GGTGGCGCTG 13080
GGCGTACGAA ACGGCGTGGA TGAGTCAATC ATGTCCATTG AGACTGCAGA CTCGGCCAAC 13140
AAGAGCATCG GCACCATCGA TGCTGCTTTG AAGAGAATCA ACAAGCAGCG TGCGGATCTC 13200
GGAGGCTACC AGAACCGTAT GGAGTACACA GTTGTCGGTC TTGACATCGC TGCGGAGAAC 13260
CTGcAGGcAG CTGAGTCTCG CATCAGGGAC GCAAACATCG CAAAGCAAAT GGTTGAATAC 13320
ACTAAGAATC AGGTGCTCAC CCAGTCTGGC ACTGCAATGc TTGCGCAGGC GAACACCAGC 13380
GCGCAGTCGA TTCTCTCAAT TCtCCGGTAA AGcCcTACGc CGCGTGCGCT CTTGTCCAAA 13440
AAGGGCAAGA GGAGTACACT GGGcCACAGG GGCTGCCCTG TGGTGCCCTT CTAGAATGAT 13500
CTTTGAAAAG ATTTCTCCcT TGCAGGCCTT CGTGTGGGCG GTTCTGAGGC TTTTTCTAAA 13560
AAGCTTCAGA ACCGTTTTCC GTGGCGCGGT GCGTGCAGGG TGCGGCGTGC TCGCCTGCGT 13620
CCGTGCATAC GGTTTTCCAC CCTATGGATC AAAGGAATAG AGGTGGAAGG ACTTCAGGGA 13680
GGGTCATATG ATTATCAATC ACAACATGAG TGCGATGTTC GCGCAACGCC AGGGAGGCAT 13740
CAACGGACTT GCAATTGCTA AGAACATTGA AAAGCTTTCG TCTGGcTACC GCATTAACCG 13800
TGCAGGAGAT GATGCTTCTG GTTTGGCTGT CTCAGAAAAA ATGCGTAgcC AAATCCGCGG 13860 CCTCAACCAG GCAGGGCAAA ATATCCAAAA CGGTATATCC TTCATTCAGG CTACCGAAGa 13920
TACTTGGCGG AGACAACTGA AATCGTCCAG CGCCTGAGGG AGCTTGCAAT CCAGGCGGCA 13980
AACGGCATCT ACTCCGCCGA GGATCGCATG CAGATCCAGG TGGAAGTTTC ACAGCTTGTC 14040
GACGAGGTAG ACCGAATCGC AAGCCAGGCC CAGTTTAACG GCATGAACTT GCTCACGGGC 14100
CGCTTCTCCC GcGAGTCTGc CCTTGGGCCC aTGcAGCTGC ACGTCGGTGC GAACATGGAC 14160
CAGAATGAGA AAATATTCAT TAACACCATG ACGGCAAGTG CTCTGGGCTT TTTCTCCGAT 14220
GAAGGGACAG ACGGCAGTCG TTCCATCAGC ATTGCGACCG TCGACGGGGC GAACAAGGTC 14280
ATCGGTACGC TTGATAGCGC GCTCAAGGAG ATTAACAAGC AACGTGCGGA TTTGGGTGCC 14340
TACCAGAATC GATTTGAAAC CGCGTATCaG GGAnATCcTA TCGCGGCGGA AAATCTGCAG 14400
GCAGCCGAGT CTCGCATCAG GGACGCGGAC CTTGCGCAgC AGATGGTCGA TTACACGAAG 14460
AACCAGATTC TCGAGCAGTC GACTATGGCA ATGCTCGCTC AAGCAAATAC ACAGCCACAG 14520
GCAGTGcTCC GCTTGATGCA GTAAGCGCAT ATCGCGACGG TATATTTTTA AACGTTCAGC 14580
GTCTTCATTA GAGAGTTTCG GATGTCATCT CTTGTGAACG CCCATGAGGG AAAGAGGAGC 14640
CGCATGCCTC TTTCCCTTTT CTTTTCTCTT TATCTTTCTG TCATCACACC cTGCGGTTGC 14700
CATTACGGCG TAAACGCTGG GGATGTACTT TGTATTTGTT AAGTTTGCGT ATCTGTAAGC 14760
CGACAAGCCG CGCGTGGGGT TTTTTTGAAG GGAACTCGGC TGGAGTCTTG ATCCACTCTC 14820
CGCAGCTTTC GAGGAGTCAT TGTGTCTGAT GTCCGCATCC CCGGAGTAGG GGCCGGTAAG 14880
TACGATAACC TCATCCAGTC GCTTATGAAA AAGGAGCGCA TTCCTCGGGA CAACGCTGCG 14940
GCAAAGGTGA AGGTTTcGAG GTTCAGAACA ACGCGCTCAA GGACGTGGAG CGGTATGCGC 15000
GCGATTTGCG TGACGCCGTC AAAGGACTCT wTTCCTTCAA CAACCCTTTC GCAGaGAAGG 15060
AAGCyCATTC TAGCAACGAG CGTGCGTTCA CCGTCGATGC TACTCGAGAC GCTGCCGAGC 15120
AGAaTCATAC AcTGCGCGTC AAAGACATCG CACAAGGGGA TGCGTTTCTC TCAGACCCCC 15180
TCCCTGAGGA TTTTCGCGTT CCCAGCGGGA CGTATACGTT CTGTATTGGA GAAAAAAAAA 15240
TATGCGTGTC GTGGAAAGGC GGGCaCTATC GTGATTTTAT AstGCCGTCA ACAAGCAGGG 15300
CAAAGACTCA CTCACCCTCT CAGAGATAAA AACGAGCGGT GCGAGCCGTG CGCTCCTGTT 15360
TCGCTCAGAA TTAACGGGAA AGAGCAGTCG TCTTTCCTTT GAAnGcaCTG CGCTGGACCT 15420
TGCACTGCGC cTGCGCGTCG TGCAGGAAGC ACGTTCTGAC GTTTTTACAC AGGATGTACT 15480
CAGTGTTGGA CCTGGAAAAC ACGCGCGTTT GGATTTTCCC CACCCTCTGC GCGCGCAGcA 15540
GGGcTTACGC TGGAGTTTGT CGCGTCTCTG GAAGGGGCAT CTATTGCAAA CGAAGAGTcG 15600 cGTGCGCACA CGCCCGCACA GGGAGGCGCT CCCACGTCTT CCCACGGAAA TACGGCGTCC 15660
GCTGCACATA ATCAGGACGG AGCAGCTGCT GTGCGCCCTA CTGAACCGGC AAACGGCGCT 15720
CCTGTACAGG AAGAAACCAG TTCAGTGTTC TTTGAGGGGG TCACAGTAAA GAACGAGGCT 15780
TCCCAGGGAG ATCTGCCTAC CACGGACGGC TTGGAAAAAT ACCCAGCTGT CGACGACAAA 15840
GGAGAcAATC CGCGCGCaCC TGGAGAGTCG CAGGGCACGG CCACCCACGA AGGTTCAGGG 15900
TCGTCCACAG ACAACGCGGA TGACACACGC TCAACTGGTG CCTTGGCAGG ATCGGGTAAG 15960
CTTGCACTTG AGTCTCTGCA GGGCCACGCG CTTCCTTTAC CACCGCTGGT GCTTACACAG 16020
AACGCACCGC AGATGGTATC CATTCCTTTG CGCGAGTACG GGGATGTTCG CGCGCTCATA 16080
CTGGATAACG CGCAGGCGCG AGGCGCACTG ACACTGCGCG CTATCCGTGT GCGTGCCGAG 16140
GATGCACCAG GTGGTTATGT CCCCGTGAAC CCTGCCTCTC AAGCACAGGA TGCAGCGTTT 16200
GATTTCGATG GGGTGCACGT TACGCGCGGA ACTAATTCTA TCACCGACCT TATCCCCGGC 16260
GTTACGCTTT CGCTGCACGA ACGTACAGAA AAAACCGAAA CGCTCTCTGT CACCCCCGAC 16320
GTGAACGCCA TGAAGAACGC TATTATAGAA TTCGTTGCTA AGTACAATCG ACTCATGGCA 16380
GAAATTAACA TTGTCACCAG TAACAAGTCA GCCATTATCG ACGAGCTTGC GTATCTTACC 16440
CCCGAGGAGA AAAAGAAAGA GACAGAACAA CTCGGCAGCC TCCACGGGGA TTCCACGCTT 16500
CTTATGCTGA AAGACAGACT GAGACGCAAT ACCAGCAATG CGTACCGCGC CGGCGATGAC 16560
GGTGCATCGC GGACACTTGC ACACATCGGC ATTTCCACAA AAGCGCACGC TTCGTCTGGC 16620
ATTAACACGG CACAGctACG CGGTTATCTT GAAATTGATG AAGAAAAATT ACATTCCAGT 16680
TTGAACGCAC AAAAGGATCA GGTGCGTGCT CTTTTTGGGC ACGATTCAGA TGGTGACCTC 16740
CTTGTGGACA ATGGCGTTGC ATTCACCCTA ACAGAACTGC TCAACCCTTA TTTGGGACGA 16800
TCGGGTATTT TTGCCATACG GTCAAACGGC GTTGACGAGC GTATTAAATC GACAGAAAAA 16860
CGCGTAGAAA CGTACGACAA GCAACTGGAA AAGAAGGAAC GGGAGCTGCG ACACAAGTAT 16920
CACACCATGG ATGGCGCGCT TCGTTCTCTA CAAAAGCAGT CTGACGCAAT TCAGAACTTC 16980
AACCAGTCTG TTCGCAACAG GAATTAGTGG GAGTCTTAAT GGACATTACG ATTAACGGAC 17040
ATACACTGCA GTATGTCATT GAACATGAAA AAACTATTGG GGAGGTTCTA GGCGCGATAG 17100
AAGCTGCGTG TAAAAAAGAA AAACAAACGG TATCGGCGGT GACGGTCAAT GGTAGGGAAC 17160
TGTCTGCTAA TGAATTGGAT ACACTTTTTT GCCAATCCTT GGATACCGAC GTCACCCTTA 17220
ATCTTACCAC TCTTTCAGGG GGAGACGTGC GTGCACTCTT GCGTGAGATT AGTACCACTC 17280
TCCTTGCACG CACAGCTGCG TTACAAGAAA TCGCAGTAAA CATGCATAGC GGTAATCTTG 17340 CAGAgAGcTA CGCTATGGTC AGTGACTTTT CTGCTCTCTT GAAAAGTCTT TATCACTGCT 17400
TTACTCTCTC GGACATCGCT GATTTGGATC ATGGCCTGAG AATTAAGGGA AAAGCCCTGC 17460
ACGATTACCA GCGCGAGATT TCTCCCCTGC TTAAGGGCTT ACTAGAAGCA ATGGAAGAAG 17520
GGGACAGCGT TGCTGTCGGT GATATTGCGG AGTACGAGTT GGCACCGGTT GTTCGGGATT 17580
TAAGTGACGG TATCTTGCAT ATGGACATGG GTGTACAATG AAGTTTGACG GACTGATTCG 17640
CAATCTCGAC CACATTACGC GAAAGGATAC GTATCTCTAC TACCGGGAGG AGTTTTCTGC 17700
TGTTGCATGT TACTCTCTCT TCGGTCGAAT TCATTCAGGA AGGGTTGAGT TTTCGGTAGA 17760
GACCaCTCCC GTTGGGGAAA AGAGCGTGCA GGTAAAATTA GTTGATGCAA TTGATTATCC 17820
GCTCTTACCG CTTGTACAAG CACTCAAGCG TGTAGTGAGA CTGTTGATCG AGAAGAATCA 17880
GTTGCCGCGT TAGATCTTGT CCAGTTTTTT TAAAAACGGT AGACTCGCCG CGGTGAGGTG 17940
CGTCAGTCGC TCAGCGCAGG ATACTGCgCG GtGGGGCACG GTAGTCGGAA GGCTGCTTGA 18000
GGAAGGTTCC GTAGTAGTCT TGCAGGGGGC GTTAGCGGCA GGGAAAACCT GTTTTGTAAA 18060
GGGGCTCGCT CTGGGACTCG GTATCCAAGA GGAGATTACG AGTCCTACCT TCACACTGCT 18120
GGCAGTCTAC CACGGcAGGc tGACGCTCTA TCATATGGAC GTGTACCGGC TCGCTTCCCT 18180
GGAAGACTTC TTTGATATCG GTGCGCAGGA GTGCGTATAC GGCACGGGAG TCTGTGTCAT 18240
TGAATGGGGA GAACGGGTCG CGTCAGAACT GCCGGAGTAC ACTGTTACCA TCTCGTTGCG 18300
TGTGCTCGCA GATGGTAACC GAGAGATTAC CGTAGCGTAs CgCAGAGTGC TTCCTGTCTT 18360
GCAAAAAGGC AAAGAGGGCG GGGTGTATGA ATATACTTGC CATCAACACC GTTGCGCATG 18420
CCCTCAACGT TGCAGCTGAA GGAGCACAAG GCACCGCTGT TGTGAGCATC GAAGGTGCGC 18480
ATTGTTGCAT ACAGCAACAG CTCGTGCGTG CGCTTGACGT TGTCGTAAAA CGCGCAGGAT 18540
TTCCTGTACA GGAAACACAA ATCGTTGCCT GTCCTCGGGG GCCTGGTTCA TTTACCGGCT 18600
TGCGTACCGG TTTTGCAGTT GCAAAAGCCC TACAGCTGGG TGTCGGAGCC CGTTTTATTG 18660
CCGTGCCTAC GCTGCGCcTT GCGGCACATC CGTTCCGCGC GTTCACAGGA CGGGTGTTGT 18720
CCATAcTAGa TGCAAAACGT GGTCGTTTTT TTTGGAACTG CTTTAAGTCA GGAGAGCCGC 18780
TCTTTGAAGA CTCTCACAAC CACGCACAAG AAATCGTAAA AAAAGTGGAC ACACGGGTTC 18840
CATGCCTGGT GTGCGGCACG GGAACAGCAC TTTTTAAAAG TGTAATGGAA AGCCAGGACA 18900
ACACGGTTCC TTTCATGTAC GTAGAAACTG ACGCTCATGA AGGAGCAAAG ACACTCCTTG 18960
CTTTGGTAAA AGTGCTCAAT CACAGCGCCG CCACTCCGGG GGAGCGCGGA GCGCCGCAGT 19020
ACACAACACG AACTTACGCA AAAGGAAGCT AATACTATGG GCAATTCAGA TATCTGTTCT 19080 GACATTAATG ATATCGAAGA ACTTCAATCT GAAGAAGGTG ATGCACCTAT ACGAGAAAAT 19140
GCCAATCCAA TCAGAGAGGA TTACAATTTT ATACGTGAAC AAAACCCCAT TCTCGGCTCA 19200
GGACTTGATC TTATCGGAAG TGCAAAACTG CCCATGCTCT TTTTAGACAG CAATCTGCTG 19260
ATTGAATATA TCAGCGCCGA AGCGAATTCT CTTTTTAGAG GTTATTACCA TCTGGAGAGA 19320
AAGCCGTTCT TTAATGTGTT TGGGAATATC CTCAGCCGTA AGGAACTTGA AGACTTTTTC 19380
TCTTGTGTCC GATCTCACTC TAAAGGATTT ACCTGGAGAG GCACGATGGC CCATAAAATT 19440
CGTGCAAAAA GAGCGCTATA CACGCGCACA AGTTTTATCC CGCTTTCCAT cAGCGACGCC 19500
CAACCTTCTG GATATATCGT TCTTTTCGAA GACATTTCAG ATATGTACTC GCAGCAGATC 19560
AGTAATATGC TGAGTAGTTT GCTACAAGCG TCAAAGCTTA AAGACAATGA AACAGGGTTG 19620
CACTGCGAGC GCGTTAATCA CTATTGCAGA CTCATTGCAG AATACCTGTA TGACATCAAC 19680
TTATACCCCC AAGTCGATAC GGACTTTGTA GAGAATATCG CCTTTCTTGC AGCTATGCAC 19740
GACGTGGGGA AAATTGGTAT TCCCGACTAC GTTTTGAAAA AACGTGGTGG ATTAAACGAA 19800
TTAGAGTGGG AGCTCATGAA GGAGCATACT ATCAACGGTG CGCTCATTCT TTCTTCTTAC 19860
CCTGACCCTA TGGCGAAGGA AATAGCGCTC AGTCATCACG AGCGCTGGGA CGGCACAGGA 19920
TACCCCTTCA AATTGGAAGG AGAGATGATA CCGCTTTCTG CACGTATTAC GAGCATCGCC 19980
GATGTATATG ATGCATTGCG TATGGAACGC TCTTACAAAA AGGGATTTTC TCATGAACAA 20040
ACTACACACA TGATTTTAGA ACAGTCTGGa CAAAGCTTTG aCCCCATTTT GGCACGTGTA 20100
TTTCAGAAAA TACATACAAA GTTCAACGAC GTGTGGGACA gCTACAGGAC TGAGCATCCT 20160
CAATCCTAGT CAGAGATAAG GTTTTCTTCG GTGTCAAATT GCTGCAAGGA GCTCATACCA 20220
GTTTCTGTCT GCATGCGGGA AATGAGAGCA AACAGGTACC GTGCGGTGTA AGCGCTCAGG 20280
GTGTATGCGT GCACGCTCTG TGCATCCCAC AGAGTCCACT CCTGGTGAGC AGCAGGCTGA 20340
AGACGAAACC GTATAATCTT TCCGTCTCCC CGGTGAATGA CAATTGATTC TTTCCATTGA 20400
GTCGGTGTAG GTCCAACATC AACATAAGAG AAACGTGCGA AAAACTGCTC GACCTCTTTA 20460
AACACTACAG ACGTCCGACT CCTCACGATG TATTTCCCTG AAACTTCCAT CATTTGTATC 20520
TGCGATTGTG TGAAGAGTTC ATGGAACTGC CGCGTATCAC ACCAAAACCG CTTTTCATCT 20580
CTGAAAAAAG GGGTAAAATC ATCCTCTGTC AAAAAAACAG CTGCGTTATC TCCAATTCGA 20640
ATATAACGCC CAAGTCCTGT ACCGTTCAAC GCACCAAAGT ATATGTCGCC TATAGCAGCG 20700
CCATTTTTTT TTGTAAACTT CATCTGCACT GCAGGTTGTT TTCCGAGAGC ATAGTCT 20757 (2) INFORMATION FOR SEQ ID NO: 101: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22191 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: AGGAAGGAGA GGATTTTATC GTACGCGGTT TTCTCGATAG CGGATAGATT ACGATAATCT 60
TGCACGTCAC TGCTCATGTT GATTTCTTCG GGGATCCAGA AGTTGTTcAT TGCCtGCCGA 120
TACCACTTGC TGACCCAGGG ATACTTCATA TTGTTaAAGT CGTTGAGATT GGTAGTGTTC 180
CCCCCGACCA TGCGTCGCTT ATGAAGTTCA ATGTCTCCTG CCtCATTAAA CAGCGCGCGT 240
CTTTGCAGTA TCGTTGAACT TTCCATCATG ATGAAACCTC CTCCTGAGTG CGGGCACAGG 300
GTAGTATACT TGATGGACTT TGAATTCTCA ATTGCGGCGG GCGTAAAAAT GTCTGTCTGC 360
AGGGCTCTGT GTCTTGTGTG GTGGTCCCTG TTTTGTGGGT AGGTTTGGGG AAACCTAACA 420
GAGACGTAGG AAATCCGTTA TGCTCGAGCA CATGAAGCGT GAACAAGCAC GAAgTCAGCT 480
ATCACACGAG CCTCCTAAGC GGCGCCGAGC CTCTCTAACC GTCTGCGGCC TGCGTGCTGT 540
GGAAACGCTT GGCAGCATAC ATCCAGAGAA AATTCATCGG TTCTTTTTTA CCCCTGTGCG 600
TGCGAAACGC TTCGGACCCC TGTGTGCATA CCTTGCTGCA CGAAAGAGGC TCTACCGTAG 660
CGCTAATACG CAGGAACTTG AGCGATTGAC GCAATCCGTT CACCATCAGG GGGTTGCTGC 720
TACCATAGAC GAGCCGCGCT TTCCAGCCGT GACTCATTCT CAGGTTGAAT TTTGGGTACA 780
ACGGCGTGAG TTTGTTGTGT TACTCGATCG CGTAGGAGAT GCCCACAATC TGGGGGCGAT 840
TATACGTAGT GCTGCTTTTT TTGGAGTGCA CTCACTGGTG GTGAGTGACT GTCGACAGCA 900
GGCGCAGTTA CAAGCGCAAC ATATCGGGTT GCGCAGGGAG GAATGGAGTT TGTGCAATTG 960
TTGCGCTGTA CAAATGCGCA GGAAGTATTG GAAATGTGTG CAGGTAAAAT GACCCGTGTG 1020
GGAGCCTCCC CTCATGCGTT CAGATCGCTT ACACGGCTTT CAAACATACT CTCGCCTGAA 1080
GAAGCGGTAA TATTAGTACT GGGAAACGAG GAGACAGGGC TTTCTGAGCA TTTGACTGCG 1140
CATTGCGATC ATCTCTGTCG GATTGCAGGC AGTGGTCAGG TGGAAAGTCT AAATGTTGCG 1200
CAAGCGGGTG CGCTTTTTTT GTCCACTATC GTACAGTTGC GTCAATCTCC TCAGGACTAC 1260
ACGCAGGgAC ATCGGGCCAC GCCACGTGCA CAAGAGCGTG TGCACCGCTG TGGGCAATTA 1320
GAGGAAAAGG GGCAGAAAAA TGGAGCACGT GTTCTTATTC CCCGCTCGGG GGcGCGTGCC 1380
AATTCCCGTG AAAGTTGAGA GTAGGGAAAG TAGACGGGTG TGAGGTATGG AACCTACAGC 1440 GCAATTGGTT TTGCAGTGCG TGTTGACGTG CGTCCGCGTG TGCCTCTGAG AACGCAGCCT 1500
GCTCTGTTGT CGCGTAGGAA AACTTGTAAA GAGCTTGCAC AAAGCGTCTT TTTTTTTTAC 1560
CCTCGACCCC GTTCGCCTGA ATTCATCACG TAGGAGTTGC CCGCATGAAT CAGATCCGCC 1620
TGTTTGCCCA GAGTGCGCTT GTGAGCGTCA TGGGTATGGG GATGGTTTTT GCCTTCCTCC 1680
TTTTGCTCAT ATGCGTTGTG CGCTGTGTGG GCGCGCTTGT CTCTTCTTTC GGCTGGGATC 1740
GCGGTCCTGA CGAAGGTGTC GGCGCTGCAG TCCCTGCAGG AGGAGCACTC GCCGCGGCTA 1800
TCGCAGTCGC CGTTCATGAG AAGGCAAGGA GTACTTCATG AGTACCCCGG TTCGCATTAG 1860
CGAAATGGTC CTACGTGATG CGCATCAGTC TTTGcACgct ACGCGCATGA CTACCGAAGA 1920
CATGCTCCCT ATTTGTGACA AGCTAGATCG CGTTGGGTAT TGGAGTTTGG AGGCGTGGGG 1980
AGGCGCCACG TACGACGCCT GCATTCGCTT TCTAAATGAG GATCCCTGGG AGCGTTTGCG 2040
TGCTCTCAAA TCTCGGTTAC CTAAGACCCC TATTATGATG CTTTTGCGTG GGCAAAACTT 2100
GCTAGGCTAC CGGCATTACG CGGATGACGT TGTAGATGCG TTCGTAGAGG CCGCTGCACG 2160
CAACGGCGTT GATGTGTTCC sCATCTTCGA TGCACTTAAT GACCCACGTA ACCTCAGTCC 2220
AGsTGCGCGT GCTGCAAAGA AAACAGGCAA GCATGTGCAG ATGGCTATCT CTTACGCTAC 2280
CACACCCTAT CATACCGCAG AGAAGTACGT AGAGTTAGCA AAGGAGTATG CGCGCTTCGG 2340
TGCGGATTCT ATTTGCATTA AGGATATGTC GGGGTTGCTG AcCCCGTACG GGGCGTACGA 2400
TCTGGTTTCT GCCATTAAAA AGAGTGTCGA TTTGCCCGTT GAGTTGCACA CCCACGCCAC 2460
TACTGGTATG TCTGTTGCAA CCCTGGTGAA GGCGGCAGAA GCAGGTGTTG ATGTAATTGA 2520
CACTGCCATT GCTTCTATGT CCATGGGTAC TTCCCACAGC CCTACAGAGA CTTTAGTGGA 2580
AATCcTACGG CACACGGGCC GTGACTCAGG GCTCGACATA AATCTCCTGC TAGAAATAGC 2640
AGCCTACTTC CGTCAGtACG GAAGTGCTAT GCCCAGTTTG AGTCTAGTTT TCTGGGTGCA 2700
GACACGCGTA TCCTCGTGTC CCaGGTGCCT GGGGGTATGC TTTCCAATTT AGAAAACCAG 2760
TTGCGTGAGC aGGGAGCCCT GGATAAGATG GACCAGGTTC TTAAGGAAAT TCCCCTGGTA 2820
CAGAAGGACT GCGGTTATAT CCCGCTTGTG ACTCCTACGA GTCAGATTGT AGGTACGCAG 2880
TCAGTATTGA ACGTGCTGTT TGGCCGGTAC CACCGGCTTA CTGCTGAGAC AAGGCGTCTG 2940
CTCACGGGTC AGTATGGCCG GACTCCCGCC TCCTGTGATG CAGGTTTGGT GGAGCGGGCC 3000
TTGAAGGAAG AAAAGTTATC GCAGTCGCTT GTCTGCCGCC CAGCGGATGC CTTGCCTCAT 3060
GAGCTTGATC GCATGAGGTC TGAGGCcgCs CCGCAGGCGC ACAGGATACC ATTGAGGATG 3120
TGCTCACGTA TGCTATGTTT CCCAAGATCG CTCCCACATT CTTTGCTTCC CGTGCGCAAG 3180 GGCCTATTTC GTTCAGAGGA AAGGGGCAGG GGCAAAAACA GAAGGGTGAG AGTGCAGGGT 3240
CGGTAGCTTC TTATGTGGCT ACCGTAAATG GTAcTGCGTA CACAGTTGTG CAGGAAGGCG 3300
CTGTTCTCCG GGTAAATGGT ACTCCCTACA CCGTTAGGgT TGAGGCAGGC CCGTCCGTTG 3360
CTTCGGG AC GTCGCAGGGT ACCGTGACTA CGGCAAAGGT TGGGGCGTGT ACTACGCTAC 3420
CCGCGCCGGT CGCAGGTAGC GTAtTAAACA CACCGTGCAA GATGGAGCTA CGGTAAATTC 3480
GGGGGAGACG GTGCTCATGG TGGAGTCCAT GAAGATGGAA CTTGAAGTGA AGGCCACCGC 3540
TGCTGGTACT ATCCATTTCC TAATAGCGCC TGGCGCGCAT GTCAGTGCGG GGCAAGTCTT 3600
AGCAGAGATT CGCTAGAGGA TGTGACCATG AATACGCGTT TACCCCTTCG AGTACTCCAG 3660
TGCGTGTTGG TGGGATTGCT TGTGTGCGGG CCCCTGTGTG CAGctACGCG CCGCCCGGTa 3720
CGTGCTTCTG CGCCGGTGCC TATGGTACAG AGTTGTAAAG ACACGGGGGC ACGATGTGCG 3780
CCGGCGTCGT CCATGCGTGA GGACATGCGT GCGTcACACG GAGcTGCGCC GCTTCTCTCT 3840
GTAAGGAAAT TTTTACTCAA TACGTGGCAT AGTACCGGTC TCTACGCTTT CTTTCATGGC 3900
GTAACACAGG TGCCGGATCT TGCAAATCCG CaGCGAACAC ACAGCGTGTT CGGTTATCAA 3960
CAGcGTTGCT GCTCGTGGTT GGTCTGCTCc ATCATTTATC TCGGTGCTGC TAAGGGCTTT 4020
GAGCCGCTGC TGCTCATTCC TATTGGCTTT GGTACTGTCT TCGTCAACAT CCCTGGTGCG 4080
GGCATGTATA GTGAGCATGG TATGCTCAAA CTCATTTACG ATGCTGGGGT GGGGAATGAG 4140
TTTTTCCCTA TGCTCATTTT TATGGGTATC GGTGCACTTA CCGATTTTGG ACCACTGATT 4200
GCGAATCCTA AAATGGCAGT CCTTGGTGCC GCTGCCCAGT TAGGGGTGTT CCTTACTCTC 4260
TTTGGGGTTG CAGCGTTGAA CTTTGTACCC GGGATCCGCT ACTCCATCCT GGATGCCTGC 4320
GCCATTGCCA TTATCGGTGG GGCGGACGGG CCAACTTCCA TCTACGTATC TGCGAAgcTT 4380
GcTCCCGAAC TCATGGCCGT TATCGCGGTG GCGGCATATT CGTATATGGC TCTTGTACCT 4440
ATTATTCAGC CTCCGCTTAT GCGCCTGTTA ACTACCAGAA AAGAACGTCT TATTAGGATG 4500
AAACAGCTGC GTCCTGTTTC GCGGATAGAG AGAGTACTCT TTCCGCTTGT CTTGCTCTTG 4560
CTCTCGGTGC TGCTCATTCt GCGGCTTCCC CACTCATCGG TATGATCGCt TCGGGAACTT 4620
TGTTAAGGAA TGCGGTGTTG TGGAGCGGTT GTCTAAGACG ATGGCTAACG AGCTTTTGAA 4680
CATCGTGTCG ATCTTGCTGT CTTTGGGTGT TGGTTCTCAG ATGACACCCG ATAAGATTAT 4740
GAACCCCAAT GCCTTGGGCA TTATCGTGTT GGGACTCGTT GCCTTTTCTG TCGCAACCGC 4800
AGGGGGAGTA TTCATGGCAA AGTTAATGAA TTTGTTTTTG AGCGAGAAAA TTAATCCACT 4860
TATCGGTTCC GCAGGGTGAG TGCTGTTCCT ATGGCCGCGC GTGTTTCTAA TAAGGTGGGG 4920 CTAGAGGAGG ATCCTTCTAA CTTCTTGCTT ATGCACGCGA TGGGTCCTAA CGTGGCTGGT 4980
GTCATTGGGA CCGCGATACC GCAGGGtGTT CATCTCGGCC TACGGAGGGT AGGGAGGAAG 5040
AGTAACCGCG GGGTTTTGCC GCTTAGGTAC CCTTTCCTCC GTGCGCGGGC ACACCCTCTC 5100
AGGTGGCTAG GGGCTTTTGC AGACGAAGCG GGTAAAGCTC GCTTGGAATT CGAGCTCAAC 5160
TGTCCCAATG GGGCCGTTGC GTTGctTCGC TAAAATGAGT TGAGTCTCGG TTTCGTTGCG 5220
GTCGCGGTGT AGAAACATGA CCACGTCGGC GTCCTGCTCA ATTGCCCCCG AACCACGAAT 5280
GTCCGCCAGG TTTGGCGCAG AGCCCTCTGC CGGTCGACCG ACTTGCGAAA GTGCTACGAT 5340
GGGGATGTCT AGCTCGCGCG CGAGGCTTTT TAGTGATTGG GAGATCGCTG CAAATTGTTC 5400
GTAGCGGGGC GCAAAGGGAT TGTCTGCTAC GATAAGTCCC AAGTAGTCGA CAAAAATAAT 5460
CTGGATCTTT TCTTGTACGC ATAATCGACG AGCCACGGCA CGGAGATCCA GTAGCTTCAT 5520
GTTTGGCACG TCCACGATGT AAAGTGGGGC GTCGTACATC TCTCtGCGGC GTTTTGGATG 5580
CGCCCGAAAT CGGAAAGTTG TAAAAGCCCT TTGCGCAGGT TCGTCGCGGA TACTCCTGAC 5640
TCTGCAGCGA TAAGTCGCTG CATCAGAAGC AAATTAGACA TTTCCAGAGA AAAAAAGGCG 5700
GTTGGAATAC GTTGCCTTAT GGCAATGTTC GAGGCCATAG TCATGGCGAG CGCAGTTTTC 5760
CCCATGGAAG GACGCGCACC TATGACAATA AGCTCGGAGT TCTGGAATCC ACCGGTAAGA 5820
TTATCCAGAG CCGTTAGcCG GTGGCAATTC CGACCAGATC GCTTTGATTT CGGTAACGAG 5880
TCTCAATAGT ATTGACCAAA TCAGGAATGA GGTTTTTCAG CAATTTGAAG GTTGCTACTC 5940
TCCTTGcATT TGTTAGGTCA TAGATTTCCC TTTGTGCTGT TTCGAGTACG ATGTTGCCCG 6000
ACACGGTGTC ATTGAATGCC TCTGCGGTGA TAATGCGGGC TACTTTTAGT AGCGACCGGC 6060
GCATGGCAGC GTCGCAAACG ATGCGTGTGT AGTATTCAAC ATtCGCGGCG CTTGGGACCG 6120
CATCGgTGAG AGAGGCAACA TACGCGCTGC CACCGACGAA ATCGAGCGCC TCACAGGAGC 6180
GCAGtGCTCG CTGAGCACGA GGATATCAGG GCGTTGACCT AAATCCGATA AcTCTACGAG 6240
TGCTTGAAAG ATGCGCTGGT GCGCAgcGGA ATAAAAAGAG CTCGCAGACA ACTGCTCTGT 6300
TGCCGTGCTC AGAGCAGAGT CATCCAGTAG AACAGCGCCG AGCACAGCCC GCTCGGCCTC 6360
TAGGTTATGA GGGGGAATTT TTCCCTTGAG TTCCTGAGTG GGATTAGGCA TGCCCGGCAC 6420
AGAACCTCCT CCGAGGAATA CTCAGAGGAG GGAAGGTGGT GAAAAGACAG TCCCCCCCCT 6480
CTCCGTGGAA ACCcTCTAAC GAGGCAAAGG GTTGAGGCCC AACTGCCAAG AACTGTCTCC 6540
CTTCCGTGGA ATCCAGCCCC AAAGTGGCAC AGTTCTTTGC ACAACTAAAA AACGAACAGC 6600
ACATACTCCG TGCAGAGAGA TACTTCCGCA AACGGTCTAC TCACTCACAC TGTCCgCTTC 6660 GCTTTGGTTT TTGATGGTGA CAGGAACAAC AGCACATATT TCCTCGTATA GTCTTATAGT 6720
GACGTGATAG TTCCCCACAC ATTTCAGAGT AAGACCAGGG ACCTCCACAC GcTTGCGCTC 6780
AACCTCAAAT CCCATGCACG CAAGTTGTTC TGCAACGGTA TGACTCGTGA CAGCGCCGTA 6840
CAACTTACCG TTGGTACCGG CGGGCATGGC AATAACCACA GGCTGAGCCT CTAAACGAGC 6900
CTTAAGATTT GCGGCATCTT GTCGCTTGAG AGACTTTCGC ATCTCTATGT CCTGTTGTCG 6960
CTGTTTGAAG CGAGCCACGG TAAAACGATT ATGAGGAACA GCAAGGTTTC GAGGGTAGAG 7020
GTAATTACGA AAATAgcTGC GGCGACCTCT TTCACATCAC CTTCTTCACC AAGGATCTTT 7080
ACGTCTTGAT TGAGAATAAT CTTCATACGT TTTGTCCCCT CCTACGCAGG AATAAAAGTC 7140
CACGCAGAGC CACTCTGCCT ATGCACAGCA ACTTCTACTG AACTAGGAAA AACAACCGCT 7200
GTATACCAGC GCTACTCGGT CAGAACGAAA GGTAGGAGCG CAACGGCGCG GGAACGCTTG 7260
ACTTCGAGAG CAACACGGCG CTGGTGTTTG GCACACGTAC CGGTGATGCG TCTCGGCAGA 7320
ATCTTACCCC GCTCTGTGAT AAAGCGACGA AGCGTGTCCG GATCCTTATA ATCAGCTAAA 7380
AGCTTCTGCG TGCAAAAACG GCATACTTTC TTTCGATAGA ATTGTCTGTT TTTCTTGGGT 7440
GCGCTCTCTT CACTTTCACG AGGAGAACTG AGATGCGTAT CCAGGTCAAC ACTCGGATGA 7500
TCTTCTGCCA TGATATGCTC CTAGAAGTGG AAAAATTAGC ACGCCCAATT CCTTTCTGAG 7560
GTCCAGGCGA ACGCCAGGTT CGGTCATACT TTTATTCATT ATGCATGCTC CTTGTGGGGA 7620
AAGAGACCTA AAATGGAACG GTGTCCAGAT CTGAACTTGA AAAATCAGCC TCATCAAGGg 7680
ACGAGGTTGC GTCCAGTCCT CTTTTCTCAC CCAAGATATC GCCTGTGGGT GACcTAGAAT 7740
CTGAGGATGT GCCGCGGACA GACCAGCGGT CGACTCGGCG GCAACGCGCG AGGAAGAAAA 7800
CTCACCGTCC TCGGCACGAG CAGCGCCACC GAGGACAGAA CCGAGGAGTT GAACGTTAGT 7860
CGCAGAGATC TCTACCTTGC tGCGTGACTG CCCCTCTTGC TCCCAACGGC TTTGACGCAA 7920
TTCCCCCTCG ACGGCTACCT GCTTGCCTTT GATAAGATAC TGGCTGATGA CTTCGCCCTG 7980
GCGTCCCCAT AGAACGATAT CGAAGAAATT AACTTCCTCA ACCCAATCAT CACCACTTTT 8040
CCTGCGCCGA TTGATAGCAA CAGAAAAACG ACACAGAGCA CCACCTGCAG AAGTGTACTT 8100
GAGTCCGCAT CACGCGTAAG CCGACCAACG AGCACTACAT GATTGACGTC TGCCATATCT 8160
TCTCTCAGGA ATCGACACGC ACAAACAAGT GCGTGAGTAA GTCGTGTCGC AATCTGAGCT 8220
TATGATCGAG TTCGCGCACC TTCCCCGGCT CACACTGAAC AATGAAGAGC AGGTAACGGC 8280
CCCTCTTTTG CTTCTTCAGA GGATACGCAA GTTCCCGCTC TCCAATATGG TCTTCGCGGG 8340
CGATGAcTGC GTCGTTTTCC TGTAGGAGGG CACGAACGGC GGTGGAACCC TGAAGAAAGA 8400 GATCCTCGTG TGCACTGAAA ACGGCCATTA GCTCGTAGGT CCTCATAATA ACTCCTATGG 8460
ACGGCTCACC TTATCAGGCT GCACTAGACT CACCCGACAA GGAGGAGTAG CGTGCGCAGA 8520
ACGCGCGGCG AGaCGCCAGC GTAGTAArAA TCTGTGTGCC GCATCAAGTC TTTTGTCCTT 8580
TTATCCGATG CTGCCGACGG CGCACGTCGT cTGAGGGGGT ACATTCGTGC GTGTGCTCCT 8640
TACGGCAGAC GaGGTGAGGA ACCTGTTCGC AGAAGGTGCG CTTTCCTGTG TGCCCGCGTC 8700
GGTCACGTAT GTGTGGGGAG GTcTGTGGTG CAGAAtTTTG TGTTTCAAGA cAATATTTAT 8760
CACCTAGCAC GTTCTATTGA TGTCCTGTAT GAGGGGCTTC AGCTTAACTT AGATGAGTGC 8820
CTTTATGCGG AGAAAGTTGT GTATGACGTG CGTTTTTTTG ACCATGCGTT GCAAAAGTTG 8880
TGCGCGCATA TCGATCGCCA GTCTCACTTC CCCGATTACT TACCAATTCT TCATTGCCTA 8940
TTCTCCTGCG GTGCACGATT CTtGAACTTA TTGAATTTTC TTATTCATCG TGCCTCTCCT 9000
GTGACTGCGC AGGTTGAGTT TACCCGGATG CTCCCGTTTA TTGAGAAAAG ACACAGCGCA 9060
TTGCATGAGA ACCTCGCACG CTCCATTCAA GAGGTGGACA CGAGCGCAGA TGCGAGTCAC 9120
GTGGTTTCTC AAGATGAGAT AGATGAGTTG CTTGAGCATT AGGTAATCCG CTTGCTTGTT 9180
ACAAGGCGCG ACGTTTTGTG TACGGAGGTT GGCACGGGTG AGGATTACGG GGGGGATGCT 9240
AAAAAACCAC GTATTACGTT GTCCGGATGG TCCTATCCGT CCTGCAATGG ATCGCATGCG 9300
TGAGTCGTTA TTTGCGATTT TGGGTGATAT GCGCGGCTGT TCCTTTTTGG ATCTTTTCGC 9360
CGGATCGGGA GTGTGCGGCT TGGAGGCTTA TTCACGCGGG GCGTATCCGG TAGTGTTTGT 9420
AGAGTGGAAT GTGCGTTCTT TTTCTGTTTT GTTGCAGAAC GTGCAAGTGG CGCTGTGTCG 9480
TTTGGAATGT AGATGTATGG CAGTGGAGCG GTATATTGCG CGTGCACGTA CGCTGTTTCA 9540
TTTTGTTTAT CTTGATCCAC CTTTTCCCTA TCGCTTTCAC GCTGAGCTGT TGCAGCGGCT 9600
TTCTCGTGCG TCATTGTGTA GAGAAGGAAG CGTGGTGATG GTGCACCGAC CAAGAGAGAA 9660
AAAACTTGCG GATAAAATCG ATTCACTTGT GCGGACCGAT CAGCGTGTGT ACGGGCGCTC 9720
GGTAGTTGAT TTTTACCGCA GGATAAAGCC GGTCTTGCGT AGAACTGAAA CCCGTTGTAG 9780
GATCTTGCgT TTTTTGTGCG AGGGGATGCA GAGGGTGCTg AGAGCGACAC GCCGTACTTC 9840
GTTaAAGGTA TACAGCGCCC TGTTTcCACT CTCTCCGATC GCGACCGGGg CGCTGCTAAA 9900
CAGGCGGGGC AATGCATACC TGAATGAgGG GAAGCTGCAG GAAGCGGCAC GCGTTTTCAT 9960
TACCACTGGT TACCACGATG GATTGACCAG GATTGGCGAT GTGTATATGC GCAAGGCGGA 10020
CGTGTTGaCC GCACTGCGgT TTTATTATTT TGCTCGCAAT GAGCAGAAAA TGAGGCCCAT 10080
CGTTTCAGCG CTCTCAGTGT TGATTCGATG TCTTATATAA TTTTGACGCC GGGGGGGGGT 10140 ATGGGGCTTA TGGGGTGCGA GTGTAGTGTT CCGCCTTTTG AAACGGTTGA AACGGGCAAA 10200
CGGTCAGTGG ATGCGGACAG GATTATGCGT CTGTCCCGGG ATGGcTACGC GTTCTTAAAG 10260
GTCAATGACC TTGAGAGGGC AGAGAGCGAG TTCGGTAAAA TTCTCCAAAT TGAAGCGGAT 10320
AACAACTATG CCCTCGTGGG GCTCGGGGAT GCGGCGCGCA AGAGACGCGC GTACCAAGAG 10380
GCATCTGACT ATTACACGAG GTGCTTACAG CATTACCCTC GCAACAGCTA TGCGCTCTTT 10440
GGTCTTGCGG ACTGTTATAA AAACATGCGT CGGTACGTGA AGGCAGTGGA AGTGTGGCAG 10500
CAGTACCTGG AGCAGGATAG CCACAACATT GCGGTGCTTA CGCGCATGGC CGATGCTTAC 10560
CGTAAAATAC ATGATTTTCA AAACTCGAGA AACCTTTACT CCCAGGTTAT CGCCCTGGAT 10620
GAACATAATT CCTACGCGCT AATTGGGCTT GCTCACTTGC ACTACGACTT CAAGAAGTAC 10680
CGTGAAGCAC TCATCTACTG GAAAAAGCTC CTGGAGTGTG CAGAACACAG TGTGGATATC 10740
CGTGTACTCA CCTCTATCGG GAATTGTTAT CGTAAAATGA AACTCTTTAG TCAGGGATTG 10800
CCCTATTTTC AGGAAGCGCT GAAGCGTGAC CCAGGTAATT TTTATGGATT TTTTGGGATG 10860
GCTGACTGCT ACCGCGGCAT GAACATGCAG GAGCGTTCCA TCCAGTACTG GGAGAAGATT 10920
CTGGAGAAGG ACACGCAGAA TCGTGTCATC CTCACCCGTA TTGCCGACGC ATATCGGCAC 10980
ATTGGGGAAT ACGAAAAGGC CCATCAAACG TATAAAAGGG CGTTGGATAT CGATTACGAT 11040
GCCTATGCCA CGCTCGGGCT CGCAGTCCTT TGTAAACTCC AGGGGAGATA CGAAGAGGCG 11100
GTTGTGAGTC TTGATCGACT CGTGCAGCTT GATCGGAAAA ACTATCGCGT ATATGTGGAG 11160
CTTGCAGACT GCTACCGCAA GCTCGGGCAG AAGCAAAAGG CGCTTGAGAC ACTTCGTCCC 11220
TTTCAGCAGT TTGGGGTTAA GAACCGTGTT GTTTCTGAGC TTATGAGTGA GTTGGAGGGT 11280
GCATCGTAGT TTCCCCCCTT TTTCTTCCTG GTAAGCTTGG GTATGGGTGG CGTTTGCCAA 11340
GTGTGAACGG cTCGTTGCGC AGGTGTGGCT GCGCGATGCC TGTCGATTCG GCGCTTTTTG 11400
TGTGGTGGTC GGGGTCCTCC TCTTCGTCGG AGGTGGTGCA GGACTTTGCG ACGAGGGGGT 11460
GGAGCGTATA TGGAGTGGTG CTGCGCCCTT TCGGGGCTTT TGCCAGAAGA AATCCAGAAG 11520
GTGTGTGCGT TTGCTGAGCG CTTTCGTGGG GTGCAGGTGT TCAGATGGAT TGCCGCAGGG 11580
TGCACTGACT TCCATGCGAT GAGTGATCTC TCTTCTGAGA CGCGTGCACG CCTGGCGAGG 11640
GCGTGCGTCA TCTCGGACAC TCGTGTCTAT ACCACGCTGC GTGATGTGGA TGGTACGCTC 11700
AAGCTGGGTA TTGAACTGAA AGATAAACGG CGCGTAGAGG CAGTCTTACT CGTCGATCAA 11760
GTCTCGCGTA AGACTGCTTG TCTATCCTGT CAAGTCGGCT GCCCTATGGC GTGCGCGTTT 11820
TGTCAAACAG GCCAGTTGGG TTTCGCGCGA AACCTTTCTG CCTCAGAGAT CGTCGAGCAG 11880 TTCCTTCATC TGGAACGATG TGTCGGTACA TTGGATAATG TTGTGTTTAT GGGAATGGGT 11940
GAGCCCATGC TCAATCTGGA TGCGGTGTGT AGAGCTATTG AGATACTGTC TCATCCACAG 12000
GGTCGTGACC TATCTGAAAA ACGTATTACT ATTTCTACGT CTGGACATTG CCGTGGTATT 12060
TATTCGCTTG CTGACCGCGC ACTGCAGGTT CGCTTGGCGG TGTCTTTAAC CACCGCGAAT 12120
GCACCGTTGC GCGCACGCCT CATGyckCnt GCGCACGACA GTTTAGCAAA ACTGAAAAGC 12180
GCTATTCGCT ATTTTAACGA GAAGAGTGGA AAGCGTGTGA CACTCGAGCT CGCCCTCATG 12240
CGGGGAGTGA ATACTTCTGA ACGGCATGCG CAAGAAGTTA TCGATTTTGC ACATGGGCTT 12300
AACGTGCACG TGAACTTAAT TCCCTGGAAT CCGGTAGCAT CAATCCACTT TGAAACACCT 12360
CGGGAAGTGG AGGTTGCGCA TTTTGAGGCG CTTCTCATGC GCGCCCGCAT CCCCGTGACA 12420
CGCCGCTATC AGCGTGGGAA TGGCATTGGA GGCGCATGCG GACAACTAGG TAAAACAGCC 12480
GGCGTGTAAC TCTTTCGCTC CGTTTGTAGA TGTTTGTACG TGTGGCCATG CTCGTTCCTG 12540
TTTTTCAGGA AAGTTTCTTT GAGGCAGCCA GCGCTTTTGC CTTGCTCGAG TAGAGCTGTA 12600
CGCAGTCCCA GGGTAACCAA CCTCTTTCTA CGTTCACCCA CAGCGCTGAC TCTTTATCCT 12660
TTGAAAGCAA ATCAATGCCG AGCACTGGAA ATATGTCTTT GCGCCGTGCG TACGCAATGA 12720
CGATCCCCGT GACACCTGGT TTGTCACGAA GCAAAACATA AGCCTTGGTG ATTACTGCGA 12780
AGCGACGCGC TTCTCCAATT GGAGAGCTCG AGGGGAACCG TACATCTTCC AGTGAATTTT 12840
TCCGCACACA GGCGGAAGTG CTGCTGAGTA CGAGCAGTAC CAACGCCCAT GTGCGGCGCT 12900
GCAGCACGGG GATACCGCGT ACCTTCATAG AAAAAGCGTC GGCGCAAACA GAAGGTGTAG 12960
AATGCTCGCT GCGAGAGAAA ACAATCCCcA AGGGTTTGGC ATACTGCACC AGCGTGCGCG 13020
GAATAGGTGA CGGCTTGGAA AGGGCCACAC AATACCGATG TAGAAAAAGA GCACATCCTG 13080
TACCGCCGGC AAGGGTGGGT ATGAGATCCC CAAGCACGGG AAGTTGGGCA CCGATAGGAT 13140
TAACGCACTT GCACACTGCG GCAATTCCCG AGAGAAGCGC CACACTAGAG AGAAAGTCGC 13200
GTCATACAGA AAGGGACAGT CGTGCGCACT CTCCTCTTGG GAGTCGAGCA CCGTGAGGAT 13260
AAAGCCTATC GCAGCGTTGG TCGCCACAGA TAAAAAGTAA AACGGTAACA TGATGGTCTC 13320
CTCTTACATG TGGCATGACA TTGCCATATT TGAATATGCA CTCAGGATGT TTGTTCAGTG 13380
AAACACAGAG CATCTTTACA CATATCCACC ACGATAGTTG AGCCGCCTCT GAATCGGCCA 13440
CTGAGAATCT CACGCGCTAG GGCATTTTCC AATTCCGTTT GGATTGCACG CTTCAGTGGT 13500
CGTGCTCCGA AAGTGTCGTC GTATCCGCGC TCCGCAAGAT AGGCTTTCGC CGCGTCACGC 13560
ACACGAAGTT TTATATGTCG ACTTTCCAAA CGCTCCACTA CCATCTGCAG TTGGATGTCT 13620 GTGATGAGGC GAATATGTTT CCGTGTGAGA CGCTTAAAAA TTAACACTTC GTCAATCCGG 13680
TTTAAGAATT CTGGGCGAAA GTATGTGTGC AGTAATCCCC GTATCTGCTC TGGTAGAGTT 13740
TGTTCTTCTG TAGATTGTGT CTCGGGTACA GGCAAGTCCG ACGTGTGTGT GCGCGACTCG 13800
CGTGCAGAAA GAATATGCTC TGATCCGATA TTGCTGGTCA TGATGATGAT CGTGTTGCGG 13860
AAATCCACCA CCCTTCCTTG GCCGTCAGTC AAGCGCCCAT CGTCGAGTAT TTGCAGGAAT 13920
ATATTAAACA CATCCTGGTG CGCTTTCTCT ACTTCATCAA AAAGAAGTAC GCTGTAGGGT 13980
CTACGTCGTA CCGCTTCTGT CAATTGTCCC CCCTCGTCAT AGCCCACATA CCCCGGGGGC 14040
GCGCCAATGA GTCGGCTGAT CGCGTGTTTT TCCATGTATT CACTCATATC GATACGCGTC 14100
AGTGCACGCT CATCGTTGAA AAGAAAATCA GCTAACGTAC GTGCAAGTTC TGTCTTTCCT 14160
ACCCCCGTGG GACCGACACA TAAGAAACTG CCAAGAGGAC GGCGCGTATC AGAAAGTCCT 14220
GCCTTATTAC GACGAATCGC GTCGGAAATT ACCCGCACTG cTTCGTCCTG CCCTACCACA 14280
CGTTGCATGA GTACTGACTC AAGCTGCAGA TATTTCTGTT GCTCGCTTGC CATCATTTTG 14340
GaTACCGGAA TTCCGGTCCA CATAGAAATA ATTTTCGCAA TGTCCTCTTC ACACACTTCC 14400
TCGCGCAAGA GCTGTCCTTC TAGACCGGAT TTTTTCTCTA CTTCTGCAGT AAGAAGCATG 14460
ATTTTTTTTT CAAGTTCTGG AATTTTGCCA TACCGAAGTT CTGCAGCCTT GTTCAGGTCC 14520
CCTTCACGTG AAAACATGGT TTCCTCAATG CGGAGACGCT CAAGCTCCTC TTTGTAGCGG 14580
CGTGACTCTT CTATCCTCCC TTTCTCATTT TGCCATTGGA CCTGCATTGC AGCACGGCGC 14640
TCTAGGAAGC CTGCGAGCTC TTTTTCTAAC TTTTCCAAAC GTTCCTTTGA AGCCGGATCA 14700
CTTTCTTTAA GGAGAGAGGC CTTTTCGATA TTCAGCTGTA ATATCTTGCG CTCCACCTGG 14760
TCTAGCTCAA CAGGCTGACT TTCAATTTCC ATTTTCAGGC GGCTTGCTGC TTCATCCACC 14820
AGATCAATCG CCTTATCTGG TAAAAAGCGG TTGGTGATGT AACGGTCAGA CAAAACGGTT 14880
GCTGCAACAA GCGCTTCATC TTTGATACGT ACCCCGTGAT GCAtTCGTAC TTTTCTTGCA 14940
AACCGCGCAG GATAGCAATG GTGTCCTCCA CCGTAGGCTG TACGCAGTAC ACTTGCTGAA 15000
AGCGGCGTTC GAGCGCTGCG TCCTTTTCGA TATATTTGCG ATATTCGTTG AGCGTGGTTG 15060
CGCCGATTGA ACGCAATTCA CCGCGCGCAA gcgCAGGTTT CAGAAGGTTC GACGCATCCA 15120
TAGATCCCTC ACTTGCGCCG GCGCCTACGA GCGTGTGTAG TTCATCAATG AATAAAATAA 15180
CGCCACCGTC GCTTTTCTGT ACCGCTTCAA TTACCGCTTT TAGTCGTTCT TCAAATTCCC 15240
CGCGGAACTT TGCACCGGCA ACCAATGCGC CGAGGtCAAG GGAAAGCAAA CGCTTTCCCT 15300
TGAGGCTTTC TGGTACGTCT CCTGAAACGA TACGGCGTGC AAGTCCCTCG ACAATAGCGG 15360 TTTTCCCTAC GCCGGGTTCT CCAATAAGCA CTGGGTTATT TTTTGTACGA CGTGAGAGTA 15420
CCTGCATAAC GCGCCGGATC TCTTCATCAC GTCCAATAAC CGGATCTATT TTTTCTTCTC 15480
GGGCGAGGGT AGTAAGATCT CGGCAGTATT TCTCCAAGCA CTGGAATGTT GATTCTGGAT 15540
CCTGGCTCGT AACGCGCTTG CTGCCGCGTA TATCTTTGAG GGCGGCACTG ATAGTTTTAC 15600
TGGTAATGCC CTGACTGTGA AGGAGACGTG CAGTGTTGCT ATCTGTCTCA CTTATGGCAA 15660
GCAGGAGATG TTCGCAGGAG ACATATTCAT CTTGGTTCTT GAGCGCGAGG CGTTCTGCAC 15720
GTGcACAGGC TTTGCTCAGC GTTGGTGCAC AGCGCGTTTG GGCGGCAGGA CCGGTAACAC 15780
GTGGTTTGCG GCGCAGGCAT TGGAGTAATT CATCGTACAG GAAGTCCGGT TTTGCCCCAA 15840
TTTTTTCAAT GAGCGGAGAG ATAATCCCGT CTTTCTGGGA AAGTAGGGCG TGGAGTAGAT 15900
GTTCCTCCTC AACTTGACCG TGGTTCTCCG CTTCTGCCAG AGATATGGCG TCATTGAGCG 15960
CTTCGCTTGC TTTGACTGTG TACCTGTCTG TGTTCATGGC GTGATTATAG GTCTTTTGAA 16020
CGCTTTTTtC TCGTCATCGG TATGTTTTTT CTACCGCTTG CAGGGGACTT ACGGGAGTAG 16080
TCGCGGTGGA GAACArGGGT GTACATGGTA TGCGGTGCGC TTtGGCAGGC CGCGGTAAGg 16140
CGTAgCcTTT TATATTTTCT tGTTTTGAAG TAGGCTCCTG CGAGTTGGGA GGTTGGGAAT 16200
ATGGAAAAAA CGTTAACGCT TTTTGGGAGC AACCATTGTG GACAATGCGC TACGGCGCTT 16260
GAGTATTTGC GGAGCAATCA CATTAACTTT GAGTACGTGG ATATCACCGG CAGCGGGAAG 16320
AACCTCAAGC GTTTTTTAAA GATGCGTGAT TCAATGCCAC TCTTTGATGA CGTGAAGAAG 16380
GAAGGGCGCA TTGGTATCCC GTGTCTTTCG GTGAACGACG GAGAACAGGT CTTCCTTGGT 16440
GTGGAAGGTT TGGACCTCTC GGCGTTTCGC TAGGGGTTAG TGGGCCGGTG TGTTCGGGGG 16500
CGTGATGTCG TCCCCCTCTT TGCGCTCTTT TCGGGCGACC AGCTAGGCGG TTTGCTCTGC 16560
TTCAAAGCTG CTCTGGAGCG TCTGTGCGGT TGTGCGCAgC TGCTGTTCAG ATTGAATGCG 16620
GAAGAAGGAG AGGAGGCTGC GTGGGATCTG CTCGTAAGGA GTTTCGCCGT CGAGAAAGCG 16680
GATGGTCATG TTTGCAAACG CCACGGTGGA GATGAGCGGC TTATATTCCT CAGGAGCGTG 16740
TGGCAGATTG TGGTGAAAGC GAATGGTGTT GATAACCGGT TCAGGTAGAT TCCAGCGCTG 16800
TGCGAGCGCC GCGCCAATTT CCGCGTGGCC TACGTCGGAC ATGATGGTGT CGAGCACATG 16860
CGGGGGTATG TTACGctCCG CCTGAATTTC TGTGAGTTTG ATGAGCATCT CAGGGTATGC 16920
GGAGGTGAAA ACGACTTCCC CCAGGTTGTG CAGGAGGCCG CAGATGTAAG AGTCTGCGAT 16980
GAGCGCCTGG TCGCCAGTGG CTTTGGCAAG ACCGAGCGCA AAGTAGCCGG TGCGGTATGC 17040
TTGGTTCCAC AGCTGCTTTC GCTCATCGTC GGTAGACTGC AGCACGCGCC CGGCGCCGAC 17100 TGAGTACAAC AGATTCTGTA ATCCGCGCAG CCCGACGCGC TTCnAcCgcT TCGCTGATAT 17160
CCAAACAACG TTTGTTCATA CCGAAGCGCG CGGAGTTGAC GAGTTTGAGA AGGTCGGTTA 17220
CGAGCGCTAC GTCTTGGCTA ATGAGAGCTA CAATGTCTGA GAGTTGGACG TCAGGATTTT 17280
CAATGGCGCG TTGAATTTCC AGTAGTTTGC TTGGCAGCTG GGGTATGTCG TCGATGCGGT 17340
CGGCGATAGA CGCGGCAAGC TTTGTGGTTT GAGTTTGGAT CTCAATGTTG CGGGGAACTA 17400
CCATGCGCGA AATGGTTCGA TCCTCTTCCA CTACGAGGCG GTACACGTCT TCCTCAAGTC 17460
CGAGCTTTTT GAGCATGAGC ATCATAATTA CGAGCCCGAG GCCGGCCCCC TCGGAATTGT 17520
CCAGGATGTG CGCGAGTGCC TCTTCCAAAC CAGAATAACG GCGCGCACGA ACGCGCTTGT 17580
CAAAAACGCG CTTAAATTCC ATGGGAGTCA TTTTGCAGTT ATTGATCACC TCGATGACAA 17640
GCGCATGTGA GAGCACCTGC ATGGTCACCT TTACGTACAA CCCCTCCTGT TGTTGGAGCT 17700
TGAGGTAATG ATTGATGTTC TCAAGGCTTT CTTCCTTGAA GCTTTTCATA CCGCGCTTGT 17760
AGTCCTGTGG ATCAAAGATG TCCAGTCCTT TTTCGCGGAA GTAAATACGC TTGGTGTTGG 17820
CCTTTTTCGC GTTTGTGGTG AGTTCACTAA TGCAGTACGT TAAATAATCC TTCACGTGCG 17880
GCTGGCCAAT GAGATCTAGG ATTGTTTTTG CCACCTGTCC AATATAGATG TCCATATCAC 17940
GGGGGAGCGT ATAGGTGGTG ATGGCAATCG GGAGCTGCAG TTCAATTGCC TTACGAATTT 18000
TTTCAGTGTC AACAACTATT TGTGTATCAC ACATGGGCAG ACTGTACACG CTGTGTATGC 18060
TTTTGACAAC ACGCGCGAAA TGCACAGCTT TCCCTGTGCC AGAATGCAAA CGCGTGTGGG 18120
TGCGGTGTAT GTCTTTGTTG ACAACGGGGA GAAAAGCGTA CGAGAATCTG CCCCGTTAGG 18180
TTGAAGGGGG AGCAAACTGT ACGTGCTCAT GTAATCTTTG TTTTCCCGTA TCGATATGGG 18240
GTGCGATGCG CGCGGTGGAT ACGCTCCTTC TCGACTGAGG TCGCAAGATT TTGTAGGAtG 18300
CGTAGGGGCG TGCGTCGATT GGTGTGGTGC TATGTGTCGT GCTGGTTTGA TGTATCAGGA 18360
CTGGGGTGTG AGGGGAAAAA CGGACAGGAT GTGCTTTGCG CGGAGTCGGT TGTTTTCCCG 18420
TGGTGCGGTG GGCACGGTGT GCTGCACTGT GTTGTTTCTC GCGTGTCGTG TGCGCACTCC 18480
TTCTTCCGTG CCTCTGCGTT CTGGATCGGT CCGTGCCGCA GTACCCGAGG CCACATCCTT 18540
TCACTGGCGG CGCTATGCGG gTACGCGCcT GCGCGTGTGT TTTCCGTACC ATGCTTCTTA 18600
TCGCGCACTC AAAGCGATTG TTCCGGAATT CGAGTCACTG ACGGGTATCG CTGTTGAAAT 18660
TGACTGGCTT CAGTATGCGC GCATGCACGA TAAGCAGGTA CTCGAACTGA GTAAACCGCG 18720
TGGTGACTAT GACCTTATCG CTATGCTGTG TACGTGGAAG ACGGAGTACG CGTCGCGCGG 18780
GgCTACTCCG GTCACTCGAT TCGTTTTTTC AAAATCCTTC CCTGTGTATG CCGCATTACG 18840 ATTTTGAAGA TCTCATTCCT GTGTACGTAG AGACAATCGG GTATGTAGGT GGACGCAAAC 18900
CCTGGCTTGG GGGTCCGGGT GCGTTTTTGT GCGCCGTGCC CTTTGGAGCG GAGACGAGTA 18960
TACTTGCCTA CCGCAAAGAT ATCTTCCACA AGTACCGCAT CAAGGTACCA GAAAATTATG 19020
ATGAGCTTTT GGATGCCTGT AAGAGACTGC GTGAACGCGC GCAACTATAC GGTCTTGCAA 19080
GCCGCGGCGC TTCAGGACAG CAGGCtTGCA CGCATATTTA CTACACGCCg CACCCTTTGG 19140
TGCTAAGGTG TTGGACGATT CTTTAGTTCC TGCATTCCAT CGTTCGCGTT CTATCGCCAC 19200
ATTACGGTGG ATGCAAGAAA TGTTCGCGTA CGGTCCACCG GGTATGGCGA GTTTTGCCCA 19260
AAACGAAGCA TTGCAGGCCT TTTTGCAAGG GCAGACGGGG ATGTACTTGG ATACCAATAT 19320
GATTGGACCG TTAGTTCGTG ATCCGACACG CTCAGCCATA CGCCCGCACC ATGTGGGATT 19380
TGCgTTGCAC CCGATGGCGC AGGTGCGTGC AGGAGAAGTT GGCGGCTTCG GGCTTGCAAT 19440
TCCACACAAT AGTGctGCGC CCGAAGCAGC GTTCTTGCTT TTGCAGTGGA TTACGGCGCC 19500
GCAAACAGGA CGGCGAGTGG TGGAACAGGG TGCGCTACCA TTCCGCCAGT CGCAGCTTGC 19560
GGACCGCGCA TTGCGTGcAC GTTTTGCTGA GTTTGAGGTA TTAGAGCGTC AgcTTGcACA 19620
TTGCGATCCA GATTGGCGTC CTATCGTGCC TACCTGGGGG GAGTTGGGAA CCCTTTTGGG 19680
AATTGGGATA AATGAGGTGC TCACCGGTGT GAGTGAGCCA GAGGAAGCGA TGAGCGCATT 19740
AGTGCTGCCG GCACGTCGTA TTCTCCGTCG TCACGCGCAC GCACGTTATG TGCCGTAAAA 19800
AGTCTCTTCT TcTTGGGGGa AGAGAGGGCA GAAAAGATGC TGGATATGAG CGGAAACAGA 19860
ATGACGTGTT CTATCTGTGG AGCGCGTTAC GTTGGGTAGC GTAAGAGGGG GAGTGCCGTG 19920
GGGAAGAGTA TGAGCAAACC ACGTGATACC GTAAGCGCGT ATTGTTTTTA GCGCTCCAGC 19980
GCTTGTATTG CTGATGTTCG TGCTGGTGTT GCCGGCCTTT TTGGGGCTCC GTATGArTTT 20040
TTTTCAGTGG CAACTGaACG CGTTGCGCAC GCAACCTGTA TTTGTGGGTT TTGAAAATTA 20100
CCGGGAGCTT TTCTCCAGTA TTCACTTTTG GGCCAGTGTA AGAACGACGC TTGTCTTTAC 20160
GCTCTCAGTG GTGGTGCTTG AGGTTGTACT TGGACTTGCA CTCGCGCTGG TGTTGGAACA 20220
CGGAGTACCC GGGTTGCGTT TTTTTCGTAC AGTGTTTGTG TTGCCGATGA TGATCGCTCC 20280
CGTGGTGGTG GGAGTGCTCT GGGGGTTTTT GTATCACCCA CAGTTTGGTA AAATCAACTT 20340
ACGCTTGCAG GCTTTTGAGC TTGGACCGGT ATTGTGGCTT GCGAATCCGC GTCTTGCGCT 20400
TTTGTCTGTG ATACTCACTG ATGTGTGGCA ATGGACGCCT TTCGTATTCC TTGTGTTGCT 20460
TGCAGGTTTG CAGGGTATTC CGCAACATCT TTTGTATGCG GCGAAAGTGG ACGGAGCAAA 20520
TTATATGCAA ACACTCCTGC ATATAAAGAT TCCACATATT GCGCCTGTGC TTGGCATTGC 20580 GACCGTCCTC CGTTTGATAG ATTCTTTCCG TGGTTTGGTA GTGATTATGA CACTGACAAA 20640
TGGTGGTCCG GGAGTTGCAA CAGAAATCCT GnCCGCTTCA CTTGCAGCGT ATTGCCTTTG 20700
AGGATCACCG TCTGGGTAAA GCATCGGCAG TTGCTGTGCT TCTGTTTCTC CTGACAAGTC 20760
TTTTGACTTG TATTTTCATT CTCCTTACGA TGAGGAGACA GGCGCGGTGA GGGTTGCGTA 20820
CGGGTTAGAT AAGAGCAGGA GCAGATAGCA TGAACATGAT TTTTTTGAAG TGGCGTACCG 20880
CGTTGGTGTT GTGTCTGTTA AGCTGTATTG CGTTGGTGAG TATGTTCCCT CTCTATGAAA 20940
TGGTAGCTAC TTCTTTGAAG CGTGATGCGG ACGCATTTCG GTTGCCGCCA GCATGGTTTT 21000
TTATACCAAC AATTGAAAAC TATCGGCAAC TCTTGCAGGA ACATCATTTT GGACGTGCCC 21060
TGTATAACAG CTTGGTGGTG ACGTTGAGTT CCACGGTGGT GAGTGTCAGT GCAGGGGCTG 21120
CAGCAGCGTA TGCAATGCAG CGCTTTCGGT ACCGAGGTAA AAAGGCAATC ACGGTGGCGT 21180
TGTTGCTCTT GCGAGTGATT CCGCCGGTTG TGCTTGTAAT TCCTATCTTT GTGTGGTGGA 21240
CTGCGCTCGG GTTAGTGAAT TCTTTAGCAG GACTTGCGcT CGTGTATGGT GCGCTCAATG 21300
TTCCATTTAA TGTGTGGGTA ATCACTACCT TTGTTGCGGA AATTCCCCCT TCGCTGGATG 21360
AATCTGCAAA ATTGGATGGA TGTTCTCACT GGATGATTTT TACCCGCATT GTGATGCCAC 21420
TGATTACACC CGCACTTGCG TGGTGAGTAT TTTTACATTT CGTTTTGCAT GGAATGAGTA 21480
TATGCTTGGA TTTGCGCTGA CCAATCGGAA AACACGGACA CTGCCGGTGG CACTTTCACT 21540
TTTTCTCACG GATAGTGGTG TCGAATGGGG GCGGATTACC GCAGCAGCAA CGCgATTGCA 21600
ATTCCTGCAT GTGTTTTTAC CTTTGCGGCG GCGAAGTACT TGGTGGTGGG TTTGACCGCA 21660
GkGcGGTAAA GGGATAAACA CTCTGCgCGG GTGAGTACGT GCAGCAGATA TGTGcGGCGC 21720
ATCCCTGGGA gACTGCgTCG GGTCGTGTGC GCGTGTCAAT GCgTTGTATG TAGGGAGAGA 21780
TGGGGTGGGT GCaGAAAATG TATGGGGCTG TGGTAGGTCT GCGGTGAGAG AGAGTGCgCA 21840
CCGGGATGGC aTCTACATCG TTGGCGCAGG ATTCGCAGGG AGTGTCCTTG CCCGTGAGAT 21900
CCAAACGAAA AAAGTACTCG GCACAGTTAT TGCTTTTTTG GATGACGATC CGTGCAAAAT 21960
CGGATCGAAT CTTCACGGTG TCCCGGTGCT TGGTCCCATT TTTGAAGTTG CCCGGATTGT 22020
GCGTATTACT CCGCATGATC ACGCGCTGAT TGCAATTCCT TCTATCTCCA TTGAGCGTTT 22080
GCGTGACATT TACCTGnACT GCGCGCTGCG GGGTTTACGG TTATCAAACT TCTGCCGGCG 22140
CTTGCTCAAA TCATCGATGG TACTGCGCAT TTAGTGCAAA CACGTGAAAT T 22191
(2) INFORMATION FOR SEQ ID NO: 102:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5420 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: ATCTTTTCTG CACAAAAGTT ATAAAAAGAT TTGATAAAGC TTTCGATTTC CTCCATCGGT 60
TTTGAGCnCG CACGCCAAAT GCGAGAAAAG TCCTTCATCC TTAAAAGnTT GCGCnTTCAC 120
ATTGGGGAAA AAACGCGGCC ATACGCGGCC GCACATCGGT AAAACGCCAG GGAAACTCTC 180
CACCAAACAC TTTGGCCAAC TGCGTGCGCA CCACCTGCTC ATCCCATACG AAAACTTCCT 240
GACCCTCCAG GAAGGCACCA AGACGCGTAT GAGATGGCAG GAGCATACGT GGGACCACCG 300
TTTCCGAAAG ACGGTCCACC ACCTCCATAT TACACATTTT TACAGCCTcA ATCGTATACG 360
CCCCaGTCAC CGGACGAACG TGTAGAAGCA CCACAGGCCG CTGATACCAG CCGCTGGTAC 420
TGAAACGGAG CGCCGTTACC ACCTTCCACG CAGTTGCACA GCCTGAGAGA AACTCGTGCG 480
GATACCCTGC ACCTGGCACT TTTGCATTAA TGATAATTGC ATCAGGGAGT GTCTCAGGGG 540
CCGCATGGTG ATCAATAACC AGTACGTCAA TACCTCTGTG ACGCGCATAC GCAATTTCTG 600
CGCGATTAGA AATACCGCAG TCAACGGTCA CAATGAGAGT ACCCCCTGCG GCCGCATGCT 660
CATCTACTGC ATTGCACGAG AGCCCATAGG GCTCATCGGC AACCGGCACG CGCCAGCACA 720
CCGTCAAGCC AAATGCACAG AGAGCCTCAA AAAGAAGGGT GGTGGCGCTA ATACCGTCTG 780
CATCACGATC ACCAAAGATG AGCACCTTTT CTTTCCGCTC ATGCGCAAGA CGCAAACGGT 840
CAACTGCAGC GCTCATCGCA TGGAAACGGA AcGGTcCTGG CAGATAGCGT AGATCACGCT 900
CAAAATGAAA AAGGAGTGCG TCAgcATnCC ACTACCTTGC GACGCACAAG GATGGCCGCC 960
TCAAGCGCCG AGCAGCTATA CCGCGCGGTA AGCTCGCGGA CACGCCCTTC GTCAAGTGCC 1020
TTTGCACGCC ACCGTTTCAT GCGTAsCCTT CCTCCGAGAG AGTACCACCC GACGCAAGAG 1080
GACCCGCCAG AACGGCCACA CGCACCACAC ACGCGCCCCG GTGCcGCGAC GCTTTCCGTC 1140
•GGCGAAgcaT CtTCCTCTAA AGAGACTGCC CAATCTTTCT TACGGACCGG TATCCAGACT 1200
CTGCACACCA CGAGCAGGAT CGTCCTGCAC GTCTTCTGAC TCCTCCTCTG GAGCGTCACG 1260
CGACGGGTCG GATGCAGACG GCACAGGCAC ATGCGGCGCA TCAAAAAGCG GCGCTTCCTG 1320
CGGATGCACC GGCGCAAAAG GATTGCTCGG GACAAGCACC TCCCCCTGCA CGCCCGGTGA 1380
ACCCAGTGGC GGATCTTCAA AGACGCGCGG GTCTATATGC AGCCCTTCTT CGsTcACTGC 1440
CGGCCAGAAC CCAgTGCTGT ATCCAcTTTC CTTCAACCGT TCTTTAGCCA GTTCCTCAAG 1500 CATCTTCCCA CTTTCATGGT ACGAACAAAA GCGCTCAgGC AmCGTGcCAr GCAAAAATGA 1560
CAGAGAAACA AGCTCATCCC CACAGTACGG GGTAGGCAGC AATCCTGATT TTGCACACAC 1620
CGCCTGGTAC ATCAACCCTG TTTCAGGACG AACAAATGCC CGGTGCGGCT TATCCTGATG 1680
AATAGCCCGC ATAAAACGCG CCCACGGAGG ACCTGCAAGC GTCGCGCCCG TGCTATGCAA 1740
TCCGAGCGAT CGGTCTCCTT TGTCAAAGCC AAACCACAAC ACTGCAGTGT AATAAGGAGA 1800
GTATCCAACC GCCCACGCAT CAGACCAGTT TTGCGTAGTC CCCGTCTTCC CCGCAACCGG 1860
CATAACAAAC GATCGCcCCG TTGCAGGGTC TTGGTATGTA AATGCGCGCC CCCGCTCAGA 1920
GGCCACCGCC AACGTCCCCA TCGTTACCGT TTTCTCTAGC ATATTCGTCA TGAGCGCCGC 1980
GTTCTCCGCA GAGATCAGTT GCGTTGCCGC ACCcTGCGCG CGCAGgCGGG CCCGCACTTC 2040
CCGTTCTGGA TCCAAAATCA CCCGCCCTAA ACGATCCTCC ACTGAACGCA CTGCAATCGG 2100
TTCTACCGCT TTGCCCCCAT TTCCAAACGC TGCAAATGCA CGCGCAAGCT GAATCGGCCG 2160
GAGGGCAACT ACGCCCAACG CAAGCGGATA GACGCGTGGG AAGGTGCGCT CAATCTCCTG 2220
CCGATCGGTT ATATGCAGGA GCGTCGCCGC ACGCTGGATT ACCGCGTCGA AACCGACCAT 2280
ATCCAGTACA CGAATAGCAG GAATATTGAG CGACTGCGCA AGcGCCTTCC ATGCAAGCAC 2340
TACCCCCTGC CATTTTCCCC CATAGTTGTT GGGAATATAC GAAACACCAT TGCGGCTGaA 2400 mACCTGCGGT GCATcGTGcA AtGCGTTGCC ATCGTGAGCT TTTTGCTATC CAACGCCGCA 2460
GAATACACCA GAGGCTTAAA TACACTGCCT GGCTGCAACA ATCCTTGCGT TGCACGAATC 2520
ACTTGGTTAG AAGCACCGAA TCTGCTGCCC CCTACGAGAG CTGTAATGTA TCCGGTATCG 2580
TTCTCAAGCG CGATGAGCGC ACCTTCCACC CGCTTACGTG CAATTTCGTC GCGCACCAGA 2640
GAGGCGCCTT TGTCAGACAA CGTTTTAAGA TTGTCGAGGC CGAACATCAG CGCCATAACA 2700
TTCACCAGGG GACTGAGCGT ACTGCGGTAA TaCGCGCCGC TTTTTGCCTT CATGCGCCGG 2760
TCACCGACGT GCAACTGCGG AACGTTGAAC ACCAACCCCA ATAGCTCGCT GATATTGCTG 2820
TACAGTTCTC TGCGCGCAAC GTGTGTGAGA GAAGATTTTT GCACGTGTGC ATTGGCCTGC 2880
TCCAGCGTTT GCTCAACTTG CTGCTCTGCA ACCAACTGAT GACGCAAATC GCACGTGGTA 2940
TGCACGGTAT ACCCGTCCTG GTACAAATTC ATCGTGCCGT ACATCATGCG GTCCAGCTGC 3000
CTCCGCACAT ACTCGGAAAA CCAACGCGCC TTATCCGCAC GGGCATAGAA CGCAGAACTT 3060
GTGGTGCGAG TGTAATCGAA ATGCGCCCAG TAGTGCTCGT AGGACTCATC CCGTTCTTGT 3120
TCACTGAGAT AGCCAAGgCG CGTCATTTCA TGGAGTACGT AACGCTGACG GTCTTGAGCG 3180
CGGTTAGGAT ATTCAAAGGG ATTGTAGTGT GCCGGGTTAG AAAGCAAAAT AACCAAGAGC 3240 GCCGCCTCTG CTGCGCTCAT CTGACGTACC GAATGGCCAA AGTAGAAGCG GGCAGCCGCT 3300
CCTACGCCGT AGTGCCGCCA CCGAAGTAGA CGCGGTTCAA ATACAACTCC ATAATTTCGT 3360
TCTTGGAATA ACGCCGCTCC ATATGGAGTG CCCACCACAA CTCTTTGATC TTACGCCTGA 3420
GACTGCGGTC GCTGCGGTCT GAATAGAGAA GACCTGCTAT CTGCTGGGTC AGCGTACTCC 3480
CGCCCCCTAA GGCGCGACCG GTGAGGGTGC CGACAAGGGC ACGGAAAATA GCCTTGATGC 3540
TGTAGCCGTG GTGGGTATAG AAGGAGCGGT CTTCGCGGGT GAGTAGAGCG TGCACAAGGT 3600
GTGAAGACAA GTCAGCAAAG GAAACGATTT CGCGCTTTTC GTCTGAGGAA AACTCAGTGA 3660
TCAAATCACC CCGAATGTCC AGGATTCTGG TGGGAAGCGC CGGATTAAAG CGGGTGAACC 3720
GTTCGCTCTG CTTAATGTTT TCAATGGAGG CAAGCAAGAA CCCAAAGAGC GCAGcTCCCC 3780
CCACCAACAG ACCGCACAGC AACACCAGAT ACAGGTAACA AACGCGACGC ATGGGCCAAG 3840
AGCGGAACAC AAAAAGGCGA AAACTTCAAG GCAAGAGGCA GAAGACCGTC AACGGCGCAC 3900
ACCGCACGAT GCACTCAGGC AGACACACAA AAAGGCCGGC TTTCTCAAAC CGGCCCCACT 3960
ACGTACCGAC CGCGCGATAG GTGGGAGGGG AACCACGCAG CCGGCGCCCC CAGTATCGGT 4020
TCTACCTGCA AAACCTTGAG CCTGGTGCGA ACCCCACGCC CAACGGTAGT GCCACAGAGA 4080
GGAGGAGCGC GTCGACACCT CACCTGTCGG CGCCCCGACC GTCACGTATC AGCGGAACTT 4140
TTCCATGAAT CGCTCAGGAT TTTGGGGTAC CCCCGCTTCC CGTACCTCAA AATGCAAGTG 4200
TGGCCCGGTC GATGCGCCTG TCGATCCCAC ATTGCCGATC ACCGCTCCCA CACTCAGTTT 4260
TTGCTGTACA CGGACGCGCA CTGCACTCAA ATGTCCGTAT AAGCTGTGCC TTCCGTCTGT 4320
GTGCTGCAAA ATCACGTACT TGCCATACAG ACGATTGTAC GCAATCGTCG CCACCTGTCC 4380
GCTCGCACAC GCATACACCA GCGCGCCCAT GGGAGCGGCA AGATCTATAC CTGGATGATA 4440
ACTCAGCCgC CCGGTGAACG GGCTTTTGCG TGCGCCAAAC CCTGAGGTGA GTCTTCCGGA 4500
TGCCAACGGA AAACGATAAA ACGGCTTAAG AAAGAAGGCA CGCACCGTTC CGTCAAACAA 4560
GGCCTGAGGC GCGCACACTA CCTCCCGCTT TTCTCGCGTA TGcgTTGTCT GCGCAGTCCC 4620
CGGAAGGGAA AGAAAAAAAG ACGGCCCTTG ACCCTTTTTT ATCAGTGCGT AAATGAGCCG 4680
CTCCAAGGGA AGGTGCGGAT CCGCAGAAAC GTACAGTCCC GGAACGnTTG GCAAAAGCAA 4740
CGTACGCCCC TCAAGGGGGG TATGCAGCGT CTCAATACGG TTCAAACTAG CCACCGCGTC 4800
ATAGGGAATA CCGCAGCGCG CGGCGATACG GATGATGGTG TCTGCCTTTT TTACGCGATA 4860
TGCATAAAAA CGCAAGGGTA AATCGTTTCC CCGCTTGCCT TGCGCCAATG CCaCGCgCGC 4920 aGCAcGtCAT CGCTATACTG ACGAAAGAGC GCATCCTGCC CTTGCAGcTG CGCTATCAAC 4980 GGATAGGGAC CGCACACCGC AGGATGTGCC ACACACACAC ATAGGCAGGA GCACACgcTG 5040
CGCGCACCAT CCGCACGCGC AGGAGAAAGG AGAACGGAGA AAATGGGCAC AGACGCATCA 5100
CACGCAGCCG CTTAGCACTC CTGCGCGGGG CGAACTTCCG CGGAGGAAGA GAGCCCCATA 5160
GCACCGACTT CCCCTGCTTG CACTAAAACG CATGGCTCCT GCGCGGCGGC CCTACGcCCT 5220
GCACGTCGCA CCCGTGCATA AGAAAGArCG TTGCACACGC GTcTCAGTCC CCACAGGACC 5280
GCACCGCCGC CATAGAGCAA ACCGCTCAAC ACCGAGAAAA CACGCGGACG ATGCACCGCC 5340
AACGCCCAGA GAGGGTAGAC GAGCGCGCAG CCGAGCGCAA ACGAAAGTAT CAGCAAGCAT 5400
CGCAGCGAGT GCTCCAATGC 5420 (2) INFORMATION FOR SEQ ID NO: 103:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6754 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103:
GGGATCAGAT TGAAAATGCA GAATATTTTA GCGCACACGG TGCTGCGTGC ATCCTTCGTG 60
CTCAAGATGA AAAGGGACAT CAACTGGTGT CTTTGCTCAC AGAACTGTTC CATCCTTCCT 120
GCGCTAGGAT AGAGGAGATG GCACGTGCGT CTTACACACT TGGAATTGGC AATGCCGCGT 180
ATGATATTGC GCAGCAGTTG CAGACTTTCA TAAAGGAGGG GATGTGACAA TCAGTACCCT 240
GGATCTCATT CTTGGAATCA TCATGGGGAT AGTGACCGTC CGTGCCACCA TGCGCGGGTT 300
TGTCGATGAG TTCTTTTCTA AGGCAAGTAT CCTGTGCGCA gCAGTTGTTG CAATACTGTG 360
TCATAAAAGG CTCGTGCCCT TGACACGTGT GTTGTTAGGC CACAGTATTC TGCTTCCGTG 420
TATAACGTTC TTGATTACCT TTATGGGCGT CTATTGCGTT ATGCTCTTCC TCCGTTCACG 480
TATGCGCACG TaTGctACGC GCGATCTTAT CAGTGGTTTT AATCAGGTGT TTGGCTTTTT 540
TTTCGGGATA ATTGAAGGGA GTGTACTACT CACTGTTATC CTTTTGCTTT TACACGTGCA 600
GCCTTTTGTA TCTGTTTCGC ATATGTTGCA TGAAAGCGTA ATTAACACTG TTCTCTCTCC 660
CCTTGTCTTA GATGGCGTTC GCTATATGCG CCTGAAGATG TAGGCGCGTT TTCGTGTTCG 720
AGAACATTGT CGGACAGCCA GCTACTGACC TTCTGCGCGA TGATGTTATG CACGCGCGTC 780
TTCCTGCTGC GCTCTTATTT GTTGGACCAC CCGCTAGTGG CAAGCTAACT GCTGCACTTG 840
AGcTTGCGCG CGTGCTTTCC TGTACGCAGG GGGGCGTGTG GCAGTGTCCT TGTGCACCCT 900 GTGTGCAGCA TACGGAGTTA CTTTCTCCAG AACTACTCGT GATGGGGATA AAAGATCTGA 960
TCCTCGAAAT ACGCGCGTCG GCACGTGCCT ATATGTCAGT GCACACGCAG GGTACGCGCT 1020
ATCTTTTTGT GCGGGCAGTG CGTAAACTCA TTACACGCTT TGACGAACGA CTGTGGGATA 1080
GTGATGATAC TCGTTTCTCT GCCGCGGTTT CGAGCATTGC TGAGCTCGAC CAGGAGCTCG 1140
CATCACTCCC CGcACAGGGA GACCGACGTA CCACTCCAGA GCAAAAAGAA AGAGTGCAAA 1200
GGATATGCGT GATCGCCGAA AGGCTAcAGC AGGAGTCACT CTATACCCAA TTACCCGTAC 1260
AGCAAATACG AAACGCAATT CAGTGGGTGC GTCTTACACC CTCAGGAAGA AAGAAAGTGC 1320
TGATCATAGA ACACGCGCAC GCGATGCATG AATCTGCACG AACTGCTTTT CTAAAGATAT 1380
TAGAAGAACC TCCGCGCGAT ACGCTTTTTA TCCTCACTAC CGCTACAAAG TATGCAATCA 1440
TGCCTACTGT CCTTTCGCGT GTGCGCAGTT ATTCTTTCAG AGAAAGAAGT GTTGAACTCC 1500
AATGTGCAGT AATTACACGC GTATTTCATG ACAGACCGAC CGATGCAAAA AACACACAGG 1560
GTGTACTCTT GCACTATTTA TATCAGTTTT TGTCGTTTTC TTTAGAAGAG GTGCAAACGT 1620
CCGTGATGTA CTTTTGGCTG TATGTGTGCC AGCACGCTAG GTGTATCGGA CGCATCATTC 1680
CTGATACTTG GGTTTCGGTA GGTACTCAAC CTCAAGTATC AGGAATsGAT CTTACCTCAC 1740
TTGATTCTAC GCTGCATTTT TTCCAAGCTC ACAAACAGCA ACACGCCGTG TCCCTTTTTT 1800
TCTCCTTACT CGTCAGGCAC ATGCGCACCC TGCAGCGCAC CACCGAGTAC TCCGCACGGA 1860
ATACAGAATG tTCGCACACA TTGCTCACTG TATCGAACAG GCACATCGCA ATGTGCAGTT 1920
GTGGAATCTA ACAATACAAG GGACTCTCGA ACACTTAGCA CACACCATCG CGAATCATCT 1980
ATGAGAGATT TTATTGCACG CGCGCTAAAA AAGTCAGCGA AAATGAATGA CTCTCAACTG 2040
AGAAATATGA TTGAACTTAT TGCCAACGAG TACACCTTGT TGGATGCACT TATGGATTCT 2100
CTGAATTAcG GACTTATCGT GTTGGACTGT TTACACATTC CATTAAAGAC AAACCGAGCA 2160
ATTGCACGGC TCTTGGGTAA ACCACTGCCT TCAAATCCTC GCAGGCCACT GTGGCATTAT 2220
CTTGATGACG AACACATTGC GCAGTTCATT GTGGCAATTA TTAAAAATGA GGTAGGAAAA 2280
GCACGCGCAG AATTCATTGT ACAAAGACAA GGTGAAACAT TGTATCTGGA AGTATCCTTA 2340
TTTCCGCTAA TTTGTGACCA AAAGATCCGC GGAAGTATTA TCGCAATACA TGATATCACA 2400
GAGAAAAAAC AAGAAGAAAT CTATAACCGA AGCTAGAAAG TCTTGCAAAT TTAACGAATC 2460
TTGCAGCAAC CGTTGCGCAC GAAATCAAGA ATCCCCTAGG AGCAATGAGC ATTCATCTGC 2520
AATTACTACG TAAGAATTTT AGTACCTGTA GTTTCGAAAC AAATAAAAGA ATCCAAAAAC 2580
ACCTCCATGT GGTAGAGGAG GAAATCGAAC GGCTCAATAG AATTGTCACC GGCTTCCTTT 2640 CTGCAGTTCG TCCCTTAAAA CTAAATATCA CACGGCTGAG CGTTTTTGAT CTTGTTACAT 2700
CCATACGAGA CACATTTATG AAGCCTTCAC CAAAGCAGAA CTGTCTTTCT CTGTACATAT 2760
GCCACACAAT CTTCCCCACA TACGAGGCGA TGAACACCTG CTAAGACAGG CATTGGTAAA 2820
CATTATCACT AATGCTAAAG AAGCCATGCA AAGAGGAGGG GCCCTTGAAG TCTTTGTCCA 2880
TAAACAAACT GACCACATCA GTATCAGTAT TTCGGATACA GGAGAGGGAA TTGATGCCCG 2940
AAATATTCAC AATATTTTTG AGCCGTACTT CACTACTAAA ACTGAAGGTA CGGGGTTAGG 3000
GTTAACCTTA ACGTTTAAGG TGATTAAAGA ACATGGCGGT GACATCAGTG TGTCCTCTAC 3060
TGTTGGACGG GGTACGTGTT TTACTCTCCT TTTACCCATA GATAAATTGG GACGATCGCT 3120
TTTACAAGAA AAAATATCCA CCCACCTAAG ACATACGAGT AAAGAATAAG GAAATGCGAT 3180
GAAATTCAGT ATTCTCGTAC TAGATGATGA AAAAAATATC CGTGAAGGTT TGCAAATGGC 3240
CCTCGAAGAT GAAGGATATG AGGTGTTTAC CGCAGAGGAT GGAAATACAG GGGTAGAGAT 3300
TGCCCTCAAA GGGGATATCG ATCtTATTAT CACTGATTTA AAAAtGCCtC GTATGAGCGG 3360
GGaATTGGTG CTGCaACATG TGCACGCGGT GTTGCCCGAT ATTCCTAtCA TTATTCTCAC 3420
CGGGCATGGC ACAGTAGAAA ATGCAGTTGA AGCAATGCAC AAGGGAGCTT ACGATTTTTT 3480
AACTAAACCA TTGGATCTTA ACCGATTGTC TTTGCTTGTG CGCCGGGCGC TACAAAACCG 3540
AGAGTTGATC GTTCAACATC GAGAGTTAAT CAAACAAATA GGAAATCGCA CCTCATTCGA 3600
GAACATTGTA GGAGAAAGTC CTGCAATGAA CAAAGTGTTT GACATGGTAA AAAAGGCAGC 3660
CGCCTCAAAA GCGTCCGTGC TCATTACTGG AGAAAGCGGG GTCGGTAAAG AACTTATCGC 3720
GAATGCAATC CATAATCTTT CGCCGAGGAA GGCAAAACCT TTAATTAAAG TACACTGCGC 3780
TTCTTTTGCA GAAGGAGTGT TGGAAAGTGA GTTATTCGGT CATGAAAGGG GTGCCTTTAC 3840
CGGTGCGGTC AATCGCATGA AAGGTCGTTT TGAACTTGCG CACGAAGGAT CAATGTTTCT 3900
TGATGAAATC GGAGAAGTAA GTATGGCTGT GCAAATAAAA CTACTCCGTG TGTTACAAGA 3960
ACGTTCATTT GAACGTGTAG GTGGAAGAGA AACAATAAAA GTTGATGTAC GCGTAATTTC 4020
TGCAACAAAT CGTAATCTTT TAGAAGAAAT TAAACGCAAT TTGTTTCGAG AGGATCTTTA 4080
TTACCGATTA AATGTTGTGC ACATTCACGT tCCTGnCTGC GCGAGCGcAA GGAGGATTTG 4140
CCATTACTGA TTGCAACATT TCTTAAAGAG ATTGCAGAAG AAAACGGTAA AAAAATTACC 4200
TCTATAGATC CTCAGGCCCA GTCTGCACTG CACGCGTATG ATTGGCCTGG TAATATTCGT 4260
CAGCTGAGAA ACTGCATTGA AAGCGCTGTC ATTATGAGCT CAGGTCCTGT TATCCACATA 4320
GAGGATCTCT CAGAGCCAAT TCGATCTCTC GGTGAAACCT CTTCCATACG CaTTCCTATA 4380 GGAGTGAGCa TGGAGGATGC aGAAAaGGAA aTCATCCTCC AGACACTGGA AGCACAAAAA 4440
GGTAATAAGA GCAAAACCGC AGACGTGCTT GGCATTGGGA GAAAGACGCT CTATCTAAAA 4500
TTAGATCAAT ACACGAATAC AAGCTTTGAA CCTGATGCCG CAGCAAAATC ATGaAACGTG 4560 cTTTGATAAT CACCGGAGGT GAATATGCAC CCTATGAGTT TGTGCAATAT TACCTGCCTG 4620
CGTACGATCT GCTCATTGCC GCTGATTCAG GGCTTGATAC CGCATTGCAA TTTGGTCTTG 4680
TGCCCGATTT TGTTATTGGA GATATGGATA GCGTTAAGGA CAACCTGTTC ATACAGGCGT 4740
GTGATAAAAC GCGCACACAC CTTTTCCCCC GAGATAAAGA TTTTACTGAT ACTGAGCTTG 4800
CAGTCACCCT TGcGCACCAA TTGGGAAGCG ACGATTTGAG CATCGTCGGA GGGGGTGGGG 4860
GAAGGGCAGA TCACTTTTTA TATTTCATGC GTCTTTTTGC CGCACCTCTG TCACCGCGTC 4920
TGTGGCTGTA CAGACATGGA CTGGGATATT GCTTTGGGGA AGGATGTGTT ACACAACAGT 4980
TATGTATTGG AGGAGTGGAT AATACTTCTT TTTCTTTCTT TCCCGTTGGA GATGCTACAG 5040
ACTATTCGCT CTCCTCTGAA GGATTGCATT GGCCCCTCGA TGGGgTGCCg TGGCACACTC 5100
ATGTAAGTAT GAGTAATCGC AGCAGCGCAC CTGTCGTGCG CGTCGAAGCA CACCGGGGGA 5160
GATTTTTGCT TTTCCTTTCT CCCCTCGGAC GTTACACCAT TGATCATCAC GAGCGGGGTA 5220
TTGCGTGCAC GCACAGAACG TAGATATTGC GCCGGGCAGT ACCTCGACCG TTTCCATCAT 5280
AGTGGGTATT GACCCAGGAC TTGAATCTAC CGGATACGGC GTTATAGAAG CAGGGGGAGG 5340
CAGTCTGCGC TGTCTTACTA CGGGGTGATT GTTACCCAAA GCAATCAGCC ATCTGCTGCA 5400
CGACTCAGAC ACATCTTCGA TACCCTGCAA CAGGTAATCT CAATATATCA ACCTCAGTAT 5460
TGCGCAGTGG AGACAATCTA TTTCGCAAAG AATGTAACCA GTGCGTTGTG TGTTGCGCAA 5520
GCGCGTGGGG TTGTATTACT TGCTATGGCA CAACAGCACA TTTCAGTAGC TGAATACGCA 5580
CCGAATGCGA TTAAAAAAGC AATAACTGGT ATTGCCCAAG CAGAAAAAAG ACAGGTACAG 5640
CATTTGGTAA AAATTTTACT CAATCTTAAG GATATACCTC ATCCTGATCA CGCTGCTGAT 5700
GCCCTAGCGG TTGCTGTTAC CCATGTACAC TGTTGTATGT CTTCAAACTA TGCGGTAGGT 5760
TCAACGCGCT CTAGGGGAGC GTACGTTACG CTGTACAAAA AAGGTAAGAG ATGAAAAGCA 5820
AGAGTTCTTT GTTGAAAAGT GGGTTGCTGC TTTCTCTTTT AACACTTGTC TCTCGTGTAT 5880
TGGGTTTAGC GCGAGAAGTA GTGAAGTCTA CGCTTATGGG GACCAGTGCG ACAGCAGATG 5940
CATTTACCGT TGCATTTATG ATCCCAAACC TTTTCCGCCG ACTGTTTGCA GAAAACGCCA 6000
TAAGTGTTGC CTTCATTCCC GTCTTCACAC AGCACTACTC AATGCCGAGT TCAGCGCAAG 6060
TGCCATGTTC TTCTAAAACG AAGGAGTTTC TTTCAGCTAT CTTCACACTG ATGAGTAGTG 6120 TCACTGCAAG CATTTCTCTT ATCGGTATAC TCGGTGCTCC GTACATCGTG CGATTATTTG 6180
ACACTGATCA GTCATTAACC GTTTCATTAA CCCGCTTGAT GTTTCCCTAT TTATGGATGA 6240
TCTCTCTCGC AGCTTTCTTT CAAGGTATGC TGCACAGTAT TAAGGTATTT GTCCCCTCAG 6300
GATGTACCCC AATATTTTTT AATGTCAGTG TCATTTTTTC GATGTACTTT CTGAATGTGT 6360
CACATATGAA CGTGGCTATT GCTGCAGCAA TAGGTGTTCT TATAGGAGGA TGTGCGCAAG 6420
CACTCTTCCA GCTAATATTT GTATATATGC ATGGGTTTCG TTTTACGCTC CAGTCTCCTT 6480
TAAAAGCAAT GCACGATGAA GGTGTGCGAC GAATCATTGC GTTACTTCTA CCGACAACTG 6540
TTGGCATTGC AACCTATCTT CTAAATGACC TGGTGTGTAC TGCGCTTGCA ACCTCTGTTG 6600
AGATAGGAGT TGCTGCGAGT GTGCAATATT CAtTCGTATA CAAGAACTTT TATTAGGrAT 6660
ATTtATCGkT TCTCyAAGCT CyGtGGkACT TCCTGAtCyT TCyTTCCaTG tTATGAGAAA 6720
AGATTGGCAA TCGTTTGAGG ACCTCCTGAT AACA 6754 (2) INFORMATION FOR SEQ ID NO: 104:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9410 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104:
GCCAACGCGT TGGCGGCCCT ATGTGCATTT CCGTTCTGCT TGCGTGGTAG TGGATTGTGG 60
AACAGCGCTC ACCTTTACGG CGGTGGATGG CACGGGGTTG ATTtCAAGGG GTGGCAATTG 120
CGCCTGGTCt GCgcACTGCG GTGCAGTCTC TCCATACAGG AACGGCACAA TTACCACTTG 180
TTCCTCTTGC CCTGCCtGAT TCCGTTCTGG GCAAGGATAC TACGCATGCG GTGcAGGCGG 240
GTGTGGTGCG GGGCACGCTC TTTGTTATTC GCGCTATGAT TGCACAGTGT CAGAAAGAGT 300
TAGGGTGmCG CTGTGCAGCG GTGATAACGG GGGGGCTTTC GCGTCTTTTC TCGTCAGAGG 360
TGGACTTTCC TCCTmTCGAT GCACAGtGAC GCTCTCAGGT CTTGCACATA TTGCGCGGCT 420
GGTGCCGACA TCTCTCCTGC CACCTGCTAC AGTGyCaGGT TCATCGGGGA ATTGAGGAAA 480
CTGTTATCCG CGCTCCCCAT CTTCCGATAC TGGATCGGTG TCGGGGGGAG TAGGAGTGGG 540
GAAGCGTCTG TGCTGTATCG CGCTGGTGAT GCGCGCGTTC TGGTACCTCa kTsCGAAGGG 600
AgTCAGTaTc GCTTACGTGC CCgTTCaTcG CAGTGGGGGC TCTCAAGATT cGAGCATGAG 660
CACAGCAGTG GGCGATACGC TCyTTAACGC CTTyTTCGAC GAGGrAATGG TGGTTACGgC 720 AGTACCGCCG GGTGTACACG ACGGCCAGAC TATAGCAGAA ATTGCTGCAT GTTTTGAAGT 780
AATGCCCGAT TACGCGTTGT TGGTGCAGTT TCATTCCGCT CGTCTCCCTG GTGGGGAAAG 840
CCCTACCTCC CGTGCCCGCG GCGCTTGGTC TTCAGAGAGG .-..-.CCGTGCTG TGTGGACATT 900
AGTGGATTTG CATACGCAGC GCGCGTGTGT CTATGCGTGT GTCGCCCCAT ACAGGGAGAG 960
TATTCCCGTT TCTGAGTGTG TTGACGTCGT TACCCGTTGT ATTGCGGAgC AGGcAATTTC 1020
GTACATACGG GTGGGCACGA GCACCGATAC AGCCGGAGTT CAGTTATAGA AAATAGGGAA 1080
TACGTAAGGT GTCTGCAGCG TCGCTTCAGC TGGGAGGAGT CTTATGATTA AACGCCACAT 1140
GTTCGCAAAA AGGGGTGTCA AAGGAAGATC TTACCTGGTT AGGGTGAAna mTGCGTTCTT 1200
AGTGCTTTGT GTTGCTTCTG TCACGCCGCT TTGGGCTGTG TGGGAAGGGA ATGCAGAAAT 1260
TGGCCCCCAG GGAAGTTTTC TGCAGGACGG CATGTTTGTG CGCAGTGACA TGTTCCCCAA 1320
AAACACTGCT GTTGAAATTA GCAACTTAGA AAAGAATGCC AAGGCTCAGG CAGTGGTTAT 1380
TGGGCACGCA GGGATCCCCG GTCTTCTAGT TAGCCTTGCA CCCGCTGCTG CAGCACAGCT 1440
TGGGATTGGC GTATACCAAG CTGTGCGTGT ACGCGTACGT ACCTTGGGTA CCGTGCGCGG 1500
TGGGTCTCAA ACAAGTCAGG ACGGACTGTC CCTTGCATCT TTGCCGTCCC GTGTGCCTGC 1560
GCGCCCCGCC AGcgTGATCC TCTGTCATCC CCGCCGGCAG GTCACACTGT ACCGGAATAT 1620
CGCGATACGG TTATTTTCGA TGACCCGCGT TTGGTTTCCC CTTTGTCTCG TGAGGTGGAG 1680
GACGCGCCGA AGtAGTGGAG CCGGCCTCTG AGCGTGAGGG AGGGGAGCGT GAGGTGGAGG 1740
ACGCGCCGAA GtAGTGGAGC CGGCCTCTGA GCGTGAGGGA GGGGAGCGTG AGGTGGAGGA 1800
CtGCCGAAGg TtAGTGGAGC CGGCCTCTGA GCGTGAGGGA GGGGAGCGTG AAGgTGGAGG 1860
ACTGCCGAAG GTAGTGGAGC CGGCCTCTGc AGCcGTGAGG GcAGGGGAGC cGTGAGGTGG 1920
AGGACGTgCC GGGGGTAGTG GAGCCGGCCT CTGGGCATGA AGGAGGGGAG CgTGAGGTGG 1980
AGGACGTGCC GGGGGTAGTG GAGCCGGCCT CTGGGCATGA AGGAGGGGAG CGTGAGGTCG 2040
CTTCTCAGCA TACGAAGCAG CCATCCCACT CGGTTTCCAA CTCAGCTCCC AATCAGTTTC 2100
GGAACCCTGA GGGGGAACTC CCCTTTACGC TCCCTGACCT ATCCGAGTCA GAAATTGTGG 2160
TTCCGGAGGA ACAGAAAGGA CGTGCGCATC CCCAGGTGAT ACCCGAGGGT GCGCCACGTG 2220
GACTGCAACC TGGTGAATAc TACGTACAGA TTGCAGTCTT TCATGACGCT ATCCAGGTGC 2280
AGAGCATTGT CCACCGTTAC GGGGTAGAAT ACCCCATCGC AGTGGAGCAG GACATCCATG 2340
AAGGTAAGGT GCGTTTCACC GTATGCGTCG GTCCTGTCCA AAAAGACGAA CGCGGCGCGt 2400
ACTAGAGAAC TTCCAAAGGT TTGGATTCAA GGACGCCTTT CTGAAAAAGG CGCGATGATC 2460 AGGTCGGCCC TCCTCTTCCC CTCGTGACCG TGGTGACTCG CCCCGAAgGc GnCAcAGAGC 2520
CCGAAGaACG GAAGGGAAGG GGCAGACTTA ACTATTTCTT TGTTTTTTTG AGCACGTAAA 2580
ACGGCGCCAT CTCCTTTGAA GGCTTTCCTG CGCCGGGAGC GCCCATGTAG CGAACGGAgT 2640
TACTGTCTAT CAGCTCGTAC AGCTCTTTCT CGTGCGGTGC CTTCGATTGC TCCGAGGACA 2700
CAAGCGAGAG TTCGACAATT CCGTCTTCAC GTACCATCCA CGTACCGCGA TACGTAAGAG 2760
GAGAAGGTGC CGACTTCTTC TCAAGGGCAA GCTCTACCTT TTGCGCAtGC CATCCGCGTT 2820
GAACGTCACA GTCGTATCGA TTCCCGGGCA ATCGGCCGCA GTAGCGTACC CCGAAAGATA 2880
CCTCCCTTCA ACGCGCACTC TAsCTTTTCC GCTTtGGCCT TCCCGGCGTG CGGACACAgG 2940
TTGTGCACGA GACACACAAA GCGCTAmCGA GCGCTCCAAC ACCAAGGAAC GCGCACAGgs 3000
CGGsACAGAT CtTTCATCAC AGAAAACCCC CTTGTCACGT CTGTAAGnTC AGGGGAGAAA 3060
AGCCCAACGA TGCAAAAGTT ACGCTCCCTC TTGCCAAAAG TGAGAGAAGA GCGCAGGCCA 3120
CgCGCACGgr GGCAAACGTG GGGTTTAGGA GCGCACCTGC GCCCCGGCCG CGGGTGCACG 3180
CATCTGAGCC TCAAAGGCAG CAACAAAGCG GGGTAACAGT TCATCAACGG TGCCCTCGAT 3240
ACGAAGGCCA GCAGCGCAGC GATGACCTCC ACCCCCGAAG CGAGCGGCGA TAACGCTTAC 3300
GTCAATCGAT CCTCGCGATC GAAACCCCAC CGAACAATGT GTGGGAGATT CCTGACGTAC 3360
CACCACAATT GCCTCGACGC CCTGGATGCT CTGGATGAGC TGATACAGAG CATCGGAGTC 3420
TCGCACATCA AGCCCCAACT GGACCGCGTC TTCGCAAGTT TCGTACGAGG TCATGAGCGC 3480
GCCACCATAA TACGGCGTCA GTCGAGAAAG CACGCGCGCA ATCAACATGC GCGAAGCAAG 3540
GGATCTCCCA CCGTTCATGG CAAGAAACGT GTCTTCGGAT TGGCGCCTGC ACGTACAAGA 3600
CGCGCAGCAG AAGCGAAcGT GTCCGCACTG TGCTCGTCGA GGTGGCGAAA AAAACcTGTG 3660
TCCGTAGCCA ATCCCAAAAA AAGTGCGCGC GCCTCGGCTG CTTCAAGAGA TCCGGCCATC 3720
GTCTCGATTA ACGTTTGCAC CAATGTAGTG GTGGACGGAG CTGTTTTTAC GACGAACGAG 3780
TGCGCGCAGT GGTCGCCGCA CGTTTCGTGA TGGTCTATGA ACGCGCGCGC AAAGGGGGCA 3840
AGCTGCGAGG CGAGCTCAGC GCCAACGCGG CTGAGCTCAG AGCAGTCGAC CACGATGAcG 3900 gCCGTCTGAT CAGACGGCcG TnAtCTGCGC AGAGAGACTT GGACGGAAGA GTGTCGCGTA 3960
CGCAGCGATC TCTCTACGCT TGAAAGGACC CGCAGACAAA AGCTCAACCT CTTTCCCTAT 4020
GCGTCTTAGG AACGAGGCAA GCGCAAGACT GGAACCTACA CAGTCCCCAT CCGGCTTCTC 4080
ATGCCCCACG ACCGCAAACG CGCGATGCGT GcaATGAACT CGATGAGCCC GGAAAGACCC 4140
CTCCCCCGAT CCCCTGTACG CGCCGCGCAC GGAGAGGGGG AGGAAACGGA CTTAGGAAGA 4200 GAACTCGGGG GACTGGAACT GAACATCAAA CTTATCGTCG TACTCAGAAG GACTTTTCAG 4260
CGTTCCCCAC ACCGGGCTGC TCAGGCTGCG AGGGAGTGTG AGCTTTACGT AGGTAAGTGT 4320
GTTCCCTTCC ATAAAAAACG AAAGGTAGGG ATCGACGCGC ACTGGAGcGT GCCGCAAGAG 4380
CGCCTTACCC ACCCTGGGCG AGAACCAGGC GGACGGCAGA TAGATCTCGG CCATTTCGAA 4440
GTTAGCACGC CGGTAAATAA CCCGATAGCC CAAGCGGTGA GAGAAGACTT TCACCACAGG 4500
AAGATTGACG TAgTACATAC GAGAAGAGTT GCTCTCTGTG AGCCCTGGGG GAGGGGGCTC 4560
TAACACACGC CCAGCAGTAT CAGATGCATC CTGCGCACTC AGAGCGACAG CAGAAGCGCA 4620
CAGCACAAGG CGTAGACAAC GAGTTGAAAA TTTCATAGCA AAACCTCCGC GGCGGCAGGA 4680
TTCTAGCCTA GGTGTGTCCC TTAATACAAG CGTCTGTATC ACGCTCCTTG CAGGTAACTG 4740
CGCCGTGAAA AATGCGCGCG GTTGATTTCT TGCGTAGAGC TACAGTATGC TGGTACATGC 4800
GCATTTATCT TGCGTCAAAC AACGCGCACA AGCACGCGGA GTTCTCCTCA CTTTTCCCCA 4860
TGCACACGAT TCTCCTGCCG AAAGACGAGG GTATCGATTT TTTCTCGCCT GAGGACGGGT 4920
CTACTTTCTT TGCTAATGCA AGGCAGAAGG CTGACGCCCT CTATGACGTG GTACATGCGC 4980
CTGTGCTCGC CGATGACTCA GGTCTCTGTG TGGATGCTTT AGACGGGGAC CCAGGGgTGC 5040
ATTCGGCGCG TTTTGGTGCA CAGCATGGGG TACACACAGA CAtGCGCGCA TGCAGCTCCT 5100
TCTGGAACGT ATGCACGGAC GGCArGACCG TGCCTGTTCC TTTGTGTGTG TGGCGGTACT 5160
GAAGTTGGGA TCGGTGCCGT TGTGCGTTGG GCGGGGGGTG TGCCgGGGAG TGTTGACTAC 5220
AGAAATGTCT GGGGTAGAAG GTTTTGGCTA TGACCCGATT TTCCTGTTGC CACACCTGGG 5280
CAGGACGTTC GCTCAGCTCA GCATTGAGGA GAAGAACCGC GTCTCTCACC GGGCACTTGC 5340
GGCGctGCGC CTCGCACAGG TGTTGGCCAT GATGCAGCTA CCCCGTkgsT GCGCTACGAG 5400
TTAAAGCTTT TGCGTGGTGC TCGTCGTATG AaCGCGCGGC GGCGTGCTGC GGCCTGGCGC 5460
CCCCTGTGCG CAACGTAAGG GACAGACCGC GCAGACTGCC CGAAGACACA AATTTTATGC 5520
ACGCGCTCGG AGGTGTGCCC GTCGCATACA TCGTGCCTAG TGCTCTGCAA GGTGCGTGAg 5580
CGAACGGAGT GCaGGACACG CGCTTGACTG CAGCTGAGAG GAGGGATTGT ACGAAGACGA 5640
TGTTTTTTGT ACCATGGCCC CACGCTGCAC CATCGGAGGG TCCCCATGGC GGTAAACGAC 5700
GAACAGTTTC AACTCGTTAC CTTCCAGCTC GGGGAGGAGC TTTATGGCAT CGACATTATG 5760
GGTGTCAAGG AGATTGTGAA GGTTCAGGAC GTTCGTCCTA TTCCCTGTGC GCCTGCTTAC 5820
GTGGAGGGCA TTTTTAACCT GCGCAgcGAG ATTATCCCTA TTATTAACCT GCACAAGCGC 5880
TTTCACCTAC GCGAGGCTAC GCTCGAGTCG GGCGACGAGT ATCTCGGCGG CTTTGTCATT 5940 CTCAATGTGG AGGACAGTAA GCTCGGCATT ATCATCGACC GCATCGCGCG TGTTATCGCT 6000
GTCTCGCAGG AGGACGTGCA GTCCCCTCCC CAGGTTATCA CCGGcATCGG GGCGGAGTAC 6060
ATTCATGGGG TCGTGCGCCA GGGGACGAGT TATCTTATTG TTCTGGATAT CCACAAGCTG 6120
TTTAGCTCCA AAGAGTTGCA AAAGCTCGCG AACCTCTAGT GCCCCACCGC TGTCTGTCTC 6180
CTGCCTGCAG CTCCAAGCGC CGTGCGGGGT CCATCCCCGT CCACAGCCCC GCCATCCGAA 6240
GgTCCGAAAT GGACGCGGTG CTCACCTGCT TGGTGGATGA AAAAATTGGC CCTGGTTCGC 6300
TTGGCAGCAC CCTCATCCAG TTGGTGCGCG AGGTGTTTTC TCCAATCGAT GCATAcGTgc 6360
TGCGCAGCCC ssTATCGCAC TTTCCTTTgC ACTCCGTGCA CTGAAATTGC CTCCTGCTTC 6420
CCCTGTACTT CTTTCTGCGC TTGCGCCCTT CTGGCACTAC CGTGAGGTGC TTCACCAGGG 6480
GCTGCAGCCG CTTGTCCTTG ACGTAGACAT TCACAGCGGT TTGTTGTCCC GTGATGTGGT 6540
GGAAACTGGC AtCGCGCGTG GCGCTCGTGC GCTTCTTGTG CCTGAAACAC TTGGAAATGT 6600
GCCTCCTGCA GCGGTGTTTT TGGAACTGGG GATACCCGTC ATCGAAGACA GCTCTCAGAG 6660
TGTCGGTGCA GTATtGGGAG AAAAGAAGGT GGGAACcTTT GGCTCGTGTG TCATCGTGGG 6720
ATTGGAGGCA CACGATATGC TTACCGCAGG CGGCGGCGCG GTACTCATGG CCTTTGAGGC 6780
CGCCTGCGCG CGTCGGCTTC AGGCGCTTGT GCCAGAAGCG CTTGCCGTTG ATATGCTGCC 6840
GGATATGAAC GCGGCGCTCG CGTGTGTCCA AGTAAAGCAG CAAGAAAAAA ATATTGCCCT 6900
CAGGCGCGCA ATCTACGACC GATACTCCTC TGCGCTTTTG CGTACGCGTC ACGGTACGCT 6960
TCACCGGTGT GAGCAATTGG AACACAGTGC CTACGCTTTT CCTGTTGTCC TTGCTTCTGA 7020
TCTGAAGGAA GTGACGCGTT ACGTGCGGCA GGCGTCCATT GAGATTTCTC CTGCCTTTGA 7080
ACATTCCATT GTGGCAGCGT TTCAATTACC TGCTATGCGC AGACGGTGGC CTTTTCCGCA 7140
GTTTCTTCCT ACTTCTGCAT CGCACACGGC ACCTTTTCAG GGTGAGGACA GGGAGGTACT 7200
AGAGACCACG CAGGGCGCGG AAAAAACCTG TCAGGACTCT AGCTGGGAAA GGGAAGTGCG 7260
TGCGTCTGAG ATTACGCCTG AGATGTGTTG GCCACATGCA TCTGCGCTTT TGTTGCGCTG 7320
CGTGCGCTTT CCGTTGTACC CGCGTCTTGC GCCTGCACAC GCACAGGAAA TTGCGCGCAT 7380
CCTTGGGACA CTGCCGTGAG CAGCCGCGTG TGTCCTCAGC GGCCTGTTGC AAAATCATCG 7440
GGGGACGCGA AgGTGTTGCT TATTGTCAGC ACGTACAAAC CGCGCGCTGC GtGCTCGCTG 7500
CGGACGTTGT GAACTTTCTG AGCATACGTG GATTCCAGTG CCACACCATT GAGTATGATG 7560
GATTGAATAA AGAAAGCTGT GCTCGCGCAG GCTATATGTT TGCAGTCAGT ATTGGGGGGG 7620
ATGGTACTAC ACTGTTTGCC GCGCGCTGTG CTTCTCCTTC TGGTATTCCC ATACTTGCCA 7680 TAAATTTAGG GCGTTTCGGC TTTATCGCTC CTATTGAGCC ACGGTATTGG CAACAGGCGT 7740
TGAGCGATTA TTTGGCAGGG GGGGTGCGCC CTGCtGAGCG TGCGCTCATA TCGTGCACCG 7800
TCACGCGTGC GGGTAAAGAG ATTGCTTCGT GTCTGGCGTT AAACGATGTT GTCCTTTCAA 7860
GTGGACCGTC GCGCGTCTTA CCCGGGCAGA GGTGTGCTTC AACGACATTT CTTTTGGCGT 7920
GTATGAAGCT GATGGCATTA TTCTTGCGAC GCCTACAGGA TCTATGCGTA CTCGGCGGCC 7980
TGTGGCGGTC CCATCCTCGA TCCGGACCTT GATGCGTTTG TCCTCACTCC CATAAGCGCA 8040
CTGTGCCTTT CTAATCGTCC CGTGGTAGTT CCCTCCTCAG GGGTGGTGCG TATCAAGGTG 8100
TTGTCTATGC GACACAAAGA AACGGTGCTG TCTGTGGACG GACATGAATT GTGCACGTTG 8160
CAGGAAGAAG ATCAGCTGCT TGCAAGCAGG TCaTCGTGCA GCGCACGATt GGTTTTCTGT 8220
ACACCACACG TGTTCTACCA TGCACTGTGC TCGAAACTGG CGTGGTCAGG GAGTATTTTT 8280
TCTCGCAGGG GAAGACGTCA CGATGATTGA GCAACTTTCG GTGCGCAACG TTGCGCTCAT 8340
TCAATCTTTG GCGTTGGAGT TTGGTGCACA GTTTACTGCC CTCTCAGGGG AGACGGGTGC 8400
GGGTAAGTCA ATGATACTCG GCGCGCTGTC CTTTCTCTGT GGGCAAAAGG TAGGGCCTGA 8460
TCTTATTCGC AAGGATGAGA ACGAGGCATG GGTTTCTGCG GTGTTTCGCT GTGAtCACgc 8520
ACCGCGTGCG GTGCACACAT GGTTGGCAGA ACGGAGTATT GAGCCTGAGC ACCACCGCGT 8580
GCTCCTTCGT CGGGTGATGC GGCGTACCGG TCGTGGCACG GCGTGGATTC AAAACGTCCC 8640
GGTCTCTCGC GCAGATTTGG AGTTTTTCAC GTCATTTTTC ATAGACCTCC ACGGACAGCA 8700
TGAACACCAA TCGCTGTTTC GTGTTGCAGA GCATCGCCGC TTTCTGGATA CtACGGAGGA 8760
CTCCAGCAAG AAGTTGATGC GTTTACTGCG TGTTATGCGG CTCTTGCAGA GCGACGCGCG 8820
CAgcTGCAGC GGcTCGCTTC CTGTGAACAC AACCGGCAGG AGCGGCTAGA ATTCCTCTCC 8880
TTTGCCCTTG AGGAACTGGA GCACGCAGCG TTGGACGTGC ATGAGGAGCG TGCGTTGGAA 8940
GGAGAAGAGC AAAAGCTCTG CCAGCACGAA AAACTCTGTG ATGTGATGCA AAGGGTTGAC 9000
GCTGCAATTA GGGGGGTGGA CCTGCAAGAG GGCGCGCTGC TTTCTTCCTT AAAGAAAGCG 9060
CTTGGTGCAC TTGAAAGCGC CTGTGGGATT GATGGGAGTC TTGAGCCGGC GCGTGCCCGT 9120
TTAGAAAGTG CGTACTATGA AATCGAAGAC GTAGCGCATG TTCTGCGTAC GTATACAGAC 9180
GGTATTCAGT TTTGTCCCGA CCGTTTGCAG CACGTTCAGG AGCGTCTTGC GCTCATATAC 9240
CGGCTGAAGA AAAAATATGG AGGAACAGTT GCGCAGTTTT AGAATACCGT GCGCGTGCGC 9300
ACAAGAGATG CAGGATCTTT CACAGGCGGT GGGTGATAAA GAGGCGCTTG AGCAAGATGT 9360
TCAGCGTCTG ATGGCTCAGT ATTACACGCG GGACGTGCCT TATCGCTTAA 9410 (2) INFORMATION FOR SEQ ID NO: 105:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3245 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105:
CTCTTAGAAA CCGGTnCACC GATcCACACT CCCTCTGAAC CGAGATATTC GGTACGCCGT 60
CCTTCAATCA AAAACTGCAC CGCATGCACC GTAGGAAACG CAGCTGCCGT GAACACCACC 120
TGCGAAAgtG CGCCAAGTAG CCCTCAATGC CATACTGATT GAACTGAAAC TCCTCGCTCA 180
GATCCAACAC CGCCACCCCT TCTTGCACGC GTGCAGAGCG CAGTCGCGTA CCTTCAGGGA 240
TGAGCGTACG CAgTCCGCGC ACCGCTTCTG CCTCAGACGG ACCAAGGAAC AATGCGCGCA 300
ACGCATCCGA AAGCGGCACC GTCGAATGAG GAAGCGTGCG AGCTACCTCC TCACGAGTGA 360
TCTTTCCATC TGCATCAATG CGCACCCAGC ACAGCGTAAT TGCGCGGGAC TGTTGTGCAG 420
AACCGGGCGC CCCAGAAACc GTGCCGACAT CTTTCCTTCG GTACCAGCGT cAsTGGCGCA 480
GGGGATCCTG CAACAGGAGC GTCCGGCTGT GCCACATCCT GCGTAGGGAG CGTGGCTTTA 540
GCATCACTTT CAACAGCCGA GCCGGCCTGA GAACGACGTG CAGAGTCAAA AACGCGCATG 600
GAATCCTGGT TGCCGAACAT CCCTTGCACT GCAGCGCTCG AAAACACACG GTCAATATTT 660
CTTCTATTGA GAAAGAAAAG CACCGCCATT AACAAGGAAA ACGCAAGCCA GAACAGCAAC 720
CCCGCTGATG AGTGTTTCCC CTTCCCGGCC ATACAGCGCC ATCGTACAGG AATTATAAAG 780
CTTTTCCAAG ACATGCTGCA CACACGTCCC TTATCCCCGC CTTGCAAAAA ATATAGGCAA 840
AAAGAAAATA AAGCAGCCGA TATTGGGTCG TGCGCATGGC ATTCATGTTA CTGGCCCTGC 900
TTTTCTCTTT TCGTAATGCG TCCATTCAGG CAGAGGACGC GCGCCTGCTG CAGCCAAAAA 960
CCAACGCTTT GGATCTTGTC GTGCAGGGGG TAGATCTTGT GCTGTTTGCC CAGGATAAGA 1020
CGGCTATCAG TATCAGTACC CCTCCTGAAA AAGACGTGTT CTTCACAGAA CACGAAGGGG 1080
TGCTTCGTGT CCGTACACGC ACAGAAAACG CGGAGGGTAC ACGCCGAGTG ATACGCATTG 1140
GCATACCGCG TGCACAAACG CTCGCATGGG TGAAGATCAT TGCGACGGGC GCACATACTA 1200
CGGTGCGCGG CGTACGCgCG GtGTGGTCAC TGCTTTTGTG CAACGAAGGC ACACTCGCCC 1260
TCACGGAAAG CACGCTCAAG TCATGCACGC TGACACACAC GCGCGGCGAA CTCCGCTTTG 1320
AAGCGGCGGT ACTAAAACGG GCGTCATTCT GTTTGAATGA CGTGAACGCT CGTTTCACTC 1380 TTCTCGGATC GCGCGCCGAC TACCGTCTTA TCTGTAGCCC AGGAGAACGT GCGTGGAAAA 1440
TTGAAGGCGC CGAACAACGA GGCGCGCACT ACACCGAGCC CGCACGGGCG AgaCGCCACA 1500
TGGTTATCAG CGCGAGTGCT TCGTCGATAG ATGTAATGTT CAAAGCGCCA CCTACACAAC 1560
AGGAAGCGGT AGACACGACA CAGAAGGGGT AATCCAGGAT AGACTGCCCC TTTCAATATC 1620
ACCTCAGATA GCAGATTCAC ACCGCGCACT ACTCAAGCAC GTCAGTGACG ATGCGCACCA 1680
CGCGTTTTTT CCCCGCGCGA ACTATTACCG TCCCGTCCAA ATCCAACGCT GACTGGTCAA 1740
TTACCGCACC GATATCTGCC ACGCGCTGGA GCCCGACAAA AGCTCCTCCT TGTGCAATCA 1800
AGCGCCGTGC ATCACTCTTC GTAGTACACA ATCCAACCTG TACAAACAAA TCAGTCACTT 1860
TGATCCCAAC TTGCAACGTG CACTGTGTCA GCTCGAACGT CGGCAATGCA CACTTATCGC 1920
CACACCCGCC GAATGCCGCG CGCGCTCCCT GCAACGCCAC CTGCGCGACA gcCGTTCCGT 1980
GCATGAGGCG CGTTACCTCG TATGCCAACA GCTCCTTTGC ACAATTAATT CCCTGAGTCA 2040
ATATCGCCTC GACATCGCGC ACAGACAAAA AGGTAAACAG GAGCAAGAAA CGCCGCACAT 2100
CTTCATCCGG AGTATTTCGC CAGTATTGGA AAAAGTCATA GGGAGACACC AAAGCCGGGT 2160
CTAAGAAGAG CGCACCTTGC TCGGTCTTGC CCATTTTTTG CCCATCCGCC CGGGTAATGA 2220
GCGGAAAGGT CAACCCATGC ACGGTTTTTC CGCGCACTCT TCGAACCAAA TCCGCCCCGG 2280
CAACAATGTT GCCCCATTGA TCATCGCCGC CAATTTGTAA CTCTACCGCG TAGcaTnCAC 2340
TGAGCGTTAA AAAATCATAG CTCTGCAATA GCTGATAATT AAATTCAAGA AAGGAAAGTC 2400
CTGTCTCCAG GCGTTTCTTG TACGCCTCAT AGGTAAGCAT TTTGTTTACA GAAAAATGCG 2460
CCCCAACCTC TCGCAAGAAA TCAATGTAAT TCAAATGTGC CAACCAATCA CGATTATTCA 2520
CATAGAACAC ATGCCTGTGA TCGAAGGAAA GAAAATGATc CAGcTGCGCA ACTATCGCTC 2580
CCGCGTACGC ATCGAGCGTT GCATAATCGA GCATCTTGCG CATACTGGTT TTGCCGGAGG 2640
GATCCCCAAT ACGCGCGGTA CCTCCACCGA TGAGCACGCA ACCGCGGTGC CCCGCATCAC 2700
ACAAATGCTT TAGCGCAAAC ATAGGGAGCA TGTGCCCAAC GTGCAAACTA CTGCCAGTTG 2760
GATCTACACC GACATAAAAG GTGAGTGGGC CTGCATCCAT ACGCGCCGAA AGCGCCGAAA 2820
GATCAGTACA TTGTCTAATA AAACCACGCG CCTGAAGACG CGCAAGCgCA GGaTTCATGG 2880
AGCCGATTAT ACCGCGCATC GCACACCCCG ATCCAGGGCA GGCGGTTCAG TCCCGAGAAA 2940
AAGACAAAAC GCGCTGATGC ACCCCCACCG TCGCCCCGCG TGTCACTATT CCTTTAAAAG 3000
CGCTGCAATC TTCGGACGAC TCCAGCGCAC CGCAACGTCA TAGGGTGTCT CTCCCGCCAC 3060
GTTTCGTAAA AACTTTCCAA AGCGATTCAT CGCCAGCAAA CtTGCAGTGT CTTTTCATCT 3120 GCAACCTTTG CTGCGTAATG GAGAATACTT TCTCCAGCTG AATCTGTCTT ATTTACCGCA 3180
AAACCCACCA GCGTTTTCAA GATTGACGTG TtCTTGCTCA GGACTAACAA GGCCGGAGnA 3240 CTTCC 3245
(2) INFORMATION FOR SEQ ID NO: 106:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1347 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106:
ACAGGGAGGT TCAGGCATCA GAGAAGGACG TCTGTACACT CAGGGAACAG GTTGCCGTGG 60
CGCAGGGTCT GATTGCTGAG GTTAATAAGG AGTCTTCTTT CGTCGACTCG CTGAGTAAAC 120 gCGTTGCgGC TGCAAAGACG CAGCTGCAGC AAGTGTCCGC TGCGATTCCT GATATGCAAA 180
ATGCATTTAC GCGCGAGAAT ACCGCGCTTC TCCACCGGGT GCGAGATGGA GTACTTGCAG 240
ACGTACATAA GGAATTAGCG GTGTTGCAAA CAAGGCTGGA AAAAGCGCAG GGGGAAAGCC 300
AGTCTCTTTT TGAAGTTTCT GCAGTTAAAT TGCGTGAGTT GTATGAAGGG GCATTTTCTG 360
AAGCAACTGT GCGTGCACAG GTGCTGGAAG AAAATGGATT CGGTCAGTTG AAAGTACAGG 420
CGGAAAATCG CCTTCTCCAG TTGCAGGAGG AGTTTGAAGG GAGCCTCCTT TCTTTGCAGC 480
AGCACGTTAT GCAGCGTGTC GAACAAACGG ACCAGCACAT CCAGGATTGT GCATCCCAGT 540
GGTCTGTTCG GGCGCAGACA TGTGAGTCTG ATTTGAGTAT ACGTCTTGCG GACGTTACGG 600
CGTGTGTGGA TGAAAGCGTG GCGCAACTGA AGGAACAGAT TACTACACAG GAGCGTGAAG 660
TGCGTGCGCA CCTGGAAGGG ATCGAACAGT CGCTTTCAGG AGCAGAATCC GGTTTACcGA 720
GCGCGTGCAC AAGAGTGTGA CAAGTTTTCA CGAAAACTTA AATAAGATTG CAGAGGCTTC 780
TGATGCGCAG TTACAGCAGT ACAGGAAGGA GATGGATGGA CGCTGTAGCA AGTTTGACAG 840
AGAGCTTGAG GGTATTGATG TCCTTGAGTC TCAGTTGCaG CTTGCGCGTG AGCGTACAGA 900
ACAGAAGGTG CGCGAAGAAT TTGAAGCGTA TGCGCAGGAT CGTGAGCGGA AGCAGTTAGC 960
GTTTGAGGCA CAGTTGCAGC ACAGTATGGA TACGGTTGAG CACCGTATGA AGCAGCTGAA 1020
TGACGAGCTG CGTGAGCTGA AGGCAAGTGC GTATGCAAAT GCATCCGAGA AACTGCAGTC 1080
GGTGGAGGAT AACTTTTTTG AGGTACTTAC CAAGCGCAGm aCTCGTTGCA CGCGCGCTTT 1140
TCCGAGTGGA GTGAAGGGAT TGAGGGTCGT TTGACGCAAC TTGCTCTAGA GAGTGAGTCT 1200 GCGCGAAAGG ATCTTGAGGA TACGTACCGC AAAGAtTGCA CACGCGGCTC AAgGATTTTG 1260
TGGAArAATA CAAgGGGCAG TGTACAAAAC TGGGAGAGCA AATCCTCGCG ATTGAATCAA 1320 ACGTGAAGCA GCACATGCGC GCAAACG 1347
(2) INFORMATION FOR SEQ ID NO: 107:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5230 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107:
AACTGTTTGC GAAACGGnAT TCCAACTAAC CGACACATTC AACGTACGCG CAACTTTCAT 60
CGCAGACACA AACGCTTCAT CAGCAGAAAG ACACAGTGCA AAAAAGAAAA CACCACCTAT 120
CAAACACGAA CGCAAAAACT TCATAATGCC CCTCGCTCGA GATACCCTGT CAAAGTACCA 180
ATGCACACCT GTCCCCTCTC CCTACTCACA ATAAGCACAT ACGTCCACGC ACCACACTAC 240
CATACCCTTT GCGCATGCAA TCTCTCACCA GATGACAACC GGTGCACTTT ACTGAGAACG 300
ACACGCAAAG TACGCAAAAG CTCACGCGGT AGATATTCCA ACcGCTCCCC TAACAATAGT 360
CCTAACTCCT CAGTGCTATT CCATACATCa TGGAAAAGCG CGGTTTCTTC CGGACTCATC 420
AATCCAATTT GGAACCCAAA AATCATCACT CCACGCCGAC GCAGTTCCTC TATCGTTCGT 480
CGACTCTCAT GAGGGTATGa GGAGCATCCG TCAGTAATTA CCAACACAAC CTTAACAAAA 540
GATGTATCAG ACGACACACG TGCACGCCTG CGTTCCACGT CTGCAAGAAT GTGTTTTAAC 600
ACCTCTGCAT CATTCGTCCC TCCGAATCGT CCATCTAAGT TCACCGATGC TTTGATAAgt 660
GCGCGCTATT GAAGTCTTTC GATTTACTCT TTCCAAATGA CTTCACTTTT ATAAAACTGC 720
TCCCAAAGTA ATACACCTCA GAATGAATAG TTGTTCTCTC ACGCCTACCT GCTGCGAGCA 780
TGTCTGAGTA CTCTCCAAAA TCTTTCAACG AAAGCATACT TACCGCAAGC GCTTCCCTTG 840
CACTCGCAAT CTTTTCTTTA TTCATAGAAC CCGAGTTATC AACAACGAAT GAAACTTCTA 900
TCTGTGCCGG ACGTAGCGTA CACGCACGCA CTTCATACAC ACGATCATAC ACGCGCAnTT 960
GtACCTCTGA TCCCTACCCC GAAATTCTTG AAAATCAGGG AAAACTGAAA CAAAATCGCC 1020
TACGTTTAAC CTCCCCTTCG GGCACTTCCC CTCGTAACAG TCTATCCACT GGACTGACTT 1080
TCCAATAAGT GATTTCCAAA ACGAGCGCAT ATCGATACGC AAACGATGCG TTGTACGCAG 1140
CAACCGCGCA AACTCTTTGC GTTCCTTCGG AGAAATAGAG AAGCGCTCGT CAAACTGAGT 1200 GGTACAAGAC TCAAAAACAT CTTCTCTTTT TTTCAGCTCT TGCTGCCCGT CCCTTGAGTG 1260
CACAGCGTGC AAATTACTTT TAGTGTATTC GAAGGCACGC TCGTTTCTGA AAGGATCATA 1320
ACTTCTGTCC GTTATATTCT TTAACTCGAA TGTGATATTT GCCACACTCA GATGCCAAAG 1380
CCGCACAAAA TGCTCTAAGA GAAACACAGA GACAATACGG TCCCGGTGCT TGGTTGGGAC 1440
TACCTCGCCA GAAAGGGGGA AAAATTCTGA GTGAATAAAA TTACTCATCT CAGTACCAAG 1500
CACTAAAGTG CGTAgcCTnC CTGTACTACA GGATCCCGTG CGATGCATGC GCGATTGCAA 1560
CCCTTTGGCA TATCATGGAA CAGCACGCGC TCTGCAAGCG CA ATGCAAA CTGCAAGTGC 1620
AATGCCGTTT CCGTATGTTC CCAATACACA CCCCGTTCTT TTTGATACAG AGAAGCCAGC 1680
AGCGCACTTC CTTCTGGAGT TGTGTAGTAC GGAGCTCTGT TCCGCACATA GTGGAAGGCA 1740
TACACCGCAT CGAAAAGGTA AAACAGTGAT GCCACTTCTG TGTGGCAGTA ATGCTGAACA 1800
AAGGCGCGAA CACGTATTTC ATCTCTCGCT TCGCGAGAra AAACTGCCTC ATACATAAGA 1860
CGTCGCATCC CCTGTGCAGC GTCTTCAAAA AAAGACACAC GGTTGCAGTA ATACGCGTAA 1920
TCCGCCTTTA ATTCTCGATA GTAAACAAGC TCCCGGAAAA GgTTCCACAA TATTTCTTTt 1980
CCGTCAAAGT CGTCCTTTAA AAACCACTGC AGTGGAACCC CTACCATCCC CTCCTGCGGA 2040
ACAAACACAA ACCGATCAGC TTGTAAAATA GGTACACAAC GAAAATGCGC ATCCCCCGCA 2100
AArGTGGCGA TATTCCGTGC TTCAGCGGAA AAAAAACGCG CAAAATTCTC TTGCGCACGC 2160 tTcGCATCAC ATTCATTCAT TGTCATGCAA CGGCGGCGCG ACGATCTTAC TCAGTCTCTC 2220
TTTAAACATA GAAGGTTCAC AAAATTCGTG CACCGTCAAA AGCCTGCACG TACCATTTTC 2280
TGCCGCAATA AACACGTTGA ATACCGACGG GTGTTCACTC TGCAATAAGA ATGCAGAAAC 2340
GACCGCTCCT GTTGTGTTTC CAAATGAGCG AGAATACTGC GCCGTGCCCG. TAGCATCGAA 2400
TGACAGCAAA CGCGCCTCCC CACCCCGGCC GAAAACAAGC ACCATGCGTG AAGAGAGCGC 2460
CAACGCGCAT AACACTACAC TTTTGAAACC TGTAATCCAT TCGCCATACG CATAGTCCCC 2520
CGCAGCCATG CCATGCTCGT GCCGAAGCAc GCGACTTTCC CCGTCATCGC CAAAGACTAC 2580
CACTCGATCC TCAGAAAGCA CGACCGCTTC AGTGTATGGA TTTTCAAATC CATATATCGC 2640
GTCAGAATAC TCTTCATTGT GTAAAAGGCG TGTACCGCTA AAACCACTAA ACACCACAAG 2700 cTGCGCAGTA TTAAGAGCCC GTACTACCCG TATTCCCCCA CCTAAACCAA GAATCGGCCC 2760
ACGATACTCA TATCCGTGTG CACTCTTTTC TAAAATGAAG AGAGCACCTG TCTCATCACC 2820
TACCACCACT CGGTGAGCAT TTACTTTTTC CACACACAGC ACACGCTCAT ACACGCCTTC 2880
AACACGCGCA ACTACCTCAA ACGAAACCCT TCTTCTCTCT GCCTTCTTCG ATCGTTCCTG 2940 AATAGAAAGC ACGATCGTTT CaCCACtTCG GTAACGACAA CTACTTCTCC ATCACCCATA 3000
GAGATAAGCG CACAAATACG CCAATCCGTT TTCACGCTCT TATGCGTGGA TTCCGCAACA 3060
AGCCATTGAT CGTGCACCCG GTATAAAAGC CGTATATCAT CTGTGCCACC TGCGATAAGG 3120
ACAGAATCTT CCGCCAAAGG AATCGCTGCG GTGATGCTTT TTCCAAATTC AAAAAACACG 3180
TCACTGTACG TGCACTCATC TCCACAGATA TCGGGAAAGC AGTGCCGAAA CTCGGGAAGC 3240
AGCTCCTGAA TCAATGCGTC ACTTGTCTGT TCTGCACGTG CAAGATCATC CCACAGTGCG 3300
CGCGGAGTTC CTCCCTCACT GCGCAATCGC GCGTGCTCAG ACACCAGTTG GTCACACGAA 3360
CACTGCAGAT GGGCAATCCG ATCCGCACAC TCACGTGCGT ATGATTTCTC CTGCAAAAGT 3420
CTGAATGCTG CACACACGCG TACCAACTCG TGTATCCGAT CGTGCAGAAC CAACATCGCA 3480
CCGGCATCTA CCACATCCGG CACTGTATTC TGTGTATGCA TGGCGTACGT AGGTCCGTGT 3540
CCAAAGAGCA ATACCACCAC CTCAGCGAGA GTTTTTACCC CATCCGGCAG ACGCGTGTAT 3600
TCGTAAGGAC GGGTACGGAT CTCTyCGTAA GAgTAAAACt GCCCCawTCC CtTGGGCCGC 3660
ACAcTCCACC cGTCyTTTTC TTGAAAAAAA CCAAaTCGGA CTGCTTGCGA TAAAATATAA 3720
TTTTGATCGT CTGCATTCAA AACACCACCG ATAAAGCCGT CCCACAGCGC CTGATCCACA 3780
TCCTTCTCTT CGCCCCGATT CCACTCATCA AGCACACGGA TCACATTACG AATGGAAAGC 3840
ACCGCTTCAC GCAGCTCAGG CTCCACATCC CCAAGACCGT CATGCGTCTC ACTTCCCCCC 3900
GTGCTATCTT TCCATTTGCC GGAAAATACC TGCTGCGTCA CCTTTGCCAA TTGCGCAAGT 3960
TTAAACAGCT TGTCCAATGA ACCATCAATA TCCGGCAGGC AAAGTGATCC TCCAGAAGAG 4020
GCAAGACGCG CGAGAATGAT GCGAAAAAGT TCATTTTTCT CAGGATGCGC CTGATTCGTA 4080
TAAAGACCTT CTGTGGACTG CGGAGGGTAA TCATATTCGA ACGTGTTAAA GCGCGAAGCA 4140
AACGCGGGGT TGAGCGCCCC AGTGCCCTCG TAATGAACAA GTCCACTGCT GATGTTCCCC 4200
GTGGCAATTA CGCCAAACCC TGCAGCTATT TTTACCGGGC CTACCCCGGG GATATAGGCG 4260
AAATCCCCTA CCCGTTTTTG CAAAATATCA TTGAGTGCGA TGAGATGCTG CATGGGAATT 4320
GCGTTGATTT CATCGATAAC AAGAGGTCGT CCCTCTTTTA CTGCGCGCAg CACCTCACGC 4380
TCAATCTTTT GTACCTCAGT GCCAAAATTG CCATACTTTG CAAGATAGAT ATCCAGCATT 4440
GGATCAAATG CGTAACCACG CTCAGTAAAG GCATGCACAT CCTGGTAACA GTGTTCGGGT 4500
GTTTTTCCCT CCAGACTGTT TTTTAACACA AGCGTCTTTT CAAGAAACAG ATCCTCTGTG 4560
TCGATGTGCT TTGACCCCGA AATAAATAGG GGTTTGATGC GCTGCAAAAG CCGCTCGTCA 4620
CCGAGACGTA GCGCCTGTTC ATAATATGCC GCGCGCTGAG AAAAAAGGCG TTGAAACTGC 4680 GTACGCGCTT CCTGGGAGTT TTGTAAGGCA TCTGACACGC CACGCTCGCT GCCCCGAATC 4740
TCCGCACACG AGCCACCGGT TGCAGCAGTG CACCACTcAT TAAGCGCACA ACGGACATCT 4800
TCAGTGATTA ATTTATGAAG CGCAAAACgc tCCGCGGCGA GCACGGCGAG CTCAGTCTTT 4860
CCCGTGCCGA GATGCCCGCG GAGTAGCACC GCATCCCCGC GGGCAAGGCT TGTCGCTATC 4920
CTCTCAAGTG CAGCACTAAC AGAGGGGGTT TGCGCAAACA CCCCCGAGtG ACGCGTTCAA 4980
CTTCCCCAAG CCATTGAGCA GCATGGAGAA CGAAAAAGGA TTCAGGTGTT TCCAACGCGA 5040
GCGCTGCTCG TTGCTCCTCC AGATCCCGTA TGGCATCTGA GTGCACTCTG GTAAAAACCG 5100
TACTCCGCCT ATGGATACCG AAGTGGGCAC AGGCGACTGC ACACTCCCAG TtCcGCAAAA 5160
AGCGCTCATA CACAATACGC GCGTCAATAT CTGCCAAACG TGCCCGA An CCGTGCAAGT 5220
TCTATCCGCG 5230 (2) INFORMATION FOR SEQ ID NO : 108:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1379 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108:
TTGGATTCGC CAGTCTATCC TCAAGGCACT GAACGAAAAA GCACGCATCA TCCGGTTGCC 60
CGTGAGTCGT ATTCGTGCTG CAAAGAGCAC GTATGAGTAC ATACCTCCTG TTACCTCGGT 120
GAATGTTGGT ATATCGAAAA CACaGGGAGG CaACAGATAT GGCaAAAGGC GTGCAACGAT 180
GTGCCGAGCT GTGCCaACAg TGGAAGAAGT TCACATCTGG CACACGCCGC TATCTCTAGA 240
TGCTCCGCTT GGAGCCTCGG GAAGTCGTGG TGAGTCTGGG GATGTTGATG GACTGCGTAT 300
AGGTGACTGC GTGCCGGACG ATCTTTACGC GCAgcCTGAG GAGCATATGC TTGCATGTGC 360
ATTGCAAGCT GATATTGCAA AGATTTTACG ACTTTTGCCT GCGCGTGATG CGCAGGTCAT 420
CCGCTATCGA TTTGGACTTG GCGGGTATGA ACGGCGTTCT CTGCAAGAGA TTGGAGAAAT 480
TTTTCAGATA ACAAAGGAGC GCGTTCGCCA AATAGAAAAA AAGGCTTTGT TGCGTATCCG 540
TAGCTGTGCC CGTCAACACA GACTGGATTC CTACATAGCG TAGACATCAG AGGCGTATGA 600
AATAGGAGCA GGACGCGTTG GGCATACAAC TTATAGTGTT TTTAGGAAAT CCTGGTGCAG 660
AGTACGAAGA AACGCGGCAC AATGCTGCAT GGTTGCTTTT AACGTACCTT TTCCCATCCA 720
TCGTGCTTCC TTGGCGATGC GGATGTCGGG GGTCGATTGC GCGTATTGAA GGGTTTGAAG 780 GGTCAAGCGA AGAAGTTTGG CTTTTGAAAC CGCTGACTTA TATGAACCGT TCTGGGAAAA 840
GCGTAGGGGC AGCATGTGCC TTTTTGCAGA CGGATGCGAA cAGCTcTTAG TAGTGCACGA 900
TGAATTAGAA TTACCGTTCG GTGTGGTGAG TTTAAAACAA GGCGGAGGGC TTGGAGGACA 960
CAATGGGTTG CGCTCTATCA AGGAAGTGCT TGGTACCGCA GATTTTTGGC GGTTGCGCAT 1020
AGGCATCGGG CGTCCACCCA GTGAGAGTGT GAATATAGCG CAgTACGTCC TCTCTGCCTT 1080
TTACCCGGCA GAGATGGCTG CATTCCCAAA GCTGGGGCGT GCCACGCGAG ATCTTCTGTG 1140
TCAGCTTGTA GTAACAGATC AGGCAGCGAC AGTCACCTTA CTCAGTGCGT GGAGAAAAAA 1200
ACGGTTGCTG TCTTTATGCG AATAAGGACA GGGTGACTCc CATACGGTGA AGGAAGGGTA 1260
AAAAGAGAAA GTGTGGGGAG GACTTGCATA AAGAAAGAAT GGTGGGTTAn TGGCGCGCCA 1320
CGTGTATTAG CTAGGACGAG GAAATCATAT GGCATTTGAA ACAATTTCGT CATGCTTAA 1379 (2) INFORMATION FOR SEQ ID NO: 109:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1531 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109:
CTGCCCnAGG CGCGTGTGGG TGGTGCGCAT AACCAGGGGA AGGATATCCC ACCGTGTTGG 60
TAGTAGAAAC CCGTGCTTCA GATTGTAGGG ACCGGTGCCC ATAACTGCAC CAGGGGCCTG 120
GCCATGACTG CATACCCCCT TGCCATAGCT GATGCCCCAC TCAAGTGTGG CAGAGCCAGT 180
TAGCTTCGGG GAAAACTCCT GTCCGAGCAC TCCCCCGCTC GCTCCTACCC CCACCACCAC 240
ACACAGCACA CTCCCCCACC GCATGCACCC CATGCTACCT CACCCCCCCC CCGGCCCTGT 300
CTAGTAGCCC CCTCACCCTG CCACCTGCAC ACACGCAAAA ACTCACCACT CCTTGCACCT 360
GCCTACCCGC CnGCATCGcG CGcACCCCAG GCGCAGACCT TTGCGAGCGC AAACGCACCG 420
ACACACCCAG CCACACATCC CATAAAAAGC GTAAACTGAT CCTGCATCCC GGCTGCAGTT 480
CCCCGTGGGA ACAACGTCAC GCTCGATCCC AGCACGAGCC CACACACACA gCATGCGCGG 540 aCACCGGGTA CAACCGAAAC CATGCGCGGA GCACCCGCGC CGCCCCCAGC AACCCTATCC 600
CCATTCCCAC ACAGAGTGGA AACAGGAGCC CCCACACGTG CGCAGAACAC AACCTGCTTA 660
TTACCGACGC GCCAGTCACT CCTCCCCCCT GCTCCACCGC GCGCAAAGAC AACACGCTCA 720
AAAGCGGTTG GTACACCCCC GCCAGCAGCA ATACCAATGA ACCGGAAAAT CCTGGCGTGA 780 gCATTGCCGC CGCCGCTAGC GCCCCTGCGC ACACGGTCGC CACAAATCCG CGCGTGTGTG 840
CAGTAGTAAG GACCGCCGTA GTCTGAACAG ACGCAGGGGC GGTCGCAGAG GATGCGTCAC 900
GCGCGTGCTG CATGCGAGAA AACGCACACA CTGCCACAAA CCCAAGCAGC ACAAAGAAAA 960
TAATTCTAAC TGCACACCAC ACGCGGCGTG CCGGTTCCCC CACATGCCCG CGCGCTGTGC 1020
CTTCCGCCTG CACTGAGTGC TCCGCGGcAA CACTGGCGGC ACGCACCCGA TTGCGCAAAG 1080
ACGGCACACT TGCTAACAGC ACCCCGGCCA AAAAGGCATT CGTAAGATGA GGAAATGCTT 1140
CGTACAACGC ACGCATAAAG CGTGGgCACA CCCCTATCCC CAmCACTATT CCCCCCGCAA 1200 gGGCAAGCAA GCGTCGCCAC TGACGGCACA GATGCGCTCT ATCCAATGCT ATTGCTGCAA 1260
TCAGTAaTTC CCAcGTGCCA CAGAGCAGCG CAACCGTACC CCCCGAAAGA CCCGGTACCA 1320
CGTTTGCCGC TCCTATTAAC ATTCCTATCC ACACGTGCAT GACCGGTGAA CTCATGCTCC 1380
GCCTCCCCTG CTGGTCTTCT TTCTGCGAAA AAACGCACCC TCCCCTTCAT CCGGGCACTT 1440
CTCTCAGTTC AACCGCGTCC ATCGCATGCG CAACACACCC TCACCATACA AAAGGAGTCT 1500
GAAACTCTAC ATGCTCAGCG ACAATCTTTA C 1531 (2) INFORMATION FOR SEQ ID NO: 110:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1398 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110:
TCTGCCCGGG TGTGGTGAAG CCGCnCAGTT GCCGGCGCGT TACnCGGATG GGATGCCAGT 60
CCAGTGAAAT GCCGCGTGCa GAAGCgTTGG TGCAAACTGC CcTTTGCcAG AAaGCTGCcG 120
TATCTAGAaT CcAGTTACAC TTTGAwCcGG GTTGTCCcTT TCTCTGAAaG GCTTCTACGG 180
TTGCaGAGAC GGTACGTGCC aACGAACTTT TTGCGCCAGC AGTCTGATAT TCCCTTGCAC 240
CTTGGAGTAA CAGAAGCAGG CCCGCTTGTT TCCGGTATTG TCAAAAGTAC ACTTGCATTT 300
TCCCAATTAC TGTCACGCAA TATTGGTGCC ACGGTGCGGG TGAGTCTTTC AGATAGCATG 360
GAGCATGAGG TGcTGGygCG CGAGAAATTC TTGCTGAATG CGGTAAACGG GCTGGTGGGG 420
TTCGTTTAGT GTCATGTCCG CGCTGTGGCA GGATTGGTTT TGACGTACAC GCATTTGTGC 480
GGAGGTGGCA AAAGGAACTG TTCAGTTTGA AAAAGGATAT CACGGTTGCG GTTATGGGCT 540
GTGTAGTGAA TGGTCCTGGA GAAGGAAAGC ATGCGGATCT CGGTATCAGC GGTGCGGAGG 600 ATTCGGTGAT TTTTTTTAAG CGGGGAAAGA TAGTGCGTCG CATTCAGGTA CGTGATCTTT 660
GCGCAGACGA GCGCACGCGC ATAATAGACG CAGCGTTTAA AGAGGAATTG TCAAGTTTAT 720
GAATAACCTG aTCAAAGCAT ATGCGGcGGG TGTCATGAGT GCTGCGTTTC TTTTTGGGTC 780
AGAGGGGCGG GTGCGCAgTG AATCCGATCG GGTGCGTGGG GAGGATCCGT GGCACCTGTT 840
ACAGTGGGCA CAGGTTGTCT ATGAGCGAGA GGAATTgGCG ATACGTTGCG CTATGCcAgC 900
GGGnCACGGG CGCTTCGGCG GGArCAAnTG GAGCACCAGT GCCgAaGTGC TACTGCGTGC 960
ACGCACACGG GCTGAGTCAG CGGGGATACC CGAAACACTG TCTGATTTAT ATGCACTTTT 1020
AAAAAGTCGA GGAGAGACAG ATGCCtGCGA AGTGCTTGAT GCTATTTTTC TCACTCATGC 1080
GCCGCACGTT TTTCAAAACT CCGTTTCCAA ACTGCTCCAG TGGCTGAAGG ATTCAGCCGC 1140
TTTTCCAGAA GCGGAGTTGC TCTTGGGAAA GGTATTCGAG GGTGAAGGAG AGTACGCCCA 1200
GGCTTTGCAG CATTATCGAA ATGCGTGGGA TACGCGAGCG CAGCTTGTAG TTCCCGACGC 1260
TCGCTTTGAT ATTATCTACG CAATGGCGAA TGTGTCTCGT CTGCTCAGTC AGCAGGATGA 1320
ACGGGAGAAG TACTTGCTCC TTGTGCTGAG CGAAGATCCT CTGTACAGTG CACGTGAGGT 1380
GTGGGGCAAG ACGCTGCA 1398 (2) INFORMATION FOR SEQ ID NO: 111:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1900 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111:
AACCGATAGA CGAAGGGACC ACACGCGCTC CTCCCCTTTT TAAATAGAGA AAAGAATAGT 60
CCCTACGGGA ATTGAACCCG TCTTCTGAGA TTGAAAATCT CATGTCCTAA CCGATAGACG 120
AAGGGACCAC ACGCGCTTTC CCTAGGGAGG ACTCGAACCC CCACACACAG AACCAGAATC 180
TGCGGTGCTA CCATTACACA ACCAGGGAGG GAGGCCCAGG CTAGTACGAC ACGTATTTTC 240
TGTCAAGCAG ATAACGAACA CTCACTCAGA TCAAAAACTT ACCCGGTACC TCCACTTGAC 300
CGACGCAGAC TGGGCAGCTG CCATCAGCAG CCCCACAGGC ACACGCGnCA TTCCGGGCAA 360
AGACACACAA CTGGCACATT GCCGGATACA ATTGTGGCAG TCGCTCCTTC CAGTTCAGTA 420
GCGCTATTGc AAAGCGCGCC TGGCTGcGAT ACAAGGAGGA GAACACTTAC TCCGTATGGT 480
TATTTCAGGa TTACGCGCCA TCGAAGAGTT TCTGCGCGgC AGtCCTnTGC GCTAGAAGGG 540 TTACACGCGG GAGGAAAAAA CAGCAAGACG CAACAAAGCT GCGTGCGTGT GTCACGCTCT 600
ATTATGCAGC GGAAAATGCG CGCATCAAAC GACTCCTTGG TATGGCCGCG GCACGGGGGA 660
TACGGATCAA CACACAACGT GTGCTGTGCT TGATAAGTAG CGCGTAGTTT ACCCnCCTGC 720
TGCGCGATCA CCGCGGTATC CTTGCTGTTC TAAGTTCACC GAGCGCAACG TCCGCAGGGT 780
TCTCATACAA GAAAAAAACG ACtCCGCTGT GGACAGCTCG CAGGAACGAA CGTTGTTACA 840
TGCACTGGCA ACGCACAcGC ACGCGCTTGT GCTCGTCTTA GACGCAATTA CTGATCCCCA 900
CAACGTTGGG GCAATTGTAC GCATGCAGAC CAATTTTCTG TCGATGCAGT GCTCCTGCCG 960
CACCATCATG GGGCAGGAGG TACAGAAACT ATCACGCGAG TGAGCGCAGG CGCCGTTGCA 1020
TGGGTACCGC TTGTGCGTGT ACGCAACCTA GTGCGCACTG CAGGTATCCT CAAGCGTTCA 1080
GGATTCTGGC TATACGGTGC TGATGTAGCA GGAGAAGCAA TAGGCGCCCG TACTTTTCCT 1140
CCTAAGACAG CGCTTGTGTT AGGCAACGAG GGGCACGGCG TTTCGCTTTG CTGCGCACGC 1200
ACTGCGACGC ACTCATCTCT ATCCCAACGC AGGGaATGTA GACAGTCTGA ACGTGTCGGT 1260
TGCCGCAGTA TTCTGTTATA CGAAATACGC CGGAGTCAGC AGTCTCCCTA CTCCGTACAA 1320
AGGCAAAACG AAATGAACGC TCAATGAAAA CACCCAGGGc ATCTTCGCAT CTTTAATTCA 1380
TATTGCATCT TTCACCCCAT TGCGTTACCt GAGGGAGTCT CTACGCaCGC GGTAcGGAgG 1440
GACCCCATGG CACATCTTCC TAAAGAGTAC GATTTTTCTA TAGAGTCATT GGGGGAAAGC 1500
AAAATTCCCT CTCCCATCTA CCTGTCTCAC ACCCTTGGCG ACTTCATTCC TAAcTACGTC 1560
AgTGACAATG AGTACATCAG CCATGAACTG AGTGCGCGTC TGGGGGAGAC GGTAGGGCCC 1620
TTTACTCATA AAAACTTGAT GGAGCGTGCG GGCCCGCGCC AGAAGATTTT CTTCAACCCG 1680
CATCACGTTC ATGCAGGTAT TGTCACCTGT GGAGGGCTCT GTCCCGGCCT CAACGATGTC 1740
ATTCGCGCCA TCGTCCGCTG CCTTTGGGGC CGCTATGGCG TTAAGCGCAT TAGTGGTATC 1800
CGCTTGGCTA TAAGGGCCTC TTGCCCGATT ACAACTTCGA TATCCTGCCG CTCAACCCTG 1860
AGGTCATCGA TAACTGCCAC AAAACAGTGG TTCGCTGCTA 1900 (2) INFORMATION FOR SEQ ID NO: 112:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13969 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: TTCGATAGAC GGACAATATC GTGTCCCTAC CGCCTTTATT CTGCCAGAAA AGAATGGAGA 60
ACGATGAAAA TTCTCACGGA TGAGCTGATG CGTTCTCTCC ATTGGTGTAA TTAATATAAC 120
AGTCCCGCAT GTGGACGATT GATTTCCcAC GTGAGCmAAG GAAAAAGGAC GCATGATAGC 180
GTCGGCTTCT TGCTTTTCCA TTACCGAAAG ATCTACAGAT TTTCGTAGTA CGCGCGCGGG 240
CGTACCCGTC TTCAGCCGTC CCATCTGAAA CCCCTTTTTT CTCAGCGCAG CTCCCAGCCC 300
TTCAGCGGCA TGTTCCCCTA ATCGGCCTTC AGGTGCCTCA TATTCCCCAA TATACACACG 360
CCCTTCCATA AAGGTCCCCG TAGTCAGTAC CACCGCACGT GCAGAGATGC GTCTGCCGCG 420
TGCGGTAACT AyCGCATGCG CTGCCCCATA CGCTACATAT CCTGCATCAG TGGTATTGGA 480
ACACACAACG TCTACCACTG TGTCCTGATA CAGGTGAAGA TGCTGCGTAC ATTCCAACGT 540
ATACTTCACC TTCTGGGCAT ACAAAAACTT ATCTGCTTGG ATACGCGGCG CCTGCACTGC 600
AGGnCCCCGG CTTTTGTTGA GCAGTCGATA CTGAATCATG CATGCATCCG CAAACTTTCC 660
CATCTCTCCG CCGAGTGCAT CGATTTCTCT TACAATATTC CCCTTGGAAA TTCCTCCAAT 720
GGAAGGATTA CATGAGAGCC TGCCGATACT ATCGATTGTC TGAGTGATGA GTAACGTGTG 780
CTCCCCCATA CGGGCAGCGG CCAGCGCCGC TTnCGGCACC TGCGTGTCCA CCACCGACGA 840
CGATAACGTC ATAGTCAGAA AATCTGAAAC CCATGGGCAG CGGATTATAG AAGAAAGGTG 900
CAAAAGGCTT CAATCAGGAA GCACGACAGA ATTCAGGAAG CAACGCACAC ACAACATCTG 960
AAGCCTGTTT CTGATAATAA AGGTACCCGT ACATCACCGC TGCGGATAGG ATCTTCCTCC 1020
CGTACTCCCG AGGTTCTGCA AGAGGCAGGG TTTCGAGGAA CAGATCATCG GGCAGACTTC 1080
CCCTCTGTTT TTTCCATTTG CGGACGCGCG ACGGTCCCGC ATTGTAGGAG AACAGGGCAC 1140
AGAACACGGA GCCATCAAAG CGACGAATGA GATCGGAAAA GAACAGACTG CCGAATCTAA 1200
CATTGATATC CGGGTCAGTT AAGTCATAGG TGTCGATATT GAGCTTACGA GCGATATCTG 1260
AAGCGGTTGG CCTCATCAAT TGAGCAAGAC CTACAGCGCC CGCCCGAGAG ATAACcTGCG 1320
GCTGAAACAG ACTTTCACTC CGGATGAGCG CAAACAGCAG ATACTCCTCA ATATGATACT 1380
TCCCTGcATA GCCTTGTATG ACGTCAAGCC ACGGACGCGG GTACGCAATC TTCAAGTGTT 1440
CTACAGAATA GCGCGCACCG TGAGAGCGTA TTGCGTACGA CTGAATGCGC ACGGCATCGG 1500
ACCATCGTGA TCTTTGTATG TGTGCGTTCG CAAAATGCTC TGCAAGGTGG AGAGGAATGT 1560
CAGGATAGAA CTGTACCATA GCTTGGTAGA ACATATCGTC CAAGTGATAG TCTACATAAC 1620
CTTGGAGAAT AGCACGCGAC TCATCGGGGG TAAGAAAAGG ATGCGGTGTT CTCTTACTCC 1680
GCACCTTATA CAAAGCCTCT TCCAGAGGAA TGCCCAATGC ACATGCAGCA AGAACACGAT 1740 AGTAAAGGGA AGAATGCGCA gTTTCGAAGA TGGTACGGTA AGCGCGGCGC gcAcTCTCTT 1800
CTGAAAGCGT ACCTGAACGC GCAAGGACAT ACGTAACGCG CGCGGAnTTC GGGTAATGCT 1860
CTATGCGTCA CACCTCTTGC AGTGTAGCCA GACGCCCCCA ATCTTGCTGC GTAGTCAGCT 1920
GGACAATTGC ATAATCGACA AGGTCAGAAA ACCAAGAATC CGAACGCCAA CGTGGTGCAC 1980
TTTCAACAAG CACCTTAAAG AAAGAATCAA AGTCCAATGC ACGCAgTACG TCCAAGTAAT 2040
ACCACAGGGC ATTATCAGCA TCTTGCCGCC TTGTGGCCAC TTTCTCTGCC TTTTTGAAAA 2100
GGGGAAGCGC TTGCTTCTTG TGAGAAGCGG ACCGCGAGTA AAGACGCGCC GCATAAAAAT 2160
AGCAATAAAA ACGCAtCGTG CCTCTAATTC CGCATTGGAC AACGTTGAGC GCGATAGATG 2220
TGCAAGATGG TCAAGAAAAA CCTGCGCGGC TTTAACACTG TTTTCACTAC CGTAAAGCGC 2280
TGCTTTCCCA AAATCGGAGA GAACCGAATT TGTAGCGTAC CgCGTGTGCC ACGCAGAAGA 2340
GCGCAACAAC AATCGCACTC TCTCCCACCC CCGTTTGTAA TTCCTGTGAA AGACGTCCAC 2400
ACGAGCACCA TGTACTTCCT TAAAGATACG CGGCAGATCC GGCAGTGCCG CAATAAGCGC 2460
AGCAAATTGT GCACTCGTAT GGGGCGCAAG TGCCCTATCG TTGTACCAAG ACTCAACAAC 2520
CGTTCTCAGA CTGTAGTAGC GTTCGAGTTC AAAGAGCACC CTTGAACGCA AAAGGCGAAG 2580
TGTCTGTTGC TGCGTTTTGG TTCGCGCATC GCTGCCTGCA TTAAGGACAT GCAGCTGTTT 2640
TTCAATAAAG GCGAGACGCT GTAGTGGACT CCCTGTATTG TGTGCCTCAT GCGCACATAA 2700
CTGGCGGTAC GGTGCGGCCT GCGctGCGCC ACGTACGAAG TACTCGTGAG CTTCTTCGGA 2760
AAACTTGGCC CGCTTGAGGT GTAACCCAAT GAAGTAACTT GCACCTTCCC gCGCGGCGAC 2820
CTGCTCGAGA AACTCATCTG TTGGCTTTAG AAACGGAGTG TAGTTTTTGT CTCTCAAAAA 2880
CTGAGGGATA TCAACCTCCC TAGCGCCGCG CGcGCACACA GTGCCCTCCC ACGAGACAAG 2940
GACACACACC ACGCACCACA CGCGCAGGCA CAACCTGCCC TCCCGTAAGA GAGAGAAAAA 3000
GCCACCTCCC CACACGAGAT GCTTACCGTG ACGGAATTTC AATTGAGGTT CCTGCAACGA 3060
TATGATCAGG GTTTTTCAGT CGGTTAAACT CAGCAATTCT CATGTAGCGC CACGGAGTCT 3120
TGTAATAACT CCGCGCCAAA TCCCAGAGCG TATCCCCCCG TTTGACCGTA TAACGCACAA 3180
CCTTGACGGA TCCAGACCCA GACTCCTCTA CAGGTTCTGC CGCTGCACTC GTAGCCGTTT 3240
CAGCCCCGGG AGACGCAGGG GGgACGTCCT TTTCCTTGTG GGTTTCTGGA AGAGCAGTCT 3300
CAGATGGGAG ATCGACGACA CGTGCCTCAA CCGCAGTACG CTCAAGCTCA CGCGCTGCAA 3360
CATGCGCAGA AGACCTGGGC AGTTCAACTT TTTGTGACGG CACGACCGCA GGACGAGAGG 3420
CGCCGTGCAT GTACAGAACC CAACCTACCA AGACACCACC GAGCAGAATG AGAAATGCGC 3480 ACACTGCCAA CAAAATCCGT CGTCTCCTCG TGAGGGTAGA ACCATCCCCT CCCTCGTCTT 3540
CAGGGAAAGA ACTGACATGC TCGATCTGAG CAGCAGGCTG CTCATCCGGT GTCGCGTCTC 3600
CTGAGGATCT CTCCAACACG GAGAAAGAGG TTTCCTGAAA AGCACCAGAA CCTACATCCG 3660
TCGCACGCGC ACGGCACCGA CTCCCATCAC ACTCAACCTC GAGGCGAATA GTAGCTCCCC 3720
CCGCCGCAGC AACGGACAAC GTGTCTATCG AAAGCGCACC CATAGGCTCA AGTTCTGCAG 3780
AGGTGACCGA CGCCTCTCCA TCGcCAGGAG AAACCCGTTT AAAAAAGCTC AAAAGAGCAC 3840
GAGACTGTTG GTCATTCGCG GTAACGAGCT CGAGCGTTGT gCGCTGCGCA CCCTCTGCAG 3900
TAAAGAGAGG GaAAAAAGAC CCGTCCGCGA GCTTGATACC GATCTCCTGA CTCATGAGCG 3960
CACCCCCGCC CGAGCATAGT CAGGCGAAAC CCTTTTGTCA ATCTTCAGTC CATAGGGGGA 4020
GGGGGAGGCC GGCGCTCGCT GAGACTGCAC GCTCAGAGAC AGCATGCGGT TCACGCTACC 4080
TGTAGGCTCC TCCTACCCCC CCCCCgTGCC TGTATTCCGA CTGAGGTTCG TCCGTTGGTA 4140
TATTGCTGTC CAAGAGCCCA CTCAGGCGGG CGATTACCTC CCGCGTATGC AACTCAACTG 4200
CCCGCGTTCG TGCAACCAcT TCCGTGAGCA CTTCCACGTT ACCGATCATC GTCTCGACCT 4260
CTGCGCACAC CTTTTGCGCA TCTCGTGTCA ACCGCTCAGC ATGAGCTGCA CCTCGCGTGC 4320
ATACCTCTGC TATCACTTTC CCCTGCTCAC GCGCCGTACA TAACTGCCCT GTCACCGTGC 4380
GCATACGCTC AACGGTACGC TCATTTTCTT CTCCCACCAC CTGAACGGCA CCGGAAATAT 4440
CCGCAAGGAT GGTGCTGAAA CCGTCCACCT TTCCTCTAAT GTCTGTAAAA CTCTGGGCAA 4500
CTCCTGCCGA CGCGTGCCCT GATTCTCCAA TCACTGCCTC TATTTCCTTT AACATTTTTC 4560
CCGTTGCACC AGACTCAGCT GCTGTTCGCG CTGCCAGGGA GCGGATTTCC CCCGCTACCA 4620
CCGCAAAACC GCGTCCTGCG TCACCTGCAT GTGCAGCCTC AATCGCTGCG TTCATTGCAA 4680
GTAAATTCGT CCGTCCTGAA ATATCTACCA CCAGCGCATT CGTCACCGCC AAACCACGCG 4740
ACCGACTCGT AATTTCTTCG ATCACCTCCA TCATGCGGCC AACGCATTCC TTGCCGGTTT 4800
TACTTGCCGT GGTAATTGCC TCAAAATCTG CGTGCACTGC AGAAAACTTA CCCCGCAAGT 4860
TGTGCACCGT ACTCCCGAGA CGCTCAACTG CCTCAAGCGC TTCTCGTGCA CGTTCCCCTT 4920
GTGCTTCCAT TTGTTCATCC AACTCTCGCA AGGAGGAACC AAGCTCTTCA TTCACCACAC 4980
GCGTTTGAGA AAACCCTTCC CCCTGCACCG CTAACTCTTC GCGTACAGCA CCCGTGTAGC 5040
GCGTCAAGTC TTGCACTACC TGCTCTGCAC GATCCATGGT CTCGGTCAGA TTCTGTTCAT 5100
TGTGCGAAAG cTCCTGcGCG GTGCGACGAA CTTCACCAAA CAACCGCTCC AAACGCTGCG 5160
ACGAGGAAGA TAAGTCCTGA GAAAGCCGCT GCGCAGTCCG CCCATCCAAA ACAACCTGCC 5220 ACTCGATGAG CGCaGTTTGC GTAACCAAAA ATCCAACAAA CCCAAAAATC AACAGGGGAG 5280
CTCCTGACAG TACCCCTACA GCCAGGAGCA CATCGTGCAG CGCTGTGACG CCTAACACCA 5340
GCACCCCCAT ACTCAAAGGc ACCGCGCCAC GCTTTTTGCG GTACAAGACT TTGCACATCA 5400
CCCAGAGCAC AATACCCAGC AAAAGCAATA CGAACAGTTG CTGTAAGGGA AGTAATCGGG 5460
CAAAAGAGGC AGGGGGAAGC AAAAGGATAA TCACCGCGTA CGCAAGCCCC TCTGCACCAA 5520
ACGCCACCAC AAAACTCTGA TTCACAAGCC CAGGGTACAA CGTAGAAAGA TAGTACAAAC 5580
AGGACACCCC CGAGAACGCC AGGGTCAGGT ATTCTAACCG CACCATCGGG TCCCACCCAA 5640
TCGTGATCAA GCGTGGCAGA AACGCATTCC CCGTCAGCAG CAAGCGAACC ACAATCAGCA 5700
ACGAAAAAAG CGAGCACGCA TACAGGCTCT TTTCGCTCAC TCCCAACACC CCTGCAGTAT 5760
CTGCCCGCTC ATTCTCCGCA TAAGACTTTC TGCTGCAGGA GGACAATCGC CGGAAACAAA 5820
ACATTGCCAG GTAATACGCG AATATCGTAA ACGCAAAACC AATTGTCATC GCCTCAAGCA 5880
CGTCCTTGCG CAGaCgCGCG TACGAACGCG CGAAACAGAC CCCAAGTGAA CTTCACCGAC 5940
AATGCCCGGT CTTAGACTGT GGTAATTGCT TACCTGAATG CACACGTCAA TTTCAGGTTC 6000
GTGCGTTGGC AACCACACTT CGGCAGGGTG GACGTAGGGG ACGGCATCTG CTCGGTTACG 6060
CGCAACGGTA CCAAGTTCTG TTAAGAGATG CCCGTTTGCG TAGATTCTTG CTGCGTAGTT 6120
AAGCGTATCG CAGGACAGCG CAAgCrACGG AGCGCGAGGG GGAAGCAAGA TTTTCAGCGT 6180
ATAGGTGGCA CATCCGTAGT GAGGATACGC CTGAATAGCG GGTATCTGAA CCGCGCCCCG 6240
CGgTTTGGTC CACAGGGAGG GAACGGTCAT AAACGCAGCA GGCTGCTCAG CTGAGTGAGT 6300
AAGTAACTCG TTCCAGTGAA ACCCCCACGT TCCCGAGAGA GAAAGCAGTG CATCACCAGA 6360
GGAAAAATCC CACTGACGGA GATCGAGCAC TCCGTTTTCT GCCGAAGaGG CGCCGGGAGC 6420
GCAGTGTGTG AAACCGCAGC AGAAAGGGCA AAAGCAAAGA CATACGAGTA CGTAATGGCA 6480
CGATATTTCA CGCCGCGGAC TATACGCGAC ACATCCAAAA TGATCTACGG ACTGCACTTG 6540
CTGTTCACGC AATCACACGG GCATGCCCAA ATGAGGAATT TTCATCGCCC GCATCAGGGA 6600
TAGTGCGCAC GTTTTTGGAA GGGTTCTCCC GCAGTGGGGA CTGCAACGCT TTTTCATCCA 6660
AAACAAAATA AAACTGTTGA CACGGTTTAT TCTCCTGCCT CACACTGGCT GCAGTCCGCG 6720
CCATTTTAAG TCTCTCCGTA AGGAGCGTCT CATGCACACG CAAAGCCTCA GCCCCAGGCA 6780
GTTCATGATG AAAATACTCA ACGGGTCTTC TGCCGGGATC GTCATCGGTC TTGTCCCCCC 6840
CGCTATCGCG GGGGAGTTGT TCAGAGCGCT TGCTCCGCTT TCGCCGCTGT TCGCCGCGCT 6900
CTACCATGTG GTGCTGCCCA TACAGTTCAG TGTACCGGCT CTCATCGGTA CCCTTGTTGG 6960 ACTTCAGTTT CACTGCTCCG CGCCCGAAGT GGCTACCCTC GCCTTTGTTT CTGTTATTGC 7020
CTCAGGAAAT GTCACGCTTC AAAATGGCGC CTGGTTGATC ACCGGTATCG GGGACGTCAT 7080
CAATGTTATG CTCATATCTG CACTTGCAAT CATACTCGTC CGTGCTCTGC GGGGGAAACT 7140
TGGTTCGCTG ACCATCATCG CGTTGCCCGT TATCGTAGCT GTTGTCGCAG GGGGTGTCGG 7200
CTCCTTTTCC CTGcCCTACG TAAAAATGAT TACGCTTTTC GTCGGCAGAG TTATCGCCAC 7260
GTTCATCGCG CTCCAGCCAT TACTCATGAG TATCCTGCTG TCCATGTCTT TCTCGCTCAT 7320
CATCATCTCC CCTGTGTCTT CCGTCGCGGT AGGAATCGCC GTGGGGCTCA CCGGTCTGGC 7380
AAGTGGAGCA GCAAACATCG GCGTCTCCTC CTGCGCCATG ACCCTCATTG TGGGAACCAT 7440
GCGCGTCAAC AAGATCGGTG TTCCGTTGGC GATGTTCGCA GGAGCGATGA AAATGCTCAT 7500
GCCAAATTGG ATCCGGTACC CGATTCTCAA TATTCCGCTC CTGCTCAATG GCCTCGTTTG 7560
CGGCGTGCTC GCGTGGCTTT TCAATCTGCA GGGTACTCcT GCAAGCGCAg GCTTCGGTTT 7620
TATTGGACTT GTTGGACCGA TCAACGCCTA CAGGCTTATG GCGTACACTC CTATGGTGCG 7680
CGCGGGTATT CTTTTCCTCG TGTATTTCGT TCTTTCCTTC CTTGCTGCGT ATCTTATCGA 7740
CTTTATTCTC GTTGACCGCC TCAAACTTTA CCGGAGAGAA CTCTTTATCC CCGAACAAGG 7800
GTAGATATCC TATATGTTAT GTGTTTCCGC CCAGGTCCTG CGTGAGATAC GTGCAGAACG 7860
TGGGTAAGGA ATGTTGTTTG CCTTACCAAG GAGGTGCGAA ATGAGGTGTG TTGTCTTTAA 7920
TCTTCGAGAA GAAGAAGCCC CTTACGTGGA GAAGTGGAAG CAGTCCCATC CAGGGGTAGT 7980
CGTGGACACT TACGAGGAAC CGTTGACCGC AAAGAACAAG GAGTTGCTTA AGGGGTATGA 8040
AGGGCTCGTG GTTATGCAGT TTCTCGCTAT GGAAGACGAG GTGTATGACT ACATGGGTGC 8100
GTGCAAACTA AAAGTCCTTT CCACACGTAC CGCAGGCTTT GATATGTATA ATGCAACTTT 8160
GCTGAAAAAG CACGGCATCC GGCTGACGAA CGTACCGTCC TATTCACCGA ATGCTATCGG 8220
GGAATATGCA CTCGCCGCCG CGTTGCAGct GACGCGACAT GCGCGCGAGA TTGAAACTTT 8280
TGTAAGGAAG CGTGATTTTC GCTGGCAAAA ACCAATTCTC TCGAAGGAGC TCCGCTGCTC 8340
ACGCGTAGGT ATCTTGGGAA CGGGCAGGAT TGGACAGGCA GCAGCAAGGC TCTTCAAAGG 8400
GGTTGGTGCT CAGGTAGTTG GTTTTGATCC GTACCCGAAC GATGCCGCAA AGGAATGGTT 8460
AACCTACGTG AGTATGGACG AGCTGCTGTC CACTAGCGAC GTGATCAGCT TGCACATGCC 8520
TGCGACAAAG GACAGTCATC ACCTGATCAA TGCGAAAACA ATCGCGCAGA TGAAAGATGG 8580
CGTGTACCTG GTGAACACGG CACGCGGAGC AGTGATCGAC AGTCAGGCGC TCTTAGACAG 8640
CTTGGAC AA GGCAAGATTG CAGGTGCTGC ACTGGATGCG TACGAGTTTG AGGGTCCGTA 8700 TATTCCTAAA GACAACGGGA ACAACCCTAT TACCGATACG GTCTATGCTC GGCTTGTCGC 8760
ACATGAGCGT ATCATCTATA CCCCTCATAT CGCCTTCTAC ACAGAAACAG CGATAGAGAA 8820 CATGGTATTC AATTCGCTTG ACGCCTGCAC CACGGTGCTG CGTGGGGAGC CTTGTGCCGC 8880 TGAAATCAAG CTGTAACTGA CGcCAGGTGT CCCTGGTCCC GTGTGAGTCT GACTGGCTAA 8940 TCGGTCAGTC TGGAGTCGCC AGCTCAGGGT GGGTTGTGGG ctCCGCGGGA CCCCGTCCAG 9000 CCGGTTACGT GCGGGCGCGG CCCACCTGTG TGAGCGCGAT AACCAACATC AGTACCACCA 9060 CCGAGACCAG TGCGGTGAAT CCCCCTGGGG CCACGTTCAA GTAATACGAG AAGACCAAAC 9120 CCAGCGCCGT GTCCAGCATA CTAAATAGAA ACGCCGCCAC CAACGTAAGC AGGAAACCCA 9180 CCCGCAGCTG TAGCnTGTCG CAACCGGTAC GGTCATGAGC GAGCTCAGCA CCAAAATACC 9240 GGTAATCTTT ATAGAAGCTG CTATAGTCGC TGAAATTACC ACCGACGCGA CGTAGTTTAT 9300 CCCGTCTGCT GCGACGCCAC AGATACGCGC GGTCTCTTCA TCAAATGCCA AGTACAGCAG 9360 CTGATGGTAG CGCAACGCTA GCGTACCTAC GCAGAACACG CTGAGTGCGA GCATGATCCA 9420 CAAATCGCGT GTAGAAACAA CCAGTATGCT GCCAAACAGA TAGCTGTCTA TATCCGCCTG 9480 GATAAGCCCA GAGCTCAACA GCGTGACAGC AATACCCACA CTCAGGGAGA GTACTATTGA 9540 AAGAATCAGG TCATGATGGT TTTTGAAAAA GGCGCGCAAA AACTCTATCA AAACCCCCAC 9600 CAAGGCAGTG AAAAAAAAGG ATCCCCATCC TGGATGGATG CCGCACGAAA CGGCAATAGA 9660 TACTCCTGCA AGTGAACCGT GCGCAAGTGC ATCTCCCATG AGCGCGTAAC GGCGGAGCAC 9720 TAAGTGCATC CCCACAAGAG GACACAACAA GGCTATGAGA AAAGAAGCAA CAAAAGCGTT 9780 GCGCATAAAT GCGTACTGCA ACATCACCGA CTCCGACACT GCGCACAGGC AAGCGCATCT 9840 TTTTTCTGCA TATCCAAAAA CTCACTGACG TACTGCTGAG GATTACACAA ATGGCCATGT 9900 CCTTCGCTGA GATGAAAAAT TTGCGTAGAG TTTGTAATCG CTGCATCAAG ATTATGCTCC 9960
ACCGATATAA CCGTTACGTT ACGTGATGTG TTCAATCCCT TCAGCAGAGC GTAAATATCT 10020
TTCTGTCCTC GAGAATCAAT ACCTGTTGAC AGCTCATCGA GCACCAGCAA ATCAGGATCT 10080
CCGATCAGGC TCCGCGCAAT GTACACCTTC TGTAATTCTC CTCCAGAGAG GGTATACACA 10140
AGCTTTTTTT TCGCACCCCG CATACCCACC TCCTCCAGCA CAGCATCGAC AACCCACTTG 10200
TGCGATATGC GCAGAAGTCT GCGATACGAG TTAAGCATTT CATATACCGT AAGCGGAAAA 10260
TAGAGCGTGT GCATCTTTGT CTGTGGAACA GAACCAACAC GCTGTACAAA GTGAGCGATC 10320
GTACCGGTGC TCGGCTTAAG TAATTTGAGG ACAAGCTTCA CAAGCGTGCT TTTCCCACTA 10380
CCATTTTCTC CTACAACGGA AAGGTACGCG CCTTTTGGTA TTGCAAGATC CACCTCGTGC 10440 AGTATAAAGC GCGCGTCTGC GGTGTACCTG AAACTAACGT TTTGTAAAAG CACCGCGAAG 10500 GGACTAGCCA TGATGCGCAT GGTTATACTC TATTTTTATG AATGCAACAC TACTGTCTCT 10560 GAGAAAAAAG AGACCACAGC TGCACAATGC AAATTGTCTC TCTACCATGG TTTTCGAAGA 10620 AATTCCCGTT CAACATCCTA AAAAGGAATT GCATATTTTG CAGAGAGTGT GCGAGTACGC 10680 GCCCGTTTGG GGTGGAAGCC CCTTCAAAGG AGCCATCATG CAACGCTGCT CAGTAGTTGC 10740 CGCCCTTGCG GGGGTGGTTT TTCTTGCACA GGCGTGTTCG CTATCAACAC CTTCTCGCAT 10800 AACCCACACG GATAAGCTGC CTGTTGTGGT GACATTTAAT GCTCTCAAAG AGTTAACACA 10860 GATGGTAGGT GGAGAAAAAA TTCATTTAGT GTCCATCGTT CCTGATGGGG TTGACTCTCA 10920 CGACTTTGAA CCAAAAGCAA AACACATGGC CTTCATTAGT GATGCCAAGG TCATCGTGTA 10980 TAATGGTCTT GGCATGGAAC CCTGGATACA CTCGGTACTC CATGCTGCAC GTAATAGCGG 11040 CAGTATACGC GTAGAAGcTG CGCAGGGCAT TGTTCCGCTG AAGGCTCACA CACGTGGGcA 11100 TACGGCGCAC CATGTACATG CACATGCATC GCACGGGTCT GCGTACGACC CTCACGTTTG 11160 GCTCAGCGTA TGTAACGCTC AAACGATGCT TCGTACCATC GGAAAGGCAC TGTGTAAGGC 11220 GGATCCGCAG CA ACGCGCT TCTACAAAAG GAATGCCCGT AATGCGGCCG CACGGCTTGA 11280 GGCGTTGTAC AAGGAATACC GCTCCAAGTT TGCAGCCTTA TCTCATCGAT ATTTTGTGAC 11340 CACGCATGCG GCGTTTGGTT ACTTGTGCAG GGATTTTGAC CTCCAGCAAA AGAGTATAAA 11400 GGACGTCTTT AACACAGAAG AACCTTCCAT CAAGAGACTC GTAGAGCTCG TCGAATTTAG 11460 CAAAAAACAC TCAGTGCGGA CCATTTTTAG TGAACGTGGT CCTAGTGAAA AAGTCGCTCG 11520 CGTTCTTGCG CAAGAGATTG GTGCTTCAGT TGAAACCATC TACACTATGG AAAAAAACGA 11580 GGAGAACCTT TCGTACTACG AAAGGATGAA ACACAACATT AACAGGATTT ATCGTGCCTG 11640 TTCAAAACAG GTGACACCCT CGCAATAACA ACCGCTTTGC ACATTATGCG TTTTTCTGTA 11700 CACTCACCGC CATGTACTCT TGCTTAAGGA GGCTTTTTGG CATACGGGGC ACGGGGACTC 11760 TGTGTGCCAT GTCCGTTTTT TGTCTACTTC TTTCCTTTGG AAGGCGCTGT GTGGCGGCGG 11820 ATAATTTCCT TTCTTTCCTT GTGTGGAATC TGGTTCTTGC CTTCATCCCC TGGCTCATCT 11880 CGGCTATCTT GCACGTGCnc GnCTTCGCTG TCCGCAGTGT ACAGCTGTTC CTTATGCTGC 11940 TCTGGCTATT GTTTTTCCCC AACGCTCCGT ACATCCTTAC CGATATTATC CACTTGGGAA 12000 AGGGTAAGTC ATTTTTGCTT TACTATGACC TTATTATTTT ACTCGCCTAT AGTTTCACTG 12060 GTTTGTTCTA CGCGTTTGTC AGCCTTCACC TTATTGAAAG CATATTAGCC CGTGATTTTC 12120 ATATCAAAAG GCCATTCATA ATTTCAGTAT TTGAATTGTA TCTCTGTGCA TTCGGTATAT 12180 ATCTGGGGCG TTTCTTGCGC TGGAATTCCT GGGACATTGT CCTACATGGA CGCACTATTC 12240
TTTCTGATAT TGGTATCCGC GTCATCAGGC CAGTGTTCTA TGTTGACACC TGGATGTTTG 12300
TGTTTTTTTT CGGCACCATG CTCGTTCTTT GCTATCAAAG CTATCGATCA TTTCTTACCC 12360
ACACAAGAAA TGACAAATGA ATATCGTTCT CTTTGAACAG GAAGAGGTAG TGCACGGTTG 12420
CGCTGTACTT TCTTTCAGGG ATAGTCGATT TTGCCATATC AAGCGTGTGC TTAAATTGAG 12480
TGCGGGAGCC TGCTTCAAAG CAGGGATTAT TAATGGGGTG AAAGGTTCTG CACGCATCTC 12540
CCTAGCCACA GAAAAGTATC TCGTAGCCGT TTTTGAAAAA CTGGAATACG AAGATTGTGC 12600
CCTTTTCCCC CTTCATCTTG TCATAGGGTT CCCTCGTCCC ATTCAGCTCA GGCGCATTTT 12660
ACGCGACGTG TCCAGCCTCG GGATCTCCTC TATCCATCTT GTAGGGACGG AATTAGGGGA 12720
GCGATCTTAC CTAGACTCAG GACTTGCTCA CATGGAAAAA ATGCACACGT ACCTCATACG 12780
TGGCCTAGAA CAGGCAGGAG GCACGAAACT TCCCCTCATT ACTGTTTCGG AGTCGGTGCG 12840
CACCTTTTGC TCACAACACA CCCACATACT CGGCGACAGC ACACACCAAA AACTAATACT 12900
TGATACTAAG AACACCCTAA CCGATCTAGG AAGCGCCcGC TGCGCGGGGA TGTACTGTGG 12960
ATTGCAATAG GGAGTGAGCG TGGATGGACC GAATCTGAAC GTTTACTTTT CTCCGCCaTG 13020
GGATTTAGAG CAGTAGACAT GGGAAGACGG ACCTTGCGCA CAGAGACCCG CGGCCTGTGC 13080
CsCGTGCGCC GTTGTACTCG CCAACGCGCA CGCGTGGAAA AGAAAAATCC CTCGGCCAGG 13140
CAAGAGATCT TCGCCCATAA GTCGAAAGAA TCCCTAGATC CGGATCACAC TCAAAAAGTA 13200
AACCAGAAAA GCCCGAGACA GGCCTGAAAC AAGGAAACAC AACCAAAAGA TCCACACCGC 13260
ACGCGTCAAC CGAAAGCCAT TGCTCCCGAC CGAGGCGCTA AGTTTCAACC TCGCAAACCC 13320
AACGCCGACC CCTTGAGGAG ACCTCCCAAA AAACCGCGGA AAGAAATCCA CACGGAGACC 13380
ACCGGCGTTG TATACACGGA CCGGAAGCGG CGCCAGGGGA ATTACCCGCA CCACACGCTG 13440
GCCGAGATCC ACGGGAGACG CTCATCTCAA GAGATCATGC CCTTACGTAT GACCAATCGA 13500
CCGGCGCTAA CTAGATCAAT ACATACCTCC CATACCTCCC ATGTCAGGAG CTGGCGGTGT 13560
AGAGGAACTC TTTTCGGGAA TTGCAGCAAT TGCACATTCG GTAGTCAACA AAAGCCCAGA 13620
AACCGAAGCC GCGTTCTGTA GTGCTGAACG TGTAACcTTC GCCGGATCGA TAATCCCGAC 13680
CTTAATCATA TCAACCCATT CCATCTTGGA TGCATCAAAA CCGATGCCAC GTTTCTCCTT 13740
TGCCTTCTCT GCCACAAcTG CGCCATCAAT ACCCGCGTTC TCTGAAATCT GGCGTATCGG 13800
CTCCTCGAGA GCACGACGCA CAATCTTAAA ACCAACCGCC TCATCTGGAG TCAGTCCACT 13860
CAAATCAGCT TTCTCGAGCG CCGCCGCAGC CTGAATAAGC GCTAAACCAC CACCAGCAAC 13920 AATACCTTCC TCTATTGCCG CACGTGTCGC ATTTAAGGCA TCTTnCATA 13969
(2) INFORMATION FOR SEQ ID NO: 113:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3357 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: CACAGCACCC TACTGCGCCT TCAGGTACGC GACGCACACA TCGCAAGAGA AATTGCCGCC 60 TCAAAAAGTC TTGCATCGCA CCTTTCCCCC ACAACCGCTT GCCTGCAGGA CGGTCCGCTG 120 CCCACACCGG aCACTGCGCG CATTGCCCAG CCGGATACGA CACCTCAGGT GCAGGGTGCA 180 CAGACTGGAT GCAGTTACGA CTTTCTATTG CCACGCCTCC ACCGAGGCAC CGTTAAACAA 240 TTACTCTTGC GCCACGGATG GCCGGTGCAT GATGAAGTGC CTCTCCGCGA GGGAACTCCC 300 CTATCCTTGC GCTTACGCAT CTCTCCGGCT TCGTGTCCTC CTCCTTCTAC TGCGTATCCG 360 TGTTGTCACA CGCCAGGGAC ACCCTCGTTT GTGCCACGTG ACTACCAGTG GGAAGCAGCC 420 GATGCCTTCG TCGGGAATCG CACACAAGGG AGCGGATTTG GTGTGGTAGT TTTGCCGTGT 480 GGCGCGGGAA AAACGGTTGT CGGTTTACTC GTTATGGGTC TGTTGCAAAC GGATACCCTG 540 ATTCTGACTC CTAATAGCGC AGCTGCACAG CAATGGAAAC kTGAATTgTG TGAAAAAACC 600 GACyTGGACG GGACATCCAT CGGTATCTAT TCAGGAGAtG CAgGAAATCA GACCAGTGAC 660 TATCGCAACt ACCAGATAcT CACCTGGCGT GCGCATGCAG ACGCTCCcTT TTCCCATTTC 720 CGTCTCTTTA TGGAACGCAG TTGGGGTTTG ATTATTTACG ATGAGGTGCA CTTGCTCCCT 780 GCACCGCTTT TCCGTATCAC CGCAGAACTT CAGGTGGTAC GACGCTTGGG ATTAACTGCA 840 ACGCTCGTGC GAGAAGATGG CTGTGCGCAG GATGTGTTCA GCCTCGTAGG ACCGAAGCGG 900 TATGACGTGC CGTGGAAGGA TTTAGAAGCA CGCGGCTGGA TCGCACGGGT GCGGTGCGTA 960 GAAGTTCGGG TAACGATGGA CCGGTCACTC CAGTACCAGT ACATGACAGC TCCTGTGCGC 1020 CTGCGACATC GCCTTGCCAG CGAGAACGAA GCAAAAGTAG CGGTGGTACA GCGTCTATTG 1080 CGCGCACATG CAGGTGCGCC TACACTGATT ATTGGGCAAT ACGTGCAGCA GTTATTACAT 1140 CTCGCACACG TACTGCAGGT GCCACTGGTG AGCGGAAGAC AAACTTATGC GGCGCGTGAA 1200 GCCATCTATC AGCGTTTTCG CGAGGGCACG CTCCAGGTGC TCGTTGTATC AAAGGTGGcA 1260 AATTGTGCGC TTGATCTTCC TGACGCGTCG gTTGCAaTTC AAGTTTCCGG GaCATTtGGC 1320 AGCCGTCAGg AGGAGGCGCA ACGCcTCGGA CGCCTCTTAC GGCCAAAGAT ATGCGACGCC 1380
CATTTTTACT CGTTAGTTAC AGAACAAACG GTGGAAGAAG ACTGTGCAcT GCGTCGCCAG 1440
CGGTTTTTGG TAGAGnCAGG GTTACACGTA CGAAACCcTT CGCGTAAGCG AAGTaCACGA 1500
ATAAAGGATA CTCCGTGCAG AGTCCTCCCT GTGTGTGTGA GGgGGGGGGG AGGAGGGGGT 1560
GACCGTGCGG TCTCCCTTGT TTTTTTGGTT CAAGACCGCT ACAGTACTCC ATGCTCGTAC 1620
GCACTGCACT CAGGCTCATC TTTGGCTCCC AGCACGAGCG CGATCTGAAA AATCTCCTGC 1680
CTCTTTTGAA TGCCGTCAAC GCCCAGGAGT CCTGGGTACT TCCTCTCCAG GAGTCTGAGT 1740
TCAAACAAAA AACAGCTGAG TTTAAGGCGC GTGCCGCTGC AGGAGAAGCG CTTGACGCTT 1800
TTTTACCTCA GGCATTTGCG CTTGCGCGCG nAGGCAGCTC GTCGTGTTTT AGGCGAGCGT 1860
CCCTATGACG TGCAGATCCT CGGTTCCCTC GTCCTCCACC ACGGCAAAAT CGTGGAAATG 1920
AAAACGGGCG AAGGCAAAAC GCTCATGAGC GTGGCAGCGG CGTATCTGAA CAGTCTTTCG 1980
GGGAGGGGTG TGCATATTGT CACGGTCAAC GACTATCTTG CTGAGCGCGA CGcggAnTGG 2040 gATGCGTCCA GTATATGATT ATTTAGGCGT TTCCGTCGGC GTCATCCTCT CTTCCATGGG 2100
CAGTCAGGAG CGGCGGTGTG CGTACGCGTG CGATATTACc TACGGTACCA ACAATGAACT 2160
GGGCTTTGAT TATCTGCGCG ACAACATGCA ATTTTTAACG GAAGAAAAAA CGCAGCGTGA 2220
TTTTTACTTT GCCATTATTG ACGAGATTGA CTCCATTCTC ATCGACGAGG CGCGCACACC 2280
GCTTATTATC TCAGGGCCTg CAGAAAATGA TACCCAGCAT TACGCCGAGG TTGACAGACT 2340
CGTCGGGCAG TTACAGGAAG TGGAGCGAAA TCCTGCCACA GGTGACTACC CCAACGAAgT 2400
GGACGGAGAG GAGGTTCGCG GCGATTATAT CGTTGATGAA AAGAATCGCA AGGTTTCCTT 2460
CAGTGGTCCG GGGATGCTGC ACATTCAGGA wtGCTCACGC ACGCTGGGCT TATCCAAGGG 2520
AGTCTATTTG ATGAAGAGAA CTTCAAGTAT ATCCACTACT TTACGCAGGC aCTCCGTGCG 2580
CACTTACTTT ACCGCGCAGA CGTTGATTAC GTAtAAAAGA CGGACAAGTA CAGATCGTAG 2640
ACGAGTTTAC CGGTCGCATC TTGGAAGGTC GGCGGTATTC TGACGGATTA CATCAGGCAA 2700
TTGAGGCAAA AGAACACATC CGCATTGCGC AACGTAATCG CACTATGGCA ACTATCACGT 2760
TTCAGmACTT TTTTAGAATG TATAAAAAGC TTTCTGGAAT GACGGGAACT GCGGATACCG 2820
AGGCGTTGGA GCTCAATAAA ATTTATAAAC TTGAGGTGGT AGTTTTGCCc GACGAATCTT 2880
CCCGTAGCGC GGGTGGATGA GCATGACGTG GTATACCTGA GTGAAGAAGA AAAGTGGAGT 2940
GCCATTTGTG ATGAAATAAA GGAGGCACAC ACACGGGGAC AGCCGGTACT CGTGGGCACT 3000
ATTTCTATAG AAAAGTCCGA AAAACTCTCT GCTCTGCTGA GAACACGCGG TGTAAAACAC 3060 GAAGTTCTCA ACGCTAAAAA TCACGCGCGC GAGGCACTGA TTATCGCCGA AGCGGGGGCG 3120
AAGGGTTCGG TGACCATCGC AACCAACATG GCCGGACGCG GCACGGATAT CAAGCTAGGG 3180
GGTAATCCTG AATTTCGTGC ACGACAGAGC GCAACTGCCA TAGCATCGAA GCACGGTTCC 3240
TCCTCTGTCA CTGTGCAGGA ACATATGCAA GCGTGCTATG AGGCGGAATA CACACGGTGG 3300
CGCGCAGATT ACGAAGAGGT TAAGCAGCTC GGTGGTTTGT ACGTCATTGG CACAGAG 3357 (2) INFORMATION FOR SEQ ID NO: 114:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1462 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114:
TTGCCCCCAC GnTGAAAGCG CTCCTGGTnA TCGGCGGCGC ACATTCGGCA AATACCCAGC 60
GTCTACTCCA CACCGCGCGC GAAACGTCGC TACCTACGTG GCTGGTAGAG CGTGTAGAAG 120
ATATTCCCCC CGATATCTAT GCCTTCAGTG CGGTGGGCAT CAGTGCAGGG GCTTCCACCC 180
CAGACTGTGT TATCGCTGCT GTGGAgCAgG ccTGCGCACG GGCGGCgCGC CTGTCGCTTC 240
TCGGGTGTCT TCCTCTGCTC TGCCCAAGGT GAGTACCTGC AGGGCTGTTT GTGCGGCGGC 300
TACTTCTTCC GTCGGTTCAG CGGGTGCATC CGGCGCGGTG TCGCCCGGTG CTGTCCGACC 360
TTTTGCTGTA GGCTCCGTGC GGTGAAAACC GyTTCCCTTT GTCTGCGAAg TGCATGGCTG 420
AACGTTGCGC TCGGCGGGGT GCTGGGTGTC CTTTCCGGTT GTCGTtCTTC GACGAAAGGG 480
CGGATGTGCG CGTCGACCGT ACGGGAAAAG AATACCTCGA TGCACAGGTT GCGTGTGCAA 540
AAGAGGAGCT GCGCGCGCGT CCTCTGCGGG CGCTTATGTG TGCAATTGCG CTCAAAAGAA 600
ATGCCCCAGC ACATCAGAAG GTGGCTCAGC TGTATGCCCc AGGgcTTGCG CGCGTCAAAG 660
AGGCGTTTCG CTATTCAGTG GAGAAACAGA AGTGGTCGGA GGCACTTGTG TTTTTTCGTT 720
CCCTCTCGGC ACTTCGCATT CCGCTGAAGG ACTGGACGGA GCGATCGCTG CATCGTGCGC 780
AAATTGAACA GTGGAAAAAG GAGGGTGCGC ACGTATTGGT TGCGGCGCAA GAGAAGCGCG 840
CCGGAACTTC TgCTGCGCGG AGTCCGGCAG CCATGATAAA GGGGACGGTC ACCATTTTGG 900
TAGATCGAGG AATTCGCGTA GAGCACGGAC GCGGGTTTGC AGATCGAGTT ATCGGGTCAG 960
GTTTTTTCAT CGACAAGAGG GGCTATATCG TCACTAACTA CCACGTTATC AGAAGCGAGG 1020
TAGATCCTGC GTACGAAGGt ATTCGCGTGC GTACATCAAG CTCCCCTCAG ACAACACCGT 1080 GAAAGTTCCG GTGCGCGTTG TCGGGTGGGA TGCGCTTGCA GATCTTGCAT TGCTAAAAAC 1140
AGAAATTACT CCTGAGGTGG TGTTTGGCTT AGGTTCCTCA AAGAATTTGG ACGTGGGGAG 1200
TAAAATCTAC GCGATAGGAT CGCCTGCTGG GCTTGAACGA ACGCTTACTT CTGGCATCGT 1260
GTCTGCGAAA AAGCGCaAAC TGCTTTCAGT CGGTGGGGGA GTGCTGCAGA TAGACGCATC 1320
CATTAATCGA GGGAACTCAG GCGGTCCAGT TATCGACGAG GAAGGGTGCG TTCAGGCAGT 1380
AGCGTTTGCA GGTGTGGAGC AGCATGCAGG GCTTAATTTT GCCATTCCTG TAGAATTGCT 1440
CAAGCAGGTG CTGCCAACTT GT 1462 (2) INFORMATION FOR SEQ ID NO: 115:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4532 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115:
CTTTTTGATG ACGCACGAGC GCTTGTGCCC CCGCGGGAAT TTTACGGGGG ACAGGCTCAT 60
GCGCGGCAGC AACAAGCGTA ACCTGATGTC CCCtCGCGCT CAAACGACGC GTAAGTTCAC 120
GCCCATTCGT ACCGCATCCA ACAACAATAA CCCTCATGTC CGAAGnCCAT AGTAGCACGA 180
AATTTTTTTG CATGGCCAGC GCGCAGAACA CGGCGCACAA CGCCTGCCAC TCATATCTTT 240
TTCAAAAGTA CCACTACCTG TGCGGTAACC GCCGCACCAG ATCCAACAGG TCCGAGGCGT 300
TCGGCAGTCT TTGCCTTAAC AAAAACACGT GTTACGTGcG TGTCCAGGGc CTGCGCAAgs 360
GATGcGCGCA TCGcTTcCCG AAATGGGTGT AATGCAGGCT GCTCAAGACA GACAACAGCA 420
TCGAGATTCA CCAGCCGcCA GCACGCTGCG CGCACCAGTT GcCAGGTATG GCGGAGCAAC 480
GCGCAAGAAT GTGCGTCTTT CCATCGTCCG TCACAAGAGG GGAAAAACGT GCCAATATCC 540
CCCAGGCCTG CTGCACCCAA AAGGGCGTCA ATGCTCGCAT GCGCAAGAAC GTCTGCATCC 600
GAATGACCCT GCGCTCCCTT CTTACTGGGa ATATGTATCC CTGcAAGTAT CAGcGGTCTT 660
CCTGcACACA GcGCGTGCAT ATCAGTCCCC AGTCCAACGC GCAGGCACCT TCTTCCGTAC 720
ACGGcAGGAC TGATATCCCC CTCGTGAGAG CAGGCTCCGA TGCCCGCTGC TcCAGATCCT 780
CCGGATAGGT AATCTTTACA TTACTGCGTT CACCGGCGCA GACATGTACG GTCCCTCCGT 840
AGCGAGCGTA CAGCTCcTGt TCATCAGTAT ACTGTTCCCC ATCAGTAGCA GCGCGGTGAT 900
GGGCAGCGCA CAGCGAsGCG TAgcAAAAAC CCTGAGGGGT TTGTGCTAAG CGCACTCGAC 960 TCCGTATAAG ATGCGTTTCG ATACTCCCAT CGGCAGCAAC ACCCTTGGGA GTATCCGTCG 1020
CCTCTATAAC CGGCACTGCC GCTCCATAGC GACAGGTAGC CTCAAGTACA GAATGAATAA 1080
GCGCAACACT CACAAAGGGA CGTGCGCCGT CGTGCACCAG GACCACATCG GGCGCATGCG 1140
TAsCATCGCG TCAAGCCCCG CGCGCACAGA CGCACTGCGT GTATGTGCAC CCGGCACGTA 1200
AAGAATGACT GGACGCGTAC GAGACGGGAA CGCCGAAAGA CGCGAATCAC ACGCCACTTG 1260
ACTTTCTGCG TACGCAACTT CACCTGCAGG AACGGTAACA ACGACAAGGA AGAACGACCG 1320
CGCCTCAAGG GCACGCACGA GTATCTCAGA AAGGAGGCAG ACACCAGGCT GCCTGGAAGT 1380
TAACGGCAAG TACTCCTTTT TCTGCACACA CGCACCACCT CTGCGCATGC GTGCAGAGCT 1440
GCCTGCAGCG GTGACAAGCA ACGCGGCGCG GgTCACCCGG GcACATCCAC CGTACCCGAC 1500
ACAACTGGGT TTTCCTTTGC CTTTTCCAAA TAGCCATGGA TGAGCGTTTC AACCTcAGGG 1560
GGTCGCAGCC CCAGCGCAAA ACACATCTCA TCCTGAAAAA TACGGTATGC AGAATCATAG 1620
AGCCTTCGCT CCTGGATAGG CAGTTCCTTG ACTTTACTCC GGTGGTAGAG CGAGCGCACA 1680
ACGGCCGCAT TGTCCAAGAT ACCACCACTT TTAAAAAGGT TTAAATTGAC CTGATAACGC 1740
ATTTTCCAAT CAAGAGGACT AGGATCAAAA TCCTCAGACA GAAACCTCAA CGCGCGCTCT 1800
GCTTCCTTCC TTTTGACAAT GGTTCTAATA CCCAGTTCCT GTGCTTTATC CACCGGAATA 1860
AGCACCGTCA TATCTGACTC TTCCAAGTAA ATGACGTATA TAGCAGCGTC TCGTTCTTAA 1920
ATGTTTTTTC GCTTATTTCC TGcACCTGAC CGACGCCCTG TCCTGGATAC ACCACGTGAT 1980
CGTGGGGACG AAACGCACAC GCCTTACCCA TGGGGCCAGC GTACACAAAC ACGGAAAAAA 2040
GTCAGTGGGA AGAGGAAGGG GAAAAACGAG GGAACTCCAC CACGCCCGAG TAGCCATAAC 2100
ACAAAGAACG TGTAGACTGG CGCACCCTTT TGTACTACTA TGCGCGCCAT GGCTTGCGTG 2160
CGCCGAGTGC GAAATTTCTG TATTGTCGCG CACATTGACC ACGGTAAATC CACCCTTGCT 2220
GaCCGACTCA TCGAAAGGAC GCGCGCGGTA GAAGAGCGTC TGCAGCACGC GCAGATGACC 2280
GACAACATGG AACTCGAGCG AGAACGAGGT ATAACTATTA AAAGCCACGC CGTGTGTATT 2340
CCCTACACGG ATGCACACGG CACCGAGTAT GTGTTGAACT TTGTAGACAC GCCGGGACAC 2400
GCGGATTTTG CATACGAGGT GTCGCGCGCA ATTGCTGCCT GTGAGGGAGC GCTCCTGGTG 2460
GTAGATGCAA CGCaGGGAGT TGAGTCGCAG ACGATCTCAA ATCTCTACTT AGTTTTAGAG 2520
CACAATTTGG AAATTATCCC TGTTATCAAT AAGATcGnAC yCTAcGGcAG ACGTGnCCGC 2580
GTGTGCTCCA ACAGGTAGAG CACGACCTGG GCTTGGATCC CGCCTCTAGT GTGTTGATTT 2640
CTGCAAAAAC GGGAGAGAAT GTCGACGCGC TCTTTGATGC AATTATCACG CGTATTCCTC 2700 CCCCGCAGGG GAGTGGTACG GCCGCGCTCC AAGCGTTAGT ATTTGACTGT CACTATGACC 2760
AGTACCGCGG GGTAGTTGTC CaCATTCGTG TTTTCGAGGG ACAAGTCACA AGTGGCATGG 2820
TTATTCGTTT CATGAGCAAC GGGGCAGAGT ACCGTGTAGA AGAGACGGGT GTCTTTGTAT 2880
TCAACCTTAT TGCACGTGAA GCGCTGTGTG CAGGAGATGT CGGTTACCTG AGTGCAAATG 2940
TAAAAACGGT TTCAGATGTA CAGGTGGGGG ATACCATCAC AGACGCGTCC TGCCCATGTG 3000
ACACGCCGCG TGCTGGATTT AGACGGGTAA AGCCGGTGGT CTTTTCCTCG GTGTATCCGG 3060
TGGACACTGA TGAGTGTGAG CAACTGCGCG AAgcATTGGA GCGACTTGCC CTCAACGACG 3120
CarTATTTCC TGGGAACGAG ACTCATCCTT AGCGCTGGGG cACGGATTTC GCTGTGGTTT 3180
TCTAGGACTG CTTCATCTTG AAGTAGTGCA GCAGCGTTTA GAGCGAGAGT TCAACCAGAC 3240
AGTCATTTTT ACTGCGCCTC AGGTGCAATA CTATGTGTTT CTAAAAACGG GACAGCGCAT 3300
AGTGTGTGAC AACCCAGCCC ATTATCCTTT GGAGCAGGAG ATTGCACAGG TGCATGAACC 3360
CTACATCCGT GCAACTATCA TTACGCCGAC AGAGGTGCTC GGTGCTGTCA TGACGCTCTG 3420
TATTGAAAAG CGCGCGTACC AAACAGCGGT GAACTATTTA GATCAGAAGC GGGTGGAACT 3480
GGTATACGAG ATGCCCCTTG CGGAAATTCT CTTTGGGTTT TACGATAGGC TCAAGAGTAT 3540
TAGCCACGGC TATGCGTCTT TTGACTATGA GCTTATAGAG TCGAAGCTCA CAGATCTGGT 3600
GAAAGTTGAC ATCCTTATTA ATGGGAAGCC GGTAGACGCG CTTGCGCAGT TGTGCTATCG 3660
ACCGCATGCC CGCAGAAGGG CGCAGGCGGT GTGTGCTCGC CTGAAAGAGG AGATTTCCCG 3720
TCAGCAGTTC AAGATTGCAA TCCAAGGCTC AATCGGCGGG CAGATTATCT CGCGCGAGAC 3780
GGTTAGTCCG TTCCGCAAAG ATGTACTTGC TAAATGCTAC GGAGGTGACA TCACACGTAA 3840
GCGAAAGTTG CTGGAGAAAC AGAAGGAAGG GAAAAAGCGA ATGAAGATGG TGGGGGATGT 3900
GGAGATCCCG CAGACTGCCT TCCTGTCGGT GCTAAAAGAG GCTTCCGACG CCTAAGGGTT 3960
TCAGCGCTGT TTTTTAGAGT CCTCTCCGTC TTGCAGGGGa TGTTGCAAAA GCGATGGTCC 4020
GTCATGCTGC GGTGTAGACT TAGGTATCTG GATAAGTAGA CAGAACACAC ATTATACGCA 4080
GCAAAAACAG AAAAAGAACA GGCGGGGAGG GCGACGCGCg CCCTCCGGGC CGCAcTAAaT 4140
CTTACCGATT AAaTCAATAC CAGGCTTCAA CGTCTTTGCT CCAGGCTTCC AACGAGCaGG 4200
ACAAACtGAT CCCCATGCTT AGCCACAAAC TGTGCTGACT GAACCTTGCG CAAAAGCTCA 4260
TCCGCATCGC GCCCAATACC CATGTCGTGT ACCTCGAAAG CTTTCACAAG GsCTTCAGGA 4320
TCGACCACGA ACGTACCCCG CAGCGCATGC CAAGTGTCTG GCAACAACAC TCCAAAGrAA 4380
CCCgCAAGCT TTyCCGCCTT GTCAGAAATC ATCTCGTAGG GcAGATTCTT TATCGTGTCT 4440 GTCGCATCCG CCCATGCCTT GTGCACGTAC TCACTGTCCG TAGAAACCGA ATATACCTTA 4500
CAACCAATAA CTATAGGAAA CAAACGGGGA AA 4532
(2) INFORMATION FOR SEQ ID NO: 116:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6923 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116:
GTGGTTAGAT TCCCTTTTGG GGATGAGTTG GATGGTGCTG ACTCTGTGTG TCTGTGCGCT 60
GTTGTTTTGC CTGAGGAGGA AGTACGTACA TCTCTTTTTT CCTCGTGGGG TTTCGGTGCA 120
CACGCCCCCT GCGTCTTCGG ACGTGCGGAG TGTGTTGCCG GATATGCCAG TGAGAAGGAG 180
GCGAGGAATC TTTGTCGTAC TCGAATGGGT TGACGCGCTC ACCCAGGCTG CGTGTTTCAT 240
GCTTTTGGTG AATTTGTTCG CGTTCCAGTT GTACGTTATC CCGAGCGAAT CGATGGTCCC 300
CAGCTTTATG GTCGGCGATA GACTCCTCGT GTTCAAGACC GCCTCAGGGC CTGTATTCCC 360
GCTTTCTTCG TTTCGTTTGC CACGCTGGCG TACCTACAAG CGCGGAGACA TCGTCGTTTT 420
TTCCAATCCT CATTACCCTG ACACTCCGCA GaTAAGCTCC GCGCCTTTTT AGCCCAATTA 480
GTGTACATGC TCACCTTTAC GCGCAAGAAC ATTAATGTGG ATCCTGTCAC CGGTGCGCCG 540
AAAGCTGATC CTCTCGTCAA ACGCATTGTT GCTCTGCCAG GGGAAAAAGT TATGCTCGTT 600
GACGGTGTGC TCTATACGAA GACCAGGCAT GATGCGCACT TCAAGCCTGT CGCACAAGAC 660
CGTACGTACG CCACGTGGGA TTTGAATGCG TTGCCCGCAC GCGATTTGGC GCGTGTTCAA 720
CGGGTCATAT TTAATGCTGA GGAGCTCGCC GCCATCCATC TGGTAGAgCG CCTGCGCGCC 780
CAGGTGGATT TTCGCGATTT AGCAGAGAAA ACGCGCGCGT TGGTTGCCCA AGCGCACGCG 840
TaCGCGGGGg CGGCGTCACG CACCCGACAG GGCATTGGCG TGGCGCAACC GATAACGCAC 900
ACATCTGACA TTCCTGCTTT ACCTCTGTTT GAAAAAGAAA TGCGCGGGGC GCGGGAGATC 960
ACACAGCTCT TCGCCACCGT TGCAGACGTT GCCACGCATA TCCGCGACAC CTCCCAGGGG 1020
TTCGCnCAtT CGCTCACTTT GTGCAAAGCT GGATCCCATT TTGGGGGCAA GGAACGTATG 1080
GCTTGGACAC GGGACAGGAA GGTCCGTCCC tGCACCGCGC AGGCCTCTCG CTCTACCAGA 1140
TAAGATTTGC GCAGCTGAAC GCGTTGGTGA AGTACACGTT CGCCCAGCTA GTGGTAAAAG 1200
GCCTCCAGGT GACAGCACAC CGAACGTCGG AGGCTGGGCA GGACGAAACG CTCACTACAC 1260 TTTTGCAGGA CGCGGCCCGG TACATCTTTT TCCTGGGTGC GGCGCGTGGA TTCAACATGG 1320
ACGAATTCCC CGCTGGCGCC GAGCAGTACC TTCCAGAACA CAACTACTTC ATGATGGGAG 1380
ACAACCGATT GAACTCTACT GATATGCGCC ACGCGTACAC CGAACACCTC GAGGCAATCG 1440
ACGCGCACGA CCCGTTCCCT ATTTTCTTTA GCTCCAATGT TGCGCCCAAG TACATTCCCG 1500
ATAGCCACAT CCTCGGTGTG GCGTCGTTCC GATTCTGGCC GCCCTCCCGC ATAGGCACCC 1560
CACAATAGGC TTAGGGGGAG CGGAGAGAAA GCTAAGAAAG GCGGAACCCG CCCAGGCGCG 1620
CAGCAGGCAG AAAGCCCGTA CCCTCAGACA sGTCCCGTTA CACAACAAGC GGGATAGGGA 1680
AGAGCCCTCG CTTACAAGGG CtCAGGAAAG ACATCCCCTT ATCGTGCGAA gcGCTcGCCG 1740
AATGCTTTAC AGCTTGCCTG GGCTTCTTCA CTGGGATCGT CGTACGCGAT CTCCCCCTTC 1800
CCCTCGAAGA CATCAGCGCC AgCCGCCTTA CAACGCTCGA CCCAGTTGAC CATCCATTCG 1860
CCGCCTTCCC CTTCTCCAGC CCACTCATAG GATCCGAAAA GCGCAACTTT TTTCCCCGAT 1920
AACCTTCCCT CAATAGAGGT AAAGAAGGGT TCAAACTCGC TTGACTCTAG CTCCTCAGAA 1980
CCAGctGCAG AGCAGCCAAA GGCGAAGsGG TCATAGGAAT CAAAAGTACC AACGTCGAAG 2040
TCCATGACGC TAAAAAGGTC AGCTTTTGCA CCACCGACAT TCAAACCCTC TACGATGCAG 2100
CGAGCCATCG TTTCAGTGTG CCCAGTGCCA CTCCAAAAAA TGACAGCAAC TTTTGCCACA 2160
AACTCCTCCT CGGGAACGTC ACGCAgTGGG TGcACTCGCA AAATAGTGCA GCCACGACAC 2220
GCGCACCCTG CCCGCGCAAG GGTAGGGGAA AGCTCTGTTG CTGTCAACCG CAGCCACAGC 2280
AGGATCCGgT GCCACCCTGG ACACCGGTAG ACTTGACGGG CCGACATTTT CCGGTACACT 2340
GGGGCCTGCG CGCCAACTTA GCTCACCTGG CAGAGCAGCA CCCTCGTAAC GTGCAGGTAC 2400
CCGGTTCGAG CCCGGGAGTT GGCTTTCTGT TTGGCGTAtT CCGCGCTGTG GGCCGGTAGG 2460
TGAGTCTTGG AAGAGGTGGG GGGGsGCGGg AACGGCGTGC TGTCCTGTTC CTACGCGTTT 2520
TTCCTCACTT CGGGTGGGGT GTTTTCCCTT TGAAGAACTG GGCAAACGGC TGkTATCGCG 2580
CGAAATCCTG TCCCGGCGCG GGGGATGTGC CCCGTGTCTT TGCGCGCTCA GGGGAGGTTT 2640
TCCCTTCAGG AGCCCGGGGA CGGGGTGCTC TCCGTGAAGG TGTCGCCGTG TGATCCGCAG 2700 gTGCACGCTG CTCCTGCCTG AGACTGAGGG AAATACGTCT CCTTGCCTCG TCAAGCCCGA 2760
TGATGGTGCA GGTGTACACG TCACCTACCT TCACTTTTTT GAGGGGGTTG GAAACAAAGT 2820
GGTCGCTCAT CTGCGACACG TGCAGAAGCG CCGTTTCCTT TATTCCAATG TCCACAAAGG 2880
CCCCAAAGTC CACCACGTTT TTTACCTTTC CCTGTACGGT TGCCCCCACT TTTAAATCTG 2940
CAAAGGATAT CAGACCTTGG CGCAGCACCG GTTTTGGATA ATCCTCGCGC GGGTCACGAT 3000 TAGGTTTTTG CAGCTCTGTA ATGATATCTT CGACGGTTCG ATCACTGACT GCGCATTGCG 3060
ACTGCACCTG TGCCTTTTGc GCTGCGCTCA CTGTACCGCC TGCGCGCAGT ATATCAAAAA 3120
TTATCTTTCC CGTTGCATAG TTTTCTGGGT GCACCCACGA GTTGTCCAGC GGGTTTGTGC 3180
TTTCGGGGAT TTTTAAAAAT CCTGCACATT GCTCAAAGGT TTTTTGTCCC ATACCACTGA 3240
CTGTTTTCAG TTGTTCGCGG CTAGTGAATA TGCCGTAGtG GCACGATGGT GCACGATCCT 3300
TTTTGCCAAC GCGCTATTAA CGCCAGATAC GTGCTTTAAG AGAGATACGC TAGCCGTATT 3360
GAGATTAACT CCTACGCTAT TGACTACAGC ATCTACTACC GCGTGGAGCT CCTCAGATAG 3420
CTTTTTTTGA TTAACATCGT GCTGATAGAG TCCCACCCCA ATGGATTTCG GATCAATTTT 3480
TACCAGCTCT GCTAGAGGGT CTTGCAGCCT GCGTCCAATG GAGATTGCAC CACGGATGGT 3540
CAGATCTAAG TCAGGGAACT CCTCTCGCGC AATATCTCCT GCTGAGTATA CGGAAGCTCC 3600
GTCTTCCTCT ACCACGGTGA ATGCAACGGC AGAGTGTGTT TCGCTAATTA TGGAGGCGAT 3660
AAGCTCCTGC ACTGCATGGG AGCCGGTGCC GTTCCCAACG GCTACGAGCT GAATGCGGTA 3720 gCGATCAAGC GCCTGCGTCA AAGCGGCGCG TGCATGGTCC GTGTTGTGCG GATATATGAC 3780
AAAGGAGCCG AGATATTGGC CCGTTTCATC CAGTGCCGCA CACTTAGTCC CTGTGCGGAT 3840
GCCAGGGTCT ATGCCGAGCA CGCGCGTGCC CTTGACCGGC TGGGTCATGA GCAGATTCGT 3900
AAGATTTTCA CTAAAAACGT TGATACCGTG TTGCTCTGCC GAAGCGGTAA GGTCTGCGCG 3960
TATCTCCCGC AGGACGGCAG GACTGAGCAG GCGCACCACG CCATCTGTAA TGGCATCGCG 4020
ATGATACCTG TTGTTGGGGT GCACCGCCTC TTGAACCTGC TCGACAGCGG CGTCTAAATC 4080
GACGGTGATT TTTACGTCAA GGATTCCCTC ACGCTCCCCC CGATTGATGG CTAACACGCG 4140
GTGCGCCTTG ATGTCGCGCA CTGCCTCTGC GTAATCCCAA TACATTTGkT AGACGGACGT 4200
GtGgCAGCGT GCGCGTcCCC GATTCCGGTA GCCGTAACGA CGCCTGCAGA AAGGTAAAAG 4260
GACTTCAGTG CGGCACGAtT GGCGTTGCAG TGTGCGGTCT CtCTGCGAGG ATATCGCAGG 4320
CGCCTGCGAT GGCGTCTTGA GCGCTGGAGA CGGCACGATC AGAATCTGCa GCAGGAGCGA 4380 cGAgCGCTGC GGCAGCGCGC TCGATTTCTG CCTGCGTGGC GCACTGCGTT TCTATCAAAC 4440
GCGCAAgCGG CTCGAGTCCT TTTTCGATCG CCTGCATGCC GCGTGTCTTT TTCTTTTTTT 4500
TGAACGGAGC CCAGAGGTCC TCGAGTGCTG CAAGGGTAGG AGCGCTCCTG AGGTGCTCGT 4560
AGAGCGTgGG GGTGAGCATG CCTTCTTTGA AGACGGCGCG TATAATCTCG AGTCTGCGTG 4620
TTTCGCGTGC AAGGTGGGTG TGGAAGAGGC GTTCGCAGTC GCGGATGAGC ACCTCATCGA 4680
GGCAGTGATG CGCTTCCTTC CGGTAGCGCG CAATGAAAGG AACCGTGCAG CCTTCTTTGA 4740 GGAGGGAACG CACGGCAGTA ACCTGCGCGG TGCGGATGTG CAGTTCGCgc GcTACGCGTT 4800
CTGCGAGCTC GTCCTCTTGC ACGCTGAGTG CGTCCACAAA GTCCTGGTCT AAAGTCATGA 4860
GGGGGAGTGT AACGCGTTTG GCTCTTTTTA GAGAAGCGCC GGCCTGCAAC CGGCTCCGGC 4920
GCGGACCCTG GCGTGGCACC GGCCAGAGAA GGGCGAGTGG AGAATAGGGG AGTCGAACCC 4980
CTGACCTCTT GATTGCGAAC CAAACGCTCT ACCAGCTGAG CTAATTCCCC AGGACTGCTG 5040
GCTCCAGCTA TACACCAAAT ATGCGCGTCC TGCAAGGGTG TTTcCTGCGG GTGTGATGCC 5100
CCTCGTGCAC CCTGTTCCCT GGCGCACTGC GGCGCGGTGT AGCGCTCTAG GCGCGTCGGG 5160
GGGTGTTGTA GAATAGGCCG CATGAGCTAT TCGTGGAAAG TGCGCGCGCT GTGcTGCGCA 5220
GGACTGTGTG TAGGTGCGGG GCTTCGTGCC CAGGAGGGCA GCGGAATTCG CGTGCGCGGT 5280
ATGCCGGAAC ACGCGCAGGT GACCGTAAAC GGATATCTGT GCGCAACACC AGAGGAAATG 5340
GTGCTCACCC CTGGTGAGTG TGAGGTAACC GTCTGTGCCT TTGGA ATAC CAAAAAGACG 5400
CTCCAGGTAG TGGTTGAGGA AgGCTCGTTC ACGGTGGTGG ATGGCCGTCT GGATACGGCG 5460
CGTTTGGAGC TCACGGATGT GACTGCGCAG AGGGCGCACT TTAATCCGCG GGATCCGGCG 5520
GgACTGAACA CGGAgTACGT CACgTTCCGG GTGACAAAAT CTGCAAAgTG TACGGTAACG 5580 gTAAAGGATG CCGAAGGAAA GGTGGTGTGC GAGGAGCCGG TGGAGTTAGT TGAGCTGGGG 5640
TTGAACGTGG GGGGAATATT CGGGGGCAGT AATAAGAACA GCGAGGATGT TAGCGTTAGC 5700
GCAAAGGTAG CGTTCGAAGG GAACGTTACG AGCGACCCGG CTATGGGCCA GCTCTATGCC 5760
TCAGCGCTGT GTTTGTACCG CATCGTGCAC AACAACGATA GCAGCGGCGC AAACAAGTGC 5820
TTCATGCGGA AGGGTTTGAC GTTTGCGACC ACCTGTGCGT ACGGCATTAA GGGATTCACC 5880
GTCGCGCTCT CCGGAGAACT GGGTGCCAGT TCAGAGACGG GGATAAAAAA GCCGGACTTC 5940
TCAACCGATG TCGGCCTGTC GCTCAAGTAC CAAAACAAAA TATGCTCCAT TGCCACGTAC 6000
AGCAAGTGCG GAACCACCAC GGGGAGCAAT AGTGACGGAG CGAACAGTGT GGCGGGTGTG 6060
TCcGGTTATG CGTGCTGCCT GCAAGTCTCG TGATGGGCTT GGAGAACAAT ACGCTTCAAA 6120
GGTAACTCTT ACGAGGGCTG GGnAnTACGC GCTTCCATTG GGTACGTTAT CAACACGAAG 6180
CTGAGAGTCG GGCGACCATA GCGGGGCAGG GTACCAGCCT GCTGCGATCG CGCGGGCAAG 6240
TGCCGCCGTC AnCGTGCGCG GTCTGCGGGT ACGTCATACC AAACCAGCGC GCGCTGCTGG 6300
GGTAGCAGCG CACCGTTCCT TTTCCCTGTG CAATGAGGCT GTTTACCGCC GCGGGCAAAA 6360
GGTATTCCCG CTCGTGCTGC GGCGCGcGcs CTtCTTGCAC GAATGTCTGC CAGCACGCTG 6420
CGAGGTGTTC GAACACGCGC GGAtGAAGCC AAAAAAGTTC ATAGACGCTA CTTCCTGCCC 6480 GGTGAGTGTG CAGGGCGCGC GTGCCGCTGT GGCGGGcAGT CTTCCGCGCG CTGTGnGrTA 6540
ATGACCCGTC TGcCGCCGTG CGCGTGCCAG CCGATGTGCG TGTGTTCGTG TATGGCACGC 6600
ACCAGGCGTC CTGCGGCGGG GACTGCGGGG GTGGGGGGAG AGACTGCGGC GTCAcGTcCG 6660
CAAAGGTGCA GATACCGCGC GAAACGCCAC CGGTTTCGCT GAGCgTGTGC ACAAGGGGgT 6720
AGCCGACCAT GGCGTGGCGT GTCGAGTCCA GCCCCTGCGC GGCAAGGTGC GCGGCAAGcg 6780
TTTTGTAcgC GTCGcnTCCG TAGTAGTCAT CAGCGTTGAT AACCGCAAAC GGTGCAGTCA 6840
GCTGTGTGCG TGCGCAagCA AGCGCGTGGC CCGTnCCCCC ACGGCGTGCG CGCGACAGGA 6900
TGCGnCAGAG CGGCGCCGTG TGC 6923
(2) INFORMATION FOR SEQ ID NO: 117:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6986 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117:
ATACCTTACA ACCAATCTCC ACAAAAGACG GATATACGCG CGCGAGATCG TCAAGCTCGG 60
TGGGGCACAC AAAGgTGAAG TCGGCCGGGT AAAaCmTAAA CACCGCCCAr cTACCCTTAA 120
TGGATGCGTT AGAGACCTCC GTAAACTTCC CCCCGACATA CGCAGGamGc TTAAAGTCTA 180
TAACCCTCTT GCCTATGAGA CTCTCCaTAG cAACCTCCCA AACGTTCGCA CTCTAGCGCA 240
AACTATAGGA AACAAACGGG GAAAAGTAAA GTCTTTCACA CGTATCCCCC GGGTCACCAC 300
ACCATCACGC AGTACGTTCA GAAACAAGGC GAACGCTGGC TTTTTCCTCT TGTGTAACTA 360
CCTTCTTCGT AATGCATAGC TCCTTTTTCC CTTTCAGAGA CGGTGCCTCA AACATAGCAT 420
CAAGCATTAA TCTTTCCACA ATAGAGCGCA AACCCCGCGC CCCCGTTTTT TGATCAATTG 480
CCTGCTGAGC TATTGCGTCC AAAGCGTCCT CATCAAAGAC aAGACGCACG kCATCCAACG 540
CGAATAACGC TTCAAACTGA CGGACAATAG CATTTCGCGG TCGTACCAAG ATATTGCGCA 600
GATCCTCTTT AGAAAGAGCA TCCAAGGCGA CCGTCACCGG CAGACGGCCG ATAATCTCTG 6-60
GGATTAATCC AAATTTCACC AAATCATCCG GAATGACGTC CTCGTGCATC AGTTGCAGAC 720
CTCGCTCCTT TACCGTTTTT ACATCTGCTC CAAAGCCAAC CGGATTCTTA CACAcTCTCG 780
TACcGACAAT ACCATCTAAC CCAACGAACG cACCACCACA GATGAACAAA ATGTTCGATG 840
TATCCACCCT GAGCATGTCt TGGTTTGGAT GmTTGCGACC CCCtGCGGAG GCACcGATGC 900 TATCGTTCCC TCAaTTATTT TCAAAAGCGC cTGCTGAACC CtTCACCCGA CACATCACGC 960
GTAATAGACA CGTTCTCGCT CTTACGCGAA ATCTTATCGA TTTCATCAAT GAAGATAATC 1020
CCCCGTTCTG CGAGGGCAAC ATCTCCGTTC GCATTCTGAA CGAGCTTTAA TAAGATATTC 1080
TCTACGTCCT CACCCACATA ACCGGCCTCG GTGAGCGTAG TAGCATCTGc tATCGCAAAG 1140
GGGACCTTCA TTTTCTGAGA AAGTGTCTTA GCCAACAGCG TTTTGCCTGA ACCTGTCGGC 1200
CCAATAAGCA GCACGTTAGA TTTTTCAATC AATACCGAAT CAATATCCAA AGACCTACCT 1260
GCCACCCGTT TGTAGTGGTT GTACACCGCA ACCGATAGCA CCCGCTTGGC CAAATCCTGC 1320
CCAATAACGT ACTGATCAAG GTAAGCTTTC AACTCTAAAG GAGTTGGAAT CTCCTCTTTG 1380
GTCATGAGCG CAAGTGCCGA CGGCTTGCGA TCACGCAAGT ATTCTGCACA CCGCTCCACA 1440
CAATAATTGC AAATAGAAAC CCCATGACCG GTCACAATCC GGCGTCGCTC ATCTTCTTTT 1500
TTTCCACAGA AAGAGCAGCC CAATACCAGA TCCCCCTTAG ACCTGAGCAT GCTTTCTCCT 1560
CTTCATTACC GTATCTACGA TACCATACGA ACACGCCTGC TCCGCGGAAA GGAAGAAATC 1620
TCGCTCCATA TCCTCCCGCA CCTGCTCCTC TGACTGCCCA GTGTGCAAcG CGAAATACGC 1680
AATCGTCAGC GTCTTTAGGC GCAGGATCTC CTGCGCCTgG GATGCACACA TCACTTGCCT 1740
GCCCCTGTAC GCCACCCCAC GGTTGATGGA TCATCACCCG AGAAGACGGA AGCGCAAAAC 1800
GCTTGCCAGG CGCACCTCCT GCCAGTAACA CTGCTGCCAT ACTCGAAGCC TGTCCTAAGC 1860
AAATGGTCTG CACCTCAGGG CAAATGTGCT GCATCGTATC GTACACTGCA AGCCCTGcAG 1920
TAACCGCCCC GCCAGGACTA TTAATGTACA GGCTGATATC CTTATCTGGA TTCTGAGACT 1980
CTAGAAAAAG TAAcTGCGCT ACAACTAAAT CCGCCACCGC GTCAGTGATC TCCCCGTCTA 2040
CGAAAATAAT ACGGTCCTTC AACAAGCGGG AAAAAATGTC ATAGctCCGC TCTCCACCCC 2100
CCGACTGTTC AATCACGTAG GGAACCAGAT TATGCATACG TTCACGCACG GGACTGCTCC 2160
TGAAGAAAGT CAGTCAAAGA CTGCTCCGGC CCACACTCAG TCACACATCG TCCGAGCAAT 2220
TTCTGGCACA GCTTCCGTTC TCGTATTCCT TCACACAGCG CACGCCGTTT TTCCTCCCCT 2280
GCATAATACT CGCGTACCCG CTCCTCTTTG GAACCTGTTT TGGACGCAAT GCGTACGTAC 2340
TCCGTCTCAA TTTCCTCAGC AGAAACAGAC ACCTGCTCCT GCTTAAGAAG GAGCTCAACA 2400
ATCACACGCT GCTTCAGGTG CTCTTCCACC TCCGGACGCC ACTGCTGAAA AAACTGCAGC 2460
TTATTCTGCG GGGTGCCCGA CAGGCTCACC CCAAACTGAC GCATCACCAA CGCCCAACGA 2520
GACTCCATCT CCCCCACAAC CAAAGATTCC GGCAGAGAAA AAGGATTCTC CCGCACCAAT 2580
ATACGCAACA GCTGCCGCCT CTTATACTCG TGCAGCGCTG CCTCCAAcGC TTCCGCGAGG 2640 TTTTGCCGCa aCTCCGTGTC AGATCGTCAA GTGTGCGAAA AGCATCGCTC ACATCTTGCG 2700
CAAGcTCATC ATCAAGACTC GGCAACTGAC GCTGCTTGAG CGCCTTAAGC GTTACCCTCA 2760
CCTGAGCGGC TTCGTCCTTC AGCATACCGG CCCTTTTAGC AAAGAGACAC CGCTGTCCTA 2820
ATTTCATACC CAATATATCT TGCCCAAGCG CAAAGGGACC TTCCTCCACC CCAAGCGTAA 2880
AGACAACGCC GGCGCGCTCA GTACCCGGAC GAACGGCACC TGAATCGTCA ACCTCGTGAT 2940
AATCGACGGT GGCAATGTCC CCTACCTCTG CACACGAATC TGCACCCTTA TCAGTAACCA 3000
GCGCATTGCG CTCCTGAATA CGCGTTAACT CTCGAGAGAC GTCCTCTTCT GTGACCGACA 3060
CAGTGGGCAC GGACAGCGAA AAGCCCGATG TGTTGCGTAG TTcAACGGAA GGAAATACGT 3120
CGTATATGAC AGCAAAaGAG AAaTCCTCGT CAGGATCGAA CACTGGCTTT TTCTTAAGCG 3180
AAGGACGGGA GATAGGAAGA GGCTGACTGT CCTGCGACGC CTGGGCAAAC CCCTCCTCCA 3240
GAGCTTTTTC CATGAGGGCC GCCGCTGCAT CTTGCCGAAT AGCACTTCCA TACTTCCGCT 3300
CAAGCACTGC AAGAGGAACT TTCCCCTTGC GGAAACCAGG AAGCCGCGCA CGCTCAAGAT 3360
ATTCCTCAAC AAAACGCTGA TAATGcCGct GCGCATCCTC GCGCGCGACG ACCACCTCTA 3420
GCTCAACCTG AGATTGTGCA AGCGCGGTGA ATTTTTTTTG AAGTTCCACA AGCCCAGATC 3480
CTTAGGAAGA AATACCTACG TCCGCAACGC cTCGCACGGT CCAAGCAGGa TGCAGCAAaG 3540
CGCTGaAAAA GCGGGAAACG GGGATCGAAC CCGCGaCTTC CACtTGGCAA GGTGGCGCTC 3600
TACCACTGAG CTATTCCCGc ACAGGCGCCt GCGAGAGGAG GGACTTGAAC CCTCATGCCA 3660
GAGGCACTAG ATCCTAAGTC TAGCGTGTCT GCCGATTCCA CCACTCTCGC ACGGAAGAsA 3720
TsCGGCAAGC AAAAACTCGC CCAACAGGAT GCAGACACCC AACCGCCCCT GAGCCATGcA 3780
GGCTTCGAAC CTGCGACCCA CAGATTAAGA GTCTGTTGCT CTACCAACTG AGCTAATGGC 3840
CCGTCCTCCG ACACCCTCCC CCCAGGATCA CATATCATGC AAAAAGGATC AAGATGAATC 3900
GTATCGTCGC GTCCCACGCA CCTCCTCTTT TCGCTCAACA TTTCCTTCAA TCAGTCCAAA 3960
CCTCTAGGAA GATATCCAAG TCGCCGAACA CAACAGGGGC GTAGTAGGGG ATTGACTGTG 4020
CAGTCACTGG GTCCGTCGGG TTCACCCTAC AAGGAACACT CCGTTTGCCG TACTCGTTCC 4080
GCATAGGCCC TGTCTTTAGG TATAAACTCC CTGCCCCATA CGCCGGCACC CCTTTTTTGC 4140
AGGAACGGTT TCCTAAACGA GCTATCCGTG CTACCCTGGC AGCCGACCAG GGAGGGCGCG 4200
TATGGATCAG CATACACGTA CACGCGATCT TGTTTCTGCA TTTTTTGGGC GCTTTCACTT 4260
TGATGTCCAG GGACCTTCCG TCCGCACGGT TGTCGACGTG TTGCGCGCAG ATATGGTGCG 4320
CGGCTTAGAG GAAGAGGCGC AGCTTCCTCC CCGTATGGGG AGTGCACTTG CGATGATTCC 4380 CACTTGGGTG GCGCCCCCCC GTGTATCCCC CTGCAACCGA CGCGTGATAG TTATCGACGC 4440
TGGAGGAACC AACTTTCGCT CGTGCCTCGT ACGCTTCGGC GACAGTGGCA CACCTCACAT 4500
CGAGAATTTA GAAAAACGTC CCATGCCCGG TACCACCCGT GAGTACTCAA GGACAGAGTT 4560
TTTTGGAGAA ATTGCAGACA ACCTGGCACG TCTGAAAGGT GCAGCGGACT GCATTGGCTT 4620
TTGTTTCTCT TACCCTATTC GTATCAGACt GACGGTGACG GTGAGGTTAT TCAGTTTGCG 4680
AAGGAAATCA AAGCTGCTGA GGTCATCGGC ACGTGTGTCG GTGCTGGTTT GACAGAAGCG 4740
CTAAGTGCTC GGAACTGGCC TGAACTCCGT TCTCTCAAAA TGCTCAATGA CGCAACGAGT 4800
GCGCTGCTTG CAGgTTTTTT TGCGGCACCA GAGGGGTGTT CGTTCAGTTC ATACGTAGgT 4860
TTTATTCTTG GCACTGGAAT GAATTCTGCG TATCTGGAGC CAGACCCTAT TCCTAAAATT 4920
CCTGCGCATC ACACACCTCA GGTGGTAGTG TGCGAATCGG GAAAAAGCAA CAAAGTACCG 4980
CGCAGTGTCT TTGACGAATT ATTCACTCAA ACTACTGCCG AGCCGGATAT TGCACACCTA 5040
GAGAAGATGT CCTCGGGCAC CTACCTCGGT CCCCTTGCTT CCGTTGTCGT GCGGCTTGCG 5100
GCACAAGAAG GTCTTTTCTC ACACGCAGTA CACGCTGcAC TCAGTACGGT TTCCTTTACA 5160
CTCGTGGATA TGGATCGTTT TTTATTTGCT CCCTCTGTGT CCACCACCAC GTTGGGCGCG 5220
TTGCTCGCAC CGGGCACCGA CACAGACCGA GAGATTCTCT TTCTTTTGCT CGATGCGGTA 5280 kTTGCACGTG CAGCACGCAT CGCTGCGGGA GTAATCGCCG CCTCAGTATT AAAAAGCGGT 5340
GCTGGGTATG ATCCGCTTCG TCCCGTGTGC GTGCTCGCAG AAGGCACCAC GTTCCAACGC 5400
ACCTACCGCC TACGCACCCG GGTTACTTCC CACCTGCAAG CCTTTTTGAC TGAGGAGCGC 5460
GGTGTGTATT TCGATATCAT TTCACTTGAA AACGCCGTAA CGCTCGGCTC TGCACTCGGA 5520
GGACTCAGTT CGTAGGCATA TGCCTAAACG GACTGATGAT CCTGTGAGAG ATAGCGCCGT 5580
GCAGTGCTTC CTGTCATCTT CTCGTCGGCC GCGTGTGGCT GAGCGGCCGT GCTCGCCTTC 5640
TGGTGCGAAC GCGCTCTCCC TGTCTCTAGG GGAGTAACTT CCACGCCGAG TGTATCTTCT 5700
CAATCTTGTA CACGAGTAGT GCCTTCCCTC CATGGTTCAT GATTACATCC ACTCGGGTCG 5760
GGCTCTCGAA GACAATGGAG TCCACCCGCA CGTTTTGCCG CGACGGCACG AACACGTGTA 5820
TAAAATAATC GCGTAGtGCC GCAAACGAAC ACCCTTCTTT GGCAGGGCGT CCGAGCTCTG 5880
CTGCAATATG TCAGGCATTG AGTACGCGCG CCGATATGCG TCGGAAAGAT ACACAAGCCA 5940
CTTGTGATAG TCACGTTCGG CAGTGATACG ATTCAAATGC GCCACCACAT CTTGCAATTC 6000
TGCCTTTGTA CGCTCATAGT CTGAGCGCGT GATGCGCACG GTGCCGAAGT GGCGACAACG 6060
CCCGCACGCT CCTGCGGACT GTGAACATTC ACCTTTCCAT CGTTCTGCAC CCACTTCTGG 6120 ACGCGGGGCT GTGTGCATGA AACACTGCCC CACAATACAC TTCCGATTAA GTAGCATACT 6180
TTTCCTCTTG CAAATGCGCT CCTCACTACA CGCCCACCAG CGTACACAGA GTAGGATCGT 6240
TGAAGAGCTT TGCGTACCCC TGTGCCGACA AGCCCAACTT ATCCTCCAGG TACGGATTTG 6300
CACCATGATC CATCAATAGC CGAATTAATA CATGATCTTT CCTACCCACT GCCAACACCA 6360
GCGCCGTTTG ACCATTTGAA CCTCGCACGT TTGGATCTGC TCCTGCATGC AAGAGCAGAC 6420
GCGCAACAGT TCGGTTCCCA ATTTGAGCTG CTTCCATCAG CGCAGAATAC GCGCGATCGT 6480
CAGATAACTG ATCTACTGGc gCACCGCGCG CAATAAGTTG CGCTGCCATC TCATCCTGAC 6540
CCTCCCGCAC TGCCAAAGAC AACACAGsGT ACCGCGTGCG TCTTTCAACG CAGCGCTAAA 6600
TCCTGCATCC AAAAAGAGAT TGACAATATC AATATTCCCA TCCATGACTG TCGCGATGAA 6660
ATTTTCTTCA AAACATGGAT AACCGCGCTC TAACAGCGCA GTGCGTGCGA CACGCTTCTT 6720
TTTCTGCCTT ACAAATCTcT CGTGCTCGAC ACGAAAGAAA TCCTcAAACG TCTCCTCCTC 6780
AAGTAAAAAG ACCAAGTCGC GAAATACATG GATATCCCTG ACCTCCGTTG TTGTAGCCAA 6840
GAGCAGCACG TGCATACCAC GCCCACAAGC AACTCCAGAA AAGAGnATAA AAGCCGGATC 6900
GCGCATGGGC TCATGCGGTA CAAAAAAAAC ACATGCGTTG CATCCTGTAC CAACGCCAAA 6960
GGGTAACTGG CACGGTGGAT GTGGTC 6986
(2) INFORMATION FOR SEQ ID NO: 118:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1323 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118:
ACTGTCTGCC ACGCTACACA CACTCACACA TCAGTATCTT AAATACGAAG AGTGTTCAAA 60
ACAGCTTGCA CAAAAGACGC AAGAAAGCGC AAAGCTTATA ACTCTTTCAG ATGAACTGAA 120
TGGGATAAAC CAAAAAAAAA TACAATTTGA CGCATGGGCA CTCATTTCTT TTCTGCACGA 180
AATTACTGCC TACGCAAACA TACGTTTGCA AAAAATGAGT GAAGGACGTT ACCATCTGAG 240
GGTAGCTGAC TCGCACGTCA ATGCACGAGG ATATCAAGGA CTTGCGCTGC TCGTTGCAGA 300
TGCGTACACT GGGAGCGTGC GCCtTCGGCA ACACTTTCAG GAGGCGAAAC CTTTATGGCC 360
TCTATCAGTC TTGCACTTGG TCTTGCAGAT TCTATCCAAA CCCGATCGGG AGGTATTGTG 420
CTTGACTCGC TGTTCATAGA TGAAGGATTT GGAAGTTTGG ATGAGGCAAG TTTAGATAAG 480 GCAATTGGCA TCTTAGATGA AATCAGAGAG GGAAGTCGCA TGATAGGCAT CATTTCTCAT 540
GTTCATGAAT TGCGCACGCG CATCCCTCAC AAAATTCTGA TAAAAAAAAC AAACGCAGGA 600
TCACACGTAA TGCAGGGGGA TGCAGAATGA AAACGAGCGC GCTCTTTCTT GATTTTTACG 660
AATTGACTAT GGCGCAGGGA TACTTTTTTC ACAAGCCGCA CGAGTGTGCG tGTTTGAAGT 720
ATTCTTTCGT AAACACCCCT TCGCGGGAGG GTACTCCATT TTTGCAGGAC TCGATCCGCT 780
CCTGACGGCA ATAGAGCAGT TCCGCTTCAG TGGAGAAGAT ATCGATTATT TGCGCACCTT 840
GCACTTATTT CATGATGACT TTTTGTCTTA CCTTGCTTCC TTCCGCTTTT CAGGAGATAT 900
ACACGCGCTA GAAGAAGGTT CAGTAATATT TCCTCACGAA CCGATCATCC GCGTGCACGC 960
GCGCTTGGTT GAAGCACTTC TGCTTGAAGG ATTGATACTC AACACCATTA ATTTCCAAAG 1020.
CCTCATCGCA ACAAAGACTG CACGGATGTG GCGCGCGTCA GGTGAAGGTG TTCTTATGGA 1080
GTTTGGCCTC AGAAGAGCAC AGGGCTATGA CGGCGCGTTG AGCgCCACaC GCGCTGcTGC 1140
AATAGGTGGC GCAACAGGGA CAAGCAATAC ACTTGCTGCA AAGCTcTACG GTATTCGGCC 1200
AATGGGAACT ATGGcGCACG CgTGGGTGAT GTCTTTtGAC AGTGAAGAAG AGGcCTTCGA 1260
ACGCTATGCT GCACTCTATG GAAGCGCGTC CGTATTCCTC ATCGATACGT ACCATACCCT 1320
GGG 1323
(2) INFORMATION FOR SEQ ID NO: 119:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3076 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119:
TACTnCTACT TCCATCCCCT CCAGAACTCC TCCCgGATAT TCCtTCATCA GTACATGGAA 60
TGCTCCCCTA CCGCCTTGCA CCAAAAATGC AAGACCCGTA GCTTCGGTAT ACCGCTTAGC 120
CCCGATACAT TATCTGCGCA TGCGTACTCG ACCAGTGAGC TATTACGCAC TCTTTCAaGG 180
AATGGCTGCT TCTAAGCCaA CCTCCTGGCT GTCCAGGTAC CCACACTTCA TTTCACACTC 240
AAGCGGTATT TCGGGACCTT AGCTGACGGT CTGGGCTGTT TCCCTCTCGA CTACGAACCT 300
TGTCGCACGC AGTCTCACTC CCACACGTTG ACTGCCGGCA TTCAGAGTTT GATTGGGTTT 360
GGTAGGCGAT GAAACCCCCT AGCCCATCCA GTGCTTTACC TCCGACAGTT TTGTATAAGG 420
CTGTCCCTAA AGGCATTTCG GGGAGAACCA GCTATCTCCA GGTTTGTTTA GCCTTTCACT 480 CCTAGTCACA AGTCATCCAT ACCTTTTTTA ACAGATTATA GTTCGGTCCT CCACAAGGCT 540
TCACCCCTGT TTCAACCTGC TCATAACTAG ATCACCCTGG CTTCGGGTCT ACGACGTACA 600 ACTCACCACG CCCTTTTAAG ACTCGGTTTC CCTCCGGCTC CAGGACTCCT ATCCCTTAAC 660 CTTGCTGCAC ACCGTAACTC GCAGGCTCAT TCTACAAAAG GCACGCTACC ACCCTCACAG 720 GCTGTAACAT CTTGTTGGTT TACGGTTTCA GGTTCTATTT CACTCCCCTC ACCGGGGTTC 780 TTTTCATCTT TCCCTCACGG TACTTGTCCA CTATCGGTAG TTGTCGAGTA TTTAGCCTTA 840 GATCGTGGTC GACCCAGATT CCGACAGGAT TCCTCGTGTC CCGCCGTACT CAGGTACCGC 900 ACCAGCAGGT CCGCCCCATT CCGCATACGG GGATTTCACC CTCTCTGTCA GGCTTTCCCA 960
AAACCTTTCT GCTATAGGCC GGATTATTTC ACCCACCGAA CGCAAGCCCG CGCGGCCCTA 1020
CAACCCCTGT TAGACACAGG TTTAGGCTCC TCCAATTTCG CTCGCCACTA CTTTCGGAAT 1080
CTCTCTTGAT TTCTTTTCCC AAGTTACTTA GATGGTTCAG TTCACCCAGT TTCGCCTTAC 1140
CCTCCCTATT CATTCAGGAA GGCAATGACA AGGCTTTACC TGTCGGGTTA CCCCATTCGG 1200
TCATCCCCGG ATCACAGGAC ATGTGCTCCT CCCCGAGGCT TTTCGCAGCT TATCACGACC 1260
TTCATCGCCT GACAACTCCA AGACATCCAC CGTAAACCAC TATTCGCTTG ACCATATTAT 1320
CCATCCCTTC TCAACTTCAC ACCCCACCCT AATACTCTCA AAAATCACCT ACCACCTACT 1380
CCTTACCCCA TAAACAAAAC AAaGGGACAT AAaGAATAAT AGTGGGCTTT CCCTGGAGAT 1440
AgGGGACTCG AACCCCTGAC tACGACCTGC AAAGCCGTCG CTCTAGCCAG TTGAGCTATA 1500
CCCCCTTTTC AAAAGGGAAG GGGAGAGACT GCCGTGCAGG AGCAGAAAAA CCtTaAGtGG 1560
CTTCCGCCAC ACGnCGAACA CGGCACCATG CCATGCCCAT ACCCTTTcTC TTAGAAAGGA 1620
GGTGmyCCAG CCGCACCTTC CGGTACGGCT ACCTTGTTAC GACTTCACCC TCCTTACCAA 1680
ACATACtTCG GCACCGCCCT CCtTGCGGGT TAGGCTAGTG ACTTCGGGTA TCTCCAACTC 1740
GGATGGTGTG ACGGGCGGTG TGTACAAGGC CCGGGAACAC ATTCACCGCA CCATGCTGAT 1800
GTGCGATTAC TAGCGATTCC AACTTCATGA AGTCGAGTTT CAGACTTCAA TCCGGACTAC 1860
GATTGCCTTT TTGCGGTTTG CTCCACTTCA CAACCTCGCA TCGCTCTGTA GCAACCATTG 1920
TAGCACGTGT GTAGCCCCGG ACATAAGGGC CATGATGACT TGACGTCATC CCCACCTTCC 1980
TCCGGTTTGT CACCGGCAGT TCCGCCAGAG TCCCCAACAC CACTTGCTGG CAACTGGCAG 2040
TAGGGGTTGC GCTCGTTGCG GGACTTAACC CAACACcTCA CGGCACGAGC TGACGACAGC 2100
CATGCAGCAC CTGTCAAGAG GCGTATcGct ACGCCACCGC ATTTCTACGG CGCTCCTCTT 2160
GATGTCAAAC CCGGGTAAGG TTCCTCGCGT ATCATCGAAT TAAACCACAT GCTCCACCGC 2220 TTGTGCGGGC CCCCGTCAAT TCCTTTGAGT TTCACTCTTG CGAGCATACT CCCCAGGCGG 2280
TACACTTAAT GCGTTCGCGT CGGCGCCGAG ACTCATGCCC CAACACCTAG TGTACATCGT 2340
TTACTGTGTG GACTACCAGG GTATCTAATC CTGTTCGCTC CCCACACtTC GCACCTCAGC 2400
GTCAATCATC GGCCAGAAAC CCGCtTCGCC ACCGGTGTTC TTCCAAATAT CTACAGATTC 2460
CACCCCTACA CTTGGAATTC CGGTTTCCCC TCCGTGATTC TAGACCAGCA GTACCCAGTG 2520
CAGTTCCCAA GTTGAGCTCG GGGATTTCAC ACCAGGCTTA CCAGTCCGCC TGCATGCCCT 2580
TTACGCCCAA TAATTCCGAA CAACGCTCGC CCCTTACGTG TTACCGCGGC TGCTGGCACG 2640
TAATTAGCCG GGGCTTATTC GCACGACTAC CGTCATCAAA CGGGCATTCC CTCCCGTCCT 2700
CATTCTTCGT CGGCAAAAGA ACTTTACAAT CTTTCGACCT TCtCATCCAC GCGGTGTCGC 2760
TCCGTTCAGC TTTCGCCCAT TGCGGAATAT TCTTAGCTGC TGCCTCCCGT AGGAGTCTGG 2820
GCCGTATCTC AGTCCCAGTG TGTCCGGTCA CCCTCTCAGG TCGGATACCC ATCGACGCCT 2880
TGGTAGGCCA TTACCCCACC AACAAGCTAA TGGGTCGCAG GCTCATnTCT GAGCGAGGCC 2940
GCAGCCCCTT TCCTCTCAAA GACTACGTCC AAAAGAGCGT ATTCGGTATT ACCCCCTATT 3000
TCTAGAGGCT ATCCCCATCT CAAAGGCAGA TTACCCACGC GTTACTCACC AGTCCGCCAC 3060
TCTAGAGAAA ACGAAA 3076
(2) INFORMATION FOR SEQ ID NO: 120:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1091 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120:
CnGATGTGCG TAAACGCCTG GAGACGCTCA AGAACACGCA ACGTCAAAAG GGCAGAACGC 60
CGCGCCGCAC GgTCGAAAAC TCTTTATAAC TCATAGAGAC ATCCTGCTCA GCACCCGCAC 120
CGTCGGATAT ACAGCGTATA ATAACAAAGG GGACGCCATT AACTGACGCT ACATGCGCAA 180
AGGCTGCCCC CTCCATTTCC ACACCATGCG CACCAAATTC GCGTATAATG CGTGCACGTG 240
TTTGCGCATC TGACACGAAA AGATCCCCTG AGGCGACGCG CCCTTCGACT AAACGAGAAA 300
CGCGAGAAGG CGGATCCCCC GAGCCAGAGA GCGCACAAGC ACCCTCGGTC CACTCCGGAT 360
CCCGTGTGCA GAGATCAAAG GCTTCCCGCA CCAAATACCG CAATGCCGTG TTCGCAGTCC 420
ACTCTACAGA ATCCATGCGC GGAATACGCC CTTTCTGGTA ACCAAAAGCG GTAAcGTCTA 480 CATCATGCTG CACTGCATCG ACAGAAACTA GCACATCAAA AACACACAAG CGCTCATCGA 540
GAGCACCTGC AATTCCTGTA TTGATAAGCA CACGCGCACC AAACTCCGAA ATGAGTAGTT 600
GAGTGCAAAG CGCTGCATTC ACTTTCCCAA CACCGCCGCA CACATACACC ACCTGAAGCG 660
CACCCACCGA CACAACATAG AACGTGAGCC CTGCCCGCTC TGTACCTACT CCCCCGAGAC 720
ACTCACGTAC GCGCGCAACC TCCTCTCCCA GTGCAGCAAA AACGCCGACC GTCACGCACC 780
CTCCCCGTGA AAAACACGAA AACGCGCACT CGCAACCCAG GCACGGAAAA AAGCTGTCCC 840
TTGAAGGTCA GGAAAAAGCC CCGACCACAA GGCACACCGA TAAATGAACG GAATATAGCA 900
GGGAGAGGAC TCGAACCTCC GGCCTCCGGG TTATGAGCCC GACGAGCTGC CAACTGCTCC 960
ACCCTGCGGT GACGCACAGA GCGTACCACG ACTAGAGCCC GAAGTCAAGC CACAAAGCAG 1020
GACGCTCCGC CCCAGCTTGA AGCGGAGCCT TACAATCATA CATACGACCA GAGGATACGA 1080
CACGCAGTTT A 1091 (2) INFORMATION FOR SEQ ID NO : 121:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19186 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121:
ACGACCTCGC CTGCAAAATC GCTATCCTTT CCCCATCTTT ATAGGTGAAT AGACCCGGCC 60
AGGGAGTAAA GGCCCGAATC TTGCGCTCCA ACACAAcTGC AGGATTAcTC CAGTCCGCCA 120 aTCCCaTCTC CyTACAGAGC TTACCACAAA aTGTTGCCTg CGAgTGaTCC TGtGCTGCAG 180
GGGCGAGCGT GTGCCGCTCA ACACCGACTA AAACATCATC CACAAGATCG GCAGCCACCA 240
AAGACAAACG AGAAAGAAGC GCGCCAGTAG TCTCTGTACC GTCGAGCTGA ACACGGGAtG 300
CGCAAGAATG TCCCCTGCAT CCATCTCCTC ACCAATGTAC TGGAGCGTCA CACCCGTCTC 360
GCAGTCCCCC GCTAAAATCG CCGCAGGGAC CGGTGTACTC CCTCGCCAAC GCGGCAACAA 420
CGAAGGATGA ACGTTAATAG CACCGCGCGG GAAAAGCGCA AGGAACCGAG GGCCAAAAAT 480
CTTACCGTAG GCAAAACACA CCAGCACGTC CGGACGCAAA GCCTCCACCG CGTCATAAAA 540
GGCGCGATCC AAACGCCCCG GAACGAAAAG AGACGCAGAC TCAGGAAGAA CCCCAGACGC 600
TTTTAAGCGA AAAAATTCCC TTGCAACCGc AGAATGGaCA AGCTTcCCCG aGCGTcCGaC 660
AGCAGCAGGA GGATTCGTcA AAACCCCCAC AACCCGaTGA GCGcACGCCA CCCGGCGCAA 720 AGAAGGCACG GcACACTCTG GAGTTCCCGC GAAGAAGACC CTcACCATGG GCTAACGAAT 780
CGTCCCGCGc AGCGCgnCAT ACCGACGAAG GGCATCATCC CTCCGCTTTT CGTCAATGCG 840
ATCCAAAAAA AGAATACCGT CAAGATGATC GTACTCGTGC TGGATCACTC TTGCCAGAAT 900
ACCATCTGCA TCAACGGCAC AACGTTTACC ATTCTCGTCG AGGTACTGCA CACTCACCCG 960
ACGCGGACGC AACACCCTTT CATAAATGTG AGGAATGCTC AAACAGCCTT CCTCGTAGGA 1020
AGATTGCTCT TCAGACGCGG CAGTGATCTG GGGATTGATA AAAGCGCGGA CGTGGTGCTC 1080
AACATCTACT ACAAACACGC GGACGGTACG CCCTACtGCG GCGCCGCAAG CCCCACACCG 1140
CCTGCCCCAC GCATCACACG AAACATACCC GAGATGAACG CGCGCAGCTG CTCGTCCACC 1200
TCCGAAACCG GCTCGGAAAC CGTTGTCAGG CACGGCTCAC CTAAAAACTT AAGCTCCACC 1260
GCCTCTTCCC GCTCTTACCC AAGAGAAACT CATACCCGCC TAGTACAGCC TTGAGTCGCT 1320
GACCTTGCGC CCCTTCTTCA TCAGtTGCGC CGCAACGCCT TCTTCTTGCG ATTCAGAAGC 1380
GTGGAAGGCT TCTCATAAAA TTCTTTTTTC TTCCACTCGC GAATAATACC TTCCTTCTCC 1440
ACCTGGCGCT TGAAGCGCTT AATTGCCTTC TCTAAATTCT CAGAATCATC CACCGTTATG 1500
TGAGCCACCG ACCCTCCCCT GAAAAAAACA CAGCCTACCA GGGGTAGnTA GCGCAAAAGA 1560
CGTCCCCTGT CAATCGTTTC TTCCGAAAGC ACTCCTGCCT CCCCTGCACG CACACCTtAC 1620
GTGTCTTGCT TATGCAACCG CTGACCTTCC TAACGTGCGT CTACTTGTGG TACAGTGCAG 1680
CGCTGTGCGT AGAAGTCAAA ATTGTCCGCA CCGACCTCCT TTCGAGTGCA TCCGTGGGAC 1740
GCTAGCCTAC CCTGCACAGG CGTCCACCCC TCCTGGGAAG AGGCCCGACC TGCCGTGAAC 1800
ATTACCCGAA GCCGCGTTGC GGTCTTTATC TCCTCCCTGA CCGCTGTGCT TCTGCTGCTC 1860
ACCGTTCAGT GCGCGCGGTA TATGCTCATG CGTGGCAACG AGACAAAGGA ACTGAACACA 1920
CTCACTGAAC GCGGCGCGAT CTTGGACCGA AATGGCCGGT TTCTTGCCGT TGGAACCACC 1980
GTCTACAACC TCAGTGTTAA CAAAAATCTT GTCTCAGACC CACGCACTGC AGCCCACGTG 2040
TTAGCACAGG TCCTTGACCT TTCAGAACAA GATATTGAAG AAAAATTCCG CACCGCGCGC 2100
GCTCACTTCT TTTACCTCAA GAAAAAAATG AGTGAAAcGG AAAAGAACCT TGTCGCTCAC 2160
GCTCTTAAGG AGCACTCCCT GAAAgGATTT CGCCTAGAGG CAGTGCGCAA CCGCATCTAT 2220
CCAGAAAGTA GCCTAGCGTC CACGGTCATC GGATACGTAG GTGATGACGG AAGGGGACTG 2280
AGCGGTATCG AGTACACtTG CAGGATGTTC TTTCTCCTGC CCCGTACCAC ACCGGGTATA 2340
CGGGCAAGGG GCATACTGTC ACCCTCTCGA TCGACCGAAC CATCCAGTAC ATGATGGAAA 2400
AAATCGCAGA TACTACGCTC CGGCGTACCC AGGCAGAAGG ACTCATGTTC CTTGCGGTGG 2460 AGGCAAAGAC AGGTCAGATT CTATCCTACG TCAGCAAGCC GTCTGCTAAC CTTTCACACT 2520
TTTCCCAAAG TACCCCTGCC GAACGCTTCG ATCGCCCCGC CCTTTTCATC TATGAGCCTG 2580
GCTCTGTGTT CAAAATTTTT TCCATCGCTG CACTGTTAGA ACTCGGGGTA ACTTACACCC 2640
ACGACACGCT CCACTGCGAC GGTTCCTTCT CCTTTACCTC CCCCTTTTTA AAACCAGGTC 2700
AAAAAGGCCA TCTCATCCGC TGcCTGCGCC CACACGGcAC CATCAGCGcT GaAGATATCA 2760
TCCGGcTTTC GTGTAATGaC GGcATGGcAC AAATTGcTGa ACGTGCCGAC AACCACAGCT 2820
TTGAGCAACT ATTGCGCGCT TTTGGATTTG GCGCGAAAAC AGAAATTGAG TTGCCGGGGG 2880
AAACCGTCGG TCTCTTCTCT CCCTCAGAAC GCTGGTCCCA CCGCAGCAAG CACACCATCG 2940
CAATCGGCCA GGAAATTGGC GTCTCTGCCT TGCAGtTGTG GCTGCCGcTA CCGCGCTCGC 3000
CAACGAGGGC GTACCGCTCG GCCTCTCCCT CCTCCATGAG GTCACTACCG CCGAAGtACC 3060
GTGGTGTACC GGCACAAAAA GAAACCCAAA ACACGCGTTA TCTCCGCAGT AAATGCGCAA 3120
AAGGTGTTGC GATACATGCG CACCGCCGCA GAACTTGGCA CCGGGAAAAA GGCGCTCGTA 3180
GACGGGGTGC CGATCGCAGT CAAAACAGGC ACTGCGCAAA TGGCGCACAG AAATGGTCGT 3240
GGGTACAGCG ACACCGACTA CCTTGCAAGT TGCATCGGCC TTTTCCCCGC GCACGATCCA 3300
GAAATTATCT TATACATTGC CATCATCCnt CCTATCGGAC AAGCCTATGG AGAGCTCATT 3360
GCAGCGCCTG TCATCTCTCA AGCGGCAAAC GAGATTATCG ACTACCGCGG TATGGTCCGT 3420
GCCAACGCCC CGTTAATCCA ACATAGCGGT CTCATCCATA CGTCAGAACG GACACCTCCA 3480
CGGTTAGGAA CCCATATGCC AGACCTCACC GGTCAACCTA AACGTTTACT CCTGGATATT 3540
GCAAAACGCA CCGACGTGCA CCTTGTCcTT ACAGGAGAAG GTTATGTGTA CGAGCAGCAT 3600
CCGCCTGCGG GCACACCTCT GACAAAAGGA ATGACCATTG AACTCAAACT CAAATAAAAA 3660
GCGAGATCCC GCGCGTTTCC CGGCCGGTGT TGCGCAAGGC TGCAGTACCA CACGCGCAGG 3720
GGATCTGAAA CACAGAAGAA AGCATCCCTT TGAGAAGTTT TTTGCGCAAA ACAGCGCTCT 3780
TTTTGCCCAG CGGTTTCCAG ATCTTGCACG TGCGCTGACA CTTCCAAACG AGCAGCTCCT 3840
GCAACGCATC CCTCCTGATT ACCTCCTTGC AGCGGCCCAT GACGGAGACG CAACGCTTGC 3900
AGTACGGGGC ACCTATCTCC ACTCAAAATA TCGGCCGCGG CAAGAGGCTG CACGTCTTAT 3960
CAGCCAGGAT TTTTTTACGC ACGCGATTGC AAAAGGCGGC TATGTAGGTG CAGGTTTAGG 4020
TCTTGGCTAT GTAGCAGAAC TGTACGCGCA nAGCCACCCT ACGCACACGG TAGTGCTTAT 4080
CGAGCCAGAT ATATTCGTGT TCCTGCTTTT TTTAGCCAtA GACyTyTCAC TCCCCTCcTC 4140
CGACACGAAC GTCTAAAAaT ACTGCCTGCa CAGACGGTAC CGGATGTGTT GCAGTTCyTG 4200 CGCGCCACGG GGGATGTGTC TCTCCCCCTG TTCCATTTCT TACCAGCCCA GGAGCTAAAC 4260
ACCGCGTGGT TTCACGATTT TACCCAGGCG CACCGGCACG CAACTGCACA GGCAGAAACG 4320
AATAAGCAAA CGCTGCAACG GTTCGGTCCG CTTTGGATAC GCAACACTAT AAAAAATGCC 4380
GCACAGCTGT GTGTGCGCAC GCCCGTAAAT GCGtGCGCAA tGCGCGGCAG GGAGTACTGC 4440
ACTCATCATT GCAGGTGGCC CTGGGGTGGA TGCGAGTATA TCGTTCCTTC CGTACCTAAA 4500
AAAAAAACAC TGATTATTGC GGTTGAaTAc CGCGTTnTGg CGTGTGCaGC GCGCGCGTCA 4560
CCCCGGACGT TGTGCTTCTG TTTGAtCCAC AATACTGGAA CTATTTGCAC GTAGCACGTG 4620
CGGTGGCGCC CCATGCACTG CTTATCACTG ACATTTCTGT GTTCCCTGCa GTGTTTCGAC 4680
TCCCGTGGGA GTACATTGTC CTTGCaGGTT CGGCATGCCC TTTCGCTACA TCCCTTGCGC 4740
ACCATGATGC CTCCTGCTCT CCTTCCCACG CTCTTCCTCT CCCTTTTGTC GCGCCCGATC 4800
TTCTCGCCTC AGGTGGGTCG GTTGCTACAA GTGCGTGGGA ATGTGCCCGC TACTTGGGAG 4860
CGACAACCAT AGTGTACATA GGGTTGGATC TGGCATTTCC TGGAGCGCGC ACGCACTTTC 4920
GCGGAGCGCT GTTTGAGGAG CGGGCACACC TGCAATCGGG TCGAGTGGCT CCTGCCGAAA 4980
CTACCTCCTT CTGTGCACTG CACTCCCTGC CCCTATATCC TGTCCCGGCT GCATCTGACC 5040
CCCATCCAGG AAAGAACTCC CCTGCTTnCC CTACCGGGGA GAACAAAGAG CAAACGGTGC 5100
TAACCGaCGC GCGTTTTTCA CTGTACGCCG TGTGGCTCGA AGCACACCTT GCGCGTTATA 5160
CGCACATTAA AACGTATGCG CTTGAGCCTG CAGGAAGGAG AGTTGCAGGC ATCACGCCCT 5220
TGCGCTTTTC CCAGCTTGTC ACGTTACTCA ACCGGTCTGC AGCTGTGCCG CACTGCCGGA 5280
CCATGTATAG CCGCCGACGC AGGCTATACT CCCGGTATAG AAACAGCAGC GTGCAGGTCA 5340
ATTGCACCGA TGCGGTGCGC TGGAGGCGCA CGATGAAGAG TACACCTGTG CACTGTCTCG 5400
GTCATTTCAC GAAGAAAAAG AAATCCGCAG AAGGTGCACG GAACCTATGG CGTGCTCTCA 5460
GACGTGCGGA TACGCGGTGC GCTTCTCCCC ATAACCTCGA GCACGCACTT GAGAGAACTC 5520
GTTCCTTCCT CAACGCGATG CCACTGACAC CCACAACGTA TGAAAACAAA ACACATGCAT 5580
TGTACACAGC GCTGTGCACG CTCCTCCCGA CCGAGCCTAC CTACCGCGCA CGGGCACACG 5640
CACACCTCTT TGAACTCCTC ACCCGAACAC TCAAGTTTTG CGCCGCCTAT ACCGAGGAGG 5700
AGGGGGAGTG GCGCGAGGCC tGACATCCGG GGCCACTCAA AGACGTGTGT GGGGCgCaGG 5760
GGCaGGGCCC GTGCGTTCCC CCTTTCAGGG AAAGAGCCAA GCGCGTGCGT CCTGCATCAG 5820
TTTTAGAAAC ACCAAGACGT CTACGTTTTG CTTCTTGCGA TTCTCGTAGT CCGCCCAGCG 5880
GCGGGGCGTG TAGTGTGCAT GGACGTCTGC CCTCACCTGT GCTGCATGCG CACAGCCGCG 5940 CGCAATCAGC TGGGCGTCGT GCGCGGGGGC GCGCCTAAGT TTTTCAGGAA GCACTTCTGA 6000
CAGAAGGTAT GCACTGAGTA CGCGAATCAA CGGtGACGCT TGGTGTGCAC CAACGTGAGA 6060
AGATGCGCGT GTGCCTGTTC AAAGTCGCCG CACAGGTGAC ACGCGAATGC GAGGTAAAAG 6120
CGAGTCCACG GCTGGACGGG CGCCTGTGTG GCGGCAACGC GCGTCATAAA GGTACGCAGC 6180
TCCGAGTACT TTCCCGCAAG GAGCTTGCCA ACGGCAAAGG TGAGCGCGCA CTTCATGATA 6240
TACCGAGGGC TGCGCGTGCA CAAAATGTTC TAGTCGCTCA AGTGCTTCAA AGTCACTTAA 6300
AAGGATCAGT GACTCTGAAA GGAGAGACAC CCTCCGCAAG GCAACCCTAT TGCGGGTGAA 6360
CACTTCCTCC TCCAAAAGAG CTGCCAATGC CGACCAATCT TCCGCATCCA GGTATCGAAG 6420
CAGTCTCCTG TTTTTAACAT AGAGTGTATT TATCAGCAAA AAGAGTAGGA GAAAAATGAG 6480
AAAAAAAATC CCATACGCGC GCACAGATTG CCAGCCTGTA CCCCGGACAG GAGTGAGAAA 6540
CAGAACATAT CCGGTACCCA TACACAGCAG CAGGAGTGCA CCGTTAAAAa GCACTACTAA 6600
TGCTTTGATC TTCATGGGAA aTCTCCTATA CTGCCTcCCC GCCTGTGCGG GGCGCCATTC 6660
TACTATGGGG GCGAGCGAAA CGTGAATACG TATGCAATTG TAGATATACC CGAGGAAACC 6720
CACTTCAGCG ATGACGTGTA TCTTGATGAC ACTTTTTTAG TGCTGACCCC CGCAGgTCTC 6780
TTTGACGAAG CGCTAAAGAG TCGACTGCAG GAGTGGGACT TCACCACGGT GCACACCGGc 6840
GGACAACTGC TTTCTTCTGA GACTACACTG GAGTGTCCTC TGCCTGCATC CGGCGATCCA 6900
AACGCGAAGG GTGCTCCCGG ATCCGAGGCC TTGCGTGCAC AACGAAGCCG TTTTCACGCA 6960
CTGAAGAAAC AGTACAACGA GTTTCAAATG TTCGTTGAGC ACGTTTTTGA TCAGTATCGT 7020
GCAAAGCAGa GCCTGAACAC GCGCGCGGTC ATTGACCGTG CGAAGATGCT TTGCGAATTG 7080
GTAAAAAAGC ATCGGcAAAC GCTACTGCGC GTCCTTCCCA CCATTCCCTA CCGGGAGCAC 7140
TACGCGCTTG AGACACATGC ACTCCGTTCG ACGTTGTATG CTGTTGTCAT TGGGTTACAG 7200
CTAAAAATGC AACCGTTCAA GATCATtGAG CTCGCAACAT CTTGCCTGCT TCACGAAATT 7260
GGCATGGCGC GCGTTATCCC GAAGCGTACA CCACGGAGGG GGAGTTAGAT CCAAAAACGC 7320
AGAAGGCGAT CTTTGCGCAC CCTATTATCT CCTACCATAT CCTGCGTGAC CACTCGCTCC 7380
CTCTGCCGGT GTGTGTTGGG GCACTTGAGC ATCGCGAGCG CGAAAATGGG CTCGGCTACC 7440
CGCGCAGGCT CGTGGGAGAA AAGATTTCCC TCTATGGCAA gAtAAtCGCA GTCgCtTGTT 7500
CtTACgAGGC AGCAACCGCT CCGCGCTCAT ACAAGGAAAT GAAAAACGCC GCTGAGGGCA 7560
TCGTCGACCT TGTGCGTAAC GCGAACACCC AGTACGATGC AGTCATTTCG CGCGCGCTTC 7620
TCTTTGCCCT GTCGTTTTAT CCTATCGGGA CCCATGTGCA CTTATCAAAC GGGAAAATCG 7680 CGCAGtGGTG GACGTTAACC CCGACGACCC ACGCTTCCCC ATCGTGCAGG TACACGGAGA 7740
AaTTCACCGC AACGGCAAGC CCATCATTCA CAGCACGAGC GCAGACGAAA TTTTTATCAC 7800
GCGGGCCCTG AGCGTAAAGG AACAGCGTCT TATCCTTCAG GAAGGTGGCG ACTGACCCTG 7860
CGCAGGGcAG ATGCCCCTTC CTGCTGCTGG CTTAGACGCT TTTTTAGCCC CTCTAGGTCG 7920
TGTTCGCAGC GAATAACTTC CTCAAACTCG TCGCGCTCGA TGGTTTCTCG CTCCAAAAGg 7980
CGCGtTGCGA TGTACTCAAG GAGCTCTTTT TTCTCCGTCA AGAGTGCTAC CACCGCGCGG 8040
TAGgTcCAGC TAGAACGCGC GCCACTTCCT CATCAACGTA CTGCTGTGTG CACTCCGAAT 8100
ACTCGCGCGC TAACTGCGGC TCCGCGAGAT ACCCGGTTCC GCGGCGAGTA AGTGCAAGgT 8160
TTGAAACTTT TCGCTCATCC CATAATCTGT AATCATCTTG CGGACAATGT CTGTTGCGCG 8220
AGAAATATCG TTCCCTGCAC CAGTTGAAAC TTCCCCAAAG GCTACAAATT cCGcTGCGnk 8280
TCCTGAAAGC AGCACATCTA CTTCTGCCAA CAACTGCTGC TCCGTAACAA TATGCCGATC 8340
GTCTTCAGGA ATGTGAAAAG TATATCCAAG CGCAGAGGTG CCCCGGGGAA TAATTGTAAT 8400
TTTGTGCACC TTGTCTGCAC CCTTCGTGAA GGTACCTGCA AGGcATGTCC TGTCTCGTGA 8460
TACGCAATAA TCCGGCGCTC TTCTTCTCGA ATTACCCGAC TTTTTTTCTG CAATCCTATC 8520
ATTGTCTTTT CGACCGCTTC GTCCAAATCC GTTTCAATCA CCTGCGCACG CCCAGACCGT 8580
ACCGCGAGCA ACGCTGCCTC GTTCACCACG TTTGCCAAAT CAGCACCTGA ATACCCACCG 8640
GTGATGCGCG CCACTGCCTT CAAATCCACT TCTGGCGCTA ACTTCACGTT CTGCGCATGA 8700
ATACGCAGAA TTGCCTCTCT TCCCTTAAGA TCGGGCCGAT CTACGCAAAC CTGTCGGTCA 8760
AAACGACCGG GGCGTAGGAG CGCAGGATCT AACACATCGG GGCGATTGGT AGCAGCAAGC 8820
AAAATGAGAC CGGTGGTGTT ATCAAACCCA TCCATTTCTA CCAGAAGCTG GTTAAGCGTT 8880
TGTTCCCGCT CATCGTTGGA ATGGATAGCG TTCAGGCGGC TTTTTCCAAT TGCGTCAAGC 8940
TCATCGATAA AAATAATCCC TGGCGCCTTC TCCCGCGCTT GTTTGAATAA ATCGCGCACA 9000
CGCGAGGcGC CAATCCCCAC AAACATTTCG ATGAAGTCTG AGCCACTGAT GCGAAAAAAG 9060
GGCACTGACG CCTCACCTGC CACTGCGCGT GCAAGCAACG TCTTACCCGT CCCTGGGGGA 9120
CCGACCAACA GCACCCCGCG GGGAATTTTC CCCCCGATTT CAGTATACTT TTTAGGGAAC 9180
TTGAGAAAAT CAACTACTTC CATCAGCTCT TCTTTTGCCT CATCCACCCC TGCAACGTCT 9240
GCAAAGCGTG TGGTGACCTT TCCTTCTTCC ACCGCCgCAG AGCGCGCGTG TCCGGCTGAG 9300
AAAATACTGC TCCCCAGCCC GCTTACATTT GAGGCCATCC GCTTAAAGAA AAAGCGCCAG 9360
ACAAAAAAGA GGATGAGCAG CGGAAAGAGA TATTGAAATG TCTCTATGAG GTAATTGCgC 9420 TCGCGCGGCT TAATACTGTA GACCACCTGC CGCTCATCGA GCATGCTCAA AAAGGAATCG 9480
GAGAGGACAC CGATGGCATG ATAGGTAGGA GCCTCCCGCT CAGAGAGCAA CGAGAAACCC 9540
CGCGCAGAAG GCGCAGGGCG CGCGGAAGTG TACCCGACAA AGTAAGGGGA ACCGACAACA 9600
ACCTTTACGA TTTCCCCACT TGCAATGCGA TCTTTAAATT CCGAGAACGG GATGATGCGC 9660
AAAGCACGGG AAAACAAAAA GTGGTTTGCA AGAGCAAGCA GCGCACAAAG AGCAAGGAGC 9720
ACGAGCGAGA GCACCTTACG CGAATTTCTG CGCGGAGGTC TTTCGCGTGA CGACGAAGGG 9780
CCTTTTTGAG GGCGGGGGGA GAACTTAAAA AACCCGAATG GATCTGAAGA ATCATCTGAC 9840
TGTTTGTAGC GCGTATTCAT CTCAGTTAAG GCTCCCTTTT GCAGTGCGCG CGGGCACGCC 9900
GGCCCATGAT ACACGAAAAG AAGGTGAAAC GTCGACAGAC TGTGTGAGAG CCGTACGGcg 9960
CAgCTTAGCG ACGTCCGCTG GGGCGGTATG CACGGTGCTG CAAAAAAACA CACGCGCGCG 10020
GCTATTGCTT ACCCTCATGC GCGCGTTCAC CGAAACTTCA CCTAATGACA CCTACCTATC 10080
ACCCGTACCC GGGCGACAAT CGCCCCTTTC TATACGCCGC GCTTCAGCGA CGTGAGCACC 10140
TGTCCCTTAT CCGGTACCGC GCAGAGCACG CGCAGGACCT GGCACCCTTG AGAGCGTTCT 10200
TGGCGCGCAT CGAGGCGCAC GCAACCGTCA TTGGTGCACG CACGCGTGGG GACACCCTTT 10260
TTATCCTTGC CGCATCGATG CCGTCTGATG CACTGCGCGA CGAAAAACAC GCATATGTGC 10320
GCACAATCTC CTGGGAACAG GCTCCTCAGA TACTTGAGAC GTTGGAGCGT CCGCCGCTCC 10380
CTCCTTATTG CCCGCCCGTT CCTACTTCCT GTTCTTCTTC GCGTCTTATT CCCGACGTGC 10440
CGCACAACAC CAGGTCACAC GCGCAGGAGA GTTCCTACAC CTCTCGGCAT GCGCTTTTGA 10500
CGCTGCTCAT TGAGTGGCGC GCGCTCATGG TAGAGATGGA CTATTCAGTG AGAGCGCACA 10560
GGGTGCAGCG TAGTTCTGCT CCGTTGCATG AAAGACACGG CACTCTTCCC TCTGACGTAC 10620
TGCTCTTCCA AACACAGGGG GAGGTCTGCG CTCTCTGTGC CTTTCAGTTG CAGCACGTGC 10680
GCGCAGTAGG CGGGCAGCGT CATCTTATCA TTCATGAAGC GGCAGGAGGC GGGAACATTG 10740
CATGCGAACG CATTTTCTCT CTCAAGGAGA TTGATTTTGC CACTGCAAAA TTTACTGAGC 10800
GTATCCGTCG CGGGCTGTAC CAGGTTGCTG TGCATACGGC ACACGCAGAC TTTACGGTCA 10860
ACCTCATCGT TCCCTCACTC AGAGAGCAGG GCGGCGCCTA CTCCCTTGCA GAGTCTTCTG 10920
CCTTTCACAG GAGGGTAAGC GCGTAAACTG CAGATTACGC AAAACTAGCG GTTCTAGCCC 10980
GGAGGGtGAA GCGCGCCGTG TGCATCGGCA cGACCCCGCC AGAGAAGGCG GGCGCCCGTC 11040
TTCCCCGCGG TgGCGsGCTG CGCCCTGCTA CCCCGTCTGC TGTGGAGCGG GGCCTTGTCT 11100
TTTTCCTGAG GGCTTGTTAC gcTGCGCGCC AGTCCCCGAG GAAGAAGGAA TTGCTATGAG 11160 TAGAGGTATT TCTACCTTCA GGAATATCGG CATCAGCGCG CACATAGATT CTGGAAAGAC 11220
AACCCTTTCT GAGCGCATTC TCTTTTACTG TGATCGTATT CACGCCATCC ATGAGGTGCG 11280
TGGTAAAGAC GGTGTTGGCG CCACCATGGA CAACATGGAG CTTGAGCGGG aGcGCGGTAT 11340
TACCATCCAG TCTGCCTCCA CCCAGGTCCA GTGGAAGGGA CACACTATAA ACGTCATTGA 11400
CACTCCCGGG CACGTTGACT TCACCATCGA GGTGGAGCGC TCCTTGCGCG TTTTAGACGG 11460
TGCCGTCCTC GTACTCTGTT CGGTTGCAGG CGTCCAGTCC CAGTCCATCA CTGTCGACCG 11520
GCaGcTCCGC CGCTATCaCG TGCCCCGTAT CTCATTTATC AATAAGTGTG ATCGTACGGG 11580
TGCCAACCCt TTCAAGGTCT GCGCTCAGct GCGCGAAAAG CTCTCCCTTA ACGCGCATCT 11640
TATGCAGTTA CCCATTGGGC TTGAAGACCG TCTAGAGGGT GTCATCGATT TAATTTCGCT 11700
CAAAGCCCTT TATTTCGAGG GAGAAAGTGG CGCGCACGTG CGTGAGGCGC CCATTCCCGA 11760
ACAGTATCAG GCAGATGTGA AAAAGTACCG GGATGAACTC ATCGATGCGG CGTCTTTGTT 11820
TTCTGACGAG CTTGCTGAGG CCTACCTTGA AGGAACTGAG ACCGATCAAT TGATTCGAGC 11880
GGCAGTACGT GCGGGCACCA TTGCAGAAAA GTTTGTCCCG GTTTTTTGCG GTTCTGCGTA 11940
CAAAAATAAA GGTATTCAGC CACTTTTGGA CGCTATCACA TACTACCTGC CAGATCCTAC 12000
CGAGGTAACT AATACCGCGC TCGATCTGGA TAGAGCCGAG GAGCCAGTTA CCCTCTCCAC 12060
CGATGCAGAC GCACCGGTAG TTGCGCTCGG GTTTAAACTA GAGGATGGCA AATACGGCCA 12120
ACTCACCTAT GTGCGTGTAT ATCAGGGGAC TATCAAAAAA GGGGCTGAGC TTTTTAACGT 12180
CCGCGCGCGC AAGAAATTCA AGGTGGGCCG TTTGGTACGG ATGAACTCTA ACCAGATGGA 12240
AGACATCAGT GAGGGAACCC CCGGAGACAT TGTGGCGCTT TTCGGCGTGG ACTGCGCGTC 12300
GGGAGACACC TTTTGCAGTG GAGATCTGAA TTACGCAATG ACTTCGATGT TTGTTCCAGA 12360
GCCGGTCATC TCGCTTTCCA TCACTCCTAA GGACAAGCGG TCCGCTGACC AAGTTTCCAA 12420
GGCGCTGAAC CGGTTCACCA AGGAAGATCC TACCTTCCGC AGCTTCGTAG ATCCTGAGTC 12480
TAACCAAACT ATCATCCAGG GGATGGGGGA GTTGCACCTG GATGTGTACA TTGAGCGCAT 12540
GCGACGCGAG TA AAGTGTG AGGTGGAGAC GGGCATGCCG CAGtGGCGTA TCGGGAGGCA 12600
ATTAGTGCGC GCGCGGATTT TAACTACACC CACAAAAAGC AAACCGGCGG TTCCGGGCAG 12660
TTCGGGCGTG TGGCCGGCTT TATAGAGCCC ATCGCCGGGC AGGACTATGA GTTTGTAGAT 12720
CAAATCAAGG GAGGAGTAAT CCCAAATGAG TTTATCCCTT CGTGTGACAA AGGCTTTCGC 12780
ACAGCGGTaA AGAAAGGAAC TCTTATTGGT TTTCCGATTG TGGGGGTGCG CGTTACCATT 12840
AACGATGGGC AGTCTCACCC GGTTGACTCC TCAGACATGG CGTTCCAGGC GGCAGCGATT 12900 GGTGCCTTTC GTGAAGCGTA CAATGGGGCA AAGCCAGTAG TCTTAGAGCC AATCATGCGA 12960
GTGTCGGTGG AAGGGCCCCA GGAGTTCCAA GGCAGTGTCT TTGGGTTAAT TAACCAGCGG 13020
CGGGGAGTGG TTGTATCGTC AGCGGACGAT GAACAATTTT CCCGCGTGGA CGCGgAGGTC 13080
CCGCTGAGCG AGATGTTCGG GTTCTCCACC GTGCTACGTT CTTCCACACA AGGTAAGGCT 13140
GAGTATTCTA TGGAGTTTGC TAAATACGGC AAGGCACCGC AAGGTGTGaC GGACTCGCTC 13200
ATAAAGGAAT ACCAAGAGAA ACGAAAAGCA GAACAAAGGT AAGCGTAACG TGCTAGGCGG 13260
CGCGTCCTTC TCGACGCGGT GGCGAAGTCT TGAATAAGGG GGCTTTCTGG TGTAcCCTCC 13320
CGGGCCGAAC GGTACTCTCC TCACATGAGC CGAGGAGGTA TCACGTGGGA GGTTAACATC 13380
ATGAATGCTC ATACGCTTGT GTACTCCGGC GTAGCACTTG CCTGCGCGGC TATGCTCGGC 13440
TCCTGTGCCT CGGGCGCCAA GGAGGAAGCT GAAAAGAAGG ctGCAGAGCA GCGTGCGCTT 13500
CTGGTCGAGA GTGCGCATGC TGACCGTAGG CTTATGGAGG CGCGTATCGG CGCGCAAGAG 13560
TCTGGCGCAG ACACCCAGCA CCCCGAACTT TTCTCCCAGA TTCAGGACGT TGAGCGCCAG 13620
TCTACCGACG CCAAGATTGA AGGGGACCTC AAGAAAGCTG CCGGTGTCGC CTCAGAAGCT 13680
GCGGATAAGT ACGAGATTCT CAGGAACCGA GTTGAAGTTG CTGACCTACA ATCTAAGATC 13740
CAGACTCACC AGctTGCGCA GTACGACGGG GACAGCGCGA ACGCTGCGGA AGAATCGTGG 13800
AAGAAGGCAC TTGAATTATA CGAGACCGAT AGCGCGCAGT GTCTGCAATC CACCGTCGAA 13860
GCGCTCGAGT CGTATCGGAA AGTCGCGCAT GAGGGATTCG GCCGCTTACT ACCCGATATG 13920
AAGGCACGTG CGGGTGCTGC AAAGACGGAC GTTGGCGGTC TTAAGGTAGC CGTCGAGTTG 13980
CGTCCACAGC TGGAAGAAGC TGACAGCCAA TACCAAGAAG CACGTGAAGC TGAAGAGGTA 14040
AATGCACGTG CCAAAGCTTT TAGCGGGTAC CACCGTGCCC TCGAGATCTA CACAGAACTG 14100
GGGAAGGTTG TACGCCTGAA GAAGACCGAG GCGGAAAAGG CGCTGCaGTC TGCAAAAACA 14160
AAGCAAAAGG CGTCCTCTGA CCTTGCGCGG AGTGCGGATA AGAGTGCCCC aCTTCCtGAA 14220
AACGCTCAGG GTTTCTCAAA GGAGCCGATT GAGGTAGAGC CGCTTCCAAA CGACAGGCTT 14280
AACACAACGC AGGCAGATGA GTCTGCGCCG ATCCCCATAT CTGACACCTC TTCACCTTCT 14340
CGCGTGCAGT CTCGGGGTGT TGAAGACGGA GGACGTTCTC CAAAATCCTC TATGAACGAA 14400
GAAGGAGCCT CTCGATGAAG ACACGTAATT TCTCGCTCGT ATCCGCGTTG TACGTACTGC 14460
TGGGTGTTCC TCTGTTTGTG TCTGCCGCTT CCTACGACGA CAATGAATTT TCTCGCAAGA 14520
GTCGTGCGTA CTCGGAGCTT GCAGAGAAGA CATACGATGC GGGAGAGTAT GACGTCTCTG 14580
CAGAGTACGC CCGGCTCGCT GAGGATTTTG CGCAAAAATC CTCGGTCTAC ATCAAGGAAA 14640 CTATGGCGCG CACCACTGCC GAGGACGCTA TGAACgcTGC GCGCACCCGC CACGCGTGGG 14700
CGAAAAATGA GCGCATCGAT CGCGCCTATC CGACCGAGTA TTTGCTCGCT AGCGAGGCTA 14760
TCAAGACCGG AGGcTCGCTT TTGACAGCAA GCAGTACGAC GTAGCGCTCA CGTGGGCGCG 14820
TAAGGCGTTG GACGCACTCA AAAACGTAAA GCCTGAAAGT CAGTTGCTTG CAAAGGCCGC 14880
GAAGGAGGAg GCTGCGCGCA AGGCCGCCGA GGCACGAAAA CTCGAAGAAC AAAGAATTGC 14940
AGCCCAGAAA GCGCAGGAAG AACGTAAGCG TGCGGAGGAG GAAGCTGCGC GCAAGGCCGC 15000
CGAGGCACGA AAACTCGAAG AACAAAGAAT TGCAGCCCAG AAAGCGCAGG AAGAACGTAA 15060
GCGTGCGGAG GAGGAAGCTG CGCGCAAGGC CGCCGAGGAA GCAGCGCGAA AaGGCGGAGG 15120
AACTCGAGAA GGGTCGTGTG CTACCTGCGC AATACAAGGT GACTACGTGG TCCATTGACC 15180
GGGAATGTTT CTGGAATATT GCCAAAAACC CCGCCGTTTA TGGCAACCCC TTCCTCTGGA 15240
AGAAGTTGTA TGAGGCGAAC AAGGACAAAA TTCCTCAGTC CAAAAACCCC AATTGGGTAG 15300
AGCCTGAGAC AGTCCTGGTC ATCCCCAGTC TCAAGGGAGA GGAGCGCGAG GGTCTGTATG 15360
AGCCCAACGT GAAATACCGT CCTCTGCCGT AACGGATAGA CAAGAGCGTA TACGCTTTTT 15420
CCCCTTTTCC ACAAGGGTGC AAGGGGCGTG GTTGGGAGCC CATAGAGAAA GaGCTyCCCA 15480
GAGCGCTGGA ACGCTACGGT GTCCaGCGCT CTTTGTGTGT TTTTGTCTCT ACAAGAAAGT 15540
TCCACTTTTT GCTACACTTC CCTTCTATGG ACGTGTCCTT TGAAGAGCTT GGTTTGAATG 15600
AACAATgCTT GCAGCGGTGC GACTCAAGGG GTTTCGGTGC CCAACTCCCA TCCAGGCTGC 15660
TGCCATTCCC CGACTGTTGG CAGGGGATGC GAATATCATC GCAAAAGCCC GAACCGGGAC 15720
TGGAAAAACG GyCCCTTCGG CCTCCCCCTT ATCCAAGAAC TGGGAAGCCC GTGCGAACAC 15780
CCAGGGGCCT TAGTGCTTGT TCCTACAAGG GAGCTCGTGC GCAGGTCGCA AGCGAACTGA 15840
GCTCCCTGAG GATACAAAAA ATACCTCGGA TTCACACCGT GTACGGTGGG GTCTCCATCG 15900
CGGAGCAGCT GCGTAATCTC GAACAGGGTG GAGAGATAAT AGTAGGAACG ACCGGGCGCG 15960
TCATCGATCA TATTGAGCGC GGTTCTCTCG AGCTGTCTTA TCTGCGCTAC TTCATATTAG 16020
ACGAAGCGGA TGAGATGCTA AACATGGGTT TCGTTGAGGA TATAGAGTCT ATCTTCTCTC 16080
ATGCAAATAA AGACGCACGC GTCCTTATGT TTTCTGCCAC TATGCCCAGG CAGATCCTTT 16140
CTATTGCCTC TACCTTCATG GGAAGCTACG AGGTTGTTGA AGAAGTCACT CCAGAAGAGG 16200
CGCGCCCGCT CATTGAACAA TTTATGTGGG TTGTAAGGGA CGCTGACAAA ATCGAgGCGC 16260
TTGTGCGCCT TATTGATGTG AGCGACAACT TTTACGGTCT GGTGTTCTGT CAAACCAAGG 16320
CGGACGCCGA cACTGTTGCG AAATCTCTAG ACGAACGCCA TTACCATGTT GCTGCACTTC 16380 ACGGAGATAT TCCGCAAAGC CAGCGAGAAA AAATTCTCGA GCGCTTTCGT ACAAAACGAG 16440
CGCGTATCCT CGTCGCCACT GATGTTGCCG CTCGCGGCAT TGACATCGAA GGAATTACGC 16500
ACGTGGTGAA CTACTCCATT CCTCATGATA GCGCTACTTA CACGCACCGc GTcGGcAGAA 16560
CTGGACGCGC AGGATCACAG GGTATCGCTA TCAGTTTTGT ACGCCCACAC GAGACACGAC 16620
GGATGGAGTA TCTGAGTAAA CACTGTAATG GCGAATTGAA AGCTAGTACG GTACCTTTGG 16680
TGGAGCACAT CCTTACTCAA AAGGAGGGGC GTATTTTCTC GTCCCTCAAG ACTCATCTTT 16740
GCCAATTACT CTCTGAAGGG GTGCACGGAA CCTTTACCCG TTTTGCGCAc GGCTGCTCCA 16800
AGAAGACCTT AAAGCTCGCG TGGCAGAAGC CCTGGGTaCT TCCGCCGACG TTCCTCAGGA 16860
ACCGAACGTG TCGCTTGTCG CCGCGCTCCT GCAAATCCAC TACGGTACTG CGCTGGACCC 16920
CAGGCAkTAC CGGGATATTA AAACGATTAC GCCAGAGACG GCCCGCGCAC GTCCCCATGA 16980 mGCGGAAAAG GCGTATGTGC GCATTGAGTA CGGAAAAAAA AGCTACCTCA CTCGGAAACG 17040
TGTTGTGCAG TTCATCTGTG CCCTGGTAAA AATCCCCGGT CATCTTGTAG ATCGCGTTGA 17100
CATAACCGAA CGTTkCGCGT TTGCcGCATa CCCCGACGCG CAGgaGGAAg CAGTTCGCTT 17160
ATCCAAGAAG CGCAAGGACC TGCCGCGCGT TTCCTTCGTT GGGCACGCCA GTcGCCTAAG 17220
AAATACCGCT ACCCCTGCAG AAAAGTCTAC CTATCCAAGG CGCCTCCCTT CCGGAGAAGG 17280
CCTAAGGGAG CAGATCTCAA GGAGAACCTC TTCCTCTAAG AAGGCTTCTG GGAAACCGGA 17340
GGATTCTCTT CCCCCTCCCC AAGAACATCG CCTTGATTGA TGCAGCGGCT CACTGCGCCA 17400
CTACAGCATT CGTGCAAGCC AGCGCGAGAT ACTAAGGGCA TAGTTACCGA CGGCTTCTAT 17460
ACCACGCACG ATGTCCATAT ACAGGAGcTC CGCCTTTACA TCTGCACCCT GCTCAAGACG 17520
TCTGCGCACA AGTCCtTTTA GATGGGCCCC CTTGCTTTCG ATAGAGTGCG TCATTTGGTT 17580
TACGTGCAAC ACCTGCTTAT CTTCCaGTGG ACGGTTCaAG TGCGAATACA CAAAGTCAAC 17640
GCACTCATCC ACCATGCCgA CGTACGGGAC TAACTCCTCG ATATCATCAC GCTTGAGCGG 17700
TACATTTCCC TTGATGCTCT TATGGAAGTA CAACCCTATA CCACACAAAT GGTCAGTAAT 17760
ATTTTCAATA TCGTCTGCAA TGGAAAACAT TAATTGCACG TTATGTTTTG CTTTCTCGCT 17820
CAAAGAAAGA TGCGATGTTT TAATCAGAAA GCGCGAAAGC TGTTCCTGCA TTTGATCTGC 17880
ATAATCCTCT TCCTTTGTCA GGCGTGTTAC GATCTCATCA GTAGAAAGCA TACACGTTCC 17940
CTGAATGGAC TTACGGATAG TAACAAGCAT ACCCTGTGCT ATTGAAAACA TTTTTTTCAG 18000
TTCAATTtCC GCACGAAAAA TATGTGCCTC AGCGCTCTCT TTTACCGCAG TTTCTTGAAA 18060
AACAAGCTGA TACCTwTCTG GAGCGTCGTC ATACCGAGGA CGAATTAACC ACTCTACAAA 18120 CGCTGCAAGG TGCTTAGTGA AGGGAAACAC AATAATAGTG TTGACGATGT TAAACATACT 18180
GTGAAAGAGC GCAAGCCGCA CTGTGATGTT ATCAAAACCC GAATTCTTTG GAGTCAAAAC 18240
ACACAAGAGT GCCAAAACTG GATGAAAAAA CATCAAAAAA ACCAATGCAC CAAACACaTT 18300
AAACAGCACG TGGAcTGCGG CAGtCTCCGT GCGTTCAATT TACTCCCAAT GGCTGCAATT 18360
GCAGCATCAA TGGTAGAGCC CACATTACTT CCTAATACGC TTGCTGCAGC GAACTCCACT 18420
CCGaTGACAC CACCGAACGC CATAGTCAAC ACGATCGCAG TGGTTGCAGA CGAGGAGTGC 18480
AAGATGACCG TTAACACAAA GCCTGATAGG AGTCCTACAA AAACACTGAG CGCACGATCC 18540
TCAACTGCAA TTTTAAGGAA GGAAAGCTCT TCTACAGAAA GTGGAGGAAT GAGCGAAGAG 18600
AGCAAACCAA GCCCGGTAAA GAGAAGACCA AAGCCCATGA TGCTCTCGCC CAAATGTCCT 18660
TTATGCAAGT GTTTAAAAAA AGTCAGAAAA TAGCCAATCC CAAAGGCGGG GACAGCGATT 18720
GACGCAAGCT TAAACTGAAA ACCCACAAGC GCAACAATCC AAGCAGTAAC AGTGGTACCG 18780
ATATTCGCAC CAAGAATTAC GCCGATTGAC TGCGTCAAAG AAAGCACTCC CGCGTTAATA 18840
AAAGAAATCG TCATAACCGT CGTAGCCCCT GACGACTGCA CAATAGCGGT AACTGCCATG 18900
CCGGTTAGCA CCGCGAAGAA ACAGTTACTG GTCATCACTT GGAGAaTTTT GTGGAGGCTT 18960
TCTCCAGTAC CCTTTTGGAT ACCGTCACTC ATCAGCTTCA TACCAAAGAG CATGAAGCCA 19020
AGGCTTCCGA TACCCTGCAA AAGGACAGCC ACAAGGTGCA TCGGCGCCCA CCATAGCAAA 19080
AACAGGGGAT ACGTATCAAT TGTCCGAAGC GGGACACTGC GCCGTACGGA CGTATGTTTA 19140
TTAGTCAATC TCTCTTTTCT CAAATAGTCT CGCCGTGACA TCGCTT 19186
(2) INFORMATION FOR SEQ ID NO: 122:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4901 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: TGGTATGTGG TGCGCGATTC ACGTAAACGA AATGTTTCTT TTCGCAGAAC AGATGCTAAA 60 TAGCTGGCGC ACAGCGATAC ACGCCATAGG GCAACTCTTT TTGCGCACGA GTGCGCACGT 120 GCCGGTGACA CCCATTCATC TGTTAGATCC TGCTCACTAC CTGGAACGCA TTCCCATCGT 180 GTTGTCGGTG CACGCGCTCT GTGGCGTTGC AGGAGGCACG CTATGCCTAT CGTTGGCGGT 240 ATGCATCATT CCTGCGCTCC GTGCGGGCCG CGTGCGTCCC CTTGACCTCA TGCGCAAGGT 300 GTGATAGGAT GCCCCCAGTT TTTTTGATTT CTTACATCCG GCTCGCAGAC GGATGGGTAC 360
CATCTTTTAG AAGAGCACGC GGCCGGTGCG GTAGGGGCTA TGGAAGAACG CAAGAACTAT 420
ATGGATCGGC TTGCCGAAGC ACTTACGCGG CGCAAGGTGC AGCTCGATCG GGACATCCTA 480
CCAAAGGCGC TGGAGCAGTA CCGTGTCCAG GTCACAGCGG TAAAGGCAAT TCGCAGTAAT 540
TTGCTGCAAA AAGGTTTCTT GCACGATGAC GCGTACAAGT ACGACAGCAA GATGACTGAG 600
ATTGAGTTGC CAGAAACTTc CCCtACGGGG AAAATGAGAA GCCGATGGTC ATTGGTTCTC 660
GCCTTTCGCA CTATCAGACT ATGCTCGGTT TTTTGGACAA CTACTACCGG TTCGATTCGG 720
AGTTTCTAAT CCCGAAACGT ATTGCAAAGC TCGTTGCGCT CAACGGTACg TTCATGTGGA 780
AAGATTTTAC TGCTACCACC AAAGACGCGA ACACACGTGG GCTATTTGAC ATAGTGCAAT 840
CTTTCTATGG CGCTGCTGAT CCTATTTCGA TAGGACTGGT GAGGGATTCG TTGCAGTACC 900
TAGTAAAAGC CCATGAGGTA ATCAGCACAG CACTTAAGTC CCTTTCGGTG TTTCATCGCG 960
AGCGCTACAA ATTGCTTATT CGCCAGCATG CTCTGGATGG CTTGGACGAG ACAACGGTGG 1020
ACGTCAACAA CCCAGAGGTT GCGCTTGACG CGATGAAAAA GAACTTTTCA GAAAATGCAA 1080
AAGGCCATCC GTTCTACAGC GAGTTGGCAA CCGTGGTTTT GAGAGAAGAC TTCTCTGCCA 1140
ACGCAGAAAA GCTGCGGGCT GCAATCCTCC GCGAGTTTGA AGAATCTTCT GCACCCAAAA 1200
GATGcCGCGG TGCTATGCGC AATCCACACG CCGTACTACT TTCTGGTTTC AGATCGCTTG 1260
GAGCTACTTC TAGCCACTTT CATACTGCTC TGGAAAAGAT TCGCTTCAAT GAGGAGCTCG 1320
TGACTCAGTC TGAGGCGGCC TTCTTTTCAA AGGTAGTGTT AGCCTTTCTC AAAGCTTTCA 1380
ACATTCAGAC GCGTTCAAAG GACGTTGAAA TTGTCGTCGT CGACCCGGCA ACACAGATAC 1440
AGAAAAAGGA ATGCGTAAAC GTTGAGCTCT TTCAAAAAGA GCTGGCCCGG TGTGTCAAAC 1500
TGTATCGGGG TTTTGTGTCT CCAGACACTC CGATTCATGA AAAGTTAATG GCGCTCAAGG 1560
ACGAGCAGct CTTCGAGCTC CTTTTTAAAC ACGTAGCAGA GGCGCATACG CTGGTTAAAC 1620
AGCTTGCAGG TCTTGATGAG TACTACAAGA CAGTGAGGTC TGATGTGCGC GCGAAAATTA 1680
AAGGGGTCAA GATTGAAGTT ACAACTATCA CCACTTCTGT AACCAAGGCA AATAAGTGCC 1740
GCGCAGAATA TGCCTCGCAA CTAGAGGAGC AAAAACATAT GAAGCGTTTA GGGGTAGCCC 1800
GTGCGTAGAA TGCGGCTCTC GCGCCGCGGc ATTCTCACGG TAGTAGGTAC CCTTCTTCTC 1860
CTACCTCTCT TTCCTTCCGA AAAAAAAAAG ACTCACGCGC CGCTCCCTCG ATCTGAAAGA 1920
AAAGAGTTTG TGGTGTCCTT TTCTCCGTAT AGGCCTGTGC TACACCCGCA CGTGGCATCG 1980
CGCGTGGACG AAGCACAGCT GCTCACAGCC CTATATGAGG GACTTGTCAC CTATGATCCG 2040 TACGATCTCC ACCCAATCCC GGCGCTCGCA CAACACTGGT CGGTAAGCAC CGATGGGTTG 2100
ACGTGGACGT TCTATTTACG AGATCAGATT TTCTTTCAAA ACGGCGACCC TATCACTGCA 2160
GAGACGTTCC AGCAATCCTG GCTCAATTTG TTAAATCCTG AATGGAATGT GCCGTATGCG 2220
TCTTTTTTGG ATGCAGTTGA GGGGGCACGT GCGTACCGCA GCGGCACTAC GGCTGACTCT 2280
CACACGGTTG GGATTCTCGT AGAGGGGTCA GACAAAAAGA CACTCGTGGT CAAGCTCGCG 2340
TACCCAGCAG GACACTTCAT TCAGATGCTC TGTCACCACG CATTTGCCGC AGTCCACCCC 2400
ACCCAACTGG CAAGCGTCGG CACGCTGCAC GCGCGTACGG CAAGCGCCTC AGCACACAAG 2460
CCGTTCCATC CTATCGCAAG CGGTCCTTTT GAATTACAAC AAATGCAAGC AGATCGCGTG 2520
GTGTTGCGTG TTAACACCCG CTACTGGGAC AGGGaCGCGC TTGCCCTCCA CGCCATCGTG 2580
GCGCTgcATT GCACAAGACC CTGCAGCGCG CGATGCGGGG TTTAACGATG GGAGCATCCA 2640
TTGGATTAGT GGAGCGCTGG AGCACAGTTC TTTGCAGGAT GCAGCTACAC TTCAGATCGT 2700
ACCGCTTCTG GCAACAGAGT ATCTGTGTTT TAAAACGGCA CATGAGCCGA CGTGCAAgCC 2760
ACGcTGCGCA AGGCACTGCT TTTAGCTACT CCGGTGGAGG AGCTTACCGC GCGCTATTTA 2820
TTTCCCGCAC GAACGCTCGT AACTCCGTTT ACCGGCTACC CGGTACCGCC TGTAGTACAT 2880
GAATACAATC CTGCGCGCGC ACGCTnTtTT AGCAGAAGCG AAGATAGGTG GGAAGACAGC 2940
CCGTACTCCT CTTAAAATTC TCGTTTCCGA CACCGAGGCG TGCCGGGCAC TCGCACTTGA 3000
ACTTCAGAAG GCCTGGACAG CCCTCGCACT TGCAGTGGAA ATCTGGGCAG TGCGGCCTGA 3060
AACGTACCGG GAATATGTGC AGGATGAAAA ATACCACGTG AGAATCGTGT CTTGGGTTGC 3120
GGACTTTGCA GATCCGATGG CGTTTCTGGA GCTGTTTAGA AAGGGATCAA AGACACACTC 3180
AACCGGATGG ACCCATGAGG AATTTGAGGC ACTGCTGACA CGCGCAGGAG CAGAACCGCA 3240
CGTGCTTCGT CGTTGGGAAC TTCTTGCGCA GGCAGAACGT ATCCTCTTAC AGGAAGCAGT 3300
TGTGCTTCcG CTTTCGCGTT TGCATGCACT GCACGCGGTA CAGCGGCGCA CGGTGCGCGG 3360
CTGGTATGCA AATGTGCTCG ATGTGCATCC ATTTAAGTTT ATCTCGTTAC AAGAAGAAAT 3420
AAAGGTCAAC CTAGACTCAT AGAGGGGCTG CAACCCGTGC ACACCCAGGT GTACCTTGCA 3480
ACGTAGATGT ACCGGCGTGT ACAATGCCCT CTGCATACAC AGAGGGGATT ATGGGGTATC 3540
CGTTTCGCGC TCTAGAGAAA AAATGGCAGG CCTATTGGCG CGACAAGCgs GTCTTTTGTG 3600
TGTCCGAGGA TGAGCGCTTC CCTCCTGAGC GGCGTGCGTA CGTGTTGGAC ATGTTTCCCT 3660
ATCCTTCAGC GCAGGGACTT CACGTCGGAC ATCCAGAAGG CTACACTGCA ACTGATATTT 3720
ACTGCCGCTA CTTGCGCATG GGTGGTTACA ACGTGCTCCA CCCTATGGGT TTTGATGCCT 3780 TCGGACTTCC GGCAGAAAAC TTTGCACTCA AAACTGGTAC TCATCCGCGC GTCTCCACCT 3840
CCGCCAACTG CGACACCTTT CGCAGACAGA TCCAGTCGTT TGGTTTTTCC TACGATTGGG 3900
AACGTGAAAT ATCTACCGCA GATCCAGAAT ACTATCGCTG GACTCAGTGG CTGTTCCTCA 3960
AACTTTATGA AAAAGGATTA GCCTATGAAG CAACCGCGCC CATCAATTGG TGTCCCAGCT 4020
GCAAAACAGG CCTTGCAAAC GAAGAAGTAA GAGACGCGTG CTGCGAGCGC TGTGGTGCTG 4080
AGGTGACGCG GCGTGGTGTC CGCCAGTGGA TGGTGCGTAT TACAGCGTAT GCCGAGCGTC 4140
TCCTTTCAGA TTTAGATGAA CTTGACTGGC CTGAGTCAGT TAAACAAATG CAGCGTAATT 4200
GGATTGGAAA AAGCTGCGGC GCGGAAATTG ACTTTCCCGT AGATGCGCCT GCGTGTTCAG 4260
TGCACGATAA GCTACCACAG ACAATTCGCG TGTACACCAC GCGTGCGGAC ACGCTTTTTG 4320
GAGTAACGTA CCTGGTACTT GCTCCCGAGC ATGAAGCGGT AACGGCGCTC ACTACACACG 4380
CACAACGCGC AGCGGTACAG GCGTACGTGC AACGTGCAGC AAAAAAGAAC GATCTCGAAC 4440
GCACTGATTT AGCGAAGGAA AAGACCGGTG TTTTCACCGG CGCGTACGTG CGCAATCCAA 4500
TCAATGATAT GCGCATACCG GTGTGGGTAG GTGATTATGT GCTCGTTTCc TACGGCACGG 4560
GGGCAGTGAT GGCAGTTCCT GCACATGATC AGCGCGACTG GGATTTTGCC ACTCGGTTTG 4620
GCTTACCCAA GTTAACCGTG GTGTCTGCAG ACTACACTGC AACAGTTCCT AATAGCAACT 4680
CCCCTCAAGG CGCGGTACTC CAAAGATGCG TCTCAGACGA GGGTTTTGTC GTCAACTCTG 4740
GAGCTTTCAA TGGTCTTGCT AGTGCCGACG CGCGAGAACG TATTGTTGCC CATCTTGAAA 4800
TGCGTGGCGC AGGTGCACGG CGCGTCACCT ATCGCCTACG CGACTGGGTG TTCAGCCGTC 4860
AGCGCTATTG GGGAGAACCC ATCCCTCTTG TGCACTGTCC T 4901 (2) INFORMATION FOR SEQ ID NO: 123:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2257 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:' 123:
CCCACCTGTC TCACCGGCCT GAACGCCGGC GTCGAAGCAC GCGTGATACA TCCCCCTCAC 60
CTACATCCGT TACAGAAATA ACGGAGGGTA CGAACTGAAT GGAGCTGTGC CCCCTGGGAC 120
TATCAATATG CCAATTTTGG GGAAGGCGTG GTGCAGCTAT CGCATCCCCC TCGGTTCCCA 180
CGCCTGGCTT GCACCACACA CATCCGTGsT CGGCACAACC AATCGCTTTA ACATTATTAA 240 CCCCGCGGGC AACCTGTTGA ATGAACGAGC GCTCCAGTAC CAGGTGGGAC TGACGTTCAG 300
TCCCTTcGAG AAGGTGGAGC TCAGCGCCCA GTGGGAACAG GGCGTGCTTG CTGACGCTCC 360
TTACATGGGC ATTGCCGAGA GCATCTGGTC CGAACGCCAC TTCGGCACCC TTGTCTGCGG 420
AATGAAAGTG ACATGGTAAA AACGCGTGCT GTTCGATTCC ACCTCCCCTA TCAAACCCCG 480
TTTTGCTCGT CTTGCCTTTG CAGTTGCAAA ATTTTGTCTC GGATGAGGGC TGCCTCTTCA 540
AATCGTAACT CACGCGCACA AACCTTCATG TGTAGGCGCA ACGCCTGTAc CATTTTTTTG 600
CGTGCAGCAT GTGTGCGCAC GTCTGCGTCT GCTGCGCGCA ACAGGGGTGC GACCTGTACa 660 cGCGCAGtCT TTTTTTACTT CCTGCTCACG GACCAGAATA TCTTCAATAG ACTTTTTAAT 720
CGTACGGGGT GTAATCCCAT GAGCACGATT ATACGCCATC TGAATCTTTC TCCGTCGAGC 780
AGTTTCCTCT ATTGCTTCAC GCATCGCATC GCTGATTGCA TCCGCGTACA TTACCACAGT 840
TCCGCGAGCA TTACGTGCTG CTCGACCAAT AATTTGGATG AGACTCGTCG TCGAACGTAA 900
AAAACCGACT ATATTGGCAT CCAAAATAGC AATGAATGCC ACCTCGGGCA AATCAATACC 960
TTCTCGTAAT AAATTTATTC CAACTAATAC CTCACATTCC CCCGCACGCA GACTCGTGAG 1020
AATTTCTACG CGTTCAATAG TTTCAATTTC CGAATGAACA TACTTTGTCC TTATTCCCAG 1080
TCCATTGAAA TAATCTGTTA AATCTTCAGC CATTTTTTTT GTCAATGTTA GCACCAAAtA 1140
CGTTCGTTCC gCGcACTACA AGCTTTTACC CGcTGaCATA TATCTTCTAT TTGTCCATCC 1200
GTTTTTCTCA CTTCGATGCA TGGATCTAAA AGTCCAGTGG GACGAATCAG TTGTTCAACT 1260
ATTTGCACAG ACTGTGTGCG TTCCTTCACC CCAGGAGTTG CAGAAATAAA AACTGCTTGA 1320
TTTAACAATG CCTCAAATTC CGAATCTTTC AGTGGACGGT TATCTCGTGC ACACGGCAAG 1380
CGAAAGCCAA AATCGATGAG ATTCTGTTTA CGCACCCGAT CTCCTTCATA CATTGCACCA 1440
AGCTGCGGAA GTGTTACGTG ACTTTCATCA ACAAAGAGCA CAAAATCCTT TGGAAAATAA 1500
TGAAGAAGCG TCACCGGCGG TTCACCAGAT TTTCTACCTG CAATCGGCGC AGAATAATTT 1560
TCTATACCGT GGCAATACCC CATCTCTCCG AGCATTTCAA GATCGTATTC TGTGCGCGTT 1620
TTTAAACGTG CCGCTTCTGC AAGCTTATTC TCTTGAGTTA ATTGTACCAA CCGTTCATCG 1680
AGTTCTTGTC TAATACGGTC CATGGCGCGA GGGATTGCAT CCTCTTTAAG TACAAAATGC 1740
TTTGCAGGGT AAACGGTAAG TTCTTCAAAT. TCCCTTAGAA CAGCACCGCT TACAGGATGA 1800
ATGCGACGGA TACGAACAAC TCGATCCCAA TCGCACTCGA TACGATAAAA TTCTTCTAAA 1860
TACGCAGGGA AAATTTCAAT AACGTCTCCC CGAACTCGGA AGCGACCGCA CTCGAGCACC 1920
GCGTCGTTAC GCTCGTATTG CAGAGATACA AGTTGCCGCT TGAGATCTTC AAGATCAAGA 1980 CACTGGTTGA CTTCCACGTG GATACGCAGA TCACGCCAGG ATTCAGGCAA CCCAAGACCG 2040
TAAATACACG AAACAGTTGC GACTACAATA ACATCACGAC GTTCCATGAG ACTAAACGTT 2100
GCAGATAAAC GCATTCTATT TATCTCTGcA TTGATAGAAG CATCTTTCTC AATGTAGAGA 2160
TCACGAGCAG GGACATACGA TTCAGGCTGA TAATAATCGT AGTACGACAC AAAATACTCC 2220
ACCGCATTGT CTGGGAAAAA ACCTTTAAAT TCCCCGG 2257
(2) INFORMATION FOR SEQ ID NO: 124:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 992 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124:
TACTCATCGG CGTCGCGCnT TCnTACTCCT TGGGCATTGA CTACTTCAAC TACCCCGTCT 60
GATACCATGA CCATTACATC CCCAGGACAC AACGTACGCT CATGAATCTG TATAGGGGGC 120
AGATCCACAA TGCCAATAAC TGGACAATTC GAATGAAGAT GATACACTTC GTGCGCCTCT 180
CCCGCCTGGG TAAAGACAAG GGGACTTTCC ATCGATGCGT TAATGTAGCG AATATTCATA 240
CCCGCTGTGT CAATTAATCC CAAGAATAAG GTCGTGTACT TATCGTGGAG ATGCATACGC 300
TTTACTGCCC GGTCCACCGC ATACAAAATC TCAGGAAGAT TCTTTTTGTC TTCCACGATG 360
CGAATCGTAC TGAGCACAAC ACCCATAACT AACGACGCGG CCAAACCTTT GCCAGAAACA 420
TCTCCAATTA CAAATAAAAA CAGGTGTTCA TCAATTGAAA TAACGTCGTA ATAATCCCCA 480
GATACATTAA CCAGTGGCTG ATAGAATGCC CCGACGCATA TTTCCTTGGT ATGTGGGAGC 540
GCCTTAGGCA AAAGTGCGCG CTGTACACGC GCCATCATTG CCCATTCCTG GGATACATGG 600
GAGTACAACA ACAAAGTGCT CATGTTCCTC TTTCGATTTA AATACTCTTC AAACTCTTTG 660
AACAAGAGCG ATATAACTTC GCGCTCAACA GCACGGATAA AACGACATAC TATAAAAAGA 720
CGCAGCTCTC CACTGGAAAG ACATACCCCA CGAGCCCGAC GTCGGTCAGA CATAAGACAC 780
AGATCATCAT CAAAAAAATA TATACCGTCT GACCAGTGCC ACGTGTAGTC CATAGAAAAC 840
TTATGAAGCA CAAGATCATA CGTGCGTGTG TCTGAAACAA ATCGTGGCAG CACTGTTGTA 900
AATAACACGn TTACTTATCG TATCCATTAA GAGCACTGCG CAATCGGAGC GATATTCAAG 960
CACCTCCTGG AAGGCAGCAA CAAGTTGTTC AT 992 (2) INFORMATION FOR SEQ ID NO: 125: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2291 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125:
GTGCAGGATC CGTCTCTTGT TCACGAAGAA CTCAGACACG GTTGGCGTAT CTGCAGATCG 60
CCTATGCTCC ACGGGGATAT ATTAAAAAAC ACGCGCACGC TAGAAaCCtC GCCGCATATA 120
AACgGCGGTC AAgAtTTGTC CAATACGCGA TTAAGCACAA TTACACCACT GAGTCAAAAA 180
GCAGTATCAG GATAGACAGG GGAACAGGTA TGCCTGATAC GGTAATATTT ACAAAAATAA 240
AGGTCATTGA CCAAAAGCTG GCTTCCTTTG TAAGGGAGCT CATGTCTCAC CAGGTAGATC 300
AACTCACGGT TTTACAGAAA ACGGAGGAAA TCTACGGTAT GCTCGTGGAC CTACTGATCT 360
AGCACTAAGG AAACTTATCA TGATACCTAA ACTGAGCCCA AATGCTGACC CCTGCACAGC 420
GTGTGCTTCC ACCCCTGCCT TTCCGCCGCA GCCCATGGAG GAGGAAGGAG AGTTTTTATA 480
TCAGTTGCGT CTTGAGTACT CCCGCGAAGT GTTATATGnC GTTTTCACGC GGnTACGCGT 540
GCACACTTTT GnTnGGTACC CACGCGTTAT GGCAACGACA TTGCTCGGTG CGTAGGGCGT 600
ATACGTACAC CGGTACAAAC CGATATCGCG AGTGTGGTAC GTGTTGCATC AGACCAAGAT 660
TTGTGCACAT GGCACATACA TAGGGAAAAG GAACGTGCTG CGGAAATGAT TTTTCGGGAT 720
CGCATTGAGC ACTATCAACT TGaGATGaAA TGTATTTGCT GTCACTATCC TTTAGAAgAA 780 gCACsCsTGG TaTTTCTATA CAgTGCGCCA GCACGTkTTG ATTTCaGAGA ATTAGTTAGA 840
GACTTAGGAG CTACATTTGG TACGAGAGTC GAACTGCGAC AGATAAATGA ACGGGAAGAA 900
GCGCGGATAG TAGGCGGAAT TGACTGCTGC GGGCGCGCGC TATGTTGTTG CTCAGTGTTC 960
AGCAGGTTGC GTCCAGTCTC GGTAAAAATG GTAAAGGAAA AAAATCTATT ATTTCGTTCA 1020
ACCCAGATGA TGGGTCGTTG CGGACGATTG CGCTGTTGTT TGACGTTTGA GGAATGATCG 1080
TTACACACGT AGCCTGTGTG GCGCACCTAA GTCGCTGCAC CCACTCGTAA CACCACACTC 1140
ACATGTGCAG TGTGTTCGCG TGTACAACCG ATGTGAGCGT GGTGGGGTTT GTTAGCAAGG 1200
GGACGTTGAT CTGTTCGTGC TGCTGAkTGT CTTTGTCCCG TCGTATAGAA CACACTCGCA 1260
GGTTTCCCTG CCCAATACCA TTACTTAGCC CACCTCCTCT AGGAAATTTT CGTAAATGCA 1320
GAACGTTTGG TAATACTTGA TTTTTTATAA GTCTCCATTG AATATAGGAA ACGAGTATAC 1380
CCGTTAAAAG CATAGCATTG CTCGACATTA TCTATTGTTT TACGCAAGAG GATGCAGAAT 1440 GAAGAGTCTT GAATATTATC GATCACAGCC AAAAGCAGAT GTGCACACGC ATCTGAATTT 1500
GAGTATGAAA TACGAACGAT ATAAGCAATG GTCAGGAGTA GTCATTCCAA ACTTTCCACG 1560
TAAAATGCGC GGGCTCGACG AAATGCATGA AATTATTGGT GAGTACACGC GTCCTCAGTG 1620
TAAAACTGCG CAAGACGTGT TGAATTTGTT TACCATGTCC ATAGAGGATG CCATTGCAGA 1680
CAATGTCGTC GTAATGGAGA CATCAGTTGA TATTGGCTTT ATCACCCATT ATGAAGAAAA 1740
TTTGGATCAT TTCTTATGTG ATTTAAGCGA TCTGCATCGA CGCTACAAGC GCAATGTTAC 1800
CCTTCACTTT GAGCTCGGTA TCTCCAAAAT ACGAGAGCGC AGyTnCGTAG AACAGTGGGC 1860
TGAGCCCATG ATGCGAAgCG GTATCTTTGA AAATATTGAC CTCTACGGTC CAGAGATTTC 1920
CGAAGGAATC GAAGATTTCA TCTATATTTT TAAACTGGCC GAGAAGTATC ACTTAAAAAA 1980
GAAAGCCCAC GTAGGCGAGT TCTCTGATGC GCAATCGGTA CGGCACTTTG TCGAAATATT 2040
TAACCTGGAC GAAGTCCAAC ATGGCATCGG AGCCGCTACT GACGAGAACG TTTTGCGGTT 2100
TCTAGCTGAA AGAAAAGTTC GCTGTAACGT ATGTCCAACC AGTAATGTCA TGCTCAACGT 2160
CGGTGGAATG CCCTAGAAAA ACATCCTATA AAAAAAATGA TGGATGnCAG GGGTCCGTGT 2220
TGGGTTAGGA ACTGACGATC TTCTCTTTTT TGGAAAAACA AATAGCGAAC AATTGTTTGA 2280
TATGGTTTCC T 2291 (2) INFORMATION FOR SEQ ID NO: 126:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2169 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126:
CCGCCAGTCC ACCGGTGnCA CCCTTCAACG CCGGTTAGGG AAAGCGTGCG GCACGCCAGC 60
GAGCGCAGCG CATGCTGGAG CkCATTCCAA GAATGGAGTA TGCCACGCAC AGGCTCCTGG 120
GTACGAGCGT GCATTTCTTG AGCcGCtGTT GACACACGGC CGAGTGTAAc GGAGCACTGT 180
GCCTTGTGCT AGTCCCTGAG CGAGAAGGAT GCAGCAGGAG AGGGCAAGGG AGGAGAACAG 240
GAAGGAGCAC CGGTAGTAGT TGATGGGGTT AATTTTTTTC CGTTATGATC GCTGCTATAT 300
GTTGTCGGTG TTGTCGTATC TTCACGTGTA CTTTGGGCCG TTTCGTCTGT TGCAATCTTA 360
TGCGGTGTTG ATGGGGATTG CCCTGTATGC GGGATTCTTT TTTACGTATG GGGTGTTGCC 420
CAGTGCGTAT CGCTTTTTGC CCCAAGACAG GGGGCGTGCG TTTGCGCCGT GTGCACAGGA 480 AGCAGCGGGT AAACCCACAG GGGCAGGAGT GATTTTTGTG TCCGTCTTTG TGTTGTTAGT 540
GTACCTGCTT ATGCGTCCGA GTTTTGTGCA TGCGCTTATA TTGCTGCTGA CGTGGGGGGT 600
GATGCTCACC GGATACTTAG ACGATTGCGC GCAGGTGTGC TGGGGGGAGT ATCGCAAAGG 660
CGCGTTGGAC TTTTTGTTTG CGGTGCTGAC AGCAGCGCTG TTGGGTCATT TTTATTTTCA 720
CGATCAGGTG TTCTGGTGGT TTCCTTTTTT TTCAGATCCG GTGTTCGTCT CTCCTTTTTT 780
ATTTTTTGCC GGTTCGGTGG TGATTTTGTG GATGTCAATT AACGCAACCA ATTGCACAGA 840
CGGGGTTGAC GGGCTTTCGG GAGCGTTGGT GTTGATGGCG CTTCTTTCGA TGGGTACGAT 900
TTTTTACTTT TTGTTGGGAA ATGTGCGTGC GGCGCAGTAC CTACTGGTGC CGTTTGTAGT 960
GGATGGTGCG CAATGGGCAC TGATGAGTTT TGCACTTGCC GGGGCGCTGA TGGGGTATGT 1020
GTGGCGTAAT GCACACCCTA GTACGGTGTT GATGGGAGAC GCAGgCTCCC GTGCGCTGGG 1080
GTTTTTCATT GGGGTGTTGG TGTTGATCTC GGGCAATCCA TTTTTGCTGT TGATGACAAG 1140
CGGTGTTATT TTGGTGAATG GGGGTACGGG GCTTCTAAAA GTGGTGTTGT TGCGTTTTTT 1200
TCATGTGCGG ATCCTGAGCC GGGTGCGCTT TCCGCTCCAT GATCACATGC GTGAGAATTG 1260
GCACTGGTCT ACGGCGCAGG TATTGCTGAG GTTTATGATT TTACAGGGAC TGCTCACGAT 1320
TGGTCTTTTG GGGGTTTTGT TCAAACTGCG GTAGAGGGAG GGCaCCCCTT GCGGGGCACG 1380
CCGGGCCGAG CGAGGGCGAC GGTGCGGAGT ATCCGGCGCC TTGACGTGCG TTTATTTCTT 1440
TTGCTAGCCT GCCCCTAATT GCTTTCCGTT TCCGGAATGA TGGTAGAGGA GACAGGGCGG 1500
AAGGCGTGGG GTGTGTATGG TGCCGGTGAG AAGgTTCATA GCGGTGTGTG CGGTGACGGC 1560
GTGTGCCGGG CCGTGTTTTT GCGTTCAAGC GTTTATCTCT TCTCGGATCG GGTATGGGCG 1620
CTTTGGGATA TATGGGAACG AGATAAAGGA CTCCTACTAC AAACATGTTC CGATGACGGG 1680
ACTAGGGGTT GACGTGGTAA CGTCTTCAGG CGTTGCGATG GTGTTCAATG TGGAGAaTGC 1740 kTTGACGCaG CTCATGTTTC gCGCGCAGGC GCTGCTGGGG TACGCGTTTG AGGTTGGCAG 1800
GTTCCGCTTT ACACCTGCCA TTGGCGGCAG TTTCCTTGCG TCGCACGACC ACGCCGCAGG 1860
GGTGGCTCTG TCGCTTGACT TTCAGTATTT CTTTAATGAT TGGGTCGGGT TGGACCTGAA 1920
CATAGGCGCG GGGGTGGATG TTCCGGTGAA CAGTAACCTG CGTTACCTGA TGCGGGTGGG 1980
GACGCCGGAG TTAGCGAAGA TTCTCATCAC GCATACAGTG ACGCATGGAC TGGCTAATCG 2040
CTGGATATCA GGTCCCCACT GGTGGAATTC TCTTTCTTCG TGGGTCGGGA ATACCGCGGG 2100
AAAAGTGGCT GGATTTGTAG CGCGTTTGAT AGCAAATTAT CTGCTGAAAG GCTCACAGTA 2160
CAGCATGTT 2169 (2) INFORMATION FOR SEQ ID NO: 127:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 693 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127:
TACCACCAGC AAACGCCGTG AGGTCAGTCC TTCCCGGCAA ATTTCAAAAA AGAGCGCAGG 60
GAGCCTGTAC ACATTGAGTG CCAACACCCA CAGCCCGAGG AAAACATAAC CCACAGCCAT 120
CACTGGTACT ATCACCGAAC TGACAACACT AATGCGCGCG ACCCCCCCAA AGATAACAAA 180
CGCAAGGAGA AAAGACAGCA ACACCGCTAG GATCTGGACC CACAGAGAAG AATCCTTTCC 240
GTAGTAAGAG AGAGTAGAGA CAATATTATA AGCCTGTAGT GCATTGAACC CGTACGCGTA 300
TGCAAAGACA AGACACAGCG CAAAAAGCAC CCCCATGGAG CGACTTTTCA GACCCATTTC 360
GATGTAATAG GCGGGACCAC CTCGAAAACC ACACGCAGTA CGCGTTTTGT ATGCTTGGGC 420
GAGCGTACTT TCGACAAAGG CACTTGCAGC GCCAAAAAAG GCACTCACCC ACATCCAAAA 480
CACTGCCCCT TTTCCTCCAA AGGCGATAGC GTwAGcAACG CCGACAATGT TCCCAGCCCC 540
CACACGGCTC GCAGTGGAAA TCATAAGCGC TTGAAATGAT GAAACTcCTT TCCCCCTCTT 600
TTCAGCCAGC GCTGCAAACG CrGGtTCAGA AGACTAAGTT GAACACAGCC AGTCTTTATG 660
GTaAAAAGAn ACCGCAGACG ACAAGCCAAC CGA 693 (2) INFORMATION FOR SEQ ID NO: 128:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4835 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128:
CCACTTCACA TCTTTTCATC AAATTACCGC ATAATACGAT CAACCCTCTT TCCAAATACG 60
GACACCAATT GCCTGCAAAC GCTCCACTaG ACGTTCATAT CCACGCTCAA TTTGATACAC 120
GTTACGGATT ATACTTACCC CGCGCGCACA ACACGCTGCA ACACCATCGC CATTCCCGCA 180
CGCACATCAG GAGATACCAG GTCACTCCCA TGCAATGCAC TCGGACCTGA AACAAGCGCC 240
CGATGCGGGT CACACAGGAT GATACGCGCA CCCATGGTAA TTAACTTGTC CACAAAAAAC 300 ATACGCGACT CAAACATTTT CTCATGAATC AATATAACCC CTTCGACTTG AGTCGCCACC 360
ACTGTCATAA TGCTAGTCAG ATCCGGCGGA AATGCAGGCC ACGGGCCATC ATCTATTTTA 420
GGAATCATAC CGCCAAAATC GTAATTTACC CGCAGATCCT GCGACGCaGA GACGCTTACC 480
GCGTGCTCTT GCTCGCTCCA AATTACTCCA AGTTTTTTAA ACGCAAAACC TAGTGGACGT 540
AGATCACGTA CGTTCACATC CGAAATCGTC AATGCTCCAC GCGTTACTAC CGCAAGCCCA 600
ATAAGCGAAC CTACTTCCAT GAAATCAGCC CCAAGTGTAT ATGTGGTACC ATGCAACGCG 660
CTCACCCCCT CAATTGTTAA AACATTCGAT CCGATGCCAG AGACGCGCGC ACCCATCGCA 720
TTCAACAAAT GGCATAGATC TTGCACATGC GGTTCGCTCG CCGcGTTCGT GATTACCGTA 780
ACCCCCTCGG CAAGAACTGA AGCCATAAGC ACATTCTCTG TCGCCGTCAC AGATGCCTCG 840
TCTAAGAAGA CATCACATCC CACCAGCTTA TTTGCAGAGA AAGTAAACAC CCCATCTAAA 900
CGAACTTGTG CGCCAAGTGC GGCAAgCGCA AGAAAATGCG TATCAAGTCT CCTGCGCCCA 960
ATAACATCAC CACCAGGTGt GGGAAGCACT GCTTTTCTAC CACGCGCAAG AAGTGGCCCT 1020
GCAAAAAGAA TGGAAGCACG CACTTTCTGC GCAgCTtCGC ACGGCACTTC GCACGTCTGT 1080
AACTGAGGAA GATGTAACAT GTACTCATGA TTACCACGCC TTTCAACACT CCCACCAAAC 1140
GCACGAAAGA TAGTTAACAT CACCGCCACA TCCTCAATGT CGGGCACATT TTGCAGTAAC 1200
ACTGGCTCTT GTGTGAGAAC TGCCGCCGCA ATACAGGGAA GCGCTGCATT CTTATTTCCA 1260
CACACGCGAA TACAACCTGA AACGGGAAAC CCACCTTCAA CACGATAGCA ACTCATGCTC 1320
TCACCCCTTT GCGCACACAT TCCTGTAGAA CACCGATCAA CATACAACTT AATTGCACGA 1380
ATACCGCAAA AGTACGGCCC CATCCTGTGT ACTTACTTAG GCGTCAGCaA ACGCTTGACC 1440
TACATGCTCG AGTGTCCACG AGGGCAGACA CAACATACCG ATCGAGTACA CGGCGTATCT 1500
TAAAACGCAA TCCTTCAATT ACCAACTCAT CCCCTACCGT AGGaGACGTC CAAAACGTTC 1560
CAGGAGTAAA CCTCCCACCG TGTGGTACAc GCGCGACTGC AAACGCGTAC CTAACACTTC 1620
GTTAAAATAT TCAAGAGGAA CCGTTCCTGA ACATAAAATC TCATGAGTAC CGACGCTTTT 1680
CATCTGAAAA TCAACACGAG GTTTTGACAT TCGCTGcACC ATTGACCCAT GCACTGTGTA 1740
TCCTTCTACT CCTTCCGAAA CTCCAGCAAG TGAACCAAAA ATTACCTCCA TAATGTCTGT 1800
CATGGTTACT AATCCTTCCC CATCGCCCCG CTCGTCTACC ACTAACGCCA TCTGcTGCTT 1860
TGATACCGAG AGCATATCTA ACACTGAAAA AAGGTCAGCT ACATTCGGTA CACAACACAA 1920
CTCCTGTGCC AAATCACCCA CCCGATGCAC CGCAGCTCTG GTGTTCGCTA CACATGCCGA 1980
CACCATCTGA TCTGATGACT GCGAGGGTAA TACTTTACTT TCTCCAAAAA TATCCCAGTA 2040 GTGTACGAAA CCCCAGACGA TCCCCGCAGC ATCATGCACG AGCAACTGAG AAAAGGAACA 2100
GGTACGAAAA GCATCCCACA CATGAGATAG TGGTGTGTCC TGAAAAACTG ACACAAAGTG 2160
TGTGTGCGGT ACCATAACGT GTGCAAGCGG TATGCGTTTA AACTGTAAAG CGCGCTGGAA 2220
CAATTGATTT TCGGTTGACG AAATCACCCC TTCTCGTGCC CCAACTGCAA TAAGCGTTTT 2280
AATTTCTTCT CGCGAAAGAC ACGTCGTATG ACGCGGcAAA AAAATACCCT CCAGCACATG 2340
CATCAATGCT GACGACACAC ACGCGCAGGG GTACAGCAAC CAGTAACTCA ATTGCAAAAA 2400
AGGCGCAATC CACATCAAGA ATCCCAGTGA GTACCGTGCA CCCAGCGCCT TCGGGAACAT 2460
TTCTCCACAA AGAATAATCA CGCACGTCAC CGCAACCAGT GCCTTCCACA CCGACTGTGC 2520
ACCCCACAAC TCCATAGAGC CTAACGTCAC CACGCTAGAG AGCACCATAT TCAGTGCAGT 2580
GTTTTGCACA ATAACTGTGG TAATCAGCTG TTCGCGCCGG GCCAGAAGCC AGCATAAGCG 2640
TTGTGTACAA CGTGTACTGT GCCgCTTAAG CTTACGTTCA TCGTCTTGGT TCACCGACGA 2700
CAACGCGCTT TCTGAACCTG CACATAGGGC CGAGCACACA AGTAAGAACA TGAGTTCCAA 2760
TCCTACCAGA AAAGGGTATG CCACGCGCGC TTCCCTACTA TCCTTGAAAC TCAAACCGAA 2820
CACGTACGAT TCTGCGCACA TGCAGTTGTA GCACCACACA ACGCCAAGAA CCGAATACAA 2880
CCGTGGTGCC AGGATCAGGA ATACACCCTG TATACTCCAT GATAAGACCT GCAAGGgTCT 2940
CACTGGTACA CGAAGAAAAA TCGGTGCCGA GTAGATCATT TATCTCATCC AAGCGCAgcG 3000
ATCCAGGAAA TATATACGCC CGTACCCCCG CACGCGTAAC cTGCGGACCT GTTGAATTCA 3060
CAGGGAATTC ATGCGCACTG CTCTTAAAAA ATGCTTGGTA TATATTGTGC TTTGTCACAA 3120
GACCTGCCGT CCCGCCATAT TCATCAAGCA CAATGGCGAC TGCGCGTGAG TGTGCGCGCA 3180
ATTTGTGCTG CACATATGCA AGTCGTGTAC ATTCGAAGAC AAAAACCGGC GCGCTCACAT 3240
GTTGCATCAG TGTTCCGCAC TCTTCTAAAT CTCGTCCGTC TACCTCTTCT GAGCACAAGA 3300
ATTTCTTCAC ATCGAAAATA CCAATTATCC AATCAACACT CCGTTCATAC ACTGGAACAC 3360
GCGAAACGCG CATCTTCTGT GCACAGGCAA TTGCCTCCGC CAGAGAACTC GCGCGCGGAA 3420
CTGCAATCAA TTGCGCACGA CAGGTCATAA TATCTCGCGC AGTAAGGGAT GCAGAATGCA 3480
AAATACGTTG ATACAATGCG CGTTCGCGGG AAGTCACAGT GCCATCCGCC TCTCCAGCGT 3540
ACAGTACGGT GTGCAGGTCG TCATCCGTAA CACGCAGCGA GGGAGTGTGG CACGCGACAC 3600
GCGCAAGACG CAAGAGCGCA CTCCGCGCCA TACAGAACAC CTGTACAAAA GGAGTAAGCA 3660
TCAAAGCGCT CCACTGCAAG AATCGCGCAG TATGCAGTGC CAtGCGTTCG GCCGGCACAA 3720
GGCAAGTGAC TTCGGAATAA TTTCTCCAAA AAGAAGTGTA AGCACCGTTC CTGCACCGAT 3780 GCTCCACCCC ACTGCGTGGA TGCCAAAGAG GGCACGTGCA AAAAGCGCAA TGACTGCAGA 3840
CAACGCACTG CTCGCCAGGG TGTTCCCGAT AACCACAGCA GCAAGATAGA AGTTTTTCCG 3900
TCGAAGGATA CGCATTGCCA CTCGAGCGCG AgCATGACGT TTTTCGTACA GGTAGCGAAG 3960
TCTTAAGGTA TTTAGCGCAC AGAACGCTGT TTCCGCAGCA GAGAACAACA TGGAAAGCAC 4020
CAGCAGCACT ACCAACACAC CGAACGCAGC GGAAACGGAA AGAACACTCA CACGTAATTC 4080
CCCCGAAGGc tAAAACACCA GAACCGAAGA GACAACGCCA TACATCCTTG GACCCCTCCC 4140
CGCTGGGGGG GGGCACCTTT TAAGGTGCTC ACGCCCTTGT GTCAAGAGCA CACCCTCCAC 4200
TACAATGAAc TGCGTGTCCG GAGACCGCGC GGAGTCCTCT TTCTATGAAT AGAACCGAAT 4260
CTCCTCGTGG CTTAATCAAA GCCACCGTAC GTGAACAAGA CCGAGGCCGA ACCGTTTATA 4320
AAAAGATTGC CCAGTTCCTC TCCCTCATTG GAGAAGAGCA GGCGGCGCTG GTGCTCAAGC 4380
AACTTGAGCC TGCACAGATT GAGGCGGTGG TTGCCGAGCT CCTGACACTC AAACCCCTCA 4440
GTCCAGAAGA AGCGCGTGAG ATcTACGGGA GTTTTCTGCC CTCTGCGCTC GTGTGTCGCC 4500
TGTTACCGGT GGATGCGTnT GCGCAGTCGA TGCTTTCCAA AGCGTTTGGG GAAGAAAAGG 4560
CCGATCTTAT CTTGAAGCGG GCGGTGCCAG CGGCACAGCC GAAACCTTTT GAGTTTTTGG 4620
TGCGCTTGAa GCCTCCCAAC TTCTCCCCCT CCTGGAAGGA GAACTACCTG CCACCAAAAC 4680
ACTCATCCTC TCGCAGCTGC CTCCAGAAAG CtGCGCACTA TTTGAGTAAT ATCAGcACAG 4740
AGGAGAAGAA GGACTTGATC GTTCGCCTTG CAAAGTTAAA GCACGTTAAC CCTCAGGTGC 4800
TGCAAGTCAT GAGTGACTCC TTGCACAAAA AGTTT 4835 (2) INFORMATION FOR SEQ ID NO: 129:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2355 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129:
GGCTTTCCCT GCGTTATAGG AGGTAACGAT TGCACCGCCT GTTCAAATTT TTCCCACACA 60
CCTGGTGGTG ACTGCGATCG CATCTGTGCG GCGAGTTCAT TGCGTGTCTT CGCAGCAGCA 120
ACAAGTAACC CAAGCGCGTG CATGAGCGCG GTGTCAAAAC GATTACAACC ACGCTGTGCC 180
GCAAGCGTTT CAGCGAGGGA CTCCCTCGCG GCACCCTTCT TGTACGGTGT CCCAGCATCG 240
AAGGCACACA CGATCTTGAA TCCTGCTCCC GGGGAAAGCG TCAAACGCGC GCCTACATTC 300 CACAGCACAG TATCTTTGTT TTGATGATGC GTCTGATCAG GTCCAGTTCT GTACCCTGcA 360
GAAAGAGTAG CACAACTTGc TGcCTCAAGC CTGaCCTGTT CCCGCCCCTT TCGGGCGTAT 420 ACGAGCGTGA ACTCAGTACC AAGGCCATAT TTACTGTGCG GGTCCGCTGG CATGCGCCGA 480 CTATTGGTGC CCTTTCCTTT CACCTTAGAG GTGACAGAAG GAGCACTTTT CCATATACCG 540 TTCGACGAAA AGGAAAGGGC ACAGTTGAAT GTAATGCCAC TGTTCGAAAT ATTCCGTGCC 600 TGATATGCGA TTTTTCCGCC AACCCCACCA AACCCTGGAG CGTATTTGAC GCTCCTTGAC 660 TCATAACTAG TAGTAATAAA GGGGGTCCAC AACTGCGCAA AATTAGAGGG AAAAACGGGA 720 TCTTTTCCTA CAGAAAAAGA GACATCGTAG AGGTGGAGTG TGGCCTCAAG GGAAAAATCG 780 CTTCTTCCTG ACTTTAAGAA AGGAGAACGC GTCGTCATAC TTGGATCCGC AGTTCCCGAC 840 CCTAAAGCAC TTTCAAAATC CACCTTCAAT CCCTTGAGAG AAAGCTCAAC CCATATGGGA 900 TCCTCACCTG AAAAGCTCGT ATACGTGGCG CCTTTCTTGG GCAACAAGGG AAAAGCAAGC 960
TTCCAGCTGC TCTTCGTGCG AAACCCATGT CGTATGCTTT TACCCGCTGT AACTGGAGAG 1020
GCACCTTCTG CATCAAAGAC GACACCCCAG CTAAGTTCAG CAGAAcCGCT GAACGCGGCG 1080
GAGGACGAAA AGGCGCGTGA AACCCCGTCC AAAACCGCAC CATGAAGGAA AAACACCCCA 1140
CACCCTAACG CAAAACGCGC ATATACGGAT ATACAGGCGC CCATAGGGCA GCCATTATGA 1200
CCCTTTCATA AACACGATCA ACATTTTTTC CCGTCAGCGG TAATTTTTTT CACTACTTCA 1260
CTCAGTTTTC GCACAAATGG CACAACACCA CGCCCCGTTC CTTCTATACG CCTCCCCAGA 1320
AAGTGACTCT TCTTCCCACC CACGAATATC TAATGTACAC CAACATACAT TACAGATACT 1380
GCTAGATCTG ACACATGACA TCGTTGTACT AAACTGTGGT GAAAACACGG TACACTACTG 1440
TATGCATGCG TCTAGTGTTA GGCTCGTGCA TTTTTATACT TTTACTCCGA GGGCGCGCTA 1500
CCGTATCGCG TCTGCACGCG AGCCCGGCCG TCACCATTTC GGGGAGTACT CGTCTTACTT 1560
GGGGCATTAA CTTAGGCGCG AAGGCGAACT TCGTGCTACC CGTAGCACCG CTTGGGGCAA 1620
CCGGCACTGT GCGAGAGAAC CCCAATCATC GCTTCCGTCA TCGCAGACAC GGTTTTAGGA 1680
GTTCCAGTAC TCTCTTTTTC TCGCTGACGC TTTGTCCACC GAAAACTCGG TCGAATCTGC 1740
ATAAAAGCAG CGGTGTGTAT GCAGAAATCC TGTTAAGGAA CCTAGAGTGT GCGCTCCCCC 1800
TCGGTTCCTT ATCTGGTGAG GCTTTAGGCG AACTCACGCC CACAGAAAAA CAAAGCTTCT 1860
CCGTAGAAGC GACCCTTCGC TTCTACGGCG CATATCTCAC TATTGGAAAA AATCCGACCT 1920
TTTCTAAAAA TTTTGCCAAA TTGTGGCCCC CGTTCATCAC CACACGATAC AAGGAAGCAG 1980
ACACCCAATA CGCCCCTGGC TTTGGGGGTT ATGGAGGGAA GATTGGTTAC CGCGTAGAAG 2040 ACGTCGGGAA TTCCGGGCTA GGTTTTGACT TTGGGTTCCT TTCCTTCGCT TCAAACGGCG 2100
ACTGGAGCAC GAGCGGGACT AGCCATAGCA AATATGGGTT TGGTAGTGAC CTCTCTATGG 2160
TACAAGAGAA ACAAGAAGCT GTTTTTAACT GTGGAACTCG CCGGTAATGC TACCCTCCAG 2220
GAGGGTTATG CCACGTTAGC TCCAACATTT TCGGGAGCAC CCAACAACAA ACGGGCATCC 2280
CACGCGCTCT TATGGAGTGT GGGAGGGCGT CTTTCGATCA TGCCTGGTGC AGGATTCCGC 2340
TTCATTTTAG CTACG 2355 (2) INFORMATION FOR SEQ ID NO : 130:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 718 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130:
CGGGCGCGTC GAGTAGTCGG GTGCAGGAGG GATCTGCAGC GAATGACCAT TTTCAGGACG 60
AATCGACCAC GGCGAAACTC CTTTAGTCAG TGGCCGCCAG CGCGTGGAAA TAGATGTGGA 120
TGGATGAGTC TCCCGTCCCT GTGCAAGAGT TTAAACTTTC TCAGTTTGAG GGTCCTTTGG 180
ACcTCCTGCT GTTTCTCATC AAAAAGAACG AGCTGAGCAT TTACGATATT CCTATTTGCG 240
AAATTACTGC TCAGTATCTG CAGTACGTGG ATCAAACCGT CTCGCCCGAT CTCCGTGGTC 300
TGACGGAgTT TTACGCAATG GCTGCGGTTC TTCTGTACAT TAAAAGCTGC ATGCTCCTCC 360
CAATGGAACT AGATCTAGAT GGTGAGGATA TCGAGGATCC TCGGCAGTCG CTGGTGGAGC 420
ACCTTATCGA ATATCAAAAA TACAAGCAAC TTTGCAAGCT GATGGAGCTG TATGAGTGTG 480
AAGACATGTG GTGCGTTGAG CGAAAAAAGA CGCAGCATCT GTTTTTGTCT CCAGCAGAAG 540
TGCCTCTCCT ACACGGTGAC GTTCGTGATT TGCTGATGCT CTTTATTCGG TTAGTGAGAA 600
AGACGCCTCA GTGGATTATG GATTTGTACG AAGAAGTTTC GGTAAATGAG AAGCTGACAT 660
TGCTTTCGGA ATTGCTTGGG GTTCGGGGGC GGTGTGTATT TACTGAGCTT ATTAAGCA 718 (2) INFORMATION FOR SEQ ID NO: 131:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1959 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131:
CAACTGTGCA CCGCCGGnTA nAATAAGTGG ACTCGCACAA CGCCAACAGT GTGGATATGC 60
GTGCGAAATC TGTTCGCGTC TAAATAACGC TCCCTGTTTC TGCACACGTG CAATAATAGC 120
CTTATCAGCA GCTTTTACAA AAAGCCCCTG ATAATCTGCA ACCTCTGCGG TAAAGCGACA 180
CTCTGCATCA AGAGGACACT GAATCGAAAT ACCCGCATCC TTAAATACTT CATAATCATC 240
CTCTCCGAAC GCAGGTGCGA CGTGTACAAC ACCAGTGCCG TCTTCAGTAG AAACAAAATC 300
CGCAACTCGC GTACAGAACA ATCCCTCCTC TGAATCTCCT TGTACAGAAG GATCCGGTCC 360
CTGCCCAAAC ACAGGATAAG AAAAAAGCGG CCGATAGCGA ATCCCCGCAA GATGTTCACC 420
TCTTTTTTCC CACACTACGC GGTAnAGnAC GAATCTGGAT AaTAGAaTTC AgACGAGAAC 480
GAGCCAAAAT ATAGTGCTCA TCATTCGCCT CTATTAGCAC GTACAAAATT TGTGGTCCCa 540
GCGCAAgcGC TGCtTGCcGG AgCGTCCAGG CGTGGTGGTC CATGCAAGAA AGCACGTATG 600
CGCAGGAAGT GATGCACTTC CCCAAGACGC TGCCGCACAA AACTCCCGTG CAGCAGGACT 660
ACCAGGAACA ACAGATGTAC ACTCAAAACG CACAGTAATG GCAGGATCAG ACACATCCTG 720
ATATCCACCT AAATTCAATT CGTGATTAGA AAAGcTGTCG CACACCGTGG ACAGTACGGG 780
AGTATTTTAT AACCTTCATA CAGCAGTTTT CGCTGCCATA GTGCGCCACA ACCCACCACA 840
CGGACTCCAT GTAGCAGACA TCCATGGTTT TATAGTCATT ATCGAAGTCA ACCCAGCGCC 900
CAAGACGCGT GAGTGTGCGC TGCCACTCCT TCACATATCG CAGCACACTG GAGCGACATG 960
CCGCGTTAAA CGCGCTGACA CCATACGACT CAACATCACT TTTTGAATTC AAATTGAGCT 1020
CTTGCTCAAT CAGGTGTTCA ATGGGCAGAC CATGACAATC CCATCCAAAG CGACGCGGCA 1080
CGTACGCACC ACGCATTGTC TGATAGCGCG GAATAATATC CTTAATCGTG CTGGGCACAA 1140
AGTGACCAAA ATGTGGCAGT CCAGTTGCAA AAGGGGGACC GTCAAAGAAA ACATAAGACT 1200
TCCCCTGCGC ACGCTGCGCC ACAGACTGCT CAAACACCCG GCGTTCCCGC CAAAAGGCGA 1260
GAATACGCCG CTCCTGCGCG ACAAAATCAA CCTTTGGGTC CACAGGCGTA TACATACAAC 1320
CTCCGTTGCT CAGAATCGCA TAAGGAGCGT AAGGCATTAT ATCATTTTCG TCCTTCCTTT 1380
TCCCCATACG TCTTATGACC GGCGCCACAC CTTTCCCCAC CTGCACCAGA TACCCCACGT 1440
GTGCGTAATC GCACGTGCTC TGCCAATTAC TGCATTGAGT ATTACGTATG CAATAATGCC 1500
CCACATTACA CCTTCTGCAA TCGAATACGA AAAAGGCATC ATCAGAACTG CGACGAAGGC 1560
AGGAAACCCT TCCCCACATC TTGAAAATCC ACATTGCTTT CCATGCAGCT CGTTACCGTA 1620
ACAGTGCAGG GAGCGATGGT AGCCGACATT GCAGTTGCAG TGAGAACCGC ACCCCATCCA 1680 TTCCTGTATG CGAGCGTATT GCAGGATTCG CCGCAAGCAC GTAAGCCAGT GCGAGAAACG 1740
AGGTATCAAA AAATGAGCGC ATTCTTATGC ACATGATTCA AAATGACTAA ACCTCATAGT 1800
AAATGAGGCG CCACCACGGG ATGCAGAACG CACATCGGTA GAAAAACCAA ACATTTTCTT 1860
CATCGGAGCC TGTGCATGCA CAAGCTCCCG TCCGTGTTTT GAATCCATGC CCAGAATTAT 1920
TCCCCCCCGC TGTATGATCA CATTCATCAC ATCTCCTAC 1959 (2) INFORMATION FOR SEQ ID NO: 132:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 722 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132:
GTTCCGTCAG CAATTTCAGT ACGCGGTTGA GGTATTGGGC GAAAAGGTTC TCTCGAAGCA 60
GGAGACCGAA GACAGCAGGG GAAGAAAAAA GTGGGAGTAC GAGACTGACC CAAGCGTTAC 120
TAAGATGGTG CGTGCCTCTG CGTCATTTCA GGATTTGGGA GAGGACGGGG AGATTAAGTT 180
TGAAGCAGTC GAGGGTGCAG TAGCGTTGGC GGATCGCGCG AGTTCCTTCA TGGTTGACAG 240
CGAGGAATAC AAGATTACGA ACGTAAAGGT TCACGGTATG AAGTTTGTCC CAGTTGCGGT 300
TCCTCATGAA TTAAAAGGGA TTGCAAAGGA GAAGTTTCAC TTCGTGGAAG ACTCCCGCGT 360
TACGGAGAAT ACCAACGGCC TTAAGACAAT GCTCACTGAG GATAGTTTTT CTGCACGTAA 420
GGTAAGCAGC ATGGAGAGCC CGCACGACCT TGTGGTAGAC ACGGTGGGTA CCGGTTACCA 480
CAGCCGTTTT GGTTCGGACG CAGAGGCTTC TGTGAtGCTG AAAAGGGCTG ATGGCTCTGA 540
GCTGTCGCAC CGTGAGTTCA TCGACTATGT GATGAACTTc AACACGGTCC GCTACGACTA 600
CTACGGTGAT GACGCGAGCT ACACCAATCT GATGGCGAGT TATGGcACCA AGCACTCTGC 660
TGACTCCtGG TGGAAGACAG GAAGAGTGCC CCgCATTTCG TGTGGTATCA ACTATGGGTT 720
CG 722 (2) INFORMATION FOR SEQ ID NO: 133:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2308 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133:
CGACGAACAC TCCAACCCCA AATGACACCT TCTCCGGAAG ATTACCCAAC TCAACCAAAT 60
TCCGACAACC TTCAAACAGA CCATCTGCAA GATTCTTCAC TCCAACTGGA AGACGAACCG 120
ACGCTAATTC CGCGCAGTCT CGAAAGGCCC CAGATCCAAT TTCCTTCACG GAATTAGGTA 180
AAACCACGTC TACAAGGCTT CCACACGAAG AAAACACATC CTGCCCAATC ACCTGCAGCG 240
ATCTCGGAAA CACAATTCCT GCCAGACTCA CACACTTGTG AAATGCATGA TGACGAATCT 300
CTCGCAAGCC GTCCGGAAGT TCAATACGAG CGAGTTTTGT ACACCCATGC AAATGGCATG 360
GCTGGCCCAA TnCATGGCGC ACCGAAACTA GGCAACGnAC ACCCGACACC AGCGGAAACG 420
GCACCCATAA AACGCCTTGG TAGCCACCTC GGACAACCGG CAGCCCATCA ATCTGACTGG 480
GCAGTACCAC TGCAATGCCC TTGCCCAGAT ACCGCGTGaT AATTATCCCA TCACGGTCGG 540
GAGTCAGACC GTACTGaAAA AACTTCGCCG GCGTCTCTTT CACCTCCgCA CGAGAGACAG 600
CATTCTGCTT CGCAgAGACA GGATGCGATA CCAGTACGCC CAAACCGATA CACCACATAG 660
CAAAAACGAC TACTTGTTTT GGCATTATTT CGTCCTCCTT TTTAAATCAA CACGAGCCAA 720
AAACTAACAC TCACTTCAAT GGATACGCGT CCCACACAAG CGGTGCGCAA sATGgACGAA 780
CCCCCTTGAA AAGGTCAACT TTTAAAAAAT TAAGCACTCC CGCAGATATG CGCACCGTCT 840
CCTACAGGGA CTCCATACCT CGCCGGTAAA ACTAAGACCG TGAGCCCTCA CCAACGCAAG 900
ACAGTTGCAC AGCAGTATGT ACAACTGCGC GCAGACCCGC TCGATAGCAG GTAACTCAAG 960
TGCGCTGAAA cCACTCGAAG ATGCTGCTCA TCAATTGCCA CAGTTCCTAT TCCGCACTGT 1020
TCGAGCAGAT ATTGGCGCAG TGTTTCTGCA GAAAAACCTA TGCAACGGAT ACAGGTAAAA 1080
TAACCAGAAT TACACGGCAA AAACTCAACA CGCAGCGGTA CCGCAGCCCC ACACTCCGTT 1140
GTGCTCCAGG TAAGATCACG TACCACGCGC TGCACCTCTC GGTAGCGTGC ACACATCAAC 1200
TGAAAAAATT GGTGTCTCTC ACGTGCCGTG GCTGAACCCA ACCCAGCAGG CTCATCCTCT 1260
TCAGCCAACA GACGCAGCGC AAGGTTCTGC GTAAGGTGGC AGTACATGAA AGTGATGCAC 1320
GGATCATCCC CATGACTTTT TTTTCAAAAG CCTGGTACTG GGAAACACCA' AGAGCGAGTC 1380
CTGCACAGCT TAGAAAACCC ACGCGTAATC CCCATGCGTA CTCTTCCTTC GTTAACCCAT 1440
CTATCTTTAA CGCGCAGATA TTCTTGTGCG CTTGCGcAAA GcGGGCAAAA AAAGAGCCAC 1500
GCATCAAAGA CGCCTCATAC TCGAACCCGC TA ACGCATC GTCACAAATC ACCAGTACCG 1560
CACACCCTGc GTCAgaTTAA GCATACACCA CCTCGTATAA TTGCTGTGCC TCCTCTTCCG 1620
TGGGGGTATA ACtGACGGAT TATGGGGGAA ATTCAAAATT AACCTTATCA CCCCATCCGA 1680 AGCCTGTGCG TCCAGCGCTT CCTTGCACGC GCTCAAATCA AAGCGCCCCG CTCGAAAAAG 1740
AGAAAAGGGA ACCGGcGTTG CCGCACAACG CACTGCTAGC ATGAGATCAT AATTTTCCCA 1800
GCGCGGTGCC GGCACCAAAA CCGTCTGTCC TGCACCCACA AATAAGTCCA TAGCACAGGA 1860
CAACGCTGCC GTAAGTCCCG GCAACACGAT AGGAAGCGAC ATCTTTGGAA AAACTGCCGC 1920
ACAAGAATGC TCTCCCTGCT CTGCTGCACT CTGCATTGCT GCAACGTCCG GTTCATCCGG 1980
ACACAACACA GGATCACGCG CACACAAACG cCGCGCCCaG cGCTCGCGGA GCgCAGGAAT 2040
ACCTGCAGTC GGCGCGTAGG AAACTATTTC AGAAGAAGAA AGATCAGGAA CAAGCGCATG 2100
CAACGTATCA CGAAGCACCG GCACCCCATG ACGCAGAACC ATGCCAACCG CCCCATTCAT 2160
ATCAGGCGCA GCATCCGCGC CTCTGCATTC TGTGCAACAA TCCCGTGGGG AAAATACGCG 2220
CGCAAACCAA GAGGAGATAA CAGCGCGTGC ACCACCGTTC CTTCAAGAGC AGCGTTTAAC 2280
GCACGCGCGC CTTCAGAGAG GTCCATGT 2308 (2) INFORMATION FOR SEQ ID NO : 134:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1236 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 134:
CTTTGACCAC TCGGGAAACT CCCCAAACGA ACAAGCCAAC TTAGGGATTC GAACCCTAGA 60
CCTCGAGATT ACAAATCACG CGCTCTAGAC CGACTGAGCT AAGTTGGCGA TGACTACGCG 120
CCCCGGACTG TACGCGCGAA CCCACGGCTT TGTCAAGCCG CTCTCACTGC ACTAAGAAAA 180
ACTACACGCG CAGAACAGCA GAGCACGTAC AATGCGGACT CTATACCTTG AATTTCTCCA 240
CCTCATCTGC AAGACTGCTG ATGCTCTGCt TGCTACGCTG TGTTATCTCA TTAACCTCGT 300
GTAtGCGTTA TTGATCTCGA TGGCACCAGC TGCCATCTCA TTCATACTTC TGGCAATTTC 360
ACTAGTTAAG TCGTCCAAGC GCTGCATCTC GCGTGCAATA ACCTCGCCCC CTTTGAGCAT 420
ATCCGCAGAC CCCTCCTTCA CGTCTAcGGT GGCGGCATTG ATGCTCTTAA TCGCAGCTAG 480
GACtCACGGC TTCCATCCGA CTGCTCTTTC ATCGCCTCTG TCAGCGACCG GCTCATTGTA 540
CGCACCTGAT CGGACAAACG GAAGaTGGTA TCAAACTGCT CCTCAACCGC TTTTGAAGAC 600
GTGGAAAGCG TATCTATTTC CACACTGAGC GTCTTGAGCG TCTCAGTAAT GGTCTTTCCT 660
TGGGTGCTAG ACTCTTCCGC AAGCTTACGG ATCTCATCCG CAACCACCGC AAAACCCTTT 720 CCTGCCTCGC CCGCATGCGC CGCCTCGATA GCGGCGTTCA TTGCAAGCAG GTTCGTCTGA 780
CTTGCAATGT GCTGAATAAC ACTGCTCGCC TCAAGCAAGC TTCCCGATTC CTsrcTGaTC 840
TTTTGCGTAA TACCGCTAGa GCTAACCAGC GTGTCACGCC CATCAGCGGT GGCAATAGCA 900
AGACTGTGCA CCGCCTCATC ACTCCGTTCA AGTGTCTGCG TAATAGACAC AATGTTGGCA 960
ACCATTtGCT CAACTGAGGA GGAAGACTGT GCGACGTTCA CCGCCTGCGT CTCAATGCTA 1020
CTGTTCAGCC CCTTAATTGT CTTGATGATA CTTCCACCGT GTCCGTGGCC TCAnACACCC 1080
CACTCACCTG CAATTCAACT CGGTGCTTGA CACCATCGAT ATTGGCGGTA ATCTCGTTTA 1140
CCGCACTGGC TGTTTCAGTC ATGTTACTGG CAAACTCGTC CCCGATGCGC CGCATATCGT 1200
CAGAGCTCGA CCCCACCGTG GCGATACAAA GCGAAT 1236 (2) INFORMATION FOR SEQ ID NO: 135:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3856 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135:
CCTTGGTGTT GATTTTTTTA TATGTGGCGC nTCGTTTTCG GTGGTTCTTT GCGCTTGGAG 60
CGATAGTCGC GCTCGTGCAT GATGCGTGCA TTATGGTGTC ATTCATGGTG TGGTTTGGTT 120
TGGAGTTTAA TCTGCGAGTA TCGCGGCGAT TCTGACGATT ATCGGGTACT CGATCAACGA 180
CACGGTGGTA GTTTTTGATC GGGTGAGACA GACCATTCTC CTGGATCCTA TCGCGTCAGT 240
GACGACAGTA CTTGACCGAT CGCAAACAGA CATGCTCACG CGCACTGTAG TGACAACGGt 300
GACGACGCTG CTTGCAGcGC TGATGTTGTA TGTGTTTACC GAGGGAGGCA GTCGGGATTT 360
CTCACTCGCG CTAATGGTGG GGATGGTCAG CGGCGTGTAC TCGACCATTT ACATCGCCGG 420
TGGCTGTATC GCCCTGATCA GCCGGGGGAA GAGCGGAGGT CAGCTGCTCG GACTCTGAAA 480
CCCCGCACGC GCGGCTGTGT GTAGGAAAGG AAAGGAGAGA GGGAAGAGAG GGGACTCGgC 540
GCTGCCGACT TTGCTATTTC TTCCTTACAA TCGTTTTTCT TTCCCCGCGG GTGTGcAGGC 600
TGCGCGCTTG CCTAAGGTGG AGCCGTTCTT TTTTGTTCTT TTCCAGAAAG TCAAGTGACC 660
TGTTAAAACC TTGCAGAAAT TTCTGCATCT CTTCTTCAAA CACCACAACC GCGTGCCGGT 720
CAAACCCGGC ACCGTCGACG TTTTTGCTTT CTACTACCTG CAGAAATACA TCTCCGACCC 780
TATTTTCCTT CACATTGAAA AAGTAGGTCC GATTTTCCGC TTGCACCTCT GTAGTAAACA 840 GCTCACCGCG TATCCCCATG CTGTACCTTC CTTCTGCTTA TGAAGTCGTA TCCGCACACG 900
AAAGCGCACA GCGATCTGCC TCGCGCACCA TGTCCGCGTG CAGCTGCAAA AGCGCAGyTy 960
CCCCCGCCTC GTCATTACCT TTCAGATCCG CAGAAATTCT AAATATTGGT TCTGTCTTTG 1020
AACCGCGCAT CCACACGAAC GCCACCGCTT CTTGGTGCTC GTTGTAAAAT TGGATTTTCA 1080
ATCCTCCGTC AGCCGACCGG CTAAAATCCC CCGCCGCGTC CTGCTGCTCC CTCCCGCGGT 1140
ACAATATCGG GCGGTACGAA CAAATGCCGA AGCGTTTTTT TAACGTGTCT TTCTCCTGCG 1200
CCCACCTGCG TTCAAATACC CGCTGaTATG CGCTCTTGAG GAGTGCATGG TCAGTCGTTC 1260
GTATCTGcAA GAGCGCGCGC GTATCGTGCG TTGGTGTAGT CGTGTACGCA GgTAGGCTGG 1320
CAAATAGATC CGATATCGTG AAGGCATGCT GCGCGCGTTC TGTTTGGTGT GAGTGCGCGC 1380
ACCACCAGGC AAAGAGTCCT GGCTGTTTCC CATCCCCCCG CAGGAGCAAG AGCTTTAAGA 1440
GTGCAAAAAC AGTATGCAAA GGATCACGTA CTGCCGCCGG ATGGAGAATG CTGCCCCCGT 1500
TTGATCCTTC CCCTAGAATG CGCACGCAgT AGCCTTCTTT CCGTAGAAGG TGTGCCTTCT 1560
CAATGAGGTG CGCCTCTCCT ACCTCCGTGC GAAAAACGTG CACGTCTAAG AgCTGCGCGA 1620
TAGCTTCTAC ACGCAGCGAG GTGGGACCAT TGGTAACCAG CGCGATAGGC GGCGCGCGGT 1680
GTGGCTGCAT ACGGAGGTTG CGACTGAGTT CACAGATCTC CGACACAACC GAAAGAGCAA 1740
AGACTGCCTG TTCGTGCGGA ATGACAGCGC GATTAAGGGT TTGGTCGTAA TAGACAATAT 1800
TTCCCCGATC TCCGTCACAA TCTGGCACAA AGCCGAAGGc AATGGAGCGT TCTTCTGGGG 1860
ATGAACCTCG CGTGGCTGCC TCGGTCAGTG CCTGCGCGCA CGCGgTAAGA GAGGACCCTT 1920
CAGGAACTAT GCGATGGCGA ATATCCCCTG GCGTTTCGGC GATACTAAAA AGGGCGACAC 1980
CGAGTGATTC TAGGAGGCGC CTATCTATGG AAGCTGCACG CGCACTTCCA TTGAAATCAA 2040
TGAGGATAGA CAGGGGTGTT CCCTGTTCGC TGTAGGCGGC ACGCTGCTGG GTAAAGCGAT 2100
GGAAAAAGGC ACGGTGCTCA GTTTCTTTAG GGGAATTTGC GATCACTTCC CTTATGAAGA 2160
GATCATAGCT GTGCAGGCTT TCTTGTTTGT GTGTTTTGTT GTACAGCAGC GGTTCGAGTA 2220
TAGAGGCGTT GAGCGTGGCA CAATGCTCTC GTATTTTCTG GAAAGAGGCG GGCTGTAAAC 2280
ACTGTGCGAC GAAgTCTTCG CTCAGCTGTT GTGCCTGAGT AGGACTGAGC ACGCCGCCGT 2340
CATTTAGGCC AAATTTAAAG CCATTGTATT CGATAGGATT GTGACTTGCG GAGATGTAGA 2400
GGAAAGCGTC GTAGTTCCTC GCGTAGCTTG CAATTTCGGG TATTGGTCCG ACTCCGACAC 2460
AACGCAGCGA ACAGCCTTCC AGATGGAGGA TAGCGGTGCA GATAGAACTG ATAATCTCTC 2520
CAkTAGGGCG CGAGTCGCGC GCGATGACAA CACGTGGTTT TGGGACCTTT TTTTTCAAGA 2580 ATCGGGCGTA GCTGAGTGCT ACCTGTGCGC TGAGTACAGC GTCAGCCTTT TGCGTGAGGG 2640
TGACGGTGCG CTCAGGGGTA AAAGCGTAGG GGAAAACCTT GCGCCACCCT GAGGGTGACC 2700
GGGTGAGTGA TAGATGAAGC GCTGCGCaGG CTGCGGCAAG GTGAGGGAGG TGATGAGTGA 2760
GTTGAGCAGG GACATGCGCG GAGGATAGGG CGCGGGTGAA GGAAGTGCAA GGGAGGTGGT 2820
GGTCTTGACA GGATGAAATC AGGTGTGGTC GGGTGTAGGG GCGGGGGAGA GGTAGGAATG 2880
GGAGTAGTGG AATGGGTAGG TGAGTGGATG CACGCGGTGG TGTGGAGCTT TCCGATGGTG 2940
GTGCTGTTGC TGGGGACGGG GTGCTACGTG ACGGTGTGTA TGAAGTTTTT TCCTGTGGTG 3000
CGGCTGTGGT ATGTATTAAG ACAAACTATT GGGGGTCGCG GAGGTAAGAA GGGCGGCAGT 3060
GGTGAGGTGA GTGCGTTCCG TGCGGTGTCC CACTgCtTGC AGCGACGTTG GGGTCTGGAA 3120
ATATTGTAGG GGTTGCGACG GCGATTGCGA TTGGGGGGCC TGGGGCGATA TTTTGGATAT 3180
GGGTGACGGG GATATTCGGG ATGGGAACGA AtTCGtGGAA GTGGTGTTGG CGGTGTACTA 3240
TCGGCGTCAG ACTGGTGATG GGCGTTTTGT GGGGGGGCCG ATGTATTATC TGAAAGACGG 3300
AGTGGGAGTT CCAGGGGCTG GGGTACTTGC GAGTTTGTTC TGCATATTCA GTGTTATCGC 3360
GTCCTTTGGG ATAGGAAATA TGACGCAkCG AAcTCGGTGG CTCTAGTGTT CGAAGATGTG 3420
TTTTGTGTGG ACGTGCGGGT GACCGGGGCA GTGCTGATGG TCTTGGTAGG CTTAGTGAGC 3480
GTGGGTGGGT TAAAAAGTAT CAGTTGGGTG ACTGGGGTAA TGGTGCCTGs GATGGCGATT 3540
TTGTATGTAT GTGTGGGCGT ATGCTGgTGT GTTGCAATAC GCGACAgCTG GTGCCAGTGT 3600
GCTGGGATAT CGTGTCCGGG GCGTTTGCCG GGACTGCAGC AGTTGGGGGG TTTGCAGGGA 3660
GTGTGGTGCG TCAAGCGATA GCGGnTAGGT ATTAGCCGGG GGGTAGCGGT GAACGAGGCA 3720
GGGCTTGGGA CTGCTCCTAT TGCGCATGCG GCGGCTATTA CAGACCATCC AGTGnCGACA 3780
GGGGCTTGTG GGGTATCTTT GAAGTTATTT GTGGGGACAA TGGTGGTATC TTCGGTGACG 3840
GCATTTGCGA TACTGC 3856 (2) INFORMATION FOR SEQ ID NO: 136:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4444 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 136:
CGACCTAGAG CGTAnACTGC TGGAGGGACC GCTCCGTCCA TTATACGGAC GATGGAGGGA 60 AAAGGCAATG TTCTGCCCGG CAGTTGCGTC CGGTGCACGT GGAGGGTCAG TGACGAAAGA 120
AGAAGCAATC GAGGTGGACG GAGTGGTTAA GGAAGCGCTG CCGAACACCA CCTTTAGAGT 180
GCAATTGCAG AATGGGCACG AGATCCTTGC GTATCTTTCA GGTCGGATGC GCAACACTAC 240
ATCCGTATCG TGCCCGGAGA CTCGGTGAAG GTCGCGCTCT CACCCTACGA CCTCTCCCGC 300
GGCAGAATTA TGTTTCGTGA GCGTTAGATT CCTCTCTCGC AGgAGAGAGG TGCTTGCTCC 360
TCTTTAGCAG CGGGGTTGCT CATCCTTCAA GGAGGATTGA GAGTCCTGCC CTCAGGGTGT 420
GCACTCCGCG CGCCAGGCAG GACACGCTGA ATGCGATCCC GAAGAAGCCA AATACTAAAA 480
AGAAGCGTAC CCGaATAGAG CCAGGAGGAC GAGCGCCGCT CCCTGAATGA GAAGGCGCGC 540
CCCGTCTTTC CTGGAGCCCG TGCGCGCGGT AGGACCCGCC GGATCGGGCG GAGGTCTGCA 600
CTGCTCCAAT AACTCCACCC ATTTCCAgGA AAACCACCgA AGAAAGTGTA CCATCCCTCG 660
AAGTGTTCGG TTTCCCGCGC GTGCCCTCTC CAGGGACCGT GCGTGTGTGT GTAGCGGGTG 720
CGCGCTCCTG CCTTGTCGTC CGTACCTGGA GGTCCAAGCG CCAAAGCCCT CTGaGGCGTc 780
GCGTcCTGcA CGcGaGGaCC AAAATGcGcC ACCTGcGcGT TcGGcATcGT aGCGTGCGCG 840
TGATGCACGG TCAGAGAGGA CGGCGTAAcT GcGTTGaTCC TTTTAAACTG GTCCTCCGcA 900
CACGCGTCGC CCGGATTTTT ATCCGGATGA TACTTAAGCG CCTGCGCGCG AAAAGCTTTC 960
TTAATGTGTT CCTCAGAAGC ATCGGCAGCA ACGCCCAGTA TGGCGTAATG GTCAGGAACA 1020
GTGCGCTCGC TCATCGTTTA CCCAATGGTG GTCCCCACGA ACTCTTGCCT GTTCAGGATG 1080
CGCTCCAAAT TCGGTAAGTC ACGCCCTGTT GCGCAAATGA CCTGAATTCC CGACTTTTTT 1140
GCCCGGACAC TAGCGACTGG GTCAAAGGGG ACATGGCTGC CCGGTACCCA CTCCTTGTCG 1200
ACAAGGAGGA GGAAATCATC CCAGGAGAGG GATGTGAGCG CCTTCGCATC CTTGTCTGAA 1260
CGAGGATCCC CCGTGTACAC ATGCGCAACG TCGGAAAGGT TAATAACCGT CTTTGCAGAA 1320
TAACGCTCTG CAAGGAGCAC AGCGTCGGTG TCGGTGGAAA AACCCGGTTT CCAACCAGCA 1380
GCAACGAGCA CCTGACCTGA AAAAACGTTC GCCGCAGTCG GGTCATACAC GACCGGATTT 1440
GGACAAAGGA TTCCAAAAAG GGATTTGAGC AGCTGCGCGT TCAAACGCGT AgCCATGaTG 1500
CCAATCCAGT CAAGTTCAAC GTGTTCGGCC GTGGCATACA GCTCCCTGTC TTCCTGATTA 1560
TCGCCCTCGT GCGCAGGTTT CGCACAGGCC GGAGAAGACA GGCTCCGGCG TAgcGCAcGA 1620
TAGGCGTTTT GATAAGTGCG CGCAGGTGCA CCACCGCCTG AAACGACAAT GAGCTTCCGT 1680
GAGGCGTCTT CGTATAGGTA CCGTTGAACG GAACGAACGA ACCGCCCGAG AAGCTCTATG 1740
TCGGGCGTCT CAGGCGCAAC GATGGAACCT CCAAGTGACA GAACGGTGAC CATGAAACCC 1800 TCTCGCCGGC ATCGTAACGC AAAGAGACCC TTTGGATCCA GGCCCTGTGT GTATCTGGCA 1860
TTGCGTCCCA GCGTGCACGG GGCGATGGAG TGTTCTACAC GGGCGACACA GACTCCTAGT 1920
TCTTGTATTC TGTGCAAAAA CCGACGTAAA AACTGTATCT CCGTAGTCTA GTGAGTGTTC 1980
TCTGTGCATG AGTTGCTCCC GTACGACCGG TGCTTTACGC GCGGTCCCCC TTGTGTTCCG 2040
TTCCGTCCTG GTGCTTGCGG TGTGGGGTGT TTCCTGCGTA CAAGCCGCCG ATGTGGCGCA 2100
CAATGCGGAT GTACCTTCCC GCTCGCTGAA GGCGCTCGAG CGTTTCCGTT TTTTTGTGTA 2160
TCCCAAGCCG CTCGACCTTT CTAGTGACTT TCATGCGAAG GCCTTGAAGG GGGAGGCACT 2220
GGTTCCTAGC CTTTTCAAGG GAAAGGTGAC GCTTTTGAAC TTTTGGGCTA CGTGGTGTCC 2280
GCCCTGTCGT GCGGAGATGC CGTCTATGGA TCGCATGCAG GCTCTTATGA GGGGGAATGA 2340
CTTTCAGATT GTCGCGGTCA ACGTTGGTGA CTCGAGAAAA CAGGTGGAAA GTTTTATCGC 2400
GCGTGGAAAG CATACCTTTC CTATCTATCT TGACGAGGAG GGGAGTTTGG GGAGTGTTTT 2460
TGCTTCCCGT GGTCTGCCAA CTACTTATGT TGTGGACAAG GCAGGGCGCA TCGTGGCAGT 2520
GGTTGTCGGG AGTGTGGAGT ATGACCAACC AGAGCTAGTG GCTCTCTTTA AGGAACTGGC 2580
GCGTGACTAG TGTCCCCGGC GTTGTGGGTT CCTTTTTGGC CGGGTTGCTT TCTTTTCTCA 2640
GTCCCTGCGT CCTGCCGCTT ATTCCGGCGT ACGTCTCTTT CATTTCGGGA GAATCGCTCG 2700
GTTCTATCCG GGCGGGGGCG GCGCGGCTCC AGGTTTTTCT CAGCAGTGTT TTTTTTGTAT 2760
TAGGACTGAC GACGGTTTTT GTGTTGTTTT CAATCGTATT TAGCGGAGGG GTGCAGCTTG 2820
CAGGTGCGGG TGTGCTCACT GTGCTCACGC GTGTAGCGGG CGTGGGGGTG ATACTCCTCG 2880
GCTTAAACAC AATCTTCGAC GTGGTTCCGT TTTTGCGTGT GGAAAGGCGT ATGCACACAA 2940
CGGTGCGACG GGTGGGTGTG TTTCGTGCGT ATCTTTTTGG GTTGCTGTTC GCAACGGGAT 3000
GGACTCCGTG CGTGGGGCCG ATTCTCTCTT CTCTGTTGTT CTATGCGGCG AGTTCTGGGC 3060
AGCTGCTCCA CGCAGCAGGG CTCCTGACCG TGTATGCACT GGGATTGGGA CTTCCCTTCG 3120
TGTTTGCAGG GATCTTTTTT GGACGTGCGG AGCGGGTGTT TGCGTGGGTA AAGAGTCACA 3180
TGCACGCAGT AAAGCTCGCC TCCGGGATGT TGATCGTCTT TTTCGGACTG CTGATGCTAA 3240
CGTCGGGGTT GCAGGCACTC AGTCGGCTTT TTCTACGGGC AGGATTCGCG TTAGAGGAAT 3300
ACTCGACGCG GGGAATAACC CCTCTTCGGC AAATAGCGGC ACTTCTTGCG CAtGGTTTTT 3360
GTACCAGGGG GTTTGAGCGC GAGcGGCtTT GGGGCTGTGC GGGTGGGTAG CCATCACGTA 3420
AATAGTTTTT TGATGCGTGT GAAGGCCCGC GTGACCTCTT CCTCGCTTAT TTTTCCTTCC 3480
GGTGCATGGG CGCGCACCAG GTGTTTGGGA ATTTCGTCCA GAAAGGGGGA GGCGGTGCGC 3540 TCTTCCCTTT TTCCGCCGCG TTTCCTTGTG CGGCAGTGTG TCAGGTACAA CTTCTCTTTT 3600
GCGCGGGTAA TTGCAACGTA GAAAAGGCGT CGTTCTTCCT CGATGCTGTG AACCTCTTCT 3660
ATACTTCGTT CGTGTGGAAT GGTTCCCGCT TCCACGCCGG CGATAAACAC TACGGGAAAT 3720
TCCAGCCCCT TCGACGCATG AATTGTCATG AGCGACACAG CGCCCTCTGT TTCTTTTTGG 3780
ACGTTATCGC GTGCTAGCAA CGTAACGCGG TTTAGGTAGT CGTACAAGCT TCCGTGTTCT 3840
GAACTCTGTT CCCAGTGTTC GATGGATTCG ACTAAGTGTT CGATTTGCAA GAATTTAAAA 3900
CGTGCGGCAT GTTCGTTTTT TTGGAATTCT TGGATGAGGT AATTAAAATA CTGAATGTCT 3960
TCAACAAATT TGCGTACCTT GTACGCAAGA TTTTTCCCCG AAAGTAGATG GGTACGCGCC 4020
TGGGTAATGA GCTGGAGAAA ATTTTCCACA GCGGTACGAT GTGATTCTTT TAAATCGACT 4080
GCATGTGTTT TATTGATAAT TTGATTCAGT GCATTGAACA CGGAACATTG TTGTGTATTA 4140
GCAATGTCGG AGACTAGATG CAGTGTTTTT TTTCCAATTC CTCGCCTCgG GGTATTAATG 4200
ATGCGTAATA GGTTGACATC GTCGTCAGGA TTAGAAATTA CCCGCAGATA ACTGAGCACA 4260
TCCTTTATTT CTTTCCTCTG AAAAAAGCTC ATGCCGCCTG AGACACGGTA TGGAATATTT 4320
TCTTGCAAAA ATACGTCTTC AATTATGCGC ATAAAACTAT TCGTGCGGAG TAGGACTCCA 4380
AAACTACTGA AAGAATAGGA TGCGCGTATT TGTTCTGCGA GAATCGTGTT TGCAATAAAG 4440
ATTG 4444
(2) INFORMATION FOR SEQ ID NO: 137:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5695 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137:
GGACGCAAAC GGCGTTCnCC TTCAAAGGCG CGCTCGATAC GCCGCGTTCA ACCCCATAGC 60
GTTGCCGGTT GCCGCTTCCn AGACTTcCCC CttACCGTAC AGTgCACGCG CGCGCGAACy 120
TTCCGCAAaG CAGACAGCGA TACTGTCCCA GACGCAGGCT TCCACCCGTT GGGgTATyTA 180
CACAACCGGG CAAAAGATCA ACAACCGGGT GAGGGGTGTC TACCGCAAAT TCTCGAGAGT 240
GTGCACTTGC CAGCAACAGT ACATTGCGCG CAAATTCAAT GACTGCAATC TGCATCCCCA 300
AGCAAATGCC AAGATAGGGC AGGTTCTGCA CACGCGCGTG CGAAACTGCA CAGATCATCC 360
CTTCAATCCC CCGCACGCCA AAGCCACCTG GAATTACCAG CGCATCTGCG TCCGCAAGAG 420 CGTGCCCAGC GTCCTGCACG CTGCAGATTT CCTCTGCGTC TATCCAGCGC ATGTCCACTC 480
GCGCACGATG GCAAATACCC GCCGCCGTcA ACGCTTCGCT CACGCTCAAA TACGCGTCCG 540
CAAGCGACAC ATACTTACCC ACCAGCGCCA CCGTAAGCTC CCGCCGCGGA TAGTACAGTG 600
CACGGACCAT TGCGCGCCAC GCCGTAAGAT CTGGCTCTGC CCCCTGCCCT GCTGCAGGAT 660
GCAGACCACC GTCTGTACGG ACCGTCCCCC CCGCTCCCAA CACAGCCGAC TGCGCGCCTC 720
CTGCACCGAG ATTACGCGCC AcCTGCCCCC CGCAGCAGGT ATCAAAAAGA CGCAAACGCT 780
CACACAAGAG CGCTCCCAGC CCTTCTGCTT CCAAAAGGAG CGGCACCTCA TAGATAGAAC 840
GAGCCGTCAC GTTCTCGACA ATAGCACGCC GCTCAACATT GCAAAAAAGG CTCAGCTTTT 900
CGCGCACCGC ATCCGTGATG TGACGCTCGC TGCGGCACAG GATGACATCC GGCTGCACAC 960
CAAGTCCTAG CAGCTCCTTT ACGCTGTGCT GTAGCGGCTT GGTTTTCATT TCACCACAAC 1020
TGGGTAAGTA GGGAACTAAA CCCAAATGAA TAAAAAGACA GCGCTCCTTC CCCAGAACAC 1080
GCCTGATTTG ACGAATCGCC TCGATGAACG GAAGCGACTC TATGTCACCG ACGGTACCAC 1140
CGATTTCGGT GATAACCACC CGAGCCCCCG TGGTAGCAGC AGCGGcGCGA ATTCGCGCcT 1200
GgATTTCATC CGTAACGTGA GGAATAACCT GTACGGTAGC ACCTCCGTAT CCTCCCGCGC 1260
GTTCACGGTC CAAAATAGCC CGGTACACGC TCCCCGCAGT CGTGCTATTG AATCTACTTG 1320
AAGGCACGTC CGTGAAGCGC TCGTAATGGC CCAGATCCAG GTCCGTTTCG CCGCCATCTT 1380
GCGTGACAAA CACCTCCCCG TGCTGATAGG GATTCATGGT ACCGGGATCC CCATTCAGGT 1440
AGGGATCAAA CTTTTGATTC ACCACGGATA TACCCCGGcT CTTAAGCAAA AGTCCGATGG 1500
CAcTGCGGCA ATACCCTTGC CCAGCGAGGA AACTACACCG CCTGTAATAA AGATAAAAGC 1560
CGGATCCATA CGGCACGAGT GTAGCGTGTC CATGCGTTTT TTAAAAGGGA GCACCCTGCC 1620
TCTCCCCTGC GCCTGTAGGG TGGACTGTCC CATCGGCTTG CATCGAACAT TGCTTTCGAA 1680
TACCATGGGC CCCATGAACG CGCGCGCTGC CCTCGCTCTT GCCATGGTCT TCACTTTCCA 1740
GAGGTTGTGC GCAGAAGAAC GTTTCGTTAT CTCCACTGAA TACTTCGACA TCATCTACAC 1800
CGAGGCCTCG ACTGAGTCTG CGCGTATACT GGCGAAGCAC GCTGACCGGT ACGAAGAACA 1860
AATCAGCCTC ATGCTCAATC GGGTTCCTGA CAAGAAAAAG CGCACCACCG TCGTCCTGTA 1920
TGCCCACACA CAGGATGCAG GCGGGTCTTT CTCTTCCAAA CCGTCAAGGA AAATAATCAT 1980
CAACGATACG CGCGTTCCCA ATCTGGGTTT AGGCAGTTTT AAGGATTCGC TGCTGAGCAT 2040
TTTTTACCAC GAGcTTACGC ATAAGATTTC TCTCGATTTC TTCATGCCGC TACTCCCTCC 2100
CCTCTTTACC GAAGGGGTTG CCGTCGCCTT TGAAAGCAAC GACGGCACGC AAGGGCGACT 2160 GCACGACCCG CTGACAATGC ATTACTTAAT TCAGAACAAA CTGGAAAACG TCTCCCCCTC 2220
CTGGAGGGAG GTTGCCGAGC TCAGATACAA CTACCCCCAC GGTATGCCCT ACGTGTATGG 2280
GGGAAAATTC ACCGAATACC TGCAGAAAAT ATACGGCAAG GAGCGGTGCG CCCGGCTGTG 2340
GCAGAACTCC TGGCGTCTCT TTATTCGACA CCGTTTTTGG GACGTTTTTC AAAAGAATCT 2400
GGGAACTGCG TGGAATGAGT TCATCGACAG CATTCCAATC CCGGAGAAGG TGGCACAGCC 2460
GCAACTCCTT TCTGAACGGG AAGCGCAGGG TCACTACGGT GCACGCAGCG nTGCGCCGAn 2520
CGGctTCGCC TATTATGACC GTGACCGTCA CGCGGTGCGT TTCCGTGACA AAGCAGGTGG 2580
GGTGCGCACG CTGTTCTCTC ATGACAACAC GCTGCATCAT CTGAACTTCT CCGAGGACGG 2640
ACGCTATTTG GCAGTATCAG ACACCATTGA CACGTGGAGC GAGCGCACGC ACCGAGTACG 2700
CGTTTTTGAC ACACACTCAG GCTCGTTCTC GCCGGAGGTG TACACTGGCG CGTCCGAAGC 2760
TTGCTTTGTG GGGAACGGGC AGAAAATAGT ATTCGTCCGG GTGCAGGGGC AATACTCGCG 2820
CCTCACGCTC AAGGACCGCA CAGACCCTAC cTTCGAGAAG GTGTTATACG AAGCAGGTCC 2880
TGGTTTGCCA TTCGGTGCGC TGTACGCGCC GGcGTACGCg sTGaTGGCAC CGTCGCCATT 2940
ATTGGCGCAC GCGGTATGGA GCGTAACTTG CTTTTCATCC CGGTGGACGA CAGGCCAATG 3000
ATGCAGGTTC CGCGCGAGCA GATGCCACAC GCAATGCGGG AACTGCAGTC GCAAAAGATC 3060 aAAGGCTCGT GGACGCTCAC CTTTAGCTGG GCAAATATGA ACATGCTCTC TCGCCTAGGT 3120
TTTTACGACG TGTCCCGCCA TACCTTCAGA CTAATGGACC AGGACGTGTC AGGGGGGGTG 3180
TTTGCACCGG TGGTGTACGA GGCACTGCCT GCTGCGGTGC ACGAAGAGTC TGCAGCAGAG 3240
GCGACCATAC GCGGTGAAGA GCCGGTTGTG CGTGTGGCCT ACACCGGCAG ACACCGCATG 3300
CACATGAGCC AATACCAGCG GGATGACCsC GCCCTGCGCG AGCGGCGGGT GTCGCTGGTT 3360
CCCTTGCACC CGGCAGAGGC GGAGGAGCAG TCGCGCCCCG CCACGCTCAT GGTTAACGGA 3420
GAGTTCATAG CAGACGTGCA CGAAGCCGAC CGCGGCAGAC GGTACCGCGC AgCTCAGTGG 3480
ATGTGGCCTC CCACGTTCTC ACCCCGCTTT GTGCCGCCAA ACAGCTTCAG CAGCCTCAAA 3540
GACCTTGGAC ACACCGGACT GGGGGTGAAC ATGAAGTTTG CCGATCCGTT TGGGCTCGTG 3600
GAGGTGAATC TCCAGTCGGT TTCTCATTTC TATCCGTTTT TTACCTCGCT GGGGCTGAAA 3660
AGCTCTTTTT ATGTGGGCAA AACCACGTag CCCTACGTGC GTATCACGAG ATAGACACCG 3720
GAGGGTTCCG GTACACTAAG CTCGGCGGGG CCTTTGAAAC GCTGACAAAC TTTCCTATGC 3780
AGGATGACCG AAACGCGTTT TTTGTGCGAA CTGCCGTGGG GGTAGACTCC TACTCCTGCC 3840
TGTGCGCCAA CGGCGGGGGT AACGGGAATT GCTGCGGCAA CAACGGGGGG CAGCAGTGcT 3900 GCGCCTGCAA TGGACAGGGC GCAAACGGAC CACATTATTA CAAGAGCCTT GAAAGCCCCT 3960
TCATTCAGGC ACAGGTGGAA ATGGGGTATA GCTTTTCTCA GCGTGCAGAG CGCACGGGAA 4020
CAAACTGGTT CGTGGCGGAC GTCACGGGGG TGAGCCTGAA GTTACACGTT GCAAATAGTT 4080
TTGACACCGG TAAAACAAAA GACGCGGTGC TCGTGCAAAC GAAAGGGTCA TTCCGCCTGC 4140
CGGTGGTGCC CCTACGGGTG GGCGTCAgcG CGTACGTTGG GTATAACGCC GGGTGGCGCG 4200
GGGGCAAAGG AAACATTCTG GCGGAGCACC CAGTGTACGG CTTTCCCGGT CCTACGTATT 4260
TACCCAAGCT CGCAGGGGTT GGTGGTATGG AGGGCTCGTG TAAAAAAAGC AAGAGTGCAG 4320
GGTTCGGCGC TGAAGCAGTG CTCACCATCT TGGACTACGA CATCAGCATA TATGATCCCT 4380
ATCTGCCCGT CTTCTACCGA AATATTGTTT GGAACGTAAG CTGCGAGTAC GTGCTCAATG 4440
CGCCAGACTT TTCCTCACCC AAACACCTGT GCGTTGCAAG CACATCACTG GTTTTGGAGT 4500
TTGACCTGGC GGATGTAAAA GTACGGGCCG GGGTTCAGTA CGGCTTCCAA CTTGCAGAGA 4560
CGCAGAGCGC CACAACCCCC GGTTTCAGCC CGATTTTTTC CATGGCCGTG TAGGAAGGTT 4620
CCCCCGTGGC GTGTGCAAAG CGCGCTACGG GGGCAGCGCT TCGGCAAAAC GCGGGAGCTG 4680
GCTGTCTACG CTCTGGCAAG GGTTTTTCTA TGCATGAAAC GTCCTACAAA CCTATCCACC 4740
GGCTGTCCCC TGACACCGCT AAAAAAATCG CCGCAGgAGA AGTCATCGAG CGGCCCGCCT 4800
CCGTCGTGCG CGAATTGCTC GAGAACGCAC TCGATGCAGG CGCCACCAAA ATCCATCTGG 4860
AAATTAACGC AGGCGGcTGC GCGCTCATCC GCGTGAGcgA TaACGGCCAC GGCATGTCCC 4920
CCCAGGATTT GTTGCTATGC GCTGAAGCAC ACACCACGAG CAAAATATCG TCTGCAGACG 4980
ACTTATTGCA GcTGCGCACG TTAGGCTTCC GGGGAGAAGC ACTCGCCTCC ATCGCCGCAG 5040
TCAGCCGCCT GCACCTTACG AGCACCCGAT CAGGGCCCCT CGCGTGGCAC TACCAGCCAA 5100
AGGCTGCAGG CACTGCACGg CACGTACCGC CGGTGCCGCA GGgCACCGAA GCAGGCGTGC 5160
TAGAGCCTGC AAGTCTTGAG CGAGGCACCG TCGTACGCGT CGAGCAGCTT TTTGAAAACT 5220
TTCCTGCGCG CAAACGCTTT CTCGGACGCC AAAGCGCAGA GACCACCCTG TGCCGCAGCG 5280
CACTCATCGA CGTCTCCCTC GCACATCACC CCGTGGAGTT TCGCTTCACC GTCGACGGAA 5340
CGCACAAGCT CACCCTGCTC AGTCAGCAAA CCCGGAAGGA TCGGTGTCTT GAAACGCAAA 5400
TGCTCAAAGG AGATCCTGCG CTCTTCCACA CCATAGAAGG AGGTGACTGC TCGTTTCACT 5460
TTCACCTTGT ACTTTCAGAA CCCGCCATCT GCCGCAGAGA ACGCCGCGGT ATTTTTACCT 5520
TCGTCAACGG ACGACGCATT TTTGATTACG GTCTTGTCCA GGCACTTGTG TTAGGAAGCG 5580
AGGGATACTT CCCCAATGGC ACCTTTCCGG TCGCCTGCCT TTTCCTCACC GTTAACAGCG 5640 AACGTATTGn ATTTTAATAT CCACCCTGGC CAAAAAGGTA GGTTCACTTn ACAGG 5695
(2) INFORMATION FOR SEQ ID NO: 138:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 659 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138:
AACTACTCTn AGTAGGATCC CCCTACACTn TATCGCACAC GGCGCACAGA CGGAACGGAG 60
CGCCTGTGTT TCACACATTG AATTTAAAAA AGATAACGTC CCCGTCTTGC ACCTCGTATT 120
CCTTCCCCTC CTGCCGAACG CGGTTTGCCT CCCTCACCTT TGCCACACTC CCACAGGACG 180
CAAGATCATC GAAAGAATAC GTTTCTGCAC GAATAAAACC ACGCTCAAGG TCGCTGTGGA 240
TCACTCCTGC CGCGTGCGGT GCAnAGnCCC TGCCCGAATG GTCCACGCGC GACACTCCTC 300
AGGCCCCGCG GTAAAAAAGG TACGCAACCC CATCAGGGAA TACACTGCGC GCGCAAnGCG 360
CACGTCCTGA TTCGCGCAAC CCTAATTCTT GCAAAAAGGC GTTTTGCTCT GCCACATCAG 420
AAAGCTGCGC AAGTCTGCTT CAAATTTTCC ACACATAACA ATTGCCTGCG TGTTATGCAC 480
ACGTGCGTGC TCTTGCACCG CGCGCACGAA ATCATTTCCG TAnTGCATGC CGCTTTCGTC 540
TGTATTGCAC ACGTAnAGGT GCGGCTTCAT TGTCAACAAG CGCATATCGC GCACCGGTTG 600
CGCTCCTCAT CCGACAGCGG CGCCATAnAT GnCGCTTTCC CATTTCTAAA TATTCGCGC 659 (2) INFORMATION FOR SEQ ID NO: 139:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1229 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139:
AGnTTCCTCC GCCTCCTTGC gCCGGGCTtC CTCCGCCTCC TTGCGCCGGG CTTCCTCCgC 60
CTCCTTGCGC CGGGCTTCCT CCGCCTCCTT GCGCCGGGCT TCCTCCGCCT CCTTGCGCCG 120
GGCTTCCTCC GCCTCCTTGC GCCGGGCTTC CTCCGCCTCC TTGCGCCGGG CTTCCTCCGC 180
CTCCTTGCGC CGGGCTTCCT CCGCCTCCTT GCGCCGGGCT TCCTCCGCCT CCTTGCGCCG 240
GGCTTCCTCC GCCyCCTTGC GCTGCAGCCA CTGCTCAAGC GCCGCTATCA TACGCTGCAT 300 TCGTAGTGTT GTGAAGCAGT CGGGTTTTGG GACACGTACT TTTTATAGTC AGCCAGTGCG 360
TGTGCATAGG CTTGTCGCTT CATGGCAGTG TTTGCGCGAT TGAGCAAGGC AGGAGGGTAA 420
TCGTGCTTGA GGGCGAGGGC CTGCTCATAC GCATGTTGGG CTTCCTCGTA GCGATGTTGC 480
ACGAAGTATA CGTTCCCGAG ATTGTAGGCG TATAGGTGCG CATATTCCTG ACTGTGTACT 540
GGTGGGTTTT GCAGCCACTG GATTGCcTGC GTGTAGCGTC CTGTCTGTAG GTACGCCATT 600
CCCAAGTAGA GTGCTGCTTT CGGGTGGGCG GGTTTTTGCT GTGCTGCTTT GTGAAGTGGA 660
CCGATGGCCT GCTGTGCCTG CTGTAAGGAA AGGAGGCGCT CTCCTTCACG AAAGTGGtCA 720
AAGGCATCCT CTGCCCGCAG GAAGGACGCA AATGCGAGCG CAAGAAAGGA GAATAAGCGA 780
ACGGATGGAA GGGAGCGTAC GCkTGCGTAG GTGCACGGTG AGGATTTTTG GCACATTTGA 840
CATTCCTTCT AGGGTGCGCT ACCATGGGCG GCATGTCCGC GTACATGGCA CTGCTGGCAG 900
CGGCGTTCTC GTCAAGTATC GTCTTTTTGC TCGTTTTTTT GATGAGAGGT TTTTCCATCC 960
CGCGCAGACA ACTTTTGGTG GAAAAAAGTT TTCGAGACGG CAAGTACGCG CTTGCTATCA 1020
AGCATGCCCA TGCGGTTTTG GCTAAGGATC CCCATAACTG GGCAGTGCGT GTATTGCTCG 1080
GTCGTGCGCA TCTtGCGGAA GGgAAGcGCG ACGTCGCGCT TATGGAGCTG CGCGcTGCCA 1140
GCAGCAGAGC TkCGTTTCGG AAAgTGGtAr ATGAaGtTgA gTTTcGCAAG ACTATTGCAC 1200
AGCTTATCTC CAGTTTGACC AATCGAnGA 1229 (2) INFORMATION FOR SEQ ID NO: 140:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1506 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140:
TTGGGACTAG CGCGTACTCT TACCGTCGGG GTTGTTTCTG CTCTCGCGCG TCCCATTCAA 60
AATAAGGGCA GTATCATCAG AAATATGATT CAGACGGACG CGGCGATCAA TCCTGGGAAT 120
TCTGGTGGTC CGCTTCTGGA TACCCAGGGG CGCATGATAG GCATTAATAC CGTTATTTAC 180
TCTACATCTG GAAGTTCTTC TGGTGTTGGC TTTGCGGTGC CAGTAGATAC CgCAAAGCGC 240
ATAGTGTCCG AGCTAATTCG CTACGGCCGT GTGCGTCGCG GCAAAATCGA TGCCGAACTC 300
GTGCAAGTCA ACGCATCTAT TGCTCACTAC GCGCAsTTaC AGTAGGCAAG GGATTGCTGG 360
TATCACAGGT CAAGCGGGGA AGCCCCGCTG CACAGGCAGG ACTGCGCGGT GGCACGACGG 420 CCGTACGCTA TGGACTGGGA CGGAGAGCAG CGGTTATCTA CTTGGGGGGA GACGTCATTA 480
CCGCCATCGA CAACCAGCCT GTAGCGAATC TGAGCGATTA TTACTCGGTG TTGGAGGATA 540
AGAAGCCTGA CGACGAATTC GCGTTACAGT ACTCCGCGGC AGACGGCAGC ATGTGGTAGC 600
CGTGCGGCTC ACAGAACGCT CAGATGAGTA GCGAGGTCGC GCGCCCCGTG CAGGCTGCCT 660
TTTTCACTGG TTACTTCATG ATGCTGCGTG GTCCTCTTTC CTCCTTTTTC CCCTTCTTTT 720
CCTTTCCCTT GTAGGCGGCG TTTTTATCTG TCTCTGGTGT CTTCTTTGGA TCCTGTTGCG 780
TGTCAAACGT TTCAAACGAT TGCGCCATTT CCTCTAGCAG CGCGTCAAGA TTTCCCTGAT 840
CGAGATACTG CATAACCTGT GAACATTGAG CAGGAGATAG TGCAATGCGC ACCGCAGGgC 900
AGTTGTAACC ATTTGCACCC GGAATCGTGC GGTTTGCAAT GATAAAATAA GGGCGATCAT 960
CAGTGATAAA CTGATACTCA AAGCGCATAG TGGGGGTGGC ATTGTGCGCA GAGCCAAGAA 1020
TACCCCAGGT CATCAGCGGA GTAGTTGTCC CAAAGTACGC CCGCTCACTA CTCTTTTCCC 1080
GCGTGAGyky TTGTGCTTCA TATTCGCCTA AGTACTTCTC AATAGcTTCG CGCAGTGCTG 1140
TACGGTCCTT ACGCTCAAGG TAGAGCGTAA TGCCGTCCAG TAGGAATTTG AACTGCATAA 1200
GCACAGTGTC GATCGGCGGG TCAAACACAA AGGTAAAGTC GCGTGGAGAA ATCGCAGTAC 1260
GCAAGCGGtC GACCGTGTAG GCATTCAGGA CTCCTAATTC CTTAGGAGGG TAGTCGTTCG 1320
AAACAGTCAT GTTCGTGCTT GAAGCACAGC AAACCAAGAC CCCCGTACCC AGCAGTGCAC 1380
CCAGAGCGCG ATAnCCTGCG CGACATAACC TGATTCCCCA CTTCCGTAAA GGnAGAGTGG 1440
AGGGAGAAAG CATACAAAAT CCTnAGCGTT TCCATGGGGA CGTCAGCGTA CACACAAGCG 1500 nTGTCA 1506
(2) INFORMATION FOR SEQ ID NO: 141:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5380 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141:
TCAGCCTATG GTCTACGGGA ATTATGGGGG AGGCAGCTAC TCGGGTCGGT TTTTCAGTAG 60
GAATATTATA ACTGGAGAGA AGAAACTTCA GGGTCAG AT TTTGAGGAGC GGTTCGACGA 120
ATGCGATGCC GAAGGcAGTG aTGTAAATGC GATAAAGCCG GCTTATCTTA AGCAGTTGCA 180
GGATATTGCG TGGAAACTGG AGGATCACAG CCGAGAGATT CGGGAGGTTC GCTTTACTAT 240 CGAGGCGGGC AGTTTATGGC TTATTGAGCA AAAACCTGTC GAAGCGAAGA GCACAATCTC 300
TTTGGTACGG TTGCTGCTCG ACCTGTACGA GCGCGAGGTG GTGGATGCTG AATACGTGGT 360
CAAGTCGGTA AAACCGGGTC AGCTGAACGA GATTTTGCAC CCGGTCATTG ATATGACGAG 420
TGTGACAGGT TTGAAATCCT CGCAGGGGGG GATTATTGGT GTTCCTGGTG CGGCGGTTGG 480
GCGAGTGTAC TTTACCGCTG ATTCCCTCAT CGAGGCGTGG CGTGTGGCGA AGATGGGCGG 540
ACAAGATACA CGGTGTATCT TGTGTATGCC TGCAACGTAC GCGGGGGACG TTAAGGCAAT 600
TGAGGTGGCA ACTGGTGTTC TTTCTAACGA GGGGGGGTAC TCCGCCCACG CTTCGGTTGT 660
TGCCCGTCAG TATGGGAAGA TCTCTTTGGT CCGTCCAGAT ATGAAGATTT ATTCGGACAA 720
AGCGGTCGTT GACGGTATGA CTATCAACGA GGGCGATTTT GTAACGCTTA ATGTTCCTTA 780
CTACGGGGAA TCCACCCTGT ATATGGGAGC TGCGCAgcTC ATTGAGCCTG ATCCAGAGAC 840
GTCTGGCCTA GTGAGCTTCA TCGAGCTTGC GAAGGGTTTT GTGCGTTCGT TTCACGTGCG 900
GGCGAACGCG GACAGTCCAC ACGATGCAGA gCTCGCGCTC GCCTTTGGTG CGCAGGGTAT 960
CGGACTGTGT CGTACAGAGC ATATGTTCTT CAAAGAAGAT CGGATAAATG TGTTCCGCCG 1020
TATGATCTTC TCGGAGAATG CTGAGGAGCG GACGGGCAGT CTCAAGCAGT TGCAAAAGAT 1080
GCAGGGAGAG GATTTCTACG GCATCTTCAA GGTAATGCAG GGACATGAAG TGACTATTCG 1140
CCTTCTGGAT GCTCCTTTGC ACGAGTTTTT GCCGCACGGG GAGAGTGAAG TTAGCAAGTT 1200
TTTGGAGTAT CTCGAGAAGG TTTGTGGTAA AGGTCTGTCC CGGGAGGAGT TGCAGGAGCG 1260
GATCTCCATG CTATCTGAGG TGAATCCCAT GCTGGGTCAC CGTGGGTGCC GTATTGCGAT 1320
TTCATACCCG GAAATCTACG CCATGCAGGT GCGCGCCGTG TTCGAGGCAG TGTACCGGTT 1380
GCAGAAAGAG AAGATCTCGG TGTACCCAGA GATAATGATC CCCATTGTCA TGAATTGCCG 1440
TGAGTTAAAG CAGATTGTGT ATGGTAAAAA GATTGAGGGG CACGCATACC AGGGTATCGG 1500
CTCGATAGAG GAAGAGGTAC GTCTTGCGCT CAAGGCAAAG GAGGTTGACT ATAAGGTGGG 1560
TGCTATGATT GAGCTGCCTG CAGCTGCGTT GAGTGCAGAC GAGATTGyGC gcTACGngcA 1620
GTTTTTCTCG TTTGGGACTA ATGACTTGAC GCAGACAACG CTTGGACTCT CCAGAGACGA 1680
TTTCAATACG TTTATGCCGG ACTACACGAT GTATGATTTG GTTGACGGAA ACCCCTTTGC 1740
GATACTCGaT GCGCgCGTGC scgAGTTAAT TGAGGTTGCT ATGCaGCGTG GACGGCTGGC 1800
ACGGCCGGAT ATTCAGCTAG GTTTGTGTGG GGAGCACGGT TCACGGTCAG AAAATATTCG 1860
TTTTTGTATG GAAGTAGGAC TAGATTACGT TTCGTGTTCG TCTTACTCGG TGCCTATCGC 1920
TTTACTTGCA ATTGCACAGG CGGAGATTGA AAACGCAGAA AAGGAAGGCA GGAAGCCTGC 1980 ATGGCGGGGA AGGTCTTCCG CGAAGTCAGG CGGTAGGCGC GCTAGGTAAG GTGTCGTTCG 2040
TGCTTGGTGA GCGTCTCTGT GCGTTCCACT AACGGTGCGA GGGATGGCTG CGTCTGCGTG 2100
GTTAGAGGTG TAGCTGGGTG TTTTTTGGAG GTTTTGTGTA CGCGCTGATT GAATACAAGG 2160
GCAAGCAGTA TAAGGTGGAA CGGGGTAGTA GTATCGTTGT AGATAATATC TCCGAAGTTG 2220
CGCCGGGCGG GTGCATCGAT GTGCGTGAGG TGTTGATGAT TGGTGGCGAG GGTTTGACGC 2280
GGATTGGTTC TCCTTATCTT GAAGGAGTGG GTGTGCGCGC GGTGGTGGGG GAATGTTTTC 2340
GCAGTCGGAA GATTACCGTG TACAAGTATA AGAGCAAGAA GGATTACCAC CGAACTATCG 2400
GTCATCGGCA GTGGTACACT CGCTTGACCG TTAGTGACAT CTTGGGGGTG TAGGCTCTGG 2460
TCCGAGTGTT GCTTGAAGTC GGTGACCGAG GGCAGTTTTT ATCTGCAGTT GCCTCTGGTC 2520
ATGCTGCGCG TGGAACACGA GGCGGTGATG TTGTGTGTGC GGCAGTTAGT GTGCTTTTGC 2580
GCACTGCGGT GCTTGGGCTT GAGCGTTTGG GTCCTCAGAT AGAGGCGGCG GATCGGGGTT 2640
TTCTCTCCTT TCGCGTGGGG GGGTGTCCGG ATTCCGCGTT GGCTCTCTTG TGTTTCACTG 2700
CGGAGTTTCT AGAACGTGGT TTACGTACGT TGATGCAGGA GTATCCCAGT TCGGTGCATC 2760
TTTGCGTGCG GAGGGGAGTG GTGTGTGCGT AgcGTCGCGG TTAAGACAAA ACGGGGGGTA 2820
GTATGGCTCG AAAGAGAGGT GGCAgTGGAT CTAAGAACGG GCGCGATTCT AATCCGAAGT 2880
ATTTGGGAGT AAAGTTGTTC GGTGGTCAGC ACGCTCGTGC TGGTTCGATT TTGGTGCGCC 2940
AgCGGGGTAC CCGAATTCAC CCGGGAGAAA ATGTGGGAAG GGGGAAGGAC GATACGTTGT 3000
TTGCTCTTGC TCCTGGGGTT GTGACCTATC TTCAAAGGAA GGGGAGGCGC CTCGTTTCTG 3060
TGTGCGTGGA AAACCGGCCT TCTTGAGCTT TTATAGAGGG AACCAGGTGC CTCCTGTGCT 3120
TTGTGTCTGT GGTGTAAGAA GGGTCAGGGG GTATTTGCGT GTCTGTGGGA TCGTAGGGGG 3180
CGTGCACAGG TTTTTGAGAG AGCATGGCCA GTTTTGTTGA TGAGGTGCTG ATTCGTGTTT 3240
CCTCTGGTCG GGGTGGAAAT GGCTGTGTGG CGTTTCGGCG GGAAAAGTAT GTCCCgCGCG 3300
GCGGCCCCGC GGGGGGCGAT GGAGGGCGCG GCGGGGACGT TGTGTTCCAG GTACGGCGCA 3360
ACATGCGCAC GCTTGTGCAC CTGAGGTATG GACGCGTGTT TCGTGCAAAG AATGGGCAGG 3420
ATGGAGAGGG GGCACGCCGC TTTGGTGCAA AGGGGCACGA TTGTGTTATA CCGCTGCCTC 3480
CGGGTTGTCT TTTAAGGGAT GCGCAGACTC ATGAGGTTTT GCACGATTTT GGTCATGCCC 3540
ATGAAGGTTG CGTGACGCTC CTTTCGGGTG GAAGGGGTGG TTGGGGGAAT TATCATTTCC 3600
GTGGCCCAGT GCAGCAGGCT CCGCAACGCG CGCATTCTGG GCAGCCGGGG CAGGAACGTG 3660
TGGTGCACGT TGAACTGCGT ATTGTGGCAG ACGTTGGCTT TGTGGGGCTC CCCAACGCGG 3720 GCAAATCTTC TTTGCTGAAT TTTTTTACCC ACGcGCGGTC GnTtwGcCCC TTATCCTTTC 3780
ACTACCCGGA TTCCTTACTT GGGGGTGCTG CGTACGGGGG ATGGGCGCGA CGTGATCCTG 3840
GCAGATGTCC CTGGGATTCT CGAACGCGCC TCGCAGGGTG TCGGCTTGGG GTTGCGCTTT 3900
CTCAAGCACT TGaCCcGCTG TGCGGGGCTT GCATTTCTCA TTGATCTTGC AGATGAGCGT 3960
GCGCTGCATA CATACGATTT GCTTTGCAAG GAATTGTACG CTTTCTCCCC TGTCTTTGAG 4020
ACAAAGGCGC GCGTGCTCGT AGGTACCAAG CTTGATTTGC CGAATGCGCG TGAGTGTTTG 4080
CAGCAGCTTc tGCACAGCAC CCATCCACTG AGGTTTGTGG AGTCTCGGTG CACAATCGCT 4140
GGGGTTTAGA TGAATTGCAG GAGGCTTTTG TGCGTCTCTC TGACGCAGGT GCGGGCgcGT 4200
TGCGTTCCCC TGTGTGGCGG AACCAAGCTC CCAGTTTTAT GTACGCTCAG CTTGAGGATC 4260
CGGTGTGTCA GGTGCGTGAT GATTTTGGGG CAACGGTGAG CTTGAGCAGA AAACGAAAAG 4320
TGCGCGGATG AAGTTAGCCC TGTTTGGCGG TTCGTACGAT CCTGTTCATC TGGGCCACTT 4380
GCTCTTGGCT GATGCAGTAC ACCGGCACGC CGGGTATGAC CGCGTGCTGT TTGTGCCTAC 4440
CTTCGTTTCC CCCTTCAAAG AAAAGGAAGG AAGTGCAAGT GCGCACGATC GGGTGCGGAT 4500
GCTCCATTTA GCAATTGGGA CAACGCCGTA TTTTTCTGTT GAAGAGTGTG AGATTAGGCG 4560
TGGGGGTATT TCGTATACTG CCGAGACGGT GCAGCATGTG CGGGAAAAGT ATGGCGCACA 4620
GCTTGAGGGC AAGCTCGCGC TGGTATTGGG GGAAGATGCA GCGCGCAGTG TACCGCACTG 4680
GCACGCGTTC GATTCGTGGA GTACACACGT TGATTTtGTC GTGGGTGCGC GCCCTGTGAC 4740
GTCAgGCGAT GGGGGGAACG TAGAACGCGC CACACGCACT CTACAATCGT TTCCCTTCCC 4800
ATGGGTTTCG GCGGAGAAtG TGGcGCTTCC TATTTCGTCA ACGTACATAC GCACCGCAAt 4860
TCAACGGGGG CgTAgTTGGG GTTaTCTTgT ACCTtCCCCA GTTCGTGAGT ATATTATCGC 4920
GCGTGGACTC TACCGkTCGT GAGCCTCTGC CTTCTTTCTC CGGAGTGGAC GATTCTTCCT 4980
GTTCTTCTCG TTTTCTTTCT TTCCTTGGAC AGGCGCAGAT GCCAGCGTTC TCTTTTGTGT 5040
CTGCAGATAT GACGGCGCTG ATTGCGCGTG TGGmAcCGTA CGCGCGCGCA GTGCTTTCGC 5100
CACCTAGGTA TGAGCATTCG CGTCGTGTAG CGGAGTTTGG GTGCATGCTG GTACGGCGGT 5160
ATGCGTTGGA AGCGCAGCTT GAGCCGCACG TTTATTGCGC GGGTATTGCG CACGATATGT 5220
GCCGGGAGCA TTCGGAAGTG TTCTTGTTGC GTGCTGCcTG CGTGCGATGG TTTTCCCATT 5280
GATGTAACTG AGCGTGGTAC GCCATTGCTC TTGCACGGGC GCGCAGCCGC GTGTGTGTTA 5340
GCACAGGAAT TTGGCGTGCA GGATGAGGTG TTGTTGTCTG 5380 (2) INFORMATION FOR SEQ ID NO: 142: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13954 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142:
CACGCTCAAC CGGGAACGAT TCGTGGTGAT TTTGCnTACG CACCACTACT AATATTGTCC 60
ATGCTTCTGA TTCTCCCGAg AGCsCTGCAC GAGAACTAGC GCTCTACTTT TCTGCGCAAG 120
ATTTTGTTGA ATGGCGTGAC GGGAATTACG ATTTTTTCTA AAAAGTTTAG GGTCTGCGGT 180
GCGTGTTCTT TTTCGATACG GTGATCATGT TTCCTCTTGC TGAAGAAAGG TGAGGATCAG 240
GGGAGGGGTT AGGAATCATG GCTTCGATTG CAaTACTCGG TGGAGGGGCA TGGGGCACGG 300
CGCTTGCTGC GTCTCTCACC GTAAACGGTC ACACCGTAAT GCTGTGGGCC CGTCGTAGGC 360
AGACGTGCGA TGCTATCAAT GCACGAAACG AAAATGTTCA GTATCTGCCG GGCATTACGT 420
TGCCCGCAGC CTTGTGTGCC TCTCCCGATA TGGCATATGT CTGTGCCGGC GCGGATCTTA 480
TTGTATTAGC GGTTCCTTCG TGCTATTTGG CTGAAGTAGC TGCGCTTATG AATACCACTC 540
CTCGTTTTCA GAGGTTGCGT ACTGCTGCCG TAGgACAGGA ATATCCCCTT ATTGGTATTT 600
TGACAAAAGG ATTTATTCCG GATCAGGAAG GGATGCCTCA TCTAATTACC GATGCGCTGG 660
GTGCGTTGTT GCCGTCTGGG GCGCACGGGC AGCTCGTGTA TATTTCGGGT CCAAGCCATG 720
CACAGGAGGT AGCGCAGGGA AAGGTGACCG GACTTATTGC AGCGAGCCAA AATCCTATGG 780
CGGCCATTCG GGTGCGGGAA TTGCTGCGCT CGAAGAGGgT GCAGGTGTAT TCCAGTCTTG 840
ATGTTGTTGG GGTGCAAGTG TGTGCAGCGG TAAAAAACGT GATTGCCATT GCATTTGGTC 900
TTTTGGATGC GATGGCTGAG CATTCTGAAG CTTTTGGGGA CAATACAGAG TCGATGCTGC 960
TCGCAGCGGG CTTGAATGAA ATTCAAACCA TTGGAAAACA GTTGGGTTCT ACACATCCTG 1020
AAACATTCAC ATCGCTTGCA GGAATAGGAG ATTTGGATGT GACGTGTCGC AGCGCnTATG 1080
GACGCAACCG ACGTTTCGGA CGCGACATAG TGCATAAGGG GATCCTTGAT TCCTTTTCTG 1140
GAATACAGGA TCTCGTGAGT CGTTTGCCCG AAGTAGGGTA TCTGGCGGAA GGGGTAGTTG 1200
CCTGTATGCA TGTGCAGCGC CTGGCTGAGC GGGATCGGTT GAAGGTTCCA ATTTGCGCGG 1260
GACTGTACGC TATTTTAAAT CGGGAAAAGG GTGCTGACAC CTTTATGCAA GAGATTCTTG 1320
GTTGGTAGCA CGGGGGTGTT TTCTTCCGCG CGTCTCTGTG GGGGAAGCGT AAAGAACGAG 1380
TCTAGGAGTG GAAGTGAGAA CTCTTACCCA AATGTTATTT TTGGCCCATA ATCCGCGTTC 1440 GGACGGAGAG ATATATGCAT ATCATCAAGC GAAATGGCGA ACCGCAACCT TACATGCGCG 1500
AGAAAATAAT TGTTGCTATC AGTGCTGCTT TTAGAAGTGT CCAGAATCCT CTTGCTCCTG 1560
AAGTTCCTGC TATCATCACA GATCTTGCCG CGGAGGTTGA GCGACAGCTT TTTGAGATGA 1620
ACCGTGCGGG CGTTCCTGTT CACGTGGAAA AGATTCAGGA CTTTGTCGAA AAGACTCTTA 1680
CCAAATACAA TCACAGCGAT GAAGTGAAGA GTTTTATCCT GTACCGTGAC GATCGCACAA 1740
AAAAGCGTAT TGCAAGAGAA CAGATTGCGT GCTGTTTTAC TGACTCTTCA GTGCTCGGTG 1800
TACTGAAAGA AATCCAACAA GACTTTCCGT TTCCTGAGTA CAGTCTCGAT GCACTCGCCA 1860
GTAAGTTCCT GCTCTTTAAA AAAGAAGTTA CGGACGAGCG TCGGAGTATG CAACTGCTTA 1920
TTAAGGCAGC GGTGGAACTG ACTGCCCAAG AGGCTCCCCA GTGGGAGCTT AtTGcTGCGC 1980
GCTTGCTTAT GCTCGACTTT TCACTCGCGC TAGGAACATC TTTGGAAAAG TTAAATATTC 2040
ACTCCTTCTA CGAGAAAATA ACTTATCTTG AAGAGGCCGG TCTATATGGG GTGTACATCC 2100
GCACGCACTA TAGTCGGGCA GAAATTGAGG AAGCTGCCAC GTATCTTGAG TGTAGTCGCG 2160
ATAAATTGTT TACGTACAGC AGTCTGGATA TGATTCTGCG TCGCTATGTG ATCAGAACGC 2220
GTGCGCATGT ACCTCTTGAA ACTCCTCAGG AGATGTTTCT CGGTATTGCA CTGCATCTAG 2280
CGATGAATGA AACCCAAGAT CGTATGCAAT GGGTAAAACG CTTTTATACA GTCCTCAGCA 2340
AGTTGCAGGT TACGGTCGCA ACACCTACGC TTTCAAACGC GCGCAAACCT TTTCATCAAC 2400
TTTCCTCGTG TTTCGTTGAT ACGGTGCCAG ATTCGCTCGA CGGTATCTAC CGCAGCATCG 2460
ACAATTTTTC CCAGGTATCT AAGTTTGGGG GAGGGATGGG GCTGTACTTT GGAAAAGTGC 2520
GTGCGGTAGG CGCTCCCATT CGGGGGTTCC AGGGTGCTGC AGGTGGTATT CTCCGTTGGA 2580
TTAAGCTCGC CAATGATACT GCAGTTGCAG TAGATCAACT AGGAGTACGC CAAGGCTCGG 2640
TGGCAGTGTA TTTGGATGTA TGGCACAAGG ATATTCCGGA ATTTTTGCAA TTACGGACTA 2700
ATAATGGGGA TGACCGCATG AAAGCACATG ACGTATTTCC TGCGGTCTGT TATCCAGATT 2760
TGTTCTGGAA GACAGTACGC GATAATTTGG GGGCGTCGTG GTATTTAATG TGTCCGCATG 2820
AGATTCTTAC GGTGAAAGGC TATGCTTTGG AGGATTTTTA TGCGGAGGAA TGGGAGAAGC 2880
GCTACTGGGA TTGTGTAAAG GATGCGCGTA TCTCTAAGAG GACCATTCCG ATTAAGGAGT 2940
TGGTGCGCTT GGTGCTAAAA TCTGTGGTGG AAACCGGTAC TCCCTTTGCG TTTTACCGAG 3000
ATCATGCAAA CCGTGCAAAT CCCAATGGGC ATCGGGGAAT TATTTACTGT TCTAATTTGT 3060
GTACTGAAAT TGCGCAGAAC ATGAGCGCTA TTAATTTAGT AAGCGTAAAA ATCACCGAGG 3120
TTGATGGACA AAAGGTAGTG GTGCAGACAA CGCGGCCGGG GGATTTTGTT GTATGTAACC 3180 TCGCGTCGTT GGTGCTGAGC AATATTGACC TTTCAGATGA TAAGGAGTTG CGCGAGGTAG 3240
TGCGTGTGGC GGTACGTGCA TTAGACAACG TGATCGATTT GACATATTAT CCGGTTCCCT 3300
ATGCACAGGT AACCAATGCG TATTATCGTG CTATTGGTTT AGGTGTTTCA GGCTACCATC 3360
ACGTGCTTGC CCAGCAAGGA ATCGATTGGG AAAGTGATGA ACATCTTGCA TTTGCGGACA 3420
GAATATTTGA GCGCATTAAC CGTGCCGCAA TTGAAGCGAG TATGACAATC GCGCGCGAGA 3480
AGGGTGCGTA TGGGTGTTTC ACTGGGAGCG ATTGGTGTAC CGGTGCGTAT TTTCGCAAAC 3540
GCGGCTATGT CTCTGAAGAC TGGCAACGTT TGCAGCGTGA GGTAGCAACA CATGGGATGC 3600
GCAACGGTTA CTTACTTGCA GTCGCGCCAA CTAGTTCTAC GTCTATCATT GCAGGGACCA 3660
CTGCGGGTGT AGATCCTATT ATGAAGCAGT ATTTCCTCGA GGAAAAGAAA GGCATGCTAA 3720
TGCCACGCGT AgCTCCTTCT CTTTCGCAGA AGACCTGTCC ACTGTATAAA AGTGCACACG 3780
CAGTGGAGCA GCGTTGGAGT ATCCGTGCTG CGGGTCTGCG GCAACGACAT ATTGACCAGG 3840
CACAGTCAGT GAATCTGTAC ATTACAACGG ACTTTACACT GAAGCAGGTT CTAGATTTGT 3900
ACGTGTATGC GTGGGAAGTA GGAATGAAGT CACTATAtAC GTACGAAGCC AGTCGCTCGA 3960
AATAGATTTG TGTGGGTATT GTGCCTCGTA GGAGCGTGCT TGCATAACAC TGTCCACTGT 4020
GTAGGCTTTC TTTGTGACGT TGCGTACGCT TCAAGCCGGT GTGGCGGTCA GTATCGCTCT 4080
GGATCGTGTG TGCTTTTTCT GTTATAACGG GGCGGTGGCA CACTGTgTAG TAGAAGCTGC 4140
CGAAGATATT TTGGACCGGC GTTTTTCTGT ATTGGATAAG GGTTTCGTGC GTTTGATAGA 4200
TTACCTGGGA GGGGATGCAC GCATTGTGCA GGCAGCGCGT GTTTCTTACG GTGCGGGGAC 4260
TAGGACTGCG CGTGACGATG CGGCGCTTAT CGATTTTCTT TTACGCAATA AGCATACGTC 4320
TCCTTTTGAG CAGGTGGTCC TTACCTTCCA TGTACGTGCA CCGATTTTTG TCGCGCGTCA 4380
GTGGATGCGG CATCGCACTG CTCGCATCAG TGAGGTGTCT AGTCGTTATT CGCTTCTTAG 4440
TCATGACTGT TATGTTCCGC .AGGAAACTTC AGTTGCAGTT CAGTCCACGC GTAACAAGCA 4500
GGGCCGCGCG TCCGAAGGTA TCTCTCCTGA ACAGCAGCAG GAAGTGCGGG CAGCGTTTGA 4560
AGCTCAGCAG AAAGCGGCGT GTGCCGCTTA CGACGCATTG ATTCAAAAGA AC TCGCGCG 4620
GGAnCTAGCG CGTATTAACG TGCCgCTTTC GCTTTACACC GAGTGGTATT GGCAGATTGA 4680
TTTACACAAT CTTTTTCATT TTTTGCGTTT ACGTGCGAGC GCTCATGCGC AAGCAGAGAT 4740
TCGTGCGTAT GCAGAGGTAA TCATTGAAAT TACCCGTGCA GTTGCGCCGT GCGCTACCGC 4800
CTCTTTTGAA AATCATGAAA AAGATGGGGT GCAGTTTTCA GGGCGGGAGT TTGctGCGCT 4860
TAAGGCCTTA CTGGCTGGAG AGGGTCTCTC CCTTGAGGGG AAGGAACGTG CGCGCTTTGA 4920 AGAAAAATTA CGCTCTGGCC TGCAGCAGTA GAAGTCTATA GTGCGCTCGT CTGTGTGAGC 4980
AGCAAGAGTA TTGCCTTTCT GTGTCTTAAA AAGGTGAATG TGGTCATAGG TATGCTGAkG 5040
AAAAGGAGAG CGGTCAGTTA TGGGGATTGA GTACTCAGCG AGTAGCATTA CTGTATTGGA 5100
AGGTCTTGAA GCGGTACCAA GCGTCCGGGG ATGTATATCG GCTCTACCGG TCCTAATGGA 5160
TTGCACCATC TGGTGTACGA GGTGGTGGAT AACTGTATCG ATGAAGCCAT GGCTGGGTAC 5220
TGTGATCGTA TCACCGTGGT GCTCGAACAA GGAAACGTCG TGCGTGTTGA AGACAACGGG 5280
CGAGGTATtC CTGTTGACGT GCACCCTCAT GAGGGGGTTA GTGCGCTTGA GGTTGTACTT 5340
ACTAAGTTAC ATGCGGGGGG GAAGTTTGAC AAGAAATCGT ATAAGGTGTC GGGTGGACTC 5400
CACGGAGTTG GAGTTTCtGT GGTCAACGCG CTGTCGTTGT GGGTAGAAGT GACAGTGTAT 5460
CGTGATGGTG CTGAGTATTA TCAGAAGTTT AATGTGGGGA TGCCGCTTGC TCCAGTAGAG 5520
AAGCGGGGAG TGTCGGAAAA ACGTGGcACT ATTATCCGCT GGCAGGCGGA CCCATCCATT 5580
TTCAAAGAAA CGGTGGCCTA TGATTTTGAC GTACTCCTGA CGCGTTTGCG TGAACTTGCT 5640
TTTTTGAATA GCACGGTAGT TATTCAGTTG CGTGATGAGC GGTTGGCGAC CGCTAAACAG 5700
GTTGAATTTG CGTTCGAAGG AGGTATTCGT CATTTTGTCA GTTATTTAAA CCGCGGTAAA 5760
TCAGTTGTGC CCgAACGTCC TCTGTACATT GAGGGATCGA AGTCGGATGT TTTAGTGGAA 5820
GTTGCGTTGC AATATCACGA TGGTTATACG GAAAACGTGC AGTCATTTGT CAATGATATT 5880
AATACCCGTG AGGGGGGCAC GCATCTTGAA GGATTTAAGT CGGCACTTAC GCGTGTGGCG 5940
AACGATTTTT TGAAAAAAAG TCCAAAGCTT GCAAAGAAGA TAGAAAGGGA AGAAAAGCTC 6000
GTTGGGGAAG ATGTGCGTGC TGGATTGACA GTGGTGCTTT CTGTGAAAAT TCCTGAACCC 6060
CAGTTTGAAG GGCAGACAAA GACGAAgTTG GGAAACAGTG AGGTGCGGGG TATTGTTGAT 6120
TCTTTGGTGG GGGAGCGTCT GACGCTCTAT TTTGAGCAAA ATCCAGGTGT GCTTACAAAG 6180
ATTCTTGAAA AGAGCATTGC AGAGGCGCAG GCGCGTCTTG CAGCACGTCG TgCAAAGGAA 6240 rcTGCGCGCA GAAAAAGTGG AATGGATAGT TTTGGGTTGC CGGGAAAGTT GGCCGACTGT 6300
TCGCTCAAGG ATCCGGCGAA GTGCGAAgTA TATATTGTGG AAGGGGATTC TGCAGGAGGT 6360
TCGGCGAAAA AAGGACGGGA CAGCAAGACA CAGGCCATTT TGCCTTTGTG GGGGAAGATG 6420
CTGAACGTGG AAAAGACACG TTTGGATAAG GTCTTGCATA ACGAAAAATT ACAGCCAATT 6480
ATCGCAACGC TCGGTACAGG TGTTGGCAAG GATTTTGATT TAACAAGGAT TCGCTATCAT 6540
AAAGTGATCA TCATGGCGGA TGCGGACGTG GATGGCTCTC ACATCCGTAC GCTTCTTTTA 6600
ACGTTCTTCT TTCGATACCT GCCGCAAATA ATTGAAGCTG GTTACGTATA TCTTGCGATG 6660 CCGCCTTTGT ATCGCATTGC GTGGAGTAAA AAGGAACTGT ATGTGTATAG CGACACAGAG 6720
CGTGACGAAG CGCTAGAAAG TATCGGTAAA AAAAGTGGTG TCGCTGTGCA GCGTTATAAA 6780
GGTCTGGGGG AAATGGATGG CACTCAGCTT TGGGAGACAA CTATGAATCC AGTGCGTCGC 68*40
AAGATGATGC aGGTGGTGCT CTCAGATGCG GTGGAAGCAG ACCGGGTGTT TAGTACTCTC 6900
ATGGGTGAAG ATGTCGAACC GCGCCGTAAG TTTATTGAAG AGAATGCAAT ATATGCGCGT 6960
TTGGACGTAT GAATGTTTTG TGCATTTGTA TTTCCATGTG AGTGGTGCCG TCTGACAGGG 7020
GGAAGGATCA GTGGCGTATC AAGTGACGGC AACACGGTAT CGGCCGCAAC GTTTTCAGCA 7080
CGTGTTGGGT CAGAAGTTTG TAGTGGCAAC ACTGCAAAAA TCTCTTGAGG AGAACAAAGT 7140
TTCTCCTGCG TATTTGTTTT CCGGCCCACA TGGGTGTGGT AAGACCAGCT GTGCGCGTAT 7200
CCTTGCAAAG GCATTGAATT GTGTGCAAAG AGAAGCGTCT GAACCGTGTG GAGAGTGTCC 7260
GTCTTGTAGA GAGATTGCCA CCGGTACTAA TTTAAATGTT ATCGAAATTG ACGGTGCGTC 7320
ACACACAGGG GTGGGCGACG TACGTCAGAT TAAGGAAGAG ATTCTCTTTC CACCTCATGG 7380
GACGCGTTAC AAGGTTTTTA TTATTGATGA GGTGCATATG CTTTCAAACA GTGCCTTTAA 7440
TGCACTGTTG AAGACAATCG AAGAGCCTCC GCCGTATGTG GTATTTATCT TTGCAACAAC 7500
GGAGGTGCAC AGGATTCCTG CAACGGTAAA AAGTCGCTGT CAACAATTTC ATTTTCGTTT 7560
GGTAGATACT CAGACGCTTG TTTGTGCGTT GGCGCAAgCT GCCCAGCAGA TGCAGATTGC 7620
AGTTGAAGAC GGAGTACTGT CTTGGATTGC GCGTGAGTCA GCCGGTAGCA TGCGAGACGC 7680
ATATACTTTG TTCGATCAAA CCGTGGTGTC TTGCGCAGGG CCGGTAACAC TTGAGAACAT 7740
TCAAAAAAAA CTCGGGCTAA TGACTGACGA CTCACTTGCA GCACTGTTTT CACATTGCTG 7800
CCGCAAAGAT GCTCGCGCCG CCTTGGAATT GGTAGATGCT TTGGTAAGTT CTGGTGTCTC 7860
CGTTGAACAG TGCGTAATCG ACTGTGTCCG CTATGCGCGT GCACTGTTGC TTTTCACGCA 7920
GGgAATTACA AATGAGTCAC TGGTAGGAAT AGCGGCAAAC CAAGTGCCTG AGTACGTGCG 7980
TACCACATGG AATGCGTCGC AGATTGAGCG GGCGCTTGGA CTGTTACTGC AACTGTTTCG 8040
CGACATTCGT TTTTCAGTAG ATCCGCGGTA TGAATTGGAG CTCGCAATTT CGCGTTTAGs 8100 tGGTTGAGTG AGTATGTCTC AATTCAAGAA GTACGCGTTG CATTGGATAG TGTGCAGCAG 8160
ATACTGGACA CGCATGCAGT TCCCGGGGTG TGTTCTGCGT CTGTAGGTTC GGACGATGGG 8220
GAAACAGGTG TCGTCTCCCC ACACGGTATA CGTCCCCCTA TGTCAACATC AGTATGTACC 8280
GTGCGTGCGT TACAAGATGC CTTGGTAGAA AAGTTGCGCG CGTCACACCA GATGTTGGcA 8340
ACAGCGCTTG GTTCTTCATA TTCTTGGCGC GAGGAAGATA CTTCTGTGTG CATGTGCGTA 8400 CGCAgCATTA TGAGCGCAGg TTATCTCTCA GCACGCGTCG CTGCTCAAGG AGTATGCGTC 8460
AGAATTACTG GGACGGGAGG TATGCGTTCG CGTACTTCTG GATTCGGTGC CTTCGTCAAA 8520
AGTCGCGCCC TCCCATCTTC CTCAGAGTCC TGcCCCATCT GCTCTCTTTA CAACTTCTTg 8580
CTTAcTCTGG GgCAGGAGTG TGATAgGGTG AtGGAGATCT GCCTGCACAG tGaAGCTTCT 8640
CTGTGATTGT GTGCAAGGGC ACGTGGTGCG TGTGTATGAA GGTACTGCAC GGTGTGTACC 8700
TGGGGAGGGG AAAATAGCAG GGGCTGTGCG TACTCCGTAT TGAgTAGCGA CAGACGGGTA 8760
GGAAGAGTGC GAAAGCTGTC AGTTCAATAT GGGGATTTTG TACCAGGGGA TGTGTGATGA 8820
TACCGGCGAT TGAAGAAGTA GTGGAGCATT TATCTCGTCT CCCGGGTATT GGAGTAAAGC 8880
TGGCGACGCG CCTTGCCTAT CACCTTTTGA AACGTGACCC CGCTGAAGCG CAGGTTCTTG 8940
CGCGCGGGAT TGCGTGTTTG CATGAGCGCG TATATCGGTG TGTGTGCTGC GGTGCTTTCT 9000
GTGAGGGGAG GACCTGCGCG TTGTGCACGG ATGCGTCTCG GGACCGAGGC ATCATTTGTG 9060
TAGTGGAGCG TGCGCAGGAC GTGGAAATGA TGGCGGGTGT GGGGGAGTAT CGGGGTTTAT 9120
TCCACGTGTT AGGAGGAGTC ATTGCACCGC TTGAGGGGGT CGGTCCTGAC CAGCTCCGTA 9180
TTGCGGCGTT GCTGAAACGG TTGCAAGAGA GTTCAGTACG GGAAGTTATT CTGGCGTTGA 9240
ATCCCACCGT GGAAGGGGAT ACCACCGCCT TGTATGTGCA AAAAATCCTT GCAAACTTTC 9300
CGGTAATAGT AACGCGTCTA GcgTCTGGTA TCCCCGTAGG nGGGGACTTA GAATATATCG 9360
ACCGAACGAC ATTGGCGCAC AGCCTGCGTG GGCGCCGGCC ACTTGATTGC TCGGAGGCTT 9420
AGAGCATGAT CTCTACCGCA GTGACTACTG CCATATGTGT ACCCGGTGGA CGCTGCGGAT 9480
AAAGAACACG CTTGCGGTGT TCAGCAAGAA GAGTCCAACG ATACAGGCGC TCCACGTGAG 9540
TCCGTGTGTG AGCAATCGGT AACCTGCAAT CACAAAAGGT AATGCTGCCG CCGCGATGTA 9600
GTAATGGGAG ATTTGCAACG TGTGGCCGAG CACCCCGTAA GAAAGTGCGC TGAGGACTGC 9660
CGTTGTCCAG AAAAACCACC GGTGCCACGC GTGACTCCCG TAGTAGAACG TGAAGTTCCT 9720
CTGTGGGGTC ACCAGGATCC GTGTTTTTCT GTGGGCGGAs TGCGCACTTT GAGCGTTTTT 9780
TCTTCACCAG TACTATCTAT GATAGATCCG GATTCTGAGA TCGGcTGTGC CAGCAGTATA 9840
GCTCCCTGTG TGCTCTGTGA GGGACGATAT GCGTGTATAG TGTGTATCGT AGTGTATCGG 9900
CTAATAAGAA ACGCCACCGT GCACAGCACA AAGACAACGG CCCCTTCCTG TCGTGTGAGT 9960
GTCTTTTTCG CAAAAATGCC ACTTGCAAAG AGAGCAAGCA GTGCCAGACT CCGGAAGAAC 10020
ATGACGATTT GCATGGGAAC ATTCCCTGCG TATGTGGTAA GCGTGCCTGT GTAGAGTATG 10080
GGGAGGAAAA GACGGACTCC TTCGAAGTTG AGTGCGCACA GAAAAAGAGA AAAATAGGAT 10140 ATCTCGACCG CATGTGTTTT TTCGAAGCGA CGAAGGAGGT AAGTAAGCAA GAGTGGTGCA 10200
GAAAGACCCn TACCGAGGAT TGCCACCAAG GTACTGCTAT TTGAGGGTGC GGTGGACACC 10260
GCAGCAAGCG AGCGAGCAGC TGCTTCCGGT AGCAAGAAAG TGAGCACATG CATGAACGCA 10320
CGGCCAGTGG TGTGTAACAG GGGAGGAACG CGCGCAACGG CAAGTTGTGG GAGTGAACCA 10380
CCAGATAGCA CGAACGCGGC CTGTAAAAGG GAAATGCTGC ACAGCAGGGC GGAAAGTACG 10440
ATGGAGAACG CGACGACTTT GTTTCGTCCA GCGATGGTCA CCGGGCCATT GTAGGCGGTG 10500
CATGCGTGTT ATGTAAAGTT GTGCTGCGAT CAGGAGTCTT CTTAAGGAGA TTGGCGAAAG 10560
GTGCCGAAGC TGCGGGCGCG GAGGACGTAT GGGTAGAACA GGATGTGTGG GTAATCGATG 10620
CAAACGGATA GGTACTGTGC GTGCGCGTGG CGAGAAGCAA AGGAGAATAG CATGGGTGTG 10680
CTATCTGGCA GCGGTGTTTT TGACGTTATG TACGAAGTCT GCCATGGCGG GAGATATCGC 10740
GACCTTTGTG AATTTGGGAT TTTCTGCGCA CGGAAAGACA TTTGCCTTTG GACAGCACGG 10800
GGTAACTGAT GGGTTGTATC aGGCGTATGC GGATATCTAT GTGGTGGACG TAGAATCAAA 10860
CCGTTTTGTG CAGGGAGGGG TAGTGCGCAC AACGCCGACG CGAGAAACAA AAGGCAAACG 10920
AAGCATGGAT GTCTTTCTTG CGTTGCAGAA CCGCGCGCAA TCTCTCTTGC AACGTGCAGA 10980
TATTTCTGCG CTGCGTTTGG GGCGTACTCT GTACGTGCAG GCTGAGGATC GGATGGGGGA 11040
AGAGACGCTA CTGTTTCGAG ATTTTAAAAC GAATGTAGAA TATGTGGTGG TTATGCATGT 11100
AGAACGGACA ACAGAGCTGG GTGTGTCGTT CTATTTGACG GTTGAAATGA CAAGACCGAA 11160
TGGAAGCAAA GTTTCGCGTG TGGTGGGGCA GCGCGGCTAT GTGCGACCGG GGGTGAAAAA 11220
TTACGCCTTA AAAAAAGTAC TTATTAATGA GCAGCAGGAC GCTTTGATTT TTGTTGTTGA 11280
AAAGCACGGA TATGCCCCtG ATGGAGCATC AGTTCGGTAC ATGGTAGAGG CGTGTCGTCT 11340
GTAACGGTTA TGGGTTCTTG CAATCGTACG AGGGTGGGGT ATGGCTTACG TGTGATATAA 11400
TGGGCGTTAG CCTATTTTTG GTGAGGGAGC ATGTCGTCAT GTGGGATCTG CATCGTGCTT 11460
TATGCGCAGG GTTGTTGAGT GTGCTGCTCT ACGGCGGCTG TTCTTCTGCG CGGGACTTTG 11520
TACACGTTAT GAAAACTGCG GGCAATGAAG TGGCGCTTGT GCATTACGCG TTGCAAGGTG 11580
ATTGCATTGT GTTTGGTTTT CGCGGTGAGG TATCTGATGT AGTTGCGCGC GTATATCGCG 11640
CAGAGGGTGT TGTTTCAGAG GAAATGAGCG ATGAGCTGTT GCAGTCCTTT ACAGACGAAA 11700
TCCCCTCTGC ACTTTCACGC ATTGCGGTGC CTGCAGCGTA TCAGGAGTCT GTGGATGCAG 11760
CCCGTGAGCG CCTTTCGTTT TTCGCAGTGC GCTTTTTACA AAAGCCGCAC GTGGGGCAGC 11820
AGGTGGTATT GCGTGGTAGT GCGCAGGACG TGGCGCACCA GAAGCATGAT TTTGTGCTTC 11880 CTTTTGAGGG GATAAATACA CAGCCTGCGC GTTTAGAAAT CAGCGAAGTC CGTCCCCTTT 11940
ATGGGAATAA ACGCTCGGAA TTTGTGGAAT TGCTTGTCGT GGAATCCGGT AATTTGCTGG 12000
GAATTACTAT CACAAATGTA GGTGGCAAGG GGAATCGCTG CGACTATCAT TTTCCTGCAG 12060
CCCAGGTGCG GGTAGGAGAG CGTGTGGTAT TACATTGGCG TAACAGGATC CTGCTTCGTG 12120
TGATGAACTG TCTGCTGAAA CGGTATCGGC AGGTTCGCAG GCCTGTGCGC GTGCACGCGA 12180
TTTTTGGGGG CAGGAACGCT CAATCCCAGG GCGTAACCCA AACGCAATCG TAGTGAAAGA 12240
ATCTGCCAAC GGAAAGATAC AGGATGCGCT GCTATTCTTT AATACCCACG TAAAAAAGGG 12300
AAAGGCGCCG ACGTTTCGCT GGGCTTTTCC AGAAATTGAA GCTGCCTCGC GGCTTGCATT 12360
GGAGCAAGGG GCGTGGCTGA GCACGCACGA GCACTTCCCT CTAAGAGAGA GTCACTTTTT 12420
TCAGkGCGAT TTGACGCCAG CGAAGAGTAT CGCGTTAAAG CGCAAACGGG GCGCAGGCCG 12480
ATCTGCGGCA GATTGTTTTG TGCTAAAAAA AGCAACGATG GGGTTGCCAA ATCAATAGCT 12540
CTTCAgGGTG CTGCTAACTT CAGCAAATTG ATCGTGTCGG cACCCTGcTT AGGTCTTGGC 12600
GCTTGCTCTG AAACCAAAAG CTTGTCGGGT TCTACAGCGG CCGCTTTTGC GTGTCCACAT 12660
CCCTTCTCCT ACGCTGaTCA TTCGAGTTCC TCAAGAATTC CTGCCATTTC TTGCCCACAG 12720
GATACTCCCC ATCTGTCCAA CACCTCGTTC ATCCGTCGcA GATCCTCACT TGaAAGCACG 12780 aAGCACCCTG cACTCCACGC ATAATTCGTG TCGCGCCCTA ACTTTTCTGA CCACCGGTCG 12840
TGTATTAACC ACCGACCGGT CTGATATCCT TCTGTAGATA TTTGCATCGC TTCCCGATCA 12900
ATCCACTCTC CATCCATATC CCATGTGCGT GTGATTGCGT GTATCCGTCC ATGAAACGCG 12960
CGGGCAGGCA CAAAGCAACG CACTACAAAT GCCCCCGGCG CAACCGATTC CCCATGTGCC 13020
ACCCCAGACC CCGGTGCCAT CCGCCCGAAA CAATAATTCG CCACGGTTTG TACCATCGCG 13080
CCATACAGCA CCCGTCCCGC TTTATCCTGT AGTTGCATTG TATCACAACT GTTGTTTGCC 13140
CAGTTGTTCT CGAAACTATC GCCTTTGAGT GGATCTACGC GGTAATGGAA ACTTATCCTT 13200
CTTCTCAATA CGTGCATCCG TACGTCCTCT CTCATCATTG TCTCTCCCCG CTGTACATCC 13260
GCGCGGTGCG CCGCACTGCA GACACCCTTC CTCCTGTGAG GCTGGTAATC GGCGTTGGAA 13320
TAAAAGGGCA GGGTATCGCC TGCTTGCATT CTGCCGCCCT TCCTGAGAAG AAGGCgCGCA 13380
TCGCGCAGTC ACTGACTACC CTTCCGGCAG CCTCCGGTGC ATCGTGCCTC ACCTTTTTTA 13440
CCCGTGGACA CATACCCCAA TTGCGCAGTT TCAAAAAGTC CGTTGAACAA TCGTTCGTCG 13500
TTTTCTTACA CGCAGATGTG CAACAACTAC GAACGCAAAA CATCACGTGG CTTGGATCCA 13560
TTCGGCGGAC CGACCACCCC CCTTGCTTCC ATTTCTTCGA TTAGGcGCGC GGCGCGATTG 13620 TAGCCTATCT TCAATTTACG TTGCACATAC GATGTGGACG CTTTACCCGC GTATTGCACT 13680
ACCTGCACTG CCTGCTCGTA TAAAGGATCG CTTTCATCCA CAAAATTTCC AGATATACTC 13740
GCGTCGTCAT CGTCAAAGAA AATTTCTTCA TCAAGATACT CAGGCGTTCC CCACGCGCGT 13800
ACATGGGCGA TCACGCGCGC TAATTCTCGC TCGGAAACAT ACGCACCTTG AATCCGCGTA 13860
GGAAAAGACT GACTCGGGTT CATGTACAGC ATATCCCCTC GTCCCAGCAA TTTTTCTGCG 13920
CCCATCTCAT CCAAAATAAT ACGGCTATCC ATTT 13954 (2) INFORMATION FOR SEQ ID NO: 143:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7247 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143:
AGGGTGCCGT ACACCGGCAA CCACTCTAAC AACTCCACGC CAGAAAACCC ATACGACTTC 60
ACCCGTTCAA TGAGACTGAA AATAAGATCA GACTCAAGTA AATTGAGCTC GAGCTTTGAA 120
GCACCCTTGA ACAAGGCCTC GCGAAAATAC AACTTCGACC AGCGCTCATC TCCCGCAAGA 180
GCGTATGCAT CAGCAAGTTC AGCAAGAATT GCAGAAGAGT CTGGCTCCAA AGAGACTGCA 240
AACTCAAGGC ATTCACGCGC AgcATCGTAA TTGCCCAATG CTTTATAGCA CAGCCCAGTC 300
TTACGATAGA CTTCCGCGGG ATGCTCGCGC GCATCACCGG TGATAAGACT CCCATAAAAC 360
CGCAATGCGT AAGTGAACAC ACAACAGCGC AATGCATACA TAACAGGTTC GACAACGCCA 420
CCACGTTTTT GCATATATGC AACGAACGCA CCCCATTGTC CAACAAGATA CTCACCCTGa 480
CTAAGACCGC CAGAGATAGT ATGTACACGC TGCATCCGAT CCTCCCAAAA GCGAATACCG 540
GTAAGCGCGT ACAAAACATC CCGATTGTCT AGCTCTTTTT GAAGCAAATC CTGCAAATCC 600
TGGTGTGCCT GGCGCAGATC CCCCTGCTTC AGTAAATCCA GTGCTGACTC AAGCTCTACA 660
CGAATCGATC GTGCAACCAT AGCGGGATAT TGTATCGGTA TCAAAGCGCC ACGTCAACGA 720
AAGTACTTTT TCAAAAACAA AACCAACGCT ATGTAAATAA ACACGAGCTA CTCGCAGTGT 780
CCTTTCACTG TGACATAAAA ATCCCGCAAA CGCGCACGCA CACACTGGTA GAGCTGAGCA 840
CTTCCCTGCT CTTTCAAATC GTCCGGCACC ATCAATTCCA GCGCATCATC AATTGTGCAT 900
ACCGCCCACA CGTGAAAgTA CCAGCACGCA CCGCATCCTG CACACGTTCA GGCAAAAACA 960
ATTGTGCACA ACTACTCTTT GGGATCAAAA CTCCCTGCAT TCCCGTCAGT CCGTTCAGAG 1020 CACAGACGTC ATAAAAGCCA CTGATCTTCT CGCTCACTCC TCCAATAGCT TGCACTTGTC 1080
CCAACTGGTT CACACTTCCT GTTATTGCGC GGTCCTGCCG AAGAGGAAAC CGGCCAATCG 1140
CAGAAAGCAA TACTAAAAAT TCTGCAGCAG ACGCCGAATC CCCATCAATC CCGTGATACG 1200
ACTGTTCAAA ACAAAGCGCT GCAGAAAGAC ACAGTGGAAA ACTATCCTGC CCCTTTCCAA 1260
GGTCATCCAC AGCTGGATCA AATGCAGCAT CTGCAAAAGC AGACAAGCAC TTCTCACGCA 1320
ATAGCGAAGT GATGATTAAA TGCGCCTTAT CATAAATTTC CCCCGAAAGA CCTGCCTCCC 1380
GcTCaATATT CATCACTCCT TCCTTACCCG CGGATGCCTG TGCCGTTAAC GAAATTACTA 1440
TGCCAAAGCT ATGGCCGCAA TGTTCTTCGA TTGCCAGCGC GTTAATTCTC CCTATTCGAT 1500
ATCCCTGTAC CTCTACCAAT AACTCTCCAC AGGCTATCAT GCGCTGAAAA CGCTCCCGCG 1560
CGCGGGAACA CACGTACTGC CTGCGATTCA GCGCTTCCTG CACCACATGC GCAGTAATCA 1620
CCGACACATC TGGATGCATA TCCACGGCCA CTGCATGAGA CTCCAATACT AAATCTGCAA 1680
TTTGTACAAA AGAAGTACTG AGCCGTGTAT GACTCTCTGC CAGTTCTTCC GCATATGCTA 1740
ACAATCGCGC GTACGCTGAG GAATCAAGTG AGAAGGTGCC ATAACGCGCC ACAACTCTAT 1800
CAAGGTAGGC TATCAAAGCA ACTTGATTCT TATCAGAATT CGGCATGCTC GTATCAAATT 1860
CTGCACACAC CTTAAATAGT TCTCGGAACG AGGAATCTTC CTGAGATAGA CGTTCGAAAG 1920
AGCATGGCTC GCCAACCAAA ACAAGCTTGC ATGTGAGCGG AACTCCTTCA GGCCGCAACA 1980
TACCTTGGGA CTGGGAACTA CCCGCTGGGA GTAACACCTG CTTGGTACGC AGCGCACGCT 2040
TCAAATGTGT CCATGCTTCT TCCTCCGCCA GTAGATCTTC GAGCTGCACG ATGAGTACAC 2100
CTGCATGCGC TCGATGCAGC GCACCTGCAC GAATGCGTAA ATGCCCATTC TCCAGCGTAT 2160
CCCCTTCATT CCCCTTGCTT TCAATCGATC CACATAAATT CGCCAAATTC GGCTGATGCT 2220
CCGTAAACAC ATACCCCGCA TGTTCTGTGT GCACACACAC GAAATTTAAG GTATAGCGAT 2280
CGAAAAACCG CTTTTTTACT AGTGCAGAAA TACGCATTGA CATTAAACAC TGTACCCGCA 2340
CCTGCACATC GgTTTGTATC CGCTCTATGT ACGATATGAT GCGCGCACGC ATTTCTTCAA 2400
ATTCATGCGG CACCTCAATG AGGCGAGAAG AGGAAGTCTT CTCCCTTTCC GTCTGTTCCG 2460
CCGCACACGA AACAGGTGGA GGGGAAAAAA CCGGTGCATA ACACGCCACG CGTTTTTTAA 2520
TACAGGCCAT CTGCTCGAGG ATAATTAACC GCAAACGCGC ACGGTAGTAC TGCGCAAGAC 2580
GCCTAcGCGC GGCCCGycGC gCAGTGCGCA ACGTATGGAG CAGCGAAGAA ACCTCATCAC 2640
AGGAAAGACG ATATCGCGCA TGGAGTTCGT GTACTACACA TCTAGAAAGC TTCGCGCGGG 2700
AAGCTAAATC GTGCAGCGCT. TCAAAGCTAC TGTCTTTTCC CTTTAATAGG GGAACTAAAT 2760 CAAAGGAGTA CGTACCACGT TTCCTATGCC ATCTCACACG AAACCCACGC GTATACAACT 2820
CCGCTTCAAT ACGTGAAAGC TCAGCACACT CACGTGTTTC AATATCAGCC AGCAATGTGC 2880
GCCGCTCACG TAAGAAGGCA TCACTTTTCA CAATATCCTG TGCTGTATTG AGAATAGCGT 2940
TAACCGATCT CCTCAACGCC GTAGCAAAAG GCATCCCTTC TCCCGCCGGA AACTGCAGCA 3000
CcTGCGGTTC GTGCGGATGC ACAAAATTGT ATGCATACGC TATATCCCAC ATTTGCTCGG 3060
GACGAGGTAC AAAATCTTTT AATAGATATT GGAGCGTAGT GCGCTTCCCG GTGCCTGACG 3120
CGCCAATCAC ACAGATATTG TAGCCGTCAC CGTACATTCT CATCCCAAGA CGTATCGCAG 3180
CGCAGGCACG TTCATGCGCT AGCAAAAAGG GGTGATCCTG CGCACGGGCT TTCAGATACG 3240
CAATAGTGTC AGGAGCAACG ATACCCGTAA CTTCTTGCCA GGAAAGCTCT CGCCAGAGCG 3300
AGCTACGGCA ATAGGGCCCC AACATCAAGT TCCCCAGCAA AGCACGGCAT GCGGATATCC 3360
CAATTACTCA GCACtGCGGA TGTATCTCCC CAAATATACC CACACGAGAA CCATGCACGC 3420
AGATACTCGC TTGTCTGCCT AAGACAAACC GCGGATCCTG CGCTTCCTCT ACCTGATACG 3480
GAAGTTTCAA ACAATACAAA AGTCCTGAAA CTAGGCTCGC CACCTCATTG TAACTTGCCT 3540
CCTGCGAGGC ATTGAGGAAT CCAAGTGATT GTTGCGTGCA TGTACCGTGT TCACCATGAG 3600
GGGAGCAAAA GGCAACCTTT CCAATTTCGA AAATACGATG TGGATACAAC GCATGTGCTG 3660
ACTTTAGCTC AGCTGACAGA AGACACGGAA TGATAGACCT GCGCACGAAT CGGTAGCTTT 3720
CAGTAAGAGG ATTTTCTATC TCGATTAAAT CATCTGCCGT GCATgCATAC GCGTGCAAAA 3780
CTCCCTGGCA GAGCCAAGAT AATGAAAAAT CATCTCCTGA TAACCAAATC CAACTAGAAG 3840
GTGCTTAATT TTGCGCGTCA AAAGGGTCAG ATCACTCAGT CTGCCCACTG TAAAAGAACA 3900
TGGAACTGCG GAGAAAAACG GTCAAGCGTT CTACCCAGCA TCACTTCTTC AATCAGATCT 3960
ACCGCATGGA GAAAATCATT ACGGTACGCT GCAGGCTCGA CGACATAGGT TCCCTGCTTA 4020
ACTTGCACGG CACAGTCCAT CCTTGCAAAG GCTTCGGTTA TATCTGCAAC AGGCAAGGGC 4080
ATCCCCAAAA GACGATCTAT CTGCTCGTGC GCAACCTCCC TTTTTTCTTG AAAATAAAAG 4140
GGGGTAGTAA CAGACCTCCC CCATGGGGTG tCAAAACTAT AGGTAATTTG CACAGGTTCA 4200
ATCTGCATAC CCATATCAGA AAGATCACAC GCAAGGCTAT TCACCGCTAC TAGCACTGCG 4260
CACATATCCG TACCACTACA CTCGATAAAT AGTTCTGTAT CCCCGGTCAC TACTGCACCC 4320
AGCGACGCGC TATTAGTGAT CGGAATAAGC GAAAGCACCT CACCACGAGC ATCGACTAAT 4380
AAGGGAACGA AAGAATAATC CTTGAGGAGA TGACCATATT CACGCCCTTT CGGATGTTGC 4440
GTCAGAATCT GCTCACCAGA CAGCGGCATG GACATACCGA GAGGCGTGAA ACTTACCTCT 4500 GCGAGTAGGA CAGCGCGGTA TGCGAGTGGC CAACATATAT CCTGTGCACG GTAACACCCC 4560
AGCGAAAGGG TACACCTCCT ACGTCCAAAA TTACTGGCGA GTCTTTCCTG TGTTTGTATT 4620
AGATCCCACA GTCTTACTTC GCTCAGTCCA GTACCCCGAG CCACAAAGCC AGCAATAAAA 4680
GGGCGTATAG TTTGCAAACG CGCATCCACT ACTACGCATC GCCCATACGA GTCGCGCACC 4740
GTACAGCGAT CTGAAAGAAA TGCACGATAC TGCGCTGCGC gsGAAgCCCC CCCTGCGTGA 4800
ACACGCAAGA GGCGCGCAAG CCCTGCTGTA GACCACAAAT CCGGACGATT TGTGTCGTTA 4860
AGTTCGATTT TTAAAAGACG CGCATCAAAC GCACCCTCAC GTACACAGCT GGcCTGaCCC 4920
CTCGTCCTCT GcACAGGaTy GcACACCTcC CCTcCTGGcG AAAGGCTCCC TCCCTcTTTA 4980
TCTTCCCTCA CGGAAAAAGA GCTGCACTCC AATTCAGCCT TAGCAtTCGC AAACGTCGCT 5040
CCAACGTATC ATCGTCGCAC CGTTCACCGA GCAGCTCAAA AAACAACGCC TCATGAATGC 5100
TAATTTTCGG CATGAGCCCT CCGACCGTGC TCAGTATACG GTCAAAAAAA GGACACCACA 5160
AGGGCCAGCC ACCACACCGT CTTTGcAGAG GATGCCACTA CACGGCACAC GTGCTCTACA 5220
CTTTAGTCAT AATACGCGTA ACACTCTTAA ACTGAGCGCT CATATGCGCA CGGCATAAGT 5280
ACCGTGCGCm CAAAGGCATG AGCTCACTGA CGACCGAAAC GAATACTCCG ATACTTTTCC 5340
GGAAATTCCT GGTCTGTTAC CGGTAAAGAA AACTTACCCA AAATATGTAT GTCAGAGAAC 5400
ACTGACACAT GGTTGGATAC TTCCTTCCCT GTGAGAAGAT CCGTGTACGG CGCAACCCTC 5460
CATCGTGTTC CTTTGGAAAA CAAACTAAAG GCATCAGAGA GAGCTACACG CATATCCTTT 5520
TCTCGGAGTT CATTTTGCAA TATCCTGCGC GCATATTCAG TTAACCCAAG TGTGCTGACA 5580
AAAATATGCG AACTTACATT CACCCCGAAA CGACCTACCA ACCTCTTCTC TTCTACCGCC 5640
GaCTCGACGC GcTCCAGGAT CCCCGACACA TCCTCATCGT CAGCATACTC CACCCCGAGT 5700
GCTTCAGCAC AGTCAAACAC ATTCGCCCCC AGAAGCATGC CATCTTTGCT TACCAGTTCT 5760
CTCATCAGGG AAACACGGTG CGTATCATTA GTGCAAAAGA ACATAGTGTT CTTCCCATAC 5820
TTACCGATCC ATTGCTTCGT GTGTGTGCCG ATAAACGCAC GCGCCCCTTC AACACCTGAC 5880
TyCCTGCGGG ATCTGGAGCC TCCTCAGACG CAAACTCTAT ACCGATATCT GCACACACTG 5940
CCTCCATTAC CTGACGCCTC GTCCTAAGCT CAGCCTTTGA AAGGTGACGA GGAAAAGAAA 6000
CGTGCACCAA CGTACGCGCG CCCATTTTCT TTGCAGCCCA GGGAATAGCA TACCCACCAA 6060
ATGCATAATC ATCACTAACC ACGATATCAG CAGATTTTTG GATAAGCTCA GGATACTCAT 6120
GCGGATCCCC TACAAAGAGC ATGATGTCGG AACGtTCGCC CTAATTTTGT TAAAAGCCTC 6180
TGCAGTTCCA TGAATTCCTT CGCTGACAAC AATAGCACCC ATGAGAGGAT CATCAGCAAA 6240 TGCAAGGAGC CGGGACACCG TTTCATCTTG CCGATTTACA AAATCTTCAG GATACGTGAC 6300
ATGCAAGATA CATCCTCGGT GACCGTCACC TACCGCACCG TACATGGCGC ATGCTCTTTT 6360
TGCCGCAACT ACATCATGGC TAGAGTGACT ACCCGGACCG GTCATAACCC CGATGTGATA 6420
ATCCTCCACC GCTTTCGTGC CCCCGGTATT TTGTGGAGCA CGCGCGCCCC CACAGCGACA 6480
CAACAGTGCA CCGACACATA CCACCCATAC CTTCAATAGA AACTGACTTT TCATAGTCTC 6540
CCCTTAACGA TCTGCACACA CCATCTCTCC AAAACGCTTA GGCGTATGGT CCACCCCCCC 6600
CCCACGGGAA AATCAATTGT GTATTTATGA TACAGCATCA AAAAATCATC GTGTGCAAAC 6660
CAAAAAGGTA TCCCTTCCCT CCCCCTTATT GAAGGCTCCC TCACCACAGA ACCTTACACG 6720
AACACACGAG cTyGtCCGCT TATCCGGCCA AGAACCATGC TCCAAACTTA TGTAGGGACG 6780
ACGCGCAAGC ATTCCCTGCG ACCACTCAAT GAGCAACTCA CATCTTTCAA ACGGACTCAC 6840
CGCCACCCCT ACCTGATAAT ACAGATTTTT ACCTGCGCGG AACGTGCGGT TCCTCTTTAn 6900
CGTTCCGGCA GCACCACCTG GTCCGTAAGA AAAACCcTGC GCACCATATG TAGCGCCATA 6960
CAGCGAAACC ATCGGTCTTA TCCATGCGTG AGcAGnCAAG AGTAATCCGG TAGTTAAGCC 7020
ACAGTTTCCC CACAACAGGC AACGCGCTGT GTGCACCCGG ATGGACAACA CCGGGTGGAG 7080
GAAGCACAAA CGCCTGAGGC TTCTTTCCCC GTGTGGAATT CGGCGTACAG GAGCCAGGTT 7140
CCAGGTACAA GCCATGAGTG AAAGGAATAT ATAAAcGCGC CTcGAcACyT CCGCTCAGAC 7200
CGGACACAAg CTTAGAAAGA GAATCGTACC ACTTACTTTC CACCCCG 7247
(2) INFORMATION FOR SEQ ID NO: 144:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2898 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144:
AACCGnTTCC CGTTGTGCTT CTGCAATTTT TCTGCTGTGC TCGCTGCTCC TGAACTGGCC 60
TTCACGCGCG AGGCTCTCGG CCTGTTCGGT TGCCTTCTGC GCAGTGTGAG CCGCCTGCTG 120
AAGCTTCCCT GTGCGCACGC TCGGCGAACG CGCTCCTTCT CTTTTGCTTC CTGCGTACGC 180
GnCCGGGTCC GGTGAAGTTC TGCGCCGCCT GCTGAGCTTG CTGAGCTTCT CGGCGTGTCT 240
TGTCCGCGTG TGCCTGGGCG CGCTGAGCCT GAgTGGAaCA cTCCGCCGCG CGCTCCTGCG 300
CCTGCTGTGA CTGCTGTGCT AGTTGTTGGG CCCGGCGCGT TCCTTCCTGC GCCTCCCGCT 360 CTTTAAGGTC AATCATGTcC CTGCGCAGAT CAACGTTCTT GTCCGGATCT TCACGCAGCC 420
GCTCCACCAC CCGCTTGTCA GAAATGGCGC CGGTGTCCAC TGCGCTGAGT GCGTCAGAGT 480
GCCCGGAAAG GGGAATGACG AGCTGCGTTT TTCCCGGCCA CTGATCGAAA CGGCGCGCAA 540
TACCGACGTG CTCGGGGGAG AGATGCGCCG TTACCACCCC CTTGTAGCGC GCCTTGAATG 600
TGGGAAGGTC TGCACGATAC ACCGCGTTGT ACACCGTTGT GAAGGTTGCT ATCGTGCGTG 660
CGTCCTGCAG GCGATAGCaT AGGCGGCAGA AAGATAGGCG GCGACGATCT CGCGTATGTT 720
GTCGATATGG TCCACCCGTG CCTGTGCGCC AATGATGAGG ATGTCTGCAT CAAAGCCATT 780
GGTCGTGTGG GGACCTAcTG CGTGAATAAG cgCGTAACGC gcACGATCCC CCGCAACACC 840
GCCGCGCAGT GCAGAGGCCA AGCCCTCCCC GATGCGCCTG ATAGCGGCGG CGCTATCTAC 900
GTCCGTGTGG GTTCCTGCAA AATTTTCAAA CTCTACCGTG GCGTTTGCAC GCTCAAGCTC 960
ACGGCGATCC ACCTCGAGTG CAAAAAGACA GCCTGCGCCC AACACGGCAC ACATGGGCAG 1020
CACGTTCTTC ATGACCCTCT CCCCCTTCTA CACCTTTGTT TTTGAATACC GCACGCCTAG 1080
ACACCTGAGA TCCCAACCTT CCTGCGTCAT CAAAGCCCCA TCAGCGTACA TnACTCTGAA 1140
CGAGAGTTCT TGCATAGGCC CGTATGGTGT ACGGGAAAGC CCCCGTGGAG ATGCGGGAGA 1200
GTTCTGAGAC TGCCTTTTCA GAAAGGGTGC GCGGTGCACT GCGCGCGCAT GCAGCGAGTG 1260
CATCGCACGC TGCACGCATC AGCGACTCGG CGCGTGCGTC CTTAGCACGC GTGACTCGGT 1320
ACAGCGCAAG TAGTGCCCGC TCATCGAGGC CGAACGTGCA GCGGGTGAGC CCTTTGAGGA 1380
TACCTTCGAG CACGGTGTAG TCCTGcTCTT TGAGAAGCTC AAGGAGCGCG TCTCGGTATT 1440
CGTGTGCGCC TAAGGCTCCT AACAGCGTTG CTGCCTGCGC TCTCCTGGCA GAGGTCACTG 1500
CAGGCGCCCG TGCGCGGGgT GATGCGGTGA AGTAAGAAGC GCGTGCAACG AGCGGGGCAT 1560
AGGCCGGTTC TCGAGGTCmT ACATTcCTTC CTTCAGGGAT GTTTGCACTG CCTGAAGGAG 1620
AAGCTCAGGG GTGGCGTACG GTCCGGGCGG CGACGTTGTG TGCGTCTCTT GCGCGTTGTC 1680
TGTCTTTCCG CTGTTTTTTG CAGGAgCGGG GGAAGAGAAG AACGGGTCGG AaGAgATTGA 1740
GTCATAAGGC GACCGCTTAG CAGGGGCAGC AGGGCGAGAA GGGGGAGCCT CTTGCACGTG 1800
TGCACGGCGT GAAGAACCGA GCACGTGAAA ATACTATCGC GGTTGTATAT GGGCGTGTGG 1860
TCAATGAGTC TTGTTTCTGC ACGGTATGCG TTGAGTACCC AACCGCCTGC AGATACTACG 1920
AGCCCGTCCG TGGTGATAAG CGGAGTAAAC GGTGTTTCGT ACAGCGAAAT GTTCCACTTC 1980
AGCGCGCCTC CTGCGCTGAG GGTAATCGCC CGCGCTGTAT CGCTGAGCAt ATTTCGTCCC 2040
CCTGTACGCG GCACTGGGCA GCCGAGGAAA AGGTACACGG AAGTGTGAGC GAGAAAAGCT 2100 CATTTTCCCC ATCTGCGTCG TAGGCACGCA CCGCACCGTC TGCAAAAAGG GCGATGAGCA 2160
TGTTTTCAGT AGCAAAAAGA GCGACACAAG AGCGCTGTGT GCTGATGCGC ACCGGCACCG 2220
GCGCCTGGTG nCTGTTCGCG GTCTTTCCCG CCTGAGTCTT ATCAGGCGTA CTGTTTAGCG 2280
TGTCAGTGCG CACGCGCCAC AGGCCCTGAT CCGTGGCAAC GACGTACCCG CGTGGGGTGC 2340
TCGTCAGCGC ACGTACGGAC CCTGATGCgc GCAGGGCGCC TTTTTTTTCT CCAAATATGG 2400
ACCAGTACTC CACGGTTCCT GCGGGCCCGT GATGAAAAAT TCGCGCGTTT TTGCATCTGT 2460
CGGAGGAAGG AGGGGCTTTG CGCCAAGCTT CTGACTCCAC AGGCGCTGGC CGGTACTGCT 2520
GAAGCACAGC AGGGCGTGTG CgGTGGTGTA CAGCAGCCGT CCGTCTCTTA AAATTACAGG 2580
CGCGGTGCAC AGCCCCCCGT TCGCCGCGTT CACCGCACGC TTTGCACCGC GCGTGGCAGC 2640
AGGCAGGACA GCAAATTCTC GGACGGGTAT GCCCTGGCTG GAGAAAACCC ATACGCGTCC 2700
GCTAACCTCG CTTACGTACA CGAGTCCATC GTCAGAAACG CTGAGAAAGG GAGTGCCGAC 2760
TACGGGGCAG TTTCTGCGCC ACACGAAGGA TCCTTTGGAG TTTACGCAGT TGAGGGCGCG 2820
ATCTGCGCTA AATGTATAGA TGAGCGAATC GTAACGCACC GGCTGTGTTA CGTTCTGCCC 2880
TGCAAGGACC ACACTCCA 2898 (2) INFORMATION FOR SEQ ID NO: 145:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3956 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145:
CACGATGTAC TGCATTCGGn GGCATGCCCA GGACTCGGnG GATGCGCGCG TCAGAGGTGA 60
TAGCGGATAT CTACCCCGGC GGTGACAAAC TGCGCATGGC GCGTGTTAGT CTGcTCtCTC 120
TCtTCTTCAA CCACTTTTTC GCACGACCGG GGTACGCCGC TATATGCAGC ACTGGCACCC 180
AAGGACCAGC GCTCTGTCAG CTGAAGgTAA CAGCCGGCCC CTGCTTTGAG CAAAAGACCG 240
TAGTACGTAG ACGTGTAGTA GTGCTGGTAG CTGAAGCCGG CCCCCACCGT CAGCGGCAGA 300
CGGATGCGCC AGAAGGCGAG CGTGTAACTT GCATTCAGCG TTACAGGAAC GGCAAGgTAG 360
GAACACAGCC GCTCCTTCTC GTAATAGAAC GGATAACTGC AGGAGTGCTG CGCGCTCGCA 420
TCAATCCCGA GCGACAAGCC GCGATACACA AAATACTCGT ATCCAAAAGA CACCCCAAAC 480
GCGGGGTAAA CTTCcTTATT CCCGTTGGCT GCAACGCCCC CAGATCCCCG CGGGCGGTAT 540 TGCACCAGTC CACCTGGAAA AGGGGCACCG CGCCAATAAA CGAAACGCGC GCGATCCCCC 600
GCCCCGCGnC GGTTaCATCC CACGATGCGG CAGGCGCCAC CTCCTGCGCA TACGCTGCAC 660
CCGCGGcAAC ACCCACCCCG CAGcGcAGAC CTGCAGAAAG CGTATCCCGT GTGCCGCGCA 720
CTGTTTTAGC TGTTTCATAC GTTTTTTCGC TCCACACCTT TCAAAAAGCT GCGGGCAACC 780 CACACAGCCT TCCAAAATTC TACCCCCCCC CGGCCAACAT TTGTCGAGTT CTTTTTTTGC 840 AGGAGGGTCG CGCCCCGCGC ACCACCGCTG cATACCCAGT GCGCGCCTGT AACATTCTGA 900 CCGGGGAGGT GTTTTCTCAC ACCGGGGAGG TTTTATGTGC AAACCGCGCG TGTGGCGCAT 960
CGCCCACACC ATCGTCCATG TAGGCGCGTT GCTGCTCGGC ACCAGCCAGC TGACAACCTG 1020
TGATTTCTCC GGCATTTTTG CCACCATTCA GCAGGAAGTT GCCATTAAGT CGCCGTCTAT 1080
TCCGGGGGCG ATTTATGGCC TGGTCAAGGC CGGGGATAAG CTCTATGCCA CCAACGGTCG 1140
GCTTTGGGAA AAGGAGCTGA ACGGCATTAA GTGGAAGCCG GTGCCTTTTC TTGACGGCCA 1200
AGATAAGCGA ATTGATAGCC TTGCAGCCAG CAACACGTGC GTATTTGCCT GTGTTTCAGG 1260
AGACGGTGTG TACAAATACA CCGCCGGCAC CACCTCTTCG CAGAAGGAGA GTAATACGGA 1320
TAAAGCGCAG GCGGTGGTAC AGATGTCGGA CGGAAAAGTG GTCCTGCAGT GTGCCTTGGG 1380
GGATGAAAAG ACGACCCCGA GCGACGCAGA CGAAAGGTTG CTGGGGGGCG GCCAGGGCTA 1440
CCTCGTCACA TCCAAGGGAT TTTACACCCT CCCAGGGTCA GCCTCCTGCG AGGTTATCTC 1500
CGAAACAAAG GACGTCACCT GTAAGGCAGA GGCGCCGATC CTCGCCAGCG CCTGCGATGG 1560
CAGCAATACC TATATCCTTA CCAAGGACAA GGTGTACTGC CGGTATACGA ACGGCTCAGG 1620
GAGCACCCCC aCTACGTGGT GCGACGTGGA ACACAAGGTA TCAGAGCCgc TTGCGCTTGC 1680
AGTGTTCAAA AATAAGGGTG AGACGTTCTT GCTCGTTGGG GGACAGCAGG GATACGGGGA 1740
AATAAAAATA GCCACGGCAA GCGGCAGcTC CTCTTCTTCC TCATGCGTTC cCCTCACGGC 1800
GGAAAACGTG CACGCCACCA CCGGGTGGGG CGCCAACTGC TCCACCCCGG AAGGCAGCGC 1860
CGAGCAGTAT CGTAGTACGA TCGGCCGCTG GGCAGTGAGC GGTATTTACG TAATCAAAAA 1920
AGACACTAGC GGTGGGCGGA AAAAGCGGAg CACCTCAACA GACTGCGAAA GACCAGACCT 1980
CTACGTGGCG GTGGGGGATG CGAGCGACAC CTACACCGGG CTCTGGAAGT TCGACACCGC 2040
TACGAATACC TGGAATCGCG AGTGATGGCG CGCAgCAGAT GcGTGCACCG CGTGGTGCAC 2100
CAGGCAGCGT GCATCGGGGT GATAGGCCTG AGCACCAGCG CGCTGACCAC GTGCGATTTC 2160
ACTGGCATCT TTGTGGCCAT CCAGTCGGAA GTGCCCATTA AAACGCCGTC TATTCCGGGG 2220
GCGATTTATG GCCTGGTCAA GGCCGGGAGC AAGCTCTATG CCACCAACGG CCAGCTTTGG 2280 AAAAAGAACG TAGCAGAAGA AGGTAAAGAC TGGGAGCGGG AGTCCTGTTT CGACTCGGTG 2340
ATAGGCGACA GCCGCATCAC GAgcTTGCGG CAGACAACGG CGAGAATGGC GTGCTCGTTG 2400
CCTGCATTCT TGGCAAGGGG GCGTACAAGT GGTCGCAGGG TAGCGCCGAC CAGACAAGCG 2460
GAAATCCGTC TGCCCTGAGT GGCACAGAAA AAGCACTCAG CGTGGTAGGG ACCGGGACAT 2520
CATGCGTGTA CCTTAACCAC ACGGATGATA AGGTTGGGGA AACCAGTAGT TCGGAAAGTG 2580
GTGGAaTGcT GCGTCAGGAG AAACGAATGA GTTCTGCCTG CACGCCGGTA ACGGtTTTTA 2640
GTTACCACCA AAAAGGTGTG TGTCGGTAGT GATGGTTCTC CCGTGGCAAA GAGTGATGGC 2700
GAAGAACCAG TTCCGCCGAT TCTTGCGGCA ACTGACGACG GGAGCGGGCA CGTTTATATC 2760
CTCACGAAAG ACAAGGTGTA CTGCAAAAAA GTTAATCAAA GCGAAGGGAA AATTCAGGAT 2820
TGCCCACAGT CTGCCGCAGC AGCGCCGGAG CCAACCGGGG CACACAGTGT TGCCCATAAG 2880
GTAGCAGACG CGCACTCCAT AGCGTTCTTC AAAAACGGCA GCGACGAGTT CTTGCTCATC 2940
GGGGGCCGGC AGGcTACGGA GAGATAAAGC TGGAAAGAGG TTCAGGAAGC AACGGGAACG 3000
GAGCACAGTG CGTCCACCTG AAGGAAGAGA ATGTACACGA TCAAACCGGC TGGCATGAGA 3060
AgGGCTCCAC CCCGAAAGGC AGCGCCGAGC AGTATCGTAG TACGATCGGC CGCTGGGCAG 3120
TGAGCGGTAT TTACGTAATC AAAAAAAGCA CTAGCGGTGG GCGGGGAAAG CGGAGCACCT 3180
CAACAGACTG CGAAAGACCA GACCTCTACG TGGCGGTGGG AGACACGAAC GATACGTACA 3240
CCGGGCTGTG GAGGTTTGAC TCCGCCGCGC AAAAGTGGAA CCGCGAATGA GTGGCTCTAA 3300
CCtACGCTTC CCCTACTCCC GCGCGTACGG CGGGGCAGGC GTAGGGCGTA ATTTTGAAAA 3360
ATCGAGCAGA TTCTCAGTAC AAAAAGAGGG TATAGGTGCG CCGGGTTCtG cGCnGTACCG 3420
CTTTGCAGTT CAAATTGTTT TTGCTTTCCC GCCTCTTTTT TATTTTTACG TCACATATTC 3480
CCtAGACGGG TGGGGGGGGG TGAGGTAGAA GTGAGAGGAG GGGGAGTGAG TGGGCAGGCA 3540
GGTGATGCAA GCGGGGkyAC TTGCGGGCAT GGTATGTGCT GCTTCTGGTT ATGCAGGCGT 3600
ACTCACTCCG CAGgTCAGTG GCACAGCCCA GCTCCAGTGG GGCATTGCGT TCCAGAAGAA 3660
TCCACGCACT GGCCCGGGCA AGCACACCCA TGGGTTTCGC ACTACCAATA GTCTGACTAT 3720
TTCCCTGCCG TTGGTGTCAA AGCACACCCA CACCCGCCGA GGGGAGGCAC GCTCAGGGGT 3780
GTGGGCACAG CTGCAGCTGA AGGACCTGGC AGTAGAGCTT GCGTCTTCTA AAAGCTCAAC 3840
GGnCCTGTCC TTTACCAAAn CTACCGCTTC CTTCCAGGCA ACCCTGCACT GTTATGGGGC 3900
CTACCTGACA GTGGGTAnAG TCCTTCCTGT GTGGTTAACT TTGCCCAGCT GTGGAA 3956 (2) INFORMATION FOR SEQ ID NO: 146: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1314 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146:
GGAAAGCACT GCGTACTAGT GAAGCTAACG CGGAGACTCC CGCGAAAGTC AAGAAAAAGC 60
GCGCCTTTGC TCGCTGGGAT ACCAGAGAGG ATAGCTTGGC GGGCGAAACC TTTTTTTTCC 120
GCTTCTGGAT TGGGGTAAAG TCCTTTTTTC GGTCTTCGTC GGTGGAAGCG TTGTACGAGG 180
AAACACTTCT CCAGCGTCTG GGCAATGATC TTAAGGCGCA TTACGCGCAG TACATTGACG 240
TAAAGGAAAA AACCTTCACA AAGGTTTTTT ACGACAAGAT GGGCGAGTTG CGCAAAACAC 300
AGGTATTTTT TGATTCATTG CTCGCCTGCT ACAATAGCGA TAAGGGGGAC TTCTACCTCC 360
TGTTGAGCTC TTTTATCACT CCCGTTGTGT ATGAGCAGCT GATGGCGTGC AGCGATCCGT 420
TTGCTGCGCA GGGAGATGGT ACTCCTTCTG GTTTGCGTGC GTCCCTCCTT AAAAGAATGG 480
ACGTCGCCCT TGCCACCCTG AGTGGTCCTC ATAAAACTGA GTTGTATCAG GCGGCGCGCG 540
CTATCGAGTG GATGAAGGTT TTTTGTGAAG TGCCTGTCGA TCGTATTCTG CTGCGTTTTA 600
CCGTCATCTC CCCGGCGAGT GCCGTGTGTC CCATTACTAT TTTGCAATCT GAATTGGAAA 660
AGCTTGCGTG TGTTATCCAT GACAGCAAGC ATATTCCCGA CGCGGTATTG CAGGGGCTTT 720
TTGTGCTGAA GAGTAAAACA TCGCTGCATG ACGCGCAGGT GGATAACGCT GCGCACGCTG 780
CTGCCTTTCT TAAGGAGGCG AGTGCAGCGC TCGTTGTTAT CAAGGATTTA TCGCACAGCA 840
TTCCTATTGA GGATTTTGTC CGTTTTGCAG GTAGGAACAT TCGTTGGCAA CCTCGGGCAA 900
TTGCCGGTGG TGAGGATTGG TTTGTCCTTT TTAAGAAAGC GTGGAAAAAA CGTTTCAATG 960
AAAAATGGGC GCTGTGGTCT ACCGCACAGA AGCGTTTAGT GCTTAAGGAG CAAATGCTTT 1020
CTCTTCTCGG AAGGGAGCAG TTCTCAGAGT TGACTCATCG GCCATGGGAA GGGTTTTGGT 1080
ATCAGCTTGT GTTTAGGCGA GAGATGTCTT TTGTCTTTTT AAAAAATTTG TTCGAAGgTG 1140
CGTATGCGCG CGTCGTTTCA CCGCCGCTGA ACGTTATTCT TGCTGAGGGC AGTTTCTATC 1200
GTCGAGATGA GTTGATAGAG TACACTGACG CGGTAAATGT ACTGGAGCAG ATGGGAGCAA 1260
AGATTAGGAA TTTCGAGGTA AGACTTTCGC CGGTGGGGGA ATGGGGAGTC GCCT 1314
(2) INFORMATION FOR SEQ ID NO: 147:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1058 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147:
GTCGAGTATT TCGTTTGAGT AGGTAATTCC TCCTTGTGTG TTGTCCTGTC TTGTCGGGGT 60
GCGAAACATC AGATTACAGA ACTGTGTTTC AAGCGGCGAT ATGCGCAGTA TTTTCTGTTC 120
GAACTGTTTA GAAAGATATG TCCCTCGTGC ATATGGCTTT CCGTTGGTGT AGGAAAAGTC 180
TGTTCCAACA GAAGTGATAC GGGTAAACCC TAGAAAGGAA GCCACCGAAT AGGCGGCGCC 240
AGCAACCGTT nCCCGAACTA GTTTCTAAAA AAGGTAGGGA AGAAAAACGA CTTGCGTATA 300
CACTAAAAGG GTGACCTCCT CCGGTGAAAA TAAGTGAATT ACCATTCAAA TAAAAACTGC 360
GTACTGCGCA TGGACACGCG CATATGTCGA ATATTGCAGT TACCTGTGAA GGAATGAACA 420
CATAGTGTGT GGCATACGAG ATGTATTGCG GATCAATGCT GATAAAAAAA TCTGGCATAA 480
GACCGCCTGT GCAaCACACG GGAAAAGCAG TATCGCATGC AAAAATAGTG TACACATCCC 540
TATGTGAACG TATTTTTTCA ATACCTTTTT CAAGACTTGG CCCTGCACCT AATATGATCG 600
CCTCAGTTTG TGTATTGATG TTCGGTAGCT TTGGTGTGAA TATCTGTGCA TATCTTAGAT 660
TGAGTAATGC GTTTCTCATC CAAAGCTTTC CAAAGTGCAC TTGGACTGAA AAGTCGGCAC 720
GAATGATTTT CATTGCTTGG CTGGTAAGCT CAGCGATCTT TTGTTCGCTT GTGGGAAAAA 780
ATGCCTTCCA GGGTTGTAGA TAGTGAACAA AAAAATTGCC GTGGATGATG GGCATATAAC 840
ACTGTGCAAT TTCTTGTGAT GCACTACCGT GGGTGAGCGG ATGAAGAAAG TGGACGCGTT 900
CTTGGACGAT AATATGCGTG AGGTCCACCT GGCTCCAGGA GTGCACGGAA ACTCGCGTAA 960
TCAAATTCAC A ACCGCGCA cATGCGTGGA CGAAATTTTT CTAAGAAAAC ACTGATGTGG 1020
GATACCTGCC CCGATGCCAC AAAAAAGAAT GGAAGCAT 1058 (2) INFORMATION FOR SEQ ID NO: 148:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1145 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148:
TGACCGGCCA GCAAAAGGCT CATGCTCAGG CCGGCnCCGT TGCAGGACCG GGCTCGTACC 60 CCGnTGGGTG CCGGTTTATC CCCCGACCAG GGTTTTTTGG ATGAAGTCTG CCGCTGGCCA 120
TTGCGTGTTG ATGGACAgTT TCtGGCGATT GGCGAGCCtT TTTTACGCAA GArTgCgGTG 180
CaCGATGTGA CAGCGTTCaT TGATGCrCAC AATGCgCGCA AAAGGGCGCC gCGTGCCCCg 240
CTGTTGGTTC CgGAcTTTTA GCTGTCaTGG GGAAAgCGAA TTTATATGCg CGTATTCGTG 300
TCTTGGCATA AGCaAAAGGA TAAGGGGGaG CGCGAAcGCg CAGGTGCGCT GCaGGGAGCG 360
TGCATAGGTA CGCCTGAGCA CCGCCAGtTT TTGCGCTGGT AGGAAGAAGT GGCAGCGGAA 420
AGAGTTTTCG TGCGCAGATA GTGATGCGGC GTTTTGATAT AGCGTTCGTT ATAGACGATG 480
GATTGCTTAT TCGTGAGGAT AAGATTATTG CAGtCGCTCG GCAAAGCAGG AAAGGACGCT 540
GCTtGCGGCC ATACGGGTTG CGCTCTTCGA GAACACGATG CACCGTCGCG TAtTGCCCGT 600
GTGTTGACGT ATTTTCTGTG TGGGTCACAG AACAAAAAGG TGCTCATTCT GGGGACTTCA 660
GAGAGGATGG TGCGCAGAAT AGCCCTTCGT GTGGGGCTTC CTGAACCTGG GCATATCATC 720
CGCATTGAGG ATATTGCCTC GAGCGAAGAA ATTGCCtGCG CgCgCaCACG GCACGTGaGG 780
GCACGCACGT CATTCCAGTT CCCTCTGGGG AGGTACGCAA GAGCTATCCT AAGATCTTTT 840
ATGAAAGGAT AAAACTCTTG cTGCgTAGAG AGGCAGGTGC GGAACGAATa GGACGGTGGG 900
CGCACGCCAT ATGGCATGAG GGGCTCAAGC GTGcGTGCAG CGGCGCGCAC cGCATGtATT 960
TGAAAAATCA ATAGTGCGTC CACCTTTCTC ATGCAATCTG CGCGTGGTGG CGGCGACAGA 1020
GGGCACACAG GATGCGTCGC CTGTTGTGGT GCCGCCTCAC GAGGAGCTCG CTCTCCAGCA 1080
GTAGGAGAGC GATGACAGGA GAGGCAGTAG TTTGTCCCGT AGCGTGTGGA GCCGGAnGGT 1140
CGAAG 1145 (2) INFORMATION FOR SEQ ID NO: 149:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 860 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149:
TAGGAGTTTG TTTTCTAGAA nTTGTTTTTC AATAAAGGGC CGGTTCTAAG TCACGCTCTC 60
CCTTCTGCCG ATAGAGGGnG CGGAGGTGGG GATGCTATGG TGATGGCCGC GGCGCCTTTA 120
CAGAATTTGT GTTACGACGG ACTTTTACAG GTAACCTCAG GGGGCAATAT CTCGCTCCCT 180
GtACCCTCGA ATCAGGTGAT ATACGCTCAT TTCGAACATG TTGATGCGAC TCCTGCGGAg 240 cGGAGrCAGG CAGGGGTGTC GGTGTCTGAG CTGCAGATTT TGGACGCGTT GGTCGAGCGG 300
CTGATAGTGC AGCGTCGGGT AGCGGCAGAA GCGGCAGACA TGGCGGTGCA GAAGCGGCAG 360
GAGACACTGC TCCGCGCCGC AGAGCTTTTT TCTCAGAAGC AAGTGGACGA GACCAAACGG 420
CGGGGAGAGT CTCTTCCTTA CACCTCAGTA GAAGTACAGG GGCCTGAGCT TTTTGACTTG 480
CGCGCGTAGG GCACGTGCAG GCAGAAGTAC TTGACTTTTT TGAGGATAGG GGGGAGACCC 540
CGCCTGGCCG GAGTTCTGAG GCTGGGTATG GACTCTTTTG GAGAGTGGGG GTTGTGTGGG 600
TATGAGAAGA TTGCTGGCAT GTTCGGCGGG GGTGCTGTGT TTTTCCCAGC TTGGCGCGCT 660
TGAGTTGTTT CTTTCTCCTA AGATTGGGAT CACGAGTGTG TATCAGTTTG GGAGTAACGG 720
TGGTTCGGAC GGTACgTCGT CGGGTAAGGG TGTGTCTTTT GATAGACTGA TTGGAAGGGT 780
TGACCTGGGG CTAATTTTGG TGAACGGCCT AACGATTTCA GCTTCGGCGG AAAGTTCGTT 840
GACCAATGTC TTTGTGCGTG 860 (2) INFORMATION FOR SEQ ID NO: 150:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 13811 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150:
TAAAAAAGGT ATGGATGACT TGTGACTAGG AGTGAAAGGC TAAACAAACC TGGAGATAGC 60
TGGTTCTCCC CGAAATGCCT TTAGGGACAG CCTTATACAA AACTGTCGGA GGTAAAGCAC 120
TGGATGGGCT AGGGGGTTTC ATCGCCTACC AAACCCAATC AAACTCTGAA TGCCGGCAGT 180
CAACGTGTGG GAGTGAGACT GCGTGCGACA AGGTTCGTAG TCGAGAGGGA AACAGCCCAG 240
ACCGTCAGCT AAGGTCCCGA AATACCGCTT GAGTGTGAAA TGAAGTGTGG GTACCTGGAC 300
AGCCAGGAGG TTGGCTTAGA AGCAGCCATT CCTTGAAAGA GTGCGTAATA GCTCACTGGT 360
CGAGTACGCA TGCGCAGATA ATGTATCGGG GCTAAGCGGT ATACCGAAGC TACGGGTCTT 420
GCATTTTTGG TGCAAGGCGG TAGGGGAGCA TTCCATGTAC TGATGAAGGA ATATCCGGGA 480
GGAGTTCTGG AGGGGATGGA AGAGAGAATG CAGGTATAAG TACACGAAAA GGAGGGTGAG 540
ATTCCTTCCC GCCGAAAACC TAAGGTTTCC TGGGTGAAGG TCATCTGCTC AGGGTAAGTC 600
GGCCCCTAAG GCGAGGACGA GGGTCGTAGT CGATGGGAAT CCGGTTTATA TTCCGGAACC 660
TCTTGCAATT TCGATGGCAG GACGCGTGAG GTGAAGCCCG GCCAAAGATT GGTAGTTTTG 720 GTCTAAGTAT CCGAGCCGTT TTAAGAGCGA TAGGCAAATC CGTCGTTCGA GGTAAGGTGC 780
GAGTGCGACT GGAGCGATGA GCGAAGGGAA GCAGGTGTAG TCATGGCGAC GGGAAATACT 840
GTCTAAGGTT AGGTTGCAAG AGACCGTACC GCAAACCGAC ACAGGTAGGT AGGATGAGTA 900
ATCTAAGGCG CTCGAGAGAA CTCGCGTCAA GGAACTCGGC AAAATACACA CGTAACCTCG 960
GGAGAAGTGT GACCCTTGCC TTTGGTGAGG GTGGCAGAAA GCAGGTCCAG GCGACTGTTT 1020
ATCAAAAACA TAGCCATCTG CAAATCAGTA ATGAGACGTA TAGGTGGTGA CACCTGCCCG 1080
GTGCTGGAAG GTTAAGAGGA GAGGTTCGTG GTAACACAAC GCTTTGAATT GAAGCCCCAG 1140
TAAACGGCGG CCGTAACTAT AACGGTCCTA AGGTAGCGAA ATTCCTTGTC GGGTAAGTTC 1200
CGACCCGCAC GAATGGTGTA ACGACTCTGG ACACTGTCTC GACGCGAGAC TCGGTGAAAT 1 60
TTATGTACCG GTAAAGAAGC CGGTTACCCA TAGTTAGACG GAAAGACCCC GTGAACCTTC 1320
ACCGTAGCTT ACTATTGGAA CTTGGTTTAC CATGTGTAGT ATAGGTGGGA GACAGAGAAG 1380
CTTGGCCGTC AGgTTAGGCG GAGTCAACAG TGAAATACCA CCCTTGGTAC GTCAGGTTTC 1440
TAACCTTTGG CCGTGGATCC GGCAAAGGGA CCGTGGTAGG TGGGCGGTTT GACTGGGGCG 1500
GTCGCCTCCT AAAAGGTAAC GGAGGTGCGC GAAGGTCTCC TCACACCGGT TGGAAATCGG 1560
TGCGCGAGTG TAAAGGCACA AGGAGGCTTA ACTGCGAGAC CGACAGGTCG AGCAGATACG 1620
AAAGTAGGTC TTAGTGATCT GGCGGTAGCG TGTGGAAGCG CCGTCACTTA ACGGATAAAA 1680
GGTACTCCGG GGATAACAGG CTGATTTTCC CCAAGAGTTC ACATCGACGG GAAAGTTTGG 1740
CACCTCGATG TCGGCTCATC GCATCCTGGG GCTGAAGCAG GTCCCAAGGG TTTGGCTGTT 1800
CGCCAATTAA AGCGGTACGT GAGCTGGGTT CAGAACGTCG CGAGACAGTT CGGTCCCTAT 1860
CTGCTATGGG CGTTGGATAT GTGAGAGGAG CTGCTTTTAG TACGAGAGGA CCGAAGTGGA 1920
CGAACCTCTG GTGTACCAGT TATCCTGCCA AGGgTACGTG CTGGGTAGCT ATGTTCGGAA 1980
GGGATAACCG CTGAAGGCAT CTAAGTGGGA AGCCCGCCTC AAGATTACAT ATCCCTGAAG 2040
GTTGACCTTC CTGAAGACTC CTTGCACACT ACAAGGTCGA TAGGCTGGAG GTCTACGTAC 2100
CGTAAGGTAT TAAGCCGACC AGTACTAATA AGTCGTGAGG CTTGACCATA TTATTCATCC 2160
TTCTCCTTCA CCCTACCCCT TTGCGTAAAA TATTTCGCCT GGTTGCCATG GTGGAGAGGT 2220
CATACCCGTT CCCATCCCGA ACACGGAAGT CAAGCTCTCC TACGCCGATG ATACTGCTCC 2280
TTCGCGGGAA AGTAGGTAGT AGCCAGGCTC CCCTTTGCCC ATCACTCGCT TGACAAAATA 2340
TTATCCGAAA CCTTAAAGTC CGCAGtGGCT TCGGCGCATT CGTTTGACCT GGTAGATTTG 2400
GATGATGAGG ATTACGGTAC TATCCTTGCT GAGGATGTTG AGTTTTCAGG GACGGTTCGC 2460 TTTGAAGAGC CCTTCATGGT ACGTGGTGTT TTTGAAGGGA GCATCGAGTC CTCGGGTGAT 2520
CTTATGGTTG AGGAGCAGGC GCGTGTTCGC GCTGAGATAG TTGCAGATCG TGTTGTCATT 2580
AAGGGAGAAG TAATCGGGAA CGTCACGGCT ATGTCTGTTG TTCGCGTCTT CCCATGTGGG 2640
AGGTTGATAG GGGATGTTAC CGCGCCAGAA GTTGTGCTTG AGAGTGGGTG CTTTTTCAGT 2700
GGCGTGTGTA GTATGCCCGA ATCGGGTTGA TCGGTTTCCC TTGCAGACGG GTAGCTCTCT 2760
TGGTCGCTAG GTTTCTTTTG CCCGTTGTGT TCTGTTTCGT GTGTAGGGGG CTCTGTTCGC 2820
AGCAGTCTGG TCTTGCTTTC TACCGGGCGG GTAGTTACCG TGAGGCGATT AGATGCTGCG 2880
AGGAAGCTAT CGCACAGGGG TCTAAGGACC TTGAGCATTA TCTCGTGCTT GCCTGGGCGC 2940
TTGTTGCGGT TGGGAAGTAC CATGAGGCTG CCCAGTGGGC CCTCGAGGGG CGCGCCGTTG 3000
CCGCCCATGA TCCTCGCCTC ATAGAGGCGC AGGCGGAGGC TTTCTGCCAC TTGGGGAGAA 3060
ATGAAGAAGC GCTTAGGTTG TTTCAAGACT ACATTGCGTG TGCCCCGAAT GGAgCgcGGC 3120
AGACCGCCGC GTATTATCTT ATGGGGGAGA TTTACCTGCG CACTTCGCGC TTCTTGCACG 3180
CAGATATCGC TTTTTCCGTC GCGCTCCAGT TAGATTCGCT TAACGATTTC TGGTGGGCCC 3240
GCTTGGGCTA TGCaCgcGAG CGCGCGGGGG ACTATCGCTA TGCGTTACAG GCCTACGACC 3300
GAGCGCTCCA GCTGAATAGA GATCTGGCTG ATGCTCGCCG GGGCAGGGAG CGGGTGCTGA 3360
GGTACGTGTT CCGTAGGTAG GTTTTCTTCG TGTTTTCCGT TCGCATCGAG ACGCTTGGCT 3420
GTCGTCTGAA TCATGTCGAG TCTGAGTCGC TGGCAGCGCT CTTCTTGCAG GAGGGCTTTG 3480
CGGTATGCCG TGGCAATACC TCCACTGCCC CAGTGGTTCT GTGTGTGATC AACACCTGCA 3540
CGGTCACAAG TAAGGCAGAG CAAAAGGCGC GGCGCCTCGT TCGTCTCCTG TTGCGCACAT 3600
ATCCTACTGC AATAGCGCTT GTCACTGGAT GTTATGCGCA gcTTGAGCCT GCTTCTCTTG 3660
AAGCCATGGA TGATCGTGTC CTTGCTTTTC CAGGAAAACA AAAAGATGCC CTCAGCCTTC 3720
TGCCCTCGTG TCTCCGTGCG TTACTTGTGC AGCGTGGTCC TGCGCCGATA GATCAGTATG 3780
TATGCGGTAT GCGTGCGCTG CTCGCTTCTT TGAAGAAAAA AATTATTTCT TTGGAACTAA 3840
CGTCTGAGTT TCCATCGCAG ACGCATATGC CCACAAGGAA TGCCCTTCCT CAGTTAACCG 3900
GTGTGCCTCA TGCGCCGCGC GTTTCTGTAT CTTCGTTTTC AGAACCTACA GCCGTTCCCC 3960
GTTTTGCTTT GTATGCCCCT CGTTTTTTGT TCCACTCACG CGCAAGCATT AAGGTGCAGG 4020
ACGGTTGTAA CAGTGGATGC GCTTTTTGTC GCATTCGTTT TGCACGCGGT CGCGCTGTAT 4080
CACTTGAAAC ACACGAGGTA ATTGGGCGGG TGCAGGCCCT TGAAGCTCGT GGTATGAGCG 4140
AGGTTGTACT TACAGGGGTG AACTTGTCTC AGTATAGAAG TGGCAGTATA GATTTTGCGG 4200 GTTTATTAGA GTTGATTGTG CAGGAAACGC ATACGATTCA TATTAGAATT TCGAGTTTGT 4260
ACCCAGAAAG CGTAACATCT GCTTTTTTGC GTGCTATTGC GCACACGCGC GTGTCGCCTC 4320
ATTTTCATTT ATCGGTTCAG TCGGGCAGTG ATCGCGTGTT ACGACGCATG CGACGCGCTT 4380
ACACACGTGC GGACATT AT CAGGCAGTTT CCGATTTACG GAGTGTGCGT GAAGAACCCT 4440
TTTTGGGTTG TGACATAATC GTCGGCTTTC CAGGGGAAAC AGAGGAAGAT TTTGCAGACA 4500
CCCAGCGTAT GTGCAAAACT TTGCGTTTTG CAGGTATTCA TGTATTTCCG TTTTCTGCAC 4560
GCCCCGGTAC AGAGGCGTTT GCTATGGATG CAAAAGTGCC TCAGCGTATT GCAGGAGAAC 4620
GCGTTGCTGC AATGCAgCAA CTGGcAGAGA AAAACTACCG TGCGTATTTG GAATATTGGA 4680
ATGGGAGGGA ACTATGTGCG GTGGTAGAAC AGTCCGTCGC ACGTGTTTTG ACAGAAAATT 4740
ATTTGAGCCT CCCAATCATT GAACGGGGTG GCGTCGCTGC CTCAGCAGGA TCACACGTAA 4800
GGATTAGAGT TCATAACGAG GGTGCTATCC TCTTGTGAGT TTCTGATTCG CAGTGGAGAT 4860
TCATTGTTGT TAGAAAAAAA CGAGCATGCA GTAGAGCGTG AGGAAGGGTC TGTGCGCGTG 4920
TTTCCTGTCT GTCCGCTGCT GCGTGAAGTG AGTGAGCGTT TCATCCGTGC AGGATTTTGC 4980
ATTTATGTTG TCGGGGGTGC TGTGCGTGAC TTTCTCCTGA ACCGCAGTGC TCATGATTGG 5040
GATCTTGCAA CTGACGCTCC CCCTGAGCGT GTGCGTATGC TTTTTAGACG CACGGTTCCT 5100
ACTGGTATTG AACACGGCAC TGTTACAATT CTTTTCAGAG GGCATTCTAT TGAGTnGCAC 5160
TACGTTCCGT GTTGAGTCGG ATTATTCCGA TAGGAGACAT CCGGATTCTG TTTGCTTTGC 5220
CGCGCGCATT GAGGACGATT TGGCAAGGCG CGACTTCACT GTCAATGCTT TCgCTGCCGC 5280
GCTCCCCTCG GGGGAAATCA TCGACGTATG TGGCGGTTAC GCcGATTTGC GTAACGGTCT 5340
TATCTGTAGC GTTGGGGATG CACATGCTAG ATTTTCTGAA GATGCGTTGC GTCCTTTGCG 5400
TGCTGTGCGT TTTGCAGCGC AGTTATCTTT TTCCATCGAA GCGCGCACGC GTGAAGCAAT 5460
TtnAckcTAC GCGCTCaTAC TGCACGTATT TCTCGTGAGC GGGTGCGTGA TGAACTTTCT 5520
AAGATGCTTT GTACTCCCCG TCCGAGTATT GCCTCCGCCT AATGGAAGAG ACTGGATTGC 5580
TGCACACACT TTTTCCTGCC TGGCGCAGGT GTGTGGGAGA AACGAAGGGG AGGAGAAGGA 5640
GACGCAGGAC AGTCTGCAGG TTTCACCGCA GCAACGACGT GACGCTCGCA CCTTTGCTGC 5700
GTGCGCGTTT GCTGCGTGCG ATCGGGTGCC TGCAGAGCTT GCGGTCCGCC TTGCAGCACT 5760
TCTCTTTCCG CTGGCTCACT ATCGTACGCT TCCTGCTTCA GGAACGGGGG CAGCGTTGTT 5820
TATATGCCCT CCTGCACTTG CTGAAGCGCG CGAGCTTTTG CGTGGCCTGA AGTACCCGAA 5880
TTACcTGACG GCTCAGGTGT GTCACTTGGT TGCACACGCG CGTTTTACTC CACACGAATG 5940 TTGGTCTGAA GGAACGCTCC GTCGTTTCGT TGTGACGGTA GGTACTACTC AGCTAGAATC 6000
GCTGTGTGTC CTGCGGCGTG CCTATCTTGC TGATGACGAT TACCGCATAT CTGCTCAGCC 6060
TGGAGTGTCC TGCGAGAATG CGCGGGAAGG AAAAAAAATG CAGGAACAGT TTGAAATATT 6120
CGTAACACGC GTGCGCAGGG TCGCCGCACA ATTGCCTGTG CACGGCATAC GCGACcTTGC 6180
GGTGAACGGA CGCGACTTGA TAGCGCAGGg CATTCCCCCA GGTCCCACCA TAGGGCACAT 6240
CCTGAACGCA CTGTTTGATA TGGTCCTCAT GCAACCGTCG CGCAACACGC GTCCGCAGCT 6300
TTTGGAACAT GCGCAGGAGA TTGTGCGCAC CATGGCGCAG AATTAGTGCG CCTTGAGCAA 6360
CCCGTAGAGT CTTTCTAATT CTGCTTGGGA GAAATACGTG ATCTCAATTC TCCCCCGTTG 6420
TAAATTCCCA CTGATGCGCA CCTTGGTTCC CAGTTGTTCC AACAATTGCT GTTCAATGTC 6480
TGCGATGTCT GCGTTACGTA TGCGCGCATC TAATTGCGCA TCCGTCGATG GGGAAGAAGG 6540
AGGAAGGCGG GTGATGTCTG TAGCAGAACC GCCGGGTGAA GGGGAAGACG CCGCCGTGTG 6600
CGCGCGAGCT CCAGCGTAAT CGTGCAAACT TCCGCCACGA TTAAGACATG CAACGCACTC 6660
CTCTGCGGCA CGTACAGACA ACGCATGGGT AACTACATAC TGAGCAACGC TTACACACAA 6720
CTGCATGTCG GTGAGTGACA GAAGTGCGCG CGCATGCCCT GCGCTCAGCG TTCGAGACGA 6780
AAGCGACTGC TGAACTTCAG GAGGCAGTTT TAAAAGACGC AGCGCATTGG TAATGGTACT 6840
GCGGTTTTTT CCAACCCGCT GTGCGAGCTC TTCATGACTT AAATTACCCA GATCCATGAT 6900
ATGTTGATAG GCGCGCGCCT CTTCCAGGGG ATTCAGGTTT TCTCGCTGAA CATTTTCAAT 6960
GAGCGCGATG GCAAGCTTTT TTTCATGATC GCAGGTGCGC ACGATAACAG GTATCCGATT 7020
CAACCCCGCA AGGATGGCAG CGCGTGTTCT ACGCTCCCCC GCGATAATGA CCCAGCTTCC 7080
GTCCTGGTTT TTTTCCGCAA GGACTGGCTG GATTACCCCG TGCTCACGAA TAGACGCGGC 7140
AAGCTCCTCG AGAGATTCCT GCGCAAAGGT GCGACGCGCC TGATGTGGAT TCGCCTGCAG 7200
CAGCGTGGGA TCAAGATAGT GTACAGTCTG CACACCGCCT GAATCACGAA CATCGTATCG 7260
ATCTGAGCTT TCTTGCAGCA GCGCGTCAAT GCCTTTGCCT AATTTATCTT TGCCCATCGC 7320
GTGCCACTAT TTCCCGTGCA AGCTTCTCAT AACTCCGTGC CCCTGCGCAC TGTGCATCGT 7380
AAGAACTAAT GGGTAACCCG TGCGAAGGGG CTTCGGATAA CTTTACGTTG CGAGGAATGA 7440
TAGTATTGAA CACCTTGTCC CCAAAATAGG TTGTCACTTG CTTAACCACT TCTTGCGCCA 7500 gCTTAGTTCT AGTATCATAC ATGGTAAAGA AGATGCCTCC GATCGAAAGC GCGGTATTCA 7560
GACCACTTTG TACACGCTTT ACCGTCTGCA GTAAGAGTGT GAGGCCTTCA AGTGCAAAAT 7620
ACTCACACTG CAATGGGATG AACACCTCGT TTGCCGCTGC AAGTCCATTC AATGTGAGGA 7680 TACCCAGCGA GGGCGGACAA TCAATCAGGA TAAAGTCGTA CGTGTCTTTT ACTTCTGCCA 7740
A ATCTTTTT AAGGTAGAGC TCGCGGTCTT GTTCATCTAC TAGTTCCACC GTCGCGCCAG 7800
AAAGATCGAT GGAAGCGGGG ATAGCAAAAA GGTTATGCAC TGGTGTaGTG CGCAcGCtGt 7860
TGATGTGTGC CTTACCTGCA AGAAGATCAT ACACGGTCAA CCCTCTAGCT AAGCCGAGAC 7920
CCGAGGACAT GTTCCCTTGA GGATCAAAAT CAACGAGCAA GGTTTTCTTT CCTGCAAGCG 7980
CTAAATACGC ACCCAAGTTA ATGGCAGAGG TTGTCTTACC CACCCCTCCC TTTTGATTTA 8040
CGAACACCAA GGTTTTACCC ATGGTGCCGG AGTGTACTCC ACTTTCTAAA TAAATAAAAG 8100
GACTTCGAAA GCGCGTCTGA ATGGGAAGCT ATGCGATGGG GTAAATCCTG GTAAGGAGAT 8160
ACGGTGTGCA AAAGCTAGAG TGATACGATG AACTGTGCTA CCTTACGCGG AGGTGATCCT 8220
TTTTCT ATA AAAAAACATT GCATTGATTG TGGTGAAGCG TTTTTCGGTA TTGTAGTGCT 8280
CAATGCGTTG TGTCTTGCAG GGGTAGGGTA CAGTCTGTTA TGGCACCAAG GGCCTGGTCG 8340
GAGCGTGTTG TTTGTCTTAG TACTGGCTAC GCTGTACGCA TGTCTGTGCG CGTTCTGTGT 8400
CGTTCGCGGA GAACGGGGAT GTGATACTCT GGCAGATACG AATCTACGCG TCTTCACACA 8460
CGCACTGCGC GAGGTGTGGC TCCAGAGCCT GTGGTGTGCA CTGCTACAGT GTGTGTTGTT 8520
TCGAACTGGA AAATACGTGT GTACATATTA CTTTGCACGC ACGCATTCTG TATTTACCGC 8580
GTGCGgTATA CTCAGTGCCT GGACATACGC GCTCGCgTGC GGTGCACTCC TGTGGTTTGT 8640
GCCGGTGCGT GCACGCTACA GAACGCATTT TCGCCAATGC GTATATCTCT CGGCACGAGT 8700
ATTTTTTGAG CACCCGTGTA TTACCTTCCT TATGGTTTTG TACAGCATGG GTGTGCTCGC 8760
GCTGAGTGTA CCAATGGCTT TTTTATTTCC AGGACCGTGC GGCATTGTGC TTCTGTGGCA 8820
GGaTGtgCTG CGCACGCTCT GTTTTCGTCG TGCATGGCTT GCTGCACACG AGGGGCGCAA 8880
GGCGGCGTGC GCACCGCCTA TTCCCTGGGA GCAGCTCATG TGCCAGATGC GTGCACAATC 8940
CCGCGCGCAC ACCGTCGGCG AGCTCTTCTC TCCGTGGAAA TCGTAATGTT CATTGTGTTC 9000
GAGAGAAATA ACAACCCCGC AGTCTATGAG GGGGTGCCGT ACTCAGCAGG ATTTTGGTAT 9060
ATGTGCTCAA GCGCGATCTG TGCTTGTGCA AACAGCATGA TACTTTCTGT CGCAGTCTGA 9120
TAACGCTTCT CAGCAACGAC ACGCACAATG CCGCAGTTAA TGATGAGCAT TGCACAGGTG 9180
CGGCACGGTG TCATGGTACA GTAGAGTGTT GCGCCCTCTA GACCGATGCC CAAACGCGCT 9240
GCCTGGCAGA GGGCGTTTTG CTCTGCGTGC ACGGTGCGAA CGCAATGCTG CGTGCACGTC 9300
CCGTCTTCAT GCTGCACCGT GCGTAGCTGG TGTCCATGTT CGTCACAATG cAGGAGACCT 9360
CGCGGTGCGC CTGCATACCC AGTTACAAGC AGATGGTGAT CGCGCGCTAT GACGCAGCCT 9420 GCGCGTCCGC GATCGCAGGT GGCACGCTTG GCAATTGCAT GACACACTTC CATAAAATAC 9480
TCATCCCAAC TCGGTCGCGT AACCAACTCC TCACGTTGTC CCATTaTCGT ATGCTCCCTT 9540
TTTCCCTTAA AGATGCACAT GcCCTTGTGC TGCGGTGCGC AGCACACTCT GGCGGGTGAG 9600
TTTAAACATC GCTcaCTTCG TGCGGAATAT ACGGTAGAAA AGCAAGAGAA AGATATACCA 9660
TACGTATCGC ACGCTCCTGT GCCAGCGCAA TGGATGCAGT GTTTGTCCCC AGGacTGCGT 9720
CCGTGTAACA GTTCTTTTGC GCGTGGTCGT TTGCGCCGTT TTGAGAAAAT GTCGATAACC 9780
TCGCTGTCTT GCACAAAAAT GCCACCGGCG AAGTGTGcGC GATTGCGCGC AA.TCCATCGT 9840
ACGCCTCCTG GTACCCCATT GCCGACAATT ACCAGCGATG CGTTGGATTT TACAATTGCC 9900
GTGATAATGT TGTTTTCTAA CGTCTTACGA TAAAAGCCGT GAAATCGACC TACCACGCTC 9960
AGTCCGGGAA AGGTAGAATG CACGtTACGC TCCGCCTCGA GCAGACTCTG TCCCCGACCG 10020
CCGAACAAGT AGAGTGATTT GTAGTGCGTG TCCATAACGT TGAGCAAGGT GATGATGAAT 10080
TGAAACGGGT GCCGAAAGAT AGGTGTCGGC AATCCCAAAA AGCGTGCACC GCGCACAATA 10140
CTGTAGGATG TGGGCAGACA CAACGCTGCG cGGCCTACCA TGGTGCGGAA CTCGTGATTA 10200
TGCCGTGCCT TTAGCACGTC CCAGAGAGAA AGAAATATAA TGTGCTGCGG CTCTTTTTTT 10260
TCTAACAGGG AGAGAATGAC GGTGCACAGA TACTCTTCAC GCACCACATC AAGGGGTACG 10320
GAAAGGAACT CAATGCGCGT TACCAGCTGC GTTTGTTCCA TAGTGCTGCC TCACTAAGAA 10380
GGAGATGTGC GCCTGCAAGA GCATAGAGCG CGGCCGTTTC TACGCGwCAs ACATTCGTGC 10440
GAAAGTGCAG GGGTGCAAAG TGCGCGTCAT GAAGCGTGCG CTCTTCTGTA TCAGAAAAAC 10500
CCCTTCAGCG CCTACTGCAA GTATAACCCC TGCGCGCGCA CATCCGGCGT GTTGTGCGGC 10560
GCATTTGGCT CTGTGGGCGT ACGGTGCGCG GTGCAGTGTG TAGAACGTCC TGTCCAAAGC 10620
AGTTCGTGAA AGGAACCTCC CACGTGTTTT TCACTGcAAT AGATACGCAC TGCAGAAGGA 10680
TAAGGAAATT CCTGCTCGAG CATATGCAGT GCGCCAAGGA GGCTGTGGAG CGTGTCCACA 10740
CGGGTGTTCA CGGcAGAACC AGACTGTTGA CGCGCTTCCC GCACGATGCG CTGCAAGyGt 10800
GCATGCTGGG TGTCTTTTTC ATCGCGGATG AGCGAACGTT CTCCTCGAAT CGGTATGATG 10860
CGCGCAGTTC CAAGTTCTGT CGCATGCCGA ACGAtCGCGT CAAAGACGnG CGCACGGGGC 10920
ATCCACTGAA GCAACACAAG CGTTACCTGC GTTTTCTCAT CCACTGGGGG CATGGGCGTG 10980
CAGTGGAGGA TTAGTGAACG CGTATGTATG TCGAGAGCGG ATACTACTGC GCTGTACTGC 11040
GCACCGGCCT GCGTACGCAG GGTGAATGTA TCCCCcAgCG TGCGCGTcGA ACACGCACCA 11100
TGTAATGAAA ATCCTTGCCT GAAAGCTGCA CATACCCGCC GGAATCCGGT TCTGCGGTTA 11160 v- TCAGCAATTG TTTCATAATA TCnCaCGgCT GAGcGCGCCG GACGCTGTGC GACAACACCA 11220
CCGTAAGGAA CTTCTCTCAC CGTAAACCCT ATACATATCT GAAAgTAGTG CGGCGCCGAG 11280
CACGCACAGT CTGGCTCCCA TAGCCTGGGC AGCGGGACCG ACCTGATAAG GgCCACACTC 11340
ACCAAGTCTT TGCAAGGGGT ATAACAAGGG CGCAgCGTAc rAACTGGACG TTCAAGATAC 11400
CTTCTTTTTT TCCTGTATTC CCTCTTGCAA TTCTTTGACC GTTTTAGTAC GCGCCACCAG 11460
CGCkTGaATA TCCTGTGTGC GTATCAACTC CACCGCCTTT CGCAGCTGTT CATCATATTC 11520
TAAGTCATAT ACAAAAGACG CATTGTGCCG GTGATACTCC TGTGCGATGA GCTTTTCCAC 11580
CAGTATTGAT CGGAGTTTAA ATTCGCGTGC CAGGCCACGT GCATACTGTT TTATGCGTTC 11640
GGAGTTCATG GTACGGTTCT GCTCTGCAAA GAGTTTCACT TTTTCAGACT TAAATAGACT 11700
GACAAGTGCC TCTTCTTCCT CGGGAGTAAA CTCTGTTTCT CGTACCTCGA GGTCCGGCGG 11760
AATGCCGCtC TTATCGATAT TTGCGTCGCT TGGGGTGTAG TAGCGTGAGA TAGTCATTTT 11820
CAAACTCTCC CGTTCATTGA GGTCAAAGAC TTGCTGCACG ACTCCCTTTC CGTATGTGGT 11880
TTGGCCGACC AGATATGCGC GCTTGTGGTC TTTTAGCGCA CCTGAAACGA TTTCTGAAGC 11940
GGACGCAGAA TCCCTGTTGA TGAGAACAAT CACAGGCATA GACGGAGGTA GCTTCTGCGC 12000
ACGTGCGTTA ACGCTGAACG TTATGGAGTG TCCCTGCACG CGTGACTTAG TGGTTACAAC 12060
GGTTCCGGAG GGAATGAATG AGCTGGTAAC GTCCACTGCG GCGGTGATAA GGCCGCCTGG 12120
ATTGTTGCGC AGgTCGAGAA TGAGACGGTC GCAACCTTGC CTACGCAkTT CGGTAACGGT 12180
TTCTACCATG CGGGTAGCGG TGACTGGGTT GAACTCTATG AGGCGCACGT ACGCGATGTT 12240
TGGATCGATT TTTGCGTACT TAACAGTGGG AATTTCAATT TTCGCACGGG TGAGGGTGAA 12300
TGGTCGGAAA ATCTTTGTAC CCCGTTTGAC AACAAGAGTC ACCTTGGTGC CTATCGGCCC 12360
CCGTAGTTTC TTTAAAACCT CGTCCATAGT CATGGTGTCA GTGCTCATGG TGCCGATTTT 12420
GACGATGAGG TCCCTCGGTC TGATTCCTCC TTTCCAGGAA GGAGTGTCTT CAATGGCGGC 12480
GGTCACTTCC ACGTAGGCGG GTTTGCCCGG AGTGGATGTT CTGGACTTAG AAATGACAAC 12540
GCCGATGCCG CCAAAGAAAC CCTTGGTAGT GTCTTGGAGA TTAACGCCGC TGATGCTGTC 12600
GCTTTCGACG AAGGTTGTGT AAGGGTCCTG AAGAGAGTTA ATCATGCCTT CTAGTGCCCC 12660
CTTGTAGAGG ATGTGGGGGT CTACCTCGTC GACGTAtATT TGCGGAGGAA TTCGTAGACA 12720
TCCTGCACCG TCTGCATGCG CTCGTCTTCC TCGGAAGACT GTGGCAAATA AGCGGCAGTC 12780
CATGTGGGGA AGGGAGCTGC GCTGGTGATC AGACAACAGA CGAGGAGCGC AAACAGTCTC 12840
AGAACACTCA TGGGCGGATG ATGGAATGCG TGCCCGTCTG TGTCAAGAAT CTGCGTTGGA 12900 CAGTTTCCTT TCCATGCGAC TAGACGGAGG AGGGCACTGG GGGAGGTGCA GGCGCGCCCC 12960
AGACCCGCTC TGGGAGGAGC AGGATATAGC GTGACTTCCG TTTTCTGCCC TGTACTTCCC 13020
TTCTTCGTTT TTCACCCCTG TGCATGGGGG GGGGGAGAGA GAGAGAAGTT TCGTTCATCC 13080
TCGCACCCGT TGCGTTGCGG TGCGTGCAAG GACTGTGCTA CAGTGCCGGC CGATGGGGAC 13140
CGTGATCATC GCTCTTGATG GACCTGCAGG CTCTGGGAAG AGCAGCGTCT GTCGTCTGCT 13200
CGCGTCTCGC CTTGGCGCGC AATGTTTGAA CACGGGTTCT TTCTACCGTG CATTTACCCT 13260
CGcCGCATTG CGTAGGGTAT CGGAGTTGGC CGTGCAAGCG TGCTCTCCTT CTCCGGACCC 13320
TGATGCGGCG GTCGGGTGCG CGGCTGTTCC ACACGCAACA AATCTGGACA CATCATATGC 13380
TCCTCTGACG GCCCAGAAGA AGGTTGCACT TTTTGATGAA GCGTATTGGG TTTCGTTTGC 13440
GCGCACAGTT GCGCTTTCTT ATCGTGCGGG TGTGATGTAC GTGGGCGAAG AGAACGTGGA 13500
GTCACTGCTG CGTTCGGATG AGGTGGAGTC GGCAGTCTCG TACTTCGCGG CAATGCCGGC 13560
TATTCGGGCA ATTATGACGG GGAAGATCCG GTCGGCCGTT TGTGGTGCGC GGGTAGTTTG 13620
TGAAGGGCGT GATCTAACGA CGGTTGTGTT TGTGGATGCG GACTTGAAGT GCTACCTTGA 13680
CGCTTCTATT GAGGCGCtGT GGCGCGTCGT TGGGCGCaGG GAACGAGCCG GTTATCGAAg 13740
CAGGAACTCG AGCAGGCGCA TGCGCGCGAC GTGACGCACA CGACAGCGAC GCnCACCGTG 13800
GGGGGGCTCA G 13811 (2) INFORMATION FOR SEQ ID NO: 151:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1233 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151:
TTTCCATACG CTCGCTCCAC ACCTTTGCTT TCTTTTTTAC CTCCTCTGCA TCTTGTCGCT 60
CGTTGAGCAA CAGGTGGACC CTGCGCAGCG CGTGCAGGTA GCAGACAGCC AATTTAAAAT 120
CTTCAAAACT GGAGGACGTC ATCTCATACT TTTCCCGGTA ACGCTCAGCT GCCTGCAGGA 180
GTAATTTCTT TACCAAACGC AGGTGATAGA CCACTGTCTC GTACTCGTCA GCCTGTGGAT 240
TCAATCCAAT CTCAGAAGAA TTCGCTAAAT CCAGAATATT CTTCGCCACC GTTGCGTAgc 300
GCGCCTGGAG CTCTACAAAA GACCAATACC ATTTAGTATT CGTCCCATAG GCGCTAATAA 360
TGGAATCAAT GGCGAGTCCC ATCTTACGGA TAAGATAATA ACGCTGCTGC TGACTTACGT 420 TCGCTATCTG CGCCACCTGT TCCTGGTAAT CGGAAAACGG CGTGTCAATT AAATTGGTAA 480
CAATTTCTTC AAGATAGATG AGCGCCTTAT AAAGCGTTTT CCGCGCCTCA TTCAGCAGAT 540
CCTCCTGCTT ACCACCTACT ACTACAACCT GTGTCTGATA TTTAGCAGCG TACAGCGTAC 600
TGAGGTAAAT CATGTCATCT ACCAACGCGA GCTTCTTGTA CGCCGCGCCC GTCCCATCCC 660
GCCTGATCAG TTCTAAAATG TTTTTTTCAC GCGTAAAGAT TTGATCAATA GTCCCTTGGT 720
ATACGTTCAA CTTTTTCTGG TACAGAGCAT TCTGCTCTTC CTCAGTCACC GCACTCCCCC 780
CTTCTCTGCC GCGCACCTGC GGCAACAACT CCTGCGCATG CGCTAGGGAG GTTGCACTGT 840
GCGTGTCCCC TGCTAGCATG CGCGCAACTT CTTGAACCCG TCGTTCCCCC ACTACATGCG 900
CCGCGCTCGT ATTCGTGTGT TCTCCACTCG ACTCTTTTTT CACACACACG TGCGCATCCG 960
CGTGCGCCGC TATCATAGCC AAATGCGTAA TGCACACAAC CTGCTTGTGC TCAGACAACG 1020
CTTGCAAATG CTCTGCAACC GCACGCGcCG TTTCACCTCC AATTCCCACA TCAATCTCAT 1080
CAAAAATCAA CGTGCCCACT TCATCGACCG ATGAAAGCAC AGTCTTTAAA GCAAGCATCA 1140
CGCGGGAGAG TTCCCCCCCT GAAGCAATCT TTGCTAGCGG ACGCGCAGGC TCTCCTGCGT 1200
TGGCGCTAAT TAAAAACTCA ACGTCATCAA AGC 1233 (2) INFORMATION FOR SEQ ID NO: 152:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2946 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152:
GTAAAGCAAA ATCACTCAGG ACAAGACTCA CTTCTTACTT TCGCTGCAGG CACGACCCAA 60
AGACACGCGT GCTGATGTCG CGCGCCGCTG CACTTGAATA TCTACAAACA CAACACGAAT 120
ACGAGGCGCT GCTCCTTGAG AACACACTCA TAAAAAAACA TACTCCGCGC TACAATATCT 180
GCCTCAAAGA CGGGAAAACC TATCCTTTGC TCAAGCTAAC CTGCGAGCCA TTTCCGCGTA 240
TTTTTCGCAC ACGCCAATTC TGTCAAGACG GTGCACGGTA CTTTGGTCCC TTCCcTGACG 300
TGCAAATCCT CGATTCTTTT CTTAAACTCA TTTTACGCAC CTATAAAATC CGTACGTGCA 360
CCACcTTGCG GAAGAGAAAA AATCCTTGCC TCTACTATCA CCTGAAGCGC TGCGATGCCC 420
CGTGTTGTGG ATGGGTCTCT CCACGCACAT ATCAAAAGGA CATACATGAG ATTACCCTGC 480
TGCTCGAGGG GAATATTGAC GCGACTGTAG CGCGTCTAGA AAAGCGCATG AAACGAGCAG 540 TCCGCCAAGA AGCATTCGAA GCTGCCGCGC GCATACGCGA TGATATCCAG GCAATCCGCT 600
GTATTACACA CAAAAGTCTT GTTCAAGACA TGGACGAACG TGCACGCGAT TACATCGCCT 660
GGTCGAGCAC GGGAGCAATC GTCACCTTCG CCGTTCTACG CATGCGGGGA GGAAAATTAA 720
ACGGTAGAGA ACTTTTTCGC ACACGTTCAT TAAAAAATGA AGAGGAAATC CTTTCAGAAT 780
TTCTCATCAC TTACTACTCT GACCATACCA TACCCCCACA TCTATTTGTA CACTCGTCTG 840
CAGGGTTAGC AGAACACTGG CTCAGCCATA AAGCAGGTAC ACAATGTACC GTCACGCTCA 900
TCCCTTTGCA TACCTTTCCT ACGCCGCAGA CCCCTTCTTC CACTGTCACC ACAAACGCTC 960
CTACCCTTGC AGCTTCGCAA AATAGCAATG CAGTACAAGA TTCAGGGTTA CGTTCTTGCA 1020
GCGAAACGTC CACCATGCAT ACGCTTCAAA AAGCACACGA CGCCTGCACT GCAAGCGAAG 1080
GCACACGAGA AAACACACCG CACGAGAGCG CGCACACTCC TCATCACCGC GCCATTTTAG 1140
CCATGGCGCA GTTAAACGCT CATGAAGATA TTACTCGGTA TCTGAAAAAT CGCGGCGCTG 1200
ACGATGCACT CAAGGAATTG CAAAAGCAAC TGCATCTTGC ACGCATTCCA ACGCTCATTG 1260
AAGGATTTGA CATTTCCCAT TTGGGTGGAA AGTACACTGT CGCAAGTCTC ATTTGCTTCA 1320
AAAATGGGGC CCCCGACACA AAGAACTACC GATTGTTTAA TTTACGTGCG CACGACACCC 1380
GTaTTGACGA TTTTGCATCG ATGCGCGAAn AATTgCCCGC CGTTATACCC ACACACCAGA 1440
GGGCTACACT CTGCCCGATC TTATCCTTGT CGATGGGGGG AATCGGTCAC GTTTCTGCTG 1500
CACAGCACGT CCTCGACGCT CTTGGTCTTA GTATCCCGCT TGTAGGTCTT GCAAAACGCG 1560
CAGAAGAGCT ATTTATCCCC AATTCTCCTA CACCACTAGT TCTGGATCGT CGCAACCyTG 1620
CACTGCATAT GCTgCAACGC ATCCGAGATG AAGCACACCG cTTTGCAATC ACACGGAATC 1680
GGCATCTACG CACAAAGAAA GAGCTAGTCT TAAGCTTTGA GCGTCTCCCC CATGTGGGCA 1740
AAGTGCGCGC ACACAGACTG CTTGCTCACT TCGGTTCGTT CCGCAGCCTG CAGAGCGCAA 1800
CTCCCCAGGA CATAGCGACA GCCATTCATA TACCGCTCAC CCAAGCACAC ACCATCCTGC 1860
ACGCGGCAAC CCGCTCAACA ACCGCCCCTG TACGAGAAGA ATATAAAGAA CACGAGCACG 1920
ACCCCCAGGG AGAATCACCT GGACCAGGTC GGAAAACAGA CTAACGCGCA CCCGGCCTAC 1980
GACGACGCAT CCAGGAGTCG CTCAAGCTCA TTCTTTCCCA GAAGCTCGAG CGGACGATTT 2040
TCTGCAAACG AAAGCGCAGC GcgTGAAAAC GAAGAAGACG TCACCACCAC CCCACGCGCT 2100
ATACCACGCT CCTTCATGCG CTCCAAAATA GCGCGCAGGA AGCTGTCATC GAGCACCCGC 2160
GCTTCCCGGT AAAAAACAAC GACTTTTGGC TGCGTACGCA CGTTGCTCCA ACCCTTCTCG 2220
TCCCTTTCTG TGGCCACTAT TTCACAACAC CCGCGCACGT CCTGCACCGA GTCTATGGAT 2280 AAGCGAAGCG CcTTACTCAC AATTCGCTGA CACAGATCGA AAAAGGAATC GCCATCGAGG 2340
GTCAAATACT CCTTCATCCG GTCGTTCAAA CGTATATCCT GATATTGCGC GAGtTTATGG 2400
TCACATCCTT AAAACCCGCA GACCGTGCGT TGATAGCCTC CCACTGCTCA ATTGCCTTTT 2460
CCACATCCCG CTTTTTTTCG TAACAGGAGG CTAAGAGATA ACGTATCTGC AAATCCTCAT 2520
TTTCACCCGA CTGCGACTTG CAGCGCAGCg cAcGGTCAAA CTCAACAATG GCCCGATCCA 2580
CATCCTTCGC ATCCATGTAG CAACAGCCAC GCTCGGTAAA ACACTTCTGC CGCAGCACAG 2640
GACTCCTCGA CGCCTGCTCA AAGGAGCGAA GTGCCGCCGC ATACTCCCTT TCTTTCCGCA 2700
AGAGCTGCCC CTGGCAAAAG AGAGCCTCCG CATTTGCAGG TTCCAAGGCT AGCACAGTAT 2760
CGAGCTCTGC CCTTGCTTCT GCAAGATGAC CAGACCGAAG AAGCAACGAA CCGAaGCGmr 2820
CATGCGCTGC AGGATGCGCA GygTCCATAC CGaTGCACTG GcGATAGTAT GCCGCAGCCC 2880
GATCGGTCAC ATTCTTGCGT TCAAACAGCT GCCCAATACG GTAGTGATAA TCTGCACACG 2940
TAGGGT 2946 (2) INFORMATION FOR SEQ ID NO: 153:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1905 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153:
TACTGAAAAA CCGACCCGAG TAGCTGCCTC CCCCATAATT CCcGTAGACC aTAGGCTGAa 60
TAAGGATGGC AACTCCaTCT TCCCCaTCCT CCTCAAACGA CaGCAGTTTA CTAACAaGCC 120
TTACCGCTTC TTCAAGCTGA ACGTAGGCAT CATCGAAAAA GCCGTCTGGA aGCAAGGCTC 180
GATACCGATT CATCGCCTCG CGCCCAGAGG GAGAAGATTT TCCTCCCTGC AAAAAAACCT 240
CTATTTCCTT CAGGGTTTCG ACAAACTCCG AAGCACCCTT GGTGTTCTCC TCACTTTCTG 300
CGATACCAAG CAATACCCCA AATACACCCT TCAAAAGGAA GAACACCTCG TGCGTTGCAA 360
AATGCTCCCC GACACGCTCG GCAAACCCCG CGAAAGTATC CCGTGTCAAA CCAAAGTTAT 420
GTAATACCGG ATAATTCGAA ATGGCCAAAT TGGGGGATAA AACCACCTTG AGGAGCATAG 480
GATTCTTTGC ATCCGCATAC TCTTTTCTGT TTAAAAGCGT AAACTTCCTT AAATACGGAC 540
TTAGCGCGGA CCGAAGCTTC TCCCCATACA GACTACGAGA GACTGTCGCA TCAATAACCA 600
CGCTCGGAAG CACGGGCAAC CCCAGACTTG ACAACTCATC CGCCTGACGA CCCCGAATCC 660 CCAAAAGCCC TCGGTCAAGC CTCTTATCGA GAGCCTCTTT ATTGCTCAGA AAATGAATGG 720
ATTTGGCAAT GTTCATTGTG CACTCCTAAA ACAAATTCAA CACCGTGTCT GGGGAACCGT 780
GCAAAAACAC CTCAAAACGA GAACTCGCCC CCTCCTTGAA ATACTTAGAG AGCTTGGACA 840
AATCCTTAAT TTCCTCACCT GACACTGCAA ACTGAATCAC ACTCCCGTGC TTCACCTTTC 900
CCCATTTGAA GAGAGTGTTA ATATCCACAA TACGCTCACC ATCGTAGAAC ACAATGACCT 960
CTGAAGAAGG GTACCGCGCG TTGTAACTCC TGATAATACG CTTCCACGCT TCCACATTCC 1020
CGTTATGAAA CAACTCGTTA GAAACAGGCA CCGATATTAA CTGAGACATC CGAATCGGAC 1080
CTGcAGAGGC TTGAGCAGGA GGACGCGCGG TAGGAGACAC AGACTCAGAC GAAGCCCGAT 1140
CACAGGTAGC CCCTTCAGAC GCCCTGGCAG CGTCCGCACC CGACTTCTTT GCCGATCGAA 1200
CCCTCTTGCC AGAACCCTCA GACTTTTTCG TAGACACAAA TGCAAAAACC CCCGAAAGGA 1260
TTTCTTTGGG AACCACGAAC TTACGATTCT CTATCAGAGC AATCAAACCC TTGGCGrCtT 1320
CGTCTGCAAT CCGTTCATCA AGCGGTCCCC TATCCTGCTT TCCCACGTAC ACCAACAAGA 1380
GCTCATTTTT CCGAAAACCC TCGACAAGGG AGGCGTTCGA AGGATTCTTC GGGTTTATTG 1440
CCAAAAATCC AAGATCGGGG TGATGATAAC CAAGTACAAT ATCCACTCCC TTCCACCGCG 1500
CAGTTTCCTG AGCAAACCCG AGACCCTCGA CGGAAACCTT CCTTAGATTG TAAGAAAACA 1560
GCCGATACCC CCACGTGTCG ATGAGTAGCA CCGACAGGAG CGCTGCAACC TGCTCGCTTT 1620
GCATCTCcTT CGCACAGAGC AGCAGGTTGA CCTGGTCTGC GACGTTATTC TTCGGATCAC 1680
ACTCATCCAA CACGCTCAAA ACGCCCTTCA CATGACGCAC TAGAGCATTA AAGTTCTTCT 1740
TCACCGCTGC AGGCACGTAA AAACTAGATG CACTCACCAC GTAACACTCC TTTGATGATT 1800
AAAAAGCTCC CCGAAACAGA AAAGTTCCAG GGAGCTTCGG ATCACGACAA AGTCTATCTA 1860
CAATCTCGCT AGAGTTCGCG AATGGCAGAc CGTCTACCAC GTCGA 1905 (2) INFORMATION FOR SEQ ID NO: 154:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1370 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154:
AGGATCTTTT CCTCAGTCAA CTCCAAACGA TCTCCATCTA CATCCAGATA CAGCGTCGTC 60
CCCTCGAGTA TCTCCTGAAT TTCTGCAGAA GACAACCGCT CAATGCTGAG CGCTGCACGC 120 TTTGTCTTCG ATCCAAGCTC TTTTCCGAGC ACTCTAAAAT TCGCTTTTGC ACGGTACTCT 180
ACTATCTCGT CTTCTTTCTC ATGAAACACG AGCTCTTTTA CATTAAGCTC ATCAAGCACA 240
TCTTCTTCCA TTTCAAGAAG TGCTGAGCGC TCCATCGGGT TACGCGTAAT AACCTGCATA 300
GCTTTAAGTG GCTGGCGTAC TTTGAGGTTA CACTGCGCTC GGATCGCACG CGCCATAGAA 360
ACAACTCGCT GCACTGTTTC CATTTTAAAC TCGAGTGCAT CGTCTCGCAC CATTGGTGTA 420
CAAACAGGAT AGTCTGCAAG ATGCACAGAc TGCACATCAT CCGCGGCGCG ATTATTCTGC 480
CATATACTCT CGGTGATGAA TGGCACCACG GGAGCAATAG CGAGCACACA TCTTTTCAGC 540
ACGCAATACA ACGTGTTGTA CGCACATCGT TTATCTTCAT CGTTGATGCT TTTCCAAAAT 600
CTCCTTCGAG ATCGGCGGAT GTACCAGTTG TTCAGCTGAT CTACATACGA AACGATAGGA 660
TCCGCAACTT TCGATACATC GTAAGCATCA AGTGCACAGG CAATGTCTTG CACCAATTTT 720
TCTGTCAGCG ACAAGATCCA ACGATCTAAC GGGTTATTCA AATGCGTCGC TAAACGCGTG 780
ACCGCCTGaC CCATTcCGTc AACTTTTGCA CATACAGGcA GGATCGATAC CATCGATGTT 840
CGCATACGTA ACGTAAAAAC TGTAACTATT CCACAATGGG ATAATCACAG TCTTCAAAAT 900
ATCTTTCACC CCTTCGTCAG AATATTTTAA ATCATCCGCA CGGACAACCG CAGAACGAAC 960
AAGAAAGAGG CGGACGCGTC ACACCGTAGC GATCCATGAC TTCATTTGGA TCCGCA AAT 1020
TGCGCAGGcC TTGGACATCT TCTTTCCATC AGACGCAAGT ACCAACCCCG TAACGATACA 1080
GTTTTCAAAC GCAGGACGCT CAAAGAGTGC CACAGCCAAG ATGGTAAGGG TGTAAAACCA 1140
CCCTCGCGTT TGATCTAACC CTTCAGAGAT AAAATGAGCA GGGAAATACC GcTCAAAGTC 1200
AGTTGcATGT TCAAACGGAT AGTGcTGcTG CGcATAAGGc ATTGcACCAG ATTCAAACCA 1260
ACAATcTAGC ACCTCAGGkA acGCGTCGCA TcACACTCCC AcAGGsGcAA GGaATTGTTA 1320
CCATATCTAC AACGTGcTTA TGCAAATCTT CAAGCAACAT GCCGGrAgTT 1370 (2) INFORMATION FOR SEQ ID NO: 155:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1073 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155:
ATGAGCGATC TCTTTTTTAG AAAGAACCTG TCAAAAGGAC AAACGGAGCA TAATGCACAT 60
CGTAAGAGGA AAATAAAATT TCTTGTAGAA TCGGATTGAC AACTCTCTAT GAGGCTTTGA 120 CGCATATTTT AACGTATGAA GCGTCTCTTT TATTTTCTGG ATGGACATmC CTTTTGTTAA 180
AAAATCCTGA TGATCAAATA ATTCAAGCGG TTCTCTCAAC TCGAAAGAAT CAAGATATTC 240
AGAAACCATA AGTTTCATTT CATGAGTGAT AGTATATACA ACCGGTTTTA TArGAACCCA 300
CCTGTCTTTC TGCCAATGTG CAGTGTGTGC TTCGATAaGC TTCTGAATAG TACCGTTACT 360
ATTTTTCATA ATAACTAACA TATCAACGAG CACCTTTTGT CCTCTGTGAT ACGCACCCGC 420
AATGGAGATT ATCTTACCAT CAAAAGAAGA AGCAACAACG TCATTTTCAC TATTCCCCTG 480
AGCATAGCTC ACAGTACGTG CAATGATTTC ATCCTTTTTT GACTGAGAAA CAACTACCAC 540
ACTATTGTCA AATGCAAATA TTCCAATTGC TAAGAAACAC ATAAAACCAA ACCACGGGCC 600
TACGATACTA AAAAAAGATA ACCCACCGAA ATGGATGGCA AGAAATTCAT TTTGGACATA 660
CAAGATTCGC ACAGTATACG TTGCAGCAAA AAGCACAGAC AAAGGACAGC ACAGCAGTAC 720
ATAATGTGGG ATAGATACAT ACCATATCTT CACCAAAGAA GAAAAACGAT ACCCATTCGA 780
AATAnAmTGC GTAAGATTAA CCAGAATATC AATAGCATCG AGAATCA AC TCGCACACAC 840
GAGCAcTAGA AAAAAAATTG GTAAGAACAT GTCCAACATA TACACCTGCA GTATTTTCAT 900
ACTGCAGATT TTCTCATTGA CAAAATCAAT GCAGTAAAAA ACAATGCGGC GTTTGGAATC 960
CACAGTGCAA GGGAAGGAAC AATACTAAGC CTCAACGCAA GTTCTTGTCC TCCGATAGAG 1020
ATAATCCAAT ACCCTAGTAC AACAATAAGC CCTCCTACAA ATCCTTTCCC CTG 1073 (2) INFORMATION FOR SEQ ID NO: 156:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 884 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156:
TTTAAAAnTC AGGGAGCCCG AGAATTGTCA AAAATATCCT CAAAAATATn GAGCGTTTGA 60
TCAAGAAAAA TTCCAGGCAT TTCTGCAACA AATCCTCAGA aACGTaAAAA CGCAGGAAGA 120
CCCCCGcGTG CTCGACGCCT ATCGGCGCCT TTTTCGCAAG AGCGTCCCCT TCTCCATGCG 180
CTCATACGTG GCCGCACACC TGGCGCACAC GCACTGCCGC GCCGGCGCCA CCGCCGcAGC 240
ACGCGGCGCA CCGGAGCGCG GGAAGGTATA CGTACCGGCA GCGCGCACGC GCTTTCCTGC 300
GCCACGCGCC TGCCAGCGCC GCCCGCGCGC GCGACTTTGA GCGCAAAGcG cGCgGACTAC 360
CCCGCCCTCT GCCCGGGGGA CACCACGAGC ATCTTCATCA GTATTGGAAA AAATCGGCAC 420 ATCTATCCGC GCGACATCAT CGCCCTGCTC ATGCAGCGCG CCGATGTTGC ACGCGAGCAC 480
ATCGGCACCA TCCGCATCCT CGACCACTAT TCCTTTATCC AAGTCCTTTC GGGTGAGGCA 540
GAAGCGGTTA TCGCCCGTTT AAATGGCCTC TTTTACCGGG GGCGCACCCT GAcGGTGAGT 600
CACTCACGCA GGGCGGACGA GCATCCCGCT CCTTCTACAG AGCCGCACGC CGCTGCCGTT 660
GCACCAGAGC CTGgACTTTA TGGgCAGAGC CcATCCCCGg CCCTGGgAAG AATAAGGgAA 720
CATGgCCGTt CAGTGcAGGG gCGCCGgCCc GGGGCGCGGT GGGCAnTTCG GTGCGGTGCG 780
GCAGTTTTCG GTTGACAGGG CTGCGCGTGG GAATCCTACC TGnAGCGGCG CTCAATGGAG 840
GCGCACGTnC AGGGCCGTTT GCCATTATGG GCAGCCCGCT TTCT 884 (2) INFORMATION FOR SEQ ID NO: 157:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3247 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157:
GGGGAGGTTT ACATGCGTAA TACCGGATCC GCCACCTACG CCCGCATCGG CGGTGTGTTC 60
CTCTCTTTCT TTGTGCTGTT ACCCGTCTTC TGCCATGGCA GCAAAGAGAA GGGAAAGGAA 120
GAAGAACCGG TTCGCCTCTC AGTCCTCATA CGAGAGAAGC ATTACTCTTC GGGCCTGCAG 180
AATGTGTTTA CGAAGTTGGA ATTGGAAGAA GGAATCGCCG TCACCGTCGA AACCATCCAG 240
GACGATCAGT ATCCTACGGT GCTTCACGCG CGCCTTGCAG ACGGGACCGC TCCGGATGTT 300
GTAGAGGTGT CTCTTCCCTC GCTCCATGCC CTTGACCCAT ACCTTTACTT TGTAGATCTG 360
AGCAAAGAAG CCTGGATACC GGATCTACTG ATTCCTCCCA CAGATCCGTA CGGCAAGACA 420
TTTGCGCTTC CCTTAAACTG CGCCGTGTCT ATCAATGCAC TTTTCTACAA CAAGGACCTT 480
TTTGATCGCT ACGGGATATC CGAGCCCAAA AGCTGGAATG AACTCCTAGA AAGCTGCGCT 540
CTCATTGTAA AAAGTGGCAT TTCTATCGTA CCCCTCGCGC TCAGCACAAC GGAAAGCTTT 600
CCACATACGT TGCTTGCTGA CGCGATTACG AAAGTGCTCG GTGAGCAGGG CGCTCGAGAT 660
TTAGTCAAAC GTGCCACAGA CGACTCCATC GATTGGACGC ACGAGCGTGg CtCTATCCTG 720
TACTCGGAGC CTATCTGGAA CTGTTCAAGC GGGGATACGT AAACAAACAC CACCGGACTG 780
CGCGCGTGCg GAAaTCATTC ATGATTTTAC ACGCGATCGC ATCGCTATGT ACTTTGGCAG 840
TCACCTGGTT GCAGATGCAA TCATAAAAGA ACGTCCTGGA ATCAACTTGG GCGCGTGCGT 900 CCTCCCTATA ACCGAAAATG CACAAGACGT ACTGACTGGA AGTTTGGAAG TGCAAGGACT 960
CGCAGTGCAC AAAAAAAGCG CGCGTGTGGC AACCGCGTGT CGTGCACTCT CTGTGCTTGC 1020
GTCTGCCGCG TACCAAAACA GTTTCTTTGA AGAACACAAA GGGCTTCCTG CGTTTCGAAA 1080
CACCACCAGC GCAGTTATTC CTGCGTGCCT CAGTGcCCTG TTTAAAAGCC ATATAGAGAA 1140
AGGAAAAGTA ATACAGGCAA TCGACGCGTA CGrCAGGCGC AAAACACACC CCACAGAGCC 1200
TCTGTTTTTC CAGATTTCGC CGCGTATGTA ACCGACCCGG CACCAACTGC GCACACCATG 1260
CTGCACCGCG CCCAAACTGA AGCGCGGAGG AGAAGAGAGC CGGTACAAAA AAAAGAATGA 1320
GAGCTCCTGC GCGGGCAATA CGTGCCACGA GCCCGGCAGC TTAACCTACC GCGCGACAAG 1380
ACAATCTTGC CACAGCCGGT CGAGGTCATA AAaGGCgCGC ksCgCGyTcA TAAGATGAAC 1440
CAGGATAGAC CCAAAGTCCA GCACCCGCCA TTGCTCTTCG CAAAGGCCTC GTTTTTTTCG 1500
ATGTACTTCT CTTAGGCCAA AGCGAGCCGC CTGCTCGCAC ACCAGACGAT GAGTGCCGTG 1560
CAAGAGGCCA GGCACAGTGG CAACTACCGC AAAGTCCGCC CAGCCGCAGC GCGCGCTTAC 1620
ATCAAACACA CATACATCCT CCGCGCGCGC ATCACACAGA GCCTCTGCTA CCGCGGAAGC 1680
AGCTCCGTTA GCACTCACGT TCCCTCCTTT ACCAGATATC CATTAAAATC CTTCCCCAGG 1740
ATAAGCGTAA AATCGATGCC CGTTTCCACC CCATACTCAT CAAGCGAGGC GCTAGTTGTC 1800
TCAATATTCT GACAACGAAT CACcTgCGCC ACCACTTTAG CCACTGCAGG ATTCCCAATG 1860
CGATCGACGA GCACCGTCTT TTGCACACTT TGCTCCAATG CATTATCAAC GCGAACTACG 1920
TCGTAACCAA ATCCTTGGTA AATATTCGCC GTCGTGCGCG CAAGACCGTG CGATTCAGTT 1980
CCGTTAAGAA TTTCCAACGC ATACACACGC TCAAAGGCCG TACCATTCTC TGACGCAAGC 2040
ACCGCAAGCG TCTGGCGCAC GATTTCCTTA ATTTGCTGTC CATCaCGAGA TGGAAAAAGG 2100
AGTACCTTGC CGTCTACCAC TCGTTTAGTC CCTGAAAAAC GCTGCGGCAC TAGGCGTTCT 2160
GAATCCAACT TAGATAATTC ACCTATAAGC TTTTTTAAGT CAGCACGCCG AACGTTAGAA 2220
CGTATTAGCC TGTTCAAGGA AAAAGCACGA GTTGAATGAA CAAAAAACTC ACTGTGATCA 2280
TTAACACTAC GCAACAAAGC TAAAATAGCT TTCTGTTTCC TCGATGCTGA CTCCCCTTCT 2340
CCCTCATCCT CGTCTTCGTA TAAAAGATAA TCACGCATCT TATCCCCATC CAAAGACACT 2400
GATCcTGACG GTAACAGGAC ATGCCCTGCA CGTTCTGTGT GCACGTCGAT GGGCGTAGGG 2460
ATAAACACCG ACAGACCAGA GAGTAAATCT GTCAATTTAG AAAAGTTATC CAGCGAGCAC 2520
ACTACAGAAA AAGGAACGTT AATTCCCGTT AACTTTTCTA CCTCCCTTTT ATACTCCTCA 2580
ATACCACGCT CGCTGTAAAG cGAACCGATG CCATCCGTAC GGCCAAGACT CTGCAGAATA 2640 AGTCCCATAT TATGGGGAAT ATCAAACATT GCCGCGCGCC TCGTTGCAGG ATAATACGCA 2700
ACAACATTGC TGGAGATTGG AACGTTCTCG TGTTCAATGA CAAACAACAC CTTGAGAATA 2760
TTGTCGCTAG AAAGTGAGGA TTCAAGCGGA TCGCGCTTCA TACCAAAGAA GACTGCAAAG 2820
ACGGTGATAA CCAGCATAAA AAAAATGAGG AGTAAAAAAA GCCCATGTCT TCCCATATCC 2880
AACACTTTCA TTGCACACTC CCTGGCCTCT CGTCGAACTG CGGCAGA AC ACACACTGCC 2940
CTACTTTTGC CCGACTGCCA CCCGTTACGC GGGTTGACTT ACATCCTTTC CCAGTGCACG 3000
CAACATCGCA CACGTACGCG GATGGGGCGT GGTACGTTTT CGCTCTGCGC ACACAATACT 3060
CGCACTCACT ACTTTTGCAG CAAGCGCATC TAACGTTAAC GTTGACACTG cACCGCGCAA 3120 gGACGCCGCC CAAGCACGAG CTGGcTCAAt TTTGTCTGAT ATAAAAAGAA TCTTTCCTAA 3180
CACACCAAAG TCTTCACACC CAAAGGnATG CCAGCGCACC GCAGACAACA ACACCTCATC 3240
CTGnACG 3247 (2) INFORMATION FOR SEQ ID NO : 158:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1691 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158:
CGAGTGCTAA AAGCTCATGA CACAGATCCG AGCGATCCTT TTTTaTGGCA AGCACCAATG 60
CCGAAaTCCC ATCCGATCCC GTTTGTCGAT TGGATATCCC TTTGCCAGTA AAAGCCGAAA 120
TGTCTGCACG GGCACATTTT CCATAACCGC AAGATGCAGT GGCGTCTGCC CGTCAGAATT 180
TGCCAGGAGC GGATCCTTCC CCACGACTGC ACCGACAAAC GTATCTCCtT CAACAGGGCA 240
AGTGACAGCG GCGTAGTCCC GTACGAATCG CGCGCAAAGG GGTTGCCTCC TGCAGCGCGC 300
AACGCCGCAA TAACTGAcTG CGAATTGCAC AGTACCGCCT CGTGCAAGGG AGTGCGTCCA 360
TACATATCCT GCTGCACAGG ATTTGCACCC GCAGAAAGGA GCATGTGCAC CGACTCTTCC 420
TGATCTGCCA AAACCGCATC CGTGAGCGCA GACTTACCCG TCTCATCCCC CATGTGCAGC 480
GCCACCCGAT GACTCAACAG CAACCGGATA AAGTCGACAT TCCCCGCACG AGCTGCAAGA 540
TGCAACGGGG GTTTCCCAGA AAGATTCCGA GCATTTAAAA GAGATACATG CCGCGtCATC 600 tTCGCGGATA AGCACGTCAG CCGAACGGAG CGCAGACCAA CGCACACACG CATGCAACAC 660
CGTATTTCCA AtGCGTCCCG TGCGTCCACA AGCGCAGGAT TACCTGCCTG AGGATGCAGC 720 AAAATTGAAA TAACCTCCGC TGCATCTGAC TTTACCGCAC TGAAAAGAGG CGTTTCTTGA 780
TTCAAATTGC GTGCCTCTAT CTCTGCCCCC TTGCGCAAAA TACCATTAAT CGCCTGCGTA 840
AGTTTCCACT CGCACGCCAG ATGCAGCGGC GTGTTTCCTC CTGTATCCTG AGCGTGCACG 900
TTTGCCGCAG TCAGAATCCA ATCCTCGCGA CCGCCACTAG TTGTCAACGC TGTTTTCAGC 960
GGTGACACGC CATGCACGTT TGTACTGAAG ATATCAGCTC CTTCCCTCAT CAAAAACTCA 1020
CCGACAGCAC GGTCATCGTT TGCAACCGCA TAGTGCAGCA GCGTGTTTCC CATCGTGTCA 1080
CGTGCAATCA TCTGTTTTGG TTCGCGAAAT AAGAATTTAA CTATGTCAAG GTGCGCACGC 1140
CTTGATACTG CAACGTGCAA AGGATCGCGC CCCACCACAT CTTGTTTATA CAAGTTGCTA 1200
TCGGTGACGA CAATTTTTAC TGTCTCCAAA CCGCGCGCAA GTGCCTTGGT AAGCGGCGTT 1260
TCTCCGCGCA TATCTTCTGC ATGGATATCT GCCCCAAGAC TGACAAAGTA CGCAtCAAAT 1320
CCCGATGATC ACGGTCAATC GTGAGGACAA GCGGCGTTTC GCCCTTTTTG TTCCGTTCAG 1380
AAATATCTGC GCCTGCGCCT AcTAGACGCT CTACAAACGC CCTGTCCATA CCTAAACGGG 1440
CCGCGACGTG CAAAGGAGTC TCGCCATAGT CATCCTTTAT GGCGACAGAA GCGCCTGCAT 1500
CAAGCAGTGC GCCAACCAAA CGGACACGGA AGGGAGCAGG AGCGACAAGG TGTAAACAGG 1560
TGTTCCCCGA CGCGTCACGC ACGTTTGGAT CGGCCCCACT GCGCAcnCAA AGACCGCaGC 1620
ATCCACCTGC CCTGCACGCA CTGCTTCGTG cAAAGGGGTG GcGCTGGaTA GTTTTTTGCA 1680
TTGAGATTGA C 16 1 (2) INFORMATION FOR SEQ ID NO: 159:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1462 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159:
TGCGCATCGy GTGaGAGGaT TGACATTTTT ACTTGCTGAC GAGGCAGGCG TATTCAAGAC 60
GGTGTATTCT GATTTTGAAG AGATGGCAAC GCATGAGATG GAACATCAGT CGTTACAGTA 120
TTTTTTCAAT CAAGATGTTC TTATGATCAG AAAATCGGAT CCAGTTATTG ACCATGCACG 180
TCAGGCAATT CGGACAGATA TCCTGCATTT TTTTGATATA ACGCACGCCC AGATGATTGT 240
CTTGATTCGG GAAGCTCAAA AACTTATTGG ATGTCTATGT TTTGGTATGA AAAACAACGG 300
CGTAGAATAT AGCAATCATG ATCAGCAtGT TTTAGAAAAG TTGTATtCAC ACTTtGTATT 360 GGTTTCCTAT TACTTACAGA aTATtGCAAA GCAAGACGTG GTTATCACGg TGGACAAAGA 420
ACTTAAAATG TCCCATCAGA TTATTGAGTC AATACAACGG aAAAGGGATT TTATTCAGGA 480
TGCCTCCGTT GAGGTGGATT CAATTGCGTA TTCTGCGCAC CAACTTGGGG GCGATTTTGT 540
TGATTTCATT AAACTGTCCG AGAAAAgATA CCTGCTAGTT ATCGGGGATG TATCGGGGAA 600
AGGTCTGGCA GCGAGTATGT CAATGGTGAT TTTGAAGTCT GTACTGAGTA CCTTTCTGCG 660
GGGACTGTGC CTGGAAGAAA CGGCAGTATT TACAACCTTT ATTGAGAAGA TAAACCGGTT 720
TATCAAAGAC AATTTGCCGT GTGGGACTTT TTTTGCGGGT GTATTCTGTA TTCTtGACCt 780
GGCAACCCAT ACGCTCTACT ATGCGAACTG TGGCATACCG CTCATGTCGA TGTACGTCGC 840
TTCATACAAG AACGTGGTGG AGATACAAGG CGAGGGGCGC GTGCTGGGTT TTGTTAAAGA 900
TGTTATGCCC TTTTTGCGGG TGAGGAAAGT TCAACTCGGT CAGGGGGACG TGGTGGTATT 960
TTCCACTGAT GGAATGGTAG AAGTACAAAA TTTGCAGCGG GAGCGCTTTG GTAACGAGCG 1020
TGTGGATAGG ATTCTACAGG AAAGTCATGG TCTTCCGGTT TCTCAAATTA CCCGTACTAT 1080
TTATGCTCGG CTGTGTGAGT TTATGGCGCG AGATATGCAG GATGATGTAA CTGTTCTGGC 1140
AATAAAGTGC CTTGGGCCTC GGTAGGAAAT GAGGAAAAGT TCGGGAGCGC CGTATGGATA 1200
ATATAAATAT CGCCAAAGAC GTTCGGCCTG GGTGCGTTTT ATTAACGGTG ACTGGAGCGG 1260
TCAGCTCCTA TACTTACGGG GAGTTTGAGT CGCGTGTGCA TGGGGCGCTC AAAGAGAATC 1320
ACGTTGTTTT GGATCTCTCC GGCGTGACGG CTATGTCTTC TTCGGGATTG GGGGTGCTTA 1380
TCTCTGCATA CGATGAGGGA CTGAAGTACC AGCGTCGTCT GTGCATTCTT AATCCTTCTG 1440
AGAGCGTAnC AGAGCGATAG AG 1462 (2) INFORMATION FOR SEQ ID NO: 160:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1013 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160:
TGGAACGCGC GGTTATTACG CGTCCTGCAT ACGCGGTGGA TTATGCGGTG CTATTCCCTG 60
TACAACTTGG TATTGATTTG CAAACAAAAA GGGTGAGCGG GCTCTTTTCT GCAGGTCAGA 120
TTAACGGAAC ATCCGGCTAT GAAGAAGCTG GAGGTCAGGG TATTATCGCC GGGATTAACG 180
CTGCGCTGTA CGCGCGCAGT ACTAAAACCA AAGAGGAGTA TCATCCATTT GTTCTGAAAC 240 GCGACGAAGC ATATATTGGC GTCATGATAG ATGATCTTGT AACACAAGGA ATAGACGAAC 300
CCTATCGGAT GTTTACCGCG CGTGCGaGTA TCGTTTGAAA CTCCGTCACG ATACTGCGGA 360
TGAACGTCTT ACAGAAAAAG CTTACGCCAT TGGGCTGCAG AAGAAATCTG CTGTAGAAAC 420
GTTGCAAAAA AAGATGCGTA CGAAGCACGA GATCTTGCAT CTGCTTCAGA CCAACAAAGT 480
TAGTCTTACC CATGCAAACG CATATGTTCA GCTGAAGCCG CATATAGGTA AATCGTTTGC 540
AGcTACGCTA CGTGATCCGG TAATACCTCT TGGGCTTAwC kCTTCGCTGA ACGAGCAGAT 600
AGCGCAGTTC CCTTTGGAAG TGTTCCAGTC GGTTGGGGTG GAGATACGCT ACGAACACTA 660
CATCGCTGCA CAGGATCAAA GAATTGCACA AGTGGAGAAA ATGGAAGGAA TAAAGATACC 720
AGCGCATTTT GATTACGCGC GTATATCAGG TCTCTCTGTA GAATCCCGTA CACGATTGGA 780
ACACGTTCGC CCGGACACTA TCGGGCAGGT TGGGAGAATG CGCGGAATCA GACCCTCTGA 840
CGTAATGCTG TTGCTCGCCC ACTTAAAGCG GTAGCAGCTA CCGCAGAGAT AGAAGAACCG 900
CCTTATCAGG CAGGTGTTTG TACGTACTTT TAACGCACAG CAAGGAGCGC TTCGGCGTGA 960
AGTTCGGTGA TAAGGCCACA GGAGACCATA TCAAACAAnT GTnCGCTATT TGT 1013 (2) INFORMATION FOR SEQ ID NO: 161:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1129 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161:
GTAGTCACGG CGTGTGCCCC GCGTCAACCA ATGCGGTAAT ATCTGCCAGC CGAATTTTTT 60
CATGTGCAAG GCAGTCCCCT ACGCGGTTTT GCGAATCGAC GAGTACTGAC AGATATACAT 120
CTTTCAGTTG TGCAGGCGCT TTATAACGCA AGAACTGAGA AGCTGCTGTC CGATCAGCAG 180
GCAGGCGCAT ACGTGCACGT GCAAGATCTA GCGTAAgcTG TGCGTTAATG GTACCGTTTT 240
TCCAATCCTG CACCGCATCG GTGTGTACGG CATCCTGGGT CTGCGCACCA GACGGCGCGT 300
GAGAAAACAG AACAAAAAGG AGGAGGAGAA TGAAACGCCT CGCACTGCTG CGTACCACAT 360
GGGGCGCACT TGCGTGCGTC CACACCAGGG AGAGAGCGCC GAACGGAAAA CGGCACTGAT 420
TTGAAAAAGA GGCAGCACGC GTCAGGCAGG ACACTCCCAT ACCCTACCGT CTCGGCGTTT 480
TGAGAATCAT TCCCAAGGAG AGCGTGGCGT CTACTGCTAG ATTGTTTTCC TCTGCCAGGT 540
TTTCTACTGA AACGCCGTAG CGCTTTGCTA AGGACCAGAG GGTATCsCCC TGCTGCACTA 600 CATGCTTGCC GACAAAGGGT GCTGCAGGGG TTGCCCGCGC CGCTGTGCTT TCCGGCGGTG 660
GTGGCACGCT CGGGGGTGCC GGCGCGTCTG CTCCCGCGCG CGTGTCCGCg CCCCCGGGCG 720
TCGGCGGCGT CATTGAGGAC GAGGGAAnTG CGGACAtTCC GGCAtGCATG CTCTTtCTGA 780
TCCATTACTC CAGGTGCACG CGTGCCTGCC CCCCGCAGGc GCGCGCCGGG TCTGCCCArT 840
CCGGATGGTG GGAATGATGA GCTTTTGTCC AATCTTCAGG TGAGTGGCGC TGTGTGCGCG 900
GTTATGGGsT TTTAGCGTAT CGACGCCCAA CCCATAGCGG CGGGCGAGTG CGTAAAGCGT 960
GTCTCCTGAA CGGATGGTGT GCAGGGTGTG ATGGACGAGC ACGGCGCCTG GGCGGTTGAG 1020
TACTGCCTGG ACCGCCTGCG CATGTGTGCT TGGCACCCGG AGGGTGTACG CTGCATTCGG 1080
GGGAGTAATT GAATAGCGGA GTGCAGGGTT GAGCGTGTGC AACAATTGC 1129 (2) INFORMATION FOR SEQ ID NO: 162:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1713 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162:
GTGAGCTGCG TGCAGTAGAG ATTGGAAAGC TGAAACAGGC GATTTATTCC CTCGAAAACG 60
ATTTAAGGAA TGAGAAAGTT TCGGCTGAAA GGCGGCTCCG CGTCCTCGTC TCATCAATCA 120
TTTTTTCTGG CTCGTTGATC ACTATGTGAA TGTGCAAGAA GACAAGAGGA ACATCGACGA 180
AGTGTTACTC AAAATTAAAT TGCTTGATGA AGCGGTTTAT CATCGCTATG TGAGGATAGC 240
GTGAAAGTGA GGGTGAAGTA ATGTCAGAAC ACATAGAACA CGACGTTCGG GAAATGCTCA 300
ATGAAGAAAA ATGGACACGC GCGACCTTAC CGCGTATTCT GCGGAAAAGT TTAAGGAACT 360
TGACAGAATC ATTGCGGAGG CGAAAAGACA ATCTATCCTT GATGTACTGA AAGGTATCTG 420
TGACGAACAT CTGGCGCACT CGAAGAACAG TATAATCGCG TTATACATTT CTGGGATTAT 480
TTCGCTTTCT AaGCAGTTGT TAGATGATTC GTGTTTAGTG ACGCTGCTGA CTATCTTTGG 540
TGATAATCAC AAGAATCAAA TAGTTGAGCA CCTCTGTACC CGTGTGCTTG AGTACGGTGA 600
ATCAAAGCTT GCGTTGCGTG CGTTAGGAGA ATGTTACAAA ACCTCTGGAA ACGAACAGCT 660
CTATGATGTT TGGGAACGGT TAGTTAGGAT CGATTACGAA GAGGCGGAAA TCACTCGTGT 720
GCTGGCGGAT AAaTACGAGC mGGAAGGGAa TAAaGAGAmm sCTACGGAGT TTTACAAAAA 780
AGCGCTGTAT CGTTTTATCG CGCGGAGGCA GAACGCGGCC ATAAAGGAGG TTTGGACTAA 840 GCTTGTTGCA CTGATTCCAG ACGATGTCGA GTTTTTTTAT CGTGAGCAGA AGAAAATTTC 900
AGAGAAACTG GGAGAAGGGC GCGGGAGCGT GCTCATGCAA GATGTATATG TCTATTACAA 960
AGAAAATGAG GATTGGACAA CGTGCATCAA TATACTCAAG CATATTCTTG AACATGATGA 1020
GAAGGATGTT TGGGCGCGTA AGGAAATCAT AGAGAATTTT CGGTGTAAGT ATCGCGGACA 1080
TAGCCAGCTT GAGGAGTACC TAAAGATATC GAACATTAGC CAATCTTGGC GCAATGTCTT 1140
TGAAGCCATT AATGATTTTG AAAAGCATAT TTCCTTTGAC GAGGGTAGTT TTGTTTTTCA 1200
TCGAACGTGG GGGGTAGGTC GGATTGCGAA GGTGTGTAAC GATGAGTTAC TGATCGATTT 1260
TGCGAAAAGG CGTGCGCATA CCATGCTTTT GAAGATGGCT ATTAGCGCGT TGCAAACCCT 1320
TGGCAAAGAG CATATCTGGG TGCTTAAGTC GGTACTGAAG CGGCAGGATC TTGCTGCGAA 1380
AATAAGGcAG GATCCTGAAT GGGCACTGAA GGTGATCATC ACAAGTTTCG ACAATAACTG 1440
TAACCTCAAA AAGGTTAAGC AGGAATTAGT TCCTTCTTTG CTTtCTGTGG GGGAGTGGAC 1500
GAGTTGGAGT ACGAAAGCAC GGAAGATTTT GAAAGAAAGT ACTGGATTTG CTGCGAATCC 1560
CAGCAATATC GATTTTTATA CGGTGCGGAG CTGTCCTGTT TCCCTAGAAG AAAAACTtGC 1620
TGTGGAATTT AAGGCACAAA AAAATTTCTT CGCGCGCATC GACATCCTCA TACCTTTATG 1680
GACAAGGGCA GATACAGATT TCTGGAACGG GCC 1713 (2) INFORMATION FOR SEQ ID NO : 163:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 717 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163:
CCGTTATGGT ACAGGAGCGG AAGACTTCGA GTAATAAAAC CAAGCGACTT GAGTGCTTCG 60
TTCAAAAACC CCTGGTTCCC CAGGATAGTG ATCCAGGCGT TTATCCTGAT GAGAGAATTG 120
GTCCAAAAGG GGACGATAAT TAGGAACAGG AAGAACGTTT GCCTCCGGCT GCGAGAGAGC 180
GCGTACCTGC AGGAAGTGCC AGGAGCACGC ACAAACAGGT GACGCCTGTA CTGATAAGCA 240
GGGTGCGAGC CAGCAGTGCG CCATAGCCGG AGGTAAGGAC TTGCGCATAC GCCCGGATGG 300
AAAACTTCCA CACAACGCCT CCGTACAGGC CCTTTTGCAA AAAGCTGTAC ACAGCGACCA 360
CCGTAAGGGG GCACAGAAAG AAGACTGTAA GCCACGCAGA CAAGGGGAGC GTGCACCAAA 420
AACCTAAACT GCCTACGGTA CGcGTCGCGC CACGCGCAGC CGGsTGCTcA CGaCTCGATG 480 TCCTCAACCA CGTAGCCGTC GTGTGCAGAC CAGGACACGT AAACAGTGTC CTTCCACGCA 540
ATCTCAGGGC CAGTGTCGAG ATACTTCTGG TGTTGCTGAA ACACTTGGAT AATAGCGCCA 600
CTTTCTAACT GGACGAAAAA CTTAGATTGG AACCCTGCAT ATACAGGCTC CTCTACAAAA 660
CCGCGAAAAA CATTGAGCGG CGCACTGGTG GTACCCGGGT CTTCAAGGGA AATGTGG 717 (2) INFORMATION FOR SEQ ID NO: 164:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1283 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164:
GCTATTGTCT GTCCCAACTC AGTACGTGTC TTTGCAAATG CTTTCCCCAG GGCGTACCTC 60
AACTGTTTTT TCGTATCGTT GAGAACAACA TCAAGCATTA CCTTCTCGAT ATGCTGATAC 120
AAAGCACTGA TGGTATATTG CATCCCTGCT GCAGCAACGC GTGCACGCAG TGTGTCGCTT 180
TTTTTCTCAT CCAGGAGGAC ATCACGTAGC CAATcATTTC CTTCCCCAAG ATTATATTTA 240
GCAACAAGAT CAACAGCCTT ATGTAACACC GCATCAACGG GATCAGACTG TGCCCGGTAC 300
AAAACATACG GGAAATACGA CGCACACAAC ACCCGTTCTG CAACTGCTAa TGCCTCTAGT 360
CTTGCTCTGT AaTGTGTATC CCTAAAACCT TCAATGACAA TATTACGCGC TTCATCTGTA 420
TCGAGCAAAC CTGCTCCTTT AATCGAGGCT ATTCTCAGAA TAGGATCTGA CTCCTCAGAT 480
AACACGGATA ATATTTCCAC CGCATCGGAA CGTCCAATAC CTGCCAGTCC TACAGCAGCA 540
GAAGCTCGAA CGACACTATT TTCCTCCGAG CTCATTACCA CCAGTTTAAA AAAATCAAAA 600
GcATGCTCAG CATGCAACTG CTCAAGAGCT GCCATAACGT TTTGTTTCCT GATCAACGTC 660
TTTTTATCGT CATCGAAATG AATATTCTCA TAATACCTCA CTAAAAATTC GGAATCTTCC 720
GTACTCCCCA TGTTTCCAAG TGCGACAATG CATTCATCAG CATACTGAGA CACTTCACTG 780
CTCAACACTT CCCTTAATAA AGGAGTCAAC TCTCTCGCCT CAAGACCTGA GATGTATCGT 840
ATCGCTGACT TAACTACCAT CGGATTATGT TCCGTTACCT GTTGCAATAC ATCCACCGCA 900
ACAGACTGCG CACAATCATT TTTCTGAAAC AAGAAAAAAT CAAACAACAA CGCCTTTAAT 960
TCAGAACTTT TTGTGCGTCC GCACAACATA CACAACGTTT CGTTTAATGA GGCGTTATTT 1020
TCCTTCTTTA ACTCTTCTAT CAAGGAGATA ATATCAGAAA CTAATCCATA CTTGATGGTA 1080
TTCATCCTTT TCTTTAGAAG AACCGTTTTC TCATCAGTGA TATCTATATG ACGCTCCTGT 1140 TCTTGCACGT GCTGCGTTCC CGTTTCCTTA GCAAACATGG GGACACCCCT CCCTAAGAAA 1200
AGGATTGCnA CATGCACGCT ACAGTAGTGC ACGAAAAGAC TGCnCCCTTT CACCCCATTT 1260 CCCCTCCGTT CCATTCTATA TCT 1283
(2) INFORMATION FOR SEQ ID NO: 165:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2529 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165:
CAAATCCAAC TGnAAATAGA TACCCCGGcT TAACCCCCAG ACAAGTGAGC GCGTGAAACA 60
GTGTGCACAG CGTCTCTACC GAATCGTTCA CCCCGCGCAA CAGTACCGAC TGCGATTGCA 120
CAGGGAGGCt GCGCcTACGC AGGCCnCGAG CACCGCGCGC TGCGTAGAAC CGAGCTCTGC 180
CGGGTGATTA ATATGCGGAA TTATCCACAC CGGCTTCATC TCCTGCAGAA ACGCAATCAG 240
CTCGGGAGTA AAGGCCTGCG GAGCAAAGGT GACTGCGCGA GTGCACAGAC GAATAATCAA 300
ATCCGGCGCT ACACTGgCAG TGCsCGGAAA AGCGATGTGA CCTGTGCAAA AGAACCAGTG 360
AGTGGATCAC CACCTGAAAC CAGGATTTCC TTCACCGAAG GGGTAGCACG AAGATACGTA 420
ATAATCTTCT CGCGCTCTTC GTTGGGGATC CACCCTGCAC GTTGGGCGAT GAAACCGCGG 480
CGAAAACAAT AGCGACAGTG TGAAAAGCAA CGTCCTGTTG CCAACATCAA CACACGATTC 540
GCATACTGAT GCACCAAAAA GGGTGTCACG CAtACCGGTC CTCACCCAAT GGGTCGGCAC 600
ACTCGCAAGC ATGCACCACA CGCTCCTGTG GCGCAAAGCA CACCTGACGT TTCAGCGCCT 660
GgCGTCCGCG CCCTGCGcTT GTGCAATTAA ATGCGCATAC GCTGGAGAAA TATGCTCCGT 720
CAGCGCATCT GCCGCGCAAG AGGCAGGACT CAACGTCCGC CAATGCTCAT CAGCACGCCC 780
TGCACCTCGT CTCTTTCTCT GTTCCCGGGT ACACTCAGCC ATAGACACGG ACCACGAGTG 840
CGCATCCTAC GGTAAAAATG CATGCGTGTA CAAGTCACGC CGTGCACCGC ACCCGTGCTG 900
CAGGATCGCA TACCGTAGTG TTGACACACG TACCTATTTG CAGAGACCCT GGTCCCCATG 960
GTCAGACTCG AGCGGCTAAA GAAGACATAC GCAGGTGTTC CTATACTTCG AGATATTTCT 1020
CTAGAGATCC CAGCGCACGG AATGTATGGA ATCATCGGCA AAAGTGGTGC AGGAAAATCA 1080
ACGCTACTGC GCATCATGAG TCTTTTGGAG AAACCTGACG AAGGAGCCGT TTTTTATCAC 1140
ACCACGAGGG TAGATTTACT GCGCGGTGCT GCCTTGCGTG CACAGCGCAG GCGCATAGGA 1200 TTGATCTTTC AACAATTTCA TCTGTTTTCT TCCCGCACCG TCTTTGGGAA TGTTGCCTAC 1260
CCGCTTGAGA TTGCACGGTA TGCACGTAAG GACGCCTACG CGCGCGTGTT GCATTTGCTA 1320
CACTTGGTTG GTCTTGCAGA CAAAGCACAG GCGCGTATCA GCACGCTGTC AGGTGGGCAG 1380
AAGCAGCGCG TAcCATTGCG CGCGCCTTGG CTGCAGAACC TGCAATACTC TTCTGCGACG 1440
AAGCAACAAG CGCTCTCGAC CCTCAAACAA CACAGTCAAT TCTGACGTTG CTGAAAAATG 1500
TGCAGTGCTC ACTGCGTCTG ACGGTCGTAT TGATTACACA CCAGATGGAG GTGGTACGCG 1560
ACTTGTGCGA TCGGGCCGCC GTATTGCATG AGGGAGAAAT AGTGGAAGAA GGAAGGGTGA 1620
CACAACTTTT TGCTGCGCCA CGGCGGCTGA TCACACAGCA GTTGTTGTCG GGCTGTTCTT 1680
TTGCCTCTTT TGCAAAGTCA GAACCCTTCC ATCGAATGTC TTCGGGTGCG TGTGCCGTGC 1740
ATGCTATTGA CAAGGCACAC TGGTAATGGC GAACCAGACA CTGTGGCTTT TAGTAGCTCG 1800
TGCAACCGGA CAGACAAGTC TGATGGTGTG TGCTTCAGCA AGTATTGCGC TAgCAGCGGG 1860
AACCCCGTTG GGGATATTGC TGTGCGTAAT GTCGCCTGGA CACGTGTGGG CGCATCtGCG 1920
TGGCATCGTG TGTTAAGTTC GTCAATGAAC GTcTGCGCGC TTTCCCATTT GTGATTTTGC 1980
TGGTGGTGTT GCTTcCgCTC TCGCGTATGC TCACAGGACG CACAGTGGGA ACGGCGGCGG 2040
CTATCCTCCC GCTTGCGAtA cTGCGCTCCC TTTCGTGGCA CGGGTGATTG AAAGTGCTCT 2100
GCTGGAGGTG GAGCCAGGGA TAATCCAAGC GGCGGTGGCA ATGGGTTCAA GCATGCGGCA 2160
ACTTGTACTA AAAATCATGC TGCCTGAGGC TGCTCCTGCA TGTGTTTCTG GTGTAGCACT 2220
GATGGTAATT AATCTAATTG GATACTCAGC AATGGCAGGG GCGATTGGGG GAGGAGGTTT 2280
AGGAGACGTA GCGATCCGCT ACGGGTATCA GCGCTTCCAA CCAGAGGTGA TGACAATGGC 2340
AGTGCTTGCA ATCCTGGCGC AGGTTGCGtA ACGCAATGGA TCGGGCGTAT AATCTGTACC 2400
CGAATACGAG CGCGtCAGgT AGTACCCCGC CAGAGTTAGG CAGGACGTCT GTCCTTGCAT 2460
GGGTAATCCT TTGATCTTTT CTACGGGGTC TGACGCTCAG TGGAACGAAn ACTCACGTTA 2520
AGGGATTTT 2529 (2) INFORMATION FOR SEQ ID NO: 166:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4060 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: TGACGGCTCT TACCCGATTG AGCGAAAAGA AGGGAGCAGT ATATGGCAGA GATATCAGCA 60
ACAGCTTATG CTGkCCAGGT TGACGACCTG ACGCTTGCGT ATCGGCAGAA GCCGGTGCTT 120
TGGGACGTGG ATGTGCGTAT TCCAGAGGGG GTTATCGAGG CCATTATCGG TCCTAATGGG 180
GCGGGCAAGT CGACCCTATT GAAGGCGATC ATGGGTCTTC TGCCTCTCGC TTCCGGAGAG 240
GTGCGTGTCT TTGGGCGTCC TTTTTCAAAA GAGCGGCGAC kTGTTGCGTA TGTCCCGCAG 300
CGCAnTGCAG TGGATTGGGA TTTCCCTACT ACCGTTTTTG ATGTGGTGCT CATGGGTTCG 360
TACGGTTCGC TCGGTTGGAT TCTTCGTCCG GGAAAAAGAG AAAAGGCGCG TGCCCGGGAA 420
GCGATCGAGG AAGTAGGAAT GGGCGCCTTT TTAGACCGAC AAATCAGTGA GCTTTCAGGC 480
GGTCAGCAGC AGCGCGTGTT TCTCGCGCGG GCCTTGGTGC AGGACGCGGA TCTTTACTTC 540
ATGGATGAGC CATTTCAAGG TGTGGATGCA GCTACTGAAC AAGCAATCGT TACTCTTTTA 600
AAAACGCTGA AAGGGCGTGG GAAAACGTTG CTTGTTGTGC ATCATGATTT GCAGACGGTG 660
GCAGAGTATT TTGACCGCGT GCTGCTTTTA AATGTTCGCG TCATCGCTGA AGGGGCCGTC 720
GTGTCTGCCT TCACCGAAGA ATACGTTCAA AGAGCCTATG GCGGACGGAT TAGTTCCACC 780
CTTTTTCCGA GAGGAAATAA GGAGGATGTG CACGATGCAC GCGCTCATGC GTCTGTTCTC 840
TGACTATACG CTGCAAAATG TGGTGTTAGG CACGCTTTTT TTGGGTTTGG GTTCTGGGCT 900
GGTCGGCAGT TTTGCGGTGC TGCGTCGACA AAGCCTTTTC GGTGACGCAG TTTCTCATGC 960
AACCTTCCGG GGATTGTTAT CGCGTTTCTT TTAACCGGCA CGAAGTCTAC TGAGATACTT 1020
TTGCTGGGTG CTGCCCTCAG TGGTTTAGTA GGAACTGTGG TGATGCTAAT GGTGATGCGT 1080
ACTACAAAAA TTGATACCGA TGGTGCGCAG GGCATTGTGT TGGGTGTTTT TCTTGGGTTT 1140
GGGTTTCTAT TACTCACCCA CGTGCAGAAG TCGCCCCAGG CGGCAAAGGC TGGTCTGAAC 1200
AAATTCATTC TAGGGCAAGC GGCCACGATT TTGCAGCGAG ATGTCCTGCT CATCATTGCG 1260
ATGGAGGTGG TGATCGGTTT GCTTGTACTG CTGTTTTGGA AAGAACTGAA GCTTTCTACC 1320
TTCGATCGAG ACTTCTCTGC GGTGCAAGGT TTTTCTCCAC AGCTTATGGA GTTCATGCTC 1380
ACGGCACTCA TCGTAGTTGC AGTTGTCGTA GGGGTTCAGG CAGTGGGGGT TATCTTGATG 1440
AGCGCACTGC TGACTGCGCC TGCAGTGGCA GCGCGGCAGT GGACAAACAG TTTAAGGGTT 1500
TTATGCGCGC TTGCTGCTTT ATTTGGGGGT GTCTCAGGTG TTTCAGGTTC GGTTGTCTCT 1560
GCCCAGGTTC CCAGGCTTTC TACTGGCCCC GTGATAGTGT TGGTGCTGAC GGGTATTGCG 1620
CTTGTCTCTA TTATGCTTGG TCCTCAGCGG GGTGTTTTGT ATCAACTGTG GCGGAGAAGA 1680
CGGGTTTCGC TTCTTCAAGA GGAGGGGTAG AATATGACCA TGGAGGTTGT GCTTATTGCA 1740 GTGGTCGTGT CGGTTGCGTG CGCGCTGTGT GGGGTTTTCT TAGTGTTGCG TAGAATATCG 1800
CTGATGAGTG ACGCGATCAG TCATTCGGTT ATCCTGGGGA TAGTACTCGG TTATTTTCTG 1860
AGTCGTACGC TTTCTTCTTT CGTGCCTTTT GTGGGGGCAG TGATTGCGGG GATATGTTCG 1920
GTAATCTGTG CAGAACTTTT GCAGAAGACA GGGATGGTAA AGAGCGATGC AGCaGTCgGG 1980
CTTGTGTTCC CTGCAATGTT TGGGTTGGGG GTGATCCTTG TGTCGTTGTA tGCAGGGAAT 2040
GTACATCTTG ATACAGATGC GGTACTGCTT GGGGAAATTG GACTnGCGCC CTTGGATAGG 2100 nTTTCGTTTT CAGCTTGGTC CTTGCnTAGG AGTnTGGTAn AGATGGGGTC CGTCnTGTGT 2160
GGATTACTGC TGTTGCTTGC GCTCTTTTTC AAGGAACTCA AGATTTCtAC GTTTGATCCG 2220
GTGCTTGCCA CGAGTTTgGG TTTTTCTCCT ACGCTTATTA ATTATGGGCT TATGCTCGsG 2280
GTGAGTATTA CCTGTGTGGG AGCCTTCGAT TCGGTGGGTG CAGTGTTGGT CATTGCATTG 2340
ATGATTACAC CGCCTGCAGC AGCGCTTTTG TTGAcAGAtA mCTtgTwGTt GATGTTGGTC 2400
CTTGCTTCAT TGCTCGCCTC TTGTGCGTCC ATTAGTGGGC TTTTTCTTGC GGTGAAGATA 2460
GACGGCAGCA TTGCAGGAGC AATGGCTACC ATGGCGGGCG TTCTGTTCGC GTTGGTGTAC 2520
CTTTTCTCTC CAAAACACGG GGTTGTGCGC AGGTGTCTGG TAATGCGTGC TTTGAAACTT 2580
GATCTAGATG TGGTGACACT TGCCGTGCAT CTTGCAAcaC ACtTACACGG TGGAGCGCAG 2640
CGTGGAGTGC GCTGAAGTGC ACCTGACAGA ACATGTGAGT TGGTCTGcGC GCAGGGCGGC 2700
CCGCGTGGTG CGTACCGCGC TCAGGCGAGG GATGGTAGAG CGTCACGGTG CCTTGCTGCT 2760
ACTCACTGCG CAGGGTGTGT nCGCTCGCGC AGGCGCGATT GGATGTATCC GTGTAGGCTG 2820
AGTCGATGTC GTTAGTGTCA GATATTGCAG CAGAGAATTA TTTGAAGACA GTGGTAAAGG 2880
CGTTGGCGCG GTCTCGTCGG GAGCGCGTGG GTACCGGGGA GTTGTCTCGC CTTTTACACG 2940
TGACGCCGGG GACTATCAGC ACAATGGTGA AGCGCTTGGA AAAGGGTGGC TATGTGCAAC 3000
GCACGcATCG TCTTGGCTGT ACGTTAACCA GAAAGGGGGC AgTTTTTGGA TCTGCaGTGT 3060
TAAGGAAGCA TCGCTTGTTG GAGAGTTTTC TTTCCCAGGT ATTGTGTTTA GAAGCAGGGG 3120
TGGTGCACAA AGAAGCGGAA ATGCTTGAGC ATGCGTGTTC TGACGAGCTC ATCGACGTTA 3180
TTGATCGCTA TTTGCAGTAT CCTACGCGGG ATCCTCACGG GCAGCCGATC CCAAGAAAGG 3240
ATACGCTTTT GGATTTGTAT GTTGAGGACG ATGTGCCAGG TGTATGATCT TTTTGTATGG 3300
GGTGAGGATG CGCCTTTTGT CAGATAAAAG GGGATGTGCA AAACGTATTG TTGAGAGGAG 3360
AGGGCCATGA AGCTTGTGTT GATCCGTCAT GGAGAAAGTG AATGGAACAG GCTGAACCTG 3420
TTCACTGGTT GGACAGATGT TCCGCTTACC CCACGTGGGG AGTCGGAAGC CCAGGAAGGA 3480 GGCCGCGTAC TGCAAGAAGC GGGGTTTGAT TTTGACCTAT GCTACACTTC TTTCTTGAAA 3540
CGTGCCATTC GTACGCTCAA TTTTGTACTC CAGGCACTGG ACCGTGAGTG GTTGCCGGTT 3600
CACAAAAGCT GGAAATTGAA CGAGCGGCAT tATGGGGATC TACAAGgTTT AAATAAGACA 3660
GAGACGGCGC AGAAGTATGG TGAGCAGCAG GTTAGGGTGT GGCGTCGCTC CTTTGATGTG 3720
GCTCCTCCTC CGCTTACTGT AGGGGACGCA CGTTGTCCGC ATACTCAAGC CTCCTACCGG 3780
GGGGTATGCG CGTCTGGTCG GACGCCAtAC TTCCGTTTAC GGAAAGTTTG AAAGATACCG 3840
TTGCGCGTGT GGTGCCGTAT TTTGAAGAGG AAATCAAACC GCAGATGATT TCCGGACAGC 3900
GTGTGCTTAw TGTGGCGCAT GGTAACTCGT TGCGCGCACT GATGAAGCAC ATAGAGTCTT 3960
TGGATGAGAC TCAGATAATG GAAGTAAATT TGCCTACCGG TGTACCGCTT GTCTATGAGT 4020
TCGAGGCGGA TTTTACCCTG TGTGGGAAGC GTTTTTTAAG 4060
(2) INFORMATION FOR SEQ ID NO: 167:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2074 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167:
CTTTTTACCC AACTCATTGC CAATGCGGTT ATGGAAGCGG GTAACGAGCA TGGTTTGAAA 60
ATCATTGAGA ACTTGCAGGA TGAGGAGAGT GGCGATGAGC TTGACGAGTC CGTTTCCTTG 120
CACGAGGAAG GGCGCGAAAT TACTGACTAT GAAAATTATA CTCCTCCTGA GGAGCGTGAG 180
TATTCTGTGA ACGATGAAGG CGATGTGTTT GATGAGGATG AGTCGCTCTA CGAGGGGCGT 240
TAGGTGTGCC CTCCGCGGTC TCTTTTGGGA TGTGTGTCTG GGTAGGCATG TTTGCATCGA 300
AGTCTGATCG GAAAATGCTG TCAGGAGGGG TACATGGAGA TTGCTGCTCG CGACGTTAAG 360
TCTTTGCGTG ATAAAACCGG GGCCGGGATG ATGGAtGTAA GCGTGCGCTC CAGGAGTGTG 420
CAGGGGACGC TCTGTGTGCA GAAAAGTATC TTAAgGAGAr GGGGCTTGCT GCCATCGAAA 480
ACAGGCGTGG GCGTGCCACT GCTGagGGAG TCaTCGTTAt TAAAGCACgG CaTGcAGAgG 540
GCgCgGCCTG TgGGGCGAGC GCTGTAGCAA TGGTTGAGCT TGTTTGCGAA ACAGATTTTG 600
TGGCAAAGAA CGCAGAGTTC ATCGCCCTTG CTGAGCGTAT AGCTCAGGCG GTGCTCGAGC 660
ACGCGTACAC TGAGGTAAAC CAGGTnTGCG CGATATGGTG GTGGACCTCG CAACGCGCGT 720
ACGGGAAAAT ATGAGCTTGn ACcGCCTTGC GCTCTTACGT GCCGcAGTGC CGGTGCAGGT 780 CAGTACCTTT CCTnTACGTG CACCCTGATA AAAAAACAGG GGTAGTGCTC TCCTTTTCCT 840
CCGATGCGCC GGATGTGTTC CTGCGATCCG ATGTGCGGGC cTTCGCGTAT GACTGCTGTT 900
TGCACGCGGC GGCATATACC CCTCGyTACG TGCGCGCAGA GGACGTGCCT GCTGAGTATG 960
TGCGGGAGCA GCGTGAGGTG TTCCAAGCGC ATGTTGCGTC TCTCCAGAAG CtGCGCATGT 1020
CAAGGAAAGT ATCGTGCAGG GTAAACTAGA GAAGCATTTG GCTGAGATCT GTTTTCTGAA 1080
GCAGCCCTTT GTTAAGGACG wCAAGCTTTC TGTTGAAAAA AAGATGGCAG AAGTGGGTGC 1140
CCGCGCAGGG GGTGCGCTTC GGTTTACTCA GGCACTGATA TACCAGCTAG GGGTACAGTG 1200
AGTGGGAAGC ACGGATAGAT CCTGCCaCCC TGCAGGATGG GGAGCAAGCA GGCGTGGGGG 1260
AGCTCGTGCT TCTCTCTTGC CGCACTGTGT TGTGAGGGGA AAAGATGGGT ATCGCTGAGT 1320
GCTATGAGCA GAAGATGAAG AAGTCCCTCT CAGCGCTGCA GGAGGGTTTT AACACGCTGC 1380
GTACTGAACg TGCGACTGCA CATTTGCTTG ATCAGATTAC TGTCGACTAC TATCAGCAAC 1440
CAACCGCGCT TAGTCaGGTG GCTACCGTTT CGTACCCGAG GCGCGTTTGA TCATTATCCA 1500
GCCTTGGGAT AAAACGCTCC TTGCGGATAT CGAGCGTGCA ATTTTAAAGT CAAAATTGTC 1560
GGTCAATCCC TCCAACGACG GCAAGGTTAT TCGTCTAGTG ATTCCTCCAC TTACCCAGGA 1620
GCGAAGGAAG GAGCTTGTCA GGCAGGCGCG CGCGTTAgCC GAGCAGGCGC GCGTTGCTAT 1680
TCGCAATATT CGCCGTGAGG GAATCGAGGA AGCAAAGCGC GGGCATAAGG AGGGACTGCT 1740
AAGCGAGGAT GCACTGAAAG CAGCAGAAGA GGCCTTCCAA AAAGCGACTG ACGCTTCTGT 1800
CGCAGAgTtG CACGGTACTT GGCCGAGAAG GAAAAGGATA TCCTGGAAGG TTGAGTGCCG 1860
TGCAGCACGT GGCCATCATC ATGGATGGAA ACGGGAGATG GGCGGAAAGG AGAGGGTTGC 1920
GGCGCAGTGC AGGGCACCGG CGGGGGCTGC AGACAGCGCG AGAGATTGTC GCGGCGCTGT 1980
GCCGATTCGG GTGCCTTTTG TTACTCTGTA TGTGTTTTCT ACTGAAAACT GGAAGCGCTC 2040
TGCnATGAAG TGCATTTCTT GATGAATTTA ATCA 2074 (2) INFORMATION FOR SEQ ID NO: 168:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2685 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168:
CGCCGGGnTA TTCCTTGCGT ATGCATCGAT GATCATTTGC ACGTTTTGCC GACCTAAAAT 60 ATCCGCTGCA TTGGTACGGA TTACCTGAGT CAGGTGCGTG GCGATGATTG CAGATGGAGC 120
AACTACAGTG TAmCCGACAC GCTCAGCACG ATCGCGATTT TCTTCAGAAA TCCACACAGC 180
AGGGAGTCCA AATGTAGGAT CAATTGTTCG CTCCCCTGGA ACCTCCTCTG TAACTTGACC 240
AGACTTTATG GCGAGAAACC ATCCCAAGCG CAnTTTyCCC CGCGCAACTT CTAATCCTTG 300
GATTTTGAAA CAATAACTGC TCGGATCTAA ACGCATATTG TCAATAATTC GAATTTTAGG 360
AGCAACCAAT CCGAGATCCA ACGCGGCATC TTTTCTAATA ACTGTAATTC TACTTAGGAG 420
CTCTGCACCC TTTTCTTTAT CAACAAGAGG AATTAATCCA TACCCAAGTT CTAGCGAAAG 480
TGGATCAAGC GGCACAATAG GACCCATTTC AGAAGTACTA TCTTGAGTCT GTTGCATACC 540
CTTTTTATCT GAACTTTTTT GCATCTCATG TTCTTGAACG TGTACCCGTT CCCTTTTCCT 600
TAGCTGCAGT CCCACAAAGG CAAAACACAC GGCCATAAAA AATAAAATAC TGTGGGGAAA 660
ACCCGGCAAT ACCGCCATAA CGATCAATGC ACCTGAGCCA ATAAAATAAA CAAGTGCACT 720
TTTTGAAAAT TGTTCCTGTA CGTTTTGACC AAATGACCCT TGATCGCTTG ATCGAGTGAC 780
AATAAAACCT GTTGCAACAG ACAACAACAA AGAAGGAAGC TGTGCAAGCA ACCCATCTCC 840
TATCGTTAAA TTTGTATAGG TCTGCAATGC TGCCTGAAAA CCCTCCCTAC GAAATATGAC 900
ACCCACTATC AGGCCTGCAA TCACATTTAC AATGGTAATA AAAATACCAA TTTTGACATT 960
GCCCGATACG AACTTACTCG CTCCATCCAT TGcTCCAAAA AAATCTGcTT CACGCTGaAT 1020
TTGCCTcTtA yrCTcTCGCG cTTcTTCwTC GGTGATAACA CCTGnCATTA TATTCAGCAT 1080
CAATAGACAT GcTTTTGGTT GCATTGAAGT CTAAGGTAAA ACGCGcAnAA ACTTCTGcAA 1140
TACGCGTCGC ACCCTTAGTA ATAACAAAAG CTTGCACTGC AATTAAAATG ATGAATACCG 1200
TAAAACCAAT TACTAGACCT TGCGTCCCGG ATCCTCCCAC CACAAAAGAA CTAAACGCAC 1260
GGrTCATATA CCCGCTAAAC CGATCTCCTA ACGTTAAAAT CAACCGGGTG GAAGACACGT 1320
TCAGTCCAAG TCCAAAAACG GTTGAGGTCA ACAAGAGCGA GGGrAATACA GrAAAATCTG 1380
TTGGTTTTTC AACAAATAAC ACCATAAGTA ATATCAAAAG GTTAAAGATA AGATTAAaGG 1440
CCwTCAACGC aTCGAGAATT TGCGTGGGCA GAGGAACAmC AATAGAAAaG ACAmCCACCA 1500
ACACTGAAAT CGCAACAAAA GCGTCAGTAG TGAAAAgGCA CTCTTACCGT GCGCCATAGT 1560
ATGGGTTACC TCTTACGCTG TGCATGTGTT TTAAATTTAT CCAGCTTGGT AAAAATCAGC 1620
ACTAAgCATT AAAATATTCG TAGGGAACTT CTCTCCCGAT AGCAACtGCG TGTACAACGC 1680
ACGTGCAAGC GGTTTGTTTT CTTCTATCAA GATACCTGCC TCTTTTGCCA ACCGTTTGAT 1740
TCGGTATGCA GTCCCATCAG ATCCTTTCGC AACCACAGTC GGCGCAGTCA TGTATGCAGG 1800 CTCATATTGC ACCGCAcTGC AAAATGAGTC GGATTAGTGA TCACAACATC AGCGTCAGTG 1860
GTATTCCGAG CAGACTCTCT AACAAGAGAT TGCATCTGCT TTCTAACATA ACTTCTCACG 1920
AGCGGGTCCC CTTCCTGCTC TTTTAACTCC TCTTTCACTT CCTGCCGAGA CATTTTTAAC 1980
GAATCGATGA ATTGCCTTCT TTGGAAGAAA TAATCGGGAA GCGAGAACAC CACTAACAGC 2040
AAACTTACTT CGAGGAGAAC TTTACCCGCA AGGGATGTAA TGTAGAAAAT ACTCTGGGTA 2100
AGACTCACAC CCAATAAgAA ACAAACATAA AAAGATCATT ACGTATAGTA AAATACGATA 2160
CAAAAAaTAT CGCTGTAaTC TTTATGAGAG ATTTAAGTAA ATTGAAAAGC CCTTCTGTTG 2220
AAAAAAATGA GCGTTTGAAA AAACGAATTA CATCTGGAGA TATTTTCTTA AACTGCGGTC 2280
GAATCGACTT TACCGAAAAT AAAACGGTCT TGTTTTGTAC AATGTTTGCC GCAACGCCAG 2340
AGACCAGCGC AACAAAGGAT ATCGGAAGTG CAAGTTTCAT AAAATACCGC ACAAATACAA 2400
AAAACCATCC AGTATTCTGG ATGGACGCGG TAGTAGCACG CGTAAAGAAA AACCTGAGTA 2460
CACCGATGCA CTCTCTCAAT ATAAATGGTG CAAGCAAGAA CAAGGAAGTT GATGTGAAGA 2520
GCATCACAAA CGCTCCATTT AGATCCgGCT TTcGGAACAC GTCCTTCTTC TCGTGCTTAC 2580
GGAGTTTTGt TcGGTAGaTC CTCTGaCCTC CCTTCaTCCT CAGCGGCAAA CCACTGGCAA 2640
ATCAATAATA AAAAGAGGAA GCGGAAATGT TCCTTCTTGT TCTAT 2685
(2) INFORMATION FOR SEQ ID NO: 169:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 634 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169:
AATCTTGCAC GCCCTCGCTC AAGCCTTCGT CACTGGTCCC GCCACCAAAC GTTTCGGCGA 60
GAAGCAGGCC ATTATCGCCG GCATGGGGnC CGACGCGCTG GGCTACGTCT TGCTGGCGTT 120
CGCGACGCGA GGCTGGATGG CCTTCCCCAT TATGATTCCC AATAAAAATT GCGCGTGCCG 180
CACACCGTAA TCAGTTTACT GAAGAAGTGC TCCAATCCCT GCGGATTTGC CACCAATGCG 240
GTCTGTGTTC TGCCGCCTGT ACTGCGCGTA TTCCTCTTGC AAAACTTTTG CACGATGCAC 300
AAGAACGCGC ACTGCATCTT TCCCGTGCTC CAGTCACCAA AATAGAACCC CACTCCACAC 360
AAAGCGTCGG GAAAACTATC CGCGAGCACC TGCCAATGCG CACCGCTGGA GTACAAACAC 420
GCACCCTTCC TTTACACCGG CTTAAGTGCT GGACAGAACA ACAGTGTACT GTTGGCGCTG 480 CTTGTTGCGC ACGTGTTCGT CGTTGCAGCC ATGCGCGACA CGGTCGnTnT TTTTCCATCG 540
TCAGTACCGA ACTCGGCGCA CTGAGCGCGC GCTCGTTCAA ACACTACGCA CACCACATGT 600 GCCCCTGAGC GACTCTCTCG TACTGGGCCT GCTC 634
(2) INFORMATION FOR SEQ ID NO: 170:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4042 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170:
CTGAACACGT TGCACGGTGC GACGCTCCTT ACAGGAATGr AAAAAAAAGA ACgCCGCGCA 60
CAkTGCACCG CCCACATATG ACGCACACCG CCTAGTTTTC ATCACGAAGA CcTTCTGAAG 120
AAAAACACGC GCGGTGACAC GCTTCGAGCT GTCCATTTTC CACGTGTGTG TACACCTGGG 180
TAGTTGCAAT ATCTGCATGG CCGAGCAGAC ACTGAACCGA ATGTAAATCG ACACCGCCTG 240
CAAGCAAATG CGTCGCATAT GAATGGCGAA ACGTATGAAC GTGCGTCTCC ACCCCTGAGC 300
GGATTTCAAT ATCCTGCAGG CGCTTCCATA TACCTTTACG ACTCAGACGA CCCCCCTGCC 360
TGTTTAAAAA GACAGCTCCT TTTTTTTCAC TATTCTCCGG ATGCTTCGCG CTCGTAAAAC 420
GAACACGCGC CTCACGTATA TACTGCaTGA GAAAGTAACA CGCCTGTTCT CCAAAGGGAG 480
CCATACGCTC TTTGTTCCCC TTACCTGTCA CTTTTAGCAG ACGTTCTGCA AAAAAAATGT 540
CAGAAAGGGA AAGAGAAACT GCCTCACTGA CCCGCAAACC TGCAGCATAA ATAAGTTCAA 600
AGAGCGCGCG ATCACGCACG CCACCTGGTG TGCACAGTGG AATGGATTGC AGAAAAGTGT 660
TGACCTGTTC GGGAGACAGC ACGCGCGGCA GTGAATACGC ACGACGCGGT GCATCAACGT 720
CTTGCATtGG GTTATCAGGA CGCACCTGCT CCAAAaTGAG AAACTGATAG AACGCGTGCA 780
GGCCGCCATG TCCTTcGCAA TCGTTTTGCC TGAGACACCC GCAGCACTCC GCTTTTCGAT 840
AAAACGTACA CAATCGTGCG CATTGGCGCT TTCAATGTTA CACGGCGGAT CAAGGTTTTT 900
CTGAAAAAGC ATCAACGTAG TCACGTACGT CCGTGCGGTC AAAAGCGCGT GCGCCTCTGC 960
CGAAATTAAG TACGCGTAAA AGGATTGCAC ACGGATATCG ACGTCTTTCA CAGACCCTTG 1020
TTCCTATTTC TCAAAAAGGT CGGTTCATCC AAATCTTCAT TTACCGAGGG GAGCGGAACG 1080
CCAAACGTGT GGCCCTtCAC TCCGTTCTTT TCCATACGCG TTTCTTGAAC AGCGCTATTA 1140
CGCGTAGCCA GACCAGGTAA GTTTGGCTGT TTAGAGCTTT TAGCTCTGTT CCATTCGTCA 1200 GAACTAATAT ACACACCGGT ACTCACCGCA CCGTACGATG ATGTTTTTAT CTTTTGAGAA 1260
CTGTGTGTAT CTCCCGCTAT CGAAATACTC GCTTGCGGTA CACCCGTTGC GATAACCGTA 1320
ACCCTCACCC TATCCTGCAT ACTCGCGTCG ATGGACGTGC CATGGATGAT AATCGCATCC 1380
GGATCAATGG TCTTTGCAAC CACAGACATC ACGCCATCGA CTTCTCCCAT GCTCAAGTTC 1440
TCTGAGCCAC GTACCGCAAC CAGCAGTCTG GTAGCACCTT CTATCCGCGT CTCTTCCAAA 1500
AGCGGATTAT TAATTGCAGC GGTTGCCGCA TCTACTGCGC GGTTTTCCCC TTCTCCCTCT 1560
CCCACACCGA TAAGCGCGTA CCCCTGCCCT TCCATGGTGT TTTTTACATC CATGAAATCT 1620
AAATTCACTT CTCCAGGAAG GGTAATTAAA TCAGAAATAC TTTGCACCGA CTTGCGCAgc 1680 aGATnCATCT GCAACCAGAT ACGTCTCTTT AATCGGGCAG CGCTTATCTA CCACACTGAG 1740
TAAATTCTGA TTAGGGATCA CAATCACGGT GTCCGAGTGC GTGCGCAATT TTTCGATCCC 1800
TCGCTCAGCG AGCATCATCT TTGCTCTGCC TTCAAAGCGA AACGGCTTCG TGACTACGGC 1860
AACTGTCAAA GCACCAAGTT CCCGTGCAAT CTTTGCAATA ACTGGGGCAG CACCTGTTCC 1920
CGTACCTCCT CCCATCCcTG CGGTGATGAA CACCATGTTC GCGCCCTGCA ACGCACTTGC 1980
AATGGCTTCA GCATCTTCCA TTGCAGCCTT CTCGCCAATC TCAGGATCAC CGCCTGCACC 2040
CAACCCCCTT GTCACCTTGG TGCCAATGGC AAGCTTTTTA GGCGCGGTAG AATAGCTCAA 2100
CGCCTGcACA TCTGTATTTG CTGCAATAAA CTCGACGCAC TGCAAACCGC AGCTCATCAT 2160
CCTATTTACC GCGTTTGACC CACCACCACC GGCACCGATG ACCTTTATGA CCGTTGGACT 2220
TAGGGTAAAC TCTTCGCCTG AAGGTGCAAG CTCTATATTC ATCATTCCCC TCCCATTCCC 2280
CTACACCGCG TGCCGCATGC TGTGCGCGGT TTAAAACAAG TTCCTCCAAA TATCCTTCAC 2340
TTTAGTAAAC ACTCCCGCAC GCTCCATCTC AGCACGGCCC TGATAAGCGC GCTGTCCCTG 2400
CTTATGGGTA TATTCTAAAA TCAGTCCTAA CACCACTGCA AACTCAGGAC TGCGATATTC 2460
CCCTGCCAAT CCTCCCAAAG TACCTGGTAT TCCAAGGTGC ACGCGCGGTG TATCAAAAAT 2520
TGcTGACGCA AGCTCTACCG CACCGGTAAG cTGCGCGCCA CCGCCGCAGA GAATAATATT 2580
TTCAATGATA CCACGACCGC TTTGcGTCTC CACCGTCGAA AGACGATCGC GCACTATCGT 2640
AAAAACCTCA CACATGCGCG CTTCAATTAT TTCGGCGATT TCTCGTTTAG AAATTTCTAC 2700
AGGAATCCGA TTTCCCTGGC TGGAGATGAG AACACTCCCT TCTCCCTCAA GCAGGGGGAT 2760
CCAGCAACAT CCATCTTTAA TTTTAATGCG CTCTGCAGTT TCAAGCGGGA GGTTTTTTAC 2820
CTTTGCAAGA TCAGAAGTTA CCTGACTGCC CCCAACAGGA ATCGAAGTGA TAAGCACCGG 2880
GGAACCCTTG TACATTGCAA TAACATCCGT AGTTCCCCCA CCAATATTAA TGAGCACACA 2940 CCCTACATTT CGCTCGTCAT CGTTTAACAC AGAACGAACA GCAGCGAGCC CGTTATGCAT 3000
TAAAAAATCG ATGTGCAAGT TCGCcCGTTc ACGCAATCGA TTACACTGCG CATACACGTT 3060
GCAGAACCGG TGATCATATG CACyCTcTTC CAGGCGAACC CCAATGATAT TGCGCGGATC 3120
GGTGATGCCG TGCTGATCAT CCACCGAATA AACTTTGGGA ATAACATGAA GAATTTTACG 3180
ATCGGGAGGA AGAGAAACTG CACAGGCAAC TTCAaGCACC cGATCAATAT CGCTTTGATC 3240
AACTTCGCGA TGCCCcTTCC CcTTATCTGC AACCGCCACA ACACcTTTTA AATTTCTACC 3300
CTCGATGTGG GTACcTCCAa GCCCCACAAg CAGTGCGCAA tTCGATACCG GACATCATCT 3360
CCGCAGCTTC AACCGCGTGG TGGATACCCA CAACTGTATT CTCAATATTG ACTACTACAC 3420
CCCGCCTCAA ACCCTTTGAA TGACCGACGC CTACACCTAC AACCTGTAAC GCACCACCTT 3480
CCAACCGCTC GGCCACTACC GCCCTGATCG ATTCGGTACC GATATCTAAG CCGACAATAA 3540
CCTCACCCAT AACTTTCCTC TTCTAGCGAT ACACCGCCGT CCCACCTCTC ACGTCAAGCT 3600
CCTTAATGCG CCTCTGCGTT TGCCACTCCC GCAGCGCATC AACGAGCAAT ATGACATACC 3660
GCAACTTTTC TTCGCTAAGG TTTTTGTCCA TGCGTACTCT GATAGGTGCA CGCACCAGGT 3720
AAAGCGCTAA ATCATATCCC CCGTGTCTTT TTTGTTCAAT GCTTATCTCA GAAATTTCAC 3780
CGAGCAAAAG AGGGTTCCGC TTGCTCAAAT TATCTAACTG CACAAAAAGA GGAACAAGCT 3840
GATCGTGCAC GCGTAGCCCC ACGCGCGGAT TACGAAATTC AAGACCACTT ACTACCGGCA 3900
ATACTGTATC AAGAGGTGCC GTCCCAACAC TAAAAACTGT TCCTGTCTTG TCAATCTGCA 3960
CCGGCATCGC ACGGCCTTGA ACGTGCACAA AACCAAGTGC AATAGCAACA CGCTCTACCA 4020
CATGAACATG CATCGTATCT GG 4042 (2) INFORMATION FOR SEQ ID NO: 171:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 484 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171:
TTGCCTTTCA AATAAATAAG TATTTTTTTA AAAGGnGGGA GGGGGTTTAT ACTTCACAAT 60
ACTATTATAG GACATGGTG TTTATCCCTT GCGTAAAGAA CCTGGAAGTG TTAATGCAAC 120
TTGCAAAAAA AGAGGCATTG GTTACTTGGA GTCACAGGGA GTCAAAGGAA GGTATTAACT 180
CCCTTTTTTT GTTCTGTTTT CGTTTGTTTT CAAACAAGTA ACTGGCCATG ATGATACAAA 240 CACATACACA ATGACTTGTT AAATTACCTT TTCAAACAAA AAAAGTTATT TCACCCCTGA 300
GACCAACCAC CCATGAAACA AGGGAGAAAG ACAGAACCAA GTTAGAGAAG CCCCAAAAGC 360
AACATGCTGC ATTGCTCCAA AGACTGCCAG GTTCCCTTGc AAATAAAGTA CTTGCAACAC 420
CCCCCTTGAG CTATGTGGcT CTGTGTGTGT TTAcTAGCAA AGCCAGTCTT TGrAATCTTG 480
AAAC 484 (2) INFORMATION FOR SEQ ID NO: 172:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3134 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172:
TCCTTTACAC GGTATTGCTC CACCCTGAGG GCATCGAACG CGCTGTCAAA CTTCTCCCGT 60
GAGCTCCCCG TTGCCAGAAG GCGATTACtG CATCGGCAGG GTCCTGGTGT TGGTTACCGG 120
CGTCGAGGGC GAAGGAGAAG CGCCGCACAT ACCCCTGCAA TCCCTCGCCT GTAAACTGAG 180
CGCGCACAGC CCCACGGGGA AAACGCGCAC CCCCACCGTA CCCGTTCTCT GGCCCACACG 240
ACAACCCCTC ACCAAACCAC CTTCACCCCA CAGATAAACG TGCCAAAGTA ACGCTCAGAC 300
CACATATCTC GGCAtACCCA TGTaAGGAGC GTCAGCAAGC ACCCCCTGTT CCCACTGGGC 360
GCTGAGCTCC ACTATCAGTG GCAACTTCGT CAGCGTAGAG ACGAGCGTAC CGTCCTGCAG 420
CGTGAGGGCA AGCCCCACCC GCTCGAGAAT CGAGGACTCC GCCATCTGTC CCAGCAAAAA 480
TTCAAGATAC TTATCCTTTA CACGGTATTG CTCCACCcTG AGGGCATCGA ACGCGCTGTC 540
AAACTTCTCC CGTGAGCTCC CCGTTGCCAG AAGGCGATTA CtGCATCGGC AGGGTCCTGG 600
TGTTGGTTAC CGGCGTCGAG GGCGAAGGAG AAGCGGAAGC CGGCGCCTGG TTCGAGGGTG 660
AGTCGGCCTC CTACTCCCCA CAGGAGTGCT GTTTTGTTTT CGTTCGTGGA GTCTTcGGTA 720
CCCTTACGGT AGTGCTGCTC CAGTGTGGCA TTCCCTGCCA GCTCCAACGT AAGCAGCCGC 780
TGACGGTCGA CGCCATAGGA AAGCGTTGCA TCGGCCCCGA AGCCATACTT GCTGTGCGTG 840
GTGTCAGTAC TATCCCAGGC ACCATTGGAA AGGAAGGAGA GGAAACCGAT GTCCACATCT 900
ACTCCGCTGT TTCCCACATT GTGGGCCTGG TAGCCGAGTT TTGCCCCGGA GCCGGAGAAA 960
CCAGGGGCAT AGCGAGTGTC CTTTTCTGAA TAGGCACGGG TGACAAAGGG TTTCCACAGC 1020
TGGGCAAAGT TAACCACACA GGAAGGACTG GTACCCACTG TCAGGTAGGC CCCATAACAG 1080 TGCAGGGTTG CCTGGAAGGA AGCGGTAGGT TTGGTAAAGG ACAGGGCCGT TGAGCTTTTA 1140
GAAGACGCAA GCTCTACTGC CAGGTCCTTC AGCTGCAGCT GTGCCCACAC CCCTGAGCGT 1200
GCCTCCCCTC GGcGGGTGTG GGTGTGCTTT GACACCAACG GcAGGGAAAT AGTCAGACTA 1260
TTGGTAGTGC GAAACCCATG GGTGTGCTTG CCCGGGCCAG TGCGTGGATT CTTCTGGAAC 1320
GCAATGCCCC ACTGGAGCTG GGCTGTGCCA CTGACCTGCG GAGTGAGTAC GCCTGCATAA 1380
CCAGAAGCAG CACATACCAT GCCCGCAAGT ACCCCcGCTT GCATCACCTG CCTGmCCACT 1440
CACTCCCCCT CCTCTcACTT CTAmCTCACC CCCCCCCCAC CCGTCTAGAA GACACGGAGA 1500
GCTCTATCTC ATGAGCACCT ACACACTCGC CTTCTCTTGG GGGACAGACA GAACTTCCGA 1560
AGAGAGAACA ATAGGTTCCG GCGATGTTTC GAATAGGGTA AGGCGTTCTG ACGCCTcCTC 1620
TAATGACTGC GCCAGACGCG CTACCGTGGC TTTGCTTAAC GCAACGTtGC TgCGCCGATT 1680
CGATCATCGC CTCGTACTCT TTGATCCCCC GTCGATATnC CnCGGGGGGG GGGGGGGnGn 1740 nnnnGCACAC ACGCTGGGCn GACTGAAGCT TTTGACGGAG CTTGGCGCAC TCTACCTGAT 1800
ACCCTGCACG TGCCACGCGC ATCCGCGCTT CTGCCACGGT ACGCAACAGC GTGCTGAGTT 1860
CATCGGGCAC CGTCGGTCTT GTCTTCTGAG ATAAAGGAGG CTGAGCAAAT CCTTCCTCGG 1920
TAAAGAAACA ACTTGCGTAT GCAGCACCGA TGCGCGCGCA CAGCCATTGA TATCCTCATC 1980
CAGTTCACTG ACCTGCGCTG TGAACGCCGC AACGCGCTTC AGCGGCTTGG CACCACAGTC 2040
AAGGGcGCGC AGgCGgCGTC GAgTGCTTCT TGCTCATCCA TAAGCTCcTG GCTATGTTGA 2100
AGGTTGCTCG CGTAAACGCC GCGGTCGGAT ATGAGCCGTG CGTACGCGGC ACTGAGTGCG 2160
GAGGAAAGCT CGCCTGCGTG ATACATGCGC TCGACGTCCG GATGCGCAAT GACATCCGGG 2220
GTACAGAGCG TAATAATCTT TTGAATTTTT GCTTCGAGCA CGCGAATCCT GCGCTGAACA 2280
ACCGCGCTCT TCGCCTGGAG GCCAACTCTT TCAAGAATTG AACCAAATGT GCATGTTTCC 2340
AAGAGCTGGT CACGCTTGGC GCGTAAATCC TGCAGAGTAG ACTCAAGCTC CGCCGTACGC 2400
GCATAGATCG GTTCGAGCGC AGGTAGGCCT ACGTGCGCGT AGGTGGCATA GTACTGGGCT 2460
ACAAAACTCC TGAGTACATC CCGCTCCTGA CGAGCATGGC GGTGCAATAC TTTGCTCACC 2520
CGCTTTCCAA GAGCGGCAAG TTCTTCTTGT CTTTGAAGTA TCGACTTAAT ATCAAGGATA 2580
GACTCAGCAA CCTGATCGCG CTGACGTTGA AGCGCGTGGC ATCGGCTGAT GTCAGTGTCC 2640
TGTACGCCAA GTCCGCTGAT GTCACACGCA GCACCGCCGC GCACAATATG TTCACCGAGA 2700
CTGCAACAAT GGCTCTGCAG ATCCTGCTGC GCACGCTGAC ACGCGGCATT CAGCGCGGAA 2760
AGACTCTTAT CCGCGAACAT GACCGCATTG TAAACACTTC CCCTGCGTAT GTACAGGGCA 2820 CCTCACTCCC TCTTTACCCA TGCAAGGACA GACGTCCTGC CTAACTCTGG CGGGGTACTA 2880
CCTGCGCGCT CGTATTCGGG TACAGATTAT ACGCCCGATC CATTGCGTTA GGCAAcCTGC 2940
GCCAGGATTG CAAGCACTGC CATTGTCATC ACCTCTGGTT GGAAGCGCTG ATACCCGTAG 3000
CGGATCGCTA CGTCTcCTAA ACCTcCTcCC CCAATCGCCC TGCCATTGCT GAGTATCCAA 3060
TTAGATTAAT TACCATCAGT GCTACACCAG AAAnCACATG CAGGAGCAGC CTCAGGCAGC 3120
ATGATTTTTA GTAC 3134 (2) INFORMATION FOR SEQ ID NO: 173:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 635 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173:
ATTnTTTAAT nATAGGATCC CGTCACGTCA TACACAACTT TCTCCGCATA AAGGCACTCA 60
TCTAAGTTAA GCTGAAGCCC CTCATACAGG ACATCAATAG AACGTGCTAG GTGATAAATA 120
TTGTCTTGAA ACACAAAATT CTGCACCACA GACCTCCCCA CACATACGTG ACCGACGCGG 180
GCACACAGGA AAGCGCACCT TCTGCGAACA GGTTCCTCAC CTCGTCTGCC GTAAGAGCAC 240
ACGCACGAAT GTACCCCCTC AGACGACGCT GCGCCGTCGG CAGCATCGGA TAAAAGGACA 300
AAAGACTTGA TGCGGCACAC AGATTTACGC TACGCTGGCG TCTCGCCGCG CGTTCTGCGC 360
ACGCTACTCC TCCTTGTCGG GTGAGTCTAG TGCAGCCTGA TAAGGTGAGC CGTCCATAGG 420
AGTTATTATG AGGACCTACG AGnTAATGGC CGTTTTCAGT GCACACGAGG ATCTCTTTCT 480
TCAGGGTTCC ACCGCCGTTC GTGCCCTCCT ACAGGAAAAC GACGCATCAT nCGCCCGCGA 540
AGACCATATT GGAGAnCGGG AACTTGCGTA TCCTCTGAAG AAGCAAAAGA GGGGCCGTTA 600 nCTGCTCTTC ATTGTTCAGT GTTGAGCCnG GGAAA 635 (2) INFORMATION FOR SEQ ID NO: 174:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1644 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: CGGTACCTTC GGTTCTTGGG CAGGGCTTAC TGCAGCCGAG CAGCTCGTCT TATTCAGCCG 60
GTAACTGGCA CCGCCCACGT TCATTGCTTG GCGTGCTAAG TGCACTGCCA AGGAGGTAGG 120
CGCCATACAC GAAGAGTCGG CGTATTAAAG GGGTCTGTCA GAACTATGCG GTGCCGGTGC 180
AGCTGGGGGT ACAGCACTAT TTTAGCGCGC ACTGGGGGAT AGACGCGACG GCTACCGTTT 240
CGTTTGGCAT TGACACCAAG CTGGCTAAGT TCCGCATCCC GTATACGTTG CGCTTTGGCC 300
CCGTCTTCCG CACCTAGGGG ACGGCGCTGG GArrAAAgAG TCCTGCCGGA AGGccCTGCG 360
GCGGGTAGTA GCTACCAGGA GAGGGTGACG CCGCACACGA TGCGGCCGAT TCCCTGGGTG 420
AGGCACTCGG ACACCAGCAG GTACGGGACA TCAGAGAGCA TACCCTGTTC CCAATCAAGG 480
GAGAATACCG TCTTCTCTAT GAGACTGGCT GAAATACCAG CACGCAGCTG TGCACAGTAC 540
TCCTTGGTTA GATAGGTAGC TCCTACTGCT CCACCTGCAG CAGGGGCATT CAGGTGTGCA 600
CGGTTGGTAG AGGCATGGAC CGTAACGCTT GGCTTCACCC AGCCGTAATC CTGCACCGGG 660
ATGCGATAGC TACACCACGC CTTCCCCACC ACCGGTGGAC GGATATACTC CTTTTCCTGA 720
ATGCCACGCA CAGCCGTCCC CCCGTTATTT TTGTATAGCG CATAGGTGAG GGGGATGTAC 780
ACGCGTGTTT CAACGCCGGC GTCCAGGCCG GTGAGCAGGT GGGTGTAGGG GTCACCGCTC 840
TTAGTTTCGA GCTTAAGGAA TCCGGCAAAG TCGCCACAGC TTGCGATGGT GTTATCTAAC 900
ACCCTGGTGC CAAAAACGTT TGCCGGTGCT GTGGCAAAGT ATATGCCAGA AGACAGCCAC 960
TTCCACTGCG CCGTAAACAG CGCATCGAAG GCGACATTGT AGGTGTCAAG ATACAGACAC 1020
ACGGCGCTGA CTCCCATTAG AAAGGCACGC CATGCAGACG CACGCAGGTT CTGTATAGCC 1080
TGACGTATCT GCTGCCCCGC GTCCAACGCA TCCGTCTTCT TCTTCACTTC TTCGGTTACA 1140
AACGTCTGAC CCTCAGTGAA AAACTTTGTA GCCTCAGCCG TAAGCTTTGG AATAAGATCG 1200
GCGAGATCGT CCTTCAGTAA TTTTTGTTGA TCCGGAAGGT TGAAGAGGCT CATCGGATTT 1260
CCCTTATGCG GTGCAAGCAA CGCATCAGCC ACGTCTTCTA TTGCCTTGGT AAGACCGTTG 1320
ATGAATGCCT GTGCTGCAGC CTTGCTCTGA GCCACAGCCG CTTGCACTTT CTGGTTAATT 1380
TCAGCAACGA TTTGCGTCTG TACCTGCTCA AACCCCTTCA CCACCTGcTC CGCATCGTAC 1440
TGCAGCAAAA CCTGCCCCAT CAGGGAAAAT GCAGGAAGCG CGGGAAGCGG TGACGGTGTA 1500
GGAGGGTCGG TAATGAACAC CTTCAGGTTC GTCAAAAGCG TGGCAGCTAC CGGAGGGGTG 1560
ACTTTTACGT TTGACACTGC CGTTATGTAC GCGTTTCCCC TCGACTGAAG GGCGGTTTCs 1620
GTGGyksCgT TACGGCATCA AGCA 1644 (2) INFORMATION FOR SEQ ID NO: 175: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2535 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175:
CACGTACCGT CTCTGCAACC GTAGAAGCCT TCAGAGAAgG ACAACCCGAT CAAgTGTAAC 60
TGATCTAGAT ACGCAGCTTC TGCAAgGCAg TTTGCACCAA CGCTTCTGCA CGCGCATTCA 120 cTGAcTGCAT CCCATCCGCG TAACGCGCCG CAAtGCGCGC TTCACCACAC CCGGCAGAGA 180
ACCGGTATTC ACCCCAATAC GCAAAGCAGT GCCGGTTGCT CGAGCCTTTT CTACCACCGC 240
ACGCACGCGC TCACGCACAC CGATATTCCC CGGGTTAATA CGCACTTTCG CTACCGGTCC 300
GTCCATACAC CGCAGCGCAA GCCGGTAATC AAAGTGAATG TCTGCAACAA GCGGCATACG 360
CGTGCGCGCG CAAAGCsCcA CAAAAAGCTC AGCCGATTCC CTGTCAGGAA CGGCAAAACG 420
CACAACGTCA CACCCTAATT GTTCAAGTTC CAAAAGACGG TCCACAATGG ATTGGAGATC 480
TGCACCAATG AGCGGTTCTT TCCACATTGT TTGGATAGGA ATCGGTGCGT CACCCCCAAG 540
AGGCAACGCA CGCACATGTT CCTTTCCTCC AATTACCAAT GATCGTGCAC GGTAGTGGGA 600
CGCACGCGGT CTTAATTTTA ATGGAACATC CAAAAGTCCA CTCCCATACG GACTATTGCA 660
GGGAGACACA CCGGCAGATG AATCCACCTT TTCCTCTGGc tGCCGCGCAg CACGCTCGTC 720
CCTCTGATTC ACCGGGCCAC CACTTGCGCA CGCACACTTT GCGCAACTTC CACTCCACAG 780
AGTGCTTCGA TAACTGCGAA AGAAAACTCT TCTGCCGCAC CTGCCGCACA CGCGGTCAAA 840
AGATTTCCAT CACGTACCAC CCGCGCGCGC TCAGGTTTAC GAAGCGCACG CGATTTCTCC 900
TCTTCCGTCC TTTTCCCAAC CCCATCGTCA TGTGCGGAGA ACACCGCCGG CTCCATACCC 960
GGATAGCAGG TATAGCGACG TGACCCCAGG AGATTCCACG CAGAGAGCAC TCGCGCCGGG 1020
GCAGCACACA GCGCAGCGAC GAGTCCTCCG CGCAGGTGCA CGCGCATGAC AAAATCACGC 1080
ACCGCCGCAC AGGCGGCAAG CGTATGACAG TTTTCCAACC CCCCGGGAAG AAGAACCGCA 1140
TCTGCGGCGC ACGCGGCATC CGCGATGCCC GGAGAAGCAC AAAGCGCCTC AAGACTGCAG 1200
TCGCAGCTCA CACGCAAGCC ACGCGTAGAA ACAACCTGCT CTGCCCCAAC ACCCACGAGC 1260
GTTAACGCTA TCCCCGCACG TCTGAGATAA TCCAACGGGG TGATAGTCTC CACTTCCTCA 1320
AATCCGTGTG CAACAAAAAG GTATACCCGT ACGCTCACCG CAACACCCCC GTACCTGTTC 1380
TTACAGCAAC AGTGCAACCC CGCGCACACA GACTACGGAT AATCGTCCTC CGTTCAAAAG 1440 CGACGTACTA GCCGTGCGCC CCGCGAGCGC AgCACTACCT TAGGATAGAA GAACTAAAAA 1500
CCACGCGGAG CCTGTGCCGG ATCTATTGGT TTATTTTTTT TATATACCAT AAAGTACAAT 1560
CGTGATCGCG CCGCTGCTGC GTCAAACCCC AAATTACCGA GCACGTCGCC GGCAGAAACA 1620
TAATCTCCTG ACTTTGGCAA AATACGCTCA AGTCCTCCAT ACACGTACAC GTGCTTTCCC 1680
GCCGACTCTA CAAAGAGCAC CTGACCATAA CCCCGATGGG TCCCCCTGGA GATTACCTTA 1740
CCAGACATGA GTGCACGCAC GGCAGCATTT CTTTCCGAAT CAATAACAAC CCCGTAGgTC 1800
TTGCCCCGCA CGTACGCAAG CGAAGTCGGG CTCACCGGCC AGCGCGCATT CTTATCAACT 1860
TTCTTACTGA TATACTGACG CGGATCACGC AGAGAAATTT CTCTCCTCTC GGGGGCCGGC 1920
ACAGCAGCGG GGGGAGAAAC AGGAGTCTGT ACGGGATCGG CACGGGCACG CGCCCACTGA 1980
TCCCCATCGG GCAAAGAAGA CACAGATAAG ATGCGGTCCG CGCCCGCCAC GGTAGGAGGC 2040
GACTTTTCAC GCGGAGGAAT CACGAGCACA TCACCTGGGT GAATAGTGTG CGCAGCAGAG 2100
ATTCCGTTGG CTGCAAGAAG CGCGGCAAGT GAACAGTTTA GCATACGGGC AATCGAAAAG 2160
AGGGTGTCAC CCCGGCGCAC CGTGTATCCT CGTGGAACTA CTATGCGCTG CCCAGGCACA 2220
AGCTGGTGTA CGTTTGCCAG ATTATTTGCC TGCGCAAcGC AGAAAGGGGC ACGCCATAGC 2280
GACGCGAAAG TGAAAAGAGG GTCTCGCCCT TAGCAATCAC GTGCACGTCT GCGCCGCGCC 2340
GTAGGCTGCT GCCAACAAAA AACACCCCGC AAGCGACAGA AAAAAACCGC CCATACCCGC 2400
CCCCTTCTCG GCAGATAAAA AAAAAGCATG AGCGTCACCT ATGCGCGCCC GTTCCCCCTG 2460
TCGTAAATAA CAAACGTTCA CCCGCAGCCG AGACACTCCA CAGCCGGCAG GAGCACGCAC 2520
TAACCCTACT TGTCG 2535 (2) INFORMATION FOR SEQ ID NO: 176:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1226 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176:
TGGTTCTTGC GCCGATTTGC TGACGTACTT GAGTCCAAAG ATGCGCAAGG AACTGGAAAA 60
TGGAACTGAT TAAGGATTAA TGCTCATGGC AAGGAAGAGG CGTGCTCCGC GCGGGGCGGC 120
ACAAGGGTGG CTAACGACAT ACTCAGATAT GGTCACGCTT ATGCTCTGTT TTTTTGTCAT 180
GCTGTTTAAT CCAACTGAAG TTGATATCAC GGTGTTGCAG AGTATTGCTG CATCGATTGT 240 AGGTGATCCT ACCGGTGGAG GGGTTTCTGC CTCCTCAGGG AGGCTCGCTG ACTTGGGAAA 300
CACCGTTAAC ACGCTGCCtT CACTGGAAAA GGGACAGAAG CTGGCGACTG CGCTGAAGAA 360
GGCGGTTTCG CTTTTCGCTC CTGAGATTAA AAGCAATAAG ATTGCGGTGA CCAGTGATGA 420
GCGCGGTTTA GTTATTTCGC TCACTTCGGA TTCGTTTTTT TATCCGGGGT CTTCCGATCT 480
GAATGTGGAG GAATCTCGGG AGGCGTTGTT GCGTGTTGCG CAGTTTTTGT CTGATCATGC 540
GCTCGCCGGT CGACGGTTTC GCATTGAGGG ACACACCGAC TCAGTTGAGG TGCCCGAAGA 600
TGGGAGTACA GACAATTGGG AACTTTCTAC CCGTCGGGCG GTGCGCGTGT TGCATTATCT 660
TACTGATTTT GGTGCaCAGG AAAATCGCTT TTCCCTTGCa GGGTACGCAG ACACACGCGC 720
AAAATTTTCA AACGAAAGCc TGAAGGGAGG GCGTACAACC GGCGGGTTGA TATTGTCATC 780
CTGGACGAGG GTCACTTTTG ATGGTACACT TCCGCTTCCC CTTTCCCGGA GGTGAGTTTC 840
CGGTATCTGC ACAGCCTGAG TTTGAGCTGG TGGCgCGTCT GCACGCCGCT GTAGTACGTG 900
AGGAGGATGT ATGGCAGAAA AGGACTCCAT AGGAGATATC GCTGATGATT TTGAGGAACA 960
GCTTGTCGCT CCTGCTGCGG ACAGGGTGGG CTTTCTGCCA GGATTGCTCA GATGGGTTGC 1020
CATTGCAGTA GGGGCGGTCA TCTTCATTGT GACGGTGGTG ACAGCCACCG CGCTGGTGCT 1080
CGCAAAGCAG GGGAGTAGCC ACACGGCGTA TCCGGTTcAC AGGAGTTTCG GgAGTCTCGC 1140
GAGCTTTTGC AATAcTACGA GTmCATGGGC CTATCCGTAC CAATACTGCA GATGCGCTAC 1200
CGGGGACGTT GTAGTGAGCG TTGCGT 1226 (2) INFORMATION FOR SEQ ID NO: 177:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1079 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177:
ATCAAATATT CCGGGGATAT CGTCACATTT CCTGAAAATT CTACCTTTGC AAAAGAAGTT 60
GCCGGAACAA CAAAAAGATA AAATACAAAC TCCACACACA ACACGAGCAG CAATATAACA 120
ATTACCACCT TCAGTACTGC AACTCTCACC CGCATAGACA TATACCCTAC CAGAGGGTAC 180
TTGCACGCTT CAGTATTTCT ACACAAAAAC TGTTTCTTCA CGCTCCCtCC tTATCTTCTC 240
ATCCCCTGAA ACATTGATGA TCAATCCACA TAAAGAAAGC GTTACCACAA TAGATGAACC 300
TCCTGAAGAA AAAAAGGGAA GAGGAATACC AGTGGCTGGA ACTAGACGTA CCACCACTGC 360 AACATTCAAA ATTGACTGTA AAACAATTGC CGCCGACGCG CCGAAAGCCA AAAAAGTATT 420
AAAGCGATTA GCACATCTGA GTGCGATAGA AATGCCAGTG AGGGTAAACG CAAACAACAA 480
CATTAAATAT AAACACACAC CAATGAATCC CATCTCTTCT CCAATAACGA CAAAAATAAA 540
ATCCGAATAT ACTTCGGGCA CGCTCGCAAT TTTCCTCACC CCATTTCCAA TACCACGCCC 600
CCACAACCCT CCATCCATCA GTGCCTCGAG CGCCGCGTTT ACTTGGTATC CTGCGCCAAG 660
CGGGTCTCTA TCTGGATACA AAAATGAAAG CACTCGACGC AAACGATTGG TGGACGTGAC 720
GATCATAAGC ACTGCTATCG GCGCAAGAAC CATTATGCCT CTTAAAAACC ACCACAGAGG 780
CGCACCTGcA ATGAAGAACA TAACCACTGT GATAAAAAGC AAAAACATAG CGGTAGAAAA 840
ATCGTTTTGA AAAAAAACTA CTGACACAAA AATCACGCTC ACAACAAAGG GAGGAAAAAT 900
TGATCGTATA GGTGTATCGA AATGCTCCCG GTGCTTATCA AAAAAGTTTG CAAGAAAAAC 960
AATGAGTACC AGCTTCACAA ATTCAGATGG CTGAAAATTA ATATCAAACA CCTTAATCCA 1020
ACGCGTCGCT CCATTGCGCG TTGAACCAAT ACCAGGGAAA AACGTGCACA CACAGAGCG 1079 (2) INFORMATION FOR SEQ ID NO: 178:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 556 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178:
GnTTTGAAGC CGTAGGCATC TGTGCGCTTG CAGAATGTAT ACATGCAGGG AGTGCGCGTA 60
CGTGCCACGT GAGTnCTTGG GGGAAGCAGC TTATGTTTAG ATCGCGGGGG GGGGGGGGGG 120 nGriACTTGTA GTGCTTGAAC GGGATGTGGA CCGAGTACTG GAGTACCTGG GAAAGACGGC 180
GCTTGTCCAT TTGCGTCTTT CCGCCGCGGC GCGTGGCAGT TCTTCCCACT GTGCGCAGAG 240
CAAAGAGTAT GTCGGCCGTC TTGAAGAAGC GTGTAAGTAC CTGGGTGTCT CTGGCGAGTG 300
CGCGTTTTCT CCAGGGGATT CTTTGCCTAC CGAAGAAGAC TACACGTTGG CACAGCAGAT 360
ACTTGCAGAA GTTGACGCTT TGCACGCACG CGAACGAGAG GGTGATGCTC CCTCAGTTCC 420
CCGTGGGAAG AGTTCTGTAG CCCATGATTC TGCCAACGAA GAGCAGTTTC AGGGTGAGAA 480
ATGTGCGCTC GGCTCGATGC GAnGCCCGGC ACTGTGTGCG CTGCTTAGGC GTTTTGCGCT 540
GCAGGAAnGT GTGCAC 556 (2) INFORMATION FOR SEQ ID NO: 179: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 450 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179:
TnTGACTAAA GTACATGCTT GCAGCGATAC TCAATGAAAT TGGATTCAAC ACCGGGCTAT 60
ACGCTTCTCC TCACGTTATG GATCCGAGAG AGAGAATAAC ACGCGCGGGT GTGTTTTTTT 120
CACCTGCCGA GTATGCAAGC GCGTGCACAC ACGTATACCA CACGGTGAAA AAAACAGAGA 180
ATCTGCGCGA CTACGGCCAG GCGACGTGGT TTGAGCTTAT AACGCTACTG GCGTTCATGC 240
TATTTGCACA ACAACGCATG GAATGGTCCG TTTTTGAAGT AGGACTTGGA GGAAGACTAG 300
ATGCAACAAA CATCATTTGT CCTAGTATCT GTCTCCTTCT CCCCATAGAA CAAGAACACA 360
CGCGCATATT AGGAACACGT ATAAAAAGTA TTGCAAAAGA AAAGGGCGGC ATTATCAAAC 420
CCTATACGCC TATTTTTGTT TTGATCAGCC 450 (2) INFORMATION FOR SEQ ID NO: 180:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 605 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180:
TGGGATCTTG CAACTGACGC TCCCCCTGAG CGTGTGCGTA TGCTTTTTAn ACGCACGGTT 60
CCTACTGGnA TTGAACACGG CACTGTTACA ATTCTTTTCA nAGGGCATTC TATTGAGTGC 120
ACTACGTTCC GTGTTGAGTC GGATTATTCC GATAGAGACA TCCGGATTCT GTTTGCTTTG 180
CCGCGCGCAT TGAGGACGAT TTGGCAAAGC GCGACTTCAC TGTCAATGCT TTCGCTGCCG 240
CGCTCCCCTC GGGGGAAATC ATCGACGTAT GTGGCGGTTA CGCGATTTGC GTAACGGTCT 300
TATCTGTAGC GTTGGGGATG CACATGCTAG ATTTTCTGAA GATGCGTGCG TCCTTTGCGT 360
GGCTGTGCGT TTTGCAGCGC ATTATCTTTT TCCATCGAAG CGCGCACGnT GAAGCAATTA 420
TCGCGCTACG GCTTCATACT GCACGTAATT CTCGTGAnCG GGTGCGTGAT GAACTTTCTA 480
AGATGCTTTG TACTCCCCGT CCGATATTGC GCTCCGCTAA TGGAAGAGAn TGGGATGCTG 540
CAAACACTTT TTCCTGCGCT GGCGCAGTGT GTGGGAGAAA CGAAnGGGAA GGAGAAGGAG 600 ACGCA 605
(2) INFORMATION FOR SEQ ID NO: 181:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1265 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181:
ATGnCGTTTA AACGAAAAGG AGAGTTCGGT TTTTGCACGT ATTGAATAAA TGTTTTAAAC 60
CAAGTGGATG CATTGTCGTG TGTAAAAAAT AATTTTTGGT AGCAGTCGAG CGGAAGCATG 120
GAAAATGAGC ACAGTACCGG GGGTGTCGCC GTACCGTGCG ACAAGTTGCG CATAGAATGT 180
TTCAGCGGGG GTGTCAATCG TCAGTTCGCG GTACGTAATG CAACAAGGCA TTTGCTGCTG 240
TCCTCAAGTA CTGCAGGTTT AGTAGCGGGC ATATCAGAGG CTTCGAAGAG ACGGTAATCT 300
AGGAGACAGT ATTCCATTCC GGCCTTTACG AGTGAGGGAA TGAGCGCGGC GCTCCATGCA 360
GATGCTTCCA GAAAGCAGCC CCGTGGGCGT TTTATAACGT GTTCTCGCAA GGTAGAAGTG 420
AGTAGCTCAA TTTGACTAAT TACATCCGCC TGAGGGAGGA GGGGAAGATA CGGCTGGTAA 480
AAACCGCCGC TGAGTAATTC AAGCCGTTTG GAACCAAGCA GATTGTTGAT AAGATAAGGA 540
TAGGGAGAAT TCTTTTTTTG TATAAATTGA AAAAAAGAAC CGGTCATGTG CACCGTCAGT 600
GGGAGTTGTT CGTGCCTGCA CAGACCTGAA AAAAATGCCT GGTACTGAGA AGGATCGTGC 660
TCTGCGTGCA GGGACTGAAC ACATGCTTCC GTTAACTTCG CAGCAAATGC GATACGTATT 720
TTGGTATTGT CTCCTTTTTT CATAATCGGG TTCTCATCCC CGCCGCGGGT ATCGGAAAAA 780
GCCACAGCTC GTCAGTACCG TACAGGCTCA GTGCAATAAA mCCCATACTT GCTAAGAGAA 840
GTGCAGCTGT TTTCTCCACT GcAGTGCCGT TTCCCTGCGT GATGCGCTGA TGTATCGCGC 900
GTAAGATGGT GGAGAAACAT AAAAAGCCGA CACAGCTCAC TCCCCACATA ACGAGCGCCT 960
CAAAAAGGTT AAGGGCAGTG TACGTAATGT AAAATACGAG TGCGTATAAA AGACCCCACT 1020
CGCTCTGAGC ACGGTGTGAT GTACACGTGC GCGACAGTTT TCCGCTGCAT GTACTCAGAC 1080
AAAAGGCACT TGCGCCCACG CCCAACGGTG TAAGAAAACG CAACGCAAGC GGGTCATATA 1140
CGTAATGTAT ACAGACCCAG TTACCGGTTA TCAATATTCC GCnTAGGAAA AGGAAGCGAC 1200
ATGTATGGAA GAGCGTGCAA CGTGGAGAGA GCGCAGCGAT GCACAGGCGA TCTAATACCG 1260
ATGCC 1265 (2) INFORMATION FOR SEQ ID NO: 182:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1299 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182:
GCACCCGGAT GACCGTTCnT GATCGAGCGG GTACGTCCAC CGTGCGCCCC ACACCGAATG 60
TnGAGCGTCC GCGCTCCCCC CTGCATGCGG TACACAAAAA ATCCGCTCnA CGCCGTCAGT 120
GCCAGCACCA GCGCCACGCA GCACCCGTCC AGCAGCGTCA GCATTCCGCC GCGCGCGGCA 180
GCCGACCCGA TTCGGGGGAG AGGTGGAGGC ACGATCATTC TCGCCGCAGG CCACGCGTCC 240
CCCCGTTAGC GCTCAAAATA CGCACGCGCC TGCCGCGcCA GACGCCGCCG CCCGAGCATC 300
CCCCGCACCC CGCGGGCAAA AAATACGTCC AACACTGCAC GGGCGTGCTG CACCGCCGCA 360
TACCCACGCG CCACCAGCTC tGCGGaCGAG AAAAATCACT CTGGTCccGC CGCGTTTCTA 420
CCCGCACACA CGCGCTCGGG TAATTGCGCG CCTGCGTGGC AACCGCATGC GACGCACACG 480
AAAGACTGCG CACAACCACG TCAAACCCCG TCCGAAAGTC ACTGAGCCCC AGCTGCTCAA 540
ACCGGTCAAA CGTCACCGCC AGCACCGCGT cAAAACCCTG CGCGCGCGCA ATCCACACCG 600
GCGTGTTGTT CAATATACAG CCGTCCGCCA AATACACCCC CTCCTGCCGC ACCGGCGCAA 660
AAACACCCGG GTACGCACAC GAGGCGCGCA ATGCACGGGC AAGCACCCCA CTGGAGAGCA 720
CAACCTCCGC GCCCGTACAC AGATTGACCG CATTGCACAA AAACGGAATT TTACAATCGT 780
GAAAGGATTT CCCCCCCGTC ACCCGCGTCA gsAGcGTGGC AAACTTTTCT CCCGAATCAA 840
GCCCCAGCCC CCGCACGAGC GTGTTCAAAC TCACCCCCAG CTGCACCAGC TTGCCCAAGC 900
GTTGAAACGC AGCCCGCGCC CCCCACACCA TCCCCGCCTC AACGCACGCA GAGGGATCCC 960
GTGCATTCAC ATAGTCTGAA ATAACAAAAT CACGCTGAAA AAACGCCTCC ATCTCCCGCA 1020
CCGACATCCC CAGCGCATAG AGCGCCCCCA CCACCGCACC CATAGAACAT CCTACGACAC 1080
ATTGCGGCGG CGGAACCTGT AGCGCTTCAA GCGCCTTGAG CACCCCAATG TGGGsAATTC 1140
CCCGCGCACC ACCACCTGAA AGAACGAGCG AsCACTTCAC GTGCGGTCAT TATGAGCGTT 1200
TTCCTCCCTG CTGTCCATTC TCCCCCCAGT GTGATACCGT TCCAGTACGC AgTATGGAAT 1260
CGTTTGTACG CAGCGCACTT GCGGCGCGCA CACTCCCCA 1299 (2) INFORMATION FOR SEQ ID NO: 183: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1115 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183:
TTAACCGGGC GTTTCCTTGC AGGTTACGCA GGATTGCTTG CAGCAAGCGG GGCAGAGGGT 60
GCCTTTGATT GGCAGAGAGC GGACGCGATT GCAGCGTTAG ATCAGGCAGG AGCGCAACAT 120
GCGGTTTCTC CTATCGCTCA TCCTGCAGGg TGGATGGAAC TTAATCCGGC TGTGCTGTCT 180
GATGTGTATA GTGTGGCTGC AGGATATCCG AGTGCGGAGG GACTCCCTCA TGCGGGGGAT 240
AACCGGGCGG CGCTTGCAAT CGCATCGTTG AGAAATAGCC CGGTTATGAT TGGCAATCGA 300
CACACGTTTG ACGAGTTTTT TGcAGAAGTG ACAACAGCAA TCGGTCTTAA GGGAGAACAr 360
GCGGAgCGTT CGATGCAAAT GCACGCTGCA ATTCTCAAGG AGCTCACAGA TATGCGTGAT 420
GCGACGTCGG GAGTGAATAT TGATGAAGAG TTGGCGGACA TTATTAAGTT TCAGCACGGC 480
TATAATGCGT CTGCGCGTTT TATTGCGGCG GTGAATGAAA TGCTCGACAC CGTCATCAAT 540
CGTATGGGTG TTTAATTTTC AGATAGTGCA CGGTATGAGG AAGGGAGGAT GGGCGTGAAG 600
CGTATCAGCT CACACATGCA GGGCACAGAC AGTGCCTTTT TCTTAAGGGA GCAGGAAAGT 660
AGACTACGGA AGGTAAACAA TCAGCTTGCA ACGCAGCGTA GGATCCAGCA GtTCGCGATG 720
ATCCGCTCGC TGCAGGTCAT TCTGTGAGGT ACAAGTCGTC CCTGGCGCGT TTAGATCGCT 780
TTGAGAGAAA CACGAAAACT TTACGTGACC AGTATCAAAT CGCCGAGGGG TTTATGACTT 840
CTGCGCTGAA CGTAtACAGC GTCTTCGGGA AATGgCTGTC GCAGGAGCGA ACGGAACCTA 900
TACTCCTGAC GATTTAAAAA AAATGGCGAG TGAAGCAGAT GAGCTTTTAC AGGAGCTGGT 960
GCACAATGCA AATGCAGTGA GCGCAGATGG GGTGCGGGTA TTTAGCGGTA CCAAAGTTTT 1020
CACAGAGCCC TTTGAAACGG TCATGGGGAA TGTTGAGGGA TTAGGGTCTG AAGTGATCAC 1080
TCAGGTACGC nTnTTTCCCA AACCGGGGGG TTTTT 1115 (2) INFORMATION FOR SEQ ID NO: 184:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 464 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184:
CAGCCTTGGG CGACTTCCTC TGTGAATGTT CAGGCCTATA CTAGGCAAAG ATTTGAGAGG 60
TTATCTCGGC AGACCTCCGC AGCTCTCCCT CCACCGTAAC AATCTCCTCT CAGGACCATG 120
ACCCATCTTT TACCCTCTAA GCTCAGATCT CTCACACTTC ACAGATTCCC AGGCCCTGGC 180
TTGGTTCCTC CTCCCTGCAC GACAGCCTGG AAATGGGCTC TTATAAATCA GTGCAGTTGT 240
GGGCATCACC TTGGCCTTCC CCCATCTTAG GGATCATAGT GCTAAATTGC ATGTTGTCCA 300
AAATCTGAAA GCCATgTTTT ATGTATTTCC CCACTTTTCT CATGTTTAAG GTGATGCAGT 360
TAATCTaCCT CTGTTAcTCC GCCTgACTGA AATTGGAAGT CCCgTCTgTG CTTCTCTCCC 420
AGATTcATAC CAkGCTCAGT TAnCTGGTCA ATTTTGTCTG TTGA 464 (2) INFORMATION FOR SEQ ID NO : 185:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 580 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185:
CGCAGGTTTT GCCGTGTAAT CTAATGCTTG AgCtCTGCGG ctAACTTCTG TCCaGAaTCA 60
GCAaTTTCCC CAAAAAACAC GCTACaTGCm TcAAaGGCAT AATCACGTAG AGCAAACAGC 120
GCATATACGA TAAAAAAACC AAATACAAAC TGCATATCAA GCGTAGAAAA ACGGTGCACC 180
CAAAATACCG ATATGTAACC TGCTGTACAT GAAAACCCCA CCGTGCATAA CACAAAACGC 240
GATAGCGCAA ACACCACTTC TTTAAAGGAA ACAATTCCAG TAGACGCAAA TGCCCACTGA 300
GAAAATAAAA CTATAAATAC ACCTCCTAAG GGAAAAAACA CAAACACCCA GGGAAAAACC 360
GcAAACAAGA ACAAGATAAT aACCCACGTG aTaAAACCCA AAGCCACTGc CCCCGcGAGa 420
AACAGGAGTT GTTGCCGATA TGcAATACTT TTCATCATGA ATGCTTTTTT cACCAAAGTT 480
GCGAACGTAA aCAGGGAATG ATTAGAAAAA AAAGCGCCGC GTTATAACAC AnTACCCGAA 540
ACAGAAGAAA ACAAATAATT TTGGTAAAAT TTAAAACTTC 580 (2) INFORMATION FOR SEQ ID NO: 186:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1377 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186:
AAACCGCGCT GCGCAGCCTG GGTTAAACGT TGCGCTTGAA ATAGAAAATC AGCAGGTTGT 60
CGGGsCGAGT ATGGGGGAAG AGAGTATTCG GCAGGGTACG CGCGCGCTGG TGTGGGGGCT 120
GTGCGCGGTG CTCTTATTTA TGCTGGTGTG GTATCAAGAA GCGGGCGTAA ACGCTTGTGT 180 cGCGCAgCTT TTGAATCTGT ATATCATGTT CGGTGTGCTT TCAGCGTTTA ATTTGaCcTT 240
ACGCTTTCGA GTATCGCGGG GATGATTCTC ACTATTGGTA TGGCAGTGGA TGCGAATGTC 300
GTTGTCTTTG AGCGTATACG GGAAGAACTT GCGCTGGGCA AAAGcCGCGG GGCTGCTGTG 360
TGTArCGGCT TtGAgCGTGC GTTTTGGGCA ATTATGGATT CAAACGTGAC GACGTTTATA 420
GCAGCGCTTT tCCTTTCGgT GCTCGgTACC GGTCCTATtA AGGGTTtCGC ATACAGTTTG 480
GCTATCGGGG TGGTGTCCTC CGTATTTACG GCATTGTTCG TTTCCCGTCT GATGTTTGAC 540
TACGGGACGG AGGTTTTACA CAAAAAGACC GTGCGCATTG GATGGAGGAT TGCTCGCGTA 600
TGAGACAGGT GGTGCGTTTC AGTTTGCTGT TCCTGCCATG CGCGATACTC AGTGTAGTTC 660
TCATTGGTGC GGGAGTGCTC CGTTGGGCAT TGTGGGGGAT GAGCTTTGGT ATCGACTTTC 720
AGTCTGGTTT GATTGAACGG CTGAGGATAG CACCTCCTGC TTTCTCTCTC GTGTACACCG 780
GAACGCATCG ATGCAGTTTT TTCAGGATGA ACAGAAGGTT GTGTTTACTG TCTCTTCGCC 840
TGGGGTGCTC GGTGAgCGTT ATGAATTTTT GTATACGGAG TATCCAACCC TTCGTGCCTT 900
CTCCGAgGGA GCAAAGAAGG TGGAGCACCT CAGTGTTACG CTCCATGcCC CTGAgACTGT 960
GtAcATGCGT GAwACATTCT CCGGGGCGGA GGGCTCCACG TTGTCGAGTG CTTCGTGTTT 1020
TGTGCATTAC TTCTCGGAGG ACGTTCGTGC GCCAGGGGTG GAGGAGTTGC GCCGTGTGCT 1080
GAAGGATGTA CCGTCTGCGG TGGTACAGCA GGTAGGGGTG CGCGCTGAGC ATACCTTTCA 1140
AGTTCGCGTT GCAGCTGAAA CTGCCTTCCC GTCCTCCCTT TkGCCAGAGC AGGGAGGAAC 1200
TGCTCTGGCm CAGTCCGATG CTCCCGATCT TGTTACCCCT CAAGGTGCGG TGGAAAGCGT 1260
GGTGTAaCGC GGCGCTCGTG CGCGCGTATG GAGCAGATCA tGTGGTCCGT TTAGCGATGG 1320
ATTTtGTCGG ATCTCGTTTT TCTCATCTGT TGGTGCGTTC AGGCGTTGTT GTTGGTT 1377 (2) INFORMATION FOR SEQ ID NO: 187:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 483 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187:
CTGCTGACCA TGGCGGCGGG GTGTCTGCTG GTGTGTGCTA AGTGTTCGAG AGTCCCTTGT 60
ATTGTTAGAT TCCACAACTG CACATTGCGA TGTGCCTGTT CGATACAGTG AGCAATGTGT 120
GCGAAnATGT CTGTATTCCG TGCGGAGTAC TCGGTGGTGC GCTGCAGGGT GCGCATGTGC 180
CTGACGAGTA AGGAGAAAAA AAGGGACACG GCGTGTTGCT GTTTGTGAGC TTGGAAAAAA 240
TGCAGCGTAG AATCAAGTGA GGTAAGATCn AnTCCTGATA CTTGAGGTTG AGTACCTACC 300
GAAACCCAAG TATCAGGAAT GATGCGTCCG ATACACCTAG CGTGCTGGCA CACATACAGC 360
CAAAAGTACA TCACGGACGT TTGCACCTCT TCTAAAGAAA ACGACAAAAA CTGGATATAA 420
ATAGTGGCAA GAGTACACCC TGTGTGTTTT TTGCATCGGT CGGTCTGTCA TGAAATACGC 480
GTG 483 (2) INFORMATION FOR SEQ ID NO: 188:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 846 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188:
TTTTCAGCAT TACATTGGGC GCTTTCGAGA ACTGTGGCAC CAAGGAGGTA ACGGACGCCT 60
ACTCTATCTA TGATACCACC GGCCAAGTGA TGTGGGATAA AAGCTATCTA TTTGGGATGC 120
AGGTGCCATT TCGTTATTAT GCAGAATGTG CACTTCGTTT TCATAAGACA CCTTTGTATG 180
TAGATGTGCA TGTGCCAATC ATTTCTGATC CCTTTTTTTC TACTGATTTT TTACAACGGA 240 tGAaGATCTC AACTGGTTTC ATCTTGACTT TAAGTCCTCG ATCGCTCAGA GAGCTTACGC 300
AAAGCACCAT AAGCAGCTAT GACTGGAAAA TCTCTGCTTC GCTGCGCCCT GTCTGGCTTG 360
TACTTCATCC TTGGTTGCGC GATTTTTCGC TCGATCCGAT TTCGTTCACC GTACATTTTA 420
ATTCGAAGTC TGATAGTAAA AAAAACAACT CTTCTCCAGA GCGTAACTTC TTCTACCCTC 480
ATTCGATGGA ATCTCGAGCA GGATTGTCGT TCTCTGGTAC GCTGTTCTCT CATGTGTGGG 540
AAAGACAAAA ATCTCAACAA AAAGAATCGT ACGCGCCAAA AGArATACGT AATCCACTTG 600
CATACACTCC TGCAGACGGG TTATCTAGGG AGGGwTCTCC TCCTGAACAG TCCCCTGnCA 660
GTATCAAAGG AGAACAGCGA GACTGATTCT ACCTTCGATT TTTTTATGCC AGAATTTCGT 720 GAAGAAAATG AACGTCGTAC TGGTACTGAT CATGCGTATG TTTTTACGCG ATACGCTTTA 780
GATTACAAAG GTAAAGGTGA CATCGTGTAC GATGCACAGT TCAATCACGG TTCGTGGGGA 840 TGACGC 846
(2) INFORMATION FOR SEQ ID NO: 189:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 689 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189:
CTGnCACAGG nACAAACGGC GACCCCGGCA GTTTAACGGT GAACCCTAGG TTAGTAGCGG 60
AAAGACCGAC GTACATGTTG GGTTCTGCAG AGCTAAAGAA CTTAGCAACG TTCCCCACCC 120
ATTGCAGACC CACGTCGGTA CCCAGCGACA GGTGCGTTAG gCCCTGCGCG TCGCGGAACC 180
CCGCCTTCAG GTTAGCCCCT ATGGAAAGTC CCCCAAACCG ACGGGAGAAG TTCACAATCC 240
CCAAGCCCCC CAGTTTTTTA ATTGGGTTGG AGGCAGGCGT GCACACGGGC CCCGTGGAGG 300
GAGAAAAGTT AAACCCGGAT TCGGGGAAAA ACATACGCAT CGACGCGCCG TATCCCCAGT 360
TGCCCGACTG GCCAACGTAG GAAAGCGTTT CGGCATGCGA ATTGTTAAAT CCGACGGTGT 420
GGGCAAAGGT TAGCTCACTG TGCGTCATGT TCGCACTCCC TGCTGGGTTC GCCTCAAAGA 480
AGCTGGCATC GTTTGCCAAT GCGGTGAACG AACCGTCCAG AACACTCAGA CGCCcTCCGG 540
AAnGGGAGGA AAcTGCGCGC CTCTTAAACT CACTCATCTT TGAGCGAGTC TTCGCCGCAG 600
CTTCTGAAGA GGCAATCGCC ACGCTGGCTC CCGCGAGCAC GAAACCACAG ACGAGCGCTG 660
CGCTCTTATA AAAAACAGAT TTGTAATGT 689 (2) INFORMATION FOR SEQ ID NO: 190:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 942 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: TACTCCCCTC TCTGAACATG GGGAGCTGGG TGTTTGCGCA GACGCGGGGA CAGGTGGTGT 60 GTCCGAAGGA GCGGCGCTGC CGTGCGCACG GAGGGGAACG CGTTCCTAAC TCCCTGCCGA 120 AACAGGTTGC CGGGTGCGCG GTGCCTTCTG CAAAGGACTC ATCAAAGGCT ACCTCAAGCG 180
GCTACGGCTC GGGGTCCGTG TTGCGCGTAG ATGCGAAACG CATCGATTAT CTTCTGAATT 240
TGGTAAGTGA GACGGTGATT ATCAAGGCCT CGCTCAATCA GAGTGCGCTg AATTTGGGGA 300
GGTGTACACC CTATTCCAAA ACGCTAATGG CGCGTACAAG GAGCGTTTGC GTAAGTTTTT 360
TGATAGGGTT CCCGCTTACT TAGAAAAGGT AAAGAACGGT CAGGACGCAG ATGCGGTGCG 420
CAAGGGGATG ATAGCAGAGG CTGTCGGTGT CTTtGACATT TTTTCTTCGT TTGAGAATGG 480
ACTGAAACAG TCCGTCACTA AGTTTCGGTC TTCTGCTCAG AATTTGGGGC GTATTTCTGG 540
TGAGCTTCAA GAAGGTGTGA TGAAAATCCG CATGGTGCCT ATTAGCCAGA TTTTCAGTCG 600
TTATCCGCGT GTGGTGCGCG ATCTCTCGCG GGACTTGCGT AAAGAGGTGC GGTTGGTCAT 660
TGAAGGAGAG GAGACGGAGC TTGATAAGTC TGTGGTTGAA GATTTGCTCG ATCCCATTAT 720
GCACTGCGTc GTAATTCTCT CGACCACGGC ATAGAAGCGC CTGAAGTTCG CGCGCGCTCT 780
GGAAAACCGG CGCAAGGTAC GCTTcTCCTG CGCGCAACAA CGAAGGAAAT ATGATCGTaT 840
TGAGGTTGCC GATGACGGGC GTGGcATCGA CGTGGAgGCA tGAAGACGAA AagCAGTTGA 900
GCGArGTGTG TTGCAcCCAG GcAAGAACCT CACTGAGGTT GA 942 (2) INFORMATION FOR SEQ ID NO: 191:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 413 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191:
CAAGTACAGC TACGCCGCGC ACAAGCTCAA ACAAGGACAG AGCCACTCAA TCCACAGTGC 60
ACACACAGCC GGACGCAACC GGCAACCTAT CTCACGCACG AACCTAACCG GGGTAACTCT 120
GCTGGCAGAG GTCAACAGGA CGGGGATTCT ACCTCCGATT TGCACAACGT TCAAGCACTC 180
ACGTACGCAA GATATACGCA CCGCTTTTTC CTCAAGTGCA CCCGGCGTAT TCCGATACCC 240
CTTACCAAGC ACGGGGTGCA ACGGCAGGGA TGCCAnGGCG GCTGGTCCGC TTGAAAGCAC 300
CAAAAGGAGT ATGCTTCATG ATTCATCATC ACAACATGAG CGTATGTTCT CTCCAAAGAA 360
CGCTCGGACA CAnAAATTGT CCGTCCAGAA GAACnTGGAA GTTGTCTTCA GGA 413
(2) INFORMATION FOR SEQ ID NO : 192:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 503 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192:
TACATACCTC GCTCTGCTAA TCCTGTTACC AGCCCCCCTG GGCGCAGGAG TTGTCCAAGA 60
AGTATTGCTC TAAAACGGTC AATCTGTACT TCGTACATGG CAATATCCGT GACTTCCTCC 120
CCCATCGCGA TATTCAGGGC AGCTTCTCCT TCGTTCGCAT TAATGACTAC ATATCTGAGG 180
TTATCTTCGG TAACCGGGAT ATCATCGTCT TCTATGACAA ATCTGCAGGG CTCACATTTT 240
GTCTACAAGA AATGCTGAGC GCTTACTTAG AGCGTATGCA TGCCCAGTAT CCTACTGAGG 300
CACTTGCTGA CTTTCTTTCG CGTGATCCGG TGAAAGCTTT TGCGTACCTT GAGCGCTACT 360
TTATTATGAA CATGAAACAG AATAAGCGTA GGTCCTCATC ATCGACTATC TGAATCTCTC 420
GTTCCCTCAG AAGATATGCA AACTAAGCGA AACAGATCGC TATGCTCGTC ACCCTCAATC 480
GCGGGCAAAT GATCCGGTGT TCA 503 (2) INFORMATION FOR SEQ ID NO: 193:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1038 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193:
CAATTGCGTC GGTTCGCCGT TCTTCTCAGT GCCTTGTTCG TGCTGCTTCC GTTTGCGCTG 60
CCGCGCTGCT TGCGCACAAG AACCTAGCGC GCGCGGCGGG TTGCAATCCT TTTTTTGCGT 120
CTGGGCCTCA TGCTGTGTGG TACGCTCTTA GACGGGCGTT CGTGGCGTAA CGAGCTTCCC 180
TTTCACCTGT GCCCCGCAGC GCTCATTnGA GGGTCGCTGT ATTTCATCAC CCGCCGACCT 240
ATCTnTTTCA ATCTGCTGTA CTTTTGGCAT TTTGGCTCTT TCGTTGCGGT ACTCTATCCG 300
GATCTCACTC GGGCGCACAC CATCTTGTAC GCGTACTTGT TCATGCTCAC CCATTGCCTT 360
GAGCCTGCGA TGGTCGTGTT CAGCCTGCTC CACTTGCGCG AsGCATTAGC AAGCGTGGCC 420
TGCAATGCGC ATGCTTGGCT TTCTTCTGCT TGCAGCAAAC GCACTCTTTT GGAATCGGAG 480
ACTCGGCGCC AATTACCTTT TCATTAGCAA ATACCCGCTT GAGATCCTTC GGGTAATCCG 540
TCCTTTTTTT GTGTATCAGC TGCTGTTTGT CAGTGCACTG TGCCTGTTAA TGCTGGTACT 600 CTACCTACCC TTCCGGCCAA GCCAACACGG AAGAAACCAG CTCTTCGTCA TTTAGCTGCT 660
CGCTGTGGTT CCATCGGACC CTTCTGCCAA GGCAGAgGGC ACGCGTGCCg TTGGGaCATC 720
GTGGTGGAgT TGCAGCGCGC TtGcGTCAmA mGGCCTtAmT CCGCGGTCAA cGCGAACGTA 780
TAATGGTGCG CtGTGACTTC TGTAmCAAAA mCGCTTATCA TCcAAGCAGA CCGCTCCATT 840
TTACTTGATG TGCACGCTCC TGAGGCGGTA GAGCACGCAA GGCGCTCGTT TCCTTTGCAG 900
AACTGGAAAA ATCTCCAGAG CATCTACACA GCTACCGACT CACTCCTCTT TCTCTGTGGA 960
ACGCCGCGAG CGCAGGGATT CAGCCCCCAG AATGGATTGC ACAAACACTG ACGCGTTTCT 1020
CACGGTTCAn nTCCCCCG 1038 (2) INFORMATION FOR SEQ ID NO: 194:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 441 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194:
TTGTCGATAC GnGATATATC nTATCTGCTG GGCAAAATGA CCAAAATGTG GCAGTCCAGT 60
TGCAAAAGGG GGACCGTCAA AGAAAACATA AGACTTCCCC TGCGCACGCT GCGCCACAGA 120
CTGCTCAAAC ACCCGGCGTT CCCGCCAAAA GGCGAGAATA CGCCGCTCCT GCGCGACAAA 180
ATCAACCTTT GGGTCCACAG GCGTATACAT ACAACCTCCG TTGCTCAGAA TCGCATAAGG 240
AGCGTAAGGC ATTATATCAT TTTCGTCCTT CCTTTTCCCC ATACGTCTTA TGACCGGCGC 300
CACACCTTTC CCCACCTGCA CCAGATACCC CACGTGTGGC GTAATCGCAG TGGCTCTGCC 360
ATTACTGCAT GAGTATTACT ATGCAATAAT GCCCCACATT ACACCTTCTG CAATCGAATA 420
CGAAAAAGGC ATCATCAGAA C 441 (2) INFORMATION FOR SEQ ID NO: 195:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 514 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: CAAACAGCGT TTTTGCACGC ACATGATGCA ACAAATATTC ACTGACTGCC TGCGCTATGG 60 ATAAACCGTC TTCAGTTGCC GTTCCCCGTC CTACTTCGTC CATGATAACA AGGCTGTCGC 120
GGGTTGCTGC ACGCAGGATG TGTGCTGTTT CACTCATTTC TACCAAGAAG GTAGATTCCC 180
CGCGCGCAAG GTTATCGGCC GCTCCTACCC GACAAAAAAT ACGATCGACG GGGGTGAGCT 240
CTGCCTTTTC TGCAGGGACA AAGGAGCCAA CCTGCGCAAT CAGGCAAATG AGCGCATCTG 300
ACGCAAAAAA GTACTTTTTC TGCCATATTC GGTCCGGTGA TGAGCGCAAA AnCGGGCAAC 360
AACGCATGTT CAATTGAAGA AAGTGTCAGA TCATTGGGTA CAAACTCCCG GAGGAGATGA 420
AATTCCACCA CGGATTCTGC CCCCGTATAC GATGCACGTC TTGATAAGAC GGTGATCAGC 480
AGTGACGCAn GGCGAAAGTG GGACTCAnTT GCAC 514 (2) INFORMATION FOR SEQ ID NO: 196:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 407 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196:
TTTGTGACAA nTGGTACCGG CAACTGGnAA GGCTTTCGAG TAGAATCCAA ATTTACAAAA 60
CATTCCCATT AAAAGCACAG AAGGAAGAAA AATAAGGCAG GCGTTTCAAG CTACTGTTGG 120
GCATGAGTTA ATTTCGGCAG ACTATACACA AATAGAGCTG GTCGTGTTGG CCCATCTATC 180
TCAAGATAGA AATCTTCTCA ATGCATTTCG ACAGCACATT GATATTCATG CATTGACTGC 240
TGCATATATT TTCAATGTGT CTATAGACGA TGTACAACCT GCAATGAGAA GAATCGCAAA 300
AACTATTAAC TTTGGAATCG TGTATGGAAT GAGCGCTTTA GATTGAGTGA CGAACTTAAA 360
ATTCTCAGAA GGAAGCGCAG AGCTTCCATT ACCGTTATTT TGAAACG 407 (2) INFORMATION FOR SEQ ID NO: 197:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 410 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: GGGTGGnGTT GGTCGGTCCC CCAGGAnGGT AAGACGTTGC TTGCACGCGC ATGGCCAGGT 60 GAGGCGTCAT GCCCTTTTTT CGCATCAGTG GCTCAGACTT CATCGAAATG TTTGTGGGGA 120 TTGGCGCCTC GCGTGTGCGC GATTTTATTC AAACAAGCGC GGGAGAAGGC GCCAGGGATT 180
ATTTTTATCG ATGAGCTTGA CGCAATTGGA AAAAGCCGCC TGAACGCTAT CCATTCCAAC 240
GATGAGCGGG AACAAACGCT TAACCAGCTT CTGGTAGAAA TGGATGGGTT TGATAACACC 300
ACCGGTCTCA TTTTGCTTGC TGCTACCAAT CGCCCCGATG TGTTAGATCC TGCGCTCCTA 360
CGCCCCGGTC GTTTTGACCG ACAGTTTGCG TAGATCGGCC CGATCTTAAG 410 (2) INFORMATION FOR SEQ ID NO : 198:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 429 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: nGACGCAACA CTGACAACTG ATGGnTTCGC TCATCTCTCT CGGTTCCCTG AGTTCTTTTA 60
CATACGGTCT CTTTGCCACG GCGCGTATAG CAATGGGCGT GCATGCCAAC GACACGGCAC 120
TTGTATCGCA CTATGTGGCG GATTTGTATT TCGAATCAGC TGCAATGATC GTAACGCTCG 180
TCACGGTGGG TAAATACCTG TCCGCCTTGT CTAAAGGGCG CACTTCTCGC GCACTCACAC 240
AACTGCTAGA CA AAAACCT AAAACGCTCG CGTTATCGTC AGTATCTGTT CCGCGCGGAG 300
ATCCCTTCTT CCCCAACAAT GCAACGCTGC ATCAGCCCAT GAGACCCATG AATTGAGATA 360
GAAATCTGCA CAGGACGTAA TTGTTCGGAG AnACGTAATT GTAAAAGCCA GGTGAGTAAG 420
TTCCGG An 429 (2) INFORMATION FOR SEQ ID NO: 199:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 374 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199:
CGCTATCGTA nGGAGGATCG CTGGGCTAAT TCGGTAAAGG CGGTCATTCT CTCCTTCATG 60
GATGGTGGGT ATCACCTGAC GGGGTTGAAC TGCACTATCC TTTCTCAGAT TCCTCCTGAT 120
GCGGGGCTGG GTACTCCCAA TGCGCTGAAG TTGCCATGGC CCTTGTGCTT GGAAGTTGTT 180
TGCCGCTACG CTGCCAAAGG AAAGTGTTGT TTCGATCGTG GAACACGCAA ATGAGCGCTA 240 TCTCAAGACC CACGCACATC GCGCGGATAT TCTGTGCGTG TTGTTTGCAA AGCAGGGTAA 300
CTGCGTGCGC ACTGATr-ACC GCAAGAAGCA nGCGGAACTG TGTCAATTCC CTCGGAGGGA 360 AACGTATGTG CTAC 374
(2) INFORMATION FOR SEQ ID NO: 200:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 382 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200:
TGCAGCAGGC GTCTGGATGT GCAGGAGCTT GCGCAGTTAC GGTGTGATCT GTCTCGAGTA 60
CAGGAGCAGG TATATGCGCA GAATGAAAAA AACCGAGCCC TCCTCCGTGC GCAGTCAGCG 120
ATTTGCGCCA CAGnCTCGAT GAATTTCGTG CGGCGCGTGG GCTGGTGAGC TATGAGGACG 180
GGGAGGACTC AGGGACGGTA ATTGATCTGA GTTTATAAGC GCTTTTGCCA GACCATCTCA 240
GGGGAGGAGG TGCGGCAAGC TGGATTTTTC CAGTCGGTTG TGTGACAATG CGCTGTGGAA 300
CCCGCTGTTA TTTTGCGTCC CCTTTTGGAA AAAGGTGAGT TAAAGCAGAT GTGGAGCGTG 360
CGCAACGCCG CGGnTATGTG CT 382 (2) INFORMATION FOR SEQ ID NO: 201:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201:
GGGGCCAAAT AATTCCGnTC CCAAATGGCC AAGnGGTTAA GCCAAAAAAA AAAAAGGCCC 60
CGGATTTTGG GTTCTTAAGG AGGAAGGGAG GCCCCAAAAA GGCCAAAACC CACCCCCCCC 120
ATTATTCGGA CTTCAAGGAG GACCATTAGG TCGGGCTTTC AGGCCATTAG AGGCCGCAGT 180
AAGCAGCTGG CAGGCGTACC GAGCCTCCCG TTTCGGTACC GAGTGCAAAC GGTGCCTGGA 240
CCGCCTGCGA CGGCGGGCAG nCAAACCGCC GGAACTTCCC CCGCTGGTGC GGCTCCGATC 300
CCGAGGATTA CGCGTCGGCC CATAAACAGA ATACTCGGTG GAAGAGCCCA TAGCAAACTC 360
ATCCATGTTC GTTCTCCCGA GCGGGATAGC ACCTGCGGGC GCGCAGnCGG CAAACAACGG 420 TGGGCATCGT ACGGGAGCCC TATAGTCTGC AAAGAGTTTA CTGCCACACG TGGCAGTGCT 480
TTCCTTTTCA CTGAAATATT GTCCTTGACA GCCAAAGGGT AGACCTAACA AAGGCTTACC 540 T 541
(2) INFORMATION FOR SEQ ID NO: 202:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 722 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202:
CAGTAACAAC GCCAACCGGT CTCATCAGCA ACACGGTCCC AGATGnACCA CCTTCnGAGA 60
CGCTTCGGCT ACGGTCACCA ACTAGCACTT TGAAAAACAA GAACAACAAA AGAAAGAAAA 120
TGACGTtCAA ACTGACCAAC CGAACGGCAT TTAACCCACG AgCAGCCTCC TAAAGAGAAC 180
TCTCAATGAG TACATTGTAA CCTcCACCTc ATGTGTGTTA TGCCTCGTCT ACTCCCGGCA 240
GGCCATCTTT TCAGGAAATA CCAAATGTTC CTGAAAGAAG ACTGGCACAT TCCCAGCACG 300
ATATTGCTTG CGATAGTCAC GCATAGATAC AACGCCACCG GaTGCAGTGT CACGATAGCA 360
ATCAGGTTGA TGACCGTCAT aAATACCATA AAAAGATCCG CAACACCCCA AACAAAATGA 420
AAACTCGCAT GCGCACCGAC AAATACCGCA CTGACACAGG TAACTCTGAA AACACTCAAA 480
ACCATTTTAT GGTCCTTAAT GAAACGTACG TTTGACTCCG CGTAgTAATA GTTACCCATC 540
AGTGAaCTGA ATGCAAACAG AAAGATCGCA AGCGTCACCA AGTGCACCCC CACGGGGCCA 600
ACTTGCTTGG ACAGCGCTTG CTGCACGAAC TGCATTCCGC TCACATCCAC TGATCCTGnC 660
AACATCAGAG AGCAGCAACA CAAAAGCGTC ACACTACAAA TTAGCATCGT GTCTATAAAC 720
AC 722 (2) INFORMATION FOR SEQ ID NO: 203:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 648 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: ATCnCTCCGA GAGTTTGGTC GAAGGAGGCA TGGCTGTGTT CAGCATTCTA GAAAAGAAAC 60 AATTTTCTCC GCAGTGTTTT ACTTGAAGGT GCATGCACCT GATATTGCAA AAAATCGTGC 120
TGCAGGACAG TTTGTGCTTG TTCAACTTGa TGACGAATAC GCTGAGCGCA TACCGCTAAC 180
GATTGCGGAC GCGCATGsGA TgAAGGGTGG ATTGCGCTAG TGATCCAGAC TGTTGGCGCC 240
ACTACTATGA GGCTGTGCGA AAAGGAAGTG GGCGATTCCA TCTCTGTAGT TCTCGGTCCG 300
TTGGGAAATC CAACTCTCAT TGAAAATGTA GGAACTGTCG CCTGCGTTGC AGgGGTGTTG 360
GGGCAGCTCC GCtGkATCCT ATTGCCCAGG CGCATAAAAG GGCTGGAAAT CACGTCATTG 420
TAATCCTTGG GGCGCGCAAT CGGGATTTAA TTATTTTTGA AGAGGAGATG CGCGCGCTTG 480
CAGAcGAGCT GGTCATTGTC ACAGACGACG GCTCATATGG ACGCAAGGGC TTAGTGACTG 540
AGCCCCTGCG TGAnTGTGCG AGCGCGCGTC CTGTCCACAG GAGGTGGTTG CTATCGGTCC 600
GCCGATTATG ATGAAGTTTT GTGCGGAAAC GAnGCGnCCC TTTGGGAT 648 (2) INFORMATION FOR SEQ ID NO: 204:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 366 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204:
TGCACTGTAC TACACAGTGT ACGAATGCGT GAAGGTGATG TCTTCTGAGG TCGGTGCGTC 60
TTTGTACGTG CACATCCCCT TCTGTGCGCA ACGCTGTGCT TACTGCGATT TTTACTCCCT 120
GGTGCGTTCA ACCTATTTTA GGCCTCATCA GCCTTGTCCG CATTTTATCG ATCGGCTGCT 180
ACAGGATGTG GCATTGCAGC GGGAGTGCTT TGGGGTCCAG GGGTGGCAGA CAGTGTATAT 240
GGGTGGAGGT ACCCCTTCGC TATTGGCACC GCAGGACATT CGTCATTTTT GCGTACGTTA 300
CGCGCCGCGC AGGnATTCCG ATTCAGGAGT TCACTCTTGA GGTGAnTCCT GAGGATGTGA 360
CCGAAG 366 (2) INFORMATION FOR SEQ ID NO: 205:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 566 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: CATAGATGAC TTTGACGGTG CCTCCGAGGA TCAGGGTCTC GCCTGGCGTG CTGCGGGGAG 60
CAAGTTCATC ACAAAGGGCT TCCCTATCCT CAAGTATTTC GAGGGCATGC CACAGGCGGT 120
GCGCATGGCG GGCTCGTGGC AAGGGAAGGA CAAGGAAGCC CGGTTCATCG GAGTAGAGTG 180
CAAGTTCAAT CGACAGGGGA ATAACTGGCT GGACCTAATT CCGACTAAGG GTGGTAGCGA 240
TTACGAGATC CCCCTGCGTG GGGTGGTCAG TGGGTTCGAC GTGTGGGTGT GGGGTGCAGG 300
TTATCAGTAC TCGCTCGAGG CTTTGGTTAG GGACTGCACG GGAcGAGTCC ACACCCTCCt 360
AAtAgGCaAC CTCgAcTTCC aAGGGTGGAA rAAcCTTAgT GTTTCGGTTC CCACACACAT 420
CCCACAGACG TCGCGCTATT TGGGGAGCGC GCAACAnCTG AATTTTGTCG GTTTCAGGAT 480
CCGTACTAAC CCATCAGAnC GGGTGGATGA TTTCTACGTG TnCTTTGACC AGTTCAAGGC 540
GCTTGCTAAC ATGCATATCG ACTTTT 566 (2) INFORMATION FOR SEQ ID NO: 206:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1601 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206:
CAAAAGGCAG TTGACAGATG CAGAGGCGGA TTTTATCCAG AATTTGTGTA AAAATGATGA 60
CCTTATCATG GCGCTGAATA TTTTTTCTAA CCGGGTGCAG TTGTTCTCCT GGACGTACGA 120
CGACCTCATG TTCGTTGCTG GTGGGGGAAG TGAGTnAGTA CCGAGAAGTA GCTCTCTTTT 180
CATAAAAATC TTTTCGGGAA GAGTGCCTTA TCAAGCCAAA GACGAATGCT TGTTGTTTAC 240
CGCAGCGTTG AGGGTACTCA CTCCCGCGAA TATCTTTTTT AGGGGTTGCG TGGTCTTTCT 300
GGTTATGCTC ACCCTCACCA TATCaAGCAT GGTCTTTTTG GTTATTGAGT ATCTGaGCGT 360
GCGCCGATTC CaGTGCGcAC AGGGGGACGC GCGCCCTCGT CtGCTGGACC TCAAGGTAnG 420
CGCGCTCTCG GACAGAnGAG GnCCCAGGTn TCGCCTGGTA CACCGCCTGA GTGTCCAGTG 480
TCCGTCTTTT GAAACTCATG GGCAGGACCT TGCTTCTAAG GAGAGCGTGA GCGACCTCGn 540
AAGGTTTACA TTTGTACAAC CATTCTGACG AGGAGCTTTT TTCCATACAT GACTCGGTGT 600
TCCAGGAACA TACAGGGsGG GAGGGCCGGA TGTCTTCCTC TTCCGGTGAG GAAGGGTGCG 660
GGCAGACGCT GCTTCCCGTT CATGAAAAAA AGAGCACCTA TGGTCTTTTC AATCCCTTrr 720
CCGGrGTCGG TTGGCGGGCC TATCTGGAGG AAAGGCTTGA GGCAGAATTA GGACGCGCCA 780 CCGCTTCCGA ACAAGACCkA AsGCTAATGA TAGTACAGGT GGAGCACCCC GCGCACCAGA 840
CAGCCGTTGC GGACGCAGCG AAAAAGCTCG TGGAGTTCTT CAAATTTCGG GATATGCTCT 900
TTGAATTCGA GGGTAGTTGC TGCTTCGCGG GTATCGTACA AGACGCAAGC CTCGAAGAGg 960
AnTGGTaCTC GCGAGGGaTA TACACAAGkA GCTGTGCGGC GCCATTGAGA GCGCACGCgT 1020
CCTTATCGGC ATCGCGAaCG ckTACGTCCA GACTAACTAC CGcGGcCCAC TTaATTGAGG 1080
ArGCGCACGC GTcGGTGAAG AGGGCGCGAG AAGACCCCGC ACACCCCATc ATCGcTTTCG 1140
AGGCCCCCCA CCAGTGTGGG CGCCCGTATC GGTCTTCAGC TTATAGGGAT GTCCCGGACC 1200
GGTCTACCTG CTAATAATGG CTTCCTTCTC GCCGCGCCGC GCACTCTGCA GAAGCGCAGC 1260
ACCCAACCTC CTGAAACTCT GGGAAGAAAC AGTTGCACCG GTGACCACAT CCACCATCTC 1320
GGGATTACCC TTTTCAAGCA AAGCATCGGC GAGCTCTCTG AAGGCCTTTT CAGGACCTAT 1380
GCCCGAGGAT GCATACATGA CCCGATGGTA GTCAGCGTCC TGGGACTTAA ACCGCCCTTC 1440
TTTATGCTGA TAATCGTAAA CCACCTGCAC CATCTTGCCA CCATCAAAAG TAACCTCGAG 1500
AAAGTCCTTC CAACCATTCT CATCAAAATC CTGATACGTC GCCCGGTACG TGCCATTCGG 1560
GATAGAACTA AATGAACACG CCCCAAGCAA CACCGCAAGA C 1601 (2) INFORMATION FOR SEQ ID NO: 207:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 359 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207:
GTTGTnTCCT GTGCGACGCG nAAATACCTT CCCACACCTG CTGGTTAAAG GACTTGTCAT 60
CGATGTCCCC TGAATCGGTT ACCATTCCAA CGACGAATCA CCACCTTCTG CGCCCCCTGC 120
GTTTCCCATC TGCGGCCTGT CGCTCTTAGA GCAGCCGATG AGCAGCATGG CGCAAAAAAn 180
GCCCGCAAAn GCGCGTACCC ATTCTCTCTC ACAAGAATCC TCCCCCCTTT ATCGACAAAC 240
ATGCGnAAAA TAAnGGGTCA CAGTGTAACC CAAGGGACAA nGAGGTnCAA AGAGTGGTGA 300
GTTTTTGCGT GTGTGCAAGT GGCAAGGTGA AGGGGTACTA GACAGGCCCG GGGGGGGGT 359 (2) INFORMATION FOR SEQ ID NO: 208:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 516 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208:
TCTTGCAGAA AGTGAGGCGG ACTTCCGTGC AGGAACCGAA ACACTGCAGG ACCTTTGTCT 60
GTGAGTCTTT CCGGGCATAC GCACACAAAA GAGGAGGTGC AGGCTGAGGT GGAGGCAGGA 120
GAGGTTCAGT GCCCAGCGCT TTGCCATCAG CCGGTACTCG TGCAGGAATG CCTGACGTTG 180
TTGGAACCTG CAATTGTGGG TATCTCGCGA GGTGCAGACA GCACGAGAGA TGGGGCGGGG 240
GCGTTTTTTA TTGACGGGAC ACTGGGGGAT GGGGGACACA CACAGGCGTT TTTGCACGCG 300
TACCCTGcGC TCCGTGCGCT CGGTGTTGAA ATAGATCCGT CAATGCTCGC ACGGGCGCGA 360
GCGCGCTTGA CGCCGTTTGG CAAGCGGCTT cGCTATGTCC TGGGGTGGTC TGATGTCTTT 420
TTTGCCTCCG CATATGCATC AGCTCCTGCC TCTCCTGCAA CGGGAAGGAC TGCAGCTGGC 480
GCCGCAGTGT GCCGGGTGCG TATCCGGCGC CGCAGA 516 (2) INFORMATION FOR SEQ ID NO: 209:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209:
CCACAnAnAG GACCAAAAAG GACCAATGGC CCGGGGCCGG CCCCAACCGG CCACCTTCCA 60
GGAACCGGCG GTGGCGGGTT GGCCCCCCGC GGTTTTTTAA AAAAAAAACC GGCCGGCCCT 120
TTTTTTAAAA AACCTTACCA AAACCGGCGG TTTTCCAACC CAGGCCAACC CGGGGGGGGA 180
AACTTGGCCC TTTACGGGGG AGAGGGTGTT AACACCCGGG CTTATTGGGC GCGATGGAGG 240
TACGCACACG TGCAGCGTTT TACTCCCGAC GGCTGGCTCC GTACAGGGGA CGCTTTGGGA 300
CAAAGACAGA AACCGGTAAT CTCCTCCCCT GGCAGCAGCT CGTGCCATAT GCAACTCGGT 360
GCGCGCGGAG AAGCGGTGTA CGCAGAnGAT CTTGTTTGTG TGCTTATGCA AnATCCGTGG 420
CGTGGTGGCA GCACACGTGC GGCGTTAGAC ACGCAAnGGG CAAGCGCACT GCGCCGTATG 480
GGTAAAACAA GGAGCCGAAC GAATACGGGC ACCCTTCAGA TAGTGTGCTT TTTGCGCACC 540
C 541 (2) INFORMATION FOR SEQ ID NO: 210: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 210:
GGTGACCGTC ACCTACCGCA CCGTACATGG CGCATGCTCT TTTTGCCGCA ACTACATCAT 60
GGCTAGAGTG ACTACCCGGA CCGGTCATAA CCCCGATGTG ATAATCCTCC ACCGCTTTCG 120
TGCCCCCGGT ATTTTGTGGA GCACGCGCGC CCCCACAGCG ACACAACAGT GCACCGACAC 180
ATACCACCCA TACCTTCAAT AGAAACTGAC TTTTCATAGT CTCCCCTTAA CGATCTGCAC 240
ACACCATCTC TCCAAAACGC TTAGGCGTAT GGTCCACCCC CCCCCACGGT GGGAGGAnAA 300 nAATTTTCCC AAAATTTTTG GGTTGGGTT 329 (2) INFORMATION FOR SEQ ID NO: 211:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 485 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211:
TCGTTTGTAT CGTTGAGCGT GTTATCGCTT CAAGGGACAA AGGTGCCATA CTCATTTTTC 60
TGCCAGGGGA GCGTTCTATT AAAAATTGTA TTACCCGTCT TTCCCATGAA CGTTGGTTCC 120
GCAAGCTCTT TCTTTTGCCC CTCTATGGAA GATTGAGTAA AGAAGAACAA GAGCAAGTTT 180
TTAACCGCGC GCCATTTGGA AAAAGAAAAG TCGTCATCGC AACGAATATT GCAGAAACAT 240
CCATCACCAT TGACGATGTA ACTACCGTCA TTGACTCTGG TTTTGGGGGG TTnGGGAAAA 300
AGGGTTTTTn AAAAAAAATT TTTCCCGGTT nAAATTTAAA AAACCCCCCC GGGGCCTTTT 360
TTTTCCCCCC TTAAATTnAA ACCTTGGGCC CAAAAAGGTT TTTTTTGGGG GGAAAAACCG 420
GGAAAAAAAA AnCTTTCCCC CnTTTAAATT TTTTTTCCCT TCCCAAGGGG GGCCTTTTTT 480
CCCGG 485 (2) INFORMATION FOR SEQ ID NO: 212:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 808 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212:
TTGCATCGCG CGTCTGCAAG CGCGTGnACC AGCTTAnCCA TTGAAAAAGT ACACAAACCA 60
AAAACAGGAG GGATTACGCC GAGCACCATA GAGCACGTAt TGGCGGAGAG GCATACACCG 120
TGGTGCTCGA CCGCTCGTGG ACGTATGAAA AGGGCTGGAC CATTGCGCTG TTCCCCGATG 180
ATAAGGCGGA TGAGGCTGGA GCGTATGTAA AAyCCCTTTC GGTTTCAAAA AACAAAGAGG 240
GAAACTGGAT GGTTGCCATT CCCAACGCGG CGCTCAAAAC CGGGTCGTAC ACGCTGCGCG 300
CAATAACGCC GCGGAATATT TACGCAGAGG TGCGGGGGAT ACTGCATGTA GTGCTCTGGC 360
GCAGGCCCAT CTTTTTCTAT TATGACCTGA GCGTGGGATA CGCGCCGGTG TATCGGCCGC 420
AGaCCACGCC GCGACAATCA ACGGTGTTTC TGACTTTTTC AAAATCTGTT CTCCTATCGG 480
GTTTGTCGGC ACGTTTGAGA TGTGCTTTTT TAAGCGCAAC AGCAGCACCA TCAGCGCTGG 540
CTTTAACGCG CAAATGCACT CCGATTCAAA ACAAGTGGAC GTGAAGCTCG ATGGAAACTT 600
TGCGTATCTA TACGAACTTT ACCCGCGCAT CGAGGTAGGC GGCATGCTTG GGTTGGGGTA 660
CTCGCTGCCA TTCGGACAGC GCAAGGAAGA CGACAGCATG TACTCCTACG TGACAGGAAC 720
GATGAAGTAT TTTTTACTAA TAGCATTACC TGCGCGTTCA ACAGCAGCAC ATGTTGACCG 780
TAAAGCCGAG TTTCACAGGA GTGAGCCT 808 (2) INFORMATION FOR SEQ ID NO: 213:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 312 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213:
CTGCGCCCGC AGGnAAAAAA GTGTGTGCAG CAATCCAGTC TCTTCCATTA GGCGGAnGCA 60
ATACTCGGAC GGGGAGTACA AGCTCTTAGA AAGTTCTCAC GCACCCGCTC AGAGAATACG 120
TGCGTATGAG CGCGnAnATA AATTGCTTCA CGCGTGCGCG CTTCGATGGA AAAAGAnAAC 180
TGCGCTGCAA AACGCACAGC ACGCAAAGGA CGCAACGCAT CTTCAGAAAA TCTTGCCTGT 240
GCATCCCCCA ACGCTTAAGA TAAGAACCGT TACGCAAATC GGAGTAnCCG CCACATTACG 300
TCGATGATTT CC 312 (2) INFORMATION FOR SEQ ID NO: 214:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214:
CTGCTCTATG TAGCGTTTTG CCTTTCGCCC TGTTTTCATT TCCACGATTT TGCCGTGGTG 60
GAGGACGAnG GAACCGAGGA TCTGCACGTC ATAGGGACGC TCGCCTAAAA CACGACGAGC 120
TGCCTCGCGC GCAAnCATAT ATTGCCTGAG GnAAAAAAGC GTCAAGCGCT TTCTTCCTGC 180
AGCGGCACGC GCTCTTTAAA CTCAGCTGTT TTTTTGTTTG AACTCAGACT CCTGGAGAGG 240
AAGTTACCCA GGACTTCCTG GGTCGTTTGT ACGGTCATTT CAAAAGAGGC AGGTAGATTT 300
TTTTCAGAnC GCGCTCGTGT CTGGGTAGCC AAAGnAGTAG CCTGTAGTGT CAGTGTCGTA 360
CGTAGTCATG GTAGTTACTG TTAGCGGTTC TGTAACCAnA AAAACAAGGG TAGACCGTCA 420
CGGTCACCCC CT 432 (2) INFORMATION FOR SEQ ID NO: 215:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 631 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215:
AGTGGGAATA AGAGAGGGAG GAGACATACA GTTCAGCACA TGCTCACTCG ATCTTACTCC 60
TAATGATAGA GCTAGAAACG TGCCATGGGA CTCCAAATCC CCTTAAATTG GCAAGTACCA 120
ATGCCATCTT AACTACTTAA ACTGATCAGT TTCAGTTCAT AATTGATCAA AAAGATAGGA 180
TTAAGTGTCA AAAGGATCAC ATGAATAAAA CCAGTGTCTT CTAATAATAA CTGATAGAAT 240
TAAAAAGGAG AGAATGACTC AACATGGGAA GTGGGATACA CAGCAGACTC AGAATGACAA 300
ATGTCCTAAA CAGCACTCTG GCCACACAAT CAGCTCTTAA GGCATTCGGA TCTGGCTAAA 360
AAGCCAATGA AAGTTTCTCA GGCATGGAAA GCCAAGAACT GTGGCAAAAA ATGGCCTAAA 420
TGAAACATCT GTGTGAGTGA GATCCCAGCA GAAAGAACGG GCCATCAAGG AAAGAGGTAC 480
CTTTCCCTGA AGGGAGGAGA GAACTTCCAC ATTGACTATG GCCTTGTCTA AATAAGGAGT 540 TGGCGAACTC AAGAGGCTTC CATAAGCTTG GCAACTCATG ACAAGAGCCT TGGGTGATTA 600
CTGATGCCAT AAACAAGAGT GTCAATTTGT T 631
(2) INFORMATION FOR SEQ ID NO: 216:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 514 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216:
ACATAGTACA ACACATTAAG GACAGAGATC CTACATGGGG AGTAAGTGCA CAGTGACTCC 60
CTTGTTGATT TAACAACTGA CACTCTTATT TATGACGTCA GTAATCACCT GAGGCTCTTG 120
TCATGAGCCA AGGCTATGGA ATTCTCTTGA GTTCACAAAC TCTGACCTTA TTTAGACAAG 180
GTCATAGTCA AAGTGGAAGT TCTCTCCTCC CTTCAGAGAA AGGTACTGCC TTCTTTGATG 240
GCCCATTCTT TCCACTGGGA TCTCACTCAC AGAGATCTTT CATTTAGGTC ATTTTTTGCC 300
ACAGTGTCTT GGCTTTCCGT GCCTGAGAAA TTTTCATGGT TTTTTTTTTA GCCAGATCCG 360
AATGCCTTAA GGGCTGATTC TGAGGCCAAA GTGCTATTTA GGGCATCTGC CATTCTATGA 420
GTCTGGCTGT ATATCCTGGC TTCCCATGTT GGATTGTTCT CTCCTTTTTA ATTCTATCAG 480
TTATTATTAG CAGACACTGG TCTTATTTAC ATGA 514 (2) INFORMATION FOR SEQ ID NO: 217:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 483 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217:
ATCCCTGTTA AATATAAGAG TGGGAATAAG AGAGGGAGGA GACGTACAGT TCAGCACATC 60
TTCAATTGGA CTTAGCCATA ATGGTAGAGT TAGAAATGGG CCAGGGGATT CCAATTCAAT 120
CCCATCAAGG TGGCATGTAC CAATGCCATC TCACTAGTCA AAGTGATCAG TTTCAGTTCA 180
TAATTGATCA TAATGATAGG ATTAAGTGTC AAAGGGATCA CATAAACAAG ACTAGTGTCT 240
GCTAATACTA ACTGACAGAA TTAAAAAGGA GAAAATGATC CAACATGGGA GTTGAGATAC 300
ATAGCAGACT CATAGAATGG CAGATGTGCT AAACAGCACT CTGGGCCTCA GAATCATCCC 360 TTAAGGCATG CGGATCAGGC TAAAAAGCCC ATGAGAGTAT TTTAGGCnGG AAAGCCAAGA 420
CACTCTGGCA AAAAACAAAA AAACAAAAAA AAAACAAAAC AAAAACACAC ACACACACAC 480 ACA 483
(2) INFORMATION FOR SEQ ID NO: 218:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 527 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218:
ATTTTCATTT AGGGTGGTGT TTTTTTTTTT TTTTTTTGCC ACAGTGTCTT GGCTTTCCAT 60
GCCTGCAAAA CTCTCATGGG CTTTTTAGCC AGATCTGAAT GCCTTCAGGG CTGATTCTGA 120
GGCCAGAATG CTGTTTGGGG CATTGGCCAT TCTATGAGTT TGCTGTGTAT GCTGCTTCCC 180
ATGTAGGATC ATTCTCTCTT TTTTAATTTT ATCAACTGTT ATTTGCAGAC ACTGGTCTTA 240
TTTATGTAAT CCCTTTGACA CTTAATCCTA TCTTTTTGAT CAATTATGAA CTTAAACTGA 300
TCACTTTAAC AAGTAAGATG GCATTGGTAC ATGCCACCTT AATGGGATTG AATTGGGATC 360
CCCTGGCACA TTTCTAGCTC TACCATTAGG GGTGAGTCCG AGTGAGCATT TTCTGAACTG 420
TACATCTCTT CCTTCTCTTA TTCCCACTCT TATATTAACA GGGATCACTT TTCAGTTAAA 480
TTTAAATGAC TAAGAATAAT TGTGTGTTAA TTAAAGAGTT CAACCAA 527 (2) INFORMATION FOR SEQ ID NO: 219:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 460 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219:
CTGAGAAGGG AGAAGCAGCT TCTACACAGC TGCCTCCAGT TCAACCAATA AACTGTAGGA 60
CCTGCTCCTG ATTGGAGGAG AGCAGCGTAC TCGGCGTGTG GGTAACAGAG TTGGGATTGG 120
TGGAAGAGGA CTATAAAGGA GGAGAGAGAC AATATGCACC AGGAACATCT AAGGGGAACA 180
TCTGGGGGAA CACCTGTGCA GCCCCCGAGA GAGCCGGCCG GCGGTGTGCC GCTTCCCCCG 240
CGGAAGTGGG GAAAGTGGCT AGGGGGAACC GCCCTTCCAC GGAGGTGGAA GGGTTGGTAG 300 CCAACCCGGG AAGAACCAGC AGCAAACCCG GGGAGGGCCG AGCAGACGAA AGAACAACGC 360
AGGTCCTGTG TTGTTCCTCC ACGAAGACGG GGAGCGACAC AGTATTCTGT GATAGAGGAC 420 TTTGTAGCAC CATCAGTTAT TGAATTCAAG TCACTTCCAG 460
(2) INFORMATION FOR SEQ ID NO : 220:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 327 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220:
CCGTCTTCGT GGAGGAACGA CACAGGACCC TGCGCTGTTC TTTCTTCTGC TCGGCCCTCC 60
CCGGGTTTGC TGCTGGTTCT TCCTGGGTTG GCTGCTGGTC CTTCCCACCT CCGTGGAAGG 120
GCGGTTCCCC CTGGCCACTT TCCCCACTTC CGCAAGGGAG CGGCACACCG CCGGCCGGCT 180
CTCTCGGGGG CTGCACAGGT GTTCCTTCAG ATAGATGTTC CCCTTAGATG TTCCTCGTGC 240
ATGCCGTCTC TCTCCTCCTT TATAGTCCTC CTCCGCCAAT CCTAACTCGG CTGCCCACAC 300
GCCGAGTATG CTGCTCTCCT CCAATCA 327 (2) INFORMATION FOR SEQ ID NO: 221:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 474 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: nGGAAATTTT TCCAAAAAAA GGCCCTTGGG TTAACCCTTA ACCCGGAAAT TCCnTTTTCC 60
CCCGGCCAAC CCGGCCGGGG AAACCGGTTA AATTTTGGGG GCCGGCCAnG TTTCCAAAAA 120
AACCAAGGTT TTAACCTTCC GGAnAAAACn TTAACCCCAA CATTCCGnGG GGCTGGGGAA 180
CCAACGGCTT CnAnAGGGGG AnCCACTGTG GATTCATGGG TTCGGCAGGC AACGTTGTCT 240
TGnATAGGAT TGCGCACAAT TGGGCTGCCA AnGCATCGAG CATACTGCTT GCGTTTTTGC 300
TCGTGCAATT TTACAGCGGC AGTCTGCTGG AACGGCGCGC CATTTCTGTT CCGTTAGTTG 360
TGAGAAATGA AGGGCGCACT AACTCCTGCG CTTCGCTTTC CTCAAAAGGT GACGGTGCTG 420
GATGCGCGCT TTCACGTGAT ACGCTCGGCG CACTGCCGGG ATCTGACATT GTCC 474 (2) INFORMATION FOR SEQ ID NO: 222:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 580 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222:
GCGTCAGTCG CGTCAGATCC CCACGCGTCC GTGCCCGTTG CACGCGTGGA CACTGCCAAA 60
GCGCAGTGGG CTGCTGGATA CCGCTGCTGC GGGGCTATAT CCGCGTAACA GTGTAAGCCA 120
ACGGTGTAGG AAAGTGGCGC ATTTTGCGTG TATGCTACGG TGTCCGAGAA ATTACCAAAA 180
CGGTGGAGGT GTATTATGAT TATCCTCACG CTAAACTGCG GCAGTTCATC TGTAAAATAC 240
CAGGTGTATA ACTGGACAGA GCGTGCGGTG ATTGCCGTTC GTGGCCCGGn TTTGGGGGGT 300
TTAAAGGnAA AGnCCGGnTT GGGTTTTTAA ACCTTTCCCA AGGGGGCCCA AGGGGGAATT 360
TCCTTGGGnT TTTTAATTTC CCAACCGGGC CCAATTGGnn AAGGGGGTTG GGCCCnAACC 420
GGGGGGCCCC CGGGnAAAGG nnAGGGnAAA AAACCCCCAA ACCGGGTTTT TTCCCGGGGG 480
GGGGGGGAAA AAAAAAAAGG GTTTTCCCCC CCCCTTTGGG GnCCCCCCCC CCCTTTTAAA 540
ACCCCCCCCC CAAATTTTAA AACCCCCCGG GAAAAAAGAn 580 (2) INFORMATION FOR SEQ ID NO: 223:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 692 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223:
AnTAnGCTTG GTCCTAAGGG TCGCAATGTC TTGCTTGAAA AAGGGTACGG GGCTCCCACA 60
GTCACGAAGG ATGGGGTTTC CGTTGCGAAA GAGGTTGAGC TCGAAGATCC GTTCGAGAAT 120
ATGGGTGCAC AGCTTTTAAA AGAGGTGGCT ACGAAGACGA ACGACGTAGC TGGGGATGGC 180
ACAACTACTG CGACGGTATT GGCGTATTCG ATGGTGCGTG AGGGTCTGAA GGCGGTTGCT 240
GCCGGTATGA CGCCCCTTGA GTTGAAGCGT GGTATGGATA AGGCAGTTGC GATTGCAGTC 300
GATGACATTA AGCAAAATTC CAAGGGTATA AAGAGCAATG AAGAAGTCGC TCATGTAGCG 360
TCAGTATCTG CGAATAACGA CAAAGAGATT GGAAGGATTC TGGCAAGCGC AATCGAGAAG 420 GTGGGGAATG ACGGGGTCAT TGACGTTGAC GAAGCCCAGA CAATGGAAAC GGTGACGGAA 480
TTCGTTGAAG GGATGCAGTT TGATCGTGGG TACATCTCGT CCTACTTCGT CACTGACCGA 540
GATAGGATGG AAACGGTGTA TTGAAAATCC TTACATCCTT ATCCTACGAT AAGTCCATCT 600
CGACTATGAA GGATTTGCTT CCGCTACTCG AGAAAATTGC GCAAACAGGT CGACCGCTGC 660
TTCATCATAG CTGAnGATGT CGnAAGGCGA AA 692 (2) INFORMATION FOR SEQ ID NO: 224:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1000 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224:
AGCGGAGACT TACCAACGAG CCGATATGAG CCGCGTATGA CTTGCCATTT ATCGTATGCT 60
TTTGCGTATT CGCTTTTTCT GCAGCGACGG TTTGCTGGTT GAACACAAGA ACTTGCTGGA 120
TTGCAACCGC GGTCGACTGC GCAACCCGGG CCTGTACCTG CTGGGTGATA ACGTTCTGCA 180
CCACGCCAGT AATCTGCCCG ATAATGGGCC TCGCGGCGGC CGCCCCGGCA CTAAGGAAGA 240
AGAGCTTGGT TACATTTGGC AATTGCGCTG CGATTGCGGT AAGTGAAGGC AAGTCAGGTT 300
GCCGAACAGG ACTTTGCCGG ACGTTCACCG GAACGTCACG TcAATATTCA CTACTTGTGT 360
TCCATTGGTG CGTTTAATCT GATCGTGTGC CTGCTTCACC GCTCCTTCAG CGATCTGAGT 420
AATAACTGAT CTTGCCACCT GCACCGAGTT ATCTATGGTG CTACCAACCG TATCAGCCGC 480
CTGTTTAGCC TGTTCCTGCG CAcGTGcGTA AAAATTGCCA GTGCAACCTC CTGTGCACGC 540
TCGCGTGTCC TTTCGGTCCT CATCGCCGCG GTAGCCTCAC TCTGGTGTTG GTTACCGGCG 600
TCGAGGGCGA AGGAGAAGCG GAAGCCGGCG CCTGGTTCGA GGGTGAGTCG GCCCCCTACA 660
TTCCACAGCA GTTTATCCTT GTTCTGATTG TTTGCGTCCT TCTGTGCACC GATGAGGTAT 720
CCGTCTTCTA GCGTAACATT GCTGGCAAGC TCTACCGTGC ACAGAGGGTG TCCTGCACGC 780
GCATACATTA GCTTCAAGTC TGCCCCAAAG CCATACTTAC TGTGCGTGGG GTCAGTACTA 840
TCCCAGGCAC CGTTAGAGGC AAAGGAGAGA AACCCCACAT CAAGGCTGAC CCCACTGCCC 900
CCAATGTCCT GTGCCCGATA ACCAACCTTG nCGGCTAAAA CCCCAAAACC CGGGGCATAC 960
TGTAACGCAT CCTCCTGGTA ATGCGGTGGT CAAnCCAAGG 1000 (2) INFORMATION FOR SEQ ID NO: 225: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 842 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225:
TCAAAAATCT GAATATCCTA CCATATGACC CAACCATCCC ACCCCTGGGA ATTTACCTAA 60
GGGAAATTAA ATCAATAAAT AAAAGAGTTA TCTGAACCCC CAGGTTTATT GCAGCTCAAT 120
TCACATAGCT AAGACATGGA ATCAACCTAA ATGCCCATCA ACCAATGACT GGATAAAGAA 180
ACTATGGGGT AnGTACACTA TGGAATACTA CATGGGCAGT AAAAAAAAAA ATTGAAATCT 240
GGTCATTGGA CAACAAAATG GATGAATCTG GAAAACAGCA TAATTAGTGA ATTAAACCAG 300
TCCCAAAGGG ACAAATACCA TATGTTCTCC TTGATCTGTG AGAACTAATG GAGTACCTAA 360
AAGGAAATCT GTAGAAGTGA AATTGACACT TTGAGAAGGG ATGACTTGAG CTGCCCTTGT 420
CTTGACTTTC AAGGAACAGT TTTTCTTTTT TCATTTTTTT TCTTCAkGCT ATTTGCTGAA 480
CTCTTTAGTT AACATAGAGT TAATCATATA AAGTCAtTGA GGATGGATCT CAGTAAAAAA 540
TAAGAGTGGG AATAAGAAAG GGAGGAGGAA GTTTTGTAAC TGTAAAGCTA TATAGTTATA 600
CATACATTCC TATGTACTTA CTTCTAAGGC ACAGTTTAAA AACTTGTCAT GAGATCCCAA 660
ATCTCATTAA GCTGGGTGGA AAAATGCCAT CTTAAGTGTT AAAGTGATCA TATTAGTGTT 720
AAAGTGAACA TATAGATACG TTTAAGTGTT AAAGTAAACA TATAAATAGG TTTAAGTGTC 780
TGGTAATAAT AATAGATATA ATTAAAAAGG AGAGAATTTT CCAACTTGGG AAATAGTCCA 840
CA 842 (2) INFORMATION FOR SEQ ID NO: 226:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 560 base pairs
(B) TYPE: nucleic acid <C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226:
TACAAATTGA GTGTCACTGT GATTTTCAGG TTAATTCATT CAGAATTCCA CAGGCAGTCA 60
GCTGGAGGGG TGGATGAATC CGGGAACAGC CGGTTTCCTT GCGACTAAAT AAAGCCTTAT 120
ATCATAATGG ATTCTTGCAT TTCTTTGGCT TTTTTCTCCG TTTTCTTATT TATCATGGTT 180 TATAATGAAT TTTCTGATGT GAGCTTTTGT TTTGTGGCAG GCATATTGAG TTCAGGGCAT 240
AAGAGGAAAT TTCTTTTTCC ATCAGGGCAC AGCTGTTGGC TTGCTCCAAT TCTGGTTTGC 300
TGTCATTGTT TTCGGTGCCT CAGCCGATTG TGGTTTCTTG ATCAAAAATT TCCTAAATGT 360
TTTCTGGGTC TGAGTTGGTT TTGCCTTCTT CCTTCCCAAA TTTTGTATTC GGATGCAATA 420
CAAACAAGTG TTTGGACAGA GTATGCATTA CGAAATAGAA ACACAACATT AAGGCGTTTG 480
AAGCTTAGAC ATCTACAGGT AGCACGAGCA AGGTGAGTTT TTGTGTTTTG GAAAATCAAA 540
TGGAACACTT TAGCTGAGGT 560 (2) INFORMATION FOR SEQ ID NO: 227:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 406 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227:
TGnAAACTTC AGTTTGGGAG GCAATTAATA TCTGTCACCT CAGCCAAAGT GTTCCATTTG 60
ATTTTCAAAA ACAAAAAAAC TCACCTTGGC TCGTGTTACC TGTAGATGTA TAAGCTTCAA 120
ACGCATATTT GCTGTGTTTT TTATTTCATA ATGTCTACTC TGTCCTCACT CTTATTTGTA 180
TCACATCAGA ATACAAAATT TGGGAAGGAA GAAGGCAAAA CCAACTCAGA TCCAGAAAAT 240
TTTTAGGAAA ACACCACTCA AGAATCCACA AATCGCCTGA GGCACCCAAA CCAATGAAAG 300
CAAGCCAAAA TTGGAGCAAG CCAACAGCTG TGCCCTGAAT GnAAAAAAAA CTTCCTCTTG 360
TGCCCTGnAC TCAATATGCC AGCCACAAAA CAAAAGTTCA CAGCCG 406 (2) INFORMATION FOR SEQ ID NO: 228:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1425 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228:
GAACTTGCAA AAAGCGTGGT AACCnGTATT GCACGGGTTA AGGATGATAT GTGTnAACTG 60
CTTGATAGAG GCTTCGTGCG CATTCTTGGC CAAGCAGCAG GCAGAGTCGC CACTCCGGGG 120
AAAAAGCTTT TGCTTCTCAG GCTCCCTGCA GAAATGGAGA TCGCGCGnCT ATACACCGTA 180 TACGCGCGCT CGGnGGCGTT nTGAGAACGT CGGGGATCCT CTAkAGTCGA CCTGCAGGCA 240
TGCAAkCTTG kCACTGGCCG tCGTTTTACA ACGTCGTGAC TGGGAAAACC CTGGCGTTAC 300
CCAACTTAAT CGCCTTGCAG CACATCCCCC TTTCGCCAGC TGGCGTAATA GCGAAGAGGC 360
CCGCACCGAT CGCsskTCCC AACAGTTGCG CAcCTGAATG GCGAATGGCG CCTGATGCGG 420
TATTTTCTCC TTACGCATCT GTGCGGTATT TCACACCGCA TATGGTGcAC TCTCAGTACA 480
ATCTGCTCTG ATGCCGCATA GTTAAGCCAG CCCCGACACC CGCCAACACC CGCTGACGCG 540
CCCTGACGGG CTTGTCTGCT CCCGGCATCC GCTTACAGAC AAGCTGTGAC CGTCTCCGGG 600
AGCTGCATGT GTCAGAGGTT TTCACCGTCA TCACCGAAAC GCGCGAGACG AAAGGGCCTC 660
GTGATACGCC TATTTTTATA GGTTAATGTC ATGATAATAA TGGTTTCTTA GACGTCAGGT 720
GGCACTTTTC GGGGAAATGT GCGCGGAACC CCTATTTGTT TATTTTTCTA AATACATTCA 780
AATATGTATC CGCTCATGAG ACAATAACCC TGATAAATGC TTCAATAATA TTGAAAAAGG 840
AAGAGTATGA GTATTCAACA TTTCCGTGTC GCCCTTATTC CCTTTTTTGC GGCATTTTGC 900
CTTCCTGTTT TTGCTCACCC AGAAACGCTG GTGAAAGTAA AAGATGCTGA AGATCAGTTG 960
GGTGCACGAG TGGGTTACAT CGAACTGGAT CTCAACAGCG GTAAGATCCT TGAGAGTTTT 1020
CGCCCCGAAG AACGTTTTCC AATGATGAGC ACTTTTAAAG TTCTGCTATG TGGCGCGGTA 1080
TTATCCCGTA TTGACGCCGG GCAAGAGCAA CTCGGTCGCC GCATACACTA TTCTCAGAAT 1140
GACTTGGTTG AGTACTCACC AGTCACAGAA AAGCATCTTA CGGATGGCAT GACAGTAAGA 1200
GAATTATGCA GTGCTGCCAT AACCATGAGT GATAACACTG CGGCCAACTT ACTTCTGACA 1260
ACGATCGGAG GACCGAAGGA GCTAACCGCT TTTTTGCACA ACATGGGGGA TCATGTAACT 1320
CGCCTTGATC GTTGGGAACC GGAGCTGAAT GAAGCCATAC CAAACGACGA GCGTGACACC 1380
ACGATGCCTG TAGCAATGGC AACAACGTTG CGCAAACTAT TAACT 1425 (2) INFORMATION FOR SEQ ID NO: 229:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 362 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229:
GGGGCTTTTT AACCATATCT GAATGCCTTA AGGGCTGATT CTGAGGCCAG AGTGCTATCT 60
AGGATATCTG CCATTCTGTA AGTCTGCTGT GTATCCTGCT TCCCATGTTG GATCATTCTC 120 TCCTTTTTAA TTCTATCAGT TAGTATTAGC AGACACTAGT CTTGTTTATG TGATCTCTTT 180
GAGACTTAAT CCTATCATTA TGATCAATTA TGAACTGCAA CTGATCACTT TAACTAGTGA 240
GATGGCATTG GTGCATGCTC AATTGGACTT ACCCCTAATG ATAGAGTTAG AAATGTGCCA 300
GGGAATTCCA ATTCAATCCC ATCAAGGATT TTATTTAATT TAATTTAATT TTATTTACTT 360
AT 362 (2) INFORMATION FOR SEQ ID NO: 230:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 554 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230:
CGGATCCCCG GTCAATTGTG GAGACAnTAT TATCAATTTA CCAAGAGAAA GCCGAGGTGG 60
AAAGAGTTCA GAATTGAATT TGCCCACGGC nAATATnGTT AAAAAGTCTA GATTAAAGGC 120
CAACCAACCA AAGTTAAATA TAGGCATTAG GATCTGGCTG AAGAGCCCAT GAAATTATTT 180
TAGGCATGGA AAGTCAAGAC ACTCTCAAAA AAAAAAAAAA AAACTAnATG AAAGATCTCT 240
GTGATTGAGA TCCCAGTGGA AAnAATGGGC CATCAAAGAA nGGTACTTTT CTCTTAAGGG 300 nGGAGAGAAC TTCCACTTTG ACTATGACAT TGTCTAAATA AGATTGnAGT CAACAAACTC 360
AAAAGGTTTC CATAGCCTTG GCAACTCATG ACnAGAGCCT AGGGAGATTT CTGACGCCAT 420
AAACAAGAGT GTCAnTTTGT TAAGTCAACA ACAGGAGTCG CTGTGGCACT TACTCCTCAT 480
GTAGGATCTC TATTCnTAAT GTGTTGTACA AGGnGAATTA ATGCTATAAC TAGTACTCAA 540
ACAGTATTTT TCAC 554 (2) INFORMATION FOR SEQ ID NO: 231:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: ACCCAACCTC AGCACTAACC TTGACGAGTC ATTTCTTTGA TTTGGTCATT GGTAAAATAC 60 TGACCAnCCG TTTGAGCTTG AGTAAGCATT TGGCGCATAA TCTCGGAAAC CTGTCTGTTG 120 CTTGGAAAGA TTGGTGTTTT CCATAATAGA CGCAACGCGA GCAGTAGACT CCTTCTGTTG 180
ATAAGCAAGC ATCTCATTTT GTGCATATAC CTGGTCTTTC GTATTCTGGC GTGAAGTCGC 240
CGACTGAATG CCAGCAATCT CTTTTTGAGT CTCATTTTGC ATCTCGGCAA TCTCTTTCTG 300
ATTGTCCAGT TGCATTTTAG TAAGCTCTTT nTGATTCTCA AATCCGGCGT CGTCAAAAAC 360
AGGAAGCCTG GGTAACCCAG GTAGTGCAAC AGGCGACGCA GACAGTAACG GCTGGAGTTC 420
GAAGCGCGCT GGAATCTCGG GGGACTACGT ACATAAACGC GCTAGAGGCA GTTCAGCCTA 480
ATCCTGCTAA ACCTACCGGT AAGGnTGTGC AAAATCTTCA CACCCCGCAG GAAGTCCGCC 540
G 541 (2) INFORMATION FOR SEQ ID NO: 232:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 628 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232:
AAATAAAAAG AATCATCAGG AATTCCTACA AAGATCTATA TGCCAACAAA TTGGGAAACC 60
TATAAGAAAT GGATAGATTT CTGAACACAT ATAATCTACC CAAGCTGAAT CATGAAGACA 120
TAGAAAGTCT AAACAGATCA ATAACCAAGA CAGAGTAATA TCAGTAAGAA AAACCCTCCC 180
GCTTCCCATG TTGGATCGTT CTCTCCCTTT TTAATTCTAC AGTTAGTATT AGCAGACACT 240
AGTCTTGTTT ATGTGATCCC TTTGACTCTT AGACCTATCA TTACGATCCA ACATGGGAAG 300
CAAGATACAC AGCAGACTCA TAGAATGGCA GATGTCCTAA ACAGCACTCT GGCCTCAGAA 360
TCAGCCCTTA AGGCATTCAG ATCTGGCTCA AGAGCCTATG AGAGTATTTT AGGCATGGAA 420
AGCCAAGACA CTCTGGCAAA AAAAAAAGGG GGGGGGGCAA ATGAAAGATC TCTGTGAGTG 480
AGATCCCAGT GGAAAGAAAA AGAACGGGCC ATCAAAGAAG GAGGTACCTT TCTCCGAAGG 540
AGGAGAGAAC TTCCACTTTG ACTATGGCCT TGTCGAAATA AGATTAGAAT CGGCAAACTC 600
AAAAGGCTTC CATAGTCTTG GCAACTCA 628 (2) INFORMATION FOR SEQ ID NO: 233:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 614 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 233:
TAATGGAAGT AGAGAATGGG AGGGAACTGG TAGGAAGGGA GAGGTGTGTG AGGCTGGGAA 60
ACCACTACAA ACTTAATAAA AAATCAAATG CTGAGGTAGG ATGTCAACTA CGTAAAAGAA 120
AATAGACCAT AGAATAATAA ATGAAAATAT ACCAAAAGCA CTTAAACATT TTCCTACTGT 180
TGGGTAAATA GGTGAATTAC AGTTTTTAGC TTCAGGCAAT AAAAGAAAAT CTTTGTGGTA 240
AGATTTCAAG TTTTTAAAGA AGTTTATCTT CACAATTGAT CACACTGATA GGTCAAAGAG 300
TCAAAGGGAT CACACAAACA AGACTAGTGT CTGCTAATAC TAACTGATAG AATCAAAAAG 360
GGAGAGAACA ATCCAACATG GGAAGTGGGA TACACAGCAG ACTCATAGAA TGGCAGATGT 420
CCTAAACAGC ACTCTGGCCT CAGAATCAGC CCTTAAGGCA CTCGGATCTG GCTGAAGAGC 480
CCATGAGAGT ATTTTAGGCA TGGAAAGCCA AGACACTCTG GCAAAAAAAA GGnCCTAAAT 540
GAAAGTTCTT CTCTGTGAGA TCCCAGTGAG TGAGATCCCA GTGGAAAGAA CAGGTCTTCC 600
AAAAAGGAGG TACC 614 (2) INFORMATION FOR SEQ ID NO: 234:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 301 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 234:
GCCnGCCAAC CCAAGGnTTT TTTACCCGGT TTTTTGGCCA CCAAATTCCC CTTCCCCGGC 60
CACCTCCAAG GGCCAAGGAA ACCATTCCCC CGGCACCCCA ACCGGACCAC TGCACTGGCA 120
AGTnAAGnCG GCATGGGCAG CAGTCGGTGC AGGACCTGCA GGATCGCTCA TTCCTGGCGC 180
TCCTCTCAGT GCGGGAGTCG GCTCTCGCGG CGnTGGGGAG CGTTGCCTGC GCCAGTAGAG 240
CCGCTGCTCC GCCAGGCGGG ATGACGCATT GGGTGCGCTT GCAAGACTGT GGGCAGCGTG 300
C 301 (2) INFORMATION FOR SEQ ID NO: 235:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235:
AACGTGGCAG ACGTGGTAAC TACGTCATCG ATAAnCAAGG CATCACGCGG CACACGAGCG 60
TGCGCCCCGA GCTCTATACT CCCTGCAAGA TTCTCAACAC GCGCAGCGCG ATCTAAAGTT 120
TTGCTGCGCG AAACGACCCT CTTACTCGCA CCAACGCACG ATTAACGGTA AAACCAGCCA 180
ATTCTAGTCG ACGCGACACG TCCGCAAnCG GGTCCCATCC TCTTCTCAGC CATCATGCAn 240 (2) INFORMATION FOR SEQ ID NO: 236:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 567 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 236:
AAAAAAAAGG GnGGGGGGAA CCCAAAATTC CCCCCAAAAA nGGGTTTAAC CCAAAAAAGG 60
GAAAAAAAAC CCAACCCAAA TTTAAATTTA AAAACCCTTC CCCCCTTTAA AAATTTTTTG 120
GnAACCCAAT TTGGGAAnTT CCAAAGGGAA AAAAATTGGG AAAAACCCTT TTTAAAAACC 180
CCCAACCAAA AACCCAACCC AAAATTTAAT TTAAGGTTTC CAAAAAAACC TTTTTTTGGG 240
AGGGCCAATT TTTAAAAAAA CCCCCTTAAA AAAAAGGAAA AAAAGGAATT CCCCTTAAAA 300
AAAATTTTTG GGCCAATGGG AnGAAAGAAA CCTGGCCTAG GnATTTAACC CCTnCCCAAA 360
AnGGGATTCC TCCCCAAATG GAAGAACCTG GGGCCAnGCC TGGAATTTTC CTCCATCCnG 420
ACCACCCTnC CCnGAnCTAG GGGGGAAGAA ATGGAAAACC AnCATGGTTT AAAAAAAAAA 480
TCCCTTGTCC AATCCCAGAA ATACCGGTAA CCCCAGTTAG AnGCCTCCTC CATTTAATTA 540
AAATGGAAGG GTGGAAATTT AAAAAAA 567 (2) INFORMATION FOR SEQ ID NO: 237:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 215 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 237: CTCCATCCAG TCTATTAnTT GTTGCCGGGA AGCTAnAGTA AGTAGTTCGC CAGTTAATAG 60 TTTGCGCAAC GTTGTTGCCA TTGCTACAGG CATCGTGGTG TCACGCTCGT CGTTTGGTAT 120
GGCTTCATTT CAGCTCCGGT TCCCAACGAT CAAGGCGATT ACATGnTCCC CCAGTTGnGT 180 TGAAATAGTA ATCAGCAGGT TTTCGGGGCG AGTAT 215
(2) INFORMATION FOR SEQ ID NO : 238:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 372 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238:
ACTGACTTTG CTACTAATCA GACCTACATT TAACTCAGAT TAATAATGCT TGCTTAAGTG 60
TCGCTCCCCC TCTTCGTGGA GGAACGACAC TAAATCCTGC CTAGGCTTCA TATCCGAGTC 120
ACGGCACCAT TATGTCGCTC CCCCTCTTCG TGGAGGAACG ACACAGGACC CTGCGCTGTT 180
CTTTCGTCTG CTCGGCCCTC CCCGGGTTTG CTGCTGGTTC TTCCCGGGTT GGCTGCTATC 240
CCTTCCACCT CCGTGGAAAG GGCAGTTCCC CCTGGCCGCA TCCCCATTTC CGCAGGGAGC 300
GGCAAACCGC GGCCGGCTCT TCTCGGGGCT GCACAnATGT TTCCCTTAAA AGTTCCCCAA 360
AAAnGTTTCT GG 372 (2) INFORMATION FOR SEQ ID NO: 239:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 150 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 239: GCATGCCTGC AGGTCGACTC TAGAGGATCC CGGGCATATA AACAACGCTG CCCCCGTTCC 60 TGAAGCAGGA AGCGTACGAT AGTGAGCCAG CGGAAAGAGA AGTGCTGCAA GGCGGACCGC 120 AAGCTCTGCA GGCACCCGAT CGCACGCAGC 150
(2) INFORMATION FOR SEQ ID NO: 240:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 150 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: GCATGCCTGC AGGTCGACTC TAGAGGATCC CGGGGAGACG CTGAGCGCTC TCCTCGCCCA 60 CGAAAGACAC CGTGnGnGCC CGGTCCCTAG AACGGACGGT CCGCAAGGTA CTTGTACTCT 120 TGCCACTGTC TCCGCGCCTT CTCTTCTGCG 150
(2) INFORMATION FOR SEQ ID NO: 241:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 311 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241:
GAGGGAGGGG AGGAGGGAGA GAGAGAGAGA AAGAGAGAGA GGTCCTCTGT CCACTGCTTC 60
GTTCCCCAGA TGGCCACAAC GGGCCAGAGC TGAGTCGATC CGAAGTCAGG CGCCAGGAGC 120
TTCTTCCGGG TTTCACACTT GGGTGCAGGG TCCCAAGGAT CTGGGACATC TTCTGCTGCC 180
CTCCCAGGCC ATAGCAGAGA GCTATnAGAA GCAGCCAGGT ACTAGAACTG GTGCTCATAT 240
GGTATGCTGG CACTGCAGAC CAnAGCTTTA ACCCACTCTG CnACAGTGCC AGCCCTGAAT 300
GTTTTTGAAT A 311 (2) INFORMATION FOR SEQ ID NO: 242:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 150 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: TCTTGCATGC CTGCAGGTCG ACTCTAGAGG ATCCCCTCTG GGGTAGAAGG TTTTGGCTAT 60 GACCCGATTT TCCTGTTGCC ACACCTGGGC AGGACGTTCG CTCAGCTCAG CATTGAGGAG 120 AAGAACCGCG TCTCTCACCG GGCACTTGCG 150
(2) INFORMATION FOR SEQ ID NO: 243:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 596 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 243:
GAAAATATCG GAGAGAGAGA CTAGCAAACA GCCTAGGGAA AAGCCGGACG AAAAAGGAGC 60
CGGAAGAAGC TATTGAAAGC CTAGGCATAG ACTTGGATAC GGACTACGGG GGGAAGTTGG 120
GAGAAATCTC TAAGGTCGAA AGCGAAAGTG AAAGCTAGAA CAAACAGATT CGGACGCGGA 180
CTGTGGGGAG AGGCCAGGAG AAATGAGGGA GGAATATCGT TGGAGATAGC TTGGGGAAAC 240
ATACCGGGTA GAGAAAACTG TTAGGGAAAT TGAAGCCGCG GGGGGCAGGC CAAGGCGGAA 300
ACGAAAGCCA CTTTGGGGTT CTCAGGTTAG CCCGGGAATA GGGGGCAAAA AGTTGAAACC 360
AGAAGCTGAG ACGTAAGCCA GATTGGGATC CGTCTGATTA GCCCGGGGAG CAAAGGACGG 420
GAAGCCAAAT CGTGGGGCGG AGACGTACGC TGGGTTGAAT TCGCCAGGCT AGCCCGGGGA 480
ACTTGGATTG AATGCTAGTG GTGGAGACGC AAGCTACGCT GTGTTACTCG CGGAAGCCGC 540
CGCGTGCAGA GAGAGCACGG GGCGTGAGTA GATAGGGAAC GGGGCTGGCG TAnGCC 596 (2) INFORMATION FOR SEQ ID NO: 244:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 150 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: ATGCnTnGCA TGCCTGCCGG TCGACTCTAG AGGATCCCCG TCACTTGCCC CCAGCTCCAA 60 ACATTGCATC GTGACCCGTG CACCTTCTTT TGCAATGCTA GAGAGAATGA TTACTGGAAT 120 ATCAATGCGT AGACGTTTCC GCTGTTCAAG 150
(2) INFORMATION FOR SEQ ID NO: 245:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 489 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 245: TGTTCTGGGC AGGGTTTTCT GAGTAAGAGC CCCAAAACAC AGGCAGAGAT ATTCAAAGTT 60 TCACATTTGG GTTAACTGGA TCATATTTTT GCACGTTCAT GGCTGAAACA AGGTCTTAAC 120
AAAACTCAAA ATTGACAAAT GAAATCATGT CACATTAAAA TGCTTCTGTA CAAAAGACAA 180
GGTTTTATTT GTTTGTATGT TTTTATATAC CTGACTCTGA AAACCTTATG CnGGGGCTGG 240
TGCTGTGGTG TAGCAGGTAA AGCCGCTGCC TGCnGTGCCG GCATCCCATA TGGGGGCCGA 300
TTTGAATCCG GCTGTTCCAC TTCTGATCCA GCTCTCTGTT ATGGCCGGGA AAGCAGTAGA 360
AGAGGGCCCA AGCCCTGGGT CCCTGCATCC ACTTGGCAAG ACCCGGAAGA AGCTCCTGGC 420
TCCTGCCTTG GACAGGCGCA CTCCTGCTAA GCGGCCAACT AnGGAGTGAA CCAACAGATG 480
GAAGACCTC 489 (2) INFORMATION FOR SEQ ID NO: 246:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 172 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 246: CTAAAGTTnC CCCGGGGGCT AAGCCCTTGG CCGAAATTnC AACCCCAGCG TACGTCTCCC 60 GCCCCCACGG TTTGGCTTTC CCGTCCTTTG CTCCCCGGGC TAATCAGACG GATCCCAACC 120 TGGCTTGCGT CTCAGCTTCT AGTTTCAACT TTTCGCCCCC TATTCCCGGG GC 172
(2) INFORMATION FOR SEQ ID NO: 247:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 617 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247:
ATCAGAAGTG GAGCAGCCGG GTCTGGAACC GGCACCCATA TGGGATGCCG GCACTTCAGG 60
CCAGGGCGTT ACGCnTGGCA CCACAGCGCC TGCCCTGAAA CCTTTTCTTT TACACAAAAT 120
GCAGATGGCT AATACTTCCA CTAACAATGT CCAGTATCAG GTTCAGCTAT GGTTTCTCTA 180
GCTGGGTGTG ATACTTCCTT ATTTTTACTT GAAAAGCACA GTGACAAAGA GAGAGGGAAA 240
GACACACATG GCTGGGCCAA GAGGAAGCCA GGAACCAAGA ACTCCACCCA GGTCTCTAAC 300
GTGGATGGCA GGGCCCCAAG TATTTGGGCC ATCCTGCACT GCTTTCCCAG GAACATTAAC 360 AGAGAGCTGG ATTGGAAGCA GAGCAGTCAG GATTCGAACC TGCACTCTGA TATGGAAGGC 420
TGGCATCGCA GGTGGCAACT TAGCCTGATG GACAACAATG CTGGCCTTGT GATGTTTATT 480
TTTATGATTT TCTACAGCAG AAACAGCAGT TCCCAAATGC AGATATTTCC AAGCCTGCAT 540
AGACTCATAC TTCCTTTCAG GTAGCAGTGA CTGAGAATAG AATCTGCAAT CCCAGTGTTA 600
TCAACATTAC ATTCTAG 617 (2) INFORMATION FOR SEQ ID NO: 248:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 170 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 248: CCCCGGGCCC AATTTAAACC GGGCCCCnTT CCTTTCCCCC AAAAAAATTT GGCCAAACCC 60 CGGGAAGGnT TAAAACCCTT TTAAATTTTG GTTGGGCCTT TTTTGGGGGn CCAnTTAAAA 120 AACCTTTCCC CAAACCGGGG GAACCTTCCA AACCCTTTTC CTTCCCCCTT 170
(2) INFORMATION FOR SEQ ID NO: 249:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 585 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249:
GAGCATTTTG AGAGGCAGCA AGTGATGGCT CATGTAACAG GTTTCTGGCC ACCTGTGTGG 60
AGGAGCTGGA TTGAGTTTAT GGCTGCTGGT TTTGATCAGG GCCAGCCCTC ACCATTCTGA 120
CCATCAGCAA AATGAACCTG TAGCTGAATG CTTCTCTCTC TCTCTCTCTC TCTCTCTCTC 180
CCCCCAACCC CATCTCTCTC TCCCCGGTCT CTTTCTCTGT CTCTACCTTT CAGATGAATT 240
TTTTTTAAAA AAATTAGTAT TTTGATGCAA AATTGTTTGA CATCTCTGAC CTTTTCATAA 300
TACACCTTCT CCATTATCTT TTTGAGGACT GCTTTAAGCA TAGATTTGTA TGTAGATATA 360
GATGTCTTTC GTCTTTTTTA AAAAAGATTT ATTTATTTGT TTTGAAAGTC AAAGTAACAG 420
AAAGAGAGAG AGAGAGAGAG CTCTTCCGTT AGCTTGGTCA CTCCCCAGAT GGCCTAACAG 480
CCAGCACTGG GCCAGGCGCC GGGTCTCCCA CACAGATGGC AGGGACCCAA ACACTTGTGT 540 CAACTTCTGA TGCTTTCCCA GGCCATTAGC AAGGAGGTGT ATTAG 585
(2) INFORMATION FOR SEQ ID NO: 250:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 566 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 250:
AACCAGGACA GAATCCGGCG CCCCGACTGG GACTAGAACC CGGTGTGCCG GCGCCGCAGT 60
GGAGGATTAG CCTAGTGAGC CGCAGCGCCG GCCAGGAATA AAGTTAATTA AGTAGGTGAA 120
AGACTTGTAC ACTGAAAATA CAAAACAAAT TAGAGAGTAC AAAAATAAGT GGAAAGACAT 180
TCCAGGTTCA TGGATTAGAA GGTTTAACAT TATTAAAATG TAGTTTAAGG GGACAGCATT 240
GTGGCACAGC AAGTTAAGTC ACCGCTTCCA ATGCCAGCAT CTCATATCAG AGTGCTGGTT 300
TGAGTCCCAG CTGCTCCTCT TATGAACCAA CTTCCTGCCA ATGCACTGGA AAAGCAGCAT 360
ATGATGGGCC CTACCACCCA TGTGGGAAAC CCAGTTGAAG CTCCTGGCTT TTGGTCTGGG 420
CCTGGCCCAG CCCTGGCAGT TGAGACCATC TGGGGAGTGA ACCCATGGAA GATCTGTGTG 480
TGTGTGTGTG TGTGTGTGCG TGACTGTGCA TGGAAGATCT GTGTGTGTGT GTGTGTGACT 540
CTGCCTTCAA AATAAATTAA GAACCG 566 (2) INFORMATION FOR SEQ ID NO : 251:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 441 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251:
AGGAGGTTAA GACCTAGTGA ATGCGTGGAG CTTATGAACT GGAACTGTGA AAAAAAAAAA 60
AAAACTGAGG ATGTGTGGGA GAACTCAGGG TGTGCCTGAG AAGTGAGTAC TCTCCGTGGG 120
AGACACCACA AACTTGGTAC CCTTGGCTAC CCAGTGAGAG CCATTGCAGG GGAATCTGAG 180
CTTACACTGA GGACTGAACA GATCCTTTGT GTGGTCCTTG GGACAGAGCA GAGGAATATT 240
ATACACACTG GGGCTAGCGC CCAGGCACTG ATTGCCATCA AGGAGAAAAG CTCAGCTGAG 300
CAAAATTACT TCCCTTCTGA ACACAAAAAG AGAGAGAGAA GTTTACTATG CCTAACCTGG 360 GTGTGTCACC TTTGGGCACA CCCTTAACCC TGAAGAACTG AGCCGAGCTC TCTGGnCCAA 420
ACCCGTCAAA AGCCTCTAGn G 441
(2) INFORMATION FOR SEQ ID NO: 252:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 486 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252:
ATGCCCAAGT CCTTGGGTCC CCGCACCTGC GTGGGAGACT GGAAAGAAGC TCCTGGCTCC 60
TGCCTTCGGA TCAGCGCACT CCTTCTGTTG CGGCCATTCG GGGAGTGAAC CAACGGATGA 120
AAGACCTCTC TCTCTTTCTC TGCCTCCTGC CTCCTGGnCC TGCCCCTGCC CCTTCCCCTT 180 nCCnTACCCC TACCTCTATC TCTACTTCTC TGTAAGACTC TTCATTTCAA ATACATAAAT 240
AAGTCTTAAA AAAAAAAAAG CCAAAGTTTT CTACAGTTTC ATTGGTTCCT GGGAAAAGAT 300
GCCACCACAG TGATTTGCCT CCCAGCTGTG AGCATTCCTC CTTACCCTTA TCGGACCCAT 360
CAGGATGCCT GGTCCAAGTC GCCCACCGTG CATAGGCATA CAGTGGATCT TGGGTGCCTG 420
CTTCTGTGCA CATCCAATCT ATCTTCCTGA CCTCTGGCCC AGAATTATGG TCCTTGATCC 480
TCCATG 486 (2) INFORMATION FOR SEQ ID NO: 253:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 478 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: nGGAACCCnT GATAAGCAAG TGGCCAGGTG AAGGTGGAAA AATCAAAAAA ATATGGCATG 60
ATATTATAGA AGCCAAGTAA AGAACATGGT TATATAGAAG GAAGTGACCA GCTATCTCAG 120
AAACTGCTAG CTAAGTCATG TACAATGAGA ACTGAGGGAT AATACTTATA AAATGAGAAG 180
GTAGAAGGAA TATGAAAATT GTCTACCAAC CTCTACCCAA AGCTATACCA CTTTCCAGGC 240
ACCCTTGAGA GATCTTCCAC CATGTCTATA CACACAGATT TACTTGTAAT GTTAGTAGTA 300
GTTAAGTCAT TTGCATTTTG GAGCTTTATA TGCCCATGGT TTATAAACAG AAACAGAAGT 360 TATTAAAATT TTAGAAAGCA TAGGAATACT GAGGATCAGT CTCCCAGTAC TACTATTTTA 420
AGATTTTATT TATTTATTTG GAAAGAGTTA CACAGAGAGA GGAGAGGCAG AGAGAGAG 478 (2) INFORMATION FOR SEQ ID NO: 254:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 464 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 254:
AATGCTGGAG AGGATGTGGT GAAAAAGGTA CCCTAATCCA CTATTGGTGG GAATGTTAAC 60
TGGTAAAACC ACTATGGAAA TCAGTTTGGA GATACCTCAG AAATCTGAAT ATAGACCTAC 120
CACATGATCC AGCCATTGCA CTCCTGGGAA TTTACCCAAA GGAAATAAAT CAGCAAAGTA 180
AGGAGCTATC TGCACCCCCG TGTGTATTGC AGCTCAATTC ACGATAGCTA AGACATGGAA 240
TCAACCTAAA TGCTCATCAA CTAAGACTGG ATAACGAAAT TATGGGATAT GTACTCTATG 300
GAACACTACA CAGTGGTAAA AAAATGAAAT CCAGTCATTT GCAACAAAAT GGATGAATTT 360
GTAAAACATC ATACTTAGTA CGATAAGCCA GTCCCAAAGG GACAAGTACC ACCTGTTCTT 420
CCTGATCTGT GATAAGTAAT AGAGCACCTA AAAGAAAATC TGTA 464 (2) INFORMATION FOR SEQ ID NO: 255:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 466 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255:
CCTTTTTACT TGTTGAACTC TTTATTTAGT GGAGCGTTAA GCCTGTGATG CTAAAATAAA 60
TTAAAAATAT GTTATTGCCA AAATTAAAGG GAGAAGGGAG ACTGGGAnGG TGAGAAGAGT 120
GGAACTAAGT ATCAAATTCT TAGGACTGTA TATATGAACT ACTTGAAAAC TGTTCTCTTT 180
ATATTAATAA AAATTTAACA TAAAAGCACT GAAAAAACTA GTATATTTAA ATCCTCTACA 240
AAATCAATTG CTATGTATTT CTACCTTCAA ACCCATAAAT ACTTGCTTTG TGTGTGTGTG 300
CAGTGTGTGT GCATGTACAT ACCTAGACAC AAAAAAACTG TATGTGGGGC TGGCGCnTGG 360
CACGCTGGGT TAATCCTCTA CCTGCGGCAC CGGCATCCTC TATGGGCTCC GAnTCTAGTC 420 CCGGnTGCTC CTCTTCCATC CAGCTCTCTG CGTGGGCCCA GAAAGG 466
(2) INFORMATION FOR SEQ ID NO: 256:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 451 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256:
CAAAGTCACT ACTTATCAAT AGTCATTTTG AATATAAATG GCCTCAACTC TCCAATTAAA 60
AGACGCAGAC TGGCTGAATG GATTAAAAAA CAAACCCATC TACTTGCTGC TAACAAGAAC 120 ACATCTTTCA ACAAAGGTGC ATGCAGACTG AAAGTGAAAG GTTGGAGAAA GATATTCCAT 180 GCCAACAGAA ACCAAAAAAG AACTGnCATA GCCATCTTAA TATCAGACAA AATAGACATT 240 AACACAAAAA CTGTTAAGAG AGACAAAGAG GGGCACTATA TAATGATTAA GCGATCAATT 300
CAATAGGAAG ATGTAACTAT TATAAACATA TATGCAACCA ATTACAGGGT ACCGGCAGTG 360
CAAAATAAAT GTTAATGGAC CTGAAAGGAA ACAAAACTCC AATACAATAG TAAAGAGGGA 420 CTTCAATAGT CCACTTTCAG CAnGGACAGA T 451
(2) INFORMATION FOR SEQ ID NO: 257:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 638 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 257:
CAGAACCATG GGAAACCACT ATGAAAGAAG GTGAAGGCAG AATATTTCCC AGTAAAAGTA 60
CAAAGGAAAC TCAAAGTAAA CTATAGGAAT ATCTGTGGGA AAATGGTAGG GCCAAGTTGT 120
TACTTATCAA TAATCACCTT GAATGTAAAT GGCCTCAATT CTCCAATTAA TAGATACAGA 180
CTTGCTGAAT GGATTAAAAA ACAAAGCTCA TCTATTTTCT GCCTTCAAGA AACGAATCTC 240
ACCAACAAAG GTACATGCAA ACTGAAAGCG AAAAGATGGA AAAAGATATT CCATGCCAAC 300
AGAAACCAAA AAAGAGCTAG TGTAGCTATC CTAACAGCAA AACAAAATAG ACTTTAACAC 360
AAAAACTGTT TGAAGAGATA AAGAAAGGCT TATGCAATGA TTAACGGATC AGnTTACCGG 420
GAGATGTGAC TATnTAAnGT ATAnGCACAC nTTACAGTAT ACTGAAAACT GCTCCTGATG 480 AnCATGGTTC ATAGAGAnAT CAnAAGAGAA ATAAGAAATA AAAAACTGTT GAGGGACTTT 540
AACAAAATAA GAAGTATGAC TCTGTGGATT ACTGTTATnA CTATTATGAT CATAAACCTG 600 CTCTGATAAG CGCTTTnCAT ATATCACTTC TATCTTTA 638
(2) INFORMATION FOR SEQ ID NO: 258:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 406 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 258:
AGTCTCTnTC TCTCTCACTG TCCACTCTGC CTGTCAAAAA ATAAAAATAA AAACAAAAAA 60
TAAAAAAAGT ATCTCTTGTG ATTGAAATTT GTGTTTTTTT TTATTTAACT CATTTGCATA 120
TCTTTATCAA CATTTTGGCC ATCTGTTTTA TTCTCTGAAA TATCTTTTTG TGAAGTCTAT 180
TTAATTTTTT TCTGAAGATT TACTTGTTTA TTTGAAAGGC AGAGTTACAG AGAGGGAGGG 240
TGAGACAGAA AGAGCTGAGA GAGAGAGTGT GAGAGAGAGA TGGATCTTCC ATCTACTAGC 300
TCCTTCCCTA AATGGCTATA ATGGCAAGGA CTGGGGCAAG TTTAAGCTAG GAGCCAGAAA 360
CTCCATGCAA GTCTCCCATG TGGGTGGCnG GGnCCATGTA CTGGGG 406 (2) INFORMATION FOR SEQ ID NO: 259:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 443 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS : double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259:
AGAGAGCTGG ATCAGAAGTG GTGCAGTTGG GACTTGAACC AGTGCTCATA TGGGATGCCT 60
GTGTCTCAGG GTTAACCTGT ACCACAATGC CTACCCTCAT TATACCACTA ATAATCTGGA 120
ATCAGATTCA TTATTGCTTC AACTCCTGGT AAGGAGTATA GATCAGTAAA GGTTCTTAGA 180
GTAGATTAAG ACAAGCTGCC TAAAAAAGCA AAGGTTTGAA GTAATAAGCT CTTGGAGAAA 240
ATAGATGTTG TCGAAGGGAA TGCTCGAGTG ATTCTAATAA ACCGTCATTC CCTGTGATTG 300
CTTTACATGG CGTGCAGTGC ATTTGGAACA AGACAGACGC CAAGTCAAAT CCTATTCCTG 360
GCTATATGAT GTTGACTGGA GATAGTTCCC ATCTCTGAGA CCCAGTTCTT ACTGGGTAGA 420 CTGTGACACT GGCTGTCTCC AAG 443
(2) INFORMATION FOR SEQ ID NO: 260:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 656 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 260:
AGGTCTTTAA TCCATGCTGA GTGGATTTTT GTGTAAGGTG CAAGGTAGGG GTCTTGCTTC 60
ATGCTTCTGC ACGTGGAAAT CCAGTTTTCC CAGCACATGG AACCCCAAAC CCCCTTAAGA 120
TGGTATTTTT ACCACCCAGC CTAAGTGTTA AAGTGATCAT ATGGATAGGA TTGAGTGTCT 180
GGTAATAATA ATAGATAGAA TTTAAAAGGA GTGAATGCTT CAACATGGGA AGCAGTCCAC 240
ACAGCAGACT CATAATTGCT TTAAAAAGCA CTCTGACCTC AGAATCAGCC CTTAAGGCAT 300
TCTGGTCTGG CTGAAAAGTC CACGAGAGCA TTCAGACATG GAAAGCCAAG ATATTGTGAC 360
AAAAATGTCC TACACGAAGG ACTTAGATGG TGGAAAGAAG TGTCCATTAA AGAAGGAGGC 420
ATTTTCTCTA AAGAGAGGAG AGAACTTCAA CTTTGCTTAT GACCTTGTCT AACTACGGAA 480
TGAGTTTGTG GATTCAGAAG GCTTCCATAA CCTTGGTACC TCATGTCAAG AGCCTCAGAT 540
GATCACTGAC ATCATACTTA AGAATGTTAA TTGTTGGGGC TGGTGCTGTG GGCACAGCAG 600
GTTAAAGCCC TGGCCTGAAG AACTGGGCAT CCCCATATTG GGCACCAGTT CTAGTT 656 (2) INFORMATION FOR SEQ ID NO: 261:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 480 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261:
ATCAGTGCCT TGTAATTAGG TGCATATACA TCTATAATAG TTACATCTTC CTGTTGACTT 60
GATCCCTTAG TCATTATATA GTATTCCTCT CTGTCTCCCT TAACTGTTTT TGTGTTAAAG 120
TTTATTTTAT CTGATATTAA ATGGCTATGC CTGCTCTTTT TTCATTTCTG TTTGCATGTA 180
ATATCTTTTT CCAAACTTTC ACTTTCAGTC TGCATGCATC TTTGTTGGAA AGATGCATTT 240
CTTGTAAGCA GCAAATAGAT GGGTTTTGTT CCTTAATCTA CTCAGCCATT CTGTGTCTTT 300 TAACTGGACA GTTGAGGCCA TTAACATTCG ATATGACTGT TGATAAGTAG TGACTTGCCC 360
TGCCCTTTCC CAAAGATATT CTAATATATG CTTGAACTCC GTGATCTTTA CGTGAGGTTT 420 TCTCCTTACC TCTTCATATG AGGCCAGTTT CGTGTGTAAC ACATATTATG CATTTTTGCA 480 (2) INFORMATION FOR SEQ ID NO: 262:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 538 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262:
CnAAATTCAA AAACCCTGGA GGATGACATA TATTAACCAC TTATTCCAGT AGTAGCACAT 60
GTACGGCTAT TATAAATATA GATGTATTGG GGTAGCTATT GCTTACTTAA CTATTTAAAT 120
AGTCATTGGC TTAGCTTGTA TAGGTCAGGC TCAGCCGAGT AGTCCTGGCT TTCACTGGGC 180
TACCTGGTGA ATCTTGAACC CGTGATATAC CAGGGAGGTG TCTCTGCTTC AGAGTATGGC 240
TGGTTGTTGC CTGGGACAGT GGAACCAATG GCCCAAATGT CTCTCATCTC CAAAAATAGC 300
CCAAGCTTTT TCACATAGTG TTTCCAACGA TCCAACAGGA AGAAAAGCAG GCAAGGCCTG 360
GAGACTTAGG CTCAGAACCA GCTTACCTTC ATTTCTGCTG TGTTCCATTC ACAAAAGCAA 420
ATCACAAAGC CAGCCCACAT TGAAGGGGTG AGAAATAATT TTTTTTTTGA CAGGCAGAGT 480
TAGATAGTGA GAGAAAGAGA CAGAGAGAAA GGTCTTCCTT TTCCAATGTT TCACCCCC 538 (2) INFORMATION FOR SEQ ID NO: 263:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 681 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 263:
GCnCTTCCAG AAAAATCTTA TTTACTACTT TGGCCCTAGA AATCATAAAT GACTATGATA 60
GAATGTAACT TTTAACAGAG GTGAGGAAGG CATGGTTTAA TATCCAGTTT TGAGAATACA 120
ATTTTCATTT GTATTTTATT GATGTTTATA GTTCTTTTAA GATTTATTGA TTTATTTGGA 180
AGTCAGAGTT ACAGAGAAGG AGAGGCACAC ACACAGAGAG AGAGAGGTGT TTTCCATCTG 240
CTGGTTCACT CCCCAATTGG CAGCAGCACC ACAGCACCGA CCCCTTATAG TTCTTTTCAA 300 ATCATTTTAC ACTGTCTTTT TATTCATCTA GTACAGAGAC AGAAGAATGG ACACAGAGCT 360
AACATAGCTG AGTATTCAAG GAGACAGGTT AAGGGGTTAA ATGCCTCATA TCATATTATT 420
TATACCTTAG ATTAATTCTG GGGACAGTAT TATCTGGAGT TTACAAAGTA GAAACTGAAT 480
GGAAAGAGCT TAGAAAAATA CGGTGTTTTT TAATATCATT AAAAAGGCCA ATCAAGGGGC 540
CAGCACTGTG AAATAGCAGG TAAAGCTACT GCCAGCATGG ACATTCCATA TGGGTGGAGG 600
TTGAGCCCGG CTAGTCTGCT TTCAATCCAA CTCTCTGCTA TGGCCAGGGA AAGCAGTGGA 660
AGATGGCCCA AGTCCTGGGC A 681 (2) INFORMATION FOR SEQ ID NO : 264:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 653 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 264:
TGGCTTCTAG CTTGGGATTG GCTCAGCTCC AACTGTGGCA GCCATCTGGG GAGTGAACCA 60
GTGGATGGAA GACCTCTCTT TCTCTCTGCC TTTGCCTTTC TGTTAACTTT GCCTTTCAAA 120
AATAAATAAA TAAAAAGAAT TGAATTAGAG AACATCTAGC TAGTGTCCAC TATAGAACTG 180
AATATTCATT TAAAAAATGG TTAGAGAGTG TATAGGGCAA AAGAAAGGGC TGCTTTCTCA 240
AGTACATCGT CAATTCCCAG GGAGGCTGGG ATCAGCTTGC CTTGGGGTTG GTCAGGAGAA 300
TGTCAAGGAT TACAGCAGCT CTTAGAGCCT GGTTGTGAAG GGAAGTAATG GTGATCAAAT 360
GAGAATTGTA CCAGTGAAGG TCAGCAGAAA GGACCAGCTC TGCTGCTGAT GGTGGGGATG 420
TAAGAGGAGC TTGGAAGCTA CTGCAGTGAA TGTGAGCTTT GGAGATTTAT GTATTTGAGC 480
TGTACAACTA TGGGGAAGAC TTTTTTTTTT TATCnTGTAA GCTTCAATTT TTCAAACTGT 540
GAAATGGGGC GAATAATTAT AGACATAAAT ACAGAGCAAC AGTTTTGAAA TACCATAAAC 600
CTCATGTCCT TCAACCAGTG nTATTCATAC CnTATAGGGG TATTGTGAGT TCC 653 (2) INFORMATION FOR SEQ ID NO: 265:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 459 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 265:
TGGTGTTCAA AAGAGGAAAT CCAAATGGCC AACAGGCACA TGAAAAAATG TTCAGGATCA 60
CTAGCAATCA GGGAAATGCA AATCAAAATC ACAATGAGCT TTTACCTCAC CCCGGTTAGA 120
ATGGCTCACA TTCAGAAATC TACCAGCAAT AGATGCTGGC GAGGATGTGG GGGAAAAGAG 180
ACACTAACCC ACTGTTGGTG GGAATGCAAA CTCGTCAAGC CACTGTGGAA GTCAGTCTGG 240
AGATTCCTCA GAAACCTGAA GATAACCCTA CCATTCAACC CAGCCATCCC ACTCCTTGGA 300
ATTTACCCAA AGGAAATGAA ATTGGCAAAC AAACAAGCTA TCTGCACATT AATGTTTATT 360
GCAGCTCAAT TCACAATAGC TAAGACCTGG AACCAACCCA AATGGCCCAT CAACAGTAGA 420
CTGGGATAAA AGAAATTATG GGACATGTAC TCTATAAAA 459 (2) INFORMATION FOR SEQ ID NO: 266:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 707 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266:
GGAGTCTAAA AATATTATTT CATAGAAGTT AACAATATAA TGATGGAGAC CAGAGTTGTG 60
GCACAGTGGG CAAAGCTGCC ACCTGTGACA CCAGCACCCC ATGTGGGCGC CAGTTCATGT 120
CCCAGCTGCT GCACTTGCAG TCCAGCTCCC TGCCAATGGT TTGGGAAAGC AACAGAAGAT 180
GGCCCAAGTG TTTGGGTCCC TGCCACCCAC GTGGGAGACC TGGGTCAAGC TCCTGGTTCC 240
TGACTTTGGC CTGGCTCAGC ATTGGCCATT GCAGGTGTCT AGGAAGTGAA TCAGCAGATA 300
GAAGATCTCT CTCTCTCTCT CTCTAACTCT TTCAAAATAA GTAAATAAAT ATTTTTAAAA 360
TATATATGGT GGATATCAGA CGCTGGGGAG GGAAGTAGAG AGGGAGAGAT AGTGAAAGGT 420
CTATGGTGGG TACAGCTGAA GAAAAGTGAG AAATTCTGAG GTTGTATTGC ACCATGTGAC 480
AACAGATAAT GTGTGCCAAC AGATAATGTA CCATACAGTT CTACATTTAA GAAAAGAAAA 540
CTAGAAGATA CAATTTTGGA TATTTTCACT ACAAAAGAAA ATGTTAACAA TTTAAGGAGA 600
TAGATATATG TACCCCACTA GACATTTAAC AAGATAGACA TGTATCAAAA TGTCAAATGC 660
TACCCAATAA ATATTTATAA ATTGTAAATG TCTGTTAAAT TTTAAAA 707
(2) INFORMATION FOR SEQ ID NO : 267:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 639 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 267:
TGAGAGGCAA AATGTAGAAA GAGAACAGAG AGAGAGAGAG AGAGAGAGAG AGAGCGAGCT 60
TTCATCTGCT GGTTCACTCC CAATTGCAGA ACCAGCCAGA GACAGTATAG GCAGAATCCA 120
GGAGCTGGGA ACTCTGTCAG GGTCTCTCAT GTGGGTGGTA AGGGCCCAGA TACTTGGGCC 180
ATCTTCAGTT GCCTTCCCAC ACGCATTAGC AAGGAGCTGG ATCAGGACTT GAACCAGCAT 240
TCTGATATGG GATTCTGACA TTACACACAG CAGCTTAACC CACTATGCCA CAGTGGCGGC 300
CCTTGCCCTC ATCCTTAATA ACCTATATAC TACATCTGCC TGACTACTAT CAAAGTGGCA 360
GAACTTGGGA GCATCTCAAG ACATAGGAAT GGTGTAAGAA TTTTACTAGT GGGGCTGGTG 420
TTGTTGCACA GTGAGTTAAG CCGCTGCCTG CAATGCCGGA CTTCCCATAC GGGTGCCAGT 480
TCAAGTCCTG GCTGGCTCCA CTTCTGATCC AGCTCCCTAC TAATGCACCT GGGAAAGCAG 540
CAAAAGACAG TGCAAGTGCT TGGGCCGCTG TCACTCATGT AGGAGACCTG GGTGAAGCTn 600
CCTGGGCTCC TGGGGCnTTC AGCCTGGCCC AGTTCCTGG 639 (2) INFORMATION FOR SEQ ID NO: 268:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 550 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 268:
TGCCTTTCAA ATAAATAAAA TAAACCTTGA AAAGAAAAAC TAATAGATTA TTAAAGGACG 60
AAAGAAGCTT GAATACAATT GCTTTTTAGC AAGTGTTAGT TATTTTTGTG AGGACACGAT 120
CCACCTCGGT GTCCTTGAGT GCAGCCTCTC AGGGCAGTCA CAAGTCTTCA GAATTGGTCT 180
TAAAAGTCCT TTAGGAAGCT GCAGCGTTGG AGGGGGTCCA AAATCTGACA CCTCCAGGTT 240
CTCCTCACTG ATGGGGAGAT CATGGCTTCT CAGCACAGCA CCAACTGGGG TTAGTGTTTT 300
ATTTTGTGTT TAGATTTATG TACTTGACAG GCAGAGGGAG AGAGAGAGAG GAGAGAGAGA 360
GAGAAAGAGA GGAAACCTTC ATCTGTTGAT TCACTCCCTA AATGTCCATA ACAACTGGGG 420
CTGGACCAGT CCCAACCCAG GAGCCAGAAA CTCTCTCTGG ATCTCCCATG TGGACTGTAG 480 GGACCCAAGC ACTTGGGCCA TCAACTCCTG CCTTCCAGAT ACATCAGCAG GAAGCTGAAT 540
CAAAAGTGCA 550
(2) INFORMATION FOR SEQ ID NO: 269:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 269:
GTTGGGGGAT CTGGGATGGA GCTCAAGAGT TCCTGGCTTC TGCTTGATTG AGCTCTGGCC 60
ATTTTGGGAA CGAATCAGCA GATGGAAGAT CTCTTGCTCA CTCTTTTATT TTTTCAAAGC 120
TTTATTTTAT TTATTTAAAT GGAAGAGTTA GAACGCTCTT CCATCCACTG GATTCACTCC 180
CCAAATGGCA GCAATGGCCA GCGCTGGGTC AGGCTGAAGC CAGGAATTTC TTCTGGATTT 240
CCCACATGGC TGCAGAGGTC CAAGGACTTG GGCCATTCTC CACTGCTTTC TTGGGCACAT 300
TAGCAGGGAG CTGGATCAGA ACTGGAGCAG CTGGGACTTG AACCAGTGCC ATATGGGATG 360
CGGGCACTGT AGGCAGCAGC TTTACCTGCT ATGCCACTGC GCTGGCCCCA TTCTCTCTTT 420
CTCTGTCTCT TTCATTCTCT CTCCCCATCC ACCATCTCTC TGTCACTTTG CCTTTGAATA 480
TATGAAAGTG ATTTTTAAAA ATnAAAGTAA TTCTTAATAA TATCAGGAAA TGAATTCTAT 540
A 541 (2) INFORMATION FOR SEQ ID NO: 270:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 478 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 270:
CGGTTTATTT AAAAGATATG TTAAGGGACT TAAAGGGAGA CTTAGACTCC AATACAATAG 60
TACTGGGGGA CTTCAATACT CCACTCTCAG AAATAGACAG ATCAACTGGA CAGAAGATCA 120
ACAAGGAGAC AGTAGATTTA AATGACACTA TAGCCCAAAT GGACCTAACA GATATCTACA 180
GAACTTTTCA TCTGACATTT AAAGATTTTA CATTCTTCTC AGCAGTGCAT GGAACCTTCT 240
CTAGGATTGA CCACATCCTA GGCCATAAAG CAAGTCTCAG CAAATTCAAA AGAATTAGAA 300 TCATACGATG CAGCTGCTCA GACCATAGCG GAATTAAGTT GGAAATTATC AACTCAGGGA 360
ATCCCTAAAG TACACAGAAA CACATGGAGA CTGGAACAAC ATGGCTCCTG GAATGAACAG 420 TGGGTCATAG GAAGAAATCA AAAGAGAAAT CAAAAACTTT CTGGAAGTAA AGGAGGGT 478 (2) INFORMATION FOR SEQ ID NO: 271:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 416 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271:
GAATTGAGCn CGAAACAGGT GTAnCCATAA GTTTTGATAT GTTGTCATGT CACATCATTA 60
GTTTCCAGAA AATTTTTGAT TTCTATTTTG ATTTCTTCTA TTACCCAGTG TTCATTCAGG 120
AACTTGTTAT TCATGTGTTT GCATATGCTC TAGATATTCC CGAGTTGCTG ATTTCCAGCT 180
TTTTTCCACC ATGGTATGAG AAGCTGCATG GTATGATTCC AATTCTTTTG ACATTGTTGA 240
GACTTGCTTT ATGGCCTAGT ATGTGGTCAA TCCTAGAAAA AGTTCCATGT ACTGCTGAGA 300
AGAATCTGTA TTCTTCAAGT GTAGGAATAA AAGTTCTGTA GATATTAGAT CCATTGGGCT 360
ACAGTGnTGA TTAAATCCCT GnTTCCTGnT GGATTTCGTC GGGGACCGTC CATGCT 416 (2) INFORMATION FOR SEQ ID NO: 272:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 474 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 272:
TTGTTCTTAC TGAAGTGCAT CAGTGGCCTC TACCCTGCTT CTGCTGCCTT CCGCCCTCCC 60
TGGCCAACAA GCACTTGGAA GAGGAGCCCT GGATGAACGT TGTAAACAGA CCTCAAGACT 120
CCCTTGGGAT GCTGGCATCC CAGTTCCTAG CGCCTGGGTT CCAGTCCTCT GCTTTCCTGT 180
CCCTGCCGGC TAAGTGGGAG AACTGGATTG GGAGCAGGTG CAAGCATCTG GGGAGTGAAC 240
CAGCAGAGAG GGGACCACGT TTACTCCTTT TCTCTCTGCC TCTCAGATAC ATGCAGATAT 300
ATACAAGTTT AAAAGGAATG CTTCGGTTTT TGGCAAATTA TTTGTTAATA AAAATTTAAA 360
TATTTCTGTC ACTTTTTAAA CATTTATTTA TTATTTGAGA GAGTTACGAG AGGGAGAGAC 420 AGAGAGAGGT CTCTATCTGC GGTTCACTCC CCAGGTGGGC TGCAAGGGCC AGGG 474
(2) INFORMATION FOR SEQ ID NO : 273:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 427 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 273:
CGGACTGTGC CTGCATGGCC TGCATGTGAA CAGACTTGCC ATTGGCTGGG AGCGTCCTGC 60
TAGGGGAACC TAGGGTATAT AATGGGGATA GGATTGTGTA GGGGGATGAG GGCTTGGCTT 120
CTTTTCTTTC GCCTTGCATC TGGATGAATA AAGTTCCATG AGAACCGAGT AAGCAGCGAT 180
CGTGTCGTTA CTATGCTGGA CTCTCGCGGG CAAGCGTCCG GCAGCCCCTA GACTCAAATG 240
TTGTCAGAAG TTTGAGAACT GCCATCCTAA AGAATTTCTG ATGGGGCCAG TACGGTGGCA 300
TAGCAGGTTA ACACAGTGTC TGCAGTGCTG GCATTCCATA TAGGCGCTGG TTCGAGTCCT 360
GGCTGCTCCA CTTCCAATCC AGCTCTCTGG CTAATGGCCT GGGAAAGCAG CAGAGGGATG 420
GCCCAGG 427 (2) INFORMATION FOR SEQ ID NO: 274:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 616 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 274:
CATCACCACC ACAAATGAAA ACTTAAGCAT GCACTGGTGA TCCACAGGAG AATGGAATTC 60
CTCTCTCCTA TAGGCCTGAG CACCACCCTG CAGGAGGGGC TTTAAAAGGC AGATGCAGCA 120
CATCTTTGAA TACCTTGTTT CTACTTGTGG TTGTGTGATT TTCTTTTTTT TTAAGATTTT 180
CTTTATTTTA CTTGGGAGGT AGAGTTTGAG ACAGTAAGAG GGAGAGACGG AGAGAAATGT 240
CTTCCCTATG TTGGTTCATT CCACAAATGG CTGCAATAGC TAGAGCTGCA CCAATCTGAA 300
GCCAGGAGAC AGGCACCTCT TTCTGGTCTT CCAAATGAGG GCAGAGGCCC AAGGATCCAG 360
GCCATCCTCC ACTGCTTTCC CAGGCCATAG CAGAGAGCTG GATTGGAAGT GGAGCAGCCG 420
GGGCTAGAAT CAGCACCCAT ATGTGAACCT GTGCTGCAGG CAGAGGATTA ACCTACTGCA 480 CCACTGTGCT GGGCCTGGCT GTGTGATTTT CAGCATTTAG CATTGGAGCA TGGGTGTGTC 540
ATGGTTGGCC ATTCCCTAAC ACCATGGCTG ATCTCTACTC AGGTTTGCAG TTTTAGGCAA 600 CATTTTGAGC TGCACA 616
(2) INFORMATION FOR SEQ ID NO : 275:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 484 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 275:
TGTGATAnTC TGnACTTTCA AATAAATAAA TAAATGAATC TTTTTTTTTA AAAAAGTGGA 60
TTAGGGTACC TTTTCCCCAC AGCCTTGCCA TCATTTATTG TTTTTTTGAT TTATATATGA 120
TAGCCATTCT AACTAGGGGG AGGTGAAACC TCATTGTGGA TTTTATTTGC ATTTATTGAT 180
GGCTAGTGAT CCTGAGCAGA TTTTAAATTT CACCCTTTGA AAAATGCCTG CTCATGTCCT 240
TTTTCCTTTG CCTATTTCTT AACTGGATCG TTTGTTGCTG CTGAGTTTCT TGACTCTTTA 300
TAGATTCTTG ACATCAATCC TTTATCAGTT GCATAGTTTG AAAACATTTT CTTTTGTTAT 360
GTCAGTTGCC TCTTCAGTTT GTTGGGTGAT CCTTTACAGT GCAGAATCTT CTTAACTTGA 420
TGTAATTCCA TGGTCTATTT TTGCCTTTAn TGCCAGTGTT ATGGGGTnTT TCCAAGAAGT 480
CTTT 484 (2) INFORMATION FOR SEQ ID NO: 276:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 586 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 276:
CAGTGTTGCC TTTCAACTTT CCATCTTGTC CGTGTGAGGG GGTCCCTCAC AGGGGACGGT 60
GGGCCCAGCC TGTCGTGTGC TCACTTGTCT GTGACATCAC GGATGGGAAC TGnGTGAAAT 120
TTCAGCTCCC CAGATCTGAA ATTTCACAGA GCAGCCTTTG TCACCTTGTT AGAGAATGTT 180
TTTCCATCTC AACTCAGCGG TAAACGGATC ATTTATACGC ACCTGTTTTC ATCTGGATGG 240
TGAATTTACC TGGTGGGTGG AAATTGGATG TGAGATTCAC ACAGCCTCAG AGCGTCTGGT 300 CAGAGGTACT TACTGAGGCT GGCGCCATGG TGCAGTGGGC TAAGGCACCG CCTGTGACAC 360
CGGCATGGCT GCTCCATTTG CAGTCCAGCT CCCTGCATAA AGCTCCTGGC TCCTGGCTTC 420
AGCCTGGCCC AGCCTTGGTC ACTGTGGCTA TTTGGGGAAT GAACCAGCAA ATGGAAAGAG 480
TCAGTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTGTA ATTCTGCCTT 540
TCAAAAAAAA TTTAGGATGT ATGTATGTAT GTATTTATnT ATnTAT 586 (2) INFORMATION FOR SEQ ID NO: 277:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 327 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 277:
TCACTTTAAC ACTTAAAATG CTATTTTCCC TCTTAGCACA AGAGGACTTG GGGTGTCATG 60
TCAAATTTTT AAACTATACC CTTAGAAATA AATCTGTGGT AATGTATACA GAATTATGCA 120
GCTTTGCAGT TACAAACTTC ATACACTTCA TAATTATGAC TTTAGGAACA TGGTGATTCT 180
TTCCACTCTG CCTGTCCTGC CACCCACATC CCCACCCCTC TTCCTCCTCC CTCTCTTATT 240
CCCTCTTTTA TTTTTGACTA GGATATATTT TAATTTAACT TTATACATAT ATGATTAACT 300
CTATGTTAAG AGAAGAGTTC AGCAAAT 327 (2) INFORMATION FOR SEQ ID NO: 278:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 481 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 278:
TTCTAAAAAG ATTTTCTACT TCTACTTATA TACAAAAAAA TAAAATATTT CTTTATCTAA 60
TTGACACAAC CCTCTGTCCT GCTTTTTTGT TTAAAAGATT TATTTATTTA TTTGAAAGAG 120
AGTTAGAGAA AGGTAGACAG AGAGAGAGAG AGAATCACCA GGAACTTTTT CCAGGTCTCC 180
TGCTAGAGAG TAGGGGTCCA AGGACTTGGG CCATCTTACA CTGCTTTCCC AGACCAAAAG 240
GAGACAGATG GGCTGGAAGT AGAGCAGCCA GATCTTGAAC TGGCACCCAT ATGAGATGCT 300
GGCACTGCAG GCTGTGGCTT TGCCCGCTAA GCCACAGTTC AAGCCCCAAT ATGTCCGGCT 360 TTTTCGGATC ATATCAGTCT ATGACGTGCC ACTTATATTA CTATTAATCA ATGGCACCTC 420
TTACTCTGAA ATGTGATTAT CTTGTATGAT AAATTATACA TAAGTTCTTA AAATAAGTGT 480 C 481
(2) INFORMATION FOR SEQ ID NO: 279:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 481 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 279:
GCGTGAGGAG CAAGATCAGC CTGGAAAACG TGTGACAGCG TACCACTTTG AAACAGCTGA 60
GTATAAAGCA ATAGAATTTC CTTGGAGCTA CACATGCCAA ACACTAAAAA AGCAAGTTCC 120
CCAGGCCAAC TGTTGCCTTC TTCTTCTTCA TTTTTTTTTT TAAAGATTTA TTTATTTGGA 180
AGGCAGAGTT ACGAGAGAGA CGGAAAAATA GAAAGAGATC TTCCATCTGC TGGTTCACTC 240
CTCAAATGGC CATTAACAGC CAGAGCTGGG CCAGGTTGAA GCTGGGAGCC AAGAGGTCCA 300
TCCCAGTCTC CCCCATGGGT GCAGGGGCCA AACACTTGGG CTATCCTCCT CTGCTTTTCC 360
CAGGCCCTTT AGCAGGGAGC TGGATCAGAA TTGGGGCAnC CGGGAnTTAA ACCCAGGCCC 420
ATGTGGGATG CCGGTGCTGT AGGTGGATGG CTAACTCACT GCACCACAAT GCCAGCCCCA 480
A 481 (2) INFORMATION FOR SEQ ID NO: 280:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 455 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280:
AACAGACATT CTTCGCAGAT CTAGAAAAAA TGATGCTGAA ATTCATATGG AGGCACAAGA 60
GACCTCGAAT AGCTAAAGCA ATCTTGTACA ACAAAAACAA AGCCGGAGnA TCACAATACC 120
AGACTTCAGG ACGTACTACA GGGCAGTTGT AATCAAAACA GCATGGTACT GGTACAGAAA 180
CAGATGGATA GACCAATGGA ACAGAATTGA AACACCAGAA ATCAACCCAA ACATCTACAG 240
CCAACTTATA TTTGATCAAG GATCTAAAAC TAATTCCTGG AGCAAGGACA GTCTATTCAA 300 TAAATGGTGC TGGGAAAACT GGATTTCCAC GTGCAGAATC ATGAAGCAAG ACCCCTACCT 360
TACACCTTAC ACAAAAATCC ACTCAACGTG GATTAAAGAC CTAAATCTTC GTCCTGACAC 420 CATTAAGGTT ATTAGAGGAA CATTGGGnGA AAnCC 455
(2) INFORMATION FOR SEQ ID NO: 281:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 515 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 281:
GGCACTGGGG AGTGAAATAG CACGTGGGCA CTGTCTTTCT GTCTCTCAAA CATGTGAACA 60
CATTTTTTTA AAGAAGATGA CCAAAAATAG GTAAAGTCCT CTGTCGGACG TGTTCACACC 120
ATTTAATTCA GTCTCTCGAT TGTTTTTACA AAAATAAAAG CCTTTTTTGT TTTACAACAG 180
TTTTTTTTTT TAAGGATGTA TTTATTTGAA AGACAATTAG AGGGAGGTCT TCCATCTGCT 240
GGTTCAGTCC CTAGATGGCC ACAGCGGCCA GGGCTGGGCC AGGCCAAAGC CAGGAACCGG 300
GAGCTTCTTT TGGGTCTCTC AAATGTGTGG CAGGGCCAAG CAGTTGGGCC GTCTCCACTG 360
CTCTCCCAGG CCGCTAGCAG GGAGCTGGGT CGGAAGCGGA CTGACGTTGC TGGCCTCGGC 420
CTACCTGCTG GCACCGTAAG CTGGCTCCAG GACAGTTTGA TGGAGGTGCA GTCCAGCACA 480
CTGTGTGTGT GTAAAAGTCA CACTTCCAGC ATACA 515 (2) INFORMATION FOR SEQ ID NO: 282:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 585 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 282:
GGAATGCCAG CCCGCCACTG CCGTCCGTCC TGGGTTGTTT CATACCCTAG CTCTTCTCCA 60
CGTCGTGAAA GCCCAGGCGA GGAGACCTGG TGACTGTCCC GGGCACAGCA CTCGGTGAGG 120
CGTTCCAAGG GCATCCCAGG GTGCAGGCGT GGGCTGCCAC GTTCTCGCCC CACTCCACGT 180
GACCGCTTTT GGCCCTCAGG GACAGGGCGA GGATGTTGGC TGGCCCTGGC CGCCCTTCAT 240
AGGGGTGCTC CTCAGAACCT GAGGGAGAAA TTCTTTTTCT CTGAGATTTA TTTATTTATT 300 TGAAAGAGAC AGAGATCTTT CATCTACTGG TTTACTCCCC AAATGGCCTC AACAGTCAGG 360
GCTGGGCCAG GCCAAAGTCA GGAGCCAGGA ACTCCATCCA GGTCTCCCAC ACAGATGGCA 420
GGGACCAAAG TACTTGGGCC ATCCTCTGCT GCCTTCCCAG GCGCATTAGC GTGGAGCTGG 480
ATCAGAAGCA GGAAAGCCGG GATTCAGCTG GCCTCCAACG TGGGATGTGG GACAGAGCCC 540
ACCCCTGGCA GGTTTTTTTT TTTnnGnnTT TTTTTTATAT TTnAT 585 (2) INFORMATION FOR SEQ ID NO: 283:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 636 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 283:
ACTCTGCCTG TCAAAAAAAA AAAAAAATCT TAAGGGGGCT CAGCTCAGCT GTGACTCCGT 60
GATCCGCTAA TATGAGGTCT ATTTCCAGTC CCTCCTCCTC CATTTCCAAT CCAGCTCCTT 120
GCTAATTCAC CTGGAAAGCC ACGGGAAGAT TGACCCACTA CTTGGACCAG CTGGGAGATG 180
TGGAACAGAT GAAGCTCCCG GCACCTGCCT CCTGGATTTG CCCTGGCCAC CCAGAGCCCA 240
TCCAGCAGAT GGAAGATCTC TCTTCCTCTC CCAACTCTCA GCCAGTCACC CCACAACTCT 300
TTCACATAAA TAAGAGTACA TTAAATTTAA AAGAGAATTG GCCAACTAAG TCTCTGAAGG 360
TGGGGGGGTT GGGCTGACCC TGGGGCATAG TAGGTTAAGC ATCTATCTGT GGCTCCAGTT 420
TGAGATGGAA GGACTCTCTG TAACTCTGAC TCTCCAGAAA AAAAAAAAAA GATAAAAATC 480
TTAAAAAGAA TTATTATTAT TATTATTATT ATTATTATTA TGGGGCCTGT GCTGTGGTGT 540
AGCAGGTAAA ACTGCCACAA GCAGTGCCGG CATCCCATAT GGGCTCCCAT TCGGAATCCC 600
ATCCCCAGCT GGGCTGCTCT CTGCTAnGGT CTGGCA 636 (2) INFORMATION FOR SEQ ID NO: 284:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 656 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 284: TAGAGATTTA GTAATCTTCT GAACTCTTTG CTTTCAACCA TTTTCAGTTG CCAAAGATCT 60 TTTCCTTATT AATACTTCAG CTTCCTTAAC CTTCTACTTT TCAACTTAAG GGTACTTGCT 120
TGAGATGGTC TGAAACACTG TGTTTGAAAG GGAAGGCATC CAGAATTAAA TTTTAGTTAC 180
TTAAAAATTT ATTTCTCAAA GTCTCTCTCA ATCTTGTCCC TTACTCTTTG ACATCCAGAA 240
ATGAGTAGTT TTCCAACAAT GGTACATCCA ATATCTGGAC TCCCTTTATT GTCTTTTATT 300
GGTTGCTAAC AATATGTGGA TGATTGCTTA TGCTCATCTA TTTTTTTAGA ATATTCATTT 360
ATnATTTAAA AGACAGAGGT GCAGAGAGGC AGAGGGCAGA GAGAAAGAGA AAAGGGGGGA 420
GGGTTCTTCA TCTGCTGGTn CACTCCCCAG ATGTCCGCAA TGGCCAGAGC GTGCCAATCC 480
GAAGCCAGGA GCCAGGATCT CTCGGGnCTC CCAnGTTGAT GCAGAnGCCC AAGAACTGGG 540
GTAnCTTCTA CAGATTCCTC AGGCCTAGAA GACAGCGGGA TGAGAAnTGG AGCAGCCGGG 600
ACTAGAACCA GCGCCCAAAT GGGGATTCTG GGCACTGCAG GCGGCAGCTT TACCTG 656 (2) INFORMATION FOR SEQ ID NO: 285:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 285:
AGCTGAATCT TCAAATTCAC AGAAATTATG ACTTACAAAG TCATCACACA CTGAATTAGC 60
AAATACAGAG ACTCTGTCCC TAGTGGGAAT ACAGGATCAG GTGCCTATGA GCCTCTGGTC 120
AAAATATTCT CAAAAACAAT CAGCAAAGAT CCTGTCTAAA GAAACCTCAT TCAATATACT 180
GTGTTGCATC ACTGAACTCA TAGCCAGTAG CTCCTTTTTA AATTTATTTT TTAGATTTAT 240
TTATTTGAAA GTCAGAGTTA CAAAGAGAGA GGGAGACATA GAGATCTTCC ATCTGCCAGT 300
TCACTCCTCC AATAGCCACA ACAGCCAGGG TTGGGCCAGT CTGATGCCAG GAGCCTAGAA 360
CTCCATTCAA GCCTCCCACA TGGGTACAGG AGCCCAAGGA CTTGAGCCAT CCTCTGCTGC 420
CTCCCAGGTT CC 432 (2) INFORMATION FOR SEQ ID NO : 286:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 572 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 286:
TGATGCCTGC ATTATAATTA AATAATGCCT ATCACCGTTT CTTAAGCAAC AAATGATCAC 60
AGATTCCCTC TTTGCTGCTA AGTAATGAGC ATGGTGATTC TTCCCTCCGT AACCACCTCC 120
CCACCTACAC TCCCTCCCCT CTTCCTCCTT CCTCTCCTAT TGACATGGAG TGCCATACAT 180
AGCTAGGGTG AACATTTATC CCGCTTGGCC CTAGGTGTTC ATACACTTCA ACTGTGTGTA 240
ATTATTCATA TGCCCTTTTT ATGACCAAGC TTATGTCAGA ACGACAAATT ATATATTCAC 300
CTAATACTGA GAATCATCAA ATTGATTTCT ATTTTCTTCT CTCTTAGGCA GTTCTTGAGG 360
GGCAGAAATA TTTATTTTGT ATATCATGTA TTTATTCAAC TAAAAGACTG ACTGTCAAGA 420
TGCTGAGTTT TAGTGTCAAA GCAAATATGT TTCTAGACCC CATGAAGTTT ACAGTCTGGC 480
AAGAAAGAGC CATCTACCCC AAGAATATTG TTTAACATAG TAAATGCTGG TGTGAAGTCA 540
GTGGATAAAA ATTGAAGCAT TCTTAATCTA GC 572 (2) INFORMATION FOR SEQ ID NO : 287:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 573 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 287:
AAAAGGAATA AAAGAAAAAA CCAACACACA CACACACACA CACACACACA ACTGTTGAAG 60
GAAGCTCTCT TTGCTCTCAC TCTCACTGTG CCTTTCAAAT AGATGTGAGA AAAAATGTGA 120
AAGAGGGAAA CAGAAAAGTA GGACACAGAG ATATGACACG AGATGGACTG GACCTGCCAC 180
TGCTGTCTCT GAAGATGAAG GCAAGGGCCA TATGGTGGCA TCTCAAAGCT GGGAACTGCC 240
CTCAGCTGAC AGCCTATAGG AAACAGGGAC TTGAAACGAA AACTGATTTT GGCACTTGTA 300
AGTTAATTAT TACTGTTAAC AACTACCTAA AAATATGAAG GGGCTGCATT GTGTGAAGCA 360
GGTAAAACCA CTGCTTGAGA TGCTGACACT TCATATAGGA GTGCTGGTTC AAGTCCCTGC 420
TACTCCACTT CCnATCTAAC ACCTTGCTAA TGCACCTGGG AAAGTAGTGG ATGATGGCTT 480
AAGAACTTGG GCCCCTGCAA CCCATGGAAG AGACCAGGAT GCnGTTCCAG TTTCTTGGCT 540
TTGTCCTGAC CCAGnCCCAG CCATTGTGGC CAT 573
(2) INFORMATION FOR SEQ ID NO: 288:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 603 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 288:
CAGTTTTGGA TAAGACAGCA ATGTGTTAAG CATATTCCTT TGACTCAGCA GCACTTGCCA 60
CTCAATTTTC TCAACAGTTT GTGTAATAAT ACAATTATTG TTGTTATTAA TATTTTTATT 120
AAATTCCACC AATAAATGCC TTAGTTGCAT CATCTTATTT CTTATTAAAA AGCTTTTTTT 180
AAAAAAAGTT TATTTATATT TATTTGAAAA GCACAGAGAG AGAGAGACAG AGAGATATTC 240
CATCAGTTGA CTCACTCCTC AAATGCCTGC AACAGCTGGG GCTGGGCCAG GACAAAGCCA 300
GGAACCAGGA GTCAGGAACT TCATGTGGCA GAGACCCATG TCCTTGGGTG ATCATTAGCT 360
ACTTTCCAGT GTGCACTTTA ACAGGAAGTT GGATCAAAAG TAGAGCCAAG GTTTGAGCCT 420
GGCACTTGAT CTGGGATACA GGTATCCCAG GCAGCGACTT GGACCACTCC TCCAAATGAC 480
CATACCCCAT TTGTACCAAC TCAGAGTGGT TCATGAATAG GCTGAAAGAG AGAACAGCTG 540
TTTAGTGCGT TACATATTTT CATCAAGGAA TACTTTTTTT GGTCTCTTGG GATGAACATG 600
GAA 603 (2) INFORMATION FOR SEQ ID NO: 289:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 614 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 289:
CAATGGGTnC AACCCAAGCC AGACCCCTGA AGGCCACTCA GAACAACAGG CTTGGGAGCC 60
AGTGTTGTGG TACAGCAGGT TAAGCTGCCG CCTATGATGC CAGCATCCCA TGTGAGTCCC 120
TGTTCAAGTC CCGGTGCTCC ACGTCCAATT CAGCTGCCTG CTAATGCACA AAAGAAAGTA 180
GCAGCAGATG GTCCGTGTGC TTGGGTCCCT GCACCCACAT GTGAGACCCA GACGGAGTTC 240
CAGGTTCCTG GGTTCAACCC AGCCTAGCCT CAGCTAGGTA GCCATGTGGG GAAGTGAACC 300
AGTAGATGGA AACCAGCTCG CTCTCTGTCT TTCCCTTTCT CTAATTCTGT CTTTCAAAGA 360
AATAAACAAA CAAAATTGGA AAAACACCTC ACTGGACTGA TAGCACCTGA GTCCCTCCCT 420
CTGCAGCCCA TGCCCCAGTG TACACTCCTC AAGACACCGG CATGCACCTG CCTAGCCCCT 480 CCACAAGGCA TTATCAGGGC CCCTCCCAGT GTTTCCTCTT CGATCCCAAG GnTCGAAGGG 540
ATTGAATCCC CAGGATTTGn GCCGGCGCTG CGGCTCACTA GGnTAATCCT CCGCCTAnAG 600 GCGCCGGGGA AAAn 614
(2) INFORMATION FOR SEQ ID NO: 290:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 504 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 290:
CCAGACCGGA AGGAAGGAGT ACGAATGTTA TTGAGTGTGG AAACGATAGT GCAGATTTAC 60
TTCAAAATAC ACCAGCAGGG GCAGGTGCTG TGGTGCGGCA GGTTAGGCCA CAGCTTGAGA 120
AACCCACATC CCGTATCAGA GTGCTGGTTC GGATCCCGGC TGCTCCAGCA TTCCTGCTAC 180
TGCGCCTGGG AGGCAGCGGA TCAGGACCCA GGTACTCGGG TCCTGCCACC CACGTGGGAG 240
ACCCAGATGC AGCTCTTGGC TCCTGACGTA GCCTGGCCTA GTCCTTCTGT TACAGGCATT 300
TGAGAAACAA ACCAGCAGAT GGCAGATCTC TCTCTCTGGC CTCTCTCCCT CTCACTCTCT 360
CTGTTGTTCT GCCTTTCCAG TAAATAAATA AATCTTTAAA AAGATAAATA AAGTACCCCA 420
GTAAAAGCAA CCACATAACT TCTACACAGA AATGAGCAGA TGTATCAGGA CATTAAAGAA 480
AGCATAAATG AATAGAAAGA TGTA 504 (2) INFORMATION FOR SEQ ID NO: 291:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 414 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 291:
TTTTTGAGAA CTTCTTAATG TTCTTATCAA TTTTTTGAGA TCTGCTTCTT GCATTTCTTC 60
TATGTCATCA TCTTCATAAT CTTGAATTGG GGTGTCTTTT TCATTTGAGG GCGTCATGGT 120
GACTTCCTTG TTTTTATTAC CTCGGTTTTT GCGTTTGTTA TTTGGCATAT TGGAGATATT 180
TGGTTTCTTC ACTGTGGTGC TTTTTCTTGT TATACTATGA CTCTAGATTA AGTGGACTAT 240
CTGTTTTTGA TGGAGCCTTA GGGCTTGAGA TGGGTGTGGC CTGAGAGCTC TGTTTGGTGT 300 GCCAAAGGTG ACACTCCCAG GTTAGGCGTG GTAAATCTCT CTCTCTCTCT CTTTTTTTTT 360
TTTGATTCAA AAGGGAAGTA ATTCCGCACA GCTGAACGAA GTGGAGGTAG TTAG 414
(2) INFORMATION FOR SEQ ID NO: 292:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 579 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 292:
CTGATTTGAG TATTGGTCAT ATTCCTGAAA GACACAATCC TGAAAGCCAT CTACCCAAAC 60
ATTAGAATTC CAAAAGATCA AAATCTACTA TCTAAAGAAT CACAACCCGA AAATTAAAAT 120
CCTTAATTTT GAAAACCTGA AAGCCAAAAT CCCCAAAACT TATTGCCTGT TTTTTTTTTT 180
TTAAGAGAAT CACTTTCTAA AACTTATTAA GTATTTATTT GAAAGGCAGA ATGACCAAGA 240
GAGAAGGAAA AACAGAGAGA ATGAGCTTCC ATCTGCTGGT TCACTTCTCA CATGGTCACA 300
ACAGGGAAGG GCTGGGCCAA AGAAAGGAAC TCCATCCAAG TCTCCCACAT GGGTGGACAG 360
GGACCCAATT ACCTGGAATC ATTTCCTGGC TGGCCTTCCA GGTGGCATTA GCAGGAAACT 420
GGATAGGAAG TGGAGTAGCT GAGACTCAAC CAGTGGCTCC GATATATGAT GGCGGTGTCA 480
CAGGGCAGCT TAACCTGCTG TACCACAACA CCTGAACCAG TAAATGTCAT GTTTTTTAAA 540
AAGTTATATA AGAGGAGCTA AGTCAAATCC GGGGATAGG 579 (2) INFORMATION FOR SEQ ID NO: 293:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 583 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 293:
CAGCTTATTT TGATGATTTT TGATGGAGAC TACATTTAAG TCTGTAAAAT GTAAAAACGC 60
TTGTGTTCCT TAAAGGACAT TTATTGTTCT CAAGATATGG TTAAAAACAC ACATTTAGGA 120
CTTGAGAATC CCACAGTAGT CTGACATTTT ATTTCCACTC TAACACAATC ACCCGCCATG 180
CAGGCTCAGA ACTTTCTTTT TTGTTGTTGT GGCAGCCTTC TCTCTCTCTC ATATATATTT 240
CCTCTCTCTC TCTCTCTCTC TTTnGnTCTC TCTCTCTCTC TCTCTTTTAG AGAAAGAGTT 300 TATTGGGGAA ACCTGACAGG CTGGAGGGAA GGGGCAATGA ATGAAAAGAG GCAGTATGAA 360
AGCATTAGGG AGAGGCAGAG ACAGAGACAG AGGTCAGGGA GATGnGAGAT GTGGGATAGG 420
GATGGAGATG GTCCCGATGA GAGCAATGGA GACAGAAAGA GAGAGACATG TTCAGGAACA 480
GGTCCTTTTA AAACTTTGCC CAGGGGCAGG GGAGGGGAAG TAGGAACAGG GGAATCCCAT 540
TAGGAAGGGG GTGGAGCTTG ACACTGGTGG TTGGGCCATG TGG 583 (2) INFORMATION FOR SEQ ID NO: 294:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 483 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 294:
TCnTAAGAAG GAAACTTTAA GCCTTTGATA AATGAAAAAC AACCTCTTGT TCTCAGAGAT 60
AAATCATTAC AGTGAGAACT TTCTCTCCAA CTTTTAAAAA AAGCAGAATG CTAATCTTTG 120
GCTTTGCATG CTAGGGTTTC CCTACACTGC ACGCACTCCA GTTCTCTGCC ATTTTGGTGG 180
AGGAACAGCT CCTGTGGGTA AGTGATTTAC CTTCCAGCCC TTCCCCTGCG GGACCCCAAG 240
ATCTTTAACC ATTTAAGCAA TGGCATTTAA GTCTTAAGAA CCCCCAGAAG CATGCCCCCT 300
GGCTTTAGAC ATTCTCAAGT TTAAAAAAAA AAAAAAAAAG GCCAGCTTGG GTATCAACAA 360
AGCAGAGAAA GAGTAAGCCA GATAGTTCAG TTCTGATAAG GTTGGAGATC TGCAGCTCAG 420
GTAAGCTGAA CTGTAAACAA GACCCCTTCT TGCTTCTGCA GGGCCCTGGC TGCTGCACCC 480
ACG 483 (2) INFORMATION FOR SEQ ID NO: 295:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 513 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 295:
ATCCTACTAA ATTACCTTTG ATATACTTTT CAATTGACCT CTCTGCTTTT TGTTTTTCTT 60
ACCTTTGTGT AAATCCTCTT AGTTCATCCA AACCTTTTTC TGATTGCTCA GGTTCTCTTA 120
CTCCAATAAC TAGGGAAATC ATGTGGCTGG TAGTTCTTTC AACACACCAC ACAAAAGAAT 180 GCTCAGTAAA TGCTTACTAA ATAATGAATG AAATGCCTTA CAAAGCTAGG GTGAACATTT 240
AACCCGCTTG GCCATGAATG ATTTCTGTTT ATACATTTCA ACCCTGTGTC ATTATTAATA 300
TGTCTCTTTT ATGATCAAAT TTATCTCTGA GCAACAAATT ATACAGTCAT CTAATTGTGG 360
CAGTTATCAA ATATATATCT ATTTCATTGT CTCTTCAACT AGGTAGTTCT GGGATCAGAA 420
ATGTATTTGT TTAATTTTAT AGCCTCATTA TTCATCAACT AAAAGAATGA CTATCAGGAT 480
GCTGAGTTTT AAGTCAAGCA AACAGTTTCT GCC 513 (2) INFORMATION FOR SEQ ID NO: 296:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 616 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 296:
TAAAGAAATC TTAAAAAATA TGCCAAAAGT TCACATAATC AGCTCTGGGC AAGACAGGAA 60
GGTAGATAAA AAAATATGAG ATTATCATTC ATTTTAGAAA GGGATTGATG GTGTCAAAAG 120
TTAACTGTGG GGCCTGCACT GTGAGACAGC AAAGTTAAAG CCACAGCCTA CAGTGCTGGC 180
ATCCCACGTG CACACTGATT CAAAGACTGG CTGCTCCACT TCTGATCCAG CTCCCTGCTA 240
AAGCACCCAG AAAAGCAGTG GAATATGGCC CAAGTCCTTG GCTCCCTGCA CCCATGTGAG 300
ATACCTGAAT GAAGCTCCTG CATCATGGCT CCATGCTTCC AGCATGTAAA AAGAAACTAG 360
CTGAAGGCAG AACATGAGTC AAGATGTTAC AGGATTGGGG ATAATCTGAA AAGAGAAAGG 420
TAGTGTGATT CTGTATCCTT GCTTCCTCCT GCAGCTCTGA ATCTGTCATC AGGGATTAAT 480
CACACCTTAA CCATGTACAG CCATTCAGAG CCATGACATA TATTCTGGGA ACGTGCAGGG 540
GTCTGATCCA CCTGACTCAA GCAGCCCCTT CTTACTAATA GGAGTTCTAC CAGTCTAATT 600
TTGGGCAGGG AAGGnA 616 (2) INFORMATION FOR SEQ ID NO: 297:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 402 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: TTTCTGTGTG TAnTTTTTCA AATGAATAAA TAAATCTTTA AAAAAAAAAG GCAGCAGTCT 60
GCCATAATGG TGGATTATAC TTTGTAATCT ATCTGGAATT TGCTTTGATA AAAGATATGA 120
AGTAGTGCTT TAACATTTTT TTAACTTTTT TAAAGATTTT TTTTTATTTG AAAGACAAAG 180
TACAGAGAGA CAGAGAGACA AAGACCATCC ATCCACTAGT CTACTCCTCA AATGGCTGCA 240
ACAGTCAGGG CTGGCCCAGG CCAAAACCAG GTACTTGGAA CTCCGGGCAC AGTCTCCCAC 300
GTGGGTGCAG GGCCACAAGC ACTTGGGACC ATCTGCCGCT GCTTTCTCAG GTTCATTAGG 360
GAGCTGGATT GGAAGTGGAA CAGTGAGGAC TCAACCTGGG CA 402 (2) INFORMATION FOR SEQ ID NO: 298:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 465 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 298:
GTGGCAATGA TTAGAGTAGG TGGGTCCTTG CCACTGATGT GGGAGACCAG ATAGAGTTCC 60
TGGATTCCTG GATTTGGCCT GCACCAGCCC TGGCTGTTGT GTGCATCTGC AGAGTGAACC 120
AAAGGATGGG AGTTCTCTCC CTATTTCTCT GTCTCTCTTT CAAAAAATAA AAATAAAAAA 180
ATAAGAACAG AAAACCAATT GTGTGGGATA AACCCTGATT CCTCTCCTTT CTATAACAAA 240
CACCAGGAAG CAGGATCCAC TTCCCAGTTC CTTAAAAATG TGAATTCAGC CAGTAATGGG 300
GCTATTGAAA CTTCTGAGAA TAATTAGATT GAACTTTTCA TAAAGATAAT CAGAAAGTGG 360
AGGGAGAAGG GAGCAACATA ATTGAAAGTA GTGACGAAAA ACAC CAAGT GTCTTATAAA 420
ACAATAAAAT GTTTATGTTT TATTTTTATA CCATCTTTCC AGCAG 465 (2) INFORMATION FOR SEQ ID NO: 299:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 431 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 299: GCTGTCTTCT GTTACTTTCT CAGGCACATT TGCAAGGAGC TGGATTGGAA GCTGGGCAGC 60 CAGGACTTGA ACTGACCTTC TGACATGGGA TGATGGAGTC ACAAGTGTTT ACTGAACGTG 120 TTGTGCCACA ACACAGCACC CTTTTGAGAT AAGTGTATAA TTTGCTTAGG TTTTACAAGT 180
ACTGGGTTAA GTCCATTGCC CGAGTAATAT GGCTAATAAT GGCAGAACCT CTCTGTCTAT 240
GGCCAAATGC CATGCATCAG TCACTTCCAC GTAGGCATAT GTTATCAGGA GTGGTCAAAC 300
GGCTACGTCG GTGAGCTTTT TTCTTCCAAA ATAGGAAACT TTTGTGCTAA TCAGCAGAAA 360
GCACTGATTT AAGGAAGCAG TGATCTCTAT CAGCTCCAAA AGCATCTTCC ATAGATGTCC 420
ATTAGTTTCA C 431 (2) INFORMATION FOR SEQ ID NO: 300:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 300:
TGACCTTGCA CTCACTTGGG AGACCCGGAT GGAGTTCCAG GCTCCTGACT TGAGGCCCGC 60
CAGGCCTAGC TTCAGTGGTT GCGAGAATTT GGGGAGTGCA CGAGTGTGGA AGATCTTTCT 120
GTCTCTCTCT CTCTCTGCCA CTCTCTTTCA AATAAATAAA TAAATAAATC TTAAAAAAGG 180
AAAGAAAAAT GACTCCAACT GCTGTGGTGG nGAGGGGGGG AAGGTCCCCA GCTGTATTTA 240
CTCCAGGTTC TTATCTCCAT GTGAGATAAA ATTCAAAGTG GAGTCATATA TTGAATGCAA 300
AAATGAAAGG AGGATTTATT TAGAGAGAGA ACATTTGAAA GTTAAACATG GGTAGCTCTG 360
TGAGGAGAGG CACACACTAT AAGGAGCAGC ATTTGTAGCC CAGTTCAGTG GTCTCTTTTA 420
TTGGATAGGG GT 432 (2) INFORMATION FOR SEQ ID NO: 301:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 566 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 301:
CAAACAGAGT AAATTTACTT CTTTGATTCA CTATGATATT TTTGGAAGAC AATGAATTAA 60
AGACAACAGA ATGGGAAATG TTTACAATCT TCCAAAACAC GGCGCCATTT AATATCTATC 120
TAATTTTTCT AGAGTGAAAA GGAATCATAT AAATATTAAA TGGAACCACA CGAAAGTGCC 180 AATATTTGAC TTTTTTTACT GTTAAGGGAC AGGGTTTCAT ATGGCTCACT TTAATACATT 240
CAGAGAAAAT GTTAACAAGA AAACCCTCTG AAGGCACGCA TCAGAAAACG CTGATCGGAC 300
GAAAGCAATT CTAGACTTGG CACCCTTAAT GACTGCACAG TAATGGCCTG GTATTATAAA 360
GCCCCAAGCC CCACTCTGTC ATAAAACATG ATTTCCCTTT TAATGTTCAT TACTTAGAGA 420
CACTGACAAA AAAATCTGAA GATAAATTTA AGCTCATAAA TACCATGnAC ACAACATACT 480
GTAGCAGAAA TAGATAATTT GCTTCACTAA ATTAAGTAAA TAACACAGAG TAGGCTATAC 540
CAAAGAATTT TGGCAAATTA TGGATA 566 (2) INFORMATION FOR SEQ ID NO: 302:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 505 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 302:
TTCCATGGTA CAATTTCATA TCATCTGCAA ATATGGATAT TTTGACTTCC TCTTTTCCAT 60
TTTTGATGCC CCTTATTTAT TGTTCCTCCC TAATTTCACT TGTTGATTCT TTCACTTGTT 120
GATTAAAAGT GGGGAAAGTG AACATCCTTG CCTTATTCCA GATCAGAAGG GAAATGCTTT 180
CAGCTTTCCC CATTCCATAT AATATTGGTC ATTGGTTTGT CATAATTAGC CTTTATAATT 240
CTGAGGTATT TTCTTTCAGT GCCTAGTTTG TGGAAGTTTA TATTTTATCC ATGAAAATGT 300 nTAATCTTGT CCAATGGCTT TCTCACACTT ATTGAGATGA CCATACTCTT TTATTTCTTT 360
ATTAAATTGA GAAGTATGGA TTAGCTCTTC CAGAAATGTT TTCTAGAATC ATnTGTAATG 420
TCATTAGGTC CTGGACTTTT CCTTAATGGA AGACTTAATT ACTGCGTTAA TTTCATTGGC 480
CTGTnATGGG TTGGTTTGAG GGTGT 505 (2) INFORMATION FOR SEQ ID NO: 303:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 528 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 303: AGAAGGGACT GCAGTGGGAT GGACCAAATC AGCACAGACA ACATCTTCAA CAGTGCTTTG 60 CnTnTAGTAG GGGTGAAAAT TGCCATTGAC TGATTAATGC TGTCTTGAGA ATACCGGATC 120
ACAGAATGCA AAGTCACTTG GGTTCTTCTG ATTCCTATGT GGATTGCAAA TCATCTCTGA 180
GCTGCAGGAA AACTGAATTT CATTAAACAG TTCCCTACGC TGCAGCTTTC ACAGTAATTT 240
GAATGTTTCC AGTTATTTTT CAATAGAAAT TATTCTTTTC TCCGGGGACA AAACAGGCTT 300
CATCCTGTCT ATTTCTTGGC TGGTCTAAGA ATCTAGGTCA TATATCATTT GGAGATTTCA 360
CTCGGGCTAG AAGGGAAGGG ATAGTGGTTA GCTATTGCAA CAATCTCTAC CTGGACCATG 420
GAGGAACACT GGATTAAGCA CAATTCTTGT CTTTGGGAAG TATGGCATAA CCACTACTTA 480
AAGGGCAACA GACTCTCACG AGTTCCTGGC ATTTTTCCTC CATGTAAG 528 (2) INFORMATION FOR SEQ ID NO: 304:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 561 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 304:
ATTCTGAATC CCATTTGCTA GGCCTGTAAT GTCAATTTCA CAATGAGTCA GCCCTCCATG 60
TCCACCTCCT GCTAAATCAG ACCTCCTCAT CTATTAATCC GTGGTAGAAA GCTCTCCCAG 120
CGTACTCCAG AGCTGGCTAT GAACTGA AC CCGCACCGTT TGGCTTCCAG AATTGCAGGA 180
CTCCAGAGTG CAAGACTTAG GCAGACACAG TCATTGTCAG AGCCTTGAAA GGAAGTGACA 240
GCCAGACGGA GGAGGAGTCA CAGCAGAAGT GGTCATCATT GCTAGTACCA AGGCTCCTGT 300
CTGAACCCTG CCTGCCATGT CACTAGCAAA AGATGTGAGG ATCAGTCTCC TATTTTTAGA 360
ATCTCAGCAT CGCATCGTGT TGAGGAATTC GCAGGCTAAA CTACTAGCCA AGCTGGGATT 420
CCTCCTGGAT GTCCCTGGAT CATGCTATCA GACTTGTCCT TTCTCCTTCC CTCTCCCTCC 480
AGCCTACATT TCAGCACACT CCTGTTCTAC CCTGGCTGTT CCATACTCCT CTAATCTAGT 540
CTGCTCTTTT TCTCTGTCCA T 561 (2) INFORMATION FOR SEQ ID NO: 305:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 524 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 305:
ACGTCGTGGA GCACnTGGTT GTCAAGGTTT CCTCACTTGG GGTTTTCTTC GTCATTTCAT 60
CAACGTGTGC TGGTCCTAGG AGAGACAATT GGAAGTGTGT GCTCTCCAAT CGGTGGGTGA 120
GAGCAGTGTC ATCTTCGCAG GCTGATGAAC CACCTGGACC AGGCAAAGCC TGAACGGAAG 180
AAAATGGGGG CGATGCCAGC CAAGCCCATT TCTAAGTCAG CCTCTTCTGA ACCAAGCACT 240
CTGGCTCCTT CCTTCCTCTT TTCCTCCTCC CTCTTGCATC CTCTTTCCCC ATCCCTTCTT 300
ATTTTTTGCC TCCTCTCTGT CCTCCGGTTC TTTATTTCCT TCTCTCTTTG TTTGTCTCCC 360
TTATTCTTGG ATTCTTTGTT CCTTTTGTGG CAAGGCAGAA CTAAAATAAG GTAGTTCTCA 420
AAATTTCTGT TTTTGGGGAT TTGTCCCAAG GCGTAATAAA AATAAGCACA TTAACTTGTG 480
ATATCAGATC GGAAGTCTAA TACATGAGCT GGGTTTGAAG TTTG 524 (2) INFORMATION FOR SEQ ID NO: 306:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 563 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 306:
TAGCATTCTA CCCAGACACC CTTGCTTTTC TGCCAGACAT AATCTACATG AATCAATCCC 60
AGGTCTTCCA ATACAATGTA GCTTGCTACT TCAGGAGCTG GATGGTGTAC AATTGTTCAC 120
AGATGCTCCT AATTGCACCT GGATCCAATA ATAACTAAAT ATTGTTTAAA GTTTTGGCAG 180
CAACTATTTC AGGAGATATG TAGGATGAAT GGTTGAAAGT AACACTCCAG GCATAAACCT 240
GCATGAATTA GAACCTGTAG GAAACATAAA CACAATTCAG TAGGTGGCAC TATTCCCCCA 300
GTCAGAACCT TCTTCATTCT GCATGACTTG GCAATGACCT TTATTAATTT ATTCCATGGA 360
ATTCTGTGAC CTCAATTGAT ACCAATATAC TCCAAATACA TATTCTGGAT TCTTTGAGAT 420
ACCACAGCCC AGGTAAGAAA CAATCTATAG TAGCAGGTAT AAAAATACAA AATCATTGGT 480
TTCATCTGAA GTCCTAAGGT AATCAGATCA GCAGGTTACT TGAAGTTAAT TAATTTCCAT 540
CCCCATCAAA CAACCCAATT GGA 563 (2) INFORMATION FOR SEQ ID NO: 307:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 517 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 307:
ATTTCAGCTC TCATATCTTA CTTCTCAAAC TTTAGCTAGA TTCCAAAAGA ACTACTCTTA 60
TTGGCAACAA AAAGAACTAT AATTCCATTC ACAGAAACTC CATGTTAGAA ATGTATTCAA 120
TCCAATCCCC ATTACTAAAA TAACATGTTT CATTATTTTT GTTCCATTTA CTTTAATAAT 180
TGTGTAAGAA TTAAAAATCC TATGTCACTC AGAAAACTCA TATTTCACCC TTAAAATGGC 240
CATTTCTAAA TGGACTAATT AAAAAATTTA AAAAATTAAC TCCTCAACTT TAGAACATTG 300
GAAAATGAAG AATATGTCTT CTGGGAATTT AGAAATTGTT ATTGCAATCT TATTAATCCC 360
CTTATTTTTT GAGTGTATAA TGTGCTCACA GTCATCTAAT CCACACTAAT CAATCAAAAA 420
GCTTTTATTA CTGAATGTTG CAGTGTTCAG CAATGTCCTA GACACTGGCA GGCTTAAAAA 480
TTCTAGATTA AGGCCGGCGC CGCGGCTCAC TAGGCTA 517 (2) INFORMATION FOR SEQ ID NO: 308:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 477 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 308:
CATTGCTAAA CCTTAATCAT GGTATCATTC AAATAGTAAT TTCTTAGTCT TTCACTTGGC 60
CCTTCCTTTT GTTTTACATT AACGGTTTTT GTCTCTCTCT CTCTCTTTTT TTTTTTTTTA 120
ATAATCCATG CGCCCCAATT AAGTCCAACA TCGCTGATTT GGGGCAATAC AGAATCCACT 180
GCACCATCTA ATCTGAGTAT AATTAATCAC TGCATTTCGC ATACCAATGC CCAATCTGTA 240
CCCACAATCT TCCTCCCCTA AACTGTCGCT CCTGCTGGAT GCTCTCTAGA CACACCACTG 300
GCCCTCACTT GATATTCCAC ATCCCCTCCT GTCTGGCAAC CCCGAAATCA AACTTTCTCA 360
TCTCCATTGG AAAAGTAATT GCTCTTAGTA ATTAATCTTG TTATTTTCTT GCTTAAATAC 420
TTGAGTGAAT TCCAAGAATG TATAAGTAAA ATTCAATTTC CATAATACTG TACTnTA 477 (2) INFORMATION FOR SEQ ID NO: 309:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 647 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 309:
TGTAAAGTTT AGGACAGTAA GTGTAAAGTC ATCCCTGGAG GATAAAGTTG AGTTTCTGTA 60
TTTTGCTCAG GAGACAGCGC TGTTACCCCG GAAGGTCACA GCCTCTGTAT TCCTGTGTGG 120
TAGAGGTTCT GCTGTTTGTG GACACTTGAG CTACTTTTCC CAGCAAACGG AGCAGGACAT 180
AAAACCTTTC CCCTTTCTGG GTTGTGTAGG CAGAAACAGA TGGCTAAAGA ACATCTTAAG 240
ATTAGACACA GGAAAATGCA GGTCCCCTTG AATAGAAAGG AGTCAGTCCA TGCTGCTAAA 300
TGAGGCATCC AGGCATTTTG TAGAGCATAA CAGAGAGGAA GAAACCTAGG TTAATAGTTA 360
CTTTGATTTA AATGAAACTG TCCCTAAGGC GGCTTTGAAA TGAGACGGAA GAAGGACTAT 420
GACACCAGAA TGTATGTTCC CAGCACAGAA TAGTGTGAGC TGAGGGACAT GGCAGCAAGT 480
GCCGAGGCTC AGAACACAGA GATTCTCCTT GCTGTAGTCT AGAGCAGGGA TCTCATCAGC 540
TTCTCATCAG GTACAAATTG GAAAGAAACA CAACTTACTT TATATATATn AGCCnCATAT 600
ATATGTGTAT ACACACACAC ACACACACTG AGGGCTGAAA TTAATGC 647 (2) INFORMATION FOR SEQ ID NO: 310:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 479 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 310:
AAAAAGAAGA GTATTCAATG GGACTGGGCC CCCAGGTAAT CACAGCAGTA ACCAGCTCAA 60
TCATACTCCC TTGGGTTGCA CTTTCTTCTT TCCATTCTAC ACTTTTCATT GTCTCACTAT 120
TGTCCCCCTG CCTTCCCTTG CCAAAATAAA CTCAAAACTC AGAAAATATT AAGAGCTTAT 180
CTTAGAGTGT ATGCAACTTT GAGTTGTCCA TTAGATGTCC AAGTGGAGAT GGCAAGGAGG 240
CAAGATGACC TAGAGTCTGG AATTGAACAC ATAATGTGCA GAAGAGACCA GGTTGCTTGT 300
CTTCAAAGAT CACTGTTTAA GGATGTGTCA GACACTCCCA CCTTCAGAGC CTCCCCCGAT 360
ACAATGCATC TCCATCTCCT GTTGCTTTGG TGAACACTCC TTTCTGGTTC TCAAGAGGAG 420
CCAAAGAAGA GGCCATGCTC ATGGAAGGCA CAAAAACCTG TCCTGCCATT CTGATGATT 479 (2) INFORMATION FOR SEQ ID NO: 311: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 646 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 311:
GGTCTAGTTC CAATATGGGC ATTTGAGGAA CTGCATCAGA TCAGAGGAGG ACATTTTACA 60
GGGCAGGAGG GCCTTTTGCA GCAATGTGGA CACTACAGAT ACTATCTGGT AGAATGCTCA 120
GGAGAATGGG CTCTGTGTGC CATATACCAT TTGAAACCCC TGCTGAAACC TTTCAGTTGT 180
GGAACATTTC TTTCATTTTA GGTAGAATAT GTAGAATTTT AAGTGTCAAA AATCCACTGA 240
TGAAGAAAAT GTTATTCAAC TTTTCCCCTT CCTGTTTCTT GTTTTCCCCT TCATGCTTCT 300
CATGAACAGT GGTCTCAGCA TTTCTGTATG GAAAACCTAG AGAGTTTTCC AGTGTTTTTC 360
CTTCTAACGG TCTCAAGTGA AATTTCAGGG ATATATTATC AAAGTGCAAG TTGGTTGGCA 420
TGTACTTTCT TGGCATGTCT TAAAGATAAT ACAATTCTAA CGGACAGTCT TCCAACCTAA 480
GGGATGGATA TGTAAGTTTC TCATCCCTAG GGCAAGATGG CTTATCTCTC AAGCCAACTC 540
CCAGTGGATT TTTGGTTATT TTTTGTGnTT CTGTTGTGAA CTGTGAAGTC ATTGCTACCT 600
CTGTAGGAAA GAGGACCGTG AGTGGCTAnG TAGGGTCAGC TGGAGA 646 (2) INFORMATION FOR SEQ ID NO: 312:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 312:
ATAGAATTGA GTACAACTTA AGCATGTGAA CTCTGAAAAA AGACTTGCCT AGGTCCAAAT 60
TGTAACTCTT CCATTTACTG GGCATGGGAT CTTTGGCAAG CTACCTGAGT GTAATGGGCC 120
TTCATCTGCC CCTCTGTCAA TAGGAGAAAA TAATGGTGCC AATTTAATGA ACAGCTATGG 180
TTGTTAAATG AAATGATGCA CATACAACAG CTCACAATAT GAACCTCGTG AGTGATTATT 240
CATTATTATT ATTTCCAAAG AAAACTTAAC AAGTTAATGG GACCAAATAC TTCTGTCTGG 300
GGTGCAGGTG GGGGCAAAAT ATCTAAATTC TGATTTTAGG AGAATTTGAT CCAAAGTTTT 360
ATAGTTTCCT GTTTTTAAGT CCTTGGTGCC ACTGTCTCCT CTGGTACAAA TTCAAGTGAT 420 CATGTCACCT CCCAAGAAAC TTCCTGCTGG AAAAAAAATG TTTTCTAGGG GAAGAA 476
(2) INFORMATION FOR SEQ ID NO: 313:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 313:
TGCCCCnATC ACGCATCCGC TCAAAGAAAA GCACATACCG AATACACCGC TCCCCCGCCA 60
ATTGCTGCAC CGCAAGTGCC ACACCAATTA TTAGCAAGAG AGTCCAACAC ACCGCCCTCA 120
CTACACCGTA CAATGCTGCC CCATCCCCTA TCATAAGGGT CCGGTAAATC CCCCCGAACC 180
TTCAAAATAC GTGATAAAAG AAACAAGCCC ATTGTAGATT CCCTGCGCGC ACCTTTTCAA 240
GTAGTCCGCA TCGTTGAGCA ACCGCGCCTC TACCGGATTA CTCACAAAGC CTAATTCCAC 300
CAAAACACTT GGCATCTTCG CGTTCCGTAC TACAAACCAG GCCTCCTCTT TTACTCCACG 360
ATTTTTACTT TGTGCACCGA CGCTTGCTTG CATTCCGTCA GCGATACTGC GCGCAATCAT 420
AATACTTTCC ATTGTGAATT CCTCTTCGAG CATCGAGTTC AAGATCGGGA GCACCT 476 (2) INFORMATION FOR SEQ ID NO: 314:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 507 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 314:
CTTACTCACC ATCCCTTGAA GGGCAGGCGA GGGATTGGAA AGGGGGGAGA AACAGGCAGG 60
GAGCAAAGGA GGCAGTGCAT ACTGTTGAAG ACTGGAAGGT GTACACAGAC ATAGAGGCTG 120
CATGGAAAGT CCATAGGAAT AAAGGAAAAC CCCAATCATT GGCTTACAGA AACGGGCAAA 180
AGGGACAATG TGCAAAGACA GACAAGTCCC TGTCACTTTT CTTCTGAGAC TCAGGAGAAT 240
GCCTGAAGCC CAACACATGT ATACATTGTT TCATATTTCA GTGAGAGAGA ATATTAATAG 300
TACCCATTTA CCAAGTACCT TCTGGATGCC ATCTGATCCT TATGGTCAGT GTTACCATCT 360
CATCTTACTC AGAGGACTCT GAGGCTCAGA AAGTTAAATA ACTGTCTAGA GTCACACAGC 420
TGTAnGGACA AAGCCCCTAT CTATACTCAG TCCATCTTCC AGTTTATTTC TCTGGCTGTC 480 CACTAGCATC TATTTCTAAA ACAGCAA 507
(2) INFORMATION FOR SEQ ID NO: 315:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 512 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 315:
AACCACTACT GTATATTTTA AAATCAAACC AAACTGATAC ATAGTTATCT GTTGCTATCC 60
AAACCTATTT GGCACCCAAG TTTGCACAGC AAAGGCTGGT AAGGGGTATG CATTCCTCAA 120
GGCTTTTTGA TTCCTGAATT TTGAACAGCA AGTGGTGAGA GTTTTGGAAA GCAAGAAATT 180
CACAATTCCT TTGTTACCTA ATTTTGCACT GGAAGCGTTC TGGTAGTGAG GTAGCTCAAA 240
TTGCAGTTTT GAGGCTGCAC TTGGATGCCT CATTTATGAC AATTACTTAA AGTGATTAGA 300
CTGGGTGCCA GGAAAGGAAC TGAGCATTTA TGTGTGCTCT TTCCTTTGGT CCCCATAGGA 360
ATCTAGTTAG GCACCTGTCT TTGTTATCTA GATGAGGAAA GTAAAGGAGA GGTATGCTGA 420
CTTGCCTCAG TCACAAAGTT AGTGCCAGAT GGAGCCAATT CACCGACACA AATATGTGAC 480
TCCAAAGCCC ATGGATCGGT TTTGTCAATC TC 512 (2) INFORMATION FOR SEQ ID NO: 316:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 499 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 316:
AAGATTGTCT AACTGTGCGC ACGTCTGAGG CATTGTGAGG GTAATGCCCA CGGGTCTGTC 60
CCACCTGAGG ATCCTCCCTC ACCGCCCCTC CCACCTACGC CTCCCCCCCA CCCCCGGGGC 120
TGATGGCAGC TTCCTTCCTT CCAAGCAGTA ACGACTCTGC CCTTTTGTGT TAGCCACATC 180
TTTAATTTTC TTTCTATTTT TGGCACCCAA GTATCTCTCC AAATAATATA ATTTAATTTT 240
GTTTCTGAAC GTGAAATACG TGAGATCATC CAGGATTCGC TTGTGGAGTT TGCCACGAAA 300
GGAATAAGCT CAGCTATAGG CAGTGTTCCT CACTCCTGAC TCTGCAGCGG CCGTTGGTGG 360
AGGCTGGTGC GTGTGCCCGG GCGGCCAGAG CCTCTCCACA GGGCACCACC TTTCCCCGCG 420 TGGCCGTTTC AGCTCTGCCC TGCAGTCCGT CTGTGGTCCG CATTGTCCGA GGTGACCGGT 480
CATCGTGGTT TACGTCGCA 499
(2) INFORMATION FOR SEQ ID NO: 317:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 527 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 317:
CTGCATAAAA CTGCTGCTCT GAAGTATCTT GCAAGACAAA TGCTTTTACA AAATTTGCAC 60
CTACACATAA TTAATTTGGA TAAATTATGA AAAAATGGTT TCATGGAGGG GAAATAAGTC 120
ATTTCTACTT TGATTTTGTG TATCTTCTAT GAACCCCATA ATTGTCTCAA TTTACTGTAC 180
TAATTTTCTC TTTGCCCAGT CTTCAGGTTT TCTTTTGCTT TCATTTCAGA CTTAAGGTTT 240
ATGACATTTT CAGCACCACC AAGGTTTGAC CAGAGTTCTT GTAATAAGAA AAATCAACAG 300
CTGTGATGTA CATAGTATTA TGATTACATC TATGTCCAAA TTTTATTTTA AGAATTGTGT 360
TTGTTATTAA CAAAATAAAC TCGCAGGAAT GATGTCTGCT TATATGATTG ATTAGTTTCA 420
GTCCTAAAAT TATAAAGAAT GTGTTTAAAA ATAAAGATGT TTTATGAAGC TCTTTCTTCA 480
TTTGAAGAAG CAGGATTTTT CCTCCAGGTC TCAGTATTTC ATTGTGG 527 (2) INFORMATION FOR SEQ ID NO: 318:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 515 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 318:
ATTATTATAA TTAGTATCCC ACTTAATAGG TTCCAAAATT CCTACTTTTT AATTAAATAT 60
ATAAAGTGAT CTGCATAATA TAGAAGTTTC AAATCCATTT CTCTAAATTA AAAGTCCAAG 120
TTGAAAAATC GGGTTTGGTT AAAAGCTGTG GAAGGAGAGA AATAAAATAA AAACCGTTAA 180
GAGTTAGCTT TTATTGAGTG ATGGATTTAG AATGATTTTC TTCCTCTGTG CTTTTCTGCT 240
TGTCAAATTT CTCTAAAATG AGTACTAAAA ATGTAAACAC AAACAATTTA AAAAGCTGTA 300
TGTCAGAAAT GTGAATGCTT AAGTAAGCTT TTAATGTTAA AAAATAAATA AATAAAGTCT 360 GAATTATACT ACTCCAGATG GCTTCAGCTG TGATTCGTGC ATAGCATTTG AAAGATCGTT 420
TTTTTACATA AAATACCCAT AACGCTAATG TACTAACACG GAGGTCCACC GGACTCCCGC 480 TGGGTTTCTG AAGGGAATGA AATCTAAGCC GTTAA 515
(2) INFORMATION FOR SEQ ID NO: 319:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 159 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 319: GGTTTGGTGA GGTTATTGAG TGTGGGATGG GTGTAGGTGG GGTTTGGAGG GATGGTTAGA 60 GGTATAGGAA GAGTTTAGAG GGGGGnTGAG TGTATATGAT GAAGGGGGGG GTTTTATGAT 120 GTTGAGTGTT GATATATATT AGGTAGGGTT AnTTAGGGG 159
(2) INFORMATION FOR SEQ ID NO: 320:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 365 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 320:
AGCACTTGCG CGCAACCCTG CAGCTTGCGC nTCAGTCCCG CGGCTCGGCT TCCGCGGCTC 60
GGCTTCCGCG GCTCGGTCCC GCGGCTCGGT CCTGCGGCTA GGCTTTCGCG CGCGGTGGGC 120
GACCTTGTTC TCCCAGTAGG TCCTCCGATT TACGCCCACT GGATCCAGAA GAGTTTCGTC 180
TGCAATATTT TCCTGGTTCT TTTTTCTGAG GCTACCGTAA CTCCCCTTTT ATTAAACTAA 240
ATTTTCCCGG ACTATCGGTG CGCGCCCTCA CTATTCCGCC ATCTTGGCTC CGCCCCCCAG 300
TAGTTTTTTT TAAGGTTTTT ATTTTTTATT GATTTGAAAG ACAGAGTTAC AGAGAGAGGT 360
AGAGA 365 (2) INFORMATION FOR SEQ ID NO: 321:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 373 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 321:
AAAGAAGAAA ACACTGCAAA TAATGTTGAT GAGCTCCAGC TACTGGTAAA GATCTTAGGC 60
TTGACCAAAT GTGATTTGCT GTGTTTTCTG TTTTGCTTCA TTCATGCTGC TGTAACAAAG 120
GACTAAGATT GGGGGGTTTA TAAACAACAG AAGTTTCACT GTTTTTGAAG TCTCAGGTCA 180
GGTTGCCAGC AGGGCCGGGT TCTGGTGAGA GCTGTCTTCT TGACTGCTAA CTTCTTAATG 240
TGCCCTCACA TGGTAGAAAG AGCACTGAAG AGCTCCCTGG AGATTGTGTT TTAAGGGCAC 300
TAATCCCGTT CTTGAGGACT GTACCCTCAT TAGCCAGTAA CCTTCCAAAG GTACCACACC 360
AAAATATGTA TGA 373 (2) INFORMATION FOR SEQ ID NO: 322:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 464 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 322:
GCTAAAATCT CACATTTTTT TTTCTTTTGT TTAAAGGGAA ATCCTCACAT TCACCTGATA 60
CTCAGATGAA TGTGAAAGTT AAAAATCTTC TCATTCAAAA GAGCACACAG CACAGATTGA 120
GTAAGAATGA AAATGCCTGT GGAGACACTT CTGGGTGGTC CACATTGTAT ATTGCCTGTA 180
AGTTTCACTC TTTTCTTGTC AAAGACTTGA GATTCCATTC TAAAAATAAA TCAACTCAGT 240
TGGGTTAAAC AGTCCAAAGA AAACAAAAAA TTTATTATCA AAATAATAAT CTAGGCCTTG 300
AATATTTCTT TCTCAGACTG AACTGAGATA TTTCTAGAAT CAAAGCAGAT GTCACCCATC 360
CTGACTAAAC TACACTATTT GCTCTGGACT TATATGAGAT TTTTCCATTG GTGGCTGCAA 420
TTTCCAATCT CAGTGTAAAA TTACTTCGGA TAAAGACAAC AATT 464 (2) INFORMATION FOR SEQ ID NO: 323:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 466 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 323: TCATTTTTAT TAATCAGCTC CGCATGAAAA TAGGGATTAT GTTTGGGAAT CCTGAGACTA 60
CTACnGGnGG CATTGCACTT AAGTTTTATT CCTCCGTGAG ATAGAGGTTA GGAAGGTGGA 120
AACGCTTTCA AGAGGTGATG AGGAAGCGTG GGGCAATAAG GTAAGGATCC GGATAGTAAA 180
GAATAAGATG GCACCCCCCT TCCGCAAGTA GAAACGGAGA TTCTCTTTGG GAAGGGTTTT 240
TCTGCCTTTT CGTGTTTGCT GGATGCAGCG GTTAAGCAGG AAATTATCGA AAAAAAAGGG 300
GCGTGGTACG CGTACCGAGA AGAAAAGATC GGACAGGGGC GTGACAATGC CGTGGGCTTT 360
CTGCAGCAGA ATATGGACAT CACCTTGGAG ATCGAACGGG CAGTGCGTAC GAAGCTTTTT 420
CCTAAGCAGG CGTTTATATC CAGCTTTCAG GAACATCGTC CTGCTC 466 (2) INFORMATION FOR SEQ ID NO: 324:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 191 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 324:
CGGATACTGG GGAGCGGTGT GCGGAAATTT TTGTGTGGAC CGCGCGTATG TCCGCTGTAC 60
CGGGCCGTGT GGTTACCGTT TGTGTACGTG GGTGCCGTGG GGAGTTTGAG TAGCGTGTGG 120
AATATCTCGG ATGCGTTCAA TGGACTGATG GCGTTGCCGA ACCTGGTGGG GCTGTTATTT 180
TTGGCTCGTC A 191 (2) INFORMATION FOR SEQ ID NO: 325:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 631 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 325:
CATGTAACTC TGCCTTTCAA ATAAATAAAT AAATATTTAA AAAAAAGAAA ATGTACATTA 60
ATATAAAACA ACTTACAATC TACACCTAAT TTGTTGATGG TAAATAGGCA ATCTTCAAAT 120
TTTCAAATTT TCCAGTTGTG AATTTGTCTG CTTGCTGAAC TCTATTTGCA ACTCCAAACT 180
CAGTGCTCAC AACACTCACA GGCACACGGA GAGTAGGAAA AAAAAAAGGC AAAATTCAGT 240
CGCAGAGGAG GCTGCACGAG GCCACACTCG GTCTCCTGAC GTTACAGCCG CGTCCTTCGG 300 GGGTCTCTGT GTGCTGTCGC TGGTGATCCA ATTGTTGCAA CCAGTGTTCC TGCTCTGTTG 360
CACACAGGGC TGTGGTCTCA CAGTCGGGTG TGGTCACAAG GTCTCCTTGG ACAGACACAC 420
GTCCAGCCAG GGCGCACAGC GGCGCAGACC TGACAAAGCC GTGTGTGCAG TGACTTGATA 480
AGCGTAACTC CCGCCACCGG CAAGAACTGA GAATTGAAGG GAGACAGACT GGAGAGACCA 540
ACAGGGAGCT GACCGAACGA GTGATATCCT CCATGCCAGA GCAAACGGAA GAGGACAAAG 600
CACAGGCCAG ACCGCCTTGT CCTAAnTGGA G 631 (2) INFORMATION FOR SEQ ID NO: 326:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 516 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 326:
CGCAGCGCGG GnCGAAGGCA CAGTGAGAGG TTTCAGACAC GGATGCGCCG TGTGACGTCC 60
GCTGGGGAGC TGAAACATCG CGAATCCGTT CCTCCAGCGT TTTTAGGGGA ACACGACAGG 120
GCGTGGTGCC TGCCCCTACG TCCCCAAGCT GTGCCGCTGA GGTCAGGACT TCTTTCTCCG 180
GCTCTCGCGC ACCGGAGCAG CTGGGGACCG CCTGTCCCAG GCCTCCCTCC CCGCTGCTCT 240
CCCCAGCCCG GTCCGAGGAC TGACCTCCCA CCGGTGAGTG AGGGCGGTCA GGCGGGGCTC 300
GTCTCTCTGG GTCTCGCTCC TAACACAGCG ACAACCTCTT AGGGGAAGAA ACGCCTCTTG 360
TCAGGGTTCA GTGAGACTGG GGACCCCCAA TCCCCCAGAC CCCCGGTGCT TGCAGCCCAG 420
CCGTGGGCCC TGCAGATCCG CGGGACAnGC ACGTCCACAG TGCTTCTTTT CCGGAAATGC 480
TCCCTTCTGA GCCAGTGCTT CTGGTACAGT CAGAAT 516 (2) INFORMATION FOR SEQ ID NO: 327:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 534 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 327: ACTGTGCCAG AGCGCCGGCC CCCGAGTGGG TTCTGATTAA GATGTTTGAG GACCACAGTT 60 GGCCTTCCCT CTTGATGAGA GCCCAGAAAT TTCCCAGTGA GAGGGGTGTG GACTTAGGAG 120 ACACCCTGAA GTCCGGTGGC CTTGGTGACT TTGTCAGCGT CCGTGGTACT CAGCCGCTGA 180
GTCAGGACAC CCGGGATTGC TCCTACAGAG CTTGGATTTT CTTCCGGGCT GAGCACGTCG 240
GGACCATCAA AGCCAGCATC TCAGCCGTGG GTTGGTGGGC AGCGGCTCTG AGCAGATGAG 300
GCTCAGGTGG ATGGGTTCCT CTTTCTGCCA GGGAAAATGC CTCAGGACCA CTTCTCTGCT 360
CTCCTGGGAC AGGAAGGCCA CAGGCTCATT GCGATTTTTA CGGACAGCAA GTCATCCTGT 420
CCGTTGGCTG GGAGGACTCC ATCTTCATAA TTCTGGAGTC CTGAGCTTGA CGTGACTGCA 480
GCCATTTTGG ATGCACTTGC TGTGTTGCCC TTGAGTAACT CTGAGGTCCT CCCA 534 (2) INFORMATION FOR SEQ ID NO: 328:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 509 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 328:
TGTATGTGCA TACGTGCTTT CCCCCTCTGT TGAATGCCTT CTCCTTCTGG TGAGCTTTAC 60
CTGCACAGCC TTCTCCTTCA GGGCCATCAG CTGCGCGTGG CTGAAACCCT GGACAGCCCC 120
CAGCAGCTCC ACGCTCCTGA TGAAAGTGTC TCCGTCCATG CAGAGCAAGT CCTCAGGGGC 180
CCAGCAGGCG TTGGCCTCGG CCAGCCTGAA GATGTCATTG GGAGAGGGAG CCACGACTCC 240
ATGGCAACCT GAAAAGTGAA TGGGGAGACG CAGGAGGGGA CGGGGGAAGC AGAACTCAGG 300
GCAGGTTAGC CCGATTCTCT CCTGGAATGA GGAATAACTC ACCCCAGTCA GATTTATTCA 360
CAAGATGCCT TTCTTGCTGG TGACCAGCTA CTAAAATGGT GATACCAGCT TCTTGCACAT 420
CCACATTGCC ACTGGACCAC ACAAGCCAGT GCAGTGAGAG CTGGTGGTGG ATACTCCTGA 480
CATGAGATTC AAACCCAAAG CCACTnGnC 509 (2) INFORMATION FOR SEQ ID NO: 329:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 530 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 329: TTATGATAGG ATTGTCATAA AATCATTAAA TAAATTATTT ACCCATTTCA GCAGTGTGAT 60 TTTTTTTTTC TGGCTCTTGT GCATTAGATA ATATGAATGG AAGGGAAGAC TGTTTTTCTT 120
ATTTAACTTT TTTTTTATTT AGGGTAAACA AATTTCATGT AATTCACATG TACAGATATC 180
AGAATATAGT GCTATTnCCC ACCCTACCCT CCCTTCAGCC CACAGTCCTA CTCTTTCTCC 240
TCCTTCCTCT CTTATTTTCA CTCTTAATTT TTATAATGAT CTACTTTTAG TTTACTTAAG 300
ATTAACCCTA TATAAAATGA GTTCAACAAA TAGTAGGAAT TAAAAAACAT TGTTCCTCAA 360
CAGTAGAGAC AAGGGCTGTA AACAATCATC AATGCTCAAA ATGTCAATTT CATTCCTATA 420
CATTTCATTT TTGATATTTT ATTAGTTACT GCCAATAAGG nAAAACATAT GGCATTTGGG 480
ACAAGCTATT CTACTAAGTA TAATGGTTTC CAGTTGTATC CATTTTGTTG 530 (2) INFORMATION FOR SEQ ID NO: 330:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 537 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 330:
AAGCAAAAAC CGTGTTCATG GATTGGAAGA AATTAATATC ATCAAAATGT CCATACTAAT 60
GAAAGCAATT TGCAGATTCA GTGTGATCCC AAACAAAATA CTGACATTCT TCTCAGATTT 120
AGAAAAATGA CAATAAAATT CATATGGGTA CACAAGGGAC ACAGAATAGC TAAAGCAATC 180
TTAAACACAA AGCTGGAGGC ATCACAACAC CATACTTCAA GACATACTAC ATACAGTTAT 240
AATGAAAACC TGAAATTGGC ACAAAAATAG AGACCTGTAG ACCAACTGAA CAGAATAGAA 300
ACTCCTGAAA TCAATACACA CATCTATGCC AACTAATTTT TACAAAGGAT CTAAAACCAA 360
TCCCTAGATA ATTGACAGTC TCTTCAACAA ATGGTGCTGG GAAAATCAGA TCTCCTGTGC 420
AGAATTATGA AAGAAGACCA CTAGCTTACA ACTTATACAA AAATCTAAAA TGGATCATGA 480
CCTAAACCTA TGACTGGTAC CATCAAATTA CTAGAGGGAG ACATGAACAT GGGGAAA 537 (2) INFORMATION FOR SEQ ID NO: 331:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 414 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 331: TGCACAGATC AAGAAACCAA ACTTTAAATT ACAGTAATTC ATCCCATATC ACACAGTAGC 60
AGAATTAAAA GTGGAGTGGA TTCCCAGCAG ATTGTAAATT CTTTTGGGAA TGCCACCTTT 120
TTTTATTTTG CTTTTTAAAG TGAACTGGTC AGGGATGGTG TTTAGCAAAG CAACTGTAAC 180
ACTGCTAAGA TGCCTACTAC CTCTATCAGA GTGCCAACAT TCTAGAGTCT TAGCACTGCT 240
CCCAATTCAG CTTCCTACTA ATCAACACTT TCAGAGGCTA CAGGCGACAG CATAAGTATT 300
CGnGTCCCTA CCACCCACTT GGGAGACCTG GAGTCAGTAG TGTCTCCTGT CTTCAGCCTG 360
GCCCAGTGCT CACTGTTGCC GGTATTTGGA AGTAAATTAG CAGAAGGCAG ACCT 414 (2) INFORMATION FOR SEQ ID NO: 332:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 282 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 332:
GAATCCCACA AATGAATTTG TGGATGAAAC AAAAACATAA nCTCTTGGCT TGCTGAATGT 60
ACCTCTATTC nCATCTCATT TGTGTTTACC TACTTAGATT TCATGAnTTT AAACTACATT 120
TCTAATAAGG GCTCATAAAT ATTCATACAT GTATTTTTTT TCCTCCAATA CATACTTTGA 180
CAATTGATTA TnTGTTACCT AGCTATGACA AGTTTTTGGC TCTTTATGGC CAGGCTCATT 240
TGGATGATAT CCTTCAGCTT GCTTAAGAGA ATTnTAACTT GA 282 (2) INFORMATION FOR SEQ ID NO: 333:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 583 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 333:
GTCTTTCCCT CTCCATCACT TTGTCACATA AATTTTTTTT AACTTTGGTA TAACAGGGAC 60
AGATGTGAAC TTTTTCAACC AGTTAAACAG CATTTATGAA AAACCTACAG TTAATGGCAT 120
ACTTACTGAT AAAAGACTAT ATACTTCCCA CCTGATATTT AGCCATGATG TAGCTATCTG 180
CTGTCAGCTT TTCTGCTGTT AATGTACTGG AGGTTTTAAC CCATGAAATA AGGCAATAAA 240
AAGACAAACA AAGCACTTGT AACAGCAAGG AAGCAGTAAA ATTCTCTTAA TTCACAGAGG 300 ATATGACAGT CTATAAAATT TCAAGAATTT TAACAAACTT CTTGAATTCT AGTAACAAAA 360
AGTTACTAGA GTTGAGTTTA ACACAGTTGC AGGTATTTTA AAAGCATAAT CTACTGTGTT 420
TCAATAAATT AGTGGCAAAT TGGTAAGAAA TATGAnTTTA AAAACAATAT AACTTAAAAT 480
ATCATCAGCA AATCCAAGAT ACTTAGGGnT AAATCTAAAA TAGTATACAC TGnAACTATA 540
TAAAATGCTA CTGAGATATT AAAGGAGAAT TAAATAAATA AGG 583 (2) INFORMATION FOR SEQ ID NO: 334:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 527 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 334:
TGGnCAATAC AGCTATTTGG GGAGGGATTC AGTGGATGAA AAAGATCTCT CTCACTCTAA 60
CTTTCAAATA AATAAATATT TTTAAAAATC AAATAGAATT TCTTCATTCT TTTCAATTGT 120
TTAATTCCTA AAATTGCCTG CTTTCATGAT ATGATTGGTA ATGTAATTGA AGAAAAATCA 180
TAGGACTGTA CCTGCCCACA ACAATGTGCC TAGCCACCTA CCCATCCTGG CAAACTTGCC 240
ATGAAATCTC TGGCCCAGGG GAGTAAAGTG CTTATATTAA TAGCAATCAG ATTTCTTTTG 300
GAGATATTAC TCTTCAAATT CACAAAAATT ATTTGATGGG GAGGCATGGG ATGGAAGGAA 360
ATGATGTGGG AAGATAATCT CAATGAAGCT GATGTTCAGC TATACCAAAG TGTGATGGGG 420
ATCAAGTAGT GAATATGCCA GGGGGTAGCA CCCACCAGTG nCTCCATTTT GCCATTGGGG 480 nACCTTTTGG AGTAAGGAAG GAAGCCATGG TGGTGGGGAG CCAGGGG 527 (2) INFORMATION FOR SEQ ID NO: 335:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 584 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 335:
TTCACTGCTA GCATGATGCA GACTTACTGG AGGTATTAAT TTCATAACTG TTTGAAGACA 60
GAGTGCATTG ATTTCTATTT GTCAAGAATT TTTTCAGAAA GAATATTGAA TTCTATCAAC 120
TGCATATTTT TGGACTTATG GAATGATTAT ATGGATTTCC TCCTTTGCAG TTATATTATT 180 GGATTTGTTA ATGTAAAATC ATCTTTACAC TTTTAGAATA AAGTCTACTT GGCCAAGGAA 240
CACCCTTTTA AAGTGCTGCT AAGTATGTAT GCTAATATTT TATTTAAGTT TTAGAAAATG 300
CCAACGCCCA AAGTTAGCAC AGTGGCAGTG CCATGGCTCC GGACAGTTGG ATTATGTAGA 360
ATGGATCTTT TGCCAGTAGA TGGGATCATA GGAACAGGGC TCAGCCCCTC AGGCAGGGAA 420
CTGGTTAAAG CCACCCAGGG TGGATTTCTG TTTTCATGAA GATCTAGCAA ACTTCTCTCT 480
GTTTCTGGCA CTAAGTGAnA CTAGGAACTG TGGGACATTT TATATGAAAC AAATGTAAGA 540
AGACCAAAGG AGACAGAGAA GAGAGCAGCA GAGCCCTnGG AGCC 584 (2) INFORMATION FOR SEQ ID NO: 336:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 528 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 336:
TAGAATGGAA GTAAAGATAA CCTGCTAGGA GGCTATTGCA GTGGTACAAG TAGATGACTA 60
TAGTGGAAAC AAGCAAGTTG GCCTCCAAGT TCTTCTGATC ATGTACTCCA TTTCTAAGAC 120
AAATTTGCTG TTTTACCTCC AAATTAAGTA CACTTATGTA TTCCTATTTA ACCAATACAC 180
ATGCTATATA TAAAATTAAA ATGTTGAGAT TTTAAAGGGA CAAAATAAAA ATGAAGTAAG 240
CTTTTATAGT CATTTTTAAA TTCCTCTTAT TACTGAAAAC AAAAGCATTC TTACACAAGA 300
AAATATAGTG CTTCAAAGGT CTGACACTAG ATGGACTTAC TCTGACATTT GGGTCCTTCT 360
GATGCCACAG TCACACAAAA GATACAATCA ACTACGTACC CAACTAAGCA CTAGCATATA 420
ATTTCTTTCT TTTATTGCAT TTCCACAACC AAATATTTGG GCACTCTTGG ACATAAAGGA 480
ATATTTTCTT CCnTTTGGGA TAAnCCTTTA CCAGGGAATT CCACCCCA 528
(2) INFORMATION FOR SEQ ID NO : 337:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 337: CTTGGGAAAG ACTGGGGACT GGATAGGGGA TACTGTAAGA TCGCTGTGAG CTGCTTCCCT 60 AGGGGCCACC AAGCATCACG GTCTAGAGAG ACAGAGAAGG CCAGGCTTGG CCCGAACATT 120
TACAGGCCCA AAGATTGTTT TCTATTCCCA CCCCAGTCCT GTCCTACATC ACAGATTGGA 180
CACCCAGCTC CTATGTCCAA ACCTCATCTA TTCCTCCAAC AGCTGCCCTT AGCCACAGCT 240
CGGGTCTGGG CATTGGGTCT CTCTGGGAGG ATGGACTGGA GCAGGGGCCG GCGTGGGCCC 300
TGGAAGCGGT TTGGGGACTC TTGGACAGGG AATTCCAAGG TCGTGGGGAC CCACAGCTTC 360
TGTCTGGAAG ATGAGCTGTA GGCTGAGTGG AAAGTCCCTT GTCCCTGAGG GTTTCTTTGT 420
CCCTTGGGGA GTGGCATGGT GAGAAGGGGG CCAGAGCAGG AACCCAGTTA CCTTGAGCCT 480
CAGGGCAATC CCAGAAATGG GCTCCTTGAG CTGCACCTGA TGCTCTGACA TCAAAAGAAA 540
TACAAATAAG AGTGAACTCC AGGAGGGCAG GCCTCTGACA T 581 (2) INFORMATION FOR SEQ ID NO: 338:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 506 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 338:
TGGTTTAGTC TTTTCAACAA AAACAAAGTC AAAGGTTGAA GAGCATCGCC ATCACTAGCG 60
GTCAAGTTTC TCCGGTCGGA AAGGCCTGGA CAGGGATAGA TGGATAGGAC CTTCAGCTTC 120
ACCCTGGCTG GAGATGGCTG GGCATGGGCG AGCAGAACCA AACTGGCCGT TATTAGGCAG 180
CTGCTCTGTG CCTGGCCCTG GACGTGGCAG TGGAGATGAA CTGCAAAGAT GGGTAGCAGA 240
CACCAAGTTG TCAAAGAGTG GTACCAGGAC AACTCCCCAG GAAGTCAGGG AGTGAGTGGG 300
CGAGGGGCTT CCTGGAGGAG GTGGGGGAAC AGGGGATGGA GCCAGCCCCG ACGCAGAGAG 360
AAGGCCATCT GTGGAATGCA GAACCAAGTC CAGCACACCC TCCTGACTCC AAGGGAAGAA 420
GCTGGCAGGA GACAGGGTGA GAAGCAGGTT GGCTGGGACA ATGCAAAGTC TTCAAAGCAG 480
GCTTCAGACT TGAGGTCATA TTTTGG 506 (2) INFORMATION FOR SEQ ID NO: 339:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 634 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 339:
CGAAAAAGCT GATCTGAAAA GGCTACATAC TATATGATTT CAACACTGTA ACATTTTGGA 60
AAAGGCCAAT CTGTGGATAG AAGACAGGTT GTCAAGAGTT TGGGGAGAGA GGAAAGAACA 120
AGCAGAACAC AGAGGATTTT TAGGGCAGTG AAACTATTCT GATGAGAGTA TAATGGCAGA 180
CTTATGTCCT TACACATCTG CCAAAACCCC ACAGACTGCA CAATATCAAG AATGAATTCA 240
AATGTAGACC ATGGAAGATC ACGATCTATC AACGTAGGTT CCTCCACTGT AACACATCAC 300
TCTGTTAACA GGCTGTTGGC AGTAGGGGGA AGCATTTGGG TGTGAAGGCT ATGAGAATTC 360
TGCACTTTCT ACTCAATTTT ACTTGGAAAC TCAAACTGTT CTAAAAAATA TGGTCTATTA 420
AAAACAGTTT TTTAAGGGAC CAGCATGGTG ACATAGCAGG TAAAACCACA CATGTGATGC 480
CAGCAGAGGA TGGCCCAAGC ATTTGGGCCC CTACTACCAA TGCAGGAGAT nCAAACGAAG 540
CTCCTGGCTC CTAGCTTCGG AATGGCTCAG CGTTGGCCAC TGCGGTCATC TAGGGAAGTG 600
ACCAAGTAAA TGGGAGATCT CCCTGTGTnT CTCT 634 (2) INFORMATION FOR SEQ ID NO: 340:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 454 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 340:
ATTTGTGCAC ACAAGATGTC CGCTGCTCCT AAGGGTGCAA ATAATAGCTT TGCCAATATG 60
AAAGCAAAAC ATAATTACTC AACAATAAAC ACACTGTACA CTGGTAACAG CTGCTATCAA 120
CAGCATGTTG GAAGCCAAAA CAAAGGAAGT ATTACAGGAT TTTTTTTAAA CCCCTGCCTA 180
TCATTCTCAC GGAGGTAAAG TTAGTTTTTG TTTCTGTCAC AGCAAGTGTG ACTAAGCAAA 240
AAGTATTTTA CTCAGTTAAT ATTTCACTCT TGCTTTTAGG TCAGAAAAGA AGCTTGGCCT 300
CATTTTGTCT AGCCAGAAAG TGGGAGGATG AATTTTAAAG AATTACTTAG AAGATCTTTA 360
AGAAAAATCT GTTATATTAA GCATGTAGGG ATTTnATACT TTTCTACCTG GAATATTGCA 420
GAACTACCTT TGATAACTGC TTTACTCTGG CTGT 454 (2) INFORMATION FOR SEQ ID NO: 341:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 450 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 341:
TGAGTGATGG CAGGAGCGGC CCTCTCTCCC TCCCATCCCC CAGATGTCAT CTGTCTACAC 60
TGCGGACACC GAGnCAGGCT GCGCTGGCCA CGTGCGCAGG ACAGAGCACT CAACAGCTGC 120
CGTGGGCTGA GGGAGGCTCT GCTTCTCTCC ATACGGGGTC TTTCCTCATT CCTTCACCCA 180
ATGGGGCTAT GCCAATGAGA GTGATGGTGC CCCCAGCGGA CCTGGGACAA CGAGGGGTTG 240
GTAGACACAA CGGGGCTCCA TACACCAGAC CCCCACTTCT ACCCTGCCTG GTGTCAATCT 300
CAAAATTCAA AATTCTCCCC AAGAAGAGAA AGAGTAGGAA AAAGCAGCAA AACAAGTACT 360
TCCACTTGTC AGCATCCCTG AGTAGACAGT GCTGCCGTCA TCAACCAACA CAGCCAAGGT 420
CGGTCCGGTC AGAGAAGGGC TCCCCGCAGG 450 (2) INFORMATION FOR SEQ ID NO: 342:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 555 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 342:
CTGnAGCTGT GGCGCCACTT AAGCCCAACT TTTAATCAAT AAGATTATGT CACTTGCCCA 60
GAAAAACAAT ACAAACAGAA GTGTTACTCT AGGATAATGT CACAATACTC TCTAATAAAC 120
CTTCCATTCA AGGTTCTTTG CACAGGATGA AGTCACACAA CTGAGTGTTT TCAAAATATA 180
AGCTACTTGA CTTTATGCCT ATTTGCTAAT GTGTGTTTGT GTGTTAAAGT CTAATCCTAG 240
AATTGCTCTC ACTTGACAAC AGGTTTGGGT AGAGGGGAAA AGGAAAGAAC TTTAATCCTC 300
AATTGTTTCA AATTTTACTT ACATCTAATG AAAGAAAGTA ATGTATGTGA CTATCAAAAC 360
TGGTTTAACT GTTTACAGTA TGCTTAGGCC ACAACATCAT TTATTCCCTT GTTATATCnC 420
TTCACATTAA AAAGCTCCAT TTTTCCAACT TCGTGTTAGG CAAACTGCTC CACTATTATG 480
ATAACAACAG TnATCATAGT TAGGATAACA AGAGTTATAA TTTTGATATG ATAnGGATAA 540
CAACAGTTAT AATTT 555
(2) INFORMATION FOR SEQ ID NO: 343:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 466 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 343:
AAATACTGGG ATTCATGAGT TTGATTAATG AAGAAAGATT ATTAATTTTG TGGTTTCCAA 60
AAATATGTAA GTAGAAACAA GGTCATTTAT TGTACTCAAA TACAAGAATC CCTATGATTG 120
AATGACACAG ATCAGAAATA AGCCTAGGAG ACATACTAAT AACTGTCTCT GCATCTGAAT 180
AAATGAACTG TTAGTAATTA GAATTCATTA ACCACATAGG CTAATGGATG GAATAACACA 240
ATTTTACTGC TGTTTTAAAG TTTTTGGTCC TCATTTATTC AAATGGCTCT ACTAAGGACA 300
ACACTGATGG AGTACTTTGA CCTTTTGTAC ATCACTTCTT TCCAAGGTGA AATTTCACTT 360
GTTCTCTTTC TCATTAAGGT TCTGCTTGAA AATGATTCAT GCTTGGCCGG ACGCTGACAG 420
GTCACTAGGG CTAATCCTCC GCCTGCGGCT CCGGCACCCC AGGTTC 466 (2) INFORMATION FOR SEQ ID NO: 344:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 465 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 344:
AATCTTATTT GTATACAACC AGCATAATTT TGTTATTGAA ATTTGAAAGA AAAGGAAAAT 60
AGCAGGTCAG TCTACTTATA TGAACATAGA TGAAAAAATA TAAATAGAAT ATTCATTAAT 120
ATTCATTTGC AGACAATATC TAGTGAAATA TGAAAAGAAT CATGACCAAG TGGGTTTATC 180
TGAGAAATTT AAGTTGATCT AGTATGTGAA AATTAACATT CACTGTATTA ACAAAATAAA 240
AAACCAAATG ATCATTTCAG TAGCTGCGTA AATCTTTTAT TTAAATTGTA TTCAACACCC 300
ATTAATGATA TAAAAAGCTT TAACAAACTT GTGATAGAAA GATACTTCTT ATATGATAAG 360
GGCTTTTTTT TAAATAATGG AGTGTAAATA TCATCCTGGA TGGTAAAATA CTGAATATTT 420
TCCCTGAGAT TAGGAACGTA TAGGATACTT CACCACCACC AGTAC 465 (2) INFORMATION FOR SEQ ID NO: 345:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 533 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 345:
CATTTAACTC GTAACAAACT ATTAGCTACA GTAGCACATT ATTAAACAGA TAATGTTTAA 60
CAATTTTATT CACAATTTGG ATTAAATGAG AAAAACATCA CATTTTAGCA CCCTAAATAA 120
CAGTTTTGGG GATTATTTGA ATTGAAACTA ATTCTAGCTT TAGAAAGACA TTACTAATTT 180
TCTTTAGAAA ATTGAGGCTA AACTACTTTA GTTCTTCTCT TTTTTTTAAA GGCAGACAAT 240
TCTATAAGAT GATATGGTTA ATTGTATTTA TATTGTAGAT AAAATTTATG TATCATCAAC 300
TTGGCAACCG GTCCGGAAAT GTCCATGGCA AGTGATTTAC AGTTGCAGGA GAGAAAGCGT 360
TCTCTGCCAG GCGGTTGAGC GTTTTGGAGG GGGGAAACCT GGGGTTGGGG CAGATAAATA 420
TTCAGCAAGT TTACTTTTGT GTTTCCCATA TTCAGnATTC ACTGCCTGCC AAAGCCTCAA 480
ATTAATCCAA TTAATAATAT CTAAGTAnGT GAACTTACAC AATCCATAAT CTA 533 (2) INFORMATION FOR SEQ ID NO : 346:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 346:
CAGATAATTT AATTAGGCAC TTCAAACGTG CAGTAAGATG GAAACTTTCC ATTTGAATAC 60
TTTATGAGTC TAAGAGGGCT TTTGGAGAAA GACTGGGTAA GAGTTTAAAC TAACTAATTT 120
TCAGAAAAAA TGATACCAAC TTCATGCTTA AACAGACTTG TTAAAAAGAC TATCTAGATA 180
TTCATTATTG TACACCAAAT ATGAAAACAA AATATATCTT GTTTATTTAA ACTCCCTTAC 240
TGGATTATAA TTTAAGTATT ATAATTTACT GATGTACTTC AGTTGTTAAG TTTCTATGGA 300
GACATTTACA TGGTATTTAT AGAACCACAC ATTAGTATGA CTTCAATAAT GTAGCAGATT 360
TAAAGCTGCA TCCCATCTAG ACAAGGATGG TATGATCTCA CTCACATGTG GAATCTTACA 420
AAGGTGATCT CATAGAAGCT GAGAGTAAGA TGGGGGGTTA CCCGACACTG AGGAGA 476 (2) INFORMATION FOR SEQ ID NO: 347:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 517 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 347:
GTTTCTGAAA TGAGGCACTG GGGGATGAAA TAGAAACACC ACTGTATCCT TTAAGCCTGC 60
TCTGCAGGTC TCCCTCCTCA ACCACCACTG TGGTTTCCTT CCTCAGGTGG TCGGGATGGC 120
AAGGCGTTGG TTGACTTAAC GACAGGCGGC TGTTCTTCAC TGTGTTCCAT AACTATAGCT 180
TGGAACGTCT TGTCTTCTTT CCAGACTACA AATCTGTATG TTTTGAGCTA TATAGAATTT 240
CAAAGGGCGT TGATAAATAA TGAAAGAAAA CAGAGATTTG GGGTTATTTA ATGTTTTGTA 300
TCTACCACAG ACTTCTCTCA AAAGAGAGCT AGATTTTGTT AAAAATGAAA CTTCTACAGG 360
AAAGAAAAGC TGTCTA ATT TCATGAGAAG TTGTCAGAAC ACAAAAAGGA ATAATCTTTT 420
TTTTTTAATC AATCTGATCT GTCACAGAGA AACAGAAAGA GTATCTCATC TAAnGGGAnT 480
CCCATGAATG CCAAAGTCCC GGGGAAGCCA GGAGACC 517 (2) INFORMATION FOR SEQ ID NO : 348:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 532 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 348:
AGTTAGATGT ATTGCTGGGA TATAGGTATG CTTTAACATA TGCAAATAAA TGTACGGGAC 60
ACGCTACAAA GTTAACAGAA TGTAAGGTAA TGTGATATAA TCATCTTAGT TATTTAGTAA 120
AATCATATAA TAGAACTCAG CATTCTTTCA TGATAAAAAC TCCCAGCAAG TTAGGCATAG 180
AGGAAATGTA TCTCAGTATA ATAAAAACAA TATTTTACAA ACCCATGAAT TATCATAATC 240
CATGGTCCAA AAGTGGAAAC TTTCTTCCTA AGATCAGGAG CAAGACAAGG ATGCCAACTA 300
TTACCATTTC TATTAAACAT GGTGTTAGAA ATCTTAACCA GAGCAATTAG GCAATAGAAA 360
GGAGTAAAAT GAATGAGAAT CTAAAAAAGT AAAATTATCT CTCAGGTGGC TGATCTTATA 420
TACAGAAAAT CCTAAATACT ACACTAAAAA CCTCATAAAA TTATAAATGA AATCATAAAC 480
AAATCCAGCG TGTTTCTATT CTCTAACAAC ACACTGAGAA AAAAGATGAA CA 532 (2) INFORMATION FOR SEQ ID NO: 349:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 417 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 349:
TAATTTCATn TGGTTTTGAT TTGnACTTGA TATTTTTCAA CTAAATATTT ATTTTTGAAA 60
TAAAATATTT TAAGTTACAT TATAGTCACA TTCTTTATGT GCCACTAAAT AGAGTTCAAC 120
AAATAAAAAG TGAAAATACT GTAGTTTAGC AGAAATATAG GCAAAGCTCT ACAAAAACAA 180
TCAAATGAAA AAGATGTTAC ACAGTAAATT TTTAAATAGT nACAGATCCT TAGAACTGTA 240
GTGGTATATA ATTCTTAACC ATTTGATGTT GAAnACCTTT TTATAATCTG ACCATTTnAA 300
AGTCTTTTGA GAAATGTCTA CTGAAATCCT TTCCTCATTT TTTGTTTGGA nTTTTGGTTT 360
TTGGnTCTCC AGTTTTTATT ACTTCCTTGT ATATTTTATA CAGTTTAACT AGTTTAA 417 (2) INFORMATION FOR SEQ ID NO: 350:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 437 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 350:
AAAAATCAGT CTCATTATTT ATTGTCAGAC TTGATGCATT CAATATGGCT GATGTGTGAA 60
GTTGAGAGCA ACACCAGAGC AACCCATTGC TCAATTCCCT GTCGTGCTGT CACCTCTTTT 120
CCTATGACTG TCTCAACAGA TGGTCCCAGC TCCTAGCAAA TCTTACCCTA CTTGAGGAAG 180
CAACACGTCT ATTTTTCAGT GACAGTTAAA TTCATGGCCA AGTGCACTGG ATTCCATCTC 240
CTTCTTACTA TCATACCTGT GTCTTCCACT TCTCTCTCAG CCCCTATCTC ATGCTCTCCC 300
TGTAATTTTC TTAGGCTAAT GTAATTCCTT CCTCAATAAA GACCCTCTTG GTCTTTCTGT 360
ATTGGTCTGA CACCACCTCA ACCTGGATCA GTTCTTCATG ATTTCCAAAG TGACACAGTA 420
AAGATACATC AGGATTG 437 (2) INFORMATION FOR SEQ ID NO: 351:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 462 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 351:
CTAATGTATA AGCACTCATC AAGCATCCAC TCCCATCATG CTTATCAATA TCCCAGTGGT 60
CAAAGCCAGT TGCATGGCCA AACCCAGGGT CAGAGTTGGA GGGGCTGACA TAAAGGCTTG 120
AATGCTGAGA ATCCCAGTTT ATTGACCACC ACCAACATAA CAGGCTTTAT GAAGAAGTAT 180
TAGGGCAAGG TTCTGAGAAA CCAGAGAGCA TTTTTCTTCT CCTGTTATCA GTTATTTTTT 240
CTTCAGTACC TTATCTGTAG AACACTTGTA AGCCATAGAA GATAAAGTTA CTCATCCAAC 300
ATCAGCATAA ATTCTTAACT CTTTTAGCTG CTAGAAATAT TTCACAGAGA TAAACCTGTA 360
TCTGGCTTGT AGGAGCTGTG GTCCTATCAG AGAGATGCGT TTGCAAGCAT GCATTACAAG 420
ACAATGTGCA CGACAACAAA CATGTATACC TCAAGTAAAA AG 462 (2) INFORMATION FOR SEQ ID NO: 352:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 643 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 352:
GAATnGAGCT CGGATACAGG AATATGTATT GGAAATAAAA GACATGAGCC TCCAGATTAA 60
AAGGACCCTA CCAAATGCTT TTGAACTGTA TTTGTAAAAC ACCCGCCCAA GGCACTTCAC 120
TATGAAATTT CAGAGACAGC CTACAAGCTT CCAAAGATAA GATAATAAGT ATATTCCCTA 180
GGTGAAATCT TAGATTAGTA TTATGAGTAT ATGACCAGTG GAGCAATAGT TCAGTGATTC 240
TTTACCCAGC TAAACGATCA ATCAAATGAA AGAATATTTT AAATCCTTTT CATCCTTGCA 300
GAATATCAGA AATTTGCCTC TATGCACCCT TTCTTAGAAG ACCACAGGAG GAGGTATTTA 360
CACAAATGGC AGGGCAGAAT AAGAAAGAGG AAACTGGAGA TCCCGTAAGT AGAGGTTATT 420
CTTCCAACGA AGAGGCATGC TTTTGAGATA GGAGGTGAAT GGTGTAGAAA TGCATTTTGA 480
CTCAGTAGTT GAAAGTTGAG AGAATTTACA CCTCTGTTTT CTCAGTGGGT GAATGCAGGG 540 nTGCTTACTG TAGTTAGAAG CTTTGAGAAT AGACAGACTT GGGGGACAAG AAAAGGGTCT 600
CATAACCTGG GATAAAACTA TTGGnCAGTT TGGnATTGGG GGG 643
(2) INFORMATION FOR SEQ ID NO : 353:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 523 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 353:
TATACTTCTT CATTGACTCA AGCCACTGAA ATTTTAGAGG TTGTTATTAC AGCACACACA 60
CAGTTTTCCC TGACAGACAC AGATTCAGGT CTAGGAAGGC CATAAATGCC ATATTAAGGA 120
ATTCGAATTG TACTCTGAAG GAGGTGCAGA GATAACATTT TTGAAAGGAG TGACACGATG 180
TACATTTGAG TACAATAACT CTGGATACAA CATACATAGT ATATTGAAAA CAAGGTGACT 240
TTCAACCTTT GGTGTACATA AAAATCCCCC ATGAAGAACG GTAGTCACAT TCCAATTCCT 300
ACATATTACA CACTCTATAA AGAAGAAAAT ATTTATCCTC TGGGGAAAAA AGTTACCCAG 360
GAATTCTTCA AGATGCAATG CTCTGGGCCA GGATATCTTC AGATATCCTG AATCAGCATC 420
TCTAGAGGTG GGGTTTATGC ACAGTAACTG AnAGGACAAG CTGAGTTCAG AGGTAACCAA 480
ATTCACATGC TTGTGGTTCA AATAATCATT CnCCCTCATG GAT 523 (2) INFORMATION FOR SEQ ID NO : 354:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 592 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 354:
AGAAAAAAAA ATACTGCAAA TATAAAGGAA GAAGGAAGCA AAATGTAGAA ACTGATGAAA 60
TAAGAAATGT ACAGCGGTAC AGCTTACCCA GGTATAATCA ACACAGTCAA CACCAAACTT 120
GGAAAGAATC TGAAAGTGTG AGAGCAAAGC ACGAACAAAG ACAGCCAATG TCAGTAATGT 180
AAAAGTTGCT ATCATTCCAT TATCCCACAG ATAAGGAGAG GACATTATGA AAAAATTTAT 240
GCCATTTGAA AATATAGATG AAATAAAAAA GTCCCTAAAA TAATACAAAT CACCAAAACT 300
ACAAAGGAGA AAAGACATAG GGCTGTGACT CTTAAAGATA TTTAAAGTCT TACATACAAA 360
AATATTCCAG GCCAAGATTA CCTCCTCTGA GTTCTAAGAA ATGTTGAAAG AACAAATAAC 420
ACCCACCTTA CACAGACTCT TCTCAAGAGT AGACCTCCTA CTGAGGTCAG TGTAACCCnG 480
GAACCCGGCC nGnGCAnGGA CGTTGTAAAA CAGGAAAATT ACAGGGCTTG ATTTCATAAT 540
CAGAATGAGA AAATCTTAAA ATAGCAAGAT ACAGGAAATA ATATCATAAT CA 592 (2) INFORMATION FOR SEQ ID NO : 355:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 582 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 355:
TTGTGTTTTC CACTTGCTTC CGAGTTGCAC TGTTCAGATA TTTTAAATGT CCATGTGACG 60
CAAGGCTCCT AGAAAAGTTC ATGGAAATGG ATTTTAAAAC ATTGACTTGG GCAAAAACTG 120
TTTGGAATCC ACCTATGCAG ATAAGGGGTC TTCAGAAAGT TCCTGGGAAA ATAGGTAGGA 180
AGAAAAAGTT ATGCATGGCG TTCCAAAACA TTCTTTTTGC ACCAGAATGA GCTCAGTTTT 240
TGACTCCTAT TTCCATGCAT TTCTTAGTCT CTCTGTATTT TCTCCCATTT TCCTTTGCCT 300
TGGAAAACCC TAGTGTAGAG TGCATTTCTT GTCTGGAGAC ACAGACACAA TCTTTCTCTT 360
TGGAAAACCG TGTGTTGAAA GAGTGAGTTG ATGGGGCCGG CGCTGTGGTG CAGCGGCTTA 420
AAGCCCCGGC CTGnAAGGCA GGCATCCCAT ACGGGCGCCG CTTCGAGTCC GGGCGCCGCT 480
TCGAGTCCCG GGCTGGCTCC TTTTCTGATC CAGCCCTCTG GCTATGGCCT GGGAAAGCAG 540
TAGAACACAG CCCAAGTCCT TGGGGCCCCT GGCACCTGCA TG 582 (2) INFORMATION FOR SEQ ID NO: 356:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 582 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 356:
GTAACTGCCA ATACCCAGTC CTCAGCAAGA TTCCAGTTCA GTTGTTCAGT TATCAGTGGT 60
GATTTTAATG CTCCGCAAGT GATTCTAACT GCAACCAGCG ATGAGAAGTC CTTTAGGTTT 120
CCCTGGAGCT AATAGCTGTC CTTATTTCTG AAATCAGTCT TAGGATTTCA GTGACATGCA 180
AATTAAAGCA TCTTAAATAC GGACCGTTTT CTTTGAGTGC AATGCAAGGT TCAGTGTTTA 240
AATTTTCTTT AATTAGAACA ACTAGGAAAC TTAAAATCTC ACTTCCTGGC AAAAGAATTA 300
GGCTTGCTCA TTTATTTGAG AAAAAGATTA TGTGCCTGCC CAACCATCAA TCCTAATTCT 360
GAAGACCTTG CACTGnAGGG ATGTCAAAAG AGGAGCTGGT GACTAACAGG AGGGACTGAG 420 GGTAGGGACT TTGnAATATG GTGAGGAAAA AAAAAATCAA TGCCTCACTT ATCCTTGGGC 480
AACAAATAGA TATTAATGGT TTTAACACTG ACCAGTTGGG ATATTATTTT GTTGCCTACA 540 GCTGAGTTTA AGATCCATAA TTTCACATAG TTGTTCCAGG An 582
(2) INFORMATION FOR SEQ ID NO: 357:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 386 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 357:
CAGnAGCGGG CGCnGGCGCC CCCAGCCCTC CTTCCACCCT TGGAGCTCCC AGGCTGCCTG 60
GCAAACCCTC TCCTCCCTCC CTGCGCCCCA CCCCACTCTC ACTTCCTCTT TTCTGGTCCT 120
CAAAGGTACC CAGCTGCTCC AACCTCGGGG CCTCTGTGTC CGCTGTTCCC TCTGCCTGTA 180
ACACCCTTCC CAGGCACGCT GCCAGGCTGA GTCCTTCTCC AGCCGGGGCC TCCTCCCTGG 240
AGCTGnnCCG CCCTGGCCCT CCCTGGTTAA TCAGCGAnnC CCACCCCTTC ACCCCCGCCC 300
CAAGTCGCAC CACCCTGCTG GACTTCCTCT CTCCCAAACC GTCTTGTTTG TGAACCTGTT 360
GTGGTCACAG CTGTGGnnCC GTGCTC 386 (2) INFORMATION FOR SEQ ID NO : 358:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 663 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 358:
AATGAAATGG AGTAGAGAAG AGACTTTTCA GCCCAGCAGG GTCTCCCTCC TGTGCGGACG 60
TTGTTCTCGG GGATCTGCGA CGGGAGCTGT GGGCACTCGG AGCCGAGACG CAAAATGAGG 120
CTCACAGACC AGAAAAGGGT GCAAGCCACC GTCGCTGGCA GCCGCGGCCC TGGCTCGCCT 180
CGCTGGCCCT TCCTAAGGCC CCGCCTCTGC GTCCCCCACG GATGGAAAGG GTGCAGACCA 240
CCGAGCTCCG AGATTCGCAG AAAGAAGCGG CTTGGGGAGG CCCCGCGGGG AGGCTGGGGG 300
ACTTCCTGTT CCGGTCTACG CCAGGCGCAG CCAACACTCG CAGGAGCTCA GCAGCCCCCA 360
CCTGGTGACA GCCTGGGTGG GGCAGGGAGC CCCACCACCA GCCGCTGCTT GCTGTCCCCC 420 ATCCTCCCCA CACACAGGGG ACAGCTCCTA GCTGGCTGCT GAGGGTTGGG GTGGGGTGGG 480
CATGGTGGGG TGGGGAGAAG GCAGCCCCGG GCTCTGGTGA CCTGCCCAnG ACCTGCCCGG 540
AAGCCCTCTG TACACTGCCA CTGGTTGAAC TCATCACAAG TTCCGCCCAG AGCCCTGCAC 600
TAGGCACTGT GTGTTGAGTT CATGCGACCT GCACCACAAG CTGCACGCGC ATGGGCACAT 660
GCG 663 (2) INFORMATION FOR SEQ ID NO: 359:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 543 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 359:
TGCAGGATGT CCAGCGTAAn AAAAGGGTTG GGGGGATAAT TGTACCTCCA TGACAGGCTT 60
CACAAACTCC AGGAAGACGG GAAGGCCTAA TTCTTAGAAA ACGAAACTTA GTATATACTT 120
CCTGAAATTA ATGGTCTCCT ACTTCCTTGT GTGACCACTT CCTTCTGAGA CGCATAGTCC 180
TGGACTGCTG AGATGCACCA ATAAAATGCT CAAATTTAGG TAATTGACAT ATAAAGAATA 240
AGGAACTAGC ATATGTGCAT AATGACGAAT ATACTTGTTT ACTCAAGTGT ATAACAACTC 300
GGAGAAGGGG AGAGGCGGGG CTTCTCACCC CCGGCACTGn CACTATGTTC TATGTGTCGG 360
TGGGAGCCCC AGCTAGCTGG TAATAAAACA ATAAATCTCT TGGCCCTTGG CATCTGTGAC 420
TGTCTTTTGT GGGTTAATGG GAAAGATCTC AGATCCCATA ATTCAACACT ACATAGGAAA 480
CCTGGATTGA GTTCTCAATT CCTGGGCCTT GGTTACTAAC ATTTGAGGGA GTGAGCCAGA 540
GAG 543 (2) INFORMATION FOR SEQ ID NO: 360:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 584 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 360: CTGGAATATT TCCCGTAAGT TAGGTATGGC ATTTTGTTTG GGAACAAATC AGAATTTGTT 60 TCTTCAATAT TAAGTTATCA TTTCACCTAA ACTACAAGTT GACTTCCTGA ATTGAGGTCT 120 AAAGCTCTTC AACTCTGTAT ATTTAAAGAA GTTTATTAGA TAAAGAAGAG ATTTATTTGT 180
AATTTTAAGT GCTTTAGAAG TAGGTTCTAT CTTGTTCTGT TTATCTTTGT TGTATCACAT 240
GCTATGAATA TTTTAAATGC ATATACTAGT GAATAAGTTA AAGAAAAAAA ATCAGTGAAT 300
GACTAAGGCC TATCCGCCTG ACAAGTTTTC CATCCATTCA TTTCTCTAAA GCATTTCCCA 360
GTCAACCTGA AGTTTTCCTT CTGAAGTTCC CCTACAAAAA GGCAAGGAAT GGAAGAAATG 420
AGAGGAGTGT CTCCAGGGAT GGCAGAGCTT TGGCCAGCAT TGGGCGTAnA GATGAATCTG 480
CTGTCTGTTG TGGGGGAGGA TTGAGTCCTG CTCATCCTCT ATGCAACCCT AGTCCAGCTT 540
TCAGGCGTGG nTAACCCCTT TTCAGAATGT AGCCAGCCCA GGCC 584 (2) INFORMATION FOR SEQ ID NO : 361:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 540 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 361:
AGGCCCCACC TCCAAATAAC CTAGGAATCT GGGGACTCAG TTTGCAAAAT GTTAGCATCC 60
AATGGTTTGC TTCAGAAATG CCTGAAACAG CTGGTGTACG GCTAGGTTAA AGCCAGGATC 120
CAGGAACGCA ACCCTGGTCT CACATGTGGG CAGCAGATAC CCAACTACGT GAGCCATCAC 180
CTGCTGCCTC CAGGCTGTGC ATTAGTAGGA ATCTGGAATT AGGAGCAGAG CCACTCTTGA 240
ACCAGGCACT CTGATATGGA ACACGGACAT CCCAACCAGT GTCCTAACGG CCGGGCCAAG 300
TGCCTACACA TGTCCTCATT TTGATGCAGG TTTTTTTTTC ACTCCTGGAT ATTGGTTTTA 360
AATCTTTATC CCTCAATTCA AGCAGCTCAT GGAAATATAG CATTTGAAGA CGTACCAGGA 420
GAACGCACAG CTGACCAGCC AGAGCTGAAA TGTGCATGTT GAAGACTGGA ATGGCTGTGG 480
TATCAACAGA CTGCAAGCTG GAGTCCACAG TAGATCCTCT GCATTTCTCA CTCACTGGGA 540 (2) INFORMATION FOR SEQ ID NO: 362:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 514 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 362: TGCAGGTTAA GAAAGTACTT TTCAAGTTAC TTATCTGAAA ATACATCTGT CTAGAGAGCC 60
TCAAGATAAA AGAAATGTTG GCAAATTTTT AATAGAACAC AATAATATGG AAAAGCCTAC 120
AGCACTTCCT CTCCAAACTC CTACAAAATG TCAATGGTGT AGTCAGCAAG TAGTCAGAAT 180
TCTGTGAAAA CCATCGTAGG CATAAGAGAC TCTTAAGAAT AGGAAAAAAT ACCCACAGAT 240
GCTTAATTGG TGAAAACGTG CTGTAAAGTA TTTCTGAACA AACTTTTAGG AGGCTGAAGT 300
TCCTTGTGGA AAGCCCTCAG GCACCTGTGG TGAACCTGCA GTGGAAGTTA ACCCCAGGCG 360
GAAAATATGC TTCCTTCACA CTATGGTCAA AGCAACGAGG GTGGAGGGCA GGGACATCTG 420
GACATTAGCA AAGATTCTGC TTTGAGGGAC AACTGTCATA AGAGCTCCAG GCCTACGTCA 480
AAAGTGTCTC AGGAGAGGGG TGGGTGAACA AGCT 514 (2) INFORMATION FOR SEQ ID NO : 363:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 633 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 363:
ATAATGAGCA TTTTCCATGA CGTTTTTGTA GACGCCCTCA TGCACAGGAT TCCTGGGAGC 60
TGGTCCGTGG GGGCAGTGTG AACCAGTCCC ATCCTATGCC TCAGAGGTGA TTTTTTCACG 120
GAAATCTGTG GAGTGCTGCT CCCAGGACTT GTAGGCATCC ATAATGGTAT AGGAAGGCTT 180
GTCCTTTGGT CCGTAATGCC TTCAGTCCCG TGGCCTTCCT CAGATTCTGA CGGCGCGCAC 240
CCATCGTGTC GGGGACTGGC TGGACTGGAG CATCTGCCAG AGGATCTCGG TGATCACTTA 300
GCATGAGTGA ATTGACATGG CTGTTGTAGA GCTTTTGGGC AGCCATTTTA ACGTTGTGTA 360
GATAGGGTAC CCCCAACACT GGGAAGTTGC TACTTCATCA CGGTGTTTAA GGTTCCTGAG 420
AAGTGCCGTA GTCGCCTGTG GTTCTCTGAT CTCCAGAATG GATAAGAAGT CAGCCCTTAA 480
AATGTCCCCA AAGCTTCAGC CATGTTAATG AAGCCGGATG ATTTACAAGT ATTATTGGnA 540
CTTTTAAACC AGCCTGGCTG CTGGTATACA TATAATTCTG ATCTTACTTT ATTAAATTTC 600
TCAGTATGTA CCTTATGAAG ACATTTTTAA TTG 633 (2) INFORMATION FOR SEQ ID NO: 364:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 514 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 364:
AAAAAAAAAA AAAAAAGTAA TGACTTCTAA TCTCCATAAT TAGGTCATGG TTTGGCCTTT 60
ACACACTCCC ACTTCTTTTG TCAACTTCTA GCAAGAACAT CACAAATTTT AGAAACTGGT 120
ATTTAATTTT TTCATGGACT TAATGATTCT CCATCCAGCC CTAATGATCT CAGTAATTAG 180
TACACTAATC TAATACACTC TTTTGTAATT AAGAAAATCA CTGGGAATTA AAACTAGCTT 240
CCTATTTGAA TGAACATCCA TAATCATGCT TCTAGACTTT AGACATTTAC CTGATCCATC 300
TGTTTTCACA GGAAAACGAC CACTAAACAT AGACGCCAGC ATAGAGTCTT TGAAACGACA 360
CAAGGACTCA CGCCGGGCTG TGTACGTGCA GCCACCCACG TTCAGCCGGA GAATATCTAG 420
CACCTCTGTT TCTGCCTTGG GGCCTGCCAT TGCTGCCTCC CAGCGTTCGC nGAAACCTCA 480
AGGCAAGCAG TCCTGGTCCC CCTCTGTGAA AACA 514 (2) INFORMATION FOR SEQ ID NO: 365:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 584 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 365:
TAnAGTGCAT GTTTCCTGTG GGCCATATTA AAATCGCTTC ACTACATTGT CTCACATAAT 60
CCTTTTCCGA TAGGTCACTG GGCCTTAGGG GCCTTAAATA AATTGTCCAA AGTCAAACGA 120
GATAGCAAGA GGCAAATCTG GGCTTTGCAC TGATGCTAAC ACTTTCTAAA GTCCACGTTG 180
CATGCCTGGG TGATGCTTGA ATGCTCCTCT GCACATTTTG CTTTTAACTC TGCTATTACC 240
TTCTTGGTAT TGTATCATGA TTTATTGGTG GGTGAGTCTT GTCTACCCCA GTAAATTGTA 300
ACCTTCTCAA GGACAGGGAA TAAGTTATTC AAGCCTATAT CCCTAGCACT TAGCAAATTG 360
TCTGGCACAT AACAGAGCCT CAGTAAATGC TTGTGGCCTG AATAAATAAA AGCCTTGAGA 420
GAGATGGTCA GGGAAAAGCA GACAGAGTAA TCTAGATTGC AGTTTGAACA AAACACCTCC 480
TTTCCTATGG nCCTGATTAG GGAGGTGTCT GCTGGGCATG GAAGGCAGAT GGGTGAAGGA 540
GCAGGGACAG CCATGCCTTT CCCTGGTTCT CTGGAAATCC GGCT 584 (2) INFORMATION FOR SEQ ID NO: 366: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 462 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 366:
CAAGTGATCC AAGCAAACTG AAGGACATTT GCATTCACCC AATTTTCTGA ACCAATCTGT 60
ATTGGTTTAC TTTAATTAAC AACATAAAGG CTTGTATACA TAGAATTTAC TCTCATTTTT 120
ATAAGACCAC TCTAGCTGGA AAGCAAGAAC ATGAATACAA CATAGGTCTT ACATCATTTA 180
CAACATTTTA ATGCATTTTA CATAATTTGA ATTGACTTAA AAAGCTGTTA CTTTATCAAA 240
TTTAGAGTCT TCTCCTTTGA GAATTCCAGG GATCCTACTG GAAGCCTCAA AATTGGAAGA 300
CTCAGTTTAG GGCTTAAACT ATCAGAGAGT CAAACTATTT AGCCAAAAAT TAGTATGTTT 360
AACCTTGAAA GGAACGGCTG AGAGCCGAGA GCCAAGAGCC GCCATAAAAA AATGGCCTTA 420
ATGAAACAGT CCTGGGAAAT GGGGAGTAAA CTAAGTTCAC AG 462 (2) INFORMATION FOR SEQ ID NO: 367:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 614 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 367:
CTTAACGTAA TCCTATTTGT CTGAATTTCC CATTACATTA AGAATAATAT CTCATCGATT 60
CAAGCATTAA TTATGAGTTA AATAAAATAT CTGACAAGTG TTTATATCTG TGAGAGTCAG 120
AATTCTATCC TTTTTTAAAA ATTAGCTCAG GAAAGTTCTA TGATTATTAT CATCTTATAA 180
CACAAGAAAC TGAAACTTGA AAAATTTAAG TGTCTTGCCC AGGTTGTTTA AACTGCATGG 240
CACTGGACTA TGCTACAAAT TAGATTTCTT GGTTTTTCTA CATTGTTATC TCAAGTATCA 300
GTGACATCAA GAAACACAGA AATGTGCTAG AAAAGCTTTA AAAACATTAC AGCCTATAGA 360
ACTATCATAT AATTTAATTA GTGAAGACAT CTTTACCCAA CTAATTAACA AATGAGATGG 420
CTTTTCAAAA AAAAGTTAAG TTTAATTTTC AGTTCAAATA AATGAATATA TGATGTATGG 480
GGGTGTGGAG GCAAAAGTTC CTGAATCCTA ATTCCAAAAC AGAGTTAAAA AAAGAGGACC 540
CTGGCAACAA TTTTGGnCCC ACAATCATTT GTGGCCCATT ATGTGCTGTA AGAAATACTT 600 ACTGAAGCTT TTTT 614
(2) INFORMATION FOR SEQ ID NO: 368:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 701 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 368:
ACCAATAGAG CTCTGACTAT CACAATCACC CTGGAATCCC AGCCTTGGGA AGACTCATTC 60
TTCCACAACC CCATGGCTGC AGAGATGTGA AGAAAACAGC AACCCACAGA TTAGAGGCCC 120
AGCCTGACTG TGACATGCCA CTCGTGCTGA CATTCCACCA GCCAGGGTAA ATCACACAGC 180
TGTACCTGCA TTCTGTAAAG TGTGGATGTA TAATCTTTCT ACAGAAAATG ATATTCTGAT 240
AGTGGTGCAC AATAGAAGTC CACCACAGGG ACAAGCCTCA GGAAAGGACT CAGGTTAAAC 300
CTCAGCCACG TGGAATCTAA GGTTCTGGGC ACATTTCCAA GTTAAGGAGG AGTCAGACAA 360
ATATGGCTGT ATCTGTAGGG TGGGAGTCAC ATAAAACAGG TGGAAATAAG TTCAAGAAGC 420
CAGTGGTAAA AATCAATCAT GGTAATAAGA GTAAGAACAC CTACCACCTA TTTAACATCC 480
ACCCTTTTGA GTCAGACATT TTACATACAT GTTATTCTCA TCATACTTTA CCATAAATTA 540
TTGCATTCCC TTATATAATC TATGCATCAT TTTATCTACA TAAGTATTCT ATGAGTTAAA 600
TGTTTCTGCC ATTGTATTAC AGAAGTGAGA CTGGGAGTCA GATTACATCA CTTAACCCAG 660
GTCAAAAGGC AAGAAGAAGG CAGGGAGTTG AACCAGTTCT G 701 (2) INFORMATION FOR SEQ ID NO: 369:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 615 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 369:
TGCCTTATTT CTGTAGCTGG AAGGCACTCC ATGATGTCCA GATTTAACTG TTCTTTCTCA 60
TAGTTGGAAT ATTTTTAGTG ATTGATGACA AATCACACTT TAATTTGAAC TCTAGAAGGC 120
AAGAGGCTCA CATTAAGTGA ACACGTACAG CAGACATGAA GGGGAAGAGT GTCTTCGTTA 180
TGGGTGTGCA GTTTGGCAGA AATGATGCCC TGAGTGCATG TTAAATAGTC TAGCAAAGTG 240 AAATCCCACA GCTCAGCTGT GAAGTCAGTG AATATAGCAA TCTGTGTGTG GCAGAAGCCT 300
TCACAGCTGC TCCTCCATGG TGCACTCTGA CTTGGGGATT TGTTTTTGTT ATACAGCTCA 360
CTGGGCTCTC CATAAGGCAT GAAAGAGAAA AGTTTGACTG ATTTGCAGCA AATAACTCTT 420
TAGACTGCTT TCTATTGCAT TGGAAGCCTG CTTTAGAGTG TGTAGATGCT AAATAAATGG 480
TAACTGTTCT ATGTTTTATT TTTATCCCTG GCTTCAGCAA CTTACATTTA TAGCATAATT 540
TTTGATTTCT GCCTGCATTA GAGCAGGTAG GGAAATATAA TTAAGAGCAC ATTTGATTCT 600
AACCTGCTCC TATGG 615 (2) INFORMATION FOR SEQ ID NO: 370:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 523 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 370:
TGCGGATCAn TnCCCAGCCC CGTATCAGAG AATCGGTTGT TAAATGTTTA CCGACACACC 60
ACTGGCTTTG GTTATCATTT TCTTACTTCT TTCCTTATTA GTTGCACTAG TTTGCATACC 120
CATCAATGAT ATGATGGGTC TTTTTTTTGA GTTATAATTC ATTATTTTCT TTTAAAATTA 180
AAGTAGTAGT CATCCTTCCA TACAAGGAAA GGATTGGAAG GAAAAAAATC ATAATATTCA 240
AAGATAGGCT TTTGGTGAAC AAAGCAAGTA TGGACTTTGA GCTACATGAG ATCTTCAGAA 300
AGTGCTCAGA ATATATTTTT TTCTACAAAG AAACCTTTTA TTTAAGGAAG ATAAACTTCA 360
TACATTTCAT AAGTACGATT TTAGGGAATA TAGTGATCTT CACACCATAC CTGCCCTTCC 420
AACCACTCTC CCTGGCCTTC TCCGTCTCCC TCTCCCGTCC CCTGnCCCAT TCAACATGGA 480
AGATCCATTT CAATTAACTT TATAAACAAA AGGACCAACT CTA 523 (2) INFORMATION FOR SEQ ID NO: 371:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 586 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 371: ATTAGTAGCA TGATAAGTCT GCCAAAATTA ATTTAAATAA ACTAAAATAA TTAATTCTCA 60 GTAAATATGA AGCATCTACT GAATAAAAAA TCGTGGTTAA CAACACATAC TTCCTATGCT 120
TGTGAAACTT ACAATCTACT CAGGTGGGGT AAGGCCAGAA TATATGCACA ATACTAACGG 180
ATTGGGGGTA GGGAAAAAAA TCTGTCCAAG GAAGGGACAA ATCTCAGAGA TACAGAGTAG 240
GGAAAAGCCA GGAAAGAGCA TGTACCAAGT CCTAAAACAG GAAGGGACAG AAATCCTGGA 300
GACGACAGTT GGGATGGCTG GAGAATGCAC CACCCTAACG GCAACAGGAA TCATTGCGGC 360
GGGGGTGGGA AGCTTTATTC TGCCAAGGGC CATTTGGATA TTTATAACAT CATCCACAGG 420
CCCTACAAAA TTCTCAACTT AGAAGTCTGC CTGCTCTAGA TTTATTGCAT TTCAAGTGCC 480
GCTTGAGGCT TCCTTGGCAG GGCAGACCAA ATGACCTTGG TGTTTTATAC GGCCCGTGGG 540
CCAGAGGTTC TCATCTCTGC ACTAAAGGGT TTGAGCAAGG AAGAAA 586 (2) INFORMATION FOR SEQ ID NO: 372:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 656 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 372:
TGGGTTTATA ATTCTAAGCA AGCACAACAG TAAATCATTT CTATATCTAA ACTGTAGCAC 60
TTTGAAATTC ACGAGTAGTT TTGCCCATTT GGTATGTTTT TTATACTGTG GTTTTTAAAA 120
AGTCACATCT GGTGCCATTT TATTGCTGTG AAAGAATCGC GTCTTCATTC TAAATCTAAG 180
ATAATATTCT AATACTTTTC CTCAAAAGAG TTCTTAAAAA TTTCTAGAAT GATCTTGTCA 240
TAGTTTCCCA TTCACTGACT CTGTAATGAA TAAATATTTG AAGCCATAGC AATAATTTTT 300
CATGCATATG ATACTGTCTT AATAATTCCG TTGTGGCTAT GCGCTTTATT GATTTCTGTG 360
CCACTTCTGT GGCCTGAAGT GAAATGTATT GCCCCACGGA AGCCATTATG GTTTCCTTTC 420
GAATTACGGC TTCACTGGCT CGCTCCTGTG CGCGGGCTTC TTCTCTGTCT GCAGTGCGTG 480
TTCGGTCCCT GGGAnTTACT GGGGnCCTGG ATTCCCCCCA TCGCCCTCGC CTTGTCCTGT 540
GnGTGGAAGA TGAnGACAGG AAAGTTGGGA AGAAAAATGC AGCAGGCAGC CAACCCTGGA 600
GAAACCGCTC TTCAGACTTC CGGGnCnCCT GGGCGATGGG GCATCCTTCC TGCAAC 656
(2) INFORMATION FOR SEQ ID NO: 373:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 584 base pairs (3) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 373: TAGGTTGTGT nCCAGAATTG CAATCTTCAT CATGTACTGG GGCTTAGAGC TGCTCTGTTC 60
CACACCTTCT CCCCCCCAGA AAAAAATCCC CACTGGACAC AACAGACATG CCTGAGAATT 120
TAACCACCTT GCATATATAG CAAATTCATT CACTTATTAA TACCCACGTA ATTTAATATT 180
ACTAAACATC AAGCAAGTAT GTATTAGCAT CAACTTCTTG CTAGACAGAC ACTATTTTAG 240
AAGTTGAGAT TCAACAGTAA CTAGAGCAGA TACCAATACC TGCTCCCAGG GGTCTTACAT 300
TCCATACAAA GAAGATAGAC CATCAAGGAG GAAGACAGAC ACTGGTAGGA GGGAAGTCCT 360
TCACCCAAAG CTTTGTTTCT TGGGGCCAnC TGGCTTCCAA AAnGAGTTTT AAGTGTGTCT 420
GGGCATAAGT CTGCCAAGTC CCCAGCTCCA TGTGCCTTCC TAAGCCCCGA GGTTAGTAGT 480
GGGAACCAAA CATCTCCACT AGAACAGGGC AACCGACAAG ATTTCCTAAC ATGAAAGGGC 540
TCGGTATGTA AGAAATGAAG TAAGTGAGTA GTTTGGTGGA TTAT 584 (2) INFORMATION FOR SEQ ID NO: 374:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 567 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 374: AGAATGAGTT CTCTGTGTGG AGATTACACC AGCATCTGCG TCCATTAATG CAAACGTCAC 60
TTACAAGCAG GAGTATCTGG AAGGAGGATT CTTGAGCTGA AGTCCAATAA GAACAGTTAA 120
TCCACATTTT GAGCTCTCTT CCGAAGTCTG AGGCCAGAGG GAATGAATCC GCCAAGATGT 180
TTTTGATAGA AAATAACAGA AAATCCAACT CGAACTGACG CAAACAATAA GGAGATTTTA 240
TTGACTCACT TAACTGGAAA GTCCAAAGGA ATTAAGCTTC AGGTGAGGGT TGATCCAGTA 300
ACTTAATGAT TTTGTTCTCG GTATGGGATC CTGTGTTACT TGCCTATCAC TGTGTAGCAA 360
ATCACTGGAA ACTTAGTGGC TTATAGCAAC AAAGCCTTTT GCTTATCTTA CAGCCACGGT 420
GGGATAGGAT TGAAGAGAGG CTCAGCTTCT GCTCTGGCTC AGACTCTCTT ATGAGGGTAC 480
AACAACAAAG GTGCCTTTTG GTGGTGTGAT CATCTGAGAT CCTGAGGACT GGAAGGATGT 540
GTTTCTGAAA TGGTCCATTC ACACAGC 567 (2) INFORMATION FOR SEQ ID NO : 375:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 326 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 375:
TTATATATAT GGCTGTGCAA GTCCCTTCAT CCCTTGACAT GAAAGCAAGA AGGGAAGAAT 60
GTAGCCATGA TCATTGGGAT CAAGGTCAGG CCATGCTACT CACTTGCTGC ACATCTTGGG 120
AAATTACTTA ACTCCCTGAG CATCATGTCC TTTCCCTGTG TATTGCCGCC ACCTCATCTA 180
GCAATCCCAC TGCCAGGTAT GCGGCCAAAA GACACTAAAT CATTGTATCA AAGAGATACC 240
TGCCCCACCA TGTTTGTTGC AACGTTGTCC CCAATAGCCA AGATATGGAA TCAACCAAGA 300
TGCCCATTGT AAGATGAATA AAGAAA 326 (2) INFORMATION FOR SEQ ID NO: 376:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 627 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 376:
AGTGGAGCAG CCAGGACACA AACTGGCACC CCCGTGGGAT GCTGGCGCCC CAGCGCTnTT 60
CCCGGCTGTT GCTTTCTTTA ACTGAAGAGT AGGAACTGCA CACGTGTGTT GGGTACATGT 120
GCACATCGTG TCTTGCGGTT ATTCGTGGAG GTCAGGGTCT GAAGTCCTGG CAGACAGCCT 180
GGAGGGCGTG TCCCAGCTGG TGTGTGACAG GCCCCCAGGC GGATAGCAGT CTCTGTCCTC 240
ACCGCTCAGC AAGACGCCTG GCAGGGAGGG GCGGGCTGCC GCCCTCCTTG CTGGCTGGGC 300
AGGTGGGGTG AGGTCGCTCT GCGTGCCCGC CCGTCTCACC CCTGCGTTTG CCCCATGCAG 360
CGGACCGGCC CCGTCATCCT CGGCGCCCAG CAGTGGGAGC TCAGACGACG CCATCCGCTC 420
CATCCTGCAG CAAGCCCGCC GGGAGATGGA GGCCCAGCAG GCCGCCCTGG AGCCCCCCGT 480
GAAGCCCACC CCGCTGCCGC AGCCCGACCT CGCCCTGCTC ACCCCCAAGC TGCTGTCTGC 540
CTCGCCCATG GCGGCCGCGT CCAGCTACGC TCCTCTCGCC ATCTCCCTAA AGAAGCCTCC 600
GGnGGCCCCC GAGGCGGCGC TCGGCTC 627 (2) INFORMATION FOR SEQ ID NO: 377:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 402 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 377:
AGATTTGAAG AnACACTACA GAGAAATTAT CATCAAAAGA GCCTGGTACC GGTACAAAAA 60
CAGACAGATT AATGGAACAG CATAGAAACn CCAGAAATCC ATCCAAGCAT CTACAACCAA 120
CTTATCTTTG AAAAACAGTT AAAAACAGnC TCTTCAACAA ATGTTGCTGG GAAAACTGGA 180
TAGCCACATC TAGAAATATG AAGCAAGACT CCTCCACCTT AAACAAAAAA TCCTCATTAA 240
ATGGATTAAA AACCTAAATT TGnGACCAAA TGCAATCAAA TTATTAGGGA AAATGGGAGA 300
AACCCnGCAA GACATTGGTA TAGAAAAAAA TTCTTAGAAA AGACTCCAGA GGCACAATCA 360
CAGCCAAAAC TGGCAAATGA GATnACATCA AATTGAGAAG AT 402 (2) INFORMATION FOR SEQ ID NO : 378:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 628 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 378:
ACCCnTTTTT CATTTCTGGG AAAAACTTAA GCCTGTAACA TTGAAGAACC CAGAGGAAGC 60
CAGTGAGATA GCACTAGTGA GAGTTGGGAT GGGATGGAAG GTTAGCTGAA ACCTGGGGAC 120
GGAGGCGGCA CCCCTGTGGT TGGCCTGAGC GGGAAGGTAT GGAGGACGTG GGGGGACGGA 180
AGTCGTCACA GGGAGTAGGC GGGGCTGTGC GTTTCTGTGT GCATCAGCGG TTGTTTTGCT 240
GTGTGGAATG GGGATAGGGC ATATATTTTC CAGATTTTGC TTTTGGACTT TGAGCTGTCT 300
GTTTCCATCA TGGGCATCGG GCCCGCAGTT AAAGAGCGCC AGACTCCGGC GGATGGGAAT 360
GGACCGTGCA CCCCAGCAAG TCTGGCTGAG TTCACCTGGA GACCTTAGTA GGCGGGGCTG 420
CCTTTTCCAT GGCATTTGGT AGTTCCAAGT TTGACTAGTT AACCATTAAT TTTTATGGAG 480
CTTGTTCATG GGAAAGAATT TCATGTTGCT ATGGGAATGA CCATTGAATG CCGGCCCTGG 540
AAAACAGCTC AGGGGCCTTC TATCACCAGT CTATCATCTG TTAACTGTGA CAGACACTGG 600 GGATCAAAGA AGTTATGGCA CCTACCCG 628
(2) INFORMATION FOR SEQ ID NO: 379:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 328 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 379:
CATTAATCAT CCCACACTTC TGGGTTACAC CAACAGGCTC CCGTTGCACC ATCAACAAGA 60
AAGCTATAAG GCAATCATCT CCAGTAGTAA CATTAATTCT AGGAAGCTTA CTCTGCTGAA 120
ATAAACAAGT GAGTCAGAGG TGGGATGGGG AGAACCGCAC GGACCCAAGA CAGGAATCCA 180
CGTGGTCATG GCTGTGCTGT TGATCCAGCT GAGAAGTCAC GTGGACAGGG ACTGTGGCTG 240
TGACTATGGG TCTGACGAAG CAGCGGGACA GGAGGCAGGG GCTCTGAGCA TCCGTGTCCC 300
ACCCCAACTC ACATTCTACC CGACACTG 328 (2) INFORMATION FOR SEQ ID NO: 380:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 487 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 380:
AAGAGTTCTT CCTCCCTTTG AGAAAGAACA AAACTGGGTC CCCCGACAGG AATGGACACA 60
TAACAGCAAG GCCTGAGTGC TGGTCAGGAG ACATGGACCC AGGGAAAAGT TCCAACCGTG 120
GAACCCTGGC CTGGGAGGCA GAGGCCAGCG GAAGGTTGGT TCTTTAAAGA CTCTGATAGT 180
ATGCACGCAA GATCATGTAC ACTGTGACTT AAATTCAAGG TTACCTTCTA TTGAGTCTGC 240
ACGCTAAAAA CAACAACAAT AATAATAAAT GCAAACTGAT TTGATGATAG CCTCGGAAAC 300
CCAGCGAGTG AAAAAACCTA CTGGCAAACA CAGACTCTAA AACAAGCAAA AGCAACTGCC 360
AAGAGCCAGC TGCCCCAGCT TCAGGTGTGG CTGAGCCACA CATGGCTGTA AGCCACAGCA 420
CCGCCAGCCA GCCACAGGCC ATGAGTATCT GCCGCCGnCT TCTCCCTCTG CCCCTCTCCG 480
CTTCCGT 487 (2) INFORMATION FOR SEQ ID NO : 381: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 469 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 381:
GTGAAGTCAT GGACCCCAGT AGGCCTATTC CCCTGTGGAT GAGTGTCTTC TGTGGGGGTC 60
TTCAATGTTT TGGCACTCTG TGCACTGTGG TCGGTCCATG ATGGCAGGGA ATCGGCTCGC 120
TTTTCTTCAC AGCCCCTCAA GCACCTACCG CAGGGTGGGA ACTTCCAGCC AGCCCAGGCC 180
ACATTCGGCC AGCAAAAGCA TTTGGTTTGG CTCTGCCAAG GGCAGCCACA GGCGGGGACT 240
CAAAAGTCAG TAACTCTCTA GCAAGCTAAT TTTTAAGTTG ATAATTTTGT ATGGCCCACA 300
AATGATGTTA TAAATATCCC AATGGCCCTC GGCAGAAAAA AAAAAGATTT CCCCACCCCT 360
GCATAGCTGA AGTCCTAAGT TGTACTTGGC CAAATGCGCT CTGCCTGCCC AGGAAGGAGT 420
CCTTGGATGT CCTGTATGTC GCTGGCATGG GCACTGAGGG AGCAGCAGC 469 (2) INFORMATION FOR SEQ ID NO : 382:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 470 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 382:
TTTTAAAATG CTTTGTGGAA GTCACAATCT AGTTGGGGAA GTGTCGAGTT CATGCCTTTC 60
CTTTTAAAAA GATTTAAAAA GAAGAAATTC AAATTCCATC ATTTGAAAAA GAAAATATTT 120
AAAAAGAGAT TTAAAAAGAT GTTAAAAGAG GGAAGATTTT TCCCTTACCT CTCCTTCCTT 180
TTAATATAAT ATCCAAAATA TTAAAATATA TTATGATATA AGCATTTATA TTATTGTCTT 240
CATATCTAAT TTTTTTCTTC TTTAACGTTG TTTCCTTCAG TAACTCAGAC CTTTTCTGAG 300
TTTAAATTTT CAATAATAAG AAATGAAGGA ACTTGTCAGA TGTTGCTAGT CAGGAAGGCT 360
GCCATCAGGA CTCTGCTGGA AACTACCTTT CCCCATGCAA ATGATTTGAG ATATTGATAG 420
CATTAATTTT TTTCAGAAGG TACnTATATA GAATCCCATT ATTCAAAAGG 470
(2) INFORMATION FOR SEQ ID NO: 383:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 482 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 383:
ATACCACGTG GATCCAGCAA AGTGCTTCTG CAGAGCCAGT CTGACCTGGG GCTGGCCGGC 60
TGCTTCCTCT GGCTGGGCAT GGCCAGACCC GCTCCTGGAA AACCCCGAGA CATCAGAGGG 120
AGCCTGGGGG GCCCCGCAGA GGGAGGCGGC CCAGCGCTGA CTCCAGGGAG ACCCCCCCAC 180
ACCCCCACAG CGGGCTCAGG GCAGGGGCCT CGCTGAGGAG CCCAAGTGCT GCCCTCTAGG 240
GGTGACCCTG CAGCTTCCGG GGCCAACCTG ACTGCAAACA GAAGCAGCAA AGTGCACAGT 300
CCCTGAnCCC CCAGACTTCG AGACAACACC CGCTGTGCCC CTGCCAAGAT GAGGGTGCGC 360
CGGCATGGCT GAGATGCGTG GGGGCGGCTC TCCACCCCTG TTCATCTCTC TCCTCTTCAT 420
TGCAAATCCT GCTTCAAGAG CAGACCAGAG GGCAGGCATT TGGCACAGAG TTAAACTCTG 480
AG 482 (2) INFORMATION FOR SEQ ID NO: 384:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 466 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 384:
GGACCGACAG TGCGCACGGA GGCCAGGTCA CCCCTCTCTG TGTTTTCCAG ACACTTACAA 60
GACCCTCGGA ACAACCCTGA CAGCAAAGGC GCTGCAGGAG CTCTTTCGCA GATGAGAAGC 120
CTGAGTCGGG CAGAGCTAGT AAAGGCCAGG AATTGGCCCC AGGCCCCTTG GAGCCAGAGG 180
GCCTAGCATT TCCCTTCCTC AAGCCATCCT AGAATGCTCG GAAGCAGAAC CCGACCGCAA 240
GCTGATAAGG GAACCTGTCA CGCATAGCGG GAGCGTCCCG GAGATCTGAG CTGTGGTCCT 300
CAGGGGATGG GCCTGCGTCA TCGGCCACCA CTGCCGTCAG CATGCCGCCC GGTGTGAGGA 360
GCGAAGGCTG TGACGGGAAG GACTAACGGG AGTGAAGGAT TACTCAGATT GGAnTGTCCC 420
TGTTCCGGGG TTAGTCCAGG ATCAAAGGAC GGGCAAAAAT GGGACC 466
(2) INFORMATION FOR SEQ ID NO : 385:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 415 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 385:
TGAATTCCTA AAAACAGGAT GTTAGGATGC CTTGGTTTTT CTAAACAGAC AGAACACTGT 60
TCCCCTGTTT TAAGTATATT TGGGAAGGGG TGAAGGGAGG TGGAAGGGGA GAGGCAGTCT 120
AGTATACTTA TTTCAAGAGC AAACACAGTC TGATGAATGC CTTGGATTGT TCTTGATGCC 180
AGCTGACTTT TCAGCACTCA TTAATGGCTT CCGTGTGTGC ACAGTCTTCC AAAGGCATGG 240
AGAGAAACCA TCATGCACTT AACCTCCATG GCATTGCTGT GTGGTATCCG GGCCCAATGT 300
CCCCACCCCA TTCCAGTGTG TCCTCATTTG TAAGTGTGGG AAAGAGCACA GAGAAACCAT 360
CATATCAAAT CAGAGCTGGT GTTTATCGAA CCCCCATAAT GTCAGGCAGT AGACA 415 (2) INFORMATION FOR SEQ ID NO: 386:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 416 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 386:
GGCACACCTT TTAAAATGTG ACAAGCGCTC ACACAGCAGC TTGCGGGTGC CAGGCCAGCT 60
CTAAGCTCAG GAGCATCACC ACGGTAACGA TGGTACCCAC ACCACATCAC CAGCACACCA 120
CGGCCGCCGC ACACAGTGGG CACCCGGGCC AGCGGCACGC CTTACCATTG CTCATGCACC 180
CGCATTTACT CGCAGCAACG GCCTAGCGAG CGGGGGTCTA TTCTTAGCCT CACTCTGCAG 240
GCGAGGAAAC AAGGCCAAGC CCTTGGAAGC TTGTCCTGGA GGAGAGCCAC GCGGGAAGTC 300
CTGCAGCCTC TGGCCACTGG GACGCAGCTG CCCGGGGCGG CAGACACCCT GGGAGGGAGC 360
TGCATCCTGC CGGCCCTCCC CTTCCTGAGG CAGGCCTGCA nTTCCTGTTC CGCCTG 416 (2) INFORMATION FOR SEQ ID NO: 387:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 517 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 387:
AAAAGCACAG ATACTTTTTC CTGAGCTGAG AGCACTGTGA AGTCCCACAT CCATAACGGG 60
CAACCCTTCC TGTGAGAACC CTTCTGTGGA GCTCTGAAAA GAGAACTGTC AAGGGCAAGG 120
AGGCTTGTTT GGTGAACCAG GAGTCCTAAG CTTTGCTCTT TGCTCTGTCT AGCAAGTTTT 180
CAGTTACCTG CAGCCATTTC TGGGTTAGGC TCTTTTCAAT AGTCACTTCC ATGGTTTAAG 240
TAGGACATAG TTCCTGAACT CAAGCAGACA CTTAAAGTTT AAACCCCAGG GTCTTATGGT 300
ACTAGTGCAG AGAGAGTGAA AAATGGGCCT CATAGACGAT CCTTAGTTCA CTGAGACACA 360
CCCTTAGAnG GTAGCTCTCA TGAGTTGAGT TGGGTCGAAC TCTTGCTTCT GGGCTGCCAT 420
GTGACTATCC TCTGTCATGT CATnTCTGGT CTCACCAGAC AGGCAAACCA AGGGGCCTGA 480
CCAGTTTGGA GGGTCTACTC CAAATAAACA AACCTCC 517 (2) INFORMATION FOR SEQ ID NO: 388:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 344 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 388:
TTCCTTTGTT ACCTGTGACA TGGATACAGT GCTTTGCAAA CTGGATAACT GAGGCCATGG 60
CCTCTGAGAG GACAGAAACA GGCCCCACCA GCTCAGGGCC CTGCCAACCT GACTCTTCAT 120
CTGGTGACTT GATGATGCAT CCATCTCAGG CTCCAATCCA ATGAGAAGCA CTAGTGTCCA 180
GATCCTAACC ACAAGACTAC TGCTGTCCTT TTATGGGCTT CCCAAACCTG TGTGCTTGTC 240
CAGAGCAGCC CACCCAGCTG CTTTGTGACC TTCCTGGAGT GGCGGCACCA CCATTTTGAG 300
TGCACCGTGT GCGAAAACAA AAACAACAAC AAAAAGCAGA AGGC 344 (2) INFORMATION FOR SEQ ID NO : 389:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 487 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 389: TTGGCTTTCT CCTTCCGCCT GCCTGGAATG TGGCTGTGGT GCCTGACCCG GCTGCAGCAG 60 TTTTGCAACC GGATGGAGGG TAAGGAGCCC AGAGCAGATG TTAGAAGGAG CCTGGCTTCC 120
CGGTACTGGG TCCCAGCTAC CACATCGGCC CTGGGCTGCC TGTTCCTTTT TGTTTGTTTG 180
TTTAAAAATT TATTTATTTG GAAGGCAGAG TTAGGGAGAC AGAGATCTGC CATTTGCTGT 240
TTCGGTCCCC AAATGGCCAT TAACAGCCGC GGCTGGGTCA GCCCGAAGTC AGGGGCCTGG 300
GGCCCAAGGA CTTGGGCCAT CTTCTGTTGA TTTCCCAGGC ACATCACTGG GGAGCTGGAT 360
GGAAGTGGAG CGCTGGGACT CGAACCGACG CCGCCATCGC ATGGTGGCAT AACCCGCTGT 420
GCCACAGCGC CAGCCTTCCC GCACCGCCTG TTGGTCCAGG TCCTGGCTGC CAGGAGGGAG 480
GCGGTTC 487 (2) INFORMATION FOR SEQ ID NO: 390:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 631 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 390:
AGGAAACAGG AGACAGGGTG CCTGGGCTTT GGCCCTGCCC ACTGCTCCAC GACACGGGGC 60
AGCTCACCCC CTGGTCCAGC CTGGTTTCGG GAGCGGTGCC TGGAGCAGAG TGCCCGGGGA 120
GAGGCCTGAG CTGAGGAGGC GGACAGGGCC AGGACGCGGA GGGCCTCCCT TTGTTCCATC 180
GCCTGCCAAG CAGGAAGGCT CTCCCCAGGT GATGCGCCTT CTCTGGGGCC GCTTCCCTCC 240
TGGGATTCGT CACAGCAACC TGAGTCTCAG GGAGTGTGCG TTAATTATCA AACGCTTCAC 300
AAGCATCAGG AGCCCGCAAG GTGCTTCACC ACCATAAAGC ATTTCCTCTC ATTAGTGTTC 360
ATGATCATAA AGCTTCAGAT GCGGCTCCCA TGGCAACCTT AACTCCTCTC CTCCCCCCTC 420
ACACACACAC ACACACACAC ACACACACAC ACACACAAAC ACACAGTGAC AGGATTTTTC 480
ATTGCAGTAA TAACTGCAGC TCGTAAACCT GCCAAATTAT GGGGAAACAT TTATTTTTAT 540
AAGCTGAGAT AAGTAGGCAG CGCGGGGGTA GCAGATCCTC GCCTTGGTGG CTCCAGAACA 600
AGCTGGTCCT CAGACTAAAG ACATTCGCAA G 631 (2) INFORMATION FOR SEQ ID NO : 391:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 391:
CATTTGCATA TATGCTAAAG CGATTTGTTT TGTGCTAACC ATGAGACTTG TATAACAGTC 60
ATAGTATACT ATTTTTAGTA CTTCTTGTAG CTTTAGTTAA TTGTGATCAT TCTGATATTT 120
AACTTTTCAT GCTGGGGATA CATCTAGGTA TTTCCTTTTG GTATTTCCAG CGCATTTTTC 180
TCACCCTCAA TCATTAGGAA GGATCCGGAC TTACAAGATA AAGCAGTGAG CAGCACTCAG 240
AAGCCATGTG CTATTAATTA GAGTGGCTCC ACAGCGATGG AACCCATCCC ATTGCAGGCT 300
GGCTTGCCAC GGCCATTCTC CTTCTTCGAC CTCTGTGCCA CCAACTATCC TGACACTCTG 360
ACCTGCGGAT TTACTCCTTC GTGTCCCACA ACCTGTCAAA GAGATAGGGA TGCAGTAATT 420
AGCAATGTTT TATGTATCAA TGGTGCTACT ACATCCCAAA TTCTTGTACT TGAGCTCCAG 480
CAATATGAAG TACTTTTTCG AGTGTTCCTA CACAGTAGAT TTATTTCATT AGCTACTCTA 540
A 541 (2) INFORMATION FOR SEQ ID NO: 392:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 530 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 392:
ATTTTAAGGA AGAATCAAAA TGAAATGAAG AAACTAGTAG AACAGGAAAA TGTGATAGTG 60
AAGAGAAATC AAAATGAAAT GAAGAGCTCA ATAGATCAAA TGACAAATAC ATTAGAGAGC 120
CTTAAAAACA GAATGGGTGA AGCAGAAGAG AGAATATCGG GACTTAGAAG ACAGAGCACA 180
GGGAAAACAT ACAGTCAAAC CAAAGAAAGG GAAGAGGAAA TTAGAAATCT AAACATATTG 240
TTGGGGAATC TACAGGGATA CTATTTAAAA AACCAACATT CGAGTTCTAG GGAGTTCCTG 300
AAGGCATGGG AGAGAGAGAA AGGCTTAGAA GGCCTTTTTA ATGAGATACT AGCAGAGAAC 360
TTTCCAGGTT TGGGAGAAGG ACAGAGACAT CCTACTACAG GGAAGCTCAT AGAACCCCCA 420
GTAAACCTGG ACCCAAAAGA GATCCTCACC ACGGACACGT GGTAATTTAA ACTTACCACA 480
GTGGAACAnT AAAGGAAAAG ATCCTAAAAT GTGCCAGAGA GGAAAnGGCC 530
(2) INFORMATION FOR SEQ ID NO: 393:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 208 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 393:
ATTCAATCCT GTTAnTTTCT TCTCTGATTT TCATTnTTTC TCTTCTCCTA CTAGATTTGG 60
GATTGGTTTG CTGCAGTTTT TCTAGGTCCT TGAGATGTGC TGAAAGTCAT TTATTTGGTA 120
CCTTTCCAAT TTCTTGA AT ATGCnCCTAT TGCTATAAAC TTGCCTCTCA GTACTGCTTT 180
TGCTGTATCC CATAAGTTTT GATATGTT 208 (2) INFORMATION FOR SEQ ID NO: 394:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 189 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 394:
GCCAATCCCA CCGCCCAAAG CCGAATATAA CTCATGACAT CTGCAAATAC ATTTACAATG ■ 60
CCCAGCAACG CATTAATTAC GTTTTTCATA CTGTCTGCAA CACTCTGCCG CACACTCACC 120
CGATAATTCA CAAAGATAAA ATTCAGCACA AAACCCGCTA TTATAGACCC GACAATCATG 180
CCGGCAAGC 189 (2) INFORMATION FOR SEQ ID NO: 395:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 585 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 395:
CTCACCCATT TGTGCAGATG CTCCTCAGTG CTCTACTCTT CACCACCTGA GTTCACTGCA 60
TTCCCTAGGA AAAACTCAAA ATGCAAATAT TAGCCTCTCC TTGTAGTAAA ATGATTGCAT 120
CTCATGTCTT GTGTGAGGTG TGAGGACCAG GCCATGTCCT AGGGGTTTTT CATCCCTGCA 180
GCCAAGTGCC TTGGGTATGT TCAGTTTGCA TTCCCCTAAA AAGTAGTTCA GAAAGTATGT 240
TTCTCCCCAA AATAAAGTCA TAACTGATGT TGATTTATAA AGATTTACTT ATTTATTTTG 300 TTTGTTTCTT TTTTTTTATT TGAAAGACAA ACTGGCTGCA GCAGTCAGAG CTGGGCCAGG 360
ATAAAGCCAG GATCCAAGAA CTCCATCTGG ATATCCCATG CAGCTGACAG AAACTCAAAT 420
ACTTGAGTCA TTATCCACTG CTTACTAGGC ATGTTAGCAA AGTGTTAGAT TGAATGCGGA 480
GTAAGCAGGG ACTCAAACCA GCACTCATAT ATGGGTACAG GGGCCCCAGA GTGTCAGCTT 540
AACCCATTGn ACnATACTCA CCCCCCTATG GGGACTCTAA AACAT 585 (2) INFORMATION FOR SEQ ID NO: 396:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 396:
GTAGCAAAAA TATACCATCT GCAGTTCCAA TTATCTTGCT AGAATTATAT GTTTGACAAA 60
CAATACATTA ATTATAACCA TTTTTATCAA TTTTACAAGT GTCACTAAAC CAGTTATTTG 120
TATCACTTTG TTCCGTGATG AATTATGGAA AGTAAGCCCT TTAATTTTTA AGCATTTTAT 180
CAAAGTACAA TATACATCCC AAGAAATGCA CGTGAGGTAC AAATGAATGA GGTTTTTGCC 240
ATCTGATCAC ACTCATATCC AGCACTCAAA CTAAGAAAGG GTTAATGATA CCTCGAGAGT 300
TCTAACTATT CTCCACATTT CTATTGTTTT GAACATGTGA AAGTGTATCA TCATGTATGT 360
GGTATGATTC AGTGAAACAC AATTTATAAC TATATTTCAA ATTATACGAC ATCCACATAG 420
TATTTATACT CTCTATATTC ACTATTAAAA ACAAGAATGC ATGTGGTATT TTTCTTTTAG 480
GGTAAGACTT ATTTCACTTA GAATAATGGT CTCCAGTTGC ATCCATTTCA CTTTAAATGT 540
CAGGATTTCA TTCTTTTACA GCTGAGTAGC ATTCCATCAT G 581 (2) INFORMATION FOR SEQ ID NO: 397:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 617 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 397: GTCAGAAAAA TAATCAAAGG CATACAAATT GGAAAAAAAG ATGACAAAAA TATTCCTATT 60 TTAAGATAAC ATAATTCTTT ATTTGAGGAA TCAAAAACTC CACTATTAGA ATTCATAAAC 120 AAAATTTGAT AAAATTTTAG GATACAAAAT CAACTTACAA AAATCAGTAA CATTCTTATA 180
CACTTACAGT GCTCTGGTTG ACAAAGAAGT TACAAGTTCA GTCTCATTCA CAGTCATTAC 240
AAAGCAATTA AATACCTTGG GATCAATTTA GCAAAGGATG TGAAAGCTCT CCACAACTGA 300
AATTACAAAA TGTTAATGAA AATTATAAAA GAGACACAAC AAAATAGAAA CATTTTCCAT 360
GATCATGGAT TGTAAGAATC AGCTTTATCA AAATGTCCAT ACTACCCAAA GCAATTTGCA 420
GATTCAATGT GGTTCCAATC AAAACACAAA GGnAATTCTT CAAATATCTA GAAAAAAATA 480
CTAATGTTCA TATGGnAACA GAAAACACCC TGGnAAACAA AAGCAATTCT AAAAAATTTA 540
AACAAAGCCA AAGGGTCACA ATACCAAATT TTAAAGCATA CTATAGGGGC TGTTATAATC 600
CAAACAGTCT GGnTCTG 617 (2) INFORMATION FOR SEQ ID NO: 398:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 486 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 398:
TGCTTTACTG CTACCAGACA GTGGGGTGTT TTCCTTCCTG CTGCACAGAA TTTTTACAGG 60
GATTTAGCAT TTCCTAAACG GTATTTCATA AAGCCTTAAC CGATTTCCCA CAAAATAGAA 120
CTGGGTCCTA GATTTTCACC TTCATTTAGA AATCCAGTCT ATCAGGATCA ATTTTACCAC 180
TTTTTAAATC ATTTTAGAAA GATACACACT TGCCTCCCAG TCATCGTGCT ATAGGAAAGA 240
TGACACTGGG GCTGATTTTG ATGTCTCCCT GAGTGGCTTT TCTTCTCATG ACCTTAAAGC 300
TGCTAACAGC AGTCAATAAT TGAGGAGAGG TAGGCGGCAT TATTTTTTCT ATTTAAGGGT 360
CAAAATTTAA GCTACTTATA CACTGCCATT TGTGCTCTAT CTCTGAACTG ATAACCTACT 420
CTAGTGTTAC AGCAGTCTAC CCAAAGAGGG AATCCAGGGG CCTCAATTTT CTCTTGACAC 480
TACACG 486 (2) INFORMATION FOR SEQ ID NO: 399:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 666 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 399:
ACAGGCCCTA TCCTATGGCT TTTCCTTCTG CATCTCTTCT AAGACAGGCT AACGGGGACA 60
CAGATGCCTC CTGGGCAGCC AATCCAATCA GGAAACCAAT CCAGAAGTTA ATGTGAAGGA 120
CAGTGAAACC TCAGGCATTG GTTCAGGAGT TACGGAATTT GTGATAGTTT TAGAACCTGG 180
GAAATAATTT TCTATGTGGT CTCAAGAAAA CTCACTTTCT ATTTACAGAT AATTCTCCTC 240
CTCCTCCTCC CCCTTCTCCT CTTCCTTCTT CTTCTTTCTC TTATTAAAAA AAATAACATT 300
TATTTCAAAG GAAGAGTTAC AGAGAGGGAG ACAGAGACAG AGTCTCTGCA GTAGCTGGGG 360
ATGGTCCAGG CGAACAGnGA GCCTGGAACT CTATTTGAGT CTCCTACGTG GGAGCAGGGC 420
CCAAGTACTG GGGCCATCCT CCCCTGCTTT CCCGGCACAT TAGCCAGTAA CTGCACTGGA 480
AGTGCGCAGC TGGGTCTCCA ACTACACTCA TACTGGATGC CGGCATTGCA AGTGACAGCT 540
TAGGCTGCTG CACCACAGGT GGTTCTCAAA TTTTCATTTC AATATATGTT CTGAAATTTT 600
AACACATAAT TGTACTTGTT TATAGGATAC ACTACAGTAT TTTAATACAT GTATTCACGT 660
GTAATG 666 (2) INFORMATION FOR SEQ ID NO: 400:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 485 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 400: AGTTCTTCCC GAGACACAAT TCCTCATGCT GGCCGTTTTG TCTTGCCTTT CTTACCTTCC 60
TCCTCTAGGA ATGAGGAGAA AAGAGGGCAC TGAGAGAGAG GGTGAGAATA ATCGCTCTTT 120
TCCTCCCCCC CTCTGAGCGT GTGCTCTCTC TGAGGCTCAT TTCATGGGGT TGGGAGTATG 180
GGGAGAAGGG GTGGCAGGAA CAGAGGGTCC CGTGTGATCA CAAGAGCATG CACTGTTTCC 240
ACAATTATTC TAAGTTCAGT AACTACACAG CGTGGCTATA TGTTTCATTT TGCCTGTCTC 300
TGAGTGCCTA GAGAGCTGGC CACACGTTAT TCTGGGTATG TCTGTGACGA TGTCTCTGGA 360
TGCCATTACC ACACAGAAGG TAAACTGTAA ATGCATGCCC TCCCCAGAGT GGTGGGCTCC 420
GGCAATCTGC TGAAGGCCTG AACAGAACAA ACAGCAGAGT GAGGCGATCT ACTGCCTCTG 480
CCCGG 485 (2) INFORMATION FOR SEQ ID NO: 401: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 563 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 401:
GACATTTCTA AAACTAGCAT CTGAATTATG GGAAAAGACT TTTTTCTCAC CTTAGCTAGA 60
TTAATGCAGA CAGAAAATCG GTTTAATATT TGTGTGGACT AGGGGTGGTG TCTGTGTGGT 120
GTTTAAGTGA GGCATGTGTC CCATATTGGA GAGGCTTTTT CAACTACAGG CTACTCTGTC 180
TCCAATCCAT TTTCCTGCTA AAGTGAACTA TGGGAGGAGG CAGATGATGG CTCAAGTATG 240
TGGATCTCTG TCACCCATGT GGGAGACCTG TAATGAGTTC TGGGATCCTG TCTTCAGCAT 300
AA ATAATCT TGACTGTTGG GGGCATTTAG GGAATAAGCC AGCAGATAGA GATCTGTTAT 360
CTATCTATCT ATCTATTTGT GTTTCTGTAA TTCAAATAAA AATGGAAATA AAATCAAAAT 420
GAAGATAAAC ATTTGTGTTT ATAATAGTTT ATTCAAGATG ATTAAATTCT AGTGTACTGG 480
ACCACTATCA CTTGCAAAAC ATGAGAACAC TGTGGATGGT AAAATCCTCT TGCTATTCTT 540
ATGTGAAGAG TTAGACTGCA CTT 563 (2) INFORMATION FOR SEQ ID NO: 402:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 440 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 402:
CATCACTTTA AGTTGAAGGT ACAACCTCAT GTGAAACTCA TGAATTCTTT GAAAATGTAA 60
TGTGGCTCAC ATGGAATGTT CAGCGTGCCA TGCTGCCCTC TTACACTTGG AACAAAATGA 120
AGAGAGATGA CCAATTCTCA TGTGACAGTG ATGGAGCCCA ATTATGGATG ACTACCATGA 180
TATTCTTTTG AAGAGTCACT TGAGCTAATG AATTAATATA TTAATTTATA TTGTTAATTT 240
TAACTATCTA ACACAGTTAC TATTTATGTA TGTATGAACC ATAATATAAT AATTTAAGGn 300
ATTTCCACAA ACAATGTACT ATATTAGTAG AGAGTAACTA AACTTCTTCT TTGCnTTATA 360
TAATTATAAA GGATGTCAGA ATGATACATT TTCCAAAACA ACATATACTA TAATATTAGG 420
ATAAAATAAT GGAGTTAAnT 440 (2) INFORMATION FOR SEQ ID NO: 403:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 300 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 403:
ACCAnTGACG TTTTAGGnGA CACTTTATCA CCTTCTTACA CCCTAGATCA TCTACTTACC 60
TACTGTGGCT GCATGCCTAT GCCCTTACCA TGGCATTTTG ATCACCAGGT TAAAAACAGC 120
AGTTCTCAAA TTCCATGTCC CTCTACCATC AnnCATCACC TAGGAACTTG CTAGAAATGC 180
AAGATTTGGG ATCCACGTTG TGACACAGTG GGTTAAACCA GATCTGCAAT GCCAATATCC 240
CATGTGAGCA CTCGTTnAAA TCCTTGACTG CTCCACTTCT GATCCAnCTC CCTGCCAGTA 300 (2) INFORMATION FOR SEQ ID NO: 404:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 439 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 404:
TTATAAATAA TAATAATAAA AAAGTGTGCA CAGCACAAAA GAGAGAAATG nTTCAGGGAT 60
TCAGAAGCTA GAAATTCACT TCCAGGGTnA nTGGGCCTTG GCCTCAGAGG TTTTGATTTG 120
GGTGTAAGTT GATGCTGCnA GTCCTGGAAA ATTTGGGATG nnACTCCGCA nTGTTGTATG 180
TATTGTAAGA ATATACACAA TCTTAAAAAA GAAAACAAAC AAACCAGAAA AACCCCTCTC 240
GnCTCAGTAA CTATATTTAC CCTGAGCCAA GTGCAATACC TGGCTCCAAG AAGATTTAAT 300
AAATCTTGTG TCTTAGCCAG ATCCCTTCAG TTAGCATAGC TATCTCTAGG ACAAAAATCT 360
GGTGAAAAAG AGAAAATGTA ACAAnTGACA AGTCAGTAAA ATCACACTTA GCCTGCCCAA 420
ACCCTTGACA GGCnTGTGA 439 (2) INFORMATION FOR SEQ ID NO: 405:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 409 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 405:
AGGGCAATCA CTTAGATGCA ATGACAAGGn CAATATGAAC TCCAAGAAAA CAAAAATGTA 60
AAGTCTCTCT ACCATTCCTA CATACTCAGC ACAAAGTATA TATTCAATGA CTAAGGCAAC 120
ATTTAAGAAT GAAGGCGTAG TAGTTAAAAA AAAATAAAAA CAAAGAATGT AACCATACTT 180
ACCAAAACCT AGGTTTAAAT ATAGTGGATT ATATACATTC AAATATAGAT AGGTATTTTT 240
TAATTACTGA AATCATTTTT AAAGAGAGTG GCAAAAGTAT ACCATTTAAC TTCTCTTAAG 300
AGCGGTATCC ACTGTGTTTA TCCTTATAAT TTAAAGTTAA AATTCTGAGA GGACTCCTGA 360
TnCAGAGGAG GCATCAGATA TCAAGTCAGG TTAACACAAA TGnCACTTT 409 (2) INFORMATION FOR SEQ ID NO: 406:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 568 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 406:
AAATTGTTTT GTACCAAAAA TATTTTCATT TCATTTTCCA CATGACTTTT GAAGTACCTG 60
ACATACTACC CTGTACTCTT TCAGGACATT CTAGGCCTAA TAGGTTGCCC TTAATTCATT 120
TAAAGTATTT AATTAAATTA TTTCTCCCAT TTCACATATG AGAAAACTGA GGCATGAATA 180
GAGACTCATC CAAGATCATC CACTAAGTCA CTCTATGTGG TGGAATGAGA AGGCCATGCA 240
CAGATGGCAC TGGCAAACAA GTGTCAGTTA CTCAGGAAAT GGCTTTGGAA CCCACCTGGC 300
AAAGGGGCCT GCCTGGCAAC AGCnTnGATT GGTTAGGGCA TAAACCTCCC CTTGACCAGA 360
TTGGCTGCCT GGCTATATAA GCTGCTGCAC CAACTGAAAT AAATGAGTCT GCAAGCTGCT 420
CACCTTTGGC CCGCTTTCAC CTGACTCCTG GTGTCTGTGT CATGACTCCA TGCCTCTTGC 480
CTGCACTGCA CTCCTCCTCT CAGAATGAGT CCACAACAAC ATCTCTAACA GTGTTACCCA 540
GAGTCCACCC CCTCTTCCAA GGCTTCAG 568 (2) INFORMATION FOR SEQ ID NO: 407:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 635 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 407:
AGCTGTCTCC ATCCTACTTC AGGAGCAAAT GCTTTGTCCT CATGCAAAAA GCGCATACTA 60
TGATTTTACA CCTGGGATCT TCATCACGAT GTACGTGCAC AAATGTGCAC AGGACAACTT 120
TGTTTTTACA TTTTGACTCA CTTTCTAAAA ATGTAGGCTA ATAAAATTTG TTGATAACTA 180
CAAAACTCTC CAATCACACA CACCCCAAAG CGCCTTCAAA TTTTCCAAAC TGTAAATCTA 240
CAAGGTATCA ATATTACAAG AAGTACCTTC ATGGATGTCC ACTGAAATAA AGTCTCACTA 300
TTGTGCTTTC GGTCCTCCAT TTTCCACATC TGTATAAAAA AAATCACACC TCGCTCTATA 360
ATTGAGTTTC AGAAATAAAG CTCTACTACA TGCATGTGCG TTCATACCAT TTCTATAAAA 420
ACCAAGGGCT CAGATGCCTT TGCACAGTGA AACAGATCAC AGTGAAACAG ATGATGGCAG 480
GCACATACTC TTCTCCTTGC TTCTCAGCTC CTCCTTGCTT GAGAGAAGGC AGGAAAGCAA 540
GCTGAGGGCA CTGCCATGTA TGAGGACCAA CCACTGTTTT AGTCCCTTTA CATGTGTTGT 600
TCTTATTTAA CACCTACAAC AGACCGAGGG AATCA 635 (2) INFORMATION FOR SEQ ID NO: 408:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 564 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 408:
AACATCCGTA CCACTTGCCC TCAGGTAGAA GCTAACCACA AAAAGAAATG CAGCAACATG 60
AAAATGAAAT GAnAATACCA AATAGACTGA GGCTTGCCTG AAACCTGTTT GTGCTACCCT 120
AAAAGGAGAC TAGAGAAGGC TAGAGAGCCT TGAAGAAAAA ATAAGAAGCA AAGACTTCAT 180
TTTGGACCTA ATCGGAAGAC TGGAAAGACA ATTAAAAGAG GACTGCACTT CTGAGTGTTT 240
CAGTCGGACT TGCAGTCCCA TCCATGTTCT GTCTAAGATT GTCTTTGGAA GTGTGTTTGG 300
CCTAGTGGTT AAGATGCCTG AATTCCTGTG CAGAGCnCTG GGTTCTCATG CTGGCTCTGG 360
TCTTGATTTT GGCTTCCTGC TAATACGCTT CAGGGGAGGC GGTGAGCAAT GGTCCAAGTA 420
CTTGGGGTCC TTTTACCTAT GGATTGAGTT CCTGATTCCT AAGTTCACTT CAGCCTGGCC 480
CAGCCCTGGC TATTGGATGT ATCTGGAGAG TGACCAAAGG GATTGTATCT TAGTTTGTCT 540
GTCTCTTTCA CAAAGATTAA ATTC 564 (2) INFORMATION FOR SEQ ID NO: 409:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 637 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 409:
GGAATATTCT GATCATTTAG AACAAATTGA CTTGATAAAT TTCAATTGTA TTATTACCTT 60
CTAAATCAAA TAATAGTATA TATGTTCACA AAACAGTATT GATTTCTCTT AGTGAGTTCT 120
ACAAAATGCT AAATTTCCTT TTAAATTAAA AATATTTGGT TGGTAGAGGA TGGCACTTGG 180
TCATGAAGCT AAGACACCAG TTGGAATGCC CACATGTCAT AGCCTGTGTT GGACTCCCTG 240
CCCTGCTCCT GATCCCAGCT TCCTGCCAGT GTGAATCCTG GGAGGCAGCA GGTGCTGGTT 300
CAAGTGGTTG GGCTTCTGCT GCCCACATGG GAGATCTGGA TGGAGTTCCC GCCTCCTGGC 360
TTCAGCCCTG GATATTGCAG GAATTGGAGA AGTGAAGCAA TGGACAGAAG ATAGCGCCTC 420
TCTTTCTCCA TCATTCAGCC TTTCAAAGAA ATAAGATAAA CTTTTTTTTA AACTTTTATT 480
TAGTAAATAT AATTTTCCAA AGTACAGTTT ATGGATTACA ATGGCTnGnC CCCGCCATAA 540
TTTCCCCCCT ACCTnCACCC CTACCATCTC CCGCGCCCTC TCCCTCTCCC ATTCCATTCA 600
CATCAAGATT CATTCTCAAT TATCTTAATA TATACnA 637 (2) INFORMATION FOR SEQ ID NO: 410:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 486 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 410:
AGGACTTGAA GCTGATCTAG AAAGTGGGCT CTGCCTAGAC AGCAGTAGCA GCCCTGTAAC 60
AGGCGTCTGC CGGAAAAGCA CCAGGGCTTC TGCCCGGCTC CTCCTGAGGT GGGCTTTCCC 120
TACTTCTTCC TCTGCGGTTC TGTACTGCTT TCCCACGGTT TCCAAGCTGC CTTGGTTGGT 180
CGTTCTTTGT ATCTTTCTTG AAGGACCATC CTCAAATGCA GCTAGATCTC AAGTTAACAG 240
CCAGGCCAGG GTGGGTGGGG TGTCACATTC CTCCTCTGAG CACAGACCCC AGTCCCCCCC 300
TCAGTTCACA GTGCCGTTTG CCTCATCTGG GCTTTGCCAC TCACTAGCTG TGGGACTTGC 360 CATGAGTTGC TTAACCTTTC TGTGTTTCCC TCGTCTCAGC TATAAAAGGA AGCTAATAAC 420
CACCGCCACC TCCAGGATTC TGGGAGGCCT TAGTGAAAGC TAGGGAGAGC CGGGCGCTCC 480 GGCGGT 486
(2) INFORMATION FOR SEQ ID NO: 411:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 417 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 411:
AGGATTGTTT TGCTTATGCA AGGAAGGCCT TTGAAGATGT TTGTCAAACT CTCTGTGATC 60
TGATGCTCTA AGTTGCTCTG ATGCGGACAT GTATTAGCAG CACATGTGTC TGCGTGTTGC 120
TGTTCTTAAT GTAGTCACCG ACAGCCTAAA TGCTTCTAAC GCCAGCTCAC GGGCACACTC 180
TTAATTGGTC ACTGTCCTTA AATACCCAGA GTTCCTACAA CCACGGCCAC ACCTCCTGGG 240
AACTAAGAGG GCTGTTGGCA ATGCACAGGA AGGAGGTCCA ACTGGGGCCG GGGCTCATGT 300
CCCCATCGAA GGAAGGACCC TGACTGCGTT GCAACTGCAA AGCCTTTAAG AACTGTTTCT 360
TTTTATTATT TGTGCCTTTC TGGCCTCATT AATGAAAGAG CAGCTCAAGA TTCAGGG 417 (2) INFORMATION FOR SEQ ID NO: 412:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 481 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 412:
ATAGGCATGT ATACATATGC ATCACACACA TACAGATGCA TCGCAGGCAT CCATATAGAT 60
GCAAAGTATG TTTCTGCATG TGTATATATG AGTTTACACA TGTTTGTGAA TGTATTACCT 120
GGATTTCTCT ATATAATCCA GGGTGTGTGT GTGTGTGTGT GTGTGTGTGT GTTTTCATTT 180
CTCACCCTGA ATCTGGAACA GTGTCTCCTT GGCTAAGAAA AAAAATCACA AGGCTCAGAA 240
GGAGCAGATA AGGGATTCTC CAGAAATCCC CATCCCTCAC CTGGGAGCAA CATTCCCTAT 300
TGACTTGAAA GGTAAAACAA TCAGTTTCCA CCTCATCCTC CATCCTCTGT GTGTCTGAAT 360
CATATTTTTT CTCTGGGGAG AAGTGTTCCT GAGACTACAG AAAACTCCCT GGAGGACAGG 420 AAGACAGAGG AAAGACCTCT GTGGGTTGAA GAGGCAATTG CAGGAAGCCC AAGTGGGAGA 480
G 481
(2) INFORMATION FOR SEQ ID NO: 413:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 415 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 413:
TTATAAACTC TAAACATGTG CTCCCACAAG TACTGTTATT TCTCCAAGGC ATCACATTTC 60
CTGGTGGTGG CTTCTGGTTT AACAGTCTGT GTCTTACACT CGAGCAGAAA CCCTGTGAAG 120
CCAAGGAGTT AAAATCTTTC ACCATCCCCT GCTATAGTAA AATATGGGCA TGTATATTGC 180
ACTTCACAAA TAATTGAGGG GGGGGGGAGT TGTTACACAT GAATGAGTAA CAATTCAGGT 240
GGTCCAACTC TATAGTTAAT TTTTGATATA GTTAATTAAT TGATATAACT CCCTCAATTG 300
GAATGAAATC TTTCAGCAGA GAGATGATCA TCATCATCAA AATCTTCTCT GGGTCCCACA 360
TGCCTACACT CCATTGAGTT AAGAGAAAGA TACTATGGGG nGGnACCGTG GCTCA 415 (2) INFORMATION FOR SEQ ID NO: 414:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 530 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 414:
TACTTTTATT CATTCATGAA ACGAGTCATT GAGCACACGG AGCACCGGGG TGAACACGTG 60
GATGTCACTG TGAAGGTGCT GGCATTTGTC CCGGGAAGTG CAGGCTTGTG AACGGATCCT 120
GGAAGGCAAG CGCTGCAGAG GTGCTGTGCG CAGACCACGA GCGGCTCTCG CAGCCATCTG 180
TCCCAGGAGT TGAACATCTG CCGCTTTCTT CTGCGAAnTG TGGCGTTTCC AAGGGTTTTC 240
TTCAAGAGTG GATGTTGAGT TAATGATTGT TGAGATTAAT TACTTCATCA GAATTCTTTA 300
AGAAAATGAC ATTCCTATTA ATGTTTCCCT TCATGCCCAA ACCACAACTT TTACTCTTCT 360
TGATTTAACC TGCCTGGTGG CCTGTGAGTT AGGAGTTAGA TCCCCTTCAA ATTCCTGCTC 420
ACGTATTCCC GAAAGCTCAC GCTCCCAGTT TTATCTCGCC GCCTCCCGCT GAGAAGGAGC 480 GTGATGATGC GAACTTCCGC TGAGAACTAA TTATGGAGAG GCAGTGAAAT 530
(2) INFORMATION FOR SEQ ID NO: 415:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 515 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 415:
TGCTCGCCAC AGGGGACATC ACCTACTCCC TTGGCCTCCC ATCCAGTAAG CCTATGAAAG 60
ACCCCCAGAG GAGCTCTCGG GGATGGAGAA CACACGAAAG GCAGGATACG GAACACGAGA 120
ACAGCTTTAG GGGTATTTCC ATCAATGTCT GCCAACCGTC TCCTCACATT TTTGGAGATG 180
ATTTTGTTTT CCCCAGGTAA AGTCTGTATT GCTGGTGAAT GCAGAGGTCC CCTGCTATGG 240
ACGGGCAGCT CCCCGCCCCC TTCCCATCAT AACCAAGCCT ACTCCAAAGC TCTGACTGGT 300
TGATACCCAG CTAGCCTTCC TGTGAACACT CACCAGTCCT CATAACGTGA GTAAAGGATT 360
TTCAGAGACA CTTTAAGGTA AGGCCAAATG CGATTTCTCT ATTCTACAAA AGGGAGCCCA 420
GAAGTGGTTA AAGAAGTTCA CTCAAGGTCC CTATAACCCA GCAACAGCAC GCACAGTGCT 480
TGGGCCTTAG GATCCTCAGC ACAGTTCCAG TGCTC 515 (2) INFORMATION FOR SEQ ID NO: 416:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 300 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 416:
AGGACACTGT TCAACAAATG GTGCTGGGAA AATTGGATCT CTGCATACAA AAGTATAAAA 60
CAAGGTCCCT ACCTTATACC TTATACACAA ATCAACTCAA AATGGATCAA GGGGGGCTGG 120
CGTTGTGGCA TAGCGGATAA AGCTACTGCC TGCAGTTCCA GCATTCCATA TGGGTGCCAG 180
TTCTAGTCCC AGCTCTCTGC TGTGGCCTGG GATTGTGGGG AGCAGGGTCC ATCTCCCGCA 240
ACATGGTGGA GCACCCGGGA GACAAATAAG TACTGAGGCT GTGGACTGAn TGAAnCCTTT 300
(2) INFORMATION FOR SEQ ID NO: 417:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 575 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 417:
TGnGAAAAGA GAACAGAGAC CTTAAAAGGT GGCTTGCCCA GAGCTGCCTG CTTAGCCTGT 60
GGCAGTCAAG GGTACTAAGT AGGGCTCGTT GCTCCCAGTC CAGTGCTGTT TAAAATTGCC 120
TCTTTGTGTT TCTCTTTCTT TTTCTTTCTT AATGGGCAGG CCACCTTTTC TGGACTGCTT 180
TTAAAGAATT GAAAGCTGAA TATTAGCAAT TTTTCAAAGA CACGTATACC TCTAGTTGCT 240
CCTCTTAAGT GATAAGTCGT ATCTGTTTTT TGCTGTAAAT AAAGTTAATG TGTTCTACTT 300
AAGTATGTTT ATTCAGTAAA AATATTTGTT GAATACACTT TATCTTTATG TGTCTCTCTT 360
TACTCTTAAT AATAACCATG AGAATTGCGC TTTCACATTT ATTCAGCTTG TAGTGCATGC 420
CAGGCCTTCT GCTGAGTGCT ATGTCTGTCT GACTGTCACT TGGAACACAG TCCGTGTTCC 480
CTGTAGTCTA GAAAGGCAAG GGGACTTGAA GACTGCTGGT TGTCATAACT GAGTGCTGAA 540
GAATCGAGAC AGGACATTGA TAAATGATCA TTGAG 575 (2) INFORMATION FOR SEQ ID NO: 418:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 704 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 418:
CTATTTCTGT TGACCAAATT TCAGCCAAAT TATACTCAAT CAGCATGCTT TCAGCAAATA 60
TCAAACTCAA ATTACAGGCT GGTAAATACT TAATACCTTG CCTGTACATT TAGTTGGGCT 120
AATAGATAAG CAGAATCAAC TTGGTAGGCA GAGTGTGCCT GCTCCTAACT ACACTGTGGT 180
ATGTGAGTTG AACGAGAGGA ACGTGGTTTA TGTAAGCAGT GAGAGAACAC ACAAAGTGTG 240
AGGAAGAAGA TTTTGGAAGA CTACAAAGAC CAAACCCAGG AGAGGCCATC TCTAGCACCC 300
AAGGAGGTCG TGCGGAGGGT GAGTGTGGTG ACCATGACCC TGCTTACCGG CACCCAGCCC 360
TTGGCTnCCA TCTCCTCAAC CCCATCCCCT CTCCATTCCC ACATAAAGGG GGGAGAAAGT 420
CATTAGCAAA GTATCAGAAT GATAGCAGCA AGTGATGAGA ATGAAAGGAG AAGATGTTGT 480
AAGCAAATAC AGTGGAGCAG TTAGCTGAGA TTCTCTGAGC AATGCAGACT CAGCCTGGAA 540 GCCCAGCTCC ACCACAAACC ACGGGAGGTC TCCCCAAGTC ACCTGTCTCT GAGCCTGTTT 600
CCTCATATGC GATATAAAGA CAACAGTATT AAAGCATAAA TGGTAATAGT TCTATTCCTA 660 CCTTGGAAAA TATTCTTGTA AGCATTTGAT GTGATCATAT ACGT 704
(2) INFORMATION FOR SEQ ID NO: 419:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 516 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 419:
GAAGCCACAC GGGGCTCTGA CCACAAACCC AGGCTCCTGn CCCCAGGCCT GGCCAGGCCC 60
CTGCTCCCCT CCCCCTCTGC AGCCCCTGAC CAAGTGCCCT GGGCATGATT AGAGTGAGGG 120
AGTGAGGCCT GCAGACCTGC AGCTGTCCTG CCTTGCCCCA CGCAGCAGTT TGGAAGTTAC 180
CAAATGCCTT CCGTGACAGG CGGCCCCCAG GACCGCGGGG TGCGGGGTGC CGCGGGAGTG 240
CGTGCAGAGG GCCGGCTGTG TGACCCCAGG CAGCGGACAC CCTCTCTGGA CCTGGCCGGC 300
CCCTCGCTAA TGGGCTCTTG GTCGCTCTCC TCCCTGCAGC CTACTCTGCA ACCAACATGT 360
TCCCTGCAAA CCGTTCCTTT GGGATTAGGA CCAAGGGATC CTGTGGGCAG CTTTTAGGTA 420
CAGGTAGGGG CTGAGAAGCT GCCGGAGGGG GCTGGGAGCC TGGCAGAnGG GCCGGCCTGG 480
CCTGGGGGGC ACACGGAGAG GAGAAAGGGA nCCTGC 516 (2) INFORMATION FOR SEQ ID NO: 420:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 242 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 420:
CCACCTCCAG CCGTGTTCAA ATTAATCTTG GCAACTACCT CATGTCTGGT TTCCGTTGGC 60
AACCAGCCAG GCTCTGGGCA TGACCCTCTC CAGTGGACAG CCTCTGCCTC CTGGGGCCTG 120
GCCGCGGCTG CCCACGTGGC TTTTGGCATG TGACCCTGGC CTTCCTTGGG GTGATCCCGG 180
GGCCGGCGGG CGGCTGGGGG CTGGGGTGCC CAnGTGAnCC CAGCTCAGAG GGCTAGGCAA 240
AT 242 (2) INFORMATION FOR SEQ ID NO: 421:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 651 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 421:
AGGGTACAGn GTCCTGAAGA CAGAACAATG AAGAAAACAT AAACCCTGTC CTGAAGGAGC 60
TTAACTAGAG AATCCCCTTG CAGGGTCAAC GACCAATCCC AAnnCTGTGT GTACAATAAG 120
TGGTCCCTCA AGATCACAGA TCAGGGACAT TCATCTTAGG TTGGGTGGGT AGGGAGAACT 180
TCCCAGGGAG GGGACCTCTA TGTTGACCCT GAATACTAAA CTAGCCAAAA AGAGCCAGCT 240
GAGAGAGGGA GGTGGGGACA GCTAGAAGAA GCATTCCCAT ATTTAGTGAC TACCAGTAGT 300
TCCATATGCA CCTAGCGGAA GTGAAAAGTT GAGGCTGGAG GAGTTGGGAG AGGCAGATCT 360
GAGTGGGTGT GGTTTGTCAG GTTGAAGACA GGGAAGACCA AGGACAACCA AGGAGTAGCA 420
GGCAAGGGAG TGACGTGGTC AGATTTGAAC TTCAGAAGGT ACATTCTGAT TGTTTGATGT 480
TAATAGTGTA TCAGGGGAGT GCAAACAGCC AGGGGGACAT TGCAAGTGTC TGAGAAATGG 540
TGATGTCCCA GCATTCCTGA CCCAAGTCCT GCCAGTGGGC AGGCCTGGAC ACGTnCCAGG 600
GAAGAACGGC AAGACCGAGG ACTGTCAGGA TGTCAAGCGG GnGGTAAGGG G 651 (2) INFORMATION FOR SEQ ID NO: 422:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 303 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 422:
CTCGGGTCAC TTTGGTTCAG TGGTAAATAA nCAGTGAnGG ACGCTGCCTT AGGAAAGTAG 60
GGCATCCGTG GGAGCAAnGA GGGGCTGCTT TTGGTGCCAG AAnCTCCCAG CCCCTCACCG 120
GTCTCTGTGC GGAAGCCCAG AGTCTCCCTC CAnCCTCCCT GGTCTTACCT GACCCTGACC 180
CCAACAGCAT GGCTGCCAAG TTCTGAAGAC CCAGGGAAGG TAAAGGATAT GAAACCTCAA 240
GGGCTACTTC TGACTCCTGG CCGTnGGGGA AAGGGATGGA CTGATAATAT CCTTAnTCTT 300
TTG 303 (2) INFORMATION FOR SEQ ID NO: 423:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 628 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 423:
TTTCCTAGAA GTCTCTCTAG GACTTCCTTG GATTGGCTAG ACTGGAGCCC CATGCCCACC 60
CCAAGGCCAA TCGCGGACCA AAAAAGAACA GAAGTACCCC AGCTGGCTTC ATGGACACGT 120
GATTTGACCC CTGAGGTTGG GTCATTGCTC CCAGAACTCT AACAGAGAGG GCAGGAAACC 180
AGCTGTGTCT GTGACAGCCA GTCCCCCAGT CCAGTACCAT CTAAAGTCAG AACACAATCT 240
GAACCATGCT GTCAGTCAGG CTAGGGACGT CACCATCGCT GGAGACAGCA CCCCTGCTGT 300
GACCTGTGCT GCCCAGTGTG GTAGCCACTC CAGTTACCTG AAGTTAAATC AACTCCCCGG 360
TCACACCAGC CACACGGCAG GTGCTCCCCA GCCCCACGTG ACGGGCGGGC TACTGCTTGG 420
GTCCAGACTT GGAACATTTT CTTCAGTGCA GAAAGTCCCA CCAGTGAGCT CTGCCGGCnC 480
CCAnCCCAnG CCCCCAGTAA TGAGGCTCCC GGGCAGCTGT GAGGCGAGGG TCTTGCTGCC 540
CATCAGCGAC ACCAACAnCG GCTTCCCCCA GCGCAAACGG GACTGGCAAG AGGGGGAACC 600
CCCAGnCCCG GGGAGCCTTG AAGGGCAA 628 (2) INFORMATION FOR SEQ ID NO: 424:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 447 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 424:
TGAGCCACGG nGCCGGCTTA AATGGAAAAT TTTACACACA CACACACACA CACAAAACCT 60
AATTTCTTGC TTCTCTTGTC ATATCAGGAG ATCTGGCAGT AGTGGGGTTG GATTnTGGGG 120
TGGTGATGGn TGGnTGCnGT GGGGGTCnCC TTGAGCCATA ATGGCCACTG CTCTTCAGCT 180
CCCTACCACC CAGGATGCCA GGGGCCCCAG GGCACTCTCT GGnTTTCCTT GCATTGCCAn 240
AGTTGTGTTT TTCACAGTAG AGAGACATTT CTCTGTGCTT GTCGTTCTAT CTTAGATAGG 300
GATTGAAAAC AAATCCAGAG GCAGCTATAT TTTTCTAGAA AAAATGTGGT CTGTGTTGTG 360 CATGCnTGCC TGGTCCCTGC GTAAACACCT GTGnTGCTTT CCAGGGTGTC TATCCCAGGC 420
TTTCCAnTCT GGCCTCTTTG TCACACA 447
(2) INFORMATION FOR SEQ ID NO : 425:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 485 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 425:
CAAATACCTG ACCTCTCACC TCCCACATTT TATCATTAAG AATCTGAAGT CAAGAGTGAA 60
AGTAACTTGC TCAAAGTCCA ACAGATTTTC CGTTCATTTC TAATTAAGTA TAAGTTTGGT 120
TGTTGTTAAA TGGGTTCTGG TGCCATAAGC ACCTACCACA GTGATTAGAA ATGAAGGCAG 180
ATCTTGAGTC AGATTCTAAG CTCTACGGCC TCTGAGTACA GGGCTGACTG ATTCAGTGGC 240
TTTTGTCCTT GTTAGCTTGC TGAAATTGCT TGCTTAGCAG TGGACTGGTA GGCATTGTAG 300
CTGGAACGCA AGGACAGAGC CTATAACTCT TCTACCAACT CATGTCAACC TTGTGCAAGT 360
ACTGAGACAG CAGCTGGGAG AACAAAAAGA TACAGAGATA AAAGAGGGAT AGATAGCTGT 420
ACCTTCATCA TGGGAGTACA ATGACGGATA AGGGGAAATC CATTCTAAAA TTTAGGCCCA 480
GGCAA 485 (2) INFORMATION FOR SEQ ID NO: 426:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 484 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 426:
ATCCTGTTAC CGTGGTATCA GCTCCCTATA CTGTCTTTAC TAAACACAAA AAGCGTCTCT 60
CTGCGGCACT GGTGTCAAAG CCAGTTGAAT CCATGCTTGG GCAGTGCCGT GTCCTCTCAT 120
CTCCCAGTTG ATTAAAGTCT GTCCTCAGCG GGATCATGTG GAAGGCTGGG ATCATGTGAT 180
GAGTTGACAC ATAATGTGCC TTTGCCGAGG CAGTGGGGAG GGGAACAGGT CCAAGAAGTT 240
ACAAACTAGC TCAGAGTGTT CTCCGCAGAC ATCTTAGAGC TTATTGGAAC ACACACACAC 300
ACACACACAC ATTCTGATTA ACAAGCAGTT TTAAGACTTC AAGAGGTGCC AAAACCCGGC 360 ACTACTTGAT ATTGTTGTTA TTTATTGGGn AGACATTAAC ATCCACGGnA GACAACCATA 420
GTTCAGTGTC CCTGACCCAC TAAGCTGCTG CACTTTTCCT CATTGCTTTC CAGCCCTTGT 480 TCAG 484
(2) INFORMATION FOR SEQ ID NO: 427:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 551 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 427:
AGCCTGCACC AGGGCTGTGC TGGACTGTGG CGGGGTATGC AGGGGAGCGA TGGGATGTCC 60
ATGGACCCGT CCGCCCTCCC TGTCTGTATT AAGTCAGCGT CATGTTGCTT GAACTAAAAT 120
AGCTGTGAGA AGCCACTGAG GAAGAAGGTT TACTCGGCTC ATGCTTTTGG AGGTTCACAG 180
TCTGTGATTG GGAGGCCCCA TTGGTCCGGT GTCTGATACG GGCATTGGAT GGTGAAACAC 240
AGGGCAGAGA GAACCACACT GGGAAGCAGG AGGCAGTGAG AGGCAGGGAG CCTGCTCGTC 300
ATCTTCTTCC ATGTGGCTCT CCCTGGAGAG CACCAGGATT CCATCACAGG ACTCTACCCT 360
ATGGTCTGAT CCAATCTAGT CACTTTCCAA GGCCTCCCCT ATACGCATCG CAAAGGGTTA 420
ATTCCAGCCT TGACCCTGAn TCTGGTGAGC AGGTCTCCAG CACACAAGCC TTGGGGGCCA 480
ACTGGACTAT TATCCAATCC GTAGTACTGT CCCCAGACAn AGCTGGGCGT GCTCTGnCTT 540
CTGCGGTGAC A 551 (2) INFORMATION FOR SEQ ID NO: 428:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 531 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 428:
TGCCCTCAGG GCCCACGGTC AGGCACAGCT CCTCCATGAA GCCCTCCTTA GCCACTCAGC 60
CGTCCCTGCT GTCACTGCAC TCCATTTCTC CAACAGACCT CACCCCTCCC TAATGTCTCC 120
CTGGCTTTGA ACTCCAGGTG CCCTCCTTGG TTCTCCTGGA CTCCAGTCTC TGGGTTGACC 180
TCAGCTCTGC CAACCACATG GGCTCACAGG GACCCGTTCT CTGTAGGTCC CACCCTCAGC 240 TTCCAGAAGC CTCTGGGTTT ATAGGCTGAG AGTAAAGACA AACTCAGAGC ATGCAGGAGG 300
GCAGACTGAC AGTATCCTGC TACGAACCCA GCAGTGTAGC CCCGAGAACA CCTCAGGTGA 360
AGGCCAGAAA GTCTCCTAGT GCAGTAGCCC CTGCACCTCA GACCTTCCTT CTCAAGTGAC 420
AGTCTCTCTC CAGTGTCTCT CAGCTCAGCT GCTCTTGGCA AAAGCTTCCT GAGCCAGCTG 480
GAGCACTGAC AGTACCAAGC TGTTTCCAAG AGCATGAGGG GGTGTCGGCA G 531 (2) INFORMATION FOR SEQ ID NO: 429:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 526 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS : double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 429:
AAGAACTTGG CTTTGTAGGA CTGATAGTCT AACTATGTTC TCTAAGAACT AGAAAATGCT 60
TATATCGCCA GTACTCGAAA ACGTGTATGT GTTGTATGTA TGTGTGTGTG TGCGTGCAGA 120
TTTTGGATTT TATTTAAATA TAAAATGCAA ATTTATTTAC TCTAAGTTGA ATCTCACTGA 180
AATGAAAATG AAAAGGCAAT ACTATTCCTT TAATATGGTT CACTAACCTT GTTCACACCG 240
TTACTGATGC TCGGTAnGGG TTTGCCGACC CCAGTTCTTC TCAACCCTTG ATGATTATGG 300
AGGATTAATT ATAATTGAAA ATGTGAAAAA AAGAGACTAG ATCTAATCCA AAAATCAATT 360
TGGCATAGAA TTTAATAAGA GATAAAAGCT CAAAATTCTA TAACTAACAA ATGAAAAATA 420
TCAACACGGG AAAAAGGAGT TTAGTTCCCA GCATCCAGGC CCGTACCGTG TGGATGCCTC 480
CTGGCCTCTG CTCTTCATCA GCTGCGTGAC AAAGGCACGG nCTCAC 526 (2) INFORMATION FOR SEQ ID NO: 430:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 508 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 430:
AGAAGATATA TGAAAAAGCC TCTTGATTTT TAAAATTTAT TTATTTATTT ATTTGAGAGT 60
GACAGAGAAA GATAGATCTT CTATGCATTG GTTTAATTCT CAAGTGGCCA TAACAGTCAG 120
GTCCTGGCCA GTTCAAAGAC AAAACCAGGA ACTTAATCCT TCCCTTGAGA GGAATACAAG 180 GATCCAACTA CATTTGCCAT TGTCCACTGC TTTCCCATGT GCACAATCAA GAAGCTAGGT 240
TGGAAGCAGA GTATGTAGAA ATCTAACTGG CACTCTGCTA ACGGTATTCC AGAGTCACAA 300
TTAGTGTTTT AAACTACTGT GCCACAATGT CAGCTCCAGC CCCTTAATTT TGATAGAAAT 360
GTGATAGTTT TAGAATTATC AAATTTACAG ACTATGCGTA TGTACAAAAT TGCTAAATAT 420
TAAAATATCC ATTTTATGAA GAGGTCCAAA TTTGATAAGA GTAAATAAAG GnAATATATA 480
TAAAATGCAT ATAGTGAAAG GAAAAATT 508 (2) INFORMATION FOR SEQ ID NO: 431:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 467 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 431:
AGAACAAATT AAGTTCTTTC TTATTATCTT TCAACATTTT AGTTTTACCT CAAAAATGGC 60
TTACTTCTAC TGAAAACAAT TTTCTTTATT TCTCAGAATT AGGTTTTTGT AGACACACAT 120
CTTATTTTTT ATTGAGTTAC TAAACACTTG AGCTCCTATC TATTCAATGG TAACCCACTA 180
GAAGGACAAA ATGTATAAGA CTAATAATAA ACTTGCCTGG AAGATTAGAA AAATTTATTA 240
ACTATTTTTn CTTTTGTTGT TTAATGCCCT AGTGAAATAA TTTCATGTTT CTAATATACT 300
ATAGTGTGTG GGTnTGTGTT ATTATATGGC ATACCAATAC AATTCTTCAA AAGGCTATCA 360
CAGCCATGCT GATCATTTTG TTTGGATGAA TGGGTAATTT TCAGTGGAGG TAATGCAnAG 420
GTTTCCCATA ATTCATCATC ACAAAATCAC ATCTTTACAA TCAGAAT 467 (2) INFORMATION FOR SEQ ID NO: 432:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 432:
TGAAAACTAC ACTGTATTTC AGGAATGnAA AATAATCTTG GAGCTCCCTA TCTTGCCTTA 60
TGTAAGATAA GCACTCTGTG TGACATTGTC AGTTTTTAAA GAGCATATCA TATAGAATCG 120
ATTGGTTCTC TCAGTAGAAC TGCCAACTAT GGTATTTATT TCATGCCTTC ACATCACTCT 180 TAGCAGGTCA AAAATCATTG TCTTGTGAAC ATCTAAATTG CTTCTTTTAT ATATTTATAT 240
TAGCAATACC ACTGAAGGGC AACCCTTAGA TTCCGTGGAC ATATGGGTTG TTTGGCATCG 300
GCATCTGTCA GTTCATTAAT CACTAAAGAA AATTTGGCTG AATTAATTTG TTTCCGGnTA 360
TCCCTTTATA GTCGTGCGTA TTAAGTCCTT ACCAGATACC ATCTTAGGCA ACCCAAGACA 420
CCTTCCAAGA TAAGAGTTTC ACCATAGTTA GCAAAGTCAG AGCnAAAnGT AACATC 476 (2) INFORMATION FOR SEQ ID NO: 433:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 515 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 433:
TTTTTCAGAC AGAGAACACG TAACGTTATG TGGAAGACGG GGTGACAGCA GGGATAAAGA 60
GATACTCCAT GCTCTGGTGC TGTGGCACGC CGAGTGAAGC TGTTGCCTCT GATACAGGTC 120
CAGTCGCAGC TGCTCCGTGG CTTCGGACCG GCAGCGCCCC GGCTGGGGAG TGAACCAGCC 180
AATGGAAGCT CTCATTCTGT CCCTTTCGAT CTCCCCTCTG TCACCGTGTC TTTCAAATAA 240
GTGAAATAAA CCTTTCAAAA AGAAATTTTA TAGAAACACG ATGAGATTCT TTTTAAAAAC 300
AGTTGTAGAA AAACTTTGAA AACTTGTAAT GCTAACCAAG ATGGAGAAGA CAAGATAGAA 360
AAAAGGAAAA GAATAAAAAG GTATTTAGAA GAGAGTGAAA AGCATGGACA AAGATTGTTT 420
TTTCTTTTGT TTTTCTTCTC AAAGTAGTGG AGGGAAAAGA GAGAAATGGT GAGATGTAAT 480
AATGAGTACG AATACGGTTT GTCCTGTCTT ACTCT 515 (2) INFORMATION FOR SEQ ID NO: 434:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 508 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 434:
GGATCCATCT TGCTACTGAT GGTAGCAACA GTACTTGCCA ATCAGTATGC ATTGAGTGCT 60
TACTGTATCT GTGCCAGTCG CCAGGCTCAG CGCTTGAGAG ACACAGCCTC CCATTAAATG 120
GTAACGGACT GGCAGGACAC CTGCTCTTAA ACCTAGAAAG AGATTCACCA AACTTCTGTG 180 TTCAGGAGCC CAGCAGAGAG TTCAGGTCAG GAGGAGCCTG CCCACTCCCC ACGTCCAGTT 240
CTTTTAAGGG TCTCTATCTT ACTAGGGCAG ATAAGGAGGA AATGAAGGCA GGTGAATAGT 300
CCCTGCACCC CAAGTGGACA CTTTGCATCT CTAAGCAACT CACCTGTGTG CAGAAGCTTG 360
TGGCATTCCA GAAAGAATCC ACTTTGAGCT CACCTCTCAA GGGGACACAG AnGCAGTGCC 420
TCCAGGTCAA GCAGTGGAGG AAAGGGTGTG GGTCTTGGAG CAAGAGAAAA CTCAGGGGAC 480
TATGGCCAAA CGGGGGACAC TGTCCTTC 508 (2) INFORMATION FOR SEQ ID NO : 435:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 532 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 435:
AGCATAATCA AGGCGATTTA TAACAACCCA TGGCCAATGT CCTATTGAAT GGGGAAAAGT 60
TGGAAACATT CCCACTGAGA TACGGTACCA GGCAAGGATG CCCACTCTCA CCACTGCTAT 120
TCAATATAGT CCTGGAAGTT TTAGCCAGAG CCATTAGACA AGAGAAAGAA ATTACAAGGG 180
ATTCAAACTG GGAAAGAGGA AGTCAAACTA TCCCTAGTTG CAGATAATAT GATTCTATGT 240
ATATGGGATA CAGAAGATCC ACCAAGAGAC TACTGGAACT CATAGAAGAG TTTGGTAAAG 300
TAGCAAGATA TAAAACCAAC ACACAAAAAT CAACAGCTTT GTATACACAA ATAACGCCAC 360
AGCTGAAAAA GAACTTCAAA GATAAATCCC ATTCACAAAA GCCACAAAAA AACATCAAAT 420
GCCTGGGATA AATTTAACCA GGGTGTCAAA GATCTCTATG ATGAGAnTAC AAAACCTTAA 480
AGGAAAGGAA ACCGGAGGTT ACCAAAGGAT GGGAAAATCT TCCCnGTTCC TG 532 (2) INFORMATION FOR SEQ ID NO: 436:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 241 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 436: CCTTCGTTCA AGCAAAATAT TTCCCGTGGG AATAACTTCA ACACCTTGGG ACTGCGTCAT 60 CCTTCCCTAA CTTCATGAGG GAACCTTGGC CAAACTGCTT TTCGATCTGG AGGCGCGCCG 120 CCTCCAAACG CCTTCATCTT TTCTTCAAAA GAAGAACTAA CGGGAACATC CCCTTTAAGG 180
CTTTTGACAC CACAAGCCTC CCATGAAAAA CCCCGCGCAA GGnGCAAGAG ACGCGCTAnC 240 A 241
(2) INFORMATION FOR SEQ ID NO: 437:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 403 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 437:
CTTCTGATTC TATCTCTGTA TTGGTTATTG GTCTGTTTAG GTTTTCTATG TCCTCATGGC 60
TCAATTTAGG TAGATTGTAT GTGTTCAGGA ATCTATCCAC TTCTTCTAGA TTTTCCAAGC 120
ATATATCTGT AGTTTGTGAT TATTCTTTTT ATTTCTGAGT TATCTGTTTT TGCATCTCAT 180
TTTTAATCTC TGATTTTATT GACTTGAATG TTCTCCTTTT TTTTTTTGTT AGTTGGGCCA 240
ATGGTTTATC AATTTTGTTG TTGTTGTTTT TCAAAAAAAA GCTCTTGATT TCCCTGATCT 300
TTTGTATTGT TTTTTGTTTG TTTCAATTTT GTTTATTTTT TCTCTAGTTT TAATTTTCTC 360
TAATTTTGGA TTTGTTTTCT TCTTGTTTCT CTAGGTCCTT TAG 403 (2) INFORMATION FOR SEQ ID NO : 438:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 613 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 438:
TGGCTCTTTC TACAGATAAA TGTATATTCC AGAGAGCATT TTTCATTTAT CCTTACCTAC 60
TGACAATCCT TATACTCTGG TATATGCAGT GAATGTGCAA GTCTCCTCAG AGGAAACAGT 120
CAAGAATGAT AGGAGCATCA GTTACAGAAT TTTCTGCTGA CATATGTGAT ACGATTGGGA 180
GAGTACCTTT TCTCTACCTT CTAAAACAAA GAGTTAATCA TAGCCATTAA TTTTTGAGCC 240
AGTGCAACCA AATCACGAGC AGCTCCAGAG TCAGTTTTAA GATCAACTCC AGAGGAAGAA 300
ATCCCGTAAC AGTCTCCCTT CTCCATCCTT CCTCCCCCAT AACTGGTGCT TGCAGTGCAG 360
TCAGAACTTC TACAGATCAC AGACATTGTT TGCCCAAGCT GACCAATCTC TAAGGCGGGT 420 TTTTCTTCCC CCCTTCAAGG ATACTTCATG TTAACAGCTT GAGGGCGTTC AATCAGCACT 480
GCTTTGATCT CGGGCTGGAA CTATAATTAT TTAACATTGT TACAGGATTT CTATAATCCA 540
CCCCACCATT AAATCCCTTT TACACCCTCA TCTTCTGAGG CCTAAATCTG GCATCCGTAn 600
ACTTGGTTTG TCT 613 (2) INFORMATION FOR SEQ ID NO : 439:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 536 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 439:
AAAAGAAAAC TTCTTAGAAT TCCACATTCC TCAATTATCA CCAACTTTTC TATTGCTTTT 60
CTTAGTAAAA ATGTTTTAAG TTCTCTGAAC TTATCCATAT CACTCAAAAT TCTATAGCAA 120
CCACCTATTG AAATGGTTCC TTAACAAATT CAATTATCAG CCACCTGTAA ATCATTCAAG 180
GTCATTGATA TAGTGAAATA TTTTCTTGCT ATGACCCAGG GAAGATACAA TCCTGTTTTT 240
ATTTCCAGAA AAAGTTATCA CTGCTAACAT TTTTGCAAAA GAAAGAAAAT GAAGGATTTA 300
GTTTACTTTT TTTATCCTGA TCACATAAAC AATTTTCATC ATTCATTTGT AGCAGACTCC 360
TGCATTAAAA TCAATTAACT GAGTATTTCA TTTAAAAACA ATGGAGAAAA GCAATTAGCC 420
TACTAATATA GACTGCATCT GTTGGTGAAC TAAAGGTAGA TATTTTGCGC CTCTGGACAC 480
ATTTTTGAGG TACGCTTGAT AGGATTTGTT TACAGACAAG AGAGGAGTTG TGAAAG 536 (2) INFORMATION FOR SEQ ID NO: 440:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 172 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 440: GGGGTAAGAG nGCCGCCGGT TTTTGCCTTG AAnTTTGAAT TGAATTTTTA AACCTTGAAG 60 AATTGGGGGG GATTGCAATT GCCCCTTGCG GTTGAATTTT TTGCCAGGTA ACAAGCCGGG 120 GGGGTTATTC TTCTTCAAAG AAAGAnCGGT GGGTTTGTnC ATGGAAAAAA GT 172
(2) INFORMATION FOR SEQ ID NO: 441: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 644 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 441:
CCGGCACCGC AAGGnAAGGA TTAACCTACT GAGCTGTGGC GCCGGCCGAT TTATTTATTT 60
GAAAGACAGA GTGACAGACA AAGAGATCTT CCATCTGCTG GTTCAAGTCT CCAAACATCT 120
GCAACAACTG GGGCTGGTCC AGGCTGAAAC CAGGAATTCC ATCCGGGTCT CCCATGGGTG 180
GCAGGCACCC AACTACTCGG GCCATGACCC ACTGCCTTCT CAAGTGCATT AGTAAGAAGC 240
AGGATCCAGG CCTTGAACCA GCACTCTGAT GTGGAATGCA GGTGACACAA CCCCTGCCCC 300
CCAACAGATT TATTCTAAGA GGTTTCATTC TATACATTCT GACAAAAAGG GGAAAAAAGG 360
TGCCATTTTA GACTTTTTTT TAAGAAAGAT AGAGGTCTTG GTATGCAAAC TGTAATCTCA 420
AAAGAAAAAT AAAAAGCCTA AGGTTTTCAA AATAGTCCTT AAAAGTAATA TCATGACTTA 480
GCTTCTTTTT ACTGATTTGT ATTTTGGAGC GTTTCTAAAA TAGTTGCAAT TAAAGTTATA 540
TTTATGTATT TnAnATAAAT ACAAATGTAT ATAGTACATA TACTATACCT TTATACTTAA 600
AATATATATT CATTTAAGAA TGTGACACCA AGGGCAGGGC ACGG 644 (2) INFORMATION FOR SEQ ID NO: 442:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 442:
GTGCACCTCA CAGGGGACCC ATGCATGCTC TGCCTGCGGT TCCTGAGCCT GTCTTTTCTC 60
TAAAGCACTG GCAGTCTCTG CCTCCCTGTC CTCTGCCAGC CCCACCTCTC CTGTCCTCCT 120
CACCTCCACC TCTCCTTTTT CTCCTGCCCC TTCATGCCAT CACCATATTT TCCCTGCTGG 180
AAGCCCCTCT CCCTGCTGTG TCTTTACAGC TGTCTCCTCC AGCCCTCTTC CGCACAGGAA 240
CCAGGCTGAG TCCCTGCCCA GAGAACCTGG TGTCATTCCC CCAGCTTAGC CAACGCAGTG 300
CACCACAGAG GCAGAGAACA GGGGTTGACC CTTCCTTCCT CTGCCTCCCT TTTGTCTCCC 360
CACTACCAGC AGCTTCCCCA GAnATATCCC ACCTCCAGTC AGCACTGGGG GAnCTCCGCC 420 AGGCCAACTC CA 432
(2) INFORMATION FOR SEQ ID NO: 443:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 630 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 443:
TGTGTGTTAT TAAAGCAGGT TACTGATGCC TTGGGTTTGG TTTGTGTCTA ACCACAGGTG 60
GGTTTGGCTC TTTCCCTCTT CTCCCGTGTT CCCCATAGCA CATTGTGGTT TCTGGTGTGG 120
ACTCTGCACA GTACAGGAGT CTGGGGAGGG CAGCGAGTCC TGGGAGACCC TCACAAAGGG 180
CAGGTCCAGG TTCAGATGCT GGCTCAGCTC CAAGGCCTGT GTGGTCTTGG GAAAAGCCCT 240
TTAGCCACCT AGTTCTTCTT TCCATGCCCC CAGTGAGCAC GGCATCCACC TGCGCTTTCT 300
GGTAGTGCCT CAGTGTGGTG ACACATGGAG CACATGTCAC AATGCTGACA GGGGGAGAGA 360
CGATTAGGGT TAGAGCCAGG GCAGGCCTTG GGTTCTGGGA AGACTAGCAG TGTCTAGGGA 420
TTTGTGTGCT GTGGCCTGAG GCAGTCGCTG AAGGAAAGTT CTGGAGGAGG GGTTTTGGGT 480
GTGGTCTGTA TATTGGAGTT CGTTTTTGGT GGGAGAGACT GAAGTTGAAG GTGGTTCATT 540
TTCTCCTTAG ATGTCTTCAG GCAAGTTCTC TGGCATATTT ATTTTTGTGG GAAATGGCCA 600
AGAAGTTGAA AGAAGAGGCA GTnnCTGTTT 630 (2) INFORMATION FOR SEQ ID NO: 444:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 477 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 444:
TCCTTTGATT TGACAACTTA TGATAGTATT CTTATACTAG TGTTGAGCTA ATTGAGAAGT 60
ATGGAGCATA CAAAACACTC CCTCTTATCC CTTGTTCTAT AGGGAAAAAC AGGACATATC 120
CTTGATTTTT CAGGGGCTTT ATGGGAAGTC TCAGAGTTAG TTTTAGAGGA TATAAAAGTA 180
TCCTAAAAAT CATACTTTTA AAATTTGTGG TTCTGATCTT GAGAAGGTAA AATCAAAGGT 240
TGTTATATTT CTTACCTGAC TAAGACAGGA TCACTGGATT CCTGATATAT AAGACGTGGC 300 TATTTATTCA GCAAAAGATT CTACACAGTA TTTTAAAATG AATGTTACAG AGAGTAATAT 360
GACTGTAAGG AGACTCGGGT GGGATATGGG ACCATGAGCT GGCCTAGATT CCTGGTTCCC 420 TTGAAGGATT CCATTTCCTC AGTTGTCCAG TAGCCAGnCC TTTAAnCCTG GTGGGCT 477
(2) INFORMATION FOR SEQ ID NO: 445:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 508 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 445:
CAAGAATCTG CCTCTGCCAA TGTCTTGCAG CATTTACCCT ATGTTTTCCT GTAGTAATTT 60
GATGATTTCA AGTCTTAGGG TTAGTTCCTT CATCTATTGT AAGTTGATTT TTGTATAGAG 120
CATAAGGTAG GAATCTTTTT TCACACTTCT GCATATGGAG ATCCAATTTT CCCAACATCA 180
TTTGTTGAAG AGACTGTCCT TTCAGAAGGG AAGTGCATTT AATACACTTT AGCTACTGAA 240
CACCGTAGTT AGTAACATGG TGCACTGTGG AGTACTGGTT GTTTGCCCAT GTAACTGTGC 300
TGTGGTTCAC TGCCCCTGCC AAGAATCACC AGAGGTGTTA ATACCTCATA TCACTAACCT 360
GGGAAAAGAT CAACATTCAA AATATGAAGT GTTGTTTCTA CTGAACACAT AACACTTTCA 420
TACCATGGnA AATTGAAAAA ATGTTAAGTA TAACCATTTT AAGTCAGTGA CCATCTGTCC 480
TGGAAACCCn ATAAAGATAT AGTAAGTG 508 (2) INFORMATION FOR SEQ ID NO : 446:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 530 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 446:
ATGAGAAAAG ATTGGCAATC GTTTGAGGAC CTCCTGATAA CAGCGATAAA AATCGTCATG 60
CTTATTACAA TTCCTGCGAC ATTTTTCGTG TTATTTTCAA GTGACCGTAT CATAACGCTC 120
GTGTATAAAA ATGCTATTTT TAACGAACTA TCCGTGCGCA TGACCGCTAC CATATTTCGA 180
TGGCATAGCG TGGGAATGCT TGCTATTGCG CTGAATCGCG TTCTCATCTC CGCCTTTTAC 240
GCGCAnACAA CTCTTTTGCC CCTATGATTG CAGGAACTAT TTCATTTGTG ACAAATATCA 300 TTTTAGCAAC ACTGCTCTTT ATTCCCTTAG GAGGTAAGGG CATTGCATTT TCTCTGAGCG 360
CGGCGAGTAT GGTACAGACC GTTTTTTTAT GGATGTTTTT AAAACGATCG TGGCAGATAA 420
CTATCCCTTC ACTGTATAAA ACTTCCCTTT ACTATGGAGT GAAAATAACT TTATTTTCTG 480
TAATCGCGCT GGTACCCACA TGGGCAAGTT CTTTTTTTAC GGCGnATTTT 530 (2) INFORMATION FOR SEQ ID NO: 447:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 479 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 447:
CTACCTCTGC TTCTCAACAA CCCGACCTTG AGAAATCATG AACAGCCAGG GGACAGGCTG 60
TGCAAGGTGC TGGTCAGATT GGGAAAAGGC ACGGGCAGTG CACCTGGCGC ACGTACATTC 120
TGCACCCCAC CCTTCACTCA TCCTCGGCTT CAGCCTTCCC TGTGCGGAAT CAGGGGCCCA 180
TTCTTTGAAG GGGGGAAATC ACAGGAGCAC AGTTCATGCA GTGCAAGCTG ACCAGGGAAA 240
CGTGCGGAGG GGGATTCGAG CTCGGCCGAG CGGGTGCTGA CGTCAGAAGC AGGGTGGGCG 300
CAAGAGAGGA TGATTCTGGC ACATAGAGGT GAGAGGGCCA GGGCTCCAGG CAGCTCTCCT 360
GGCCACTCCC CGCACGGCAG GTCGGGGCCT TCCGTCCTGC AGATCCAAGC CTGCACATAG 420
CACCCCTCTT TTTAGCATTT CCCCCACGAG GCTGACGATT CCTGTTGGCT TCTGACTAA 479 (2) INFORMATION FOR SEQ ID NO: 448:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 405 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 448:
CCATTCCCAG GGCCCTCAGC CTGCCCTAGG CCCACAGGCC TGCCACGGAG GCTGTGGCCC 60
ACTCCTCATG GTCTTCGGTC CCCACAGGGC TGAGCGCTGA GTCTGGCACC CGGTGGTTTC 120
TGAGTCTGCG GAGGAGTTAG TGGTGGAGGG AATTAAATGA ATGAGAAACC CGCCCAGGGC 180
CTGGGATTGC TGCCTCCCCT CCCCCTACAG AAAGCCTTGC CTCTCTGGCT CTCCGGCATC 240
TCAGCGTCTA TGATTAAAGC TTAAATCCCT GGCTTAGGCC AAACGGGGTA CCTGGTAAAG 300 AGGAAATCCT TTGGCTCAGT CTGAGTCCTG ATAGAAGGTG CACTCAGGAG GTTTCCCTGA 360
AAAGCGAAAG AGCAAGCAGG GATTTGCAGC GAGACAGGGG GGCGG 405
(2) INFORMATION FOR SEQ ID NO: 449:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 449:
GGCAATGTTC TTGCTTATTC TTGTCATCAG ACAAGGGCCA TTCTCTAAGG GTTTTGATAG 60
GCAATCATTA AGCATCCACT TTACAGAACT GATTTGGCTG CTTCTAACTT CTTCTTGTTC 120
CCTAATCCTT GAAAACCTCT AAAGACATCT ATTTTTCTTC AGTTACTAAA ATGTAGACTT 180
CATTGACCTA GTTAAATTCA GGACCCTCAG TTTTTTGAAA ATGGAGTAAA TGACTGTGTG 240
ATTGCTAACA AAAGTGTCCT AAAATAAGAA TGTTATTTAT TGTTAAAAAA TTTGCAGACA 300
TATAAAAATT GTACTGATTT GGGAGTACTC TGTGACATTT TGATACAGCA TATACCTTGT 360
GTAATATCCA ACCAGCATAG ATATATTTAT CTCTTCAAAT ATTCAACAGT TTTTATAGTG 420
GAAACATCAA AAATGATATC TTCTAGTTCT TTTAATAGAG AAATAnACAG TACATT 476 (2) INFORMATION FOR SEQ ID NO: 450:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 571 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 450:
AAGTGAATTT GATCTCTTTA AAGCTAGGCT AAGTTTCTGC AGCAGCTATT CAAGGGGAAG 60
GCAAGACACA CTGACAATAA AGACACGTGT ATTCATTCAT TGGGGTGACC GAATGACAAC 120
AAAGCAGATG AGCCTATAAG AAAAAGCCAT TTGAATGTGA TAGTTGAAAA TCTCTTGTAT 180
TATCTTTAGG GATATCAGGT TCATCATACA TGTGATGTGT GAGTGGATAG TGTAAAAAAG 240
TTATCCTTTT TAATATTTGT TTCTCACACT TGGTTCTTCT TATGTGTGTG TGTATGTGTA 300
TGTAAGCTAA ATTAAGCTCT TGTTCTGTTA ACAGAAAGGT GGGTAGATGT GGTTAGAATT 360
TTCTTTTTCC TGTCACACTT TCTAAGGATT ATTTTTAAGA AATGGTATCT ATTTATGTCA 420 TGACTGATAA GTATTTCCTT AAGTATTCAC TGATATTTCA AAATTTAAGA CTTATGCTTA 480
ATTAAGAAAA GGATGAGAGA TCCAATTCAA TCAACCCCAT TTCTGTTCTA ATA ATGAGA 540 CTTCGGTGGC TAGTGGATTC CCTCATATAA G 571
(2) INFORMATION FOR SEQ ID NO: 451:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 634 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 451:
AGGTAACACA GTAAAGGATA AATCTATATG ACCTCTCAAT ATATGCAAGA AAAGAAATAA 60
TAAATATACT GCCCACTCAC TGTAAAAATG AGTACCATTA AAAGACATCA AAACCAAAAT 120
CCCATTGTGT ATCCATAGCA CATAAACAGG AAACAAGATA TCCTCACAGA AGCCAAAGGA 180
GAAAGGAGCA GGGGAACAAT GCGAAAACAG CTAGCTACAT GCATCAGGAC TTGTGTAGGA 240
CGGAAGGTGA AAGACTACTT ACTGGTAGGA GAAATTATAG CTAATTTGCA GCCATGTTTG 300
TACCCAGGAT CCACTCCCAT CAAAGTACGT CCTGGAACAG GGCTTGTTAA AAGAAGCTGG 360
ACGAAGGTTT CCGTCCAAAC ATCATTACTG ATTCCTTCTC TGGCATCGGG AAGTCAGTTT 420
GGGCTCTTCA AAAATATAAA AAGAAAAGAA TTACTGGAGT TTTTTCATTT ATCCAAGTTT 480
TCTTTCCAGA AAAAGAGAAT TTATTTTGTA TAGTCGGTCC TTGCAGGAAC ATTTATTATA 540
ACAAAATGAC ATAGTAGAAA AGCAACATAT AATAAAATTA TATTGTCTAT TTTTTGTnAA 600
TCCCTGTAAC ACATGGCACC ACAAACAATT CTGA 634 (2) INFORMATION FOR SEQ ID NO: 452:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 466 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 452:
TTAGTATCTG TGGTCGTGTC ACCTTACTGT GTGAATGACA AAGGAGTAGA TGTGGAGAGT 60
GGATTAGGGT GAGAAGGTGT AATTGCAGGT GAAGGTGAGA CAGGCTGGAA GGAGTCACAA 120
AATTGAGCAC TCTTGACCTG ATATGTATTC CTCTCCCACC CTAACATATG TCTGTCTCTG 180 ACAGAGCACA GCCTGCTCCA CTTAGCTCCA AACCAATCAG GAGCAAGGAG TGTACACACT 240
CTGACGACAA AAGACCCCAC CTACTGCCTG TGCACTTGTG CCAAGCAGCC TAATGCAACT 300
GAATTGTGAC CTTCTTGTGG TGGCAGTAGT GAGCCACACT GTGATTTTGA GTGTGTCTTA 360
ATGTAAAACA ACAACAGAAC ACACAGGTAC TCTGTGCTGA AGAGGGCAAG CCAGAACACT 420
GAGCTGGGAG CAGACTAAAG TGCTGTATCC CAAGGGTGTG CTGCTA 466 (2) INFORMATION FOR SEQ ID NO: 453:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 631 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 453:
AGAGACCAAG GGAATTATTT CAGCAGTGTT GCTCCATGGA TCAACCAAGA GACAGAGACA 60
GAAAGTGGAA ATCACCATTT GCAATGAGGC AGACACAGCA GAGACAGAAA GGAAGGATGG 120
GTGTGAAAAT GTGTAGGAGG CAAAGTGAGC AGGACTTGGT GGTTGGCTGG AGCGTGGAGG 180
TGAGAGACAG AAGAAAAAGA AAGAACTCAG GTTCCTGAGC TGTGCAATAG GATACATGGG 240
GCTGCACTTC CCTGAGCCGG GGCCTGTGTG AAAGGGGTGG AGATGAGGGT GGGAGAGAGT 300
GGAGAGAAGA TATAGTGATG AGTATTGACC TGGCACACCG TTTCTTCAGG ACACACACAG 360
GGAGAACATA GCCATTTCTG TACCTTTnCG CAAGCCAACA AGAGGGAATC ATAATTGAAT 420
TTCCATGAAT ATTTAAAACT GAAAGGAAAA GTCTGAACTA GAACTATCAT TTACAGTGAC 480
TATATGTCAA GATTCTTGCC AGCCTTGTTA ATTTTACATA TGCACTATTT TACCTACAAA 540
TTATATCTAC TCTATCATTT TTCCTCAAGA ATCnATTGnT ATTTTTTAAA GATTTTATTT 600
AnTTATTGAA AGAGATACnC nGAGAGAGAA A 631 (2) INFORMATION FOR SEQ ID NO: 454:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 637 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 454: CAGAAGATTT TAGCTTTCCC TGGACTTTAA AGTAAAATTG GGCAGAAAAA TGCATTGAAG 60 CAGGATTTTT TTAATATCTT CCCGTTTAAT GTTCTCTAGT CAAACCACCA AAAGAATGGT 120
GAGCAGTAGA CAATTGTCCA CATCTCCCCT TGTGTGGTCC CTTCGTTGTC CTTGATGAGT 180
GCCTACAGAG AAACCGTAAG ATTCAAAAGC TGAGCAGAAA TACTCTGATA CAGAGAGGGG 240
TCGTAAGTTT CCCTTCATTG CTTCATTGTG GTTTCTACAT CAGACTGCAG CATATCGTTT 300
TTAGAAGTTC TGTGGTTTGC TCTGGTAGGA CTACTCCCCC CGTCCTGTTA CTCGAGAACG 360
TTTATCCAGC GATGAGTACT TCGGTGTTAG GGATACTCAA CACAGGCCCC GGGAAGCAAC 420
GGGATGATAT TTTTGGCTGG CACGTATCAT TTTGTGTACT TATTTTCCTA TCTATTCATA 480
TTTGACGAGC CAAAAGGGGT GAGGTAGGAT ATCCTATCTG CTGGTTCACT CCTCAAATGC 540
CTCCCAACAG CCAGTGCTGG GCCAGGCTTA AGTCAAGAGA TTGATCCTCA ATCCGAGTCT 600
CCAAAGTGGG TGACAGGGAC CCAAACACTT GAGACAT 637 (2) INFORMATION FOR SEQ ID NO: 455:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 465 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 455:
AGTTCCTAAA TGAAAATCCT TCTAAACTAA GGTCATTGCC TTTTTAAATG GAGGCCAACC 60
AATATCACGA TTAGTGCCTC TCTTCTGCTT CTGTCTCTCA TTTCTCTTCT TCCTTTCACT 120
TAATGCCTAC ATCATGCTCA CACACAGACC TCTTCTCTCC ATTTGTTTGT CACAGAAGAG 180
ATATCAGTCA GGGTACACCT GCAAAAGCAG AAGGTGGGTG CTTGTGCTCC ACCCATCCTG 240
CAGTTTGTAC CTGAAGATTG AAGACATATT TTGCGAATGT TTACCAATGT CTGACACATA 300
TTTATACAGT TAGCAAATAG TAATCTGAAC CATGAAAACA TTCCTAGAAT ATTTACACAA 360
AGTGAACTCT AAATAGAATT CTCGGTCTCC ACCTGAATTT CATACATCTG GATTTCCTCT 420
ATGGTTTCCG AGAGAACTCC ATATCTTTCC TGGGGTGAGT TCATT 465 (2) INFORMATION FOR SEQ ID NO: 456:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 625 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 456:
CAnTACTGTT TGCATAATTA GAAAATATAG CGTTTCCATG TATGTAGCTT TGACATTATA 60
AACTCTTTAA GATTACCACA GTGGATGTCT GATTTCTCCA CAATGTGTAA ACGGGCTTGG 120
AAGTTGGCAG ACTCAGTCTC ACAACATGAA AAATATCTGA ACAAACTGAA AATCAATGAC 180
TTTTGTTTGA TCCCTCAGAG AATTGTGGTC ATGAGATAAA CCACTACCCT GAAATCTAAA 240
AGCAGTAGAC AAATGGAATC ACAGGTGAGA TCAACTCAAT AGGGGTAAGC AGACAGAACT 300
GATAACTGCT AGGAAAATCT AACCGGTAAA TTTGAAAAAT CTTAGAGACC GAACTGGACA 360
GTTAAAAGAT CCTGGAAAAC CCAATTTTAG TTGGACCACC AAACCCCTAT CTTGTGGTTG 420
GCACAATGAG GGCAGAnGGG TGACCCTCCA AGGATCTCCT CCAGGTTCTC ACTCTAACCA 480
TCTGGATAAA ACTTCCCTCA GGTGTAGACA GGGTGGGGAG GAAATGAAAC ATTTGAGATG 540
CGCCCAGAGT ACTGGCCACn ACAAAGACCT AATTCCCTAA GGGGAATCAC TCAGAGCACT 600
AATCTGGCnT GCAGAAAGTT GGGGG 625 (2) INFORMATION FOR SEQ ID NO: 457:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 282 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 457: nTATCGTAGG CAAGATAGAC ATCCGTAGGA AACACCGTGA GTGACAGTTG CCTGTAAGnA 60
CATGGGACAT GTAACCTAGC CTAGCGGTTC ATGATAGAAA GTGCACAATA AAACTCCCAG 120
GATCACATCG GCCTCCACGC AGCGCCTCTG TCTCCCCCGC CCCGGCnCTG GCCGCGAGCG 180
GCTTCCCAGG CTGGAGCTCG GCCAGGCCCC GCGGCGGCTG CTAGGGTCTA AGCnGACCCG 240
GGAGATGGAn CGCAGACACA GCCCCTGCCG CCTGGGGACC CG 282 (2) INFORMATION FOR SEQ ID NO: 458:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 531 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 458: GGTTCAGGCA ACTCAGGCCT GTGTTGTCCT GCACCTGGGC CTGCAGCTGT CCCTCCAGGC 60
TCTCCAGCTC CTTCTGCAGC TGCTCAGCCT GCGCCTTCAG CTGCTGCTCC GCCGCCGCCG 120
GCCCCTCAGG GGAGGCTGGG GACTCTGGGG TGGGAGGCTC AGCTGAGAAA GGAAGCAGCC 180
ATCAGGGGCC CTGGCCTCTG GGTTTTCAAA AAGCTCCTGT CTTGGTCCTT AGCTCCTCAG 240
GCCAACTTCT TGCCCCCACC CCGGTGCTGG CTTCTTGGCT AACAACTTTC CATATTCAGA 300
TGGTCATTAC CCTTCCAAGC CAGGACACAG AGTGCTTGAC CGTGGCATAG TGGACAGTGG 360
GTGCGCCACC CCTATTTACA GATGAGGTCT CCTATGCTCA AAGAGATAAC AAGACTCGCC 420
ATCCACCGGC ACAGACCTGC TTCCCTCTGC CACAACACCC CTCAACGGGG CAACCAACAA 480
CCATCCTGCT CCCAGGCCAA CTCTGCCCTG CTCAACCATC TGGCTCTTGA G 531 (2) INFORMATION FOR SEQ ID NO: 459:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 519 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 459:
CCACCGTGTT ACCTGGAGTT GTAGCGCCCC CTGGTGCTAG TCTAACTAGC ATGGCACCAA 60
CTCAGGTTTG CCTCCGTCCT TGTGCCTGCT ACCCCTGTAT GCTTGCACCA TCTCCCTTCC 120
CAGAGGACTG CGCCTGCTGT GCCTTCTCTT CTCCTTTCTT CCGGCAGCTT GGTTCCCAAT 180
GTGAGGCACC CTTCCCGGCA CTGGGGTGGG GGAGGCCAAG ATGGGGCTTT CTCTTGAGCT 240
TGTGGCAGTG GCAGGTTGCT GGAAGGAGCT GGGCTCCTGC CACCCCTGGG CCAGTACCAA 300
CAGCTGGCAG TAGCCCTGGA CTCCAGCTAT CTAACAAACA ACTCAGTAAC ACCATAGAGG 360
AATTGGTAAG AGCCCAGTGG GGTCCCCTGA TTCCATGCTG CCCACCCTGG GCTTCTGTTT 420
CCCCGTGGGC TCTGAAGAAA GGGGCTGGGG GCCCCCTGGT GCTGAGGGAG ATGGGGTGCT 480
GGGTGGGCCA GGTCTCAGTG GAGGGACCCC AGAGCATGG 519 (2) INFORMATION FOR SEQ ID NO: 460:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 528 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 460:
AGATTTTTGC CTGTCACTAT GCAGGGGGAG GTGTGTATGT GGCTATTGGC ATCTGTAAGT 60
AGAGGCCAGG TGTGTCTCTA AGCATCCAAT AGTGCGTAGT GTAGGCTACC CCTCCCAACC 120
CCCAAGTCTA CAGGCCTGGA ATTGAGACCC TTGAAAAGAC CATTATAGTC CATATTTAGG 180
AATTTGGGCT GTAGCCTAAA AGTAGTGGGA AAGAACTAAA GGGATTTGTG TAAGGGTGCG 240
ATTTGATTAA ATTTGTGCTT TAAATTTTTC TTTCATTTTA TCCCTCTAAT TTGAAAGGCA 300
GAGAGGGAGA TCTGCCATCC ACTGATCCTC CCACCCCCAA TTCCTGGCAA CATCTGGGGT 360
TGGGCCAGGC TAAAGCCAGG AGCCTGGAAC CTAGGTCTCC TACTTGGGTG ACAGAGACCC 420
CAAGCACTTG AGCCACCGTC TGTTGCCTCC CAGGGTCTGC ATTATTAGAG AAGTCCAGAT 480
AGTGTGTCCC TGGTGGTCCA GTGGCCAGGA TTAGAGAAGG CCAGATAG 528 (2) INFORMATION FOR SEQ ID NO : 461:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 370 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 461:
AAACTGGCTG nATTCATGTA CTAATGnCCC TTGAAGGAAT CCTAAATGAC TAAAATCACA 60
ACAAGATTTG CTCTTTCAAG AGATTTGTCT AATAAGnGCA GAACTGAAGG GTTTTATTTT 120
TTAACCAGAT AGAAATGGAA AAGCATTCAC TTTTCATAAA TGAAAAAAAT AACTCACATT 180
TCATAATCTT TTATGAATGn TGTGGATCAA CACTGAGAAC ATTTCAAATA AAAGTAATTC 240
TGCCTCCTAT GGAGTTATTT TTTCAGATAA ATCCAATCTG TACAGAACCA TCTGAACCAA 300
ACAnATGGGC TAATACCAAA AGTATCCAAG TTAAGAACGA CTCTAGACTT GACCTTGGAA 360
ACnTGGTGAT 370 (2) INFORMATION FOR SEQ ID NO: 462:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 387 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 462: TGCAGCGCGC GGCGCACGGG GCGATACAAA ACTCCCTCTG CGCACTGAGA ATTGCAAGTA 60
TAGCGCGCCC CCACCCCCCT GTCTACCTGA nGAnGGGGCG GGGCGTGCGG CGGTGACGCC 120
ACCCTGCGCC GGTTGGCCCC TTCTGGAGGG TGCTAGACAC GGnnnGGGGG GGGGGTGTTG 180
TAAAAATGAn ATGGGTCGGG AGGGAACTGC ACCGTGCATT CTnTTTTTGT GCGGGCGCCG 240
GGGCGCTGCT CCCGGGGTTT GGCGCACTGC CCGTTTTTTC TGAGCAGCTA GGCATTGCCC 300
CGCAGTAAnT GGTCACGCCC AGCTGCATGG GGCATTAAGT TTAAAAAGAA ACCTTCGGTG 360
CAACGAACGA nTAAACGCAA CGGTTTT 387 (2) INFORMATION FOR SEQ ID NO: 463:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 464 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 463:
CACCTTCCTT CTACACCTAA CTTTTAAGGA TTATTTTAAT TTTGAAGAGA TGTTGAATTT 60
ATTGAATGAT TTTTTTCTGC ATCTATTGAT ATGATAACAT GGTTTTTGCC TTTCATTTTC 120
TTGACATGAT GTATTAGATT TATCAATCTC ATTAAATTTC CTTAAAATCT TCTTCTGAAT 180
TTTTTTGAAA GGCAATTTTC ATTTCTTTTG TGATCCCTTG CAGGAGATGA TTGGGTTACT 240
TTGAAGGTAT GATGTTACCA TGCCTTTTTT TTATGCTTTC TTCTTTTTTT TnTTTTTTnT 300
TTTTTGTCAT GCTGCTTTGC CTCTTTCCGA TTTTTGTGCC TCGCTGGCTC AGGTGCTTCT 360
ATCAGTGTTT TTAGCTTGGT AGTAAAAGAT TTCGGCGGCA AGGGTACAGA GCATCAGTAG 420
AAGTGTGTCG TGGCTGGnTC CGGTGAGTAC TTTAAAGnGG CCAG 464 (2) INFORMATION FOR SEQ ID NO: 464:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 513 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 464: ATAGACATTG CAAAATGTCA AGGTACACAA TGGAAAATAA GCTCACATAT TTGTGCACAC 60 ACACACAGAG AGTAGGGTGG GTGGGGAGAG ATGGGATAGG TAGAGAGAGA GAGAAATATT 120 GATCTTCCTT CCATTGTTTC ACTCCTAAAA TGACTGCAAT AGCTGGGACC ATGCTAGGTC 180
AAAGTTGGGA AACTAAAGTT CCTTCTAGGT CTCCCATATA GATGGAAGGG ACTTCGGTAT 240
TTGAACCATC GTCTGCTGCC TCCTAGAATG CACATTAGCA AGAAGCTAGA TTTGAAACAC 300
CATAGCTAAG AGTTGAACTA GGCAGTACAA AATGGCATGT GGGCTTCCCA GACAGTGGCT 360
TAACCCTTTG TGCCACAGTG CTCACCCTAA CATCTGCCAT TGTGACGCTC AGTATCAGCT 420
TTGAACCAAG AGGAACTCTA CTCTGTAGGG AAAGGGTTAA AAAGGAGGCT GGTGGCAATC 480
CCCAATCATG CCATAGACAC CTGAGGAATT TAC 513 (2) INFORMATION FOR SEQ ID NO: 465:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 446 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 465: CAGnAAAAAG ACTTATAAAC TGAAAGTTTT TTATACATCT ATTGTCCAAG AACCCTGCAC 60
AATCTAACTT TCCAATTATA CTCTCTACTT TCATTTTTAT CACTTATATA TAGCCTTTTC 120
AAGATTTCAG AATGCTAACA AAGAAAGTGA AATCACCCTT GATGTCCAGA TAGTCTTAGT 180
CCTCATCAAT ATCATCATAT TCATTTTCCT AAGTTTCATT CTTTTAAACT GACACACAAC 240
AATCACACAT ATTTACAGGA TACAATGTGA TGTTCTGATG CATGTATATA TTGTACAATG 300
ATCAAATGAG GGTTATCAGC CTACCTACCA TCTGTAATAT TGATCTATCT TTGTAGTGAG 360
ATCAAAATCC TCTATACAAG CTATTTTATA TATACATATC ATTAATTATA ATCACTTACG 420
TGTAAACACC AGAACATATA CCTCTT 446 (2) INFORMATION FOR SEQ ID NO: 466:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 455 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 466: CAnAAnTGCT ACTGAAAATT ATTTTTTAGA GATAGCTCAA TCACCATAAT TAACTCCACT 60 TTTTCCATGT GATACACACA CACACACACA CACATAATGT CTTGAACTGT TTTCAATATC 120 ATTCTGGCAA ATCTGTCCTC CCTGCCTCTT TCATACCTGT TATTAATTTG TTGTCTGACA 180
TATAATATCT CTGGCACTGG ATTGTCCATT GTTTTCTAGC TTTCTAATTT TGAAGTGCGT 240
AAGAAGTTTT TTAGTTGTCT AACTTGGCAT TATCCTAATC TCAAGTGAAC ATAAAACTTA 300
ACATGTAATT CTTGCCCCTT TAAATATCAT CATGTGAAGT TGGAGAAAAT GTAGTAACCT 360
ACAGGGCATC AGTGTACTTT TGAGGGGCTC CTTTCAATTA CTTTCTCTAT CTTCATATCT 420
TAATTGCCCA TGGGATGGTG TCTTCATGAC TCCCT 455 (2) INFORMATION FOR SEQ ID NO: 467:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 467 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 467:
CGAGATCATG TTTATACTTC TCTTCATAAT GAATGTTTCC TATAACACAA ATCTATGTGT 60
GTAAGGTGTG GAGGAACAGA AAAAACACTC AAGTAGGAAA GAGGAGATGA TGGAGAAAGC 120
CCCACTCAGG ACTCAGAGCA TGGCTTGCTA CATCAAGGGC AGCAGTTCAC CTCCTCGGTT 180
TTGGAAACTG TTTCCAGAAA CAAATGGAGA GTGGTTTAAT CAAAGACGCA TGCATGAATG 240
AGTTTGATGA ATAAATTCAG AACATCAATC TGATAATTCA CAGAAAGAAG TGAACTGCCA 300
TAAATAGTCC TATGTGCAAA TCCCAATATG ATCATAAGCT TGCTGAGAGA GGCCAAAGTC 360
AGATCAGGGA AACTGCTAAG TGATTTGAAT ATTTAGTGGC ATCAGGGAAG AGAGTTTGAT 420
GCGTTATCTC TGTACTCATA ATGATCCGGT AAGGAATAAC ACTCCTG 467 (2) INFORMATION FOR SEQ ID NO: 468:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 460 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 468:
CAGGGCTTTA TTTATGGGAA AAAGAGAAGT CTAGAGGCTA AGCTGGGTCC ATACCAAGCA 60
GAGAGCAGGC CAGGAGCCAC ATGGAGCAAG TGTTTGTTAT GAGTAGCCAC AGGTAGCTTA 120
GCATGAGTAG CAAAGTGAAG GTACAGAGCA GGCCAATAGG CCATGTGCCC AnAAGGTGTG 180 GGGCAAnAAG AGAnGGGnCC ACCATGTTCC AGGCCTTTTA TCCACTTCCA AAGAGGAGTG 240
GTTATGTAGC CTGATGGGCA GTGGGTTTAC AGGTGGGGTC AGGTAGGAAC ATGAGATCAC 300
ACAGGGGCAT GGTGAAGATG TGATCTTCCA GCTCACAAAC TTGATCAGTT TTATCCCATC 360
TGCCTGCCGA CATCAACCTC CCCTCAGAGA GATTCTAGCC CTTAATCCTA AGGGCTGTTG 420
AAnGGTGTAG AATTATCATA TACTCTATAG CTGCTTCCTG 460 (2) INFORMATION FOR SEQ ID NO: 469:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 473 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 469:
CAGAATGTTT TATGACTAGG AAGCAAAAAG AAGTCAGAAG GAGCCATATC AAGACTATAT 60
GTTGGATGCC ACATGATTTT CCATCAAAAC TCTTGCAAAA CTGTCCTTGC TTGATGGGAG 120
GAACAATAAG AAGCACAGTT GTGGAGAAGA ATCTGGAGGA GCTAGCGTTG GCTAACCTTC 180
TCAAAAACAC TTTCCCAGTA ATCAGATGGT CTCATTGTTT ATCGGTTAGA AAGTCAACAA 240
GCAAAATTCC TTAAACCAAA TATTAACTGT TGCTGTGGCA TTGACGCTGG CCAGTCCGCC 300
TTTGCTTTGA CTGGCGCACC TCTGCCTCTC GGTAGCTGTT GCTTTCATGT GCTTTGTCTT 360
CAGCATCATC CTGGGAAAGC TGTGCTCATG TGTTCTCACA GTGGTTTGAA GAAATGCTCC 420
AGGACTGGAT CCTGnCCTAA GTAGATTGTC TATCAAAACT GTACTCTTCT GCG 473 (2) INFORMATION FOR SEQ ID NO: 470:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 613 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 470:
TGCATTTTTA ATTCTTAGTA ATATTCAGTA TATTTACCTA TGTAATAATC TTTCATTGTA 60
GCCAGGAGAA TTTTCTTCCT GCCCACTGAA CTCTGATGAA GATGTGAATA ACTGGTTGCA 120
TTTTTATGAG ATGAAGGCTC CTTTGGTCTG TCTACCAGTT TTTGTCTCCA GAGATCCAGT 180
AAGTTTGTGT CTCTTATATT GCCGCAAATA CATTAATAAT GCTTCATTTT TAGGATATTC 240 ATTTAGATGT AGGCACCTTT TGTTGTGGTG GTATTTTTCA ACAACTAGAG CAGACTCTAC 300
AATCTTCTTG AAAGATAATG GAGCAGTGGC TTACCACTGA ATGAGTTAGG CAATGCAGAT 360
TTTACCTTTG AAAAAGTGGA AGGCAGGAGT GCAGGGCACT GTAGCACAGA GGGTTAAGCT 420
GnCACTCTGG GATGGCCTAA CATCCCCGAn GTnGATAGCC ACCCTGGGTG TGGAGTCCCC 480
AGGCCTAATT CCTGGAAACC TTTAAAAAAT CCCAnGCTTT nCCCTGGCCT nAAATGGCCA 540
TAATGGGGGG AnGGGGTnGG GGGGTnGGAA TGGGGnCCCC CAAGGGGCnA AnCCTGGGGG 600
GGATTAACCT GGG 613 (2) INFORMATION FOR SEQ ID NO: 471:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 617 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 471:
TGnGTAATTG CACTAATAGA ATGACCACTT TGTTGTTTCT GGTTCCTGAC AGAATAGTAC 60
GATAGGTGCT CTTTTGGCAA ATAAAACACT TGCCACTGAA GAAATATTTG ATGTTAAGCA 120
ATTTTATTAT ATTTTCTAGT GCCTGGTCCA TACTATGCAT ATAACCAGTA CTTAATGCTC 180
AAAAAATATT TTTAAATCAA CAGGTCACTA ACTCATTTCT AAAGTTACAT AATTCTGTAT 240
TTTTATAAAA TAGTAACAAG ATAAGTGTAT ATTAACAGCA CGATATACTC AAGTTCCTCT 300
TACCTATAAC TGACAACACA CATCCTCAAA TCATGTAACT TTAAGATTCA GCTATTGAAT 360
TAGTAGTGAA AACTTTCCAC AAGTGGAAAT ATTCTTTTCA ACTTTGTATT TTTTTAACAG 420
TCTGCCATTC CTAAAGCTGC TGGGTTTGCT TTGCTCCTCC AGTTGGTAAG AATGGAAACA 480
TCAACGATGT CATGGTGTAA ATGAGGAGAT GCACTCTCTA GACTGATGCA GCCTTGCAAG 540
TTAGTGCCTC TGTGTGAAGA GAAAGGCTCT TCTGCCTCTG GGCAGAGACA ACCAAACCCC 600
AGGCTGCATC AGCCCAC 617 (2) INFORMATION FOR SEQ ID NO: 472:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 491 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 472:
TTTTAGTTAT TGGGCCATGG GGTGTCAATT TTGTTTATTT TTTCAAAAAA CCAGCTCCTC 60
GTTTGGCTGA TTTTTTTGTA ATTTTTTTTG GATTCAATCC TTTTGATTTC TTCTCTGATT 120
TTAATTATTT ACCTGGAAGG GAGATACCAT GATCATGGTA GCTTTTACTG GTAGATTTTT 180
GCCTGCAGCA ATGATTGATG TGGTGCTCCA ATGATAAATT GTATTTCCCT AATTTCTTCT 240
ATATTAATTA GAATGTGTCT GTAAGGAAGA GCTGTGCCTA CTCCACAGTT ATTACTTCAA 300
CCACTTATAT CAGTAAGGAC TCCCTGATAT TAATTTTATT GTTTCATCAC AACCAAATGC 360
TTTGTTACTC ATATTTTTGC TCTGATTTTT CTCAGTTGGA CCCTCAGGAA CTTTTTTAGA 420
CTTGTTCCTG TGTTCTTTTC ACATCTTCTC AACCCCTTTG CCACCACCAC TACCACCCAC 480
CTTCCAGGCT T 491 (2) INFORMATION FOR SEQ ID NO: 473:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 491 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 473: nACTGAGCTT TTCTTGTTTA TTAGCATATA GTTCTACATA GTGGTTTATT ATGACACTTT 60
GTATTGCAGT GGTGTCAGTT GTAATGTTTC TTTGTTCACT AATTTTATTT ATTTTAGTTT 120
TTTCTTTTTT GGTTTGTTAG TCATGCTCAA GTTTTGTTTA TCTTTTCAAA ATAATACCTT 180
TTTGCTTTGT TGATCTTTTG TATTGCATTT TTAGTCAATT TCATTTATTT CTTTTCTCAT 240
CATTATTTCT TGCCTCCTGC TCTTTCTGGG TTTGGTTTGT TCTTGTTTTT CCAAGTCTTC 300
AAGATGCATC ATTATATCCT AATTTGAGAC ATTTCTGTCT CTTTTAATCA TGTAATGCTA 360
TAAACTTCCC TCTACTGTAG CTTTTGCTGT ATATCTCAGG TTTTGATATG TTATGTTTTA 420
ATTTTnCACT TATTTCACAA AAGTATATAT TCATTTnAAA TTTCTTCAAT GACCTGTTGA 480
TCATTTGGTA G 491 (2) INFORMATION FOR SEQ ID NO: 474:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 372 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 474:
TGAGTCCCCA TGACTCTGCT TCATGTTTCT TTGGTACTAC ACTTTCAGAA CTGTACTGGC 60
AAATGACTTT GGGGACAGTC TGTTCTTGCT CTGTTGCTGC TCTGAGCCTT CAATCAAGGT 120
CCACACAAGG CCAATGATAG AAATGTTGCA TCCAGAACAG CCCCCTTATT ACTAAGATGA 180
CCTCCAGGCC CCATAAATGG ATCTTGGATC AGAGGTCAAG ACTTTCTGTG CAACTTGAAC 240
ATTTCCTGCC TTGGGAAGTT CACTGCnCCC AnCCCCAGGA CATGTTCTGA GCTGCTTGGT 300
CCTAGGTGTA nAGGCTCTGG CCGCTGGGAA TGTTGTACTG AGTTGACTTC TGAGTGTCCA 360
GATGCCAGAA AA 372 (2) INFORMATION FOR SEQ ID NO: 475:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 388 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 475:
AGATAGTATT CAAGCTTCAG GTAGGTGGGT TTCAAGCTAT ACACAAAAAT AAAGGAGTAT 60
AAAAGTCCTT CATTTGAGGA AAAAAGGGCC TGCTTCACCA ATTTTTTGTA GTACAGGTTA 120
GGCAAGATTC CTTTTTAAAT TTCAAATGTA TGACAAATTC TGAGTTATTC CCTAATGTCA 180
CCTTATAAAT GGATATAGAC CATTACCTGA ACATTGTTTT CTTGTGCTTG GTGAGGCTGT 240
CTTACATGAG GCAATGAAGG CTGAGTTCCT AATTTCCTAA TCCCAAAAAn nCTTCTTnGC 300 nGCAAGCATA ACACCAAACT CGAAGAGTGG TAAGGTTCAC AGTTAGATAn TGCTTGTCTC 360
CTGCATAATT CCACAGGAGA GAGAGTAG 388 (2) INFORMATION FOR SEQ ID NO: 476:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 563 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 476: TTTTCTGAAA TTTAGACCTA AATAATGATA ACCAGAAAAA AACTTTGACA CCTAAAAGGA 60 TTTTGTAGAT AAACATACTT TTAAAACTTC ATTATTGATG AAACATTAAG CTTCAAACTT 120
TGGAAGGACT TGCAACTTAA TGGTGCTTTT GCACACTTTT TAATAAATTG CTTTGGTATT 180
TTATGCTTTT ATTAGAGGTC GTTTAACTTT TGGTTAAATC GATTGTAGAA AGTCATTGAC 240
ACATATAATC AAAACTTAAC TGTTAAGAGT TAACACATGT AATCAAAACA ACTGCAGATT 300
AATTGTAATG CTGTTACAAA TTTACGTGTT GTAAAATGTC TTATTTTCCT TCTTGTATAA 360
TTATTTATAG AATAGATGTT CATATGTTGG CTGTGGTGAA TCGCAAGTAG ATCACAGCAC 420
CATACATTCT CAGGTGTGTA TATATGAATT TTCAATGCAG ACATTTTTTA AATGTTTCAT 480
TTTGAGATAA TTGTAAAATG GAAGATGTGT TTTTAAAGGG AGACTGAAGG AAAAGGACAG 540
CCAATAATAC AACACTGTTA ATT 563 (2) INFORMATION FOR SEQ ID NO: 477:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 437 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 477:
AGGATGGCTG AAGCCCACTT GTCCATCATA CCAACCAGGA ACTTGGACAC AAGCTGATGT 60
AACCAGACCA GAAGACCAAC CATTGTGACT CCAAGCAGTC CTTTATCACC CCACCAGAGG 120
CAAACATATA GCTGTCTATC CTTTTCCCCT TCTTCCTGAC TTCCCCCTTC TTTATAGCCT 180
CTTTAAAACT CCCCAAATAC CCTCATCCAG GAGGGTGGTA ATTCCTAAGA CTTTGGTCTG 240
CTACCCCTTC ATCTGGACAA CAGATTAAAA ATTTTCTTTC CACAACCCTC AATCCTCCTC 300
CTGGTTAATT TAATTTGGCC ACAAGGGACn GGGACCAAGC TTTGGGGnAA ATCGGAnACC 360
ACTGGTATCT GGACATTCAT CCGATTTTCG TGAGGAGTTG CTGCCGAGTT TGGCTTGAAA 420
ACCAGAGTCT CAGAGTC 437 (2) INFORMATION FOR SEQ ID NO: 478:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 391 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 478: AAnAACCGCG TTATTTTCCC TGTTTCTCAG AAGGGGAAGC CGAGGCACAG AGGGGCGGAA 60
CCTTGCCCAG AATGGTAGAG TGGGATTCAA AGTAAGGCAG CGCGGCTCCA GAGCCCCGCC 120
CTAAACCACC ACCCAGACGG CCAAACTGCA GACTGACTCG TGCATTCAGA ACTGGCTCAG 180
AAATCCCTTT GTTGGGnGGT AGGGGGGGCA GGGAAGGCCC CACACGCTCT CTGGGACTTT 240
CTATATGGCA AGTGGAnCGG CTGGCCCTGC TTTCTCAGGC AGAGCTGAGC ATTCTGGAAC 300
TCTTGCATTG GGCTGGACTC CAGGAAGGCC TGTGTCCCTG CTAAnCTCCC AGCAGGGTCA 360
GAGCGCACAA CGCTGTGTGT TCCTAGGAnC G 391 (2) INFORMATION FOR SEQ ID NO: 479:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 585 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 479:
ACnCCCATTC AATAnGATGC TGGGCCATGG GTTTTTCATA AATTGCCTTG ATTGTGTTGA 60
GGAATGTTCC TTCTACACGC AATTTGCTTA GAGTTTTCAT CATGAACGGG TGTTGTATTT 120
TATCAGATGT TTTCTCTGCA TCTATTGAGA TAATACTATG GTTTTTCTTC TGCAGTCTGT 180
TAATGTGGTG TATCACATTG ATTGATCTGT GAACGTTGAA TCATCCCTGC ATACCAGGGA 240
TAAATCCCAC TTGGTCTGGG TGGATGATCT TTCTGATGTG TTGTTGAATT CTGTTGGCCT 300
TATTTTATTG GGGATTTTTG TATCTATGTT CATCAGGAAA ATTGGTCTGT AATTCTCTTT 360
CTCTGTTGCA TCTTTTTCAG GTTTAGGAAT TAAGGTGATG CTGGCTTCAT TGAAAGAATT 420
TGGGAGGATT CCATCTnTTn CAATTGTTTT GAATAATTTG AGTAGAATTA AGTTCTTCTT 480
TAAATGCCTC ATAGAATTCA GCAGTGAATC CATCTGGTCC TGGACTTTTC TTTGTTGGGA 540
GGGCCTTTAT nACTGATTCA AATTCTGTCn CAGTTTTGGT CTCTT 585 (2) INFORMATION FOR SEQ ID NO: 480:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 396 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 480: GCGGGCGTGG nGAACCACTG GCGCCCTGAC CTGCGGTCAC nAAGGAGGGC GAGGGCCACG 60
GACGAGGCGC GGAGGAGCCG CGGAGGGAGC GGGGAGCCCA GGTCCCGCGG CACAGAGCGC 120
AACCTGAGAG CCTGGGCCAG GGGAAGGGGG TTCATGAGGG GAGAAGAGGG CACAGCCTGG 180
AGCTGGGCTC ACAGACCTGC GCAnGCGAGT CCCCGTGCGA CCACGGCGCC CCGGTCCCGC 240
GCCACGTGCA AGGTGAAGGG AGCCAGGTGC GAGGCCGCGG GGACTCACGG CCCCGCTTCT 300
CCTAAGTCTG ACAGCAGCTT GGTGGACGCA GCACAGCGGT CAGGGACGCG TGGGACACCC 360
GCCGAGATCC TGGGGGAnCC AGCGGnTCCC TCTCCG 396 (2) INFORMATION FOR SEQ ID NO: 481:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 254 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 481:
CCCTCCCTGC CGCCTCGCAG CGAGCTGGGG TGGGAGGCTG AGCAGACGGG ACCCCGGCCC 60
AGCTCTGGCA CAGGAGGTAG GCATCCCAAG CGGTGCTTTC GCCACGCCGG ACGCCTGCCC 120
ACCCCACAGG TGCGCTTATT CAAGCCGGCC AGTTCCTCGC CCCGCGGTCT GGCTTCCCCT 180
CTCCAGTCCC AGGAGnCCGC GCAGGGnCCT GCTCCCGACC CAGAACCTGT CCTAGGTGCT 240
AAGGGGCCCC GGGG 254 (2) INFORMATION FOR SEQ ID NO: 482:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 552 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 482:
CCTCTGAGAC AGGAGAGGCG GACTGGGGTG GGATGGGGGA GGGCCCCCTG GTGTACCCTG 60
GGGCCACTGC TGGAGAAATC AGCCCAGGGC TTCTCCCCAG GGCAGATCTG ACCCGGACAG 120
ACCTGAGGGC TCAGCAGGGA CAGCTGCACA CTCACCGCAG GCATAGCCCC ACCCCACCAC 180
CACCATACAG ACCTGGCAGG ACCCCAGACA CGCCTGCCTT TGTCCCACAA ACATCTGAGT 240
GCCACGCTCT GTGCCGTGCT GGCTACAAGA GTGGCGGGGA CCTGAGGCTT CTCTGAGGAC 300 CCCTGCGAGT GGACGGAGGC GCGGTCAGTT CCTGGGGGAA CCACTGCAGG TGCGGCAGAG 360
CCTGGCCTCA TCCAGGGCTC GTCCAGGGCT CAGGGAGCTG TCCGCACGAC GGAGGCAGTT 420
TCGTTCGTAA CCACCGGCAG GCAAGAGGTT CAGGGCGGAA CCAAACTCAG CTCCACCCGC 480
AGCAGACAGA CGGCGTCGCT GGGGCGGAAC ATCAGAGAnG GGCCGCGGAA GGCGGGGGCT 540
CTGGCTGAnG CA 552 (2) INFORMATION FOR SEQ ID NO: 483:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 421 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 483:
AGCAAAATAC TTAAGAACTG AAATGAGCAA AAAGCAGTTT TAAGAATATA GGTCAGGACA 60
TTAATAAAAA TTTGATCGCC TCCTTTTTAT TCTCTAACAT TCCCCACAAT AAATCAGCCC 120
AGTTTCCCCC AGCCCCTCCT CACACACACT TCACCCTCCC GTCCTGATTT CGGATCGCAG 180
AATGTAAATC TGTGACAAGG GTATCTATAT ATAATATTAA TGGTGCCGAG GAGGGCTGGT 240
GTGAAGTGTG AGAGCTTGCC AAGAAAGGAG CGATCTGAGC CCAGCCGTTC ATCCTGCGCA 300
GTGTGCTCTA CATCATCCAC AGGACAAATG TAACATCATT AGGGGGAAAA AAAGGAAGAA 360
AGAAAGGGGG CAGGAGGAGG TAGGGAAGTA GCCATTCTGC AAGGAAATGG CCAAGTTGGA 420
G 421 (2) INFORMATION FOR SEQ ID NO: 484:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 521 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 484:
GGACTTACTA GATGTGATCA GGATAAAGCT GGATCCATGT CACCCAACAG TAAAAAACTG 60
GAGGAATTTT GCAAGCAAAT GGGGCATGCC CTATGATGAA TTGTGTTTCC TGGAACAGAG 120
GCCACAGAGC CCCACCTTGG AGTTCTTGCT CCGGAATAGT CAGAGGACGG TGGGCCAGCT 180
GATGGAGCTC TGCAGGCTCT ACCACAGGGC CGACGTGGAG AAGGTTCTGC GCAGTGGGTA 240 GAAGAGGAAT GGCCCAAGCG GGAGCGTnGG AGACTACTCC AGGCACTTCT AGATCCCTCT 300
TCTTCCTTCA TTGGCCTCTC TGGACTTTGA AACAACCACA AGTCAAAGAG GAATGTGAAT 360
CTGTCCTTTT GGAGTGTAGA ATAATGATAT GAAACTGTGG ACATTAGTTT TCCCCAAAGC 420
TGGTGATTTT GTGGAGGGGT AGATTTGTTT TGGTGGTGGA TATTGTTTCT TGGTTTTTGC 480
ACATCTGTTT TAATTTAATA TTGAATCTGG AGTTGGGAAA G 521 (2) INFORMATION FOR SEQ ID NO: 485:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 532 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 485:
AGGCACCATT TCATCCCAGA GTCCAAGCTG AGAGCAAAAC GGCCTGAACT GCACCCCGGG 60
GACAGTTCTC ACTCAAGAAA ACACATTTTC CTCTCCCAAC CTAAATTTGG AAGGAATATA 120
CTAGGGTTTC TCTAAGCAAA CAAACTTGTA AAACCATCGG GGGAGGTGAG GCCGCAGGCC 180
CGCTCTACCA GGAGCGCCAT ACCAGCTGCC TCTGGGACCA CCCCATCCTT CAGCTCCCAA 240
GGGGCTGCTT AAAGACTCAA CGTCTCATTC TTCATAAACC CACCTCCTAG TCATAAGCCT 300
CAGGGGAGAC TTTTTTTTAA GCTGGTAGTG ATTCTTGGGC ATAATAATAT ATGACAAAAA 360
TGAGGGCATG GAGGAGTTAG ATAGAGCTCA GGCCAGGTGA AGGTCCTTGC ATTCATCTCC 420
TGCAGCTGTG GCAAAGGAGC AAAGGAAACC AAGCCAGCAA GAGTGTTTCC GCTGAGCAGA 480
AAAGGAACGG CTCTGACCGG CATGCAAACA CGCCTGCTGA GACCTCCTGC GT 532 (2) INFORMATION FOR SEQ ID NO: 486:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 532 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 486:
TTGTTGGCCA ATATAGTTCA GAGGGGTCTG TGTAGGGGAA TTTCTGATCC CGACAGGAAA 60
TTCTAGGTCT GTGGCAAAAA GGTTACTACC ACTTTTGTAA CTCCCAAATC AACTGAAGCC 120
TACCTGGCCA TCATAAACCT TCCCTAAAGG GCACGGCATT CTTGACTAAT CAAGGGAAGG 180 TCACAAGCAC CACTATCAAC CAATAATTTG CACATGAGCT GGACATGAAC TTCTCAGACT 240
CAACTTCAAT CTGTTAAAAC CTTCACCCCT GGAGAGTCTG GGAATTAATC AATAGCTGCA 300
CGGCCTCTTA TTCTGTGCTT TGCAATAAAT GCCTACTCTC TTCCACCATC CGATGCTAGC 360
AATGGCTTCT CAGTTGGGCA ACCAGACCTA GTTTGGGGTT CTATAGAAAG TTCTCCAATG 420
AGTTTGGGGT GTCATTGGCA AACAAGTACT CCAAGGAGCT GACATCTCAT CCACTGGAAC 480
TGAGGGCTCA TGGCATATCA CCAGACCAAG TAAAATGAAA ACGGGGACAG TT 532 (2) INFORMATION FOR SEQ ID NO: 487:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 487:
CAACAAATGC AAATCAATAA AGGCAATACA CCACACCACA TCCACAGAAT AAAGAATAAA 60
AATCACAAGA CCATCTCAAT AGAGGCAGGA AAGGCATTTT ATAAAAATCT AACATCCTAT 120
CATGATTAAA ACTCTTAACA ATTTAAGAAT AGAAAGATAA TACCTCAACA CAATAAAGTT 180
TATATATAAA AAACCAGCAG CTAACATCAT ACTGAATGGA GAAAATCTGA AAGCTTTTTA 240
AGATAGGTCA AAAGACAAGA TTCAACATAG TACTGGAAGT CCTAACTAGA ACAGTTAGAC 300
AAGAGAAGGA AACAAGGTCA TCCAAATTGG AAAGGAGGAA ATTAAATTGT CAATGTTTAG 360
GCTGACATGA TCTTATACAA GGAAGAGTCT GGAAAACTGT TAGAACTAAT AGATTCAGCA 420
AAGTTGCAGG AC 432 (2) INFORMATION FOR SEQ ID NO: 488:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 450 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 488:
TTGGAAGTnA GAGCGnGAGA ACCTGATGTT GGACATGAAT TTTATTTTGC TAATGATCTT 60
CACTCAACTC TCTGAAGGTA TTTTACCATT ATCTTACTAT TGTTAATTCT ACAGTCCTTT 120
ATAAATCCAC ATAAATGACT TTTTTCTATT TGGCTGCTTT TAAAATCTAT TTATCTTTGT 180 TCTTGCCAAG ACCCTGTACA TCTAAGTACA ATTTTCTCTC TGCTGTCCTA CTTGGGACAG 240
TATGCAACTG TGGATTTATG TTTTTCATTA GTGCTCAAAA AACAGCAGCC ATTTCCTCTT 300
CTTAAAAGAT TATTCATTCA TTCGAAAGGC AAAATGAGAG AGACCAAGAG AGAGAGATGA 360
GAGACAGAAA GAGAGAGATT ATCTCCATCT ACTGGTTCAC TCCCAAGACG GCCACAAGAT 420
GAAGCAAGGC CCCAAAAACT CCAACTGGGT 450 (2) INFORMATION FOR SEQ ID NO: 489:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 615 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 489:
TGAATTAGTT GGACAAAGCA AAACAGAAAA ATTCTGGTGG TAATAAGTGA TAAGTCCACT 60
AAGAAAATAA GCAGAGCATA GGTAGGGATG TCTGTGTTAG GGCTGTCAGG GTCTATCCTG 120
CCTCTCTAGG CACTCAAATG TCCGCTAAAT ATGTCTGAGT TGGTTTGTTT ATATTATGAT 180
AGAACCTTAT GCTTATCTCT CCTACATACA CATGTAGATA CGAGAATTAT TTATGTGATA 240
GTTCACCCAG TTTTATGATT CAGAAGGTGA CACCTCAGTC TTCCATGTTC ATAATTATTA 300
AAGACTATAT GCCTTAAACA ACCTCAACCT TCCACTGCAG AACAGCACAT GTTTGTGGCA 360
TTGATGATTT TTCAAAAAGC AGAGGCCCTT TCTTCAGAAA AAGAGAAGAG GCAAATAGAG 420
CTTCATGAAC TAGATTTAAA ATAGGGGCCC ATGCTGATAG CCACTGTCAT CATACTGTGC 480
CTTTTGGCCA CCAGTCAGAT GCAGGCTCTG AGCTTAGATT CCCTGATTCT AGACTCTGGG 540
GGGACCTCTA AGCCACTGTC AGAATTAGAA ACCAAACCTG ACTGAATGCC CAGAGATTAT 600
CCTTAATCTG CTATT 615 (2) INFORMATION FOR SEQ ID NO: 490:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 429 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 490: AGTAAGTTTC AAAATTATGC CAACATGCAG GTCCGTGATT GTCTTGTTTC CCTAAAGAAT 60 ACTGGAAGGA AGAGGTATTT TTACTTCCTG TAAATTTATC TCTTCAGGCC AACAGAAGTA 120
GACCCCTAAA TTCCAAACAG CCCACCCAGA TCACTATTGC TAGAGAATCA CTGGTGAGGT 180
ACTTGTGTCT CCCACCCCAT TTATTTCAAG TAGCAACAGG ATGATTTTTT AAAGAATTTT 240
GCTGGCCTTC TTTCCTAACT GCTTTTTTGG CCTTCTCCTG TCCACTTCCT TTTTATTATC 300
CTGCAAAATG ATTATAAACC CACAGAATGT TTTAGACTGG TGAGTGCCTG GAAGGCAGAG 360
CCAGTTCAGT AGCCATCCCG GTGAGCCCAG ATGGCCCAAC AGGAACGTGT GCTGGAAATG 420
GGCCCAGTT 429 (2) INFORMATION FOR SEQ ID NO: 491:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 491:
TTTCCTTCGA CGATGTCATC AGCGTGCGCT TTGCCTGAAA CACCTCCGTC AATGCCTTGT 60
ACATCATCGA GAATCTGAAA TCCTATCCCA ATAGGCATGA CTTGCGCTCC AAAGGACnGG 120
CCTCACGCGC AGAATAACCT GCACACAAAA ACCCCAATTC GCCTGATAGA GCGATGAGTG 180
CGCCAGTTTT CAGTGCAACC ATACGCAGAT ACTGCGCACG CGAGGGAATA AGTTCTGGGC 240
TGCGATGCCA TGCAATATCA AGCGCCTGAC CCATATGAAG AGCACGCGTT GCACTTATCG 300
TGGCAGAAAA AAGAGCCGCC TTAAGGGCAG GTTCTATGCT GAGCGTGTCA ATGAGTGCGT 360
GTGGCATGAA AATACAACCA GCTTGCTGGC ATTAAGAACG CAGTCAG AC CGTAGCGGCA 420
GATACGGCAC AT 432 (2) INFORMATION FOR SEQ ID NO: 492:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 638 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 492: ACGGGACTTA AGGAATGTGA AGTGATCCTA GTTTAGACCA GTCTTGTACT TAGCCTCGCC 60 GTTAAGACAG TTCAGGTGCC CACGGCCCAC ACTGGAGTAC CTGTATTTGA CTCAGCTCCA 120 GCCCCTGGCT CCAGCTTCCC GCTAATGCGC ACCTTGGGAG GCAGCAGTGA CAGCTCAAGC 180
AGCTGTGTCC TTGTCACCCA GGCGGGAGAT CCAGACTGAG CTCCCCGCTC CCAGCTTCAG 240
CCCGAGTGAG GGCTGTTGTG GGCATTTGGG GAGGAAGCCA GACGATTGGA GGACACTCAC 300
TTGCCTTCCA CCCTTTCTCC CCCTTCTATT TCTGATTCCC AAGTAGGTAA AATTTAAAAT 360
AAATACATAA AATACTCCCC AGAACTAGCT CCTATTTTAA GTCCAATTAT TACAATGTCA 420
CAGATACTTA TCAACAGGTT AGTCATTCTT TTCCATTTTA ATAGGAAATG AAAAGGAAAC 480
TAGGGCAGAA AAAATTGGTT TTAAAAGTAT AGTGATGGGG GAAGAATGAG TTTTTCCTGC 540
TCTGCTCTTC TGAGTACTGA GGTTCATGGG GGACTTCCAC ACAGGAGCTG TGTCTCTGGG 600
TTCACCTCTC CCCCAGAACA CGAGTGACnT TAAACTGG 638 (2) INFORMATION FOR SEQ ID NO: 493:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 641 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 493:
GAATCCCGGG AGACTACAGT GAGGACAAGA GTGACAGTTC CTCCCCCATG GGCCACCAAT 60
CTGGTGGAGA GGACAAGAAA TGAACAAATA GGGGGCAGCA TTGTGACCTA ATGGGTAAAG 120
TCACTGCCTG CAGCGCTGGC ATCCCAAATG GGCACAGCTG CTCCACTTCT GATACAGCTC 180
CCTGCTAATG CTCCTGCGAA GGCAGCAAAA GGCGGCCCAT GGGAAATGAA CAAATAATCA 240
TTACAAAGCA TGGCATGTGC TTTCAAGGAT TCTGCGAAAG ATGAAAGTTG GGGTGGGGGT 300
GCTGGTGCCC AGGTGAGCGG CCAAGGAAGA AGCCGGATCA TTTGCTGGCA CCTGGCCACT 360
GACCTCCTGG GCCAGGCAGT ATGTCACACA CACCCAGGGA GAAGGGAGGG GAAGGTGACA 420
TCCAAGGACT GAGGCAGAGG GAGAAAGGGG ACAACTCCAC TGAGAGGAAC GGGGCATTGT 480
ATGCTAACAC AGTATTTGTG AAGCCGAGGG AGTGGAGGGG GCTCTGATAA CTCTCTAAAC 540
AAAAGAGCTT GTCCTGGGCA CGGGGAAGCC TCTTCAGTGC CTGCAGGAGC CAGCTCTAAC 600
CACCTGGAAA TCAGTCATGG GGTGACAGCT GAGCCCAGGA G 641 (2) INFORMATION FOR SEQ ID NO: 494:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 554 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 494:
ACATCAAGTG ACAGCAGATA AAGCACTTGA CTTTGCCTCT TAAACACACA TTCCTGGGAT 60
GAAAAGAGGC TTAGAAAAGC AGGGCTGGGT CTGTCTGTTT TGTCACTGTG TTTACCGTGT 120
GGTCTGCATT CATCAGGGTC TCCCACTGCA GTGTTTGGGG TCCTGGCCCA GAGAGCAGGA 180
CCAGGAAGTG GCTGGGAGTA CACGAGGGGG TGCATGTGCG CGCCCTCATC CAGACGGCAT 240
GAGGTCCACT TGGCTCCTTC CTATCACGAC TCTTCTGGAA GTTCCAGAGG ACGGGGTGGG 300
GGGAGCCGTG GGGGGCTGGG CAGAATAnTG TGCCTCTGGA CAGGAGCACA CAGCAGTTTT 360
CAGGGGGCAA nTGGGAAAGC AAAGTCAATT CTCTGACCCT GAGGGACTAG CCTCAGTAGC 420
CTCCTATCTT TCCTCCTGAA AAGTTGCnAT TCCACCCGTG AGCCTTTCAn CTGTCTTATT 480
TTTCAAAAAA GATTTATTTA TTCACTTTGA AAGTCACACT TGAGGAGAGA CAGAGACATC 540
TTTTTGCTGC nTCA 554 (2) INFORMATION FOR SEQ ID NO: 495:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 584 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 495:
ACCCTGCGTC GAAATCAGGA ACCGTCTTGA GCACGCTTGC GCAGGACTTC AACGCTGTAA 60
ACACCTGGAT ACCGCCGACC GTGTTCATCA GATTACTCTC ATCAAAACGC ACCGTAAGCG 120
CGTAGACAGA GTGCTCAGGC GGAACGATCT CGCGCAATTC CAACCGATCG TATTCAGAAA 180
GAAGAGACAG CGGCGGCACT CGCCCTACCT GAGGGGCTTC CGTCCCATCT TCACCTACCG 240
CGGTCCCCTT TGCAGCTTCG GCTGCATTTG GCGAAGACAC CGACAATCCA CGCGCCGCCC 300
CCGGCACAGT CCCAGACGCA GGCAGATACG AGCGCAGACG CGCAACGAGA TCCGAGACAT 360
CCTCCGCGTA GGACCACCGC CCGCACGGGA CTCGAGCATC GCCTTAATTA CATCCAACGA 420
CGTCAGGAGC AAGTCCACAA CCGCCCCGTC AACAGTTACC TTCTCTGAGC GAATCCCATC 480
CAGCAGATCC TCCACCGCAT GCGTGAAGCC GGACAACTCG TGCATCTCAA CAGTCGCAGC 540
ACCCCCCTTA AGGGTGTGCG CCGCAGGAAA ATCTCGTCTA CAGC 584 (2) INFORMATION FOR SEQ ID NO: 496:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 578 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 496:
AAGTACATTG AATGAAACTT CAGTGACTGC CCAAATGAAA ATAAATCCGC AGTCAGCCCA 60
CCTCCCCACC AAAAAGGAAA AAAAAAACAG GAACCCCGCT GCGATCTTAA CATAAACAAA 120
CAAAACCTCA GAACCGGTTC TTCAAGCTTC CCAGTCCTAC CnnnnCACCn GCnCCCCCCA 180
TCCCCATAGG TTCATACACC AGCTGCCAGT CCAGTCTTTA AGGATAAnAn ACCACTACAA 240
TGTAGCACTT TCACTTTCAT TTGCTGGGAG AAAGCCACTG GCGTTAAACT GTGGAGAATA 300
TGCCCAATAT GTTTTCCATG TCCTGAGCAA GCAAACAAAG TCTGGCACAC AAATCGGAAA 360
CAGGGAAAAA TGTGAAAATA GCTCAGGACA AATGCTGCCT TTCAGTGGTG TAATCTGCTG 420
CATCATTTAT CTGTCTCTAC CGCTGAGTGT AGGGTTTAAA AGTTTCTGAG ACTGGGGGAG 480
GGGAAAATTG ACGCCTTGAA AGTCCAGTCC TAGGTTAACG CCAAGTCAAT TTCCGATGGC 540
CACAACCAAA AAGCGGAATT GGACTTGAAG GAAGGGGG 578 (2) INFORMATION FOR SEQ ID NO: 497:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 619 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 497:
GnAGTTATTA ATGTTTTAAT AGTAATAGTG GTATGGTAGT TTTTTTTGTC CTTTTATTTT 60
ATTTTTTTAA GGGAAAGAGT TTATTGGGGG AAACCTGACA GACTGGAGGG AAGGGGCAAA 120
AAAGGAAAGA GGGAGAAGGA GAGCAAAAGA GAGAGAGAGA GAGAGACAGA GTGTTCAGGA 180
GACGGAGACA AAGAGAGTGT CCTTATATTT TAGAAATGCA GAGAAATAAT GAATGCAATA 240
AAATGATGTC TGAGATGGGC TTTAAAATAA TCTCATGAAG TAGAACAAAT CATAGAAATA 300
TAGATGCAAC AAAATCCACC AATCTTTGGC ACTTGGTAAA ACTAAATGAT AGCTACATTG 360
GAGTTCATTA TACCATTTTT TTCCTTTTGT CTATGTTTTA AATTTTCTGT AATATTAAAA 420 CATTTACATA TGTATAAGTA TGAAGTTAAA TCAACATTTC ATTTCTCATT CTCATAATTT 480
TTTACTCTTA CTCATAATGT TCTTAATTGT TAGGAAGAGA CAAGGAATTT GTTGGGAATT 540
TTTCATTACT CTGTTTAGAC GCAGGATGGC ACTCTTTGAA TAAGTAATTA GCTAAAGTGA 600
GTAAGTnCCT CTACAGTTC 619 (2) INFORMATION FOR SEQ ID NO : 498:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 559 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 498:
AGGACAGGAT GACAGCATTT CTCCCTGCAT GCAGAAGCCA AAAGTGACTT AAGGAACTGT 60
CCTTAAACAC AGACATAAGC ACAAAGACAG ATGGTGTCAC TTTCTTAATC TGCACAACTA 120
CACATAACAA CAGAAATAAA CTCTTTCACT CTGTATTGGA GCCTTGGGTA AAGCAACCTC 180
CTCTTAACCT TAACCAAGAC CTTGGAGAGG CCACTATTAC AGAATGAGTC CCATTGACTC 240
CGCTCACAGA TCTACAGTAG GAATCCTCTA TTAATGAGAA TGTGGTATGG ACTAGATCAC 300
ACAGAACTTG TTGCCTTCAA CTTAACCCTT CCTCCTGCAA CCCCACAAGC AAGTATTGTT 360
TCAACTTTTA AACATAAAGA ACCTCATACC CATAAAAATC AGTGTCTCCC TCAAGGTCAC 420
AGAATTTATT CATAGTGAAC TAAATTTAAT TTTnnnTTTT TTTTTTTTTT TTTTTACAAA 480
TCAAACCAGC AATCCTTATT TTAATTCTGT GGTAAATAAG ATTCAAATAA ATTATAATTC 540
TCAACTGAAT AGAATTCAT 559 (2) INFORMATION FOR SEQ ID NO: 499:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 619 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 499:
CTGnCAAAGA GCTTGTACTG GAAACTTAAT CTCCAAATTC ATCTGTAAAT GCTATTTGAA 60
AGTGAGGTCT TTGGGAAATA GGATCACATG AAGTCGGAGA GTAGCACCTC CATGATGATG 120
TCAGTAGGTT CACAAGATGA GGAAGTGAGA CCCAAGTTCG CATGCCTACT CTGTCTCATG 180 TGATGCCTCT GCTGGGCGAT GATAGAAGAA GACTGTCATC AGATGCAACA CCATGCTCTT 240
GGCTACGAAG CCTCCAGAAC TGTGAGCTGA TGAGGCTTCT GTTCTTTGTA AATAACTCAG 300
TCTCTGGTTA TTCTGACAAA AAAGTAAATT TCCAGGCTGA TACTGTGGAA ATCTAACAGA 360
TTTTTCGTCT CTTTCTTGCC CTGCCATTGA GAACACATGG GTCTCATGTT GAGATATAAA 420
AGCTTCAAGA TCCAGGTTAC TTGGCTCACT GCATCTCTGT ATGGAGGCCA CCTGCCACGG 480
AGAGTGGCTT GGACCCACTG CCTAACCTTG CATGTGTTTA GTAAAAATGA TGAGAGTACG 540
AGAGACTGTT TAGTCTACCT GAACACAGGC TGCTCTATCT GACACAGAAT GGATGCAGAA 600
ACTGCCCATG TCCAAGTCC 619 (2) INFORMATION FOR SEQ ID NO : 500:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 681 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 500:
AACAACACCA AATTCGTGCT GAGGTGGAAT AATCGCTTTG GCTTTGTCTC CCGCTTCAGG 60
CCCAGCTTGT GCAAGTCTGC AGAGGCAGGC CAGAGGGGAG CTGAAATTTG GAACCCATGG 120
ATTAAGCAAA CTACGGTCTG AGGGAAGAGG TTCTTCTAGC AGGGTTTCCC TGGGAAAAAA 180
ATCACCTCAA CCGAGCACCT AAGTCCTGAG TCCTCTAACA GGAGGCAGCG ATAGAAACCA 240
ACTAAACACC CAAGCCCTCC GAGGGGGGGA TGAACTGGTC ACGAGCATCG CCCCAGCATG 300
CCTGAGGAGA GAGTGACCTC TCTTACCAAT CTGATGGGCT GAGAACTAGG TCCGATGTGG 360
GCATGCAGCG ATGAGATCAG GCAAAGGCTC TGGCTGTGAA TTCGCTCCAA ATGAGAAGTG 420
CAAGAGGGAG TGAGAAAATG TTCTGGAGAG GCGTCTCTCA GGGCCTTCCA GGTCAGAAGG 480
GAAAAAGGAG ATTGTAGGAA AGAGAAGGTT AGGGGTGGGT GGAGACACTA AAAAGGGGAA 540
GGATTTGTCC AGATGAAGGG TGGGGGGGTC ACAGCTCTCA CCCAGAGAAT ACGAGATCGG 600
CAGATTGGGA CAAGATGCTG GCCCTTCCCT CTATTTCTCA CACGGGGCAC CAAATATAAG 660
CAAAAGGGTG CAGATCAGAG A 681 (2) INFORMATION FOR SEQ ID NO: 501:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 576 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 501:
GGnACAATTA CAGGGGAATG GAGGAAAAAT CAAATTACAT CAAATCCGGT CATTGAGTCA 60
ATTGCTACTT TTCCCCAGGA GGAAATTTTT ATGCCATGAA GTACTAGGGT TGTATAAAAC 120
CTTTCCAGAT CACTTTATAA AACTTGTTTT ATACACTTGT ATTTAGTATT TACATAGTGG 180
ATTCAGCAGA AGGTTGATGT TGACAGTGTC AGTCTGAGAG GACTGACCAG GAnTACTTTT 240
AnTGTTCAGG GAATAGTAAT ACTTACCCTT ACCCCTTAAC TAATAAGGGG nGGTAATCnT 300
TnTAGTTAGG TTTAAGTTTG AGTACCTCTA CTTGGACGAC ACTGGGCACA CTCCTGCGTT 360
TTTGTATATG TTTTCTTGGG AGTGTTACAG ATTTGACTGA CATCCTGTAT ATATAATACT 420
TTTAAAACAG TTAGTTAAAT GCTTGGTCAT TAATGCCATA AATTATTAGA TATACTCTTA 480
CTCAACTGGA TGTGCAGTGT CAAGGAGCAG TCATTTGTCA TTATTTGGGA CACTGACACC 540
CAGTCTTGGA ATATTGGAAT GCCTATTTGC AAGATC 576 (2) INFORMATION FOR SEQ ID NO: 502:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 681 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 502:
TCAGCCCGGA GTATGCTAGG TTCACACTCT GGGACTATGA TCCTCAAGGA GTGGGCATTC 60
CAGAGTTGAT GATGATGGTG ACAGTAGTGG TGGTGGTGCC AACGGCATTG CTGTGGGTTT 120
GAGGGGAGTC ACTGTGGGAG GAGTAGTGAT GGTGATGAGG TTTTGCCAAG AAAAGGATGT 180
GAATCAGCGA TGCTTTATTC AAGGAGATAT GAATCAAGTC TGGTCGAGAA AGGTCAACCA 240
CGACAGAAAG TCCAGAAATG TTCGTTTCCA AACTCTCCTA AAGTGAAGGT GAAAGCTTAA 300
GAGTGTAAAA TGAAGGGGCT TTATTAGATG ACATCTACGG CATTTCTAAA CCTACCCAAA 360
AAACTCATTT TATTTTTAAA TACCATGCTG CCTCTTAAAT TATTTTGTGT AATTGTCCTT 420
GTTACCAAAA AAGAATATAA AGACCTTTTA AGTTTCTTTA GTTTTAAGAA TGCAAAATTA 480
TATCAGGCTT GATGGAAATG GAAAGTTATA TGCAGATTTA CTACTCGTAG ATTGCATCTG 540
AGTTTTTTTT GTTTTGTTTT GTTTTTACAT TTCTATCACC TCTCTGCCAT GTAGTGAATT 600 AACATTCCAC CATTTCCAGG GAAAGGGATA ATAAGGGTGA AATGTGGGCT GGAATCCnAC 660
CAnCGTGGTA GGAAAATAAT T 681
(2) INFORMATION FOR SEQ ID NO: 503:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 629 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 503:
TGATGGTAAT ATGAGAAATG CTGCAAGGTG GAGCGGGGCT GAATCACAGA AAGTGGAAGC 60
CGGAAAGTAG ACTTATTGAG TTCAGAGAAT GGTTGGCGAT CAGGAAGGGT GATCCAGGAA 120
GGCTTCCTGG GGGAGGAAGG GCTTATTGTT GACCAAGGGA GCCCTCTCAC CATTTCACAG 180
GGTGCTGTGA GATTCAAGAC AGGTTCAGAG AAATTGCACA TCTATGATAA ACTGGTGAGA 240
TGAACAAGAA CCCAGGGAAG GCAAAACCTG CAGAGTTGGG GAGGGCGAAG GAGGAGAAAG 300
GATGCTTCCA GGATAGCAGT GGCGACTGGA TTGAAATCAC AAACGCAATC ATTCATGAAC 360
ATTCTATTCC CGGCTTGGAT CAGAGGCACC ACAGAGGCGC TTGGAAAAGT CAGCTGGAAA 420
GAGAGACCAG AGAGAGTCAG CTAGAGAGAC TGGACAGAGG TCAAGTGTCA GGCTGCAAGG 480
TAGAGAGAGC TGGAACTGGC TGCTGGCTTG CCTGCAATTC CAAATGAGCT ATAAAGTAGA 540
GAAGAATCCC CGGATCTTCT CTATCAGGTC CACAAACGTT AGAGGAnCGG CTATGTGACA 600
AGGTCCCTTA CTGGTGGAnA AGAACTTGG 629 (2) INFORMATION FOR SEQ ID NO: 504:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 572 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 504:
AGTCTCACCA TGGGTGAACT TCCTGCATGG AAGTTTAAGC TGAGAAACAG AGAAGCAGTG 60
GGCAGCCGGG GTCACAGAAT CCAAACCAGC AGGTACCAAA GGAGAGAGAT TGTGAAGGCT 120
GAGTGAGGAG AGGGCCTGGC AACATCACAG CCATGGTCAC ACTCATGGTC ATGGTCATGG 180
TCAGGGTCAG ACTGGTTCCA TTTTACTAAC ACAAGAAGAA ATGAAACATA AGCTCCTGCT 240 TCCTTGGTCA GGAGGCGGCT GCTCATCCGG AGGCCCATAC TGCCTCCCCC AGCCACCCTG 300
GCTTCACACC CGTGCCCTCT GTCAATGCTG TCTCTGAGAC AAGCCTTCTT CATGGCATCC 360
CTGCAGCCCA TGCTGTCCCC AGGGCCGCCT CAAGACGTAG CCTTCAGCCT GACCTTCCTC 420
CACGCCTCCA GCTACCTCCT GGAAACACTT GGTGCTTCTC GATGTAGATC TGGGGCTGCT 480
GCCAGGCCTA GGCCACCTCC TCGTCCTGGT CATGGACTTG GGGCTTTGAT GTCACAGTTA 540
AGGTTAAGCC CAGCCGCTGC CACACATACT TT 572 (2) INFORMATION FOR SEQ ID NO: 505:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 626 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 505:
AGGGTTACTG TTTTAGGTTA ATGCCCTGTG ATTTTTATGC CCCTTTATGC AAACAGGAAT 60
AGTGTTCTGC TTTCAGAGTC TAGAAGAATT GGTTTCCTGG CTAGAAACAA GTTGAGGGAT 120
CGGTTAAGGA ATGGGGTGAT CTTTCAGATG CCCTTTCCAC ACTGAGAGGA GATTTTGGGT 180
TCCTGAGCAG TGAAATCTCA AATACAGCTT TAATGGGTGT CCATTTGTAT GTAACACAGA 240
AGAGAGCTGG CCTTGAAAGA GAGTCATTTT CTTTCAACAT TTCTGAGAGA AAATCAACAA 300
CCTCAATTTG GCCCAACAAT CATTTATGCT GCTTTTAATA TATAACAGAC TTTAAGACCA 360
GCACTGGGGG TATTCTGATG AATAAACCAC ATTCTCTACT TTTAAGGATA GCCTAGCAGG 420
GGCTAAGTTA TTCATTTAAA TAATTTATGT CTTCATGAAA CTGGGAACAT ACAGGGAGAA 480
ATTGCTGGTG GATTAACAGG AATTTGACAG GTCAGGATCT GGTTATCAAT CACACTCACG 540
AAAAAGGCAA AAGCAAAGGA TGAGGTTGAC CTCAGTGCCA AGGTCATTGG ATAGTGAGGG 600
TCATTGGCCC TTAAGCCATA AAAGTG 626 (2) INFORMATION FOR SEQ ID NO: 506:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 583 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 506: CAGTGCCTGn TAAAGAACAC TAAAGGGTAC CAAAGCTAAA TGCAGCAGAA TTGAAGTTTA 60
GCAGATGGAA GGGCATCTGA GCCTACACAG AACGAGCTGT CCACAAGGTA TAAGGGTCAG 120
CCGTACAAAA TACCTTGTGA GGCTACTACT GCAAGCATGC AACCTTCCCA GGAGAGCCCC 180
TCTGAAGTCG GGGCTCCATT ATCTCAACTC TCAAGATCCA AATCTGGGGC TACATGTCTG 240
ACAACCACAC TGGCCCCTTC AGAGAGCCAG TTATGGCCTC CAGGGAAGTG GTGTTAAGGA 300
CCTTCAAATA AGAGCCTTAG GATCTCCAGA ACTACCAAAA CCTACACTTG TAGCAGTTGG 360
GAAATTTAAG AGGTTAAGCA TTTCTCCTAT CAGCAATTAA AATGTTCACA GAGCAAGGnA 420
AACTCTTAAA CCTGAATGAG CCATAAGCAG ATACAGATCG CCACAAATCT ACACCATTCC 480
TGTGACACAA GTCTATACTC CTAAGCCCTC AGAGCAGCCT CTCTTTCTAG AATATGCAGT 540
CAATnACACA AAGTGnCTTT CAATGCTCTT CCACTCATGC TGG 583 (2) INFORMATION FOR SEQ ID NO: 507:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 607 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 507:
AGAAAACATC AAAGATATGC AACTAATGGC ACTGAAGAGT TAAGAAGAAA GTGAAAAATT 60
ATGTGTCCAG GATCTGAGAA AAGTACAAAG AGGATTATGA AGACGAACCC TTGCAGACAG 120
AAGTGGCATT TGAAACTGCT TTCTTCCTGG GAATATCTGC CAGTTCAGCA AAGAACGGCT 180
GACAGGCTGA GAGGTGCAGC TGACAGGGCC TTGGCAGGTG GAAACTGGAG TCAAAAGCCC 240
GCCAAGGGAG TTGAGCTGGA AGGAAAGCGA AAGGGAnAAT CCCTCTCAGG AGCTCAATCT 300
GTATCAGGTT GAAACCCTAA GGGTACACCT GAAATGGACA GCCTTCAACA TGCTGnAGCC 360
CAGCCATAAA CTTCATAACC CGGGCAATCA CACTGAGATG ACTTGAGCTT GCTGATGCAC 420
CCAGCAGAAG TGAAGACATA TCTTGATTTG GnAAAGATCT CAAGTTAAGA ATATAATAAC 480
CCTTCTAATA AAAAGTCCAG CACAGAAGGA GACAAAAAAT GGAACCAAAA CCCATGAGnA 540
ACAACAGACA AAAGAGACAG ACCCAAAAGG GCTCCAACAA TGGTnAATAT CAGGGAACAG 600
ACCTTTG 607
(2) INFORMATION FOR SEQ ID NO: 508:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 468 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 508:
TGGCTCAGCT GGACTGCCAA GGACTCCTTG AGCAGCTCAG CTCAGGGCTC AGCCGGGCCA 60
GAATCAGACA TTCAGGTGCC CCTCCCCCCG CAAAGTGTCT GGTGTACCCA CGCAGAGGTT 120
ACAATGGAAA TAACCCCGAT CCATGTAGCT TCATATTTAC ATGCACGAGA AGAGAAAATG 180
GCCTCTTTTT TAAGGACATT CATCACAGCA GAATCACCCT CTGCCGCTTC GCTGTGTTAT 240
TGCTGCTGCT GCCTTGCTGC CGGCCCAAGC TGGACTCAGT CTGCCCGCTG GCTCACTCAG 300
CGCAGGTCGG CAGGATTTGC CCGACCGGCT TTGGGATAAT GAGCAGCCAG GCTGTTGGTG 360
AGCAGTCCCA CAGGTGCAAG AGGCCAGTAG CCTGGTGGGC AGCTTGCTGA GGCCACACCA 420
GGTTTGGCAG GTGCTCAGGG TGAGGACTCT GGGGAACTTT CCCTGGGG 468 (2) INFORMATION FOR SEQ ID NO: 509:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 532 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 509:
GTCACCAGAT CTACTATATA CGTATTTCTT TCTTTCCCCA TTCTTGGTCT GCTCCATCAG 60
CATTTTAACT CCATGGGGGC AGAAGCTGTC TGTTCCATCG ACACCAGCAT CCTCTGCTCA 120
AACTATGCCA GTATAATTTG TTGAATGAAT GAAGCCAATA TATTTTCTCA AGTATAGAAT 180
CATTTAGATT AAGATAAGGA AATCTTATTT AATATGGAGG TAGAAATCTG GATGAGGCTT 240
TCGTTTTACA AAATTATCTC TAACTTAGAT CCATAAGTGG TGTCTGACAT CCCAAGTCAG 300
GAAGCATTCT GTTTAAGGAA ATCCACCCAC ACAGGGCGCC AAGAAGCCTG GGGCGGGAGA 360
GGGAAGTAGA CTTGCCGTAA AAAGCTG AC AATGTAAGCA GTAACTACCA TGCCCCGAAG 420
CCTCAAGTCC CTCAGCTACA AGATGAGCAT CATTACTAAC TCCTGCCCTC CTCTGGGACA 480
GTTTCTnTCG TCATCAAGGG ATCACAAATG TAAGCCCCCC TCCTTTTTTT TT 532
(2) INFORMATION FOR SEQ ID NO: 510:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 620 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 510:
GGGTACGCGT CACAGCATTT TATTTTTGTA AAATGCTCTG CTTAAGGGAT CTAGAGCCCT 60
AAAATTACCA AAAACGTTCA TTGCAAACGT CTCGGGGTGC CTTAAAAGTG CTTCAGTTCG 120
CCTGATTTAA AATCATGGCA TAATTTGATT TTCTCTGTCT CACCATCATT GCAGGGGCAT 180
TGTTAGTATG AAAATGAAGT TTAAAAATAT ACGCTGGAGG AGGTGAGCAC CCTGACAGGC 240
AGAGTTCTGC TCTGTGGTAA GACAGGCCAG GTTGGGATGC ACCGTGTGGC CAGAGTGCTG 300
GATGTGCATC AACCTGTGCA GTGGGGACAG CAAGAGGAAG TGAAATTAAA CGCATAGATA 360
GACGACAAAA CTAATAAAAA CCAAGTGCAT CAAAACACAG TGTGAAAGTA AAAATAAAGT 420
GATAGTACAA TCTCGCTATA CACAGAAATT CTAGGAAAGG GAACGGCCAG CCCCTGGGTA 480
AAAATTCCTT TGTTCAGTCT TCCAGAATGA CAGAATTTGT TGAAACATCT TGCCCTTTAT 540
GTGTGGGAAT GTAAAAACCT CCATTTTTTC CCATCACCCT TTGCCTCTCC AnTCCTCTCC 600
TCTTATCTCC TCCAGTCTGG 620 (2) INFORMATION FOR SEQ ID NO: 511:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 539 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 511:
GGGGTGCTGA GGCACACGCT GGCCTGTCCT AGGCACATGC CCCTGATGGG TCTGTGCAGT 60
ACCATGCACA TCCTCCTAAT GGTGGAGGGA GGACTTGACA GAGGACTGGC AGGGGCAGGT 120
GAGTGTATAC AAACAGAGCT TGGCTCCAAA GTCAGGGGCT TGGGTGGGAC TCGCCTCCAT 180
GTAATCTCTA AAATATGGTC TTAGGAGGTG AGGAATAAAT GCCTAAGTGA TGGGAGAATC 240
CTCTGTAAAC CCAGAAGTGG GGTACACCTA TTGGGAnGGT GGTGGTGGGT GCCCGGAATA 300
AGCTGGTTTC CCAGCAGTGC AGTGGCTCCC ACCCCTCATC CAAGGCTCGC AGGAGTGTCT 360
CTGGGCTGCA TCCTGTGTGG GTCAGGATTG AGCATCGTCC AGCTGTGATC AGGCATCTGG 420
GATGCAGGAG TGAAAGGATA GTCCTTGCCC TCTTGAAGCG TCATCGAGAG GGGACAGATA 480 GTGAACAGGC CACTTTAAAG CAGGACAATG TGATGAGGGC CGGTGGAAGT GGGGAAGAC 539
(2) INFORMATION FOR SEQ ID NO: 512:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 617 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 512:
GTCTCCCTTG GCAAAGCTGA GGAAGCCTGA TTCCTCCCCC GCTGCCATGC AGAGGTGATG 60
GGGTCGCATG AGGATAGAAA GAGAACATTA TTGGGTTGAT CTGTAAGCTG TTACTCCCCA 120
CTGAAGAACA GGCCCTGTCC CACTCATGTC TGCCCACCCG CCTTCACCTT ATCACAGCCT 180
TCCACACAGA CGCGGCCCCC TGAGTGGGCC GCTCATGCTC CTAAGCCTGC TGTGAGGGCC 240
TCTCTGATCC CTCCGCCCCT CACTCTCCTC CTCCTCCCTC ACCGCACCTG TTTTGGGCAG 300
TGCCAGCTGT GCACTTCCTG GACCTGTTGG CTCCCTCCGG CTTTACTCAC ACTTCACCTC 360
CCTGGGCCCC AGCTCCACGA AGACCTCTCT GCTGAGCCCC CATGATTTCT AGGCGTCCCT 420
GCTCCTGTGC TGTCCCCAGG GCCCTACTGG CTCTTTCTTG ACCCCCTCCC TACTGAAGGC 480
AGAGCAGCTG GCAGTTCATC TTTGTCTTTC CCTCCTTTGA CCAGGGGCCT AGTGACTCAG 540
AGCCACAGGG CGGTTGTTCC GGCTGCTGTC AACTGCAGAT GGAGATGTGG AGAACACTGA 600
GGCTGACCAC TGGCCTC 617 (2) INFORMATION FOR SEQ ID NO: 513:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 616 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 513:
CCTCTTCTnC ATTCTGCCTG TTCAAGTAAA TAATCATTAA AAATTTCCTC TATCAGGGGC 60
TGGCACTGTG TCACAGCGGG TTAAAGTCCT AGCCTGAAGC GCCGGCTAGG ACTCGAACCG 120
GTGCCCATAT GGGATGCCAG CACTGCAGGT GGAGGCCTTG CCCACTACGC CACAGTGCCA 180
GCCCTGCAAA GTATGTTAAA AGCAGAGTAC ATTTGGGAGG AGTGAGAATG GATCTCAAGA 240
GACAGGTTAA GGCTGTCCTT TTAGGGAGAG CACTTATCCA GGCCAGAGAT GGTGCAGCAT 300 GGACCAGGAT AATGACCAGT TACATAGACA GAACCGAGTA ATCAGCATCT ATTTCAGAAA 360
CAGAATCATC AGGATGTGGT GATGACTTGG TTGTAGGCGT GCAAGAGACA GGGTTGTTGA 420
GGATGCTTCC TAGGTTTCTA GCTTGAACAT TCGAGAGACT ATGATTAAGA GGCCTGTGAG 480
AGGTAAGGGC AGGTGTGTGG AAAATTCAAC CCTTTGCAAT AAGCTGCATT TGAAATACTT 540
ACTGAACATC TAAACTGAGA TGGCAAGGAG GCTAATGGAT ATGAGAATGT GTAAATTTGG 600
AAACAGTCAG CAGATG 616 (2) INFORMATION FOR SEQ ID NO: 514:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 670 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 514:
CGGTACACCC ACAATTATTA AATGACATTT TTTTTCCAGT GCAGGCTTTT GTAGAATTTG 60
GGTAGATTCT ATGAATTCCC AGACATCATA CTATGACCAC TATATGTATG TGTGTTCATA 120
ATCACATGTG TGTGTGTGCA TGTTACTTTG TAGAGGGAGA ATACATAGCT TTTCTCAATA 180
AAATGCATAT TGCAACAGAA GAAGGGCCAA AGGCGTATTG GCCTCAAGGC AGCGTGATCT 240
ACTAGAACTA AGCCTCACCA AGAGGTTGAG AGATTTGTGC CTTTGTGTCT CCGGGGACCA 300
CTAAGCAATT GTTTGACTCT AGGCAAGTTA CCTAATCTCT TCATATCTCA GTTTTCTCAA 360
CCATGAAATG GAGGAAATAA TACCCTGCAC CTGCCTGTTT CAGCTCATAG GGCTATGGTG 420
AGGATCAAAT AAGGAATGGT AAGGAAAAAC TTTGCACTGA AGAGTTCTAC ATATGCATGG 480
TTTAGGAGAT TAGGCTCTCC TGCTCTTCCC CACACCCCCC TCACCTTGCC CACCCTCAAA 540
AACTCCCTGT GAGAGAACCC GGCTCTCACT TTCAGGGAAT TTCCTCACTT CAAAGTAAAT 600
AATTATCAAT ATnTnCTAAA TTTTCACATT TCTCAGTTAT CAGGGAATGT TATTTAGGGC 660
TCTGCCCATA 670 (2) INFORMATION FOR SEQ ID NO: 515:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 638 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 515:
CTATCCTACC TATCAGTAGT ACAATGAGGC ATTGCCCAAA AATGTTGGAG GAACAGATCT 60
CAAGGGAGAG TACCCCAAAC CTTCTGCTGA TTGGTAGTTA TAAGTCATTT AGATAGTACG 120
GTTGGGGGAG GTGACAGAAC TCTGAGCCCT CAAGTCTAAA CCATGGGCAG TGAAAAATCC 180
CATGTGAAGG ATTTTAGCCA CTCTATCTTG CTGGAGAGGC TTCTAATCTT CCCCCACCCC 240
TTATCCACTA ACACCACAAA CACCCAATTA TTTTTTTCTT ACTAGTAGAG GACCCTAATT 300
AGGCTCTAAT TAGCATATGA ATAGAGGAAA CAAGAAGGGA GACTGCTTTT AGTGCTGGAA 360
GCTGCTATCC CAGTGATCTC TTAATACTTC CTGATGCTAG AAAATCCCTG GAAAACAAGA 420
ACACAAGGAT TAGGCCACAC TAGCCCATTT CAGGGCAGGT TAAAATAATC AAGGTAAGCA 480
ATGTGCTTCT TCCCTTAGCT ATACAATTTT GATATTATCT TGGAGTGCTT TTACTTACAA 540 TTGAATATAT ATATATnTGT GTGTACTGGT GCTGTTAGAC TAGGACTGAA CATAAGTTTC 600 TCAAACAGAn GCATGTTTGC AAAAATCTCT AAGGAATC 638
(2) INFORMATION FOR SEQ ID NO: 516:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 576 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 516:
GTCTGATCAA AAAAATAAAA TTAAAAAAAA TAAAAAAGAA TGCATACTAT GAAAAGCTAT 60
GCATAATTTC AATATTTTTT GCACAAAAAA ATAAGTTTAT CCTTAATTCC TATATTCCAC 120
TTTTCCACAG ACATTTTGGC ATGCCCTCAT GTAACTGACA GAATCCTAAC TGTGCTCATA 180
GCACCATATT ACTGCTTGCA ACAAGCAATT AGTAGCAACG GTGCTGGAGT GCAGCATTAA 240
GCCAATCCCT GCACTCTATA AAAAGTTAAA GGGCACATGC CCAAGCAGGC CCCTACATGG 300
GCACAGATGC CTGAGAGGGA AGAATTGTAC CGACACAACT CTGACCCTCC AGACCTCCCA 360
AAAGGCAGGG TGGGGCAGAA AGCGGCTCCA CTTAGGAGGA AGATCAACTA CCTTCTCTTA 420
CACCATGGTG AAGAAAGTAT AGGAGGCTGC AGACTGTTTG CCTGCCATAC TTTCACGTTT 480
CAAAAAAGAT GAnAACCATC AAGCCTGGGG GAGAGAGGTG CTCAGCAAAT CGGGTGnAAT 540 TAAGAGCAGA GACGTCTGCC CAGCTGTGGG CACTCA 576
(2) INFORMATION FOR SEQ ID NO: 517: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 587 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 517:
CCACCCGTGT TCTTAAATTT TTCTTGATTT CCTAATTATC TATTTTGCTT TTGGGCTTCT 60
CTTTTCCATC ACTTACCATA ATATTGAGAA TTTAAAAAAA ATTAGAAAGC AGGAAGCTGG 120
AATCAGGAGT GGAGCTGTAT TTTTTTTTAA GGTTTATGTA TACACACACA CACACACACA 180
CACACACGGC GGGTGGGGAG AGAAACTCAG AGTTGGACCA GGCTAAAGCA AGGAGCATGG 240
AACTCCTTCC AGTCTCCCCA TGTGGATGGG CAAGCGTCCA AGCGCTCAGA CCATTTCCTG 300
CTGGTTTTCC AGGTGCATCA GCAAGGAGCC GTATGGGATG TGGAGCAGCC GGAACTCAAA 360
CCAAAGTTCA TACGGATGTG TAGGCAGTGG CTTCATCTGC CGTGACATAG GCCAGCACCT 420
GGAGCTGGGA CTTGAACCAC ACATCCCAAG CAGCATCTTA ACTTCTACAC CAAACGCCTG 480
CCCCTCTTAT GGnTTCTTTG TTAATTGTCT GAAAGATACC TGAAGGGTGC GTTCCATCAC 540
AGATGTGGTA TTACAATGAA TCCTTCTGAG GGATATCACn ATTTTTT 587 (2) INFORMATION FOR SEQ ID NO: 518:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 246 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 518:
AATGATGGTG TAGGTCCGTT CAAAAATCTT ATGCCACAAA CAAAACACAG CTCTAATTAG 60
TATGTGATCA TAAACAAGGT AAACGCACCT CCCTAAAGCT GTCTTAAAGC CTTTGTTGTG 120
GTTCACTAGC TCAGAGAGAC GCTCTGAATT CTGTCTCCAT GCACTGTACC AGGGCAAAGA 180
TGTGGCATTC TCCAAATCTC ATCAAGAGGA TATATCTACT GTAAGGnAAG AnTGTTCTGC 240
AAAAAA 246 (2) INFORMATION FOR SEQ ID NO: 519:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 497 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 519:
TGTGCCGGCA GCCGGCGCCG CnGGAAGnGG ATTAGCCTAG TGAGCCGCGG CACCGGCCCA 60
TAGTTCCCTA TTTTTGAATG GGAATGATAA GTACCCATCT GATAGGGCTT TTGTGAGGAT 120
GAAAAGAGTT GGAAATATTA AATCATTACT TAGGGGCTGA CACTTTTTTT TCTTTTATGA 180
GTTGTTTATT CCAAAGACAG ACTGACAGAG AGGGTAGAGA GAGAGTGAGG TAGAGAGAGA 240
TCTTCCATTC GCTGTTTCAT TCACCAGATC ACCACAACAG GCTGGACTCA GTCAGGCCAA 300
AACCAGAAGC CAGGAACTGC AGCCAGATCT CTCACATGGG TTCAGAAGCC CAAACAGTTG 360
GGCCATCTTC TCTTGCCTTC CCAAGCACAT TAGCAGGGAG TTGGATCAGA AGTGAAGCAC 420
CTGGGGACTC CAGCCGGCAC TCGTATGTAG TGCTGGCATT GTGGTTAGTG GCTCAACGTG 480
TGCTGCAGCG CCAGCCT 497 (2) INFORMATION FOR SEQ ID NO : 520:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 478 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 520:
AGCGATGATA TTTTCTTCAG CGCGTGCGCT ACGCnTnCCC ATTCTGCTGA TCTTCAAACT 60
GCGCAAAAGA ACTCATCGCA GATGAGTnAC AACTCGTCCA ATTTACTCAG ATAATAGCTG 120
TCACGTTCTG TCAAACAGTC ATCAAGAAAC TTTGCCTCCG TGATCACTCT GCGCGCCGTC 180
TGCCCTTCCA TATAAACTCC TCATCCCCCA ATTCACGTTC CTCGCGCGTG TGCTATACCC 240
TACACCACTT AnAGCGTGAG AGAAAGCTTA TACGCAGCAC GCCGCGCGTA CTCAAACGCA 300
GGACACTTTT CTGCAAGGTT GTAATACATA TCACGCGCCT GGGCGTGCAA ACCCTTTTTG 360
TATAAGAGAT CTGCAAAATG GAGCAGTGTC TGCCACACTC CATGCTCnTG CGCCCACTTC 420
AAAATAGTCG GATGAGATCC AGTTCGCAAA AATAAATCAG CAAGATCCTA TACTGATA 478 (2) INFORMATION FOR SEQ ID NO: 521:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 647 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 521:
CCCGTTAAGG CCAGCTCCCG TTCAGTCCTC ACATCTGGGG AGGGTCACCA CCAACTCGGG 60
AGAAGTTCTA CTTAACATCC AGCTTGTAAC CTTACTGCTA TTCACGTTCA TGTCCACTGT 120
CTTGATTCTG GATAAGGACC CTTAGAGCAT TAGAAGAGGA CAGCTGTCCC GATGGGCACA 180
CCAGGCTGTC CTCCTTCTGT ATAACTGGCA GCTTCTAAGA GCTATGGCCT ACGCCCAGGG 240
AGGGGCCTGG CAAGTGGTGG GCACCAATGT ATGTTGAAAT AATGAACGTG ACTGCATGCC 300
TCCTCTCTGG GCACGCCTGG CTTCTCTCAG GCCTCAGTCC CCGGCCCTTT CCTTAGCCAC 360
AGGTTGGCTC TTCCACCCCG GACATAGCAC AGCACGTTCC CCTTCCCTGT CTAGTCCTGA 420
GCTGAGGACG TGGGGAACGG GGCCCAGGTT AGGCATCCAC ACCTCCCATG TCAGACTGAA 480
GCTGTTACAG TGTCATCTCA GGGACTGCTT CCCTGTGCTT AGTCCTTAGG TACCAAGGGG 540
GCCTACGTCT GGGAGAGGAG AAGCCCAAGA GTCCCCAGCT TGGGGTGCCT TGGGCCCGTG 600
GAAACCCCGG CCCACCTGCA GGGAAGCGAT GCTGAAGCCC TGAGCGC 647 (2) INFORMATION FOR SEQ ID NO: 522:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 476 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 522:
GAAGCAGAGG TGCAGTGTCC CGGAGCCCTG CCCGCCCTTG TGGGCCTCTA ACACTAGCTC 60
CTGTCCCGGC CCACCTAGAG CGCCCTGCGG TTGGAGGACG GAGACCTGGA GGAAGCCGCA 120
GCTGCTGCAG CTGCAGGTGG GCGGCGGGAG CGACGGAAGC CGTCTTCAGA GGAGGGCAAG 180
AGGAGCCGCA GATCTCTGGA AGGCGGGGGC TGTGCTGTGC GTGCCCCAGA ACCTGGGTAA 240
GCATGCATCC GGGTAGATGC GGAGGGGTTG GATCCGCCAG GCGGGTGGCC CTGCGCTCTG 300
ACAGGCCCCG CCTCCAGCCC CACAGGTCCC GCCTCCCCGA CAGGCTCCTC CTCCTCCGTC 360
ACGAGCCCCG CCCCGCTGGC CGGCCCGGCT TCCAGCCCTG TGAGCCCTGT GAGCCCTACC 420
TCTGGCCTCC GTACCTCCGT GAACTCTTTC CTACCATTTC CACGGCGCCG TCAGGG 476 (2) INFORMATION FOR SEQ ID NO: 523: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 477 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 523:
TAGTATAAGC CAGAAAAGAA CCCAGAACTG AGAACCCACT TTGCAGACGC TAAATTTAGA 60
ACTGCCTGAG GCAGCCTCTT TCAGTCAACT TTCGGGTAAA TTTTCTATGG CAAAAATATG 120
TGCATTGATT AGACTTTTTA ATCCATCCGA GAGCAAACAG TGCTGACACC TTGACTACCG 180
GGGTCCTCAT CCCTAGCCTG CGTTGACAGC CTTAGGCTCG TGTTCCCATA ATGTACTCAA 240
ATCACACGCT TCCGGGCTGC GAACTACAAA CCCCAGCATG CATCACTTCA TCTTCCACAG 300
GGGnGGGGGG GGCGCCTGCC GGGGACTGTA GGCAGCCGCC GCTCTGGTGA GTCCAGCCAG 360
GGACAAGAGC CTTTGGGGGA CACGTCCCAA ACGGGCCAGC TGCGGACTCG GAATCCCGTT 420
TGGCCGCCGC CCTGCTCCTG GCAnCCGCAA CCTCCGACAA nCGCACCCCC AGCGGCG 477 (2) INFORMATION FOR SEQ ID NO : 524:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 266 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 524: nCCGGGCGAn CTCGAGCGCG AACAGGCGGC CCAGGCCGCT GCCGGCGCCC GTGATGAGGC 60
AGACCTGGCC GGCCACGCTC TTCTCCTTGG GCCGCACCAG CCAGCGCGCC GCGGnCCAGC 120
ACGAATGCCC ACAGCACTTT AAAAGTGACC ACGAAGAACT CCACCACGAT GTTCATCGCG 180
ACGCCCAGGT CCCCGnGCCA GTGCAGCGCC CGCGTCCGCA CCCAnCCCGC GCCGGGCAGC 240
CCGGCTCACC GCCCCGGGGC GCTTGT 266 (2) INFORMATION FOR SEQ ID NO: 525:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 587 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 525:
TGCCAGTGCC TACTGCTCCA TCTGGTGCAG AGTCTTTCCC CAGCTACCTT TTGATTCATG 60
CTTTTAGACC ATTCAGGTAG CGCCACTAGA AATTATTTCC TGTCTTCCTG TTCCCTCCCT 120
GACAGCCGGC AGCTTCCCCC TGTGGCGTTC TCTGTAGTTT TTATCACGGC ACTTCACATT 180
CATTTTATAT ATATCTTGCC TTCATCATCA AACCCGTCTT GGTTTCACCA AGACCTAACC 240
CAGTGCTCGA CAATGACAAT GAATGTGTAT AAATGAATGT CGCAGGGACT GGCCTTTTGT 300
CCTnAGGTTA AGATGCTGGT GTCCCATACG GGAGTGCCTG AGTTTGCTGC CGGCTCTGGC 360
TCCTGATTCC AGCTTCTCAC CAATGGCAGC TGCTGAGAGG CAGTAGTGAG AGGCTGTAGT 420
GATTGAGTCC CTGGCCACAC ACAGGAGACC GGGATTACGC CTCTGCCTCC GAACTTCAGC 480
TTGGCCCATT CTGGGACATT GTGAGCATTT AGGGAGTGAA CTGGTGGATG GAAGTGTGCA 540
CACACTCTCT TTTTCTCTCT GCTTCTCTGG TGGCCnCTCC TCCCCTT 587 (2) INFORMATION FOR SEQ ID NO: 526:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 561 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 526:
TTGAAAAAGC CAGGAGAGAA TGCTGGCTTA GAGTTTCTCT TATCATCAAA AGAGCAAGGA 60
CGAGGGCTCC CCTCGCGTTG CCTGGGGATT GTGTGGCTGG AATCCAACCC CTACTCCTGT 120
CTCTTTTACC AGCATCAAAA GAGGAAACAC TTGCTTTCAT TTCAGCTTTC CTAGATGTGA 180
GGCCAAAGGA GTGAGAGTGG GGAAGAGGGA AGGGATTTGA AATTAACAAC TGTTAGTAGT 240
GAAACATCGG ATGGGATGCG ACTCTTTATT ATAATTCTTG AATATTATTT CTTTTTGCCC 300
GTTGTAAATA CTTTTTCATA ATTTTATATT TGGATTGTTC ATTGTGAATA TATAGAAAGT 360
CACTTGATTT TTGTATGTTG CATTTCATTG TGCACCTTGC AAAGATCTTT TATTGGTTCT 420
AAGAGTTTTT CAGTGAACTT TTTTGGGATT CCTATATACA AGACCATGTC TTCTGCAAAG 480
AGAAAGTTTT GCTTTTTGTT TCTGGGnTTC ATACTTTTAA TTTTCnTGCC TAATTGCTGT 540
GACTAGAACA TCACAGACCA T 561
(2) INFORMATION FOR SEQ ID NO: 527:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 374 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 527:
CTCGCCGTCT AAATCCCACA GGAACCGACT GTGTCCCAGG CACCCCTGCA GGGCTCTGAG 60
GGTCCTGATC TCTTCCTGGG ACCCCCACAC CTCCAGCGGC CAGGGAGAGG CCAGGAAGGA 120
GGGAGGAGCT CAGGGACGGG GTCCATCCCT GGCCTGCGGT GGGACCCCGC CCAGCATGGA 180
CGCCATGGCC AGCCCGGCAG TCACGGGGTG CTGCTCCCGG AATCCTGGGC CACCAnCCCG 240
TCTCCCCCCA CnCCTCTTCT CCCCCACATC CCTACTGCAA GGCCCGGGGA GGGGCACACC 300
CTGGGGAGTT CAGAGGATGG ACAGCAGGTG GCCCAGTGGC TCCCAGGGAT AAAGGGnAGC 360
TnGGCGTGGT CCGA 374 (2) INFORMATION FOR SEQ ID NO : 528:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 564 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 528:
GTGCACATTC TCTATTCACT TTAGATCACC TCTAGATCAC TTACAATACA TCACAGAATA 60
TAACTATAAA CAAGTAGCTG TATTGTTTAG GGAATAATGA CAGGGGAAAA ACACGTTTAG 120
TACAGATGCA ACTTTTTTTC CCCTAGATAT TCTTAACCCA AGGTTAGTTG AGCCATAGAT 180
GTGGAATTCA CAGAAACTGA AAGCTGACTG CAATCAGTTC TTGCTAACAG GGCCTGCCCT 240
CAGAAGAACC TAGTTAATCA GAGAAAAATC TGTTCTAGGG GTATTCTAAG AGCATAACTG 300
ACAGAGAGAA GGGAAATACC CAAATCTAGT TAGCTCTAGT CATCCTGTCT CACCTAATCT 360
GGGACAAAAT ACTGACAAAC AACTGTGAAT ATCCTCTTCC AAGAAACATT AAGTCCGAGA 420
TCTAATCAGA GTAGTATAAA ATACTTACCC TGTCACTAAA GGACTATTTA CATAATTCCC 480
TTTTACCTGT AACATCACAT CCAACTAGCA AGAAGAAATT ACAGGGCTGA CATTGTGACA 540
TTATGGATTA AGCCACCACT TGTG 564
(2) INFORMATION FOR SEQ ID NO: 529:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 528 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 529:
CGGATGCAGC TTGTTCCTCT GCATGACGGA TGAATTGATG GTGCAGCCAG CCTCTTTCCT 60
GAAATGATAG GCCACTGTCT CCAGAGAAGC TTGTCTATGA GTGGCACTCC ATATCATCCC 120
CAGCTACTTT TGAGATTAAT GTGACAATGT GTCTTTGCTT CTTCTCAATG AATTCCAGCA 180
CTGTAAGTnT GTnTAGCTCT CCTAGTATCT GGAGCTGCAG ATTTCTGCAG CTCTGGGTAT 240
GTGGAGGACA CCCCTGCCTT CAGCAGGACC GAGTTAGGGC TGGGGGCTGG AGGTCAGGGA 300
GTATATTTCC AGTAGCCCCA TAGCAGGATG TTAATTTACA AACAGCTGGA ACTTTCTCGG 360
GGAGCACAGG GGCATCTGTT TAGGGCAAGG ACACCGTATG TGTGTGAGGC TAGAGAAAGG 420
GCGGCTGGTG GCAGGAGCAT GGTGATTGGG TATAGTGTTG CTGATACAGG CTTGTGCAGA 480
GAGGGCTGCC CGCCCTTGGC CAGATGCAGC CTGTAGCTGG TGCATCCC 528 (2) INFORMATION FOR SEQ ID NO: 530:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 416 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 530: TTTTGTTTTC AAAATGGAAA TGTTTCTTAG GGGTTTTGGC ATGCTCTTTG ATATGTTGAA 60
AGATGAGAAA ACATCATTTG GAAACTCAGC ACTTATACCA TATTAGGAAA TTTGCAGTGC 120
CTCAGATTTC ACATTTAATG TTTTAATTGC CTGGTTTGGA AATAGAAAAC TTTCAAGCTT 180
CTCTCTTTCA TTTCGAATTT CCACTGCTAA CTTCCCCCTC TGCAGTGGTG GGCTGCTGCC 240
TGCCCCATTT CCTGCCCTGG TTCCAGCATG GACCAGTGGT TGCGAGAAAG CAGTTCATCG 300
CATGTCATCT GAAATGAATA TCATATTGAC TTCCGACAGA CTGGCTTACT CTTTTTTTGT 360
GATACCACAC ATGCAGTTGT TTCCTGTGGG ACTTGTAAAT ATCCTAAAAG TGCAAC 416 (2) INFORMATION FOR SEQ ID NO: 531:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 333 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 531:
TGCACACGTA TGGGCCGCCC CAACACAGCC GCCATGACAT TAAGTGTCTA GAGGGAATAG 60
ACACAGGTGA ACACATGTGG GCCGCCCCAC ACAGAGACAC AGCCGCCATG ACAGCGTGTG 120
TCAGGGAGGG AACAGACACA CGTGCACACG TnGnGACCAC CCCAACACAG CCGCTATGAC 180
AGCGTGTGTC AGGGAGGGAA TAGACACAGG TGAACACGTG TGGGCCGCCC CACACAGAGA 240
CACAGCCGCC ATGACAGCGT GTGTCAGGGA GGGAACAGAC ACACGTnCAC ACGTTGTGGC 300
CGCCCCACAC AGTAGACnCA GCCGCCATGG ACA 333 (2) INFORMATION FOR SEQ ID NO: 532:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 616 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 532:
AAATACTCGA CATATTTAAA AAGTCTTTTC ATTTCATTCC AAAATAACAG TCTCTACCTA 60
CTCATGAGCT TCCCAAAGTT AAGACTTAAC AACATTTTGG AAGACTACTG TGAACTTCAG 120
CTAGCTTTAC TAAGATGTTT ATGTGTTGTT TGGATTAATT TTGACCAAAG ACCTGTATGA 180
AGTTCAAGGC TGATCCTCTT ATGATGACGG TCATTTCTGA GCCAGCCTGT TATGATTTTG 240
TACTTATCTC AAGTGTAACT TTACTACTTT CTATAACGAG CAGAGTCCAA ATGCTTGGCG 300
CTTGTTTAGA GCTTTGTGCT GCCCCAACCA TCCACCGCTA ATTCAGATAA TTTGTTCAGA 360
GAGGTTAGAC GGCCATCAAC TCCATAGAAG CTCCTCCGGT CCCCCCAGAT CACAGAATTT 420
TGGCTTGCAC TTCTAGAAAC TGCCCAGGGC CTGGCACACA GTAGGCACTT AATTGTTTAA 480
TGTTGTTTGG GAAGGAAGTA GAGAGGCAAG GAGGGAGTTG GCAAGGGAGG GTGATGTGGG 540
AAGCCGTCCT GAAAGAGCCC ATTCTGACCA TGGCCGTGTC TGAATCGTAG ATTACAGCAG 600
AGAACGCGCG TGGATC 616 (2) INFORMATION FOR SEQ ID NO: 533:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 637 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 533:
CATCTTTGTC CCTCTGACTT GTCAGTCTGT CCAGGGGGCC ACATCTGAGC TTGTAAATTA 60
CCACTCTACC CTCTCTTCCC TTTCTGCCAC CAACACCCTT GCTCTCTCCT CCCAGCCCTG 120
TGTGGGGAGC CAGGCTGGGC ACCTCCCAGA AAGGCTATTG GCTGTTGGCT GCTCTCTGCC 180
CAGGCATCTC TGGGCAGCCC AGCATTGGCT TCTGCTGCCT CCTGGAGGTG GAAGCTCCTG 240
GGTCCGTGTC CTGCCTGGTT GGACCCGTGA ACAAGCCAGT CCCTGTGAGG GTCTTCCCTT 300
GGCAGGCCTC TGGCTGGGAA GAGGACTCTT GGCAGAGTCC ATGAGCTACC CTCGCTGGAA 360
CTTGAAATTC CTGCCCAGAA GGGGATGGCC CAAAGCTGCA GGAAGTACCA GGTGCAGCGG 420
GGCCCCTGGG TACCAGGATG CCACTCCCAT GCCATCTGCC TTCCCCGTCT GACCCTGCTT 480
CTCTGCTGGA ATGTTGCCTT CAGTTTTTCA GACAGCCCTC TTCACACGTA TGCTAGAAAC 540
ATCTTCCCTG GGGGTCTCAC AGTTGCATCT TCTTCCTCAn TCCCAATTCA GAAAATCCCA 600
GGGGAGCTTC TCATTGGGTC CATTTAGGGC CCATAnA 637 (2) INFORMATION FOR SEQ ID NO: 534:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 616 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 534:
TTCTACAAGT CCGGCACGTT CCGTTATGAG GATGTGCTCT GGCCGGACTG ACCAGCGACG 60
AGACGAAAAA ACGGACCGCG TTTGCCGGAA CGAATACAGC ATCGTTTAAC TTTACCCTTC 120
ATCACTAAAG GCCGCCTGTG CGCTTTTTTT ACGGGATTTT TTTATGTCGA TGTACACAAC 180
CGCCCAACTG CTGGCGGCAA ATGAGCAGAA ATTTAAGTTT GATCCGCTGT TTCTGCGTCT 240
CTTTTTCCGT GAGAGCTATC CCTTCACCAC GGAGAAAGTC TATCTCTCAC AAATTCCGGG 300
ACTGGTAAAC ATGGCGCTGT ACGTTTCGCC GATTGTTTCC GGTGAGGTTA TCCGTTCCCG 360
TGGCGGCTCC ACCTCTGAAT TTACGCCGGG ATATGTCAAG CCGAAGCATG AAGTGAATCC 420
GCAGATGACC CTGCGTCGCC TGCCGGATGA AGATCCGCAG AATCTGGCGG ACCCGGCTTA 480
CCGCCGCCGT CGCATCATCA TGCAGAACAT GCGTGACGAA GAGCTGGCCA TTGCTCAGGT 540 CGAAGAGATG CAGGCAGTTT CTGCCGTGCT TAAGGGCAAA TACACCATGA CCGGTGAAGC 600
CTTCGATCCG GTTGAG 616
(2) INFORMATION FOR SEQ ID NO: 535:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 544 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 535:
CCCTGGTCTC ACTCTGTTTA TGGAAACCAG CATTTTCCCC TCCCTTGTGT TGTTTCCCTT 60
CCTCCCTTCA TTCTTTCCCT CCCTCTCTCT CCTCTTTCTG TCACTCTATC TTTCAAATAA 120
ATTTTAAAAA ATGTTTAAAA GGAATACTTC TGTAAGGATG TCTGTCCAAC ACCTGCGTAA 180
CCATGTACTG GTGCCATCTT TGTTACTGAT CCTTTAGCAC ACTTGTTACC AATCCTTGTG 240
GTCAAAGATT GTTGTGTCAA AACAACTTAT GTAACCTTCC CCACGTTGTC TTTAAAAACA 300
CCCTTTGCCT CAGCTTTTAT GAATATACTC ATAGTTTACA GTGACACAGG TGCCCCCATT 360
GCAATGTCAT GTGATTCTCA GACAAACTTT TTGAATTTTG GAGAAACTGT CTTTGTAGGT 420
TATTTAGATC GACAAAGTTT ATTCTGCTAT GATTTGTATT CTTTTAAATT TGTnAAGTTT 480
TAATGGCCCA GAATATGGTC TATCTTGATG CATGGGCACC TAAAATTTTG AATTGTGCGA 540 nCTC 544 (2) INFORMATION FOR SEQ ID NO : 536:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 677 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 536:
TTTTCTAACC CTCTTTATCC ATATCTACCC CTCAACCTTT CTGAACCATG CTTGAGTTCT 60
AGCAGTCACG TTCTCTTGTT TTAGTCCTCC ATGATGTGCT TTCTGACTGG AACGTTCTCT 120
GCATTGTCTT TCTTCAGACC TCAGCTTCCC CACCCCTTCC ATGAGAAAAC TGGTGCTTTA 180
CACTCTCCCC ACTGCATAAT GCTTGCTTGA ACCTCCTCCT AGCTTCTACT GAGTGATTGT 240
GTCTTGGTCA TATGTGAATG TCTAGTTAGT GCCTAGCACA ATACTTTTCA TGTAGTAAAC 300 CATCTGTGAA TGTTGGTTTA AACGAGTTTC TGACAGTTTT TTTTGAGTAC TCTTCCAAGG 360
CCTAGTTTTC TTGTTTGTGG GAAGTCAACC AGAAGAGTAG GAGTTGGAAG CAGGCAGACA 420
AGAAAAGGTT CTTACTGGCT CCAGCTTTGC ATGCATTTTT TTTCCCCTCT TCATTTCCTT 480
CTTCCATAGC ACCGTTCAAA TTTGCAATCA TTTTACTACA CTAGTCTTCA CAGACCTTCC 540
AGGCACAGAT AAAGCTAACA GAGTTTTAAG GCAAATGGTC GGTTCCTGCA CCCAGCTGTT 600
AAGAGGGCTG CAGAGTGTGA GTGAAGACAG CTGAAGTGGn TAACTAATCA AAGGGnATCA 660
TTCCTCCTGG CAGGACA 677 (2) INFORMATION FOR SEQ ID NO: 537:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 615 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 537:
AATTCCCGGC TTCAGCTGCT AGGGGCATTT GAGGGAGTGA ACCAGCGGAT GAGAGCTCCA 60
TCTTCATCTC GATTTCTCTC CCTCTTACAA TTAAAATTTT TTTAAAGGTA ATTGATCATT 120
GTTCTGTATT TCTTTTGGTT ACAGGTGTTC TTAAGTTCAA GAGATTTTTT TAAAAGTAGA 180
TACTGCAACA CGTATTTTTA CTACGTTCGC GACTATTTTC TTAACAAGAG CCATCTGCCA 240
CTGATTAAAG ATCTTTCATG CATTATAATT ATTGCAAGGG ACTTCCCCAA TAAGCTTAGA 300
AAAGGTGTTG TGACTTCTGC CGTCTGCATA AAAGCGGGAG AACGAAGCCT CAGACGCATG 360
AACCGACGCG GCCAAGTCAC AGGTGCAGGA GAGGTGTGGA TTCTCATTCC GGCGCCCAGC 420
CCCACCGCGA CTCCTGCGGA GGCTTCTCTC ACCCCAGCAG AGCCACCATT AGCTCGCCGG 480
GCAGCGTCCG GGCCAGGTTC AGCCGCGGCG CCCCCGTGAG GCTGACGCGC TTGGTTGTTA 540
TGACGACACG AAGGGCAAAA GCCCGGGGAA AAGAACTTTG AAGGGCATTA GCGGAAAGCG 600
CTCCCCACCC CAGGT 615 (2) INFORMATION FOR SEQ ID NO: 538:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 550 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 538:
TGAGTTAAAC AAATCTCTTG TCTATAAGGT TGGCCTGCCT CGGTTATTTC ATTATATCAG 60
TAACACACTG ACTAATGCAT ATACGCAGAT ACAGGACTGC TGGGTCATGT AGGAATTGCA 120
GCTGTAATCC TCTGGGGAGC TGCCATGCTG TTTTCCACAC AGCTGCAACA GTTTCGGCGA 180
TGTACCAAGG GTCCCATTTC TCCCCATCGG CGCCAACACT TGCTGTTTTC CGTTTGTCTG 240
TCTGTAGGGG CCATCTGCAT GAGTGTAAGT AGACCCTCAG TGTGGTGATT TGCACTTCTT 300
TATGATTAGT AAGTTTTTTT CATGTTTTAA AATTTTATAT ACTTGTTGGT CATTTGCATA 360
TCTTCTGTAG GGAAAATATC TATTCAAGTC CTTTGCCCAT TTAAAAACCT AGATTATGTT 420
GTTGTTGTTG CnATATAAGA GTCCTTTATG CATTCTGGAT ATTGATCCCT TATGAGATAC 480
ATGATTTCTA AGTACTTTCT CACATTATTC AGGTTACCTT TTCTTTTCTT TTTAGAAGTT 540
ATTnATTTAn 550 (2) INFORMATION FOR SEQ ID NO: 539:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 458 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 539:
CATTTATTTG GTATGTCATT TTCCCCCATG TGCTTGCTGC ACTACCCTTG CTACTCCATC 60
CCTTAAATCA CAAACACATA TGTACCTGCA TCCATCCCCA CTCACATTAT ATTTCACTTT 120
TACTTCCATC CACCCAATGC TCGATTCTGT TAATTAGTTT TTTTAAAGAT GAATACTACT 180
GAAATTCTTG TTAAGTCTTC TAGCAAACTA TCAGAAGCCC TAAGGACAAT GAACTTGTTT 240
CCCTGAACTG ACAAACCTCA ACTGAAGCTG CACTAGAGTC AGCTCCTTTA AGGTGTGACC 300
ACTTCATTCA TCCATGTTGT CTGGCTGGCC CTTGGACATA TTAAATTTTT TTTTACTATC 360
ACTCTTCACT TGTTTGTATC TTGACCTGTC AACACTTCAT TGGAAAGTTG GTAAGGAAnC 420
AAACCAAAAA AAACATGGTA GGTAAGTATC CCCACCAA 458 (2) INFORMATION FOR SEQ ID NO: 540:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 458 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 540:
CAAAAGTACA GGAGGAGTAG AGGGACGATG ATCTGTCTTT AGAGTTACTT CCTGCTGACA 60
TGGGGCAATG AGGTACTCTT CATTGGCATG GGGTAATGTC TCGAGCAGCT GTCTCCTGGG 120
TTGGGAGCCG AGTATATCCC TGTAGCAGCA TTTGGTTGAC TGCCTGGTTG CTGAAGGCCT 180
TGGTTAGCTC CATTATCAAC TCTTTTAAAC ACTTAAACAG ACATGGTAGT TTAAAAGACA 240
GGGTGAGAAT AGCAACAAAT GATCCTAAGA GTGGCATTAA CCCGGTAATA AGAGGATTTC 300
ACAGAGGAGA GGAAACGAGG GGGGTGGTCT CTGGACCCTT GTGATGACTG TTTGATGGTT 360
TAAAGCTTAT CTCACCCAAG ACTGGGAGAA AGGCCACTCA ACTTTTGCAA GTAGAGGGGT 420
TTGTCTGTAG AGAAATTCTG AACAATCATT CAACATTA 458 (2) INFORMATION FOR SEQ ID NO : 541:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 645 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 541:
AGGGCATGAA AGGCTGTTCC GGCCACAAAA TAGAGACCCT GTTTAAATCT TATTAAACAA 60
TAACGCCCAC CCAGTCCTCC CCTCCCCCCC TTTGGAAAAA AATTCAGGAA GGAAATTGTG 120
GTTAAATCAT CTCATAGTTT GGCAAAAACT TCTACCTCTG AAAAATGAAA AGAAAAAAAA 180
AAACTCCATT CTGTGCTGGA CTCACACAGG CTAATAAAAA GCAGGACTGT ATTGCTTTGC 240
ACTGTAGCTG TCAAATTCTC AGCAATGAAA TACAGACCCT GACTCTCTCA GTCCCTTGCG 300
TCCTACCCCC TCAGACCACC AAAAAAGGAA TGGCGGGGGT GGGGGATGGG GTATCATTTA 360
TCCTCTCCCA TTCTAGAGGG AATCTTGTCT GTnGCTTTGT TTCACTTTTA ATTTCTTGGT 420
TTGAGGTCCA AACATCTCCT TTCCTGAATA AAGTTTCCAC TGTTGTTATA AACATACATA 480
TGCAAGGGGT GTTGGGAGCT GGTGCTCTGA AGATCTGTGC TTCTGCTGTC TTGTAGAGAT 540
ACCACACACT TGCCATCAGA GAAGAATGGA CATTGCAACA ATGAGAAAGA AAGAGAGGAA 600
GTAGAGAAGG GGGGAnAGAG GGAGGGAGGG AAGGAGGGAG GGAnG 645
(2) INFORMATION FOR SEQ ID NO: 542:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 681 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 542:
CAGCCAGGTA AGCAGTAGTG CTTTTTGGTG TGTGGCAGTA AGATGAGTTA TTCTGGGATC 60
CCAGTTCTGG TTTGGAGTGA CAGGAGACCC TAGAGTTCCC GAAGCAGCTG GTCAATTTAG 120
CAGATGTTTG TACCTCAACC AATTTGAAAT GGATTTGGCA ATACAAGGTG AACAGTGTAG 180
TCACATGTCT TGTCTCTCCA TTGGGACATC ACAGGGTGCC CCCTAGGGAC CCGTTGAGCC 240
CTGGAGAGAA AGCTGGTGGT GATAGGGCTG ACCATGCAAG GGAGGGCTGG CAGGCACAGC 300
CGCCCTAGGG AAGGCAGGGA GAGGAGGCAC CTGTTCTGGG CCAGACATCC CACTGCTAGA 360
TCATCTCATC CATTCCGGAG CTGATTTGGC CCTATCCCCT GGAAGATGGA AGTACGCCTT 420
AGAGAGCGCA TGGCAGAGCC TGGATGTCCA TGCCAAGTCC AAATCTCAAA TCGTTCACAG 480
ACTAGCTACA CGACACTGGG CGAGGGTGTC CTTCTCCGTG AAACCAGGAC AGTGAGACCC 540
TCCCTTGTAA ACCATACAGC AGCCGTGGGG CTCGAAGGCC ACTGCTGTCA GAGCCAGTGC 600
CCTCCAGGGT GGCGACTCCT AGGGAAGGTC CCAGCCTCAG GCTGTGCCAA GTGCACCTGC 660
TTCTCACCTG CAAAATTGAC C 681 (2) INFORMATION FOR SEQ ID NO: 543:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 553 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 543:
AGTAAATCTG TATTTGAAGG TGAGTCTGAA TTAACACTGT CTTCCTTTTA GGGCTTGATC 60
TCAGCAGTCA GATCAAGGGC TGTCTTGTGA CTGCATCAGC AGACAAATAC GTGAAGATCT 120
GGGACATACT GGGAGACAGG CCGAGTCTGG TTCATTCCCG GGACATGAAA ATGGTAAGAA 180
TCTCCCTGAG TCTCTTAGTT TTCTGTTTTT CACCTGTGTC GCTTGTCATT CTCGGGGAAT 240
CAGTGCAGTA AAGAATGTGG ACAAGTTGGG AAGTTAATTG TGGTGGGAGG TTTAGGGGCC 300
TCAGACCATT CTGGGAGAAG AGAGACTCGG AGGGTCAGTT TCTACTTCAC ACGAGTAACA 360
GCTGTCGAGT AAGGAGGGGG CGAGGGCTTC TCTAGGACAC AGGTGAAGTG GCCATTGTCT 420 CACCCTAGGA AATCCTGATG CTTAGAAAGA ACCAGGAATT TTTCTGTATT GAACACCTGT 480
TGTGTACCAG GCACTGAGCT TGATAGATCA CAGAACACCA CATAGCTGTT GAAAATCATG 540
TACCTGTGTA TTn 553
(2) INFORMATION FOR SEQ ID NO: 544:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 555 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 544:
CCTTTTCACA AAAAGCAGTC CTTATCACGG GGTGCGTATT CTTGTGCACC CCTGTGTCAT 60
CCGCACATGA TGACCCAGCA GTTGGTACAT GGTAAGTATG ACCAATACAA TCTGCAAAGT 120
GAGTCCGGGG TGATGCTGAG TCATAACCCT CCAAAAAAAG CTGATCTCTG CAAATTGGAC 180
CCTCGCTTAC CTGGAAGTAA TAATCTGGCT TTGCAAGGCT GCACAGCATA GTAAGTAACA 240
ACCCTTGTAG CATTCGGTTC TCATATCAAT CAGACTGGAG GGTAAGATTG TCTACCTTGC 300
TCGTCACTGG ATTCCTTCTA AGGGCCTTGC AACAGATTTG TCAGATGAAA GGGTGAGCGA 360
GCAACCAGCA GAAAGGCACA GGATGGGATC ATTGTTAACA GCACCGACCT GGCGCTGGTT 420
TTGTGGTGCA GTGGGTTAAG CCACCACCTG TAACACCGGC AACCCAGAGG AACACGGGTT 480
CAAGTCCGAG CTGCTCCACT TCCAATCCAT CCCTATTAAT GTAGAGGATG GGGGTCCCAG 540
CCATCCAnAA GAGAA 555 (2) INFORMATION FOR SEQ ID NO: 545:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 554 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 545:
AACTCTGCCT TTCCAATAAA TAAATCTTTT TTTTTTTAAT TAAAACCCAG AAACTTGGCA 60
CCCTCAGTGA TTGAGAAACT TTACCTGAAT GCTTTTTGGC ACGTTCCTTC ATGATCCAAC 120
CACCTGGACA AAACCCATTC TATGTGGTCT CAGGTGAGCT TGAAACTGTA ATGGnAGAGA 180
GGTCCACTAT GTGGCCTTTG TCAGTCCTCC TTTCTCTCCT CACCGCTCAT GTTTCAAAAG 240 ATAGATATCT ATTTCCAAGA TCTTTGGAGG AAACCATCAA GATCAGGGAA AACATAACAG 300
CAAAAGTTCT GTACCCTCCC TGGATGGTTT TCTCCCCAAC ACCATCCATG GGCGAGAGGC 360
ACAGAAATGG TTCCCTACTG GTATTGGCCA TGGCTTCCTG GAAGCCATCT TGGAAGGGGA 420
CTAGAACAAA CACATTTTTA GCCCCTCATA TCCAGCATCA AAAGGAAAAC CCAAGTAAAC 480
AATTAAATAA TCCTGGCATG GTGAGGATAT ACGGAGTCAC TTGGGTGAAC CTGGAGTGGA 540
CTATCTGTCG AGCT 554 (2) INFORMATION FOR SEQ ID NO: 546:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 556 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 546:
ATCCTGAAGA ATAATTGATT TGTTAATAGA AGCACTCAAT GCCTCAATCC CTAAATCTAC 60
TATGTCAAGA CATATTTCCA AAGTTGATTT TGTGAAGGTC CTTACAGAAT ATGTAACATT 120
AGCTCTCCTC AGGTAGCCAT TTGGTTCTGC AACACACTCC TCAAAACCTA TTTTACTACT 180
GACTACTACT GCTGACATTC TCAACTCACA CAAATAGTTC AAAAAGCAGT GCTATGTAAA 240
ACTTTTCAAA AAGTGTATTA AATAGTTTCA TCACACTTCA TTTTGCCAAC CAGGTGAATG 300
AAGCAATTTT CCTCAAAGGA CTTAAATTTT TAACAAAAGG TCAACTCTAG AATTCAGATA 360
TAGTGAGTTG AAGCATCAAC TTGCCTCCTG TAAACCACTT ATATTCTTAA TTTTCATTGT 420
TCTTTAAATT CTTTAGTTAT TTCACTTGAA AGGCAGATGG ACAGTGAGGG GTGCGGTAGG 480
GGTGGCAGGG AGGGAGTGGT TTCCAACTGC TGCTCCATTC TGTAAACGCT TACAACAACC 540
AGGGTTGGGC CAGGGT 556 (2) INFORMATION FOR SEQ ID NO: 547:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 552 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 547: AATCGTTGAT CAGATCGCGA ATAACGCTTC GGGAGATTTC TTCGCCGGGA TACACTGAAA 60 TCCAGTGCTT TTTATTCATG TGATACCCTG GCTTAATGCT TGGGTA ATT TGCTGATTTA 120
ACAGGGATTT TTGTGGATCG GACTTCAGAT TGATAAAGGG GACGCCGCGT ACTCCGACGA 180
CAGCATAAAA ATCTTGCCGC CAATTTTAAA AACATCGAAC TCCGGGCCAA AAGGCCAGCA 240
AAGCTCGACA AAGGGTAACT CAAGGGCCAG GCGTTTCGCG TTTCGTGCAG TGATTGCTTA 300
TCCATAAACG TTCCTTTAGG CGAAGGAGAA TAAGCAAAGT ATGCCGCGAA GTACGGCGAT 360
AATCGACGTT TAATCCGCCA GCGAGAACCA GCGTCGCCAG ATAAAGCGCA GAACAAAATA 420
CTCAATAGCG CCCAGCACTA AAAACCACAG ACAAAACAAT AAAGTGTAAA GCTGACTAAG 480
ATCCATCAGA TGGAACATGG TCACCAGTTT TTGTGCCAGC GCCAGCCCCA GTGCGGGGGC 540
GGGCAGCAGC AG 552 (2) INFORMATION FOR SEQ ID NO : 548:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 443 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 548: GGGTGAGCGC TTTGTGGCCA CTACCCACAC TAACGCCTAC TTTTGATTTT AATGCCAGTT 60
TAATGTAGGG GTGAATCTCT TTCTTTGGGA TAATTCAAAA AATGCATTTA CCTGGCCCTT 120
GGGGTACGGC ATTTGGGACA TCAAATGCTC CACTGCGTAG TCTCTCTGCC CTCAGCGAAG 180
TCTTCATCGT TGTCTCAGTC ACAGTGCTCC TGTGGGCCCT CCGTGGTCCG CTGTCGGTGC 240
AATGCGCTGT AGTTTGGGGG AGGAGTCGTT GTGAATGGAA AGGAAGCGTT GGATGGGCAG 300
TCATGTACAG AAGGTGACTG TTGCAAGGAC AGCGTACTGG ACACATAACG ATGTGTGTGG 360
GCTTGGGCAG AACCAGGCCA GGCTGTTGAA GCAAGTAGAG TGGCCTATTT TCCCCTAAGG 420
GGCAACTGGG GGCCnAAnTG GTT 443 (2) INFORMATION FOR SEQ ID NO: 549:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 579 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 549: AAAGCTCTAC ATCTAGCTAG TTGGTAAGAG TATTCTTTGG GACAGATGAT GGGCAAGATG 60
GAGATAGCAA GTATGAATCA GCACAAAAGT CTGGGTGCAC TTAGAAAAAT GAGCACAAAT 120
CACATATTTT GTTAATATTT TGTCAGTATC TTTTTATTTA TATAAACAAT TGTTATAATG 180
AGTAAATCCA GTTCAACCAT TGACATTCTA TCTAAATATT TTCCAGGTTG CCTAACCCAC 240
AATTCTATTC CCATGGCAAA ATGTGTTCAC CCAACATTTC TGAGCATGCT TTGATTTTTC 300
TGGGCCCTGC AATTTCCTTC ATTTGAAAAT CTTCTTGCCC CCTTTAAATA TCCAATTTAT 360
TTATATGAAC TGTCTCTGAA GCCTTCCCCA ATAATCACAC TGGAGTGTGT CCCTAGACTT 420
CTACAATATT TTGTTTGAAG CAGTCATTCA ACATTTATGT GACATAATAT TTCACTTAAG 480
TTGTGTGTGT AAATATTCAA ATTTCCCTAC CCTATTCAAA TCTCTTGAAA AGTAAGGAGA 540
ATAGATTAAn AATATTATAA ACnTTATAGC TGCATAGCG 579 (2) INFORMATION FOR SEQ ID NO : 550:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 588 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 550:
TGTATCGCTG TTCAGATCAG TGCCTCTGAG GGGACATGGT CTCTGCAGCT TCTCATTGGT 60
CACTATCTGC TCCTGTCTCT TCAAGGGTGA TTTTAAATGC AGAGCAAGGT TATACAGCTT 120
CCATTTACAA TTTTTACAGG GAATGCAATG AATCAAAAAG AAATAATCCA GAAGAGGAAA 180
ATCTGTATGG TTTTACATAG GCAATTAGAC CTTCTCCAAT TTTGTTCAGC ATGCTTCAAA 240
TGGAAATCTT AGCAACAATT TCAAGTGGAA ATCTCAGCCG TGAACACTAA GTTCTTTTCA 300
CAAAAATAGA TACAGATTTT TCAAACATCA ATATGAAACA TTTTTAAATA TAGAAATAAA 360
GTACTGAAAG TATTTAGAAA AAAACAAGCA AAATATTGCC TATTCAGTTA TAAATACAGT 420
GCCAGTTGTG ACCCTTTATG GTCTGGAGAA TTTTTTTCCT ATGGATAAAA TTGTAATATT 480
AGGACAGATT GTCAATTCAT CTGGTCCAAT CTGATTTATT TTCnTATTCn AAAAATTTAA 540
AACATTGTTT AAAATATATG CnACATATAA GGCnAGACTA ATGGTATA 588 (2) INFORMATION FOR SEQ ID NO: 551:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 700 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 551: nAGAACGTAG GATACTGATT CGTTTCAGAC GGATGCGCGC ACAGGGAGCG TCTATGCACA 60
GCTTGTCCGT GCGCCGCGCG TTGCAGGATT GCTGCTGAAC ATAGATATTC CCTCTCTCCC 120
TGACGGGTAC TCTTTTTATA CTGCAGCACA TATTCCCGGA TGCAATGCCG TTCGGTGTGG 180
GGAAAATACT GTGCCGGTTT TTGCGCATGG AGAGGTGGTG TACGCAGGGG AACCGGTGGG 240
TATCCTCATT GGGCCTGATG AGCATGTGGT ACGTAATTTA GTGCAAGATG TGGTGGTGCA 300
TACGTGCGCA GAGCGGGCCT GTGCGTCGGA AATACTCTGT GGAATCAGTG AAGGGGAACC 360
CCTCGCTCAA AAGGTGGCGG TGCAAGGAGA TGCAGAAACT GCTTTTAAAC GCGCATCACA 420
CACGGTATGC TCCTCTTGTA CATTTGAGCC GCGTGTACAC TACTTTGCGG AAATGCCAGA 480
AGTACAGGCA CTACCCGACG CGCACGGTCT GCACGTGTAC GCTGCTACGC ATGGCCTGCG 540
CACATGAGAA AAACTATCGC GCAGGTACTG AATATTTCTG AGCATGCGGT GCACGTACAT 600
CCGCAGCAGG AAGCGCTTTC CTGTGATGGG AGAATATGGT TCCCCTCAGT GATGGCAAGT 660
CAGGCGGCGC TTGnAnCCTA TTGTGCGAAA AAGCCGGTAC 700 (2) INFORMATION FOR SEQ ID NO: 552:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 555 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 552:
CACGTTGCAG GAAGAAGATC AGCTGCTTGC AAGCAGGTCA TCGTGCAGCG CAnATGTGGT 60
TTTCTGTACA CCACACGTGT TCTACCATGC ACTGTGCTCG AAACTGGCGT GGTCAGGGAG 120
TATTTTTTCT CGCAGGGGAA GACGTCACGA TGATTGAGCA ACTTTCGGTG CGCAACGTTG 180
CGCTCATTCA ATCTTTGGCG TTGGAGTTTG GTGCACAGTT TACTGCCCTC TCAGGGGAGA 240
CGGGTGCGGG TAAGTCAATG ATACTCGGCG CGCTGTCCTT TCTCTGTGGG CAAAAGGTAG 300
GGCCTGATCT TATTCGCAAG GATGAGAACG AGGCATGGGT TTCTGCGGTG TTTCGCTGTG 360
ATCACGCACC GCTGCGTGCA CACATGGTTG GCAGAACGGA GTATTGAGCC TGAGCACCAC 420
CGCGTGCTCC TTCGTCGGGT GATGCGGCGT ACCGGTCGTG GCACGGCGTG GATTCAAAAC 480 GTCCCGGTCT CTCGCGCAAT TTGGAGTTTT TCACGTCATT TTCATAGACC TCCACGGACA 540
GCATGAACAC CAATC 555
(2) INFORMATION FOR SEQ ID NO : 553:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 554 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 553:
CAATCCATCT GAACACCTGC ACCCCACGAA AGCGCTCAGC AAACGCACAC ACCTTCTGGA 60
TTTCTTCTGG CAAAAGCCCC GAAAGGGCGC ACACCACTCC ATATACGCTC CACCCCCTCG 120
TCGCAAAGTC CTGCACCACC TCCGACGAAG AGGAGGACCC CGACCACCAC ACAAAAAGCG 180
CCGAATCGAC AGGCATCGCG CAnCCACACC TGCGCAACGA GCCGTTCACA CTTGGCAAAC 240
GCCACCCATA CCCAAGCTTA CCAGGAAGAA AAAGGGGGGA AACTACGATG CACCCTCCAA 300
CTTCACTCAT AAGCTCAGAA ACAACACGGT TCTTAACCCC AAACTGCTGA AAGGGACGAA 360
TGTCTTCAAG CGCCTTTTGC TTCTGCCCGA GCTTGCGGTA GCAGTCTTGC AAGCTCCACA 420
TATACGCGAT AGTTTTTCCG ATTCAAGCTG CACGAGTCGA TTCAAGACTC ACAACCGGCT 480
TCTTCGTATC TCCCCTGGAG TTTACAAAGG ACTGCGAGCC CGAGCGTGGC ATAGGCATCG 540
TAATCGATAT CCAA 554 (2) INFORMATION FOR SEQ ID NO: 554:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 925 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 554:
CAGCATGGTG CGnTGAATGT GTTCCCCGGG CnTTGTACAC ACCGCCCGTC ACACCATCCG 60
AAGTTGGAGA TACCCGAAGT CACTAGCCTA ACCCGCAAGG GAGGGCGGTG CCGAAGGTAT 120
GTTTGGTAAG GAGGGTGAAG TCGTAACAAG GTAGCCGTAC CGGAAGGTGC GGCTGGATCA 180
CCTCCTTTCT AAGAGAAAGG GTATGGGCAT GGCATGGTGC CGTGTTCGnC GTGTGGCGGA 240
AGCCACACGG TAGGTTTTTC TGCTCctGCA CGGCAGTCTC TCCCCTTCCC TTTTGAAAAG 300 GGGCTTGTAG CTCAGTTGGT TAGAGCACTT CTCTGATAAG GAAGGGGTCA TTAGTTCAAC 360
TCTAATCAAG CCCACTATTA TTCTTTATGT CCCTTTGTTT TGTTTATGGG GTAAGGAGTA 420
GGTGGTAGGT GATTTTTGAG AGTATTAGGG TGGGGTGTGA AGTTGAGAAG GGATGGATAA 480
TATGGTCAAG CGAATAGTGG TTTACGGTGG ATGTCTTGGA GTTGTCAGGC GATGAAGGTC 540
GTGATAAGCT GCGAAAAGCC TCGGGGAGGA GCACATGTCC TGTGATCCGG GGATGACCGA 600
ATGGGGTAAC CCGACAGGgT AAAgCCTTGT CATTGCCTTC CTGAATGAAT AGGGAGGGTA 660
AGGCGAAACT GGGTGAACTG AACCATCTAA GTAACTTGGG AAAAGAAATC AAGAGAGATT 720
CCGAAAGTAG TGGCGAGCGA AATTGGAGGA GCCTAAACCT GTGTCTAACA GGGGTTGTAG 780
GGCCGCGCGG GCTTGCGTTC GGTGGGTGAA ATAATCCGGC CTATAGCAGA AAGGTTTTGG 840
GAAAGCCTGA CAGAGAGGGT GAAATCCCCG TATGCGGAAT GGGGCGGACC TGCTGGTGCG 900
GTACCTGAGT AACGGCGGGA CACGA 925 (2) INFORMATION FOR SEQ ID NO: 555:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 940 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 555:
TCTGGGCTCG GACGGTACAG TAGGCGCGAA TAAAAATTCA ATTAAGATTA TTGGTGAGGC 60
GACGGATAAT AACGCGCAGG CTTACTTTGC CTACGATAGC AAGAAGTCTG GTGGTTTTAC 120
TATTTCTCAT TTGCGTTTTG GAAAGCAGAA GATCCGTAAG CCCTACCTCA TTACGCAGCG 180
GATTTTGTAG CGTGTCATAA GTTTACGTAC CTTGAAACCT TTGACATGCT CAAAACGCTC 240
AAGCGTGGAG GGACCTTTTT GCTGAATGCG CCGTACAGTG AGCATGAGGT GTGGCATCAC 300
ATACCCATAG AAGTCCAGCG TCAGATCATT GAAAAGGAGG TGAAGTTTTA CGTCATCGAT 360
GCGATTTCTA TCGCTCAGAA GGCGGGGATG GGCACACGTA TCAATGTGGT GATGCAAACG 420
GCTTTTTcAA AATTTTTGGT ATCTgCCGGA AGCTGAGGCG ATTGACCTGA TTAAGAAATT 480
TATACAGAAG GCCTACGGCA AAAAGGgTGG GGAGGTTGTA CAGAGGAACA TCACCACTAT 540
CGATAtGGCG CTCGCTGGGG TGGGATTGGT GGAGTATCCG GGAGTTGCCG GTAGTTTGGT 600
GACGCGTCGT CCTGCGATGA GTTCCGATGC TCCGGAGTTT GTGCAAAGCG TGTTAGGTAC 660
TATTGCGCTC AATCAGGGGG ATAGTCTTGG GGTGAGCGCA CTACCAGAGG ATGGTACCTA 720 TCCTACTGGT ACCACGCATA CGAGAAGCGC TGTATAGCCG' AGACTATACC CATTGGGATC 780
CGTCTGTTTG TATCCATGTG GTCATGCGCT AGGTGTGTCC TCACGCATTA TCCGCATGAA 840
AGCGTACGAT GGTAAGGAGC TCGAGCAGCG CCTTCTAAGT TTGCTTCCTG TGGACTACAA 900
AGGCAAGGAA TTGGGGAAGC GAATTTACGA TTCAGTTTCC 940 (2) INFORMATION FOR SEQ ID NO : 556:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 554 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 556:
ACAACCACTG CCACGATGCG CCCTGCCTTG TCCACAACAT AAGTAGTTGG CAGACCACGG 60
GAAGCAAAAA CACTCCCCAA ACTCCCCTCC TCGTCAAGAT AGATAGGAAA GGTATGCTTT 120
CCACGCGCGA TAAAACTTTC CACCTGTTTT CTCGAGTCAC CAACGTTGAC CGCGACAATC 180
TGAAAGTCAT TCCCCCTCAT AAGAGCCTGC ATGCGATCCA TAGACGGCAT CTCCGCACGA 240
CAGGGCGGAC ACCACGTAGC CCAAAAGTTC AAAAGCGTCA CCTTTCCCTT GAAAAGGCTA 300
GGAACCAGTG CCTCCCCCTT CAGGCCTTCG CATGAAAGTC ACTAGAAAGG TCGAGCGGGT 360
TTGGGATACA CAAAAAAACG GAAACGCTCG AGCGCCTTTC AGCGAGCGGG AAGGTACATC 420
CGCATTGTGC GCCACATCGG CGGCTTGTAC GCnGGAAACA CCCCACACCG CAGCACCAGG 480
ACGGAAGGAA CACAAGGGGA CGCGCGTAAA GCAnCGGTCG TACGGGAGCA ACTCATGCAC 540
AGGGAACATT CACT 554 (2) INFORMATION FOR SEQ ID NO: 557:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 573 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 557:
GCGTCAAGCA CCGGGAAGCG CACTACGCTC CAATATCTAT TAAAAGATTT TGTACCTCGT 60
CCCGAGCAAA TGTGGGATAT AGCGTATGCA TACAATTTTG TGCATCCGCA CGnAACCGCA 120
GTGCnTGCAG TTTCCGGCGG GAGAAGGGAT GCCTTTTGCT ACGGCGTTGA GGAGATCGGT 180 TAACGCTATT CTCAATACAG CACAGGATAT TGTGAAAAGT GATGCCTTCT TACGTGAGCG 240
GCGCACATTG CTGGCTGATA TTGAAACACG TGAGnTGCTG AGCTTTCACG TATTGAAGCG 300
GAGTTGTATA CGCGTGGGTT TCGTGTGAGA TGGCATAGGA AACGTGGTAC GTACTCCTTT 360
GATTTAGTTC CCCTATTAAA GGGAAAAGAC AGTAGCTTTG AAGCGCTGCA CGATTTAGCT 420
TCCCGCGCGA AnTTTACTAG ATGTGTAGTA CACGAACTCC ATGCGCGATA TCGTCTTTCC 480
TGTGATGAGG TTTCTTCGCT GCTCCATACG TTGCGCACnG CGGGGCGGGC CGCCGTAAGG 540
CGTCTTGCGC AnTACTACCG TGCGCGTTTG CGG 573 (2) INFORMATION FOR SEQ ID NO : 558:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 532 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 558:
TGGGCTGCTC TACCGGAGAG AAAGAGATGG AAAAGGCTGC AGACATCATT CGCATGTACG 60
TTACCGACAC GCTTTCTTCT GTGCGAACAT TTAGGCAGGA GTCACACACG CGCGCAACCT 120
ACGGCTTTTG CACAGAGGCT CACGGTACAG ATACCTCTTC AACGATAGAT TTTTCACAAC 180
TGGTGTTCAC CGCAGAACAG GTACGCACCA TTGAAATCGC GAAAACTATT ACAAAAATTG 240
AAGATCGGGT ACTTGCGTAT GCAATCAAAT ACTGGCACAG ATACGATGAT CGAGCCTTTG 300
AGAACACCGT GCTACTGTAC AGCAGAAAGT ATAAAACACA TCACTACAAT ATCTTCATTC 360
AATTAAAACT GGACGGGTCA CCAATGGAAA GACGAnGATA TTTTGCACTG GTGCTCGCGC 420
ACGTCGCCTC AGGTATACTA CAGCATCAGT GGTGATCTGT ATTGCAACGA AACGGCACGG 480
TTAAAAGCGC GCCGGTAGAA CGTGCGGAGC ATCCATCATC AAAAGGGTAA AA 532 (2) INFORMATION FOR SEQ ID NO: 559:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 537 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 559: TTCAGATCAG AAGCAAGGAC AACAGGAAAA GCGTAGGCAC TGTGTTCCAA TTGCTCACAC 60 CGGTGAAGCG TACCGTGACG CGTACGCAAA AGCGCAGAGG AGTATCGGTC GTAGATTGCG 120
CGCCTGAGGG CAATATTTTT TTCTTGCTGC TTTACTTGGA CACACGCGAG CGCCGCGTTT 180
CATATCCGGC AGCATATCAA CGGCAAGCGC TTCTGGCACA AGCGCCTGAA GCCGACGCGC 240
GCAnGCGGCC TCAAAGGCCA TGAGTACCGC GCCGCCGCCT GCGGTAAGCA TATCGTGTGC 300
CTCCAATCCC ACGATGACAC ACGAGCCAAA GTTCCCAnCT TCTTTTCTCC CAATACTGCA 360
CCGACACTCT GGAGAGCTGT CTTCGATGAC GGGTATCCCC AGTTCCAAAA ACACCGCTGC 420
AAGAnGCACA TTTTCCAAGT GTTTCAGGCA CAAGAAGCGC ACGAGCGCCA AGCGCGAATG 480
CCATTTCCAC CACATCACGG GACAACAAAC CGCTGTGATG TCTAAGTCAA GGACAAn 537 (2) INFORMATION FOR SEQ ID NO : 560:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 564 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 560:
TTTAnTTAnG GATCCCCTTC CGTTTCCGTC GCTGCCGTCG CCGAGGCCTG TGCACTCCCC 60
GGCCACCGAT TGGAGGCAAC CAAAAACGCT ACGGATAAAA CGCGCATGnT GACCTGCTTC 120
ACACGCGCCC GACTGCGCTG CCCCCGCTTC ACGTTCCTTG AGCCTGACTC GTTCGCCTGG 180
GACACACCGC CTGGGCATGC CCGACTGTGT TCCCACCTGC ATAGCGCTGG ACTCTCGTTT 240
CCTCTCGTCG TAAAACCGAC AGACAACATG GGAGCCCGCG GCTGCACGCT CGCGCAATGC 300
AAGGATACCC TCATAAATGC CTGCGCCGTG GCGGCCAGTT CTCTCGCAGC GGCCGGGTGA 360
TTATCGAGGA ATTTATTGTC GGAAGAGAGT TTTCCCTGGA AGGGCTnCAT ATTCGACGGG 420
ACGTTGTACG TCACCGCACT TGCCGATCGC CACATCTGCT TTCCTCCCTC ATTCGTAGAA 480
ATGGGACACA CGCTCCCGGG CAnGCGCTCT GTACACAAGA CGnACAAGCG CTCATTCGAC 540
ACCTTCCACA ACGGTGTGCG GGCA 564 (2) INFORMATION FOR SEQ ID NO: 561:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 554 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 561:
CAACTTTATC TTCTATCTTG CCACTCCCCC TAGCCTGTAC GAAACTATCC CCACGCAGCT 60
TGCTATGCAC CACTTGAACC GGGAACAGGG TAATTTTCGC AGGGTAGTTA TTGAAAAACC 120
CTTTGGCTAC AACCTAGAAA CCGCGCAnAC CTTAATGCGA GCTTGCGTGC CCACTTTCAG 180
GAAAACCAAA CCTATCGCAT CGATCACTAT CTGGGTAAGG AAACGGTCCA AAACATCCTG 240
GTCACTCGCT TTGCCAATCC CCTTTTCGAG CCCACATGGA ACCGGACCCA TATCGATTAC 300
GTTGAAATTA CTGCAAGCGA ATCACTAGGT GTCGAAAACC GCGGCGGTTA CTACGACCAG 360
TCCGGTGCAT TGCGCGATAT GATCCAAAAC CACTTGTTAC TCCTCTGGGT ATATCGCGAT 420
GGAGGCGCCC GCCGTCGTGA GTTCAAGTCG TCTACGGATG AAATCGTAAA AGTCTTGACT 480
GCCTGCGCCC TATGGGGAGA ACGCGACGTC ATGCAGCATA CGGTGCGTGC CCAATACGTC 540
GCGGGCAAGA TACG 554 (2) INFORMATION FOR SEQ ID NO: 562:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 972 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 562:
ACATTGTAGG TGTCAAGATA CAGACACACG GCGCTGACTC CCATTAGAAA GGCACGCCAT 60
GCAGACGCAC GCAGTTCTGT ATAGCCTGAC GTATCTGCTG CCCCGCATTC AACGCATCCG 120
TCTTCTTCTT CACTTCTTCG GTTATGTGTT GCTGGACACT CGCGAAAAAC GCCGTTATTT 180
CCGTTCGCAT CATCGGCACT AAATCTGCCA GATCCTGTTT CACCTCATCT TGCACGCTCA 240
CCATGGTCAG AATGCTGTTC TGCTTACGTG ATAGTTGCTG CTTAATGAGC GCTTCGCCTA 300
CCATTCTCGC AGTATCGCCT AGACTACCCC CGACCGCTTG CATGTTTGCA TTATTTTTTG 360
CCACAGCCGC TTGCACTTTC TGATTAATTT CAGTGACGAT TTGCGTCTGT ACCTGCTCAA 420
ACCCCTTCAC CACCTGCTCC GCATCGTACT GCAGCAAAAC CTGCCCCATC AGGGAAAATG 480
CAGGAAGTGC AGGAAGCGGC GGCAGGTTCG GCGGACTTCC CTGCGGGGTG TGAAGATTTT 540
GCACAACCTT ACCGGTAGGT TTAGCAGGAT TAGGCTGAAC TGCCTCTAGC GCGTTTATGT 600
ACGTAtCCCC CGAGATTCCA GCGCGCTTCG AACTCCAGCC GTTACTGTCT GCGTCGCCTG 660
TTGCACTACC TGGGTTACCC AGGCTTCCTG TTTTTGACTT TCTCCCTGGA AGAGGTTATT 720 TGAGAGGGCG GTGAGTTCAC TCTGCGCCCT CTGTGTGCGA TTTTGAAAGT CCTGTGCACT 780
CTGGTGTTGG TTACCGGCGT CGAGGGCGAA GaGAAGCGGA AGCCGGCGCC TGGTTCGAGG 840
GTGAGTCGGC CCCCTACATT CCACAGCAGT TTATCCTgTT CTGATTGTTT GCGTCCTTCT 900
GTGCACCGAT GAGGTATCCG TCTTCTAGCG TAACATTGCT GGCAAGCTCT ACCGTGCACA 960
GAGGGTGTCC TG 972 (2) INFORMATION FOR SEQ ID NO: 563:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 619 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 563:
CAGGATGGAG AGGGGGCACG CCGCTTTGGT GCAAAGGGGC ACGATTGTGT TATACCGCTG 60
CCTCCGGGTT GTCTTTTAAG GGATGCGCAG ACTCATGAGG TTTTGCACGA TTTTGGTCAT 120
GCCCATGAAG GTTGCGTGAC GCTCCTTTCG GGTGGAAGGG GTGGTTGGGG GAATTATCAT 180
TTCCGTGGCC CAGTGCAGCA GGCTCCGCAA CGCGCGCATT CTGGGCAGCC GGGGCAGGAA 240
CGTGTGGTGC ACGTTGAACT GCGTATTGTG GCAGACGTTG GCTTTGTGGG GCTCCCCAAC 300
GCGGGCAAAT CTTCTTTGCT GAATTTTTTT ACCCACGCGC GGTCGCGnTn TGGCCCCTTA 360
TCCTTTCACT ACCCGGATTC CTTACTTGGG GGTGCTGCGT ACGGGGGAGG GCGCGACGTG 420
ATCCTGGGCA GATGTTCCCT GGGnTTCTCG AACGCGCCTC GCAGGGTGTC GGCTTTGGGG 480
TGCGCTTTCT CAAGCACTTG ACCCTGCTGT GCGGGGCTTG CATTTCTCAT TGATCTTGCA 540
GATGAGCGTG CGCTGCATAC ATACGAATTG CTTTGCAAGG AATTGTACGC TTTCTCCCCT 600
GTCTTTGAGA nAAAAGCGC 619 (2) INFORMATION FOR SEQ ID NO: 564:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 537 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 564: GACAGACATT ATGGAGGTAA TTTTTGGTTC ACTTGCTGGA GTTTCGGAAG GAGTAGAAGG 60 ATACACAGTG CATGGGTCAA TGGTGCAGCG AATGTCAAAA CCTCGTGTTG ATTTTCAGAT 120
GAAAAGCGTC GGTACTCATG AGATTTTATG TTCAGGAACG GTTCCTCTTG AATATTTTAA 180
CGAATGTTTA GGTACGCGTT TGCAGTCGCG CGTGTACCAC ACGGTGGGAG GTTTACTCCT 240
GGAACGTTTT GGACGTCTCC CTACGGTAGG GGATGAGTTG GTAATTGAAG GATTGCGTTT 300
TAAGATACGC CGTGTACTCG ATCGGTATGT TGTGTCTGCC CTCGTGGACA CTCGAGCATG 360
TAGTCAAGCG TTGGCTGACG CCTAAGTAAG TACACAGGAT GGGGGCCGTA CTTTGCGGnA 420
TCTGCAATTA AGTTGTATGT TGGATCGTGT TCTACAGGAA TGTGTGCGCA AAGGGTGAnA 480
GCAGAGTTGC TACnTGTTGA AGTGGGTTCC CGTTTCAGGT GTATCGCGGT GTGGAAA 537 (2) INFORMATION FOR SEQ ID NO : 565:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 488 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 565:
CACGTTCGTA ACAATTGGAC TGGTAGTGCA TGAGCGCTTC CTCACCTTTA AAAGTACTGG 60
ACTATTTACG GCACCACAGG ATAGAGGGGC ATTGTAATGG GAAGGTGCTG CTCTGTGCAA 120
TGCTCACAAA AAGTGCATGT CTTGAAAAAG TGTACCAGAG CCACTACACT GGTGCGCGTG 180
GGTTCTGCTG TTTCTCCGAA AGTTTTAAAA GGCTTTCGCG ATCTTTTACC GGATGAAGAG 240
ATTGAGCGTG CATTGCTCGT AGAAAAACTG ACGGTGGCTT TAAGACAAAT GGGTTTTGTA 300
CCTATCGATA CCCCCGCGTT GGAGTACACC GAGGTTTTGC TGCGCAAAAG TGAGGGTGAC 360
ACAGAGAAGC AGATGTTTCG CTTGTTGATA AGGTGGAAGA GATGTGGCCC TCCGCTTTGA 420
TCTTACGGTG CCGCTTGCGC GTTCGTTGCA ACGCACTATG CGCGTTTGTA TTTCCTTTAA 480
GCGCTATC 488 (2) INFORMATION FOR SEQ ID NO: 566:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 566: CCTTTGCCCT nTTGTTTGGA ACCCATTTCC CAAGGCCGGG GAAACTTTTn GGGGCCCGTT 60
TAACCCAGGG GGAACCAACC TTGGAACCCA ATTCCCAGGC CCCnGGAATT nGCCCCAACC 120
GGCCGGCTTT TnCAAAACCG TTnCCAGGCC GGCCGnATTC CAGGGTTTTG CCTTGGAGGT 180
TCTTAATCCC CTTTGGATTA TTTTAAAGGT CGGATTACTT TCCAAAGGAC CAGGAAAGAA 240
ACTTTGCCCG CTTCTTTGGA AAAGCCGGTT GAGCAAATTA TTTGCACCGC CATGCTCGTC 300
AGGAGCGGGG ATGTACTCAG GAGTTTTGTA GCCCCAGTAC TATGCCACCT ACGCGCACGT 360
AGGCCTACCT GCGCTCGAAC CGATCTATGC GCGTACGGCG GAGCTTGAGT CTACTCTGCA 420
GGATTTACGC GCCAAGCGTG ACCAGCTCTT GGAAACATGC ACATTTGGTT CAATTCTTGA 480
AAGAGTTGGC CTCCAGGCGA AGAGCGCGGT TGTTCAGCGC AGGATTCGCG TGCTCGAAGC 540
AAAAATTCAA AAGATTATTA CGCTCTGTAC CCCGGATGTC ATTGCGCATC CGGACGTCGA 600
GCGCATGTAT CACGCAGGCG AGCTTTCCTC CGCACTCAGT GCCGCGTACG CACGGCTCAT 660
ATCCGACCGC GGCGTTTACG CGAGCAACCT TCAACATAGC CAGGAGCTTA TGGATGAGCA 720
AGAAGCACTC GACGCGCGCC TGCGCCCTTG ACTGTGGTGC CAAGCCGCTG AAGCGCGTTG 780
CGGCGTTCAC AGCGCAGGTC AGTGAACTGG ATGAGGATAT CAATGCGCTG TGGGCGCATC 840
GGTGCTGCAT ACGCAAGTTG TTTCTTTACC GAGGAAGGAT TTGnTCAGCC TCCTTTaTCT 900
CAGAAGACAA GACCGACGGT GCCCGATGAA CTCAGCACGC TGTTGCGTAC CGTGGCAGAA 960
GCGCGGATGC gTAGGGCACG TGCAGGGTAT CAGGTAGAGT GCGCCAAGCT CCGTCAAAAG 1020
CTTCAGTCAG AGCAGCGTGT GTGCGAAasT TTTGCAGATC AATCGaGGAA TATCGACGGG 1080
GGATcAAAGA GTACGAGGCG ATGATCGAAT CGGCGCACAG AACGTTGCGT TAAGCAAAGC 1140
CACGGTAGCG CGTCTGGCGC AGTCATTAGA GGAGGCGTCA GAACGCCTTA CCCTATTCGA 1200
AACATCGCCG GAACCTATTG TTCTCTCTTC GGAAGTTCTG TCTGTCCCCC AAGAGAAGGC 1260
GAGTGTGTAG GTGCTCATGA GATAGAGCTC TCCGTGTCTT CTAGACGGGn GGGGGGGGGG 1320
TGAGGTAGAA GTGAGAGGAG GGGGAGTGAG TGGGCAGGCA GGTGATGCAA GCGGGGGTAC 1380
TTGCGGGCAT GGTATGTGCT GCTTCTGGTT ATGCAGGCGT ACTCACTCCG CAGTCAGTGG 1440
CACAGCCCAG CTCCAGTGGG GCATTGCGTT CCAGAAGAAT CCACGCACTG GCCCGGGCAA 1500
GCACACCCAT GGGTTTCGCA CTACCAATAG TCTGACTATT T 1541 (2) INFORMATION FOR SEQ ID NO: 567:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 468 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 567:
CTGTGCTGCG CACCGTGCCA GCTGAGCGTC ATACCCGGAT CATTTTTAGA GAACTGCTTC 60
TAGGACTGGT GCTCATGCTC TCCTTCCTTT TTTGCGGAAA AGTTTTCCTA TCTTTGTTCC 120
AGCTAGAAAC GGGAGTAATG AAAATGGCCG GAAGCGTCAT TCTCTTTCTC GTTGGCATCA 180
AGATGGTATT TCCTGATCAA CACGCGCTCC CCTCCACCAC AGAAGAGGAA CCGTTTATTG 240
TTCCCATCGC CACTCCCATG ATCGCnGTCC TTCGGCGTTC ACCACGCTGG TAATTATGGG 300
AGAGACGAnG GGACATCCCG TCTCGCCACC TGTGCTGCGC TGCTTGTTGC GTGGACGCTC 360
GCGTGTCTTA TTATGATAAG CGCACCGTGT CTATACCGTC TTCTTAAAGA AAAGGGAATT 420
ACCGCGCTGA GCGAATCACn GTATnTGCTG CTCATTCTTC CATCCAGA 468 (2) INFORMATION FOR SEQ ID NO: 568:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 507 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 568:
CCTGACATGG TGCACGGATC TGGAAGAGGC GCTCTATCAT TGTGGGGCGC TTACTGGAGA 60
ATGCACAGCG AAGATATCCT AGATGCGCTG TTTGAGAAGC TCTGTGTGGG AAAGTGACCT 120
GCAGTACGGG AGATACGGGC GCGTGTGACT GATTAGTTAT ATTTCTTGGG GATGCAGCGT 180
CAGGTAACTT TTTTCGGTAG AAAGGTGGAG GCGCGCGCCG TnCTGAGTCT ATGTGCACGC 240
ACGATCCTGC GCGCAACGTT CCGTGAATCA TGCGCACGCA AGCTCATTTT CAATTTCTTG 300
CTGCAATACG CGGCGCAGGG GCGTGCGCCC AAGAATGGGT CAAAGCCGTG TTCAAGACAG 360
TAGGCCTTTG CAGCCGCGCT GTAGCCAGCA CAATATCTTA CCGCGTAAGG TTCTGCGAGC 420
TCTCATTCGC ATCTAAAATT CCTGCAGGTC TCTCGCTCAA nGGAGCAAAC ATACGCATCG 480
TCAAAGCGAT TGAGAAACTC CGAGAGA 507 (2) INFORMATION FOR SEQ ID NO: 569:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 502 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 569:
TGCACAAnGA TGCGCGGTAA CACCGCATGC GATTGAGCAG GTGTGGAACG ACACATCACC 60
GTGCAGTACG CCTTTGGnTT GGTACAGGAT GCAACGCATG TGTTTTTTTT GTACGCGCAT 120
GAGCCCATGC GnGATCCGGC TTTTATTTTC TTTTCTGGAG TTGCTTGTGG GCGTGGTATG 180
CACGTGCTGC TCTTGGCTAC AACAACGGAG GTCAGGGATA TCCATGTATT TCGCGACTTG 240
GTCTTTTTAC TTGAGGAGGA GACGTTTGAG GATTTCTTTC GTGTCGAGCA CGAGAGATTT 300
GTAAGGCAGA AAAAGAAGCG TGTCGCACGC ACTGCGCTGT TAGAGCGCGG TTATCCATGT 360
TTTGAAGAAA ATTCATCGCG ACATCATGGA TGGGAATATT GATATGTCAA CTCTTTTGGA 420
GCAGGATTAG CGCTGCTTGA AAGACGCACG CGGTACCCTG TGTTGTCTTG GCAGTGCGGG 480
AGGTCAGGAT GAGAGGCAGC GC 502 (2) INFORMATION FOR SEQ ID NO : 570:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 530 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 570:
GACGAATGCG ATGCCGACGG GTGATGTAAA TGCGATAAAG CCGGCTTATC TTAAGCAGTT 60
GCAGGATATT GCGTGGAAAC TGGAGGATCA CAGCCGAGAG ATTCGGGAGG TTCGCTTTAC 120
TATCGAGGCG GGCAGTTTAT GGCTTATTGA GCAAAAACCT GTCGAAGCGA AGAGCACAAT 180
CTCTTTGGTA CGGTTGCTGC TCGACCTGTA CGAGCGCGAG GTGGTGGATG CTGAATACGT 240
GGTCAAGTCG GTAAAACCGG GTCAGCTGAA CGAGATTTTG CACCCGGTCA TTGATATGAC 300
GAGTGTGACA GGTTTGAAAT CCTCGCAGGG GGGATTATTG GTGTTCCTGG TGCGGCGGTT 360
GGGCGAGTGT ACTTTACCGn CTGATTCCCT CATGCGAGGA CGTGGACGTG TGGACGAAGA 420
TGGGCGGACA AGATACACGG TGTATCTTGT GnTATGCCTG CAACGnACGC GGGGGAnGTT 480
AAGGGCAATT GAGGTGGCAA CTGGTGTTCT TTCTAACGAG GGGGGGTACT 530
(2) INFORMATION FOR SEQ ID NO: 571:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 521 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 571: nTCTTTTTnG CATACGGGGC ACGGGGACTC TGTGTGCCAT GTCCGTTTTT TGTCTACTTC 60
TTTCCTTTGG AAGGCGCTGT GTGGCGGCGG ATAATTTCCT TTCTTTCCTT GTGTGGAATC 120
TGGTTCTTGC CTTCATCCCC TGGCTCATCT CGGCTATCTT GCACGTGCnC nGGGGGGGGG 180
TCCGGGGGGG GGGGnGTTCC TTATGCTGCT CTGGCTATTG TTTTTCCCCA ACGCTCCGTA 240
CATCCTTACC GATATTATCC ACTTGGGAAA GGGTAAGTCA TTTTTGCTTT ACTATGACCT 300
TATTATTTTA CTCGCCTATA GTTTCACTGG TTTGTTCTAC GCGTTTGTCA GCCTTCACCT 360
TATTGAAAGC ATATTAGCCC GTGATTTTCA TATCAAAAGG CCATCATAAT TTCAGTATTT 420
GAATTGTATC TCTGTGCATC GGTATATATC TGGGGCGTTC TGCGCTGGAA TTCCGGGACA 480
TGTCCTACAG GACGCACTAA TCTTTCTGAA TnTGGTATCC G 521 (2) INFORMATION FOR SEQ ID NO: 572:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 520 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 572:
AGAAGTGGTC CTGCAACTTT ATCCGCCTCC ATCCAGTCTA TTAATTGTTG CCGGGAAGCT 60
AGAGTAAGCT TGCTGCACAC GAGGGGCGCA AGGCGGCGTG CGCACCGCCT ATTCCCTGGG 120
AGCAGCTCAT GTGCCAGATG CGTGCACAAT CCCGCGCGCA CACCGTCGGC GAGCTCTTCT 180
CTCCGTGGAA ATCGTAATGT TCATTGTGTT CGAGAGAAAT AACAACCCCG CAGTCTATGA 240
GGGGGTGCCG TACTCAGCAG GATTTTGGTA TATGTGCTCA AGCGCGATCT GTGCTTGTGC 300
AAACAGCATG ATACTTTCTG TCGCAGTCTG ATAACGCTTC TCAGCAACGA CACGCACAAT 360
GCCGCAGTTA ATGATGAGCA TTGCACAGGT GCGGCACGGT GTCATGGTAC AGTAGAGTGT 420
TGCGCCCTCT AGACCGATGC CCAAACGCGC TGCCTGGCAA GGGCGTTTTG CTCTGCGTGC 480
ACGGTGCGAA CGCAATGCTG CGTGCACGTC CCGTCTTCAT 520 (2) INFORMATION FOR SEQ ID NO: 573: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 533 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 573:
GCGGGTTTAT TAGAGTTGAT TGTGCAGGAA ACGCATACGA TTCATATTAG AATTTCGAGT 60
TTGTACCCAG AAAGCGTAAC ATCTGCTTTT TTGCGTGCTA TTGCGCACAC GCGCGTGTCG 120
CCTCATTTTC ATTTATCGGT TCAGTCGGGC AGTGATCGCG TGTTACGACG CATGCGACGC 180
GCTTACACAC GTGCGGACAT TTATCAGGCA GTTTCCGATT TACGGAGTGT GCGTGAAGAA 240
CCCTTTTGGG TTGTGACATA ATCGTCGGCT TTCCAGGGGA AACAGAGGAA GATTTTGCAG 300
ACACCCAGCG TATGTGCAAA ACTTTGCGTT TTGCAGGTAT CAGTATTCCG TTTCTGCACG 360
CCCCGGTACA GAAGCGTTGC TATGGATGCn AAATGCCTCA GCGTATTGCA GGAGAACGCG 420
TGCTGCATGC ACAACTGGCA GAGAAAAACT AACGTGCCGT ATTGGAATAT GGGAAGGGAG 480
GAACTAGTGC GGTGGTAnAA CATCCGTCGC ACGTGnTTTG ACAGAAAATT AAT 533 (2) INFORMATION FOR SEQ ID NO: 574:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 562 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 574:
TCATTAGGCA CCCCAGGCTT TACACTTTAT GCTTCCGGCT CGTATGTTGT GTGGAATTGT 60
GAGCGGATAA CAATTTCACA CAGGCAATAT TCCCATCTTT CGTTACATGC GCAGTAAAAA 120
AGAGAACAGA GTCCCGTTCC TTTACCCACG CTATCAACTC ATTTGCGCAA TATTTCAGCT 180
GATTGATAGT CATAGGAATG GCACCTGCTT CGGGGGAAAA AACTGTCTGA ATCGAATCAA 240
CAATAACGAA GGTAGGGCAT CGTGTATTTA AAACACGCTC GACATCCTCG ACCCGCGTCG 300
CACAAAGCAA CTCGATGTTC TGAATTGGAA TATTCAGCCG ATCCGCACGC CCACGAATTT 360
GCCCCGGAGA TTCTTCACCC GAAACATAGA GAACCGATTT CCCGCAGCTG CAGCGATTTG 420
TAACAGTAAT GTAGATTTAC CAATGCCCGG TTCCCCGCCA ATCATGATCG CGGAGTCTTT 480
ACGGCGCCTC CGCCGAGGAC ACGATCGAAC TCTGCGATAC CACAACTAAT ACGCTGCGCA 540 TCCTGCGCGC GCACAGCACA CA 562
(2) INFORMATION FOR SEQ ID NO : 575:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 477 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 575:
GTGCAACTTT ATCCGCCTCC ATCCAGTCTA TTAATTAAAT CTATCGTTGA AGAGGTATCT 60
GTACCGTGAG CCTCTGTGCA AAAGCCGTAG TTGCGCGCGT GTGTGACTCC TGCCTAAATG 120
TTCGCACAGA AGAAAGCGTG TCGGTAACGT ACATGCGAAT GATGTCTGCA GCCTTTTCCA 180
TCTCTTTCTC TCGGTAGAGC AGCCATTGTG CATAATGCGT GTGCTTTCTG GAGAAATGAA 240
CGGACTCAAA GCGGTTGAGA GAAATAGAAC GCACAATCTT TTCGCAAACT CGGATCAGCG 300
TGCGCACGTC GGTCAGGTCG AGGGGCGCAA CCGCGGCGGC GAAAAAATCA ATAATTTTCT 360
TTATTTCTTG GATTTTCTTC TGAGCGAGCT CTGCTAGGGC ATCGGCAAGC ACTTCCTTAA 420
ACCGCACGTC CCTTTTGCGA TAATAGGACA TGAGGAGTAC CCTACGCTCC TTCTGAG 477 (2) INFORMATION FOR SEQ ID NO: 576:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 569 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 576:
TTCTTTAGCA CCTCCAGCGC TTAGCCTGAA AATATGAAAC AGGTCTGTGC ACCGCACCAC 60
TTGCGAGAGT CCTGGGTGAG AGTGAGGAGC AGGGnCCCGn AGGGGGGGGG AGGAGGACAG 120
AGCAGCCTGA GCAGCACTGC CTGCAGAAAC CCACAGGCTG CAGCAAAGGC CAATACAAGG 180
TTTACTCAGG GCCACATCTG AGCAGCCCAG GGCCAGGCTC CCCAGATGGC CACAGCGGCC 240
AGGCcTGGnC CAGGCCAAAG CCAGGAGCCA GGAGCTGCAT CTGGGTCTCC CACGTGGGTG 300
ACAGGGGCCC ATAGACTTAG GCCATCTTCC CCTACTTTCC AAGGTGCATT AGCAGGGAGC 360
TGGATCAGAA GTGGAGCAGC CAGGACTCGA ACTGGCGCCC ATATGGGATG CCGGCACTGC 420
AGGGTGCCCT GGAGAGCTGC ATCTGAaCGC CCTCACACAG GGCTCCGCAG GGGTTTCTCT 480 CGAATGCTTT GCCGGGTTTA TGGGGATGTG TTTGTTCTCA CTGCCAGTnG GAnCCTGAGA 540
TCCCCGGCCT GCTGTGCAGG AGCTCCTGC 569
(2) INFORMATION FOR SEQ ID NO: 577:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 602 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 577:
TGTTTTCTCA GGTTTCTAAA GCTTCTTCCT AGAAAACTCG AATGTGTGGA GGATTTGACT 60
CCAGGTGGAA CCAATTAGCG TTTGGCAGCT AAAAACAAAA ACTTACATGC TAAAATGCAT 120
TCAAAACCGT AAAGTCCATA GAGAATGTCC AGAAAACACA AACACAGAGG CAGTAGCAAG 180
ATCTGGGATT GAGATAGCCA CAACACACCA GATAGTTTTG TTTTCATTAA GGAGTATCTG 240
GACAAATTGT TGTAGTTTTG AAGTGAAATT TAACCAAAAA ATCACCGTGA AAGTGGTTTT 300
GGAGAAAAGC ACAATCTTGC TGTTCAGCAA ATGCATCCAA TGTCATGTTT CCAAATACAA 360
ATATCATGTT TTCTCAGGTT TCTGGAGCTT TCTTCCTGGA AAACTCGAAT GTGTGGATTA 420
TTTGACTCCA GGTGGAACGA ATTAGCGTTT GGCAGCTAAA AACAAAAACT TACAGCTAAA 480
ATGCATTCAA AACCGTAAAG TCCATAGAGA ATGTCCAGAA AACACAAACA AACACAGAGG 540
CTGTAGCAAG ATCTGGGGAT GAGATAGCCA CAACACACCA GATAGTTTTG TTTTCATCCA 600
GG 602 (2) INFORMATION FOR SEQ ID NO: 578:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 587 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 578:
GTGTTTTCTG GACATTCTCT ATGGACTTTA CGGTTTTGAA TGCACTTTAG CATGTAAGTT 60
TTTGTATTCA GCTGGCAAAC GCCAATTGGT TCCACCTGGA GTCAAATCCT CCACACATTC 120
GTGTTTTCTA GAAAGAAGCT TTAGAAACCT GAGAAAACAT GATATTTCTT TCTGTAAACA 180
TGACATTGGA TGCATTTGCT GAACAGGAAG ATTGTGCTTT TCCCCAAAAT CACTTTCACT 240 GTGATTTTTT GGGTTAATTT TACTTCAAAA CTACAACAAT TTGTCCAGAT ACTCCTGGAT 300
GAAAACAAAA CTATCTGGTG TGTTGTGGTT ACCTCATTCC CAGATCTTGC TACAGCCTCT 360
GTGTTTGTGT TTTCTGGACA TTCTCTATGG ACTTTACGGT TTTGAATGCA TTTTAGCTGT 420
AAGTTTTTGT TTTTAGCTGC CAAACGCTAA TTCGTTCCAC CTGGAGTCAA ATAATCCACA 480
CATTCAAGTT TTCTAGAAAG AAAGCTCCAG AAACCTGAGA AAACATGATA TTTGTnTTTG 540
GAAACATGAC ATTGGGATGC ATTTGCTGAA CAGCAAGATT GTGCTTn 587 (2) INFORMATION FOR SEQ ID NO : 579:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 703 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 579:
CGCGCTTGCA GTTCAGGCTC AGTTTGCGCT GATTTTCTCT ATCACCACAG ACATAAAGGC 60
CCAAGAAGAA GTGATCAGCC AGTCGATGGT AGAACAGACG AAAGACAGTG TGCACGTGTT 120
GCACGCTATA CAGCACATCA CCGAGATCAC GCGCACGTTG CAGGAAAATT CGGGCGCCAT 180
CTTGGACAAC AGCAAGCACG TAGAGGAGgG CGATGCTAGC CCTTTCGCGC ATCACGTCTG 240
AAATCGACAG CAGCGTGTCG TCCATGCACA AAAACTCAGA ACAGGTTAAA AAGTATGCTT 300
CCTCAATCAC TGAAATCGGA CAGAAGAACA AGGATTCCAT AACGGACCTA GTCACTGAAT 360
TGAGTAACAT GCGACTCTAG AGTCGCGGGG GCGCCTGTTA CCCTTCAGCT GCCATGCGTC 420
TGCGCACTTC GTGcaGGTTA GGCGTGTGTC CTTTCGCGTA GAAAGGTCGC GCAtACGAGC 480
AGGCAGTCCT CCACCACGGC GTTGCGCTGT TCAACAAAGA TCCCCCACCT AATCCCCTTC 540
TTCTCTGCAA AGGCGTACTG TTGGCTCAAc TTCCGCGGAT CAGGGAAGAC TTCCGTCGCC 600
ACCTGCACTG CAAAGTATGA ACACAGCTTT TGGTACACAT CCATGAGCGC ACTAnCCTGA 660
CAGAAAGATA AGCGCCTGCA CAAAACAAAC GTGCTCTCGG GAC 703 (2) INFORMATION FOR SEQ ID NO: 580:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 433 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 580:
AGTCAnTTCA ACGTGTTCGG CCGTGGCATA CAGCTCCCTG GTCTTCCTGA TTATCGCCCT 60
CGTGCGCAGG TTTCGCACAG GCCGGAGAAG ACAGGCTCCG GCGTACGCAC GATAGGCGTT 120
TTGATAAGTG CGCGCAGTGC ACCACCGCCT GAAACGACAA TGAGCTTCCG TGAGCGTCTT 180
CGTATAGGTA CCGTTGAACG GAACGAACGA ACCGCCCGAG AAGCTCTATG TCGGGCGTCT 240
CAGGCGCAAC GATGGAACCT CCAAGTGACA GAACGGTGAC CATGAAACCC TCTCGCCGGC 300
ATCGTAACGC AAAGAGACCC TTTGGATCCA GGCCCTGTGT GTATCTGGCA TTGCGTCCCA 360
GCGTGCACGG GGCGATGGAG TGTTCTACAC GGGCGACACA GACTCCTAGT TCTTGTATTC 420
TGTGCAAAAA CCG 433 (2) INFORMATION FOR SEQ ID NO: 581:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 452 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 581:
GATACGCAGC TTGCGACGGT GAGCTACCGC ACGTACACCG TACCTAAGGG AnCATACTAT 60
TAGTGCCATT GCGCTGCGCC AGGTACTCAA GCATATGGGG ACGCTGCTGT CGGTGAACGG 120
AATTTCAAAC GCGCGCAGAC TATCGGTAGG GGATCAAATT ACTATTCCGT CCATGGATGG 180
ACTCATGCAC ACGGTACAAA AGGGGCAGTC GCTTAnTGCA ATTGCCAGTC TCTTTCGTTT 240
GCCCCTGAAT ACGTTGCTGG ATGCGAATGA TTTAGTCAnT CGTGCATTAA CAnTTGGACA 300
GCGnTTGTTT ATTCCGGGTG CAAAATTATC TGCTTCnGAT TTnAnGAAGG TGTTGGGGGA 360
GTTATTCATG TATCCAATTC GCGGGCGGCG CACCTCTGGG TTTnGGTACC GCTCAGATCC 420
CTTTTCAGGC AAnAGGAGCT TTCACAATGG GA 452 (2) INFORMATION FOR SEQ ID NO: 582:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 582: CGCGATCCGC ACGGnCAnGG GGCAGAAGCC TGCACTGGTG GTGTCGGTGC GCGATTGCTC 60
GGTGGTGACT TCTGGTGCGT ACGAGCGTTT CTTTGAGCGT GACGGGGTAC GCTACCATCA 120
TATCATCGAT CCGGTTACCG GGTTTCCGGC ACACACTGAT GTGGATTCTG TGTCTATCTT 180
TnCACCCCGT TCCACAGATG CAGATGCGCT TGCTACCGCC TGTTTTGTAT TGGGGTATGA 240
GAAAAGCTGT GCGCTCTTGC GTGAATTTCC CGGTGTTGAC GCGCTGTTTA TTTTTCCTGA 300
CAAGCGCGTG CGCGCAAGTG CAGGGATTGT CGATCGCGTG CGTGTGCTCG ATGCACGTTT 360
CGTGTTAGAG CGTTAGGACA GCACGTGTGC TGTTCGTGTG TAAAAAATGT GGCGGATGTC 420
CTCATCAGGT GT 432 (2) INFORMATION FOR SEQ ID NO: 583:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 435 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 583:
CTGTGCTTAT CAGCACGCCA ACTAAAAGTC CAACGATGAC GCACGTACGA CACTTGACTT 60
CTCGTGCGTA GTGAAAAAGA AGCGCGCCGT ACATGCATGG CAGCGCGGTA AAAATCCAGC 120
GTAACGGCAC TCCCCACACA GCTTCTTCTT TGCGAAAAAC TGCGACGATA TTCGGTCCAG 180
ACGCAAAAAA GCAAGCGCTG AGCACTGCCA CCGTACAGAT CGCAGAGAGG AAGGAAAGAA 240
CGCGGTGCAT CGGTCTGTCC ACGTCGCACG AGAACAGGGT GACACTCAAG TGTTTACGTT 300
CACGCGCAGT AAAATGCCTG CAACACAGGA AAACACGAAG CTAACTGTGC AACCGCGGCG 360
TTCAnAnGAG GAGAAAAAAA GGAAATGAAT CTnCGTGCGC GGAGAAAACC TCGGAACCTA 420
CGCCCCGCTA AGAAA 435 (2) INFORMATION FOR SEQ ID NO: 584:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 434 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 584: TGCAACACAC CCCGAGCACG CTCGGCGTTG AGCTTACTTG AGAGCGnnGG CGTTGCACCC 60 AGAAAAGGAC AGAAGGAGCA GCAGGCTGCG TGGACGCACC GCGCACGCGC GCATCCCGCT 120
GTGCAACAAG CGGAGCACGG GTACGAGTAG CGGGAGGCGA CTTGGCAGAC GCTCGGTCAC 180
TCCGTGCATG CTTTGACGCA CCCTTTGACT CTGCAGATTC CCGAGTGTCA GAGGCAGGAG 240
AAGTTTTCCT GTCCTGCGCC GCGTCTGTGC GCTCAGCACG CGCCGGAGGA GAAACATCAA 300
GAnTTTTCGC ACGAGCGGTA GGGACnTCCT TTACCACCGT GAGnTCAGGA nTTGCCCTCT 360
GTGTTGGCGC AGGAGTnTTC CCCAGCTCAG GAnTTTTTTC AGGnTTTTTA GCCATAAGCT 420
CGGnTCCATG ACGG 434 (2) INFORMATION FOR SEQ ID NO: 585:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 427 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 585:
TCTCTCATAC TGCTCnTACT CCTCTCTCAT CATTGTCTCT CCCCGCTGTA CATCCGCGCG 60
GTGnCCGCAC TGCCAGACAC CCTTCCTCCT GTGAGGCTGG TAATCGGCGT TGGAATAAAA 120
GGGCAGGGTA TCGCCTGCTT GCATTCTGCC GCCCTTCCTG AGAAGAAGGC GCGCATCGGC 180
AGTTCACTGA CTACCCTTCC GGCAGCCTCC GGTGCATCGT GCCTCACCTT TTTTACCCGT 240
GGACACATAC CCCAATTGCG CATTTCAAAA AGTCCGTTGA ACAATCGTTC GTCGTTTTCT 300
TACACGCAGA TGTGCAACAA CTACGAACGC AAAACATCAC GTGGCTTGGA TCCATTCGGC 360
GGACCGACCA CCCCCCTTGC TTCCATTCTT CGATTAGGCG CGCGGCGGAT TGTAGCCTAT 420
CTCAATT 427 (2) INFORMATION FOR SEQ ID NO: 586:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 430 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 586: CGGTAACACG TGGTTTGCGG CGCAGGCATT GGAGTAATTC ATCGTACAGG AAGTCCGGTT 60 TTGCCCCAAT TTTTTCAATG AGCGGAGAGA TAATCCCGTC TTTCTGGGAA AGTAGGGCGT 120 GGAGTAGATG TTCCTCCTCA ACTTGACCGT GGTTCTCCGC TTCTGCCAGA GATATGGCGT 180
CATTGAGCGC TTCGCTTGCT TTGACTGTGT ACCTGTCTGT GTTCATGGCG TGATTATAGG 240
TCTTTTGAAC GCTTTTTTCT CGTCATCGGT ATGTTTTTTC TACCGCTTGC AGGGGACTTA 300
CGGGAGTAGT CGCGGTGGAG AACAGGGGTG TACATGGTAT GCGGTGCGCT TTGGCAGGCC 360
GCGTAAGGCG TACCTTnTAT ATTTTCTGTT TTGAATAGGC TCCGCGATTG GGAGTTGGGA 420
ATAGGAAAAA 430 (2) INFORMATION FOR SEQ ID NO: 587:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 439 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 587:
GnCCCCATCA GGGAAAATGC AGGAAGTGCA GGAAGCGGCG GCAGGTTCGG CGGACTTCCC 60
TGCGGGGnTG TGAAGATTTT GCACAACCTT ACCGGTAGTT TAGCAGGATT AGGCTGAACT 120
GCCTCTAGCG CGTTTATGTA CGTATCCCCC GAGATTCCAG CGCGCTTCGA ACTCCAGCCG 180
TTACTGTCTG CGTCGCCTGT TGCACTACCT GGGTTACCCA GGCTTCCTGT TTTTGACTTT 240
CTCCCTGGAA AGGTTATTTG AAAAGGGCGG TGAGTTCACT CTGCGCCCTC TGTGTGCGAT 300
TTTGAAAGTC CTGTGCACTC TGGTGTTGGT TACCGGCGTC GAAGGGCGAA AGGAGAAAGC 360
GGAAACCGGG CGCCTGGTTT CCGAGGGTGA AGTCGGCCCC CTAACATTCC ACAGCAGTTT 420
AATCTnGGCG nGCGTTGTT 439 (2) INFORMATION FOR SEQ ID NO: 588:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 558 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 588:
TTCACCATCG AGGTGGAGCG CTCCTTGCGC GTTTTAGACG GTGCCGTCCT CGTACTCTGT 60
TCGGTTGCAG GCGTCCAGTC CCAGTCCATC ACTGTCGACC GGCAGCTCCG CCGCTATCAC 120
GTGCCCCGTA TCTCATTTAT CAATAAGTGT GATCGTACGG GTGCCAACCC TTTCAAGGGA 180 GnACTTGTGG CGACAGCGCG TGCAGCAACG TGAAAAACAG CGCTCGTAAG CTCCCCTGCG 240
ATCTTATACA TATTCGGGAT CATAAGCAGC AATCCCTGAG AAGCAGTAAA AGTAGAAGAG 300
AGCGCCCCCG TCGTCAGTGC GCCATGAACA GCTCCCGAAG CGCCTGCCTC AGACTGAAGT 360
TCTACAACGG TGGGAACGGT ACCCCAGATA TTTGTGCGCC CCCGTGCGGA ATATTCGTCT 420
GCGATTTCTC CCATAGGACT GGAGGGAGTG ATAGGGAAGA TAGCAATGAC CTCACTAAGC 480
GCGTGAGCAA CGTGCCCCAn TGCGGTGTTA CCATCCATCA TGACGAGGTT CTTCTCAGAC 540
ATACGAnCGT CCTCTCTC 558 (2) INFORMATION FOR SEQ ID NO: 589:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 392 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 589:
TAATTCCCGA ACAACnGTGC CAATCGTACT CCGCAATATT CACCGACAGC AACGCTGTCC 60
CCTTCTTnCC ATTGCTTCTA GTAAGCCCTT AAGCCCAGGG GAGAATCTCG CGCTGGTAAT 120
CGTGCAGGGC TTTTCCCTTA ATTCTCAGGC CATGATCCAA ATCAGCGATG TCCGAGAGAG 180
TAAAGCAGTG ATAAAGACTT TTCAAGAGAG CAGAAAAGTC ACTGACCATA GCGTAnTCTC 240
TGCAAGATTA CCGCTATGCA TGTTTACTGC GATTTCTTGT AACGCAGCTG TGCGTGCAAG 300
GAGAGTGGTA CTAATCTCAC GCAAGAGTGC ACGCACTCTC CCCCTGAAAG AGTGGTAAGA 360
TTAAGGGTGA CGTCGGTATC CAAGGATTGG CA 392 (2) INFORMATION FOR SEQ ID NO : 590:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 507 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 590:
CCGATAnCAT TACCTGGGAG GGGGATGCAC GCATTGTGCA GGCAGCGCGT GTTTCTTACG 60
GTGCGGGGAC TAGGACTGCG CGTGACGATG CGGCGCTTAT CGATTTTCTT TTACGCAATA 120
AGCATACGTC TCCTTTTGAG CAGGTGGTCC TTACCTTCCA TGTACGTGCA CCGATTTTTG 180 TCGCGCGTCA GTGGATGCGG CATCGCACTG CTCGCATCAG TGAGGTGTCT AGTCGTTATT 240
CGCTTCTTAG TCATGACTGT TATGTTCCGC AGAAACTTCA GTTGCAGTTC AGTCCACGCG 300
TAACAAGCAG GGCCGCGCGT CCGAAGTATC TCTCCTGAAC AGCAGCAGGA AGTGCGGGCA 360
GCGTTTGAAG CTCAGCAGAA AGCGGCGTGT GCnCTTTACG ACGCATTGAT CAAAAGAACA 420 nCGCGCGGGA GCTAGCGCGT ATTAACGTGC CGCTTTCGCT TACACCGAGT GGTATTGGCA 480
GATTGATTAC ACAATCTTTT CATTTTT 507 (2) INFORMATION FOR SEQ ID NO: 591:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 663 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 591: TCCACCTGGA GTCTATACGC CACACATTCC GTTTTCTAGG AAGAAgCTTC AGAAnCcTGA 60
GAArACaTGA TATTTsTwTy TGTAAACATG ACATTGGATG CATTTGnCyG AAyAGGAaGA 120
TtGTGCTTTT CTCCmAAACC wCTTTCACGG TGATTTTTGG TTAArTtTCA CCcGAAwTCC 180
ACAAsAATTT GTCCAGATAC TCtGGATGAA AACAAAACTA TCTGGTGTGT TGTGGCTATC 240
TCAATCCCAG ATCTTGCTAC AGCCTCTGTG TkTGTTTGTG TTTTCTGGAC ATTCTCTATG 300
GACTTTAmGG TTTTGAATGC ATTTTAGCAk GtAmsnTTTT TGTTTTCAGC TGGCAAACGC 360
TAATTGGTTC CACCTGGAGT CwAATmCkCC ACACATTCGA GTTTTCTAGG AAGAAAGCTT 420
CAGAAACCTG AGAAAACATG ATATTTCTTT CTGTAAACAT GACATTGGAT GCATTTGCCG 480
AACAGGAAGA TTGTGCTTTT CTCCAAAACC wCTTtCACGG TkATTTTTGG TTAAATTcAC 540 ycGAAwwCyA CAACAATTGg TCCAGATACT CCTGGATGAA AACAAAACTA tCTGGTGwkT 600
TGTrGCTATC TCAATCCCAG ATCTtGCTAC AGCCTCtGTG TGTGTTTGTG TTTTCTGGAT 660
ATT 663 (2) INFORMATION FOR SEQ ID NO: 592:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 409 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 592:
AAAAAATAAC TGCGTGTCAA AACCCACACC CTCAAAACGG ACCCGGCGAG CAGCGCCACG 60
CACTGCACTA CCCGCCTGTC TTACTGAACA AAGAACGCCC TTATGCGCGC GCACCCTCGG 120
ACCTTCTGAA CGTCCCTACC ACTGAGGCCC CTTCCATTCA AAATCATCCC ACTCTGCAAC 180
TAATGGCTCG CTCCAGAACC TTCCGGAAAC CCCAGTTTGC TCATCAATAT GTAGCGAGTA 240
CTCATCATGC GAGGGTGCAG CGATCTCGAA ACGTACCTTG TACGTACCAA GCCCCTCTTC 300
AAACTTCACT TCGCCCCATA ATGCGGACCG TCCCCTGCGT TCAGGGGGGA AACATCAACT 360
TTGCAACTTC TCAGAGCCAT GTTTTGGGGG GAAGCAAAAA ATTCGGGAT 409 (2) INFORMATION FOR SEQ ID NO: 593:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 521 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 593:
GTTCGGCAAA TGCATGCAAT GTCATGTTTA CAAATACAAA TATCATGTTT TCTCAGGTTT 60
CTAGAGCCTT CTTCCTAGAA AACTCGAATG TGTGGAGGCT TTGACTCCAG GTGGAACCAA 120
TTAGCGTTTG GCAGCTAAAA ACAAAACCTT ACATGTTAAA ATGCATTTGA AACCTTAAAG 180
TCCATAGAGA ATTTTGCAGA AAACACAAAC ACACAAAGAG ACTGTAGCAA GATCTGGGAT 240
TGAGATAGCT GAAAATCACC AGATAGTTCT GTTTTCATCC AGGAGTATCT GGACAAATTG 300
TTGTGGATTT TGGTTGAAAT TTAACCAAAA ATAACCGTGA CAGTGGTTTT GGAGAAAAGC 360
ACAATCTTTC AGTTCGGCAA ATGAATCCAA TGCCATGTTT ACAGAAAGAA ATATCATGTT 420
TTCTCAGnTT CGGAGCTTCT TCCTAGAAAA CTCGAATGTG TGGAGGATTG ACTCCAGGTG 480
GAACCAAAGA GCGTTTGCCA GCTTAAAACA AAAAGGGTCC T 521 (2) INFORMATION FOR SEQ ID NO: 594:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 487 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 594: TGCTGGGTGT TCTCTGTTTG TGTCTGCCGC TTCCTACGAC GACAATGAAT TTTCTCGCAA 60
GAGTCGTGCG TACTCGGAGC TTGCAGAGAA GACCTACGAT GCGGGGAGAG TATGACGTCT 120
CTGCAGAGTA CGCCCGGCTC GCTGAGGGTT TTGCGCAAAA ATCCTCGGTC TACATCAAGG 180
GAAACTATGG GCGCGCACCA ATGCCGAGGG ACGCTATGAA CGCTGCGGGC ACCCGCCCAA 240
GCGTGGGGCG AAAAATTGAA GCGCATCGAn TGGCGCTATC CGACCGAGTA ATTGCTCGCT 300
AnGCGAnGGC TATCAAGACC GGAGGGCTTC GCTTTTTGAC AnCCAAGCAG TACGACGTAG 360
CGCTTCACGT GGGGCGCGTn AAGGCGTTnG ACGCACTCCA AAAACGTAAA AnCTGAAAAT 420
TCATTGCTTG CCAAAGGCCG CGAAGGAAGA AGCTGCGCGC CAAGnCGCCG AAGCACGAAA 480
ACTCCGA 487 (2) INFORMATION FOR SEQ ID NO: 595:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 377 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 595:
ATCnGTGTGC GTGCCGAGGA TGCACCAGGT GGTTATGTCC CCGnGAACCC TGCCTCTCAA 60
GCACAGGATG CAGCGTTTGA TTTCGATGGG GTGCACGTTA CGCGCGGAAC TAATTCTATC 120
ACCGACCTTA TCCCCGGCGT TACGCTTTCG CTGCACGAAC GTACAGAAAA AACCGAAACG 180
CTCTCTGTCA CCCCCGACGT GAACGCCATG AAGAACGCTA TTATAGAATT CGTTGCTAAG 240
TACAATCGAC TCATGGCAGA AATTAACATT GTCACCAGTA ACAAGTCAGA CCATTATnGA 300
CGAGCTTGCG TGATCTTACC CCCGAGGAGA AAAAGAAAGA GACAGAACAA CTCGGnCAAC 360
CTCCACGGGG GAATCCA 377 (2) INFORMATION FOR SEQ ID NO : 596:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 366 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 596: CGTGCTTTCA CCTTTTCCCC ATTTnAAGAG AGTGTTAATA TCCACAATAC GCTCACCATC 60 GTAGAACACA ATGACCTCTG AAnAAGGGTA CCGCCGCGTT GTAACTCCTG nnAATACGCT 120
TCCACGCTTC CACATTCCCG TTATGAAACA ACTCGTTAGA AACAGGCACC GATATTAACT 180
GAGACATCCG AATCGGACCT nCAGAGGCTT GAGCAGGAGG ACGCGCGGTG AGGGGGGGGG 240
GGGCAGACGA AnCCCGATCA CAGGTAGCCC CTTCAGACGC CCTGGnAGCG TCCGCACCCG 300
GTTCTTTGCC GATCGAACCC TCTTGCCAGA ACCCTCAGAC TTTTTCGTAG ACACAAATGC 360 nAAAAC 366 (2) INFORMATION FOR SEQ ID NO: 597:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 953 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 597:
ATTTAAAAGG ATCTAGGTGA AGATCCTTTT TGATAATCTC ATGACCAAAA TCCCTTAACG 60
TGAGTTTTCG TTCCACTGAG CGTCAGACCC CGTAGAAAAG ATCAAAGGAT CTTCTTGAGA 120
TCCTTTTTTT CTGCGCGTAA TCTGCTGCTT GCAAACAAAA AAACaCgCTA CCAGCGGTGG 180
TTTGTTTGCC GGATCAAGAG CTACCAACTC TTTTTCCGAA GGTAACTGGC TTCAGCAGAG 240
CGCAGATACC AAATACTGTT CTTCTAGTGT AGCCGTAGTT AGGCCACCAC TTCAAGAACT 300
CTGTAGCACC GCCTACATAC CTCGCTCTGC TAATCCTGTT ACCAGTGGCT GCTGCCAGTG 360
GCGATAAGTC GTGTCTTACC GGGTTGGACT CAAGACGATA GTTACCGGAT AAGGCGCAGC 420
GGTCGGGCTG AACGGGGGGT TCGTGCACAC AGCCCAGCTT GGAGCGAACG ACCTACACCG 480
AACTGAGATA CCTACAGCGT GAGCTATGAG AAAGCGCCAC GCTTCCCGAA GGGAGAAAGG 540
CGGACAGGTA TCCGGTAAGC GGCAGGGTCG GAACAGGAGA GCGCACGAGG GAGCTTCCAG 600
GGGGAAACGC CTGGTATCTT TATAGTCCTG TCGGGTTTCG CCACCTCTGA CTTGAGCGTC 660
GATTTTTGTG ATGCTCGTCA GGGGGGCGGA GCCTATGGAA AAACGCCAGC AACGCGGCCT 720
TTTTACGGTT CCTGGCCTTT TGCTGGCCTT TTGCTCACAT GTTCTTTCCT GCGTTATCCC 780
CTGATTCTGT GGATAACCGT ATTACCGCCT TTGAGTGAGC TGATACCGCT CGCCGCAGCC 840
GAACGACCGA GCGCAGsGgT CAGTGAGCGA GGAAGCGGAA GAGCGCCCAA TACGCAAACC 900
GCCTCTCCCC GCGCGTTGGC CGATTCATTA ATGCAGCTGG CACGACAGTT TCC 953 (2) INFORMATION FOR SEQ ID NO: 598: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 468 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 598:
CCGAAGTAAC TGGCTTCAGC AGAGCGCAGA TACCAAATAC TGTTCTTCTA GTGTAGCCGT 60
AGTTAGGCCA CCACTTCAAG AnCTCTGTAG CACCGCCTAC ATACCTCGCT CTGCTAATCC 120
TGTTCACAGA GATACCCTTG CGTTTTTGCC AGGCGCAGTG CGGCAATGGT GTCAGCCGTT 180
TCTCCCGACT GAGnAAATCG TCAGTACTAT TTCACGCGCG TGCACGACGC TCGTGCGATA 240 nCGGATACTC TGAGGCAATC TCCACCTGAC ATCCCACCCC TGCAAATGCC TCAAACCAGT 300
AACGCGCCAC TAACCCTGAC ATGGTACGAn GTACCACACG CGATAATGCG CACCCGTGTT 360
ATCCGTCTAA ACAGCCGCTA CAAACGTCTT ACACGAGGTA nCGTCCAAGA CnCGGTCnTC 420
CCCGAACGTC CGCACCTGTG CGCGAGAAGA CGAAGAAAGA CGACATAT 468 (2) INFORMATION FOR SEQ ID NO : 599:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 477 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 599:
GGAAGAGGCA CACCTGACAT GGAAGGAAGC TGCGCGTGCG GCAGTAGACG CAGGAGCACA 60
AGCGCTTGCG TTGCACCCGC GCACCTGCnC CAGTGTTACG CGGGAGAGGC AAACTGGGAC 120
ATAATCGCAG ACCTCGTGCA GTGCGCGCGT GGGTGGGGAG AGGTTCCCGT GTTCGGCTCA 180
GGGGATCTGC ATGCGCCTGA AGACGCACGG GCAATGTTAG AACACACCGC ATGCGCGGGG 240
GTTATGTTTG CCCGCGGTGC TATGGGCAAA CCGTTTATTT TCAGACAAAC CCGTCAAGCT 300
TTTAAACTGA AAGGATACTA ACACGCCCCG TGAACGTTTT GAAGCAAAAA GCTTAAGCGC 360
CAACTTGGCC GCGAAGCTTT CAACTTCTTG GCAACAAAGA ACGTTGGGGA AGAAAAAGCT 420
TCAAGCCCTT GGCAAAACCA AGAATnTCCG CCAAAACGGT TTTTTTGGnT TTTCCGG 477
(2) INFORMATION FOR SEQ ID NO : 600:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 533 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 600:
CTCCAGAAAC CTGAGAAAAC ATGATATTTC TTTCTGTGAA CATGACCTTG AATTCATTTG 60
CTGAACACAA AGATTGTGCT TTTCTCCAAA ACCACTTTCA CGGTGATTTT TTGGTTAAAT 120
TTCATTTCAA AACTACAACA ATTTGTCCAG AGACTCCTGG ATGAAAACAA AACTATGTGG 180
TGATTTGTAG CTATCTCAAT CCCAGATCTT GCTACAGCCT CTGTGTGTGT TTGTGTTTTC 240
TGGATATTCT CTATGGACTT TAAGGTTTTG AATGCATTTT AGCAGGACCC TTTTTGTTTT 300
CTGCTGGCAA ACGCTAATTG GTTCCACCTG GAGTCAAATC CTCCACACAT TCGAGTTTTC 360
TAGGAAGAAA GCTCAGAAAC CTGAGAAAAC ATGATATTTG TATTTGTAAA AATGACATGG 420
ATCATTGCTG AACAGAAAGA TGTGCTTTTC TCCAAAACCA CTTTCACGGT GATTTTTGAT 480
AAAATTTCAC CGGATATCCA CAAAAATTTG TCCAGATACC CCTGGATGAA CAC 533 (2) INFORMATION FOR SEQ ID NO: 601:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 430 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 601:
AAAACCACTT CCACGGTGAT TTTTGGTTAA ATTTCACCCG AAATCCACAA CATTTTGTCC 60
AGATACTCTG GATGAAAGCA AAACTATCTG TTGTTTTGTG GCTATCTCAA TCCCACATCT 120
TTCTACAGCC TCTGTGTTTG TGTTTTCTGG ATATTCTCTA TGGACTTTAC GGTTTTGAAT 180
GCATTTTAGC AGGACCCTTT TTGTTTTCAG CTGGCAAACG CTAATTGGTT CCACCTGGAG 240
TCTAATACGC CACACATTCG AGTTTTCTAG GAAGAAAGCT TCAGAAACCT GAGAAAACAT 300
GATATTCCTT TCTGTAAACA TGACATTGGC TGTATTTCCC ATACAGGAAG CATGAGTTTT 360
TCTCCAAAAC CACTTTCACG GTGGATTTTG GTAAAGTTTC ACCCACAATA CACAACAATT 420
TGTCCTGGAT 430
(2) INFORMATION FOR SEQ ID NO: 602:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 361 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 602: GAnTTGAGCA TGTCAAGTTC TTnCCCTGAA GACGAGGGAG GGGATGGTTC TACCCTCACG 60
AGGAGACGAC GGATTTTGTT GGCAGTGTGC GCATTTCTCA TTCTGCTCGG TGGTGTCTTG 120
GTAGGTTGGG TTCTGTACAT GCACGGCGCC TCTCGTCCTG CGGTCGTGCC GTCACAAAAA 180
GTTGAACTGG CCCAGGTCTT CTGGCGGCAT GTTGCAGCGC GTGAGCTTGG AGCGTACTGC 240
GGTTGAGGCA CGTGTTCGTC GATCTCCCAT CTGAGACTGG CTCTTCCAGA AACCCACAAG 300
GGAAAAGGGA CGTTCCCCCC TGGCGTTCTT CCCGGGGGCT GAAACGGGCT AACGAGTGCA 360
G 361 (2) INFORMATION FOR SEQ ID NO: 603:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 338 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 603:
AGCGnGnAGA GGATCCCGTG TACAGTGTAA ATGGCTGGTG GTGGGATTTA CGTAAACGTC 60
GTATGCGCCT GTCACGTGGC GCACTGCTTC CTCTATGCGC CGCACGCACG CAGnAAGCAT 120
ATACCGTGAA CAACAAATGA CACTTGCATG AAAGACTACC TCCTATTCAG GACGGGTTTT 180
TTATGTATCC AAAAGCTCTG GGGAGGnAAC GGCTGGCAGT GACGGCAAGA AACTTGCATG 240
TACCGGTTAA AAAACCGTAC ACTTTTCATC CTATCTnGCT GTGAAATGGG AGCTCAACGA 300
ATTATGACCC AAAAAACTGn CAAAAAATAG TGCTGCCT 338 (2) INFORMATION FOR SEQ ID NO: 604:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 959 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 604: CGGATnCCGA CTGCGTAATT TTGAATCGAG GAGTACAGTG ATGGAGACGT TTTTTACCTC 60
AGAGTCTGTG AGTGAGGGTC ATCCTGATAA GCTGTGCGAC CAGATTTCTG ACGCTGTTCT 120
TGATGCCTGT CTTTCGCAAG ATCCTCACAG TTGTGTTGCG TGCGAAACTT TTGCCTCCAC 180
GTCCCTTATC CTGATTGGAG GTGAAATTAG CACGCGGGCG CATATTAATC TTACCCAAAT 240
TGCGCGTGAT GTTGCCGCTG ACATTGGATA TGTAAGCGCT GATGTCGGTC TTGATGCAGC 300
GTCCATGGCT GTTCTTGATA TGACTCATCA TCAGTCGCCT GATATTGCGC AGGGGGTGCA 360
CGGTGCAGGA CTGAAGGAGT TTGCAGGATC GCAGGGGGCA GGGGATCAGG GGATTATGTT 420
TGGTTTTGCG TGCCGCGAGA CGCCGGAGTT TATGCCCGCC CCCCTCATGT GCGCGCACGC 480
GGTTGTGCGC TATGCTGCCA CGCTTCGTCA TGAACGCCGT GTGCCGTGGC TGCGTCCTGA 540
TGCAAAAAGT CAGGTTACCG TACAATACGA GGGACATCGA CCGGTACGTA TCAGTGCGGT 600
TGTGTTTTCT CAGCAGCATG ATCCGTCACC TTCATACGAA ACCATTAGAG AAACGCTCAT 660
AGAGGAGATA GTGCGTCCGG CGCTTGCACC TACAnGTCTG TTAGATGAAA ACACGCGTTT 720
TTTTATCAAT CCAACCGGTC GTTTTGTCAT GGCGGTCCCT TnGGGACAnT GGTTTnACCG 780
GGAGAAAGAT CATCGTAGAC ACGTATnGGG GAATnGGGCG CCATGGAnGA GGTCCTTTCA 840
GTAAGGGTGC ATCTAAGGnA GATCGTCTGC AGCGTATATG CGCGTATATT GCAAAAAAAT 900
TTGGCAGCCG ACCTTCTGAC GCnGTTAGTG CAGCTTGCAT ACGCAATCGG GGTACAnAT 959 (2) INFORMATION FOR SEQ ID NO : 605:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 378 base pairs
(B) TYPE: nucleic acid
(C ) STRANDEDNESS : double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 605:
AGTATGCCCG CGCGGCAAAT GAAGACGnAA CGTGACGAAA CCCTCGTGCT ACTCAGTAAA 60
ACCCGAGACC CTGACCCGAC AGACCGnCAG CCGCAGACCG GCAGTGCAAC AACGACATCT 120
TATAGGATGG CAGGCGTACA TGCCCGTCCA CTACACGGTC CTGACCGGAC CCCAAGCCCC 180
AGCCGCAGCC AACATCAACT TCCCGGTATG GGGATGCCTC ACGCACATCG CAGCCAGCAA 240
TGTATTTCAG GGAGTATTTC TCAACATGGC CATGACCGGC ACACGACTGC GCCAGCCTCG 300
TGGGGCGTAA GAAAGACGGA GCGCAGGCAC CTnAGTCGAG GACTGCGTGA ATGTGGAAnG 360
CTCGTTTAGC TTCTCAGA 378 (2) INFORMATION FOR SEQ ID NO : 606:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 445 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 606:
CCACCCnTAT CAGTTACATC AATGGGAAAA CCATCGCACG CAGCAGCACG CAACAAGAAC 60
ACTTCCGAAT GCTCCCGGCA CATATCGTGC GCAATACCCG CGCAATAAAC GTGCGGCTCA 120
AGCTGCGCTT CCAACGCATA CCGCCGTACC AGCATGCACC CAAACTCCGC TACACGACGC 180
GAATGCTCAT GCTGCACCGT CTCGGCAGTA TACGAAATAC CCCCACGCCT AATCTCACAC 240
TCTTCAACAG AAAAATACGG CGTTGTCCCA ATTGCTAAAT GGAGCATCCG CACCCGATCG 300
TGCGCACTTG CACTTCCTTC CTTTTCTTTG AAGGGGGAAA CGAnGTAGGC ACAAACAGCA 360
CGCGGTCATA CCCGGCGTGC GGTGTACTGG CATCAGCCAA GAGCAAGTGG CCCAGATGAA 420
CAGGATnGTA CGAACCGCCA AACAG 445 (2) INFORMATION FOR SEQ ID NO: 607:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 435 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 607:
GCCATTCGCC ATTCAGGCTG CGCAACTGTT GGGAAGGGCG ATCGGTGCGG nnCTCTTCGC 60
TATTACGCCA GCTGGCGAAA GGGGGATGTG CTGCAAGGCG AT AAGTTGG GTAACGCCAG 120
GGTTTTCCCA GTCACGACGT TGTAAAACGA CGGCCAGTGC CAAGCTTGCA TGCCTGCAGG 180
TCGACTCTAG AGGATCCCCA GTCTTTTCAG ACTGTCCGCA TCATTGGGCA AAACGATGAG 240
CGCAAAGTAC TTACCCGAGA CACTCGCCCA AGAGACAGGC GTATCTACCT GTTCACGTCC 300
ATCTCCTTCA GAGCATACGT TTTCGCCTGC CAACTGCACT ACCATGAAGT GCGAAACTCA 360
TATTGTCCGC CGCATCCGCT CAGGCCCGAT CTCAGGCGGT GTGCGCAGGT ATAAnTGCTG 420
TCCCAAGTCA AAGCC 435 (2) INFORMATION FOR SEQ ID NO: 608: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 248 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 608:
CTnCGGCAAC GCATACAAAA CACCGTCGnC GAGTTTTTTC AACTGCGCGT CATTCTCCAC 60
AATTACGGTG CGTGCCTCTA CCGCACCGAG AATCTGACAC AAGATCACGC TCGGTAGCGT 120
CAGAGCCACG CGTGTACGTC TGCTGCACCA AGCGCCTGAA TACCAAAGCT CGCGTGGAGC 180
CACTCAACCC GATTGTCGGA AATCAAACCA ATGTCAATCA CCACGTACCA CACCCAATGA 240
CTGGCCAG 248 (2) INFORMATION FOR SEQ ID NO: 609:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 357 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 609:
CAGAGGCTGT AGCAAGATTC TGGGGATTGA GATAGCTACA AATCACCAGA TAGTTTTGTT 60
TTCATCCAGG AGTTTCTGGA CAAATTGTTG TGGATTTCCG GTGAAGTTTA ACCAAAAATC 120
ACCGTGGAAA GTGGTTTTGn GAGAAAAGCA CAATCTTCCT GTTTAGCAAA TTCATTCAAT 180
GTCATGTTCA CAAAAAGAAA TATCATGTTT TCTCAGGATT CTAAAGCTTT CTTCCCTAGA 240
ACACTCGAAT GTGTGGGAGT ATTTGACTGC AGTGGGACCA TTAGCGTTTG CCAGCAGAAA 300
ACCAAAAAGG GTCCTGCTAA AATGCATTCA AAACCTTTAA AGCCATAGnG ATATCCn 357 (2) INFORMATION FOR SEQ ID NO: 610:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 370 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 610: GTGAGTTTTC GTTCCACTGA GCGTCAGACC CCGTAGAAAA GATCAAAGGA TCTTCTTGAG 60 ATCCTTTTTT TCTGCGCGTA ATCTGCTGCT TGCAAACAAA AAAACCACCG CTACCAGCnn 120
GTTTGTTTGC CnnAATCAAG AGCTACCAAC TCTTTTTCCG nAGnAACTGG CCTTCAGCAG 180
AGCGCAGGGG ACCAAATACT GTTCTTCTAG TGTAGCCGTA GTTAAGGCTC CCCACGnAAC 240
ACCGTGGTGC AGGCGTCAAG CGAATTGAAT ACCATGTTCT CTATCGCTGT TTCTGTGTAG 300
AAGGCGATAT GAGGGGTATA GATGATACGC TCATGTnCGA CAAnCCGAGC ATAGACCGTA 360
TCGTAATAGG 370 (2) INFORMATION FOR SEQ ID NO: 611:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 611:
TGGTTAACCC CCCACCATTG GCCTTAACCC CCCAAGGCCC CGGTTTGGAA AAAAGGGACC 60
CACCCTTAAT TCCCCCAAGG CCCAAGAGGG GGGGTTTAAC CGGGGGTTCC CAAGGGGGTT 120
CCCAAAAAAA GGGAAGGGAA TTGGGGGGAA AAAAAAAAGG CCCAACCCAA GGGAAAGnGG 180
GCCCGGGAAT TTGGGGGGAA ATTCCCCCTT TGGnTTTGGG CCCTTTTTnA ATTTCCCAAA 240
GGCCCAACCC GGGCCCCCCA AAAACCCTTT AAAAAAAAAA AAGGTTTCCC CCCAAAAACC 300
CGGGAAATTT GGGAAACCGG GCCCAAACCG GGTTTnAACC GGGGAACCCn AACCCTTTTT 360
GGGGAACCCT TTTTTCCCTT CCCGGGTTTG GnCCGGGTTT AAGGATTTGG GAAAAAAAAA 420
AAAAGGGGAA AAAGGCCCGG GCCGGGCCCC CGGGTTTAAA CCCAATTTGG nCCAAATTGG 480
GGGGCCCAAG GnCCGGGCCG GGGGGTTTnA AAAAAAAAAA ATTTCCCCCA AGGACCGGGT 540
T 541 (2) INFORMATION FOR SEQ ID NO: 612:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 330 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 612: CAATTTTGTT GCGGTTATGG CTACGGnTGA nAGTACTTGC AGGAACCGGC CCACAGTTCA 60 GGAGGGCCTC GAAGGCGAGG CCCGCGGTGA GTGCGCCGTC TCCTACGACA GCGACTACCT 120
TACCTGATTT ACCCCGGTAT CGTAGGGCGC TGAGGATACC ACTTGTCGGG CAGAAAGTGC 180
CGTGGAAGAG TGACCGGTAC CAAAAGCGTT CGTACGGGnC TTTnCATACG TCGCCGCGGG 240
GAACCCCGnA AATCACCATC CTTCTGGACG GTAGGGTCAT GGGAAGCGGC CCTGGCGCGT 300
GCCAGTGAGG nAGCTTGTGC GACGTACACT 330 (2) INFORMATION FOR SEQ ID NO: 613:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 565 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 613:
TGGCTGTTCC ACTTCTGATC CAGCTCCCTG CGACTGGCCT GGGAAAGCAG AAGATGGCCC 60
AAGCGCTTGA GCTCCTGCAC CCACATGAAA GACCCAGAAG CTCCTGGCTT CTGATCAGCC 120
CAACTCTGGc TGTTGCAGCC ATTTGGGGAG TGAACCAGTA GATGGAAGAA GATCTAACTC 180
TCCCTCTCTA ACTCTACCTT TCAAATAAAC AAATATTTTC TAAAAATTTA TACTTTTGCA 240
AAAAATCTGG TCAGTTTATG TGGTTCCAGA GTAATTATAA TATTGTTAGA ATTACTCTTT 300
ATTCTTAGTG TTTATTCTGC TGTATTGAAA TCACTTGGAC AGGATCTGGG AAGAAACCAG 360
CCAAGGAAAG AGGAAACAGA AGTAAACTCT TAAATTCTGT AATTCTTAAT AGATTATTTA 420
TTTGAGAGGC AGAGTTAGAG GAGAGACAGA AAGGTCTTCC ATCTTCTGGT TCACTCCCTA 480
AATGGGCCAC AATGGGGCAG AATTGGGGCC AATGGCCAGG GAACATCTTC CAGGTCTCCC 540
TTGTGGGTAC AGGGGCCCAA GCACT 565 (2) INFORMATION FOR SEQ ID NO: 614:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 467 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 614: TGTGCCTTTT ATCTCTTTCT GGCCCCTACT ACTCAACAAC ACTGATTTTA AAAAATTATA 60 TTGAAACAAT ATCAATATTA GCGTTTTTCT AGGTTTTTTA CATGTGTTAT CTCACCAATA 120 TTTACAAAGA AGATCCTAAA CAATACAGAT AACACCAAGA CCAAAAAAAA AAAAAAGTAA 180
TATATGCCAT TAGTTTAGAA ATTCAAACAA TATAAAAGAC ATAAAAAGCA AAAACAGAAG 240
CAACTTGGTT TCCATCCCCA AAGGAAACAG CACCAACAAT TTTTTTTTTT TTTTTTTTTT 300
TTTTTTTTTA CAGGCAAAGT GGACAGTGAG AGAGAGAGAG AGAAAGGTCT TCCTTTTGCC 360
GTTGGTTCAC CCTCCAATGG CCGCCGCGGC CAGCATGCTT GCAGCCAGTG CACCGCGCTG 420
ATCCAAAGCC AGGAGCCAGG CTGGCAACAG ATGGCTGGCA ACAGATG 467 (2) INFORMATION FOR SEQ ID NO : 615:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 150 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 615: TTGCATGCCT GCAGGTCGAC TCTAGAGGAT CCCCGGGTTC GTACGTTCGT CTTATTTCCG 60 CGCGGGCATA CTCAGCAATA TTCTGCCTTC CACTTCGTAG GAGCAGCAGG AGCAGCAGGG 120 GGCGTGGCCT TTTTGTTCGT ACCGCCGTAC 150
(2) INFORMATION FOR SEQ ID NO : 616:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 613 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 616:
TGCCTGGGTT CAAGTCCTGC CTCCTCTACT TGAGTTTCCG CTCATGTGCA CCCCAGGAGG 60
CAGCAGATGA TGCTGGCTCA AGTACTGGAT CCCTGTCCCC CATGTGGGAG ACCCAGACTG 120
AGCTCTGGGC TCCTGGCTCC AGCCCTGGAT GATACAAGCA TTTGAGGAGT GAACCAGAAG 180
ATGGAAGATC GATCTCTCTC TCTCTCTCTC TTTCTCTCGT GTGCACACGC GCACGCATGC 240
TCATGGCCTG TCAAATAAAG TGAAAAAAGA AATCTGTGCA CCCAAGATTT ATGCATCTAT 300
ATATGTAAAC TTTCCTTCAA TTAAGAAACA TTAGGGGTCA GCATTGTAGC ACAGTGGGTA 360
AAGCTGCCAA TCGTGACACC GGCATCCCAT GTGGGCGCCG GTTCATGCCC TGGCTGnCTC 420
CACTTCTCAT CCAGCTCCCT GCTAATGGCC TGGGAAAGCA ACAGGTGATA ACCCAAGTGT 480 TTGCGTCCCT GCCACTCAGG TGGGAGACCC AGATGAAGCT CTTGGTTTTG GCCTGGCCTA 540
GCCCTGGCCA TTGAGGGCCA ACAGGGCAGT GAACTACCAG TTGAAAGGTA TCATGTGCAC 600 TGGGCTCTCA CGC 613
(2) INFORMATION FOR SEQ ID NO: 617:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 196 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 617:
GTGGGTGCAG GGGCTCAAGG ACTTGAGGCA TCTTCCACTG CTTTCCCAGG CCATAGCAGA 60
GAGCTGGATT GGAAGAGGAG CAGCCAGGAC TAGAACCGGC ACCCATATGG GATGCCGGCG 120
TTTCAGGCCA GGGTTTTAAT CCTCTGCACC ACAGTGCCAG TCCCAGTGTT GCAATTTTGA 180
TTGGTGTnGA CTTCAG 196 (2) INFORMATION FOR SEQ ID NO: 618:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 603 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 618:
GTATTTGAAA GGCAGAGTGA CAGGnGGGAG AGAAGGAGAG AAAGAGAGAA AGAGAGAGAG 60
AGAGAAAGAG AAAGAGAGAG AGAGAGACTG CCCATTTCTT GGTTCACTTC CCAAGTGGCC 120
ACAAGAGGCA AGAGCCGGGA CTGGGCCAGG CAGAAGCCCG GAGACAGGAA CTCCATACAG 180
GTCTCCCATG TGGGTGACAG GGGCCCGAGT ACTTAGACCA TCATCTGCTG CTTTCCCAGA 240
CACATTAGCA GGGGACTGGA AAGGAAGCAG AATAGCCAGG AATCAAACCA GTACTCATAT 300
GGGATGTTAT TATCATAGGC AGTGGCTTAA CCCACTGTGG CCACTATGTC AGCCCCATAA 360
CTGGGCTTTT TTATTAAAGC AGACACATTT TGCGCTGCCC TGTCTACCTC TTAATCTACA 420
TTTTTTCATC ACAGCATAAA GAATTGTCTT TATTCATGTA CTCTCCCAAG CCTAACTAAA 480
TAATCATATA TTTAAATAAT TCATCAATAT AGACAAGCAA GAAATTTTAT CTTCAGAGAA 540
TTTATGTGTT AGCATAAAAG AGAAGAGAAT GGGGCAAAAT GTGTCCAAAA ATCTTATCCT 600 TTT 603
(2) INFORMATION FOR SEQ ID NO: 619:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 619:
AGAGCGGTCT TCCAACCATT GGTTCACTCC CCAGTTGGCC GCACCGGCCA GnAGCTGTGC 60
CAGTCCGAAG CCAGGAGCCA GGAGCCTCCT CTGGGTCTCC CATGCAGCTG CAGAGGCCCA 120
AGGACTTGGG CAATCTTCTT CTGTTTACCC AGGCCATAGC TGAGAGCTGG ATCGGAAGTG 180
GAGCAGCCAA GACCCGAACC AGTACCCATA AGGGATGCCT GCACTGCAGA TGGCAGCTTT 240
ACCTGCTACA CTACAACGCC GGCCCCATCT TTCTTTATTA TTGAAGTATA GGAGCTTTTA 300
TATGGTATGG AGACCAGTTC CTTGTCAGAT ACATGGTTTG TAAATATCTC CTGTTCTGTA 360
GGTTTTTTGC TTTCTTGTAT TTTTTGAAAT ATAAAAGTTT TTTAATTTTG ACATCTGATT 420
TACCTACTTT GTGGTGATGG TTATACTTTT GATATTATAC CTAACAAACC AAGTCACAAT 480
CCAAAGTTAC ACAGATTTAT ACCTGGTTTT CCAAGAATTT TACTGTTTTA GCTTTTTTTA 540
TTTTAAAGGn TTATTTATTT ACTTGAGATG CAAAGTTATA G 581 (2) INFORMATION FOR SEQ ID NO: 620:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 583 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 620:
CTTGAAGGGG TGTGTTTTCT CAGATCATTT GGAAAATTTC TTCCATAGAT GCGTGGTATT 60
ACTAGCACTC TGTGTGCAAG TCATTCTCTC TTTTTTTTTA AAAAAAAGAT TTATTTATTT 120
ATTTGAAAGG CAGAGCAACA GAGGGAACAA GAGACAGAGT GAAAGACAGA GAAAGAGATC 180
TTCTTTCCAC TGGTTCACTT CCCAAATGGC TACAACAGCA GGGCATTGGT CTAAGCCGAA 240
GCCAGGAGCC TGGAACTCCA TCCAGGTCTC CCACGTGGGT GGCAGGAGCT TCCACTGCTT 300
TCCCAGGCTC ATTATGAGGG AACTAGATGT GAAAAGAGCA GCTGGGTCTT GAACTGGTGC 360 CCTGATATGG TTTGCCAGCA TCACAAAGTT CTGACTTAAC ACACTGAGCC ACCATACCAG 420
CCCCTCAATC ATTTCTTTTC CCTATTGTGC CTATCTGTCA TACATTCCTT TTGCTTATCA 480
TGCATTTGTG TGCTTTGCAA CAAACTGATT AATTCAGGAA CTGTCTTTAA CTCACTTGGC 540
TTGTGATTAG ATTAAAGGGT AAAGGGACCT GCCCCTCCTA GAT 583 (2) INFORMATION FOR SEQ ID NO : 621:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 591 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 621:
TGTCACTCCT TAAATATATA ATAGTAATTA TTATTTAAAG ATTTATTTAC TTATTGGAAA 60
GGTAGAGTTA CAGAGAGAGA GAGGGTGAAA CACAGAGAAA GAGAGGTCTT CCATCCACTG 120
GCTCACTCCC CCAATGGCCA CAATGGCTTA AGCTGGACTG GTCAAAAGCC AGGAGCCAGG 180
AGCCAGGAAC CAGGATCTTC CTGCGGGTCT CCCATGTGGG TGCAGGAGAC CAAGCACCTG 240
GACCATATTC CACTGTCTCC CAGGCACATC AGCAGGGAGC TGGATTGGAA GAGGAGCAGC 300
CAGGACTCAA ACCAGTGCCC ATGTGGGATG CCGGCACCAC AGGTAGAGGC TTAACCTAAT 360
ACACCACAGT GAGAGCCCCT AATTATTATT TTTATATTTA AAATAAAACT TAAAAGAAAA 420
GACATACAGA TAGGAAATAA GCATTGAAAA ATATACTCAA CATCATTAGC TATTAGGAAA 480
ATGCAAATTG AnATCCCAAT GAATATGACT GAACATCTAC TTACAATGGA CATATTTAAA 540
ACCGCCCTTT GTGACATTCT GTGTATTTTT CAGAAACCAT AGAATTGTAT A 591 (2) INFORMATION FOR SEQ ID NO: 622:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 564 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 622:
GAATAAAATA ATGAAAGCAT ATCTTCTTTA GTCATGTTTT TTCTTGTTGA ATTATATTGA 60
GGGGTTTTAA TTGTTCACCT AAATAAAGCA ATGTTTTATT GTTTTGAGTG TTATTTTAAT 120
GTTTCTGTGA AATTTTTCTA ATTTATATCA CACTTTTTTT TAAAGATTTA TTTATTTTGT 180 TGTTTGAAAG GCAGAGTTAC AGCGAGAAAG AGGGAGATAC AGAGAGCTCT TCCATCTACT 240
ACTTCACTCC CCAAATGGCC AGAGCTGGGC CATTCCACAT CCAGGAGCTA GCAGCTTCCT 300
CTGGGTCTCC CACATGGGTG TAGGGGCCCA AGTACTTGGG CCATCTTACA CTGCTTTCCC 360
AAAAGCATTA CCTGGGAACC CGATTGGAAA TGGAGCTACC GGGACTCAAA CTGGTGCCAA 420
TATGGAATGC CAGCACCACA TGCAATGGTT TTGCCCCTTA TATCACAGTT TTGTGTCTTT 480
CTGGCTTCCC TCTCTCCCTT TATCCTGTTT TTCTTTTTAA AATTTTTTTA GTTTTTTGnC 540
TTTTAGACAG TGnTTTGTAC TTTG 564 (2) INFORMATION FOR SEQ ID NO: 623:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 424 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 623:
AATTGATTAA TTAAAAAAAT TATGTCTTCC TGGAGAATTG TTCCCATTGT TATTTGAAAT 60
CTGCCCTGTC TGAAATTAAC CTAGCTACTC TTGCTTTATT TATTTATTTG AAAGTCAGAA 120
TTAGAGAGAG AGAGATCTTC CTTCCACTGG ATGACTCCCA GATAGCTACA ACAGCCAGGA 180
CTAGGCCAGG CTGACATCAG GTGCTGGGAG TTTCATCCAG ATCTCCCATG TGAGTTGCAG 240
GTATCCAAAC ACTTGGGTCA TCTTCTACTG CTTTCCCAGA CCATTAGCAG GGAGCTGGAT 300
TGAAAGTGGA GCAGCTGGGA CACAAACCAG TGCCCATATG GGATGCTGGC ATTACAGACA 360
GCTATTTTAC ACCCTATGCC TCAATGCTGG GCCCCAACTn CTGCTTTCTT TCAATTAnGT 420
GTTA 424 (2) INFORMATION FOR SEQ ID NO: 624:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 648 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 624: GCCCGACTGT ATCATTTCTT GAAGGTATAA TTTAAAAGGC CTTTCTCTGG GAAGCTTCTG 60 CAGGTTGCAG ACAAGTGTAA TTCTGCCTTC CTTGTGCCCC ACTGCTTAGT TGAGCCCCTA 120 TTTCAGCGCT CACTGCTGTA TGCTATCATT TACCTCTTTG CGTGTCTGTG TTGCTCCCAG 180
GTCTGCGAGC CTCTGCGGTG CAGGCAGGGA CTGTATCTTT ATTCACCTCT ACATCCATCA 240
GCAGCACCTA GCACAGGACC TGGTATTACA TGTTGAAAGA ATGTGTTTCA AGATTCAAAT 300
CAATTTTTTG TATTGCTTCT AACTTTCAAT ATAACTCATG GTGCATGCTC TAAAACCTGA 360
GCCCTTTAGC TCATAGAGAA TTTATATTAA AGTTATGAAT TATATAGATG TGTATGCATA 420
AACCTTGTTC TTTAACTGGC TGGGATCATC CTTTTTTAAA GATTTATTTT ATTTATTTAA 480
AAGACAGAGT TACAGAGAGA GGTAGAGATA GAGACAGAGA GAAAGGTCTT CATCCATAGT 540
TCACTCCCCA AATGGGCTAC AGTGGGCCAG AGCCAACCCA ATCCAAAGCC AGGAGCCAGG 600
AGCTCTTCCG GGTCTCCCAG TGGGTGCAGG GGTCCCAAnG ACTTGGGG 648 (2) INFORMATION FOR SEQ ID NO : 625:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 706 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 625:
CAGGCTTGTA ACAGTTCTGT TTTATGCAAT GTCGGCGCAT TCAGTACTAT CGCTTTTGGA 60
CGCGCACCTG ACGCGCGCGC TATGTCTTCA CACAAAAGAC GGGTAACTTT CGGTAACAAA 120
AACTCTTCGT GCTGCTGCAG TAGTTTAGGC TCAGAGAACA GTAACACACC CAAATGCGCA 180
GAGTGCAGGC CGCCATCTTT ACGCCGTAAC GAAAGGCCGC GCGCACCGCA GCCACTTGAA 240
ACCGCTCTAC TATAGAGTCG CCATAAGATT CCAAAAGTTC ACGCGTGCGC GCATnCCGCG 300
TTCAATACAG TAACATCCCC GTTCTGCAAG CAGGCGCGAG ACCACATTCT TCCCCGCACC 360
ACTTnCGACC GATGACACCA ATTAGTGGAC AAAACTCGCG CACAGCGAGA GCGTnAACGT 420
CAAACACGCA CCGTCCTCAA GTGTTCAGAA TGTGCTGAAC ACCCGATATT CCTGCGTATC 480
TTGCGCTGAC TACCCCTGTG CCTCACGCGC AGACGnCAAA AGACGCTTCT GTGGCATACT 540
GTTCGTGTCT GGCACGGTTG GGAATAAACT CACGCTTGAG ACGACACAGC CGTTGCGCTG 600
CTGTTTGCAT ATCCGCATCA AAAATCCAGG GCAACACACG CGAGCGCCGG TTTCCTGTCA 660
GTTCTGCGTC ATGAATTTCT GGTAACACAA AACAACGCCC ACTGAC 706
(2) INFORMATION FOR SEQ ID NO : 626:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 972 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 626:
CACTGGGCGC TGAGCTCCAC CTTCTCGAAG GGACTGAACG TCATCCCACC TGGTACTGGA 60
GCGCTCGTTC ATTCAACAGG TTGCCCGCGG GGTTAATAAT GTTAAAGCGA TTGGTTGTGC 120
CGAGCACGGA TGTGTGTGGT GCAAGCCAGG CGTGGGAACC GAGGGGGATG CGATACTGCA 180
CCACGCCTTC CCCAAAATTG GCATATTGAT AGTCCCAGGG GGCACAGCTC CATTcAGTTC 240
GTACCCTcCG TTATTTcTGT AACGGATGTA GGTGAGGGGG ATGTACACGC GTGCTTCGAC 300
GCCGGCGTTC AGGCCGGTGA GCAGGTGGGT GTAGGGGTCA CCGCTTTTGG TTTCGAGCTT 360
AAGGAATCCG GCAAAATCAA AGTAGTGCGC ACGAGTGGTA GCAAAGACGC GTTTGCCAAA 420
GATATTAGTG CCTGCGGTGG CAAAGTATAT GCCAGAAGAG AGCCACTTCC ACTGcATACG 480
CAgGAGCGCG TCTATGTTGA GCGCGTTCAT AGGTGCGCGC TCAAGGAAAG CGAGAAGTTT 540
AGCAGTGACA ACTCTTGGAT CGGAAGAGCG GAAGACATCA CGTACTCCTT GCTCTATGTT 600
CGGTACAAGT TGCGATACAA GCGCCGCGAG CGCGCCAgGC TaGCACGGTT TGAATGGCGC 660
TGCCGAGCGT TCCTTCTGCA ATCAAAgCAG CAAGTCCTAC CATCTCTATG AGAGTGGTTT 720
GTTCGGTGAT TCCTGGTGGC ATCATGATAT TGGGGAAAGT TCTGCACGAG TTTTCCCTnC 780
CAACCCGTCT AAACACTTTC CCTTGnTTTG AGGATAGCTC TCTCTTGGGT CTGAnGCATG 840
TGCGTTACTC TGGTGTTGGn TAnCGGCGTC GAAGGGCGAA GAAGAAACGG AACCCGGCGG 900
CCTGGTTCGA GGGTGAATCG GCCTCCTAAT CCCCACAGGA ATGGCGGTTT TGGTTTTTCG 960
TnCTTGGGAn GT 972 (2) INFORMATION FOR SEQ ID NO: 627:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 911 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 627:
CCATGCCCGC AAGTACCCCC GCTTGCATCA CCTGCCTGCC CACTCACTCC CCCTCCTCTC 60
ACTTCTACCT CACCCCCCCC CACCCGTCTA GCCGCGTGTG ACTACCAGGA GAGGGTGACG 120 CCGCACACGA TGCGGCCGAT TCCCTGGGTG AGGCACTCGG ACACCAGCCA GGTACGGGAC 180
ATCAGAGAGC ATACCCTGTT CCCAATCAAG GGAGAATACC GTCTTCTCTA TGAGACTGGC 240
TGAAATACCA GCACGCAGCT GTGCACAGTA CTCCTTGGTT AGATAGGTAG CTCCTACTGC 300
TCCACCTGCA GCAGGGGCAT TCAGGTGTGC ACGGTTGGTA GAGGCATGGA CCGTAACGCT 360
TGGCTTCACC CAGCCGTAAT CCTGCACCGG GATGCGATAc TACACCACGC CTTCCCCACC 420
ACCGGTGGAC GGATATACTC CTTTTCCTGA ATGCCACGCA CAGCCGTCCC CCCGTTATTT 480
TTGTATAGCG CATAGGTGAG GGGGATGTAC ACGCGTGTTT CAACGCCGGC GTCCAGGCCG 540
GTGAGCAGGT GGGTGTAGGG GTCACCGCTC TTAGTTTCGA GCTTAAGGAA TCCGGCAAAG 600
TCGCCACAGC TTGCGATGGT GTTATCTAAC ACCCTGGTGC CAAAAACGTT TGCCGGTGCT 660
GTGGCAAAGT ATATGCCAGA AGACAGCCAC TTCCACTGCG CCGTAAACAG CGCATCGAAG 720
GCGACATTGT AGGTGTCAAG ATACAGACAC ACGGCGCTGA CTCCCATTAG AAAGGCACGC 780
CATGCAGACG CACGCAGGTT CTGTATAGCC TGACGTATCT GCTGCCCCGC GTCCAACGCA 840
TCCGTCTTCT TCTTCACTTC TTCGGTTACA AACGTCTGAC CCTCAGTGAA AAACTTTGTA 900
GCCTCAGCCG T 911 (2) INFORMATION FOR SEQ ID NO: 628:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 628 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 628: GTACTCTGGT GTTGGTTACC GGCGTCGAGG GCGAAGGAGA AGCGGAACCG GCGCCTGGTT 60
CGAGGGTGAG TCGGCCTCCT ACTCCCCACA GGAGTGCTGT TTTGTTTTCG TTCTTGGAGT 120
CTTCGGTACC CTTAACGTAT TCTGGTCCAG TGTGGCATTC CCTGCCAGCT CCAACGTAAG 180
CAGCCGCTGA CGGTCGACGC CATAGGAAAG CGTTGCATCG GCCCCGAAGC CATACTTGCT 240
GTGCGTGGTG TCAGTACTAT CCCAGGCACC ATTGGAAAGG AAGGAGAGGA AACCGATGTC 300
CACATCTACT CCGCTGTTTC CCACATTGTG GGGCCTGGTA GCCGAGTTTT GACCCCGGAG 360
CCGGAGAAAC CAGGGGGCAT AGCGAGTGTC CTTTTCTGGA ATAGGACACG GGTGACAAAG 420
GGTTTCCACA GCTGGGGCAA AGTTAACCAC ACAGGGAAGG ACTGGTACCC ACTGTTCAGG 480
TAGGCCCCAT AACAGTGCAG GGTTGCCTGG AAGGAAGCGG TAGGTTTGGT AAAGGACAGG 540 GCCGTTGAGC TTTTAGAAGA CGCAAGCTCT ACTGCCAGGT CCTTCAGCTG CAGCTGTGCC 600
CCACACCCCT GAGCGTGGCC TTCCCCTC 628
(2) INFORMATION FOR SEQ ID NO: 629:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 691 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 629:
CGGATACCCC TACTACCATG CAGCGTATTA CTGCTGATGT CACCGGTGAT GTGACCGTCT 60
CTACGGTGAA TCTACCCAGT GAAGAAATGA AAGGACGCAT CATTGGGCGC GAGGGACGTA 120
ATATCCGCGC GTTAGAGACA CTCACTGGTG CTGACGTTGT CGTAGATGAC ACACCTGAAG 180
CTGTCGTCAT TTCCTGTTTC GACCCGGTAC GCAAAGAGAT TGCGCGCATC TCTCTTGAGC 240
GTCTTGTACT TGACGGTCGA ATCCATCCGG CGCGCATTGA GGAAATTGTG CAGAAGGTGA 300
CGCAGGAAGT TTCTCAAAAA ATCTATGAGG AAGGGGAGAA AGTGCTGTTT GACCTCGGTA 360
TTCACGATAT GTGTCCCGAG GGGGTACGGG CACTGGGGCG CCTGTATTTC CGTACAAGCT 420
ACGGACAGAA TGTACTCTAC CACTCAAAGG AGGTGGCTCT GCTCGCTTCC ATGCTCGCCT 480
CGGAAATCGG CGCAGATGTT GCCATTGCCA AAAGGGGCGC GTTGnTGCAC GATATTGGCA 540
AGGGAATGGA AACTGATTCA GACCGCAAnC ACGCAGAAAT TGGTATGGAG ATGGCTCGCA 600
AAATGAATGA GGACCCGCGA GTGGTAAACG CCGTTGGTTC TCACCACAAC GACATAGAGn 660
CGTGTTGTGT TGAGTCnTGG CTCGTTCAGG T 691 (2) INFORMATION FOR SEQ ID NO : 630:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 632 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 630:
GTCCGTGCTC TGCGGGGGAA ACTTGGTTCG CTGACCATCA TCGCGTTGCC CGTTATCGTA 60
GCTGTTGTCG CAGGGGGTGT CGGCTCCTTT TCCCTGCCCT ACGTAAAAAT GATTACGCTT 120
TTCGTCGGCA GAGTTATCGC CACGTTCATC GCGCTCCAGC CATTACTCAT GAGTATCCTG 180 CTGTCCATGT CTTTCTCGCT CATCATCATC TCCCCTGTGT CTTCCGTCGC GGTAGGAATC 240
GCCGTGGGGC TCACCGGTCT GGCAAGTGGA GCAGCAAACA TCGGCGTCTC CTCCTGCGCC 300
ATGACCCTCA TTGTGGGAAC CATGCGCGTC AACAAGATCG GTGTTCCGTT GGCGATGTTC 360
GCAGGAGCGA TGAAAATGCT CATGCCAAAT TGGATCCGGT ACCCGATTCT CAATATTCCG 420
CTCCTGCTCA ATGGCCTCGT TTGCGGCGTG CTCGCGTGGC TTTTCAATCT GCAGGGTACT 480
CCTGCAAGCG CAGGCTTCGG TTTTATTGGA ATTGTTGGAn CGATCAACGC CTACAGGCTT 540
ATGGCGTAAA ACTCCTATGG TGCGCGCGGG TATTCTTTTC CTCGTGTATT TCGTTCTTTC 600
CTTTCCTTGG CTGCGTAnCT TAATGAnTTT AT 632 (2) INFORMATION FOR SEQ ID NO: 631:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 619 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 631:
CCACTCGAGC GCGAGCATGn ACGTTTTTCG TACAGGTAGC GAAGTCTTAA GGTATTTAGC 60
GCACAGAACG CTGTTTCCGC AGCAGAGAAC AACATGGAAA GCACCAGCAG CACTACCAAC 120
ACACCGAACG CAGCGGAAAC GGAAAGAACA CTCACACGTA ATTCCCCCGA AGGCTAAAAC 180
ACCAGAACCG AAGAGACAAC GCCATACATC CTTGGACCCC TCCCCGCTGG GGGGGGGCAC 240
CTTTTAAGGT GCTCACGCCC TTGTGTCAAG AGCACACCCT CCACTACAAT GAACTGCGTG 300
TCCGGAGACC GCGCGGAGTC CTCTTTCTAT GAATAGAACC GAATCTCCTC GTGGCTTAAT 360
CAAAGCCACC GTACGTGAAC AAGACCGAGG CCGAACCGTT TATAAAAAGA TTGCCCAGTT 420
CCTCTCCCTC ATTGGAGAAG AGCAGGGCGG GCGCTGGTGC TCAAGCAACT TGAGCCTGCA 480
CAGATTGAGG CGGTGGTTGC CGAGCTCCTG ACACTCAAAC CCCTCAGTCC AGAAGAAGCG 540
CGTGAGATCC TACGGGAGTT TTCTGCCCTC TGCGCTCGTG TGTCGCCTGT TACCGGTGGA 600
CTGCGTGCTG CGCATCGAT 619 (2) INFORMATION FOR SEQ ID NO: 632:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 649 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 632:
ATCCAATAGC AGGAGCCAGG AGCCAGGTGC TTTTCCTGGT CTCCCATGGG GTGCAGGGCC 60
CAAGCACCTG GGCCATCCTC CACTGCACTC CCTGGCCATA GCAGAGAGCT GGCCTGGAAG 120
AGGGGCAACC GGGACAGAAT CCGGCGCCCC AACCAGGACT AGAACCCGGT GTGCCGGCGC 180
CGCAAGGTGG AGGATTAGCC TATTGAGCCA CGGCGCCGGC CGAAGTATTC AAGAAAAAAG 240
AGGGAAAGAT GTCTCACAGG ATGTCAAGAT TTATTCGATA GACTCATAAA TGCACAGTAA 300
GTAAAGTCAT TTGATGTTAG TGCAAATAGA CAAGTAGAGC TTGAACAGCA GGAAAGAGAA 360
GGAATAAnGn AnGnAGGnGG AGGAAGGGTG GAATGTGGGT GGGAGTCAGG GAGACCTTAA 420
ATTAGGAGAG GGTGCTCCCT GGGGACAGTT TTGTGTTACA GCAGGTTAAG CTGCCACCTA 480
GGATGTCAAC AGCACATATG GtGCCAGTTC CAGTCCTGGC TGCTCCAATT CCAATCCAGC 540
TCCCTGCTAT GGCCCGGGAA ATAGAAGAAG ATGGCTCAAG TACTTGGGCT CCTGCACCCA 600
GGTGGGAGAC CTGGAAGAAG CTCCTGGCTC CGGGCTTTGG CCTGGCCCA 649 (2) INFORMATION FOR SEQ ID NO: 633:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 611 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 633:
GTGTGCCTGC TTCACCGCTC CTTCAGCGAT CTGAGTAATA ACTGATCTTG CCACCTGCAC 60
CGAGTTATCT ATGGTGCTAC CAACCGTATC AGCCGCCTGT TTAGCCTGTT CCTGCGCACG 120
TGCGTAAAAA TTGCCAGTGC AACCTCCTGT GCACGCTCGC GTGTCCTTTC GGTCCTCATC 180
GCCGCGGTAG CCTCACTCTG GTGTTGGTTA CCGGCGTCGA GGGCGAAGGA GAAGCGGAAG 240
CCGGCGCCTG GTTCGAGGGT GAGTCGGCCC CCTACATTCC ACAGCAGTTT ATCCTTGTTC 300
TGATTGTTTG CGTCCTTCTG TGGCACCGAT GAGGTATCCG TCTTCTAGCG TAACATTGGC 360
TGGGCAAGCT CTACCGTGGC ACAGAGGGTG TCCTGGCAnG CGCATACATT AGCTTCAAGT 420
CTGCCCCAAA GCCATACTTT ACTGTGCGTG GGGTCAGTAC TATCCCAGGG CACCGTTAGA 480
GGCAAAGGAG AGAAACCCCA CATCAAGGCT GACCCCACTG GCCCCCAATG TCCTGTGCCC 540
GATACCCAAC CTTGCCGCCT AAACCCCCAA ACCCCGGCGC ATAATGTAAC GCATCCTCCT 600 GGTATGCGGT G 611
(2) INFORMATION FOR SEQ ID NO: 634:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 634:
CGGTATGGCG TTGATCTCTT CGGTTCTGGT GTTTTAGCCC TTCGGGGGAA TTACGTGTGA 60
GTGTTCTTTC CGTTTCCGCT GCGTTCGGTG TGTTGGTAGT GCTGCTGGTG CTTTCCATGT 120
TGTTCTCTGC TGCGGAAACA GCGTTCTGTG CGCTAAATAC CTTAAGACTT CGCTACCTGT 180
ACGAAAAACG TCATGCTCGC GCTCGAGTGG CAATGCGTAT CCTTCGACGG AAAAACTTCT 240
ATCTTGCTGC TGTGGTTATC GGGAACACCC TGGCGAGCAG TGCGTTGTCT GCAGTCATTG 300
CGCTTTTTGC ACGTGCCCTC TTTGGCATCC ACGCATGGAG nTGGAGCATC GGTGCAGGAA 360
CGGTGGCTTA CACTTCTTTT TGGAGAAATT ATTCCGAATC ACTTGCCTTG TGCCGGCCGA 420
ACGCATGnCA CTGCATACTG CGCGATTCTT GCAGTGGAGC GCTTTGATGC TTACTCCTTT 480
TGTACAGGTG TTCTGTATGG GCGCGGATGC GCTCTTGCGT CTTGCGCGTG TCGGTGCCAn 540
ACTCCCTCGC TGCGTGTTAG GGATGACGAC CTGCACACCG T 581 (2) INFORMATION FOR SEQ ID NO: 635:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 866 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 635:
ATTAGGTGGG AGACAGAGAA GCTTGGCCGT CAGTTnAGGC GGAGTCAACA GTGAAATACC 60
ACCCTTGGTA CGTCAGGTTT CTAACCTTTG GCCGTGGATC CGGCAAAGGG ACCGTGGTAG 120
GTGGGCGGTT TGACTGGGGC GGTCGCCTCC TAAAAGGTAA CGGAGGTGCG CGAAGGTCTC 180
CTCACACCGG TTGGAAATCG GTGCGCGAGT GTAAAGGCAC AAGGAGGCTT AACTGCGAGA 240
CCGACAGtCG AGCAGAtACG AAAGTAGGTC TTAGTGATCT GGCGGTAcGT GTGGAAGCGC 300
CGTCACTTAA CGGATAAAAG GTACTCCGGG GATAACAGGC TGATTTTCCC CAAGAGTTCA 360 CATCGACGGG AAAGTTTGGC ACCTCGATGT CGGCTCATCG CATCCTGGGG CTGAAGCAGG 420
TCCCAAGGGT TTGGCTGTTC GCCAATTAAA GCGGTACGTG AGCTGGGTTC AGAACGTCGC 480
GAGACAGTTC GGTCCCTATC TGCTATGGGC GTTGGATATG TGAGAGGAGC TGCTTTTAGT 540
ACGAGAGGAC CGAAGTGGAC GAACCTCTGG TGTACCAGTT ATCCTGCCAA GGTACGTGCT 600
GGGTAGCTAT GTTCGGAAGG GATAACCGCT GAAGGCATCT AAGTGGGAAG CCCGCCTCAA 660
GATTACATAT CCCTGAAGGT TGACCTTCCT GAAGACTCCT GCACACTACA AGGTCGATAG 720
GCTGGAGGTC TACGTACCGT AAGTATTAAG CCGACCAGTA CTAATAAGTC GTGAGGCTGA 780
CCATATTATC ATCCTTCTCC TTCACCCTAC CCCTTTGCGT AAAATATTTC GCCTGGTTGC 840
CAGGTGGAGA GGTCATACCC GTTCCC 866 (2) INFORMATION FOR SEQ ID NO: 636:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 641 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 636:
AGGAGnAnAA GGCGGACAGG TATCCGGTTA AGCGGCAGGG TCGGAACAGG AGAGCGCACG 60
AGGGAGCTTC CAGGGGGAAA CGCCAGGGGT ATGCGGTCAC CGGGGTTCGC ATCACGTCCT 120
TTGATGCGGA CGGGGTTGCG CACTTCATTT CAAGCGAGTT TGAACAGATT CCCCACGTAC 180
GGGAAGATAC CCTCGAGATT CTAAATAATT TTAAGCGTCT GCGTTTTCTC CTGCCGCAGG 240
GGCAGAGTCT AGTACGTTCA CGTATGAGTT TCGCGGCGCG TGTCTTTGAC GGGGAAGGAC 300
TTTGCTAAGA AGTTTCAACT CGAGGTTCTG TCTCAAGACC TGCTCATCAT GGAAATGATG 360
GACGGTGCGC ATGTTGAAGT AGAGCTACAC GTCGAATTCG GGCGTGGGTA TGTACCTGCT 420
GAATCGCACG ATCGGTATGC CGATTTAGTT GGGGTTATCC CTGTTGACGC AATTTTTAGT 480
CCCGTGTTGA GAGTCCGCTA TGATATTCAG TCTTGCCGTG TAGGTCAGCG GGGGGATTAC 540
GATCAGTTAT CCCTTGAAGT GTGGACAGAT GGTACGGTGC GTCCCGAAGA CGCGATACCG 600
AGGCAGCGAA AATTATCAAG GAGCACTTTA CATTTTTGTT A 641 (2) INFORMATION FOR SEQ ID NO: 637:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 536 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 637:
ATAACCCTGA TAAATGCTTC AATAATATTG AAAAAGGAAG AGTATGAGTA TTCAACATTT 60
CCGTGTCGCC CTTATTCCCT TTTTTGCGGC ATTTTGCCTT CCTGTTTTTG CTCACCCAGA 120
AACGCTGGTG AAAGTAAAAG ATGCTGAAGA TCAGTTGGGT GCACGAGTGG GTTACATCGA 180
ACTGGATCTC AACAGCGGTA AGATCCTTGA GAGTTTTCGC CCCGAAGAAC GTTTTCCAAT 240
GATGAGCACT TTTAAAGTTC TGCTATGTGG CGCGGTATTA TCCCGTATTG ACGCCGGGCA 300
AGAGCAACTC GGTCGCCGCA TACACTATTC TCAGAATGAC TTGGTTGAGT ACTCACCAGT 360
CACAGAAAAG CATCTTACGG ATGGCAGACA GTAAGAGATT ATGCAGTGCT GCCATAACCA 420
GAGTGATAAC ACTGCGGCCA ACTTACTTCT GACAACGATC GGAGGACCGA AGGAGCTAAC 480
CGCTTTTTGC ACAACAGGGG GATCATGTAA CTCGCCTGAT CGTTGGGAAC CGGAGC 536 (2) INFORMATION FOR SEQ ID NO: 638:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 580 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 638:
CGTCCACTAC ACGGTCCTGA CCGGACCCCA AGCCCCAGCC GCAGCCAACA TCAACTTCCC 60
GGTATGGGGA TGCCTCACGC ACATCGCAGG CCAGCAATGT ATTTCAGGGA GTATTTCTCA 120
ACATGGCCAT GACCGCACAC GACTGCGCCA CCTCGTGGGG CGTAAGAAAG ACGGAGCGCA 180
GGGCACCGTA GGCGCGGACT GCGTGAATGT GGAGGCTCGT TTAGCTTCTC AGACGTGTCG 240
GGGGCATTGC ATCCGATGGT GGCGCCATCA AGCAGGGAAG TGCGCACTGG GAGGGCAAAG 300
ACAGCAAGGG CGTCGTTCCA AGCAGGAGCA AACCACAGCA CGTACGGCGG TAGAACAAAA 360
AAGCTGCTGC TGCAGCCCCT GCTCCTGGTA CGAATGGAGC ACAAGAACAG GACACGCGCG 420
CACTCCCTTT CACACAAAAA ACCTCTTCCA CGCCCGAGCC CGAGCCGGAG CCGCATGCCA 480
GCCGGCAGCC GACGAnTTTT ATAGGACGCA GGCGTACATG CCTGTCCATT AnAAATCTAA 540
AAAGCCCACG CCCGAGCCCC AACGGAGATC CAATTTCCCG 580 (2) INFORMATION FOR SEQ ID NO: 639: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 620 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 639:
TTGTATTCGG TTCTTGGCGT TGTGTGGTGC TACAACTGCA TGTGCGCAGA ATCGTACGTG 60
TTCGGTTTGA GTTTCAAGGA TAGTAGGGAA GCGCGCGTGG CATACCCTTT TCTGGTAGGA 120
TTGGAACTCA TGTTCTTACT TGTGTGCTCG GCCCTATGTG CAGGTTCAGA AAGCGCGTTG 180
TCGTCGGTGA ACCAAGACGA TGAACGTAAG CTTAAGCGGC ACAGTACACG TTGTACACAA 240
CGCTTATGCT GGCTTCTGGC CCGGCGCGAA CAGCTGATTA CCACAGTTAT TGTGCAAAAC 300
ACTGCACTGA ATATGGTGCT CTCTAGCGTG GTGACGTTAG GCTCTATGGA GTTGTGGGGT 360
GCACAGTnCG GTGTGGAAGG CACTGGTTGC GGTGACGTGC GTGATTATTC TTGTGGAGAA 420
ATGTTCCCGA AGGCGCTGGG TGCACGGTAC TCACTGGGAT TCTTGATGTG GATTGCGCCT 480
TTTTTGTAAT TGAGTTACTG GTTGCTGTAC CCCTGGCGCG TGTGTGTCGT CAGCATTGAT 540
GCATGTGCTG GAGGGTATTT TTTTGCCGCG TCATACGACG TGTCTTTCGC GAGAAGAAAT 600
TAAAACGCTT ATTGCAGTTG 620 (2) INFORMATION FOR SEQ ID NO: 640:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 710 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 640:
GGGCCAGTGC GTGGATTCTT CTGGAACGCA ATGCCCCACT GGAGCTGGGC TGTGCCACTG 60
ACCTGCGGAG TGAGTACGCC TGCA AACCA GAAGCAGCAC ATACCATGCC CGCAAGnGnC 120
CCCCGACTTG CATCACCTGC CTGCCCACTC ACTCCCCCTC CTCTCACTTC nACCTCACCC 180
CCCCCACCCG TCTAGCCGCG TGTGACTACC AGGAGAGGGT GACGCCGCAC ACGATGCGGC 240
CGATTCCCTG GGTGAGGCAC TCGGACACCA GCAGGTACGG GAACATCAGA GAGCATACCC 300
TGTTCCCAAT CAAGGGAGAA TACCGTCTTC TCTATGAGAC TGGCTGAAAT ACCAGCACGC 360
AGCTGTGCAC AGTACTCCTT GGTTAGATAG GTAGCTCCTA CTGCTCCACC TGCAGCAGGG 420 GCATTCAGGT GTGACACGGT TGGTAGAGGC ATGGACCGTA ACGCTTGGCT TCACCCAGCC 480
GTAATCCTGn CACCGGGATG CGATAGTACA CCACGCCTTC CCCACCACCG GCAGGCCAAT 540
GTGCCCTGAG GAACCGCCGG AAAGGAGAGG GTTCCCGTTA TTATTTTTGT ACAGGTCATG 600
GGTGAnGGGG ATGTACACGC GTGTTTCAAC GCCGGCGTCC AGGCCGGTGA GCAAGTGGGT 660
GTAAGGGTCA CCGCTCTTAG TTTCGAGCTT nAGGAATCCG GCAAAGTCGC 710 (2) INFORMATION FOR SEQ ID NO: 641:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 574 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 641:
CAGGTATCCG TACGCCACAT CCCCGTCAAT GCCCnCGGTT TCATTACCCC TGAnGCTGTA 60
CGTGCAAGCG TCAGTCCCCG TACCACGCTA GTTGCGTGAC GCCGTACATA GTGAAACCGG 120
CCGCCATCCA nCCGCTCCnn GnGATTGCGC ACGTGCTTGC ACATACAGGC ACACGCGGAC 180
GCTCTATCCA GCTCCACGTA GACGCCGCAC AGGCCTTTGG GAAAATACCG CTCAATCTGT 240
ATATGGACCT TCCGCGCATA GAGGAACATG CACAGGAAAA CAACGCGCCA CAGACACCAC 300
CGGGCTACCC CGCACCCACT GnACAACGCG CGCTTACCTA CTCGGTAGCA ATCAGTGGCC 360
ACAAAATAGG CGCACCACGG GGTATTGGGC TACTGTGCGC ACACCGTTCA TTTACCCCCT 420
TTGTCCTGGG AGGCGGACAG GAAnAAGAGn GCCGCCCGGG AACTGAGAAC TTGCAGGTGC 480
GCTCGCGCTC GCnGCTTGCG TGTGCGAAAG CGCCTTCTTC CGTACTCTAC ATACCACTCC 540
GGAnGGCCCT ACACCCGCAT TACGAAGCCC ACAG 574 (2) INFORMATION FOR SEQ ID NO: 642:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 561 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 642: TATTTGTTTA TTTTTCTAAA TACATTCAAA TATGTATCCG CTCATGAGAC AATAACCCTG 60 GATAAATGCT TCAATAATAT TGAAAAAGGA AGAGTATGAG TATTCAACAT TTCCGTGTCG 120 CCCTTATTCC CTTTTTTGCG GCATTTTGCC TTCCTGTTTT TGCTCACCCA GAAACGCTGG 180
TGAAAGTAAA AGATGCTGAA GATCAGTTGG GTGCACGAGT GGGTTACATC GAACTGGATC 240
TCAACAGCGG TAAGATCCTT GAGAGTTTTC GCCCCGAAGA ACGTTTTCCA ATGATGAGCA 300
CTTTTAAAGT TCTGCTATGT GGCGCGGTAT TATCCCGTAT TGACGCCGGG CAAGAGCAAn 360
TCGGTCGCCG nCATACACTA TTCTCAGAAT GACTGGTTGA GTACTCACCA GTTCACAGAA 420
AAGCATTTAC GGGnGGACAG ACATAAGAGA ATATGCAGTG CTGCCATAAC CAGAGTGATA 480
ACACTGCGGn CCAnCTTACT TCTGACAACG ATCGGAGGAC CGAAGAGCTA ACCGCTTTTT 540
GCACAACAGG GGGATCAGTA A 561 (2) INFORMATION FOR SEQ ID NO: 643:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 620 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 643:
CCTTATTCCT CCATTCTAAT CACACACGTG CATGCACCAA GAGGACAGCG CCGTGCTATC 60
TTCCCAGAAA GGAGGATGAA AACACGTGAA AACCATTCTC ATACTGGGTG CAGGAACCAT 120
GCAAGCCCCT GCACTTCGCG CACACGGGAG CTTGGGCTGT GGGTGTGCGC GGTAGATGGG 180
AATCCGCATG CACCGTGCGC GGCACTTGCA GACGAGTTTA CCCCAATCGA TTTGGCCGAT 240
AGCGCCGCGC TCGTCGCTCA CGCGCGCGCA ATTCGCGCGC ACGACGGCTT GGATGCTGTG 300
TTCACCGCGG CAACAGACTT TTCCGTTTCC GTCGCTGCCG TCGCCGAGGC CTGTGCACTC 360
CCCGGGCCAC CGATTGGAGG CAACCAAAAA CGCTACGGAT AAAACGCGCA TGGTGGCCTG 420 nCTTCACACG CGCCCGACTG CGCTGCCCCC GCTTCACGTT CCTTGAGCCT GACTCGTTCG 480
CCTGGGGACA CACCGCCTGG GGCATGCCCG ACTGTGTTCC CACCTGCATA GCGCTGGACT 540
CTCGTTTCCT CTCGTCGTAA AACCGACAGA CAAACATGGG AGCCCGCGGC TGCACGCTCG 600
CGCAATGCAA GGATACCCTC 620 (2) INFORMATION FOR SEQ ID NO: 644:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 527 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 644:
TTCCCAGAAA GGAGGATGAA AACACGTGAA AACCATTCTC ATACTGGGTG CAGGAACCAT 60
GCAAGCCCCT GCACTTCGCG CnACGGGGAG CTTGGGCTGT GGGTGTGCGC GGTAGATGGG 120
AATCCGCATG CACCGTGCGC GGCACTTGCA GACGAGTTTA CCCCAATCGA TTTGGCCGAT 180
AGCGCCGCGC TCGTCGCTCA CGCGCGCGCA ATTCGGnCCC GnCGGCTTGG ATGCTGTGTT 240
CACCGCGGCA ACAGACTTTT CCGTTTCCGT CGCTGCCGTC GCCGAGGCCT GTGCACTCCC 300
CGGCCACCGA TTGGAGGCAA CCAAAAACGC TACGGATAAA ACGCGCATGG TGnCCTGCTT 360
CACACGCGCC CGACTGCGCT GCCCCCGCTT CACGTTCCTT GAGCCTGACT CGTTCGCCTG 420
GGACACACCG CCTGGGCATG CCCGACTGTG TTCCCACCTG CATAGCGCTG GACTCTCGTT 480
TCCTCTCGTC GTAAAACCGA CAGACAACAT GGGAGCCCGC GGCTGCA 527 (2) INFORMATION FOR SEQ ID NO: 645:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 747 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 645:
TGTTCGGTGA TTCCTGGTGG CATCATGATA TTGGGAAGGT TCTGCACGAG TTTCCCCTCC 60
ACCCGTCTAA ACACTTCCCT TGCTTTGAGG ATAGCTCTCT CTTGGGTCTG AGCATGTGCG 120
TTACTCTGGT GTTGGTTACC GGCGTCGAGG GCGAAGGAGA AGCGGAAGCC GGCGCCTGGT 180
TCGAGGGTGA GTCGGCCTCC TACTCCCCAC AGGAGTGCTG TTTtGTTTTC GTTCTTGGAG 240
TCTTCGGTAC CCTTAACGTA tTCTGGTCCA GTGTGGCATT CCCTGCCAGC TCCAACGTAA 300
GCAGCCGCTG ACGGTCGACG CCATAGGAAA GCGTTGCATC GGCCCCGAAG CCATACTTGC 360
TGTGCGTGGT GTCAGTACTA TCCCAGGCAC CATTGGAAAG GAAGGAGAGG AAACCGATGT 420
CCACATCTAC TCCGCTGTTT CCCACATTGT GGGCCTGGTA GCCGAGTTTT GCCCCGGAGC 480
CGGAGAAACC AGGGGCATAG CGAGTGTCCT TTTCTGAATA GGCACGGGTG ACAAAGGGkT 540
TCCACAGCTG GGCAAAGTTA ACCACACAGG GAAGnACTGG TACCCACTGT CAGTAGGGCC 600
CCATAACAGT GnCAGGGTTG CCTGGAAGGA AGCGGTAGGT TTGGTAAAGG ACAGGGCCGT 660
TGAGCTTTTA GAAGACGCAA GCTCTACTGC CAGGTCCTTC AGCTGCAGCT GTGCCCACAC 720 CCCTGAGCGT GCCTCCCCTC GGCGGGT 747
(2) INFORMATION FOR SEQ ID NO: 646:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 896 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 646:
GTGTATGTTG CCGGGTGTGG CGGCGCGTGT TTCTCTCTCC CCCAAGCTCG GGGTGTACGG 60
GGACGCACGC GGCGGTTCTG ACCTGTGGGG CATCTGCATA CAAGCTCCCA CAATGCCAGA 120
TACAGAGAAC CAGGCGCCTC CGCGCTATGC CCGGAGACAC CGTTGGTGGG GCTGGACGTG 180
GCGTTCCGTG CGGAAAATGG CTTCCTGCTC CAACTGACGG TGGACGCGGC ACTCACGCGT 240
TTAATGTTCT GCGGCCGGTG TTTGGCCGGT TATTCGTTCA GACCGGGGGA AGGTAGTACG 300
CATCTGTCGG TAGCGGCGGG TTTTGAGTGC ACCGCGCTCA TCTACGATAG CCAGCACTTT 360
CTTTCGGTTC TTGGGCAGGG CTTACTGCAG CCGAGCAGCT cGTCTTATTC AGCCGGTAAC 420
TGrCACCGCC CACGTTcATg CTTGGCGTGC TAACGTGCAC TGCCAAGGAG gTAGGCGCCA 480
TACACGAAaG aGTCGgCGTA TTAAAGGGGT CTGTCCAGAA CTATGCGGTG CCGGTGCAGC 540
TGGGGGTACA GCACTATTTT AGCGCGCACT GGGGGATAGA CGCGACGGCT ACCGTTTCGT 600
TTGGCATTGA CACCAAGCTG GCTAAGTTCC GnATCCCGTA TACGTTGCGC TTTGGCCCCG 660
TCTTCCGCAC CTAGGGGACG GCGCTGGGAG GAAAGAGTCC TGCCGGAAGG CGCCTGCGGC 720
GGGTAGTAGC TACCAGGAGA GGGTGACGCG CACACGATGC GGCCGnCCCC CCCCCCCCCC 780
CCCTCGGACA CCAGCAGGTA CGGGACATCA GAGAGCATAC CCTGTTCCCA ATCAAGGGAG 840
AATACCGTCT TCTCTATGAG ACTGGCTGAA ATACCAGCAC GCAGCTGTGC ACAGTA 896 (2) INFORMATION FOR SEQ ID NO: 647:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 584 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 647:
TATCCAAGTT CTGTTTGTGG GCTGCCTCGC GGTAGTCAAC AATACCAAAA ACGCTAAAGC 60 CCTGCGTGCG CAAAGCGCTT CTCCAAAAGC TGCATAGTGC GATCGAAGGG GTAGCGGCTG 120
GTGATCGTCA TACGCACCCC CGGCCGCGTG TTGGAGGTTG ACCCGGCTGC GCTTGAAGTG 180
CAGGCAACAA GCCCTCCTAA AAGGAGCACT CCCAGGGAAG CAACCTTTCC CCACACGCGC 240
CACCTGTGAT GGCACAAACG TCTACACACC CCGCCTACCT CCGCTGCGTG TCATGCCCGG 300
CCAGATATGC GCCGGCGCAA AAATTCGCCC CTTCGGAAAG AAAAGAACCA TACATCCCTA 360
CCAAATCAAG CGACATGCCC CTACGAATTG CCAACGCTCC GTCACGTTCT TCTCTATCAC 420
TACATTGCTG TTCTCTTGTA GCTTGCCTTG CTCCCAGCTC AGCCGAATCT CCACCTTCTC 480
TAACGGACTG ACCACTACCC CACACTCGTA ATACCCACAA TATTCCTTGC TCCACTTCGT 540
AGCAGGAGCA GCAGGGGGCG TGGCCTTTTT GTTCGTACCG CCGT 584 (2) INFORMATION FOR SEQ ID NO: 648:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 562 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 648:
CACGTCTTCA nnCCCACCGG TAACCCCCAA CGCAGCGTGT GGCCAGTGCT GCACCGGGCG 60
TTCTTGCCCT GTTCCCCCGG CAGAATCAGA TGCGTCTCCC GTGGACCTGC CAACCGAAAA 120
TAGAnATGGC GATACACCCG GCGCTGTCCC CTCAGGGCGC GCGCCTGTCT CTCCCACCGC 180
ACCCGGACAC CGCTTTGGGA ACAACGCCAT AATTGCCCCT GnATCCATAC CCGCGTCCTG 240
GTGCACAGAC GCACCTGTAT CTGCCGGAAG GTGCACGCGA TGCGCGGTGT CTTCCACAGA 300
AACCTGATTC CCCTCCCACA AGAGAACTGA GTTTGACTGA CCCCACCCAG GAGCGGACTG 360
CTCATGCGGC GCATCCTGCG GTTCTACAGC TAGCTGCTGG CAGGCAACnA GAGCTGCACG 420 nAGTTACTTG CCACGGnnAA AAACGCGGCA CTCCAGGCGT GnATAACGCG CTTCCTGGCT 480
TCATTTTCCT TTGGGCACGA TGTAATCGTA AATGTGCGCG CTTAATCCTG nnGCAACTGC 540
TGCAAGCGTG GCCACGCACT GG 562 (2) INFORMATION FOR SEQ ID NO: 649:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 534 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 649:
CGGATACCCC GGAGAAATGC ACTTGAGTTA TTTGATGAAG CGTACGAGAA ACGGTATTTC 60
TCTCGGATTA TTGGCACAAA TGCGGTGTTC CACACACAGC TTTCGCACAA GCAGTGGTAT 120
ACTGAAACCG ATGTGTCAGG GTTGTTCGCG CGCGTCATCG CnAATnTCAT CATAATCAAT 180
CGTTGAGCAG TCTCTTGGAT GATCGCAGTA TCATCGAGCG ACTCCTACAC GCTCGCTnGT 240
CCGTTGCGGG GACACCGCGC GCATAGGGTA GCGTCGCAAC GGACATGGGT TCGGGGATCT 300
TTGTCGCGGA CATCGGTACG TCTTCCCTAA AAGCGGCGAT TATTTCCCAA GATGGAAAGG 360
TGTTACAGTA CCAGCGCGTG TTCTTTCCTC AGCCGGTGAA GGCCCAGGAT TGGGTGCGTT 420
CATTTTTTAC GGTGTTTGAG CGGTTGCGTG CCGTGCATCA CGTTATTGCC ATTACTATTT 480
CGGGCAATGG ACCGAGCGTC GTTGCCGTGC ACAAGAAGAG TCATGCCGAG GATC 534 (2) INFORMATION FOR SEQ ID NO: 650:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 535 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 650:
GGGAGnAGTG ATACGCCTAT TTTTATAGGT TAATGTCATG ATAATAATGG TTTCTTAGAC 60
GTCAGGTGGC ACTTTTCGGG GAAATGTGCG CGGAACCCCT ATTTGTTTAT TTTTCTAAAT 120
ACATTCAAAT ATGTATCCGC TCATGAGACA ATAACCCTGA TAAATGCTTC AATAATATTG 180
AAAAAGGAAG AGTATGAGTA TTCAACATTT CCGTGTCGCC CTTATTCCCT TTTTTGnGGC 240
ATTTTGCCTT CCTGTTTTTG CTCACCCAGA AACGCTGGTG AAAGTAAAAG ATGCTGAAGA 300
TCAGTTGGGT GCACGAGTGG GTTACATCGA ACTGGATCTC AACAGCGGTA AGATCCTGAG 360
AGTTTTCGCC CCGAAGAACG TTTTCCAATG AnGAGnACTT TAAAGTTCTG CTAGTGGACG 420
CGGTATTATC CCGTATGACG CCGGGCAAGA GCAACTCGTC GCCGGCATAC ACTAATTCTC 480
AGAATGGACT GGTTGGAGTA CTCACCAGTC CAnCAGAAAA GCCATCTTAC GGATG 535 (2) INFORMATION FOR SEQ ID NO: 651:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 555 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 651:
ACGCTTACCG AACGCTCCCC TCCAACGCGC TCAAATACTC CGTCTTGGAG TACGCGGAGG 60
AGCTTCGGTT GCAGTTCCAG GGGGAGATCT CCGACCTCAT CGAGAAAAAG GGTGCCACCG 120
TGAGCCAGTT CAAATCTTCC CCGATGGGTG CCGACCGCAC CTGAGAAGGC ACCTTTTTCA 180
TGTCCGAATA ATTCGCTTTC TGCAAGCTAT GGACGAGTGC TGAGCAATTG ACGGGGACGA 240
AGGCTTGTCG CTGCGGGTGG AAAGTTGGTG AACGGTTCGC GCAACAAGCT CCTTTCCAGT 300
GCCGGTTTCT CCACAAACAA GGACAGGGAG GTCAGAGGCT GCTACGAGCT TTATAGCATC 360
GAGTGTGCGT GTCCAAGCAG GAGAGGTTCC GATCATATTT TTAAATGCAG TGATTGGGGA 420
GCTAAGAGCG CATTTCGTTC GGTCAAAAGA GCGTGACTCT TTGACTCAGT GTCTCGGACG 480
CGTCGGTCTG GGCTACTGCG AGCGAGATAA GTTTAGAAAG AGTAGTAATG AAGCGTACAA 540
CGTCTGGGGT AAACT 555 (2) INFORMATION FOR SEQ ID NO: 652:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 509 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 652:
AAAAACATAG CCATCTGCAA ATCAGTAATG AGACGTATAG GTGGTGACAC TGCCCGGTGC 60
TGGAAGGTTA AGAGGAGAGG TTCGTGGTAA CACAACGCTT TGAATTGAAG CCCCAGTAAA 120
CGGCGGCCGT AACTATAACG GTCCTAAGGT AGCGAAATTC CTTGTCGGGT AAGTTCCGAC 180
CCGCACGAAT GGTGTAACGA CTCTGGACAC TGTCTCGACG CGAGACTCGG TGAAATTTAT 240
GTACCGGTAA AGAAGCCGGT TACCCATAGT TAGACGGAAA GACCCCGTGA ACCTTCACCG 300
TAGCTTACTA TTGGAACTTG GTTTACCATG TGTAGTATAG GTGGGAGACA GAGAAGCTTG 360
GCCGTCAGTT AGGCGGAGTC AACAGTGAAA TACCACCCTT GGTACGTCAG GTTTCTAACC 420
TTTGGCCGTG GATCCGGCAA AGGGACCGTG GTAGGTGGGC GGTTTGACTG GGGCGGTCGC 480
CTCCTAAAAG GTAACGGGAn GTnCGCGAA 509 (2) INFORMATION FOR SEQ ID NO: 653: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 499 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 653:
ACGCCCAGCG CCACCGCAGT CATAGTGCCG nTGTACACGC GCATGnCGTG GGTCCATGTT 60
AGCGCCGATG TGAAACCACA TGGAGCCACC GATGACGTTC TCACCTTCAn TGCGGGAGAA 120
GCGGCCCGTG AGCAAGTTCA TGCCGTTGAA CTGGGCAGAA CTAGCGATGC GGnTCTACCT 180
CTGCCACAAG CTGCGAAACT TCCACCTGGA TCTGCATGCG GTCTTCAGCA GAGTAGATGC 240
CGTTTGCCGC TTGAATTGCA AGCTCTCGGA TACGCTGCAT GATGTCGGTG GTTTCTTGCA 300
GATAGGCTTC GGTAACCTGA ATGAAGTTCA CACCGTTTGA GGCATTGGTG GATGCCTGGT 360
TGAGGCCGCG GATTTGGCTG CGCATTTTTT CTGAGACAGC CAAACCAGAA GCGTCATCCC 420
CTGCGCGGnT GATGCGGTAC CGGATGAAAG CTTCTCGATG CCCTTTCCAA CCTGGACATT 480
GGnGTGCCCG AGTGTGCGT 499 (2) INFORMATION FOR SEQ ID NO: 654:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 636 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 654:
TGGGGCGACG GCGGGTACAA TGTATTTGGT AAGCGCGTGC TGCCTGCGCT GCGGTCCTGG 60
CATTTTGATT TTGCCGGATT CCTCAAACTC GAAACCAAAA GCGGTGACCC CTACACCCAC 120
CTGCTCACCG GCCTGAACGC CGGCGTCGAA GCACGCGTGT ACATCCCCCT CACCTACATC 180
CGTTACAGAA ATAACGGAGG GTACGAACTG AATGGAGCTG TGCCCCCTGG GACTATCAAT 240
ATGCCAATTT TGGGGAAGGC GTGGTGCAGC TATCGCATCC CCCTCGGTTC CCACGCCTGG 300
CTTACACCGC ATACATCCGT GCTCGGCACA ACCAATCGCT TTAACGTTAT TAACCCCGCG 360
TACACCCTGT TGAATGAACG AGCGCTCCAG TACCAGGTGG GACTGACGTT CAGTCCCTTC 420
GAGAAGGTGG AGCTCAGCGC CCAGTGGGAA CAGGGGGTGC TTGCTGACGC TCCTTACATG 480
GGTATTGCCG AGAGTATGTG GTCTGAGCGT TACTTTGGCA CGTTTATCTG TGGGGTGAAG 540 GTGGTTTGGT GAGGGGTTGT CGTGTGGGCC AGAGAACGGG TACGGTGGGG GTGCGCGTTT 600
TCCCCGTGGG GCTGTGCGCG CTCAGTTTAC AGGCGA 636
(2) INFORMATION FOR SEQ ID NO: 655:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 513 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 655:
ACAAGATGCC GCAAATCTTA AGGCTCGTTT AGAGGCTCAG CCTGTGGTTA TTGCCATGCC 60
CGCCGGTACC AACGGTAAGT TGTACGGCGC TGTCACGAGT CATACCGTTG CAGAACAACT 120
TGCGTGCATG GGATTTGAGG TTGAGCGCAA CnAGTGGAGG TCCCTGGTCT TACTCTGAAA 180
TGTGTGGGGA ACTATCACGT CACTATAAGA CTATACGAGG AAATATGTGC TGTTGTTCCT 240
GTCACCATCA AAAACCAAAG CGAAGGAnCA GTGTGAGTGA GTAGACCGTT TGCGGAAGTA 300
TCTCTCTGCA CGGAGTATGT GCTGTTCGTT TTTAGTTGTG CAAAGAACTG TGCCACTTTG 360
GGGCTGGATC CACGGAAGGG AGACAGTTCT GGCAGTTGGG CCTCAACCCT TGCCTCGTTA 420
GAGGTTTCCA CGGAGAnGGG GGGGACTGTC TTTCACCACC TTCCCTCCTC TGAGTATTCC 480
TCGGAGGAGG TTCTGTGCCG GGCATGCCTA ATC 513 (2) INFORMATION FOR SEQ ID NO: 656:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 563 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 656:
CGCGCACGGT GGTGACTGAC CGGACTCTCG TGCGAAnGAG AGTAGGAGTC TAAAATCTTA 60
TCAGGGGCTC CGGGTGGGAC TCCGGCTGCC AGGGCTCGGG CTTGGGCGTG GACGGGCACG 120
CAACCCATAA GATGTTGGCT GGGGGTTGGG CTGGGGGAAA TAGACGTGCG GCGTGCGTTG 180
GGGCGATACC CACAATATCC TGCCTTCCAC TTCGTAGGAG CAGCATCGTT TTTCTTGTTC 240
GTACCGCGCC TGTGTGCGTT TGTGCGCGCA nGTnCCCTGC TGCTGGGGCG CCCGTGTTTG 300
ACTTGCCCTC CCAGTCGGTG TGAGGCAGGC CGCGTCTATC CCTCAGTGCG CATGTCCTCC 360 CTGCTTGAGG GGTTGGGCGC CACCATTTTC ATATGCAATG CCCCAGATGC AGTCGTCCTT 420
CTGCATGGGT GTGGTGAGAA AGACACCCTG AAATACATTG CTCTACTTCG TACCAGGAAC 480
TGCAGCAGCA GGGGGAACAG GGACACCTGG GTGAAAAGAC TGCACCATGC TAGGATGGGG 540
AATGGATATG TCCAAAAGTG TGA 563 (2) INFORMATION FOR SEQ ID NO: 657:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 527 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 657:
GAACCCTACT CATATATACT TTAGATTGAT TTAAAACTTC ATTTTTAATT TAAAAGGATC 60
TAGGTGAAGA TCCTTTTTGA TAATCTCATG ACCAAAATCC CTTAACGTGA GTATnCGTTC 120
CACTGAnCGT CAGACCCCGT AGAAAAGATC AAAGGATCTT CTTGAGATCC TTTTTTTCTG 180
CGCGTAATCT GCTGCTTGCA AACAAAAAAA CCACCGCTAC CAGCGGTGTT TGTTTGCCGG 240
ATCAAGAGCT ACCAACTCTT TnnCCGAAGT AACTGGCTTC AGCAGAGCGC AGATACCAAA 300
TACTGTnCTT CTAGTGTAGC CGTAGTTAGG CCACCACTTC AAGAACTCTG nAGCACCGCC 360
TACATACCTC GnTCTGCTAA TCCTGTTACC AGTGGCTGCT GCCAGTGGCG ATAAGTCGTG 420
TCTTACCGGG TTGGACTCAA GACGATAGTA ACCGGATAAG GCGCAnGGAT CGGGCTGAAC 480
GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAAnGACC TACACCG 527 (2) INFORMATION FOR SEQ ID NO : 658:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 620 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 658:
CTGnCAGTTC CTGGTACGAA nAGAGGCAAT GTATTTCAGG GTGTCTTTCT CACCACACCC 60
ATGCAGAAGG ACGACTGCAT CTGGGGCATT GCATATGAAA ATGGTGGCGC CCAACCCCTC 120
AAGCAGGGAG GACATGCGCA CTGAGGGATA GACGCGGCCT GCCTCACACC GACTGGGAGG 180
GCAAGTCAAA CACGGGCGCC CCAGCAGCAG GGACGCGCTG CGCCACAAAC GCACACAGGC 240 GCGGTACGAA CAAGAAAAAC GATGCTGCTC CTACGAATGG AAGGCAGGAT ATTGTGGGTA 300
TCGCCCCAAC GCACGCCGCA CGTCTATTTC CCCCAGCCCA ACCCCCAGCC AACATCTTAT 360
GGGTTGCGTG CCCGTCCACG CCCAAGCCCG AGCCCTGGCA GCCGGAGTCC CACCCGGAGC 420
CCCTGATAAG ATTTTAGACT CCTACTCTCC TTCGCACGAG AGTCCGGTCA GTCACCACCG 480
TGCGCGTATA TTCGGGGGAG TATTTCTCAC CAATAACATG CTGCAGCACG ACTGCGCAGT 540
CAGACGTGGG GCATAAGAAA GAGAATGCAG CGAACGTCAA TGGCACCGTG AGCGCCGGCA 600
CGCGGGGGCA TTGCATCCGA 620 (2) INFORMATION FOR SEQ ID NO: 659:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 503 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 659:
CCGGTCTGCC TCCTGACGnA TCAGGTCTGC GGATGCAAGG GTCCGCGTGT TACCGTCTTT 60
GAGAGCAGAT TTTTGGAATG TCTCGTTTTT AATTGCGGAT TCAGGGAGTC GTTTTGCCTC 120
TTATATGCGG CGCACCCATG CTCAAGGGTT TnGGGAACGG ATGGGTCAAA TTATGGCGTT 180
ACCTTTTCAG CTGCATGATG CGTATCCCCC CAnCGTGGTG GGGAGAAGGG AGACAGCTGG 240
TAGAGGATCT TGCCTTTGAG GTGTGTGCAG GTCTGGAGTA TCTGGAGTCT GTGACCCAGT 300
TGCAACCGGT ATACACCGTT TCAGTGGACA nGCAAAnGAT AGTCGTTnGT TGCAGCTAAG 360
GCTGATGTCA TGGGGCGTTG TTTTGTGTTA CCAGAAATTC ATGACGCAGA ACTGACAGGA 420
AACGCGGCGC TCCnTGTGTT GCCTGGATTT GATGCGGATA TGCAAACAnC AGCGCAACGG 480
CGTGTCGTCT CAAGCGTGAG TTT 503 (2) INFORMATION FOR SEQ ID NO: 660:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 587 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 660: ATTCAGTTCG TACCCTCCGT TATTTCTGTA ACGGATGTAG GTGAGGGGGA TGTACACGCG 60 TGCTTCGACG CCGGCGTTCA GGCCGGTGAG CAGGTGGGTG TAGGGGTCAC CGCTTTTGGT 120
TTCGAGTTTG AGGAATCCGG CAAAATCAAA ATGCCAGGAC CGCAGCGCAG GCAGCACGCG 180
CTTACCAAAT ACATTGGTAC CCGCCGTCGC CCCATACAGT CCAGCCGACA CATAGGTCCA 240
CTGCGCCGTA AGCAGCGCAT CAGCATTGAA CCGGTCCAGC CGTGCACGCT CAAGCCAGGT 300
TAGCAGCATG ATAAGCACGA TGTCCGACTG ACTCGGCGCC ACAATCTCCT GTAGTCTCTG 360
CTCCGCCTGC GCCCTCTGGG GCAAGAGTTT TACCAACGCC ATTTGGATAA ACCGATCTCC 420
ACTATCAGTG GCAACCTTCG TCAGCGTAGA GACGAGCGTA CCGTCCTGCA GCGTGAGGGC 480
AAGCCCCACC CGCTCGAGAA TCGAGGACTC CGCCATCTGT CnCAGCAAAA ATTCAAGATA 540
CTTATCCTTT ACACGGTAnT GCTCCAACCC TGAAGGGAAT CGAAACG 587 (2) INFORMATION FOR SEQ ID NO: 661:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 467 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 661:
AGGAAGTAAA AAAAGACGCT GCGCTGGTAC AGGTCGCACC CCTGTTGCGC GCAnAGACGC - 60
AGACGTGCGC ACACATGCTG CACGCAAAAA AATGGTACAG GCGTTGCGCC TACACATGAA 120
GGTTTGTGCG CGTGAGTTAC GATTTGAAGA GGCAGCCCTC ATCCGAGACA AAATTTTGCA 180
ACTGCAAAGG CAAGACGAGC AAAACGGGGT TTGATAGGGG AGGTGGAATC GAACAGCACG 240
CGTTTTTACC ATGTCACTTT CATTCCGCAG ACAAGGGTGC CGAAGTGGCG TTCGGACCAG 300
ATGCTCTCGG CAATGCCCAT GTAAGGAGCG TCACAAGCAC GCCCTGTTCC CACTGGGCGC 360
TGAGCTCCAC CTTCTCGAAG GGACTGAACG TCAGTCCCAC CTGGTACTGG AGCGCTCGTT 420
CATTCAACAG GTTGCCCGCG GGGTTAATAA TGTTAAAGCG ATTGGTT 467 (2) INFORMATION FOR SEQ ID NO: 662:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 530 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 662: GCCAAAGTAT GCTGCACCCC CAGCTGCACC GGCACCGCGT ATTCTGGCAA ATCCCCGTGA 60
TGCCCAATTC CTCAGACTCA GCCCTAGCAT CCTGACACTT CCGGCGCCTC AGCAGGCTCA 120
GCACTGTCTT TGGAACGTAC CACCCTCCAT GTTCGAACGA ACACACCGAA CCCTCATTGG 180
GGGCCTGGAT GGTGATGTAA TGGTAGCTGT CGTAGATGAG CGCAGTGCAC TCAAAACCCG 240
CCGCTACCGA CAGATACGTA TTTACCCCCC CCCGGCCTGA ACGAATAAnC GGnCAAACAC 300
TGACCACGGA ACATCAGGCG GGTGAGCGCC GCGTTCCACC GTCAGCTGGA GCAGAAAACC 360
ATTCTCCGCA GGAACGCCAC ATCGAGTCCC ACCAnCGGGG TCTCCGGCGC ATAAGGGGAG 420
TAATACTCCA TCTCCGTGTC ATCGGGATCC CCATTACCTC CTTGCATCGG TCGCCTTAAT 480
ACACAAGCCC CACAAGTCAA GAACGCCACG TGGGTTnCCA nAAGGCCCCA 530 (2) INFORMATION FOR SEQ ID NO: 663:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 535 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 663:
ATTCAAAGCG TTGTGTTACC ACGGAACCTC TCCTCTTAAC CTTCCAGCAC CGGGCAGGTG 60
TCACCACCTA TACGTCTCAT TACTGATTTG CAGATGGCTA TGTTTTTGAT AAACAGTCGC 120
CTGGACCTGC TTTCTGCCAC CCTCACCAAA GGCAAGGGTC ACACTTCTCC CGAGGTTACG 180
TGTGTATTTT GCCGAGTTCC TTGACGCGAG TTCTCTCGAG CGCCTTAGAT TACTCATCCT 240
ACCTACCTGT GTCGGTTTGC GGTACGGTCT CTTGCAACCT AACCTTAGAC AGTATTTCCC 300
GTCGCCATGA CTACACCTGC TTCCCTTCGC TCATCGCTCC AGTCGCACTC GCACCTTACC 360
TCGAACGACG GATTTGCCTA TCGCTCTTAA AAGGCTCGGG ATACTTAGAC CAAAACTACC 420
AATCTTTGGC CGGGCTCAAC TCACGGTCCT GCCATCGAAA TGCAAGAGGT TCGGAATATA 480
AACCGGATTC CCATCGACTA AGACCCTCGT CCTCGCCTTA AGGGGCGAnT AACCT 535 (2) INFORMATION FOR SEQ ID NO: 664:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 641 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 664:
GCAGTGCCAG CCGCAGCCGA CGACATCTTA TAGGACGCAG CGTACATGCC TGTCGATTAC 60
AAAGTCCTAA AAGCCCACGC CCGAGCCCCA GCCGACATCC ACTTCCCGGT GTGGGACGTC 120
CGCCCGCATC GAAGGCCAGC AATGTATTCG AGGGAGTATT TCTCGCTAGA AACATAGCCA 180
TGCGAGAGCA CGACTGCGCA AACCTCACGG GGCATTGCAT CTGAAAAAAA TGGTGGCGCC 240
CAACCCCTCA AGCAGGGAGG ACGCGTGCAC TGGAGCTAGG CGCACCCCCT TAACACCGAC 300
TGGGAGGGCA AACCAAACGG GCAACGTTCC CAGCAGGAGT AACCCCCAGC ACGTTACGGG 360
CGGTACGAAC AAGCAAGCTG GCTGCGGGTC CCCGGGCGTC GTCCCTGGTG GCGGTTCCTG 420
GCTCTTACGA ATGGGGAGGC AGGAACAAGA CACGCGCGCA CATCCCCCTC ACACAAAAAA 480
CCTCGTTCCA CGCCCGAGCC GTACCCCCAG CCCGGAGTTG ACATCTACTT CTCGGGTCGG 540
CAGTCGCCCG GAGGAACTCC ACACTGTTCT AGCGGGTATG CCCAAGCAGC CGGGAGCCGG 600
CAACCGGGAG TTCCAGCCCT AACCGGGAGC CGGCAnCCCC A 641 (2) INFORMATION FOR SEQ ID NO: 665:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 434 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 665:
TTTACACGGT ATTGCTCCAC CCTGAGGGCA TCGAACGCGC TGTCAAACTT CTCCCGTGAG 60
CTCCCCGTTG CCAGAAGGCG ATTACCTGCA TCGGCAGGGT CCTGGTGTTG GTTACCGGCG 120
TCGAGGGCGA AGGAGAAGCG GAAGCCGGCG CCTGGTTCGA GGGTGAGTCG GCCTCCTACT 180
CCCCACAGGA GTGCTGTTTT GTTTTCGTTC GTGGAGTCTT CGGTACCCTT ACGGTAGTGC 240
TGCTCCAGTG TGGCATTCCC TGCCAGCTCC AACGTAAGCA GCCGCTGACG GTCGACGCCA 300
TAGGAAAGCG TTnCATCGGC CCCGAAGCCA TACTTGCTGT GCGTGGTGTC AGTAnTATCC 360
CAGGCACCAT TGGAAAGGAA GGAGAGGAAA CCGATGTCCA CATCnACTCC GCTGTTTCCC 420
ACATTGTGGG CCTG 434 (2) INFORMATION FOR SEQ ID NO: 666:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 540 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 666:
GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAACGACC TACACCGAAC TGAGATACCT 60
ACAGCGTGAC TATGAGAAAG CGCCACGCTT CCCGAAGAGn AAAGGCGGAC AGCTCTTGCT 120
AGATGTTGCG ACACGTAACC GGTTTGTGCT TTCGGATCCT GCGCCGGCGG TTTTGTGGAA 180
TGCCTTCGCT GACTCGGGTA TTGACGTAAC GCTCCTGACC TGGACTCACA TTGAGCATTT 240
CAATGATTTG CGCAATGCTA TCTTCGTGGA TATCGACGAA TGCTTCAAAC AGGCGGGCAT 300
TGAGGTTCCC TTTCCGCATG TGGACGTACG GGTGCAGGGG GCGTGCGATG CGCCACGTGC 360
GGAAAnGGTG TGAAATGCAG GGTGAGTCTT GAnGTGCGCT TTTTCTTTGG ACATTGACAG 420
GATGGATAGA GGGACAGGGG GAAGCCGAAT GAGATGAAAG GAAAAACGGT GAGCGCTGCG 480
CTCGTAGGGn AACTCATTGC CCTAAGCGTA nGGGTGGTTG CGTGTACTCA GGTGAAGGAT 540 (2) INFORMATION FOR SEQ ID NO : 667:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 435 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 667:
AGCAGGTGTA TCATGGCGAC GGGAAATACT GTCTAAGGTT AGGTTGCAAG AGACCGTACC 60
GCAAACCGAC ACAGGTAGGT AGGATGAGTA ATCTAAGGCG CTCGAGAGAA CTCGCGTCAA 120
GGAACTCGGC AAAATACACA CGTAACCTCG GGAGAAGTGT GACCCTTGCC TTTGGTGAGG 180
GTGGCAGAAA GCAGGTCCAG GCGACTGTTT ATCAAAAACA TAGCCATCTG CAAATCAGTA 240
ATGAGACGTA TAGGTGGTGA CACCTGCCCG GTGCTGGAAG GTTAAGAGGA GAGGTTCGTG 300
GTAACACAAC GCTTTGAATT GAAGCCCCAG TAAACGGCGG CCGTAACTAT AACGGTCCTA 360
AGTAGCGAAA TTCCTGTCGG GTAAGTTCCG ACCCGCACGA nTGGTGTAAC GACTCTGGAC 420
ACTGTCTACG ACGCG 435 (2) INFORMATION FOR SEQ ID NO: 668:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 536 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 668:
CCACCGCTAC CAGCGTGGTn TTGTTTGCCG GATCAAGAGC TACCAACTCT TTTTCCGAAG 60
TAACTGGCTT CAGCAGAGCG CAGATACCAA ATACTGTTCT TCTAGTGTAG CTTGTGTATT 120
AAGGCGACCG ATGCAGAGGA GGTAAGTGGG GATCCCGATG ACACGGAGAT GGAGTATTTA 180
CCTCCCCGTT ATGCGCCGGA GACGCCGCTG GTGGGACTCG ATGTGGCGTT CCGTGACGGA 240
GAATGGTTTT CTGCTCCAGC TGACGGTGGA CGCGGCGCTC ACCCGCCTGA TGTTCCGTGG 300
TCAGTGTTTG GCCGGTTATT CGTTCAGGCC GGGGGGGGGT AAATACGTAT CTGTCGGTAG 360
CGGCGGGTTT TGAGTGCACT GCGCTCATCT ACGACAGCTA CCATTACATC ACCATCCAGG 420
CCCCCAATGA GGGTTCGGTG TGTTCGTTCG AACATGGAGG GTGGTACGTT CCAAAGACAG 480
TGCTGAGCCT GCTGAGGCGC CGGAAGTGTC AGGATGCTAG GGCTGATCTG AGGAAT 536 (2) INFORMATION FOR SEQ ID NO : 669:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 414 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 669:
TGGGGCGCTG AGCTCCACCT TCTCGAnGGG ACTGAACGTC ATCCCACCTG GTACTGGAGC 60
GCTCGTTCAT TCAACAGGGT GTACGCGGGG TTAATAACGT TAAAGCGATT GGTTGTGCCG 120
AGCACGGATG TATGCGGTGT AAGCCAGGCG TGGGAACCGA GGGGGATGCG ATACTGCACC 180
ACGCCTTCCC CAAAATTGGC ATATTGATAG TCCCAGGGGG CACAGCTCCA TTCAGTTCGT 240
ACCCTCCGTT ATTTCTGTAA CGGATGTAGG TGAGGGGGAT GTACACGCGT GCTTCGACGC 300
CGGCGTTCAG GCCGGTGAGC AGGTGGGTGT AGGGGTCACC GCTTTTGGTT TCGAGTTTGA 360
GGAATCCGGC AAAATCAAAA TGCCAGGACC GCAGCGCAGG CAGCACGCGC TTAC 414 (2) INFORMATION FOR SEQ ID NO: 670:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 433 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 670:
GGCAACATAC ACCCGAGCAC CATCACTTTA CCCAGACGCT TCCCGCACAT ACCCCTGCAA 60
TCCCTCGCCT GTAAACTGAG CGCGCACAGC CCCACGGGGC AAAACGCGCA CCCCCAGGGn 120 nACCCGTTCT CTGGCCCACA CGACAACCCC TCACCAAACC ACCTTCACCC CACAGATAAA 180
CGTGCCAAAG TAACGCTCAG ACCACATACT CTCGGCAATA CCCATGTAAG GAGCGTCAnA 240
AGTCACCCCC TGTTCCCACT GGGCGCTGAG CTCCACCTTC TCGAAGGGAC TGAACGTCAT 300
CCCACCTGGT ACTGGAGCGC TCGTTCATTC AACAGGGTGT ACGCGGGGTT AATAACGTTA 360
AAAGCGATTG GTTGTGCCGA GCACGGATGT ATGCGGTGTA AGCCAnGCGT nGGAACCGAG 420
GGGGATTCGA TTA 433 (2) INFORMATION FOR SEQ ID NO: 671:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 415 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 671:
ACCTAGGGGA CGGCGCTGGG AGGAAAGAGT CCTGCCGGAA GnGCCCTGCG GCGGGTAGTA 60
GCTACCAGGA GAGGGTGACG CCGCACACGA TGCGGCCGAT TCCCTGGGTG AGGCACTCGG 120
ACACCAGCAG GTACGGGACA TCAGAGAGCA TACCCTGTTC CCAATCAAGG GAGAATACCG 180
TCTTCTCTAT GAGACTGGCT GAAATACCAG CACGCAGCTG TGCACAGTAC TCCTTGGTTA 240
GATAGGTAGC TCCTACTGCT CCACCTGCAG CAGGGGCATT CAGGTGTGCA CGGTTGGTAG 300
AGGCATGGAC CGTAACGCTT GGCTTCACCC AGCCGTAATC CTGCACCGGG ATGCGATACT 360
AACACCACGC CTTCCCCACC ACCGGTGGAC GGATATACTC CTTTTCCTGA ATGCC 415 (2) INFORMATION FOR SEQ ID NO: 672:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 653 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 672: TCGTTGGTAA GTCTCGCTAC CAACAGGGCG CTGCCCTACT ATACGACAGC CGTGTTGAAG 60
CAAGCTGTTC CAGGAAAATA TACGGAGGAT CAACGCTGTG GTGCCAGCAA AAAGCGCAAA 120
CGCTCACCTC TTCCCAGGAA CTGGAAAAGG CAGTGTATTC GTTGTTCGTT CCCACGTTTG 180
AAAACCTGGT GTTGGGTGCA GGCGCGCTGC TGGCTCTTTT GGATATGCAT CAGATTGCGG 240
TGGACGCGCT GTTTACGGCG CAGTGGAAGT GGCTGTCTTC TGGCATATAC TTTGCCACAG 300
CACCGGCAAA CGTTTTTGGC ACCAGGGTGT TAGATAACAC CATCGCAAGC TGTGGCGACT 360
TTGCCGGATT CCTTAAGCTC GAAACTAAGA GCGGTGACCC CTACACCCAC CTGCTCACCG 420
GCCTGGACGC CGGCGTTGAA ACACGCGTGT ACATCCCCCT CACCTATGCG CTATACAAAA 480
ATAACGGGGG GACGGCTGTG CGTGGCATTC AGGAAAAGGA GTATATCCGT CCACCGGTGG 540
TGGGGAAGGC GTGGTGTAGC TATCGCATCC CGGTGCAGGA TTACGGCTGG GTGAAGCCAA 600
GCGTTACGGT CCATGCCTCT ACCAACCGTG CACACCTGAA TGCCCCTGCT GCA 653 (2) INFORMATION FOR SEQ ID NO: 673:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 457 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 673:
ATAATTCTCT TACTGTCATG CCATCCGTAA GATGCTTTTC TGTGACTGGT GAGTACTCAA 60
CCAAGTCATT CTGAGAATAG TGTATGCnGC GnCCnAnTnG CTCTnGCnCG GCGTCAATAC 120
GGGATAATAC CGCGCCACAT AGCAGAACTT TAAAAGTGCT CATCATTGGA AAACGTTCTT 180
CGGGGCGAAA ACTCTCAAGG ATCTTACCGC TGTTGAGATC CAGTTCGATG TAACCCACTC 240
GTGCACCCAA CTGATCTTCA GCATCTTTTA CTTTCACCAG CGTTTCTGGG TGAGCAAAAA 300
CAGGAAGGCA AAATGCCGCA AAAAAGGGAA TAAGGGCGAC ACGGAAATGT TGAATACTCA 360
TACTCTTCCT TTTTCAATAT TATTGAAGCA TTTATCAGGG TTATTGTCTC ATGAGCCACG 420
TTACGTGCTT CGnTTTTCTC CCGCTCACGC TTATCAn 457 (2) INFORMATION FOR SEQ ID NO : 674:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 487 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 674:
TGATGCCGCA TAGTTAAGCC AGCCCCGACA CCCGCCAACA CCCGCTGACG CGCCCTGACG 60
GGCTTGTCTG CTCCCGGCAT CCGCTTACAG ACAAGCTGTG ACCGTCTCCG GGAGCTGCAT 120
GTGTCAGAGG TTTTCACCGT CATCACCGAA ACGCGCGAGA CGAAAGGGCC TCGTGATACG 180
CCTATTTTTA TAGGTTAATG TCATGATAAT AATGGTTTCT TAGACGTCAG GTGGCACTTT 240
TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA 300
TCCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA GGAAGAGTAT 360
GAGTATCAAC ATTTCCGTGT CGCCCTATTC CCTTCGAAAT ATCTGTGAGC GGAAnTCTGC 420
GTGTTTACTC ACCTGGCAGC GACTCAACGC GCAACACCCA AGGACGAGCA GAAGATGTGC 480
GCGGCAC 487 (2) INFORMATION FOR SEQ ID NO: 675:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 478 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 675: ATTCTGAGAA TAGTGTATGC nGCGACCGAG TTGCTCTTGC CCGGCGTCAA TACGGGATAA 60
TACCGCGCCA CATAGCAGAA CTTTAAAAGT GTACGCTCCT ATGATTAGAC GTTCCATCGT 120
CTTCTGGTTA AAGAGACTGG AGGTGCCCTT GTGATCGGCG TGCAnGTGCT CAATCGTCTT 180
GAATATCACG TCAAGGGAGT GGAGCGGnTT TGGATCGCGG TAAGTTAGTG AGGAAAATAT 240
GCCATGTACT GGATCTGGCA GGGTGAATGC ACCGTAAGCA CCACCTATCG TTCGAATTTT 300
TTCCCAAAAn GGGnTCAGTA CTTAGATATC GGGGnAAACA CCTGGCTCTA CCCCGCGTCT 360
CTCCAAAGGA AGCCGTGGGA TGTGCAAnGG ACAGCGCTGC AAAACCCACT TGCACAGGGn 420
TGGAAGCAGC GTTCACATAT TGCGGGTGCG CATGTGCTGC AGGGCTCTTG AAAGAGCA 478 (2) INFORMATION FOR SEQ ID NO: 676:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 443 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 676:
AACCAGTCAC AGAAAAGCAT CTTACGGATG GCATGACAGT AAGAnAATTA TGCAGTGCTG 60
CCATAACCAT GAGTGATTAC AGACCCGTCA CTGCTTCTCA CACACGCATC GGCAGTGCTC 120
TGCGCACGCT TTCCTGCATG ATTACGCGGT GAACAnGCAG CATTCGGTAA AATTACCGAA 180
TCATTCCTAG GACGGGGCCC AATACCCTCG GTGCGAAACA AATCCAGACA ATCGCGCACC 240
ACACGCAnGA AGCGTGCGAT ACTCTCTCTG TCCCCAGnGT GCATAGGACT GAGCCTACCA 300
TAAACTGCGT ACATGACGGT ACCCGATCCT TGACATCTGT CAATAGACAC TACACGAGCT 360
CTGCCCGCGG CACAGTGGGA ACGAACCGAC GCGTACGCAC CTGGnAGCCG TTCGGGTAAG 420
TCTGTTAAAA nCTATACCTG TGG 443 (2) INFORMATION FOR SEQ ID NO: 677:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 518 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 677:
CTGCTACAGA CGTGGGGCAT AAGAAAAACG GAGCGAATGG CGACATAGGC GCAGATGCGT 60
TGTTGACGTT GGGGTATCGT TGGTTCTCGG CGGGAGGATA TTTCGCATCG AAGGCCAGCA 120
ATGTATTCGG GGGAGTATTT CTCAACATGG CCATGCGAGA GCACGACTGT GCTGCCTATA 180
TTAAGCTCGA AACCAAGGGG TCTGATCCTG ATACTTCTTT CCTTGAGGGT CTTGATTTGG 240
GTGTTGATGT GCGTACGTAC ATGCCCGTCC ATTGGAACGC CTTCACCCAA GCCCGAGCCC 300
TACCCGGAGC CGACATCCAC TTCCCGGTGT ATGGAAAAGT CTGGGGTTCG TATCGTCATG 360
ATATGGGTGA GTATGGTTGG GTTAAAGTGT ATGCAAACTT GTACGGCGGT ACGAACAAAA 420
AAGCTGCTGC TGCAGCCCCT GCTGCTCCTA CGAATGGAAn GGCAGAATAT GTGGGTATAC 480
GAnTGTGGGG TATGGTCATC CGTTAGAGAA GTGGAGAT 518 (2) INFORMATION FOR SEQ ID NO: 678:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 384 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 678:
CGAAACGGTA GCCGTCGCGT CTATCCCCCA GTGCGCGCTA AAATAGTGCT GTACCCCCAG 60
CTGCACCGGC ACCGCATAGT TCTGACAGAC CCCTTTAATA CGCCGACTCT TCGTGTATGG 120
CGCCTACCTC CTTGGCAGTG CACGTTAGCA CGCCAAGCAA TGAACGTGGG CGGTGCCAGT 180
TACCGGCTGA ATAAGACGAG CTGCTCGGCT GCAGTAAGCC CTGCCCAAGA ACCGAAAGAA 240
AGTGCTGGCT ATCGTAGATG AGCGCGGTGC ACTCAAAACC CGCCGCTACC GACAGATGCG 300
TACTACCTTC CCCCGGTCTG AACGAATAAC CGGGCCAAAC ACCGGGCCGC AGAACATTAA 360
ACGCGTGAGT GCCGCGTTCC ACCG 384 (2) INFORMATION FOR SEQ ID NO: 679:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 515 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 679:
CTGACGTCTA AGAAACCATT ATTATCATGA CATTAACCTA TAAAAATAGG CGTATCACGA 60
GGCCCTTTCG TCTCGCGCGT TTCGGTGATG ACGGTGAAAA CCTCTGACAC ATGCAGCTCC 120
CGGAGACGGT CACAGCTTGT CTGTAAGCGG ATGCCGGGAA GGTAAATACT CCATCTCCGT 180
GTCATCGGGA TCCCCACTTA CCTCCTCTGC ATCGGTCGCC TTAATACACA AGCCCCACAG 240
GTCAGGACCG CCACGTGCGT CCCCATACGC CCCCAGCTTG GGAGAGACAG AAATACGCGC 300
GGCCACGCAC GGCAACGCAC ACCCGAGTAG CACCACTTTA CCCAGACATT TTCTCCACAT 360
ACCTTCACTC CTCCCCGCAA TTCTTCGACA GGACCCGTTC CTCCCGGCGC CTCCCCTAGG 420
TGCGGAAGAC CGGGCCAACG CGCAACGTAT ACGGGATGCG GAACTTAGCC AGCTTGGTGT 480
CAATGCCAAA CGAAACGGTA GCCGTCGCGT CTATC 515 (2) INFORMATION FOR SEQ ID NO: 680:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 515 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 680:
CCCCTGACGA GCATCACAAA AATCGACGCT CAAGTCAGAG GTGGCGAAAC CCGACAGGAC 60
TATAAAGATA CCAGGCGTTT CCCCCTGGAA GCTCCCTCGT GCGCTCTCCT GTTCCGACCC 120
TGCCGCTTAC CGGATACCTG TCCGCCTTTC TCCCTTCGGG AAGCGTGGCG CTTTCTCATA 180
GCTCACGCTG TAGGTACAGT AGGCGCGAAT AAAAATTCAA TTAAGATTAT TGGTGAGGCG 240
ACGGATAATA ACGCGCAGGC TTACTTTGCC TACGATAGCA AGAAGTCTGG TGGTTTTACT 300
ATTTCTCATT TGCGTTTTGG AAAGCAGAAG ATCCGTAAGC CCTACCTTCA TTACGCAGGC 360
GGATTTTGTA GCGTGTCATA AGTTTACGTA CCTTGAAACC TTTGACATGC TCAAAACGCT 420
CAAGCGTGGA GGGACCTTTT TGCTGAATGC GCCGTACAGT GAGCATGAGG TGTGGCATCA 480
CATACCCATA GAAGTCCAGC GTCAGATCAT TGAAA 515 (2) INFORMATION FOR SEQ ID NO : 681:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 564 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 681:
TGGCAGTTTC TGTGGTCAAC GCGCTGTCGT TGTGGGTAGA AGTGACAGTG TATCGTGATG 60
GTGCTGAGTA TTATCAGAAG TTTAATGTGG GGATGCCGCT TGCTCCAGTA GAGAAGCGGG 120
GAGTGTCGGA AAAACGTGGG CACTATTATC CGCTGGCAGG CGGACCCATC CATTTTCAAA 180
GAAACGGTGG CCTATGATTT TGACGTACTC CTGACGCGTT TGCGTGAACT TGCTTTTTTG 240
AATACCCATT GGGCTTGAAG ACCGTCTAGA GGGTGTCATC GATTTAATTT CGCTCAAAGC 300
CCTTTATTTC GAGGGAGAAA GTGGCGCGCA CGTGCGTGAG GCGCCCATTC CCGAACAGTA 360
TCAGGCAGAT GTGAAAAAGT ACCGGGATGA ACTCATCGAT GCGGCGTCTT GTTTTCTGAC 420
GAGCTTGCTG AGGCCTACCT TGAAGGAACT GAGACCGATC AATTGATTCG AGCGGCATAC 480
GTGCGGGCAn CATTGCAGAA AAGTTTGTnC CGGTTTTTTG CGGTTCTGCG TACAAAAATA 540
AAGTATTCAG CCACTTTGGA CGCT 564 (2) INFORMATION FOR SEQ ID NO: 682:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 682:
ACACAGCCCA GCTTGGAGCG AACGACCTAC ACCGAACTGA GATACCTACA GCGTGACTAT 60
GAGAAAGCGC CACGCTTCCC GAAGAGnAAA GGCGGACAGG TATCCGGTAA GCGGCAGAGC 120
TGCTCTTAAC TGACTTTCGT GCTGCGTTGG AGGATGACTT TTCTACGCCA CGTGCTCTGA 180
GCGCCTTACA AAAATTGGTG CGTGATACCT CGGTGCCGCC ATCGCTGTGT GTTTCGGCAC 240
TCCAGGTGGC GGATACAGTG CTAGGGTTAG GCATAATACA GGAAGCGACC GCATCGCTAT 300
CTGCGCAGGT TCCTGCTGGC GATACGTTGC CGCAGCGTCC TTTACCGAGT GAGGAGTGGA 360
TTGGACAGTT GGTGCGTGCG CGTGCACATG CACCCAAACG CGTGATTTTC CCCGTGGCAG 420
ATGAGATCCG TC 432 (2) INFORMATION FOR SEQ ID NO: 683:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 691 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 683:
TGnTAACTCT GCCTTTAAAT AAATAAATAA ATACATCTTT TAAAAAATTG AATGGAAAAG 60
CTCAACATCA AAAAATCCTT CCTGTGCATA TAAATAAGTG CAGGCAATAG GGGAGGGTGA 120
CATTATTCCT ACTTGTATAA AGAACTGTTT AGGCCGGCAC CGCGGCTCAC TAGGCTAATC 180
CTCCGCCTAG CGGCGCCGGC ACACCGGGTT CTAGTCCCGG TTGGGGCGCC GGATTCTGTC 240
CCGGTTGCCC CTCTTCCAGG CCAGCTTTCT GCTGTGGCCA GGGAGTGCAG TGGAGGATGG 300
CCCAGGTGCT TGGGCCCTGC ACCCCATGGG AGACCAGGAA AAGCACCTGG CTCCTGGCTC 360
CTGGCTCCTG GCTCCTGCCA TCGGATCAGC ACGGTGCGCC GGCCGCAGCG TGCCGGCCGC 420
GGCGGTCATT GGAGGGTGAA CCAACGGCAA AGGAAGCCCT TTCTCTCTGT CTCTCTCTCT 480
CACTGTCCAC TCTGCCTGTC AAAAAATAAA AAATTAAAAA AAAAACTGTT TAGTTTTTTG 540
TTGCATTAGT CTCATAGTAT CTTACTGGAA AnGTGTTCCA GTGTCCTAAT GGnCATTCAG 600
GGGCTGAACT TGCCATGATG GTAAATTTTT GGGATAATTC ATAAATAATG CAATTTTTCT 660
TCTCTAGAAG AATGGnnTTT CTCCAACCCC T 691 (2) INFORMATION FOR SEQ ID NO: 684:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 576 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 684:
TGCTGCAGTG GTTAATGAGA CGTAATCAAC ATCATCATGG CCTTGCACAC ACCATGGCAT 60
CACTTCCCTG AGACAGTGCT AGCCTGGGTC TTTAAGCATG CTTTTAATCC GACAGGTCAG 120
ACTTTATATA AACTATCCCC CCCCCCTTTT TTTTTTGACA GGCAGAGTGG ACAGTGAGAG 180
AGAGAGACAG AGAGAAAGGT CTTCCTTTGC CGTTGGTTCA CCCTCCAATG GCCACCGCGG 240
CCAGCGCGCT GTGGCCAGCG CACCGCCTGA ATCCGATGGC AGGAGCCAGG AGCCAGGAGC 300
CAGGTGCTTT TCCTGGTGTC CCATGGGGTG CAGGGCCCAA GCACTTGGGC CATCCTCCAC 360
TGCACTCCCT GGCCACAGCA GAGGGCTGGC CTGGAAGAGG GGCAACCGGG ACAGAATCCG 420
GCGCCCTGAC CGGGACTAGA ACCCGGTGTG CCGGCACCGC TAGGCGGAGG ATTAGCCTAG 480
TGAGCCGCGG CGCCGGCCCC GTTTTTTCTT TAATTTAGTG CAAGGAACTC AGGTTTATTG 540
TCAGAAATGG AAATGATGGC TGATAnCTGC GTGCGT 576 (2) INFORMATION FOR SEQ ID NO: 685:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 578 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 685:
CnGAGAGAAA GGTnTTCCTT TGCCGTTGGT TCACCCTCCA ATGGCCGCCG CGGCCGGCGC 60
GCTGCGGCCG GCGCACCGCG CTGATCCGAT GGCAGGAGCC AGGAGCCAGG TGCTTTTCCT 120
GGTCTCCCAC GGGGTGCAGG GCCCAAGCAC CTGGGCCATC CTCCACTGCA CTCCCTGGCC 180
ACAGCAGAGG GCTGGCCTGG AAGAGGGGCA ACCGGGACAG AATCCGGCGC CCCGACCGGG 240
ACTAGAACCC GGTGTGCCGG CGCCGCTAGG GGGAGGATTA GCCTAGTGAG CCACGGCGCC 300
GGCCCATAAA ATAAATCTTT AAAAACAGTT TATTTTTAAA AAAGGTGGCT TTAATTTTCT 360
TTTATTTTTG GCAAAGCTAT GATTGTAAGA GTGTTTGAAA AGGTAACACA TCTCCTTTTC 420 CTTAGCCACA CTTTTTAACC TTTTGCACAG AGGTTAGTTT TTGTTCATCT CATAGTTAAT 480
GACATGAAAT AGAGGTTGGC AAACTTTTTC TCCAAAGGGC CAGATGGTAA ATAGTTTGGG 540 TCTTTTGGGC CACATGTGGT CTCTGTTGTG TATTGTTC 578
(2) INFORMATION FOR SEQ ID NO: 686:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 461 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 686:
TACGCCTATT TTTATAGGTT AATGTCATGA TAATAATGGT TTCTTAGACG TCAGGTGGCA 60
CTTTTCGGGG AAATGTGCGC GGAACCCCTA TTTGTTTATT TTTCTAAATA CATTCAAATA 120
TGTATCCGCT CATGCAATAA TGAATACGCT CAGGAGACAG CCGCTAACCC AAATTGCGCG 180
CAAnGGGTAA TGAAGGTGAG GGATATATAT CTTGGTGTTG AGTTTAGTCC AGAAAATCCT 240
GCGCAGACAT CTGCCCTTTC TTTTCCTCAG AGTAACGTGC TGCAGTATTT TGTACAGGGT 300
GGCAGTATGT AAGTATACGA CCGAGTGGAA CAGAACAAAA AATAAAGTGT TATATCATCC 360
ACCCTCTGGA CCGTCATACC TCGATAGAAG AAGCAGAACA GGCGGGGCAA CAGGTTATCA 420
CCGCnTTGAA CCAGAGGnGG GGACATATCT GCCATGGTAT A 461 (2) INFORMATION FOR SEQ ID NO: 687:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 508 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 687:
CCCTGATAAG ATTTTAGACT CCTACTCTCC TTCGCACGAG AGTCCGGTCA GTCACCACCG 60
TGCGCGTATA TTCGGGGGAG TATTTCTCAC CAATAACATG CTGCAGCACG ACTGCGCATC 120
AGACGTGGGG CATAAGAAAG AGAATGCAGC GAACGTCAAT GGCACCGTGA GCGCCGGCAC 180
GCGGGGCATT GCATCCGAAG ATGGTAnCGC CGGAAACCTC AAGCATGGAA AGCCGCGCGC 240
AAAAGACGCA ATTCCACCCT GACGGGGAGG GCAAGTCAAA CACGGGCGCC GCCCCAGCAG 300
GAAGAAACCA CAGCGCCGCG TTTGCACGCT ATGCTCCTGC TACGAATGGA AGGCAGAATA 360 TTGTGGGTAT TACGAGTGTG GGGTAGTGGT CAATCCGTTA GAGAAGGTGG AGATTCGGCT 420
GAAGCTGGGA GCAAnGCAAA GCTACAAGAG AACAGCAATG TAGTGATAnA GAAAAACGTG 480 ACGGAGCGTT TGCAATTCGT AAGGGCAT 508
(2) INFORMATION FOR SEQ ID NO: 688:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 436 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 688:
ACTTCAAGAA CTCTGTAGCA CCGCCTACAT ACCTCGCTCT GCTAATCCTG TTACCAGTGG 60
CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA TAGTTACCGG 120
ATAAGGCGCA CGGGTCGGGC TGAACGGGGG GTTCGTGCAC ACAGCCCAGC TTGGAGCGAA 180
CGACCTACAC CGAACTGAGA TACCTACAGC GTGAGTATGA GAAAGCGCCA CGCTTCCCGA 240
AGAGAAAGGC GGACAGGTAT CCGGTAAGCG GCAAGGTCCG AACAAGAGAG TTCGCGTAAC 300
GTTTCACATG GAAnAGCAGC GCTTCCCGAA GCCAACACAC GTTGCAAAAG GCGGCACACG 360
AATTTTCAAG CACTGGGGCA nCTGGnCAGA GGGCACGGAT CGCTAnCTGC TCTCCGATTC 420
AAGCCGGCGC AAAGTG 436 (2) INFORMATION FOR SEQ ID NO: 689:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 603 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 689:
TGCAGGTAAC ATCTCTAGGA CCTCACTATC ATTGCTTTAT ATCACTATTT TATTTAATAG 60
GCTTTTTTTT TGTTTTTGAC AGGCAGAGTG CACAGTGAGA GAGAGAGAGA GAGAGAGAGA 120
GAAAGGTCTT CCTTTGCCGT TGGTTCACCC TCCAATGGCC GCTGTGGCCG GCACACGGCA 180
CTGATCCGAA GCCAGGAGCC AGGTGCTTCT CCTGGTCTCC CATGGGGTGC AGGGCCCAAG 240
CACTTGGGCC ATCCTCCACT GCACTCCCGG GCCACAGCAG AGAGCTGGCC TGGAAGAGGG 300
GCAACTGGGA CAGGCTCCGG CGCCCCGACC GGGGCTAGAA CCCGGTGTGC TGGTGCCGCA 360 GTGGAGGATT AGCCTAGTGA GCCACGGCGC CGGCCTTTCA TAGGCTTTTA ACCCAAGCCT 420
GGCACCCCAA GATTTCAGAA GCTCCAAGAG GACTTTGCTG TTTACATTAG CACAGGTTTT 480
ATTATAAAAn nGCTGATTTG GGCCTCCTTC TCTAATTAAT AGTACTTTTA GnCACATTTT 540
TAAGATGTTT ATGAAGATGT TACTGCATTG CTGCATTTAT GATTACnGTA AGACACCTCA 600
AAG 603 (2) INFORMATION FOR SEQ ID NO: 690:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 531 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 690:
ATAATTCTAA AATTTCTGTC ATTTCAGGGT TGACACGAGG CTGGTGCCGT GGCTCAATAG 60
GCTAATCCTC CACCTAGCGG CGCCGGCACA CCGGGTTCTA GTCCCGGTCG GGGCGCCGGA 120
TTCTGTCCCG GTTGCCCCTC TTCCAGGCCA GCTCTCTGCT GTGGCCAGGG AGTGCAGTGG 180
AGGATGGCCC AGGTGCTTGG GCCCTGCACC CCATGGGAGA CCAGGAGAAG CACCTGGCTC 240
CTGCCTTCGG ATnAGCGnGG TGCGTGACCT GCAGCGCGCC GGCTGCGGCA GCCATTGGAG 300
GGTGAACCAA TGGCAAAGGA AGACCTTTCT CTCTGTCTCT CTCTCTCACT GTCCACTCTG 360
CCTGTCAAAA AAAAAAAAAA AATACTGTGT CTTGGGGCTG GCATGTGGTG CTGAAAATCC 420
CATATGGGCG CTGGTTCGAC TCCCAGCTGC TCCACTTCCA TCCAnCTCTC TTCTATGGCC 480
TCAGAAAGCA GCAGAAGATG GCCCAAGTCC TTGGGGCCCT GCAACCATGT G 531 (2) INFORMATION FOR SEQ ID NO: 691:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 629 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 691:
GAGCAAGGCA AGACTACAAG AGAACAGCAA TGTAGTGATA GAGAAGAACG TGACGGAGCG 60
TTGGCAATTC GTAGGGGCAT GTCGCTTGAT TTGGTAGAGA ACGGCATGCC CGTCTATTAC 120
TTCGCAGCCC GAGCCCAACT CCCAGCCGTA GCCCCAGCCA ATGACATCTT ATGGGACGCA 180 GGCkTACATG CCCgTCCATT GGAACGCCTT cACCCAAGCC CGAGCCCTGC CCGGAGCCCC 240
AkTCCCAGCC ATCTACTTCC CGGTTGTAGT TACAATTCCA CGCTTTCTGG CGACTATGCC 300
CGAGCCGCAG CCGCAGCCGG GGCTGGAGTC GACATCAACT TCCCGGTGTA TGGGGGTGTC 360
TTGCACGCAT CGCAGGCTAG TAATGTATTT CAGGGTGTCT TTCTCACCGA TACCACACCC 420
ATGCGGACGC ACGACATACC CCGCAGTCCC CTCGTGGGGC ATAAGAAAAA CGCAGCTCCC 480
GATGGCATAG GCGCCTCACG CGCGTGCTGC CCAGCGCGCG AGAACGAACC CTTTAAAAAG 540
GGTTCGACAA ACAGCCGTGG GGGGGGGGTA GAATGGAGTA GGTCCTCGAC GAGACGCGTA 600
AGAGGATCGG CGTTGGAGCG GGGTATGAA 629 (2) INFORMATION FOR SEQ ID NO: 692:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 692:
GATACCCAAT AAATTCAGTG TCTGGATTTC CTCGGGAGAG TTTCCATTGG ATTCCTGTGG 60
CTGTATCATA CCTTCTTTAG TGTGTGTCCC ATTTTTTGTT GTTGTTGCTG TTGTTGGTAA 120
TTGGATATAT ATATTTTTTT TTTGACAGGC AGAGTGGACA GTGAGAGAGA GAGACAGAGA 180
GAAAGGTCTT CCTTTGCCGT TGGTTCACCC TCCAATGGCC GCCGCGGCCG GCACGCTGCG 240
GCCGGTGCAC CGCGCTGATC CAAAGGCAGG AGCCAGGTGC TTCTCCTGGT CTCCCATGGG 300
GTGCAGGGCC CAAGCACCTG GGCCATCCTC CACTGTACTC CCGGGCCACA GCAGAGAGCT 360
GGCCTGGAAG AGGGGCAACC GGGACAGAAT CCGGCGCCCT GACTGGGACT AGAACCTGGT 420
GTGCCGGCGC CGTGGCACTG GCCGGTAATT GGATATTGCA AATAATTGAT ATTTGGCAAC 480
TTTAGGAAGC AGATGCTCTT AATGAACAAG GTTGCTGTTG TTGGCTTCAA TGTTTAATGC 540
C 541 (2) INFORMATION FOR SEQ ID NO: 693:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 505 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 693:
AACGCCTGGT ATCTTTATAG TCCTGTCGGG TTTCGCCACC TCTGACTTGA GCGTCGATTT 60
TTGTGATGCT CGTCAGGGGG GCGGACTAAT GGAAAAACGC CAGCAACGCG GCCTTTT AC 120
GGTnCCTGGC CTTTTGCTGG CCTTTTGCTC ACATGTTCTT TCCTGCGTTA TCCCCTGATT 180
CTGTGGATAA CCGTATTACC GCCTTTGAGT GAGCTGATAC CGCTCGCCGC AGCCGAACGA 240
CCGAGCGCAn GGTTCAGTGA GCGAGCCGCG CGTGTTTCTA ATAAGGTGGG GCTAGAGGAG 300
GATCCTTCTA ACTTCTTGCT TATGCACGCG ATGGGTCCTA ACGTGGCTGG TGTCATTGGG 360
ACCGCGATAC CGCAGGGTGT TCATCTCGGC CTACGGAGGG TAGGGAGGAA GAGTAACCGC 420
GGGGTTTTGC CGCTTAGGTA ACCTTTCCTC CGTGCGCGGG CAnAnCCTCT CAnGTGGGCT 480
AAGGGGnTTT TGCAGACGAA GCGGG 505 (2) INFORMATION FOR SEQ ID NO: 694:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 526 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 694:
AAGAGATTTT TTTTTGACAG GCAGAGTGGA CAGTGAGAGA GAGAGAGAGA GACACAGAGA 60
AAGGTCTTCC TTTTGCCGTT GGTTCACCCT CCAATGGCCG CTGCGGCCGG CGCACTGCAG 120
CCAGCGCATC GCCTGAATCC AAAGCCAGGA GCCAGGTGCT TTTCCTGGTC TCCCATGGGG 180
TGCAGGGCCC AAGCACTTGG GCCATCCTCC ACTGCACTCC CTGGCCACAG CAGAGAGCTG 240
GCCTGGAAGA GGGGCAACCG GGACAGAATC CAGCAnCCCA ACTGGGACTA GAACCTGGTG 300
TGCCAnCGCC GCAAGGGGAG GGATTAGCCT ATTGAGCCAA AnCGCTTGGC CAGCAAAGAG 360
ATTTGGATAT TTCATTTCCA TTACAGCCAA GGTTTGGTCA GGTCAACTAG GAGCCAGGAA 420
TCTTATCCAG GTCTCCCCAC GTGGGTGACA GGGACCCAAA TATTCAGCTT TCATCGTTTG 480
CTCTAnGCTA ATGnATTAAC ATGAAAGCTA AATTGGATGT TGTTAA 526 (2) INFORMATION FOR SEQ ID NO: 695:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 452 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 695:
GGTGCAGCGG CTCAGTAGGC TAATCCTCTA CCTTGCGGCG CCGGCACATC GGGTTCTAGT 60
CCCGGTCAGG GCGCCGGATT CTGTCCCGGT TGCCCCTCTT CCAGGCCAGC TCTCTGCTGT 120
GGCCCGGAAG TGAAGTGGAG GATGGCCCAA GAACTTGGGC CCTGCACCCC ATGAGAAGAC 180
CAGGAGAAGC ACCTGGCTCC TGCCATCGGA TCAGCGCGGT GCACCGGCCG CCGCGCGCCA 240
GCCGTGGCGG CCATTGGAGA GTGAACCAAC TGCAAAAGGA AGACCTTTCT CTCTGTCTCT 300
CTCTCTCACT GTCCACACTG CCTGTCCAAA AAAAAAAAAA AAAGAnnAGA AGAAAAAAAA 360
AAAACACTTT GATGTAAATA TGTTCTTTAA AAAAAAGTAT GCTCAATTTT TATTATATTA 420
TTAAnAGTAT TCTAAAAACn ATATAAAGGG GT 452 (2) INFORMATION FOR SEQ ID NO : 696:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 482 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 696:
CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA TAGTnACCGG 60
ATAnCnCGCA CGGnTCGGGC TGAACGGGGG GTTCGTGCAC ACAGCCCAGC TTGGAGCGAA 120
CGACCTACAC CGAACTGAGA TACCTACAGC GTGAGCTATG AGAAAGCGCC ACGCTTCCCG 180
AAGAGAAAGG CGGACAGGTA TCCGGTAAGC GGCAGGGTCG GAACAGGAGA GCGCACGGGA 240
AATAAAGGCC ATGCGCGTGA AGTGCGTTTC TCCAAGCGAG CTTATCAGTG CGCTCAGCGG 300
GGGTAATCAG CAGAAAGTCA TTATTGGAAA TGGCTCGAAC GCGATCCCGA CGTCCTCTTG 360
CTTGATGAGC CGACCAGGGG GATCGACGTG GGTGCGAAAT ATGAAATTTA TCAGCTCATC 420
ATTCGTATGG CGCGTGAGGG AAAGACAATC ATGTGGTTTC TAGTGAAATG CCTGAAATTC 480
TT 482 (2) INFORMATION FOR SEQ ID NO : 697:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 580 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 697:
GTTGGTTCAC CCTCCAGTGG CCGCCGCGGC CAGTGCGCTG CGGCCAGCGC ACCACGCTGA 60
TCCGATGGCA AGAGCCAGGT ACTTCTCCTG GTCTCCCATG AGGTGCAGGG CCCAAGCACT 120
TGGGCCATCC TCCACTGCAC TCCCGGGCCA CAGCAGAGAG CTGGCCTGGA AGAGGGGCAA 180
CCGGGACAGA ATCCGGCTCC CCGACCGGGA CTAGAACCCG GTGTGCCGGC GCGGCAAGGT 240
GGAGGATTAG CCTAGTGAGC CGTGGCGCCG GCCGCGATTG TGTTTAAACA TGCGTGCACA 300
TCTGCCTGAA GACAGTTCAA TTCGTATCTG CCTTGAGTCG CTGAGAATCT TTCTTCCCAG 360
TCTGTTATTT ATCATCTGTC ATAAGCATGA CCTGAAATGC TGATTGGAAT CAnTCATCTG 420 ATAAGATCCT AACATCTCCT TCTCTGAAAT TTTTCTATAA TTTCTCTGGA ATAAATTGTG 480
AATATACAAG GCTTACTAAT AACATTTCCT TATCAGATAT TAATAACATT GTTCGTCTGC 540 TTTTGGCTCC CTTTGTCTCT CTATTGAGGG CCTTATTGCA 580
(2) INFORMATION FOR SEQ ID NO : 698:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 569 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 698:
CGCTGCGGCT CACTTGGCTA GTCCTCTGCC TGCGGCGCCT GCACCCTGGG TTCTAGTCCC 60
GGTCGGGGCG CTGGATTCTG TCCCGGTTGC CCCTCTTCCA GTCCAGCTCT CTGCTGTGGC 120
CTGGGAAGGC AGTGGAGGAT GGCCCAAGTG CTTGGGCCCT GCACCTGCAT GGGAGACCAC 180
GAGGAAGCAC CTGGCTCCTG CCTTCGAATC GGCGCACGTG CTGGCCGCAG TGCGCCACCG 240
TAGCAGCCAT TTGGGGAGTG AACCAATGAA AGGAAGACCT TTCTCTCTGT CTAACTCTGC 300
CTGTCAAAAA AAAAAAAAAA AAAAAAAAAA GGATGATAGA CTATGAGCTG TGACTATTTT 360
AAAATTTATT GTATATGAGT GAAATAGACA TCTTTTCATT TATTACTGCT TATGGCCTTG 420
CCTATATTCC TGCGGAACTA TGGTGTTTTT ACTTGTTGAA CTCTTTATTT AGTGGAGCAC 480
TAAAGATTTG ACTATTTGTA ATGnATGTTA AAAATATGTT ATCTTGGGGC CGGAGCTGTG 540 GCACAGCAGA TTAATGCCTT GGCCTGAAG 569
(2) INFORMATION FOR SEQ ID NO: 699: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 421 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 699:
TGTAAATTAT GTGTTGAGAG GGTTACCTTT TTTTTTTTTC AAAATAAATT GTTGCCACTA 60
GGAAAAGAGG GCAATGTATT TTGTTAAATT TGGTTGCTAA GAAATGATGT GTTAGTCACG 120
TGAATTCTCT CAAACATCAG ATACTTTTCT GCTTCAAGGC CTTTATTTTT GTAGGTACAT 180
TGCCTAAAAA AATCTTTTTT TTTTTTTTTT TTTTTTTTTT TTGACAGGCA GAGTGGACAG 240
TGAGAGAGAG AGACAGAGAG AGAAAGGTCT TCCTTTGCCG TTGGTTCACC CTCCAATGGC 300
CGCCGCTGCA GCCGGCGCAC CGTGCTGATC CGATGGCAGG AGCCAGGAGC CAGGTGCTTT 360
TCCTGGTCTC CCATGGGGTG CAGGGCCCAA GCACCTGGGC CATCCTCCAC TGCACTCCCT 420
G 421 (2) INFORMATION FOR SEQ ID NO: 700:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 701 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 700:
CCCTTGGGTA AATTCCCAGG AGTGAGAGGC CTGGGTCATA TGATAGGTCT ATATTAGATT 60
TATTTTAAAT ATTTATTTAT TTGAAAGAGT AACACAGAGA GAGGAGAGGC AGAGAAAGAG 120
GGATCTTCCA TGCAATGGTT CACTCCTGAG TTGGCCGCAA CAGCCGGAGC TGTGCCAATC 180
TGAAGCCAGG AGCCAGGAGC TTCCTCTGGG TCTCTGACGT GGATGCAGGG GCCCAGGGAC 240
TTGGGCCATC TTCTACTGCT TTCCCAGGCC ATACTAGAGA GCTGGATAGG AAGTGGAGGA 300
GCCAGGACTA GAACCAGCGC CCATAAGGGA TGCTGGCGCT TCAGGCCAGG GCATTAACCC 360
ACTGnCGCTA CAGCGCCGGC CCTGGTCTAT ATTAGATTTT GAGATATCTC TATACTGTTG 420
TCCACAGTGG GCTTTACCAG TTTACATTCC CACCAGTAGT GGATTAGGGT ACCTTTTCCC 480
CCACATCCTC GCCAGCATTT GTTTGTTGAT TTCTGTATGA AAGTCATTCT AACTGGGGTG 540
AGGTGAAACC TCATTGTGGT TTTTGATTTG CATTTCCCTG GATTGCTAGT GATCCTGAGC 600 ATTTTTTAAT GTATCTGTAG CCATTTGGAT TTCCTCTTTC GAGAAATGTC TTTTTAAGTC 660
CTTTGCCCAT TTCTTGACTG GGGCTGTTTG TTTTGTTGAT G 701
(2) INFORMATION FOR SEQ ID NO : 701:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 247 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 701:
CCACCGTCTA CCAGnGTGGT ATCTGTTTGC CGGATCAAGA GCTACCAACT CTTTTTCCGA 60
AGTAACTGGC TTCAGCAGAG CGCAGATACC AAATACTGTT CTTCTAGTGT AGCTTGTGTA 120
TTAAGGCGAn CGATGCAGAn GAGGTAAGTG GGGATCCCGA TGACACGGAG ATGGAGTATT 180
TACCTCCCCG nTATGCGCCG GAGACGCCGC TGGnGGGACT CGATGTGGCG TTCCGTGCGG 240
ACAATGG 247 (2) INFORMATION FOR SEQ ID NO: 702:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 573 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 702:
AAGTCACTTG CTGCTGCTTT CTTAATAGCA TTAAGAGGGA GATGGATCAG AATTGGAGCA 60
ACTGGGACTC CAACCAACAC CCATATGGGA TGCTGGTGTT GCAGGCAGCA GCCTTAACCA 120
TTATGTCACA TCACCATCCC CAAAGGGAAC TTTATAGCAG TAAATGCTAT AGAAAACAAA 180
ACCCGGGGCC AGCACTGTGG CATAGCAGGT AAAGCCGCCG CCTGCAGTGC CAGCATCCCA 240
TATGGGCACC AGTTCGAGTG CCAGCCACTC CACTTCAATC CAGCTCTCTG CTGTGGCCTG 300
GGAAAGCAGT AGAAGATGGT CCAAGTGCTT GGGGCCCTGC ACCCACGTGG GAGACACAGA 360
AGAAGCTCCT GGCTCCTGGC TTCGGATCTG TGTAGCTCCA GGTATTGTGG TCAACTGGGG 420
AGTGAACCAG CGGATGAAAG ACCTCTCTCT CTCTCTCTCT CTGCCTCTTC TTCTCTCTCT 480
GTGTAACTGG ACTTTCAAAT AAATAAAAAT AAATCTTATA AAAnAnAAAA ACACAAAGAT 540
CTCAAATCAG CAACCTTAGT TTTGTACAAT AAA 573 (2) INFORMATION FOR SEQ ID NO: 703:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 305 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 703:
CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA TACCTGTCCG CCTTTCTCCC 60
TTCGGGAAGC GTGGCGCTTT CTCATAGCTC ACGCTGTAGG TnCGGCTCCC TCnCCTTTGC 120
TGATGAAACA GAAAAGCTTG CCTGGCTCGC GCGCCAAGGT TTCGTGACGG TACATTCGCA 180
TCGCTGCGCT AACGCACAGG AAGTTGTTGC ACTCCGATCT GAGATTATGC GCACGCGCGA 240
GCTGTTGnCT TACAnGATCG ATGGCCTGGT AGTAAAGAGT ACCGATCTTG GACTTCCAGG 300
ACGTA 305 (2) INFORMATION FOR SEQ ID NO: 704:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 490 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 704:
AnATCGACGC GGAACTGGCC GCCACGGCGG CGCACCACGC TGATCCGAAA CCAGGAGCCA 60
GGTGCTTCCT CCTGGTCTCC CATGCGGGTG CAGGGCCCAA GCACTTGGGC CATCCTCCAC 120
TGCACTCCCG GGCCACAGCA GAGAGCTGGA CTGGAAGAGG AGCAGCCAGG ACAGAATCCG 180
GCGCCCCAAC CGGGACTAGA ACCCAGAGTG CCGGCGCCAC AGGCAGAGGA TTAGCCTAGT 240
GAACACGGCG CCGGTCCGGG GCTTTATTTC GCTTGAGGAC CTCTATTGGT CATTATTGGT 300
CACTCCGTCT CCTCCACTGC CCTGTCTGAG CTTTGTCTGC TCCTGTCTCT TTTAGGCCTA 360
GAGGTGACTT TTCACCCTGG GGTGCCTCTG ACACCCTTTC CTGCTnCTCC TTCAACCCCA 420
AGCACAGCTT TGCAAATCCA TCTTCTATGA TGTGGCCAGA TnGGGTGTGG TATCAGCTCT 480
TGCCGGGGCT 490
(2) INFORMATION FOR SEQ ID NO: 705:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 594 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 705:
GCTCTGGCCC AGAAGTACTG TGGAGGATGG CCCAAGTGCT TGGGCCCTGC ACCCACATGG 60
GAGACCAGGA GAAGCACCTG GCTCCTGGTT TCGGATCAGC ATGGTGTGCC AGCCACAGCG 120
CGCCACCACA GCCACCAGTG GAGGGTGAAC CAACGGTAAA GGAAGACCTT CCTCTCCGTC 180
TCTGTCTCTC TCACTGTCCA CTCTGCCTGT AAAAAAAAAA AGAAGTAAAC ATGAAACATG 240
TCACCACCAA ATTCACTAAC ATTCAGGTAG GATCATTATC CAACCAACAA AATGACAGGT 300
CTTAATTTTT ATTTCTCAGT ACTAACCTTG AATGTAAATG GATTAAATTC ACCAACCAAA 360
AGACTTAGAG TGGCTGAATG GATTAAGAAC CATGACCCCA TTATATGCTG CCTACAAGAC 420
ACTCATTCCA CAAACAAAAG TACACACAGA CTTAAATTGA AGGGTTGGGA AAACATATAC 480
CAAGCAAATG GAAACCCAAA ATGAGCAGGC ATAGCTATCA CAATATTCAA TGAAACAGAC 540
TATAAATCAA AAGCTATTAA AAAGATTAAG AAGGnCATTA TATTTTGATA AAAG 594 (2) INFORMATION FOR SEQ ID NO: 706:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 533 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 706:
ATATAAGTGA GATCATGTAG TACTTGTCTT TCTGGACTGG TTTATTTGTA ATTTCATGAT 60
CTCCAAGTCC ATCCATGTCA TTGCAAATGA CAGGAATTCA TTCTTCATAA GGCTGAATAG 120
TATTCCATTG TGTTATG AC CACGATTCCT TTATCCATTC AGGGTTTATG GACACCTAGG 180
TTGATTCCAC ATCTTGGCTA TTGTGAATAG TGCTGCAGCA AACGTGCGGC TTTAGATATC 240
TCTTCAACAT ACTTATTTAA TTTCCTTTGG ATATTTTTTT TTTTGAGCGG AGTTAGACAG 300
TGAGAGAGAG AGACAGAGAG AGACAGAGAG AAAGGTCTTC CTTTTGTTGG CTCACCCCCA 360
AATGGTTGCT ACGGGCTGGT GCGCTGCGCC GGATCTGAAG CCAGGAGCCA GGTGCTTCCT 420
CCTGGTCTCC AATGTGGGTG CAGGGCCCAA GCACTTGGGC CATCCTCTAC TGCCTTCCTG 480
GGCCATATCA GAGAGCTGGA TTGGAAGAGG AGCAACCAGG ACAGAATCCG GCA 533 (2) INFORMATION FOR SEQ ID NO : 707:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 323 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 707:
GnGAGGCAnA GAGAGAGAGG TCTTCCATCT GCTGGTTCAC TCCCCAATTG GCTCCAATGG 60 CTGGAACTGA GCCAATCGGA AGCCAGGAGC CAGGAGCTTC TTCCGTGTCT CTGACACAGG . 120
TACAGGGGCT CAAGAACTTG GTCCATCTTC TACTGCTTTC CCAGGCCATA GCAGAGTTGG 180
ATCAGAAATG AGGCAGCCAG GACTTGAACC AGCACCCACA TGGGATGCCA GCACTGCAGG 240
CAGCAGCTTT ACCCATTACA CCACAGTGCC AGCCCTTCAA CTATTTTTAA TAGATCACTT 300
TTTTAAAAAT AGAATTTATC AGG 323 (2) INFORMATION FOR SEQ ID NO : 708:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 630 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 708:
ATGATAAAAT ATTTGTAATG ATCCTTAAAA TCTTCTTAGA ACAATTATGT ATGTATGTAT 60
GTAACAATTC AGGTTTATTA TTTATAAGTT TGTTTCTTTG AAAGAGTAGG GGGCCAGCAT 120
TGGAGCACAG TAGGTTAAGC CACTGCATGT GGTGCTGGCA TCCCATATGA GCAGTTGTTT 180
GAATCCACTT CCAATCCAGC TCCCTGCTAA TGTGCCTAGG AAAGCAATGG AAGATGGTTC 240
AACTGCTTGG GTCCCTGCTA CCCTGTCTCT ATATGGGAAA CCTAGATGGA GTTCTAGACT 300
CCTGGCTTCT GTTTGGCCCC ATCCTGGCTG TTGCAGCTAT TTGGGGAGTG AACCAGCAGA 360
AAAGTTGTAA GTGATAGAAT TAGGACTTTA GTTATGGGAG GATAAAGCAA ATACATTTTG 420
GTCTACAAGA ATCGCAGATT TGGAGAAAGT AAGAGTAATA AGGAAAAGTT TAAAAACTGT 480
AGGTGAGGCC GGCACCGCGG CTCACTAGGC TAATCCTCCA CCTAGCGGCG CCGGCACACC 540
GGGTTCTAGT CCTGGTCGGG GCGCCAGATT CTGTCCCGGT TGCCCCTCTT CCAGGCCAGC 600
TCTCTGCTGT GGCCAGGGGA GTGCAGTGGA 630 (2) INFORMATION FOR SEQ ID NO: 709:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 575 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 709:
CCAAGTATAA GCTCATGAAA GCATTTGTTT ATTTTATCTT TATTGATTTG CCTTTCATAC 60
TTAGACCTAT AATGTATCTG GGCTTGATTT TTGCATATAG TGTGAAATAG AATTCTACAT 120
ATCTGGGGCT GTTGCTGTGG CGTAGTGGGT AAAGCCACCG CCTGCAGTGC TAGTATCCCA 180
TATGGGCGCG GGTTCAAGTC CTGGCTGCTC CTCTTATGAT CCAGCTCTCT GCTATGGCCT 240
GGGAAGGCAG TAGAAGATGG CCCAAGTCTT TGGGCCCCTG CACCCACATG GGAGACCTGG 300
AAGAAGCTCC TGGCTTTGGA TTGGCTCAGC TCTGGCCGTT GCAGTTAATT GGGGAGTGAA 360
CCAGCAGATG GAAGACCTCT CTCTCTCTCT GTGTAACTTT GACTTTCAAA TAAATAAATC 420
TAAAAAGTAC ATATCCAGTT GGATGAAACA CTTTATTGAA AAGACAATTG TCTGTCCCAC 480
TGCTATGTAG TATCTTATAA TTAGAACAAT GACCATTTAA TGACTGATCT GTTTTCAAAT 540
TCTGnATTTT TCTGCTCCTG TCCTTTTTAA nTTTT 575 (2) INFORMATION FOR SEQ ID NO: 710:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 691 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 710:
AGGAAGACCT TTCTCTCTGT CTCTCTCACT GTCCACTCTG CCTGTCAAAA AAAAAAAAAA 60
AGGTATTTCT TCTGTnGGTT CACCCCCCAA ATGGCTGCTA TGGCCGGCGC ACTTGCCGAT 120
CTGAAGCCAG GAGCCAGGTA CTTCTCCTGG TCTCCnATGC AGGTGCAGGG CCCAAGGACT 180
TGGGCCATCC TCCACTGCAC TCCTGGGCCA CAGCAGAGAG CTGGCCTGGA AGAGGGGCAA 240
CCGGGACAGA ATCCAGCACC CGGACCAGGA CTAGAACCAG GGATGCTGGC ACCGCAGnGA 300
GnGATTAGCC TAGTGAGCCG CGGCGCTGGC CACAACCCCT GAATTCTTGA CCTGTGGAAA 360
CAATCTAGTC AATTTGTATT GCTGTTTCAA GTCACTGCCT TATGGGTCAT TTGTTTTGCT 420 GCAGTTGACA ATTTAGTGTA TGTGGTAGGG TGGAAGAGCA TAGAACATGC TAGCTCCAGA 480
TAAAATTCCC GAGTCTAAGG TCGATAGCAG GGATACAGAG AAAGGAATCA GATTCTCTTT 540
CACTAGCAGG TCCAGACAAG CATACAGCAG GCAGGTCCAA CAAACAGGTG CTAGAGAGGG 600
GATGCCTCCT CCCCACTTCT CTTGGGTTTC CAGAGAAACC GATGGATTAG CCCCTGCTTG 660
TCTAAAAGAG GGGGGAAAAT AGCTTGAGAA T 691 (2) INFORMATION FOR SEQ ID NO: 711:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 667 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 711:
GAGAGAGAGA GAGAGAGAAA GGTCTTCCTT CCGTTGGCTC ACCCCCCAAA TGGCCGCAAT 60
GGTCGGAGCT ATGCCAGTCC GAAGCCTGGA GCCAGGTGCT TCCTCCTGGT CTCCCAAGCG 120
GGTGCAGGAG CCCAAGTACC TAGGCCATCC TCCACTGCCC TCCTGGTCCA CAGCAGAGAG 180
CTGGACTGGA AGAGGAGCAA CTGGGACTAG AACCCAGCGC CCATATGGGA TGCTGGCGCC 240
GCAGGAnAGG ATTAACCAAG TGAGCCATGG CCCCCAGTAT AATGTTTTAA GTTGTATTTT 300
TCCTGAAACA AAGAGTTGAG TGTAAATGGA TTATTTGGAA AGTGATCCAG AATCACTGTA 360
TGGGGAAGTA GGAAAGAGAG TGGGAAACGA AGGAAGTCCA AACAGGGTAC AGGATTAATT 420
AATTAAGCTA TTACTGTGGG TAACGAAGGC TCAAGCCTAC TCAGGGAGAC TACATGAAAT 480
TTGCCCCACA GGTGCCTTAC CCAAGGAAGT TGGGGTATTT ACAAATTCGT ATCCATCACT 540
GGATGATGGG CTTGCTTGAC TGAGTGGTTG TTAATTCCCT ACATAAGTGT GGACCTGCCT 600
CTCTCCTCCC CCTGAGAAAG CCTGAGGGGC AGAGCCACAC TGCTTACAGT AAAGAGATCA 660
CATGTTT 667 (2) INFORMATION FOR SEQ ID NO: 712:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 358 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 712: nAACCTTCTT TCTTTCTTTC TTTCTTTCTT TCTTTCTTTC TTTCTTTCTT TCTTTCTTTC 60
TTTCTTTCTT TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTTTCTCTT TCTCTTTCTn 120 nnTTTCTCTC TTTCTCTCTT TCTTTCAAGA TTTATTCATT TATTATTTGA AAGGCAGAAT 180
TACAGACAGG CAAGGnAGAC AGAGAGAGGC TGGTCCTCCA TGGGACGGTT CACTCCCCAA 240
ATGGGCAAAT CGACTGGAGC TGGACGGATC CGAAGCCAGG AGCCAGGAGC TTCTTCCTAG 300
TCTCCCATGT GGGTGCAGGG GCCCAAnGAC TTGGGCCATC TTCTACTGCT nTCCCAGG 358 (2) INFORMATION FOR SEQ ID NO: 713:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 471 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 713:
TGATTTGAAA CAAAATGTCT AAATCCTTCT AGAAAGTGTG TATTTAATAA ATACGTAAAT 60
ACAAAACCCT GACAGCTTCA TTCCCAGAAA ACATCTCCTA TGCCTGGTGA ACACAGACCC 120
CGGGGCCTCA GGCCAGnnGA CCTGTCCAGC TCCTTGTTGT TTCTGATCAT GAAACAGGGT 180
TAAGATGTTG CACAGTCAGA CTCACCAGTC TCACTTCCTG GCCTTTGGGA ATTATGCACT 240
GGAAGGCTGT CACATCTTCA TTTTTTAAAG GAATTCATTT TTTTCAACCT AAATATCTTT 300
TATAAGAAAT AAGGCTGAGG CCGGCGnCGC GGCTCACTAG GCTAATCCTC CGCCTAnTGG 360
CGCCGGCACA CTGGGTTCTA GTCCCGGTCA GGGCGCCGGA TTCTGTCCCA TTGCCCCTCT 420
TCCAGGCCAG CTCTCTGCTG TGTCCAGGAG TGCAGTGCAG GATGGCCCAA G 471 (2) INFORMATION FOR SEQ ID NO: 714:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 651 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 714:
ATTTTTTCTT TTTTAAAAAA CCTTTTTTGG GGCCAGCACT GTAGCGTATG GnGTAAAGCT 60
GCCGCCTACA GTGTCAGCAT ACCACATGGG CGCCAGTTTG AGTCCCGGCT GCTCTACTTC 120
CAATATAGCT CCCTGCTATG ACCTGGGGAA GCAGTGGAAG ATGGCCCAAG TGCTTGGGCC 180 CCTGTACCCA CATGGGAGAC CCGGAAGAAG CTCCTGGCTC CTGGCTTTGG ATCAGTGCAA 240
CTCCAGCCGT TGCGGCCATC TGGGGAGTGA ACCAGTGAAT GGAGGACCTC TCTCTCTCTC 300
TCTCTCTTTC TCTCTGCCTC TCCTCTCTCT CTCAATCTCT GCCTCTCTGT AACTCTGCCT 360
TTCAAATAAA TACATAATTT TTTTTTAAAA AACACCTCTT TCCTATTTTA TGGCATAACA 420
AGATGGTACC TTATACCTGT ACCTGCCCTG CCTGGGCCTG GGAATCACTC ACTTCTCTGA 480
GGGGTCCTGA TTCTGGTTTA ACAGACCAAG ATGTAAGTGC CAGATGATGT TTTTTCTGAT 540
GAGTAAAATC AATGGTAAAG CCATGTCTGT AAAGGTTTGG TCTTTTGATT ATTTTTGCTA 600
AAGTATTACC TTTTTTTTTT TGACAGGCAG AGTGGACAGT GAGAGAGAGA C 651 (2) INFORMATION FOR SEQ ID NO: 715:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 582 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 715:
TCACATTGGG GGCCCATGCT GTGGCTTAGT GGGTTAAGCA GCCGCTTGCA GTGCCAGCAT 60
CCCATATAGA CACAGGTTCA AGTCCTGGCT GCTCCACTTC TGATCCAGCT CTCTGCTATG 120
GCCTGGGAAA GCAGTAGAAG ATGGCCAAAT GCTTGGGCTC CTGCACCCAT GTGGGAGACC 180
TGGAAGAAGC TCCTGGCTCC TGGCTCCTGG CTTCGGATCA GCACAGCTCT GGTCGTTCTG 240
GTCATTTGGG GACATCTATC ATTAATTCAG TAAACACATA ATCCAATGGA ATTTTAAGCT 300
CATGATATGT GTCCTACTAC CCATCCATTT TTATTTTCAA GACCTCTTTT TCTCTCTCTG 360
CCTCTCCTTC TCTCTGTGTA ACTCTTTCTT TTTTTTTAAG ATTTATnTnC TTTATTTGAA 420
AGAGTTACAG AGCAAAGTAG AGCCAAAGGA GGAGAGAGAG AGAAAGAGAG AGAGAGAGAG 480
AGAGAGAGAG AGAGAGAGGT GTTTTCCATC TGCTGGTTTA CTCCCCTAAT GAnCAGAATG 540
GCCAGAGCTG TGCTGATCCA AAACCAGGAG CCAGGAGCTT CT 582 (2) INFORMATION FOR SEQ ID NO: 716:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 328 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 716:
AAGATCTTTC TCTTTCCCCA TCTCCCTCTG TAACTCTGCC TTTCAAACAA AATAAATAAT 60
TTGTAAAAAG GCTAGTTTAT TTGAAAGGCA GAGTTACAGT GAGAGACAGG GAGAGAGAGA 120
GCTTTCTCGT CCGCTGGTTC ACTCCCCAAA TGGCCACAAT GGCCGGAAGT GAGCCAGTCT 180
GAAGCCAGGA GCCAGGAGCT TCTTCTGGGT CTCCCATGTG GGTGCAGGGG TGCAAGCACT 240
CGGGCCGTCT TCCACTGCTT TCCCAAGCAC ATTAGCCGGG AGCTGGACGG GAAGTGGAGC 300
AGCCGGGATT CGAACCAGTG CCCATATG 328 (2) INFORMATION FOR SEQ ID NO: 717:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 717:
AGGAGAGGAC AACTTCTCTC CCGGGTAGGG GCTGAGTCTG ATTCCTTTTG GGGTCCGTCC 60
CAGAGCAGTG TGGTTTGGTG CTGAGTAGGG GAGGGAAGAA TGGAGCCTGT GAAAGCCCCT 120
GTGAGAGGGA CAGGAGAGCC CGGCCCTGAG CCCCAGAGCC CAAGCGCGGA nGnGCCCAGG 180
CTCTGGAGTC CTGGGTGCTG GGTCTGGGTG TGCTATGATC TACAGAACCC CTGAGGATCC 240
TGACATGAGG GTCCATCTCT AGGCCCCAGA AAGTGGGTCA TGGGAATGGT GAGCTTAACG 300
ACTCGGGGCC CAGCAAGGCC ATGAGAGAGG GGGATGGGGG TAGGGTCCAG CTCCACTGTT 360
CCTTTTTTTG TTTGTTTGTT TGTTTTGACA GGACAGAGAG AGACAGAGAG AAAGGTCTCC 420
TTACCATGGT TCACCCCCCA ATGGCCGCTG CGGCCAGCGC ACCGCGCTGA TnCCGAAGCC 480
AGGAGCCAGG AGCTTCTCCT GGTCTCCCAT GTGGGTGCAA GACCCAAGGA CTTGGGCCAT 540
C 541 (2) INFORMATION FOR SEQ ID NO: 718:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 620 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 718: GAGGTAAGTC TCATAATCCT ACACTTTTTG AAGTGAGATA ATACCTACTC CAGACAGTTA 60
ATGTGGATAT TAAATATGAT GTAACTAAAC AGTGCCCAGC ACATAGTAGA TGTTTAAAAT 120
ATGGAAGTTT TGGGGCTGGC ACTGTAGCTT AGCAGGCAAA GCTGCTGCCT TCAGTGCTGG 180
CATCCCATAT GGGAACTGGT TCGAGACCCA GCTGCTCCAC TTCTGATCCA GTTCTCTGCT 240
ATGGCCTGGG AAGCCAGGGT AAGAGGGCCC AAGTCCTTGG GCCTCTGCAC CCACATAGGA 300
GTCCTGGAAG AAGCTCCTGT CTCCTGGCTT CAGATCAGCA CAGCTCTGGC AATTGTGGCC 360
AAATGGGAGA ATGAACCAGC AGGTGGAAGA CCTTTCTCTC CCTCTGTCTC TCCTCTCTCT 420
GTGTAATTCT GACCTCCAAA TAAATAAATA AATCTTTAAA AAATGCAAAG TTTCTCTCCT 480
CTCCACTACA CTCCAACTCT TTCCCTCACT TAATAAATAA GACTCAGCTT GACCCCACAG 540
GnACTCATCT TACATAGAAG AGCTAATACA TATGTCTAAA GCTGAAAAGT GAAGAAATGT 600
ATAGCAGACC TCTGCTCTCT 620 (2) INFORMATION FOR SEQ ID NO: 719:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 532 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 719:
GGGCCCAAGC ACTTGGGCCA TCCTCCACTG CACTCCCGGG CCACAGCAGA GAGCTGGACA 60
GGAAGAGGGG CAACCAGGAC AGAATCTGGC GCCCTGACCG GGACTAGAAC CCGGTGAGCC 120
AGCGCCGCAA GGAGAGGATT AGCCTGTTAA GCTGTGGCGC CAACCTGTAT ACAAATTTAA 180
AGTATGATTT TGTTTGGTTG GTTTTTTGAG GTATAATTTA AAAAAAAAAA GCTGTCTTTT 240
CCTTTTCTTG CTTTGTTTTC CGAAAAGCTA TCACGAGAGA AAACAACCTT TGTTTTTTAA 300
AGTTTAAAAA ACTTGCTTTG ACAAAAGCTA ATAAATTATT TTTATTTTAA AATCAAATTA 360
ATATACTTTG GGGCGGTGCT GTGTACAGCA GGTGAAGCCA CnGCTGTGGT GCGGCGTCCC 420
ATATGGGCAC CAGACGCCCA TCAGCCCCAA CGCCACAAGC ACTACCGnAC CCAACAGGCA 480
CCTGCCACCC CACGCAGACG CATCAAATGG GCTCCTCCTG CCTCGCCGTT AG 532 (2) INFORMATION FOR SEQ ID NO: 720:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 602 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 720: AACGGGnCTG TGTGTAGGAT GTTAAGCTTC TGCCTGTGGT GACGGCATCT CATATGGATG 60
CCGATTCAAG ATCCAGATGC TCCACCCAAT CTAGCTCCCT GCTAATGGCC TGGGAGAGCA 120
GTGGAAGATG GCCAGAGTAT TTGGGCCCCT GCACCCACAT GGTAGAGCCA CAAGAAGCTC 180
CTGGCTCCTG GTTTTGGATG ATCCCAGCTC TGGTCATTGC AGCCATTTGG GGAGTGAACC 240
AGTGGACGGG AGATCTCTCT TTCTGTCTTT CCCTCTGTCT GTAACTCTGC CTTTCAAATA 300
AATAAATAAT TTTTAAAAAT AAATCTTTCT TTAAAAAAAA GGGAAAACAT GGGGCTGGTT 360
CTCTGATGTA GTGGGCAAAG CTGCCACCTG TGGTGCCAGC ATCCCATATG GGCGCCAGTT 420
CCTGTTCTGG CTGCTCCACT TCCGATCTAG CTCTCTGTTG TGCCCTGGGA AAACAGTGGA 480
AAATGGCCCA ATCCTTTGGG CCCCTGCACC CACGTGGGAG ACCCGGAGGA AGCTCCTGGC 540
TCCTGGCCTC AGATCAGCCC AGCTCCGGCC GTTGTGGCCA CTTGGGGAGT GAGACAAAAG 600
AT 602 (2) INFORMATION FOR SEQ ID NO: 721:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 635 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 721: AGTACAAGGG GGGCTTCAAC GAGTTCATGG AAAAATAGAC TCAAAAATGC TTAATTTTTG 60
AAATCCACGC ATAGTTACTT CTTGAAGACC CTGTCATGTA CTAGCACTCT GCCTCCTCTG 120
GCCTATTACG TTAAAAACAA CAACAACAAC AACAACAACA ACAACAACAG TTCACCTTTT 180
GCCTAGTAGC TCAGGTATCA CTTCCTCTGG GAAGACCACC AAGAATACAA TCAGAGAAGC 240
ATGAAACACA TAAGGAAACG GTAACTTGTG ATACACTCAC ACCAAGAAGT ACAACACAAC 300
CATTAAATAT CATATTTATT TTATTGGCAT AGGAAGAAAG TTGGCATTTA AAATATGTAT 360
ACGGGGGCCA GCACTGTGGT GTACTGGGTA AGCCACTACC TGCAGTGCCA GCATCCATTA 420
TGGGTGCCAG TTCAGGTCCT GGCTGCTCCA CTTCCAATCC AGCTCTTTGC TATGTCCTGG 480
GAAAGCAGCA GAAGATGGCC CAAGTCCTCG AGCCCCTGCA CCCACGTGGG AGACCCGGAA 540 GAAGCTCCTA GCTTTGGATT GGTGCAGCTC TGGCCATTGC AGCCATCTGG GGAGTAAACT 600
AGCAGATGGA GGACCTCTCT TTCTCTCTCT GCnTC 635
(2) INFORMATION FOR SEQ ID NO: 722:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 633 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 722:
ATCCTCTGCC TCTGTGAACT TTAATATGTG TGGTTCCTCT TCTACGCACC AGACTGCATG 60
CTAGCTGCCA TCCACACAAT GTGCTCCCCC TGCCAACAGC CATAGGTCAT GTTCCAATGT 120
CAGCATTACT ACTGTTCAGT TTAATATATC TAAAATCTAA CATAAAAGTA TTTATAATGG 180
GCTGGCCCTG TGGCACAGCA GGTTAACGCA CAAGCCTGAA GCCCCAGTAT CCCATATGGG 240
CACCAGTTCT AGTCCTGGCT GCTCCTCTTC TGATCCAGCT ATCTGCTATG GCCTGGGAAG 300
GCAATAGAAG ATGGCCCAAG TACTTGGGCC CCTGCACCCA CATGGGAGAC CCGGAAGAGG 360
CTCCTGACTT CGGATCAGTG CAGTTCCAGC CATTGTGGCC ATCTAGGGAG TGAACCAGCA 420
GATGGAAGAC CTCTCTCTCT GTCTACCTCT CTCTGCAACT CTGTCTTTCA AATAAATCTT 480
TTAAAAAAAA GTATTTGTAA CAATAATGGT GCTGTACCCA TGGCACCTAC TGTCTCCTTT 540
CAAGTCTGTA CTCTGTGAAA TGGAAGTAAC ACAGTTCAGA ACAGGTTTAT GGGCTACAGT 600
GTGACACCTT CATGCATGTA TATGATGTGT GCT 633 (2) INFORMATION FOR SEQ ID NO: 723:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 467 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 723:
TTCCCATTTT CATTTTTGCT TAAGCACTAG ATATAACTAT TGTTATCATT TTCCAGATGA 60
AGAGGAACAG AGATTACCCA AGATGCTATT GCTAAGCATG ACAAAGTCAG CATTCAAACC 120
TGGAAGCATT TCTTAATGAC GATGGTTAAT TTTTGTTTGG ATTTGTCAGT CTAGTTGATT 180
GTATATACTA TATATATATA TATATATACT GTCTACAGAT TCCAAAAGGT TTTCCTTTAA 240 AAAGATTTAT TTATTTGGCC GGCACTGTGG CTCACTAGGA TAATCCTCCA CCTTGGGGTG 300
ATGGCACACC GGGTTCTAGT CCTGGTCGGG GCGCCGGATT CTGTCCCGGT TGCCCCTCTC 360
CCAGGCCTGC TCTCTGTTGT GTCCAGGGAG TGTAGTGGTG GATGGCCCAA GTGCTTGGGC 420 CCTGCACCCC ATGGGAGACC AGGATAAGTA CCTTGCnCCT ACCATCG 467
(2) INFORMATION FOR SEQ ID NO: 724:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 185 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 724:
CGTACCCTAT TGTTTATTTT CTAAATACAT TCAAATATGT ATCCGCTCAT GAGACAATAA 60
CCCTGATAAA TGCTTCAATA ATATTGAAAA AGGAAGAGTA TGAGTATTCA ACATCAGCCT 120
TCTTTTCAGC TTCCTCCTTG GCGCCCGAGG CACAGnGAGC CGAGCATAGC CGCGCAGGCA 180
AGTGC 185 (2) INFORMATION FOR SEQ ID NO: 725:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 189 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 725:
CGGTACCTGG TCAGTTTGCA AACGGCCACG GTGGAGATGA GCGGCTTATA TTCCTCAGGA 60
GCGTGTGGCA GATTGTGGTG AAAGCGAATG GGTGTTGATA ACCGGGTTCA GGGTAGAATT 120
CCCAGCCGCC TGGTGGCGAA GCCGCCCGGC GCCCCAAATT TTTCCCGGCG GTTGnGCCCT 180
TACnGGTTC 189 (2) INFORMATION FOR SEQ ID NO: 726:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 726:
TAATTATGAA AACTGAATCA AGAAGACATA GAAAATCTGA ATACACTTAA AAAAAGAGGT 60
TGGATTAGTA ATTAAAAATT TCACACAAAG AAAAGTCTAG TCCCAGTTAT ATACTTTTTT 120
TAAAGATTTA TCTATTTATT CAAAAGTCAG AGTTACACGG AGAGAGGAGA GGCAGAGAGA 180
GGTCTTCCAT CCACTGGTTC ACTCCCCAGA TGGCCGCAAC GGCCAGAGCT GTACCGATCT 240
GAAGCCAGGA GCCAGGAGCC TTCTCTGGGT CTCCCATGTG GGTGCAGGGA CCCAAGGACT 300
TGGGCCATCT TGTACTGGTT TCCCAGACCA TAGCAGAGAG CTGGATTGGA AGAAGAGCAG 360 nTGGGACTCG AACCAGCCCC CATATGGGGT GCCGGCACTG CAGGCGGCGC TTTAnCCGCT 420
ACGCCACAGA GG 432 (2) INFORMATION FOR SEQ ID NO : 727:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 428 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 727:
TCCTCCTGCA GCGCCAGTAT TCCATATGGA CACCAGTTCT TGTCTGGCAG TTCCACTGCC 60
AATCCAGCTC TCTGCTATGG CCTGGGAAAG CAGTGGAAGA TCATCCAAGT CCTTGGGCCC 120
CTGAACCCAC GTGGGAGACC TACAAGAAGC TCCTGGCTTT GGATCAGCAC AGCTCCAGTC 180
GTTGCAGTCA TTTGGGGATT GAACCAGTGG ATGGAAGACC TCTCTGTCTT TACCTCTCTC 240
TGTCTGCAAC TCTGCCTCTC AAATGAATAA AATCTTTTAA AAATTTTTAT TTATGTCTGT 300
ATTTTATTTG AAAGAGATGA GGAGGGAGAG AGCGTGAGAG CCAGTGCACA TGATACCTTT 360
CAACTGGTAG TTCACTGCCC TGTTGGCCTC AATGGCCAGG GCTAGGCCAG GCCAAAACCA 420
AGAGCTTC 428 (2) INFORMATION FOR SEQ ID NO : 728:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 463 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 728: TTTACTTGCT ATGTTCACAG TTCCTACAGG AAAACTTGGA ACACAATAGA CATTCAAGCT 60
TTATTTGTTG AAGATTATGA ATATTTAGGA TGAGACACTG TGCTTTACGT TCCCCCCCTA 120
AATCCCAAAG AAATGATATT ATTTGTGTAA TAAGTGACTT AAAACATATT TTCCTGAGGC 180
CAGTATTGTG GCATAGCAAC TAAAACTGCC ACCTGTGATG CCGACATCCC ATAGGGGTGC 240
CAGTTTGAGA CCCGGATGCT CCACTTCTGA TCCAGCTCCT TCCCTAATGC ACCTGGGAAA 300
GCAGTAGAGG ATGGCCCAAG TGCTTGGGCC CCTGTACCTA CGTGGAAGAC CCATAAGCTC 360
CTGGCTCCTG GCTTTGGCCT GGCCCAGCTC CAGCTGTTAC AGCCATCTGG GGAGTGAACC 420
AAAGGATGGA AGACCTCTCC ATGTCTCTCC CTCTCTCTGT GTA 463 (2) INFORMATION FOR SEQ ID NO : 729:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 583 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 729:
AACTCACGCA AAGAGAAATC GACAATATCA ACAGGTGTGT GTTTAAAAAA ATTGAATCCA 60
TAATTAAAAC CTTCCAAAAC AGAAAGCACA AGGCTCTGAT GACTTCAATC AAACATTCAA 120
TGAGGTATAG GAGCAGGTCT TTGGCCTAAT GGTTAAGATG TCCACGTCCC ATATTGGACT 180
GACTGGGTGT GATCCCTGGC TCTAGACTCA ATTTCAGCTT TCTGCTGATT TAGACCCTGG 240
GAGGCAGCAG GTCATCATTC AAATGGTTGG GTCACTTCCA CCCATGTGGG AGACATAGAT 300
AGAATTCCTG GCTCTTGTTT CTGGCCCTGG CCCAGCCTAC CAGTCATTTC AAGCATATGA 360
AATGTGAATG AAAAATTAGA ATGTGTGATC TCAAATAAAT AAATACTTAA AAAAAAAAAA 420
AAACTAAATT ATCTAGGGCT GGCGCTGTGG CACAGTGGGT TAACGCCCTG GCCTGCAGCG 480
CCAGCATCCC ATATGGAAGC CGGTTCTAGT CCCAGCTGCT CCACTTCCAA TCCAGTTCCC 540
TGCTATGGCC TGGGATAGCA ATAGAAGATG GCCCAAGTCC TTG 583 (2) INFORMATION FOR SEQ ID NO : 730:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 590 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 730:
TTGGAGTTGT AATAAAAGGA ATTTTCCCTT CTACCTCCTT TCTGTAAGAA ACATCAGGCT 60
TTCCTGCC C TGGTTAAACA TGATTAAGTG GTAAACAGTG ATGAATATTT CCCAGATAAC 120
TACATCTCAT GATGTTTTTT TAAAGATTTA TTTATTTATT TGAAAAGTAG AGTTAGAGAG 180
AGAGAGAGAG AGGCCTTCCG TCTGCTGGTT CACTCCCCAA ATGGCTGCAA TGGATGGAGC 240
TGGGCTGATC CAAAGCCAGG AGTCAGGAGC TTCTTCTGGG TCTTCCACAT GGGTGCTGGG 300
GCCCAAGGAC TTGTGCCATG TTCTACTGCT TTTCAAGGCC ATGGTAGAGA GCTGGATCAG 360
AAGTGGAACA GCAGAGACTT GAATTGGTGT CCCTATGGGA TnCTGGCACT GnAGGCGATG 420
GCTTTACACA CAATGCCACA GTGCCAGTCC CTCATGATAT TTTTATTAAT GTTAATAGGn 480
TTCTGGTGGG CAAAAAATAG TTTTGTAGAA AAATATTTTG GGAATACACT CAGATAACAT 540
TTGAGCAAGT TTCTTTCCTG CAGGACATAT CAGACAnTCA ACTATATTAn 590 (2) INFORMATION FOR SEQ ID NO: 731:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 710 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 731:
ATGCAAAAAT ATATACTAAA GGAAGTTTTT CAGGCAGAAG GAACATTGTA TCAGAAACTT 60
GAAGCTAAAC AAAGGCAGAG CACTAGAAAT GGAAAATAGT AAAATAAAAT TCATTTTTTC 120
ACAGTTTGAG TTGTTCTAAA AGATAACTGA CTTTCAGGGC CGGTGCTGTG GCACAGCAGG 180
TGGAGCTGCT GCCTTCAACA TGCCATCATC CCCTATGGGT GCCTGTTCAA ACCCTGGCTG 240
CCCCATTTCC TATTCAGCTC CTTGCTGGTG TGCCTGGGAG GGCAGCGGAA GATGGCCCAA 300
GTGCTTAGGC CCCTGCACCC ATGTGGGAGA CCCAGAGGAA GTTCCTGGCT CCTGGCTTTG 360
GACTGGTCCA TCCCTGGCTG TTGTGGCCAT TTGGGGAATG AATCAGCAGA TGGAAGATCT 420
CTCTTTCTCT GTCTGTCTGT CTGTCTCTCT CTGTAATGCT ACCTTTCAAA TAAAAAAGTC 480
TATTTTAAAA AGATAACTGA TTTTCTAAAA TGGTAGAAAT GCATTATGTG TTTATAACAT 540
GTAAAAATAA AATGTATAAC AAATATAGCA CAATGGATTG GATAGAGGAA TTATGAAAAT 600
TCCTTTAACA TAGAATCATT ATATGTTACA TGAAATGTTG TAATATTATT TGAAGGnAGA 660
TTCTGATTAT ATATGCGTAA GGATAAACCC TAAGACAACT GCTAAAACAA 710 (2) INFORMATION FOR SEQ ID NO: 732:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 621 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 732:
TTTCACAGCA AATGCAACAA TTAGAAGAGA CACCTACAGA ATAGGAGAAA TTATTTACAA 60
ATCATACATA TAACATAGTG AGAATATCTG GAATATATGA AGAACTTAAT AGTAACCTAA 120
CCCCCCAAAT AATTTGATTT AGAAATGGGC AAAGGTGGGG CCGGCGCTGT GGCTCAGTGG 180
GTTAACACCC TGGCCTGAAG TGCCAGCATC CCATATGGGC GTCCGTTCGA GACCCGGCTG 240
CTCCACTTCC AATCCAGCTC TTTGCTGTGG CTTGGGATAG CAGTAGAAGA TAGCCCAAGT 300
CCTTCGGCCC CTGCACCCGC ATGGGAGACC CAGAAGAAGC TCCTGGCTCC TGGCTTCGGA 360
TCAGAGCAGC TCCAGCCGTC ACGGCTAATT GGGGAGTGAA CCAGCAGATG GAAGATCTCG 420
ATCTCTCTCT CGATCTCTCT CTCTCTCTCT AACGCTGACT TTCAAATAAA TAAATAAATC 480
TTTTTTTTAA AAAAATGGGG CAAAGGGCAT GAATAGACAC AGATATCCAA GAAGTATCTT 540
AAAAATGCTC AACATTACTA ATCATCAGAG AAATGCAAGT CAAAACCTCA TTGAGATATC 600
ACTTCACCCA GTTTTTTTTT T 621 (2) INFORMATION FOR SEQ ID NO: 733:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 606 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 733:
TCAGAAATCA TATTTGAATT AAAATCTTTA AAATGGAGAA GCAACATCAT CTGAAGTATA 60
ATAATAACTA AGCATTTTAA ATTGTCAGAT ACAGGAAGCC ATAGAGGAGA ACAAGAGAAA 120
AAAAAAAAAG GACTGAGGCC AATGATGTGA TGTAGTGGGC TAAGGCTCCA CCTGCAGTGC 180
CAGCATCCCA TATGGGTGCC GGTTTATGTC CCAGCTGCTC CTCTTCCAAT CCAGCTCTCT 240
GCTTATGGCC TGGGAAAGTG GCAGGAGATG GCCCAAGTGC TTGGGCCCCT GTACTTGTGT 300
GGGAGACCCA TATGGATTTC CAGGCTCTTG GCTTTGGCCT GACCCAGCCC CAGCCATTGC 360 AGCCATTTGA GGAGTGAACC AGTGGACGGA AGACCTCTCT CTGTCTGTCC CTCTGCCTGT 420
AGCTCTAACA TCTCAAATAA ATAAATAAAT CTTTAAAAAA GGAnGGGGGA GGAAGAGAAC 480
AAGAAGAAGA AAAAGAACTT ATGAAGAGGA AAGGGATGAA CCTATAGAGT ACTGGTAAAC 540
ACTGGGTTCA CATTTCAAAC CCAAGGCTGC CCATGGCAGT nAGCATTCAC TCTAAAAGGG 600
AGCATG 606 (2) INFORMATION FOR SEQ ID NO: 734:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 466 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 734:
TGCTATGGCC TGGGAAAGCA GTAGAAGATG GCCCAAGTGT GTGGGCCCCT GCACCTGTGT 60
GGGAGACCCA GAATAAGCTC TTGACTCCTG GCCAAGATTG GCGCAGTTCT GGCCACAGTG 120
GCCATCTGAG AAGTGAACCA GCGGATGGAA GACCTCTCTC TCTCTCTCTG CCTCTCCTCT 180
CTCCGTGTAA CTGTGACTTT CAAGTAAAAT AAATACATAA AAAAAAAAAA AAAAGAAAGA 240
AAGAAAGAAA GAAAGTAAAC AGCTAGGTTG CTAGACACCA GAATCAGGAT GCACCATAGG 300
TTCCTTACAT TGGAGAGTGG CATGGGGAAG TGATTGCTGG AGAAATGGTG AAGGnGACTA 360
TGAAAACTTA ATCTAGAATA ACCTGTACCT TTTCCGAATT TACCATTTCT TGTTGAGATT 420
TAGGAGGnTA ATAACCAACA CCCCTTCCTC GnTCCCCCAA ATGTGG 466 (2) INFORMATION FOR SEQ ID NO: 735:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 567 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 735:
ATGCTACATG ATAATTTCAA GGGAGAAGCC AGTAGAGAAG AAACAAGAAG CCTAGAGATT 60
CTCCAAATGC ACAAAAATGC TGCCGTTCCT GAATTGCAAG CTGACTTGAA GCAGAGGAAT 120
GAGATTTCCT TTCCAAGTTA AAACAAATGC ACAGATTATC ATTGACTTCT TTCATTAAAA 180
GCATTTATTT TATTTATTTG AAAGAGTTAC ATAGAGAGGT AAAGTCAGAG AGAGAGAGAA 240 AGAGGTCTTC CAACAGCTGG TTCACTCCCC AAATGACTGC AATGGCCAGT GCTGAGCTGA 300
TCCAAAGCCA GGAGCCAGAA TCTTCTTCTG GGTCTCCCAT GTGGATGCTG GGGCCCAAGG 360
ATTCGCGCCA TCTTCTGCTG CTTTCCCAGG ACATAGCAGA GAGCTGTATC GGAAGTGGAG 420
CAGCTGGGAC ACGAACTAGC ACCCACATGG GATGCCAGAA CTGCAGACCA AATCTTTAAT 480
CCATTGTGTC ACAGCACCAG CCCTAGCATT AAGTTCTTTT TTTTTTTTTT TTGACAGGCA 540
GAGTGGAnAG TAAGAGTGAG AGTGAGA 567 (2) INFORMATION FOR SEQ ID NO: 736:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 537 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 736:
GAATTTCTTC TGGCTTCTAG GGTGTTGTCC CAGGTCTTCC TCCTGAGGTT ACCTTCAGAT 60
GTTAGAATAG AAGTTGCAGG GCCAACGCTG AGGTGTGGCG GGTAAAAGCT GCCGCCTACA 120
GTGCCAGCAT CCCATATGGG CACTGGTTCG AGTCCCGGCT GCTCCACTTC CGATCCAGCT 180
CTCTGCTGTG GCCTGCGAAA GCATAGAAGA TGGCCCAAGT TCTTGGGTCT CTGCACCCAT 240
GTGGGAGACT CTGAAGTAGC TCCTAGCTCC TGGCTTCGGA GATCAGCACA GCTTTGGCGG 300
TTGCGGCTAA TTGGGAAGTG AACCATTGGA TGAAAGACCT CTCTCTCTTC nCTCTCTCTC 360
TGGCTTCTCC TTCTTTCTCT GTGTAACTCT TTCAAGTAAT AGTTAAATAA ACCTTAAAAA 420
AAAAAAGAAT AGAAATTGCA TCTTTTTCTA GTTAGAGCAA GCCTGAGTCT CATTCCTCAC 480
AACTTTAAAG ATGGCTCTTC TCACCTGCAT AATGCATCAA GGTCTCCTCT TTTACTA 537 (2) INFORMATION FOR SEQ ID NO: 737:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 622 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 737: TAACAAAGTG AAGAGGTAAT GAAAACAATG AGAGAAAATA TTTGAAAACT ATGCATCCAA 60 TAAAGGATTA A ATCAAGAA TATATAAGGA GTTCCAGAAA CTCAATAACA ACAAAATAAT 120 CCAGTTAAAA ATGTGCAAGG GCAGGAACAA GCATTTTTCA AAGGATGAAA TAAAAAGGGC 180
CAACATGGGG CCAGCACTGT GGACATAGCA GGTAAAGCCA CCGCCTGCAA TACCAGTATT 240
CCATATGGGT GCTGGTTCAA GTCCCAGCTG CTCCACTTCT GATCTGGCTC TCTGCTATGG 300
CCCAGGAAAG CAGTAGAAGA TGGCCCAAGT CCTTGGGCCC CTGCACCCAC CTGGGAGACC 360
CAGAAGCTCC TGGCTCCTGG CTCTGGACTG GCTCAGCTCT GGCCATTGnC AGCCATTTGG 420
GGAGTGAACC ATTGGGTAGA AGCCCCCCCC TCTGTGTGTA ACTCTGACTT TGAAATAAAT 480
AAATAAATCT TAAAAAAAAA AAGGCCAACA GACAGATGAA AAAACTCCAG GATGACTGCC 540
ATCAGGGAAA TGCAAATACA AATTACAGTG TGGTGTCACC TTACCCCAGC TAGAATGCTA 600
TCATTCAAAA ATCAAAAATG GA 622 (2) INFORMATION FOR SEQ ID NO: 738:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 533 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 738:
GAGCAAAGTG GCATTGAGAA GAAAAAAATC GCCACATTTT GGACACTTTC CACAACCCTA 60
ACTATTCTTT TGGATTATGA TTTTGTTTTG TTTTGTTTTA AAGATTTTTT TATCTGTTTA 120
TTTCACAGGT AGAGTTAGTT ACAGACAGTG AGAGAGGGAG ACGGAAAGGT CTTCCTTCCA 180
TTGGTTCACT CCCCAAATGG CCACCACAGC CAGAGCTACG CCGATCCGAA GCCAGGAGCC 240
AGGTGCTTGC TCCTGGTCTC CCATGTGGGC GCAGGGGCTC AAGCACTTGG GCCATCCTCC 300
ACTGCCCTCC CAGACCACAG CAGAAGGTGC TGGACTGGAA GAGGAGCAAC TGGGACTAGA 360
ACCTGGCACC CAACCTAATC ATTAATCCAT AAATACCTTA AATATATCCC CCTGGGGAAT 420
CTTGGAGAGT TTATATACTA GAAAAAGCAT TTATTCATGA TTTAAAATTT TTTTAAAGTT 480
TATAAAAACA TAACATAAAT CTTACCTTAA ATATCTGTAG nATGGGGTAn CTC 533 (2) INFORMATION FOR SEQ ID NO: 739:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 517 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 739:
TTTGTTGCCA CAAATGCCCA TTCTCGGGGC TGGCGCTGTG GCACAGTGGG TTAACGCCCT 60
GACCTGAGGC ATTAGCATCC CATGTGGGCA CTGGTTCAAG ACCCAGCTCT CTCCTGTGGC 120
CTGAGAAGGC AGTAGAAGAA AGCTCTAGTC TTTGGGCCCC TGCACCCATG TGGGAGACCC 180
AGAGGAGGCT CCTGGTTCCT GGCTTCGGAT CAGCACAGCT CCAGCCATCG CTGCTGGTTG 240
GGGAGTGAAC CATCAGATGG AGGACCATTC TCTCTCTCTC TCTCTGCCTC TCCTCTCTCT 300
GTGTAACTCT GACTTTCAGA TGAATAAATG AATCTTTAAA AAAAAATGCC CACTCTCTAC 360
ATAATGCTTT AAGATTCATC CATGATAGAG TACATGTGAT ATTTTGTTTA TTACTGAATA 420
GTATTCCAAG CATGTGGTTT TGAAGGGACT AAATGAGCAG TTCTGGCCAA GGCTGGCTCA 480
CCTGGATCTT CCTTAGAGAC GAAATCCTAC AGCCCTC 517 (2) INFORMATION FOR SEQ ID NO : 740:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 643 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 740:
AGACCTTTCT CTCTGTCTCT CTTTCTCACT GTCCACTCTG CCTGTCAAAA AAAAAAAAAA 60
AAAAAAAAAA AAAGGAAAAA AAAGGAGAGG AAAGGATAAG GCCCCTGGGA GTGCTGTTCC 120
CTGGAATACA GTAAGGGGTG GCTGCTGGCT GCTCTCTGGG CTGTGGGCCT GGGTCAGCCC 180
AAGGTTCCTG GGGAGATACG CAGCTGGGGT TGTCAGAGTT AGCTCAGGAG AGAACTTACT 240
CCTCAAGCAC ACTTAGAGTG AACCTTTGCT AAGTTGGCTA CACAACTTCT CTATTCTGTA 300
AACCAGTTGA AATAAATCTT ATGGGTTTTG TTTGAAAGGA ATTTATATGA GCAATTTTAC 360
TAAATCAGGA ATATTTTTAA AGATTGTTTA TTGGGGCCAG TGCTGTGGCA GCGGCTCCTC 420
TTCAGATCCA GCTCCTGCTA ATTTGTCTGG GAAAGCAGTA GGAGATGGCC CAGGTGCTTG 480
GCCCCTTTAC TCACATGGGA GACCTGGAAG ACGCTCCTGG ATCCTGGCTT CAGCTCGGCT 540
CAGCTCTGGG nGTTGCAGCC ATCTGGGGAG TGAACCAACA GATGGAAGAC CTCTCCTTCT 600
CTCTGCCTCT GCCTCTATAA nTCTGCCTTT CAAATAAATA AAT 643
(2) INFORMATION FOR SEQ ID NO : 741:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 531 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 741:
TGCCAGTTCG AGTCCCGACT GCTCCACTTC TGACCCAGCT CTCTGCTATG GCGTGAGAAA 60
GCAGTGGAAG ATGGCCCAAG TCCTTTGGGC CCCTGCACTC CCATGGGAGA CCCAGAAGAA 120
GCTCTTTGCT CCTAGCTTTG GATCGGCTAG GAGCTCCAGC TGTTGCGACC AACTGGGGAG 180
TGAACCAGCA GTTGAAGATC TCTCTCTCTC TCGGCCTCTC CTTCTCTCTC TCTCTGTGTA 240
ACTCTGACTT TAAAATAAAT AAGTAAATCT TCAAAACAAA AACAAAGTTT GGTTCCAATT 300
ATGATTACTT TGTTATTGCC AGTTTGTTGA TTAGGGTTCA CTTAAAACGA GATACTGTAA 360
ATCTGAGAAT A ACAGGGGC ACCTGGCGTC ACATCACAAA AAGTCTGGCA CATTTCAGTT 420
TATTCAAGCA ACTATCCATG ATCTACATAG CTAAATGAAA CCTTATTCGT ATCTAAATAG 480
GCATCTGCCT CTAAATATTT TAATATGCAA TTCTGTCTCT ATTCTAATAA T 531 (2) INFORMATION FOR SEQ ID NO: 742:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 633 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 742:
AGGGAGTTTT TCCCTATGGT TGACCAAAAG ATAATAAAAC TACCTTGCTT ATTCAGACAG 60
AACGCAAAAT CAGCATTCCT CTCTTCAACT GCTCAGCTGT AGGTCAATTG ATCCAACACG 120
TCCTATAAGT TAACATTTAA TATTTCCAGT GGATCTATTA AATTTTATTT GCGGAAGTGA 180
TTTCTTTTTA AGATATATAT GTATATATAA ATGCTTATTT TAAAAAATAT TTATTTGAAA 240
GGCAGGGTTA CATAGAGGCA GAATTAGAGA AGAGAGAGTG AGGTTTTCTA CCCATTGGTT 300
CACTTCCCAA ATGGCCACAA TGGCCGGAAC TGCGCCGATC TGAAGCCAGG AGCCAGGAGC 360
TTCTTCCAGG TCTCCCACGT GATTGCAGGG ACCCAAGCAC TTGGATCATC TTCTGCTGCT 420
TTCTCAGGCC ACAGCAGAGA GCTGGATCAG AATTGGAGCA GCCAGGACTC AAAATGGTGC 480
CCATATAGGA TGCTGGCACT GCAGGCAATG GCTTTCCTCT GTATGTCACA GTGCTGGCCC 540
CAAAATGCTT GTTTTTATAT ATGTGTTTAT GTGTGTATTT TAATTGGAGA GGCACAGAGG 600 GAGAGAAAGC ATGCTACTAC CTGCAGGnTC ATT 633
(2) INFORMATION FOR SEQ ID NO: 743:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 681 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 743:
TTATCATGTA ACACCACCCn TGCAGCCAAC TCAGGACCCC CCAACAAAGG TCACCACCAG 60
ATGCAACCCC TCGACCTGAG ACTTCTCTGT CTCCCAGAAT TATAAAAAAA TATACATTTC 120
AGGGCAGGTG TTTGGCGCAG TGGGTATAGC TCTGACCTGT GATGTTGGCA CCCATATGGG 180
TGATGCTCCC TGTCCCAGCT GCTCTGTTTC CAATCCAGCT CCCTGATAAC GACCCAGGAA 240
AAGTAGCAGA AGTTGGCCCA AGTGTTTGGG TCTCTACCAC TTGTGCGGGA GACCTTAATG 300
AAGCCCCTGG CTTAGATCTG GCCCAGCTCT GGCCATTGCA GCCATCTGGG GAGTGAACCA 360
GTGGATGGAA GATGTCTCTG TTTCCATCTC TCCTTTTTTC TCTGCAACTC TTTCAAATAA 420
ATAAATAACA CACATTACTT AGCTTTGTGA ATTGCTTGAT CTCAGGTATT TTGTTTTCTA 480
AGAGAGAACA GGTAAAGATA TAGAGAGTGA AACCACCATA TGCAGTGCCG GTATCCTAAG 540
GGTGTTGGTT CGAGTCCCCG CTGTTCCACT TCCAATCCAG CTCCCTGCTG ATGCACCTGG 600
GAAAGCAGGG GAAGGTGGGC CAAGTGCTTG GGACCCTGAA CCCATGTGGG AGACCCAGAA 660
GAAGCTCCTG GCTCCCAACT T 681 (2) INFORMATION FOR SEQ ID NO: 744:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 651 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 744:
TGGCCATCAA CATGGGGAAC TTGGATTGAA TTCCTGTATT CTGATTTCAC TTTTACCCAG 60
TCGTAGTCAT TGTGGACAAT GCTGAATGAA CCATTGAGTG TGAATGTGTT ATAATAAAAC 120
TGTATTTGAA AAAGGATTGA TTTTTGTAGA CCAAAAATTA AATAAAAATA AACAGAAAAG 180
ATTAAACACT TGATTTCTTC CTGAAAACAT GGTGAACACC TTAGAACTGG GATGCTCCAA 240 TATGACTTGG AGTCTCAGAA TATAAATAAA GATACTTAGG AAGAAAAGTA GCCTTTATCA 300
CATTACTTTT TTTAAAAAAA TATTCATTTA TTTATTTGAA AATAAGAGTC AGACACACAC 360
ACACACACAC ACACACACAC ACACAAAGAT CTTTCATCTG TTGGTTCACT CCCCAAATGG 420
CCACAATAGC CAGAGCTGAG CCTATCTGAA GCCAGGAGCC AGGAACTTCA TCTGGGTCTC 480
CTACAnGGGT GCAGGAGCCC AAGGACTTGG GACATCTTCC ACTGCTATCC CAGGCCCGTC 540
TGCAGGGAGC TGGATCAGAA CTGGAACAGC TTGGGACATG AACTGGGCAC CCACATGGGn 600
TGCTGGGCAC TGCAGATGGT GACTAAACCT GCTGCATCAC AGTGCCTGCC C 651

Claims

What Is Claimed Is:
1. Computer readable medium having recorded thereon the nucleotide sequence depicted in SEQ ID NOS: 1-744, a representative fragment thereof or a nucleotide sequence at least 95% identical to a nucleotide sequence depicted in SEQ ID NOS:.
2. Computer readable medium having recorded thereon any one of the fragments of SEQ ID NOS: 1-744 depicted in Tables 2 and 3 or a degenerate variant thereof.
3. The computer readable medium of claim 1, wherein said medium is selected from the group consisting of a floppy disc, a hard disc, random access memory (RAM), read only memory (ROM), and CD-ROM.
4. The computer readable medium of claim 3, wherein said medium is selected from the group consisting of a floppy disc, a hard disc, random access memory (RAM), read only memory (ROM), and CD-ROM.
5. A computer-based system for identifying fragments of the T. pallidum genome of commercial importance comprising the following elements: a) a data storage means comprising the nucleotide sequence of SEQ ID NOS: 1-744, a representative fragment thereof, or a nucleotide sequence at least 95% identical to a nucleotide sequence of SEQ ID NOS: 1-744; b) search means for comparing a target sequence to the nucleotide sequence of the data storage means of step (a) to identify homologous sequence(s), and c) retrieval means for obtaining said homologous sequence(s) of step (b).
6. A method for identifying commercially important nucleic acid fragments of the T. pallidum genome comprising the step of comparing a database comprising the nucleotide sequences depicted in SEQ ID NOS: 1-744, a representative fragment thereof, or a nucleotide sequence at least 95% identical to a nucleotide sequence of SEQ ID NOS 1-744: with a target sequence to obtain a nucleic acid molecule comprised of a complementary nucleotide sequence to said target sequence, wherein said target sequence is not randomly selected.
7. A method for identifying an expression modulating fragment of T. pallidum genome comprising the step of comparing a database comprising the nucleotide sequences depicted in SEQ ID NOS: 1-744, a representative fragment thereof, or a nucleotide sequence at least 95% identical to the nucleotide sequence of SEQ ID NOS 1-744: with a target sequence to obtain a nucleic acid molecule comprised of a complementary nucleotide sequence to said target sequence, wherein said target sequence comprises sequences known to regulate gene expression.
8. An isolated protein-encoding nucleic acid fragment of the T. pallidum genome, wherein said fragment consists of the nucleotide sequence of any one of the fragments of SEQ ID NOS 1- 744: depicted in Tables 2 and 3, or a degenerate variant thereof.
9. A vector comprising any one of the fragments of the T. pallidum genome SEQ ID NOS: 1-744 depicted in Tables 2 and 3 or a degenerate variant thereof.
10. An isolated fragment of the T. pallidum genome, wherein said fragment modulates the expression of an operably linked open reading frame, wherein said fragment consists of the nucleotide sequence from about 10 to 200 bases in length which is 5 to any one of the open reading frames depicted in Tables 2 and 3 or a degenerate variant thereof.
11. A vector comprising any one of the fragments of the T. pallidum genome of claim 8.
12. An organism which has been altered to contain any one of the fragments of the T. pallidum genome of claim 8.
13. An organism which has been altered to contain any one of the fragments of the T. pallidum genome of claim 10.
14. A method for regulating the expression of a nucleic acid molecule comprising the step of covalently attaching to said nucleic acid molecule a nucleic acid molecule consisting of the nucleotide sequence from about 10 to 100 bases 5 to any one of the fragments of the T. pallidum genome depicted in SEQ ID NOS: 1-744 and Tables 2 and 3 or a degenerate variant thereof.
15. An isolated nucleic acid molecule encoding a homolog of any of the fragments of the T. pallidum genome of SEQ ID NOS 1-744: and Tables 2 and 3, wherein said nucleic acid molecule is produced by a process comprising steps of: a) screening a genomic DNA library using as a probe a target sequence defined by any of SEQ ID NOS: 1-744 and Tables 2 and 3, including fragments thereof; b) identifying members of said library which contain sequences that hybridize to said target sequence; and c) isolating the nucleic acid molecules from said members identified in step (b).
16. An isolated DNA molecule encoding a homolog of any one of the fragments of the T. pallidum genome of SEQ ID NOS: 1-744 and Tables 2 and 3, wherein said nucleic acid molecule is produced a process comprising steps of: a) isolating mRNA, DNA, or cDNA produced from an organism; b) amplifying nucleic acid molecules whose nucleotide sequence is homologous to amplification primers derived from said fragment of said T. pallidum genome to prime said amplification; c) isolating said amplified sequences produced in step (b).
17. An isolated polypeptide encoded by any of the fragments of the T. pallidum genome of SEQ ID NOS: 1-744 and depicted in Table 2 and 3 or by a degenerate variant of said fragments.
18. An isolated polynucleotide molecule encoding any one of the polypeptides of claim
17.
19. An antibody which selectively binds to any one of the polypeptides of claim 17.
20. A method for producing a polypeptide in a host cell comprising the steps of: a) incubating a host containing a heterologous nucleic acid molecule whose nucleotide sequence consists of any one of the fragments of the T. pallidum genome of SEQ ID NOS: 1- 744and depicted in Tables 2 and 3, under conditions where said heterologous nucleic acid molecule is expressed to produce said protein, and b) isolating said protein.
PCT/US1998/013041 1997-06-24 1998-06-23 Treponema pallidum polynucleotides and sequences WO1998059034A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP98931511A EP0990022A1 (en) 1997-06-24 1998-06-23 $i(TREPONEMA PALLIDUM) POLYNUCLEOTIDES AND SEQUENCES
CA002296814A CA2296814A1 (en) 1997-06-24 1998-06-23 Treponema pallidum polynucleotides and sequences
AU81623/98A AU8162398A (en) 1997-06-24 1998-06-23 (treponema pallidum) polynucleotides and sequences

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US5066797P 1997-06-24 1997-06-24
US60/050,667 1997-06-24

Publications (2)

Publication Number Publication Date
WO1998059034A2 true WO1998059034A2 (en) 1998-12-30
WO1998059034A3 WO1998059034A3 (en) 2000-06-29

Family

ID=21966650

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/013041 WO1998059034A2 (en) 1997-06-24 1998-06-23 Treponema pallidum polynucleotides and sequences

Country Status (4)

Country Link
EP (1) EP0990022A1 (en)
AU (1) AU8162398A (en)
CA (1) CA2296814A1 (en)
WO (1) WO1998059034A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999054470A2 (en) * 1998-04-22 1999-10-28 Glaxo Group Limited Bacterial ygjd polypeptide family
FR2798386A1 (en) * 1999-09-10 2001-03-16 Didier Raoult SINGLE-SIDED OLIGONUCLEOTIDES, PROBES, PRIMERS AND SPIROCHETES DETECTION METHOD
CN102277418A (en) * 2011-01-26 2011-12-14 宁波基内生物技术有限公司 Primer, kit and method used for detecting Treponema pallidum
CN109021082A (en) * 2018-07-27 2018-12-18 南华大学 The expression and purification of microspironema pallidum recombinant protein Tp0971 and application
WO2024050428A3 (en) * 2022-08-31 2024-06-27 University Of Washington Compositions, kits, and methods for detection of syphilis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Wisconsin Sequence Analysis Package, User's Guide, Version 8, 1995, Genetics Computer Group, University Research Park, 575 Science Drive. Madison, Wisconson 53711, pages i-xvii, 1-1 to 4-12, and A-1 to A-17 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999054470A2 (en) * 1998-04-22 1999-10-28 Glaxo Group Limited Bacterial ygjd polypeptide family
WO1999054470A3 (en) * 1998-04-22 2000-03-30 Glaxo Group Ltd Bacterial ygjd polypeptide family
FR2798386A1 (en) * 1999-09-10 2001-03-16 Didier Raoult SINGLE-SIDED OLIGONUCLEOTIDES, PROBES, PRIMERS AND SPIROCHETES DETECTION METHOD
WO2001020028A1 (en) * 1999-09-10 2001-03-22 Universite De La Mediterranee (Aix-Marseille Ii) Single stranded oligonucleotides, probes, primers and method for detecting spirochetes
US7141658B1 (en) 1999-09-10 2006-11-28 Biomerieux Single stranded oligonucleotides, probes, primers and method for detecting spirochetes
CN102277418A (en) * 2011-01-26 2011-12-14 宁波基内生物技术有限公司 Primer, kit and method used for detecting Treponema pallidum
CN109021082A (en) * 2018-07-27 2018-12-18 南华大学 The expression and purification of microspironema pallidum recombinant protein Tp0971 and application
WO2024050428A3 (en) * 2022-08-31 2024-06-27 University Of Washington Compositions, kits, and methods for detection of syphilis

Also Published As

Publication number Publication date
CA2296814A1 (en) 1998-12-30
AU8162398A (en) 1999-01-04
EP0990022A1 (en) 2000-04-05
WO1998059034A3 (en) 2000-06-29

Similar Documents

Publication Publication Date Title
KR102644935B1 (en) Microbiota composition as a marker of reactivity to anti-PD1/PD-L1/PD-L2 antibodies, and use of microbial modifiers to improve the efficacy of anti-PD1/PD-L1/PD-L2 Ab-based therapy
AU2024202444A1 (en) Ammonia-oxidizing nitrosomonas eutropha strain D23
AU2021269424A1 (en) Composition comprising bacterial strains
AU2021290210A1 (en) Compositions comprising bacterial strains
KR102523805B1 (en) Immune modulation
AU2021202753A1 (en) Isolated polynucleotides and polypeptides and methods of using same for increasing plant yield, biomass, growth rate, vigor, oil content, abiotic stress tolerance of plants and nitrogen use efficiency
AU2021200769A1 (en) Compositions comprising bacterial strains
AU2018203260A1 (en) Isolated Polynucleotides and Polypeptides, and Methods of using same for increasing Abiotic Stress Tolerance, Yield, Growth Rate, Vigor, Biomass, Oil Content, and/or Nitrogen use Efficiency of Plants
TW202222339A (en) Compositions comprising bacterial strains
AU2016274683A1 (en) Streptomyces endophyte compositions and methods for improved agronomic traits in plants
KR20170005829A (en) Compositions for mosquito control and uses of same
KR20070086634A (en) Industrially useful microorganisms
AU7280598A (en) Enterococcus faecalis polynucleotides and polypeptides
KR102363357B1 (en) Microorganism for producing a short-chain fatty acids, microorganism with antivacterial activity for antibiotics resistant pathogenic bacteria and antibacterial composition using the same
AU2016295174A1 (en) Genetic testing for predicting resistance of salmonella species against antimicrobial agents
AU2016295176A1 (en) Genetic testing for predicting resistance of gram-negative proteus against antimicrobial agents
KR20210068484A (en) Microbiota composition as a marker of reactivity to anti-PD1/PD-L1/PD-L2 antibodies in renal cell carcinoma
JP2002355074A (en) Nucleic acid molecule and polypeptide specific to enteropathogenic escherichia coli o157:h7 and method for using the same
KR20200038970A (en) Composition comprising a bacterial strain
CN107208149A (en) The biomarker of colorectal cancer relevant disease
KR20220004117A (en) Probiotic strains with increased storage stability
KR102411381B1 (en) Novel bacillus subtilis strain with high productivity of surfactin and enzyme and use of the same
WO1998059034A2 (en) Treponema pallidum polynucleotides and sequences
KR20190059562A (en) Novel Bacillus subtilis having proteolytic activity and uses thereof
KR20190057790A (en) Novel Bacillus subtilis having proteolytic activity and uses thereof

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref document number: 2296814

Country of ref document: CA

Ref country code: CA

Ref document number: 2296814

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1998931511

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 1999504981

Format of ref document f/p: F

WWP Wipo information: published in national office

Ref document number: 1998931511

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

WWW Wipo information: withdrawn in national office

Ref document number: 1998931511

Country of ref document: EP