AUSTRALIA
Patents Act 1990 THE INSTITUTE FOR GENOMIC RESEARCH, CHIRON
CORPORATION
COMPLETE SPECIFICATION STANDARD PATENT Invention Title: Neisseria meningitidis antigens and compositions The following statement is a full description of this invention including the best method of performing it known to us:- NEISSERIA MENINGITIDIS ANTIGENS AND COMPOSITIONS This is a divisional of NZ 508366, the entire contents of which are incorporated herein by reference.
This invention relates to antigens from bacterial species: Neisseria meningitidis and Neisseria gonorrhoea.
BACKGROUND
Neisseria meningitidis is a non-motile, gram negative diplococcus human pathogen. It colonizes the pharynx, causing meningitis and, occasionally, septicaemia in the absence of meningitis. It is closely related to N. gonorrhoea, although one feature that clearly differentiates meningococcus from gonococcus is the presence of a polysaccharide capsule that is present in all pathogenic meningococci.
N. meningitidis causes both endemic and epidemic disease. In the United States the attack rate is 0.6-1 per 100,000 persons per year, and it can be much greater during outbreaks. (see Lieberman et al. (1996) Safety and Immunogenicity of a Serogroups A/C Neisseria meningitidis Oligosaccharide-Protein Conjugate Vaccine in Young Children. JAMA 275(19):1499-1503; Schuchat et al (1997) Bacterial Meningitis in the United States in 1995. N Engl J Med 337(14):970-976). In developing countries, endemic disease rates are much higher and during epidemics incidence rates can reach 500 cases per 100,000 persons per year. Mortality is extremely high, at 10-20% in the United States, and much higher in developing countries. Following the introduction of the conjugate vaccine against Haemophilus influenzae, N. meningitidis is the major cause of bacterial meningitis at all ages in the United States (Schuchat et al (1997) supra).
Based on the organism's capsular polysaccharide, 12 serogroups of N.
meningitidis have been identified. Group A is the pathogen most often implicated in epidemic disease in sub-Saharan Africa. Serogroups B and C are responsible for the vast majority of cases in the United States and in most developed countries. Serogroups W135 and Y are responsible for the rest of the cases in the United States and developed countries. The meningococcal vaccine currently in use is a tetravalent polysaccharide vaccine composed of serogroups A, C, Y and W135. Although efficacious in adolescents and adults, it induces a poor immune response and short duration of protection, and cannot be used in infants [eg. Morbidity and Mortality weekly report, Vol.46, No. RR-5 (1997)]. This is because polysaccharides are T-cell independent antigens that induce a weak immune response that cannot be boosted by repeated immunization. Following the success of the vaccination against H.influenzae, conjugate vaccines against serogroups A and C have been developed and are at the final stage of clinical testing (Zollinger WD "New and Improved Vaccines Against Meningococcal Disease". In: New Generation Vaccines, supra, pp. 469-488; Lieberman et al (1996) supra; Costantino et al (1992) Development and phase I clinical testing of a conjugate vaccine against meningococcus A and C. Vaccine 10:691-698).
Meningococcus B (menB) remains a problem, however. This serotype currently is responsible for approximately 50% of total meningitis in the United States, Europe, and South America. The polysaccharide approach cannot be used because the menB capsular polysaccharide is a polymer of c(2-8)-linked N-acetyl neuraminic acid that is also present in mammalian tissue. This results in tolerance to the antigen; indeed, if an immune response were elicited, it would be anti-self, and therefore undesirable. In order to avoid induction of autoimmunity and to induce a protective immune response, the capsular polysaccharide has, for instance, been chemically modified substituting the Ni-acetyl groups with N-propionyl groups, leaving the specific antigenicity unaltered (Romero Outschoom (1994) Current status of Meningococcal group B vaccine candidates: capsular or non-capsular? Clin Microbiol Rev 7(4):559-575).
Alternative approaches to menB vaccines have used complex mixtures of outer membrane proteins (OMPs), containing either the OMPs alone, or OMPs enriched in porins, or deleted of the class 4 OMPs that are believed to induce antibodies that block bactericidal activity. This approach produces vaccines that are not well characterized. They are able to protect against the homologous strain, but are not effective at large where there are many antigenic variants of the outer membrane proteins. To overcome the antigenic variability, multivalent vaccines containing up to nine different porins have been constructed (eg.
Poolman JT (1992) Development of a meningococcal vaccine. Infect. Agents Dis. 4:13-28).
Additional proteins to be used in outer membrane vaccines have been the opa and opc proteins, but none of these approaches have been able to overcome the antigenic variability (eg. Ala'Aldeen Borriello (1996) The meningococcal transferrin-binding proteins 1 and 2 are both surface exposed and generate bactericidal antibodies capable of killing homologous and heterologous strains. Vaccine 14(1):49-53).
A certain amount of sequence data is available for meningococcal and gonoccocal genes and proteins EP-A-0467714, WO 96/29412), but this is by no means complete. Other men B proteins may include those listed in WO 97/28273, WO 96/29412, WO 95/03413, US 5,439,808, and US 5,879,686.
The provision of further sequences could provide an opportunity to identify secreted or surface-exposed proteins that are presumed targets for the immune system and which are not antigenically variable. For instance, some of the identified proteins could be components of efficacious vaccines against meningococcus B, some could be components of vaccines against all meningococcal serotypes, and others could be components of vaccines against all pathogenic Neisseriae including Neisseria meningitidis or Neisseria gonorrhoeae. Those sequences specific to N. meningitidis or N. gonorrhoeae that are more highly conserved are further preferred sequences.
It is thus desirable to provide Neisserial DNA sequences which encode proteins that are antigenic or immunogenic.
BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 illustrates the products of protein expression and purification of the predicted ORF 919 as cloned and expressed in E. coli.
Fig. 2 illustrates the products of protein expression and purification of the predicted ORF 279 as cloned and expressed in E. coli. The predicted gene 279 was cloned in pGex vector and expressed in E. coli. The product of protein expression and purification was analysed by SDS-PAGE. In panel A) is shown the analysis of 279- GST purification. Mice were immunized with the purified 279-GST and sera were used for Western blot analysis (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Results show that protein 279 is a surfaceexposed protein. Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N.
meningitidis immunoreactive band Fig. 3 illustrates the products of protein expression and purification of the predicted ORF 576-1 as cloned and expressed in E. coli. The predicted gene 576 was cloned in pGex vector and expressed in E. coli. The product of protein purification was analysed by SDS-PAGE. In panel A) is shown the analysis of 576-GST fusion protein purification. Mice were immunized with the purified 576-GST and sera were used for Western blot (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Results show that protein 576 is a surface-exposed protein.
Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band Fig. 4 illustrates the products of protein expression and purification of the predicted ORF 519-1 as cloned and expressed in E. coli. The predicted gene 519 was cloned in pET vector and expressed in E. coli. The product of protein purification was analysed by SDS-PAGE. In panel A) is shown the analysis of 519-His fusion protein purification. Mice were immunized with the purified 519-His and sera were used for Western blot (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Results show that protein 519 is a surface-exposed protein.
Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band Fig. 5 illustrates the products of protein expression and purification of the predicted ORF 121-1 as cloned and expressed in E. coli. The predicted gene 121 was cloned in pET vector and expressed in E. coli. The product of protein purification was analysed by SDS-PAGE. In panel A) is shown the analysis of 121-His fusion protein purification. Mice were immunized with the purified 121-His and sera were used for Western blot analysis (panel FACS analysis (panel bactericidal assay (panel D), and ELISA assay (panel Results show that 121 is a surface-exposed protein.
Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band Fig. 6 illustrates the products of protein expression and purification of the predicted ORF 128-1 as cloned and expressed in E. coli. The predicted gene 128 was cloned in pET vector and expressed in E. coli. The product of protein purification was analysed by SDS-PAGE. In panel A) is shown the analysis of 128-His purification.
Mice were immunized with the purified 128-His and sera were used for Western blot analysis (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Results show that 128 is a surface-exposed protein. Symbols: M1, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N.
meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band Fig. 7 illustrates the products of protein expression and purification of the predicted ORF 206 as cloned and expressed in E. coli. The predicted gene 206 was cloned in pET vector and expressed in E. coli. The product of protein purification was analysed by SDS-PAGE. In panel A) is shown the analysis of 206-His purification.
Mice were immunized with the purified 206-His and sera were used for Western blot analysis (panel It is worthnoting that the immunoreactive band in protein extracts from meningococcus is 38 kDa instead of 17 kDa (panel To gain information on the nature of this antibody staining we expressed ORF 206 in E. coli without the Histag and including the predicted leader peptide. Western blot analysis on total protein extracts from E. coli expressing this native form of the 206 protein showed a reactive band at a position of 38 kDa, as observed in meningococcus. We conclude that the 38 kDa band in panel B) is specific and that anti-206 antibodies, likely recognize a multimeric protein complex. In panel C is shown the FACS analysis, in panel D the bactericidal assay, and in panel E) the ELISA assay. Results show that 206 is a surface-exposed protein. Symbols: M1, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle preparation.
Arrows indicate the position of the main recombinant protein product and the N.
meningitidis immunoreactive band Fig. 8 illustrates the products of protein expression and purification of the predicted ORF 287 as cloned and expressed in E. coli. The predicted gene 287 was cloned in pGex vector and expressed in E. coli. The product of protein purification was analysed by SDS-PAGE. In panel A) is shown the analysis of 287-GST fusion protein purification. Mice were immunized with the purified 287-GST and sera were used for FACS analysis (panel bactericidal assay (panel and ELISA assay (panel D).
Results show that 287 is a surface-exposed protein. Symbols: Ml, molecular weight marker. Arrows indicate the position of the main recombinant protein product Fig. 9 illustrates the products of protein expression and purification of the predicted ORF 406 as cloned and expressed in E. coli. The predicted gene 406 was cloned in pET vector and expressed in E. coli. The product of protein purification was analysed by SDS-PAGE. In panel A) is shown the analysis of 406-His fusion protein purification. Mice were immunized with the purified 406-His and sera were used for Western blot analysis (panel FACS analysis (panel bactericidal assay (panel D), and ELISA assay (panel Results show that 406 is a surface-exposed protein.
Symbols: M1, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band Fig. 10 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 919 as cloned and expressed in E.
coli.
Fig. 11 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 279 as cloned and expressed in E.
coli.
Fig. 12 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 576-1 as cloned and expressed in E. coli.
Fig. 13 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 519-1 as cloned and expressed in E. coli.
Fig. 14 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 121-1 as cloned and expressed in E. coli.
Fig. 15 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 128-1 as cloned and expressed in E. coli.
Fig. 16 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 206 as cloned and expressed in E.
coli.
Fig. 17 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 287 as cloned and expressed in E.
coli.
Fig. 18 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the products of protein expression the predicted ORF 406 as cloned and expressed in E.
coli.
4C Fig. 19 shows an alignment comparison of amino acid sequences for ORF 225 for several strains of Neisseria. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. The Figure demonstrates a high degree of conservation among the various strains, further confirming its utility as an antigen for both vaccines and diagnostics.
Fig. 20 shows an alignment comparison of amino acid sequences for ORF 235 for several strains of Neisseria. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. The Figure demonstrates a high degree of conservation among the various strains, further confirming its utility as an antigen for both vaccines and diagnostics.
Fig. 21 shows an alignment comparison of amino acid sequences for ORF 287 for several strains of Neisseria. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. The Figure demonstrates a high degree of conservation among the various strains, further confirming its utility as an antigen for both vaccines and diagnostics.
Fig. 22 shows an alignment comparison of amino acid sequences for ORF 519 for several strains of Neisseria. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. The Figure demonstrates a high degree of conservation among the various strains, further confirming its utility as an antigen for both vaccines and diagnostics.
Fig. 23 shows an alignment comparison of amino acid sequences for ORF 919 for several strains of Neisseria. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. The Figure demonstrates a high degree of conservation among the various strains, further confirming its utility as an antigen for both vaccines and diagnostics.
THE INVENTION The invention provides proteins comprising the N. meningitidis amino acid sequences and N. gonorrhoeae amino acid sequences disclosed in the examples.
It also provides proteins comprising sequences homologous those having sequence identity) to the N. meningitidis amino acid sequences disclosed in the examples.
Depending on the particular sequence, the degree of homology (sequence identity) is preferably greater than 50% (eg. 60%, 70%, 80%, 90%, 95%, 99% or more). These proteins include mutants and allelic variants of the sequences disclosed in the examples. Typically, identity or more between two proteins is considered to be an indication of functional equivalence. Identity between proteins is preferably determined by the Smith-Waterman homology search algorithm as implemented in MPSRCH program (Oxford Molecular) using an affine gap search with parameters:gap penalty 12, gap extension penalty 1.
The invention further provides proteins comprising fragments of the NA meningitidis amino acid sequences and N. gonorrhoeae amino acid sequences disclosed in the examples.
The fragments should comprise at least n consecutive amino acids from the sequences and, depending on the particular sequence, n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20 or more).
Preferably the fragments comprise an epitope from the sequence.
The proteins of the invention can, of course, be prepared by various means (eg.
recombinant expression, purification from cell culture, chemical synthesis etc.) and in various forms (eg. native, fusions etc.). They are preferably prepared in substantially pure or isolated form (ie. substantially free from other N. meningitidis or N. gonorrhoeae host cell proteins) According to a further aspect, the invention pr. s antibodies which bind to these proteins. These may be polyclonal or monoclonal and may be produced by any suitable means.
According to a further aspect, the invention provides nucleic acid comprising the N. meningitidis nucleotide sequences and N. gonorrhoeae nucleotide sequences disclosed in the examples.
According to a further aspect, the invention comprises nucleic acids having sequence identity of greater than 50% 60%, 70%, 80%, 90%, 95%, 99% or more) to the nucleic acid sequences herein. Sequence identity is determined as above-discussed.
According to a further aspect, the invention comprises nucleic acid that hybridizes to the sequences provided herein. Conditions for hybridization are set forth herein.
Nucleic acid comprising fragments of these sequences are also provided. These should comprise at least n consecutive nucleotides from the NV meningitidis sequences or N.
gonorrhoeae sequences and depending on the particular sequence, n is 10 or more (eg 12, 14, 18, 20, 25, 30, 35, 40 or more).
According to a further aspect, the invention provides nucleic acid encoding the proteins and protein fragments of the invention.
It should also be appreciated that the invention provides nucleic acid comprising sequences complementary to those described above (eg. for antisense or probing purposes).
Nucleic acid according to the invention can, of course, be prepared in many ways (eg.
by chemical synthesis, in part or in whole, from genomic or cDNA libraries, from the organism itself etc.) and can take various forms (eg. single stranded, double stranded, vectors, probes etc.).
In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such as those containing modified backbones, and also protein nucleic acids (PNA) etc.
According to a further aspect, the invention provides vectors comprising nucleotide sequences of the invention (eg. expression vectors) and host cells transformed with such vectors.
According to a further aspect, the invention provides compositions comprising protein, antibody, and/or nucleic acid according to the invention. These compositions may be suitable as vaccines, for instance, or as diagnostic reagents or as immunogenic compositions.
The invention also provides nucleic acid, protein, or antibody according to the invention for use as medicaments (eg. as vaccines) or as diagnostic reagents. It also provides the use of nucleic acid, protein, or antibody according to the invention in the manufacture of a medicament for treating or preventing infection due to Neisserial bacteria (ii) a diagnostic reagent for detecting the presence of Neisserial bacteria or of antibodies raised against Neisserial bacteria or (iii) for raising antibodies. Said Neisserial bacteria may be any species or strain (such as N. gonorrhoeae) but are preferably N. meningitidis, especially strain B or strain C.
The invention also provides a method of treating a patient, comprising administering to the patient a therapeutically effective amount of nucleic acid, protein, and/or antibody according to the invention.
According to further aspects, the invention provides various processes.
A process for producing proteins of the invention is provided, comprising the step of culturing a host cell according to the invention under conditions which induce protein expression.
A process for detecting polynucleotides of the invention is provided, comprising the steps of: contacting a nucleic probe according to the invention with a biological sample under hybridizing conditions to form duplexes; and detecting said duplexes.
A process for detecting proteins of the invention is provided, comprising the steps of: contacting an antibody according to the invention with a biological sample under conditions suitable for the formation of an antibody-antigen complexes; and (b) detecting said complexes.
In another embodiment, the present invention provides a substantially purified protein comprising an amino acid sequence selected from the group consisting of SEQ ID Nos:2534, 2536 2538.
In a further embodiment, the present invention provides a substantially purified protein comprising an amino acid sequence which sequence has 50% or greater identity to an amino acid sequence selected from the group consisting of SEQ ID Nos:2534, 2536 2538.
In yet another embodiment, the present invention provides a substantially purified protein comprising a fragment of 7 or more amino acids of an amino acid sequence selected from the group consisting of SEQ ID Nos:2534, 2536 2538.
Preferably, the fragment comprises an epitope from SEQ ID Nos:2534, 2536 2538.
In a further embodiment, the present invention provides a substantially purified protein comprising the amino acid sequence of SEQ ID NO:2944.
In a further embodiment, the present invention provides a substantially purified protein comprising an amino acid sequence which sequence has 50% or greater identity to the amino acid sequence of SEQ ID NO:2944.
In a further embodiment, the present invention provides a substantially purified protein comprising a fragment of 7 or more amino acids of SEQ ID NO:2944.
Preferably, the fragment comprises an epitope from SEQ ID NO:2944.
In a further embodiment, the present invention provides a substantially purified antibody which binds to a protein according to the invention.
Preferably, the antibody is a monoclonal antibody.
Also provided is an isolated nucleic acid which encodes a protein of the invention.
Preferably, the nucleic acid comprises a nucleotide sequence selected from the group consisting of SEQ ID Nos:2533, 2535 2537.
In another embodiment, the present invention provides an isolated nucleic acid comprising a fragment of 10 or more nucleotides of a nucleotide sequence selected from the group consisting of SEQ ID Nos:2533, 2535 2537.
In another embodiment, the present invention provides an isolated nucleic acid comprising a fragment of 15 or more nucleotides of a nucleotide sequence selected from the group consisting of SEQ ID Nos:2533, 2535 2537.
In yet a further embodiment, the present invention provides an isolated nucleic acid comprising a nucleotide sequence which sequence has 50% or greater identity to a nucleotide sequence selected from the group consisting of SEQ ID Nos:2533, 2535 2537.
In another embodiment, the present invention provides an isolated nucleic acid comprising a fragment of 10 or more nucleotides from SEQ ID NO:2943.
In another embodiment the present invention provides an isolated nucleic acid comprising a fragment of 15 or more nucleotides from SEQ ID NO:2943.
In another embodiment, the present invention provides an isolated nucleic acid comprising a nucleotide sequence which sequence has 50% or greater identity to the nucleotide sequence of SEQ ID NO:2943.
In another embodiment, the present invention provides an isolated nucleic acid comprising a nucleotide sequence complementary to a nucleic acid sequence according to the invention.
In yet a further embodiment, the present invention provides an isolated nucleic acid which can hybridise to the nucleic acid described herein under high stringency conditions.
In yet another embodiment, the present invention provides a substantially purified protein encoded by the nucleic acid sequences according to the invention.
In another embodiment, the present invention provides a composition comprising a protein, a nucleic acid, or an antibody according to the invention.
Preferably, the composition is a vaccine composition or a diagnostic composition.
Also provided is a composition of the invention for use as a pharmaceutical.
In yet another embodiment, the present invention provides for the use of a composition of the invention in the manufacture of a medicament for the treatment or prevention of infection due to Neisserial bacteria.
In a further embodiment, the present invention provides a method of treating or preventing an infection in a subject due to Neisserial bacteria, the method comprising administering to the subject a composition of the invention.
Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified.
Methodology Summary of standard procedures and techniques General This invention provides Neisseria meningitidis menB nucleotide sequences, amino acid sequences encoded therein. With these disclosed sequences, nucleic acid probe assays and expression cassettes and vectors can be produced. The expression vectors can be transformed into host cells to produce proteins. The purified or isolated polypeptides (which may also be chemically synthesized) can be used to produce antibodies to detect menB proteins. Also, the host cells or extracts can be utilized for biological assays to isolate agonists or antagonists. In addition, with these sequences one can search to identify open reading frames and identify amino acid sequences. The proteins may also be used in immunogenic compositions, antigenic compositions and as vaccine components.
Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed before the priority date of each claim of this application.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. Such techniques are explained in the literature Sambrook Molecular Cloning;, A Laboratory Manual, Second Edition (1989); DNA Cloning, Volumes I and II Glover ed. 1985); Oligonucleotide Synthesis Gait ed, 1984); Nucleic Acid Hybridization Hames S.J. Higgins eds 1984); Transcription and Translation Hames S.J. Higgins eds. 1984); Animal Cell Culture Freshney ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide to Molecular Cloning (1984); the Methods in Enzymology series (Academic Press, Inc.), especially volumes 154 and 155; Gene Transfer Vectors for Mammalian Cells Miller and M.P. Calos eds. 1987, Cold Spring Harbor Laboratory); Mayer and Walker, eds. (1987), Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); Scopes, (1987) Protein 8C Purification: Principles and Practice, Second Edition (Springer-Verlag, and Handbook of Experimental Immunology, Volumes I-IV Weir and C.C. Blackwell eds 1986).
Standard abbreviations for nucleotides and amino acids are used in this specification.
All publications, patents, and patent applications cited herein are incorporated in full by reference.
Expression systems The Neisseria menB nucleotide sequences can be expressed in a variety of different expression systems; for example those used with mammalian cells, plant cells, baculoviruses, bacteria, and yeast.
i. Mammalian Systems Mammalian expression systems are known in the art. A mammalian promoter is any DNA sequence capable of binding mammalian RNA polymerase and initiating the downstream transcription of a coding sequence structural gene) into mRNA. A promoter will have a transcription initiating region, which is usually placed proximal to the end of the coding sequence, and a TATA box, usually located 25-30 base pairs (bp) upstream of the transcription initiation site. The TATA box is thought to direct RNA polymerase II to begin RNA synthesis at the correct site. A mammalian promoter will also contain an upstream promoter element, usually located within 100 to 200 bp upstream of the TATA box.
An upstream promoter element determines the rate at which transcription is initiated and can act in either orientation (Sambrook et al. (1989) "Expression of Cloned Genes in Mammalian Cells." In Molecular Cloning: A Laboratory Manual, 2nd ed.).
Mammalian viral genes are often highly expressed and have a broad host range; therefore sequences encoding mammalian viral genes provide particularly useful promoter sequences. Examples include the SV40 early promoter, mouse mammary tumor virus LTR promoter, adenovirus major late promoter (Ad MLP), and herpes simplex virus promoter. In addition, sequences derived from non-viral genes, such as the murine metallothionein gene, also provide useful promoter sequences. Expression may be either constitutive or regulated (inducible). Depending on the promoter selected, many promotes may be inducible using known substrates, such as the use of the mouse mammary tumor virus (MMTV) promoter with the glucocorticoid responsive element (GRE) that is induced by glucocorticoid in hormone-responsive transformed cells (see for example, U.S. Patent 5,783,681).
The presence of an enhancer element (enhancer), combined with the promoter elements described above, will usually increase expression levels. An enhancer is a 1 regulatory DNA sequence that can stimulate transcription up to 1000-fold when linked to homologous or heterologous promoters, with synthesis beginning at the normal RNA start site. Enhancers are also active when they are placed upstream or downstream from the transcription initiation site, in either normal or flipped orientation, or at a distance of more than 1000 nucleotides from the promoter (Maniatis et al. (1987) Science 236:1237; Alberts et al. (1989) Molecular Biology of the Cell, 2nd Enhancer elements derived from viruses may be particularly useful, because they usually have a broader host range. Examples include the SV40 early gene enhancer (Dijkema et al (1985) EMBO J. 4:761) and the enhancer/promoters derived from the long terminal repeat (LTR) of the Rous Sarcoma Virus (Gorman et al. (1982b) Proc. Natl. Acad. Sci. 79:6777) and from human cytomegalovirus (Boshart et al. (1985) Cell 41:521). Additionally, some enhancers are regulatable and become active only in the presence of an inducer, such as a hormone or metal ion (Sassone-Corsi and Borelli (1986) Trends Genet. 2:215; Maniatis et al. (1987) Science 236:1237).
A DNA molecule may be expressed intracellularly in mammalian cells. A promoter sequence may be directly linked with the DNA molecule, in which case the first amino acid at the N-terminus of the recombinant protein will always be a methionine, which is encoded by the ATG start codon. If desired, the N-terminus may be cleaved from the protein by in vitro incubation with cyanogen bromide.
Alternatively, foreign proteins can also be secreted from the cell into the growth media by creating chimeric DNA molecules that encode a fusion protein comprised of a leader sequence fragment that provides for secretion of the foreign protein in mammalian cells.
Preferably, there are processing sites encoded between the leader fragment and the foreign gene that can be cleaved either in vivo or in vitro. The leader sequence fragment usually encodes a signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. The adenovirus tripartite leader is an example of a leader sequence that provides for secretion of a foreign protein in mammalian cells.
Usually, transcription termination and polyadenylation sequences recognized by mammalian cells are regulatory regions located 3' to the translation stop codon and thus, together with the promoter elements, flank the coding sequence. The 3' terminus of the mature mRNA is formed by site-specific post-transcriptional cleavage and polyadenylation (Birnstiel et al. (1985) Cell 41:349; Proudfoot and Whitelaw (1983) "Termination and 3' end processing of eukaryotic RNA. In Transcription and splicing (ed. B.D. Hames and D.M.
Glover); Proudfoot (1989) Trends Biochem. Sci. 14:105). These sequences direct the transcription of an mRNA which can be translated into the polypeptide encoded by the DNA.
Examples of transcription terminator/polyadenylation signals include those derived from (Sambrook et al (1989) "Expression of cloned genes in cultured mammalian cells." In Molecular Cloning: A Laboratory Manual).
Usually, the above described components, comprising a promoter, polyadenylation signal, and transcription termination sequence are put together into expression constructs.
Enhancers, introns with functional splice donor and acceptor sites, and leader sequences may also be included in an expression construct, if desired. Expression constructs are often maintained in a replicon, such as an extrachromosomal element plasmids) capable of stable maintenance in a host, such as mammalian cells or bacteria. Mammalian replication systems include those derived from animal viruses, which require trans-acting factors to replicate. For example, plasmids containing the replication systems of papovaviruses, such as (Gluzman (1981) Cell 23:175) or polyomavirus, replicate to extremely high copy number in the presence of the appropriate viral T antigen. Additional examples of mammalian replicons include those derived from bovine papillomavirus and Epstein-Barr virus. Additionally, the replicon may have tr-o replication systems, thus allowing it be maintained, for example, in mammalian cells for expression and in a prokaryotic host for cloning and amplification. Examples of such mammalian-bacteria shuttle vectors include pMT2 (Kaufman et al. (1989) Mol. Cell. Biol. 9:946) and pHEBO (Shimizu et al. (1986) Mol.
Cell. Biol. 6:1074).
The transformation procedure used depends upon the host to be transformed. Methods for introduction ofheterologous polynucleotides into mammalian cells are known in the art and include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei.
Mammalian cell lines available as hosts for expression are known in the art and include many immortalized cell lines available from the American Type Culture Collection (ATCC), including but not limited to, Chinese hamster ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, monkey kidney cells (COS), human hepatocellular carcinoma cells Hep G2), and a number of other cell lines.
ii. Plant Cellular Expression Systems There are many plant cell culture and whole plant genetic expression systems known in the art. Exemplary plant cellular genetic expression systems include those described in patents, such as: U.S. 5,693,506; US 5,659,122; and US 5,608,143. Additional examples of genetic expression in plant cell culture has been described by Zenk, Phytochemistry 30:3861- 3863 (1991). Descriptions of plant protein signal peptides may be found in addition to the references described above in Vaulcombe et al., Mol. Gen. Genet. 209:33-40 (1987); Chandler et al., Plant Molecular Biology 3:407-418 (1984); Rogers, J. Biol. Chem. 260:3731-3738 (1985); Rothstein et al., Gene 55:351-356 (1987); Whittier et al., Nucleic Acids Research 15:2515-2535 (1987); Wirsel et al., Molecular Microbiology 3:3-14 (1989); Yu et al., Gene 122:247-253 (1992). A description of the regulation of plant gene expression by the phytohormone, gibberellic acid and secreted enzymes induced by gibberellic acid can be found in R.L. Jones and J. MacMillin, Gibberellins: in: Advanced Plant Physiology,. Malcolm B. Wilkins, ed., 1984 Pitman Publishing Limited, London, pp. 21-52. References that describe other metabolically-regulated genes: Sheen, Plant Cell, 2:1027-1038(1990); Maas et al., EMBOJ. 9:3447-3452 (1990); Benkel and Hickey, Proc. Natl. Acad. Sci. 84:1337-1339 (1987) Typically, using techniques known in the art, a desired polynucleotide sequence is inserted into an expression cassette comprising genetic regulatory elements designed for operation in plants. The expression cassette is inserted into a desired expression vector with companion sequences upstream and downstream from the expression cassette suitable for expression in a plant host. The companion sequences will be ofplasmid or viral origin and provide necessary characteristics to the vector to permit the vectors to move DNA from an original cloning host, such as bacteria, to the desired plant host. The basic bacterial/plant vector construct will preferably provide a broad host range prokaryote replication origin; a prokaryote selectable marker; and, for Agrobacterium transformations, T DNA sequences for Agrobacterium-mediated transfer to plant chromosomes. Where the heterologous gene is not readily amenable to detection, the construct will preferably also have a selectable marker gene suitable for determining if a plant cell has been transformed. A general review of suitable markers, for example for the members of the grass family, is found in Wilmink and Dons, 1993, Plant Mol. Biol. Reptr, 11(2):165-185.
Sequences suitable for permitting integration of the heterologous sequence into the plant genome are also recommended. These might include transposon sequences and the like for homologous recombination as well as Ti sequences which permit random insertion of a heterologous expression cassette into a plant genome. Suitable prokaryote selectable markers include resistance toward antibiotics such as ampicillin or tetracycline. Other DNA sequences encoding additional functions may also be present in the vector, as is known in the art.
The nucleic acid molecules of the subject invention may be included into an expression cassette for expression of the protein(s) of interest. Usually, there will be only one expression cassette, although two or more are feasible. The recombinant expression cassette will contain in addition to the heterologous protein encoding sequence the following elements, a promoter region, plant 5' untranslated sequences, initiation codon depending upon whether or not the structural gene comes equipped with one, and a transcription and translation termination sequence. Unique restriction enzyme sites at the 5' and 3' ends of the cassette allow for easy insertion into a pre-existing vector.
A heterologous coding sequence may be for any protein relating to the present invention. The sequence encoding the protein of interest will encode a signal peptide which allows processing and translocation of the protein, as appropriate, and will usually lack any sequence which might result in the binding of the desired protein of the invention to a membrane. Since, for the most part, the transcriptional initiation region will be for a gene which is expressed and translocated during germination, by employing the signal peptide which provides for translocation, one may also provide for translocation of the protein of interest. In this way, the protein(s) of interest will be translocated from the cells in which they are expressed and may be efficiently harvested. Typically secretion in seeds are across the aleurone or scutellar epithelium layer into the endosperm of the seed. While it is not required that the protein be secreted from the cells in which the protein is produced, this facilitates the isolation and purification of the recombinant protein.
Since the ultimate expression of the desired gene product will be in a eucaryotic cell it is desirable to determine whether any portion of the cloned gene contains sequences which will be processed out as introns by the host's splicosome machinery. If so, site-directed mutagenesis of the "intron" region may be conducted to prevent losing a portion of the genetic message as a false intron code, Reed and Maniatis, Cell 41:95-105, 1985.
The vector can be microinjected directly into plant cells by use of micropipettes to mechanically transfer the recombinant DNA. Crossway, Mol. Gen. Genet, 202:179-185, 1985. The genetic material may also be transferred into the plant cell by using polyethylene glycol, Krens, et al., Nature, 296, 72-74, 1982. Another method of introduction of nucleic acid segments is high velocity ballistic penetration by small particles with the nucleic acid either within the matrix of small beads or particles, or on the surface, Klein, et al., Nature, 327, 70-73, 1987 and Knudsen and Muller, 1991, Planta, 185:330-336 teaching particle bombardment of barley endosperm to create transgenic barley. Yet another method of introduction would be fusion of protoplasts with other entities, either minicells, cells, lysosomes or other fusible lipid-surfaced bodies, Fraley, et al., Proc. Natl. Acad. Sci. USA, 79, 1859-1863, 1982.
The vector may also be introduced into the :ells by electroporation. (Fromm et al., Proc. Natl Acad. Sci. USA 82:5824, 1985). In this technique, plant protoplasts are electroporated in the presence ofplasmids containing the gene construct. Electrical impulses of high field strength reversibly permeabilize biomembranes allowing the introduction of the plasmids. Electroporated plant protoplasts reform the cell wall, divide, and form plant callus.
All plants from which protoplasts can be isolated and cultured to give whole regenerated plants can be transformed by the present invention so that whole plants are recovered which contain the transferred gene. It is known that practically all plants can be regenerated from cultured cells or tissues, including but not limited to all major species of sugarcane, sugar beet, cotton, fruit and other trees, legumes and vegetables. Some suitable plants include, for example, species from the genera Fragaria, Lotus, Medicago, Onobnrchis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, Datura, Hyoscyamus, Lvcopersion, Nicotiana, Solanum, Petunia, Digitalis, Majorana, Cichorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, Nemesia, Pelargonium, Panicum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, Browaalia, Glvcine, Lolium, Zea, Triticum, Sorghum, and Datura.
Means for regeneration vary from species to species of plants, but generally a suspension of transformed protoplasts containing copies of the heterologous gene is first provided. Callus tissue is formed and shoots may be induced from callus and subsequently rooted. Alternatively, embryo formation can be induced from the protoplast suspension.
These embryos germinate as natural embryos to form plants. The culture media will generally contain various amino acids and hormones, such as auxin and cytokinins. It is also advantageous to add glutamic acid and proline to the medium, especially for such species as corn and alfalfa. Shoots and roots normally develop simultaneously. Efficient regeneration will depend on the medium, on the genotype, and on the history of the culture. If these three variables are controlled, then regeneration is fully reproducible and repeatable.
In some plant cell culture systems, the desired protein of the invention may be excreted or alternatively, the protein may be extracted from the whole plant. Where the desired protein of the invention is secreted into the medium, it may be collected.
Alternatively, the embryos and embryoless-half seeds or other plant tissue may be mechanically disrupted to release any secreted protein between cells and tissues. The mixture may be suspended in a buffer solution to retrieve soluble proteins. Conventional protein isolation and purification methods will be then used to purify the recombinant protein.
Parameters of time, temperature pH, oxygen, and volumes will be adjusted through routine methods to optimize expression and recovery of heterologous protein.
iii. Baculovirus Systems The polynucleotide encoding the protein can also be inserted into a suitable insect expression vector, and is operably linked to the control elements within that vector. Vector construction employs techniques which are known in the art. Generally, the components of the expression system include a transfer vector, usually a bacterial plasmid, which contains both a fragment of the baculovirus genome, and a convenient restriction site for insertion of the heterologous gene or genes to be expressed; a wild type baculovirus with a sequence homologous to the baculovirus-specific fragment in the transfer vector (this allows for the homologous recombination of the heterologous gene in to the baculovirus genome); and appropriate insect host cells and growth media.
After inserting the DNA sequence encoding the protein into the transfer vector, the vector and the wild type viral genome are transfected into an insect host cell where the vector and viral genome are allowed to recombine. The packaged recombinant virus is expressed and recombinant plaques are identified and purified. Materials and methods for baculovirus/insect cell expression systems are commercially available in kit form from, inter alia, Invitrogen, San Diego CA ("MaxBac" kit). These techniques are generally known to those skilled in the art and fully described in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987) (hereinafter "Summers and Smith").
Prior to inserting the DNA sequence encoding the protein into the baculovirus genome, the above described components, comprising a promoter, leader (if desirea), coding sequence of interest, and transcription termination sequence, are usually assembled into an intermediate transplacement construct (transfer vector). This construct may contain a single gene and operably linked regulatory elements; multiple genes, each with its owned set of operably linked regulatory elements; or multiple genes, regulated by the same set of regulatory elements. Intermediate transplacement constructs are often maintained in a replicon, such as an extrachromosomal element plasmids) capable of stable maintenance in a host, such as a bacterium. The replicon will have a replication system, thus allowing it to be maintained in a suitable host for cloning and amplification.
Currently, the most commonly used transfer vector for introducing foreign genes into AcNPV is pAc373. Many other vectors, known to those of skill in the art, have also been designed. These include, for example, pVL985 (which alters the polyhedrin start codon from ATG to ATT, and which introduces a BamHI cloning site 32 basepairs downstream from the ATT; see Luckow and Summers, Virology (1989) 17:31.
The plasmid usually also contains the polyhedrin polyadenylation signal (Miller et al.
(1988) Ann. Rev. Microbiol., 42:177) and a prokaryotic ampicillin-resistance (amp) gene and origin of replication for selection and propagation in E. coli.
Baculovirus transfer vectors usually contain a baculovirus promoter. A baculovirus promoter is any DNA sequence capable of binding a baculovirus RNA polymerase and initiating the downstream to transcription of a coding sequence structural gene) into mRNA. A promoter will have a transcription initiation region which is usually placed proximal to the 5' end of the coding sequence. This transcription initiation region usually includes an RNA polymerase binding site and a transcription initiation site. A baculovirus transfer vector may also have a second domain called an enhancer, which, if present, is usually distal to the structural gene. Expression may be either regulated or constitutive.
Structural genes, abundantly transcribed at late times in a viral infection cycle, provide particularly useful promoter sequences. Examples include sequences derived from the gene encoding the viral polyhedron protein, Friesen et al., (1986) "The Regulation of Baculovirus Gene Expression," in: The [Molecular Biology of Baculoviruses (ed. Walter Doerfler);
EPO
Publ. Nos. 127 839 and 155 476; and the gene encoding the pl0 protein, Vlak et al., (1988), J.
Gen. Virol. 69:765.
DNA encoding suitable signal sequences can be derived from genes for secreted insect or baculovirus proteins, such as the baculovirus polyhedrin gene (Carbonell et al. (1988) Gene, 73:409). Alternatively, since the signals for mammalian cell posttranslational modifications (such as signal peptide cleavage, proteolytic cleavage, and phosphorylation) appear to be recognized by insect cells, and the signals required for secretion and nuclear accumulation also appear to be conserved between the invertebrate cells and vertebrate cells, leaders of non-insect origin, such as those derived from genes encoding human (alpha) ainterferon, Maeda et al., (1985), Nature 315:592; human gastrin-releasing peptide, Lebacq- Verheyden et al., (1988), Molec. Cell. Biol. 8:3129; human IL-2, Smith et al., (1985) Proc.
Nat'l Acad. Sci. USA, 82:8404; mouse IL-3, (Miyajima et al., (1987) Gene 58:273; and human glucocerebrosidase, Martin et al. (1988) DNA, 7:99, can also be used to provide for secretion in insects.
A recombinant polypeptide or polyprotein may be expressed intracellularly or, if it is expressed with the proper regulatory sequences, it can be secreted. Good intracellular expression of nonfused foreign proteins usually requires heterologous genes that ideally have a short leader sequence containing suitable translation initiation signals preceding an ATG start signal. If desired, methionine at the N-terminus may be cleaved from the mature protein by in vitro incubation with cyanogen bromide.
Alternatively, recombinant polyproteins or proteins which are not naturally secreted can be secreted from the insect cell by creating chimeric DNA molecules that encode a fusion protein comprised of a leader sequence fragment that provides for secretion of the foreign protein in insects. The leader sequence fragment usually encodes a signal peptide comprised.
of hydrophobic amino acids which direct the translocation of the protein into the endoplasmic reticulum.
After insertion of the DNA sequence and/or the gene encoding the expression product precursor of the protein, an insect cell host is co-transformed with the heterologous DNA of the transfer vector and the genomic DNA of wild type baculovirus usually by cotransfection. The promoter and transcription termination sequence of the construct will usually comprise a 2-5kb section of the baculovirus genome. Methods for introducing heterologous DNA into the desired site in the baculovirus virus are known in the art. (See Summers and Smith supra; Ju et al. (1987); Smith et al., Mol. Cell. Biol. (1983) 3:2156; and Luckow and Summers (1989)). For example, the insertion can be into a gene such as the polyhedrin gene, by homologous double crossover recombination; insertion can also be into a restriction enzyme site engineered into the desired baculovirus gene. Miller et al., (1989), Bioessays 4:91. The DNA sequence, when cloned in place of the polyhedrin gene in the expression vector, is flanked both 5' and 3' by polyhedrin-specific sequences and is positioned downstream of the polyhedrin promoter.
The newly formed baculovirus expression vector is subsequently packaged into an infectious recombinant baculovirus. Homologous recombination occurs at low frequency (between about 1% and about thus, the majority of the virus produced after cotransfection is still wild-type virus. Therefore, a method is necessary to identify recombinant viruses. An advantage of the expression system is a visual screen allowing recombinant viruses to be distinguished. The polyhedrin protein, which is produced by the native virus, is produced at very high levels in the nuclei of infected cells at late times after viral infection. Accumulated polyhedrin protein forms occlusion bodies that also contain embedded particles. These occlusion bodies, up to 15 gm in size, are highly refractile, giving them a bright shiny appearance that is readily visualized under the light microscope. Cells infected with recombinant viruses lack occlusion bodies. To distinguish recombinant virus from wild-type virus, the transfection supernatant is plaqued onto a monolayer of insect cells by techniques known to those skilled in the art. Namely, the plaques are screened under the light microscope for the presence (indicative of wild-type virus) or absence (indicative of recombinant virus) of occlusion bodies. Current Protocols in Microbiology Vol. 2 (Ausubel et al. eds) at 16.8 (Supp. 10, 1990); Summers and Smith, supra; Miller et al. (1989).
Recombinant baculovirus expression vectors have been developed for infection into several insect cells. For example, recombinant baculoviruses have been developed for, inter alia: Aedes aegypti Autographa californica, Bombyx mori, Drosophila melanogaster, Spodopterafrugiperda, and Trichoplusia ni (PCT Pub. No. WO 89/046699; Carbonell et al., (1985) J. Virol. 56:153; Wright (1986) Nature 321:718; Smith et al., (1983)Mol. Cell. Biol.
3:2156; and see generally, Fraser, et al. (1989) In Vitro Cell. Dev. Biol. 25:225).
Cells and cell culture media are commercially available for both direct and fusion expression of heterologous polypeptides in a baculovirus/expression system; cell culture technology is generally known to those skilled in the art. See, Summers and Smith supra.
The modified insect cells may then be grown in an appropriate nutrient medium, which allows for stable maintenance of the plasmid(s) present in the modified insect host.
Where the expression product gene is under inducible control, the host may be grown to high density, and expression induced. Alternatively, where expression is constitutive, the product will be continuously expressed into the medium and the nutrient medium must be continuously circulated, while removing the product of interest and augmenting depleted nutrients. The product may be purified by such techniques as chromatography,
HPLC,
affinity chromatography, ion exchange chromatography, etc.; electrophoresis; density gradient centrifugation; solvent extraction, or the like. As appropriate, the product may be further purified, as required, so as to remove substantially any insect proteins which are also secreted in the medium or result from lysis of insect cells, so as to provide a product which is at least substantially free of host debris, proteins, lipids and polysaccharides.
In order to obtain protein expression, recombinant host cells derived from the transformants are incubated under conditions which allow expression of the recombinant protein encoding sequence. These conditions will vary, dependent upon the host cell selected.
However, the conditions are readily ascertainable to those of ordinary skill in the art, based upon what is known in the art.
iv. Bacterial Systems Bacterial expression techniques are known in the art. A bacterial promoter is any DNA sequence capable of binding bacterial RNA polymerase and initiating the downstream transcription of a coding sequence structural gene) into mRNA. A promoter will have a transcription initiation region which is usually placed proximal to the 5' end of the coding sequence. This transcription initiation region usually includes an RNA polymerase binding site and a transcription initiation site. A bacterial promoter may also have a second domain called an operator, that may overlap an adjacent RNA polymerase binding site at which RNA synthesis begins. The operator permits negative regulated (inducible) transcription, as a gene repressor protein may bind the operator and thereby inhibit transcription of a specific gene.
Constitutive expression may occur in the absence of negative regulatory elements, such as the operator. In addition, positive regulation may be achieved by a gene activator protein binding sequence, which, if present is usually proximal to the RNA polymerase binding sequence.
An example of a gene activator protein is the catabolite activator protein (CAP), which helps initiate transcription of the lac operon in Escherichia coli coli) (Raibaud et al. (1984) Annu. Rev. Genet. 18:173). Regulated expression may therefore be either positive or negative, thereby either enhancing or reducing transcription.
Sequences encoding metabolic pathway enzymes provide particularly useful promoter sequences. Examples include promoter sequences derived from sugar metabolizing enzymes, such as galactose, lactose (lac) (Chang et al. (1977) Nature 198:1056), and maltose.
Additional examples include promoter sequences denved from biosynthetic enzymes such as tryptophan (trp) (Goeddel et al. (1980) Nuc. Acids Res. 8:4057; Yelverton et al. (1981) Nucl.
Acids Res. 9:731; U.S. Patent 4,73S,921; EPO Publ. Nos. 036 776 and 121 775). The betalactamase (bla) promoter system (Weissmann (1981) "The cloning of interferon and other mistakes." In Interferon 3 (ed. I. Gresser)), bacteriophage lambda PL (Shimatake et al. (1981) Nature 292:128) and T5 Patent 4,689,406) promoter systems also provide useful promoter sequences.
In addition, synthetic promoters which do not occur in nature also function as bacterial promoters. For example, transcription activation sequences of one bacterial or bacteriophage promoter may be joined with the operon sequences of another bacterial or bacteriophage promcter, creating a synthetic hybrid promoter Patent 4,551,433). For example, the tac promoter is a hybrid trp-lac promoter comprised of both trp promoter and lac operon sequences that is regulated by the lac repressor (Amann et al. (1983) Gene 25:167; de Boer et al. (1983) Proc. Natl. Acad. Sci. 80:21). Furthermore, a bacterial promoter can include naturally occurring promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and initiate transcription. A naturally occurring promoter of non-bacterial origin can also be coupled with a compatible RNA polymerase to produce high levels of expression of some genes in prokaryotes. The bacteriophage T7 RNA polymerase/promoter system is an example of a coupled promoter system (Studier et al. (1986) J Mol. Biol.
189:113; Tabor et al. (1985) Proc Natl. Acad. Sci. 82:1074). In addition, a hybrid promoter can also be comprised of a bacteriophage promoter and an E. coli operator region (EPO Publ.
No. 267 851).
In addition to a functioning promoter sequence, an efficient ribosome binding site is also useful for the expression of foreign genes in prokaryotes. In E. coli, the ribosome binding site is called the Shine-Dalgamo (SD) sequence and includes an initiation codon (ATG) and a sequence 3-9 nucleotides in length located 3-11 nucleotides upstream of the initiation codon (Shine et al. (1975) Nature 254:34). The SD sequence is thought to promote binding of mRNA to the ribosome by the pairing of bases between the SD sequence and the 3' end of E. coli 16S rRNA (Steitz et al. (1979) "Genetic signals and nucleotide sequences in messenger RNA." In Biological Regulation and Development: Gene Expression (ed. R.F.
Goldberger)). To express eukaryotic genes and prokaryotic genes with weak ribosomebinding site, it is often necessary to optimize the distance between the SD sequence and the ATG of the eukaryotic gene (Sambrook et al. (1989) "Expression of cloned genes in Escherichia coli." In Molecular Cloning: A Laboratory Manual).
A DNA molecule may be expressed intracellularly. A promoter sequence may be directly linked with the DNA molecule, in which case the first amino acid at the N-terminus will always be a methionine, which is encoded by the ATG start codon. If desired, methionine at the N-terminus may be cleaved from the protein by in vitro incubation with cyanogen bromide or by either in vivo or in vitro incubation with a bacterial methionine N-terminal peptidase (EPO Publ. No. 219 237).
Fusion proteins provide an alternative to direct expression. Usually, a DNA sequence encoding the N-terminal portion of an endogenous bacterial protein, or other stable protein, is fused to the 5' end ofheterologous coding sequences. Upon expression, this construct will provide a fusion of the two amino acid sequences. For example, the bacteriophage lambda cell gene can be linked at the 5' terminus of a foreign gene and expressed in bacteria. The resulting fusion protein preferably retains a site for a processing enzyme (factor Xa) to cleave the bacteriophage protein from the foreign gene (Nagai et al. (1984) Nature 309:810). Fusion proteins can also be made with sequences from the lacZ (Jia et al. (1987) Gene 60:197), trpE (Allen et al. (1987) J. Biotechnol. 5:93; Makoffet al. (1989) J. Gen. Microbiol. 135:11), and Chey (EPO Publ. No. 324 647) genes. The DNA sequence at the junction of the two amino acid sequences may or may not encode a cleavable site. Another example is a ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin region that preferably retains a site for a processing enzyme ubiquitin specific processing-protease) to cleave the ubiquitin from the foreign protein. Through this method, native foreign protein can be isolated (Miller et al. (1989) Bio/Technology 7:698).
Alternatively, foreign proteins can also be secreted from the cell by creating chimeric DNA molecules that encode a fusion protein comprised of a signal peptide sequence fragment that provides for secretion of the foreign protein in bacteria Patent 4,336,336). The signal sequence fragment usually encodes a signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein from the cell. The protein is either secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located between the inner and outer membrane of the cell (gram-negative bacteria). Preferably there are processing sites, which can be cleaved either in vivo or in vitro encoded between the signal peptide fragment and the foreign gene.
DNA encoding suitable signal sequences can be derived from genes for secreted bacterial proteins, such as the E. coli outer membrane protein gene (ompA) (Masui et al.
(1983), in: Experimental Manipulation of Gene Expression; Ghrayeb et al. (1984) EMBO J.
3:2437) and the E. coli alkaline phosphatase signal sequence (phoA) (Oka et al. (1985) Proc.
Natl. Acad. Sci. 82:7212). As an additional e::ample, the signal sequence of the alpha-amylase gene from various Bacillus strains can be used to se .te heterologous proteins from B.
subtilis (Palva et al. (1982) Proc. Natl. Acad. Sci. USA 79:5582; EPO Publ. No. 244 042).
Usually, transcription termination sequences recognized by bacteria are regulatory regions located 3' to the translation stop codon, and thus together with the promoter flank the coding sequence. These sequences direct the transcription of an mRNA which can be translated into the polypeptide encoded by the DNA. Transcription termination sequences frequently include DNA sequences of about 50 nucleotides capable of forming stem loop structures that aid in terminating transcription. Examples include transcription termination sequences derived from genes with strong promoters, such as the trp gene in E. coli as well as other biosynthetic genes.
Usually, the above described components, comprising a promoter, signal sequence (if desired), coding sequence of interest, and transcription termination sequence, are put together into expression constructs. Expression constructs are often maintained in a replicon, such as an extrachromosomal element plasmids) capable of stable maintenance in a host, such as bacteria. The replicon will have a replication system, thus allowing it to be maintained in a prokaryotic host either for expression or for cloning and amplification. In addition, a replicon may be either a high or low copy number plasmid. A high copy number plasmid will generally have a copy number ranging from about 5 to about 200, and usually about 10 to about 150. A host containing a high copy number plasmid will preferably contain at least about 10, and more preferably at least about 20 plasmids. Either a high or low copy number vector may be selected, depending upon the effect of the vector and the foreign protein on the host.
Alternatively, the expression constructs can be integrated into the bacterial genome with an integrating vector. Integrating vectors usually contain at least one sequence homologous to the bacterial chromosome that allows the vector to integrate. Integrations appear to result from recombinations between homologous DNA in the vector and the bacterial chromosome. For example, integrating vectors constructed with DNA from various Bacillus strains integrate into the Bacillus chromosome (EPO Publ. No. 127 328). Integrating vectors may also be comprised of bacteriophage or transposon sequences.
Usually, extrachromosomal and integrating expression constructs may contain selectable markers to allow for the selection of bacterial strains that have been transformed.
Selectable markers can be expressed in the bacterial host and may include genes which render bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin (neomycin), and tetracycline (Davies et al. (1978) Annu. Rev. Microbiol. 32:469). Selectable markers may also include biosynthetic genes, such as those in the histidine, tryptophan, and leucine biosynthetic pathways.
Alternatively, some of the above described components can be put together in transformation vectors. Transformation vectors are usually comprised of a selectable market that is either maintained in a replicon or developed into an integrating vector, as described above.
Expression and transformation vectors, either extra-chromosomal replicons or integrating vectors, have been developed for transformation into many bacteria. For example, expression vectors have been developed for, inter alia, the following bacteria: Bacillus subtilis (Palva et al. (1982) Proc. Natl. Acad. Sci. USA 79:5582; EPO Publ. Nos. 036 259 and 063 953; PCT Publ. No. WO 84/04541), Escherichia coli (Shimatake et al. (1981) Nature 292:128; Amann et al. (1985) Gene 40:183; Studier et al. (1986) J Mol. Biol. 189:113; EPO Publ. Nos. 036 776, 136 829 and 136 907), Streptococcus cremoris (Powell et al. (1988) Appl.
Environ. Microbiol. 54:655); Streptococcus lividans (Powell et al. (1988) Appl. Environ.
Microbiol. 54:655), Streptomyces lividans Patent 4,745,056).
Methods of introducing exogenous DNA into bacterial hosts are well-known in the art, and usually include either the transformation of bacteria treated with CaC12 or other agents, such as divalent cations and DMSO. DNA can also be introduced into bacterial cells by electroporation. Transformation procedures usually vary with the bacterial species to be transformed. (See use of Bacillus: Masson et al. (1989) FEMS Microbiol. Lett. 60:273; Palva et al. (1982) Proc. Natl. Acad. Sci. USA 79:5582; EPO Publ. Nos. 036 259 and 063 953; PCT Publ. No. WO 84/04541; use of Campylobacter: Miller et al. (1988) Proc. Natl. Acad.
Sci. 85:856; and Wang et al. (1990) J. Bacteriol. 172:949; use of Escherichia coli: Cohen et al. (1973) Proc. Natl. Acad. Sci. 69:2110; Dower et al. (1988) Nucleic Acids Res. 16:6127; Kushner (1978) "An improved method for transformation of Escherichia coli with ColE lderived plasmids. In Genetic Engineering: Proceedings of the International Symposium on Genetic Engineering (eds. H.W. Boyer and S. Nicosia); Mandel et al. (1970) J. Mol. Biol.
53:159; Taketo (1988) Biochim. Biophys. Acta 949:318; use of Lactobacillus: Chassy et al.
(1987) FEMS Microbiol. Lett. 44:173; use of Pseudomonas: Fiedler et al. (1988) Anal.
Biochem 170:38; use of Staphylococcus: Augustin et al. (1990) FEMS Microbiol. Lett.
66:203; use of Streptococcus: Barany et al. (1980) J. Bacteriol. 144:698; Harlander (1987) "Transformation of Streptococcus lactis by electroporation, in: Streptococcal Genetics (ed. J.
Ferretti and R. Curtiss III); Perry et al. (1981) Infect. Immun. 32:1295; Powell et al. (1988) Appl. Environ. Microbiol. 54:655; Somkuti et al. (1987) Proc. 4th Evr. Cong. Biotechnology 1:412.
v. Yeast Expression Yeast expression systems are also known to one of ordinary skill in the art. A yeast promoter is any DNA sequence capable of binding yeast RNA polymerase and initiating the downstream transcription of a coding sequence structural gene) into mRNA. A promoter will have a transcription initiation region which is usually placed proximal to the end of the coding sequence. This transcription initiation region usually includes an RNA polymerase binding site (the "TATA Box") and a transcription initiation site. A yeast promoter may also have a second domain called an upstream activator sequence (UAS), which, if present, is usually distal to the structural gene. The UAS permits regulated (inducible) expression. Constitutive expression occurs in the absence of a UAS. Regulated expression may be either positive or negative, thereby either enhancing or reducing transcription.
Yeast is a fermenting organism witl. an active metabolic pathway, therefore sequences encoding enzymes in the metabolic pathway provide particularly useful promoter sequences.
Examples include alcohol dehydrogenase (ADH) (EPO Publ. No. 284 044), enolase, glucokinase, glucose-6-phosphate isomerase, glyceraldehyde-3-phosphate-dehydrogenase (GAP or GAPDH), hexokinase, phosphofructokinase, 3 -phosphoglycerate mutase, and pyruvate kinase (PyK) (EPO Publ. No. 329 203). The yeast PH05 gene, encoding acid phosphatase, also provides useful promoter sequences (Myanohara et al. (1983) Proc. Natl.
Acad. Sci. USA 80:1).
In addition, synthetic promoters which do not occur in nature also function as yeast promoters. For example, UAS sequences of one yeast promoter may be joined with the transcription activation region of another yeast promoter, creating a synthetic hybrid promoter. Examples of such hybrid promoters include the ADH regulatory sequence linked to the GAP transcription activation region Patent Nos. 4,876,197 and 4,880,734). Other examples of hybrid promoters include promoters which consist of the regulatory sequences of either the ADH2, GAL4, GAL10, OR PH05 genes, combined with the transcriptional activation region of a glycolytic enzyme gene such as GAP or PyK (EPO Publ. No. 164 556).
Furthermore, a yeast promoter can include naturally occurring promoters of non-yeast origin that have the ability to bind yeast RNA polymerase and initiate transcription. Examples of such promoters include, inter alia, (Cohen et al. (1980) Proc. Natl. Acad. Sci. USA 77:1078; Henikoff et al. (1981) Nature 283:835; Hollenberg et al. (1981) Curr. Topics Microbiol.
Immunol. 96:119; Hollenberg et al. (1979) "The Expression of Bacterial Antibiotic Resistance Genes in the Yeast Saccharomyces cerevisiae," in: Plasmids of Medical, Environmental and Commercial Importance (eds. K.N. Timmis and A. Puhler); Mercerau-Puigalon et al. (1980) Gene 11:163; Panthier et al. (1980) Curr. Genet. 2:109;).
A DNA molecule may be expressed intracellularly in yeast. A promoter sequence may be directly linked with the DNA molecule, in which case the first amino acid at the Nterminus of the recombinant protein will always be a methionine, which is encoded by the ATG start codon. If desired, methionine at the N-terminus may be cleaved from the protein by in vitro incubation with cyanogen bromide.
Fusion proteins provide an alternative for yeast expression systems, as well as in mammalian, plant, baculovirus, and bacterial expression systems. Usually, a DNA sequence encoding the N-terminal portion of an endogenous yeast protein, or other stable protein, is fused to the 5' end ofheterologous coding sequences. Upon expression, this construct will provide a fusion of the two amino acid sequences. For example, the yeast or human superoxide dismutase (SOD) gene, can be linked at the 5' terminus of a foreign gene and expressed in yeast. The DNA sequence at the junction of the two amino acid sequences may or may not encode a cleavable site. See EPO Publ. No. 196056. Another example is a ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin region that preferably retains a site for a processing enzyme ubiquitin-specific processing protease) to cleave the ubiquitin from the foreign protein. Through this method, therefore, native foreign protein can be isolated W088/024066).
Alternatively, foreign proteins can also be secreted from the cell into the growth media by creating chimeric DNA molecules that encode a fusion protein comprised of a leader sequence fragment that provide for secretion in yeast of the foreign protein. Preferably, there are processing sites encoded between the leader fragment and the foreign gene that can be cleaved either in vivo or in vitro. The leader sequence fragment usually encodes a signal peptide comprised of hydrophobic amino acids which direct the secretion of the protein from the cell.
DNA encoding suitable signal sequences can be derived from genes for secreted yeast proteins, such as the yeast invertase gene (EPO Publ. No. 012 873; JPO Publ. No.
62:096,086) and the A-factor gene Patent 4,588,684). Alternatively, leaders of nonyeast origin, such as an interferon leader, exist that also provide for secretion in yeast (EPO Publ. No. 060 057).
A preferred class of secretion leaders are those that employ a fragment of the yeast alpha-factor gene, which contains both a "pre" signal sequence, and a "pro" region. The types of alpha-factor fragments that can be employed include the full-length pre-pro alpha factor leader (about 83 amino acid residues) as well as truncated alpha-factor leaders (usually about to about 50 amino acid residues) Patent Nos. 4,546,083 and 4,870,008; EPO Publ.
No. 324 274). Additional leaders employing an alpha-factor leader fragment that provides for secretion include hybrid alpha-factor leaders made with a presequence of a first yeast, but a pro-region from a second yeast alphafactor. (See PCT Publ. No. WO 89/02463.) Usually, transcription termination sequences recognized by yeast are regulatory regions located 3' to the translation stop codon, and thus together with the promoter flank the coding sequence. These sequences direct the transcription of an mRNA which can be translated into the polypeptide encoded by the DNA. Examples of transcription terminator sequence and other yeast-recognized termination sequences, such as those coding for glycolytic enzymes.
Usually, the above described components, comprising a promoter, leader (if desired), coding sequence of interest, and transcription termination sequence, are put together into expression constructs. Expression constructs are often maintained in a replicon, such as an extrachromosomal element plasmids) capable of stable maintenance in a host, such as yeast or bacteria. The replicon may have two replication systems, thus allowing it to be maintained, for example, in yeast for expression and in a prokaryotic host for cloning and amplification. Examples of such yeast-bacteria shuttle vectors include YEp24 (Botstein et al.
(1979) Gene 8:17-24), pCl/1 (Brake et al. (1984) Proc. Natl. Acad. Sci USA 81:4642-4646), and YRpl7 (Stinchcomb et al. (1982) J. Mol. Biol. 158:157). In addition, a replicon may be either a high or low copy number plasmid. A high copy number plasmid will generally have a copy number ranging from about 5 to about 200, and usually about 10 to about 150. A host containing a high copy number plasmid will preferably have at least about 10, and more preferably at least about 20. Enter a high or low copy number vector may be selected, depending upon the effect of the vector and the foreign protein on the host. See Brake et al., supra.
Alternatively, the expression constructs can be integrated into the yeast genome with an integrating vector. Integrating vectors usually contain at least one sequence homologous to a yeast chromosome that allows the vector to integrate, and preferably contain two homologous sequences flanking the expression construct. Integrations appear to result from recombinations between homologous DNA in the vector and the yeast chromosome (Orr- Weaver et al. (1983) Methods in Enzvmol. 101:228-245). An integrating vector may be directed to a specific locus in yeast by selecting the appropriate homologous sequence for inclusion in the vector. See Orr-Weaver et al., supra. One or more expression construct may integrate, possibly affecting levels ofrecombinant protein produced (Rine et al. (1983) Proc.
Natl. Acad. Sci. USA 80:6750). The chromosomal sequences included in the vector can occur either as a single segment in the vector, which results in the integration of the entire vector, or two segments homologous to adjacent segments in the chromosome and flanking the expression construct in the vector, which can result in the stable integration of only the expression construct.
Usually, extrachromosomal and integrating expression constructs may contain selectable markers to allow for the selection of yeast strains that have been transformed.
Selectable markers may include biosynthetic genes that can be expressed in the yeast host, such as ADE2, HIS4, LEU2, TRP1, and ALG7, and the G418 resistance gene, which confer resistance in yeast cells to tunicamycin and G418, respectively. In addition, a suitable selectable marker may also provide yeast with the ability to grow in the presence of toxic compounds, such as metal. For example, the presence of CUP1 allows yeast to grow in the presence of copper ions (Butt et al. 19 87) Microbiol, Rev. 51:351).
Alternatively, some of the above described components can be put together into transformation vectors. Transformation vectors are usually comprised of a selectable marker that is either maintained in a replicon or developed into an integrating vector, as described above.
Expression and transformation vectors, either extrachromosomal replicons or integrating vectors, have been developed for transformation into many yeasts. For example, expression vectors and methods of introducing exogenous DNA into yeast hosts have been developed for, inter alia, the following yeasts: Candida albicans (Kurtz, et al. (1986) Mol.
Cell. Bil. 6:142); Candida maltosa (Kunze, et al. (1985) J. Basic Microbiol. 25:141); Hansenula polymorpha (Gleeson, et al. (1986) J. Gen. Microbiol. 132:3459; Roggenkamp et al. (1986) Mol. Gen. Genet. 202:302); Kluyveromycesfragilis (Das, et al. (1984) J. Bacteriol.
158:1165); Kluyveromyces lactis (De Louvencourt et al. (1983) J Bacteriol. 154:737; Van den Berg et al. (1990) Bio/Technology 8:135); Pichia guillerinondii (Kunze et al. (1985) J.
Basic Microbiol. 25:141); Pichia pastoris (Cregg, et al. (1985) Mol. Cell. Biol. 5:3376; U.S.
Patent Nos. 4,837,148 and 4,929,555); Saccharomvces cerevisiae (Hinnen et al. (1978) Proc.
Natl. Acad. Sci. USA 75:1929; Ito et al. (1983) J. Bacteriol. 153:163); Schizosaccharomvces pombe (Beach and Nurse (1981) Nature 300:706); and Yarrowia lipolytica (Davidow, et al.
(1985) Curr. Genet. 10:380471 Gaillardin, et al. (1985) Curr. Genet. 10:49).
Methods of introducing exogenous DNA into yeast hosts are well-known in the art, and usually include either the transformation of spheroplasts or of intact yeast cells treated with alkali cations. Transformation procedures usually vary with the yeast species to be transformed. See [Kurtz et al. (1986) Mol. Cell. Biol. 6:142; Kunze et al. (1985) J Basic Microbiol. 25:141; Candida]; [Gleeson et al. (1986)J. Gen. Microbiol. 132:3459; Roggenkamp et al. (1986) Mol. Gen. Genet. 202:302; Hansenula]; [Das et al. (1984) Bacteriol. 158:1165; De Louvencourt et al. (1983) J. Bacteriol. 154:1165; Van den Berg et al.
(1990) Bio/Technology 8:135; Kluyveromyces]; [Cregg et al. (1985) Mol. Cell. Biol. 5:3376; Kunze etal. (1985) J. Basic Microbiol. 25:141; U.S. Patent Nos. 4,837,148 and 4,929,555; Pichia]; [Hinnen et al. (1978) Proc. Natl. Acad. Sci. USA 75;1929; Ito et al. (1983) J.
Bacteriol. 153:163 Saccharomyces]; [Beach and Nurse (1981) Nature 300:706; Schizosaccharomyces]; [Davidow et al. (1985) Curr. Genet. 10:39; Gaillardin et al. (1985) Curr. Genet. 10:49; Yarrowia].
Definitions A composition containing X is "substantially free of" Y when at least 85% by weight of the total X+Y in the composition is X. Preferably, X comprises at least about 90% by weight of the total of X+Y in the composition, more preferably at least about 95% or even 99% by weight.
A "conserved" Neisseria amino acid fragment or protein is one that is present in a particular Neisserial protein in at least x% of Neisseria. The value of x may be 50% or more, 66%, 75%, 80%, 90%, 95% or even 100% the amino acid is found in the protein in question in all Neisseria). In order to determine whether an animo acid is "conserved" in a particular Neisserial protein, it is necessary to compare that amino acid residue in the sequences of the protein in question from a plurality of different Neisseria (a reference population). The reference population may include a number of different Neisseria species or may include a single species. The reference population may include a number of different serogroups of a particular species or a single serogroup. A preferred reference population consists of the 5 most common Neisseria strains.
The term "heterologous" refers to two biological components that are not found together in nature. The components may be host cells, genes, or regulatory regions, such as promoters. Although the heterologous components are not found together in nature, they can function together, as when a promoter heterologous to a gene is operably linked to the gene.
Another example is where a Neisserial sequence is heterologous to a mouse host cell.
"Epitope" means antigenic determinant, and may elicit a cellular and/or humoral response.
Conditions for "high stringency" are 65 degrees C in 0.1 xSSC 0.5% SDS solution.
An "origin of replication" is a polynucleotide sequence that initiates and regulates replication of polynucleotides, such as an expression vector. The origin of replication behaves as an autonomous unit of polynucleotide replication within a cell, capable of replication under its own control. An origin of replication may be needed for a vector to replicate in a particular host cell. With certain origins of replication, an expression vector can be reproduced at a high copy number in the presence of the appropriate proteins within the cell. Examples of origins are the autonomously replicating sequences, which are effective in yeast; and the viral T-antigen, effective in COS-7 cells.
A "mutant" sequence is defined as a DNA, RNA or amino acid sequence differing from but having homology with the native or disclosed sequence. Depending on the particular sequence, the degree of homology (sequence identity) between the native or disclosed sequence and the mutant sequence is pref. .,bly greater than 50% 60%, 90%, 95%, 99% or more) which is calculated as described above. As used herein, an "allelic variant" of a nucleic acid molecule, or region, for which nucleic acid sequence is provided herein is a nucleic acid molecule, or region, that occurs at essentially the same locus in the genome of another or second isolate, and that, due to natural variation caused by, for example, mutation or recombination, has a similar but not identical nucleic acid sequence. A coding region allelic variant typically encodes a protein having similar activity to that of the protein encoded by the gene to which it is being compared. An allelic variant can also comprise an alteration in the 5' or 3' untranslated regions of the gene, such as in regulatory control regions. (see, for example, U.S. Patent 5,753,235).
Antibodies As used herein, the term "antibody" refers to a polypeptide or group of polypeptides composed of at least one antibody combining site. An "antibody combining site" is the three-dimensional binding space with an internal surface shape and charge distribution complementary to the features of an epitope of an antigen, which allows a binding of the antibody with the antigen. "Antibody" includes, for example, vertebrate antibodies, hybrid antibodies, chimeric antibodies, humanized antibodies, altered antibodies, univalent antibodies, Fab proteins, and single domain antibodies.
Antibodies against the proteins of the invention are useful for affinity chromatography, immunoassays, and distinguishing/identifying Neisseria menB proteins.
Antibodies elicited against the proteins of the present invention bind to antigenic polypeptides or proteins or protein fragments that are present and specifically associated with strains of Neisseria meningitidis menB. In some instances, these antigens may be associated with specific strains, such as those antigens specific for the menB strains. The antibodies of the invention may be immobilized to a matrix and utilized in an immunoassay or on an affinity chromatography column, to enable the detection and/or separation of polypeptides, proteins or protein fragments or cells comprising such polypeptides, proteins or protein fragments.
Alternatively, such polypeptides, proteins or protein fragments may be immobilized so as to detect antibodies bindably specific thereto.
Antibodies to the proteins of the invention, both polyclonal and monoclonal, may be prepared by conventional methods. In general, the protein is first used to immunize a suitable animal, preferably a mouse, rat, rabbit or goat. Rabbits and goats are preferred for the preparation of polyclonal sera due to the volume of serum obtainable, and the availability of labeled anti-rabbit and anti-goat antibodies. Immunization is generally performed by mixing or emulsifying the protein in saline, preferably in an adjuvant such as Freund's complete adjuvant, and injecting the mixture or emulsion parenterally (generally subcutaneously or intramuscularly). A dose of 50-200 ig/injection is typically sufficient. Immunization is generally boosted 2-6 weeks later with one or more injections of the protein in saline, preferably using Freund's incomplete adjuvant. One may alternatively generate antibodies by in vitro immunization using methods known in the art, which for the purposes of this invention is considered equivalent to in vivo immunization. Polyclonal antisera is obtained by bleeding the immunized animal into a glass or plastic container, incubating the blood at for one hour, followed by incubating at 4°C for 2-18 hours. The serum is recovered by centrifugation 1,000g for 10 minutes). About 20-50 ml per bleed may be obtained from rabbits.
Monoclonal antibodies are prepared using the standard method of Kohler Milstein (Nature (1975) 256:495-96), or a modification thereof. Typically, a mouse or rat is immunized as described above. However, rather than bleeding the animal to extract serum, the spleen (and optionally several large lymph nodes) is removed and dissociated into single cells. If desired, the spleen cells may be screened (after removal of nonspecifically adherent cells) by applying a cell suspension to a plate or well coated with the protein antigen. B-cells that express membrane-bound immunoglobulin specific for the antigen bind to the plate, and are not rinsed away with the rest of the suspension. Resulting B-cells, or all dissociated spleen cells, are then induced to fuse with myeloma cells to form hybridomas, and are cultured in a selective medium hypoxanthine, aminopterin, thymidine medium, The resulting hybridomas are plated by limiting dilution, and are assayed for the production of antibodies which bind specifically to the immunizing antigen (and which do not bind to unrelated antigens). The selected MAb-secreting hybridomas are then cultured either in vitro in tissue culture bottles or hollow fiber reactors), or in vivo (as ascites in mice).
If desired, the antibodies (whether polyclonal or monoclonal) may be labeled using conventional techniques. Suitable labels include fluorophores, chromophores, radioactive atoms (particularly 3 2 P and 1251), electron-dense reagents, enzymes, and ligands having specific binding partners. Enzymes are typically detected by their activity. For example, horseradish peroxidase is usually detected by its ability to convert 3 3 ',5,5'-tetramethylbenzidine (TMB) to a blue pigment, quantifiable with a spectrophotometer. "Specific binding partner" refers to a protein capable of binding a ligand molecule with high specificity, as for example in the case of an antigen and a monoclonal antibody specific therefor. Other specific binding partners include biotin and avidin or streptavidin, IgG and protein A, and the numerous receptor-ligand couples known in the art. It should be understood that the above description is not meant to categorize the various labels into distinct classes, as the same label may serve in several different modes. For example, 125S may serve as a radioactive label or as an electron-dense reagent. HRP may serve as enzyme or as antigen for a MAb. Further, one may combine various labels for desired effect. For example, MAbs and avidin also require labels in the practice of this invention: thus, one might label a MAb with biotin, and detect its presence with avidin labeled with 125I, or with an anti-biotin MAb labeled with HRP. Other permutations and possibilities will be readily apparent to those of ordinary skill in the art, and are considered as equivalents within the scope of the instant invention.
Antigens, immunogens, polypeptides, proteins or protein fragments of the present invention elicit formation of specific binding partner antibodies. These antigens, immunogens, polypeptides, proteins or protein fragments of the present invention comprise immunogenic compositions of the present invention. Such immunogenic compositions may further comprise or include adjuvants, cari ers, or other compositions that promote or enhance or stabilize the antigens, polypeptides, proteins or protein fragments of the present invention.
Such adjuvants and carriers will be readily apparent to those of ordinary skill in the art.
Pharmaceutical Compositions Pharmaceutical compositions can comprise (include) either polypeptides, antibodies, or nucleic acid of the invention. The pharmaceutical compositions will comprise a therapeutically effective amount of either polypeptides, antibodies, or polynucleotides of the claimed invention.
The term "therapeutically effective amount" as used herein refers to an amount of a therapeutic agent to treat, ameliorate, or prevent a desired disease or condition, or to exhibit a detectable therapeutic or preventative effect. The effect can be detected by, for example, chemical markers or antigen levels. Therapeutic effects also include reduction in physical symptoms, such as decreased body temperature, when given to a patient that is febrile. The precise effective amount for a subject will depend upon the subject's size and health, the nature and extent of the condition, and the therapeutics or combination of therapeutics selected for administration. Thus, it is not useful to specif, an exact effective amount in advance. However, the effective amount for a given situation can be determined by routine experimentation and is within the judgment of the clinician.
For purposes of the present invention, an effective dose will be from about 0.01 mg/ kg to 50 mg/kg or 0.05 mg/kg to about 10 mgikg of the DNA constructs in the individual to which it is administered.
A pharmaceutical composition can also contain a pharmaceutically acceptable carrier.
The term "pharmaceutically acceptable carrier" refers to a carrier for administration of a therapeutic agent, such as antibodies or a polypeptide, genes, and other therapeutic agents.
The term refers to any pharmaceutical carrier that does not itself induce the production of antibodies harmful to the individual receiving the composition, and which may be administered without undue toxicity. Suitable carriers may be large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, and inactive virus particles. Such carriers are well known to those of ordinary skill in the art.
Pharmaceutically acceptable salts can be used therein, for example, mineral acid salts such as hydrochlorides, hydrobromides, phosphates, sulfates, and the like; and the salts of organic acids such as acetates, propionates, malonates, benzoates, and the like. A thorough discussion of pharmaceutically acceptable excipients is available in Remington's Pharmaceutical Sciences (Mack Pub. Co., N.J. 1991).
Pharmaceutically acceptable carriers in therapeutic compositions may contain liquids such as water, saline, glycerol and ethanol. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may be present in such vehicles.
Typically, the therapeutic compositions are prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection may also be prepared. Liposomes are included within the definition of a pharmaceutically acceptable carrier.
Delivery Methods Once formulated, the compositions of the invention can be administered directly to the subject. The subjects to be treated can be animals; in particular, human subjects can be treated.
Direct delivery of the compositions will generally be accomplished by injection, either subcutaneously, intraperitoneally, intravenously or intramuscularly or delivered to the interstitial space of a tissue. The compositions can also be administered into a lesion. Other modes of administration include oral and pulmonary administration, suppositories, and transdermal and transcutaneous applications, needles, and gene guns or hyposprays. Dosage treatment may be a single dose schedule or a multiple dose schedule.
Vaccines Vaccines according to the invention may either be prophylactic to prevent infection) or therapeutic to treat disease after infection).
Such vaccines comprise immunizing antigen(s) or immunogen(s), immunogenic polypeptide, protein(s) or protein fragments, or nucleic acids ribonucleic acid or deoxyribonucleic acid), usually in combination with "pharmaceutically acceptable carriers," which include any carrier that does not itself induce the production of antibodies harmful to the individual receiving the composition. Suitable carriers are typically large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, lipid aggregates (such as oil droplets or liposomes), and inactive virus particles. Such carriers are well known to those of ordinary skill in the art. Additionally, these carriers may function as immunostimulating agents ("adjuvants"). Furthermore, the immunogen or antigen may be conjugated to a bacterial toxoid, such as a toxoid from diphtheria, tetanus, cholera, H. pylori, etc. pathogens.
Preferred adjuvants to enhance effectiveness of the composition include, but are not limited to: aluminum salts (alum), such as aluminum hydroxide, aluminum phosphate, aluminum sulfate, etc; oil-in-water emulsion formulations (with or without other specific immunostimulating agents such as muramyl peptides (see below) or bacterial cell wall components), such as for example MF59 (PCT Publ. No. WO 90/14837), containing Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of MTP-PE (see below), although not required) formulated into submicron particles using a microfluidizer such as Model 110Y microfluidizer (Microfluidics, Newton, MA),
SAF,
containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blocked polymer L121, and thr-MDP (see below) either microfluidized into a submicron emulsion or vortexed to generate a larger particle size emulsion, and RibiTM adjuvant system (RAS), (Ribi Immunochem, Hamilton, MT) containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components from the group cons!iting ofn onophosphorylipid A (MPL), trehaulse dimycolate (TDM), and cell wall skeleton (CWS), preferably MPL CWS (DetoxTM); saponin adjuvants, such as StimulonTM (Cambridge Bioscience, Worcester, MA) may be used or particles generated therefrom such as ISCOMs (immunostimulating complexes); Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant
(IFA);
cytokines, such as interleukins IL-1. IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.), interferons gamma interferon), macrophage colony stimulating factor (M-CSF), tumor necrosis factor (TNF), etc; detoxified mutants of a bacterial ADP-ribosylating toxin such as a cholera toxin a pertussis toxin or an E. coli heat-labile toxin particularly LT-K63, LT-R72, CT-S109, PT-K9/G129; see, WO 93/13302 and WO 92/19265; and other substances that act as immunostimulating agents to enhance the effectiveness of the composition. Alum and MF59 are preferred.
As mentioned above, muramyl peptides include, but are not limited to, N-acetylmuramyl-L-threonyl-D-isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-Disoglutamine (nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(1'-2'dipalmitoyl-sn-glycero-3-huydroxyphosphoryloxy)ethylamine (MTP-PE), etc.
The vaccine compositions comprising immunogenic compositions which may include the antigen, pharmaceutically acceptable carrier, and adjuvant) typically will contain diluents, such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may be present in such vehicles. Alternatively, vaccine compositions comprising immunogenic compositions may comprise an antigen, polypeptide, protein, protein fragment or nucleic acid in a pharmaceutically acceptable carrier.
More specifically, vaccines comprising immunogenic compositions comprise an immunologically effective amount of the immunogenic polypeptides, as well as any other of the above-mentioned components, as needed. By "immunologically effective amount", it is meant that the administration of that amount to an individual, either in a single dose or as part of a series, is effective for treatment or prevention. This amount varies depending upon the health and physical condition of the individual to be treated, the taxonomic group of individual to be treated nonhuman primate, primate, etc.), the capacity of the individual's immune system to synthesize antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's assessment of the medical situation, and other relev\,nt factors. It is expected that the amount will fall in a relatively broad range that can be determined through routine trials.
Typically, the vaccine compositions or immunogenic compositions are prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection may also be prepared. The preparation also may be emulsified or encapsulated in liposomes for enhanced adjuvant effect, as discussed above under pharmaceutically acceptable carriers.
The immunogenic compositions are conventionally administered parenterally, by injection, either subcutaneously or intramuscularly. Additional formulations suitable for other modes of administration include oral and pulmonary formulations, suppositories, and transdermal and transcutaneous applications. Dosage treatment may be a single dose schedule or a multiple dose schedule. The vaccine may be administered in conjunction with other Immunoregulatory agents.
As an alternative to protein-based vaccines, DNA vaccination may be employed Robinson Torres (1997) Seminars in Immunology 9:271-283; Donnelly et al. (1997)Annu Rev Immunol 15:617-648).
Gene Delivery Vehicles Gene therapy vehicles for delivery of constructs, including a coding sequence of a therapeutic of the invention, to be delivered to the mammal for expression in the mammal, can be administered either locally or systemically. These constructs can utilize viral or non-viral vector approaches in in vivo or ex vivo modality. Expression of such coding sequence can be induced using endogenous mammalian or heterologous promoters. Expression of the coding sequence in vivo can be either constitutive or regulated.
The invention includes gene delivery vehicles capable of expressing the contemplated nucleic acid sequences. The gene delivery vehicle is preferably a viral vector and, more preferably, a retroviral, adenoviral, adeno-associated viral (AAV), herpes viral, or alphavirus vector. The viral vector can also be an astrovirus, coronavirus, orthomyxovirus, papovavirus, paramyxovirus, parvovirus, picornavirus, poxvirus, or togavirus viral vector. See generally, Jolly (1994) Cancer Gene Therapy 1:51-64; Kimura (1994) Human Gene Therapy 5:845-852; Connelly (1995) Human Gene Therapy 6:185-193; and Kaplitt (1994) Nature Genetics 6:148-153.
Retroviral vectors are well known in the art, including B, C and D type retroviruses, xenotropic retroviruses (for example, NZB-XI, NZB-X2 and NZB9-1 (see O'Neill (1985) J.
Virol. 53:160) polytropic retroviruses MCF and MCF-MLV (see Kelly (1983) J Virol.
45:291), spumaviruses and lentiviruses. See RNA Tumor Viruses, Second Edition, Cold Spring Harbor Laboratory, 1985.
Portions of the retroviral gene therapy vector may be derived from different retroviruses. For example, retrovector LTRs may be derived from a Murine Sarcoma Virus, a tRNA binding site from a Rous Sarcoma Virus, a packaging signal from a Murine Leukemia Virus, and an origin of second strand synthesis from an Avian Leukosis Virus.
These recombinant retroviral vectors may be used to generate transduction competent retroviral vector particles by introducing them into appropriate packaging cell lines (see US patent 5,591,624). Retrovirus vectors can be constructed for site-specific integration into host cell DNA by incorporation of a chimeric integrase enzyme into the retroviral particle (see W096/37626). It is preferable that the recombinant viral vector is a replication defective recombinant virus.
Packaging cell lines suitable for use with the above-described retrovirus vectors are well known in the art, are readily prepared (see W095/30763 and W092/05266), and can be used to create producer cell lines (also termed vector cell lines or "VCLs") for the production of recombinant vector particles. Preferably, the packaging cell lines are made from human parent cells HT1080 cells) or mink parent cell lines, which eliminates inactivation in human serum.
Preferred retroviruses for the construction of retroviral gene therapy vectors include Avian Leukosis Virus, Bovine Leukemia, Virus, Murine Leukemia Virus, Mink-Cell Focus-Inducing Virus, Murine Sarcoma Virus, Reticuloendotheliosis Virus and Rous Sarcoma Virus. Particularly preferred Murine Leukemia Viruses include 4070A and 1504A (Hartley and Rowe (1976) J Virol 19:19-25), Abelson (ATCC No. VR-999), Friend (ATCC No. VR-245), Graffi, Gross (ATCC Nol VR-590), K.rs.n, Harvey Sarcoma Virus and Rauscher (ATCC No. VR-998) and Moloney Murine Leukemia Virus (ATCC No. VR-190).
Such retroviruses may be obtained from depositories or collections such as the American Type Culture Collection ("ATCC") in Rockville, Maryland or isolated from known sources using commonly available techniques.
Exemplary known retroviral gene therapy vectors employable in this invention include those described in patent applications GB2200651, EP0415731, EP0345242, EP0334301, W089/02468; W089/05349, W089/09271, W090/02806, W090/07936, W094/03622, W093/25698, W093/25234, W093/11230, W093/10218, W091/02805, W091/02825, W095/07994, US 5,219,740, US 4,405,712, US 4,861,719, US 4,980,289, US 4,777,127, US 5,591,624. See also Vile (1993) Cancer Res 53:3860-3864; Vile (1993) Cancer Res 53:962-967; Ram (1993) Cancer Res 53 (1993) 83-88; Takamiya (1992) JNeurosci Res 33:493-503; Baba (1993) JNeurosurg 79:729-735; Mann (1983) Cell 33:153; Cane (1984) Proc Natl Acad Sci 81:6349; and Miller (1990) Human Gene Therapy 1.
Human adenoviral gene therapy vectors are also known in the art and employable in this invention. See, for example, Berkner (1988) Biotechniques 6:616 and Rosenfeld (1991) Science 252:431, and W093/07283, W093/06223, and W093/07282. Exemplary known adenoviral gene therapy vectors employable in this invention include those described in the above referenced documents and in W094/12649, W093/03769, W093/19191, W094/28938, W095/11984, W095/00655, W095/27071, W095/29993, W095/34671, W096/05320, W094/08026, W094/11506, W093/06223, W094/24299, W095/14102, W095/24297, W095/02697, W094/28152, W094/24299, W095/09241, W095/25807, W095/05835, W094/18922 and W095/09654. Alternatively, administration of DNA linked to killed adenovirus as described in Curiel (1992) Hum. Gene Ther. 3:147-154 may be employed. The gene delivery vehicles of the invention also include adenovirus associated virus (AAV) vectors. Leading and preferred examples of such vectors for use in this invention are the AAV-2 based vectors disclosed in Srivastava, W093/09239. Most preferred AAV vectors comprise the two AAV inverted terminal repeats in which the native D-sequences art modified by substitution of nucleotides, such that at least 5 native nucleotides and up to 18 native nucleotides, preferably at least 10 native nucleotides up to 18 native nucleotides, most preferably 10 native nucleotides are retained and the remaining nucleotides of the D-sequence are deleted or replaced with non-native nucleotides. The native D-sequences of the AAV inverted terminal repeats are sequences of 20 consecutive nucleotides in each AAV inverted terminal repeat there is one sequence at each end) which are not involved in HP formation. The non-native replacement nucleotide may be any nucleotide other than the nucleotide found in the native D-sequence in the same position. Other employable exemplary AAV vectors are pWP-19, pWN-1, both of which are disclosed in Nahreini (1993) Gene 124:257-262. Another example of such an AAV vector is psub201 (see Samulski (1987) J.
Virol. 61:3096). Another exemplary AAV vector is the Double-D ITR vector. Construction of the Double-D ITR vector is disclosed in US Patent 5,478,745. Still other vectors are those disclosed in Carter US Patent 4,797,368 and Muzyczka US Patent 5,139,941, Chartejee US Patent 5,474,935, and Kotin W094/288157. Yet a further example of an AAV vector employable in this invention is SSV9AFABTKneo, which contains the AFP enhancer and albumin promoter and directs expression predominantly in the liver. Its structure and construction are disclosed in Su (1996) Human Gene Therapy 7:463-470. Additional AAV gene therapy vectors are described in US 5.354,678, US 5.173,414, US 5,139,941, and US 5,252,479.
The gene therapy vectors comprising sequences of the invention also include herpes vectors. Leading and preferred examples are herpes simplex virus vectors containing a sequence encoding a thymidine kinase polypeptide such as those disclosed in US 5,288,641 and EP0176170 (Roizman). Additional exemplary herpes simplex virus vectors include HFEM/ICP6-LacZ disclosed in W095/04139 (Wistar Institute), pHSVlac described in Geller (1988) Science 241:1667-1669 and in W090/09441 and W092/07945, HSV Us3::pgC-lacZ described in Fink (1992) Human Gene Therapy 3:11-19 and HSV 7134, 2 RH 105 and GAL4 described in EP 0453242 (Breakefield), and those deposited with the ATCC as accession numbers ATCC VR-977 and ATCC VR-260.
Also contemplated are alpha virus gene therapy vectors that can be employed in this invention. Preferred alpha virus vectors are Sindbis viruses vectors. Togaviruses, Semliki Forest virus (ATCC VR-67; ATCC VR-1247), Middleberg virus (ATCC VR-370), Ross River virus (ATCC VR-373; ATCC VR-1246), Venezuelan equine encephalitis virus (ATCC VR923; ATCC VR-1250; ATCC VR-1249; ATCC VR-532), and those described in US patents 5,091,309, 5,217,879, and W092/10578. More particularly, those alpha virus vectors described in U.S. Serial No. 08/405,627, filed March 15, 1995,W094/21792, W092/10578, W095/07994, US 5,091,309 and US 5,217,879 are employable. Such alpha viruses may be obtained from depositories or collections such as the ATCC in Rockville, Maryland or isolated from known sources using commonly available techniques. Preferably, alphavirus vectors with reduced cytotoxicity are used (see USSN 08/679640).
DNA vector systems such as eukarytic layered expression systems are also useful for expressing the nucleic acids of the invention. SeeWO95/07994 for a detailed description of eukaryotic layered expression systems. Preferably, the eukaryotic layered expression systems of the invention are derived from alphavirus vectors and most preferably from Sindbis viral vectors.
Other viral vectors suitable for use in the present invention include those derived from poliovirus, for example ATCC VR-58 and those described in Evans, Nature 339 (1989) 385 and Sabin (1973) J. Biol. Standardization 1:115; rhinovirus, for example ATCC VR-1110 and those described in Arnold (1990) J Cell Biochem L401; pox viruses such as canary pox virus or vaccinia virus, for example ATCC VR-111 and ATCC VR-2010 and those described in Fisher-Hoch (1989) Proc Natl Acad Sci 86:317; Flexner (1989) Ann NY Acad Sci 569:86, Flexner (1990) Vaccine 8:17; in US 4,603,112 and US 4,769,330 and WO89/01973; virus, for example ATCC VR-305 and those described in Mulligan (1979) Nature 277:108 and Madzak (1992) J Gen Virol 73:1533; influenza virus, for example ATCC VR-797 and recombinant influenza viruses made employing reverse genetics techniques as described in US 5,166,057 and in Enami (1990) Proc Natl Acad Sci 87:3802-3805; Enami Palese (1991) J Virol 65:2711-2713 and Luytjes (1989) Cell 59:110, (see also McMichael (1983) NEJMed 309:13, and Yap (1978) Nature 273:238 and Nature (1979) 277:108); human immunodeficiency virus as described in EP-0386882 and in Buchschacher (1992) J. Virol.
66:2731; measles virus, for example ATCC VR-67 and VR-1247 and those described in EP- 0440219; Aura virus, for example ATCC VR-368; Bebaru virus, for example ATCC VR-600 and ATCC VR-1240; Cabassou virus, for example ATCC VR-922; Chikungunya virus, for example ATCC VR-64 and ATCC VR-1241; Fort Morgan Virus, for example ATCC VR-924; Getah virus, for example ATCC VR-369 and ATCC VR-1243; Kyzylagach virus, for example ATCC VR-927; Mayaro virus, for example ATCC VR-66; Mucambo virus, for example ATCC VR-580 and ATCC VR-1244; Ndumu virus, for example ATCC VR-371; Pixuna virus, for example ATCC VR-372 and ATCC VR-1245; Tonate virus, for example ATCC VR-925; Triniti virus, for example ATCC VR-469; Una virus, for example ATCC VR-374; Whataroa virus, for example ATCC VR-926; Y-62-33 virus, for example ATCC VR-375; O'Nyong virus, Eastern encephalitis virus, for example ATCC VR-65 and ATCC VR-1242; Western encephalitis virus, for example ATCC VR-70, ATCC VR-1251, ATCC VR-622 and ATCC VR-1252; and coronavirus, for example ATCC VR-740 and those described in Hamre (1966) Proc Soc Exp Biol Med 121:190.
Delivery of the compositions of this invention into cells is not limited to the above mentioned viral vectors. Other delivery methods and media may be employed such as, for example, nucleic acid expression vectors, polycationic condensed DNA linked or unlinked to killed adenovirus alone, for example see US Serial No. 08/366,787, filed December 30, 1994 and Curiel (1992) Hum Gene Ther 3:147-154 ligand linked DNA, for example see Wu (1989) JBiol Chem 264:16985-16987, eucaryotic cell delivery vehicles cells, for example see US Serial No.08/240,030, filed May 9, 1994, and US Serial No. 08/404,796, deposition of photopolymerized hydrogel materials, hand-held gene transfer particle gun, as described in US Patent 5,149,655, ionizing radiation as described in US5,206,152 and in W092/11033, nucleic charge neutralization or fusion with cell membranes. Additional approaches are described in Philip (1994) Mol Cell Biol 14:2411-2418 and in Woffendin (1994) Proc Natl Acad Sci 91:1581-1585.
Particle mediated gene transfer may be employed, for example see US Serial No.
60/023,867. Briefly, the sequence can be inserted into conventional vectors that contain conventional control sequences for high level expression, and then incubated with synthetic gene transfer molecules such as polymeric DNA-binding cations like polylysine, protamine, and albumin, linked to cell targeting ligands such as asialoorosomucoid, as described in Wu Wu (1987) J. Biol. Chem. 262:4429-4432, insulin as described in Hucked (1990) Biochem Pharmacol 40:253-263, galactose as described in Plank (1992) Bioconjugate Chem 3:533-539, lactose or transferrin.
Naked DNA may also be employed to transform a host cell. Exemplary naked DNA introduction methods are described in WO 90/11092 and US 5,580,859. Uptake efficiency may be improved using biodegradable latex beads. DNA coated latex beads are efficiently transported into cells after endocytosis initiation by the beads. The method may be improved further by treatment of the beads to increase hydrophobicity and thereby facilitate disruption of the endosome and release of the DNA into the cytoplasm.
Liposomes that can act as gene delivery vehicles are described in U.S. 5,422,120, WO95/13796, W094/23697, WO91/14445 and EP-524,968. As described in USSN.
60/023,867, on non-viral delivery, the nucleic acid sequences encoding a polypeptide can be inserted into conventional vectors that contain conventional control sequences for high level expression, and then be incubated with synthetic gene transfer molecules such as polymeric DNA-binding cations like polylysine, protamine, and albumin, linked to cell targeting ligands such as asialoorosomucoid, insulin, galactose, lactose, or transferrin. Other delivery systems include the use of liposomes to encapsulate DNA comprising the gene under the control of a variety of tissue-specific or ubiquitously-active promoters. Further non-viral delivery suitable for use includes mechanical delivery systems such as the approach described in Woffendin et al (1994) Proc. Natl. Acad. Sci. USA 91(24):11581-11585. Moreover, the coding sequence and the product of expression of such can be delivered through deposition of photopolymerized hydrogel materials. Other conventional methods for gene delivery that can be used for delivery of the coding sequence include, for example, use of hand-held gene transfer particle gun, as described in U.S. 5,149,655; use of ionizing radiation for activating transferred gene, as described in U.S. 5,206,152 and W092/11033.
Exemplary liposome and polycationic gene delivery vehicles are those described in US 5,422,120 and 4,762,915; inWO 95/13796; WO94/23697; and W091/14445; in EP- 0524968; and in Stryer, Biochemistry, pages 236-240 (1975) W.H. Freeman, San Francisco; Szoka (1980) Biochem Biophys Acta 600:1; Bayer (1979) Biochem Biophys Acta 550:464; Rivnay (1987) Meth Enzymol 149:119; Wang (1987) Proc Natl Acad Sci 84:7851; Plant (1989) Anal Biochem 176:420.
A polynucleotide composition can comprises therapeutically effective amount of a gene therapy vehicle, as the term is defined above. For purposes of the present invention, an effective dose will be from about 0.01 mg/ kg to 50 mg/kg or 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to which it is administered.
Delivery Methods Once formulated, the polynucleotide compositions of the invention can be administered directly to the subject; delivered ex vivo, to cells derived from the subject; or in vitro for expression of recombinant proteins. The subjects to be treated can be mammals or birds. Also, human suojects can be treated.
Direct delivery of the compositions will generally be accomplished by injection, either subcutaneously, intraperitoneally, intravenously or intramuscularly or delivered to the interstitial space of a tissue. The compositions can also be administered into a tumor or lesion.
Other modes of administration include oral and pulmonary administration, suppositories, and transdermal applications, needles, and gene guns or hyposprays. Dosage treatment may be a single dose schedule or a multiple dose schedule.
Methods for the ex vivo delivery and reimplantation of transformed cells into a subject are known in the art and described in eg. W093/14778. Examples of cells useful in ex vivo applications include, for example, stem cells, particularly hematopoetic, lymph cells, macrophages, dendritic cells, or tumor cells.
Generally, delivery of nucleic acids for both ex vivo and in vitro applications can be accomplished by the following procedures, for example, dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei, all well known in the art.
Polynucleotide and polypeptide pharmaceutical compositions In addition to the pharmaceutically acceptable carriers and salts described above, the following additional agents can be used with polynucleotide and/or polypeptide compositions.
A.Polypeptides One example are polypeptides which include, without limitation: asioloorosomucoid (ASOR); transferrin; asialoglycoproteins; antibodies; antibody fragments; ferritin; interleukins; interferons, granulocyte, macrophage colony stimulating factor (GM-CSF), granulocyte colony stimulating factor (G-CSF), macrophage colony stimulating factor (M-CSF), stem cell factor and erythropoietin. Viral antigens, such as envelope proteins, can also be used. Also, proteins from other invasive organisms, such as the 17 amino acid peptide from the circumsporozoite protein ofplasmodium falciparum known as RII.
B.Hormones, Vitamins. Etc.
Other groups that can be included are, for example: hormones, steroids, androgens, estrogens, thyroid hormone, or vitamins, folic acid.
C.Polyalkylenes, Polysaccharides, etc.
Also, polyalkylene glycol can be included with the desired polynucleotides or polypeptides. In a preferred embodiment, the polyalkylene glycol is polyethlylene glycol. In addition, mono-, di-, or polysaccarides can be included. In a preferred embodiment of this aspect, the polysaccharide is dextran or DEAE-dextran. Also, chitosan and poly(lactide-co-glycolide) D.Lipids, and Liposomes The desired polynucleotide or polypeptide can also be encapsulated in lipids or packaged in liposomes prior to delivery to the subject or to cells derived therefrom.
Lipid encapsulation is generally accomplished using liposomes which are able to stably bind or entrap and retain nucleic acid. The ratio of condensed polynucleotide or polypeptide to lipid preparation can vary but will generally be around 1:1 (mg DNA:micromoles lipid), or more of lipid. For a review of the use of liposomes as carriers for delivery of nucleic acids, see, Hug and Sleight (1991) Biochim. Biophys. Acta. 1097:1-17; Straubinger (1983) Meth. Enzvmol. 101:512-527.
Liposomal preparations for use in the present invention include cationic (positively charged), anionic (negatively charged) and neutral preparations. Cationic liposomes have been shown to mediate intracellular delivery ofplasmid DNA (Felgner (1987) Proc. Natl.
Acad. Sci. USA 84:7413-7416); mRNA (Malone (1989) Proc. Natl. Acad. Sci. USA 86:6077-6081); and purified transcription factors (Debs (1990) J. Biol. Chem.
265:10189-10192), in functional form.
Cationic liposomes are readily available. For example, N[1- 2 3 -dioleyloxy)propyl]-N,N,N-triethylammonium (DOTMA) liposomes are available under the trademark Lipofectin. from GIBCO BRL, Grand Island, NY. (See, also, Feigner supra). Other commercially available liposomes include transfectace (DDAB/DOPE) and DOTAP/DOPE (Boerhinger). Other cationic liposomes can be prepared from readily available materials using techniques well known in the art. See, eg. Szoka (1978) Proc. Nail. Acad. Sci.
USA 75:4194-4198; W090/11092 for a description of the synthesis of DOTAP (1,2-bis(oleoyloxy)-3-(trimethylammonio)propane) liposomes.
Similarly, anionic and neutral liposomes are readily available, such as from Avanti Polar Lipids (Birmingham, AL), or can be easily prepared using readily available materials.
Such materials include phosphatidyl choline, cholesterol, phosphatidyl ethanolamine, dioleoylphosphatidyl choline (DOPC), dioleoylphosphatidyl glycerol
(DOPG),
dioleoylphoshatidyl ethanolamine (DOPE), among others. These materials can also be mixed with the DOTMA and DOTAP starting materials in appropriate ratios. Methods for making liposomes using these materials are well known in the art.
The liposomes can comprise multilammelar vesicles (MLVs), small unilamellar vesicles (SUVs), or large unilamellar vesicles (LUVs). The various liposome-nucleic acid complexes are prepared using methods known in the art. See eg. Straubinger (1983) Meth.
Immunol. 101:512-527; Szoka (1978) Proc. Natl. Acad. Sci. USA 75:4194-4198; Papahadjopoulos (1975) Biochim. Biophys. Acta 394:483; Wilson (1979) Cell 17:77); Deamer Bangham (1976) Biochim. Biophys. Acta 443:629; Ostro (1977) Biochem. Biophvs. Res.
Commun. 76:836; Fraley (1979) Proc. Natl. Acad. Sci. USA 76:3348); Enoch Strittmatter (1979) Proc. Natl. Acad. Sci. USA 76:145; Fraley (1980) J Biol. Chem. (1980) 255:10431; Szoka Papahadjopoulos (1978) Proc. Natl. Acad. Sci. USA 75:145; and Schaefer-Ridder (1982) Science 215:166.
E.Lipoproteins In addition, lipoproteins can be included with the polynucleotide or polypeptide to be delivered. Examples oflipoproteins to be utilized include: chylomicrons, HDL, IDL, LDL, and VLDL. Mutants, fragments, or fusions of these proteins can also be used. Also, modifications of naturally occurring lipoproteins can be used, such as acetylated LDL. These lipoproteins can target the delivery of polynucleotides to cells expressing lipoprotein receptors. Preferably, if lipoproteins are including with the polynucleotide to be delivered, no other targeting ligand is included in the composition.
Naturally occurring lipoproteins comprise a lipid and a protein portion. The protein portion are known as apoproteins. At the present, apoproteins A, B, C, D, and E have been isolated and identified. At least two of these contain several proteins, designated by Roman numerals, AI, All, AIV; CI, CII, CIII.
A lipoprotein can comprise more than one apoprotein. For example, naturally occurring chylomicrons comprises of A, B, C, and E, over time these lipoproteins lose A and acquire C and E apoproteins. VLDL comprises A, B, C, and E apoproteins, LDL comprises apoprotein B; and HDL comprises apoproteins A, C, and E.
The amino acid of these apoproteins are known and are described in, for example, Breslow (1985) Annu Rev. Biochem 54:699; Law (1986) Adv. Exp Med. Biol. 151:162; Chen (1986) JBiol Chem 261:12918; Kane (1980) Proc Natl Acad Sci USA 77:2465; and Utermann (1984) Hum Genet 65:232.
Lipoproteins contain a variety of lipids including, triglycerides, cholesterol (free and esters), and phopholipids. The composition of the lipi.. ,,aries in naturally occurring lipoproteins. For example, chylomicrons comprise mainly triglycerides. A more detailed description of the lipid content of naturally occurring lipoproteins can be found, for example, in Meth. Enzymol. 128 (1986). The composition of the lipids are chosen to aid in conformation of the apoprotein for receptor binding activity. The composition of lipids can also be chosen to facilitate hydrophobic interaction and association with the polynucleotide binding molecule.
Naturally occurring lipoproteins can be isolated from serum by ultracentrifugation, for instance. Such methods are described in Meth. Enzymol. (supra); Pitas (1980) J. Biochem.
255:5454-5460 and Mahey (1979) J Clin. Invest 64:743-750.
Lipoproteins can also be produced by in vitro or recombinant methods by expression of the apoprotein genes in a desired host cell. See, for example, Atkinson (1986) Annu Rev Biophys Chem 15:403 and Radding (1958) Biochim Biophys Acta 30: 443.
Lipoproteins can also be purchased from commercial suppliers, such as Biomedical Techniologies, Inc., Stoughton, Massachusetts,
USA.
Further description of lipoproteins can be found in Zuckermann et al., PCT. Appln.
No. US97/14465.
F.Polycationic Agents Polycationic agents can be included, with or without lipoprotein, in a composition with the desired polynucleotide or polypeptide to be delivered.
Polycationic agents, typically, exhibit a net positive charge at physiological relevant pH and are capable of neutralizing the electrical charge of nucleic acids to facilitate delivery to a desired location. These agents have both in vitro, ex vivo, and in vivo applications.
Polycationic agents can be used to deliver nucleic acids to a living subject either intramuscularly, subcutaneously, etc.
The following are examples of useful polypeptides as polycationic agents: polylysine, polyarginine, polyornithine, and protamine. Other examples include histones, protamines, human serum albumin, DNA binding proteins, non-histone chromosomal proteins, coat proteins from DNA viruses, such as (XI 74, transcriptional factors also contain domains that bind DNA and therefore may be useful as nucleic aid condensing agents. Briefly, transcriptional factors such as C/CEBP, c-jun, c-fos, AP-1, AP-2, AP-3, CPF, Prot-1, Sp-1, Oct-1, Oct-2, CREP, and TFIID contain basic domains that bind DNA sequences.
Organic polycationic agents include: spermine, spermidine, and purtrescine.
The dimensions and of the physical properties of a polycationic agent can be extrapolated from the list above, to construct other polypeptide polycationic agents or to produce synthetic polycationic agents.
Synthetic Polycationic Agents Synthetic polycationic agents which are useful include, for example, DEAE-dextran polybrene. LipofectinD, and lipofectAMINEE are monomers that form polycationic complexes when combined with polynucleotides or polypeptides.
Immunodiagnostic Assays Neisserial antigens of the invention can be used in immunoassays to detect antibody levels (or, conversely, anti-Neisserial antibodies can be used to detect antigen levels).
Immunoassays based on well defined, recombinant antigens can be developed to replace invasive diagnostics methods. Antibodies to Neisserial proteins within biological samples, including for example, blood or serum samples, can be detected. Design of the immunoassays is subject to a great deal of variation, and a variety of these are known in the art. Protocols for the immunoassay may be based, for example, upon competition, or direct reaction, or sandwich type assays. Protocols may also, for example, use solid supports, or may be by immunoprecipitation. Most assays involve the use of labeled antibody or polypeptide; the labels may be, for example, fluorescent, chemiluminescent, radioactive, or dye molecules.
Assays which amplify the signals from the probe are also known; examples of which are assays which utilize biotin and avidin, and enzyme-labeled and mediated immunoassays, such as ELISA assays.
Kits suitable for immunodiagnosis and containing the appropriate labeled reagents are constructed by packaging the appropriate materials, including the compositions of the invention, in suitable containers, along with the remaining reagents and materials (for example, suitable buffers, salt solutions, etc.) required for the conduct of the assay, as well as suitable set of assay instructions Nucleic Acid Hybridisation "Hybridization" refers to the association of two nucleic acid sequences to one another by hydrogen bonding. Typically, one sequence will be fixed to a solid support and the other will be free in solution. Then, th2 two sequences will be placed in contact with one another under conditions that favor hydrogen bonding. Factors that affect this bonding include: the type and volume of solvent; reaction temperature; time of hybridization; agitation; agents to block the non-specific attachment of the liquid phase sequence to the solid support (Denhardt's reagent or BLOTTO); concentration of the sequences; use of compounds to increase the rate of association of sequences (dextran sulfate or polyethylene glycol); and the stringency of the washing conditions following hybridization. See Sambrook et al. [supra] Volume 2, chapter 9, pages 9.47 to 9.57.
"Stringency" refers to conditions in a hybridization reaction that favor association of very similar sequences over sequences that differ. For example, the combination of temperature and salt concentration should be chosen that is approximately 120 to 200DC below the calculated Tm of the hybrid under study. The temperature and salt conditions can often be determined empirically in preliminary experiments in which samples of genomic DNA immobilized on filters are hybridized to the sequence of interest and then washed under conditions of different stringencies. See Sambrook et al. at page 9.50.
Variables to consider when performing, for example, a Southern blot are the complexity of the DNA being blotted and the homology between the probe and the sequences being detected. The total amount of the fragment(s) to be studied can vary a magnitude of 10, from 0.1 to 1 tg for a plasmid or phage digest to 10 9 to 10 8 g for a single copy gene in a highly complex eukaryotic genome. For lower complexity polynucleotides, substantially shorter blotting, hybridization, and exposure times, a smaller amount of starting polynucleotides, and lower specific activity of probes can be used. For example, a single-copy yeast gene can be detected with an exposure time of only 1 hour starting with 1 gg of yeast DNA, blotting for two hours, and hybridizing for 4-8 hours with a probe of 10 8 cpm/gg. For a single-copy mammalian gene a conservative approach would start with 10 gg of DNA, blot overnight, and hybridize overnight in the presence of 10% dextran sulfate using a probe of greater than 108 cpm/ug, resulting in an exposure time of-24 hours.
Several factors can affect the melting temperature (Tm) of a DNA-DNA hybrid between the probe and the fragment of interest, and consequently, the appropriate conditions for hybridization and washing. In many cases the probe is not 100% homologous to the fragment. Other commonly encountered variables include the length and total G+C content of the hybridizing sequences and the ionic strength and formamide content of the hybridization buffer. The effects of all of these factors can be approximated by a single equation: Tm= 81 16.6(logloCi) C)]-0.6(%formamide) 6 0 0 where Ci is the salt concentration (monovalent ions) and n is the length of the hybrid in base pairs (slightly modified from Meinkoth Wahl (1984) Anal. Biochem. 138: 267-284).
In designing a hybridization experiment, some factors affecting nucleic acid hybridization can be conveniently altered. The temperature of the hybridization and washes and the salt concentration during the washes are the simplest to adjust. As the temperature of the hybridization increases (ie. stringency), it becomes less likely for hybridization to occur between strands that are nonhomologous, and as a result, background decreases. If the radiolabeled probe is not completely homologous with the immobilized fragment (as is frequently the case in gene family and interspecies hybridization experiments), the hybridization temperature must be reduced, and background will increase. The temperature of the washes affects the intensity of the hybridizing band and the degree of background in a similar manner. The stringency of the washes is also increased with decreasing salt concentrations.
In general, convenient hybridization temperatures in the presence of 50% formamide are 42EC for a probe with is 95% to 100% homologous to the target fragment, 37DC for to 95% homology, and 32 C for 85% to 90% homology. For lower homologies. formamide content should be lowered and temperature adjusted accordingly, using the equation above. If the homology between the probe and the target fragment are not known, the simplest approach is to start with both hybridization and wash conditions which are nonstringent. If non-specific bands or high background are observed after autoradiography, the filter can be washed at high stringency and reexposed. If the time required for exposure makes this approach impractical, several hybridization and/or washing stringencies should be tested in parallel.
Nucleic Acid Probe Assays Methods such as PCR, branched DNA probe assays, or blotting techniques utilizing nucleic acid probes according to the invention can determine the presence of cDNA or mRNA. A probe is said to "hybridize" with a sequence of the invention if it can form a duplex or double stranded complex, which is stable enough to be detected.
The nucleic acid probes will hybridize to the Neisserial nucleotide sequences of the invention (including both sense and antisense strand,). Though many different nucleotide sequences will encode the amino acid sequence, the native Neisserial sequence is preferred because it is the actual sequence present in cells. mRNA represents a coding sequence and so a probe should be complementary to the coding sequence; single-stranded cDNA is complementary to mRNA, and so a cDNA probe should be complementary to the non-coding sequence.
The probe sequence need not be identical to the Neisserial sequence (or its complement) some variation in the sequence and length can lead to increased assay sensitivity if the nucleic acid probe can form a duplex with target nucleotides, which can be detected. Also, the nucleic acid probe can include additional nucleotides to stabilize the formed duplex. Additional Neisserial sequence may also be helpful as a label to detect the formed duplex. For example, a non-complementary nucleotide sequence may be attached to the 5' end of the probe, with the remainder of the probe sequence being complementary to a Neisserial sequence. Alternatively, non-complementary bases or longer sequences can be interspersed into the probe, provided that the probe sequence has sufficient complementarity with the a Neisserial sequence in order to hybridize therewith and thereby form a duplex which can be detected.
The exact length and sequence of the probe will depend on the hybridization conditions, such as temperature, salt condition and the like. For example, for diagnostic applications, depending on the complexity of the analyte sequence, the nucleic acid probe typically contains at least 10-20 nucleotides, preferably 15-25, and more preferably at least nucleotides, although it may be shorter than this. Short primers generally require cooler temperatures to form sufficiently stable hybrid complexes with the template.
Probes may be produced by synthetic procedures, such as the triester method of Matteucci et al. Am. Chem. Soc. (1981) 103:3185], or according to Urdea et al. [Proc.
Natl. Acad. Sci. USA (1983) 80: 7461], or using commercially available automated oligonucleotide synthesizers.
The chemical nature of the probe can be selected according to preference. For certain applications, DNA or RNA are appropriate. For other applications, modifications may be incorporated eg. backbone modifications, such as phosphorothioates or methylphosphonates, can be used to increase in vivo half-life, alter RNA affinity, increase nuclease resistance etc.
[eg. see Agrawal Iyer (1995) Curr Opin Biotechnol 6:12-19; Agrawal (1996) TIBTECH 14:376-387]; analogues such as peptide nucleic acids may also be used [eg. see Corey (1997) TIBTECH 15:224-229; Buchardt et al. (1993) TIBTECH 11:384-386].
One example of a nucleotide hybridization assay is described by Urdea et al. in international patent application W092/02526 [see also US patent 5,124,246].
Alternatively, the polymerase chain reaction (PCR) is another well-known means for detecting small amounts of target nucleic acids. The assay is described in: Mullis et al. [Meth.
Enzymol. (1987) 155: 335-350]; US patent 4,683,195; and US patent 4,683,202. Two "primer" nucleotides hybridize wi:h the larg,:t nucleic acids and are used to prime the reaction. The primers can comprise sequence that does not hybridize to the sequence of the amplification target (or its complement) to aid with duplex stability or, for example, to incorporate a convenient restriction site. Typically, such sequence will flank the desired Neisserial sequence.
A thermostable polymerase creates copies of target nucleic acids from the primers using the original target nucleic acids as a template. After a threshold amount of target nucleic acids are generated by the polymerase, they can be detected by more traditional methods, such as Southern blots. When using the Southern blot method, the labelled probe will hybridize to the Neisserial sequence (or its complement).
Also, mRNA or cDNA can be detected by traditional blotting techniques described in Sambrook et al [supra]. mRNA, or cDNA generated from mRNA using a polymerase enzyme, can be purified and separated using gel electrophoresis. The nucleic acids on the gel are then blotted onto a solid support, such as nitrocellulose. The solid support is exposed to a labelled probe and then washed to remove any unhybridized probe. Next, the duplexes containing the labeled probe are detected. Typically, the probe is labelled with a radioactive moiety.
EXAMPLES
The examples describe nucleic acid sequences which have been identified in N. meningitidis, and N. gonorrhoeae along with their respective and putative translation products. Not all of the nucleic acid sequences are complete ie. they encode less than the fulllength wild-type protein.
The examples are generally in the following format: a nucleotide sequence which has been identified in N. meningitidis the putative translation product of said N. meningitidis sequence a computer analysis of said translation product based on database comparisons a corresponding nucleotide sequence identified from N. gonorrhoeae the putative translation product of said N. gonorrhoeae sequence a comparision of the percentage of identity between the translation product of the N. meningitidis sequence and the N. gonorrhoeae sequence.
a corresponding nucleotide sequence identified from strain A of N. meningitidis the putative translation product of said N. meningitidis strain A sequence a comparision of the percentage of identity between the translation product of the N. meningitidis sequence and the N. gonorrhoeae sequence.
a description of the characteristics of the protein which indicates that it might be suitably antigenic or immunogenic.
Sequence comparisons were performed at NCBI (http://www.ncbi.nlm.nih.gov) using the algorithms BLAST, BLAST2, BLASTn, BLASTp, tBLASTn, BLASTx, tBLASTx [eg.
see also Altschul et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25:2289-3402]. Searches were performed against the following databases: non-redundant GenBank+EMBL+DDBJ+PDB sequences and non-redundant GenBank CDS translations+PDB+SwissProt+SPupdate+PIR sequences.
Dots within nucleotide sequences represent nucleotides which have been arbitrarily introduced in order to maintain a reading frame. In the same way, double-underlined nucleotides were removed Lower case letters represent ambiguities which arose during alignment of independent sequencing reactions (some of the nucleotide sequences in the examples are derived from combining the results of two or more experiments).
Nucleotide sequences were scanned in all six reading frames to predict the presence of hydrophobic domains using an algorithm based on the statistical studies of Esposti et al.
[Critical evaluation of the hydropathy of membrane proteins (1990) Eur JBiochem 190:207- 219]. These domains represent potential transmembrane regions or hydrophobic leader sequences.
Open reading frames were predicted from fragmented nucleotide sequences using the program ORFFINDER (NCBI).
Underlined amino acid sequences indicate possible transmembrane domains or leader sequences in the ORFs, as predicted by the PSORT algorithm (http://www.psort.nibb.ac.jp).
Functional domains were also predicted using the MOTIFS program (GCG Wisconsin
PROSITE).
For each of the following examples: based on the presence of a putative leader sequence and/or several putative transmembrane domains (single-underlined) in the gonococcal protein, it is predicted that the proteins from N. meningitidis and N. gonorrhoeae, and their respective epitopes, could be useful antigens or immunogenic compositions for vaccines or diagnostics.
The standard techniques and procedures which may be employed in order to perform the invention to utilize the disclosed sequences for vaccination or diagnostic purposes) were summarized above. This summary is not a limitation on the invention but, rather, gives examples that may be used, but are not required.
In particular, the following methods were used to express, purify and biochemically characterize the proteins of the invention.
Chromosomal DNA Preparation N.meningitidis strain 2996 was grown to exponential phase in 100ml of GC medium, harvested by centrifugation, and resuspended in 5ml buffer Sucrose, 50mM Tris- HCI, 50mM EDTA, pH8). After 10 minutes incubation on ice, the bacteria were lysed by adding 10ml of lysis solution (50mM NaCI, 1% Na-Sarkosyl, 50pg/ml Proteinase and the suspension incubated at 37 0 C for 2 hours. Two phenol extractions (equilibrated to pH 8) and one CHCl 3 /isoamylalcohol (24:1) extraction were performed. DNA was precipitated by addition of 0.3M sodium acetate and 2 volumes of ethanol, and collected by centrifugation.
The pellet was washed once with 7 ethanol and redissolved in 4 .0ml TE buffer Tris-HCI, ImM EDTA, pH The DNA concentration was measured by reading the OD at 260 nm.
Oligonucleotide design Synthetic oligonucleotide primers were designed on the basis of the coding sequence of each ORF, using the meningococcus B sequence when available, or the gonococcus/meningococcus A sequence, adapted to the codon preference usage of meningococcus as necessary. Any predicted signal peptides were omitted, by designing the pnmers to sequence immediately downstream from the predicted leader sequence.
For most ORFs, the 5' primers included two restriction enzyme recognition sites (BamHI-NdeI, BamHI-NheI, EcoRI-NdeI or EcoRI-NI, depending on the restriction pattern of the gene of interest. The 3' primers included a XhoI or a HindIII restriction site (table 1).
This procedure was established in order to direct the cloning of each amplification product (corresponding to each ORF) into two different expression systems: pGEX-KG (using BamHI-XhoI, BamHI-HindII, EcoRI-XhoI or EcoRI-HindlII), and pET21b+ (using NdeI- Xhol, NheI-Xhol, NdeI-HindlII or NheI-HindIII).
primer tail: CGCGGATCCCATATG (BamHI-Ndel) CGCGGATCCGCTAGC (BamHI-NheI) CCGGAATTCTACATATG (EcoRI-NdeI) CCGGAATTCTAGCTAGC (EcoRI-NheI) 3'-end primer tail: CCCGCTCGAG (X7hoI) CCCGCTCGAG (HindII) For cloning ORFs into the pGEX-His vector, the 5' and 3' primers contained only one restriction enzyme site (EcoRI, KpnI or Sail for the 5' primers and PstI, XbaI, SphI or Sail for the 3' primers). Again restriction sites were chosen according to the particular restriction pattern of the gene (table 1).
primer tail: (AAA) AAAGAATTC (EcoRI) (AAA) AAAGGTACC (KpnI) 3'-end primer tail: (AAA) AAACTGCAG (PstI) (AAA) AAATCTAGA (Xbal) AAAGCATGC (SphI) or 3'-end primer tail: AAAAAAGTCGAC (Sall) As well as containing the restriction enzyme recognition sequences, the primers included nucleotides which hybridized to the sequence to be amplified. The melting temperature depended on the number and type of hybridising nucleotides in the whole primer, and was determined for each primer using the formulae: Tm 4 2 (tail excluded) Tm= 64.9 0.41 GC) 600/N (whole primer) The melting temperatures of the selected oligonucleotides were usually 65-70 0 C for the whole oligo and 50-55 0 C for the hybridising region alone.
Table 1 shows the forward and reverse primers used for each amplification. In certain cases, the sequence of the primer does not exactly match the sequence of the predicted ORF.
This is because when initial amplifications were performed, the complete 5' and/or 3' sequences for some meningococcal B ORFs were not known. However the corresponding sequences had been identified in Gonococcus or in Meningoccus A. Hence, when the Meningoccus B sequence was incomplete or uncertain, Gonococcal or Meningococcal
A
sequences were used as the basis for primer design. These sequences were altered to take account of codon preference. It can be appreciated that, once the complete sequence is identified, this approach will no longer be necessary.
Oligonucleotides were synthesized using a Perkin Elmer 394 DNA/RNA Synthesizer, eluted from the columns in 2.0ml NH40H, and deprotected by 5 hours incubation at 56 0
C.
The oligos were precipitated by addition of 0.3M Na-Acetate and 2 volumes ethanol. The samples were centrifuged and the pellets resuspended in either 100ul or 1.0ml of water. The
OD
2 60 was determined using a Perkin Elmer Lambda Bio spectophotometer and the concentration adjusted to 2-10pmol/pl.
Amplification The standard PCR protocol was as follows: 50-200ng of genomic DNA was used as a template in the presence of20-40uM of each oligonucletide primer, 400-800 M dNTPs solution, lx PCR buffer (including 1.5mM MgCl2), 2.5 units TaqI DNA polymerase (using Perkin-Elmer AmpliTaQ, GIBCO Platinum, Pwo DNA polymerase, or Tahara Shuzo Taq polymerase). In some cases, PCR was optimsed by the addition of 10p1 DMSO or 50p 1 2M Betaine.
After a hot start (adding the polymerase during a preliminary 3 minute incubation of the whole mix at 95 0 each sample underwent a two-step amplification. The first 5 cycles were performed using the hybridization temperature that excluded the restriction enzyme tail of the primer (see above). This was followed by 30 cycles using the hybridization temperature calculated for the whole length oligos. The cycles were completed with a 10 minute extension step at 72 0 C. The standard cycles were as follows: Denaturation Hybridisation Elongation First 5 cycles 30 seconds 30 seconds 30-60 seconds 0 C 50-55 0 C 72°C Last 30 cycles 30 seconds 30 seconds 30-60 seconds 0 C 65-70 0 C 72 0
C
Elongation times varied according to the length of the ORF to be amplified.
Amplifications were performed using either a 9600 or a 2400 Perkin Elmer GeneAmp PCR System. To check the results, 1/10 of the amplification volume was loaded onto a 1-1.5% agarose gel and the size of each amplified fragment compared with a DNA molecular weight marker.
The amplified DNA was either loaded directly on a 1% agarose gel or first precipitated with ethanol and resuspended in a volume suitable to be loaded on a agarose gel. The DNA fragment corresponding to the band of correct size was purified using the Qiagen Gel Extraction Kit, following the manufacturer's protocol. DNA fragments were eluted in a volume of 30pl or 50tl with either H20 or 10mM Tris, pH Digestion of PCR fragments The purified DNA corresponding to the amplified fragment was doubly-digested with the appropriate restriction enzymes for; cloning into pET-21b+ and expressing the protein as a C-terminus His-tagged fusion, for cloning i-'o pGEX-KG and expressing the protein as a Nterminus GST-fusion, and for cloning into pGEX-His and expressing the protein as a N-terminus GST-His tagged fusion.
Each purified DNA fragment was incubated at 37 0 C for 3 hours to overnight with units of appropriate resLriction enzyme (New England Biolabs) in a volume of either 30 or 401l in the presence of suitable digestion buffer. Digested fragments were purified using the QIAquick PCR purification kit (following the manufacturer's instructions) and eluted in a volume of 30gl or 501 with either H20 or 10mM Tris, pH 8.5. The DNA concentration was determined by quantitative agarose gel electrophoresis gel) in the presence of a titrated molecular weight marker.
Digestion of the cloning vectors (pET22B, pGEX-KG, pTRC-His A, pET21b+, pGEX- KG, and pGEX-His) The vector pGEX-His is a modified pGEX-2T vector carrying a region encoding six histidine residues upstream of the thrombin cleavage site and containing the multiple cloning site of the vector pTRC99 (Pharmacia). 10 jg plasmid was double-digested with 50 units of each restriction enzyme in 200 pl reaction volume in the presence of appropriate buffer by overnight incubation at 37 0 C. After loading the whole digestion on a 1% agarose gel, the band corresponding to the digested vector was purified from the gel using the Qiagen QIAquick Gel Extraction Kit and the DNA was eluted in 50 1 of 10 mM Tris-HC1, pH The DNA concentration was evaluated by measuring OD 26 o of the sample, and adjusted to tg/pl. 1 p1 of plasmid was used for each cloning procedure.
of plasmid vector was doubly-digested with 50 units of each restriction enzyme in a volume of 200pl with the appropriate buffer overnight at 37°C. The digest was loaded onto a 1.0% agarose gel and the band corresponding to the digested vector purified using the Qiagen QIAquick Gel Extraction Kit. DNA was eluted in 50pl of 10mM Tris-HC1, pH The DNA concentration was evaluated by measuring OD26onm and the concentration adjusted to 50pg/4l. I ll of plasmid was used for each cloning procedure.
Cloning For some ORFs, the fragments corresponding to each ORF, previously digested and purified, were ligated in both pET22b and pGEX-KG. In a final volume of 20 pl, a molar ratio of 3:1 fragment/vector was ligated using 0.5 1p of NEB T4 DNA ligase (400 units/il), in the presence of the buffer supplied by the manufacturer. The reaction was incubated at room temperature for 3 hours. In some experiments, ligation was performed using the Boheringer "Rapid Ligation Kit", following the manufacturer's instructions.
In order to introduce the recombinant plasmid in a suitable strain, 100 tl E. coli competent cells were incubated with the ligase reaction solution for 40 minutes on ice, then at 37°C for 3 minutes, then, after adding 800 jl LB broth, again at 37 0 C for 20 minutes. The cells were then centrifuged at maximum speed in an Eppendorf microfuge and resuspended in approximately 200 al of the supernatant. The suspension was then plated on LB ampicillin (100 mg/ml).
The screening of the recombinant clones was performed by growing randomly-chosen colonies overnight at 37 °C in either 2 ml (pGEX or pTC clones) or (pET clones) LB broth 100 pg/ml ampicillin. The cells were then pelletted and the DNA extracted using the Qiagen QIAprep Spin Miniprep Kit, following the manufacturer's instructions, to a final volume of 30 pl. 5 pl of each individual miniprep (approximately Ig) were digested with either NdeI/XhoI or BamHIIXhoI and the whole digestion loaded onto a 1agarose gel (depending on the expected insert size), in parallel with the molecular weight marker (1Kb DNA Ladder, GIBCO). The screening of the positive clones was made on the base of the correct insert size.
For other ORFs, the fragments corresponding to each ORF, previously digested and purified, were ligated into both pET21b+ and pGEX-KG. A molar ratio of of 3:1 fragment/vector was used in a final volume of 2 0ul, that included 0.5pl T4 DNA ligase (400 units/pl, NEB) and ligation buffer supplied by the manufacturer. The reaction was performed at room temperature for 3 hours. In some experiments, ligation was performed using the Boheringer "Rapid Ligation Kit" and the manufacturer's protocol.
Recombinant plasmid was transformed into 1001 of competent E. coli DH5 or HB101 by incubating the ligase reaction solution and bacteria for 40 minutes on ice then at 37°C for 3 minutes. This was followed by the addition of800pl LB broth and incubation at 37°C for 20 minutes. The cells were centrifuged at maximum speed in an Eppendorf microfuge, resuspended in approximately 2 00pl of the supematant and plated onto LB ampicillin (100mg/ml agar.
Screening for recombinant clones was performed by growing 5 randomly selected colonies overnight at 37 0 C in either 2 .0ml (pGEX-KG clones) or 5.0ml (pET clones) LB broth 1 0 0jg/ml ampicillin. Cells were pelleted and plasmid DNA extracted using the Qiagen QIAprep Spin Miniprep Kit, following the manufacturer's instructions.
Approximately liyg of each individual miniprep was digested with the appropriate restriction enzymes and the digest loaded onto a 1-1.5% agarose gel (depending on the expected insert size), in parallel with the molecular weight marker (1kb DNA Ladder, GIBCO). Positive clones were selected on the basis of the size of insert.
ORFs were cloned into PGEX-His, by doubly-digesting the PCR product and ligating into similarly digested vector. After cloning, recombinant plasmids were transformed into the E.coli host W3110. Individual clones were grown overnight at 37 0 C in LB broth with ampicillin.
Certain ORFs may be cloned into the pGEX-HIS vector using EcoRI-PstI cloning sites, or EcoRI-Sall, or Sall-PstI. After cloning, the recombinant plasmids may be introduced in the E.coli host W3110.
Expression Each ORF cloned into the expression vector may then be transformed into the strain suitable for expression of the recombinant protein product. 1 pl of each construct was used to transform 30 pl of E.coli BL21 (pGEX vector), E.coli TOP 10 (pTRC vector) or E.coli BL21- DE3 (pET vector), as described above. In the case of the pGEX-His vector, the same E.coli strain (W3110) was used for initial cloning and expression. Single recombinant colonies were inoculated into 2ml LB+Amp (100 gg/ml), incubated at 37°C overnight, then diluted 1:30 in ml of LB+Amp (100 pg/ml) in 100 ml flasks, making sure that the OD 6 00 ranged between 0.1 and 0.15. The flasks were incubated at 30 0 C into gyratory water bath shakers until OD indicated exponential growth suitable for induction of expression (0.4-0.8 OD for pET and pTRC vectors; 0.8-1 OD for pGEX and pGEX-His vectors). For the pET, pTRC and pGEX- His vectors, the protein expression was induced by addiction of ImM IPTG, whereas in the case of pGEX system the final concentration of IPTG was 0.2 mM. After 3 hours incubation at 30°C, the final concentration of the sample was checked by OD. In order to check expression, Iml of each sample was removed, centrifuged in a microfuge, the pellet resuspended in PBS, and analysed by 12% SDS-PAGE with Coomassie Blue staining. The whole sample was centrifuged at 6 00 0g and the pellet resuspended in PBS for further use.
GST-fusion proteins large-scale purification.
For some ORFs, a single colony was grown overnight at 37 0 C on LB+Amp agar plate.
The bacteria were inoculated into 20 ml of LB+Amp liquid colture in a water bath shaker and grown overnight. Bacteria were diluted 1:30 into 60C ml of fresh medium and allowed to grow at the optimal temperature (20-37 0 C) to OD 55 0 0.8-1. Protein expression was induced with 0.2mM IPTG followed by three hours incubation. The culture was centrifuged at 8000 rpm at 4 0 C. The supernatant was discarded and the bacterial pellet was resuspended in 7.5 ml cold PBS. The cells were disrupted by sonication on ice for 30 sec at 40W using a Branson sonifier B-15, frozen and thawed two times and centrifuged again. The supematant was collected and mixed with 150l Glutatione-Sepharose 4B resin (Pharmacia) (previously washed with PBS) and incubated at room temperature for 30 minutes. The sample was centrifuged at 70 0g for 5 minutes at 4C. The resin was washed twice with 10 ml cold PBS for minutes, resuspended in Iml cold PBS, and loaded on a disposable column. The resin was washed twice with 2ml cold PBS until the flow-through reached OD 2 80 of 0.02-0.06. The GST-fusion protein was eluted by addition of 7 00ul cold Glutathione elution buffer reduced glutathione, 50mM Tris-HC1) and fractions collected until the OD 2 80 was 0.1. 21 ul of each fraction were loaded on a 12% SDS gel using either Biorad SDS-PAGE Molecular weight standard broad range (M1) (200, 116.25, 97.4, 66.2, 45, 31, 21.5, 14.4, 6.5 kDa) or Amersham Rainbow Marker (220, 66, 46, 30, 21.5. 14.3 kDa) as standards. As the MW of GST is 26kDa, this value must be added to the MW of each GST-fusion protein.
For other ORFs, for each clone to be purified as a GST-fusion, a single colony was streaked out and grown overnight at 37'C on a LB/Amp. (100g/ml) agar plate. An isolated colony from this plate was inoculated into 20ml of LB/Amp (100 .g/ml) liquid medium and grown overnight at 37°C with shaking. The overnight culture was diluted 1:30 into 6 0 0ml LB/Amp (100pg/ml) liquid medium and allowed to grow at the optimal temperature 37 0 C) until the OD55onm reached 0.6-0.8. Recombinant protein expression was induced by addition of IPTG (final concentration 0.2mM) and the culture incubated for a further 3 hours.
Bacteria were harvested by centrifugation at 8000xg for 15 min at 4°C.
The bacterial pellet was resuspended in 7.5ml cold PBS. Cells were disrupted by sonication on ice four times for 30 sec at 40W using a Branson sonifier 450 and centrifuged at 13 000xg for 30 min at 4 0 C. The supernatant was collected and mixed with 150ul Glutatione- Sepharose 4B resin (Pharmacia), previously equilibrated with PBS, and incubated at room temperature with gentle agitation for 30 min. The batch-wise preparation was centrifuged at 7 00xg for 5 min at 4 0 C and the supernatant discarded. The resin was washed twice (batchwise) with 10ml cold PBS for 10 min, resuspended in lml cold PBS, and loaded onto a disposable column. The resin continued to be washed with cold PBS, until the OD 2 80nm of the flow-through reached 0.02-0.01. The GST-fusion protein was eluted by addition of 7001l cold glutathione elution buffer (10mM reduced glutathione, 50mM Tris-HCl pH 8.0) and fractions collected, until the OD 28 0nm of the eluate indicated all the recombinant protein was obtained.
204l aliquots of each elution fraction were analyzed by SDS-PAGE using a 12% gel. The molecular mass of the purified proteins was determined using either the Bio-Rad broad range molecular weight standard (M1) (200, 116, 97.4, 66.2, 45.0, 31.0, 21.5, 14.4, 6.5 kDa) or the Amersham Rainbow Marker (M2) (220, 66.2, 46.0, 30.0, 21.5, 14.3 kDa). The molecular weights of GST-fusion proteins are a combination of the 26 kDa GST protein and its fusion partner. Protein concentrations were estimated using the Bradford assay.
His-fusion soluble proteins large-scale purification.
For some ORFs, a single colony was grown overnight at 37 0 C on a LB Amp agar plate. The bacteria were inoculated into 20ml of LB+Amp liquid culture and incubated overnight in a water bath shaker. Bacteria were diluted 1:30 into 600ml fresh medium and allowed to grow at the optimal temperature (20-37 0 C) to OD 5 50 0.6-0.8. Protein expression was induced by addition of 1 mM IPTG and the culture further incubated for three hours. The culture was centrifuged at 8000 rpm at 4 0 C, the supernatant was discarded and the bacterial pellet was resuspended in 7.5ml cold 10mM imidazole buffer (300 mM NaC1, 50 mM phosphate buffer, 10 mM imidazole, pH The cells were disrupted by sonication on ice for sec at 40W using a Branson sonifier B-15, frozen and thawed two times and centrifuged again. The supernatant was collected and mixed with 150l Ni 2 -resin (Pharmacia) (previously washed with 10mM imidazole buffer) and incubated at room temperature with gentle agitation for 30 minutes. The sample was centrifuged at 7 00g for 5 minutes at 4°C.
The resin was washed twice with 10 ml cold 10mM imidazole buffer for 10 minutes, resuspended in Iml cold 10mM imidazole buffer and loaded on a disposable column. The resin was washed at 4°C with 2ml cold 10mM imidazole buffer until the flow-through reached the O.D 2 so of 0.02-0.06. The resin was washed with 2ml cold 20mMI imidazole buffer (300 mM NaCI, 50 mM phosphate buffer, 20 mM imidazole, pH 8) until the flowthrough reached the O.D 280 of 0.02-0.06. The His-fusion protein was eluted by addition of 7 00pl cold 250mM imidazole buffer (300 mM NaCI, 50 mM phosphate buffer, 250 mM imidazole, pH 8) and fractions collected until the O.D 28 0 was 0.1. 211l of each fraction were loaded on a 12% SDS gel.
His-fusion insoluble proteins large-scale purification.
A single colony was grown overnight at 37 °C on a LB Amp agar plate. The bacteria were inoculated into 20 ml of LB+Amp liquid -Iture in a water bath shaker and grown overnight. Bacteria were diluted 1:30 into 6 00ml fresh medium and let to grow at the optimal temperature (37 0 C) to O.D 55 0 0.6-0.8. Protein expression was induced by addition of 1 mM IPTG and the culture further incubated for three hours. The culture was centrifuged at 8 000rpm at 4 0 C. The supernatant was discarded and the bacterial pellet was resuspended in ml buffer B (urea 8M, 10mM Tris-HCI, 100mM phosphate buffer, pH The cells were disrupted by sonication on ice for 30 sec at 40W using a Branson sonifier B-15, frozen and thawed twice and centrifuged again. The supernatant was stored at -20 0 C, while the pellets were resuspended in 2 ml guanidine buffer (6M guanidine hydrochloride, 100mM phosphate buffer, 10 mM Tris-HCI, pH 7.5) and treated in a homogenizer for 10 cycles. The product was centrifuged at 13000 rpm for 40 minutes. The supernatant was mixed with 150il Ni 2 +-resin (Pharmacia) (previously washed with buffer B) and incubated at room temperature with gentle agitation for 30 minutes. The sample was centrifuged at 700 g for 5 minutes at 4°C. The resin was washed twice with 10 ml buffer B for 10 minutes, resuspended in Iml buffer B, and loaded on a disposable column. The resin was washed at room temperature with 2ml buffer B until the flow-through reached the OD 28 0 of 0.02-0.06. The resin was washed with 2ml buffer C (urea 8M, 1OmM Tris-HCI, 100mM phosphate buffer, pH 6.3) until the flow-through reached the O.D 28 o of 0.02-0.06. The His-fusion protein was eluted by addition of 700.1 elution buffer (urea 8M, 10mM Tris-HCI. 100mM phosphate buffer, pH and fractions collected until the OD 2 80 was 0.1. 2 1l of each fraction were loaded on a 12% SDS gel.
Purification of His-fusion proteins.
For each clone to be purified as a His-fusion, a single colony was streaked out and grown overnight at 37 0 C on a LB/Amp (100 gg/ml) agar plate. An isolated colony from this plate was inoculated into 20ml of LB/Amp (100 gg/ml) liquid medium and grown overnight at 37 0 C with shaking. The overnight culture was diluted 1:30 into 600ml LB/Amp (100 jg/ml) liquid medium and allowed to grow at the optimal temperature (20-37 0 C) until the
OD
55 0 nm reached 0.6-0.8. Expression of recombinant protein was induced by addition of IPTG (final concentration 1.0mM) and the culture incubated for a further 3 hours. Bacteria were harvested by centrifugation at 80 00xg for 15 min at 4 0
C.
The bacterial pellet was resuspended in 7.5ml of either cold buffer A (300mM NaC1, 50mM phosphate buffer, 10mM imidazole, pH 8.0) for soluble proteins or (ii) buffer B (SM urea, 10mM Tris-HC1, 100mM phosphate buffer, pH 8.8) for insoluble proteins. Cells were disrupted by sonication on ice four times for 30 sec at 40W using a Branson sonifier 450 and centrifuged at 13 000xg for 30 min at 4 0 C. For insoluble proteins, pellets were resuspended in 2.0 ml buffer C (6M guanidine hydrochloride, 100mM phosphate buffer, Tris-HC1, pH 7.5) and treated with a Dounce homogenizer for 10 cycles. The homogenate was centrifuged at 13 000xg for 40 min and the supernatant retained.
Supematants for both soluble and insoluble preparations were mixed with 150pl Ni 2 resin (previously equilibrated with either buffer A or buffer B, as appropriate) and incubated at room temperature with gentle agitation for 30 min. The resin was Chelating Sepharose Fast Flow (Pharmacia), prepared according to manufacturers protocol. The batch-wise preparation was centrifuged at 7 00xg for 5 min at 4 0 C and the supernatant discarded. The resin was washed twice (batch-wise) with 10ml buffer A or B for 10 min, resuspended in 1.0 ml buffer A or B and loaded onto a disposable column. The resin continued to be washed with either (i) buffer A at 4 0 C or (ii) buffer B at room temperature, until the OD 28 onm of the flow-through reached 0.02-0.01. The resin was further washed with either cold buffer C (300mM NaC1, phosphate buffer, 20mM imidazole, pH 8.0) or (ii) buffer D (8M urea, 10mM Tris- HCI, 100mM phosphate buffer, pH 6.3) until the the OD 2 sonm of the flow-through reached 0.02-0.01. The His-fusion protein was eluted by addition of 700ul of either cold elution buffer A (300mM NaCI, 50mM phosphate buffer, 250mM imidazole, pH 8.0) or (ii) elution buffer B (8 M urea, 10mM Tris-HC1, 100mM phosphate buffer, pH 4.5) and fractions collected until the O.D28onm indicated all the recombinant protein was obtained. 2 0pl aliquots of each elution fraction were analyzed by SDS-PAGE using a 12% gel. Protein concentrations were estimated using the Bradford assay.
His-fusion proteins renaturation In the cases where denaturation was required to solubilize proteins, a renaturation step was employed prior to immunization. Glycerol was added to the denatured fractions obtained above to give a final concentration of The proteins were diluted to 200jg/ml using dialysis buffer I (10% glycerol, 0.5M arginine, 50mM phosphate buffer, 5.0mM reduced glutathione, 0.5mM oxidised glutathione, 2.0M urea, pH 8.8) and dialysed against the same buffer for 12-14 hours at 4 0 C. Further dialysis was performed with buffer II (10% (v/v) glycerol, 0.5M arginine, 50mM phosphate buffer, 5.0mM reduced glutathione, oxidised glutathione, pH 8.8) for 12-14 hours at 4 0
C.
Alternatively, 10% glycerol was added to the denatured proteins. The proteins were then diluted to 20 g/ml using dialysis buffer I (10% glycerol, 0.5M arginine, phosphate buffer, 5mM reduced glutathione, 0.5mM oxidised glutathione, 2M urea, pH 8.8) and dialysed against the same buffer at 4 0 C for 12-14 hours. The protein was further dialysed against dialysis buffer II (10% glycerol, 0.5M arginine, 50mM phosphate buffer, reduced glutathione, 0.5mM oxidised glutathione, pH 8.8) for 12-14 hours at 4 0
C.
Protein concentration was evaluated using the formula: Protein (mg/ml) (1.55 x OD 280 (0.76 x OD 2 6 0 Purification of proteins To analyse the solubility, pellets obtained from 3.0ml cultures were resuspended in 500il buffer Ml (PBS pH 25gl of lysozyme (10mg/ml) was added and the bacteria incubated for 15 min at 4°C. Cells were disrupted by sonication on ice four times for 30 sec at using a Branson sonifier 450 and centrifuged at 13 000xg for 30 min at 4 0 C. The supernatant was collected and the pellet resuspended in buffer M2 [8M urea, 0.5M NaCI, imidazole and 0.1M NaH, PO 4 and incubated for 3 to 4 hours at 4 0 C. After centrifugation, the supematant was collected and the pellet resuspended in buffer M3 [6M guanidinium-HCI, 0.5M NaC, 20mM imida ,le and 0.1M NaH 2
PO
4 overnight at 4°C. The supernatants from all steps were analysed by SDS-PAGE. Some proteins were found to be soluble in PBS, others need urea or guanidium-HCI for solubilization.
For preparative scale purifications, 500ml cultures were induced and fusion proteins solubilized in either buffer M1, M2 or M3 using the procedure described above. Crude extracts were loaded onto a Ni-NTA superflow column (Quiagen) equilibrated with buffer Ml, M2 or M3 depending on the solubilization buffer employed. Unbound material was eluted by washing the column with the same buffer. The recombinant fusion protein was eluted with the corresponding buffer containing 500mM imidazole then dialysed against the same buffer in the absence of imidazole.
Mice immunisations of each purified protein are used to immunise mice intraperitoneally. In the case of some ORFs, Balb-C mice were immunised with AI(OH) 3 as adjuvant on days 1, 21 and 42, and immune response was monitored in samples taken on day 56. For other ORFs, CD1 mice could be immunised using the same protocol. For ORFs 25 and 40, CD1 mice were immunised using Freund's adjuvant, and the same immunisation protocol was used, except that the immune response was measured on day 42, rather than 56. Similarly, for still other ORFs, CD1 mice were immunised with Freund's adjuvant, but the immune response was measured on day 49. Alternatively, 20gg of each purified protein was mixed with Freund's adjuvant and used to immunise CD I mice intraperitoneally. For many of the proteins, the immunization was performed on days 1, 21 and 35, and immune response was monitored in samples taken on days 34 and 49. For some proteins, the third immunization was performed on day 28, rather than 35, and the immune response was measured on days 20 and 42, rather than 34 and 49.
ELISA assay (sera analysis) The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated overnight at 37°C. Bacterial colonies were collected from the agar plates using a sterile dracon swab and inoculated into 7ml of Mueller-Hinton Broth (Difco) containing 0.25% Glucose. Bacterial growth was monitored every 30 minutes by following OD 620 The bacteria were let to grow until the OD reached the value of 0.3-0.4. The culture was centrifuged for 10 minutes at 10000 rpm. The supernatant was discarded and bacteria were washed once with PBS, resuspended in PBS containing 0.025% formaldehyde, and incubated for 2 hours at room temperature and then overnight at 4 0 C with stirring. 1001l bacterial cells were added to each well of a 96 well Greiner plate and incubated overnight at 4 0 C. The wells were then washed three times with PBT washing buffer Tween-20 in PBS). 200 il of saturation buffer Polyvinylpyrrolidone 10 in water) was added to each well and the plates incubated for 2 hours at 37 0 C. Wells were washed three times with PBT. 200 pl of diluted sera (Dilution buffer: 1% BSA, 0.1% Tween-20, 0.1% NaN 3 in PBS) were added to each well and the plates incubated for 90 minutes at 37°C. Wells were washed three times with PBT. 100 p~ of HRP-conjugated rabbit anti-mouse (Dako) serum diluted 1:2000 in dilution buffer were added to each well and the plates were incubated for 90 minutes at 37°C.
Wells were washed three times with PBT buffer. 10C l of substrate buffer for HRP (25 ml of citrate buffer pH5, 10 mg of O-phenildiamine and 10 1l of H 2 0) were added to each well and the plates were left at room temperature for 20 minutes. 100 pl H 2
SO
4 was added to each well and OD 4 9 o was followed. The ELISA was considered positive when OD490 was times the respective pre-immune sera.
Alternatively, The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated overnight at 37 0 C. Bacterial colonies were collected from the agar plates using a sterile dracon swab and inoculated into Mueller-Hinton Broth (Difco) containing 0.25% Glucose. Bacterial growth was monitored every 30 minutes by following OD 620 The bacteria were let to grow until the OD reached the value of 0.3-0.4. The culture was centrifuged for minutes at 10 000rpm. The supernatant was discarded and bacteria were washed once with PBS, resuspended in PBS containing 0.025% formaldehyde, and incubated for 1 hour at 37 0
C
and then overnight at 4°C with stirring. 100tl bacterial cells were added to each well of a 96 well Greiner plate and incubated overnight at 4 0 C. The wells were then washed three times with PBT washing buffer Tween-20 in PBS). 200pl of saturation buffer (2.7% Polyvinylpyrrolidone 10 in water) was added to each well and the plates incubated for 2 hours at 37°C. Wells were washed three times with PBT. 200pl of diluted sera (Dilution buffer: 1% BSA, 0.1% Tween-20, 0.1% NaN 3 in PBS) were added to each well and the plates incubated for 2 hours at 37°C. Wells were washed three times with PBT. 100pl of HRP-conjugated rabbit anti-mouse (Dako) serum diluted 1:2000 in dilution buffer were added to each well and the plates were incubated for 90 minutes at 37°C. Wells were washed three times with PBT buffer. 100pl of substrate buffer for HRP (25ml of citrate buffer pH5, 10mg of Ophenildiamine and 101l of H 2 0 2 were added to each well and the plates were left at room temperature for 20 minutes. 1004l of 12.5% H2SO was added to each well and OD 4 90 was followed. The ELISA titers were calculated abitrarely as the dilution of sera which gave an
OD
4 90 value of 0.4 above the level ofpreimmune sera. The ELISA was considered positive when the dilution of sera with OD 4 90 of 0.4 was higher than 1:400.
FACScan bacteria Binding Assay procedure.
The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated overnight at 37 0 C. Bacterial colonies were collected from the agar plates using a sterile dracon swab and inoculated into 4 tubes containing 8ml each Mueller-Hinton Broth (Difco) containing 0.25% glucose. Bacterial growth was monitored every 30 minutes by following
OD
620 The bacteria were let to grow until the OD reached the value of 0.35-0.5. The culture was centrifuged for 10 minutes at 4 0 0 0rpm. The supernatant was discarded and the pellet was resuspended in blocking buffer BSA in PBS, 0.4% NaN 3 and centrifuged for 5 minutes at 4 000rpm. Cells were resuspended in blocking buffer to reach OD 6 20 of 0.07. 100l bacterial cells were added to each well of a Costar 96 well plate. 100pl of diluted (1:100. 1:200, 1:400) sera (in blocking buffer) were added te ech ll: and plates incubated for 2 hours at 4-C.
Cells were centrifuged for 5 minutes at 4 0 00rpm, the supernatant aspirated and cells washed by addition of 2 00pl/well of blocking buffer in each well. 100l of R-Phicoerytrin conjugated F(ab) 2 goat anti-mouse, diluted 1:100, was added to each well and plates incubated for 1 hour at 4 0 C. Cells were spun down by centrifugation at 4 0 0 0rpm for 5 minutes and washed by addition of 2 00Ul/well of blocking buffer. The supernatant was aspirated and cells resuspended in 2 00gl/well of PBS, 0.25% formaldehyde. Samples were transferred to FACScan tubes and read. The condition for FACScan (Laser Power 15mW) setting were: FL2 on; FSC-H threshold:92; FSC PMT Voltage: E 01; SSC PMT: 474; Amp. Gains 6.1; FL-2 PMT: 586; compensation values: 0.
OMV preparations Bacteria were grown overnight on 5 GC plates, harvested with a loop and resuspended in 10 ml 20mM Tris-HC1. Heat inactivation was performed at 56 0 C for 30 minutes and the bacteria disrupted by sonication for 10' on ice 50% duty cycle, 50% output). Unbroken cells were removed by centrifugation at 5000g for 10 minutes and the total cell envelope fraction recovered by centrifugation at 50000g at 4 0 C for 75 minutes. To extract cytoplasmic membrane proteins from the crude outer membranes, the whole fraction was resuspended in 2% sarkosyl (Sigma) and incubated at room temperature for 20 minutes. The suspension was centrifuged at 10000g for 10 minutes to remove aggregates, and the supernatant further ultracentrifuged at 50000g for 75 minutes to pellet the outer membranes. The outer membranes were resuspended in 10mM Tris-HC1, pH8 and the protein concentration measured by the Bio-Rad Protein assay, using BSA as a standard.
Whole Extracts preparation Bacteria were grown overnight on a GC plate, harvested with a loop and resuspended in Iml of20mM Tris-HC1. Heat inactivation was performed at 56 0 C for 30' minutes.
Western blotting Purified proteins (500ng/lane), outer membrane vesicles (5pg) and total cell extracts derived from MenB strain 2996 were loaded onto a 12% SDS-polyacrylamide gel and transferred to a nitrocellulose membrane. The transfer was performed for 2 hours at 150mA at 4 0 C, using transfer buffer (0.3 Tris base, 1.44 glycine, 20% methanol). The membrane was saturated by overnight incubation at 4 0 C in saturation buffer (10% skimmed milk, 0.1% Triton X100 in PBS). The membrane was washed twice with washing buffer (3% skimmed milk, 0.1% Triton X100 in PBS) and incubated for 2 hours at 37 0 C with mice sera diluted 1:200 in washing buffer. The membrane was washed twice and incubated for minutes with a 1:2000 dilution of horseradish peroxidase labelled anti-mouse Ig. The membrane was washed twice with 0.1% Triton X100 in PBS and developed with the Opti- 4CN Substrate Kit (Bio-Rad). The reaction was stopped by adding water.
Bactericidal assay MC58 and 2996 strains were grown overnight at 37 0 C on chocolate agar plates. 5-7 colonies were collected and used to inoculate 7ml Mueller-Hinton broth. The suspension was incubated at 37 0 C on a nutator and let to grow until OD 620 was in between 0.5-0.8. The culture was aliquoted into sterile 1.5ml Eppendorf tubes and centrifuged for 20 minutes at maximum speed in a microfuge. The pellet was washed once in Gey's buffer (Gibco) and resuspended in the same buffer to an ODo 20 of 0.5, diluted 1:20000 in Gey's buffer and stored at 25 0
C.
of Gey's buffer/I% BSA was added to each well of a 96-well tissue culture plate.
of diluted (1:100) mice sera (dilution buffer: Gey's buffer/0.2% BSA) were added to each well and the plate incubated at 4 0 C. 25ul of the previously described bacterial suspension were added to each well. 25ul of either heat-inactivated (56 0 C waterbath for minutes) or normal baby rabbit complement were added to each well. Immediately after the addition of the baby rabbit complement, 22pl of each sample/well were plated on Mueller- Hinton agar plates (time The 9 6-well plate was incubated for 1 hour at 37°C with rotation and then 22pl of each sample/well were plated on Mueller-Hinton agar plates (time After overnight incubation the colonies corresponding to time 0 and time Ih were counted.
Gene Variability The ORF4 and 919 genes were amplified by PCR on chromosomal DNA extracted from various Neisseria strains (see list of strains). The following oligonucleotides used as PCR primers were designed in the upstream and downstream regions of the genes: orf4.1 (forward)
CGAATCCGGACGGCAGGACTC
orf4.3 (reverse) GGCAGGGAATGGCGGATTAAAG 919.1 (forward) AAAATGCCTCTCCACGGCTG or
CTGCGCCCTGTGTTAAAATCCCCT
919.6 (reverse) CAAATAAGAAAGGAATTTTG or
GGTATCGCAAAACTTCGCCTTAATGCG
The PCR cycling conditions were: 1 cycle 2 min. at 940 cycles 30 sec. at 940 sec. at 540 or 600 in according to Tm of the primers) sec. at 720 1 cycle 7 min. at 720 The PCR products were purified from 1 agarose gel and sequenced using the following primers: orf 4.1 (forward)
CGAATCCGGACGGCAGGACTC
orf4.2 (forward)
CGACCGCGCCTTTGGGACTG
orf4.3 (reverse)
GGCAGGGAATGGCGGATTAAAG
orf4.4 (reverse)
TCTTTGAGTTTGATCCAACC
919.1 (forward) 919.2 919.3 919.4 919.5 919.6 (forward) (forward) (forward) (forward) (reverse) AAAATGCCTCTCCACGGCTG or
CTGCGCCCTGTGTTAAAATCCCCT
ATCCTTCCGCCTCGGCTGCG
AAAACAGCIGGCACAATCGAC
A-TAAGGGCTACCTCAAACTC
GCGCGTGGATTATTTTTGGG
CAAATAAGAAAGGATTTTG or GGTATCGCAAAACTTCGCCTTATGC~j
CCCAAGGTAATGTAGTGCCG
TAAAAAAAAGTTCGACAGGG
CCGTCCGCCTGTCGTCGCCC
TCGTTCCGGCGGGGTCGGGG
919.7 (reverse) 919.8 (reverse) 919.9 (reverse) 919. 10 (reverse) All documents cited herein are incorporated by erence in their entireties.
The following Examples are presented to illustrate, not limit, the invention EXAMPLE
I
Using the above-described procedures, the following oligonucleotide primers were employed in the polymerase chair! reaction (PCR) assay in order to clone the ORFs as indicated: Table 1: Oligonucleotides used for PCR for Examples 2-10 .Jifr Primer 29 Forward Sequence I Restrictionsites CGCGG A-TCCCATATG.TTGCCTGCAT ACGATT BamHI-NdeI <SEQ ID 3021> CCCGCTCGAG-T1TAGMAGCGGGCGGCMA <SEQ XhoI ID 3022> Reverse Forward Reverse Forward Reverse Forward Reverse CGCGGATCCCATATGTCATCCmTGTCGTCA <SEQ ID 3023> CCCGCTCGAG-TTTGGCGGTTFI-IGCTGC <SEQ ID 3024>
CGCGGATCCCATATG-GCCGCCCCCGCATCT
<SEQ ID 3-25> CCCGCTCGAG-ATTTACTm-1--1GATGTCGAC <SEQ ID -3026> CGCGGATCCCATATG
-TGCCAAAGCAAGAGGATC
<SEQ ID 302-7> CCCGCTCGAG-CGGGCGGTAT--T-GGG <SEQ ID 3028> BamHI-Ndel XhoI BamHI-Ndel XhoI BamHI-Ndel Xhol Forward CGCGGATCCCATATG-GACACAGCTTT1ACAT IBamHI-Ndel I <SEQ ID 3029> 1 Reverse CCCGCTCGAG-ATAATAATATCCCGCGCCC <SEQ Xhol ID 3030> 128 Forward CGCGGATCCCATATG-ACTGACAACGCACT <SEQ BamHI-Ndel ID 3031> Reverse CCCGCTCGAG-GACCGCGTTGTCGAAA <SEQ ID Xhol 3032> 206 Forward CGCGGATCCCATATG-AAACACCGCCAACCGA BamHI-Ndel <SEQ ID 3033> Reverse CCCGCTCGAG-TTCTGTAAAAAAAGTATGTGC Xhol <SEQ ID 3034> 287 Forward CCGGAATTCTAGCTAGC-CTTTCAGCCTGCGGG EcoRI-Nhel <SEQ ID 3035> Reverse CCCGCTCGAG-ATCCTGCTCTTTTTTGCC <SEQ ID Xhol 3036> 406 Forward CGCGGATCCCATATG-TGCGGGACACTGACAG BamHI-Ndel <SEQ ID 3037> Reverse CCCGCTCGAG-AGGTTGTCCTTGTCTATG <SEQ Xhol ID 3038> Localization of the ORFs The following DNA and amino acid sequences are identified by titles of the following form: m, or a] or pep], where means a sequence from N. gonorrhoeae, "m" means a sequence from N. meningitidis B, and means a sequence from N. meningitidis
A;
means the number of the sequence; "seq" means a DNA sequence, and "pep" means an amino acid sequence. For example, "gOOl.seq" refers to an N. gonorrohoeae DNA sequence, number 1. The presence of the suffix to these sequences indicates an additional sequence found for the same ORF, thus, data for an ORF having both an unsuffixed and a suffixed sequence designation applies to both such designated sequences. Further, open reading frames are identified as ORF where means the number of the ORF, corresponding to the number of the sequence which encodes the ORF, and the ORF designations may be suffixed with or indicating that the ORF corresponds to a N. gonorrhoeae sequence or a N. meningitidis A sequence, respectively. The word "partial" before a sequence indicates that the sequence may be a partial or a complete ORF. Computer analysis was performed for the comparisons that follow between and peptide sequences; and therein the "pep" suffix is implied where not expressly stated. Further, in the event of a conflict between the text immediately preceding and describing which sequences are being compared, and the designated sequences being compared, the designated sequence controls and is the actual sequence being compared.
ORE: contig: 279 gnm4.seq The following partial DNA sequence was identified in N. nieningitidis <SEQ ID 3039>: m279.seq ATAACGCGGA TTTGCGGCTO CTTGATTTCA ACGGTTTTCA
GGGCTTCGGC
51 AAGTTTGTCG GCGGCGGGTT TCAI-CAGGCT GCAATGGGALA G3GTACGGACA 101 CGGGCAGCGG CAGGGCGCGT TTGGCACCGG CTTCT'rTGGC
GGCAGCCATG
GCGCGTCCGA CGGCGGCGGC GTTGCCTGCA ATCACGATTT
GTCCGOGTGA
201 GTTGAAGTTG ACGGCTTCGA CCACTTCGCT TTGGGCGGCT TCGGCACAAA 251 TGGCTTTAAC CTGCTCATCT TCCAGCCGA GATCGCCGC
CATTGCGCCC
301 ACGCCTTGCG GTACGGCGGA CTGCATCAGT TCGGCGCGCA GGCGCACGAG 351 TTTGACCGCG TCGGCAAAAT TCAATGCGCC GGCGGCAACG
AGTGCGGTGT
4CI ATTCGCCGAG GCTGTGTCCG GCAACGGCGG CAGGCGTTTT GCCGCCCGCT 451 TCTAAATAG This corresponds to the amino acid sequence <SEQ ID 3040; ORE 279>: m279.pep 1 ITRICGCLIS .VFIRASASLS AAGFIRLQWE GTDTGSGPARZ
LAPASLAAAM
51 ARPTAAALPA ITICPGELYL TAS'rTSLWAA SAQMALTCSS SKPRIAAIAP 101 TPCGTAlCIS SA.RRTSLTA SAKFWAPAAT SAVYSPRLCP
ATAAGVLPPA
251 SK* The following partial DNA sequence was identified in NXgonorrhoeae <SEQ 'D 3041>: g279 .sea -atgacgcgga tttgcggctg cztgatttca acggtttcga gtgrtzcggc z!aagtttgzcg gCggcgggtt tcatcaggct gcaacgggaa 9gaacggaza 1 01 ccggcagcgg caggcgcgt ttggctccgg cttctttczg ggcaqccatg -S1 gtgcgtccga cggcggcggc q~zcctgca atcacactgcgga 201 gctgaagtcg acggcttcga ccacttcgcc ctgtgcggat tcciccacaaa 251 tctgcctgac ctgttCatct tccaaaccca aaatagccgc cattgcgcc 301 acqccttgcg gtacggcgga ccgcazcagt tcggcgcgca ggcggacgag 351 tttgacargca tcggcaaaaz ccaatgcttc ggcggcgaca aqcqcggtgt a:tcgccgag gctgtgt ccg gcaacgqcgg caggcgtttt Qccccccac-- 451 tccaaatag This corresponds to the amino acid sequence <SEQ ID 3042; ORF 2 7 9.ng>: g279 .pep MRICGCLIS TVLSVSASLS AAGF:RLQWE GTDTGSGRAX' LAPASLAAIJM VRPTAAALPA ITTCPGELKL IAS7TTSPCAD SAQICLTCSS SKPiC4AAIAP 101 TPCGTADCIS SARRRTSLTA SAE(SNASAAT SAVYSPRLCP
ATAAGVLPPT
151 SK* ORE 279 shows 89.5% identity over a 152 aa overlap with a predicted ORE (ORE 2 7 9.ng,) from N. gonorr-hoeae: 20 30 40 50 m2 79 .peo ITRICGCLISTVFASASLSAGIRLQWEGTDTGSRLPASLMAPTAAP g279
MTRICGCLIS-VLSVSASLSAGFILQWEGTDGSGRALAPASAAMRTUP
20 30 40 50 73 80 90 100 110 120 m2 79.pep IT1 CPGELKLTASTTSLWAASAQMALTCSSSKPRIAAIAPTPCGTDCI
SSARRRTSLTA
g2 79 ITTCPGELKLTASTTSPCADSAQICLTCSSSO
IAPTPCGTADCISSARRPTSLTA
so 90 100 110 120 130 140 150 rn279.pep
SAKNAPAATSAVYSPRLCPATVLPPSK
g279 SAKSNSATSAVSPRLCPATAAGVLPPTSKX 130 140 The following partial DNA sequence was identified in N. meningitidis <SEQ ID 3043>: a27 9.seq
ATGACNCNGA
GAGTTTGTCG
CNGGCAGCGG
GCGCGCTCGA
GTTGAAGTTG
TTTGTTTTAC
ACGCCTTGCG
TTTGACCGCG
ATTCGCCGAN
TCCGAATAG
TTTGCGGCTG CTTGATTTCA ACGGTTTNNA GGGCTTCGGC GCGGCGGGTr TCATGAGGCT GCAATGGGAA GGTACNGACA CAGGGCGCGT TTGGCGCCGG CTTCTTTGGC GGCAAGCATA CGGCGGCGGC ATTGCCTGCA ATCACGACTT GTCCGGGCG.
AC'-CT'2CAA CCACTTCATC CTGTGCGGAT TCGGCGCA.AA CTGTTCATCT TCCAAGCCGA GAATCGCCGC CATTGCGCCC GTACGGCGGA CTGCATCACT TCGGCGCGCA NGCGCACGAG TCGGCAAAAT CCAATGCGCC GGCGGCAACN AGTGCGGTGT GCTGTGTCCC GCAACGGCGG CAGGCGTTTT GCCGCCCGCT This corresponds to the amino acid sequence <SEQ ID 3044; ORE 279.a>: a279 .pep 1 MTXICGCLlS TVXRASASLS.AAGFMRLQWE GTDTGSG.RAR LAPASLAAS: 51 ARSTAAALPA ITTCPGtLKL TASTTSSCAD SAQICPFCSS SKPRIAAIAP TCGTAZ)CIS SARXRTSLTA SAKSNAPAAT SAVYSXLCP ATAAGVLPPA 151 SE* m279/a2-79 OR~s 279 and 279.a showed a 88.2% identity in 152 aa overlap 10 20 30 40 50 m279. pep ITRICGCLISTVFRASASLSAAGFIRLQWEGTDGSGPA-LAPASLAAAMA-?TAALPA a279 MT GL7SVRSSLAGFMLWG TSCPPL PAS AS!RTAL? 20 30 40 50 6 80 90 10o 110 120 m279 .pep T-I-GLLA-SWAAMLCSSP-A.,TCTDISRRST a279 :ICGLLATSCDAIFCSKR;,IP-CTD-S X ST 8c 90 100 110 120 130 140 150 m27 9. pep SAKFNAPAATSAVYSPRLCPATAAGVLPPASK.< a279 SAKSNAPAATSAVYSPXLCPAT.-
GVPPASE'<
130 140 150 519 and 519-1 gnm7 .seq The following partial DNA sequence was Identified in N. meningitidis <SEQ ID 3045>: m519.seq (partial) I .TCCGTrATCG GGCGTATGGA GTTGGACAAA ACGTTTGAAG AACGCGACGA 51 AATCAACAGT ACTGTTGTTG CGGCTTTGGA CGAGGCGGCC GGGgCTTgGG 101 GTGTGAAGGT TTTGCGTTAT GAGATTAAAG ACTTGGTTCC GCCGCAAGAA 151 ATCCTTCGCT CAATGCAGGC GCAAATTACT GCCGAACGCG AAAAACGCGC 201 CCGTATCGCC GAATCCGAAG GTCGTAAAAT CGAACAAATC AACCTTGCCA
GTGGTCAGCG
GCGGTCAATG
AGGTGAAGCG
TCCGTCAAAT
AATCTGAAGA*
AGAAAGCAAT
TGATTTCTGC
CGAAGCCGA ATCCAACAT CCGA.AGGCGA
GGCTCAGGCT
CGTCAAATGC CGAGAAAATC GCCCGCATCA
ACCGCGCCAA
GAATCCT-GC GCCTTGTTGC CGAAGCCAAT
GCCGAAGCCA
TGCCGCCGCC CTTCAAACCC AACGCGGTGC
GGATGCGGTC
TTGCGGAACA ATACGTCGCT GCGTTCAACA
ATCTTGCCAA
ACGCTGATTA TOCCCGCCA TGTTGCCGAC
ATCGGCAGCC
CGGTATGAA ATTATCGACA GCAGCAAAAC CGCCAAaTAA This corresponds to the amino acid sequence <SEQ ID 3046; ORF 519>: rn519.pep (partial) 1 .SVIGRMELDK TFEERDEINS TVVAALDEAA CAW~GVKVLRY
EIKCLVPPQE
51 IRSMQAQIT A-ERLKRARIA ESEGRCIEQI NLASGQREAE
IQQSEGEAQA
101 AVNASNAEKI ARINaLAKGEA ESLRLVAEAN AEA:RQIAAA
LQTQGGADAV
151 NLKIAEQYVA AFNNLAICESN TLIMPANVAD IGSLISAGMK
IIDSSKTAK*
The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 3047>: g519.seq 1 atggaatttt tcattatctt gttggcagcc gtcgccgttt tcggcttcaa 51 atcctttgtc gtcatccccc agcaggaagt ccacgttgtc gaaaggctcg 101 ggcgtttcca tcgcgccctg acggccggtt tgaata-ttt gattcccttt 151 atcgaccgcg tcgcctaccg ccattcgctg aaagaaatcc ctr:tagacqt 201 acccagccag gtctgcatca cgcgcgataa cacgcaatzg actgt--gacg 251 gcatcatcta tttccaag:a accgatccca aactcgcczc ataccgttcQ 3Cd agcaactaca ttatggcaat tacccagctt gcccaaacga cgctgcgttc 351 cgttatcggg cgtacggagt tggacaaaac gtztgaa~aa cgcgacgaaa 401 tcaacagtac cgtcgtctcc gccctcgatg aagccQccgg g9CttQoggt 451 gtgaaagtcc tccgttacga aatcaaggat ttggztccac cacaagaaat ccttcgcgca atgcaggcac aaattaccgc cgaacqgaa aaac9cgccc 551 gtattgccga atccgaaggc cgtaaaatcg aacaaatcaa ccttqccagt 601 gstcagcgtg aagccgaaat ccaacaatcc gaaggcgagg ctcaggctac 651 ggtcaazgca tccaatgccg agaaaatcgc ccgcatcaac cgcgccaaag 701 gcgaagcgga atccctgcgc ctzgttgccg aagccaa-gc cgaagccaac 751 cgtcaaattg ccgccrccct tcaaacccaa agcggcgg atgcggccaa 801 tctgaagatt gcgggacaat acgttaccgc gt:caaaaat cttaccaaaa 851 aagacaatac gcggattaag cccgccaagg ttgccgaaac cgggaaccct 901 aattttcggc ggcatgaaaa attttcgcca gaagcaaaaa cggccaaata 951 a This corresponds to the amino acid sequence <SEQ ID 3 048; ORF 5 19.ng>: g529 -pep MEFFIILLAA VAVGKSV V:-PQQvV-r !DRVAYRI4SL KEIPLEVPSQ VCITPflNTQL SNYIMA:TQL AQTTLRSVIG RM~ELfKTFEE VKVLRYE =a LVPPQEILRA
MQAQITAERE
GQREAEIQQS EGEAQA-AVNA
SNAEKIAII
RQ7AAALLQTQ SGADAVN:-KI,
AGQYVTAFKN
NFRR.HEK?-SP EAKTAK*
ERLGRFHRAL
'vDG: IYFQVJ
RDEINSTVIVS
I.AIAES KG
R-AKGEAES:,R
LAKEDNTR:
K
TAGL7ILIPF T: PKLAS YGS A-:DEAAGAwG RK7EQINLAS
LVAEANAEAN
PAKVAEIGNP
ORE 519 shows 87.5% identity over a 200 aa overlap with a predicted ORF (ORE 5l9.ng) from N. gonorrhoeae:m519/g519 mn519. pep g519 10 20 SVIGRMELDKTFEERDENS2.JVALDEAA YFQVTDPKL.ASYGSSNYIMI,QL'
ITTLRSV:GRELDKTFEEEINSVSLDEA
100 110 120 130 140 s0 60 70 80 m519.pep GAWGVKVLRYE::,
LVPPQEILRSMAQITAERERAIAESEGRIINA
II I I II I I i I I III I II M 1 111 Hi II H 1 1 1 11 1 1 1 g519 GAWGVKVLRYEI KDLVPPQEI LPMQAQITZKRIASEGRKI
EQINLSGQREAE
160 170 180 190 200 100 110 120 130 140 150 m519.pep I Q G-QANSA IRNAGASLLMNE-R
ALTGAA
g 519 IQSGAAVANEIRNAGASRVENENQAALTSAA 210 220 230 240 250 260 160 170 180 190 200 19 .pep NLKIAEQYJAAFNNLAKSTIMPANV~rIGSL SAGMKI IDSSKTAK gS 19 NLKIAGQYVTAF!CMLAKEDMTR: KPAKVAEI GNPNFRRH-cvS PEAKTAK 270 280 290 300 310 The following partial DNA sequence was identified in N. meningitidis <SEQ ID 3049>: a: ±19 seq 1 51 101 151 201 251 301 351 401 501 551 601 611 701 "51 801 851 901
ATGGAATTTT
ATCCTTTGTT
GGCGTTTCCA
ATCGACCC
ACOCACCAG
GTATCATC-A
AGCAACTACA
CGT TAT CGGG
TCAACAGCAC
GTGAAGGTTT
TCATTATCTT
GT CAT CC CAC
TCGCGCCCTG
TCGCCTACCG
GT CTG CAT CA
TTTCCAAGTA
TTATGCCGAT
CGTATGGAAT
CGTCGTCTCC
TGCG TTATGA
GCTGGCAGCC
AGCAGGAAGT
ACGGCCGGTT
CCATTCCCTG
CGCGCGACAA
ACCGACCCCA
TACCCAGCTT
TGGACAAAAC
GCCCTCGATG
CATTAAAGAC
AAATTACTC
CGTAAAATCG
CC'AAC~tTCC AGAAAiATCGC
CTTGTTGCCG
TCAAACCCAA
ACGTCGCCGC
CCCGCCAATG
TATCGACAC
GTCGTTGTTT TCGCTCAA% CCACGTTCTC GAAAGGCTCG TGAATAT-TT GAT7'-CCTTT AAAGAAATCC CTTTAGACGT TACGCAGC:23 ACTGTTGACG AACC"CCTC ATACCGTT-G CCCAACG-A cCCC-'rT GTTTGAAGAA CGCCAfGAAA AkAGCCGCCGG AGCTTGCGT ITGGTTCCGC C-C.AACAAAT TGAACCCGAA AAAC~cCCCC AACAAATCAA CCTTGCCAGT GAAGGCGAGG CTCAGC=2 CCGCATCAAC CGCGCCAA AAGCCAATGC CGAAGCCATC GGCGGTGCGG ATGCGG-CA-A GTTCA.ACA-A- CTTGCCAAAG TGCCACAT CGG:AGCCT"- AGCAA;AAC.3 CCA:ATAA CCTTCGCTCA ATGCACCCGC CTATCGCCCA ATCCGAAGGT GGTCAGCGCC AAGCCCC.A4T GGTCAATGCG TCAAATGCCG CTGAAGCGGA ATCCTTGCGC CGTCAAATTC CCGCCGCCCT TCTGAAGATT GCGGAACA.AT AAAGCAATAC GCTGATTATG ATTTCTCC GTATGAAAkAT This corresponds to the amino acid sequence <SEQ ID 3050-: ORE 5 19.a>: a519. pep 1 51 151 201 301 MEFFIILLAA VVVS'GFKSFV IDRVAYRHSL KEIPLDVPSQ SNYIMAITQL AQTTLRSVIG VrK7LRYEIKD LVPPQE:LRS GQREAE1QQS EGEAQAAVNA RQIAAALQTQ CGADAVNL-K- IS-AGMEIIDS SKTAK VI PQQEVHVV
VC:TRDN-CL
RMFZCE)KTF-E
MCAQ:TAZRE
SNAEKIA-RIN
AE-QYIAA?-NN
ERLGRFHFRAL
TIDG IIYFQV 9 DE N ST S KRAR2AEs7-
RAZ'KGEAESLR
!AKE SNT1L:M .TAG 11 L~ P- T I OK LA--S .ALSE;AG WG
PANVADI)GSL
m519/a519 ORFs 519 and 319.a showed a 99.5. ident:ty In 199 aa overlap 1220 an519 .pep Eq-GSNIAIQAQ-RVIGAMELIKT FE-RP2E :NSTVVSAL0EAA\ 100 11i20 130 140 50 60 Is 80 m519. pep GAWGVKVL7RYE IDLVPFE LRSMAQTAREKFIASER:b:ITAQGCaAE a519 GAWGVKVLRYEIKLVPQEILRSMQAQTAEREKqRA:A-EGPVI..QTNLAGQREAE 150 160 i170 0 190 200 100 110 010120C 130 140 inS19 .pep
IQQSEGEAQAAVNASNAEKIARIN(GESLRLVAEAQIAALQTQGAA
a519
!QQSEGEAQAAVNASNAEKIAR:NAKGEAESLRLVAE-AEAIRQTAALQTQGAA
210 220 230 240 250 260 160 17C 180 190 200 m519.pep NTKAQVANLKST--PNAISIAM7!SKAK a519
NLKIAEQYV-FNNLAESNLIMPNVADIGSL:SAGMKISKTX
270 230 290 300 310 Further work revealed the DNA sequence identified in N. meningitidis <SEQ ID 3051>: 1 e 51 101 151 201 .z 251 C 301 351 C 401 T 451 G 501 C 551 G 601 G 651 G 701 G 751 C 901 A
A.TGGAATTTT
ATCCTTTGTT
GGCGTTTCCP.
kTCGACCGCG
~CCCAGCCAG
CATCATCTA
~GCAACTACA
:GTTATCGCG
'CAACAGTAC
;TGAAGGTTT
OTT CGCT CA
TATCGCCGA
GTCAGCGCO
GTCAATGCG
TGAAGCGGA
GTCAAATTG
CTGAAGATT
AAGC~- Tzc
T-TCTGCCG
TCATTATCTT
GTTGGTAGCC
OTCATCCCAC-
AACAGGAAGT
TCGCGCCCTG
ACGGCCGGTT
TCGCCTACCG
CCATTCGCTG
GTCTGCATCA
CGCGCGACAA
TTTCCAAGTA
ACCGACCCCA
TTATGGCGAT
TACCCAGCTT
CGTATGGAGT TGGACAAAn.C TGTTGTTGCG
GCTTTGGACG
TGCGTTATGA
GATTAAAGAC
ATGCAGGCGC
AAAT-TTACTGC
ATCCGAAGGT
CGTAAAATCG
AAGCCGAAAT
CCAACAATCC
TCAAATGCCG
AGAAAATCGC
ATOOTTOGG
CTTGTTGCCG
CCGCCGCCCT TCAAACCCAA GCGGAACAAT
ACGTCGCTGC
GCTGATTATG
CCCGCCAATG
GTATGAAAAT TATCGACAGC GT CCC CTTT
C-ACGTTGTC
TCAATATTTT
AAAGAAATCC
TACGCAGCTG
AACT CCCCT C
GCCCAAACGA
GTTTGMAAA
ACCOTCCCG
CGAACGG-
AACAAA-C:A
CA.AG-CGAG
CCC.CAWC
AAGCCAA:C
CTTCAACAAT
T -C CT
TCGGTTTCAA
CAAAGGCTGG
CATTCCCTTT
CTTTAGACGT
ACTGTTGACG
ATACCCTTCC.
CGCTCCGTTC
CGCCACGAAA
CCCTTGGCGT
CGCAAGAAAT
AkAACCCCCC
CCTTGCCAGT
CT CACCC TG C
CCGCCAAAC'
C CAAGC CAT C
ATCCTCAA
CTTGCCAAAC
C GCCAG COT C This corresponds to the amino acid sequence <SEQ ID 3052; ORF 5 19-1>: 1 MEFFIILLVA VAVFGrKSV V!PQQEVHVV ERLGR'HPAL TAGLN17J2?- 51 ICDRVAYR.HSL KE:?DVPSQ VCITRDNTQL TVDGIiYQV TDPK-7ASYGS 101 SNYIMA:TQL ACTTLRSVIG IELDKTFEE RDEINSTV1/A
ALDEAAGAWG
151 VKV7RYEIKD LVPPQE::RS MQAQ-TAERE K RA_AS= RK17QINLAS 201 GQREAEIQQS EGEAQAAVNA SNAEKIARIN RAKGE-AZSLR
LVAEA-NAEAT
251 RQIAAA7QTQ GGADAVNLK:- A2kQYVAAFNN LAYESNT:IM
PANVAOIGSL
301 73AGMKI:S
SKTAFK
The following DNA sequence was identified in N gonorrhoeae <SEQ ID 3053>:.
9519-I. seq 1AT GGAAWTT TCATTATC:: G-T'OACCC G-CCGOT":
TCCCGCTTCA
51 ATCOTTTCTC CTCAT.CCC AGCAGCAACT CCACTT37C
SAAAGCCTCG
1)I GGCTTTCCA TCCCCTG ACGGCCGGTT "C ATATT'T GAT-CCCTTT 151 AT7CCACCGCG TCCCCTAOCC CCA:TCGCTC AAAGA-AATC-
OTTTAGACGT
201 ACCCACCCAG GTCTCCATCA CCCATAA TACCA..TTG,
ACTCTTCGACG
251 CCATCATCTA TTTCCAAGTA ACCGATC CCA AACTCCCCTC
ATACCCTTCG
301 AGCCACTACA TTATGGCzLT TACCCACCTT GCCCAOCGA
CGCTGCGTTC
351 CCTTATCGCC CGTATGGACT TCCACAAC GTTTGAAGpA
CGCGCCAGA
401 TCAACAGTAC CGTCCGTOTCC CCCCTCCATG A.ACCCCCG
CCCTTGGGCT
451 GTGAAGTCC TCCTTACGIA AATCAAGGAT TTGGTTCCCC
CGCCGAAT
501 CCTTCGCGCA ATGCACGCAC AAATTACCGC CGAACGCCAA
AAACCCCCC
551 GTATTCCCGA ATCCGAACGC CGTAAAATCG AACAAATCA.
CCTTCCCACT
601 GCTCAGCGTG AAGCCCGAA.T CCAACAAMCC CAAGGCGAGG
CTCACCCTC
651 GGTCAATGCG TCCAATCCC AGAAATCGC CCCATCAAC
CGCGCCAA
701 GCGAAGCGCA ATCCOTCC CTTGTTGCCG AAGCCCATG-
CGAAGCCATO
751 CCTCAAATTG COGCCGCCOT TCPACCCAA GGCGGCCO
ATGCGGTCAA
801 TCTGAAGATT GCGGAACArAm AF1AAAT TCCAT -U -GC GTTCAACAP.T CTTGCCAAAG 851 AAAGCAATAC GCTGATTATG CCCGCCAATG TTGCCGACAT
CGGCAGCCTG
901 ATTTCTGCCG GCATGAAAAT TATCGACAGC AGCAA-AACCG
CCAAATAA
This corresponds to the amino acid sequence <SEQ ID 3054; ORE 5 1 9 tng>: g519-1. pep 1 MEFFIILLAA VAVFCFKSFV VIPCQEVHVV ERLGRFHRAL
TAGLNV-T
51 IDRVAYRHSL KEILDVPSQ VCITRDNTQL TVCGIY7QV
TDK-'ASYGS
101 SNYIMAITQL AQTTLRSV:G RMSLDKTFEE RDETNSTVVS
ALDEAAGAWG
151 VRVLRYEIKD LVPPQEILRA MQAQITAERE K(RAIAESEG
RKIEQINLAS
201 GQREAE:QQS EGEAQAAVNA SNA-EKIAa:N RAKGEAESLR
LVAEANAEAI
251 RQIAAALQTQ GGADAVNLKI AEQVAFN LAKESNTLIM
PANVADZGSL
301 ISAGMKIIDS
SKTAK*
M519-1/g519-1 ORB's 519-1 anid 5 1 9 -I.ng showed a 99.0%' idenitity in, 315 aa overlap 20 30 40 so g519-1 .pep MEFILAA7IKFVPQVVERGFPLALIIF
RARS
m519-1 MEFFhILLVAVAVFGEKSVI
PQQEVHVVERGRFHRATGNLPIRARS
20 30 40 50 700s 90 1C0 110 12 g59-1 pep KE-LVSVI'DTLVGIFVCKAzGSYZ T7,TLSI M519-1 KE PDPSQVCTRDNTQL-VGIIYFQVTDKLASYGSSNYMIATLQ-TRV- 80 90 100 110 120 130 140 150 160 170 180 g519-1 .pep RMELDKTFEERDEINSTWISALDEAGAWGVKLRYE:KDLVPPQI
AMAIAR
M519-1 PMLKFEDIN VALEAAGKVREKLPQIPt/Q,
-AR
130 140 150 160 IS0O9 190 200 210 220 230 240 g519-1 .pep KPRASGi:QNAGRAIQEEAAV ANE7?:P
KFEL
rn51 9-1 XRRA-ER7QNAGRA7QEEQAVANEI .IR.
S
190 2 00C 210 220 2 30 240 250 26J 270 280 290 300 gI 19-1 .pep LVE.AA.,
TALTGAANKZ-QVANLP~,TIPNPIS
m519-1* LVENE: IALTGAANKAEQVA-N7KQT1-AV:1 250 260 270 280 290 300 310 g519-1.pep ISAGMKIIDSSKTAjKX rn519-1
IS?GMKIIDSSKTAKX
310 The following DNA sequence was identified in N. meningitidis <SEQ ID 3055>: a519-1. seq 1 ATGGAATTTT TCATTATCTT GCTGGCAGCC GTrGTTGTTT
TCGGCTTCAA
51 ATCCTTTGTT GTCATCCCAC AGCAGGAAGT CCACGTTGTC
GAAAGGCTCG
101 GGCGTTTCCA TCGCGCCCTG ACGGCCGGTT TGAATATTTT
GATTCTTT
151 ATOGACCGCO TCGCCTACC CCATTCGCTO, AAGAATCC
CTTTAGACGT
201 ACCCAGCCAG GTCTGCATCA CGCGCGACA -ACGCAGCTG
ACTGTTGACG
251 GTATCATCTA TTTCCAAGTA ACCGACCCCA MACTCGCCTC
ATACGGTTCG
301 AGCAACTACA TTATGGCGAT TACCCAGCTT GCC-AAACGA
CGCTGCGTTC
351 CGTTATOCGGG CGTATGGAAT TGGACAAC GTTTGAAGAA
CGCGACGA
401 TCAACAGCAC CGTCGTCTCC GCCCTCGATG AAGCCGCCGG
AGCTTGGGGT
451 GTGAAGGTTT TGCGTTATGA GATTAAAGAC TTGGTTCCGC
CGCAAGAAAT
501 CCTTCGCTCA ATGCAGGCGC AAATTACTGC TGAACGCGA
AAACGCGCCC
551 GTATCGCCGA ATCCGAAGGT CGTAAAATCG AACAAATcAA
CCTTGCCAGT
601 GGTCAGCGCG AAGCCGAAAT CCAACAATCC GAAGGCGAGG
CTCAGGCTGC
651 GGTCAATGCG TCAATGCCG AGAAATCGC CCGCACAAC
CGCGCCAA
701 GTGAAGCGGA A7TCCTTGCGC CTTGTTGCCG AAGCCAATGC
CGAAGCCATC
751 CGTCAAATTG CCGCCGCCCT TCAAACCCA GGCGCTGCGG
ATGCGGTCAA
802 TCTGAAGATT GCGGAACAT ACGTCGCCGC GTTCAACAAT
CTTGCCAA
851 AAAGCAATAC GCTGATTATG CCCGCCAATG TTGCCGACAT
CGGCAGCCTG
901 ATTTCTGCCG GTATGAAAAT TATCGACAGC AGCAAAACCG
CCAAATAA
This corresponds to the amino acidi sequence <SEQ ID 3056; ORE 5 19-1.a>: a519-1. pep.
1 MEFFIILLAA VVVFGFKS'V \JIQQEVHVV ERLGRFHRAL
TAGLNILIPF
51 IDaVAYRHSL KEIPLDVPSQ VCITRDNTQL TVDGIIYFQV
TDPKLASYGS
101 SNYIMAITQL AQTTLaSVIG RMELDF(TFEE RDEINSTVVS
ALDEAAGAWG
151 VKVLRYEIKD LVPPQEILRS MQAQITAERE KRARIAESEG
RKIEQINLAS
201 GQREAEIQQS EGEAQAAVNA SNAEKIARIN RAKGEAESLR
LVAEANAEAI
251 RQIAAALQTQ GGADAVNLKI AEQYVAAFNN LAKESNTLIM
PANVADIGSL
301 ISAGP2RIIDS SKTAtK* m519-1/a519-1 ORFs 519-1 and 5 1 9 -1.a showed a 99.0' idenrt'/v In 315 aa overlap 1020 30 40 C0 a5l9-: .pep MEF-LA~VGKrVIQEHVELRHATONLPIRARS T-519-1 ME--ILAA7-KFVPQVVELGF4ATG -L'F7DV-RS 20 30 40 5 C 80 90 100 10 120 a519-1 .pep KEIL0VPSQVC:7TRDNTQLTVDG7IYFQVTDPKLASYGSYNaTLQTTSI M319-1 P-EIPLDVV 'RN' TD-IFVDK AYSN~MIQ~
TLSI
80 90 100 110 120 130 140 150 160 17 a519-1 .pep PMER-'TFEPED-xINSTVV3JALDEAGAWGVKVLRY- _K 7 VP
PE-SMQAQITAER
M519-1 -RM-LDKT r7ERDEINS:-VVA0EAGAWGVKV YIDPI-I A1AR 130 140 150 160 170 190 2001 210 2M 30 24 a3:9-1.pep KFR. EERIEILSQ30 240E-AANSA:- NAG7ASP m51 -1 (RA IAE EGR :EQ N SGQ EAE QQS -E Q-A NASNAK-eARzINPAK G ASL R 190 200 210 220 2310 240 250 26) 270 280 9 I 300 :59-1 LVAEANAEAIRQIAk.ALQTQGGADAVNLKIAEQYAEINAE,,TIPNAIS 250 216C 270 280 2190 300 310 a 5 1 9 -1.pep
ISAGMKIIDSSKTAK<
M519-1
!SAGMKIIDSSKTAKX
310 576 and 576-1 gnm22.seq The following partial DNA sequence was identified in N. meningitidis <SEQ ID 3057>: m576.seq.. (par:: al)
ATGCAGCAGG
GCAAATGAAG
CCATGCAGGC
GCTCAGGAAG
AGAAAAACAC
TTCTGAAAGA
CTGCAATACA
CGACATCGTT
TCGACAGCAG
G-TGATT CC GG
AGCCACGTTC
GCGACAAAAT
AAAATCGGCG
CATCAAAAAA
CAAGCTATGC
GAACAGGGCG
AGTGTATGAC
TCATGATGA
AAGGCGGACG
AAATGCCGCC
AAATCACCAA
ACCGTGGAAT
CAAAGCCAAC
GTTGGACCGA
TACATOOC:-7
CGGTCCGAAC
CACCCGAAAA
GTAAATTAA
GATGGGCGTG
CGGAAATCGA
GGCAAAGAXz~
ATTCCTTCAG
CGAAGGCCA
AAAGACGGCG
ACAGGGCGxnA
ACGAAGGCCG
GGCGGCC CG AGgCGTACAG
CCAACCTTGC
GCCACTTTGG
C 0CCCCCGC C GACATCGGAC
GCTCCCTGAA
TTTGAAA.GTC
TTTACCGAAG
TCAAAATGAC
CGAAGAGCAG
GAACAACAGG
CTAAAGCCGT
TAAAGAAAA
GGCGAAGCCT
TGAAGACCAC
TGCTTCCGGC
GGC-AACAGC
CGACCAAAGA
CCTGATTGAC
GGTACGGTAT
TCACCTTCCC
TTTGAGCCA.
CTTCTGAAAG
AAGGCGGCGA
CTACCGCGAA
CAC-GGTGCGG
TATTTGATGT
GAAACTZGOC
AAGCAGCCGG CTCAAGTC-GA This corresponds to the amino acid sequence <SEQ ID 3058; ORE 576>: m576.pep.. (partial 1 M.QQASYA 40V 1 AQEVMDZFL-Q 1 QYKITKQGE 1 VIGWTEGVQ 1 KIGAPENAPA
DCGRSLKQMK
EQQAKAVEKH
C-KQ ?TKD D7V 7LKEGGEATF
KQPAQVDIKK
EQGA I DLKV
KADAKANKEK
TVEYEGRLI 0 Y7PSNLAYRE VN EF7EAMQAVYD GKEtIKMTEEQ GEAEUZENAA KDGVKTTASG CTVFDSSKAN CC-PVTF2LSQ: CGAGDKIGPN A-7V7DVK7V The following partial DNA sequence was identified in N. gonorrlzoeae <SEQ ID 3059>: g576.seq. (partial) atgggcqtgg ggaaatcgat Sc a a aga aa:ttcctgcagg gaaggcaaa: aagacggcgt cagggtgaag cgaaggCCgC gcggcccggc gg~gtacgg: caaccttgc ccacttggr: gcgcCgcca acazcqg,_ 2 g ttgaaagtct caaaatgacc agcagcaggc aaagaaaaag gaaaaccact qcaaacagcc Ctgattgacg tcztgaaaga taccacgaac a::tGacgtg aacagccgga Ct--czgaaa ttaccgatgc gaagagcagg taaagccgta 9gaagcctt g~ttCCggtc cgacaaaagac gtaccgtat ttgagCCaag aggcggcgaa agggtacggg aaactggtca Zocaagtcaac caaataaagg catgcaggca cccaggaagt gaaaaacaca Cctgaaggaa tgCagtacaa gacatcgt-a cgacagcagc t~a::CCggg cga a aaa a:: aaatcggcqc aacaggag:g 9tg~atgacg gatqataaaa aggoggat:: aargccgc:g aatcaCcaaa ctgtggaata aaa7 0 03 a toaac :gaa ac:cgaaaac ::gaaaa This corresponds to the amino acid sequence <SEQ ID 3060; ORE 5 7 6.ng>: I .MGVDIGRSLKI 51 FLQEQQAKAVl 101 QGEGKQPTKD 151 GVRLLKEGGE
APAKQPDQVD
QMKEQGAK: C C IVT'IEYEGR ATFY: ?SNLA
IKKVN*
LKVETDAMQP.
KEKGEAFLKE
ZDGTVFDSS
YREQ GAGER:T NAAC OGYK:IT KANGO FATE? SP NAT LVED V
EEQAQEVM.K
ASGLQYKTTK
LSQVI PCWTE
KLVKIGAPEN
Computer analysis of this amino acid sequence gave the following results: Homology with a predicted ORF fom N. gonorrhoeae m5761g576 OR~s 576 and 576.ng showed a 97.22 identit-y in 21i5 as overlap 10 20 30 40 50 M57 6. pep MQQASYAMGVDIGRSLQMKCQAIDLKCJE? QAV'DG,7KTEQQVMKL g57 6 MG D G SL0 MK 03 7080 90 100 110 120 76. pep
QAAEHAAAKEGALEAKGKTGLYIKGGQTDV
g57 6
EQQAKAVKDAAK-KGALEAEGKTSLYIKGGQTD
70 s0 90 100 110 TV3Y 0 140 150 160 170 180 m576.pep TVYER L IOGTVFDS SKANGGPVT FPLSQV I WTEC\!QTLLKGETFI2S
A
g576 TVEYEGRL1DGTVFDSSKAGPAT.PLSQVI PGWTE0VRLLKEGGEATFYIPSNLAYRE 120 130 140 130 160 170 156pp QGGK 90 200 210 200 m576.pe QAGKIPNATLVF'DVKLVKIGAPENAPAKQPAQVDIKN g576 QGAGEKIGPNATLVFDVKLVKIGAPENAPAKQPDQVDIrN 180 190 200 210 The following partial DNA sequence was identified in N. meningitidis <SEQ ID 3061>: a576. seq 1 ATGAACACCA tv--AT
ACTTTCCGCC
CT GCCGC C GC
ATGCAGCACG
GCAAATGAAG
CCATGCAGGC
GCTCAGGAAG
AGAAAAACAC
TTCTGA.A.s~f-A
CTGCAATACA
CGACATCGTT
TCGACAGCAG
GTGATTCTGG
AG CCACGTT C
GCGACAAAAT
AAAATCGGCG
CAT CAAALkkA
TGCGGCAAA-Z
TT CT: C CO C
CAAGCTATGC
G,%ACAGGGCG
AGTGTATGAC
TCATGATGAA
AAGGCGGACG
AAATGCCGCC
AAATCACCAA
ACCGTGGAAT
CAAAGCCAAC
GTTGGACCGA
TACATCCCGT
CGC CCC GAAC
CACCCGAAAA
GTAAATTA
C.AGCGCACTG
AAGAAGCCGC
CAC-GOOGACA
GATGGGCC-TG
CGGAAATCGA
GGCAAAGAAA
ATTCCTTCAG
CGAAGGCCA
AAAGACGGCG
ACAGGGCGAp.
ACC-AAGGCCG
GGCGGCCCGG
AGGCGTACAG
CCAACCTTGC
GCCACTTTGG
CGCGCCCGCC
ACCCTT-CCC
CCGCTTTGGC
C-CCGCA:-TCT
GCATCCGAAC
CCTCTTCGA-
CGGC-AGCACG
O:ACATCGG.,C
GCTCCCTG.
TTTGAAACTC
-TTACCGAAG
TCAAUAATGAC
CGAAGAGCAG
GAACAACAGG
CTAAAGCCGT
T-AAAGAAAA
GGCGAAGCCT
TGAAGACCAC
TGCTTCCGGC
GGCAAACA~CC
CGACCAAAGA
CCTG-ATTGAC
GGTACGGTA:
TCACCTTCC'C 7TTGAGCC- CTTCTGASA-;G
AAGGCGGCGA
C-ACCGCG;1A
CAGGGTGCGG
TAT::CATG:
GAAACTGGTC
AAC-CAGCCZCS CTCAAGTC, This corresponds to the amino acid sequence <SEQ ID 3062; ORF 5 7 6.a>: a 5 7 6.peo MNTIFKISA
TLSALALSA
>IQQASYAMGV DIGRS LKQMK AQEVMN'KFLQ EQQAKAVEzKH LQYKITKQGE-
GKQPTKDV
VILGWTEGVQ
LLK'GGEATI-
K:GAPENAPA. KQPAQVDIKT CG~KEAAPAs EQG.3E 7DLKV
KADAKANKEK
VEY-GRLID
Y:PSNLAYRE
ASEPAAASSA QGC-SSIGSm -rEAM~.QAVY:)
GKEIK.MTEEQ
GEAFLKENI;A
KDGVKTTASG
GTVB-DSSKiN
GGPVTFPLSQ
QGAG0?cISPN !ATTVFDVKLv/ rn57 6/a57 6 ORE's 576 and 57-.a showed a 99.5- idenzIt:'. in 222 aa overl.ap m56pp10 20 rn576 pepMQQASYAI.IG' bI GR5LK QMKEQGAE: CLKV a56CGKAAPA3ASE? ASSAQGD:SSIGSTMQQ UAGVIF FF1 FF111 FJIDLK 40, 50 60 70 80 mS 76. pep FTEASIQAVIYOGKE
IKMTEEQAQEVMMKFLQEQAKAVEHADKNEKEFLEA
a 576 FTEANCQAVYDGwv
IKMTEEQAQEVMMKFLQEQQAKAVEKMKADAANEEFLSA
100 110 120 130 140 81 100KTAS 110 120 130 140 150 m57 6. pep
DGKTSLQYKITKQGEGE(QPTKDDIVTVEYEGRLIDGTVFDSSAGPTLQ
a576 DVTAGQKTQEQPKDVVYRLDTFSK GPTPQ 150 160 170 180 190 200 m576.pep a576 160 170 180 190 200 210 VI PGWTEGVQLLKEGGEATFYI
PSNLAYRE:QGAGDKIGPNATLVFDVKLVKIGAPENAP
VILGWTEGVQLLKEGGEATFYI PSNLAYR EQGAGDKIGNATLVDVLV(I.APN 210 220 230 240 250 260 22C m576 pep
KQPAQVDIKKVNX
a576 1QPAQVDIK(\rNX 270 Further work revealed the DNA sequence identified in N. meningitidis <SEQ ID 3063>: m576-1. seq 1 ATGAACACCA TTT:CAAAAT
ACTTTCCGCC
CTGCCGCCGC
ATGCAGCAGG
GCAAATGAAG
CCATGCAGGC
GCTCAGGAAG
AGAAAAACAC
TTCTGAAAGA
CTGCAATACA
CGACATCGTT
TCGACAGCAG
GT GATTC CG
AGCCACGTTC
GCGACAAAAT
AAAATCGGCG
CAT CAAAA TGCGGCAAjA
TTCTTCC-GCG
CAACCTATGC
GAACAGGGC3 AGTO TAT GAO T CATGATGAA
AAGGCGGACG
AAATGCCGCC
AAATCACCAA
ACCGTGGAAT
CAAAGCCAAC
GTTGGACCGA
TACATCCCGT
CGGTCCGAAC
CACC-CGAAAA
GTAAAT-AA.
CAGCGCACTG
ACCCTTTCCG
AAGAAGCCGC
CCCCGCATCT
CAGGGCGACA
CCTCTTCGAT
GATGGGCGTG
GACATCGGAC
CGGAAATCGA
TTTGAAAGTC
GGCAAAGAA TCAaAAA:GAC
ATTCCTTCAG
CGAAGGCCAA
AAAGACGGCG
ACAGGGCGAA
ACGAAGGCCG
GGCGGCCCGG
AGGCGTACAG
CCAACCTTGC
GAACAACAGG
TAk AGAAA-
TGAAGACCAC
GGCAAACAGC
CCTGATTGAC
TCACCTT CCC CT? CTGAAAG
CTACCGCGAA
CCC CT TTGG C
GCATCCGAAC
CGGCACCACG
GCTCCCTcA
TTTACCGAAG
CGAGAGCAG
CTAAAGC CGT
GGCGAAGCC:
TGCTTC'CGGC
CGACC AAAGA
GGTACGGTAT
TTTGAGCCAA
AACGCGGCGA
CACGGTGCGG
GAAACTGCT C
CTCAAC.TCGA
CCCGCCCGCC AAGCAGCCGC This corresponds to the amino acid sequence <SEQ ID 3 064; ORE 5 76-1 mS%6-1.Pep 1 MNTIFKISAL TLSAALALSA
CGKKEAAPAS
51 MQQASYAMGV DIGRSLKQLiK
EQGAEIDLKV
101 AQEVMMKFLQ EQQ.AKAVEFH
KADAKANKEK
151 LQYK:TKQGE GKQPTKOD0IV
TVEYEGRLID
201 VIPGWTEGVQ LLKEGGEA'rr Y12SNLAYRE 251 KIGAPENAPA KQPAQVD:KK
VN-
ASEPAA.A sSA FTEAMQ.A'yD
GEAFLKEN;..
GTV3xDS S KAN
QGAGDKIGPN
QGDTSSIGSrn OKE ?KYTEEC, K CGV KTTASG
GGPV:KLSQ-
The following DNA sequence was identified in N. gonorrhoeae <SEQ ID 3065>: g576-1. seq
ATGAACACCA
ACTTPC CGCC CPG CCGC CC
ATGCACCAGG
ACAAATGAAG
CCATGCAGGC
CCCCAGGAAC
AGAAAAACAC
TCCTGAAGGA
CTGCAGTACA
CGACATCCTT
TCGACAGCAG
GTGATTCCGG
TT ?TCAAAA- TGCGGCAxA
TTCTGCCGCS
CAAGCTATGC
GAACAGGGCG
AG TGCTAT GAC
TGATGATGAA
AAGGCGGATG
AAATGCCGCC
AAATCACCAA
ACCCTGGAAT
CAAAGCCAAC
CAGCGCACTG
AAGAACCCC
CAGGGCGACA
ALATGGGCGTG
CGGAAATCGA
GO CAA GAA-A
ATTCCTGCAG
CGAAGGCCAA
AAAGACGGCG
ACAGGGTGAA
ACGAAGGCCG
GGC cGGCCCGC
AGGCGTACGC
CCAACCTTC
ACCCTTTCG
CC CC GCATC P
CCTCTTCAAT
CACATCGGAC
TTTGAAAGTC
TCAAAATCAC
GAG CAG CACO
CAAAGAAAAA
TCAAGACCAC
GGCAAACAGC
CCTGATTGAC
C CAC CT TCC C
CTTCTGAAAG
CTACCGCGAA.
C CCCT T TGC
GCATCCGAAC
CCC CACCACO G CT C CCTGAA TTTACCrGA:C
CGAAGAGCAG
CTAAAGCCGT
GGCGAAGCCT
TGCTTCCGGT
CCACAAAAGA
GGTACCGTAT
TTT GAG CCAA
AAGGCGGCCA
CAGGCTGCGG
651 ACCCACGTTC TACATCCCGT 82 701 GCGAAAAAAT CGGTCCGAAC GCCACTTTGG TATTTGACCT
GAAACTGGTC
751 AAAATCGGCG CACCCGAAAA CGCGCCCGCC AAGCAGCCGG
ATCAAGTCGA
801 CATCAIApAA
GTAAATTAA
This corresponds to the amino acid sequence <SEQ ID 3066; ORF 5 76 -1.ng>: g570'-1 .pep 1 MNTIE'KISAL TLSAALALSA CGKKEMAPAS ASEPAAASAA
QGDTSSIGST
51 MQQASYAMGV DIGRSLKQMK EQGAEIDLKV FTDAM.QAVYD
GKE:E(MTEEQ
101 AQEVMMKFLQ EQQAKAVEKH~ B1A0AKANKEK GEAFLKENA
KDGVRTTASG
151 LQYKITKQGE GKQPTKDDIV TVEYEGRLID GTVFDSSKAN
GGPATFPLSQ
201 VIPGWTEGVR LLKEGGEATF YIPSNLAYRE
QCGKGNATLVFDVKLV
251 KIGAPENAPA KQPDQVD:K(
VN'
g576-1/m576-1 overlap OR~s 576-: and 5 7 6 -1.ng showed a 97.86 identity in 272 aa MNI'I10 20 30 40 50 g576-1 pep MNIKSALTLSAALALSACGKKEAAPASASEPAIASAAQGDTSITMQYAG m576-1
MNTIFKISALTSAASCKEAAAEAASQDSISMQSAG
20 30 4050 80 90 101i i1 120 g57 6-1 pep D1GRSLKQMKE<-QGAEDrFEAQVIG- T-lAEIIM,--Q-QAAE~ m-;76-1 0IGRSLKQMEGEDKFEMAYGEK!T-- QV'KL7QFVK 80 90 130 10 120 140 150 160 170 180 g576- pep KAAAKKEF-EAKGKTSLYIKGGQTDITEZaI m576-1 -AAAKKEFKNADVTAGQ
:KGGQTDITEERI
130 140 150 160 170 180 190 200 210 2220 230 240 g576-l.pep GTVFODSSKANGGAT-FP1-SQVIGWTEGVRLLVrGGEATFYPNAPCGGK-P mn576-1 GTVFDSSKANGGPVTFLSQVI PGWTEGVQLLKEGGE'A:7y:
?SNLAYR-QG-AGOKIGPN
190 200 210 2120 231 240 250 260 270 g576-1 .pep AT'LV?-DVKLVK:IGA-PAQPDQVDIK
KNX
m57 6-1 ATLVFDVKLVt(:GAPE-NA2..L:-QOAQVDIK-KJ,< 250 260 270 The following DNA sequence was identified in N. mening-itidis <SEQ ID 3 067>: a57 6-1. seq 1 AIGAACACCA TTTTCAAkT 51 ACTTTCCGCC TGCG3CpAoA
CTGCCGCCGC
ATGCAGCAGG
GCAAATGAAG
CCATGCAGGC
GCTCAGGAAG
AGAAAAACAC
TTCTGAAAGA
CTGCAATACA
CGACATCGTT
TCGACAGCAG
GTGATTCTGG
AGCCACGTTC
TTCTTCCGCG
CAAGCTATGC
GAACAGGGCG
ACTGTATGAC
TCATGATGAA
AAGGCGGACG
AAATGCCGCC
AAATCACCAA
ACCGTGGAAT
CAAAGCCAAC
GTTGGACCGA
TACATCCCGT
CAGCGCACTG
AAGAAGCCGC
CAGGGCGACA
GAT GGGCGTG
CGGAAATCGA
GGCAAAGAAA
ATTCC-TCAG
CGAAGGCCA
AAAGACGGCG
ACAC-7-GAA
ACGAAGGCCG
GGCGGCCCGG
AGGCGTACAG
CCAACCTTGC
ACCCTT-CC-,z
CCCCATC:
C CT 01T GA T
GACATCGGAC
TTTGAAAGTC
TCAAAATGoC
GAACAACACG
TAAAGAA~Ak,
TGAAGACCAC
GGCAAACAGC
CCTGATTGAC
TCACCTTCCC
CTTCTGAAA
CTACCGCGAA
CCGCTTTGGC
GCATCCGAAC
CCC CACCACG
GCTCCCT.AA
TTTACCGAAG
CGAACACCAG
CTAAAGCCG'
GCCAAGCCT
TTCCGCc
CGACCAAAGA
GGTACGGTA.
TTTGAGCCA
AAGCGCCGA
CAGGGTGCGG
83 701 GCGACAAAAT CGGCCCGMAC GCCACTTTGG TATTTGATGT
GAAACTGGTC
751 AAAATCGGCG CACCCGAAAA CGCGCCCGCC AAL3CAGCCGG
CTCAAGTCGA
801 CATCAAAAAA GTAAxTA This corresponds to the amino acid sequence <SEQ ID 3068; ORF 57 6-l.a>: a576-1 .pep IMNTIFKISAL TLSAALALSA CGKP(EMPAS ASEPAAASSA
QGOT.S:IGST
51 MQQASYAM4GV DIGRSLKQMK EQGAEIDLKV FTEA.MQAVYD
GKEIKMTEEQ
101 AQEVMMKFLQ EQQAKAVEKH KADAKANKEK GEAFLKENAA
KDGVKTTASG
151 LQYKITKQGE GKQPTKDDIV TVSEGRLID GTVFDSSKAN
GGPVTFPLSQ
201 VILGWTEGVQ LLKEGGEATF YIPSNLAYRE QGAGDKIGPN
ATLVFDVKLV
251l KIGAPENAPA KQPAQVOIKK
VN*
a576-1/m576-1 overlap ORFs 576-1 and 5 7 6-1.a snowed a 99.6% identi4ty in 272 aa 10 2130 40 50 6 aS! 6-I. pep MNTIFKISALTLSALALSACGKKEAPASASEP~zASSAQGDT
ISMQSAG
m576-1
MTFIATSAASCKEAAAEAASQDSISMQSAG
80 90 100 11-2 aS! 6-1 .pep DIGRSLKQM4KEQGAEIDLK 'EAQVY Q-,m-E'AQEVM1-IFLQEQ
QAAVEKH
m57!6-1 0IGRSLKQMKEQGAEDKT'EMAYD 7PME-
QEMKTEQKVK
80 90 100) 110 120 130 140 150 160 170 180 aS 76-1. pep 'KDKNEGALEAIDVTAG,)i-KGGQTDm576-1 KAAAKKEFK __DVTAGQYIKGGQTDT-,-E~- 7 130 15O 160 170 180 190 200 210 220 230 240 aS! 6-1. pep GTVFSSNGGPVTFP'LSQVITGWTEGVQLLKGGEAT'
YTPNARQADIP
rn576-1 GTVFDSSKANGGVTF'PLSQVI PGWTEGVQ LLK7-GGEATFYI PSNLAY' REQ GD0K3GPN 190 200 210 220 230 240 250 260 270 a57 6-1. pep ATVDKV GPNPKPQD~vN m57!6-1
ATLVJFDVRLVKIGAPENAPAKQAQVDKKVN
250 260 2-70 919 gnm43.seq The following partial DNA sequence was identified in NMrneningitidis <SEQ ID 3069>: m91-9. seq 1 Ak3GAAAAAT ACCTATTCCG CGCCGCCCTG TACGGCATCG
CCGCCGCCAT
51 CCTCCCCGCC TGCCAAACCA AGAGCATCCA AACCTTTCCG
CAACCCGACA
101 CATCCOTCAT CAACGGCCCG GACCGGCCGG TCGGCATCCC
CGACCCCGCC
151 GGAACGACGG TCGGCGGCGG CGGGGCCGTC TATACCGTTG
TACCGCACCT
201 GTCCCTGCCC CACTGGGCGG CGCAGGATTT CGCCAAAAGC
CTOCAATCC"'
251 TCCGCCTCGG CTGCGCCAAT TTGAAAAACC GCCAAGGCTG
GCAGGATGTG
301 TGCGCCCAAG CCTTTCAAAC CCCCGTCCAT TCC"-TTCAGG
CAAAACAGTT
351 TTTTGAACGC TATTTCACGC CGTGGCAGT TGCAGGCAAC
GGAAGCCTTG
401 CGTACnn'r i-iIA(ACCGTCCAG3 QCGC TACGAACCGG TGCTGAAGGG CGACCACAGG 451 501 551 601 651 701 751 801 851 901 951 1001 1051 1101 1151 1201 1251 CGGACGGCAC AAGCCCGCTT CCCGATT1Tr
CTCCGTCCCC
TCAGGCAGAC
CATACCG;CCC
CAAAGGCAGG
AAATCAACGG
GAAGACCCTG
GAAAACCCcG
AACATCCYTA
AAACTCGGAC
TCCGCAACGC
TCCGCGAGCT
ACGCCGCTGA
CTTGGGTGCG
CCCTCAACCG
GCGGTGCGCG
-CTGCCTCCC
*GGGAAAAAA(
*ACCTCTCCcC TTTGAAGGA-2
CGGCGCGCT!
TCGAACTTT7
TCCGGCAT
CGTTTCCATC
AAACCTCCAT
CTCGCCGAAG
TGCCGGAAGC
TGGGGGAATA
CCCTTATTTG
CCTGATTIATG
TGGATT'. 2TI 3GTTTGCGGAC AGCGGCACp.A
ATTCCCCATC
GCCGCTTCC--T
GACGGCAA
TTTTATGCAC
ACATCCGCAT
GGACG-CTATA
GCAGGGCATT
TTTTGGCGTCA
AGCAATGACG
TGCCGGCGCA
TCGCCACCGC
GCGCAGGATA
~TTGGGATAC
CGGGATATGT
CCGTAA
7 GTATTCCCG
ACCATTTTAT
CGGAAAAGC-
CTTGTCCGCA
TCGACAATAC
CGGCGGCACA
ACCGCGCCCA
CAACAGCAT
CCCCTACCAC ACGCGCACC CCCCGATACT
CGGTTACGCC
ATCCAAGGCT
CGGGCCGTCT
CGGCTATGCC
GACAAAAACG
TGGCGG3AA
GGGCTACCTC
AAC-TCTTATA
TGCGGCAAAA
kl)ACCCCAGC
TATATCTTT
GCCCTGTCGG
CGCACTGGGC
GTCGACCGGC
ACTACATTAC
CCATCCGGTT
ACCCGCAAAG
CCGGCAGCGC
GATTAAAGGC
GGCGACGAAG
CCGGCGAACT
CTGGCAGCTC CTACCcAACG 1301 GTATGAAGCC CGAATACCGc This corresponds to the amino acid sequence <SEQ MD 3070; ORF 9 19>: m919 pep MKKYLFRA
YGIA.AAI
GTTVGGGGAV
YIVVPHLSLP
CAQAFQTPVN
SFQAKQF'FER
RTAQARFPIY GIPDD71SVP HTADLSRFP:
TARTTAIKGR
EDPVELFFMH
IQGSGRLKTP
KLGQTSMQGI
KSYMRQNPQR
TPLMGEYAGA
VDR.HYITLGA
AVRV YFWGY GDEAGETLAGK
CQSKSIQTFP
h-'WAAQDFAIKS
YFTPWQVAGN
LPAGLRSGKA
FEGSRFLPYH.
SGKYIRIGYA
I-AEVLGONPS
PLFVATJPV
QPDTSVINGP
DRPVGIPDPA
LQSFRLGCA-N
LKNRQGWQDV
GSLAGT'GY
YEPVLKGDCR
!VRIRQTGKN SC2'IDNTGGT TRNQ:NGGpj DG-KAP ILGYA D1CJEH~vVS:
GRYMADKGYL
YIFFRELAGS SNDGPvGALG TRKALNRLIM
AQDTGSAIKG
The followingo partial DNA sequence was identified in Ngonorrhoeae <SEQ ID 3071>: g919 .seq 1 ATGAAAAAAC ACCTGCTCCm Crrr 51 CctcgCCGCC TGCCAAAqca OGAG2CATr-rA 101 201 251 301 351 401 451 501 551 601 651 701 7 51 801 851 901 951 1001 1051 1101 1151 1201 1251 1301
CATCCOTCA]
GGAACGACGC
CTCCATGCCC
TCCGCCTCC
TGCGCCCAAG
TTOTGAACGC
Caggtacg
CGGACGGAAC
CTCCGTCCCG
TCAGGCAGac
CATACCGCCG
caaaGGCAG AAAtcaacGG GAagaccccG GAAAACCCcg AACAt CcgTa AAGctcgggc
TCCGCAACGC
TCCGCGAGCT
ACGCCAC:GA
CTTGGGCC
CCCTCAACCG
GCGGTGCGCG
TGCCGGCAA
GCATGAAGCC
CAACGGCCCG
TTGCCGGCGG
CACTGGGCG
CTGCCCAAT
CCTTTCAAA
TATTTCACGC
TACCGGCTAT
GGGCCCGCTT
CTGCCTGCCG
GACCGGCCGG
CGGGGCCGTC
CGCaggATTT TTGAAAAACc
CCCCGTGCAT
CgtGGCaggt
TACGAACCGG
CCCGATTTAC
OTTTGCGGGG
ACCTCTCCCG
AT-CCCCATC
TTTGAaggAA GCCGCTTCC"' TACG3(catCG CCGCCgccAT AACCTTTCCG
CAACCCGACA
CCGGCA=CC CGACCCCGCc TA'rACCOTTG T3CCGcACCT TOCCAAAAGC
CTGCAATCCT
GCCAAGOC"'G
GCAGGATGTG
TCCTTTCAGG CAAAGc~gTT tgCagqcaAC GGAAGcCTrG TGCTGAAGGG
CGACC-CCAGG
GG7AT7C'C
ACGATTTTAT
CGGAAA C CTTGTCCGCA TCGACAATGC
CGGCGGCACG
ACCGCOCGCA CAACCGcaa- CCC7TACCAC
ACGCGCAACC
CCCCCATCCT CqgttaCgcC Atccaaggcr
CGGCCCCCT
CO-aaTacgcc gacAAAAACG TGcCGGACAA
AGGCTACCTC
aaaocCTATA
TGCGGCAAAA
AAACCCCAGC
TATATCTTTT
GCCCCGTCGG
CGCACTGGGC
ATCGACCGGC
ACTACATTAC
CCATrCCGGTT
ACCCGCAAAG
CAGGCAGCGC
GATCAAAGGC
GGCGACGAAG
CCGGCGAACT
:TGGCAGC-C CTGCCCAAcG CGGCgcgcTT tcgaacttTT tccggcaaat tgtttccatc agACCTCGAT
CTCGCCOAAG
TGCCGGAAGc
TGGGGGAATA
CCCTTATTG
CCTGATTATG
TGGATTATTT
CAGAAA-'
CCA
CGAATACCGc GACGGCAAag
TTTCATGCAC
acatCCOCAt ggAC3ctaTA GCAGC~gcatc TTTrGGGTCA
GGCAATGAGG
CGCCGGCGCA
TCGCCACCGC
GCGCAGGATA
TTGGGGT-"AC
CGGGATACGT
CCGTGA
This corresponds to the amino acid sequence <SEQ ID 3072; ORF 9 l9.ng>: g919. pep 1 MKIKLLRSAL YGIAAAILALA CQSRS!QTvP QPDTSVINGP
DRPAGIPDPA
51 GTTVAGGGAV Y"VVPMLSMP HWAACDFAI(S LQSFRLGCMJ
LICNRQGWQDV
101 CAQAFQTPVHq SFQAKRFFER YFTPWQVAGN GSLAGT1VTGY
YEPVLKGDGR
151 RTERARFPIY GIPDDFISVP LPAGLRGK LVRIRQTGM
SGTIDJAGGT
201 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGA.:
IGKAPILGYA
251 EDPVELFFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVSI
GRYMADKGYL
301 KLGQTSMQGI KAYMRQNPQR LAEVLGONPS YIFFRELAGS
GNEGPVGALG
351 TPLMGEYAGA IDRHYITLGA PLF'VATAH.PV TRKALNRLIM
AQDTGSAIKG
401 AVRVDYFWGY GDEAGELAGK QKT'rGYVWQL LPNGMKPEYR
P*
ORE 919 shows 95.9 identity over a 441 aa overlap with a predicted ORE (ORE 9 l 9 .ng) from N. gonorrhoeae: M919/g919 MKCL 10 20 30 40 50 m919 .pep MKYFRAALYGIAAAILAACQSKSIQTFPQPDTSVINGPDRPVGc:PDPA;TTVGGA 9919 MKKHLLRSkLYGI AAI LAACQSRS IQTFPQ PDTSVINGPDRPAGI
PDPAG-TVAGGA
20 30 40 50 80 90 100 110 120 m919.pep
YVPHLSLPHWAAQDFAKSLQSFRLGCANT-KJRQGWQDVCAQFTVSAKFE
9919 YVPHLSMPHWAQDFAKSLQSFRLGCNL RQGW
CQFT~[FQAKRFFER
80 90 100 110 120 YT 130 140 150 160 170 180 m919 .pep Y7PWQVAGNGSLGTTGYYEPVI1KGDRRTAQRPIYGPDIVLALSK F' I 11F F1111 F1 111FF.!H 111 1 IIFF F 11111i F:H g919 YFTPWQVAGGSLAGVTGYYEPVLKGGRRTEAFP-Y: PDO.F-SV0LPAGLRGGKN 130 140 150 160 170 180 LV -90 200 210 20 230 240 m919 .pep
LVIQGNGING-H.DSFIATATGFGRLYTNIGA
F~ !1111111FFl!W 'il F :I F 1i F F i F r g919 LVRIRQTGKNSGTIDNAGGTHTADLSRFPI TAR'TAI G~G~rHRNIGA 190 200 210 220 230 240 DGA 2S0 260 270 280 290 300 m9 19. pep DKPILGYAEDPVELFF{IQGSGRLKTPSGY':.GYA-:;H-SGxAKY I I I 111 11 1 111 1111i II I! Ii I FrIi I F 1F FF111I II i Ii11 i I F 250 260 270 280 290 300 310 320 330 340 350 360 m9 19. pep KLGQTSMQGI KSYMRQNPQRLAEVLGQNPSY:
FFRELAGSSNDGPVGALGTPLMGEYAGA
g9 19 KLGQTSMQGI KAYMRQNPQRLAEVLGQNPSYI
FFRELAGSGNEGPVGCLGTPLMGEYAGA
310 320 330 340 350 360 370 380 390 400 4:0 420 m919 .pep
VD-YTGPFAAPTKLRIAQTSIGVIDFGGEGLG
1 1 1 1 1 1 1111 1 1l Fr F 11 111 11 iF IFF i i I i l 11F I FFIII 11 9919 IDRYTGPFAAPTR RIADGAvAVVYrGGEGLG 370 380 390 400 410 420 86 430 440 mO 19.pep
QKTTGYVWQLLPNGMKPEYRPX
g919
QKTTGYVWQLLPNGMKPEYRPX
430 440 The following partial DNA sequence was identified in N-meningitidis <SEQ ID 3073>: a919. seq 1 51 101 151 201 251 301 351 401 451 501 551 1 651 "~51 801 ?51 901 951 1051 121 1301
ATGAAAAAAT
CCTCGCCCC
CATCCGTCAT
GGAACGACGG
GTCCCTGCCC
TCCGCCTCGG
TGCGCCCAAG
TTTTGAACGC
CCGGTACGGT
CGGACGGCAC
CTCCGT CCCC
TCAGGCAGAC
CATACCGCCG
CAAAGGCAGG
AALATCAACGG
GAAGACCCCG
C-AAAACCCCG
AACATCCCTA
AAGCTCGGGC
CCCGCAACGC
TCCGAGAGCT
ACGCCGCTGA
CTTGGCCG
CCC-TCAACCG
GCGGTGCGCG
TGCCGGCAAA
GTATGAAGCC
*ACCTAT'rCCG
CGCCGCCCTG
-TGC :AAAGCA AGAGCATCCA *CAACGGCCCG
GACCGGCCGG
TCGGCGGCCG
CGGGGCCGTT
CACTGGGCGG
CGCAGGATTT
CTGCGCCAT TTGAAAAACC CCTTTCAC
CCCCGTCCAT
TATTTCACGC
CGTGGCAGGT
TACCGCCTAT
TACGAGCCGG
AAGCCCGCTT
CCCGATTTAC
CTGCCTGCCG
GTTTGCGGAG
GGGAAAAAAC
AGCGGCACA
ACCTCTCCCA
ATTCCCCATC
TTTGAAGGAA
GCCGCTTCCT
CGGCCCGCTT
GACGGCAA
TCGAAC'TTTT
TTTTATGCAC
TCCGGCAAAT
ACATCCGCAT
CGTTTCCATC
GGACGCTATA
AGACCTCGAT
GCAGGGCATC
CTCGCCGA.AG
TTTTGGGGCA
TACCGGAAGC
AGCAATGACG,
TGGGCGAGTA
CGCCGGCGCA
CCCTTAT'T
TCGCCACCGC
CCTGA-TATG
GCGCAGGATA
TGGATTATTT
TTGGGGATAC
CAGAAAACCA
CGGGATATGT
CGA.ATACCGC CCGTAA 73CGGCATCG
AACCTTTCCS
T CGG CAT CC C
TATACCGTTG
CGCCAAAAGC
GCCAAGCCTG
TCCGTTCAGG
TGCAGGCAAC
TO CT GAAGGG
GGTATTCCCG
CGGAAA.A.CC
TCGACAATAC
ACTO COCG'C A
CCCC:ACCAC
-CCC;ATAO-
ATCCAAGC-C
CGGCTAC-
TGCGCGAC
.kAAGCCTATA
AAACCCCAC
GCCCTGTCCGG
GTCGACC0GGC 0 CAT C-C 0T
CCGGCAGCC
GGCGACC.AAG
CTGGCAGCTT
C C C CGC CAT
CAACCCGACA
C GAO C COGC C
TGCCGCACCT
CTGCAATCCT
GCAGGATGTG
CAAAACAGTT
GGAAGCCT-G
CGACOACAGG
ACGATTT TAT
CTTGTCCGCA
CGCOCACA
CAACGGCAAT
ACGCGCAACC
COOT TAC C C CGG C CGT CT
GACZAAACG
A.GGC'7ACCTC
TGCAGCAA.AA.
rATATCTTTT
CGCACTGOOC
ACTACATTAC
A.CCCGC.kAAG
GATTAAAGGC
'TGCCCAACG
This corresponds to the amino acid sequence <SEQ ID 3074; ORE 9 19.a>: a919. pep MKF YLFRAAL CGIAAAILAA &T TVOGGOAV YTVV.LS CAQAFQTPVH
SVQAFQFFER
RTAQARFPIY
CIPODFISVP
HTAOLSQFPI
TARTTA:KGR
EDPVELFFMH 1QOSGRLKT2 KLGQTSMQGI
KAY>IQQNPQR
T?LIGEYAGA
VDRHYITLGA
AVRVDYFrWGY GDEAGELAGK
CQSKSIQTFP
EWAAQDFAKS
YF-PWQVAGN
LPAGLRSGKA
FEGSRFLPYH
SGKYIRIGYA
LAEVLGQNPS
PLrFVATAHPV QKTTGYVWQ1, 7 QSZFR'LGCAN GS LACTV-TY LV?: RQTGFKNl TRNQINcoGA
YIFFI'FELTGS
DRPVGIPDA
LKN:QGWQD~
yC.PVLKGDDR
SGTIDN-GGT
:GK;PILOGYA
*xYAOKGYL
SNDGPVGALS
ACOTGSAIKG
m919/a919 OR-Fs 919 and 9 19.a showed a 98.6% identity in 441 aa overlap 10 20 30 40 50 M919.pep
MKKYLRPALYIAAALACSSQTPQPTSVNGPO??VG:PPGTGGA
a919 KKYLFRALCGIAAAJ-LACQSK. IQTFPQPDTSVTNGPDRPVGI
PDPAO'-TTVGGOOAV
20 30 40 5o 70 8C 90 10C 110 120 m9 19. pep, YTVV?HLSLPHWQ0A!SL ?RLGCAN ,LKNRG GWQD)iCAQAE TPViSO AKQF'ERP a919 YTVVPHLS LPHWAAQDFAKC LQS FRLGCANLKNRQGWQ0DJCAQA7QTPVHSVQAKQFFER 80 90 10C 110 120 87 130 140 150 160 170 180 rn919.Pep YFTPWQVAGNGSLAGTVTGYYEPVLKGDDRRT~AQARFOIYGI
PDDFISVLARSK
a919 YFTPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQ
FPYGPDDFISVPLPAGLRSGKA
130 140 150 160 170 180 190 200 210 220 230 240 m919 .pep LVRIRQTGKNSGTIDNTGGTHTADLSRFPITARTTAIi
GRFEGSRFLPYTNIGA
a919
LRIRQTGKNSGTIDNTGGTHTADLSQFPITARTTAIKGR.EGSFLYTNNGA
190 200 210 220 230 240 250 260 270 280 290 300 m91 9. pep DGKAPILGYAEDPVELF1AHIQGSCRLKTPSGKYIRIGYADKNEHPYVSIGYAKY a919
DGKAPILGYAEDPVELF--E:QGSGRLKTPSGKYIRIGYADNHYSGYAKY
250 270 280 290 300 310 320 330 340 350 360 m91 9. pep KLGQTSMQGIK<SYMRQNPQRLAEVLGQNPSYIFFRELAGSSNDGPVGLTLGYG a919
KLGQTSMGIKAYM.QQNPQRLAEVLGQNPSYIFFRELTGSSNDPGLTLGYG
310 320 330 340 350 360 370 380 390 400 410 420 m9 19. pep VDH-TGPFAAHVRANLMADCAXARD GG7A7!G a919 VRYTGPtVT~ ~KLRIAOGAKARD
YDA-LG
370 380 390 400 410 420 430 440 n9 19. pep QKTTGYVWQLLPNGMW.PEYRPX a919
QKTTGYVWQLLPNGMKPEYRPX
430 440 121 and 121-1 The following partial DNA sequence was identified in.M *AJ nngitidis <CSEQ ID 3075>: m12l seq 21 251 201 351 301 51 401 451 751 601 851 901 951 1001
ATGGAAACAC
GGCGGATG~C
AAGGGCACC
GATTTGCAGG
GCAAGAACTC
GTCAAAACCT
ACCTCCGAC
GCCGCTGCTG
X XXXXXXXX X XX xxXX XXXX XXX XXXXX Xx
XXXXXXXXXX
xxxxxxCAGC
CATATTGCCG
AACGCCACCC
GAAACCTACC
TTCCCGT'TT
CAGATGCCCG
TTAATGGCGG
CACCGCCGAC
CGTGGTTGGC
AGCTTTACA-
GTACTGATAC
CTTTACCCC
ACACAGCCC
AGCCGCC-AT
CGCACCGT("r ACGCGC COGA O CGxx xxx xx X XXXX XXX Xx XX XXXXXXXX XXX XXXXX XX XX XXXXXX X
TTCCTTACGA
CAACTGCTCG
TAkAAAGCACG
TTGACGGCGG
ACCGCGCAAA
TCAAATGTAC
ATTTGGCAGA
CTGAACCTCG
GGCGTGTTGG
CGGCATCATG
GGATGGACGG
TACCCCGGCA
AGAC3AACTG
ATGCGCAAAC
GACATTACCG
ACACGGTTAC
XXXXXXXxxxx XX XXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
xxxxxxxx XX
CAAAAACGGT
ACAGGCTGCT
GGGCGCGAAC
'-GAAAkACCGA
CCGTTTGCGA
ATTTGCGACG
ATGTTTCGGC
ATCCGCAATG
ATTAATCGCA
TCGGGAA.CCA
CGGCAXAATGG.
GGTTACG CCI
CACC-GCAGCA
CGCCGCC-xA.
CCC-TCGGCTG
AC CATACAG C XXX XXXXXXX XXX XXXXXXX
GCAAAGTCGG
CGCCCACCCG
TGTTTGC CAT TACGACO TAT
CGCCGTCTC.
GCGGCATCCG
ACACGCGT:T
GGTGGAAGCC
TTCCCSGTAG
GCATOGAGG
::TGGGCGC=;
CCAA::GCTC
GGA-TTTGTC
C CA(-CGCGCA,;.
TTGC-CGATTT
X XXX XX XXXX XXXX XXxxxx
XXXXXXXXXX
XXXX XXXXXX x:'xx XXXXXx
CACAAGGC,
T AT T TCGCCAC AAATTGGC TC TO CGGAG G C
CACOCAGCGG
CAATCCTGTT
CC CT 0 ACAG GCCGriATT'ro T CCOCACAAA 88 1051 GCAACCGGCG CATCCAAA.oo GTGTATTCTG AnCGCGGGAT
ATTATTATTC
1101 A This corresponds to the amino acid sequence <SEQ ID 3076; ORE 12 1>: ml21 .pep
METQLYIGIM
DLCDTGADEL
TVRHAPEHGY
XXXXX XX XXX xxQLPYDKNG
ETYLDGGENR
LMADLAECFG
ATGASKPCIL
SGTSMDGAOA
HRSR:ZLSQEL
SIQLADLPLL
xxxxxxxxxx
AKSAQGNILP
YDVLRTLSRF
TR VS L S TA
XAGYYY*
VLIRr4OGGKW
SRLYAQTAAE
Axxxxxxxxx XX XX XXXXX X
QLLDPRLLAHP
TAQTVCDAVs
LNLDPQWVEA
LO-AEGHAFT?
L LOS QNLA PS XXX XXXXX XX
XXXXXXXXXX
YFAQRHPKST
i-iAADARQMY
XFAWLAACW
Y PGRLRRQLL D ITALGCHGQ
GRELFAINWL
IC DGG IRN PV
INRIPGSPH-K
The following partial DNA sequence was identified in N. gonoirrhoeae <SEQ ID 3077>: g12 I seq 1 51 101 151 201 251 301 3 51 401 501 551 601 651 701 801 851 901 951 1001 105 1 1101
ATGGAAACAC
GGCGGATGCC
AAGGGCACGC
GATTTGCAGG
GCAAGAACTC
GTCAAAACCT
ACCT C CAC
GCCGCTGCTG
GCCGCACCT
CACGAAGCOC
CCGCGCGATT
GCTTCCACAC
cacTGGca gc catatTGCog ?ACOcaccc ttcccgattc
CACATGCCCG
TTAATGGCGG
CACOGCOGAA
cgtggttggo GO GAOCCGG C
A
AGCTTTACAT
GTGCTGGTAC
OTT TACOCCCC
ACAOACGCAC
ACCCGCCTGT
OGOT CTCTC
ACCCGGA
GGCAAOTGa TGCTCCC C
TCTTCCGCCA
GCCAAOATCA
AGCGCCOGGC
TCCTTACGA
CAACTGCTCG
aaAAAGCACC ttaacggcgg acC9CgCaaA
TOAAATGTAC
ATTTGCAGA
CT CAAOCT CC GOCT CTTCC CATCOA. CC
CCCCATTATG
GCATGGACGG
TACCCTGAoC
AGACGAACTG
ACGCGCAAAC
GACATTACCG
ACACGGTtac cgcggatttT G~acaAGGTC
TGACAGCGAA
GCTACTOC
AATATGCTGA
CAAAAacggt g c aggct GOT G~gcGCGaa c cgaaaaccga ccgTttggga
AITTCCGGOG
ATGTTTCGCo
ATCOTCAATG
ATTAACCCCA
GTCTATTCTG
TCGCGAACOA
CTATGGACGG
CGGOAAATGC CTGGCCCGG GCTTCCCG
CAAATTGCTG
CACCGOACCA CCATCTTGTC OGCCCCCGAA
CTCTGGA
CCO:OGGOTG
COACGCGCAA
AGOATACAC
TTCOCATTT
TACT~ggc ga:ttcCOA OSCCGCTOCT CCCCCCTTT ACACGCTGG
TAOGAA
COOOGCCCA
CCCTTOG
TGGAcgcgtg gacgcaggca gcAAAGgcgg cacAAGGCAA O"CCOcaccOC TATTTOTCAC TgtttgcccT AAattggctc tacqacgtat tqcggacgct CgtCtCa OACOAGCCC
GCCCATCCGCOAATCCTT
ACAOGCGTTT CCOTCCAO:AG GC-GCAGC qccqCATTtg TTCOCCTAG TOCOGACP-kA GCCCGGLT ATTATTATTG This corresponds to the amino acid sequence <SEQ ID 3078; ORF 121.ng>: gl2l. pep
METQLYICIM
DLQDTGTDEL
TVRHAPEHGY
HEAL FRDDRE HWQL PYOKNG
ETYLDGGENR
LMADLAECFG
ATGASKPCIL-
SCT SM DGADA IiRSRMLSQE L SIQLAD7PLL
TRVVLNIGGI
AKAAQGN ILP Y OVLRTLSRPF TRVS LHSTAE SAG YYY*
VLVRMGKW
SRLYAQTAAE
AELTRI FTVG
ANISVLPPCA
QLLGRLLAJ{?
TAQTVWDAVS
LNLDPQWVEA
LLOSQNLAPC
DrP-SRDLAAC
?AFGFDTGPG
YFSQPHPKS:
?.AJAAARQMy
AALAACW
YPDRLRRKLL
DITALGCHCQ
GCGAPLVFAF
.4,mLMDAWTQA C RE LFALNw l
-OGGIRNPV
:NR PGS P iK ORF 121 shows 73.5% identity over a 366 aa overlap with a predicted OR-F (ORiF12 .ng) from N. gonorrhoeae:ml21/g121 10C 20 3C 40 50 m121 .pep METQLY I GI M3TSMOCADAVL1 RM0GGKWLGAEGHAFT PYPCPLaRQLLDLQDTGADEL g121 METQLY IG IMSGT SMOGADAVLVRMDGGKWLGAEGHAFT PYP DRLRRKLLDLQDTGTDEL 20 30 40 513 80 90 100 110 120 m12 pep HRSRI LSQELS R1YAQTELLSQV LPS CITAT G CHGQTVRHA PEHGYS IQLAD L 89 g121
HRSRMLSQELSRLYAQTALLCSQNLAPCDITALGCHCQVHPHYSQALL
80 90 100 110 120 130 140 150 160 170 180 rnl2l pep AXXXXXXXXXXXXxXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"XXXXX, g121 AELTRIFTVG0FRSR2LAAGGQGAPLVPAH-A:FRDDRTVLIGAIVP 190 200 210 220 230 240 m121.pep XXXXXXXXXXXXXXXXXXXXXXQLPYDKNGAKSAQGNILP QLLDRL PYA HKS g121 PAFGFDTGPGNMLMDAWTQAHWQLPYDKNGAKAQGILPLG LA 'FSHKT 190 200 210 220 230 240 250 260 270 280 290 300 m121 .pep GRELFAINWLETY:DGGENRYDVLRTLSRi-AQTVCOAVSHAADRMIDGRP g121 GRLANLTLGERDLTSE-'QVDVHADRMIGGaP 250 260 270 280 290 300 310 320 330 340 350 360 m121 .pep LADLAZ-CFGTRVSLHSTADLLDPQWVE~iAWLACWINR PSHAG
FCI
g121
LMDACGRSHTENDQVAAALAWNTPSHAGSPI
310 320 330 340 350 360 m121.pep
XAGYYYX
I111FF q121
GAGYYYX
The following partial DNA sequence was identified in. meningitidis <SEQ ID 3)079>: a121. seq 1 51 101 151 201 2 51 3 01 351 401 451 501 551 6 01 651 701.
751 801 651 901 951 1001 1051 1101
ATGGAAACAC
GGCGGATGCC
AAGGGCACGC
GATTTGCAGG
GCAAGAACTC
GTCAAAAcCT
ACCGTCAGAC
GCCGCTGCTG
GCCGCGACCT
CACGAAGCCC
CGGCGGGA-T
(CTTCGACAC
CACTGGCAGC
CATATTGCCG
AACCCCACCC
GAAACCTACC
TTCCCGATTC
CAGATGCCCG
TAATGGCGG
CACCGCCGAA
CATGGATGGC
GCAACCGG
AGCTTTACAT
GTACTGATAC
CTTTACCCCC
ACACAGGCGC
AGCCGCCTGT
CGCGCCGTC
ACGCGCCGGA
GCGGAACGGA
TGCGGCCGGC
TGTTCCGCGA
GCCAACATCA
AGGACCGGGC
TTCCTTACGA
CAACTGCTCG
TAAAAGCACG
TTGACGGCGG
ACCGCGCAAA
TCAAIATGTAC
ATTTGGCAGA
CTGAACCTCG
GGCGTGTTGG
CATCCAAACC
CGGCATCATG
GGATGGACGG
TACCCCGGCA
GGACGAACTG
ACGCGCAAA.C
GACATTACCG
ACACAGTTAC
CTCAGATT
GGACAA.GGCG
CGACAGGGAAp GCG-ACr CC C
AATATGCTGA
CAAAAACGGT
ACAGG CT OCT
GGGCGCGA.AC
CGAAAACCGA
CCGTTTTCGA
ATTTGCGGCG
ATGTTTCGGC
ATCCGCAArc G- CAACCGCA
GTGTATTCTG
TCGAACCA
CGGCAAATGG
GGT TACGCCC CAnCCGCAGCA CCCS CCGAA
CCC-TCCCGCTG
AGCGTACAC
TACCSTCGC
CGCCGCTCCT
ACACGCCCGG
CCC OCACGCA TGOCACGC C T
CCAAAGCGC
CGCCCACCCC
TCT TTGCCC CT
TACGACGTA:
CGCCTCTCA
C CCAT CCC
ACACCCGTT
CGTAGAAGCc
TTCCCC-CTAG
GCGCGGGAT
GCATCCACGG
CT 000CCG CAAATTG CT C C CATGTT CT C
CTGCTGTCCA
CCACGCGC.k
TTGCCGATTT
GACTTCCGCA.
CCCCGCCTTT
TACTGAACA'
CCCCC-CTTC
GATGCACGCA
CACAAGGCAA
TATTTCGCAC
AA.'TTCGGCT C
TGCCCACC
CACGCAC CC CAA7CCTG'T
COCTGCACAG
GCC CG-T C
TCCGCACAAA
ATT ATTCATT
C
This corresponds to the amino acid sequence <SEQ ID 3080; ORF 11I.a>: a121. pep 1 METQLYIG:I. SCTSMOCADA VLIRMDCCKW LCAEi-iAFTP YPGR7 RRKLL 51 DLQDTCADEL HRSRMLSQEL SRLYAQTAAE LLCSQNLAPS
DITALCCHCQ
101 TVRH-APEHSY SVQLADLPLL AERTQIFTVG DFRSRDLAAG
CQCAPLVPAF
151 HEALFRDDRE TRAVLNICCL At4:S'LPPDA PAFCFCTCC NMLMDAWrMQA 201 HWQLPYDKNG AKAAQGNILP QLLCRLLAH? YFAQPHPKST
GRELFALNWL
251 ETYLDGGENR YDVLRTLSRE' TAQTVFDAVS HAAADARQMY
ICCCCIRNPV
302 LMADLAECFG TRVSLHSTAE LNLDPQWVEA AAFAWMAkACW VNRIPGSPHK 351 ATGASKPCIL GAGYYY' m121/alIl ORFs 121 and 121.a Showed a 74.0% identizty in 366 aa overlap 20 30 4C 50 rn121 pep MEQYGMGSDAALRDGWG7GATYGLRLDQTAE a121 MEQYGMGSDA-VIMGKLA 'HFPPRRK.
)QTAE
20 30 40 50 80 9 C 100 110 120 m121 .pep HRSRI LSQEL .LYAQTALLCSQNLAPS D TALGCGQTVRHAEHGYS
QLAOLPLL
a121 HRSRMLSQELS RLYAQTALLCSQNLAS
ITALGCHGQTVRHAPEHSYSVQLADLL
80 90 100 110 120 130 140 150 160 170 180 ml21 pep AXXXXXXXXXXXXXXXXXXXXXXX~~Xxx~X a121 AERTQI TVG DFRSRDLAAGGQGAPLVPAFHEAL FRDDRETRAVLN 7 G IANISVLPPDA 130 140 150 160 170 180 190 200 210 220 230 240 m121.pep XXXXXXXXXXXXLYKGKA N PLDLA YFaRPS a 121 PAFG FZCTG PGNMLMDAWMQA2WQLPY0GAAQG I LPQLLDRLLA- Y FA0-TFST 190 200 210 2120 230 240 250 260 270 280 290 300 m121 .pep GRELFA:NWLETYLDGGENRYDVLRTLSRFTAQTVCDAVSADARQMYT CGG7RN PV 1111111111111 IIlii ii I I [1111MIII1IIr
MIMI
a'21 GRLANLTLGERDTRLRTATFASAkDRMIGGRP 250 260 270 280 290 300 310 320 330 340 350 360 m121 .pep 7_ALECGRSISA)-LPW~k
KALAWNIGPK'GSPI
a121 LMADLAECFGTRVSLHSTALLDPQW VEAAFAWCW VNRPfSPZ,ASKCI 312 320 330 340 350 360 m121.pep
XAGYYYX
Mlil a121
:OAGYYYX
Further work revealed the DNA sequence identified in N. meningitidis <SEQ ID 3081>: m121-1 seq 1 ATGGAAACAC AGCTTTACAT CGCATCATG TOGSGAACCA
GCATGGACGG
51 GGCGGATGCC GTACTGATAC GGATGGACGG =CGAAJTG CT'3GCGoCGG 101 AAGGGCACGC CT2'TACCCCC TACCCCCA GGTTACCCO
CCAA-TGCTG
151 GATTTGCAGG ACACAGGCGC AGACGAACTG CACCCCASCA c3GATTTTGTC 201 GCAAGAACTC AGCCGCCTAT ATGCGCAAAC CGC--CCGA..
CTGCTGTGCA
251 GTCAAACT CG-ACCGTCC GACATTACOG CCCTCGGCTG CCACGGGCAA 301 ACCGTCCGAC ACGCGCCGGA ACACGGTTAC AGCATACAC
TTGCCGATTT
351 GCCGCTGCTG GCGGAACGGA CGCGGATTTT TACCGTCGGC GACTTCCGCA 401 GCCGCGACC- TGCGGCCGGC GGACAAGGCG CGCCACTCGT CCCCGCCTTT 451 CACGAAGCCC TGTTCCGCGA CAACAGGGAA ACACGCGCGG TACTGAACAT 501 CGGCGGGATT GCCAACATCA GCGTACTCCC C-00GACCCA
CCCGCCTTCG
551 GC'rTCGACAC AGGGCCGC AATATGCTGA TGGACGCGTG
GACGCAGGCA
601 CACTGGCAGC TTCCTTACGA >kACGGT GCAAAGGCGG
CACAAGGCAA~
651 CATATTGCCG CAACTGCTCG ACAcGCTGCT CGCOCACCCG
TATTTCGCAC
701 AACCCOACCC TAAAAGCACG GGGCGCGAAC TCTTTGCCCT
AAATTGGCTC
751 GAAACCTACC TTGACGGCCG CGAAAACCGA TACGACGTAT
TGCGGACGCT
801 TTCCCGTTTT ACCGCGCA CCGTTTGCGA CGCCGTCTCA CACGCAGCGG 91 851 CAGATGCCCG TCAAATGTAC ATTTGCGGCG GCGGCATCCG CAATCCTGTT 901 TTAATGGCGG ATTTGGCAGA ATGTTTCGGC ACACGCGTT' CCCTGCACAG 951 CACCGCCGAC CTGAACCTCG ATCC'GCAATG GGTGGAAGCC GCCGNATTTG 1001 CGTGGTTGGC GGCGTGTTGG ATTAATCGCA T-CCCGGTAG TCCGCACAAA 1051 GCAACCGGCG CATCCAAA.CC GTGTATTCTG ANCGCGGGAT ATTATTATTG 1101 A This corresponds to the amino acid sequence <SEQ ID 3082; ORF 121 -1I>: m121-1.pep 1 METQLYIGIM SGTSMDGADA VLIRDGGK4 LGAECG{AFTP YPGRLP.RQLL 51 DLQDTGADEL HRSRILSQEL SRLYAQTAAE LL.CSQNLAPS DITALGCHGQ 101 TVRHAPEHGY SIQLADLPLL ?.ERTRIFTVG OFRSRDLAAG GQGAPLVPAF 151 HEALFRDNRE TRAVLNIGGI ANISVLPPDA PA8---FTGPG NMLMDAWTQA 201 HWQLPYDKNG AKAAQGNILP QLLDRLLAHP YrAQPHPKST GRELFALNWL 251 ETYLDGGENR YDVLPJLSRF TAQTVCDAVS HAAADARQMY TCGGGIRNPV 301 LMADLAECFG TRVSLHSTAD LNLDPQWVEA AXFAWLAACW4 INRPGSPH( 351 ATGASKPCIL XAGYYY* m121-1/g121 ORFs 121-1 and 121.ng showed a 95.61, identity in 366 aa overlar-: 20 30 40 50 rnl21-1.pep METQLYIGIMSGTSMDGADAVLIRMDGGKWLGAEGHAVTFYPGR.RRQLLDTQDTGADEL 20 30 40 50 80 90 100 110 120 Mn2 1-1.pep HRSRLSQESRLYAQAAELLCSQNLAPSDTAGCHGQVFAj-tEHGYSIQL-zDLLL qg21 HRSRMLSQELSRLYAQTAAELLCSQNLAPC:ITALGo HGQTVRHAHG Y:IQLADTPLL 80C 90 100 110 120 130 i3 150 160 1.
m-.21-1.peo AERTRIFTVGDFRSRDLAAGGQGALVAF?{EAFDNRE-R-I/LNTGG0TAN-S7LPPDA g121 AETI7VDRRLAGGPVAHA7DRTVLIGAIVPG 130 140 150 160 1710 180 190 200 210 220 230 240 rn21 pep PAGCGGMMA4Q.-WLYKGKA G PLDLA Y-,QHDS g.21 PAG0GGMMDWQ WLYDNA-:AGI2PQLLG-RZLAH7YFSQ?2 190 200 210 22 30 2.
250 26C 270 28C 292 300 rl2l- 1. pep GRELFALNWLTYL3GGENRYDVLRTLSRFTAQTVCAVSA.I D RQMYICCGGG:RNPV g 121 GRLANTEYDCE\RDT-TS-TQVDV1
DRM:GG-'P
250 260 270 280 290 300 310 32C 330 340 350 360 rn121-1. pep LMADLAECFG TRVSLHSTALNLDPQWVEAAFAWLAAWNRPGSEK-VGASKPCIL g121 LMADLAE0 TRVSLHSTAELNLDPQWVEAAWL.CWINIPSPH-,T-ASKPCIL 310 320 330 340 350 360 n12 1-1. pep XAGYYYZ I I g121 GAGYYYX The following DNA sequence was identified in N. nzeiiingitidis <SEQ ID 3083>: a121-1.seq 1 ATGGAAACAC AGCTTTACAT CGGCATCATG TCGGGAACCA GCATGGACOG 51 101 151 201 251 301 35"1 401 451 501 551 601 651 701 751 801 851 901 951 1001 1051 1101
GGCGGATGCC
AAGGGCACGC
GATTTGCAC
GCAAGAACTC
GTCAAAACCT
AC COTCAGAC
CPCCGCTGCTG
GCCGACCT
CACGAAGCCC
CGGCGGGATT
C CT TC GACAC
CACTGGCAGC
CATAT T C CO
AACCCCACCC
GAAACCTACC
TTCCCGATTC
CAOATGCCCG
TTAATGGCGG
CACCGCCGAA
CATGGATGGC
*GTACTGATAC
*CTTTACCCCC
ACACAOGCGC
AGCCGCCTGT
CGCOCCGTCC
ACGCGCCGOA
GCGGAACGGA
TO COOCCGGC
TGTTCCGCGA
GCCAACATCA
AGGACCGGGC
TTCCT-ACGA
CAACTGCCG
TAAAAGCACG
TTGACGGCGG
ACCGCGCAAA
TCAAAT::-AC
ATTTGGCAOA
CTGAACCTCG
GGATGGACGG CGGCAAATGG
CTGGGCGCGG
TACCCCGGCA GGTTACGCCG
CAAATTOCTG
GGACGAACTG CACCGCAGCA
GGATGTTGTC
ACGCGCAAAC CGCCCCGAA CTOCToTGCp.
GACATTACCG CCCTCGGCTG
CCACGGGCAA
ACACAGTTAC AGCGTACACC
TTGCCGATTT
CTCAGATTTT TACCGTCGGC
GACTTCCGCA
GGACAAGGCG CGCCOCTCGT
CCCCGCCTTT
CGACAOGGGA ACACGCGCGG
TACTGAACAT
GCGTACTCCC CCCCOACGCA
CCCOTTO:G
AATATGCTGA TGOOACSCGTG
GATGCAGGCA
CAAAAACGGT 3 7.-.AAGGCGG
CACAAGGCAA
ACAGGCTOCT CGCCCACCCO
TATTTCGCAC
GGGCGCGAAC TGTT7OCC-CT
AAATTOOCTC
CGA-AAACCGA TACGACOTAT
TOCGGACGCT
CCOTTTTCGA CGCCGTCTCA
CACGCAGCOG
ATTTOCGGCG GCGGCATCCG
CAATCCTOTT
.TGTTTCGGC ACACGCOTTT
COCTGCACAG
A.TCCGCAATG GGTAGAAGCC
GCCGCOTTCG
GTCAACCGCA TTCCCOGTAO
TCCGCACAA
GTGTATTCTG OGCGCGGGAT ATTATTATTG GCAACCGGCG
CATCCAA.ACC
A
This corresponds to the a-mino acid sequence <SEQ ID 3084; ORF 12 1-1I.a>: a121-1.pep
METQLYIOIM
DLQDTGADEL
TVRHAPEHSY
HEAL FRD ORE
HWQLPYDKNG
ETYLDGGENR
LMADLAECFG
ATG:.3,::2:lL
SGTSZOGADA
HRSRSILSQEL
SVQLADLPLL
TRAVLTNIGGI
AKAAQGNILP
YDVLRTTSRF
TRVSLHSTAE
GAO YYY* VL IRH 00KW
SRLYAQTAAE
AERTQIFTVO
ANISVLPPDA
QLLDRLLAHP
TAQTVFDAVS
LNLDPQWVEA
1'GAEGRA T LCSCNLAPS
DFRSRDLAAG
FAFGFD73PG Y7AQPHDKST
HAAADAPRQMY
AAFAWMAACW
YPC-RLRRFLL
DITA7GCHGQ 0 OGA P 3VPAF
NMLMDAWMQA
ORE L ALNWT
ICGGGIRNPV
m12l-1/al2l-1 ORE's 121-1 and 1 2 1-.a showed a 96.41 identit' in 366 aa overlap 20 30 40 50 63 m2-1pep
METQLYIGIGTSMDGADAVLIRMDGGKWLAATP
20 30 40 30 70 83 010 110 120 mn12 pep HRSR:LTSQELSRLYrQT,'A-LLCSQNLAPS .S:r.GQTVR!.j, P-1 OYSIQLAOLL 80 90 100 110 120 140 150 160 170 180 a121-1 pep FVDFSDAGQGPIPF ALRD7TAL,7GA a3 142 -1SLPD 130 14 10 60 170 180 190 200 210 220 230 240 m121-1.pep
PAE'ODTGPGNMLMDAWTQHWQLPYDNGAQACGIILPQLLRLHYAPPS
a12-?1- PAFGFDTGPG r4LMDAWMQAHWQLPYDKNAKAQGIL LR HYAPP 190 200 210 220 230 240 250 260 270 280 290 300 m121-1 .pep GRELFALNWLETYLDGENRYDVLRTSR'TAQTJr 3 AVHAAQYIG
IRP
a121-1 OPELALWEYGERDLT RTTVD-VHAzDRMIGINP 250 260 270 280 29030 LMD 310 320 330 340 350 360 M121-l pep LALAECFGTRVSLHSTADLLPQWvEAXAWLACWTNIGPKAGSPI 319 3-0 330 340 35C 360 m121-1.oep
XAGYYYX
11111 a12l
GAGYYYX
128 and 128-1 The following partial DNA sequence was Identified inIK Ineningitidis <SEQ ID 3085>: m128.seq (partial) 1 ATGACTGACA ACGCACTGCT CCATTTGGGC GAAGAACCCC
GTTTTGATCA
51 AATCAAAACC GA-kc;ACATCA AACCCGCCCT GCAAACCGCC
ATCGCCGAAG
101 CGCGCGAACA AATCGCCGCC ATCAAAGCCC AAACGCACAC
CGGCTGGGCA
151 AACACTCTCG AACCCCTCAC CGGCATCACC GAACGCGTCG
GCAGGATTTG
201 GGGCGTGGTG TCGCACCTCA ACTGCGTCGC CGACACGCCC
GAACTGCGCG
2S1 CCOTCTATAA CGAACTGATG CCCGAAATCA CCGTCTTCTT
CACCGAAATC
301 GGACAAGACA TCGAGCTGTA CAACCGCTTC AAAACCAT'CA AAzAATTCCC.C 331 CGAATTCGAC ACCC'rCTCCC CCGCACAAA AACCAAACTC
AACCAC
1 TACGCCAGCG AAAAAC-GCG CGAAGCCAA TACGCSTTCA
GCGAAACCGA
51 wGTCAAAAA TAyTTCCCyG TCGGCAAwGT ATrAACGGA
-TGTTCGCCC
AAmnTCAAAAA ACTmfTACGGC ATCGGATTTA CCGAAAAAAC yGTCCCCOTC 151 TGGCACAAAG ACGTGCGCTA TTkTGAATTG CAACAACG GCGA~rnCCAT 201 AGGCGGCG-T TATATGGATT TGTACGCACG CGAGGCA
CGCGGCGGCG
2S1 CGTGGATGAA CGACTACAAA GGCCGCCGCC GTTTTTCAGA
CGGCACGCTG
301 CAAyTGCCCA CCGCCTACCT CGTCTGCAAC T--GCCCCAC
CCGTCGGCGC
351 CAGGGAAGCC CGCyTGAGCC ACGACGAAAT CCTCATCCTC
TTCCACGA
401 CCGGACACGG GCTGCACCAC CTGCT1TACCC AAGTGGACGA
ACTGGGCGTA
451 TCCGGCATCA ACGGCGTAkA ATGGGACGCG GTCGAACTGC
CCAGCCAGTT
501 TATGGAAAAT TTCGTTTGGG ATACATGT CTTGGCACA mTGTCAGCCC 551 ACGAAGAAC CGGcglrCCC yTGCCIGAAG ACTCTTsGA CAAwTGCTC 601 GCCGCCAAA ACTTCCA~sG CGGCATOTTC Y-sG7CCGGC AAwTGGAGT- 651 CGCCCZ-CTTT GATATGATGA TTTACAGCGA AGACGACGAA
GCCGTCTGA
701 AAAACTGGCA ACAGGTTTTA GACAGCGTGC GCAAAA.AGT
CGCCGTCATC
751 CAGCCGCCCG AATACAACCG CT3CGCCTTG AG!CTTCGGCC
ACATCTTCGC
801 AGGCGGCTAT TCCGCAGCTn ATTACAGCrA CGCGTGGGCG
GAAGTATTGA
GCGCGGACGC ATACGCCGCC TTTGAAGAAA GCGACGATG:
CGCCGCCACA
901 GGCAAACOCT TTTGGCAGGA ATCCTCOCC G-CGGGGri-T
CGCGCAGCOG
951 nGCAGATCC TTCAAGCCT TCCGCGGCCG CCGACCAGC
ATAGACGCAC
1001 'rCTTGCGCICA CAGCfCGTTT GA
UCGG
TCTGA
This corresponds to the amino acid sequence <SEQ ID 3086; ORE 128>: m12S.pep (partial 1 MTDNALHLG EEPRFDQIKT EDIKPALTQTA IA-EAREQIAA
IKAQTHTGWA
S1 NTVEPLTGIT ERVGRIWGVV SHLNCVADTP ELRAVYNELM
PEITVFFTEI
101 GQDIEYNqRF KTIKNSPEFD TLSPAQKTKj
NH
YASEKLREAK YAFSETXVKK YFPVGXVLNgG WHiKVRYXEL QQNGEXIGGV
YMDLYAREGK
QLPTAYLVCN FAPPVGGREA
RLSHDEIZIL
SGINGVXWDA VELPSQFMEN FVWEYNvLAQ AAKNFQXGMF XVRQXEFALF DMI~9!YSEDDE QPPEYNRFAL SFGHIFAGGY
SAAXYSYAWA
GKRFWQEILA VGXSRSGkES
FKAFRGREPS
:FAQXKKLYG
RGGAW[IXDYK
XSA34EETGVP
GRLKN'WQQVL
EVL'SADAYAA
I DALLPHSGFp
IGFTEKTVPV
GRRRFSDGTL
LLTQVDELGV
LPKELXDKcCL
DSVRKKVAVI
FEESIDDVA-T
DNAV*
94 The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 3087>: g128 .seq 1 atgattgaca 51 aatccaaacc
CGCGCGGACA
151 AACACCGTCG 201 GGGCGTCGTG 251 CCGTCTATAP.
301 GGACAAGAcA 351 CGAATTTGCA 401 TGCGCGATTT 451 GAACTGGCAA 501 CCAAAACn;TC 551 CCGCACCGCT 601 GCCGCGCAAA 651 GCACTACCTT 701 AAATCTACCG 751 AAATTCGACA 801 AACCGccaaa 851 CCAAAATGGC 901 GCCCGCCGCG 951. CTTCGCCCGC 1001 GCTACCCCCGG 1051 GAAGTCAAAA 1101 CCAAATCAAA 1151. TCTGGCACAA 1201 ATCGGCGGCG 1251 CGCGTGGATC 1301 TGCAACTGCC 1351 GGCAAAGAAG 1401 AacCGGCCAC 1451 TGTCCGGCAT 1501 TTTATGGAM.
1552. CCACGAAGAA 1601 TcgcCGCCAA 2.651 TTCGCCCTCT 1701 GAAAAACTGG 1751 TCCAACCGCC 1801 GCcggcGGCT 1851 CAGCACCGAT 1901 CAGGCAAACG 1951 gcgGCGGAAT 2001 ACTGCTGCGC LacgCA~cgtg gaagACAzca
AATCGCCGCC
AGCGTCTGAC
TCCCATCTCA
CGAACTGATG
TCGAACTGTA
ACGCTTTCCC
CG .ATTGAGC
AACTGCAAAC
CTAGACGCGA
TGCCGGCATT
GCGAAGGCAA
GCCOTTATCC
CGCCTACGTT
ACACCGCCAA
cTr7CTCGGCT
GGACACGCCC
CCAAACCCTA
GAACACCTCG
CGA.AAAACTG
AATACTTCCC
AAACTCTACG
AGACGTGCGC
TTTATATGGA
AACGACtaca
CACCGCCTAC
CGCGTTTAAG
GGACTGCACC
CAAcggcgtA
AC-TCSTTTG
AccaGCGAGC
AAACTTCCAG
TCGATATGAT
CAGCAGGTTT
CGAA:-ACA.AC
ATTCCGCAGG
GCCTACGCCG
CT-TCTGGCAA
CCTTCAAAGC
ccacttggg
AACCCGCCG
GTCAAAGCG(
CGGCATCACC
ACTCCGTCGI
CCTGAAATc;
CAACCGCTTC
CCGCACAAAA
GGCGCGGAAC
CGAAGGCGCG
CCGACGCGTT1
CCCGAAGACG
AACAGGTTAC
AATACGCCGG
ACCCGTGCCA
CATCGACCGC
TTAAAAATTA
GAACAGGTTT
CGCCGAAAA
GTCTCGCCGA
CGCGAAGCCA
COTCGGCAAA
GCATCGGATT
TATTTTGAAT
TTTGTACGCA
AAGGCCGCCG
CTCGTCTGCA
CCACGACGAA
ACCTGCTTAC
GAATGGGACG
GGAATACAAT
CCCTGCCGAA
CGCGGTATGT
GATTTACAGT
TAGACAGCGT
C=-CGCCA
CTAT.TACAGc
CCTTTGAAGA
GAA.At ccttcg
T-TCCGCGGA
*gaagaaccCC GTTTTaatca CCAAACCGCC ATCGCCGAAG AAACGCACAC CGGCTGGGCC 7GAACGCGTCG GCAGGATTTG SCGACACGCCC GAACTGCGCG ~CCGTCTTCT'r CACCGAAATC AAAACCATCA AAAATTCCCC AA-CCAAGCTC GATCACGACC TjCCGCCCGA ACGGCAGGCA CAACTTTCCG CCAAATTCTC CGGCATTTAC
TTTGACGATG
CGCTCGCCAT GTTTGCCCCC AAAATCGGCT TGCAGATTCC CAACCGCGAA CTGCGCGAAC GCGAACTTTC AAACGACGGC ACGCTCGAAA ACGCATTGAA CGCCGAATTG TCGCTGGCA TAAACTTCC'r GCACOACCTC GACCTCGCCG AAGTCAAAGC CCCGCAGC-G TGGGACTTGA ~AAACATT CAGCGAAACC GTTCTGGCAG GCCTCTTCGC CGCCGAAAAA ACCGTTCCCG TGCAACAAAA CGGCAAAACC CGCGAAGGCA AACGCGGCGG CCGCTTTGCC GACGgcacGC ACT:CGCCCC GCCCGTCGGC ATCCTCACCC
TCTTCCACGA
CCAAGTGGAC GAACTGGGCG CGGTCGAACT
GCCCAGCCAG
GTATTGGCAC AAATGTCCGC AGAACTCTTC GACAAAATGC TCCTCGTCCG GCAAATGGAG GAAAGCGACG AATGCCGTCT GCGCAAAGAA GTcGCCGTCA ACAGCTTCGG CCacatCtTC TACGCATGCG CCGAAGTCCt AAGCGACGac a:cGCCGCCA ccg-Ccg9gg ctCCCGCAGC CGCGAACCGA GCATAGACGC gGCctgA CAaagcggtT TCGACAACGC This corresponds to the amino acid sequence <SEQ ID 3088; ORE 128.ng>: g128 .pep
MIDNALLHLG
NTVERLTGIT
GQDI ELYNRF
ELAKLQTEGA
AAQSEGKTGY
KFDNTANIDR
ARRAKPYAEK
EVKKYFPVGK
I GGVYMDLYA
GKEARLSEE
FMENFVWEY'N
FALFDMMIYS
AGGYSAGYYS
EEPRTINQIQ'
ERVGRIWGVV
KTI KNSPEFA
QLSAKFSQNV
KIGLQI PHYL
TLENALKTAI(
DLAEVKAFAR
VLAGLFAQIK
REGKRGGAWM
I LTLFHETGH
VLAQMSAHEE
ESDECRLKNW
EZI KPAVQTA
SHLNSWVDTP
TLSPAQKTKL
LDATDAFGI Y
AVIQYAGNRE
LLGFKNYAEL
=HLGt.ADPQP K'LYGI GFAEK
N'DYKGRRRFA
GLr.-!LLTQVID
TGEPLPKCELF
QOVLDSVRKE
:AEARGQTAA VKAQTHTGWA ELRAVYNELM PE:TVFFTEI DHD)LRDFILS GAELPPEROA FDDAAPLAG: PEDALA)4FAA LREQ:YRATV TPASELSNIDG SLATKMADTP EQVLNJFLHDL WDLSYAGEKL REAKYAFSET TVPVWHKZVR YFELQQNGKZ' CGTLQLPTAY LVCNFAPPVC ELGVSGINGV EWDAVELPSQ DKCMLAAKNFQ RGNIFLVRQME VAVIQPPEYN RFANSFGHIF VAATGKRPVWQ E:ILAvGGSRS YAWAEVLSTD AYAAFEESD 651 AAESFKAFRG REPSIDALLR
QSGPDNAA-
ORE 128 shows 91.7% identity over a 475 aa overlap wvith a predicted ORE (ORF l 2 8.ng) from N. gonorrhoeae: rn128/g128 20 30 40 50 g128 .pep
MIDNALLHLGEEPRFNQIQTEDIKPAVQTAIAERQAVATTWNVRTI
I M:1 1 II i I iI :11II I 1 1 11 1 m128
MTDLLEPFNKEIPLTIEAEIAKQ~TW,,VPTI
20 30 40 50 80 90 100 110 120 9128. pep ERGIGVVSHLNSVVDTPELAYELMPI-,VFFTEIGQI
YRKINPF
m128 ERVGRIWGVVSHLNV~rTPELAVNELME7VF GQDI ELYNRFKTIIKNSPEFD 80 90 100 110 120 LPA 130 140 150 160 17010 9128 .pep TLPQTLHLDVSALPRAEA
TGQSKSNL)TAGY
m128
TLSPAQKTKNH
13/ 340 350 360 9128 .pep YAGEKLREAKYAFS
ETEVKKYFPVGKVLAG
m128 YASEKLREAKYAFS ETXVKKYF
PVGXVJLNG
30 30 390 400 410 420 9128. pep
LFAQIKLYIGTAEKTVPVWHKDVRYFELQQGKTIG.VMLARGRGWNY
1111 1Iii I II H IIIIIIIII m128 LFAQXKK- YG:GFTEK--PVWMK:VRYXELQQNGEXl GGVYDLYAREGKRGGA'9MNYK so 60 7C 80 430 440- 450 460 470 460 m129 GRRRFSZGTZOQLPTAYLVCN7APPVGGREA SD i IFEGGHHL-VEG 100 110 -2 130 140 150 490NG 500 510 520 530 540 g128.pep SGNVWDAVELPSQFMF YENVLQMSA~EE7GEPKEFMLACQR m128 SG7NGVXWDAVEPSQFMENTEVNVAXSJ.EG.P
KELDXLANFX
160 170 180 190 200 210 550ME 560 570 580 590 600 9128 .pep LVQEALFDMMIYSESIDECROKNQQVLDSVRKEVAVIPE
RASGFGY
m12 8
XVRQXEFALDMMIYSED)DEGRLJKNWQQVOVKVVQP-RASGIA
220 230 240 250 260 270 6AG-0S 620 630 640 650 660 9128 .pep
SAYSAWAEVLSTDAYIAFEESDDVATGK..FWQEILVGRAEFARRP
m1 28 SAAXYSY'AWAEVLSAAYFESDDVATGF .WELAVXR ZFKAFtZGRPS 280 290 300 310 320 330 670 679 9128 .pep IDALLRQSGFDNAAX m128
IDAUILRISGFD)NAVX
340 The followinga partial DNA sequence was identified in N. meningitidis <SEQ ID 3089>: a129.seq 1 ATGACTGACA ACGCACTGCT 51 101 151 201 251 301 351 401 451 501 551 601 651 701 7 51 801 851 901 9 51 1001 1051 115J.
1201 :251 1301 135 1 1401 14 51 1601 1651 1701l 1 831 16851 1 9C 1 1951 2001
AATCAAAACC
CGCGCGAACA
AACACTGTCG
GGGCGTGGTG
CCGCCTACAA
GGACAAGACA
CGAGTTCGAC
TGCGCGATTT
GAATTGGCAA
CCAAAACGTC
CCGCACCGCT
GCCGCGCAAA
GCACTACCTC
AAATCTACCG
AAATTCGACA
AACCGCCAAA
C CAAAAT GOC
GCCCGCCGCG
CTTCGCCCGC
GCTACGC'CGG
GAAGTCAAAA
CCAA
T
CAAA
TCTGGCACAA
ATAGGCGGCG
CGCGTGGATG
TGCAACTGCC
GGCAAAGAAG
AACCGGACAC
TATCCGGCAT
T-TATGGAAA
C CACGAAGAA
TCGCCGCCA-A
TTCGCCCTCT
GAAAAACTGG
TCCGACC-GCC
GCAGGCGGC:
GAG CGCGGAC
CAGGCAAACG
GCGGCAGAAT
ACT CTTG CG C
GAAGACATCA
AAPCGCCGCC
AACCCCTGAC
TCGCAC CT CA
TGAATTAATG
TCGA0C(TGTA
ACCCTCTCCC
CGTCCTCAGC
AACTGCAAAC
CTAGACGCGA
TGCCGGCATT
GCGAAGGCAA
GCCGTCATCC
CGCCTACGTT
ACACCGCCAA
CTG CT CGCT
GGACACCCCC
CCAAACCC:A
GAAAGCCTCG
CSAAAAACTG
AATACTTCCC
AAACTCTACG
AGACOTGCGC
TTTATATGGA
AACGACTACA
CACCCCTAC
CC COC T TGAG GCCCTG CACC
CAACGGCGTA
ATTTCGTTTG
ACCGGCGrTO
AAACTTCCIAA
TO GATAT OAT
CAALPCAGGTTT
COAAPACAAC
ATT CCC CAGG
GCATACCOCCG
CTTTTGGCAG
CCTTCAAAGC
CACAGCGGCT
C CAT TTGOO C
AACCCGCCCT
ATCAAAGCCC
COGCATCACC
ACTCCOTCAC
CCCGAAATTA
CAACCGCTTC
ACGCGCAAAA
GCCGGAAC
CGAAGGCGCG
CCGACGCOTT
CCCGAAGACG
AACAGGCTAC
AATACGCCGA
ACCCG COC CA CAT CGAC CCC
TCAAAAACTA
GAACAAGTTT
CSCCGAAAAA
GC CT CGCCGA
CGCGAAGCCA
CGTCGGCAAA
GCATCCOATT
TATTTTGAAT
TTTGTACGCA
AACGCCGCCG
CTCOTCTGCA
CCATGACGAA
ACCTCCTTAC
GAATGGGACG
GGAAT7ACAA-
CCCTGCCGAA
COCGGAATGT
CATOTACAGC
:ACACACT
CGCTTCGCCA
CTATTACAC
CCTTTGAAGA
GAAAT OCT00 7OO CCGCGGA
TCOACAACGC
GAAGAACCCC GTTTTGATCA OC:.AACCGCC ATTGCCCAAG .AAACGCACAC CCGCTGGCCA GAACGCGTCG
GCACGATTTG
CCACACGCCC
GAA-TGCGCG
CCGTCTTCTT
CACCGAAATC
AAAkACCATCA AAAACTCCCC AACCAAACTC AACCACGATC TGCCOCCCGA
ACAGCAGGCA
CAACTTTCCG
CCAAATTCTC
COOCATTTAC
TTTGACGATG
COCTCOCCAT
OTTTCCCT
AAAATCGGTT
TOCAOATTCC
CAACCGCAAA CTGCGCGAAC GCGAGCTTOC
AGACGACOGO
A-3CCAAA ACOCCCTOCA ClOCCGPAOTG TCGCTOGCAA :AAAC=T-C GCACGA CCTC G;ACC-COCCO AAGTCAAAGC CCAACCG TGGCACTTGC AATA-z'CGCATT CAGCGAAACC G:ATT.AACG GAC-GTTCGC TACCGAAAAA
ACCGTCCCCG
TGCAACAAAA. CGGCGAAACC CGCGAAGGCA
AACGCGGCGG
CCOTTOTCA
GACCOCACGC
ACTTCACCCC
GCCCGTCGGC
ATCCTCACCC
TCTTCCACGA
CCAAGTCGAC
GAACTGCGCG
CAGOCGAACT
GCCCAGTCAG
GTCTTGGCGC AAAi-TGTCCGC AGAAC.-TC-TC
GACAAAATGC
-CCCC-G CCAAA-CGAS GAAG;A:GACC
AAGGCCGTCT
GCGCA--AOAA G-CGCCG:CG ACAGCTTCGG
CCACATCTTC
TACGCGTGGG CGGAAGTATr AAGCGACGAO
GTCGCCGCCA
-CGTCGGCGG
ATCGCGCAC
CCCGAACCGA OCATAGACOC
GGOTOGA
This corresponds to the amino acid sequence <SEQ ID 3090; ORF 128.a>: a128 .pep
MTDNALLHLG
NTVEPLTGIT
GQDIELYNRF
ELAKLQTEGA
AAQSECRTGY
KFCNTAN IDR
ARRAKPYAEK
EVKKYFPVCK
IGOVYMDLYA
O REAR LS HDE FMEN FVWE YN
EEPRFDQIKT
ERVGRIWGVV
KTTKNSPEFD
QLSAKFSQNV
KIGOQI PHYL T LENALQTA.'
OLAEVKAF.AR
VLNCLFAQIK
REGKRGGAwM
ILTLFHETOH
VLAQMSAHEE
EDT K PALQTA, SN LN SVTOT P
TLSH'AQKTKL
LDATDAFGI Y
AVIQYADNRR
LLGFKNYAEL
ESLOLADLQP
KO-YG IG FTEK N DYKG RRRP GLHH4LLTQVD TOV P P REL F I AEAREQ IAA NfK 0 LRD KyVLS
FDDAAPLAGI
LREQIYRAYV
.SLATKMACTP
WDLGYAGEKL
TVPVWHKDVR
EDCTLQL-PTAY
ELGVSCINGV
DKMLAAKN E
IKAQTHTGWA
FEITVFFT7EI
GAELPPEQQA
PEDALAIIFxA TRASELS DOG
EQVLNFLHDL
REAKYAFSET
YFELQQNGET
LVCN7TPVo EW DAVE LPSQ RON OLVRQME 551 FALFDMMIYS EDDEGRLKNW QQVLDSVRKE VAVVRPPEYN
RFANSFGHIF
601 AGGYSAGYYS YAWAEVLSAD AYAAFEESDD VAATGKR-WQ
EILAVGGSRS
651 AAESFKAFRG REPSIDALLR
HSGFDNAA'
m128/a128 OR~s 128 and 128.a showed a 66.0% identity in 677 aa overlap 20 30 40 so m128 .pep MTNLHGER0ITDKAQA7ERQAIATTWNVPTI a126
MENLLHLGEER:DQIKTEDIKPALQTAIAEAREQIAIKATTWNTELT
20 30 40 50 80 90 100 110, 120 m128 .pep ERVGRIWGVVSHN-'CVADTPELPAVYNELMPEITVFFTEIGDEYRK:-NPa128 ERGIGVHNVDPLAYEL
IVFEGDEYRKINPEFD
80 90 100 110 120 130 m128.pep TLSPAQKTKLNH a128
TLSHAQKTKLHDLRDVLSGLPPEQQAELKLQTEGAQLSFQNLADGI
130 140 150 160 170 180 m 128 .pep a128 FIDDAAPLAG1 PEDA L Al-.1AAPQSEGKTGY:7.LQ: PHYLAV1CYA0tR:L
QIVRAYV
190 200 210 22c 3 4 m128 .pep a129 TRASELSDGFDNTANI DRTLENALQ:-AKZLL kNYAELSLATKMADT
EQVLNFLHDL
250 260 270 280 290, 300 140 m128 YASEKLREAKYAFSTYXTF.FY7PVGX 310 320 330 340 350O 360 160 170 180 190 "03 210 ml 28 .pep VLGFQKLGG-7<VIWKVY-
NEXGVMLAEK-GW
a128 VLGFQKLG7G-KV HERY-LQC7-GVM-Y- -7GKRGGAWM 370 380 390 400 410 420 220 23C 240 250 250 270 m1-23.pep NDKRRSGLL-ATCNAPGRERSD-LL r;L4LTV a128 N0DYKGRRRFSGLPTAYVCpCNF hPVGK.SHD F-LHE'TGHG iCL:THLZTQVD 430 440 450 460 470 480 280 290 300 310 320 330 m1'28 .pep E LGVSG1 NGVXW DAVE L SQ FMEN FWvEYNV LAQX SAH EETGV P LPKEL KLAN a128 ELGVSGINGVEWDAVELPSQFMENFWEYNVLAQMSAEiEETGVPLK
FDMANQ
490 500 sic) 520 530 540 340 350 360 370 380 390 m128 .pep
XGMFXVRQXEFALFDMMIYSEDDEGRLKNWQQVI.DSVRKKVAVIQPPYRASGI
550 560 570 580 590 600 400 410 420 430 440LAG 450ASKAR m128 .pep AGYAzxsAAVSDYAESDATKFQ a122 AGGYSAGY'AAVSDYAESCATKF4EIrAGSSASFAR 620 6301 640 650 660 460 470 ml.28.pep
REPSIDALLRHSGFDNAVX
IM ill IM I: a128 REPS1DALLRHSGFDNAAX 670 Further work revealed the DNA sequence identified in N. meningitidis <SEQ ID 3091>: m126-1. seq 9 0 100:1 1051 101 1151 1201 1251 1301 1351 1401 1451 1501 1551 1 6C 1 1651 1701 17151 1801 1851 1901 1951 2001 1 ATGACTGACA
ACGCACTGCT
1 AATCAAAACC
GAAGACATCA
1 CGCGCGAACA
AATCGCCGCC
1 AACACTGTCG
AA:CCCTGAC
1 GGGCGTGGTG
TCGCACCTCA
1 CCGTCTA-AA
CGAACTGATG
1 GGACALAGACA
TCGAGCTGTA
i CGAATTCGAC
ACCCTCTCCC
1 TGCGCGA-TT
CGTCCTCAGC
IGAACTGGCAA
AACTGCAAAC
ICCAAAACGTC
CTAGACGCGA
1 CCGCACCGCT
TGCCGGCATT
1 GCCGCGCAAA
GCGAAAGCA
1 ACACTACCTO
GCCGTCATCC
i AAATCTACCG
CGCCTACGTT
1 AAATTCGACA
ACACCGCCA
7 AACCGCCAA.A
CTGCTCGGC-
CC'AAAATGGC
GGACACGCCC
1 GCCCGCCGCG
CCAAACCCTA
1CTTCGCCCGC
GAAAGCCTGA
*GCTACGCCAG
CGAAA.AACTG
*GAAGTCAAAA
AATACTTCCC
CCAAATCAA
AAACTCTAC'
TCTGGCACAA
AGACGTGCGC
*ATAC-GCGGCG
TTTATA-GGA
*CGCGTGGATG
AACGACTACA
TGCAACTGCC
CA-CGCCTA'-
GGCAGGGAAG
CCCGCCTGAG
AACCGGACAC
GGGCTGCACC
TATCCGGCA"'
CAZACGGCCTA
TTTATGGA ATTTCGTTTG
C
CCACGALAGAA ACCGGCGTTC
C
TCGCCGCCAA AAACTTCCAA
C
TTCGCCCTCT TTGATATGAT
G
GAAAAACTGG CAACAGGTTT
T
TCCAGCCGCC CGAATACAAC
C
GCAGGCGGCT ATTCCGCAGG
C
GAGCGCGGAC GCATACGCCG
C
CAGGCAAACG CTTTTGGCAG
G
GCGGCAGAAT CCTTCAAAGC C
C
CCATTTGGGC GAAGAACCCC
GTTTTG.ATCA
AACCCGCCCT C-CAAACCGCC
ATCGCCGAAG
ATCAAAGCCC AAACGCACAC
CGGCTGGGCA
CGGCATCACC GAACGCGTCG
GCAGGATTTG
ACTCCGTCGC CGACACGCCC GAAC'rGCGCG CCCGAAATCA CCG?2TTCT-' CACCG0.AATC CAACCGCTTC AAAA CCATCA ;ADLATTCCCC CCGCACAAAA Aki'CCAA.ACT'
AACCACGATC
GGCGCGGAAC TGCCGCCCGA
ACAGCAGGCA
CGAAGGCGCG CAACTT-CCG
CCAAATTCTC
CCGACGCCGTT CGGCATTTAC
ITTGACGATG
CCCGAAGACC CO;CTCCCAT
G:Z'TGCCGCC
AACAGGCTAC AAAATCCGGCT
TOCAGATTOC
AATACGCCGA CAACCGCGA
CTGCGCGAAC
%CCCGCGCCA GCGAACTTTC
AGACGACGGC
CATCGACCGC AC0CTCGCAA
ACG=CTGCA
]CAAACACGCCGAATTG
TCGC:GGCAA
AACAAGTTT TAAACTTCCT
GCACGACCTC
GCCGApAAJA, GACCTCGCCG
AAGTCAAAGC
CCTCGCCGA TTTGCAACCG
TGGGACT-GG
:GCGAAGCCA LL.TACGCGTT
CAGCGA.AACC
:GT CGGCAAA G-ATTAA.CG
GACTGTTCGC
;CATCGGATT TACCGAAJA
ACCC-TCCCCG
ATTTTGAAT TGCAACxAk~
CGGCGIAACC
TTGTACGCA CGCGAAGGCA
A-ACGCGGCGG
LAGGCCGCCG CCGTTTTTCA
GACGCCACC
TCGTCTGCA ACTTCGCCCC
AC:CCOTCGGC
CACGACGA ATC-'TCATrC'
~TTCACGA
OCTGCTTAC CCAAGTGGAC
GCTGGCG-
AATGGGACG CGGTCGACT
GCCCAGCCAG
GAATACAAT GTCTTGGCAC A AATGTCAGC CC-GCCGA AGAACTCT' GACAAaATGC GCGGCATG- -c--cTCCC
GCAAA-GGAG
ATTTACAGC GAAGACGACG
AAGGCCGTCT
AGACAGCGT GCGCAAAA~z
GTCGOCOTCA
GCTTCGCCT TSAGCTTCGG
CCACATCTTC
IATTACAGC TACGCGTGGG
CGGAAGTATT
CTTGAAGA AAGCGACGAT
GTCGCCGCCA
~AATCCTCG CCGTCGGCSG
ATCGCGCAGC
TTCCGCGGC CGCGAACCGA
GCATAGACGC
ACTCTTGCGC
CACAGCGGTT
T
This corresponds to the amino acid sequeno. <SEQ ID 3092; ORE 128- 1>: m128-1.pep.
1 I4TDNALLHLG EEPRFODQIRT EOIKPALQTA !AEAREQ:AA
IKAQTHTGWA
51 NTVEPLTGIT ERVGR7 WGvv SHLNSVADTP ELIPAVYNELM PEITVFFTE7 101 GQDIELYNRF KTIKNSPEFD 151 ELAKLQTEGA QLSAK<FSQNV 201 AAQSESKTGY KIGLQIPHYL 251 KFDNTANIDR TLANALQTAK 301 ARPAKPYAEK DLAEVKAFER 351 EVKKYFPVGK
VLNGLF-AQTK
401 IGGVYMDLYA REOKROGAWM 451 GREARLSHDE
ILILFHETGH
501 FMENFVWEYN VLAQMSAHEE 551 E'ALFDMI4IYS EDDEGRLKNW 601 AGGYSAGYYS YAWAEVLSAD 651 AAESE'KAFRG REPSIOAILR
TLSPAQKTKL
LLDATDAFGIY
AVIQYADNRE
LLGFKNYAEL
ESLNLADLQP
KLYGIGFTEc
NDYKGRRRFS
GLHHLLTQVD
TGVPLPKELF
QQVLDSVRKK
AYAAFEESDD
HS F DNA V
NHDLRDFVLS
FDDAAPLAGI
LREQIYFRAYV
3LATKMADTP
WDLGYASEKL
TVPVWH-KDVR
DOT LQL ?TAY
ELGVSGINGV
DKIILAAKNFQ
VAV IQ P2KYN
VAATGKRFWQ
OAELPPEQQA
PEDALAM FAA
TRASELSDDG
EQVLNFLHDL
RKAKYAFSET
Y77-LQQNGET 7 'ON FAPPVG
EWDAVELPSQ
RGMFLVRQME
RFALSFGHIF
E ILAVOGSRS The following DNA sequence was identified in N. gonorrhoeae <SEQ ID 3093>: g128-1. seq (partial) 1 ATGATTGACA ACGCACTGCT 51 AATCAAAACC GAAGACA-r7A 101 CGCGCGGACA AATCGCCGCC: CCACTTOGGC GAAGAACCCC GTTTTAATCA AACCCGCCOT CCAAACCGCC ATCGCCGAAG 151 201 251 301 351 401 451 501 551 6C1 701 751 801 851 901 951 100i 1051 1101 1201 1251 1301 1351 14031 14 5 1
AACACCGTCG
GGGCGTCGTG
CCGTCTATAA
GGACAAGACA
CGAATTTGCA
TGCGCGATTT
GAACTGGCAA
CCAAAACGTC
CCGCACCGCT
GCCGCGCPAA
GCACTACCTT
AAATCTACCO
AAATTCGACA
AACCGCCAAA
CCAAAATGGC
GCCCGCCGCG
CTTCGCCCGC
GCTACGCCGG
GAAG:CAAAA
CCAAATCAAA
TCTGGCACAA
ATCGGCGGCG
CCGTGGATG
TO CAACT C C
GGCAAAGAAG
AACCGGCCAC
TGTCCGGCAT
AGCGTCTGAC
TO CCAT CT CA
CGAACTGATG
TCGAACTGTA
ACGCTTTCCC
CGTATT GAO C
AACTGCAAAC
CTAGACGCGA
TGCCGGCAT'0
OCGAAGGCAA
OCCGTTATOC
CO CC TACGTT
ACACCGCCAA
CT CT C GOCT GGAC'-'':o
CCAAACCCTA
GOkACACCTCG
CGAAAAACTG
AATACTTCCC
AAACTCTACG
AGACGTGCGC
TTTATATGGA
AACGACTACA
CACCOCCTAC
CGCGTTTAAG
GGACTGCACC
CAACGGCGTA
OTCAAAGCGC
CGGCATCACC
ACTCCGTCGT
CCTGAAATCA
CAACCGCTTC
CCGCACAAAA
GGCGCGGAAC
CGAAGGCGCG
CACGCOTT
CCCGAAGACG
AACAGGTTAC
AATACGCCGG
ACCCGTGCCA
CAT OGACCGC
TTAAAAATTA
G;.:G'.-TTT
CGCCGAAAAA
GT CT CO C GA
C.GCGAAGCCA
COTOGGCAAA
GCATCGGATT
ATTTTGAAT
T'TGTACGCA
AAOOCCGCCG
C.TCGTCTSCA
CCACGACGAA.
ACCTC-CTTAC
AAA
AAACGCACAC
GAACGCGTCG
CGACACOCCC
CCGTCTTCTT
AAAACCATCA
AACCAAGCTC
TOCCGCCCC-A
CAACTTTCCG
COO OATTTAC CO CTC C CAT
AAAATCGGCT
CAACCOCO-A
SCSAACTTTC
ACGC-CGAAA
COCCGAATTO
TAaACTT OCT
GACCTCGCCG
CCCCAOCCG
AATACGCATT
OTTCTGOCAG
CGCCGOLkAAA
TO'CAACAAAA
CO CGAAOOCA
CCCCTTTOCC
ACTT CO CC CC
ACCATOAC
CGGCTGCG
OCAOOATTTO
OAACTOCOCO
CACCOAAATC
AAAzATTCCCC OAT CACOACC
ACGGCAGGCA
C CAZAAT7TCT C
TTTGACOATG
O TTOCr0 0CC TOO AOATT CC
CTGCGCGPAAC
AAACO-ACGGOC
ACGOATTAA
TOOCTOO CAA
OCACOAC~C
AAGTO,-AGC
TGOOACTTGA
CAOCOAAACC
OCCTOTTCOC
AC CO T CC C- COO CAAAACC
AACOCGGCGG
GACOOCACGC
O CC COT( GC00C ,C C CCCACOA
GAACTGGGCG
This corresponds to the amino acid sequence <SEQ ID 3 094:1 ORF 12S-1I nr>: g128-.pep (partial) 1 MIDNALLHLG 51 NTV7ERLTOIT 131 OQOIEL.YNRF- 51ELAKLQTEOA 201 AAQSEOKTOY 251 KFDNTAN:DR 301 ARRA-KPYAEK 351 EVKKYFPVE 401 IOOVYMCLYA 451 OKEAP.LSHDE
EEPRFNQIKT
ERVOR TWovy7
KTIKNSPEFA
QLSAKB'SQNV
KIOLQIPHYL
TLENALKTAK
DLAEVKAFAR
VLAGL FAQ TN
REONROGAWM
ILTLFHETGH
=:KPAVQTA
SHULNSVDC P T LSPAQK:KL :,DATDA7G:Y
AVIQYAGNRE
LLOFKNYAEL
E HLOLAD PQ P
KLYOOGFAK
N DYKORRR FA GLHHiLLT'-VD
IAEARGQTAA
ELRAVYNE LM
CHDLROFVLS
FDDAAPLAO:
L.REQ IYRAYV
SLATKMADTP
WDLSYAGEK(L
CVPVWHKDVR
DGL( PAY K LOIS 01.NOV PEIC';FFT77 OAT L-?PERQA PS DAL'A.SIFAA
TRASELSNDG
EQVLNFLHDL
REAKYAFSET
YFELQQNOKT
LVON RAP PVG
K
m128-1/gI28-1 OREs 12S-1 and IDS-i.ng showed a 94.5' identit-y In 491 as overlap g128-1 -pep ml 28-1 q128-l .pep m128-1 g126-1 .pep m-'28-1 Ml DNALLHLGEEPR
FNQIKTEDIKAQAAAGQAVQHGWNVRTI
I I I i i III lii il 11 I 1 1 1 11111 MTDNALLHLGEE PRFDQIKTEDIKPALQTAIAEAREQIAAI-KQTiTGWATELGT 20 30 40 50 7C 80 90 i(00' 10) 120 E-RVGR:-WGOVVSHLNSVVDTPELR.AVeNE,MPE lTvFFTEIGQI--LYNR-KT
KSPEFA
ERVGRIWGVVSHLNSVADTPELRAVYNELMPE ITVFFTEIGQDIELYNR.KTIK
NSPEFD
80 9C loo 110 120 130 140 1> 160 170 180 TLS PAQKTKLDHDLRDFVLSGAELPPERQ. ELAKLQ" EGAQLSAKFSQNVLOATOAFGIY TL5 AQKKNHDLRFLSGAE'PEQQUAKQTEGALAKFQVDADr.
1.30 140 150 160 170 180 190 200 210 220 230 240 g-'-28-1 .pep FDOAAPLAGLEDALA21FAAQSEGKTGYKIGLQIPHYLAVIQYGRLEIRY m128-1
FDDALGPDLMAAQEKGKGQPYAOYDRLEIRY
190 200 210 220 230 240 250 260 270 280 290 200 g128-1.pep TAS--LSNCGrKFDNTANIDRTLENALKTAKLLGO:KNYAELSLAT AT7VNLO ml12 8 -1 TASE:SDDGKFDNTAIDRT'LANALQTAK..G-KNY.-
LSAKIDTEVNLL
250 260 270 280 290 300 g'28-. pep 31 30 33 30 350 360 310 320 330 340 350 360 370 380 39C 400 410 420 9g12B-1 pep VLG7QKLGGAKVVHDRF7QNKIGYL~ 7GRGW 370 380 390 400 410 420 g2-1 .pep mr128-1 430 440 45- 460 4-70 480 NDYKGRRRFADGTQ=iTAY:,VCNFAPPVG0K EAR- SiiEl LL 8
METGH-GLHHRLLTQVD
H IM 1 ii il I 111: Jijil l I I; I I ij~j i I l im lim NDYKGRRRS:GT QLTAYLVCFAPV ;GEAR--SHOE 7EEHLiLT 430 440 45C 460 47C 480 490 91228- pep ELGVSGINGVF: ml 28-1 ELVGNVWAEPQ.El-WY L;QSi-TI'-
FLDL'JKF
490 500 51", 520 530 540 The following DNA sequence was identified in N. 'neningitidis <SEQ ID 3095>: a!28-1.seq 1 ATGACTGACA ACGCACTGCT CCATTTGGGC GAAGAACOOC
GTTTTGATCA
51 AATCXAAACC GAAGACATCA AACCCGCCCT GCAAACC0c
ATTGCCGAAG
101 CGCGCGAACA AATCGCCGCC ATCAAGCCC AAC-CACAC
CGGCTGGGCA
151 PA.CACTGTCG ;LCCCCTSAC CGGCATCACC S0AC0CCTCG
GCAGGATTTG
201 GGGCGrGGTG TCGCACCTCA ACTCCGTCAC CGACACGCCC
GAACTGCGCG
251 CCGCCTACAA TGAATTAATG CCCGAAATTA CCGTCTTCTT
CACCGAAATC
3101 GGACAAGACA TCGAGCTGTA CAACCGCTTC PAACCAT0A
AAAACTCCIC
351 CGAGTTCGAC ACCCTCTCCC ACGCGCAn ?ACCAA.CTc
ACCACGA--
401 TGCGCGATTT CGTCCTCAGC GGCGCGGAAC TSOCGCCCGA
ACAGCAGGCA
4511 GAATTGGCAA AACTCAAAC CGAAGGCGCG CA.CTTTCCS
CCAAATTCTC
501 551 601 651 701 751 801 851 901 1001 1051 1101 1151 1201 1251 1301 1351 1401 1451 1501 1551 1601 1651 1701 1751 1801 1851 1901 1951 2001
CCAAAACGTC
CCGCACCGCT
GCCGCGCAAA
GCACTACCT C
AAATCTACCG
AAATTCGACA
AACCOGCCAAA
CCAAAATGGC
GCCCGCCGCG
CTTCGCCCGC
GCTACGCCGG
GP.AGTCAAAA
CCAAATCAAA
TCTGGCACAA
ATAGGCGGCG
CGCGTGGATG
TGCAACTGCC
GGCAAAGAAG
AACCGGACAC
TAT C CGGCAT
TTTATGGAAA
CCACGAAGAA
TCGCCGCCAA
TTCGCCCTCT
GAAAAACTGG
TCCGACCGCC
GCAGGCGGCT
GAGCGCGGAC
CAGGCAAACG
GCCGCAGAAT
ACT'CTTGCG C
CTAGACGCGA
TGCCGGCATT
GCGAAGGCAA
GCCGTCATCC
CGCCTACGTT
ACACCGCCAA
CTGCTCGGCr
GGACACCCCC
CCAAACCCTA
GAAAGCCTCG
CGAAAAACTG
AATACTTCCC
AAACTCTACG
AGACGTGCGC
TTTATATGGA
AACGACTACA
CACCGCCTAC-
CCCG CTT GAG CCGACGCGTT CGGCATTTAC CCCGAAGACG CGCTCGCCAT AACAGGCTAC AAAATCGGTT AATACGCCGA CAACCGCAAA
TTTGACGATG
GTTTGCCGCT
TGCAGATTCC
ACCCGCGCCA
CATCGACCGC
TCAAAAACTA
GAACAAGTTT
CC CC GAAAAA
GCCTCGCCGA
CGCGAAGCCA
CGTCGGCAAA
GCATCGGATT
TATTTTGAAT
TTTGTACGCA
AAGGCCGCCG
CTCGTCTGCA
GCGAGCTTTC AGACGACGGC ACGC:CGA ACGCICTG CA CGCCGAATTG TCGCTGGCAA TAAACTTCCT GCACGACCTC GACCTCGCCG AAGTCAAAGC TTTGCAACCG TGGGACTTGG AATACGCATT CAGCGAAACC GTATTAAACG GACTGTTCGC TACCGAAAAA ACCGTCCCCG TGCAACAAAA CGGCGAAACC CGCGAAGGCA AACGCGGCGG CCGTTTTTCA GACGGCACGC ACTTCACCCC GCCCGTCGGC ATCCTCACCC TOTTCCACGA CCAAGTCGAC GA-ACTGGGCG CAGTCGAACT GCCCAGTCAG GTCTTGGCGC AAATGTCCGC AGAACTCTTC GACAAAATGC TCCTCGTCCG
CCAAA-ZTGGAG
GAAGACGACS AAGGCCGTCT GCGC.kkAGAA G-CGCCOTCG ACAGC-?CGG CCACATCT TC TAC3CGTGGG CGGAAGATT A.AG-GACGAT GTC.ZC-CCC CCG CGGCGG ATOCCCAGC CGCGAAzCCGA GCATAGACGC
GGCTTGA
GGCCTGCACC ACCTGCTTAC CAACGGCGTA GAATGGGACG ATI ICGTTTG
ACCGGCGTTC
AAACTTCCA.
TTGATATGAT
CAACAGGTTT
CGAATACAAC
ATTCCGCAGG
GCATACGCCG
CTTTTGGCAG
CCTTCAAAGC
CACAGCGGCT
GGAATACAAT
CCCTGCCGAA
CGCGGAATGT
GATTTACAGC
TAGACAGCGT
CC CTTC CCA
CTATTACAGC
CCTTTGAAGA
GAADATCC-CG
CTTCCGCGGA
TCGACAACGC
This corresponds to the amino acid sequence <SEQ ID 3096; ORF 128-1.a a128-1 .pep
MTDNALLHLG
NTVEPLTGIT
GQDIETYNRF
ELAYLQTEGA
AAQSEGKTGY
KFDNTANIDR
AR RANPYAE K
EVKKYFPVCK
I CGVYMDLYA GKT-AP.S H E FMEN FVWEYN FAL 50MM I YS
ACGYSACYYS
AAESF'KAFRG
EEPRFDQIKT
ERVCRIWCVV
KTIKnSPEFD
QLSAKFSQNV
KIGLQIPHYL
TLENALQTAK
OLAEVVKAFAR
VLNGLFAQIK
REC'KRGGAWM
:-LTLFHETGH
VLAQMSAHEE
EDDEGRLKNW
YAWAEVLSAD
REPS:D)ALZR?
EDTKPALQTA
SHLNSVT.DTP
TLSINAQKTKL
LDATDAFCIY
AVIQYADNRK
LLCFKNYAEL
ESLCLADLQP
KLYG IG STEK
NDYKCRRRFS
.LHHLLTQVD
TGVPLPKELF
QQVLOSVRKE
AYAAFEESDD
HSGFDNAA-
!AFEARE Q IA
ELPAAYNELM
NHD0LRDFVLS
FDDA.APLAGI
LREQIYRAYV
S LAT KMA CT
WDLGYAGEKL
TV PVWI{KDVR
DGTLQLPTAY
E SC NC- C KM LAAKN EQ
VA"VVRPPEYN
VAX7GKRP~qQ
IKAQTHTCWA
PEITVFFTEI
CAE CP PCQQA PC CAL.MFAA TRASELS CCC
EQVLNFLHDT,
RE AKYAE T
YELQQNGET
L VON FT P PVG SW CA VSLPSQ
RGMFLVRQN
P.EANSEGHIF
EILAVGGCSRS
m128-1/a128-I ORFs 128-1 and l 2 8-1.a showed a 97.80/ identity in 6771 aa overlap 20, 30 4 0 50 a i28-' .pep MTDNALLHLGEEPRFECQ:K:E0iKPALQTA-AA-SQ:A--AIKA PQTHTGWA NTV!EPL TCIT m128-1 MTNLHGERDITDKAQAA7AEIAKQHGATIPTI 20 30 40 50 80 90 100 110 120 a128-1 .pep ERGIrVHNVD ERA EME:~FTIQILNFT-
SPF
M128-1 ERVGRIWGVVSHL-NSVACTPE-LRAVYNELMPEITVFFEICQDIELYNRFKTIKNSPEF 80 90 100 110 120 130 140 150 160 170 Al 102 a128-1 .pep
TLHQTLRLDVSALPQAEALTGQSKSNLADFI
m128-1 LPQTLHLDVSALPQALKQEALAFtVDADGI 130 140 150 160 170 180 190 200 210 220 230 240 a!286-1.pep
FDAPA:EAAFAASGTYKGQPYAIYDRLEIRY
mn128-1 FZ0AAPLAGIPEAAFAASSTYILIHLVQANERQ-PY 190 200 210 220 230 240 T 250 260 270 280 290 300 a128- pep
TPSLDIKDTNDTEAQA:GKYESAKAATEVNLD
m.28-1
TAESDKDTNDTAAQALGKYESAKATEVNLD
250 260 270 280 290 300 310AK 320 330 340 350 360 a 128-1. pep ARKYEDAVAAELLDQWLYGKRAYFEEKYPG m12 8-1
ARRAKPYAEDAVAAELLDQWLYSKRAYFEEKYPG
310 320 330 340 350 360 380 39C 400 410 420 a128-1. .pep VLGFkIKY7GIKVVHDRY7QNEICYDYRGRGW m128-1 VLNGLFAQIKKLYGIGFTEKTVPVWHKDVRYFE
!HHFIIIH.!ILDYAREKIIAW
370 380 390 400 410 420 430CR 440) 450 460 470 480 a128-1 .pep
NDKRRSGLLTYVNTPGKA'SDITFEGGHLTV
m128-1 NDYKGRRRSGLLTYVNAPGRALHELLHTHLH TV 430 440 450 460 470 480 490 500 510 520 530 540 a128-1 .pep ELGVSGINGVEWDAVELPSQ--,EN-VWEYNVLAQMSAHEGPPEFK7ANm128-1
LVGNVWAEPQMNVENLQSHEGPPEFMT-KQ
490 500 510 520 530 540 550L'- 560 570 580 590 600 m128-1
GFVMEALMMYEDGLNQVDVK'V-QPYRASHI
550 56C 570 580 590 600 610 620 630 640 650 660 a128-1 .pep AGYA3YYWELAAAFEDVAGR7QIA
GSSA-FAR
m128-1 AGYAY~AAVSDYAzE-DV.PTKFQ AGSSAS7-;R 610 620 630 6410 650 660 670 679 a128-1.pep
RESIDALRHSGFDNAAX
HIM1 [1[I.1i: m128-1
REPS:DALLRHSGFDNAVX
670 206 The following partial DNA sequence was identified in N. meningitidis <SEQ ID 3097>: m206 .seq 1 ATGTTTCCCC
CCGACAAAAC
51 CCCCTCATGC
GGCACGACCT
101 AGACAGTCCG
GCAAATCCAA
CAAGGCTCGC
CTACAAATGG
TGATTCAATT
GCCCGCGACA
GGCCGGCGAC
ACGTCGGACT
GGCAAAACCA
AGGAACTCAT
GGCGGCAGCA
CGTTTACA~r
TGGCGGCGGC
CTCGTATTCT
CTACATCGGC
TCAAAACCGA
*CCTTTTCCTC
CCGGCAAACA
GCCGTCCGCA
GCTCCACAGC
GCACCGC
AACGCCCTcA AAGCCGsAAA.
TCAACACCGG
AACGGCGAAT
AAAACTCTCC
*TGTCTCAGCG
CACTGCTCCT
CCGCCAA-CCG
AAACCCAAC
TCAGCCACA'
CGACCGCACA
CTCGGACTCA
TCGGCACGCC
CGGCTTCGAT
TOCAGCGGCA
ACGTCAAGCT
GCCGCGCACC
ATCCCCGACA GCCGCyTCAA CGGCGCACAC
CGCTACTC
TCATCCATGC
CCCCAGCAGC
ACACCGTTTT
ACGCCAAA
C'ACC GCACTC- IIC is corresponds to thle amino acid sequence <SEQ ID 3098; ORF 206>: m206 .pep..
1 MFPPDKTLFL CLSALLLASC GTTSGKHiRQP 51 QGSQELMu-jS LGLIGTPYKW
GGSSTATGFO
101 ARDMAAASRK IPDSRXT{CD LVFrNTGGA{ 151 GKTIKTEKLS TPFYAKNTYLG
AHTFFTE*
KPKQTVRQIQ
AVRISHIDRT
CSGMIQFVYK
NALNVKLPRT
RYSHVGLYIG NGEFI${APSS The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 3099>: g206 .seq atgttttccc Cgcctcatgc agacagtccg caaggctcgc ctacaaatag tgattcaaot CcCcgcgaca ggCCggcgac acgocggact ggcaaaacca ccgacaaaac ggcacgacct gcaaatccaa aggaactcat ggCggcagca agtttacaaa tggc ggcggc atcgtattct Ccacatcggc tcaaaaccga CcttttcctC CCggcaaaca gCCgtccgca agctccacagc gcaccgcaac aacgccctca aagccgcaaa tcaacaccag aaCggcgaat aaaactctcc tgtctcggcg cca-ccaaccg tcagccaca:- CtCg3gaC:ca cgg.::cgac acaccaaccatcccgaca cQ9cgcacac tcazccatgc caCtgCtCCt aaacccaaac cggccgcaca tcggcacgcc :gcaccgca QgCccaccacc cccgCCtcaa cC:actcac' CCCCggcagc 9CCd~d zy-:,acaa atga This corresponds to the amino acid sequence <SEQ ID 3 100; ORF 2
O
6 g206 .pep 1 1 MFSPDKTLFL CLGALLLASC ;TTSGGIiRQP 51 QGSQELMLHS LGLIGTPYKW GGSST.ATGF0 101 ARDMAAASRK IPDSRLKAGD :rVFF'N-,GGA 151 GI(TIKTEKLS TPFYAKNYLG
AHTFFTE*
KPKQTVRQ:7Q AVRISHjIGRT CSGMIQLVYK
NALNVKT-PRT
RYSHVGLYIG NGEFIl{APGS ORF 206 shows 96.0% identity over a 177 aa overlap with a predicted ORF (ORF 2 O6.ng) from N. gonorrhoeae:m206/g2O6 ~10 20 30 40 m206 .pep M-PDTFCS-'LSGTGIR~ K V -;V.ZSIRQSEMH g2 06 MFSPDKTLFLCLGAwzLASCGTTSGCqiRQPKPKQTVR-QVIHGTSvLMS 20 30 40 s0 m0.ep70 80 90 100 110 120 9206
LGLIGTPYKWGGSSTATGFDCSGMIQLVYKNALUVLRADAARIORKG
80 90 110li 120 Lr'J130 140 150 10 170 mn206 .pep
LV.TGGAHRYSHVGLYIGNGEFIAPSSGTIKTELTFANLATFE
g206
:VFFNTGGHYVGYGGFHPSKITKSPYKYATFE
130 14015 150 170 The following partial DNA sequence was identified in N. mieningitidis <SEQ ID 3 101 a206. seq 1 ArGTTTCCCC CCCACAAAAC CCTTTTCCTC- TC-TCTCAC
CACTGCTCCT
51 CGCCTCATGC GGCACGACCT CCGGCAAACA CCGCCAACCG
AAACCCAAAC
101 AGACAGTCCG GCAAATCCxA. GCCGTCCGCA TCAGCCACAT
CGACCGCACA
151 CAAGGCTCGC AGGAACTCAT GCTCCACAGC CTCGGACTCA
TCGGCACGCC
201 CTACAAATGG GGCGGCAGCA GCACCGCAAC CGGCTTCGAT TO3CAGCGGCA 251 TGATTCAATT CGTTTACA AACGCCCTCA ACGTCAAGCT
GCCGCGCACC
301 GCCCGCGACA TGGCGGCGGC AAGCCGCAJZ. ATCCCCGACA
GCCGCCTTAA
331 GGCCGGCGAG CTCGTATTCT TCAACACCGG CSGCGCACAC
CGCTACTCAC
401 ACGTCGGACT CTATATCGGC AACGGCGAAT TCATOCATGC
CCCCAGCAGC
451 GGCAAAACCA TCAAAACCGA AAAACTCTCC AC-ACCGTTTT
ACGCCAAAAA
501 CTACCTCGGC GCACATACTT TCTTTACAGA
ATGA
This corresponds to the amino acid sequence <SEQ ID 3102; ORF 2 06.a>: a206. pep 1 MFPPDKTLFL CLSALLLASC GTTSGKHRQP KPKQTVRQIQ
AVRISHIDRT
51 QGSQELMLHS LGLIGTPYKW GGSSTATGFD CSCMIQFVYK
NALNVKLPRT
101 ARDMAAASRK IPDSRLKAGD LVFFNTGGAH RYSHVGLYIG NGEFIHtAPSS 151 GKTIKTEKLS TPFYAK(NYLG
AHTFFTE*
m206/a.206 OR~s 206 and 206.a showed a 99.4% identity in 177 aa overlap O 20 30 40 50 mn206 .pep MFPK~LLALACTSK-PPQVQ:ARSIRQSEMH a206 MFPPDKTLF'LCLSALLLASCTTSGKHRQPKKQTVRQTQAVRI
SHIDRTQSQELMLHS
20 30 40 50 80 90 100 110 120 m206 .pep LGLIGTPYKWGGSSTATFDCSGMIQVYKNALNVKLPRTDMAAASRKTPDSRKG a206 LGITYWGEAGDSMQFYNLVL
AOA-SKPOSRLKAGD
80 90 100 110 :30 140 150 160 170 m2 06. pep LV7T'A RSVLYGGFHPS £-~r-LTFA YGHF7'TEX a2 06 LVFTGt;YHGYINE ,iPSKIKTE:KLST 2FYAKiNYLGAHTFFTEX 140 1509 -60 170 287 The following partial DNA sequence was identified inN Ineningitidis <SEQ ID 3103>: m287 .seq
ATCTTTAZAC
CTGCGGGGGC
TOT CAAAACC
GAAOATGCGC
AGGCAGTCAA
GTGCGGTAAC
GATATGCCGC
CCCGGATCCG
CCGGGGAATC
GACGGAATGC
TACGGCTGCC
CTTCAGATCC
GCAGCGTAA'r
GGOCGGTGGCG
TGCCG CC CCT
CACAGGCAGG
GATATGGCGG
AGCGGATAAT
AAAATCCC
AATATGCTTG
CT CT CAG C C
CGCAATGCC
GAT CGCCCGA
GTTGTTTCTG
TTCTCAAGGA
C OT TT CG CA
CCCAAAAATC
CGGTACAGAT
CCGGAAATAT
TATT:TTTC CCCTTTCAGC TGO7CAAGTCG GCGGACACGC AAAAAOAGAC
AGAGGCAAAG
CAGOOCGCGC
CATCCGCACA
AGAAAATACA
GCCAATGC
AAGACGAGGT GGCACAAAAT AGTTCGACAC
CGAATCACAC
GGAAAATCAA
GCAACGGATC
CGOATATOGC
A.AATGCGGCG
GGCGGOCAAA ATOCCGCA AAACAATCAA
GCCOCCGGTT
CACCTCCGAA TGCGTAGC ACCOOGACCA
TCCGTCGGCA
CAAGGTGCAA
ATCAAOCCGO
CATC:CCCGCG
TCAAACCCTG
601 651 701 751 801 851 901 951 1001 '1051 1101 1151 1201 1251 1301 1351 1401 1451 AATTTTGGAA GGGTTGATTT GCAAAATATA ACGTTGACCC ATTTCTTGGA TGAAGAAGTA GATGCAGACA AAATAAGTAp.
TGTCGGTTTG GTTGCCGATA TTATCTTTTA TA AACCTAAA GCACGGTCGA GGCGGTCGCT TCAGGCGGAT ACGCTGATTG ATTCCGGCAA :ATCTTCGCG
GGCTAATGC
ACTGTAAAGG
CAGCTAAAAT
TTACAAGAAA.
GTGTGCAGAT
C CCACTT CAT
TCCGGCCGAG
TCGATGGGGA
CCCGAAGGGA
CGGATCGTAT
TTGCGGCC
AACGGCCGTC
CGGCAGCAAA
TGGGTACCA
ACTTGGACGG
GGCCGGCGAG
AAAAGGGCGG
GTTTTGATTG
CGATTCTTGT
CAGAATTTGA
GATGGGAAGA
G;LAGGGAATC
TTCCGATT
ATGCCGCTGA
AGOCGGTCAGC
APTACCGGTA
GC,-CTTCGTG
GGCCGTGTAC
CGTACCCGAC
TCTGTGGACG
AAAATTCAAA
AAkAATGGCAG GAAGTGG CC
A'TCGCG
ACGGGCCGTC
AGTGGCAATA
AAAATTAAGT
ATGATAAATT
AATCAATATA
TAGCCTTCT
TTCCCGTCAA
CTGACGGGGC
TCTGACTTAC
TTCAAGGCGA
AACGCGAAG
CAGGGGCAGG
CCATTATCGA
GO CGC CAT CG
CGGGCATGTT
GAAAATACAC
TTTGCCGGCA
GGGGCGGAAA
ACCGGCAAAA
TACTGCATTT
TTTGCCGCAAz
CAGCGGCGAT
ATGGAAACGG
TCCGGAAAGT
C TAT CCCCG
AAAAAGAGCA
AATTGC-CCC
GGCGAAATGC
CCATACGGAA
AAGTCGATTT
GATTTGCATA
CTTTAAGGGG
TTTACGGCCC
ACACATGCCG
GGATTGA
This corresponds to the amino acid sequence <SEQ ID 3104; ORF 287>: m287 .pep 1 MFKRSVIAM4A CIFALSACGG GGGGSPIDVKS ADTLSKPAAP VVSEKE' 51 EDAPQAGSQG QGAPSAQGSQ DMAAVSEENT GNGGAVTADN ?KNEDE 101 OMPQNAAGTD SSTPN1HTPDP NMLAGNME14Q ATO-AGESSQP ANQPDM 151 DGMQGDOPSA GGQNAGNTAA QGANQAGNNQ AAGScSD?1?A SNPAPA) 201 NFGRVCLANG VLIDGPSQN7 TLTHCXGDSC SCNNFLZ)EEV QLKSCC: 251 DADKISNY(K DGKNDKFVG:z VADSVQMKG: NQYIIFYKPK PTSE'ARI 301 ARSRRSLPAE MPUIPVNQAD T-IVOGEAVS TCI;SGN7-A ?ECNYR 351 GAEK.LPGGSY ALRV QGEPAK GEMLAGAAVY NGEVLHrMHTE NCRPYP 4C1 FAAFVDFGSK SVDGIIOSGD DLHMGTQKFK A:D N F-K2C TWTENCG 451 SGKFYGPAGE EVAGKYSYRTAKGP- n"
TEA-K
VAQN
VIAA
NGGS
E LS
FRRS
YLTY
FRGR
SGD
V
-V
The following partial DNA sequence was identified in N gonorrhoeae <SEQ ID 3105>: g2S7. seq 1 51 101 151 201 251 301 351 401 4 51 -51 651 701 801 851 951 1001 1051 12.01 1151 1201 1251 atg--ttaaac ctqtgggggc cgtcaaaacc ctgccgaaag cqatacgcag tttcggcaga aaaaatgaag atccgcaaat CcccogCgtC acgaacgtgg gttgacccac aagaagcacc attaaqcqat tgctqacagq cggacaaacc gagaztccgc qgaagCggtc qgaattaccg tat qccct cc cacggccgtg gtccgtaccc aaatCtgrcgg gcaaaaatc cggaaaatgg gaggaagtgg cggattcggc gcagtgcgat ggccgccccc aaaagaaaa gaCgcaaccg a aat aca ggc acgcgggggc caaacaggaa aaaccctgcc gcaatctgt tgtaaaggcg Qtcaaaatca ataaaaaaga gtaaaaaaag acctac tcgt.
tgat:tc-cgt aqcctgaCgg gtatctgact gtgtgcaagq tacaacggcg gtccggaggc acggcat tat aaagccgcca cggcgggqat cgggaaaata gtgtttqccg tgcaatggct gatcgcccga gttgttgCtg tgaggaggca ccggagaagg aa--ggcgqtg gcaaaatgat acaaccaacc cctgcgaatg tgtgaztgac attottqtaa gaatttgaaa cgagcaacqg atggaactaa tctgcaccgt caatcaqCC qgcattccgg tacgggCgg cgaaccggca aagcgctgca agtttccg cgacagcggc tcgatggaaa qtttccggaa cagc tatci gcaaaaaaga tgta:tt:c ta7zCaagtcg aaaatgccgg Cag-ccaagat cgacaacaac a t C cgcaa a CqC--ggtt:7t gcg, tag oga aaaccgtcoc tgzqataat .atrcaagtcqa gaqaact:;caaa:a~a~c caggagqtc ga:acactqa caata.ttC aaaaa tzcc aaaggcgaaa tttcCata-g caaaagtcca gazgattztgC cgq :tttaag qgttacag ccgacaga!tg tcgggattga C CCCCt cage gcggacacgc ggaaggggtg cgccgCaagc atggcggcag gacaacccc atCcCcCqa tcagattoog rCC:tgga agg aaaa tat aac tta:-:ggatg tgaagaaaaa tcqqa:ccgg: atcrct-tata CIC t-CCggcc ZCtg--ggatgg g--gcccgaag cggcggatca tgCttgttgg qaaaacggcc tttcqgcagc atatgggtac gqgacztgga CcCggCcggC c-zqaaaagqg This corresponds to the amino acid sequence <SEQ ID 3106; ORF 287.ng>: g287 .pep 1 MFKRSVIAIIA C-PLSACCG GGGGSPDVKS AD-?SKAAP VVAENAGEGV 106 51 LPKEKKDEEA AGGAPQADTQ DATAGEGSQD MAAVSAENTG
NGGAATTDNP
101 KNEDAGAQND MPQNAAESAN QTGI4NQPAGS SDSAPASNPA
PANGGSDFGR
151 TNVGNSVVID GPSQNITLTH CKGDSCNGDN LLDEEAPSKS
EFEKLSDEZX
201 IKRYKKDEQR ENFVGLVADR VRI(OGTNKYI IFYTDKPPTR
SARSRRSLPA
251 EIPLIPVNQA DTLIVDGEAV SLTGHSGNIF APEGNYRYLT
YGAEKLPGGS
301 YALRVQGEPA KGEMLVGTAV YNGEVLHFH.I ENGRPYPSGG
RFAAKVDFGS
351 KSVDGIIDSG DDLHMGTQKF KAAIDGNGFK GTWTENGGGD
VSGRFYGPAG
401 EEVAGKYSYR PTDAEKGGFG
VFAGKKDRD
m287/g287 ORFs 287 and 287.ngy showed a 70.1% idniyi 99a vra 20 30 40 49 m287 .pep MFKRSVIAACIFALSACGGGGGGSPDVKSADTLSKPAPVSE------------
FETEA
g287 MFKRSVIAMACIFPLSACGGGGGGSPDVKSA:TPSKPAAPVVAENAGEGVLPKEKR
EE
20 30 40 50 60 ?0 80 90 100 109 m287 .pep KEDAPQAGSQGAPSAQGSQDMAVSEENTGNGGAVTADNPKNEDEVAQNDMPNAG g287
AGAQDQ-AAESDAVANGGAA-DPNDGQDPNA-
80 9 C 10c 110 110 120 130 140 150 160 169 m287 ep DSSTPNH-TPDPN
MLAGNMFNQATDAGESSPNQPDMANAAGMQGOOPSAGGQNGT
g 2 8 7 170 180 190 200 210 220 229 m287. pep AQGANQAGNNQAGSSDP:PASNPAPANGGSN3.GRVDLANGVLIDGPSQNILHXD g287 -EAQGNPGSS SPP'GS7RNGSrIGSNT
HKD
120 130 140 150 160 170 230 240 250 260 270 280 289 m2S7 .pep CSNFD-VLSFKSDDINKD _EVLVDVMIlyTFK 180 190 20C 210 220 230 290 300 313 320 330 340 349 m287 .pep KTSFRRSARSP STV P-LIPVNQADTLTVDGEAVSLGCHSN.FAPENYYL g287 KPPT-RSARSRRSLPA7-:P:PljQATTV-GEA'iS'TGHSGN7FAPEGNYRYLT 240 250 260 270 230 290 350 360 370 380 390 4003 409 m2 87. pep YG-KPGYLVGPkGMAAVN LFTrGPPr'i
-ADS
g287 YGAEKLPGGSYALRVQGEPAKGEMLVGTAVYNGEVLHFHMENGPYSGO7
AAEVFG
300 31C 320 330 340 350 410 420 430 440 45-0 460 469 m287 .pep KSVDGhbDSGDDLHNGTQKFKIDGNGeKGTWTENGSGDVSGKFYGPAGEvVAGYY g287
KSDIDGDHGOFAIGGKGWEGGVGFGAEVGYY
360 3-1 380 390 400 410 470 480 489 m287 .pep PTDAEKGGFGV.FAGEKEQOX g287 1 1 1 1 1 m 1 1 1 1 11I 1 PTDAEKGG
£GVFAGKKDRDX
420 430 The following partial DNA sequence was identified In N. meningitidis <SEQ ID 3107>: a-287. seq 1 51 131 151 201 251 301 351 401 451 501 551 601 651 701 151 801 851 901 951 1001 1051 1101 115-1 12 01 1251 1301 1351 1401 1451
ATGTTTAAAC
CTGTGGGGG(
TGTCAAACC
CTGCCGAAAC:
CGATACGCAG
TTTCGGCAGA
GAAAATAAAG
TACAGATAGT
GAGATATGGG
AACCAACCGG
GTCGGCAGGG
CTGAAAACAA
CCTAACGCCA
TGGCATCAAG
AAGACAAAGT
TCAGAATTTG
AGACGAGCAA
AGAATGGAAC
TCTTCATCTG
GCC CAGAT C
ATGGGGAAGC
GAAGGGAATT
ATCGTATGCC
CGGGCACGC
GGC CGT CGT
CAGCAAATCT
GTACGCAAAA
TGGACGGAAA
CGOCGAAAA
AOOOCGGATT
-GCAGTGTGAT
-GGCGGTGGCG
TGCCGCCCCT
AAAAGAAAGA
GACGCAACCG
AAATACAGGC
ACGAGACC
TCGACACCGA
AAACCAAGCA
ATATGGCA
GAAAAZT(-r1CG
TCAAGTCGGC
CGAATGGCGG
CTTGACAGCG
ATOCGATAGA
AAAAATTAAG
CGAGAGAATT
TAACAA-ATAT
CGCGATTCAG
CCGCTGATTC
GGTCAGCCTG
ACCGGTATCT
CTOAGTGTGC
CGTGTACAAC
COCCGTCCOGG.
GTGGACGGCA
TI
ATT- 'O G ATOGGCOG
G
GTGGCGGo CGOCGTGTTT
G
TGCAAToGGC
GATCGCCCG~
GTTGTTACT(
TGAGGAGGCG
CCGGAAAo
AATGGCGGTC
GCAAAATGAT
AT CACACOC C
CCGGATGCCG
TGCOGGAC
GCAATACOGC
GGCTCTCAAA
CAGCGATTTT
GTTCGGAAAA
GATTTCTTAG
TGATGAAGAA
TTGTCGGTTT
GCATCATT
GCOTTCTGCA
OOGTCAATCA
A.CGGCATT
GACTTACGGG
AGGOGAACC
OGCGAAGTGC
.GGCAGGTT
TATCGACAG
7
;TCGATG
GATOTTTCC
LATGCGCTA
TGI Ar'TTTT
CCCTTTOAGC
k. TGTTAAGTCG
GCGGACACGC
A-AGATGTCGO
GGAAGAGGTG
GTGAGTGGTG
CGCCGCAAGC
CGGTCAAGAT
ATGGCGGCAG
CGOCACAAC
GOATAATCCC
A-GCCGCAAA
ATGCCGCCGA
TGCACCGAAT
ATGCCAACCA
GGGAATCGGC
ACAACCGGCA
GGAATGCAGG
OGGACGATC
AGATCAAGCT
GCAAATCAAC
ATCOTGCCTC
TTCAACCAAT
GGAAGGATAA
ATGTAGCTAA
TGTAACGTTG
ACACATTGTA
ATGAAGAAGC
ACCA~CAA
AAAATTAATA
AATATAAAAA
GGTTGCTCAC
AGGGTAAA
ATALAAGACAA
GTCCGCTTCA
CGGTCGAG-C
GGTOOCTTOC
GOGGATACG
CTGA'TTOTCG
CCGGOAA:AT
OTTCGCOCCO
GCGGAAAX-AT
TGTCCGGCOG
GGOAAAAGG(c
OAAOCT'G
TOCATTTCCA
TATGGAAAAC
GCCGOA-A
TCGATTOGG
CGGCGATGAT
TTGCATATGG
OAAACGGCTT
TADAOG,GG"-T
GGAAGGTTTT
ACGGCCCOGC
TCGCOOOACA GATGooOGA This corresponds to the amino acid sequence <SEQ ID 3108; ORF 2 87.a>: a287 .pep 1 MFKRSVIAMA CIVALSACGG GGGGSPDVKS ADTLSKPAAP
VVTEDVGE
51 LPKE-KKDEEA VSGAQATQ DATAGKGGQD MAAVSAENTS NGGA -TD 101 7ENKDEc-PQND MPQNAAD-DS STPNHTPAPN MPTRDMONQA
PDAGESAQ
151 NQPDMANAAD GMQGDDPSAG ENAONTADQA ANQAENNQVO
GSQNPASS
20v1 PNATNGGSDF GR:NVAMG-K !DSGSENVTL -NCKDKVCO:R
DFLDEZEAP
251 SEFEKLSDEE KINKYKKDEQ RENFVGLVA; RVEKNGTNKY VI7YKDKS 301 SSSARFRRSA RSRRSLP;.EM PL:PVM-2ADT LIVDGEAVSL TGiISGENI 351 EGNYRYLTYG AEXLSOGSYA LSVQGEPA(G EMLAGTAVY>N
GEVLHNFHM
401 GRPSPSGORF AAKVDFGSKS VDG1::DSGDC LHMGTQK?-z
VIDGNGFK
451 W7NGGVS F I
EV
NFB
PA
TN
PlC
AS
'A?
EN
E- IAGKYSYRPT DAEKGGFGVF AGKK7-QDmn2 7 /a2 87 ORFs 287 and 2 5 7 .a showedi a '7.2'1 iden:; v in 301 aa cveriap 20 30 40 4 9 m297 .pep MFKRSVIAI-IACIFAL3ACO,-CCOGSPDVKSADTLSPAVS
KEE
a28 7 MFKRSVIAMAIVALSACGGGGCSGVK3ADTLSKAPVEGELKEiDA 20I 30 40 50 3060 70 80 90 100 109 m287 .oep EAPQAGS)QGQGAPSAQGSQ0MAAVSEENTGNGGAVTAtINEVANMQAG a287 VSGAPQADT-DTGG
CMASETNGATIEKEPDPQAT
80 90 130 110 108 110 120 130 140 150 160 169 m287.pep
D)SSTPNHTPDPNMLAGNMENQATDAGESSQPANQPDMANAAGMQGDDPSAGGQNAGNTA
a287
D'SSTPNHTPAPNMPTRDMGNQAPAGESAQPNQPDANADGMQGDOPSAG.ENAGNT
120 130 140 150 160 170 160 190 200 210 220 229 rn287 .pep AQGNQAGNNQAGSSDPIASNPAPAGGSNFGRVDLNGVLIDGPSQNITLTHCKGDS a287
DQAANQAENNQVGGSQNPASSTNPNATNGGSDFGRINVANGIKLDSGS.NVLHKV
180 190 200 210 220 230 230 240 250 260 270 280 289 m287 .pep CSGNNFLDEEVQLKSEFEKLSDADKISNYKK0GKNDKFVGLVADSVQMKGINQY:IFYKP 1: :11111: i M 1 1 1 I I I 1 1 a287 CD-RDFLDEEAPPKSE -FEKLSDEEKINKYKKDEQRENFVGLVADRVEKNGTNKYVI
TYKO
240 2 0 260 270 280 290 290 300 310 320 330 340 m287 .pep KP--TSFARF9.RSARSRRSLPAEMPLIPVNQADTLIVDGEAVSLTGHSGNIFAPEGNYRY a287
KSASSSSARFRRSARSRRSLPAEMPLIPVNQADTLIVDGEAVSLTGHSGNIFAPEGNYRY
300 310 320 330 340 350 350 360 370 380 390 400 m287.pep)
LTGELGSAPQEAGMAAAYGVHHEGPPRRAK
I I I I I 111 1!F1 1: 1 1 1 a287
LTGZLGSASQEAGM.GAYGVHHEGPPGRAKD
360 370 380 390 400 410 410 420 430 440 450 460 m287 .pep GSKSVDGIIDSGDLHMGTQKFAIDGNGFKGTWTENGSGDVSGKFYGPAGEEVAGKYS a287 GSKSVDGIDSGDLHMGTQFKVIGGFKTWTENGGDVSGRFYGPAGEW-AG~v 420 430 440 450 460 470 470 480 489 m28'.pep YRPTDAE2KGC-FGVFAGKKEQDX a2gT YRPTDAEKGGFCVFAGKK-'QDX 480 490 406 The following partial DNA sequence was identified in N. meningiuidis <SEQ ID 3109>: m406.seq 1 ATGCAAGCAC GGCTGCTGAT ACCTAT'rCTT T-TTCAGT:-T
TATTTATC
51 CGCCTGCGGG ACACTGACAG GTATTCCATrC GCATGGCGGA GGTAAACGCr 101 TTGCGGTCGA ACAAGAACTT GTGGCCGCTT, C'GCCAGAGC
TGCCGTTAAA
151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC CACTATGGGC GACCAAGGTT CAGGCAGT-T GACAGGGGGT CGCTACTCCA 251 TTGATGCACT GATTCGTGGC GAATACATAA ACAGCCCTGC CGTCCGTACC 301 GATTACACCT ATCCACGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG 351 TTTGACAGGT TT.AACCACTT CTT-ATCTAC ACTTAATGCC CCTGCACTCT 401 CTCGCACCCA ATCAGACGGT AGCGGAAGTA AAAGCAGTCT GGGCTTAAAT 451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACC-TGACGA CTAACCCGCG S01 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATTT TTCCTGCGCG 551 GCATAGACGT TG-TTCTCCT GCCAATGCCG A7ACAGATGT GTTTATTAAC 602 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA 651 TGCCGAAACA CTGAAAGC:C AAACAAAACT GGAATATTTC GCAGTAGACA GAACCAATAA AAAATTGCTC ATCAAACCA GCCTATAAAG AAAATTACGC
ATTGTGGATG
AGGAATTAAA CCGACGGAAG GATrAATGGT CATACGGCAA TCATACGGGT
AACTCCGCCC
AGTCATGAGG GGTATGGATA
CAGCGATGAA
AGGACAACCT TGA AAACCAATGC
GTTTGAAGCT
GGGCCGTATA AAGTAAGCAA CGATTTCTCC
GATATCCGAC
CATCCGTAGA
GGCTTAAC
GTAGTGCOAC AACATAGACA This corresponds to the amino acid sequence <SEQ ID 3110; ORF 406>: m406 .pep MQARLLIPIL FSVFILSACG TLTGIPSHGG GKRFAVEQEL
VAASARAAVK
51 DM~rQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG
EYINSPAVRT
101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG
SGSKSSLGLN
151 IGGMGDYRNE TLTTNPRDTA FLSHLvQrVF FLRGIDVVSP
ANADTDVF:N
201 IDVFGTIRNR TEMHLYNAIT LKAQTKLEYF AVDRTNKKiL
IKPKTNAFEA
251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIRPYGNITG
NSAPSVEADIJ
301 SHEGYGYSDE VVRQHRQGQP The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 311 1>: g406.seq I ATGCGGGCAC GGCTGCTGAT
CGCCTGCGGG
TCGCGGTCGA
GACATGGATr
AACTATGGGC
TTGATGCACT
GATTACACCT
TTTGACGGGT
CGCGCACCCA
ATTGGCGGGA
CGACACTGCC
GCATAGACGT
XZ'CGACGTAT
TGCCGAAACA
GAACCAATAA
GCCTATAAAG
AGGAATCAAA
CATACGGCAA
AGT'CATGAGG
AGGGCAACCT
ACACTGACAG
ACAAGAACTT
TACAGGCATT
GACCAAGG'-T
GATTCGCGCc
ATCCGCGTTA
TTAACCACTT
ATCAGACGGT
TGGGGGATTA
TTTCTTTCCC
TGTTTCTCCT
TCGGAACGAT
CTGAAAGCCC
AAAATTGCTC
AAAATTACGC
CCGACGGAAG
TCATACGGGT
GGTATGGATA
TGA
ACCTATTCTT
GTAr.TCCATC
GTGGCCGCTT
ACACGGACGA
CAGGCAGT-T
GAATACATAA
CGAAACCACC
C-TTATCTAC
AGCGGAAGTA
TCGAAATGAA
ACTTGGTGCA
GCCAATGCCG
ACGCAACAGA
AAACAAAACT
ATCAAACCCA
ATTGTGGATG
GATTGATGGT
AACTCCGCCC
CAGCGATGAA
TTTTCAGTTT
GCATrGGCGGA
CTGCCAGAGC
AAAGTTGCAT
GACAGGGGGT
ACAOCCCTGC
GCTGAAACAA
ACTTAATGCC
GGAGCAGTCT
ACCTTOACGA
GACCGTATTT
ATACAGATGT
ACCGAAATGC
GGAATATTTC
AAACCAATGC
GGGCCGTATA
CGATTTC-CC
CATCCGTAGA
GCAGTGC.AC
TTATTTTATC
GGCAAACGCT
TGCCGTTAAA
=GACATTGC
CGCTACTCCA
CGTCCGCAZC
CATCAGGCGG
CCTGCACTCT
GGGCTTAAAT
CCAACCCGCG
TTCCTGCGCG
GTTTATrTAAC ACCTA'rACAA
GCAGTAGACA
GTTTGAAGCT
AAGTAAGCAA
GA'FATCCAAC
GGC'rGATAAC
AACATAGACA
This corresponds to the amino acid sequence <SEQ ID 3112; ORE 406>: g406 .pep MRARLLIPIL FSVFILSACG
TLTGIPSHGG
DMDLQALHGR KVALYIATMG
DQGSGSLTGG
DYTYPRYETT AETTSGGLTG
LTTSLSTLNA
IGGMGDYRNE TLTTNqPRDTA
FLSHLVQTVF
IDVFGTIPNR TEMHL-YNAET
LKAQTKLEYF
AYKENYA-LWM GPYKVSKGIK PTEzGLM~VDFS SHEGYGYSDE AVRQHRQGQP
GKRFAVEQEL
RYSIDALIRG
PALSPTQSDG
FLRGIDVVSP
AVLDRTNKKLL
DIQPYGNHTG
VAASARAAVI
EYINSPAVR'r
SGSRSSLGLN
ANADTDVF IN I KPKTNAFrA
NSAPSVEALN
ORE 406 shows 98.8% identity over a 320 aa overlap with a predicted ORF (0RF406.a) fr~om N. gonorrhoeae: g406/m406 20 30 40 50 9406 .pep MRRLPLSFLAGLGPHGGRAEEVAAAVDDQLG m406 MQARLLI P I FSVFILSACGTLTGI PSHGGG FAVEQELVAA.
R~KMDOLG
110 20 30 40 50 80 90 100 110 120 g406 .pep KVALYIATMGDQGSGSLTGGRYSIAIGEYIMSPAVTDYTYPRYETTA
ETTSGGLT
m4 06 KVLITGQSSTGYIDLREYNPVTYYRE-A7SGT 80 90 100 110 120 130 140 150 160 170 180 9406 .pep LTSSLAASTSGGiSLL-GMDRETNR:TFSLQV m4 06 LTTFFFINPIFSTFSDFFFF SLGHIFH:MHDYFF
ETTTNFIHAFWHWLQHW
130 140 150 160 170 180 190 200 210 220 230 240 9406 .pep FLGDVPNDDFNDFTRRTMLNELATLYADTKL m4 06 FLGDVPNDDFNDFTRRTMLNELATLYADTKL 190 200 210 220 230 240 250 260 270 280 290 300 g406.pep IKKNFAYEY-WGYVKIPTGr]FDQYNTNASED mn406 I KPKTNAFEAAYKENYALWM1GPYKSKGI KPTEGLMVDFSDI RPYGNI{TGNSAPSVEADN 250 260 270 280 2-30 300 310 320 9406 .pep SHEGYGYSDEAVRQHRQGQPX 11 F F1111: F1I1111 m4 06 SHEGYGYSDEVVRQHRQGQPx 310 320 The following partial DNA sequence was identified in. Pv~ eningitidis <SEQ ID 3113>: a406. seq 1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT T7TT-CAGTTT PTATTTTATC 51 CGCCZ'GCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGTAAACGCT 101 TCGCGGTCGA ACAAGACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA 151 GACATGGATT TACAGGCATT ACACGGACGA A.GTTGCAT TGTACATTGC 201 ACATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA 251 TTGATGCACT GATT'STGGC GAATACATAA ACAGCCCTGC CGTCCGTACC- 3C1 GATTACACCT ATCCACGTTA CGAAACCACC GCTGAAACA CATCAGGCOG 351 TTTGACAGGT TTAACCACTT CTTTATCTAc ACTTA-ATGCC CCTCCACTCT 401 CGCGCACCCA ATCAGACGGT AGCGGAAGTA AAGGTC: GGCTAT 451 ATTGGCGGGA TGGGGG -n TA TCGAAATGA- ACCTTGACGA CTAACCCGCG 501 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATT TTCCTGCCCG 551 GCATAGACGT TGTTTC-CCT GCCAATGCCG ATACGGATG- GTTTATTAAC 601 ATICGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA ,651 TC-CCGAAACA CTGAAAGCC'c AAACAAAACT GGAATATTTC
GCAGTAGACA
GAA.CCAATAA AAAATTGCTC ATCAAACCAA AAACCAATGC GTTTGAAGCT 751 GCCTATAAAG AAAATTACGC ATTOTGGATG GGACCGTATA AAGTAAGCAA 801 AGGAATTAAA CCGACAGAAG GATTAATGGT CGATT:CTCC OATATCCAAC 851 CATACGGCnA TCATATGGGT ACTCTGCCC CATCCGTAGA GGCTGATAAC 901 AGTCATGAGG GGTATGGATA CAGCGATGAA GCAGTSCGAC GACATAGACA 951 AGGGCAACCT TGA This corresponds to the amino acid sequence <SEQ ID 3114;1 ORF 406.a>: a406.pep 1 MQARLIP:L FSVFLSACG -LTGIPSHGG GKRFAVEQEL VAASAPAAVpK 51 DMDLQALHGR KVALYIATMG L2GSGSLTGG RYS:ODAL:RG EYINSPAVRT 101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN 151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLPGIDVVSP ANADTDVF:N 201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVCRTNKKLL IKPKTNAFEA 251 AYKENYALWM GPYKVSKGIK PTEGLVDFS DIQPYGNHMG NSAPSVEADN 301 SHEGYGYSDE AVRRHRQGQP m406/a406 ORFs 406 and 406.a showed a 98.B% identity in 320 aa overlap 20 30 40 50 m406.pep MQARLLIP:7-FSVF:7ISACGTLTGIPSHGGGKRFAEQELVS
AKDLLH
i i I !lliii ii II I l ii l ii lll I i Ili///li I I a406 MQARLL PILFSVFILSACGTLTGISHGGGRFAVEQELVAASARAVKDMDLQALHGR 20 30 40 50 80 90 100 110 120 m406.pep KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG I I il l l iiil i lii, i l l lll il 111 1 11 11 1 1 a406
KVALYIATMGDQGSGSLTGGRYSIDALIRGEVINSPAVRTDYTYPRYETTAETTSGGLTG
80 90 100 110 120 130 140 150 160 170 180 m406.pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 1111111111liiiill 111111 1111 1111111i iii i111 lii~lilll~l III a406 LTTSLSTLNAPALSRTQSDGSGSKSSLGLN
IGGMGDYRNETLTTNPRDTAFLSHLVQTVF
130 140 150 160 170 180 190 200 210 220 230 240 m406.pep FLRGIVVSPANADTDVF:NI0VFGTIRNRTrM-MLYNAETL-,AQTKL
FAVDRTNKKLL
Illlll~ I li!iiIiii;II Ii; lllllll ll i a406 FLRGI DVVS PANADTDVFIN: DVEGTIRNIR-MH:YNAETLKAQTKLEYFAVRTNKKLL 190 200 210 220 230 240 250 260 270 280 290 300 m406.pep IKPKNAFE-zYKENYALWMGPYKVSG TEIV D!PNGS V N 111111 i lllllll 111111lii I I I li Illi: li 111111111 I a406 I:PKTNAFEA.YKENALWMGPYKVSKGIPTGL 1DN 250 261 270 280 290 300 310 320 m406.pep S.HEGYGYSDEVVRQHRQGQ x a406 SHEGYGYSDEAVRRHRQGQPX 310 320 EXAMPLE 2 Expression of ORF 919 The primer described in Table I for ORF 919 was used to locate and clone ORF 919.
The predicted gene 919 was cloned in pET vector and expressed in E. coli. The product of protein expression and purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 919-His fusion protein purification. Mice were immunized with the purified 919- His and sera were used for Western blot (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Symbols: MI, molecular weight marker; PP, purified protein, TP, N. meningitidis total protein extract; OMV, N. memingitidis outer membrane vesicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band These experiments confirm that 919 is a surface-exposed protein and that it is a useful immunogen. The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 919 are provided in Figure 10. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J. Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 1992, ScandJImmunol Suppl 11:9). The nucleic acid sequence of ORF 919 and the amino acid sequence encoded thereby is provided in Example 1.
EXAMPLE 3 Expression of ORF 279 The primer described in Table 1 for ORF 279 was used to locate and clone ORF 279.
The predicted gene 279 was cloned in pGex vector and expressed in E. coli. The product of protein expression and purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of279-GST purification. Mice were immunized with the purified 279-GST and sera were used for Western blot analysis (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Symbols: Ml, molecular weight marker; TP, N.
meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle preparation.
Arrc-.v indicate the position of the main recombinant protein product and the N.
meningitidis immunoreactive band These experiments confirm that 279 is a surfaceexposed protein and that it is a useful immunogen. The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 279 are provided in Figure 11. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J. Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 1992, Scand Jhnmunol Suppl 11:9).
The nucleic acid sequence of ORF 279 and the amino acid sequence encoded thereby is provided in Example 1.
EXAMPLE 4 Expression of ORF 576 and 576-1 The primer described in Table 1 for ORF 576 was used to locate and clone ORF 576.
The predicted gene 576 was cloned in pGex vector and expressed in E. coli. The product of protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 576- GST fusion protein purification. Mice were immunized with the purified 576-GST and sera were used for Western blot (panel FACS analysis (panel bactericidal assay (panel D), and ELISA assay (panel Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band These experiments confirm that ORF 576 is a surface-exposed protein and that it is a useful immunogen. The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 576 are provided in Figure 12. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J. Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 1992, ScandJlmmunol Suppl 11:9).
The nucleic acid sequence of ORF 576 and the amino acid sequence encoded thereby is provided in Example 1.
EXAMPLE Expression of ORF 519 and 519-1 The primer described in Table 1 for ORF 519 was used to locate and clone ORF 519.
The predicted gene 519 was cloned in pET vector and expressed in E. coli. The product of protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 519- His fusion protein purification. Mice were immunized with the purified 519-His and sera were used for Western blot (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Symbols: Ml, molecular weight marker; TP, NV meningitidis total protein extract; OMV, N. meningitidis outer membrane vesicle preparation. Arrows indicate the position of the main recombinant protein product and the N n meningitidis immunoreactive band These experiments confirm that 519 is a surface-exposed protein and that it is a useful immunogen. The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 519 are provided in Figure 13. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J Immunol 143:3007; Roberts et 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 1992, Scand Jlmmunol Suppl 11:9). The nucleic acid sequence of ORF 519 and the amino acid sequence encoded thereby is provided in Example 1.
EXAMPLE 6 Expression of ORF 121 and 121-1 The primer described in Table 1 for ORF 121 was used to locate and clone ORF 121.
The predicted gene 121 was cloned in pET vector and expressed in E. coli. The product of protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 121- His fusion protein purification. Mice were immunized with the purified 121-His and sera were used for Western blot analysis (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Results show that 121 is a surface-exposed protein. Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N.
meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band These experiments confirm that 121 is a surface-exposed protein and that it is a useful immunogen.
The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 121 are provided in Figure 14. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J.
Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al.
1992, Scand JImmunol Suppl 11:9). The nucleic acid sequence of ORF 121 and the amino acid sequence encoded thereby is provided in Example 1.
EXAMPLE 7 Expression of ORF 128 and 128-1 The primer described in Table 1 for ORF 128 was used to locate and clone ORF 128.
The predicted gene 128 was cloned in pET vector and expressed in E. coli. The product of protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 128- His purification. Mice were immunized with the purified 128-His and sera were used for Western blot analysis (panel FACS analysis (panel bactericidal assay (panel D) and ELISA assay (panel Results show that 128 is a surface-exposed protein. Symbols: M1, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vesicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band These experiments confirm that 128 is a surface-exposed protein and that it is a useful immunogen. The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 128 are provided in Figure 15. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J.
Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al.
1992, ScandJImmunol Suppl 11:9). The nucleic acid sequence of ORF 128 and the amino acid sequence encoded thereby is provided in Example 1.
EXAMPLE 8 Expression of ORF 206 The primer described in Table 1 for ORF 206 was used to locate and clone ORF 206.
The predicted gene 206 was cloned in pET vector and expressed in E. coli. The product of protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 206- His purification. Mice were immunized with the purified 206-His and sera were used for Western blot analysis (panel It is worthnoting that the immunoreactive band in protein extracts from meningococcus is 38 kDa instead of 17 kDa (panel To gain information on the nature of this antibody staining we expressed ORF 206 in E. coli without the His-tag and including the predicted leader peptide. Western blot analysis on total protein extracts from E.
coli expressing this native form of the 206 protein showed a recative band at a position of 38 kDa, as observed in meningococcus. We conclude that the 38 kDa band in panel B) is specific and that anti-206 antibodies, likely ;ecogn:ze a multimeric protein complex. In panel C is shown the FACS analysis, in panel D the bactericidal assay, and in panel E) the ELISA assay.
Results show that 206 is a surface-exposed protein. Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis outer membrane vesicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band These experiments confirm that 206 is a surfaceexposed protein and that it is a useful immunogen. The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 519 are provided in Figure 16. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 1992, Scand J mmunol Suppl 11:9).
The nucleic acid sequence of ORF 206 and the amino acid sequence encoded thereby is provided in Example 1.
q EXAMPLE 9 Expression of ORF 287 The primer described in Table 1 for ORF 287 was used to locate and clone ORF 287.
The predicted gene 287 was cloned in pGex vector and expressed in E. coli. The product of protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 287- GST fusion protein purification. Mice were immunized with the purified 287-GST and sera were used for FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Results show that 287 is a surface-exposed protein. Symbols: Ml, molecular weight marker. Arrow indicates the position of the main recombinant protein product These experiments confirm that 287 is a surface-exposed protein and that it is a useful immunogen.
The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 287 are provided in Figure 17. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J.
Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al.
1992, ScandJImmunol Suppl 11:9). The nucleic acid sequence of ORF 287 and the amino acid sequence encoded thereby is provided in Example 1.
EXAMPLE Expression of ORF 406 The primer described in Table 1 for ORF 406 was used to locate and clone ORF 406.
The predicted gene 406 was cloned in pET vector and expressed in E. coli. The product of protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 406- His fusion protein purification. Mice were immunized with the purified 406-His and sera were used for Western blot analysis (panel FACS analysis (panel bactericidal assay (panel and ELISA assay (panel Results show that 406 is a surface-exposed protein. Symbols: M molecular weight marker; TP, N. meningitidis total protein extract; OMV, N.
meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main recombinant protein product and the N. meningitidis immunoreactive band These experiments confirm that 406 is a surface-exposed protein and that it is a useful immunogen.
The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 406 are provided in Figure 18. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al.
1992, ScandlimmunolSuppt 11:9). The nucleic acid sequence of ORF 406 and the amino acid sequence encoded thereby is provided In Example 1.
EXAMPLE I11 Table 2 lists several Neisseria strains which were used to assess the conservation of the sequence of ORF 225 among different strains.
Table 2 225 gene variability: List of used Neisseria strains Identification Strains Source reference flumber Group B zoOl1_225 NG6/88 R. Moxon Seiler et al., 1996 zoO2_225 BZ198 R. Moxon Seiler et at., 1996 zoO3_225 NG3/88 R. Moxon Seller et al 1996 zoO4_225 297-0 zoO5_225 1000 zoO6_225 BZ147 zo07_225 BZ169 zoO8_225 528 zoO9_225 NGP165 zo10_22 5 BZ133 zoll_225 NGE31 zol2_225 NGF26 zoI3_225 NGE28 zol4_225 NGH38 zol5_225 SWZ107 zol6_225 NGH15 zol7_225 NGH36 zol8_225 BZ232 zol9_225 BZ83 zo2O_225 44/76 zo2l_225 MC58 zo96_225 2996 Group A zo22_225 205900 zo23_225 F6124 z2491 Z2491 R. M xon Seier 199 R. Moxon Seiler et 1996 R. Moxon Seiler et al., 1996 R. Moxon Seiler et al., 1996 R. Moxon Seiler et at., 1996 R. Moxon Seiler et at., 1996 R. Moxon Seiler et al., 1996 R. Moxon Seiler et 1996 R. Moxon Seiler et al., 1996 R. Moxon Seiler et at., 1996 R. Moxon Seiler et al., 1996 R. Moxon /Seler et 1996 R. Moxon Seiler et 1996 R. Moxon /Seler et 1996 R. Moxon Seiler et 1996 R. Moxon Seiler et 1996 R. Moxon Seiler et 1996 R. Moxon Our collection R. Moxon R. Moxon R. Moxon Maiden et al., 1998 Group C zo24_225 90/18311 R. Moxon 225 93/4286 R. Moxon Others zo26_-225 A22 (group W) R. Moxon Maiden et al., 1998 z o 7 2 E 2 6 g r o u p X M x n M i e 1 9 zo28_225 8E60 (group R. Moxon Maiden et 1998 zo29_225 E32 (group Z) R. Moxon Maiden et al., 1998 Gonococcus zo32_225 Ng F62 R. Moxon Maiden et al., 1998 zo33_225 Ng SN4 P Moxon fal1090 FA 1090 R. Moxon References: Seiler A. et Mo!. Microbiol., 1996, 19(4):841-856.
Maiden et al., Proc. Natl. Acad. Sci. USA, 1998, 95:3140-3145.
The amnino acid sequences for each listed strain are as follows: >FA1O9O <SEQ ID 3115> MDS FFPVALLFV AAELNL S REQI LRQFAE DEQPVL PJPARAG NADEL I GSAMGLNE QPVLPVRPARRAGNADE L GSAMGLLG 7AYRYG3T S VSPG FDCS GF-MQH7 FRAIMGTNLPRTSAEQARGAPVARSELCPGDMVFFR..L.S~SVLIN R -IHAPRTGKNIEITSLSHKYWSGKYAF7ARRVKKDPSRFLN Z2491 <SEQ ID 3116> MDS 7FKPAVWAVLWLMFAVRPALADELTNLLS SREQ: LRF-DQVPTFPZRA NAD)EL IGSAMGLNEQ VL PRV PRGNADE LI GNAIGLN EQ PVrNRV PARAGNA DEIGAGNQVPNAAPGNDL!NML' ARGT STG FDCSG F MQFRMG INLPRTSEQRMGTPVASE LQPGDMVFF..
RISVL'INP
IYPTKIETLHYSGY
*RKKDSFN
Z001 225 <SEQ ID 3117> MD:;FPVAVWMARAADLI S QLR.ZQEQPV-TP7NRPARRAG NADELIGSAMGLNEQPVLVNRVPARRAGNADELIGNAMGLN.Q PVNv PAFARP GNA DEL IGNAMGLLG IAYRYGGTS ISTGFOCSGFMQH I FKRAIG NLPRSAEQMFGTPVA-P SELQPGFRTLGGSRISVGLYIGNNRIARTG.NI. ITS LSHKYWSGFYAFARR
VKKNDPSRFLN*
Z0C2 225 <SEQ ID 3:18> MDS FFKPAVWAVLWLMFAVRPAT ADELTN4LLSSREQ:
LRQFAEDEQPVLPINPAPAG
NADEL IGSAMGLNEQPVL PVNRV PARRAGNADE L I GNAC.LNEQ PVT P\NRAPARRAGNA DELIGNAM4GLLG:AYRYGGTSVST.GFDCSG-MQH:KRAYGNP
SAMGPA
SELQPGDMVFFRTLGGSR:.SHVGLYIGNNRFIHAPRTGKN
ITSLSHKYWSGKYAFARR
VKKtIDPSRF'LN* Z003_225 <SEQ ID 3119> MDS7FKPAVWAVLWLMFAVRLALADEL TNLLSSREQILRQFA
EPLINPAPARRAG
NADEL IG SAMGLNEQ PVL PVTRVPAR RAGNAD E GNAM.GLNEQ PVLPVNRA PAR RAGNA DELIGNARGLLGIAYRYGGTSVSTGFDCSGFMQHZ
FKAGINLPRTSAEQARMGTPVAR
SELQPGDMVFFRTLGGSRI SHVGLYIGNNRFIRAPRTGKN
IEITSLSHKYWSGKYAFARR
VKKNDPSRFLN 119 Z004 225 <SEQ ID 3120> MDS FFKAVWAVLWLMFAVRPALADELTNLLSSREQI LRQFAEDEQPVLP!NRA.~AR
G
NADEL IGSANGLNEQPVL PVNRVPARRAGNADELIGNAMGLNEQPVL PVNRA PARPAGNA DELIGNMGLLGIAYRYGGTSVSTGFDCSGFMQH: FERA4G INLPRTSAEQARMCTPVAR S -LQPG DMV FFRT LGG SRI SHVG LY:--GNNRF IHAPRTGKNI EITS L KWGYFR VKKNDPSRFLN 2005_225 <SEQ ID 3121>
MDSFEKPAVWAVLWLLMBAVRPALADELTNLLSSREQILRQFAEDEQPLNRARG
NADEL I GSAMGLNEQ PVLPVNRV PRRGNADELI G SAGLNEQ PVL PVNRAPARRASNA DELIGNANGLLGIAYRYGGTS I STGE'DcSGn4QUI FKPAMG INLPRTSAEQARNSTPVAR SEQGMFRLGR HVLINR-HPTKI
ITSLSHKYWSGKYAFARR
VKKNDPSRFLN*
2006_225 <SEQ ID 3122> MDS FFRFAVWAVLWLMFAVRPALADELTNLLSSREQI
LRQFAEDEQPVLPIRPRA
NAEISMLEPLVRPR
,:;DLGAGNQVPNAAPAN
DELZIGNAMGLLG IAYRYGTSVST FCSGFMKQHI
FKRAMGINLPRTSAEQARNGTPVAR
SELQ PGDMVFFRT LGSRI SHVG LY IGNNR FIYAPRTGKNIE ITS LSHKYWGYFR
VKKNDFSRFLN*
2007 225 <SEQ ID 3123> MDS FFKPAVWAVLW~LMFAVRFALADELTNLLSSREQI LRQFAEDEQVL PNAFlR< NAEL ?SMLE V VR ARGND
NMLE,,VNFAPASPAGNP
DE LIGNAMGLL IAYRYGTSVST0FDCSGFIQHI FKRAMG INLPR-SA.EQAPpJ.T
OVA?
V.KKND0P SR FLN 2008_225 <SEQ ID 3124> MDSFmFKPAVWAVLWLMFAVRPALADELTNLLS SREYILRQFAED0,QPV
INRAPARRAG
DELISNAMGLL0IAVRYGGTSISTcFDCScpFQu
HKAGNPTSS~SP
SELQPGDMVFFRTLGSRISHVSY:> ,,,RFI,PnTGNIE ITS SKWGYFR VKXNDPSRFLN Z009 225 <SEQ ID 3125> MDSFFPVAL
FVPLDLNLSRQLQADQVPNPRA
NADELICSAMGLNEQVLPmNRVPARAGNADE:NMGLNEQP;PRPRAN DE-L2GNAL'4CLLOIAYRYGGTS ISTGFDCSOFDIQH: FKRAMG:NL PTSAZEQP2S.1TP
/ZR
SELQPGDVFFRTLGSSRISHVGLYIGNNRFIAPRTGNI..ITS-LSHKWGYFR
VKKNDPSRFLN'
Z01O 225 <SEQ 2D 3126> mDSFFKAVWAVLWLMFAVRPAADELTNLLSSREQI"TRQFAEDEQPVLPN-?RA NADEL IGSAMGLNEQPVL PXPRV PARRAGNADEL ONA-MGLNEQ PL PVNkA PARRAoNA DELT0GNA1GLLIAYRYGCTSVSTGrD-C3C -EQ I FK2MI PTSEAM7PA S ELQPG DMVFFRT LGGS RI SHVO:Y:0 NNRFZ:.-P;TGAJ:1E:rTSSMHK.WSGKYA
FAR
VKKND:PSRFLN*
2011_225 <SEQ ID 3127> MDSFF=KPAVWAVLWLMFAVRPAADETNLLSSRQILRQrAEDEQVL7IRAR:G NADEL IGSAMGLNEQPVLPVNRvppRAGNA 0 5 LI GNMtG LNEQVLPVNRA>ARPACNA DEISNAMGLNEQPVLPVNPAPARRAGNADELIGNAMGLLGIAYRYrGGTVTFCG MQE-I FKPANGINL2RTSAZQAMGTPVRSELQGDMVFFRT. OO RI SHVOLY:-GNNRF 7HAPRTGKNIEITSLSHKYWSGKYAFARRVKXNDPSRFLN.
2012_225 <SEQ ID 3128> MD F AWVWMFV AADELNL SRQILQFED V RPRA NADELIGSADGLNEQPVLPVNRVPRRGNADELIN'4ALNEQPVLINJ
PA.PAN
DELIGNAMGLLGIAYRYGGTS ISTOFDCS0FMQH:FxpRkMoINLPpISAEQARMS.GTPVAR SEL-QPGDMVFFRTLGGSRI SHVGLYIGNNRFI4JPR7GKNIE
ITSLSHKYWSGEVAFARR
VKKNDPSRFLN*
2013225 <SEQ ID 3129> 120 MDS FFKPAVWAVLWLFAVRPAADELTNLLS SREQI LRQFAEDEQ
PVLPVNRAPARRAG
NADEL IGSAMCLNEQPVLPVNRVPARAGNADELIGNAMGLNEQPVLPVNRAPARAN DELIGNAMGLLGIAYRYGGTSVSTGFDCSFIQHI
EKRAMGINLRTSAEQARMGTPVAR
SEL-QPGDMVFFRTLGGSRISHVGLYIGNNRFIliPRTGKNIEITSLSHKYWGYFR
VKKNDPSR'LN*
'014 _25<SEQ ID 3130>
NADELIGSAMGLNEQPVLPVNRVPARGNADELIGNAMGLNEQPVLPNAARGN
DELIGNAI4GLLGIAYRYGGTSVSTGFDCSGFMQHI FRRALGIN7
PRTSAEQARMGTPVAR
SELQPGDMVFFRTLGGSR:SHVGLYIGNNRFIAPRTGKNIEILSKWGYFR
VKKNDPSRFLN*
Z015_225 <SEQ 1D 3131> MDS FTKPAVWAVLWLMFAVRAAELTl' SSREQI LRQFAEDEQPVLPNpAARA NA7LGAMLEPLVRPRANADELIGAGL
YYGTVTFC
GFMQHI FKRNAGINLPRTSAEQARMGTPVARSELQPGDMFFRTLGGSRISVLIN
RFHPTKIISSKWGYFRVKDSFN
Z016 225 <SEQ ID 3132> MDS FEKPAVWAVLWLMFAVRAADELTNLLSSREQILRQFAEDEQPVLINRAPARRAG
NADELIGSAMGLNEQPVPVNRVPRRGNADELIGNAGLNEQPVLPVNRAPARRAGNA
DELIGNAMGLLGIAYRYGGTSVSTGFDCSGFMQHI
FKRGINLPRTSEQAPGTPVAR
SELQPGDMVFFRTLGGSRISHGYGNFHARGN7ISSHYSKA-R
VKKNDPSRFLN*
2017 225 <SEQ ID 3133> MDS FFKPAVWAVLWLMFAVRPALADELT\NLLSSREQILRQFA-., VpNPAAPA NADELIGSAMGLNEQPVLPVNRVIPARPAGNAD-.LTGNAMG NQFIVNPApPGNAZ DELIGNAMGLLGIAYRYGGTSVS-~GFDCSG-MQHI
FKRAMG:NLPR'SAEQAAGPVAR
SELQPGDMVFFRTLGGSRI SHVGL-Y:GNNRFIHAPRTGNIETTS ZSQiKYWSGKYAFAPR
VKKNDPSRFLN*
lb_225 <SLQ ID 3134> MSIDS FKPAVWAVLWLMFAVRAADETW*;SSREQILRQFAEQ
VLPINAAP
NA--ISMLEPLVRPA G EIN GNQVPVNRAPARRAGNA DEL'IGNA.MGLLGIAYRY-QCTSVST-FDCSGFMQH:
SE-LQPGDMVFFRTLGGSRISHVG:YTGNNRFHAPRTGKIE.ITSSKWGYF
VKKNDPSRFLN*
Z019 225 <SEQ ID 3135> LMDS FFPVALIFVPLD:TL-SEILQAD-VPIRPRA NAE GAGNQVPNVARGAE_7NMLEPI-PNAARGN DEL7GNAMGLLGIAYRYGGTSVSTI G DCSGENQH I F AMG:NLPR-SAEQAAMGT
PVAR
S-LPDVFTGSIHGYGN-!APTKI-T ,HYS:'
AA.
VKKNDPSRFLN*
225 <SEQ ID 3136> M:DS FfT<PAVAVLWT MFAVRPALADEL TNL SSRTQILRQFAEDEQ:PVLPIN,
PARRAX-
NADELIGSAGLNEQPVLPINAPRRANAD L IGSA.MGLNEQPVL PVNRVP;.R PAGNA
DELIGNAMGLNEQPVLPVNPARRAGNADELTCNMGLL,..AYYGTVGFCF
MQHI FKRAMGINLPRTSAEQARMG-PVARSELQPGDMV.FRTGSR
SHVGLYIGNNR'
IHARTGKNIEITSLSHKYWSGKYAFARRVDOSRFLN*
Z021. 225 <SEQ ID 3i37> MDS-FFKPAVWAVLWLMFAVRPALADELTN7.LSSREQILRQFAEDEPLIRARG
NADELIGSAMGLNEQPVLPVPARGNADELIGNAGLNEQVVNARPGA
DEL-'IGNAMAGLLGIAYRYGGTSVSTGFDCSG~FMQHI FKRAMGIM:?-
RTSAEQARMGTPVAR
SELQPGDMVFFRTLGGSRI
SHVGLYIGNNRFIHAPRTGF:EITSLSHKYWSGKYA.ARR
VE(KNDPSRFLN*
Z022_225 <SEQ ID 3138> MDS FFKPAVWAVLWLMFAVRPALADELTNL LSSREQI RQF-AEDEQPVLPINRPpARPAG NADELIGSA1GLNEQPVLPNRVPARPGNA 2 ELIG MLEPLPNAAPAN DELIGNAMGLLGIAYRYGGTS ISTCFDCSGFMQH-
FKC:NLPRTSEQARGTPVAR
121
SELQPGDMVFFRTLGGSRISHVGLYIGNNRFIHAPRTGKNIEITSLSHKYWSGKYAFARR
VKKNDPSRE'LN
Z023_225 <SEQ ID 3139> MDS FFKPAVWAVLWLM FAVPALADE LN LLS SREQ ILRQ FAEDQPL
NLPR
NADEL I GSASIGLNEQ PVL PVNRV PARGAELIGAGNEPLPNAARAN DEL IGNANGLLGIAYRYGGTS -1STGFDCSGFMQHI FKRAMGlN LPRTSAEQA.1MGT
PVAR
VKKND PS FL *TREISL HK W GK AF R Z024_225 <SEQ ID 3140> MDSFE'KPAVWAVLWLAVRPAADETNL
SSRQILQFAEDEQPVLIPPRA
NADELIGSAMGLNEQPVZV-JRVPARRAGNADELI GNk MGLNEQPVLPVNDRAPARRAGNA DEL IGNANG LLG 1AYRYGGTS I STGFDCSG:-Qj-H I FRAMG INL PRT SAEQARMGT PVAR
SELQPGDMVFFRTLGGSRISHGYGNFHARGNETLSKWGYFR
VKKNDPSRFLN*
Z025_225 <SEQ ID 3141>
MDSFF'KPAVWAVLWLMFAVRPALADELTNLLSSREQILRQFAEDEQPVLPNAAPA
NAEISMLEPVP VARGADLGAGNQVLVRPRA
N
DELIGNAMGLLGIAYRYGGTS TSTD-FDCSGFMQHI
FKRAMGINLPRTSAEQARMGTFVAR
SELQPGDMVFFRTLGGSRISHVGLYIGNNRFIFAPRTGKNIETTSLSHKWGYAR
VKKNDPSRFLN*
Z026_225 <SEQ ID 3142> MDSF PV4VWMARAAETLSR-IRFET-V
IRAARA
NADEL IGSAMGLNEQPVLPVNRVARANADELIGNAMGLNQPVLPJRPAARGN DETLIGNA IGLLGIAYRYGGTS ISTGFDCS~MQHI
FKPRAMGINLRTSAEQARMGTPVAR
SE PDVFTGSIHGYGN-7HPTK~IS
PYSKA
VKKNDPSRFLN*
Z027_225 <SEQ ID 3143>
MDFKAWVWMARAAETLLSEIRFEEPLIRPRA
NADEL:GSAMGLNEQPVLPVNRVPARPGNADELIGNAGLNEQPVLPVTARPGN
SEQGMFRLGR HVL -NFHPTK IE:TS LSHKYWSGKYAFARR
VKKNDPSRFLN*
Z028_225 <SEQ 1D 3144> MDS FFKPAVWAVLWLMFAVRPALADEL-- NLLS SREQ: LRFEDPLPIIAARA NADELIGSA4MGLNEQPVLPVNRVPARRGNADE
LIGNAGLNEQPVLPVNAPRRGN
DELIGNMGLLGIAYRYGGTSVSTGFDCSGFQHI 7-KRAMGIN LPRTSAEQARMGT
PVAR
SELQPGDMVFFRTLGGSR: Si 4 VGLYINNRFIHARTGL:E:TS LSHKYWSDK YAFARR
VKKNDPSRFLN*
Z029 225 <SEQ 1D 3145> MDS FEPAVWAVLWLMFAVRPALADEL LSSREILRQ
F~QPVINPRR
NAE GAGNQVPNVARGADLGAGLELVRPRAN DEL IGNAMGLLG IAYRYGGTSVSTG DCSFNQH I r-KRAMGINLPRTSAEPQARMGTPVAR SELQPGDVFFRTLGGSRISHVGLYGNNIHARTGNIE:TSLq-KWGYF
VKKNDPSRFLN*
Z032_225 <SEQ TD 3146>
MD-KAWVWMARAAETITSS--IRFEEFLVPPR,.
NAE S-GNQPLPNRPRAND SMLGIYYG VT D GF-MQHI FKRNAGINLPRTSAEQARMGAPVARSELQPGDMVFFRTLGGSTSVLIN
RFIHAPRTGKNIEITSLSHKYWSGKYAFA.RVKKNDPSRFLN*
Z033_225 <SEQ ID 3147> MDSFFKAVWAVLWLMFAVRSALADELTNLLSSREQI
LRQFAEZEQPVLPVNP.APARPAG
NADEL IGSAMGLNEQPVLPVNAP~?RPAGNADELIGSAMGLLG -YYG: SSGFC GFMQHIFKAGNPTAEAMAVRELPDVFTGGSR:
SHVGLYIGNN
RFIHAPRTGKNIE:TS LSHKYWSGKYAFARRXKgNDPSRFLN* Z096_225 <SEQ ID 3149> MDS ETKPAVWAVLWLMAVRPALADELTLLSSREQILRQFAEDEOPVLIRPRA
NAEISMLEPLVRPRANAEINMLEPLVRPRAN
DEL IGNAMGLLG IAYRYGGTS ISTGFDCSGFMQHI
FKRAMGINLPRTSAEQARMGTPVAR
SELQPGDMVFFRTLGGSRISHVGLYIGNNRFIHAPRTGKNIEITSLSKWGYFR
\VKKNDPSRFLN*
Figure 19 shows the results of aligning the sequences of each of these strains. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. As is readily discernible, there is significant conservation among the various strains of ORF 225, further confirming its utility as an antigen for both vaccines and diagnostics.
EXAMPLE 12 Table 3 lists several Neisseria strains which were used to assess the conservation of the sequence of ORF 235 among different strains.
Table 3 235 gene variability: List of used Neisseria strains Identification Strains Reference number G..roup B gnmzq0 I NG6/88 gnmzqo2 BZ198 gnmzqo3 NG3/88 gnmzq04 1000 gnmzqOS 1000 gnmzq07 BZ169 gnmzq08 528 gnmizq09 NGP 165 BZ133 gnmzqlIl NGE31 gnmzql13 NGE28 gnmzql14 NGH38 SWZ107 gnmzql16 NGH15 gnmzql17 NGH36 gnmzql18 BZ232 Seiler et at., 1996 Seiler et 1996 Seiler et at., 1996 Seiler et al., 1996 Seiler et al., 1996 Seiler et at., 1996 Seiler et al., 1996 Seiler et al., 1996 Seiler et al., 1996 Seiler et at., 1996 Seiler et al., 1996 Seiler et at., 1996 Seiler et al., 1996 Seilerel al., 1996 Seiler et at., 1996 Seiler et al., 1996 gnmzLq 1 7 1LAM Seiler el al., 1996 gnmzq21 MC58 Virji et at., 1992 Group A gnmzq22 205900 Our collection gnmzq23 F6124 Our collection z2491 Z2491 Maiden et al., 1998 Group C gnrnzq24 90/18311 Our collection 93/4286 Our collection Others gnnizq26 A22 (group W) Maiden et ali., 1998 gnmzq27 E26 (group X) Maiden etaci., 1998 gnmzq28 860800 (group Y) Maiden et ali., 1998 gnmzq29 E32 (group Z) Maiden et ai., 1998 gnmzq3 1 N. icictamica Our collection Gonococcus gnmzq32 Ng F62 Maiden et ali., 1998 gnmzq33 Ng SN4 Our collection falO9O FA 1090 Dempsey et ali. 1991 References: Seiler A. et ali., Mol. Microbial., 1996, 19(4):841-856.
Maiden R. et al., Proc. Natl. Acad. Sci. US. A, 1 '98, 95:3140-3145.
Virji M. et al., Mol. Microbiol., 1992, 6:1271-1279 Dempsey J.F. et ali., J. Bacterial., 1991, 173:5476-5486 The amino acid sequences for each listed strain are as follows: FAI-090 <SEQ ID 3149> MKPLILGLAAVLALSACQVRKAPDLDYTS rKVSKPAS ILVVPPLNES £DV N0:TGMLAST AAPISEAGYYVFPAAVVEET 'KENGLTNAADI.;iVRPEKLHQI rGNDAVLY:T'P-EY
STS
YQILDSVTTVSAKARLVDSRNGKELWSGSS I REGSNNSNSGTLGLC'IGAV'1QIANS.T DRGYQVSKTAAYNLLSPYSRNG7-KO
PREVEEQPK-
£-NMZQOI1 <SEQ ID 3150> MK?LILGLAAVLALSACQVQKAPDFDYTSFKSKPA- LVPL. DNGWVLS APLSEAGYYVFPAAVVEETFKENGL-TNAADIHVRPEKLHQI
FGNDAVLYTVJTEVGTS
YQ: LDSVTTVSAKARLVDSRNGKELWSGSAS IREGSNNSNSGLLGA7
SAVVIQIANNL,
DRGYQVSKTAAYNLLS
PYSHNGILKGPRFVEEQPK-
GNMZQ02 <SEQ ID 3151> MKPLI LGLAAVLALSACQVQKAPDFDYTS FKE-SKPASILVVPPLNE3PDVNGTWGVLAST
AALEGY=PAVEFKNLNAIARELQFGNDAVLYITVTEYGTS
YQILDSVTTVSAKARLVDSRNGKELWSGSAS
IREGSNNSNSGLLGALVSA'VJWJQIANSLT
DRGYQVSKTAAYNLLSPYSHNGI
LKGPRFVEEQPK-
GNMZQ03 <SEQ ID 3152> MKPLILGLAAVLALSACQVQKAPDFDYTS KESKPAS I LVVPPLNIESPD)VNGTWGVLAST 124 AALEGYFAVETKNLNAIARELQ
FGNDAVLYITVTEYGTS
YQILDSVTTVSAKARLVDSRNGKELWSGSAS
IREGSNNSNSGLLGALVSAVJVNQIANSLT
DRGYQVSKTAAYNLLS
FYSHNGILKGPRFVEEQPK-
GNMZQ04 <SEQ ID 3153> MKPLIGLVLALSACQVQKAC DYTS FKESKPAS ILVVPPLNES PDVNGTWJGVLAST AALEGYFAVET'QGTAD~ARELQ
F-GNDAV'YTV-EYGT'S
YQIL DSVTTVSAKARLVDSRNGKELWSGSAS
IREGSNNSNSGLLGALVSAVVNQIANSLT
DRGYQVSKTAAYNLLSPYSHNG1LKGPRFVEEQPK- GNMZQOS <SEQ ID 3154> MKTIGAVASCVKPDDTFKSPSLVFN
DN-GLS
AALEGYFAVETKNLNAIARELQ
FGNDAVLYITVTEYGTS
YQLSTVAALDRGZWGASRGNSSLGLSVNINL
DRGYQVSKTAAYNLLSPYSHNGILgGPRpVEEQPK- GNMZQ07 <SEQ ID 3155> MK PLI LGLAAVLAL SACQVQKAP DFD YT SFKESK PAS I LVVPPLNE S PDVNGTWGVLAST AAPLSEAGYYfVFPAAVVEETFKQNGLTNAAD)IHVRPEKLHQI
FGNDAVLYITVTEYGTS
YQILDSVTTVSAKARLVDSRNGKELWSGSAS
IREGSNNSNSGLLGALVSAVVNQIANSLT
DRGYQVSKTAAYNLLS
PYSHNGILKGPRFVEEQPK*
GNMZQOB <SEQ ID 3156> MKPLI LGLAAVLALSACQVQKAPD--DYTS FKESKPAS ILVVPPLNES DDVNG-WGVLAST AALSEAGYYVFPAA,7.EE-FKENLTNADItiVRPEKLHQT
FGNDAVL-YIT',T-YGTS
YQI LDSVTTVSAKARLVDSRNGKELWSGSAS I REGSNNSNSGLLGALVSAVVNQ:ANNLT DRGYQVSKTAAYNLLSPYSHNGI
LKGRFVEEQPK-
GNMZQ09 <SEQ 1D 3157> MKPLILGLAAALVLSACQVQKAPD FDYTS FKESI(PAS ILVVF PLNE S PDVNGTWGMLAST AELEGYFAVETKNLNAIAQELQ FGNDAVLY 1T TEYGTS YQI LDSVTTVSARARLVDSRNGKVLWSGSAS
IREGSNNSNSGLLGALVSAVVNQIANSLT
DRGYQVSKTAAYNLLSPYSHNGILKGPRFYEEQPK*
<SEQ ID 3158> MKPLI LGLAAVLALSACQVQKAPDFDYTS FKESKPAS ILVV?PLNES?DVNGTWGVLAST AAPLSEAGYYWFPAAVVEETFKQNGLTNAADIfiVRPEKLHQ:
FGNDAVLYITVTEYGTS
YQI LDSVTTVSAKARLVDSaNGKE WSGSAS IaEG SNNSNSGLZGALVSAV"PJNQTAnSLT DRGYQVSKTAAYNLLS
?YSHNGILKGPRF-VEEQPK-
GNMZQ11 <SEQ ID 3159> MKPLI LGLAAVLALSACQVQKPD FDYTS FKESKPASI1LVVP PLNE qP DVGTWGVLAST AALEGYFAVETKNLNAI'ARELQ FGN DAVIY: TVTEYGTS YQ ILDSVTTVSAKARL.VDS RNGKE LWSG SAS IREGSNNSNSG=LGAVSAVNQ-NL DRGYQVSKTAAYNLLSPYSHNGILKGPRF-,7EEQPK- GNMZQ13 <SEQ ID 3160> MKL:-LGLAAVLALSACQVQKAPO FDYTS FKESKPAS I LVVPPLNES PCVGTWGVIAST AALEGYFAVETKQGTADHVPiLQ FGN DAVLY ITV7TEYGTS YQI LDSVTTVSAKARLVDSRNJGKELWSGSAS -REGSNNSNSGLLGA-VSAVVNQ
IANSLT
DRGYQVSKTAAYNLLSPYSHN,,GILK'GPRz-VEEQPK- GNMZQ14 <SEQ ID 3161> MKPLI LGLAAVLALSACQVQK.APDFDYTS FKESKPAS I LVVPPLNE SPDV4GTWGVLAST AAPLSEAGYYVFPAAVVEET FKQNGLTNJAD IHAVRPEK'-HQI FGN DAVLY ITVTEYGTS YQI LDSVPTVSAKARLVDS RNGKE LWSGSAS 1RTEGSNNSNSGLLGALV/GAVVQI.S
LT
DRGYQVSKTAAYNL-LSPYSHNGILKGPRFVEEQPK*
<SEQ 1D 3162> MKL GAVASCVKP DT KEKA VPLZDNTGLS AAPLSEAGYYVFPAVVEET FKQNGLTND IHAVRPEKLHQI FGNDAVLY ITIVEY YQI LDSVTTVSAKARLVDSRNGXELWSGSAS IREGSNNSNSGLLGALSANQ IAS LT
DRGYQVSKTAAYNLLSPYSHNGILKGPRFVEEQPK*
GNMZQ16 <SEQ ID 3163> 125 MKPL ILGLAAVLALSACQVQKAPDFDYTS FKESKPAS I T P PLNES PDVNGTWGLS AAPLSEAGYYVFPAVVEETKQNGLTNJADIaAVRPEKLHQI
FGNDAVLYITVTEYGTS
YQI LDSVTTVSAKARLVDSRNGKELWSGSAS IREGSNNSNSGLLGALVSAVT14QTANSLT
DRGYQVSKTAAYNLLSPYSHNGILKGPRFEEQPK-
GNMZQ17 <SEQ ID 3164> MKP I LGLAAVLALSACQVQKAPDFDYTS FKESKPAS ILVVPPLNZS PDVNTWGVLST~ AALEGYFAVETFQGLNA 7ARPK'Q
GDVYIVEG
YQI LDSVTTVSAKAR LVDS RNGKE LWSG SAS IaEGSNNSNSGLLGALVSAV7WQIANSLT DRGYQVSKTAAYNL: SPYSHNGILKGPRFI1EEQPK* GNMZQIB <SEQ ID 3165> MKPLI LGLAAVLALSACQVQKAPDF'DYTS FKESKAS 1 LVVPPLNES PDVNGTWGVLAST AAPLSFAGYYVFPAAVVEETFKQNGLTNAAD !HAVRPEKLHQI
FGNDAVLYITVTEYGTS
YQILDSVTTVSAKARLVDSRNGKELWSGSASIREGSNNSNSGLLGALVGAVVNISL
DRGYQVSKTAAYNLLS PYSHNGILKG PRFVEEQPK- GNMZQ19 <SEQ iD 3166>
MKPLILGLAAVLALSACQVQKAPDFDYTSFKESKPASILWVPPLNESPDVNGTGLS
AAPLSEAGYYVFPAAVVEETFKQNGLTND IHAVRPEKLHQI FGNDAVLYI TVTEYGTS YQ DVTSKRVSNK WSSS ESNNGLAVAVQASL
DRGYQVSKTAAYNLLSPYSHN~GILKGPRFVEEQPK*
GNMZQ21 <SEQ ID 3166> MKPL ILGLAVLALSACQVQKAPDFD= 'FKESKPAS I LVVPPLNESDVNGTWGVLAST APLSEAGYYVFPAAVEETFKQNGLTNADT AVRPEKLHQI
FGNOAVLY-TVTEYGTS
YQI LDSVTTVSAARLVDSRNGKELWSGSAS :REGSNNSNSGLLG.ALVAVQI
LT
DRGYQVSKTAAYNLLSPYSHNGILKGPRFVEEQPK-
GNMZQ22 <SEQ ID 3167> MKPLILGLAVALSACQVQKAPDFDY'SFKESKAST IVVPPIN7S PDVNGTWGVLAST AAPLSEAGYYVFPAAV\VEETFKQNGLTNAAD H;AVRPEKL-HQI
FGLDAVLYITVTEYGTS
YQILDSVTTVSAKALVDSRNGKE
LWSGSASIREGSNNSNSGLLGALVSAVNQIANSTT
DRGYQVSKTAAYNLLS PYSHNGILKGPRt-VEEQPK- GNMZQ23 <SEQ ID 3168> MKPLILGLAAVLALSACQVQKPDFDYTSFKESKPAILVVPPLNES
?DVNGTWGVLAST'
ALSEAGYYVFAVEETFKQNGLTNAADIHAVRPEKLHQI
FGNDAVLYITVTEYGTS
YQ 1LDSVTTVSAKAR LVDS RNGKE LWSGSAS I REG SNN SNSG LLGALVSA~!NQ ANS LT DRGYQVSKTAAYNL-LS PYSHNGILKGPR-VEEQPK GNMZQ24 <SEQ :I 3169> MKLLLAVASCVQAD TFKESKPAS ILVVPPLN)ES ?DVN.GTWGVLAST AAPZ-SEAGYYVFPAAVVEE-T KQNG':TNAADIHAVRPEKLHQI NDLYI:ryGs YQILDSVTTVSAKARLVDSRNGKELWSGSASIRGSNNS\JSGL:-.A
VSAVVNINL
DRGYQVSKTAAYNL-S
PYSHNGILKGPRF-VEEQPK*
GNMZQ23 <SEQ ID 3170> MKPL ILGLAAVLALSACQVQKAP D FDYT S FKESK PAS I:VVP PLNE S PDVNGTWGVLAS
T
AAPLS-.AGYYV FP.A ,EETFKQNGLTNA-2D-iAVRPEK-HQ I 9-rNDAVLYIT'IT7YG'S YQI LDSVTTVSAKAPILVDSRNGKELWSGSAS IRE-GSNNSNSG:LZAZJ3AV--IN1QIANSLT DRGYQVSKTAAYNLLS PYSHNGI LKGPR7-VEEQPK- GNMZQ26 <SEQ ID 3171> MKPL ILGLAAVLALSACQVQKA PD FDYTS FKESK PAS I LVV P LNES P D\NGTWGMLAS T ?ALSEAGYYVFPAVVEETFKQNLT,14JkDIHAVRPEKLHQI
FG>NAVLYITVTEYGTS
YQIZDSVTTVSAKARLVDSRNGKELWSGSAS
IREGSNNSNSOLLGALVGAVVNQIANSLT
DRGYQVSKTAAYNLLS
PYSHNGILKGPRFVEEQPK-
GNMZQ27 <SEQ ID 3172> MKPLILGLAAVLALSACQVQKAPDFDYTS FKESKPAS ILIVVPPL IESPDVNGTWGVLAST AALEGYFA17ETKQGTIAIAPPKfQ FGNDAVjLy7T=TYGTS YQILDSVTTVSAKARLVDSRNGKELWSGSAS
IREGSNNSNSGLLGALVSAVVNQIANSLT
DRGYQVSKTAAYNLLS DYSHNGI LKGPRFVEEQPK* 126 GNMZQ28 <SEQ ID 3173> MKPLILGLAAVLALSACQVQKAPDFDYTSFKESKPASILVVPPLNES
PDVNGTWGVLAST
AAPLSEAGYYVFPAAVWEETFKQNGLTNAADIHAVRPEKLHQI
FGNDAVLY:TVTEYGTS
YQILDSVTTVSAKARLVDSRNGKELWSGSAS
IREGSNNSOSGLLGALVSAVNQIASLT
DRGYQVSKTAAYNLLS PYSHNGI LKGPRFVEEQPK GNMZQ29 <SEQ ID 3174>
MKPLILGLAAVLALSACQVQKAPDFDYTSFKESKPASILVVPPLNESPDVNGTWGLS
AAPLSEAGYYVFPAAVVEETFKQNGLTNAADIHAVRPEKLHQI
FGNDAVLYITVTEYGTS
YQILDSVTTVSAKARLVDSRNGKELWSGSASIREGSNNSNSGLLGALVSAVVNQINL
DRGYQVSKTAAYNLLs
PYSHNGILKGPRFVEEQPK*
GNM4ZQ31 <SEQ ID 3175> MKLLLALLACVKP
)TF-ZKAIVPLEPDVNGTWGMLAST
AEPLSEAGYYVFPAAVEETFKQNGLTNADHVRPE.KLHQI
FG,'DAVLYITITEYGTS
YQILDSVTTVSARARLVDSRNGKVLWSGSA
IREGSNNSNSGLLGALVGAVVNQIANSLT
DRGYQVSKAAAYDLLSPYSHNGILKGPRFVEEQPK*
GNMZQ32 <SEQ ID 3176>
MKPLILGLAVLALSACQVRK(APDLDYTSFKESKPASILVVPPLNESPDVNGTWGMLS
APISEAGYYVFPAVVEETKENGLTNADIHVRPEKLQIFGNDAVLYITVTEYT
YQILDSVTTVSAKARLVDSRNGKELWSGSAS IREGSNNSNSGLLGALVGAVAmQIANSLT
DRGYQVSKTAAYNLLSPYSRNGILKGPRFVEEQPK.
GNMZQ33 <SEQ ID 3177> MKPL LGLAAVLALSACQVRKPDLDYnS FKESKPAS ILVVPPLNES PDV NGTWGMLAST AAPISEAGYYVFPAAVVEETFKENGLTNADIHAVRPEKLHQT
FGNDAVLY:TVTEYGTS
YQILDSVTTVSAKARLVDSRNGKELWSGSASIREGSNNSNSGLLGALVGAV~tQIANSLT DRGYQVSKTAAYNLLSP~YSRNGI
LKGPR-VEEQPK-
Z2491. <SEQ ID 3178> MKPL TLGLAAVLALSACQVQKAPDFDYTS
FKESKPAS:LVVPPLNESPDVNGTWGVLAST
AA-S-GYFAVE"KNLNAIARELQFNALTT-YT
YQ7STVAA-DRGEWGASR-NSSLGLSVNINL DRGYQVSKTAAYNLLS PYSHNGILKGPR-tVEEQPK- Figure 20 shows the results of aligning the sequences of each of these strains. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. As is readily discernible, there is significant conservation among the various strains of ORF 235, further confirming its utility as an antigen for both vaccines and diagnostics.
EXAMPLE 13 Table 4 lists several Neisseria strains which were used to assess the conservation of the sequence of ORF 287 among different strains.
Table 4 287 gene variability: List of used Neisseria strains Identification Strains Reference number Group B 287_-2 BZ198 Sei ler et 1996 287_9 NGPI65 Seiler elal., 1996 287_14 NGH38 Seiler et al., 1996 287_21 MC58 Virji et al., 1992 Group A z2491 Z2491 Maiden et al., 1998 Gonococcus fa 1090 FA 1090 Dempsey et al. 1991 References: Seiler A. et Mol. Microbiol., 1996, 19(4):841-856.
Maiden R. et Proc. Natl. Acad. Sci. USA, 1998, 95:3140-3145.
Virji M. et Mo!. Microbiol., 1992, 6:1271-1279 Dempsey J.F. et al., J. Bacteriol., 1991, 173:5476-5486 The amino acid sequences freach 1is strain are as follows: 287_14 <SEQ ID 3179> MFKRSVIAMACI
FALSACGGGGGSPDVKSADTSKPAPVSEK.EEAEDAPQAGSCG
QGAPSAQGGQDMAAVSEEITGGGAT DKPKnEDEGAQNDMPQNAACT)S LT PN!-T PAS
NMPAGNMENQAPDAGESEQPANQDATADGMQGDDPSAGGENAGS-AQGTNQAENN
TAGSQNPASSTNPSATNSGGDFGRTNVGNSVI OGPSONITLTHiCKO DSCSGNNFLDE7- QLKSEFEKLSDADKISNYKKzGKNDGKDr-GLVADSVQMKGINQYITFYvPTFA FP.RSARSRRSLPAEMPLI PVNQADTL:VDGEAVSL-GHSGNI
FAPEGNYRY--YGAEKLD
GGSYALRVQGEPSGEMLAGTAVYNGEVLHF N GPSRA.RVDS.S.,
G
DSGDGLHMGTQKFKAAI DGNG FKGTWENGGGDVSGKYGAEEVAGKYSYTAZG G FGVFAGKKEQD 287_2 <SEQ 1D 3180> M-KRSVIAMACI FASACGGGGGGS
PDVKSADTLSKPAAPVVSEKE-EA:EDAQAGSQG
QGAPSAQGGQDMAAVSEENTGNGGAAT D-KKNEDECAQNDMPQt-AAD- DS LTP!IHT PAS NMANEQPAEEPIPMdIAGMGDSGEA'NAQTQEN, TAGSQNPASSTNPSATNSGGDFGRTNVGNSVVI DGPSQNITLTHCKGCSCSGN 4'LDEE'I QLKSEFEKLSDADKISNYKKDGKNDGKNDKFVGLVADSVQMKGINQYI
IFYKPKPTSE-AR
FRRSARSRRSLPAEPMPLI
PVNQADTLIVDGEAVSLTGHSGNIFAPZ.CNYRYLTYGAE-KLP
GGYLVGPKELGAYGVHHEGPPRRArDGKVG DSGDGLHMG'QKFKAA: DGNGFKGTWTENGGGDVS(KF'VGPAGEEV1AGVYSYRPTDAEKG
GFGVFAGKKEQD*
287 _21. <SEQ ID 3181> MFKRSVIAM4ACI FALSACGGGGGGSPDVKSADTLSKPAAPVVISEKETEAKEDAPQAGSQG QGAPSAQGSQDMAAVSEENTGNGGAVTADN PKNEDEVAQN DMPQ NAAGTDS STPNHTPD P NMLAGNMENQATDAGESSQPANQPDMAANAADGMQGDD PSAGGQNAGNTAAQGpAJQAGNNQ AAGSSDPI PASNPAPANGGSNFGRVDLANGVLI
DGPSQNITLTHCKGDSCSGNNFLDEE!
QLKSEFEKLSDADKISNYKKDGKNDKFVGLVADSVQMKGINQYII FYE PKPTS FARFRRS 128 ARSRRSLPAEMPLI PVNQADTLIVDGEAVSLTGHSGNIFAPEGNYRYLTYGAEKLPGGSY ALRVQGEPAKGEMLAGAAVYNGEVLHFHTENGRPYPTRGRFAKDFGSKSVGI
IDSGD
DLHMGTQKFKAAI DGNGFKGTWTENGSGDVSGKFYGPAGEEVAGKYSYRP'~DAEKGGFGV
FAGKKEQD'
287 9 <SEQ ID 3182> MFKRSVIAMACIVALSACGGGGGGS PDVKSADTLSi(PAAPVVTEDVGE7VLPKEKKDEEA VSGAPQADTQDATAGKGQDMVSNTGNGGAATT
DNPENKDEGPQNDMPQNADTDS
ST PNHT PAPNM PTR DMGNQAP DAGE SAQ PAN P DMANADGMQG DD P AGENAGNTADQA
ANQAENNQVGGSQNPASSTNPNATNGGSDFGRINVANG'KLDSGSENVTLTHCKDKVD
DFLDEEAPPKSEFEKLSDEEKINKYKKDEQRENFVGLVADRVENGTNKYVI
IYKDKSAS
SSSARFRRSARSRRSLPAEMPLI PVNQADTLIVDGEAVSLTG~J SV-3 IIFAPEGNYRYLTYG AEKLSGGSYALSVQGEPAKGEMLAGTAVYNGEVLHFHMENGR?
?SGGRFAAKVDFGSKS
VDGI:DSGDDLHMGTQKFKAVI
DGNGFKGTWTENGGGDVSGRFYGPAGEEVAGKYSYRPT
DAEKGGFGVFAGKKEQD'
FAI090 <SEQ ID 3183> MFKRSVIANACI FPLSACGGGGGGS PDVKSADT PSKPAAPVVAENAGEGVLPKEKKDEEA AGGAPQADTQ DATAGE GSQEM AVSAE NTGNGGAATT DN PRNEDAGAQNEM PQNAAESAN
QTGNNQPAGSSDSAPASNPAPANGGSDFGRTNVGNSVV~IDGPSQNITLTHCKGDSCNGDN
LLDEEAPSKSEFEKLSDEEKIKRYKXDEQRENFVGLVADRVKKDGTNKYII
FYTDKPPTR
SARSRRSLPAEIPLIPVNQADTLIVDGEAVSLTGHSGNI
FAPEGNYRYLTYGAEKLPGGS
YALRVQGEPAKGEMLVGTAVYNGEVLHFHMENGRPYPSGGRAAKVFGSKSVDGI
IDSG
DDLHNMGTQKFKAAI DGNGFKGTWTENGGGDVSGRFYGPAGEEVAGYSYRPTDAEKGGFG
VFAGKKDRD*
Z2491 <SEQ ID 3184> MFKRSVIAMACI FALSACGGGGGGSPD ADLKAAVSK
EKDAPQAGSQG
QGAPSAQGSQDMAAVSEENTGNGGAVTADNPKNEDEVAQNDMPQNAG-DSSTPNHT
POP
NMLAGNMENQAT DAGES SQ PANQP DMANAADGMQGDD02SAGGQNAGNTAAQGANQAGNNQ AAGSSDPI PASNPAPANGGSNFGRVDLAI4GVLIDGPSQNITLTHCKGD~SCSGNNFLDEEV
QLKSE
3 EKLSDADKISNYKKDGKNDKE'VGLVADSVQMKG:NQYII FYKPKPTS FARFRRS ARSRRSZPAEMPLI PVNQADTLIVDGEAVSLTGHSGNIFAPEGNYRYLTYGA-,.KLPGGSY
-L,'GKE-LAAVNELFTNRYTG-AKD-SSDIDG
DLHMGTQKFKYAA: OGNGFKGTWTENGSGDVSGK.VYGPAGEEVAGKYSYPPTDAEKGGFGV
FAGKKEQD-
Figure 21 shows the results of aligning the sequences of each of these strains. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. As is readily discernilble, there is significant conservation among the various strains of ORF 287, further confirming its utility as an antigen for both vaccines and diagnostics.
EXAMPLE 14 Table 5 lists several Neisseria strains which were used to assess the conservation of the sequence of ORF 519 among different strains.
Table 519 gene variability: List of used Neisseria strains Identification Strains Source reference number Group B zvOlI_-519 NG6/88 R. Moxon /Seiler eta., 1996 zv02_-519 BZ198 R. Moxon /Seiler et al., 1996 zvO3_Sl59ass NG3/88 R. Moxon Seiler et at., 1996 zv04_-519 297-0 R. Moxon /Seiler et al., 1996 zv05_-519 1000 R. Moxon /Seiler et 1996 zvO6_Sl59ass HZ 147 R. Moxon Seiler et at., 1996 zv07_519 BZ169 R. Moxon /Seileret 1996 zvllI_-519 NGE31 R. Moxon /Seiler et 1996 zvI2_-519 NGF26 R. Moxon Seiler et al., 1996 zvlS_-519 BZ232 R. Moxon /Seiler et al. 1996 zvl19_-519 BZ83 R. Moxon /Seiler et al. 1996 zv2OSl59ass 44/76 R. Moxon Seiler et 1996 zv21_Sl9ass MC58 R. Moxon zv96_519 2996 Our collection Group A Ev22_Sl9ass 205900 R. Moxon 2491_519 Z2491 R. Moxon Maiden et al., 1998 Others zv26_519 zv27_519 zv28_519 zv29_Sl9ass A22 (group W) R. Moxon Maiden et al., 1998 E26 (group X) R. Moxon Maiden et at., 1998 860800 (group Y) R. Moxon Maiden el al., 1998 E32 (group Z) R. Moxon Maiden et al., 1998 Gonococcus zv32_519 Ng F62 falO90_519 FAI090 R. Moxon Maiden et al., 1998 R. Moxon References: Seiler A. et al., Mol. Microbiol., 1996, 19(4):841-856.
Maiden et al., Proc. Natl. Acad. Sci. USA, 1998, 95:3140-3145.
The amino acid sequences for each listed strain are as follows: FAIC90_519 <SEQ ID 3185> ME FFI I LLAAVAVFGFKSFVVI PQQEVHVVERLGRFHRLTAGLNILI PFI DRAHS KE 1PLDVPSQVCITRDNTQLTVDG I IYFQVTDPKrASYGSSNYIMAPJTQLAQTTLRSVIG RMELDKTFEERDE INSTVVSALDEAAGAWGVKVLRYE IKDLVP PQE ILRAMQAQ 1TARE KRARIAESEGRKIEQINLASGQREAE I QQSEGEAQAAVNASNAEKIARINRAKGEAES
LP
LVAEANAEAI RQ IAALQTQGGADAVNLKIAQYVAAFNLAKE SN, 1M PANVAD IGS L
ISAGMKI:DSSKTAK*
130 Z2491_519 <SEQ ID 3186>
MEFFIILLAAVVVFGFKSEVVIPQQEVHVVERLGRFHRLTAGLNILIP.IDRVARS
KEIPLDVPSQVCITRDNTQLTVDGITFQVTDPKASYGSSNYIMAITQLAQTLRV
R~vELDKTFEERDEINSTVVSALDEAAGAWGVKVLRYE IKDLVPPQE ILRSMQAQITAERE
KRARIASEGRKIEQINLASCAIQSEAEQSAQNNASIN'IARIEESL
I SAGMII IDSSKTAK* ZVOl_519 <SEQ ID 3187> MEFFI ILLVAVAVFGFK(SFVVIPQEHVELRFRLAGNLPFIDRVAYRHSL KEI PLDVPSQVCITRDNTQLTVDGI
IYFQVTDPKLASYGSSNY:-MA:TQLAQTTLRSVIG
RMELDKTFEERDEINSTVVAALDEAGAWGVKJLRYEIKDLVP 7QE.I LRSMQAQITAERE
ISAGMKIIDSSKTAK*
ZV02_519 <SEQ ID 3188>
MEFFIILLVAVAVFGFKSFVVIPQQEVHVVERLGRFH.ALTAGLNILIPFIDRVAYRHS
KEIPLDVPSQVCITRDNTQLTVDGI
IYFQVTDPKLASYGSSNYIAITQLAQTTLRSVIG
RMELDKTFEERDEINTVSLEAGWVK YIKDLVPPQE ILRSrQAQITAERE KRARIAESEGRKIEQINLASGQREIQQSEGEAQVASNFKIA,
INRAGEAESL
LVENEIQAAQQGD NKAE AFNAE 'LMAVDGL
ISAGMKIIDSSKTAK*
ZVO3_519 <SEQ ID 3189> MEFILLVAVAVFGFKSFVVIPQQEVHVIERLGRFHRLTAGL PF:DVAYRHS L .MELDKTFEERDEINSTVVSALDEAAGAWqGVKVLRYEIKOLVPPQE 7 MQQTAP
KRARIAESEGRKIEQINLASGQREAEIQQSEGEAQAAVNASNEKAIR.GEAESLR
LVAEANAEAIRQAAALQTQGGADAVKIAEQYVAFNLAESN: LIMPANVAD IGS L ISAGMK: IDSSKTAK* ZV04_519 <SEQ ID 3190> MEFFI;ILLVAVAVFGFKSFVVIPQQEVHVVERLGRFH?.LTAGLN:::
PFIDRVAYRHSL
KEI!PLDVPSQVC:TRDNTQLTVDGI
IYFQVTDPKTASYGSSNYIMA:QLAQTTLRSVIG
RIEDTERENTVADAGWVKJREKLPQ:RMA
AR
KRRArERIQNAGRA:QEGAAVANE:RIR:
EEL
LVENZIQAAQQGDVLIAQVANLKS-IPNAIS
ISAGMRIIDSSKTAK*
ZVO5:59 <SEQ ID 3191> ME7ILAAFFSVTQEHVELRSATGNIPIRARS KE-IPLDVPSQVCITRDNTQLTVDG7:
VDKASGSYM-QLQTR':
RMLKF-RENTVADEAAG LYIDVPELRSMIQAQ:rAERE KRRA-ER!-ILSQEZQ~-EQANSAK- IPk<-E' LVAEANAZAIRQIAAALQTQGGADAVNLKT r -QYVAFNLAESN:LIMPANVAD:GSL
ISAGMKIIDSSKTAK*
ZV06_519ASS <SEQ ID 3192> MEFFIIZLVAVAVFGFKSFV:QQEl!V-RLGRFHRLTAGLNIL:PFIVARS K7ILVSVIRNQTDIYTVTP:SGSYMTQATLSI
R.MELDKTFEERDEINSTVFSALDEAAGAWGVKVLRYEIKDLVPPQEILPSMQAQITAER
KRARIAESEGRKIEQINLASGQREAEIQQSEGEAQAVNASNEKIARINRKGEAEL
LVEAAARIALTGA-VTKIEYAFNAENLM
NAIS
ISAGMKIIDSSKTA(*
ZVO7 519 <SEQ ID 3193> MEFFIILLVAVAVFGFKSFVVI
PQQEVHVVERLGRF.RALTAGLNILIPFIDRVAYRHSL
KEIPLDVPSQVCITRDNTQLTVDGI
IYFQVTDPKLASYGSSNYIMAITQLAQTTLRSVIG
RMELDKTFEERDEINSTVVALDEAGAWGVKLRYEIKDLVPPQEILRSMQAQITAR
KRRIAESEGRKIEQINLASGQDEALEQQSEGEA
AVNASNAEKIARINRAKGEAESLR
LVENEIQAAQQGAANK
QVANNAENLMAVDGL
ISAGMKIIDSSKTAc.
131 ZVJ.1_519 <SEQ ID 3194>
MEFFIILLAAVAVFGFKSFVVIPQQEHVERLGRFHRLTAGLNILIPIDRVAYRHSL
Kv-IPLDVPSQVCITRDNTQLTVDGI
IYFQVTDPKLASYGSSNYIMITQLA..TTLRSVIG
RELDKTFEERDEINSTVVAALDEAAGAWGVKJLRYE IKDLVPPQE LRSMQAQITAERE KRAR iAESEGRKIEQINLSGQREAEIQQSEGEAQAVASNAEKiAR:NRAKGEA-SL
ISAGMKXIDSSKTAK*
ZV12 519 <SEQ ID 3195> MEFFIILLVAVAVFGF'KSFVVIPQQEVHVVERLGRFHRLT~AGLNILI
PFIDRVAYRHSL
KEIPLDVPSQVCITRDNTQLTVDGI
IYFQVTZPKLASYGSSNYIMAITQLAQTTLRSVIG
RMELOI<TFEERDE INST AIAALDEAAGAWGVKVLRYEIKDLVPPQEI
LRSMQAQITAERE
KRRIAESEGRKIEQINLASGQREAEIQQSEGEAQAAVNASN,4AKPRINRAKJGEAESLR LVAEANAEAIRQIAAALQTQGGA0AVNLK IAEQYV~r-NLAESN- LIMPANVAD IGSL
!SAGMKIIDSSKTAK*
ZV18_519 <SEQ ID 3196> MEFFIILLVAVAVFGFKSF-vVIFQQEVHVW,--:LGRFHRLTAGLNILIPFIDRVAYRHS KEIPLDVPSQVCITRDNTQLTVDGI IYFQVT0PKLASYGSSNYIMAITQLAQTTLRSVIG RMELDKTFEERDEINSTVVAALDEAAGAWGVKVLRYE IKDLVPPQE ILRSMQAQ ITAERE
KRARIAESEGRKIEQINLASGQREEIQQSEGEAQVASNAKIARINRAGEAESLR
LVAEANAEAIRQIAAALQTQGGAAVNLKIAEQYVFNLAIESNTLIMPANVADIGS
L
ISAGMKIIDSSKTAK*
ZV19_519 <SEC ID 3197> NEF LLVAVAVFGFKSFVIVI PQQEVHVVERLGRFHM),LTAC: '4IL PE'DRVAYRHSL
IYFQVTPKASYGSSNYIMA:QLQTLRSVIG~
RMJELDKTFEERDE NSTVVAALD)EAAAWGV{VL RYEIKDLVPPQEI LRSMQAQ ITAERE KPAIZER EILSQEEIQEEQANSAKAI.
KEE'
LVAEANAIRQIAALQTQGGADAVNLKIAEQVA.NNLKSNTLIMPANIADIGSL
ISAGMKllDSSKTAK* zv20_5:9ASS <SEQ ID 3198> KE-IPLDVPSQVCITRDNTQLTVDC IIYFQVT0)PKLASYGSSNYIMP:TQLAQTTLRSVIG RMELDKTFEERDE 1NSTVVAALDEAAGAWGVKVLRYE :KDLVPPQE L RSMQAQITAERE
KPRASEGRK:EQINLASGQREAEIQQSEGEAQAAVASNA--IARINAKGEAESLR
LVE-AARIALTGAANKAQVA,.NAETIPNAIS
ISAGMKI7:DSSKTA(* ZV21_5:9ASS <SEQ ID 3199> ME7:LAAF7SVIQEHVRLRHRLALI PIR7YHs KE-IPLDVPSQVCITRDNTQLTVDG: IYFQVTDPKLASYGSSNYIMJ:TQ:A!'QTTLRSV7.
9 IMEDKFEERD)ENSTVVAALD)EAAGAWGVKVLRYEIKD.VPPQE
LSMIQAQ:T-AERE-
KRRIAESEGRn{:EQINLASGQREP.EIQQ GAAVASI--K.RN KE.2 LV.ANEIQAA QQGA'NKA:YA-NKET-MA
VDIGSL
!SAGMN::DSSKTAK*
ZV22_ 519ASS <SEQ ID 3200> NEFIILL:AAVVFGFKSVTPQQEVH RRFRA.TALIL7PIRARS FE: PLD[VPSQVC:TRDNTQLTVDGIYFVDKAGSNIA:Q
AQTRSI
EDTEREISVSLEAGWVV I7KDLVPPQE:LRSMQAQTAERE
KAR
2
A.SEGRKIE-QINLASGQREKIQQSEGEAQVASNAEKIARINAEESL
ISAGM1KT:DSSKTAK* ZV26_ 519 <SEQ ID 3201>
MEFILAVFFSVIQEHVRGFPATGNLPIRARS
KE: PLDVPSQVCITRDNTQLTVDG-
IYFQVTDPKLASYGSSNYIMAITQLAQTTLRSVIG
RMELDKTFEERDEINSTVVALDEAAGAWGVKLRYEIKDLVPPQE
ILRSMQAQITAERE
KRARIAESEGRKEQNLASGQREIQQSEGEAQVASNEKIARINAGAEL
LVAEANAEAIRQIAALQTQGGADALKIQYV-NLAES
ILIMPNAIS
ISAGMKIIDSSKTAK*
ZV27 519 <SEQ ID 3202>
MEFFIILLVAVAVFGFKSFVVIPQQEVHVVERLGRFHRALTAGLNILIPFIDRVAYRHSL
KEILDVSQCITDNQLTDGIYFQVTDPKLASYGSSNYIMA:TQLAQTTLRSI
RMLKFEDENTVADAAAGK
YIKDLVPPQEILRSMQAQITAERE
KERARIAESEGRKIEQINLASGQREA
IQQSEGEAQAAVNASNKIAINGEESLR
ISAGMKz::DSSKTAK* ZV28_519 <SEQ ID 3203>
MEFFIILAVAVFGFKSFVIPQQEVHVV.RLGRFHRLTAGLNILPIRARS
KEIPLDVPSQVC:TRDNTQLTVDGI
IYFQVTDPKLASYGSSNYIMAITQLQTTLRSVIG
RMELDKTFEERDE INSTVV ALDEAAGAWGVKVLaYE IKDLVPPQE ILRSMQAQITAERE KRRIAESEGRKIEQINLASGQREAIQQSEGEAQAVNASNAF:laN.KEEL
ISAGMK:IDSSKTAK*
ZV29_519ASS <SEQ ID 3204> MEF'I ILLAAVAVFGFKS FVVI PQQEV.HVVERLGRFRALTAGLNILIPFIDRVAYRHS KEI PLDVPSQVCITRDr4TQLTVDGII
YFQVTDPKLASYGSSNYIMATQLAQTTLRSVI.
RSIELDKTFEERDE INS IVVSALDEAAGAWGVKVLRYE IKDLVPPQE
ILRSMQAQITA.RE
KRARIAESEGRKIEQINLASGQREPEIQQSEGAQVNASNAEIRNAKEEL
ISAGMKIIDSNKTAKQ~IENLM~
AIS
ZV32 519 <SEQ ID 3205> MEFFI ILLAAVAVFGFK SFVVIPQQEVHVVEPJ GRFHRnALTAGLN'IFIDRVAYRHS- KEIPLDVPSQVCITRDNTQLTVDGI IYFQVTDPKLASYGSSNY:M4A:TQLAQTTLRSVIG RME-LDKTFEERDE INSTVVSALDEAAGAWGVKVLRYE IKDLVPPQE
ILRANQAQITAERE
K RAIAESEGRKIEQINLSGQPEAEIQSGAAVANEKAIRKEEL
ISAGMKIIDSSKTAK*
ZV96 519 <SEQ ID 3206>
MEFFIILLAVAVFGF(SFVVIPQEHVRGFRLAGNLPIRARS
CVPSQVCI.DNQLTVDG I YFQVTDPKASYGSSNYIMAITQATLS RMELDTEERDEINSTVVALDEAGAWGVKLRYEKDLVPPD
LRSMQAQITAERE
KRRIAESEGRKIEQINLASGQREATQQSEGEAQAAVASNAEKIRNAGAET
LVAEANAEAIRQIAALQTQGGADALK:EQWAANLAESNTLIMPANVADIGSL
ISAGMK:IDSSKTAK*
Figure 22 shows the results of aligning the sequences of each of these strains. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acid wI siiahrceItc.A is readily discernible, there is significant conservatio among the various strains of ORF 519, further confirming its utility as an antigen for both vaccines and diagnostics.
EXAMPLE Table 6 lists several Neisseria strains which were used to assess the conservation of the sequence of ORF 919 among different strains.
Table 6 919 gene variability: List of used Neisseria strains Identification Strains Source reference number Group B zm0l NG6/88 R. Moxon Seiler et al., 1996 zm02 BZ198 R. Moxon Seiler et al., 1996 zmnO3' NG3/88 R. Moxon Seller et 1996 zmO4 297-0 R. Moxon Seiler et 1996 1000 R. Moxon Seller et al., 1996 =mO6 BZ147 R. Moxon Seiler et al., 1996 zmO7 BZ169 R. Moxon /Seiler et al., 1996 zmn08n 528 R. Moxon Seiler et al., 1996 zni09 NGP 165 R. Moxon Seller et 1996 zmlO BZ133 R. Moxon Seiler et 1996 zmlIlIasbc NGE31 R. Moxon /Seiler el 1996 zmI 2 NGF26 R. Moxon Seiler el 1996 =1l3 NGE28 R. Moxon Seller et al., 1996 7ml14 NGH38 R. Moxun Seiler et 1996 SWZ107 R. Moxon Seiler et 1996 zm16 NGH15 R. Moxon Seller et 1996 zml7 NGH36 R. Moxon Seiler et 1996 zml8 BZ232 R. Moxon /Seiler et al. 1996 zml9 BZ83 R. Moxon Seiler et al., 1996 zm2O 44/76 R. Moxon Seller et al., 1996 =m21 MC58 R. Moxon =m96 2996 Our collection Group A =i22 205900 R. Moxon zm23asbc F6124 R. Moxon z2491 Z2491 R. Moxon Maiden et 1998 Group C zm24 90/18311 R. Moxon 93/4286 R. Moxon Others zmi26 A22 (group W) R. Moxon Maiden et 1998 zm27bc E26 (group X) R. Moxon Maiden et 1998 ZM28 860800 (group Y) R. Moxon /Maiden et 1998 zm29asbc E32 (group Z) R. Moxon /Maiden et 1998 =m3 1 asbc N. !actamica R. Moxon Gonococcus zm32asbc Ng F62 R. Moxon /Maiden et 1998 zm33asbc Ng SN4 R. Moxon falO9O FAl1090 R. Moxon 134 References: Seiler A. etaL, Mol. Microbiol., 1996, 19(4):841-856.
Maiden et aL, Proc. Natl. Acad. Sci. USA, 1998, 95:3140-3145.
The amino acid sequences for each listed strain are as follows: FAI090 <SEQ ID 3207> MKKHLLRSALYGIAAA.TLPACQSRS IQTF-PQPDTSVINGPDRPAGI
PDPAGTTVAGGGAV
YTVVPHLSMPHWAAQDFAKSLQS FRLGCANL!(NRQGWQDVCAQAFQT PVH-SFQA1(RFFER Y FT PWQ VAGNGS LAGT VTGYYE P VLKG DGRRT ERARFPIYIDFS1PGRGl
OGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYAKEPVIGRYMADKGYL
KLGQTSMQGIKAYMRQNPQRAVLGQNPSYI
FFRELAGSGNEGPVGALGTPLMGEYAGA
IDRHYITLGAPLFVATAHPVTRKALNRLIMAQDTGSAIKGAaDFGDEELK Z2491 <SEQ ID 3208> rKKYLFALCG:;AAI:ACQSKS IQTPQPDTSVINJGPDRPVG: PDPAGTTVGCGGAzV
LVRIRQTGKNSGTIDNTGGTHTADLSQFPITARTTAIGFGRLYTNIGA
VDHILALVAAPTKLRIMAQDTGSA:KGAVRVDYFWGYGDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP-
ZMOI <SEQ ID 3209> MKKYLFAALYGIAAILACQSKS IQI FPQFDTSVINGPDRPVGIPDPAGTTVGGGGAV YTVVPHLSLHWAAQASLQSFRGCANL~aQWQDVCAQFTVSQKF
LVRIQTGNSGIDNTGGTHTADLSRFPITA.RTTA:KGRFEGSRFLPHRQIGA
DGKAILGYAEDPVEFFMHIQGLTSGKYLR7YADKNHYS~YAKY QKTTGYWgQLPNGMKPEYR
DGAKARDYG~AEA.
ZM02 <SEQ ID 3210> MKKYLFRAALYGIAAALACQSS7QTFPQPDTSVINPRVI
PDAGITVGGGGAV
YTVVHSLPHWAAQDF.KSLQS FRLGCANLK.NRQGWQDVCAQAFQT.PVHS FQA1(QFFER YFPWQVAGNGSLAG VTGY EPVLRAARPYID.'SPPALSK
D)GKAPILGYAEDPVET-FFMHIQGSGRLKTPSGKYIRIGYADNHYSGYAKY
VDRHYITLGAPLVATHPVTRKALNRLIMQDTGSAKAaDFG
EGLAGF(
QKTTGYVW4QLLPNGMKPEYRP* ZM03 <SEQ ID 3211> MKKYLFRALYGIAAILACQSKS IQTFPQPDTSVINGPDRPVG
IPDPAGTTVGGGGAV
VRRQTGKNSGTIDNTGGTHTADLSRFPITARTTARKGRFGRLYTNIGA
DGKAPILGYAEDEFELHFQSGRKTPGKYRGADNRYVIRMAKY
VDHI'GALVTHVTKLRIAQT--
KGAVRVDYFWGYGDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP*
ZM104 <SEQ ID 3212> 135
MKKYLFRAALCGIAAAILAACQSKSIQTFPQPDTSVINGPDRPGAACTGGV
YTVVPHLSLPHWAAQDFAJCSLQS FRLGCANLP(NRQGWQDVCAQAFQT ?VHS FQAXKQFFE7R YFPQANSATTYEPLGDRAA IGPDDFI
SVPLPAGLRSGKA
LVRIRQTGKNSCTI
DNAGGTHTADLSRFPITPJRTTAIKGRFEGSRFLPYHTRNQIGA
DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADKNEPYVIGRYMADKGYL
KLGQTSMQGIKAYMQQNPQRLAEVLGQNPSYI
FFRELTGSSNDGPVGALGTPLMGS.YAGA
VDHILALVTRVRANLMADGAKARCFGGEGLG
QKTTGYV'WQLLPNGMKPEYRP*
<SEQ ID 3213> MKKYLFRALYGIAAAILAACQSKS
IQTFPQPDTSVINGPDRPVGIPDPAGTTVGGGGAV
YTVVPHLSLPHWAAQDFAJ(SLQS FRLSCANLKNRQGWQDVCAQAFQTPVHS
FQAKQFFER
YFPQANSAGVGYPLGD
RAAFIGPDDEISVPLPAGLRSGKA
LVRIRQTGKQJSGTI DNTGGTHTADLSRFP ITART'TAIKGRFEGSRFL 2YHTRNQINGGAL DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRGYAD
NHPVIGRYMADI(GYL
KLGQTSMQGI
KAYMRQNPQRLAEVLGQNPSYIFEFRELAGSSNDGPVGALGTPLMGEYAGA
VDRHYITLGAPLEVATAHPVTREALNRLIMAQDTGSAI
KGAVRVDYFWGYGDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP*
ZM06 <SEQ ID 3214> MKKYLFRAALYGIAAAIIL.ACQSKS IQTFPQPDTSVINGPDRPVGI
PDPAGTTVGGGGAV
LVR:RQTGKNSGT
IDNTGGTHTADLSRFPITATTAIKGR-.EGSRFLPV'HTRNIGA
DGKAPILGYAEDPVELFEHIQGSGRLKTPSGKYIRFGYADKNHYSGYAKY
KLGQTSMQGIKSYMRQNPQRAVLGQNPSYI FTRELAGSSNDG
VGALGTPTMGEYAGA
VDRHYITLGAPLVAT~l PVTRKALNRLIMAQDTGSAIKGAVRVD)Y
FWGYGDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP*
ZM07 <SEQ 1D 3215> MKKYLFALYGIAAILACQSKSIQTFQPDTSVTNPRV7DPGTGG YTVVPHLS LPHWAAQDFAKSLQS FRLGCANLKNRQGWQDVCAQA WQT'PVHS
FQAKQFFER
YFTPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQAR.PI -YGI PDDF:7SVPL
PAGLRSGKA
LVRIRQTGK(NSGTIDNTGGTHTADLSaE CA RTT--iKGRFEGSRFTDVRTRNIQINGGAL DGKAPILGYAEDPVELF~FMHIQGSGRLKTPSGKYIRIGYADK1EPVIRNAKY
QKTTGYVWQLLPNGMKPEYRP-
ZM08N <SEQ ID 3216> MKYFALG:AIACS ?TPPTVIGDPG-DPA3TTVGGGGAVI YTVVPHLS LPHWAAQDFAKSLQS FRLGCANLKNRQGWQDVCAQAFQ. PVHS FQAYQFFER
QKTT'GYVWQLLPNGMKPEYRP*
ZM09 <SEQ ID 3217> MKKYLFRAALCGIAALACSSQFPPTVN
OPGPAAGTTVAGGGAV
YTVVPHLS LPHWAAQD FAKS LQS FRLGCANLK(NRQGWQDVCAQAFQT. VHS FQAKQF'FER YF-rPWQVAGNGSAGTVTGY-PVLKGDDRRTAQRFPTYGI
PDDF:SVPLPAGLRSGKA
LVRIRQTGKNSGTI
DNTGGTHTADLSQP!TARTTAIKGREEGSRFPYTNIGA
DGAIGADVLFMIGGR
PGYRGAKEPVIGKYMADKGYL
KLGQTSMQGIKSYMRQNPQRAEVLGQNPSYI
FFRELTGSGNDGPVGALGTPLMGEYAGA
VDRHYTGAwQLpVATAHPVyRA
AQDTGSAIKGAVRVDYWGYGDEAGELAGK
ZMIO <SEQ ID 3218> 52KKYLFRALCGIAAILACQSKS IQTFPQPDTSVINGPDRPVGI
?APAGTTVAGGGAV
LVRIRQTGKNSGTIDNTGGTHTADLSQFPITARTTAIKGRFGRLYTNIGA
DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADKNHPVIGKCYMADKGYL
136 KLGQTSMQGIKSYMRQNPQRLAEVLGQNPSYI
FFRELTGSGNDGPVGALGTPLMGEYAGA
VDR1Y ITLGAPLFVATAHPVTRKALNRLIMAQDTGSAIKGAVRVDYFWGDEELK C KTTG YV WQLL PNGMK PE YR P ZM11ASBC <SEQ ID 3219> NKEYLFRALCGIAAAILACQSKS IQTFQPDTSVINGPDRPVG:
PAAGTTVGGGGAV
YF-TPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQARFPIYGI
PDDFISVPLPAGLRSGKA
LVRIRQTGKNSGTIDNAGGTHTADLSRFPITARTTAIKGRFEGSRFLYTNIGA
DGKAPILGYAEDPVELFEMHIQGSGRLKTPSGKY:RIGYADKNHPVIGKYMIADKGYL
KLGQTSMQGIKSYMRQNPQRLAEVLGQNPSYIFFRELTGSSNDGPVGGTLEYA
VDHILALVTHVRANLMADGAKARDFGGEGLG
QKT-GY VWQLL PNGMK PEYR P ZM12 <SEQ ID 3220>
MKKYLFRAALYGIAAAILAACQSKSIQTFPQPDTSVINGPDRPVGIPDPAGTTVGGGGAV
YTVVPHLS LPHWAAQDFAJKSLQS FRLGCANLKNRQGWQDVCAQAFQTPVHS
FQAKQE'FER
YFTPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQARPIYGI
PDDEISVPLPAGLRSGKA
LVRIRQTGKNSGTIDNTGGTHTADLSRFPIT~ARTTAIKGRFEGSRFLYTNIGA
DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADNEPYVIGRYMADKGYL
KLGQTSMIQGIKSYMRQN PQRLAE VLGQNPSY I FRELAGSSNDGPVGALGT
PLMGEYAGA
VDRHYITLAPLVATAHPVTRKLNRLIMAQDTGSAIKGAVRVDYFGYGDEAGELAG
QKTTGYVWQLLPNGMKPEYRP-
ZM:3 <SEQ ID 3221> MKKYLFRALYGIAAILACQSKS IQ'TFPQPDTSVINGPDRPVGI
PDPAG-TVGGCGAVI
YTIPLLHAQFKLSRG~NLNQWDCQFTVSQKFE
YFTPWQ)VAGNGSLAGTVTGYYEPVLKGDDRRTAQARFPIYGTPDD
SVPL=;AGLRSGKA
LVR:7RQTGKNSGTI DNTGGTHTADLSRFP ITARTTAIK3RFiGSRFLPY:4TRNQINGGAL DGA7GADVL-MIGGRKPGYRGA
EPVIGRYMADKGYL
KLGQTSMQGIKAYMRQNPQRLEVLGQNPSYI
FRELAGSSNDGPVGALGTPLMGEYAGA
VDRHYI TLGAPLFVATAH~PVTRKALNRL1MAQDTGSA:KGAVRVDY.'4GYGDEAGELAG
QK-TGYVWQLLPNGMKPEYRP'
ZM14 <SEQ 1D 3222>
MKYFALGAAIACSSQFQPEVNPRVIAPAGTTVAGGGAV
YTVVPHLSLPHWAAQDFKSLQS FRLGCANLKNRQGWQDVCAQAFQTPV!HS FQAKQ FFER
YF-
7 PWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQARFPIYG: PDDFI SVPLPAGLRS~GK
LVR:RQTGNSGTIDNAGGTHTADLSRFPTATTAKGRFEGSRFLPYTNIGA
DGKAP:-LGYAEDPVELFFMHIQGSGRK-~PSGKYIRGADNHPVIGKYMADKOYL
KTGQTSMQGIKSYMRQNPQRAVLGQNPSYI
FFRELTGSRNDGPVGATCTPLMGEYAGA
VDRHY::LGAPLFVATAPVTRKALNRLIM -AQDGSA::GVVY'
GEGLG
QKTTGYV-WQLLPNGMKPEYRP-
ZM1li <SEQ ID 3223> MKKYLFALYGIAILACQSS:QTFPQPTSV7NGPDRPVG ?DLAGTT.V GGAV YTVHSPWAQFKLSRGCNKHGQVCQF
FVSQAKQF':ER
YFPQANSATTYEPLG RAA7:GPDZDFISVPLPAGLRSGKA LVR:RQTGKNSGTIDNGTTDS
:ATAKGFGRLYTNIGA
DZKAPILGYAECPVELFMHIQGSGPLKTPSG.VIRIGYAD :GHYS-'KYMAkDKGYL KLGQTSMQGIKSYMRQNPQRLA2VLGQNPSYI
FERELTGSGNDGPVGA-GTPLMGEYAGA
VDRHYITLGAPLATAPVTRKPLNRLIMQDTSAIKGAIRVO-GGEGL
QKTTGYVWQLL
PNGMKPEYR?*
ZM16 <SEQ ID 3224> M~KKYLFRAALCGIAAILAACQSKS
TQTFPQPDTSV:NGPGRPVGIPDPAGTTVGGGGAV
Y-VPHLSLPHWAAQDFAKSLQS FRLGCANLKNRQGWQDVCAQAFQTPVHS
FQAKQFFER
YFTPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQARFP:
GPDDF:SVPLPAGLRSGKA
LVRIRQTGKNSGT:
DNTGGTHTADLSQFPITARTTAIKGRFEGSRFLPYTN-GA
DGKPLGYAEDPVELFFMHIQGSGRLKTPS.KY:R:GYADKNHYSGKYMADKGYL
KLGQTSMQGIKSYMRQNPQRLAEVLGQNPSY:
FFRELTGSSNDGPVGALGTPLMGEYAGA
VDRHYITLGAPLVATHPVTRKALNRLIMQDTGSA
FVVYWGYGDEAGELAGK
QK-TGYVWQLLPNGMKPEYRP*
ZM17 <SEQ ID 3225> 137 MKKYLFRAALYGIAAAILAACQSKSIQTF'PQPDTSVINGPDRPVGIPDPAkGTTVGGGGA YTVVPHLSLPHWAAQDFAKSLQS FRLGCANLKNRQGWQDVCAQAFQTPVHS FQAKQrFER YF-PWQVAGNGSLAGTVTGYYEPVLKGDRRTAQARFPIYG:
PDDFISVPLPAGLRSGKA
LVRIRQTGKNSGTIDNTGGTHTADLSRFPITARTTAIKGRFEGSRFL-PYHTRNQINGGA
DGAPILGYAEDPVELF1AIIQGSGRLKTPSGKYIRIGYADKNEHPYVSIGKYMADKGYL KLGQTSMQGIKSYMRQNPQRLAEVLGQNPSYI
FFR-ELTGSSNDGPVGALG-P.MGEYAGA
VDRHY:TLGAPLFVATAHPVTRKALNRLIMAQDTGSA:IKGAVRVDY-WGYGDEAGELAGK
QK'rTGYVWgQLLPNGMKPEYRP- ZM18 <SEQ ID 3226> MKKYLFRAALYGIAAAILAACQSKS IQTFPQPDTSVINGPDRPVGI
PDPAGTTVGGGGAV
YTVVPHLS LPHWAAQDFAKSLQSF~RLGCANLKNRQGWQDVCAQA-'QT PVHS 8QAKQFFER YFTPWQVAGNGSLAGTVTGYYEPVLKGDDRR-TAQARFPIYGI
PDDFISVPLPAGLRSGCA
LVRIRQTGKNSGTIDNGTTDSFIA
AKGFGRLYTNIGA
DGAPILGYAEDPVE-LFTMHIQGSGRLKTPSGKY7RIGYADKNEHPYVS
IGRYMADKGYL
KLGQTSMQGIKSYSMRQNPQRLAEVLGQNPSYI
FFRELAGSSNDGPVGALGTPLMGEYAGA
VDR1HYI TLGAPLF'VATAH-PVTRKALNRL IMAQDTGSAI RGAVRVDYFWGYGDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP-
ZM19 <SEQ ID 3227> MKKYLFAALYGIAAAILACQSW2 IQ2.FPQPDTSVINGPDRPVGI
PDPAGTTVGGGGAV
YTVVPHLSLPHWAAQDFAKSLQS FRLGCANLKNRQGWQDVCAQAFQTPVHS
FQAKQFFER
YFTPWQVAGNGSLAGTVTGYYE FVLKGDDRRTAQAR FPIYGI PDDFISVPT PAGLRSGKA LVIQGNGINGTTDSFIATA-GFGRLY
NIGA
DGK:(PILGYEDPVELF1MHIQGSGRLKTPSGKVIRIGYADKNEHPYVSIGRYMADKGY KLCQTSMQGIKSYMRQNPRLAEVLGQNPSYI F7RE-AGSSNDGPVGALG-~P:VGEYAGA QKTTGYVw4QLLPNGMiKPEYRP* <SEQ ID 3228> MKKYLFRAALYGIAA:LAACQSKSQTFPQDTSVINGPDRPVGIPPAGrnVGGGGA YTVVPHLSLPHWAAQDFAj(SLQS £RLGCANLKNRQGWQDVCAQAFQT PVHS FQAKQFFER YFTPWQVAGNGSLGTVTGYYEPVKGDRRTAQRFIYG: PDDFI SVPLPAGLRSGKA LVRIRQTGKNSGT IDNTGGTHTADLSRFPI TARTTAI KGRrEGSRFLPV?!TRNQINGGAL DGKAPILGYAZDPVET
FFMHIQGSGRLKTPSGKYTRIGYADKEHPVI.GRYMADKGYI
KLQSQISMQPRAVGNSYFRLGSDPGLTLGYG
VDHITGPFAAPT.ANLLADGAKARDFGGEGLG
QKTTGYV-WQLLPNGMKPEYRP-
ZM21 <SEQ ID 3229> MKKYLFRALALYGIAAILAACQSKS IQTFPQPDTSVINGPDRPVGI
?DPAGTTVGGGGAV
YTWrPHLSLPHWAAQDFACSLQS FRLGCANLKNRQGWQDVCAQAFQPVHS BQA -QFFER YFPQANSATVGYPLG RTQRpYIDFSVLAG-PSGFA LVRIRQTGKNSGT
DTGHALRPIATAKRESRLY-NIGA
DGKAPILOYAEDPVrELFFMHIQGSGRLKTPSGFY:R:GYADKNEHPYVS IGRYL1ADKGYL KLGQTSMQGIKSYMRQNPQRLAZVLGQNPSY:
FFRELAGSSNDGPVGALTLMGEYAGA
VD-YTGPFAAPTKLR~AQTSIGVVY-SGEGLG
QKTTGYVW4QLLE'NGMKPEYRP- ZM22 <SEQ ID 3230>
M(KYLFRAALCG':AAILAACQSKSIQPQPDSVINGDRPVGIPCAT..GGA
YTvVHSPWADASQFRGANKRGQVAAQPHVDKFE YFTPWQVAGNGSLAG'VTGYYE PVLKGDDRRTAQARFP:YGI PDDF:SVPI PArLRSGKA LVR7RQTGKNSGTIDNTGGTHTADLSQFPIT~zLTTAIKGRFEGSR.LPYHTRNIGA DGKAPI ZGYAEDPVELF MIQGSGRLKTPSGKY:R:GYADKEHPYJS
IGRYMADKGYL
KLGQTSMQGIKAYM~QQNPQRLEVLGQNPSYI
FFRELTGSSNDGPVGALGTPLMGEYAGA
VDRHY ITLSALFVATAHPVTRKALNRL IMAQDTGSAI KGAVRVDYFTJGYGDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP-
ZM23ASBC <SEQ ID 32131> MKKYLFRAALYGIAAAILAACQSKS IQT TQPDTSVINGPDP ?VG: PDPAGTTVGGGGAV
YTV-PHLSLPHWAQDFAKSLQSFRLGCANLKRQGWQDVCAQAQTPVHSQKFE
YFPWQVAGNGSLAGTVTGYYEPVLKGDDRTAQAR..PIYGIPDDFISVPLALSK
LVIQGLSTDAGHALRPTRTIGFGRLYTNIGA
DGKAPILGYAEDPVELFFMHIQSRKTSK-R YDNEP
CIKYMADKGYL
138 KLGQTSMQGIKSYMRQNPQRLAEVLGQNPSYIFFRELAGSSNGVAI PLM1GEYAGA VDRHYITLGAPL
VATAHPVTSKALNRLIMAQDTGSAIKGAVRVDYFWGYGEGT
MKE PGYVWQLLPNGMKPEYRP- ZM24 <SEQ ID 3232> MKKYLFRAALCGIAAAILAAcQsKSIQTFPQDTSVINGPDPGAATVGGV YTVVHLSPHWAAQDFAIS-QS FaLGCANZKNRQGWQDVCAQAFQT PVHS EQAKQFE'ER
YFTPWQVAGNGSLAGTVTGYYEPVTQPR-IGIDFSPLALSK
DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYDNHVSGMAKL
KLGQTSMQGIKSYMRQNPQRLAVLGQNPSYI
FFRELTGSGNDGPVGALGTPLMGEYAGA
VDHILAL7AAPTKLRLMQTSIG V1YWYDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP-
<SEQ ID 3233> MKKYLFRPAALCG IAAA:LAACQSKS IQTFPQPDTS VINGPDRPVGI
PAPAGTTVAGGGAV
YTVPHLSLPHWAQDFASLGCKNQGWDCAAJTPHFAKF
LVRIRQTGKNSGTIDNTSGTHTADLSQFPITARTTA:KGRFGRLYTNIGA
DGKAPILGYAEDPVELF1MHIQGSGRLKT~PSGKYIaIGYAD
NHPVIGKYADKGYL
KLGQTSMQGIKSYMRQNPQRLAEVLGQNPSYIFE'RELTGSGNDGPVGALGTPLMGEYAGA
VDRHYITLGAPLFVATAH~PVTRKALNRLIMQDTSAIKGAVRVDYFWGYGDEGLG
QKTTGYVWQLLPNGMKPEYRP ZM26 <SEQ ID 3234> MKKYLFRAALYGIAAILAACQSKSIQTFPQPDTSV7NGPDRGPOAC;T-VGGGGAV]
YTVVPHLSLPHWAQDFAKSLQSFRLGC;LLKNQGWQDVCQ-TVSAKFE
Y-TPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQA~aPIYGIPDDF:
SVPLPAO-LRSGKA
LVRIRQTGKNSGTIDNTGGTHTADSQFITRTIGFES7PHTNIGA DGKAPIZGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADKNHPVIGRYM4ADKGYL KLGQTSMQG:-KAYM1QQNPQRrAE7VLGQNPSYT
FFRELTGSSNDGPVGALGTPLMGEYAGA
VDRHY ITLGAP L VAT AHPVTRKALNRLTMAQDTGSAIKGAVRVDYP4GYG
DEAELG
QKTTGYVWQLL
PNGMKPEYRP-
ZM27BC <SEQ ID 3235> MK1YLFRALYGISAAILAACQSKS QTFPQPDTSVIN4CPDRPAG:
PDAGTTVAGGGAV
YTVVPLSLPHWAAQDFASLQSFRLKNQWQVAAQPVSQKFE
LVRIRQTGKNSGTIDNAGGTHTADLSRFPI.TARTTAIKGRFESFPP~iIIGA 2GKAPI LGYAEDPVELFFMHIQGSGR'-KPSGKYIRIGYADKNH
VIGRY.MADKGYL
KLGQTSMQGIKSYMRQNPQRAVLGQNPSYIF
'E.GSNGVAGTLGYG
MKE PGYVWQLLPNGMKPEYRP- ZM28 <SEQ ID 3236> YFPQANSAGVGYPLG RTQR7:GPDDFISVPLPAGLRSGK.A LVR7RQTGKNSGTI
DNTGGTHTADLSQFP:T'RTTAIKGRFEGSRFLYTNIGA
DGKAPILGYAEDPVELFFMHIQGSGELKTPSGKYIRIGYADKNHPVIGRYMvADKGYL
KLGQTSMQGIKYMRQNPQRLAVLGQNSYI-~FLAGSSNDGPVGALGTPLMGEVAGA
VDHILALVTHVRANL7-QTSIGVV~-GGEGLG
QKTTGYVWQLLPNGMKPEYRP-
ZM29ASBC <SEQ ID 3237> MKYFAACIAI AQKIQTFzPQPDTSVINGPDRPVGIPDPAGTTVGGCGG YTVVPHLSLPHWAAQDFAKSLQS FRLGCA.NU(NRQGWQDVCAQAFQTPVHS
FQAKQFFER
YFTPWQVAGNGSLAGTVTGPLKyRE
AAFPYIPDIVPPGLSK
DGAPILGYEDP ELMHQGGRLTPGKYRS
DNHYVIRMAKY
VDRHY:TLGAPLFVATTHITRKLNRLIMAQDTGS-KGAVRVDYE-WGYGDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP-
ZM31ASBC <SEQ ID 3238> 139 MKKHLFRAALYGIAAAI LAACQS S IQTFPQPDTSI IKCPDRPAGI PDPAGTTVGGGGAV YTVVPHLSLPHWAAQDFAKS LQS FRLGCANLKNRQGWQDVCAQAFQTPVHS
FQAKQFFER
YFTPWQVAGNGSLGTVTGYYEPVLKGDDRRTAQARFPIYGIPDDFISVPLPAGLRSGKA
LVRIRQTGKNSGTI
DNAGGTHTADLSRFPITARTTAIKGRFEGSRFLPYHTRNQINGGAL
DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADNEHPYVSIGRYMADKGYL
KLGQTSMQG IKAYMRQNPQRLAEVLGQNPSYVF RLAGSGNDGPVGALGT
PLMGEYAGA
VDHILALVTHVRANLMADGAKARD-GGEGLG
QKTTGYVWQLLPNGMKPEYRP-
ZM32ASBC <SEQ ID 3239>
MKKHLLRSALYGIAAAILAACQSRSIQTFPQPDTSVINGPDRPAGIPDPAGTTVAGGGAV
YTVVPHLSMPHWAAQDFAKSLQS FRLGCANLKNRQGWQDVCAQAF.QT PVHS FQAKRFFER YFTPWQVAGNGSLAGTVTGYYEPVLKGDGRRTEARFPIYGI
PDDFISVPLPAGLRGGKA
LVRIRQTGKNSGTIDNAGGTHTADLSRFPITARTTAIKGRFEGSR..LP
'TRNQINGGAL
DGKAPILGYAEDPVELFFMH:QGSGRLKTSGKYIRGYADNEPYVSIGRYLMDKGYL
KLGQTSMQGIKAYMRQNPQRLAEVLGQNPSYI
FFRELAGSGGDGPVGALGTPLMGGYAGA
IDRHYITLGAPLFVATAHPVTRALNRLIMQDTGSAKGAVRVDYF-GYGDEAGELAGK
QKTTGYVWQLLPNGMKPEYRP-
ZM33ASBC <SEQ ID 3240> MKKHLLRSALYGIAAAILAACQSRS IQTFPQPDTSVINGPDRPAGI
PDPAGTTVAGGGAV
YTVVPHLSMPHWAAQDFACSLQS FRLGCANLKNRQGWQDVCAQAFQTPIHS
FQAK(REFER
YBFTPWQVAGNGSLAGTV'rGYYE 2VLKGDGRRTEAFPIYGI PDDF7SVLPAGLRGGK.
KL GQTSMQGIKSYMRQNPHKLAEVLGQNPSYI
FFRELAGSGNEGPVSALG-.?MGEYAGA
IDRHYITLGAPLFVATAHPVTRKLNRT
IAQDTGSAIGAVRVDYFGYGD.AG.LAGK
QBTTGYVWQLLPNGMKPEYRP-
ZM496 <SEQ ID 3241> MKKYLFRAALYGIAAAILACQS<S IQTFPQPDTSVINGPDRPVGI
PDAGTTVGGGGAV
YTVVPHLSLPHWAAQDFAKS LQS FRLGCA NLKNRQGWQDVCAQAFQT ?VHS FQAKQFFER YPIPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQRF~PIYGI PDDF: SVPLPAGLRSGKA LVRIRQTGKNSGTIDNTGGTHTADLSRr Rr-A::GRFEGSRFLPYHTRNQINGGAL DGKPILGYAEDVELFD4HIQGSGRLKT?SGKYIRIGYADNEHPYSI.RYMADKGYL KLGQTSMQGIKAYMRQN PQRLAEVLGQNPSYI FFRELAGSSNDGPVGA:,G-
PLMGEYAGA
VDHILALVTHVRANLMQTSIGVVY'GGEGLG
QKTTGYVWQLLPNGMKPEYRP*
Figure 23 shows the results of aligning the sequences of each of these strains. Dark shading indicates regions of homology, and gray shading indicates the conservation of amino acids with similar characteristics. As is readily discernible, there is significant conservation among the various strains of ORF 919, further confirming its utility as an antigen for both vaccines and diagnostics.
EXAMPLE 16 Using the above-described procedures, the following oligonucleotide primers were employed in the polymerase chain reaction (PCR) assay in order to clone the OR~s as indicated: Table 7: Oligonucleotides used for PCR to amplify complete or partial ORFs ORF primer Sequence Restriction sites 001 Forward CGCGGATCCCATATG-rGGATGGTGCTGGTICAT BamilIi- NdeI Reverse CCCGCTCGAG-TGCCGTCTTGTCCCAC XhoI 003 Forward CGCGGATCCCATATGGTCGTATTCGTGGC Bamrn- Reverse CCCGCTCGAG-AAAATCATGACACGC(-;C 005 Forward CGCGGATCCCATATG-GACAATATTGACATGT Reverse CCCGCTCGAG-CATCACATCCGCCCG 006 lorward CGCGGATCCCATATG-CTGCTGGTGCTGG Reverse CCCGCTCGAG-AGTTCCGGCTTTGATGT 007 Forward CGCGGATCCCATATG-GCCGACAACAGCATCAT Reverse CCGTG%-AGGTAGTTA 008 Forward CGCGGATCCCATATG-AACAACAGACATTTTG 009 Reverse CCGTGGCTGCGTAAA Forward CGCGGATCCCATATG-CCCCGCGCTGCT Reverse CCCGCTCGAG-TGGCTTTTGCCACGTTTT O1l Forward CG6~CCTT-AAAACCA Reverse CCCGCTCGAG-GGCGGTCAGTACGGT 012 Forward CGCGGATCCCATATGCTCGCCCGIIGCC Reverse CCCGCTCGAG-AGCGGGGAAGAGGCAC 013 Forward CGCGGATCCCATATG-CCTTTGACCATGCT Reverse CCGTGGCGTCGAAAAC 018 Forward CGCGGATCCCATATG-CAGCAGAGGCAGTT Reverse CCCGCTCGAG-GACGAGGCGAACGCC 019 Forward AAAGAATTC-CTGCCAGCCGGCAAGACCCCGGC Reverse AAACTGCAG-TCAGCGGGCGGGGACATGCCCAT 023 Forward AAAGAATTC-AAAGAATATTCGGCATGGCAGGC Reverse AACGA-TCCCAACCTACG 025 Forward AAAGATTC-TGCGCCACCCAACAGCCTGCTCC Reverse AAACTGCAG-TCAGAACGCGATATAGCTGTTCGG 031 Forward CGCGGATCCCATATG..GTCTCCCTTCGCTT IN ef XhoI BamijI- NdeI XhoI BamHI- NdeI XhoI BarnHI- NdeI XhoI BamnHI- NdeI XhoI BarnHI- NdeI XhoI BanmHI- NdeI XhoI Bam}{1- NdeI XhoI BamiHI- NdeI XhoI BamHI- NdeI XhoI Eco RI Pst I Eco RI Pst I Eco RI Pst I NdeI Reverse CCCGCTCGAGATGTAAGACGGGGACAAC XhoJ 032 Forward CGGACCTT-GCAAGG BanHI- NdeJ Reverse CCCGCTCGAG-CTGGTTTTTTGATATnGTG XhoI 033 Forward CGCGGATCCCATATG..GCGGCGGCAGACA BamHI- NdeI Reverse CCCGCTCGAG-ATTTGCCGCATCCCGAT XhoI 034 Forward CGGACCTT-CGAAACAG BanijI- NdeI Reverse CCCGCTCGAG-TTTGACGATTTGGTTCATT XhoI 036 Forward CGCGGATCCCATATG-CTGAAGCCGTGCG BamiHI- NdeI Reverse CCCGCTCGAG-CCGGACTGCGTATCGG XhoI 038 Forward CGCGGATCCCATATG..ACCGATTTCCGCCA BamijI- NdeI Reverse CCCGCTCGAG-TTCTACGCCGTACTGCC Xhiol 039 Forward CGCGGATCCCATATGCCGTCCGA&ACCGC BamHI- NdeI Reverse CCCGCTCGAG-TAGGATGACGAGGTAGG XhoI 041 Forward CGCGGATCCCATATGTTCGTGCGCGAACCGC BamHI- NdeI Reverse CCCGCTCGAGGCCACTCTCAA XhoI 042 Forward CGCGGATCCCATATG-ACGATGATTTGCTTGC BarnHi~- Ndel Reverse CCCGCTCGAG-TTTGCAGCCTGCATTTGAC XhoI 043 Forward AAAAAAGGTACCATGGTTGTTTCTCAAAATATC Kpn I Reverse AAACTGCAG-TTATTGCGCTTCACCTTCCGCCGC Pst I 043a Forward AAAAAAGGTACC-GCAAAAGTGCATGGCGGCTTGGACGGTGC Kpn I Reverse AAAAAACTGCAG- sI TTAATCCTGCAACACGAATTCGCCCGTCCGPsI 044 Forward CGCGGATCCCATATG..CCGTCCGACTAGAG Bani-HI- Reverse CCCGCTCGAG-ATGCGCTACGGTAGCCA NdeI 046 Forward AAAGAATTCATGTCGGCAATGCTCCCGACAAG Eco RI Reverse AAACTGCAG-TCACTCGGCGACCCACACCGTGA Pst I 047 Forward CGCGGATCCCATATG-GTCATCATACAGGCG BamiHI- NdeI Reverse CCCGCTCGAG-TCCGAAAAAGCCCATTTTG XhoI 048 Forward AAAGAATTCATGCTCAACAAGGCGAGAATTGCC Eco RI Reverse AAACTGCAG-TCAAGATTCGACGGGGATGATGCC Pst I 049 Forward AAAGAATTC-ATGCGGGCGCAGGCGTTTGATCAGCC Eco RI Reverse AACGA-AGGACGAkATGA Pst I 050 Forward CGCGGATCCCATATG-GGCGCGGGCTGG BamHI- NdeI Reverse CCCGCTCGAG-AATCGGGCCATCTTCGA XhoI 052 Forward AAAAAAGAATTC-ATGGCTTTGGTGGCGGAGGAAC Eco RI Reverse AAAAAAGTCGAC-TCAGGCGGCGTTnTTTCACCTTCCT Sal I 05.1a Forward AAAAAAGAATTC-GTGGCGGAGGAAACGGAAATATCCGC Eco RI Reverse AAAAAACTGCAG TGTGTTGG ACG -CGT-CCAZCcc Pst I, 073 Forward CGCGGATCCCATATGTGTATGCCATATAAGAT BamHI- Ndel Reverse CCCGCTCGAG-CACCGGATTGTCCGAC Xhol 075 Forward CGCGGATCCCATATG-CCGTCTTACTTCATC B&mHl- Reverse CCCGCTCGAGATCACGAATGCCGATTATTT Xo 077a Forward AAAAAAGAATTC-GGCGGCATTTTCATCGACACCTTCCT Eco RI Reverse AAAAAACTGCAGTCAGACGACATCTGCACAACGCAT Pst I 080 Forward AAAGAATTC-GCGTCCGGGCT'GGTTTGGTTTTACAATTC Eco RI Reverse AAACTGCAG-CTATTCTTCGGATTCTTTTTCGGG Pst I 081 Forward AAAGAATTC-ATGAAACCACTGGACCTAAATTTCATCTG Eco RI Reverse AAACTGCAGTCACTTATCCTCCAATGCCTC Pst I 082 Forward AAAGAATTC-ATGTGGTTGTTGAAGTTGCCTGC Eco RI Reverse AAACTGCAG-TTACGCGGATTCGGCAGTTGG Pst 1 084 Forward AAAGAATTC-TATCACCCAGAATATGAATACGGCTACCG Eco RI Reverse AAACTGCAGTTATACTTGGGCGCAACATGA Pst I 085 Forward CGCGGATCCCATATGGGTAAAGGGCAGGACT Bam-HI- NdeI Reverse CCCGCTCGAG-CAAAGCCTTA.AACGCTTCG XhoI 086 Forward AAAAAAGGTACC-TATTTGGCATCAAkGAAGGCGG Kpn I Reverse AAACTGCAGTTACTCCACCGATACCGCG s1 087 Forward AAGAT-TGCGTAACT G Pst I Reverse AAACTGCAG-TTACGCCGCACACGCAATCGC Pst I 087a Forward AAA-AAGAATTC-AAGCTATTAGGCGTGCCGATTGTGATTCA Eco RI Reverse AAAAAACTGCAG-TTACGCCTGCAAGATGCCCAGCTTGCC Pst 1 088 Forward AAAAAAGAATTC-ATGTTTTTATGGCTCGCACATTTCAG Eco RI Reverse AAAAAAC TGCAG-TCAGCGGATnTGAGGGTACTCAACC Pst 1 089 Forward CGGACCTT-CGCAATA Bam-HI- NdeI Reverse CCGTGGTCGAACAGC XhoI 090 Forward CGCGGATCCCATATG-CGCATAGTCGAGCA BanmHI- NdeI Reverse CCCGCTCGAG-AGCAAAACGGCGGTACG XhoI 091 Forward AAAGAATTC-ATGGAAATACCCGTACCGCCGAGTCC Eco RI Reverse AAACTGCAG-TCAGCGCAGGGGGTAGCCCAGCC Pst 1 092 Forward AAGAT-TTTTTTTATC Eco RI Reverse AAEGA-CATTTTGCAG Pst 1 093 Forward AAAGAATTC-ATGCAGAATTTTGGCAGTGGC Eco RI Reverse AAACTGCAG-CTATGGCTCGTCATACCGGGC Pst 1 094 Forward AAAGAATTCATGCCGTCACGGAGCGCATCAACTC Eco RI Reverse AAACTGCAG-TTATCCCGGCCATACCGCCGAACA Pst 1 095 Forward AAAGAATTC-ATGTCCTTTCA, TTGAACATGGACGG Eco RI Reverse AAACTGCAG-TCAACGCCGCAGGCACTAACGCCC Pst 1 096 Forward AAAiAATTC-ATGGCTCGTCATACCGGGCAGGG Eco RI Reverse AAACTGAG-CAAAGGAAGGC iCTCTGAAGCG Pst I 098 Forward AAAGAATTC-GATGAACGCAGCCCAGCATGGATACG Eco RI Reverse AAACTGCAG-TTACGACATTCTGATTTGGCA Pst 1 102 Forward AAAAAAGAATTCGGCCTGATGATnTGGAAGTCACAC Eco RI Reverse AAAAAACTGCAG-TTATCCTTTAkPATACGGGGACGAGTTC Pst 1 105 Forward CGCGGATCCCATATGTCCGCAACGAATACG BamHI- Reverse CCCGCTCGAG-GTGTTCTGCCAGTTTCAG 107 Forward AAAAAAGAATTC-
CTGATGATTTTGGAAGTCACACCCATTATCC
Reverse AAAAAACTGCAGTTATCCfT
,TAATACGGGGACGAGTTC
1 07b Forward AAAAAAGAATTC-
GATACCCAAGCCCCCGCCGGCACAAJACTACTG
Reverse AAAAAACTGCAG.
TTACGC
TGCCTTTAAAGTATTTGAGCAGGCTGGAGAC
108 Forward AAGAT-TTGCGGTCAC Reverse AAACTGCAGTTAGCGGTACAGGTGTTTGA 1 08a Forward AAAAAAGAATTC-GGTAACACATTCGGCAGCTTAGACGGTGG Reverse AAACTGCAGTTAGCGGTACAGGTGTTGAAGCA 109 Forward AAAGAATTC-ATGTATTATCGCCGGGTTATGGG Reverse AAACTGCAG-CTAGCCCA..AAGATTGAGTGTTC 1II Forward CGCGGATCCCATATGJG6TTCGAACAAACCGC Reverse CCCGCTCGAG-GCGGAGCAGTTTTTCAA 114 Forward CGGGGATCCCATATG-GCTTCCATCACTTCGC Reverse CCGTGGCTCCAACT 117 Forward AAAAAAGGTACC-ATGGTCGAAGAACTGGACTGCTG Reverse AkCTGCAGTTAAGCCGGGTACGCTC~kTAC 118 Forward AAAGTCGACATGTGTGAGTTCAGGATATTATAkG Reverse AAAGCATGCCTATTTTTTGTTG.~TAATCAAATC 121 Forward CGCGGATCCCATATG-GAAACACAGCTTTACAT Reverse CCCGCTCGAG-ATAATAATATCCCGCGCCC 122 Forward CGCGGATCCCATATGGTCATGATTAATCCGCA XhoI Eco RI Pst I Eco RI Pst I Eco RI Pst I Eco RI Pst I Eco RI Pst T BamHI- NdeI XhoI BamHI- NdeI Xhol Kpn I Pst I SalI Sph I BamHI- NdeI XhoI BamcTI Reverse CCCGCTCGAG-AATCTTGGTAGATTGGATTT NdcI 125 Forward AAA~AATT-C-ATGTCGGGCAATGCCTCCTCTCC EoR Reverse AAAET-GCAG-TCACGCCGTTTCAAGACG Pst I 125a Forward AAAAAAGAATTC-ACGGCAGGCAGCACCGCCGCACAGGTTTC Eco RI Reverse AAAAAAC TGCAG- s TTATTTTGCCACGTCGGTTTCTCCGGTGAACAACGCPsI 126 Forward CGCGGATCCCATATG-CCGTCTGAAACCC BamHI- NdeI Reverse CCCGCTCGAG-ATATTCCGCCGATGCC Xhol F127 Forward AAAGAA TTC-ATGGAAATATGGAATATGTTGGACACTTG Eco Rd Reverse AAACTG:CAG..TTAAAGTGTTTCGGAGCCGGC Pst I 127a Forward AAAAATCAGACGTAGGCGCG Eco RI Reverse AAACTGCAG-TTAAAGTGTTTCGGAGCCGGC Pst I 128 Forward CGCGGATCCCATATG.-ACTGACACGCACT BamHI- NdeI Reverse CCCGCTCGAG-GACCGCGTTGTCGA- XhoI 130 Forward CGCGGATCCCATATGACACTCCGCGA BamiHI- NdeI Reverse CCCGCTCGAG-GAATTTTGCACCGGATTG XhoI 132 Forward AAGAT-TGACCTAACTATT Eco RI Reverse AAAAAACTGCAGCACCATGTCGGCATTTGAC Pst I 134 Forward CGGACCTT-CCAAACT BaniHI- Reverse CCCGCTCGAG-CAGTTTGACCGATGTTC NdeI 135 Forward CGCGGATCCATATGATACAAGATCGTATT Baml- NdeI Reverse CCCGCTCGAG-AAATTCGGTCAGAAC-CAGG XhoI 137 Forward AAAAAAGGTACC-ATGATTACCCATCCCCAATTCGATCC Kpn I Reverse AAAAAAC TGCAG-TCAGTGCTGTTTTTTCATGCCG- Pst I 13 7a Forward AAAAAAGAATTC.GGCCGCAAACACGGCATCGGCTTCCT Eco RI Reverse AAAAAACTGCAG-TTAAGCGGGATGACGCGGCAGCATACC Pst I 138 Forward A.AA TCACCGCAGATCTTG Eco RI Reverse AAAAAATCTAGA-TCAGTTTAGGGATAGCAGGCGTAC Xba I 141 Forward AAGAT-TACTAAACGTCGkTG Eco RI Reverse AAACTGCAG-TCAGAACAAGCCGTGAATCACGCC Pst I 142 Forward CGCGGATCCCATATG-CGTGCCGATTTCATG BamHIl- Reverse CCCGCTCGAG-AAACTGCTGCACATGGG NdeI 143 Forward AAAAAAGAATTC- coR ATGCTCAGTTTCGGCTTTCTCGGCGTTCAGAC EoR Reverse AAAAAACTGCAGTCAAACCCCGCCGTGTGTTTCTTTAT Pst 1 144 Forward AAAAAAiAATTC-GGTCTGATCGACGGGCGTGCCGTAC Eco RI Reverse AAAAAATCTAGA-TCGGCATCGGCCGGCATATGTCCG Xba 1 146 Forward AAAAAAGAATTC- coR CGCCAAGTCGTCATTGACCACGACAAAGTG coR Reverse AAAAAACTGCAG-TTAGGCATCGGCI&AATAGGAACTGGG Pst I 147 Forward AAAAAAGAATTC-ACTGAGCAATCGGTGGATTTGGAAC Eco RI Reverse AAAAAATCTAGATTAGGTAA-,GCTGCGGCCCATTTGCGG Xba I 148 Forward AAAAAAGAATTC. coR ATGGCGTTAAACATCACTTGGACACGC EoR Reverse AAAAAATCTAGA-TCAGCCCTTCATACAGCCTTCGTTTTG Xba I 149 Forward CGCGGATCCCATATG-CTGCTTGACAPACAAAGT BarnHI- Reverse CCCGCTCGAG-AAACTTCACGTTCACGCC NdeI F150 Forward CGCGGATCCCATATG-CAGACACAAATCCG Baimi NdeI Reverse CCCGCTCGAG-ATAAACATCACGCTGATAGC XhoI 151 Forward AAAAAAGAATTC-EoR ATGAAACAAATCCGCAACATCGCCATCATCGC EoR Reverse AAAATCGTATCACTTAATGGC Pst 1 152 Forward AAAAAAGAATTC EoR ATGAAAAACAA4ACCkAAGTCTGGGACCTCCC EoR Reverse AAAAAACTGCAG-TCAGGACAGGAGCAGGATGGCGGC Pst 1 153 Forward AAAAATCAGCGTGTAGTTA Eco RI Reverse AAAAAACTGCAGTCAGTCATGTTTnTCCGTTTCATT Pst I 153a Forward AAAAAAGAATTC-CGGACTTCGGTATCGGTTCCCCAGCATTG Eco RI1 Reverse AAAAAACT-GCAG- Pt TTACGCCG-ACGAAATACTCAGACTTTnCGGPsI I )4 Forward CGCGGATCCCATATG..ACTGACAACAGCCC Bami-l- NdeI Reverse CCCGCTCGAG-TCGGCTTCCTTTCGGG Xhoi 155 Forward AAAAAAGAATTCATGAAJTCGGTATCCCACGCGAGTC Eco RI Reverse AAAAAACTGCAGTTACCCTTTCTTAAACATATTCAGCAT Pst 1 156 Forward AAAAAAGATTCGCACAGCAACGGTTTTGAAGC Eco RI Reverse AAAAAAC TGCA&TCAAGCAGCCGCGACAAACAGCCC Pst 1 157 Forward CGCGGATCCCATATGAGGACGAGGAAAC Banil- NdeI Reverse CCCGCTCGAGAAAACACAATATCCCCGC XhoI 158 Forward AAAAAAGAATTC-GCGGAGCAGTTGGCGATGGCAATTCTGC Eco RI Reverse AAAAAATCTAGA-TTATCCACAGAGATTGTTTCCCAGTTC Xba 1 160 Forward CGCGGATCCCATATGGACATTCTGGACAAC Ban-HI- Ndel Reverse CCCGCTCGAG-TTTTTGCCCGCCTTCTTT Xhol 163 Forward AAAAAAGGTACC-ACCGTGCCGGATCAGGTGCAGATGTG Kpn I Reverse AAAAAATCTAGATTACTCTGCCATTCCACCTGCTCGTG Xba I 163a Forward AAAAAAGAATTCCGGCTGGTGCAGATAATGAGCCAGAC Eco RI Reverse AAAACAG-TCCGCATCCTCCT Xba I 164 Forward CGCGGATCCCATATG-.kCCGGACTTATGCC BamHI- Reverse CCCGCTCGAG-TTTGTTTCCGTCAAACTGC NdeI 165 Forward CGCGGATCCGCTAGC-GCTGAAGCGACAGACG XaHoI BamhI- Reverse CCCGCTCGAGAATATCCATACTTI'CGCG Xo 206 Forward CGCGGATCCCATATGAACACCGCCAACCGA Bml BamHI Reverse CCCGCTCGAG-TTCTGTkAAAAAGTATGTGC Xo 209 Forward CGCGGATCCCATATG-CTGCGGCATTTAGGA BamHI- NdeI Reverse~Xo 211 Forward AAAAAAGAATTC-ATGTTGCGGGTTGCTGCTGC Eco RI Reverse AAAAAACTGCAG..TTCGGATGATA s 21 or-ward CGCGGATCCCAAG AATCTCGTATGG BarnHI- Reverse CCCGCTCGAG-AGGGGTTAGATCCTTCC XhoI 215 Forward CGCGG ATCCCATATG-GCATGGTTGGGTCGT BamHfI- Reverse CCCGCTCGAG-.CATATCTTTTGTATCAT.,k-ATC Xo 216 Forward CGCGGATCCCATATCGATGGCAGA2 CG BamHl- Ndel Reverse CCCGCTCGAG-TACAATCCGTGCCGCC Xhol 217 Forward CGCGGATCCCATATG-GCGGATGACGGTGTG BamNI- Reverse CCCGCTCGAG-ACCCGTATCGAATCC Xo 218 Forward CGCGGATCCCATAT3-GTCGCGGTCGATC Bam- NdeI Reverse CCGTGGTATAAATCG Xhol 219 Forward CGCGGATCCGCTAGCACGGCAGGThAG BainHI- NbeI Reverse CCCGCTCGAGTTTAACCATCTCCTCAAC Xhol 223 Forward CGCGGATCCCATATGGAATTCAGGCACCAAGTA BainHi- NdeI Reverse CCCGCTCGAG-.GGCTTCCCGCGTGTC XhoI 225 Forward CGCGGATCCCATATG..GACGAGTTGACCAACC BainHl- NdeI Reverse CCCGCTCGAG-GTTCAGAAAGCGGGAC XhoI 226 Forward AAAGAATTC-.CTTGCGATTATCGTGCGCACGCG Eco RI Reverse AAACTGCAG-TCAAAATCCCAAAACGGGGAT Pst 1 228 Forward CGCGGATCCCATATGCGCAGAGCCAACAG BamHI-i Reverse CCCGCTCGAG-.TTTGGCGGCATCTTTCAT NdeI 229 Forward CGCG GATCCCATATGCAGAGGTTTTGCCC BamHI- Reverse CCCGCTCGAGACACAATATAGCGGATGAC XhoI 230 Forward CGCGGATCCCATATG.CATCCGGGTGCCGAC BamHl- Reverse CCCGCTCGAG-.AAGTTTGGCGGCTTCGG NdeI 232 For-ward AAAAATCAGAGTAAAGGTTG Eco RI Reverse AAAAAACTGCAG-TCAAGGTIITTTCCTGATTGCCGCCGC Pst I 232a Forward AAAAAAGAATTC-GCCAAGGCTGCCGATACACAATTGA Eco RI Reverse AAAAAACTGCAG-TTAAACATTGTCGTTGCCGCCCAGATG Pst 1 233 Forward CGCGGACCCATATGGCGGACACCCAAG BainHi- NdeI Reverse CCCGCTCGAG-GACGGCATTGAGCAG XhoI 234 Forward CGCGGATCCCATATG-GCCGIITCACTGACCG BarnHl- NdeI Reverse GCCCAAGCTT-ACGGTTGGATTGCCATG Hind III 235 Forward CGGACCTT-GCGCATCA BamHI- NdeI Reverse CCCGCTCGAG-TTTGGGCTGCTCTTC Xhol 2 36 Forward CGCGGATCCCATATG..GCGCGTTTCGCCTT BamijI- NdeI Reverse CCCGCTCGAGATGGGTCGCGCGCCGT XhoI 238 Forward CGCGGATCCGCTAGC..AACGGTn'GGATGCCCG BarnHI- Reverse CCCGCTCGAGTTTGTCTAAGTTCCTGATATG Xo 239 Forward CCGGAATTCTACATATG-CTCCACCATAAAGGTATTG EcoRi- NdeI Reverse CCCGCTCGAG-TGGTGAAGAGCGGTTTAG Xhol 240 Forward CGCGGATCCCATATG-GACGnTGGACGATTTC BanIHI- NdeI Reverse CCCGCTCGAG-AAACGCCATTACCCGATG XJhol 241 Forward CCGGAATTCTACATATGCCAACACGTCCAACT EcoRi- NdeI Reverse CCCGCTCGAGGATGCGCCTGTATTAATC XhoI 242 Forward CGCGGATCCCATAT-ATCGGCACTTGTTG BamHI- NdeI Reverse GCCCAAGCTT-ACCGATACGGrCGCAG HindiI 243 Forward CGCGGATCCCATATG.X 73TTTTCGATGCTGC ialj NdeI Reverse CCCGCTCGAG-CGACTTGGTTACCGCG Xhol 244 Forward CGGACCTT-CTTAGC BamHI- NdeI Reverse CCCGCTCGAG-TTTTTTCGGTAGGGGATTT XhoI 246 Forward CGCGGATCCCATATG-GACATCGGCAGTGC Bami-l- Reverse CCCGCTCGAG-CCCGCGCTGCTGGAG NdeJ 247 Forward CGCGGAT-CCCATATG-GTCGGATCGAGTTAC BamHI- Reverse CCCGCTCGAG-AAGTGTTCTGTTTGCGCA Ndel 248 Forward CGCGGATCCCATATGCGCAACAGAACACT Bnil Reverse CCCGCTCGAG-CTCATCATTATTGCTAACA NdeI 249 Forward CGdA-CAAGAGAATATCT BamHI- Reverse CCCGCTCGAG-TTCCCGACCTCCGAC NdeI 251 Forward CGCGGATCCCATATGCGTGCTGCGGTAGT Bntl NdeI Reverse CCCGCTCGAG-TACGAAAGCCGGTCGTG XhoI 253 Forward AAAAAAGAATTC-ATGATTGACAGGAACCGTATGCTGCG Eco RI Reverse AAAAAACTGCAGTTATTGGTCTTTCAAACGCCCTTCCTG Pst I 253a Forward AAAAAAGAATTC-AAA-ATCCTTT"ITGiAAAACAAGCGAAAAACGG Eco RI Reverse AAAAAACTGCAG-TTATTGGTCTTTCAAACGCCCTTCCTG Pst 1 254 Forward AAAAAAGAATTC-ATGTATACAGGCGAACGCTTCATAC Eco RI Reverse AAAAAATCTAGA-TCAGATTACGTAACCGTACACGCTGAC Xba I 255 Forward CGCGGATCCCATATG-GCCGCGTTGCGTTAC BanmlI- NdeI Reverse CCGTGGACGATCGCA XhoI 256 Forward CGCGGATCCGCTAGCTTTTAJCACCGCCGGAC BamijI- Revese CCGCTGAG-CGCCGTTTTGCG Revese CCGCCGA-ACGCTGTTGTCGGXhoI 257 Forward CGCGGATCCCATATGGCGGTTTCTTT~CCTG BaniHi- NdeI Reverse CCCGCTCGAG-GCGCGTGAATATCGCG XhoI 258 Forward AAAAAAGAATTC-GATTATTTCTGGTGGATTGTTGCGTTCAG Eco RI Reverse AAAAAACTGCAG-CTACGCATAAGTTTTACCGTTTTTGG Pst I 258a Forward AAAAAAGAATTC-GCGAAGGCGGTGGCGCAGGCGA Eco RI Reverse AAAAAACTCCAG-CTACGCATAAGTTTTTACCGTTTTTGG Pst 1 259 Forward CGCGGATCCCATATG-GAAGAGCTGCCTCCG BamHI- NdeI Reverse CCCGCTCGAG-GGCTTTTCCGGCGTTT XhoI 260 Forward CGCGGATCCCATATG-GGTGCGGGTATGGT BamHI- NdeI Reverse CCCGCTCGAG-AACAGGGCGACACcT Xhol 261 rorward AAAAAAGAATTC-CAAGATACAGCTCGGGCATTCGC Eco RI Reverse AAAAAACTGCAG-TCAAACCAACAAGCCTTGGTCACT Pst 1 263 Forward CGCGGATCCCATATGGCACGTTAACCGTA Bam}{il- NdeI Reverse CCCGCTCGAG-GGCGTAAGCCTGCA
ATT
264 Forward AAAAAAGGTACC-GCCGACGCAGTGGTCAAGGCAGA Reverse AAACTGCAG-TCAGCCGGCGGTCAATACCGCCCG 265 Forward AAAAAAGAATTC-GCGGAGGTCAAGAGAAGGTGTTTG Reverse AAAATCGTAGAAGCTAATG 266 Forward AAAGAATTC-CTCATCTTTGCCAACGCCCCCTTC Reverse AAACTGCAG-CTATTCCCTGTTGCGCGTGTGCCA 267 Forward AAGAT-TTCGTCAGTAC Reverse AAACTGCAG-TTAGTAAAAACCTTTCTGCTTGGC 269 Forward AAAGAATTC-TGCAAACCTTGCGCCACGTGCCC Reverse AAACTGCAG-TTACGAAGACCGCAACGAGGCAGAG 269a Forward AAAAAAGAATTC-GACTTTATCCAAAACACGGCTTCGCC Reverse AAACTGCAG-TTACGAAGACCGCAACGAAAGGCAGAG 270 Forward AAGAT-CGCAGTGTTTGAT Reverse AAACTGCAG-TTATTCGGCGGTAAATGCCGTCTG 271 Forward CGCGGATCCCATATG-CCTGTGTGCAGCTCGAC MlU1 Kpn I Pst I Eco RI Pst I Eco RI Pst I Eco RI Pst I Eco RI1 Pst I Eco RI Pst I Eco RI Pst I BamHI Reverse CCCGCTCGAG-TCCCACiCCCCGTGGAG NdeI 272 Forward AAGAT-TACjAAGAACGTG Eco RI Reverse AAACTGCAG-TCAGAGCAGTTCCAA.ATCGGGGCT Pst I 273 Forward AAAGAATTC-ATGAGTCTTCAGGCGGTATTTATATACCC Eco RI Reverse AAACTGCAG-TTACGCGTAAGAAUACTGC Pst I 274 Forward CGCGGATCCCATATG-ACAGATTTGGTTACGGAC Band-l- NdeI Reverse CCCGCTCGAGTTTGCTTTCAGTATATTGA XhoI 276 Forward AAAAAAGAATTC- EoR ATGATTTTGCCGTCGTCCATCACGATGATGCG EoR Reverse AAAAAACTGCAG-CTACACCACCATCGGCGAATTTATGGC Pst I 277 Forward AAAAAAGAATTC-ATGCCCCGCTTGAGGACAAGCTCGTAGG Eco RI Reverse AAAAAACTGCAGTCATAAGCCATGCTTACCTTCCACA Pst I 277a Forward AAAAAAGAATTC-GGGGCGGCGGCTGGGTTGGACGTAGG Eco RI Reverse AAAAAACTGCAGTCATAAGCCATGCTTACCTTCCAC Pst I 278 Forward AAAAGACGCAGTTTAACGCTTC Kpn I Reverse AAAAAACTGCAG-TCATTCAACCATATCATCTGCC Pst I 278a Forward AAAAAAGAATTC-AAAACTCTCCTPAUTCGTCATAGTCG Eco RI Reverse AAAAAACTGCAG-TCATTCAACCATATCAAATCTGCC Pst I 279 Forward CGGACCTT-TGCGATAGT BamtHI- NdeI Reverse CCCGCTCGAG-TTTAGAAGCGGGCGGCk XhoI 280 Forward AAAAAAGGTACC-GCCCCCCTGCCGGTTGTAACCAG Kpn I Reverse AAAAAACTGCAG-TTATTGCTTCATCGCGTTGGTCAAGGC Pst I 281 Forward AAAAAAGAATTC-GCACCCGTCGGCGTATTCCTCGTCATGCG Eco RI Reverse AAAAAATCTAGA-GGTCAGAATGCCGCCTTCTTTGCCGAG Xba I 281 a Forward AAAAAAGAATTC-TCCTACCACATCGAAAkTTCCTTCCGG Eco RI Reverse AAAAAATCTAGA-GGTCAGAATGCCGCCTTCTTTGCCGAG Xba I 282 Forward AAAAAAGAATTC-CTTTACCTTGACCTGACCAkCGGGCACAG Eco RI Reverse AAAAAACTGCAG-TCAACCTGCCAGTTGCGGGAATATCGT Pst I 283 Forward CGCGGATCCCATATGGCCGTC-nACTTGGAAG BarnHl- NdeI Reverse CCCGCTCGAG-ACGGCAGTATTTGnTTACG XhoI 284 Forward CGCGGATCCCATATGTTGCCTGCAAGAATCG Bami-I- NdeI Reverse CCCGCTCGAGCCGACTTTGAAAATG XhoI 286 Forward CGCGGATCCCATATGGCCGACCTTCCGA~ BamHI- NdeI Reverse CCCGCTCGAGGAAGCGCGTCCCAAG XhoI 287 Forward CCGGAATTCTAGCTAGC-CTTTCAGCCTGCGGG EcoRi- NheI Reverse CCCGCTCGAG-ATCCTGCTCTTTTTTGCC XhoI 288 Forward CGCGGATCCCATATG-CACAC C 3GACAGG BamHI- NdeI Reverse CCCGCTCGAGCGTATCAAGACTTGCGT XhoI 290 Forward CGCGGATCCCATATG-GCGGTTTGGGGCGGA BamHI- Reverse CCCGCTCGAG-TCGGCGCGGCGGGC Xo 29 2 Forward CGCGGAT-CCCATATG.TGCGGGCAACGCCC Bamlf Reverse CCCGCTCGAGTTGATTTTTGCGGATGATTT Xo 294 Forward AAAAAAGAATTC-GTCTGGTCGATTCGGGTTGTCAGA Eco RI Reverse AAAAAACTGCAGTTACCAGCTGATATAACATCGCTT Pst I 295 Forward CGCGGATCCCATATG-AACCGGCCGGCCTCC BamijI- Reverse CCCGCTCGAG..CGATATTTGATTCCGTTGC NdeI 297 Forward AAAAAAGAATTC-GCATACATTGCTTCGACAGAGAG EoR Reverse AAAAAACTGCAG-TCAATCCGATTGCGACACGGT Eco 1I 298 Forward AAAAAAGTTTC-CTGATTGCCGTGTGGTTCAGCCA.kAC Eco RI RvreAAAAAACTGCAG-TCATGGCTGTGTACTTGATGGTTGG Pst I 299 Forward CGCGGATCCGCTAGC.CTACCTGTCGCCTCCG BarnHI- NheI Reverse CCCGCTCGAG-TTGCCTGATTGCAGCGG XhoI 302 Forward AAAAATCAGGTAACAAGAC Eco RI Reverse AAAAAACTGCAGTTAAGGTGCGGGATAGAATGTGGGG Pst 1 305 Forward AAAAAAG;GTACC..GAATTTTTACCGATTTCCAGCACCGGA Kpn I Reverse AAAAAACTGCAG-TCATTCCCAACTTATCCAGCCTGAG Pst I 305a Forward A-AAAAAGGTACC-CCCGTTCGi{GGCAGTACGATTATGGG p Reverse AAAAAACTGCAGTTACAACCGACATCATGCAGGGTA pn 1 306 Forward CGCGGATCCCATATGTTTATGACAAATTTTCCC BamHI- Reverse CCCGCTCGAG-CCGCATCGGCAGAC Nho 308 Forward CGGACCTT-TATGGATTT BamHl.
NdeI Reverse CCCGCTCGAG-ATCCGCCATTCCCTGC XhoI 311 Forward AAAAAAGGTACCATGTTCAGTIITGGCTGGGTGIIT Kpn I Reverse AAACTGCAG-ATGTTCATATTCCCTGCCTTCGGC Pst I 312 Forward AAAAAAGGTACC..ATGAGTATCCCATCCGGCGAATT Kpn I Reverse AAACTGCAG--TCAGTTTTCATCGATTGAACCGG Pst I 313 Forward AAAAAAGAATTC-ATGGACGACCCGCGCACCTATC Eco RI Reverse AAAAAACTGCAG-CAGCGGCTGCCGCCGATTTTGC Pst I 401 Forward CGCGGATCCCATATG..AAGGCGGCAACACAGC BamnHI- NdeI Reverse CCCGCTCGAGCCTTACGTTTTTCAAGCC XhoI 402 Forward AAAAAAGAATTC-GTGCCTCAGGCATTTTCATTTACTC Eco RI Reverse AAAAAATCTAGA-TTAAATCCCTCTGCCGTATTTGTATTC Xba I 402a Forward AAAAAAGAATT-AGGCTGATGAAAACMAACACG Eco RI Reverse AAAAAATCTAGA-TTAAATCCCTCTGCCGTATTTGTATTC Xba I 406 Forward CGCGGATCCCATATG-TGCGGGACACTGACAG BamHI..
Reverse CCCGCTCGAG-AGGTTGTCCITGTCTATG Xo 501 Forward CGCGGATCCCATATG .GC A G GA T GCB n M Ndel Reverse CCCGCTCGAG-GGTGTGATGTTCACCC Xo eve Xhol 53 Forward CGCGGATCCCATATG-GTGCGGGGCGBami-Il- Reverse CCCGCTCGAG.CCGCTGCATTCTCGAho 504 Forward CGCG GATCCCATATGATGTCGTAGGCG Baml-- NdeI Reverse CCCGCTJGACCCATTCCTTGCG Xhinlf 505 Forward CGCGGATCCCATATGTTCGATACTA ATTCG BaniHI- NdeI Reverse CCCGCTAGGCGGCAflTTATAGCG Hxnoll 5105 Forward CGCGGATCCCATATG-.CTTCGCGACA 4TCA BamHI- Reverse CCCGCTCGAG-CGCGATGGCAAGCGG NdeI 512 Forward CGCGGATCCCATATGCTGGACAGTCGG BamHI- Reverse CCCGCTCGAG.GGGCCTTGCG NdeI 515 Forward CGCGGATCCCATATGGGGAATAGCTT-CGA Bamil- Reverse CCCGCTCGAGAGAATGCCCAGAC NdeI 516 Forward CGCGGATCCCATATG-GAGTTATGTTGG Bam-Hl- NdeI Reverse CCCGCTCGAG-ATTTGCGGCGAGCATC XhoI 517 Forward CGCGGATCCCATATGGTAGGTGTGGG BamHI- Reverse CCCGCTCGAGGTGCGCCCAGCCGT Xo 518 Forward AAA~AATTC-GCTTTTTTACTGCTCCGACCGGAGG Eco RI Reverse AAACTGCAG-TCAAATTTCAGACTCTGCCAC Pst I 519 Forward CGCGGATCCCATATGTTCAAATCCTTTGTCGTCA Barn-l- Reverse CCCGCTCGAG-TTTGGCGGTTTTGCTGC NdeI 520 Forward CGCGGATCCCATATG..CCTGCGCTTCTTTCA BanmHp Reverse CCCGCTCGAG-ATATTTACATTTCAGTCGGC Xo 521 Forward CGCGGATCCCATATGGCCAATCTATACCTGC BamHI- Reverse CCCGCTCGAG..CATACGCCCCAGTTCC NdeJ 522 Forward CGCGGATCCCATATGACTGAGCCGAAACAC BamnHI- Reverse GCCCAAGCTTTTCTGATTTCAAATCGGCA HndIl 523 Forward CGCGGATCCCATATG-GCTCTGCTTTCCGCG BarnHl- I NdeI Reverse CCCGCTCGAGiAGGUGIGTGTG-ATATAG Xhoi 525 Forward CGCGGATCCCATATGGCCGAAGGTTCAA.,TC Ba nI- R v rNdeI Reverse CCCGCTCGAG-GCCCGTGCATATCAT 4 kA Xhol 527 Forward AAAGAATTC-TTCCCTCAATGTTGCCGTTTTCG Eco RI Reverse AAATGCAGTTATGCTCTCGAACAAATTC Pst 1 529 Forward CGCGGATCCGCTAGCTGCTCCGGCAGCAAC BamijI- NheI Reverse GCCCAAGCTT-ACGCAGTTCGGAATGGAG HindIll 530 Forward CGCGGATCCCATATG-AGTGCGAGCGCGG BamiHI- Reverse CCCGCTCGAGACGACC6ACTGATTCCG Xo 531 Forward AAAAAAGAATTC-TATGCCGCCGCCTACCAATCTACGG Eco RI Reverse AAAAAACTGCAG-TTAAAACAGCGCCGTGCCGACGACAA Pst I 532 Forward AAAAAAGAATTC-ATGAGCGGTCAGTTGGGCAAAGGTGC Eco RI Reverse AAAAACTGCAG-TCAGTGTTCCAGTGGTCGGTATCA Pst I 532a Forward AAAAAAGAATTC-TTGGGTGTCGCGTTTGAGCCGGAGT Eco RI Reverse
AA
4 AAATGCAG-TCAGTGTTCC&AGTGGTCGGTATCAA Pst I 535 Forward AAAGAATTC-ATGCCCTTTCCCGTTTTCAGAC Eco RI Reverse AAACTGCAG-TCAGACGACCCCGCCTTCCCC Pst I 537 Forward CGCGGATCCCATATGCATACCCACCAATCC BainHl- Reverse CCC GCTCGAG-ATCCT(jCAAA fAAAGGGTT Xhol 538 Forward CGCGGATCCCATATGGTCGAGCTGGTCAAAGC BamWl- Reverse CCCGCTCGAG-TGGCATTTCGGTTTCGTC NdeI 539 Forward CGCGGATCCGCTAGCGAGGATTTGCAGGAA BamiHI- Reverse CCCGCTCGAGTACCATGTCGGCAAATC Xo 542 For-ward AAAGAAiTT-C-ATGCCGTCTGAAACCGTGTC EoR Reverse AAACTGCAGTTACCGCGACCGGTCAGGAT Eco RI 543 Forward AAAAAAGAATTC-GCCTTCGATGGC GACGTTGTAGGTA Eco RI Reverse AAAAAATCTAGA- b TTATGAAGAAGAACATATTGGAATTTTGGXbI 543a Forward AAAAAAGAATTC-GGCAJAAACTCGTCATGAATTTGC Eco RI Reverse AAAAAATCTAGA- b
TTAATGAAGAAGAACATATTGGAATTTTGG
544 Forward AAAGAATTC-GCGCCCGCCTTCTCCCTCCCCGACCTGAG Eco RI Reverse AAAC7TGCAG-CTATTGCGCCACGCGCGTATCGAT Pst I 544a Forward AAAAAAGAATTC.. Eco RI RvreGCAAATGACTATAACAAJACTTCCAGTACTTG RvreAAACTGCAG-CTATTGCGCCACGCGCGTATCGA Pst I 547 Forward AAA67ATTCATGTTCGTAGATACGGATTTAAAA Eco RI Reverse AAACTGCAGTTAACAACAJACAACCGCT Pst I 548 Forward AAAGAATTCGCCTGCAACCTCAGACAACAGTCG Eco RI Reverse AAACTGCAG- ICAGjAGCAGGGTCCUAATCGGC Pst I 550 Forward AAAAAAGTCGAC-SaI ATGATAACGGACAGGTTTCATCTCTTTCATTTTCCSaI Reverse AACGA-TCCAAGTCAACC Pst I 550a Forward AAAAAAGAATTC-GTAAATCACGCCTTTGGAGTCGCAACGG Eco RI Reverse AAACTGCAGTTACGCAACGCTGCAATCCCC Pst I 552 Forward AAAAAAGAATTC-TTGGCGCGTTGGCTGGATAC Eco RI Reverse AAACTGCAG-TTATTTCTGATGCCTTTTCCAAC Pst I 554 Forward CGCGGATCCCATATGTCGCCCGCGCCCAC BamHI- Reverse CCCGCTCGAG-CTGCCCTGTCAGACAC NdeI 556 Iforward AAAGAATTC-GCGGGCGGTTTTGmTGGACATCCCG Eco RI Reverse AAACTGCAG-TTAACGGTGCGGACGTTTCTGACC Pst I 557 Forward CGCGGATCCCATATGTGCGGTTCCACCTGA Bam-HI- NdeI Reverse CCCGCTCGAG-TTCCGCCTTCAGAAApGG XhoI 558 Forward AAd T-ACTAAGTCAAGGCG Eco RI Reverse AAACTGCAG-CTAAACAATGCCGTCTGAAGTGGAGA Pst I 558a Forward AAAAAAGAATTCATTAGATTCTATCGCCATAACAGACGGG Eco RI Reverse AAAAAACTGCAG-CTAAACAATGCCGTCTGAAAGTGGAGA Pst I 560 Forward AAAAAAGAATTC- EoR TCGCCTTTCCGGGACGGGGCGCACAAGATGGC EoR Reverse AAAAAACI'GCAG-CATGCGGTTTCAGACGGCATTTTGGC Pst I 561 Forward CCGGAATTCTACATATG..ATACTGCCAGCCCGT EcoRI- NdeI Reverse CCCGCTCGAG-TTTCAAGCTTTCTTCAGATG XhoI 562 Forward CGCGGATCCCATATG.GCAAGCCCGTCGAG Bam-HI- Revese ~NdeI Revese CCGTCGG-AGCCACTCAACCGTXhoI 565 Forward CGCGGATCCCATATGAAGTCGAGCGCGAATAC BarnHJ- Reverse CCCGCTCGAG-GGCATTGATCGGCGGC NdeI 566 Forward CGCGGATCCCATATG-GTCGGTGGCGAAGAGG Bn-l Reverse CCCGCTCGAG-CGCATGGGCGkAAGTCA NdeI 567 Forward CCGGAATTCTACATATGAGTGCGAACATCCTTG Eol Reverse CCCGCTCGAG-TTTCCCCGACACCCTCG NdeI 568 Forward CGCGGATCCCATATG-CTCAGGGTCAGACC Bnil NdeI Reverse CCCGCTCGAG-CGGCGCGGCGTTCAG XhoI 569 Forward AAAAAAGAATTC-CTGATTGCCTTGTGGGAATATGCCCG Eco RI Reverse AAAAAACTGCAGTTATGCATAGACGCTGATACGGCAT Pst 1 570 Forward CGGACCTT-AACTCAAAC BamHI- NdeI Reverse CCCGCTCGAG-GCGGGCGTTCATTTCnTT XhoI 571 Forward AAAAAAGAATTC- Eo ATGGGTATTGCCGGCGCCGTAATGTTTGACCC EoR Reverse AAAAAACTGCAG-TTATGGCCGACGCGCGGCTACCTGACG Pst I 572 Forward CGCGGATCCCATATGGCGCAAGGCAACC BaraH- NdeI Reverse CCCGCTCGAG-GCGCAGTGTGCCGATA XhoI 573 Forward CGCG-GATCCCATATG.CCCTGTTTGTGCCG BamI{I- NdeI Reverse CCGTGGGCGTTATCC Xhol 574 Forward CGCGGATCCCATATG-.TGGTTTGCCGCCCGC BamrHl- NdeI Reverse CCCGCTCGAG-AACTTCGATTTTATTCGGG XhoI 575 Forward CGCG-GATCCCATATG..GTTTCGGGCGAGG Bamil- NdeI Reverse CCCGCTCGAGCATTCCGATCTGACAG XhoI 576 Forward CGCGGATCCCATATG..GCCGCCCCCGCATCT Bani- NdeI Reverse CCCGCTCGAG-ATTTACTTTTTTGATGTCGAC XhoI 577 Forward CGCdGGTCCCATATGGA-GGACGGTGTATTT Bami- NdeI Reverse CCCGCTCGAG-AGGCTGTTTGGTAGATTCG XhoI 578 Forward CGCGGATCCCATATGAGAGGTTJCGTACAG Bam-I- Revese CCGGCGA~GCAACGCTCACGNdeI Revese CCGCCGA-GCAACGCTCACGXhoI 579 Forward CGCGGATCCCATATG-AGATTGGGCGTTTCCAC BamHI- NdeI Reverse CCCGCTCGAG-AGAATTGATGATGTGTATGT XhoI 580 Forward CGCGGATCCCATATG..AGGCAGACTTCGCCGA BamHI- NdeI Reverse CCCGCTCGAG-CACTTCCCCCGAGTG XhoI 581 Forward CGCGGATCCCATATG-CACTTCGCCCAGC BamHI- NdeI Reverse CCCGCTCGAG-CGCCGTTTGGCTTTGG XhoI 582 Forward AAAAAAGAATTC-TTTGGAGAGACCGCGCTGCA.,TGCGC Eco RI Reverse AAAAAATCTAGATCAGATGCCGTCCCAGTCGTTGAA Xba I 583 Forward AAAAATCATCGCACATCTAC Eco RI Reverse AAAAAACTGCAGTTAACGGAGGTCATATGATGAAATTG Pst I 584 Forward AAAAAAGAA TTC EoR GCGGCTGAAGCATTGAATTACAATATTGTC EoR Reverse AAAAAACTGCAGTCAGAACTGACCGTCCCATTGACGCT Pst I 585 Forward AAAAAAGTGTACCTCTTTCTGGCTGGTGCAGAACACCCTTGC Eco RI Reverse AAAAAACTGCAG-TCAGTTCGCACTTTTTTCTGTTTTGGA Pst 1 586 Forward CGCGGATC-CCATATG-GCAGCCCATCTCG Bami-l- Reverse CCCGCTCGAGTTTCAGCGATCAAGTTTC NdeI 587 Forward CGCGGATCCCATATG-GACCTGCCCTTGACGA XnHoI- Reverse CCCGCTCGAG-AAATGTATGCTGTACGCC 588 Forward AAAkATCGCTCTATCTTAGACG Reverse AAAAAACTGCAG-TTATTTGTTTTTGGGCAGTTTCACTTC 589 Forward AAAAAAGAATTC-
ATGCAACAAAAAATCCGTTTCCAAJATCGAAGG
Reverse AAAAAACTGCAG-CTAATCGATTTTTACCCGTTTCAGGCG 590 Forward AAAAAAGAATTC-ATGAAAAAACCTTTGATTTCAGTTGCGGC Reverse AAAAAACTGCAG-TTACTGCTGCGGCTCTGAAACCAT 591 Forward AAAAAAGAATTC-CACTACATCGTTGCCAGATTGTGCGG Reverse AAAAAACTGCAG-CTAACCGAGCAGCCGGGTAACGTCGTT 592a Forward AAAAAAGAATTC-CGCGATTACACCGCCAAGCTGAAAGGG Reverse AAAAAACTGCAG-TTACCAAACGTCGGATTTGATACG 593 Forward CGCGGATCCGCTAGC-CTTGAAC
GAACGGACTC
Reverse CCCGCTCGAG-GCGGAAGCGGACGATT 594a Forward AAAAAAGAATTC-GGTAAGTTCGCCGTTCAGGCCTTTCA Reverse AAAAAACTGCAG-TTACGCCGCCGTTTCCTGACACTCGCG 595 Forward AAAAAAGAATTC-TGCCAGCCGCCGGAGGCGGAGAGC Reverse AAAAAACTGCAG-TTATTTCAAGCCGAGTATGCCGCG 596 Forward CGCGGATCCCATATG-TCCCKAACAATACGTC Reverse CCCGCTCGAG-ACGCGTTACCGGTTTGT 597 Forward CGCGGATCCCATATG-CTGCTJCATGTCAGC Reverse GCCCAAGCTT-ACGTATCCAGCTCGp.AG 601 Forward CGCGGATCCCATATG-ATATGTTCCCAACCGGCAAT NdeI XhoI Eco RI Pst I Eco RI Pst I Eco RI Pst I Eco RI1 Pst I Eco RI Pst I BamHI- NheI XhoI Eco RI Pst I Eco RI Pst I Bai-l- NdeI XhoI Bam.HI- NdeI HindIII BanmHI- Reverse CCCGCTCGAG-AAAACAATCCTCAGGCAC 602 Forward CGCGGATCCGCTAGCTTGCTCCATCAATGC Reverse CCCGCTCGAG-ATGCAGCTGCTAAAGCG 603 Forward AAAAAAGAATTC-CTGTCCTCGCGTAGGCGGGGACGGGG Reverse AAAAAACTGCAGCTACAAGATGCCGGCAGTCGGC 604 Forward CGCGGATCCGCTAGC-CCCGAAGCGCACTT Reverse CCCGCTCGAG-GACGGCATCTGCACGG 606a Forward AAAAAAGAATTC-CGCGAATACCGCGCCGATGCGGGCGC Reverse AAAAAACTGCAG-TTAAAGCGATTTGAGGCGGGCGATAG 607 Forward AAAAAAGAATTC-ATGCTGCTCGACCTCAACCGCTTTTC Reverse AAAAAACTGCAG-TCAGACGGCCTTATGCGATCTGAC 608 Forward AAAAAAGAATTC-ATGTCCGCCCTCCTCCCCATCATCACCG Reverse AAAAAACTGCAG-TTAGTCTATCCAAATGTCGCGTTC 609 Forward CGCGGATCCCATATG-GTTGTGGATAGACTCG NdeI XhoI BamHI- NheI XhoI Eco RI Pst I BamHI- NheI XhoI Eco RI Pst I Eco RI Pst I Eco RI Pst I BamHI- NdeI Reverse CCCGCTCGAG-CTGGATTATGATGTCTGTC XhoI 610 Forward CGCGGATCCCATATG-ATTGGAGGGCTTATGCA BamHI- NdeI Reverse CCCGCTCGAG-ACGCTTCAACATCTnTGCC XhoI 611 Forward CGCGGATCCCATATGCCGTCTCAACGGG Bam}II- NdeI Reverse CCCGCTCGAGAACGACTTTGAACGCGCA XhoI 613 Forward CGCGGATCCCATATG-TCGCGTTCGAGCCG3 BamHI- NdeJ Reverse CCCGCTCGAG-AGCCTGTAAAATAGCGGC XhoI 614 Forward CGCGGATCCCATATG..TCCGTCGTGAGCGGC Bam-l- NdeI Reverse CCCGCTCGAG-CCATACTGCGGCGTTC Xhol 616 Forward AAAAAAGAATTC-A TGTCAA CACAATCAAAATGGTTGTCGG Eco RI Reverse AAAAAATCTAGA-TTAGTCCGGGCGGCAGGCAGCTCG Xba I e a Forward AAAAAAGAATTC-GGGCTTCTCGCCGCCTCGCTTGC Eco RI Reverse AAAAAACTGCAG-TCATTTTTTGTGTTTTAAAACGAGATA Pst I 622 Forward CGCGGATCCCATATGGCCGCCCTGCCTAAAG Bamil- NdeI Reverse CCCGCTCGAGTTTGTCCA,,kTGATAATCTG XhoJ 624 Forward CGCGGATCCCATATG-TCCCCGCGCTTTTACCG BamHI- NdeI Reverse CCCGCTCGAG-AGATTCGGGCCTGCGC XhoJ 625 Forward CGGACCTT-TTCACGAAT Bam-HI- NdeI Reverse CCCGCTCGAG-CGGCAAAATTACCGCCTT XhoI 627a Forward AAAAAAGA-ATTC-AAAGCAGGCGAGGCAGGCGCGCTGGG Eco RI Reverse AAAAAACTGCAG- sI TTACGAATGAAACAGGGTACCCGTCATCAGGCPsI 628 Forward AAAAAAGGTACC-GCCTTACAAACATGGATTTTGCGTTC Kpn I Reverse AAAAAACTGCAGCTACGCACCTGAGCGCTGGCAA Pst I 629a Forward AAAAAAGAATTC-GCCACCTTTATCGCGTATGAAAACGA Eco RI Reverse AAAAAACTGCAG-TTACAACACCGCCGTCCGGTTCAACC Pst I 630a Forward AAAAATCGGCTTGTTTTTG Eco RI Reverse AAAAAACTGCAGTTAGGAGACTTCGCCAATGGAGCCGGG Pst 1 635 Forward AAAAAAGAATTC- EoR ATGACCCAGCGACGGGTCGGCAGCAACCG EoR Reverse AAAAAACTGCAGTTAATCCACTATATCCTGTTGCT Pst I 638 Forward AAAAAAGAATTC-ATGATTGGCGkAAAGTTTATCGTAGTTGG Eco RI Reverse AAAAAACTGCAG-TCACGAACCGATTATGCTGATCGG Pst I 639 Forward CGCGGATCCCATATG-ATGCTTTATTTTGTTCG BaiHi- NdeI Reverse CCCGCTCGAG-ATCGCGGCTGCCGAC XhoI 642 Forward CGCGGATCCCATATGCGGTATCCGCCGCAT BanmHI- NdeI Reverse CCCGCTCGAG-AGGATTGCGGGGCATTA XhoI 643 Forward CGCGGATCCCATATG-GCTTCGCCGTCGGCAG BamiHI- NdeI Reverse CCCGCTCGAG-AACCGAAAAACAGACCGC XhoI 644 Forward AAAAAAGAATTC- Eco RI
ATGCCGTCTGAAAGGTCGGCGGATTGTTGCCC
Reverse AAAAAATCTAGA-CTACCCGCAATATCGGCAGTCCAATAAT Pst I 645 Forward AAAAAAGAATTC-GTGGAACAGAGCAACACGTTAATCG Eco RI Reverse AAAAAACTGCAG-CTACGAGGAAACCGAAGACCAGGCCGC Pst I 647 Forward AAAAAAGAATTC-AT(CJCAAAGGCTCGCCGO-AGACGG Eco RI Reverse AAAAAACTGCAG-TTAGATTATCAGGGATATCCGGTAGA Pst I 648 Forward AAAAAAGAATTC- EoR ATGAACAGGCGCGACGCGCGGATCGACG EoR Reverse AAAAAACTGCAG -TCAAGCTGTGTGCTGATTGAATGCGAC Pst I 649 Forward AAAAAAGAATTC-GGTACGTCAGAACCCGCCCACCG Eco RI Reverse AAAAAACTGCAG-TTAACGGCGGAAACTGCCGCCGTC Pst I 650 Forward AAAAATCAGCCACCAACTG Eco RI Reverse AAAAAACTGCAG-TCAGACGGCATGGCGGTCTGTTTT Pst I 652 Forward AAAAAAGGTACC- pI GCTGCCGAAGACTCAGGCCTGCCGCTTTACCG pI Reverse AAAAAACTGCAG-TTATTTGCCCAGTTGGTAGAATGCGGC Pst I 653 Forward AAAAAAGAATTC-GCGGCTTTGCCGGTAATTTTCATCGG Eco RI Reverse AAAAAACTGCAG-CTATGCCGGTCTGGTTGCCGGCGGCGA Pst I 656a Forward AAAAATCCGCAGCTGGCTAT Eco RI Reverse AAAAAACTGCAG-CTACGATTTCGGCGATTTCCACATCGT Pst 1 657 Forward AAAAAAGAATTC-GCAGAATTTGCCGACCGCCATTTGTGCGC Eco RI Reverse AAAAAACTGCAG-TTATAGGGACTGATGCAGTTTTTTTGC Pst I 658 Forward CGCGGATCCCATATG-GTGTCCGGAATTGTG Bam}{I- NdeI Reverse CCCGCTCGAG-GGCAGAATGTTTACCGTT Xhol 661 Forward AAAAAAGAATTC- EoR ATGCACATCGGCGGCTATTTTATCGACACCC EoR Reverse AAAAAACTGCAG-TCACGAC'GTGTCTGTTCGCCGTCGGGC Pst I 663 Forward CGCGGATCCCATATGTGTATCGAGATAAATT BamHI- NdeI Reverse CCCGCTCGAG-GTAAMAATCGGGGCTGC Xhol 664 Forward CGCGGATCCCATATG-GCGGCTGGCGCGGT BamHI- NdeI Reverse CCCGCTCGAG-AAATCGAGTTTTACACCAC XhoI 665 Forward AAAAATCAGATGAGACCCTG Eco RI Reverse AAAAAACTGCAG-TCAATCCAJA4ATTTTGCCGACGATTTC Pst 1 666 Forward AAAAAAGAATTC-AACTCAGGCGAAGGAGTGCTTGTGGC Eco RI Reverse AAAAAATCTAGA-TCAGTTTACC-GATAGCAGGCGTAC Xba 1 667 Forward AAAAAAGAATTC- EoR CCGCATCCGTTTGATTTCCATTTCGTATTCGTCCG EoR Reverse AAAAAACTGCAG-TTAATGACACAATAGGCGCAAGTC Pst I 6 6 9 F o r w a r d A A A A A A G A A T C A T G C C G A A T A A C A A C C E o R Reverse AAAAAACTGCAG-TTACAGTATCCGTTTGATGTCGGC Pst I 670a Forward AAAAAAGAATTC-AAAAACGCTTCGGGCGTTTCGTCTTC Eco RI Reverse AAAAAACTGCAG- s TTAGGAGCTTTTGGAACGCGTCGGACTGGCPsI 671 Forward CGCGGATCCCATATG..ACCAGCAGGGTAC BainHI- Revese ~NdeI Revese CCGCCGA-AGCACTTAAAACGAAGXhoI 672 Forward CGGACCTT-GAATCCC BamHI- Reverse CCCGCTCGAG-ACGGGATAGGCGGTTG NdeI 673 Forward AAAAATCAGAATAACTCTCG Eco RI Reverse AAAAAACTGCAG..CTACAAACCCAGCTCGCGCAGGA Pst 1 674 Forward AAAAAAGAA-TTC-ATGAAAACAGCCCGCCGCCGTTCCCG Eco RI Reverse AAAAAACTGCAG-TCAACGGCGTT.TGGGCTCGTCGGG Pst I 675 Forward CGCGGATCCCATATG-AACACCATCGCCCC BamHI- NdeI Reverse CCGTGGTCTCTTCACG XhoI 677a Forward AAAAAAGAATTC-AGACGGCATTCCCGATCAGTCGATTTTGA Eco RI Reverse AAAAAACTGCAGTTACGTATGCGCGAAATCGACCGCCGC Pst I 680 Forward CGCGGATCCGCTAGCACGAAGGGCAGTTCC-G BarnHl- NheI Reverse CCCGCTCGAG-CATCAAACCTGCCGC XhoI 681 Forward AAAAAAGAATTC-ATGACGACGCCGATGGCAATCAGTGC Eco RI Reverse AAAAAACTGCAGTTACCGTCTTCCGCAAAACAGC Pst 1 683 Forward CGCGGATCCCATATGTGCAGCACACCGGACA Bami-l- NdeI Reverse CCCGCTCGAG-GAGTTTTTTTCCGCATACG Xhol 684 Forward CGCGGATCCCATATG-TGCGGTACTGTGCkAAG BaniHI- NdeJ Reverse CCCGCTCGAG-CTCGACCATCTGTTGCG XhoI 685 Forward CGCGGATCCCATATGTGTTGCTTATTAAACATT BamMI- NdeI Reverse CCCGCTCGAG-CTTTTTCCCCGCCGCA Xhol 686 Forward CGCGGATCCCATATGTGCGGCGGTTCGGAG BaniHI- Revere CCGCTCAG~CATCCGIICTATGI Revese CCGTCGA-CATCCATTCGATAAGXhoI 687 Forward CGCGGATCCCATATGTGCGACAGCAAAGTCCA BamHJ- NdeI Reverse CCCGCTCGAG-CTGCGCGGCTTTTT'HGTT XhoI 690 Forward CGCGGATCCCATATG..TGTTCTCCGAGCAAkAGAC BamHI- NdeI Reverse CCCGCTCGAG-TATTCGCCCCGTGTTTGG XhoI 691 Forward CGCGGATCCCATATG-GCCACGGCTTATATCCC BamHI- NdeI Reverse CCCGCTCGAG-TTTGAGGCAGGAPAGAG XhoI 694 Forward CGCGGATICCCATlATG-TTGGTTTCCGCATCCGG BamHl- NdeI Reverse CCCGCTCGAG-TCTGCGTCGGTGCGGT XhoI 695 Forward CGGACCTT-TGCCACCTC BarnHI- NdeI Reverse CCCGCTCGAG.TCGTTTGCGCACGGCT Xhol 696 Forward CGCGGATCCCATATG-TTGGGTTGCCGGCAGG Bam}{I- NdeI Reverse CCCGCTCGAG-TTGATTGCCGCAATGATG XhoI 700a Forward AAAAAAGAATTC-GCATCGACAGACGGTGTGTCGTGGAC Eco RI Reverse AAAAAACTGCAG-TTACGCTACCGGCACGACTTCCAACC Pst 1 701 Forward CGCGGATCCCATATGAAGACTTGTTTGGATACTTC BamiHl- NdeI Reverse CCCGCTCGAG-TGCCGACAACAGCCTC Nhol 702 Forward AAAAAAGAATTC-ATGCCGTGTTCCAAAGCCAGTTGGATTTC Eco RI Reverse AAAAAACTGCAG-TTAACCCCATTCCACCCGGAGACCGA Pst I 703 Forward CGCGGATCCGCTAGCCAAACGCTGGCAACCG BamilI- NheI Reverse CCCGCTCGAG-TTTTGCAGGTTTGATGTTTG XhoJ 704a Forward AAAAAAGAATTC-GCTTCTACCGGTACGCTGGCGCG Eco RI Reverse AAAAAACTGCAG- sI TTAGTTTTGCCGGATAATATGGCGGGTGCGPsI 707 Forward CGCGGATCCGCTAGCGAATTTACGATGCAGA BamHl- NheI Reverse CCCGCTCGAG-GAAACTGTAATTCAAGTTGA XhoI 708 Forward CGCGGATCCGCTAGCCCTTTTAGCCATCCAAA BarnHI- NheI Reverse CCCGCTCGAG-TTGACCGGTGAGGACG XhoI 710 Forward CGGACCTT-AACAG.6 T Bamill- Reverse CCCGCTCGAG-AACGGTTTCGGTCAG NdeI 714 Forward CGCGGATCCCATATGAGCTATCAAGACATCTT BainHl- Reverse CCCGCTCGAG-GCGGTAGGTKAATCGGAT NdeJ 716 Forward CGCGGATCCCATATGGCCAACAAACCGGCAAG aiI Reverse CCCGCTCGAG-TTTAGAACCGCATTTGCC NdeJ 718 Forward CGCGGATCCCATATGGAGCCGATATGGCAAABa~l NdeI Reverse CCCGCTCGAG-GGCGCGGGCATGGTCTTGTCC XhoJ 720 Forward CGCGGATCCCATATG-AGCGGATGGCATACC BamHI- NdeI Reverse CCCGCTCGAG-TTTTGCATAGCTGTTGACCA XhoI 723 Forward CGCGGATCCCATATGCGACCCAAGCCCC BamHJ- Revese CCGCCGAGAATGGAACCGCGCCNdeI Revese CCGTCGG~AAGCGJ~TCGCGCCXhoI 725 Forward CGCGGATCCCATT.TCCCGTA anI NdeI Reverse CCCGCTCGAG-TTGCTTATCCTTAAGGGTTA XhoI F726 Forward CGCGGATCCCATATGACCATCTATTTCAJAC BamHI- NdeI Reverse CCCGCTCGAG-GCCGATGTTTAGCGTCC XhoI 728 Forward CGCGGATCCCATATGTTTTGGCTGGGAACGGG Bam~il- NdeI Reverse CCCGCTCGAG-GTGAGAAJGGTCGCGC XhoI 729 Forward CGCGGATCCCATATG..TGCACCATGAPITCCCCA BamJ{I- NdeI Reverse GCCCAAGCTT-TTTGTCGGTTTGGGTATC Hind.ill 731 Forward CGCGGATCCGCTAGC.GCCGTGCCGGAGG Bam}I-Hl NheI Reverse CCCGCTCGAG-ACGGGCGCGGCAG XhoI 732 Forward CCGGAATTCTACATATGTCGAACCTGTTTTTAGA EcoRi- NdeI Reverse CCCGCTCGAG-CTTCTTATCTTTTTTATCTTTC Xhol 733 Forward CGCGGATCCCATATGGCCTGCGGCGGCA BamiHJ- Reverse CCCGCTCGAGTCGCTTGCCTCCTTT.AC NdeI 734 Forward CGGACCTT-CCAATAGCA BamHI- Reverse CCGTGGTTAATTGACAGGXo 735 Forward CGCGGATCCCATATG-AAGCAGCAGGCGGTCA Bam-iHI- Reverse CCCGCTCGAG-ATTTCCGTAGCCGAGGG NdeI 737 Forward CGCGGATCCCATATG-CACCACGACGGACACG BamHI- Reverse CCCGCTCGAG-GTCGTCGCGGCGGGA NdeI 739 Forward CGGACCTT-GAAAACAC BaoI Revese ~NdeI 740 Forward CGCGGATCCCATATGGCCATCCGCCCGAAG ail Reverse CCCGCTCGAGAAACGCGCAATAGTG NdeI 741 Forward CGGGGATCCCATATG-TGCAGCAGCGGAGGG BamiHI- Revese ~NdeI 743 Forward CGCGGATCCCATATG-GACGGTGTTGTGCCTGTT Bntl Revese ~NdeI 745 Forward CGCGGATCCCATATG-TTTTGGC&ACTGACCG aiI NdeI Reverse CCCGCTCGAG-CAAATCAGATGCCTTTAGG XhoI 746 Forward CGCGGATCCCATATGTCCGAAACACAAAC BamijI- NdeI Reverse CCCGCTCGAG-TTCATTCGTTACCTGACC XhoI 747 Forward CCGGAATTCTAGCTAGC-CTGACCCCTTGGG EcoRI- NheI Reverse GCCCAAGCTT-TTTTGATTTTAATTGACTATAGAAC HindIll 749 Forward CGCGGATCCCATATG-TGCCAGCCGCCG BamHI- NdeI Reverse CCCGCTCGAG-TTTCAAGCCGAGTATGC XhoI 750 Forward CGCGGATCCCATATG-TGTTCGCCCGAACCTG BamHI- NdeI Reverse CCCGCTCGAG-CTTTTTCCCCGCCGCAA XhoI 758 Forward CGCGGATCCCATATG-AACAATCTGACCGTGTT BamHI- NdeI Reverse CCCGCTCGAG-TGGCTCAATCCTTTCTGC XhoI 759 Forward CGCGGATCCGCTAGC-CGCTTCACACACACCAC BamlHI- NheI Reverse CCCGCTCGAG-CCAGTTGTAGCCTATTTTG XJhoI 763 Forward CGCGGATCCCATATG-CTGCCTGAAGCATGGCG Balil- NdeI Reverse CCCGCTCGAG-TTCCGCAATACCGTTTCC Xhol 764 Forward CGCGGATCCCATATG-TTrTTTCTCCGCCCTGA Ban- NdeI "'--versc: CCCGCTCCAG-TCGCTCCCTAAAGCTTJC XhoI 765 Forward CGCGGATCCCATATG-TTAAGATGCCGTCCG BamHI- NdeI Reverse CCCGCTCGAG-ACGCCGACGTTTTTTATTM, XhoI 767 Forward CGCGGATCCCATATG-CTGACGGAAGGGGAAG BamHI- NdeI Reverse CCCGCTCGAG-TTTCTGTACAGCAGGGG XhoI 768 Forward GGCGGATCCCATATG-GCCCCGCAAACCCG B amHI- NdeI Reverse CCCGCTCGAG-TTTCATCCCTTTUTTGAGC XhoI 770 Forward CGCGGATCCCATATG-TGCGGCAGCGCGAA BarnHI- NdeI Reverse CCCGCTCGAG-GCGTTTGTCGAGATTTTC XhoI 771 Forward CGCGGATCCCATATG-TCCGTATATCGCACCTTC BamHI- NdeI Reverse CCCGCTCGAG-CGGTTCTTTAGGTTTGAG XhoI 772 Forward CGCGGATCCCATATG-TTTGCGGCGTTGGTGG BarnHI- NdeI Reverse CCCGCTCGAG-CAATGCCGACATCAAAJCG XhoI 774 Forward CGCGGATCCCATATG-TCCGTTTCACCCGTTCC Bam-HI- NdeI Reverse CCCGCTCGAG-TCGTTTGCGCACGGCT XhoI 790 Forward CGGACCTT-CAAGTAAA BaniHI- NdeI Reverse CCCGCTCGAG-UGGCGTTGTTCGGAT CG XhoIl 900 Forward CGCGGATCCCATATGCCGTCTGATGCCG :N i- Reverse CCCGCTCGAG-ATATGGAAGTCTGTTGTC Xo 901 Forward CGCGGATCCCATATG-CCCGATTTTTCGATG Bamil- Reverse CCCGCTCGAG-AAATGGAACATACCAGG Xo 902 Forward. CCGGAATTCTACATATGTGCACTTTCAGGATAATC EcoRi- 2 NdeI Reverse CCCGCTCGAGAAJGTACATGGCGTAC XhoI 903 Forward CCGGAATTCTAGCTAGC-CAGCGTCAGCAGCACAT EcoRi- Reverse CCCGCTCGAGGAACTGTATTCAGTTGA NhoI 904 Forward AAAAAAGGTACC-ATGATGCAGCACAATCGTTTC Kpn I Reverse AAACTGCAG-TTAATATCGATAGGTTATATG Pst I 904a Forward AAAAAAGAATTC-CGGCTCGGCATTGTGCAGATGTTGCA Eco RI Reverse AAACTGCAG-TTAATATCGATAGGTTATATG Pst I 905 Forward CGCGGATCCCATATGACATATACCGCATC Banill- NdeI Reverse CCCGCTCGAG-CCACTGATAACCGACAGAT XhoI 907 Forward CGCGGATCCCATATGGGCGCGCAACGTGAG BamHI- Reverse CCCGCTCGAG-ACGCCACTGCCAGCG NhoI 908 Forward AAAGAATTCGCAGAGTTAGTAGGCGTTTAAATAC Eco RI Reverse AAk GA-TAAGTTGCTC Pst I 909 Forward CGCGGATCCCATATGTGCGCGTGGGAAACTTAT BaniHI- NdeI Reverse CCCGCTCGAG-TCGGTTUTGAAACT-ITGGTTTT XhoI 910 Forward AAA~AATTC-GCATTTGCCGGCGACTCTGCCGAGCG Eco RI Reverse A.AACTGCAG-TCAGCGATCGAGCTGCTCTTT s1 911 Forward AAAGAATTC-GCTTTCCGCGTGGCCGGCGGTGC Pst I Reverse AAAAAACTGCAG..GTCGACTTATTCGGCGGCTTTTTCCGC Eco RI 912 Forward AAAAAAG ATT(> Eco RI CAAkATCC
GTCAAAACGCCACTCAAGTATTGAG
Reverse AAAAAACTGCAG-TTACAGTCCGTCCACGCCTTTCGC Pst 1 913 Forward CGG~CCAkGGACCCCG B amHI- Reverse CCCGCTCGAG-AGGTTGTGTTCCAGGTTG NdeI 915 Forward CGCGGATCCCATATGTGCCGGCAGGCGGABmI Reverse CCCGCTCGAG-TTTGAATATAGGTATCAG G NdeI 914 Forward AAA~AATTCGACAGAATCGGCGATTGGAAGCACG Eco RI Reverse AAACTGCAG-CTATATGCGCGGCAGGACGCTCACGG Pst I 916 Forward CGCGGATCCCATATGGCAATGATGGCGGCTG Bam~l- Reverse CCCGCTCGAG-TTTGGCGGCATCTTTCAT XhoI 917 orw rd A A A A G A A T C C T G C G A A C C G C A C G G CE co R I Reverse AAAAAACTGCAG-TTATTTCCCCGCCTTCACATCCTG Pst I 7919 Forward AA A A A I BainCC~ CUCCCG 920 Forward CGCGGATCCCATATG-TCAGCCGGTC BanmHI- NdeI Reverse CCCGCTCGAG-GGGCGTACGA XhoI 921 Forward AAAAAAGAATTCTTGACGGAATCCCCGTGAATCC Eco RI Reverse AAAAAACTGCAG-TCATTTCAAGGGCTGCATCTTCAT Pst I 922 Forward. CGCGGATCCGCTAGC-TGTACGGCGATGGAGGC BarnHJ- 2 NheI Reverse CCCGCTCGAG-CAATCCCGGGCCGCC XhoJ 923 Forward CGCGGATCCCATATG..TGTTACGC TATTGTCCC BamHJ- NheI Reverse CCCGCTCGAG-GGACAAGGCGACGAG Xhol 925 Forward CGCGGATCCCATATG-AAJCAAATGCT.PJTAGCCG BamHI- NdeI Reverse CCCGCTCGAG-GCCGTTGCATTTGATTTC XhoI 926 Forward CGCGGATCCCATATG..TGCGCGCAATTACCTC BamJ'{- NdeI Reverse CCCGCTCGAG-TCTCGTGCGCGCCG XhoI 927 Forward CGC GGATCCCATATG.TGCAGCCCCGCAGC BamHI- NdeI Reverse CCCGCTCGAG-GTTTTTTGCTGACGTAGT XhoI 929a Forward AAAAAAGAATTC-CGCGGTTTGCTCA.CAGGGCTGGG Eco I Reverse AAAAAATCTAGATTAAGAAGACGGACTACTGCC Xba 1 931 Forward AAAAAAGATTCGCAACCCATGTTGATGGAAAC Eco RI Reverse AAAAAACTGCAGTTACTGCCCGACACACGCGACG Pst I 935 Forward AAAAAAGAA TTC. coR GCGGATGCGCCCGCGATTTTGGATGACAAGGC EoR Reverse AAAAAACTGCAGTCAAAACCGCCATCCGCCGACAC Pst I 936 Forward CGCGGATC-CCATATG.GCCGCCGTCGGCGC BamHT- Revese CCGCCGAGGCGTGGAGTAGTUGNdeI Revese CCGTCGG-GCTTGACGAGTTTGXhoI 937 Forward AAAAAAGAATTCCCGGTTTACATTCACCGGCGCAAC Eco RI Reverse AAAAAACTGCAGTTAAATGTATGCTGTACGCCAAA Pst I 939a Forward AAAAAAGAATTCGGTTCGGCAGCTGTGATGAACC Eco RI Reverse AAAAAAC TGCAG-TTAACGCA..AACCTTGGATAAGTTGGC Pst I 950 Forward CGCGGACCCATATGGCCAAACCGGCAG BamHI- Reverse CCGTGGTTGACGATGCXo 953 Forward CGCGGATCCCATATGGCCACCTACAAAGTGGAC BamHl.
Revere CCGCTCAG~TTTTTGCTGCTCGA Revese CCGTCGG-TTTTTGCTCCTGATXhoI 957 Forward CGCGGATCCCATATGTTTTGGCTGGGAACGGG BamHI- R rse CC G TC A F958 Forward CGCGGAT-CCCATATG..GCCGATGCCGTTGCG Bani-JI- NdeI Reverse GCCCAAGCTT-GGGTCGU7TGTTGCGTC Hindll 959 Forward CGCGGATCCCATATG..CACCACGACGGACACG BamHjI- Reverse CCCGCTCGAG-GTCGTCGCGGCGGGA NdeI 961 Forward CGCGGATCCCATATGGCCACAAGCGACGACG BamHI- NdeI Reverse CCCGCTCGAG-CCACTCGT&AATTGACGC Xhol 972 Forward AAAAAAGAATTC- coR TTGACTAACAGGGGGGGAGCGkATTAAAC EoR Reverse AAAAAATCTAGATTAAAATAATCATATCTACATTTTG Xba I 973 Forward AAAAAA TTCATGGACGGCGCACCCGAAC Eco RI Reverse AAAAAACTGCAG..TTACTTCACGCGGGTCGCCATCAGCGT Pst 1 982 Forward CGCGGATCCCATATGGCAGCAAGACGTAC BamiHl Reverse CCCGCTCGAGCATCATGCCGCCCATCC Xo 9 83 Forward CGCGGATCCCATATGTTAGCTGTTGCAACAACAC BamHb- NdeI Reverse CCCGCTCGAG-GAACCGGTAGCCTACG Xhol 987 Forward CGCGGATCCCATATGCCCCCACTGGAGAC Bam~i- Reverse CCCGCTCGAG-TAATAAACCTTCTATGGGC NdeI 988 Forward CGCGGATCCCATATGTCTTTAATTTACGGGAkAG BamHI- NdeI Reverse GCCCAAGCTT-TGATTTGCCTTTCCGTTTT Hindll 989 Forward CCGGAA TTCTACATATG-GTCCACGCATCCGGCTA EcoRI- Reverse CCCGCTCGAG-TTTGAATTGTAGGTGTATTGC Xo 990 Forward. CGCGGAT-CCGCTAGC-TTCAGAGCTCAGCTT BamHl- 2 NheI Reverse CCCGCTCGAG-AAACAGCCATTTGAGCGA XhoI 992 Forward CGCG-GATCCCATATG-GACGCGCCCGCCCG BamHIl- Reverse CCCGCTCGAGCCAAATGCCCAACCATTC NdeI 993 Forward CG( A-CAAGGCAGTATAAC BaniHI- NdeI Reverse CCCGCTCGAG-GAACACATCGCGCCCG XhoI 996 Forward CGCGGATCCCATATGTGCGGCAGAATCCGC Baniuij- Reverse CCCGCTCGAG-TCTkAACCCCTGTTTTCTC Xo 997 Forward CCGGA-ATTCTAGCTAGC-CGGCACGCCGACGTT EcoRi- Reverse CCCGCTCGAG-GACGGCATCGCTCAGG XhoI