[go: up one dir, main page]

WO1998059034A2 - POLYNUCLEOTIDES ET SEQUENCES DE $i(TREPONEMA PALLIDIUM) - Google Patents

POLYNUCLEOTIDES ET SEQUENCES DE $i(TREPONEMA PALLIDIUM) Download PDF

Info

Publication number
WO1998059034A2
WO1998059034A2 PCT/US1998/013041 US9813041W WO9859034A2 WO 1998059034 A2 WO1998059034 A2 WO 1998059034A2 US 9813041 W US9813041 W US 9813041W WO 9859034 A2 WO9859034 A2 WO 9859034A2
Authority
WO
WIPO (PCT)
Prior art keywords
pallidum
sequence
fragments
seq
nos
Prior art date
Application number
PCT/US1998/013041
Other languages
English (en)
Other versions
WO1998059034A3 (fr
Inventor
Claire M. Fraser
Original Assignee
Human Genome Sciences, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Human Genome Sciences, Inc. filed Critical Human Genome Sciences, Inc.
Priority to EP98931511A priority Critical patent/EP0990022A1/fr
Priority to CA002296814A priority patent/CA2296814A1/fr
Priority to AU81623/98A priority patent/AU8162398A/en
Publication of WO1998059034A2 publication Critical patent/WO1998059034A2/fr
Publication of WO1998059034A3 publication Critical patent/WO1998059034A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/20Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Spirochaetales (O), e.g. Treponema, Leptospira
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies

Definitions

  • the present invention relates to the field of molecular biology.
  • it relates to, among other things, nucleotide sequences of Treponema pallidum, contigs, ORFs, fragments, probes, primers and related polynucleotides thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, such as in fermentation, polypeptide production, assays and pharmaceutical development, among others.
  • Spirochetes are a family of motile, unicellular, spiral-shaped bacteria which share a number of structural characteristics.
  • Three genera of the spirochetes are pathogenic in humans: (a) Treponema, which includes the pathogens that cause syphilis (T. pallidum), yaws (T. permur), and pinta (71 carateum); (b) Borrelia, which includes the pathogens that cause epidemic and endemic relapsing fever and Lyme disease; and (c) Leptospira, which includes a wide variety of small spirochetes that cause mild to serious systemic human illness (Koff, A. B. and Rosen, T. 7. Am. Acad. Dermatol. 29:519-535 (1993)). In 1986, more than 27,000 cases of early infectious syphilis were diagnosed in the United States alone. Such statistics indicate that infection with T. pallidum is the largest source of human disease resulting from the spirochetes.
  • T. pallidum is morphologically indistinguishable from several other pathogenic spirochetes, but, in general, treponemes and other spirochetes, are easily identifiable when compared to other bacteria.
  • a key morphological characteristic of T. pallidum, and other spirochetes, is the presence of a central protoplasmic cylinder composed primarily of peptidoglycan and one or more adjacent axial fibrils (also designated periplasmic flagella or endoflagella; Charon, N. W., et al., Res. Microbiol. 143:597-603 (1992)). These structures provide a source of corkscrew-like motion to the treponemes.
  • treponemes move in an apparently random fashion and, unlike the majority of motile bacteria, continue to move in a more viscous medium.
  • treponemes are highly moldable to intercellular spaces; a characteristic which is thought to be mediated by the interactions of bacterial adhesins and cellular fibronectins.
  • Syphilis is the primary clinical manifestation of infection with T. pallidum.
  • the clinical manifestations of syphilis can resemble many diseases.
  • Syphilis is typically transmitted by sexual contact, but can also be transmitted transplacentally.
  • the infecting organism multiplies at the site of infection within 10 to 60 days postinfection and results in a primary ulcer-like lesion termed a chancre.
  • a small number of organisms move from the primary lesion to the regional lymph nodes and establish small infectious centers termed satellite buboes. Organisms from these locations enter the blood stream and result in a systemic infection (Goens, J. L., et al, Am. Fam. Physician 50:1013-1020 (1994)).
  • the secondary stage of syphilis manifests itself as a widespread skin rash and begins between two and twelve weeks following the primary infection. During this stage, the infected individual often experiences a low grade fever coupled with swollen lymph nodes. Also during this period, lesions of various degrees of severity may develop in a number of phyical locations including bone, liver, kidney, central nervous system (CNS), and other organs (Neeravahu, M. Arch. Intern. Med. 145:132-134 (1985)). Such secondary infections are highly infectious, but will, in time, subside spontaneously.
  • a third stage of syphilis occurs in approximately 30% of infected, but not treated, individuals. The third stage occurs several years following the first and second stages.
  • the lesions which characterize the third stage of infection are minor in terms of the number of organisms, but may be severe in terms of tissue damage. Such lesions may result in necrosis, scar formation, general paresis, damage to aortic valves, permanent blindness, and other extensive tissue damage, all probably related to a delayed type hypersensitivity reaction by the host to the T. pallidum organisms (Scheck, D. ⁇ . and Hook, E. W. 3 rd Infect. Dis. Clin. North Am. 8:769-795 (1994)).
  • T. pallidum has a remarkable ability to evade both the humoral and cellular components of the immune system. It was originally thought that the ability of T.
  • T. pallidum to evade the immune system of the host organism was due to the presence of an outer coat of mucopolysaccharides.
  • T. pallidum make use of the organization of the relative immunogenicity of its complement of outer membrane proteins to evade the immune system (Radolf, J. D. Mol. Microbiol. 16: 1067-1073 (1995)).
  • the T. pallidum outer membrane contains a scarcity of immunogenic transmembrane proteins (with regard to T. pallidum, these are termed "rare outer membrane proteins").
  • T. pallidum also secretes a number of small, but immunogenic proteins which may induce an immune response (Hindersson, P. et al., Res. Microbiol. 143:629-639 (1992)). It is clear that the etiology of diseases mediated or exacerbated by T.
  • T. pallidum genes and that characterizing the genes and their patterns of expression would add dramatically to our understanding of the organism and its host interactions.
  • Knowledge of T. pallidum genes and genomic organization would dramatically improve understanding of disease etiology and lead to improved and new ways of preventing, ameliorating, arresting and reversing diseases.
  • characterized genes and genomic fragments of T. pallidum would provide reagents for, among other things, detecting, characterizing and controlling T. pallidum infections. There is a need therefore to characterize the genome of T. pallidum and for polynucleotides and sequences of this organism.
  • the present invention is based on the sequencing of fragments of the T. pallidum genome.
  • the primary nucleotide sequences which were generated are provided in SEQ ID NOS: 1-744.
  • the present invention provides the nucleotide sequence of several thousand contigs of the T. pallidum genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan.
  • the present invention is provided as contiguous strings of primary sequence information corresponding to the nucleotide sequences depicted in SEQ ID NOS: 1-744.
  • the present invention further provides nucleotide sequences which are at least 95% identical to the nucleotide sequences of SEQ ID NOS: 1-744.
  • the nucleotide sequence of SEQ ID NOS: 1-744 may be provided in a variety of mediums to facilitate its use.
  • the sequences of the present invention are recorded on computer readable media.
  • Such media includes, but is not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.
  • the present invention further provides systems, particularly computer-based systems which contain the sequence information herein described stored in a data storage means.
  • Such systems are designed to identify commercially important fragments of the T. pallidum genome.
  • Another embodiment of the present invention is directed to fragments of the T. pallidum genome having particular structural or functional attributes.
  • Such fragments of the T. pallidum genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter referred to as open reading frames or ORFs, fragments which modulate the expression of an operably linked ORF, hereinafter referred to as expression modulating fragments or EMFs, and fragments which can be used to diagnose the presence of T. pallidum in a sample, hereinafter referred to as diagnostic fragments or DFs.
  • Each of the ORFs in fragments of the T. pallidum genome disclosed in Tables 1, 2 and 3, and the EMFs found 5' to the ORFs can be used in numerous ways as polynucleotide reagents.
  • the sequences can be used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity.
  • the present invention further includes recombinant constructs comprising one or more fragments of the T. pallidum genome of the present invention.
  • the recombinant constructs of the present invention comprise vectors, such as a plasmid or viral vector, into which a fragment of the T. pallidum has been inserted.
  • the present invention further provides host cells containing any of the isolated fragments of the T. pallidum genome of the present invention.
  • the host cells can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell.
  • the present invention is further directed to isolated polypeptides and proteins encoded by
  • ORFs of the present invention A variety of methods, well known to those of skill in the art, routinely may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and proteins of the present invention from cells which have been altered to express them.
  • the invention further provides methods of obtaining homologs of the fragments of the T. pallidum genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specifically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.
  • the invention further provides antibodies which selectively bind polypeptides and proteins of the present invention.
  • Such antibodies include both monoclonal and polyclonal antibodies.
  • the invention further provides hybridomas which produce the above-described antibodies.
  • a hybridoma is an immortalized cell line which is capable of secreting a specific monoclonal antibody.
  • the present invention further provides methods of identifying test samples derived from cells which express one of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one or more of the antibodies of the present invention, or one or more of the DFs of the present invention, under conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom.
  • kits are provided which contain the necessary reagents to carry out the above-described assays.
  • the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the antibodies, or one of the DFs of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of bound antibodies or hybridized DFs.
  • the present invention further provides methods of obtaining and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present invention.
  • agents include, as further described below, antibodies, peptides, carbohydrates, pharmaceutical agents and the like.
  • Such methods comprise steps of: (a)contacting an agent with an isolated protein encoded by one of the ORFs of the present invention; and (b)determining whether the agent binds to said protein.
  • the present genomic sequences of T. pallidum will be of great value to all laboratories working with this organism and for a variety of commercial purposes. Many fragments of the T. pallidum genome will be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value to T. pallidum researchers and for immediate commercial value for the production of proteins or to control gene expression.
  • sequenced contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative genomic and molecular phylogeny.
  • FIGURE 1 is a block diagram of a computer system (102) that can be used to implement computer-based systems of present invention.
  • FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit and annotate the contigs of the T. pallidum genome of the present invention.
  • Both Macintosh and Unix platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et al, Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer Society Press, Washington D.C. (1993).
  • Factura (AB) is a Macintosh program designed for automatic vector sequence removal and end-trimming of sequence files.
  • the program Loadis runs on a Macintosh platform and parses the feature data extracted from the sequence files by Factura to the Unix based T. pallidum relational database.
  • Assembly of contigs is accomplished by retrieving a specific set of sequence files and their associated features using Extrseq, a Unix utility for retrieving sequences from an SQL database.
  • the resulting sequence file is processed to trim portions of the sequences with a high rate ambiguous nucleotides.
  • the sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic Research (TIGR ) for rapid and accurate assembly of thousands of sequence fragments.
  • TIGR Institute for Genomic Research
  • the collection of contigs generated by the assembly step is loaded into the database with the lassie program.
  • Identification of open reading frames (ORFs) is accomplished by processing contigs with zorf. The ORFs are searched against T.
  • the present invention is based on the sequencing of fragments of the T. pallidum genome and analysis of the sequences.
  • the primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID NOS: 1-744.
  • the "primary sequence” refers to the nucleotide sequence represented by the IUPAC nomenclature system.).
  • the present invention provides the nucleotide sequences of SEQ ID NOS: 1-744, ORF IDs and ORFs within, or representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan.
  • a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS: 1-744" refers to any portion of the SEQ ID NOS: 1-744 which is not presently represented within a publicly available database.
  • Preferred representative fragments of the present invention are T. pallidum open reading frames ( ORFs ), expression modulating fragment ( EMFs ) and fragments which can be used to diagnose the presence of T. pallidum in sample (DFs).
  • ORFs T. pallidum open reading frames
  • EMFs expression modulating fragment
  • SEQ ID NOS: 1-744 and in Tables 1-3 together with routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence all "representative fragments" of interest, including open reading frames encoding a large variety of T. pallidum proteins.
  • the present invention is further directed to nucleic acid molecules encoding portions or fragments of the nucleotide sequences described herein.
  • Fragments include portions of the nucleotide sequences of SEQ ID NOS: 1-744, at least 10 contiguous nucleotides in length selected from any two integers, one of which representing a 5' nucleotide position and a second of which representing a 3' nucleotide position, where the first nucleotide for each nucleotide sequence in SEQ ID NOS: 1-744 is position 1. That is, every combination of a 5' and 3' nucleotide position that a fragment at least 10 contiguous nucleotides in length could occupy is included in the invention.
  • a fragment may be 10 contiguous nucleotide bases in length or any integer between 10 and the length of an entire nucleotide sequence of SEQ ID NOS: 1-744 minus 1. Therefore, included in the invention are contiguous fragments specified by any 5' and 3' nucleotide base positions of a nucleotide sequences of SEQ ID NOS: 1-744 wherein the contiguous fragment is any integer between 10 and the length of an entire nucleotide sequence minus 1.
  • the invention includes polynucleotides comprising fragments specified by size, in nucleotides, rather than by nucleotide positions.
  • the invention includes any fragment size, in contiguous nucleotides, selected from integers between 10 and the length of an entire ORF ID, ORF, or SEQ ID NO:, minus 1.
  • Preferred sizes of contiguous nucleotide fragments include 20 nucleotides, 30 nucleotides, 40 nucleotides, 50 nucleotides.
  • Other preferred sizes of contiguous nucleotide fragments, which may be useful as diagnostic probes and primers include fragments 50-300 nucleotides in length which include, as discussed above, fragment sizes representing each integer between 50-300.
  • the present invention also provides for the exclusion of any fragment, specified by 5' and 3' base positions or by size in nucleotide bases as described above for any ORF ID or SEQ ID NOS: 1-744. Any number of fragments of nucleotide sequences in ORF IDs or SEQ ID NOS: 1-744, specified by 5' and 3' base positions or by size in nucleotides, as described above, may be excluded from the present invention.
  • SEQ ID NOS: 1-744 While the presently disclosed sequences of SEQ ID NOS: 1-744 are highly accurate, sequencing techniques are not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS: 1-744. However, once the present invention is made available (i.e., once the information in SEQ ID NOS: 1-744 and Tables 1-3 has been made available), resolving a rare sequencing error in SEQ ID NOS: 1-744 will be well within the skill of the art. The present disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to be obtained readily by straightforward application of routine techniques.
  • polynucleotides of the present invention readily may be obtained by routine application of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining libraries and for sequencing are provided below, for instance.
  • a wide variety of T. pallidum strains can be used to prepare T. pallidum genomic DNA for cloning and for obtaining polynucleotides of the present invention which are known in th art.
  • nucleotide sequences of the genomes from different strains of T pallidum differ somewhat. However, the nucleotide sequences of the genomes of all T. pallidum strains will be at least 95% identical, in corresponding part, to the nucleotide sequences provided in SEQ ED NOS: 1-744 and the ORF IDs and ORFs within. Nearly all will be at least 99% identical and the great majority will be 99.9% identical.
  • the present application is further directed to nucleic acid molecules at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleic acid sequence shown in SEQ ID NOS: 1-744, the ORF IDs and ORFs within.
  • the above nucleic acid sequences are included irrespective of whether they encode a polypeptide having T. pallidum activity. This is because even where a particular nucleic acid molecule does not encode a polypeptide having T. pallidum activity, one of skill in the art would still know how to use the nucleic acid molecule, for instance, as a hybridization probe. Uses of the nucleic acid molecules of the present invention that do not encode a polypeptide having T.
  • pallidum activity include, inter alia, isolating an T. pallidum gene or allelic variants thereof from a DNA library, and detecting T. pallidum mRNA expression samples, environmental samples, suspected of containing T. pallidum by Northern Blot, PCR, or similar analysis.
  • nucleic acid molecules having sequences at least 90%, 95%, 96%, 97%, 98% or 99% identical to the nucleic acid sequence shown in SEQ ID NOS: 1-744, the ORF IDs, and the ORF within each ORF ID, which do, in fact, encode a polypeptide having T. pallidum protein activity
  • a polypeptide having T. pallidum activity is intended polypeptides exhibiting activity similar, but not necessarily identical, to an activity of the T. pallidum protein of the invention, as measured in a particular biological assay suitable for measuring activity of the specified protein.
  • nucleic acid molecules having a sequence at least 90%, 95%, 96%, 97%, 98%, or 99% identical to the nucleic acid sequences shown in SEQ ID NOS: 1-744, the ORF IDs, and the ORF within each ORF ID will encode a polypeptide having T. pallidum protein activity.
  • degenerate variants of these nucleotide sequences all encode the same polypeptide, this will be clear to the skilled artisan even without performing the above described comparison assay.
  • the biological activity or function of the polypeptides of the present invention are expected to be similar or identical to polypeptides from other bacteria that share a high degree of structural identity/similarity.
  • Table 1-3 lists accession numbers and descriptions for the closest matching sequences of polypeptides available through Genbank. It is therefore expected that the biological activity or function of the polypeptides of the present invention will be similar or identical to those polypeptides from other bacterial genuses, species, or strains listed in Table 1- 3.
  • nucleotide sequence of the polynucleotide is identical to the reference sequence except that the polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the reference nucleotide sequence encoding the T. pallidum polypeptide.
  • nucleotide sequence at least 95% identical to a reference nucleotide sequence
  • up to 5% of the nucleotides in the reference sequence may be deleted, inserted, or substituted with another nucleotide.
  • the query sequence may be an entire sequence shown in SEQ ID NOS: 1-744, the ORF IDs, or the ORF within each ORF ID, or any fragment specified as described herein.
  • nucleic acid molecule or polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide sequence of the presence invention can be determined conventionally using known computer programs.
  • a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. See Brutlag et al. (1990) Comp. App. Biosci. 6:237-245.
  • sequence alignment the query and subject sequences are both DNA sequences.
  • An RNA sequence can be compared by first converting U's to T's.
  • the result of said global sequence alignment is in percent identity.
  • the percent identity is corrected by calculating the number of bases of the query sequence that are 5' and 3' of the subject sequence, which are not matched/aligned, as a percent of the total bases of the query sequence. Whether a nucleotide is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This corrected score is what is used for the purposes of the present invention.
  • nucleotides outside the 5' and 3' nucleotides of the subject sequence are calculated for the purposes of manually adjusting the percent identity score. For example, a 90 nucleotide subject sequence is aligned to a 100 nucleotide query sequence to determine percent identity. The deletions occur at the 5' end of the subject sequence and therefore, the FASTDB alignment does not show a matched/alignment of the first 10 nucleotides at 5' end.
  • the 10 unpaired nucleotides represent 10% of the sequence (number of nucleotides at the 5' and 3' ends not matched/total number of nucleotides in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 nucleotides were perfectly matched the final percent identity would be 90%.
  • a 90 nucleotide subject sequence is compared with a 100 nucleotide query sequence. This time the deletions are internal deletions so that there are no nucleotides on the 5' or 3' of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by FASTDB is not manually corrected.
  • nucleotides 5' and 3' of the subject sequence which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are to made for the purposes of the present invention.
  • nucleotide sequences provided in SEQ ID NOS: 1-744 including ORF IDs and corresponding ORFs, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to said polynucleotide sequences may be "provided” in a variety of mediums to facilitate use thereof.
  • "provided” refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence of the present invention. Such a manufacture provides a large portion of the T. pallidum genome and parts thereof ⁇ e.g., a T.
  • a nucleotide sequence of the present invention can be recorded on computer readable media.
  • computer readable media refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media.
  • a variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention.
  • the choice of the data storage structure will generally be based on the means chosen to access the stored information.
  • a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium.
  • the sequence information can be represented in a word processing text file, formatted in commercially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like.
  • nucleotide sequence information of the present invention.
  • Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium.
  • nucleotide sequences of SEQ ID NOS: 1-744 including ORF IDs and corresponding ORFs, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to said polynucleotide sequences, the present invention enables the skilled artisan routinely to access the provided sequence information for a wide variety of purposes.
  • the present invention further provides systems, particularly computer-based systems, which contain the sequence information described herein. Such systems are designed to identify, among other things, commercially important fragments of the T. pallidum genome.
  • a computer-based system refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information of the present invention.
  • the minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means.
  • CPU central processing unit
  • input means input means
  • output means output means
  • data storage means any one of the currently available computer-based system are suitable for use in the present invention.
  • the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means.
  • data storage means refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.
  • search means refers to one or more programs which are implemented on the computer- based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of the present genomic sequences which match a particular target sequence or target motif.
  • a variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA).
  • EMBL MacPattern
  • BLASTN BLASTN
  • NCBIA BLASTX
  • a "target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids.
  • a skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database.
  • the most preferred sequence length of a target sequence is from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues.
  • searches for commercially important fragments, such as sequence fragments involved in gene expression and protein processing may be of shorter length.
  • a target structural motif refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif.
  • target motifs include, but are not limited to, enzymic active sites and signal sequences.
  • Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).
  • a variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention.
  • a preferred format for an output means ranks fragments of the T. pallidum genomic sequences possessing varying degrees of homology to the target sequence or target motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the target sequence or target motif and identifies the degree of homology contained in the identified fragment.
  • comparing means can be used to compare a target sequence or target motif with the data storage means to identify sequence fragments of the T. pallidum genome.
  • implementing software which implement the BLAST and BLAZE algorithms, described in Altschul et al, J. Mol. Biol. 215: 403-410 (1990), is used to identify open reading frames within the T. pallidum genome.
  • any one of the publicly available homology search programs can be used as the search means for the computer- based systems of the present invention. Of course, suitable proprietary systems that may be known to those of skill also may be employed in this regard.
  • FIG. 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present invention.
  • the computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as random access memory, RAM) and a variety of secondary storage devices 110, such as a hard drive 112 and a removable medium storage device 114.
  • the removable medium storage device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc.
  • a removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medium storage device 114.
  • the computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 114, once it is inserted into the removable medium storage device 114.
  • a nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 110, and/or a removable storage medium 116.
  • software for accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 108, in accordance with the requirements and operating parameters of the operating system, the hardware system and the software program or programs.
  • inventions of the present invention are directed to isolated fragments of the T. pallidum genome.
  • the fragments of the T. pallidum genome of the present invention include, but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and fragments which can be used to diagnose the presence of T pallidum in a sample, hereinafter diagnostic fragments (DFs).
  • ORFs open reading frames
  • EMFs expression modulating fragments
  • DFs diagnostic fragments
  • an "isolated nucleic acid molecule” or an “isolated fragment of the T. pallidum genome” refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification means to reduce, from the composition, the number of compounds which are normally associated with the composition.
  • the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS: 1-744, to representative fragments thereof as described above including ORF IDs and ORFs, to polynucleotides at least 95%, preferably at least 96%, 97%, 98%, or 99% and especially preferably at least 99.9% identical in sequence thereto, also as set out above.
  • T. pallidum DNA can be enzymatically sheared to produce fragments of 15-20 kb in length. These fragments can then be used to generate a T. pallidum library by inserting them into lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated in the ORF IDs of Tables 1-3, can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-744.
  • the isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and double stranded DNA, and single stranded RNA.
  • the entire sequence of each sequence of SEQ ID NOS: 1-744 is included with the first nucleotide being position 1. Therefore, for reference purposes the numbering used in the present invention is that provided in the sequence listing for SEQ ID NOS: 1-744.
  • an open reading frame means a series of nucleotide triplets coding for amino acid residues without any termination codons and is a sequence translatable into protein. Further, unless specified, the term "ORF' for each ORF ID is defined by the termination codon at the 3' end and the 5' most methionine codon, at the 5' end, in frame with said 3' termination codon.
  • ORF also refers to a particular polypeptide sequence defined by the ORF polynucleotide sequence, wherein the N-terminus is defined by the 5' most methionine codon in frame with the termination codon at the 3' end of the ORF ID and the C-terminus is defined by the last codon before the said 3' termination codon.
  • an ORF ID represents a sequence without any internal termination codons flanked by termination codons.
  • Tables 1, 2, and 3 list ORF IDs in the T. pallidum genomic contigs of the present invention that were identified as putative coding regions by the GeneMark software using organism-specific second-order Markov probability transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical methods, such as those discussed herein, to generate more inclusive, more restrictive, or more selective lists.
  • Table 1 sets out ORF IDs in the T. pallidum contigs of the present invention that over a continuous region of at least 50 bases are 95% or more identical (by BLAST analysis) to a nucleotide sequence available through GenBank in June, 1997.
  • Table 2 sets out ORF IDs in the T. pallidum contigs of the present invention that are not in Table 1 and match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through GenBank in July, 1996.
  • Table 3 sets out ORF IDs in the T. pallidum contigs of the present invention that do not match significantly, by BLASTP analysis, a polypeptide sequence available through GenBank in July, 1996.
  • the first and second columns identify the ORF ID by, respectively, contig number and ORF ID number within the contig; the third column indicates the first nucleotide of the ORF ID, counting from the 5' end of the contig strand; and the fourth column indicates the last nucleotide of the ORF ID, counting from the 5' end of the contig strand.
  • Tables 1 and 2 column six, lists the Reference for the closest matching sequence available through GenBank. These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be familiar with their denominators. Descriptions of the nomenclature are available from the National Center for Biotechnology Information. Column seven in Tables 1 and 2 provides the gene name of the matching sequence; column eight provides the BLAST identity score from the comparison of the ORF and the homologous gene; and column nine indicates the length in nucleotides of the highest scoring segment pair identified by the BLAST identity analysis.
  • Tables 1 and 2 herein enumerate the percent identity of the highest scoring segment pair in each ORF and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided below and are described in the pertinent literature highlighted by the citations provided below.
  • an "expression modulating fragment,” EMF means a series of nucleotide molecules which modulates the expression of an operably linked ORF or EMF.
  • EMFs include, but are not limited to, promoters, and promoter modulating sequences (inducible elements).
  • One class of EMFs are fragments which induce the expression or an operably linked ORF in response to a specific regulatory factor or physiological event.
  • EMF sequences can be identified within the contigs of the T. pallidum genome by their proximity to the ORF IDs provided in Tables 1-3 and ORFs within each ORF ID.
  • An intergenic segment, or a fragment of the intergenic segment, from about 10 to 200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably linked ORF in a fashion similar to that found with the naturally linked ORF sequence.
  • an "intergenic segment” refers to fragments of the T. pallidum genome which are between two ORF(s) herein described.
  • EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems of the present invention. Further, the two methods can be combined and used together.
  • An EMF trap vector contains a cloning site linked to a marker sequence.
  • a marker sequence encodes an identifiable phenotype, such as antibiotic resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap vector is placed within an appropriate host under appropriate conditions.
  • a EMF will modulate the expression of an operably linked marker sequence.
  • a sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction sites upstream from the marker sequence in the EMF trap vector.
  • the vector is then transformed into an appropriate host using known procedures and the phenotype of the transformed host in examined under appropriate conditions.
  • an EMF will modulate the expression of an operably linked marker sequence.
  • a "diagnostic fragment,” DF means a series of nucleotide molecules which selectively hybridize to T. pallidum sequences. DFs can be readily identified by identifying unique sequences within contigs of the T. pallidum genome, such as by using well- known computer analysis software, and by generating and testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which determines amplification or hybridization selectivity.
  • the sequences falling within the scope of the present invention are not limited to the specific sequences herein described, but also include allelic and species variations thereof.
  • allelic and species variations can be routinely determined by comparing the polynucleotide sequences provided in SEQ ID NOS: 1-744, ORF IDs and ORFs within, a representative fragment thereof, or a nucleotide sequence at least 99% and preferably 99.9% identical to said polynucleotide sequences, with a sequence from another isolate of the same species.
  • the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated.
  • any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, such as an ORF, in both directions (i.e., sequence both strands).
  • error screening can be performed by sequencing corresponding polynucleotides of T. pallidum origin isolated by using part or all of the fragments in question as a probe or primer.
  • Each of the ORFs of the T. pallidum genome within the ORF IDs of Tables 1, 2 and 3, and the EMFs found 5' to the ORFs can be used as polynucleotide reagents in numerous ways.
  • the sequences can be used as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, particularly T.
  • ORFs such as those of Table 3, which do not match previously characterized sequences from other organisms and thus are most likely to be highly selective for T. pallidum. Also particularly preferred are ORFs that can be used to distinguish between strains of T. pallidum, particularly those that distinguish medically important strain, such as drug- resistant strains.
  • fragments of the present invention can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynucleotide sequence to DNA or RNA.
  • Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide.
  • Information from the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides.
  • Polynucleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et al, Nucl. Acids Res. 3:113 (1979); Cooney et al, Science 241:456 (1988); and Dervan et al,
  • the present invention further provides recombinant constructs comprising one or more fragments of the T. pallidum genomic fragments and contigs of the present invention.
  • Certain preferred recombinant constructs of the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the T. pallidum genome has been inserted, in a forward or reverse orientation.
  • the vector may further comprise regulatory sequences, including for example, a promoter, operably linked to the ORF.
  • the vector may further comprise a marker sequence or heterologous ORF operably linked to the EMF.
  • Useful bacterial vectors include phagescript, PsiX174, pBluescript SK, pBS KS, pNH8a, ⁇ NH16a, pNH18a, pNH46a (available from Stratagene); ⁇ Trc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available from Pharmacia).
  • Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXTl, pSG (available from Stratagene) pSVK3, pBPV, pMSG, pSVL (available from Pharmacia).
  • Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers.
  • Two appropriate vectors are pKK232-8 and pCM7.
  • Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, lambda PR, and trc.
  • Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein- 1. Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art.
  • the present invention further provides host cells containing any one of the isolated fragments of the T. pallidum genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the host cell using known methods.
  • the host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell.
  • a polynucleotide of the present invention such as a recombinant construct comprising an ORF of the present invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are described in, for instance, Davis, L. et al, BASIC METHODS IN MOLECULAR BIOLOGY (1986).
  • a host cell containing one of the fragments of the T. pallidum genomic fragments and contigs of the present invention can be used in conventional manners to produce the gene product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF.
  • the present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid fragments of the present invention.
  • degenerate variant is intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence.
  • Preferred nucleic acid fragments of the present invention are the ORF IDs depicted in Tables 2 and 3 and the ORFs within which encode proteins.
  • the amino acid sequence can be synthesized using commercially available peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies against the native polypeptide, as discussed further below.
  • the polypeptide or protein is purified from bacterial cells which naturally produce the polypeptide or protein.
  • polypeptides and proteins of the present invention also can be purified from cells which have been altered to express the desired polypeptide or protein.
  • a cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally does not produce or which the cell normally produces at a lower level.
  • Those skilled in the art can readily adapt procedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides or proteins of the present invention.
  • the polypeptides of the present invention are preferably provided in an isolated form, and preferably are substantially purified. A recombinantly produced version of the T.
  • pallidum polypeptide can be substantially purified by the one-step method described by Smith et al. (1988) Gene 67:31-40.
  • Polypeptides of the invention also can be purified from natural or recombinant sources using antibodies directed against the polypeptides of the invention in methods which are well known in the art of protein purification.
  • the invention further provides for isolated T. pallidum polypeptides comprising an amino acid sequence selected from the group including: (a) the amino acid sequence of a full-length T. pallidum polypeptide having the complete amino acid sequence from the first methionine codon to the termination codon of each sequence listed in SEQ ID NOS: 1-744, wherein said termination codon is at the end of each SEQ ID NO: and said first methionine is the first methionine in frame with said termination codon; and (b) the amino acid sequence of a full-length T pallidum polypeptide having the complete amino acid sequence in (a) excepting the N-terminal methionine.
  • polypeptides of the present invention also include polypeptides having an amino acid sequence at least 80% identical, more preferably at least 90% identical, and still more preferably 95%, 96%, 97%, 98% or 99% identical to those described in (a) and (b) above.
  • the present invention is further directed to polynucleotides encoding portions or fragments of the amino acid sequences described herein as well as to portions or fragments of the isolated amino acid sequences described herein. Fragments include portions of the amino acid sequences described herein at least 5 contiguous amino acid in length and selected from any two integers, one of which representing an N-terminal position and another representing a C-terminal position.
  • the initiation codon of the ORFs of the present invention is position 1.
  • the initiation codon (positon 1) for purposes of the present invention is the first methionine codon of each ORF ID which is in frame with the termination codon at the end of each said sequence.
  • Every combination of a N-terminal and C-terminal position that a fragment at least 5 contiguous amino acid residues in length could occupy, on any given ORF is included in the invention, i.e., from initiation codon up to the termination codon. "At least" means a fragment may be 5 contiguous amino acid residues in length or any integer between 5 and the number of residues in an ORF, minus 1. Therefore, included in the invention are contiguous fragments specified by any N- terminal and C-terminal positions of amino acid sequence set forth in SEQ ID NOS: 1-744 or
  • the invention includes polypeptides comprising fragments specified by size, in amino acid residues, rather than by N-terminal and C-terminal positions.
  • the invention includes any fragment size, in contiguous amino acid residues, selected from integers between 5 and the number of residues in an ORF, minus 1.
  • Preferred sizes of contiguous polypeptide fragments include about 5 amino acid residues, about 10 amino acid residues, about 20 amino acid residues, about 30 amino acid residues, about 40 amino acid residues, about 50 amino acid residues, about 100 amino acid residues, about 200 amino acid residues, about 300 amino acid residues, and about 400 amino acid residues.
  • the preferred sizes are, of course, meant to exemplify, not limit, the present invention as all size fragments representing any integer between 5 and the number of residues in a full length sequence minus 1 are included in the invention.
  • the present invention also provides for the exclusion of any fragments specified by N-terminal and C-terminal positions or by size in amino acid residues as described above. Any number of fragments specified by N-terminal and C-terminal positions or by size in amino acid residues as described above may be excluded.
  • the above fragments need not be active since they would be useful, for example, in immunoassays, in epitope mapping, epitope tagging, to generate antibodies to a particular portion of the protein, as vaccines, and as molecular weight markers.
  • polypeptides of the present invention include polypeptides which have at least 90% similarity, more preferably at least 95% similarity, and still more preferably at least 96%, 97%, 98% or 99% similarity to those described above.
  • a further embodiment of the invention relates to a polypeptide which comprises the amino acid sequence of a T. pallidum polypeptide having an amino acid sequence which contains at least one conservative amino acid substitution, but not more than 50 conservative amino acid substitutions, not more than 40 conservative amino acid substitutions, not more than 30 conservative amino acid substitutions, and not more than 20 conservative amino acid substitutions. Also provided are polypeptides which comprise the amino acid sequence of a T. pallidum polypeptide, having at least one, but not more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 conservative amino acid substitutions.
  • a polypeptide having an amino acid sequence at least, for example, 95% "identical" to a query amino acid sequence of the present invention it is intended that the amino acid sequence of the subject polypeptide is identical to the query sequence except that the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
  • the amino acid sequence of the subject polypeptide may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
  • up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, (indels) or substituted with another amino acid.
  • These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence.
  • any particular polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to the ORF amino acid sequences encoded by the sequences of SEQ ID NOS: 1-744, as described hererin, can be determined conventionally using known computer programs.
  • a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence can be determined using the FASTDB computer program based on the algorithm of Brutlag et al., (1990) Comp. App. Biosci. 6:237-245.
  • a sequence alignment the query and subject sequences are both amino acid sequences.
  • the result of said global sequence alignment is in percent identity.
  • Preferred parameters used in a FASTDB amino acid alignment are:
  • the percent identity is corrected by calculating the number of residues of the query sequence that are N- and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a residue is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This final percent identity score is what is used for the purposes of the present invention.
  • a 90 amino acid residue subject sequence is aligned with a 100 residue query sequence to determine percent identity.
  • the deletion occurs at the N-terminus of the subject sequence and therefore, the FASTDB alignment does not match/align with the first 10 residues at the N-terminus.
  • the 10 unpaired residues represent 10% of the sequence (number of residues at the N- and C- termini not matched/total number of residues in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 residues were perfectly matched the final percent identity would be 90%.
  • a 90 residue subject sequence is compared with a 100 residue query sequence.
  • polypeptides of the present invention that do not have T. pallidum activity include, ter alia, as epitope tags, in epitope mapping, and as molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration columns using methods known to those of skill in the art.
  • polypeptides of the present invention can also be used to raise polyclonal and monoclonal antibodies, which are useful in assays for detecting T. pallidum protein expression or as agonists and antagonists capable of enhancing or inhibiting T. pallidum protein function.
  • polypeptides can be used in the yeast two-hybrid system to "capture" T. pallidum protein binding proteins which are also candidate agonists and antagonists according to the present invention. See, e.g., Fields et al. (1989) Nature 340:245-246.
  • Any host/vector system can be used to express one or more of the ORFs of the present invention.
  • These include, but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis.
  • the most preferred cells are those which do not normally express the particular polypeptide or protein or which expresses the polypeptide or protein at low natural level.
  • Recombinant means that a polypeptide or protein is derived from recombinant (e.g., microbial or mammalian) expression systems.
  • Microbial refers to recombinant polypeptides or proteins made in bacterial or fungal (e.g., yeast) expression systems.
  • recombinant microbial defines a polypeptide or protein essentially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells.
  • Nucleotide sequence refers to a heteropolymer of deoxyribonucleotides.
  • DNA segments encoding the polypeptides and proteins provided by this invention are assembled from fragments of the T. pallidum genome and short ohgonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a microbial or viral operon.
  • Recombinant expression vehicle or vector refers to a plasmid or phage or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence.
  • the expression vehicle can comprise a transcriptional unit comprising an assembly of ( 1) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a structural or coding sequence which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the beginning of the desired coding region and terminate translation at its end.
  • Structural units intended for use in yeast or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated protein by a host cell.
  • recombinant protein may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the expressed recombinant protein to provide a final product.
  • "Recombinant expression system” means host cells which have stably integrated a recombinant transcriptional unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or proteins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed.
  • Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention.
  • Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described in Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby incorporated by reference in its entirety.
  • recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S.
  • heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing secretion of translated protein into the periplasmic space or extracellular medium.
  • the heterologous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired characteristics, e.g., stabilization or simplified purification of expressed recombinant product.
  • Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter.
  • the vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and, when desirable, provide amplification within the host.
  • Suitable prokaryotic hosts for transformation include strains of E. coli, B. subtilis, Salmonella typhimurium and various species within the genera Pseudomonas and Streptomyces. Others may, also be employed as a matter of choice.
  • useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017).
  • cloning vector pBR322 ATCC 37017
  • Such commercial vectors include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence to be expressed.
  • the selected promoter where it is inducible, is derepressed or induced by appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally by physical or chemical means, and the resulting crude extract is retained for further purification.
  • appropriate means e.g., temperature shift or chemical induction
  • mammalian cell culture systems can also be employed to express recombinant protein.
  • mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23:115 (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and BHK cell lines.
  • Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5 flanking nontranscribed sequences.
  • DNA sequences derived from the SV40 viral genome for example, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required nontranscribed genetic elements.
  • Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps.
  • Microbial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps.
  • HPLC high performance liquid chromatography
  • the present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are substantially equivalent to those herein described.
  • substantially equivalent can refer both to nucleic acid and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity between reference and subject sequences.
  • sequences having equivalent biological activity, and equivalent expression characteristics are considered substantially equivalent.
  • truncation of the mature sequence should be disregarded.
  • the invention further provides methods of obtaining homologs from other strains of T. pallidum, of the fragments of the T pallidum genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention.
  • a sequence or protein of T. pallidum is defined as a homolog of a fragment of the T pallidum fragments or contigs or a protein encoded by one of the ORFs of the present invention, if it shares significant homology to one of the fragments of the T. pallidum genome of the present invention or a protein encoded by one of the ORFs of the present invention.
  • sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.
  • two nucleic acid molecules or proteins are said to "share significant homology" if the two contain regions which possess greater than 85% sequence (amino acid or nucleic acid) homology.
  • Preferred homologs in this regard are those with more than 90% homology.
  • Especially preferred are those with 93% or more homology.
  • those with 95% or more homology are particularly preferred.
  • Very particularly preferred among these are those with 97% and even more particularly preferred among those are homologs with 99% or more homology.
  • the most preferred homologs among these are those with 99.9% homology or more. It will be understood that, among measures of homology, identity is particularly preferred in this regard.
  • Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS: 1-744 or from a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ ID NOS: 1-744 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have been described in great detail in many publications such as, for example, Innis et al, PCR Protocols, Academic Press, San Diego, CA (1990)).
  • primers derived from SEQ ID NOS: 1-744 or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS J -744 one skilled in the art will recognize that by employing high stringency conditions (e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC) only sequences which are greater than 75% homologous to the primer will be amplified.
  • high stringency conditions e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC
  • DNA probes derived from SEQ ID NOS: 1-744, or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS: 1-744 , for colony/plaque hybridization one skilled in the art will recognize that by employing high stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC), sequences having regions which are greater than 90% homologous to the probe can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X SSPC), sequences having regions which are greater than 35-45% homologous to the probe will be obtained.
  • high stringency conditions e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC
  • lower stringency conditions
  • Any organism can be used as the source for homologs of the present invention so long as the organism naturally expresses such a protein or contains genes encoding the same.
  • the most preferred organism for isolating homologs are bacteria which are closely related to T. pallidum.
  • Each ORF corresponding to the ORF IDs provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide.
  • polypeptides of the present invention for commercial, therapeutic and industrial purposes consistent with the type of putative identification of the polypeptide.
  • identifications permit one skilled in the art to use the T. pallidum ORFs in a manner similar to the known type of sequences for which the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite.
  • Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes enzymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in amino acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, can be used for industrial biosynthesis.
  • T pallidum The various metabolic pathways present in T pallidum can be identified based on absolute nutritional requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS: 1-744. Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non-macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase.
  • Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to give unicellular fruits.
  • a detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts et al, Symbiosis 21:19 (1986) and Voragen et al. in Biocatalysts In Agricultural Biotechnology, Whitaker et al, Eds., American Chemical Society Symposium Series 389:93 (1989) .
  • the metabolism of sugars is an important aspect of the primary metabolism of T pallidum.
  • Enzymes involved in the degradation of sugars can be used in industrial fermentation.
  • sugars such as, particularly, glucose, galactose, fructose and xylose
  • Some of the important sugar transforming enzymes include sugar isomerases such as glucose isomerase.
  • Other metabolic enzymes have found commercial use such as glucose oxidases which produces ketogulonic acid (KG A).
  • KG A is an intermediate in the commercial production of ascorbic acid using the Reichstein's procedure, as described in Krueger et al, Biotechnology 6(A), Rhine et al, Eds., Verlag Press, Weinheim, Germany (1984).
  • Glucose oxidase is commercially available and has been used in purified form as well as in an immobilized form for the deoxygenation of beer. See, for instance, Hartmeir et al, Biotechnology Letters 7:21 (1979). The most important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, for example, in Bigelis et al, beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett et al, Eds., Academic Press, New York (1985).
  • Proteinases such as alkaline serine proteinases, are used as detergent additives and thus represent one of the largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See Faultman et al, Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and Godfrey et al, Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et al, Report Industrial Enzymes by 1990, Hel Hepner & Associates, London (1986)).
  • lipases Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for instance, Macrae et al, Philosophical Transactions of the Chiral Society of London 310:221 (1985) and Poserke, Journal of the American Oil Chemist Society 67:1758 (1984).
  • a major use of lipases is in the fat and oil industry for the production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides.
  • Application of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the washing procedures.
  • the following reactions catalyzed by enzymes are of interest to organic chemists: hydrolysis of carboxylic acid esters, phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming reactions such as the aldol reaction.
  • Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in nucleic acid synthesis, repair, and recombination.
  • proteins of the present invention can be used in a variety of procedures and methods known in the art which are currently applied to other proteins.
  • the proteins of the present invention can further be used to generate an antibody which selectively binds the protein.
  • T. pallidum protein-specific antibodies for use in the present invention can be raised against the intact T. pallidum protein or an antigenic polypeptide fragment thereof, which may be presented together with a carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if it is long enough (at least about 25 amino acids), without a carrier.
  • a carrier protein such as an albumin
  • antibody As used herein, the term "antibody” (Ab) or “monoclonal antibody” (Mab) is meant to include intact molecules, single chain whole antibodies, and antibody fragments.
  • Antibody fragments of the present invention include Fab and F(ab')2 and other fragments including single- chain Fvs (scFv) and disulfide-linked Fvs (sdFv). Also included in the present invention are chimeric and humanized monoclonal antibodies and polyclonal antibodies specific for the polypeptides of the present invention.
  • the antibodies of the present invention may be prepared by any of a variety of methods.
  • cells expressing a polypeptide of the present invention or an antigenic fragment thereof can be administered to an animal in order to induce the production of sera containing polyclonal antibodies.
  • a preparation of T. pallidum polypeptide or fragment thereof is prepared and purified to render it substantially free of natural contaminants. Such a preparation is then introduced into an animal in order to produce polyclonal antisera of greater specific activity.
  • the antibodies of the present invention are monoclonal antibodies or binding fragments thereof.
  • Such monoclonal antibodies can be prepared using hybridoma technology. See, e.g., Harlow et al., ANTIBODIES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988); Hammerling, et al., in: MONOCLONAL ANTIBODIES AND T-CELL HYBRIDOMAS 563-681 (Elsevier, N.Y., 1981).
  • F(ab')2 fragments may be produced by proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')2 fragments).
  • enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')2 fragments).
  • T. pallidum polypeptide-binding fragments, chimeric, and humanized antibodies can be produced through the application of recombinant DNA technology or through synthetic chemistry using methods known in the art.
  • additional antibodies capable of binding to the polypeptide antigen of the present invention may be produced in a two-step procedure through the use of anti-idiotypic antibodies.
  • T. pallidum polypeptide-specific antibodies are used to immunize an animal, preferably a mouse.
  • the splenocytes of such an animal are then used to produce hybridoma cells, and the hybridoma cells are screened to identify clones which produce an antibody whose ability to bind to the T. pallidum polypeptide-specific antibody can be blocked by the T. pallidum polypeptide antigen.
  • Such antibodies comprise anti-idiotypic antibodies to the T. pallidum polypeptide-specific antibody and can be used to immunize an animal to induce formation of further T. pallidum polypeptide-specific antibodies.
  • Antibodies and fragements thereof of the present invention may be described by the portion of a polypeptide of the present invention recognized or specifically bound by the antibody.
  • Antibody binding fragements of a polypeptide of the present invention may be described or specified in the same manner as for polypeptide fragements discussed above., i.e, by N-terminal and C-terminal positions or by size in contiguous amino acid residues. Any number of antibody binding fragments, of a polypeptide of the present invention, specified by N- terminal and C-terminal positions or by size in amino acid residues, as described above, may also be excluded from the present invention. Therefore, the present invention includes antibodies the specifically bind a particuarlly discribed fragement of a polypeptide of the present invention and allows for the exclusion of the same.
  • Antibodies and fragements thereof of the present invention may also be described or specified in terms of their cross-reactivity. Antibodies and fragements that do not bind polypeptides of any other species of Borrelia other than T. pallidum are included in the present invention. Likewise, antibodies and fragements that bind only species of Borrelia, i.e. antibodies and fragements that do not bind bacteria from any genus other than Borrelia, are included in the present invention.
  • Antibodies can be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels (such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), paramagnetic atoms, etc. Procedures for accomplishing such labeling are well-known in the art, for example see Sternberger et al, J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. et al, Meth. Enzym. 62:308 (1979); Engval, E. et al, Immunol. 109:129 (1972); Goding, J. W., J. Immunol. Meth. 13:215 (1976)).
  • radioisotopes such as biotin, avidin, etc.
  • enzymatic labels such as horseradish peroxidase, alkaline phosphatase, etc
  • the labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells or tissues in which a fragment of the T. pallidum genome is expressed.
  • the present invention further provides the above-described antibodies immobilized on a solid support.
  • solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports are well known in the art (Weir, D. M. et al, "Handbook of Experimental Immunology” 4th Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. et al, Meth. Enzym. 34 Academic Press, N. Y. (1974)).
  • the immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for immunoaffinity purification of the proteins of the present invention.
  • the invention provides peptides and polypeptides comprising epitope-bearing portions of the T pallidum polypeptides of the present invention.
  • These epitopes are immunogenic or antigenic epitopes of the polypeptides of the present invention.
  • An "immunogenic epitope” is defined as a part of a protein that elicits an antibody response when the whole protein or polypeptide is the immunogen. These immunogenic epitopes are believed to be confined to a few loci on the molecule.
  • an antigenic determinant or "antigenic epitope.”
  • the number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, e.g., Geysen, et al. (1983) Proc. Natl. Acad. Sci. USA 81:3998- 4002. Amino acid residues comprising anigenic epitopes may be determined by algorithms such as the the Jameson- Wolf analysis or similar algorithms or by in vivo testing for an antigenic response using the methods described herein or those known in the art.
  • peptides or polypeptides bearing an antigenic epitope i.e., that contain a region of a protein molecule to which an antibody can bind
  • relatively short synthetic peptides that mimic part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. See, e.g., Sutcliffe, et al., (1983) Science 219:660-666.
  • Peptides capable of eliciting protein-reactive sera are frequently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or carboxyl terminals. Peptides that are extremely hydrophobic and those of six or fewer residues generally are ineffective at inducing antibodies that bind to the mimicked protein; longer, peptides, especially those containing proline residues, usually are effective. See, Sutcliffe, et al., supra, p. 661.
  • 18 of 20 peptides designed according to these guidelines containing 8-39 residues covering 75% of the sequence of the influenza virus hemagglutinin HAl polypeptide chain, induced antibodies that reacted with the HAl protein or intact virus; and 12/12 peptides from the MuLV polymerase and 18/18 from the rabies glycoprotein induced antibodies that precipitated the respective proteins.
  • Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention.
  • a high proportion of hybridomas obtained by fusion of spleen cells from donors immunized with an antigen epitope-bearing peptide generally secrete antibody reactive with the native protein. See Sutcliffe, et al., supra, p. 663.
  • the antibodies raised by antigenic epitope-bearing peptides or polypeptides are useful to detect the mimicked protein, and antibodies to different peptides may be used for tracking the fate of various regions of a protein precursor which undergoes post-translational processing.
  • the peptides and anti-peptide antibodies may be used in a variety of qualitative or quantitative assays for the mimicked protein, for instance in competition assays since it has been shown that even short peptides (e.g., about 9 amino acids) can bind and displace the larger peptides in immunoprecipitation assays. See, e.g., Wilson, et al., (1984) Cell 37:767-778.
  • the anti-peptide antibodies of the invention also are useful for purification of the mimicked protein, for instance, by adsorption chromatography using methods known in the art.
  • Antigenic epitope-bearing peptides and polypeptides of the invention designed according to the above guidelines preferably contain a sequence of at least seven, more preferably at least nine and most preferably between about 10 to about 50 amino acids (i.e. any integer between 7 and 50) contained within the amino acid sequence of a polypeptide of the invention.
  • peptides or polypeptides comprising a larger portion of an amino acid sequence of a polypeptide of the invention, containing about 50 to about 100 amino acids, or any length up to and including the entire amino acid sequence of a polypeptide of the invention also are considered epitope-bearing peptides or polypeptides of the invention and also are useful for inducing antibodies that react with the mimicked protein.
  • the amino acid sequence of the epitope-bearing peptide is selected to provide substantial solubility in aqueous solvents (i.e., the sequence includes relatively hydrophilic residues and highly hydrophobic sequences are preferably avoided); and sequences containing proline residues are particularly preferred.
  • the epitope-bearing peptides and polypeptides of the present invention may be produced by any conventional means for making peptides or polypeptides including recombinant means using nucleic acid molecules of the invention.
  • an epitope-bearing amino acid sequence of the present invention may be fused to a larger polypeptide which acts as a carrier during recombinant production and purification, as well as during immunization to produce anti-peptide antibodies.
  • Epitope-bearing peptides also may be synthesized using known methods of chemical synthesis. For instance, Houghten has described a simple method for synthesis of large numbers of peptides, such as 10-20 mg of 248 different 13 residue peptides representing single amino acid variants of a segment of the HAl polypeptide which were prepared and characterized (by ELISA-type binding studies) in less than four weeks (Houghten, R. A. Proc. Natl. Acad. Sci.
  • Epitope-bearing peptides and polypeptides of the invention are used to induce antibodies according to methods well known in the art. See, e.g., Sutcliffe, et al., supra;; Wilson, et al., supra;; and Bittle, et al. (1985) J. Gen. Virol. 66:2347-2354.
  • animals may be immunized with free peptide; however, anti-peptide antibody titer may be boosted by coupling of the peptide to a macromolecular carrier, such as keyhole limpet hemacyanin (KLH) or tetanus toxoid.
  • KLH keyhole limpet hemacyanin
  • peptides containing cysteine may be coupled to carrier using a linker such as m-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS), while other peptides may be coupled to carrier using a more general linking agent such as glutaraldehyde.
  • a linker such as m-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS)
  • MBS m-maleimidobenzoyl-N-hydroxysuccinimide ester
  • glutaraldehyde m-maleimidobenzoyl-N-hydroxysuccinimide ester
  • Animals such as rabbits, rats and mice are immunized with either free or carrier-coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 ⁇ g peptide or carrier protein and Freund's adjuvant.
  • booster injections may be needed, for instance, at intervals of about two weeks, to provide a useful titer of anti-peptide antibody which can be detected, for example, by ELISA assay using free peptide adsorbed to a solid surface.
  • the titer of anti-peptide antibodies in serum from an immunized animal may be increased by selection of anti-peptide antibodies, for instance, by adsorption to the peptide on a solid support and elution of the selected antibodies according to methods well known in the art.
  • Immunogenic epitope-bearing peptides of the invention i.e., those parts of a protein that elicit an antibody response when the whole protein is the immunogen, are identified according to methods known in the art. For instance, Gey sen, et al, supra, discloses a procedure for rapid concurrent synthesis on solid supports of hundreds of peptides of sufficient purity to react in an ELISA. interaction of synthesized peptides with antibodies is then easily detected without removing them from the support. In this manner a peptide bearing an immunogenic epitope of a desired protein may be identified routinely by one of ordinary skill in the art.
  • the immunologically important epitope in the coat protein of foot-and-mouth disease virus was located by Geysen et al. supra with a resolution of seven amino acids by synthesis of an overlapping set of all 208 possible hexapeptides covering the entire 213 amino acid sequence of the protein. Then, a complete replacement set of peptides in which all 20 amino acids were substituted in turn at every position within the epitope were synthesized, and the particular amino acids conferring specificity for the reaction with antibody were determined.
  • peptide analogs of the epitope-bearing peptides of the invention can be made routinely by this method.
  • U.S. Patent No. 4,708,781 to Geysen (1987) further describes this method of identifying a peptide bearing an immunogenic epitope of a desired protein.
  • U.S. Patent No. 5,194,392, to Geysen (1990) describes a general method of detecting or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of interest. More generally, U.S. Patent No. 4,433,092, also to Geysen (1989), describes a method of detecting or determining a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A.
  • polypeptides of the present invention and the epitope-bearing fragments thereof described above can be combined with parts of the constant domain of immunoglobulins (IgG), resulting in chimeric polypeptides.
  • IgG immunoglobulins
  • These fusion proteins facilitate purification and show an increased half-life in vivo. This has been shown, e.g., for chimeric proteins consisting of the first two domains of the human CD4-polypeptide and various domains of the constant regions of the heavy or light chains of mammalian immunoglobulins. (EPA 0,394,827; Traunecker et al. (1988) Nature 331:84-86.
  • Fusion proteins that have a disulfide-linked dimeric structure due to the IgG part can also be more efficient in binding and neutralizing other molecules than a monomeric T. pallidum polypeptide or fragment thereof alone. See Fountoulakis et al. (1995) J. Biochem. 270:3958-3964. Nucleic acids encoding the above epitopes of T. pallidum polypeptides can also be recombined with a gene of interest as an epitope tag to aid in detection and purification of the expressed polypeptide.
  • the present invention further relates to methods for assaying Borrelia infection in an animal by detecting the expression of genes encoding Borrelia polypeptides of the present invention.
  • the methods comprise analyzing tissue or body fluid from the animal for 2forre// ⁇ -specific antibodies, nucleic acids, or proteins. Analysis of nucleic acid specific to Borrelia is assayed by PCR or hybridization techniques using nucleic acid sequences of the present invention as either hybridization probes or primers. See, e.g., Sambrook et al. Molecular cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 2nd ed., 1989, page 54 reference); Eremeeva et al. (1994) J. Clin. Microbiol.
  • the present invention is useful for monitoring progression or regression of the disease state whereby patients exhibiting enhanced Borrelia gene expression will experience a worse clinical outcome relative to patients expressing these gene(s) at a lower level.
  • biological sample any biological sample obtained from an animal, cell line, tissue culture, or other source which contains Borrelia polypeptide, mRNA, or DNA.
  • Biological samples include body fluids (such as saliva, blood, plasma, urine, mucus, synovial fluid, etc.) tissues (such as muscle, skin, and cartilage) and any other biological source suspected of containing Borrelia polypeptides or nucleic acids. Methods for obtaining biological samples such as tissue are well known in the art.
  • the present invention is useful for detecting diseases related to Borrelia infections in animals.
  • Preferred animals include monkeys, apes, cats, dogs, birds, cows, pigs, mice, horses, rabbits and humans. Particularly preferred are humans.
  • Total RNA can be isolated from a biological sample using any suitable technique such as the single-step guanidinium-thiocyanate-phenol-chloroform method described in Chomczynski et al. (1987) Anal. Biochem. 162: 156-159.
  • mRNA encoding Borrelia polypeptides having sufficient homology to the nucleic acid sequences identified in SEQ ID NOS: 1-744 to allow for hybridization between complementary sequences are then assayed using any appropriate method. These include Northern blot analysis, S 1 nuclease mapping, the polymerase chain reaction (PCR), reverse transcription in combination with the polymerase chain reaction (RT-PCR), and reverse transcription in combination with the ligase chain reaction (RT-LCR).
  • PCR polymerase chain reaction
  • RT-PCR reverse transcription in combination with the polymerase chain reaction
  • RT-LCR reverse transcription in combination with the ligase chain reaction
  • RNA is prepared from a biological sample as described above.
  • an appropriate buffer such as glyoxal/dimethyl sulfoxide/sodium phosphate buffer
  • the filter is prehybridized in a solution containing formamide, SSC, Denhardt's solution, denatured salmon sperm, SDS, and sodium phosphate buffer.
  • pallidum polynucleotide sequence shown in SEQ ID NOS: 1-744, or portion thereof, labeled according to any appropriate method is used as probe. After hybridization overnight, the filter is washed and exposed to x-ray film.
  • DNA for use as probe according to the present invention is described in the sections above and will preferably at least 15 nucleotides in length.
  • SI mapping can be performed as described in Fujita et al. (1987) Cell 49:357-367.
  • probe DNA for use in S 1 mapping, the sense strand of an above-described T. pallidum DNA sequence of the present invention is used as a template to synthesize labeled antisense DNA.
  • the antisense DNA can then be digested using an appropriate restriction endonuclease to generate further DNA probes of a desired length.
  • Such antisense probes are useful for visualizing protected bands corresponding to the target mRNA (i.e., mRNA encoding Borrelia polypeptides).
  • RNA encoding Borrelia polypeptides are assayed, for e.g., using the RT-PCR method described in Makino et al. (1990) Technique 2:295-301.
  • the radioactivities of the "amplicons" in the polyacrylamide gel bands are linearly related to the initial concentration of the target mRNA. Briefly, this method involves adding total RNA isolated from a biological sample in a reaction mixture containing a RT primer and appropriate buffer. After incubating for primer annealing, the mixture can be supplemented with a RT buffer, dNTPs, DTT, RNase inhibitor and reverse transcriptase.
  • the RT products are then subject to PCR using labeled primers.
  • a labeled dNTP can be included in the PCR reaction mixture.
  • PCR amplification can be performed in a DNA thermal cycler according to conventional techniques. After a suitable number of rounds to achieve amplification, the PCR reaction mixture is electrophoresed on a polyacrylamide gel. After drying the gel, the radioactivity of the appropriate bands (corresponding to the mRNA encoding the Borrelia polypeptides of the present invention) are quantified using an imaging analyzer.
  • RT and PCR reaction ingredients and conditions, reagent and gel concentrations, and labeling methods are well known in the art.
  • PCR PRIMER A LABORATORY MANUAL (C.W. Dieffenbach et al. eds., Cold Spring Harbor Lab Press, 1995).
  • the polynucleotides of the present invention may be used to detect polynucleotides of the present invention or Borrelia species including T. pallidum using bio chip technology.
  • the present invention includes both high density chip arrays (>1000 oligonucleotides per cm 2 ) and low density chip arrays ( ⁇ 1000 oligonucleotides per cm 2 ).
  • Bio chips comprising arrays of polynucleotides of the present invention may be used to detect Borrelia species, including T. pallidum, in biological and environmental samples and to diagnose an animal, including humans, with an T. pallidum or other Borrelia infection.
  • the bio chips of the present invention may comprise polynucleotide sequences of other pathogens including bacteria, viral, parasitic, and fungal polynucleotide sequences, in addition to the polynucleotide sequences of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis.
  • the bio chips can also be used to monitor an T. pallidum or other Borrelia infections and to monitor the genetic changes (deletions, insertions, mismatches, etc.) in response to drug therapy in the clinic and drug development in the laboratory.
  • the bio chip technology comprising arrays of polynucleotides of the present invention may also be used to simultaneously monitor the expression of a multiplicity of genes, including those of the present invention.
  • the polynucleotides used to comprise a selected array may be specified in the same manner as for the fragements, i.e, by their 5' and 3' positions or length in contigious base pairs and include from.
  • Methods and particular uses of the polynucleotides of the present invention to detect Borrelia species, including T. pallidum, using bio chip technology include those known in the art and those of: U.S. Patent Nos. 5510270, 5545531, 5445934, 5677195, 5532128, 5556752, 5527681, 5451683, 5424186, 5607646, 5658732 and World Patent Nos.
  • Biosensors using the polynucleotides of the present invention may also be used to detect, diagnose, and monitor T. pallidum or other Borrelia species and infections thereof. Biosensors using the polynucleotides of the present invention may also be used to detect particular polynucleotides of the present invention. Biosensors using the polynucleotides of the present invention may also be used to monitor the genetic changes (deletions, insertions, mismatches, etc.) in response to drug therapy in the clinic and drug development in the laboratory.
  • Methods and particular uses of the polynucleotides of the present invention to detect Borrelia species, including T. pallidum, using biosenors include those known in the art and those of: U.S. Patent Nos 5721102, 5658732, 5631170, and World Patent Nos. WO97/35011, WO/9720203, each incorporated herein in their entireties.
  • the present invention includes both bio chips and biosensors comprising polynucleotides of the present invention and methods of their use.
  • Assaying Borrelia polypeptide levels in a biological sample can occur using any art-known method, such as antibody-based techniques.
  • Borrelia polypeptide expression in tissues can be studied with classical immunohistological methods.
  • the specific recognition is provided by the primary antibody (polyclonal or monoclonal) but the secondary detection system can utilize fluorescent, enzyme, or other conjugated secondary antibodies.
  • an immunohistological staining of tissue section for pathological examination is obtained.
  • Tissues can also be extracted, e.g., with urea and neutral detergent, for the liberation of Borrelia polypeptides for Western-blot or dot/slot assay. See, e.g., Jalkanen, M. et al. (1985) J.
  • a Borrelia polypeptide-specific monoclonal antibodies can be used both as an immunoabsorbent and as an enzyme-labeled probe to detect and quantify a Borrelia polypeptide.
  • the amount of a Borrelia polypeptide present in the sample can be calculated by reference to the amount present in a standard preparation using a linear regression computer algorithm.
  • Such an ELISA is described in Iacobelli et al. (1988) Breast Cancer Research and Treatment 11 : 19-30.
  • two distinct specific monoclonal antibodies can be used to detect Borrelia polypeptides in a body fluid. In this assay, one of the antibodies is used as the immunoabsorbent and the other as the enzyme-labeled probe.
  • the above techniques may be conducted essentially as a "one-step” or “two-step” assay.
  • the "one-step” assay involves contacting the Borrelia polypeptide with immobilized antibody and, without washing, contacting the mixture with the labeled antibody.
  • the "two-step” assay involves washing before contacting the mixture with the labeled antibody.
  • Other conventional methods may also be employed as suitable. It is usually desirable to immobilize one component of the assay system on a support, thereby allowing other components of the system to be brought into contact with the component and readily removed from the sample. Variations of the above and other immunological methods included in the present invention can also be found in Harlow et al., ANTIBODIES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988).
  • Suitable enzyme labels include, for example, those from the oxidase group, which catalyze the production of hydrogen peroxide by reacting with substrate.
  • Glucose oxidase is particularly preferred as it has good stability and its substrate (glucose) is readily available.
  • Activity of an oxidase label may be assayed by measuring the concentration of hydrogen peroxide formed by the enzyme-labeled antibody/substrate reaction.
  • radioisotopes such as iodine ( 125 1, 121 I), carbon ( 14 C), sulphur ( 35 S), tritium ( 3 H), indium ( n2 In), and technetium (“"Tc), and fluorescent labels, such as fluorescein and rhodamine, and biotin.
  • suitable labels for the Borrelia polypeptide-specific antibodies of the present invention are provided below.
  • suitable enzyme labels include malate dehydrogenase, Borrelia nuclease, delta-5-steroid isomerase, yeast-alcohol dehydrogenase, alpha-glycerol phosphate dehydrogenase, triose phosphate isomerase, peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase, and acetylcholine esterase.
  • suitable radioisotopic labels include 3 H, n l In, 125 I, 13, 1, 32 P, 35 S, ,4 C, 51 Cr,
  • ⁇ ⁇ In is a preferred isotope where in vivo imaging is used since its avoids the problem of dehalogenation of the 125 I or ,31 IJabeled monoclonal antibody by the liver.
  • this radionucleotide has a more favorable gamma emission energy for imaging. See, e.g., Perkins et al. (1985) Eur. J. Nucl. Med. 10:296-301; Carasquillo et al. (1987) J. Nucl. Med.
  • fluorescent labels examples include an 152 Eu label, a fluorescein label, an isothiocyanate label, a rhodamine label, a phycoerythrin label, a phycocyanin label, an allophycocyanin label, an o-phthaldehyde label, and a fluorescamine label.
  • suitable toxin labels include, Pseudomonas toxin, diphtheria toxin, ricin, and cholera toxin.
  • chemiluminescent labels include a luminal label, an isoluminal label, an aromatic acridinium ester label, an imidazole label, an acridinium salt label, an oxalate ester label, a luciferin label, a luciferase label, and an aequorin label.
  • nuclear magnetic resonance contrasting agents include heavy metal nuclei such as Gd, Mn, and iron.
  • Typical techniques for binding the above-described labels to antibodies are provided by Kennedy et al. (1976) Clin. Chim. Acta 70:1-31, and Schurs et al. (1977) Clin. Chim. Acta 81:1-40. Coupling techniques mentioned in the latter are the glutaraldehyde method, the periodate method, the dimaleimide method, the m-maleimidobenzyl-N-hydroxy-succinimide ester method, all of which methods are incorporated by reference herein.
  • the invention includes a diagnostic kit for use in screening serum containing antibodies specific against T. pallidum infection.
  • a kit may include an isolated T. pallidum antigen comprising an epitope which is specifically immunoreactive with at least one anti-r. pallidum antibody.
  • Such a kit also includes means for detecting the binding of said antibody to the antigen.
  • the kit may include a recombinantly produced or chemically synthesized peptide or polypeptide antigen. The peptide or polypeptide antigen may be attached to a solid support.
  • the detecting means of the above-described kit includes a solid support to which said peptide or polypeptide antigen is attached.
  • a kit may also include a non-attached reporter-labeled anti-human antibody.
  • binding of the antibody to the T. pallidum antigen can be detected by binding of the reporter labeled antibody to the anti-r. pallidum polypeptide antibody.
  • a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the DFs or antibodies of the present invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of a bound DF or antibody.
  • a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are not cross- contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another.
  • Such containers will include a container which will accept the test sample, a container which contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody or DF.
  • wash reagents such as phosphate buffered saline, Tris-buffers, etc.
  • the invention includes a method of detecting T. pallidum infection in a subject.
  • This detection method includes reacting a body fluid, preferably serum, from the subject with an isolated T. pallidum antigen, and examining the antigen for the presence of bound antibody.
  • the method includes a polypeptide antigen attached to a solid support, and serum is reacted with the support. Subsequently, the support is reacted with a reporter-labeled anti-human antibody. The support is then examined for the presence of reporter- labeled antibody.
  • the solid surface reagent employed in the above assays and kits is prepared by known techniques for attaching protein material to solid support material, such as polymeric beads, dip sticks, 96-well plates or filter material. These attachment methods generally include non-specific adsorption of the protein to the support or covalent attachment of the protein , typically through a free amine group, to a chemically reactive group on the solid support, such as an activated carboxyl, hydroxyl, or aldehyde group. Alternatively, streptavidin coated plates can be used in conjunction with biotinylated antigen(s).
  • the polypeptides and antibodies of the present invention, including fragments thereof, may be used to detect Borrelia species including T. pallidum using bio chip and biosensor technology.
  • Bio chip and biosensors of the present invention may comprise the polypeptides of the present invention to detect antibodies, which specifically recognize Borrelia species, including T. pallidum.
  • Bio chip and biosensors of the present invention may also comprise antibodies which specifically recognize the polypeptides of the present invention to detect Borrelia species, including T. pallidum or specific polypeptides of the present invention.
  • Bio chips or biosensors comprising polypeptides or antibodies of the present invention may be used to detect Borrelia species, including T. pallidum, in biological and environmental samples and to diagnose an animal, including humans, with an T. pallidum or other Borrelia infection.
  • the present invention includes both bio chips and biosensors comprising polypeptides or antibodies of the present invention and methods of their use.
  • the bio chips of the present invention may further comprise polypeptide sequences of other pathogens including bacteria, viral, parasitic, and fungal polypeptide sequences, in addition to the polypeptide sequences of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis.
  • the bio chips of the present invention may further comprise antibodies or fragements thereof specific for other pathogens including bacteria, viral, parasitic, and fungal polypeptide sequences, in addition to the antibodies or fragements thereof of the present invention, for use in rapid diffenertial pathogenic detection and diagnosis.
  • the bio chips and biosensors of the present invention may also be used to monitor an T.
  • the bio chip and biosensors comprising polypeptides or antibodies of the present invention may also be used to simultaneously monitor the expression of a multiplicity of polypeptides, including those of the present invention.
  • the polypeptides used to comprise a bio chip or biosensor of the present invention may be specified in the same manner as for the fragements, i.e, by their N-terminal and C-terminal positions or length in contigious amino acid residue. Methods and particular uses of the polypeptides and antibodies of the present invention to detect Borrelia species, including T.
  • bio chip and biosensor technology examples include those known in the art, those of the U.S. Patent Nos. and World Patent Nos. listed above for bio chips and biosensors using polynucleotides of the present invention, and those of: U.S. Patent Nos. 5658732, 5135852, 5567301, 5677196, 5690894 and World Patent Nos. WO9729366, WO9612957, each incorporated herein in their entireties.
  • the present invention further provides methods of obtaining and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the fragments and the T. pallidum fragment and contigs herein described.
  • such methods comprise steps of:
  • agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents.
  • the agents can be selected and screened at random or rationally selected or designed using protein modeling techniques.
  • agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention.
  • agents may be rationally selected or designed.
  • an agent is said to be "rationally selected or designed" when the agent is chosen based on the configuration of the particular protein.
  • one skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, for example see Hurby et al, "Application of Synthetic Peptides: Antisense Peptides," in Synthetic Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, and Kaspczak et al, Biochemistry 25:9230-8 (1989), or pharmaceutical agents, or the like.
  • one class of agents of the present invention can be used to control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or multiple ORFs which rely on the same EMF for expression control.
  • One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity.
  • Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary to a region of the gene involved in transcription (triple helix - see Lee et al, Nucl. Acids Res. 3:173 (1979); Cooney et al, Science 241:456 (1988); and Dervan et al, Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, /. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL (1988)).
  • Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides, and other DNA binding agents.
  • the present invention further provides pharmaceutical agents which can be used to modulate the growth or pathogenicity of T. pallidum, or another related organism, in vivo or in vitro.
  • a "pharmaceutical agent” is defined as a composition of matter which can be formulated using known techniques to provide a pharmaceutical compositions.
  • the “pharmaceutical agents of the present invention” refers the pharmaceutical agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are identified using the herein described assays.
  • a pharmaceutical agent is said to "modulate the growth pathogenicity of T. pallidum or a related organism, in vivo or in vitro," when the agent reduces the rate of growth, rate of division, or viability of the organism in question.
  • the pharmaceutical agents of the present invention can modulate the growth or pathogenicity of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth by binding to an important protein thus blocking the biological activity of the protein, while other agents may bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone to act the bodies nature immune system.
  • the agent may comprise a protein encoded by one of the ORFs of the present invention and serve as a vaccine. The development and use of a vaccine based on outer membrane components are well known in the art.
  • a "related organism” is a broad term which refers to any organism whose growth can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As such, related organisms do not need to be bacterial but may be fungal or viral pathogens.
  • the pharmaceutical agents and compositions of the present invention may be administered in a convenient manner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes.
  • the pharmaceutical compositions are administered in an amount which is effective for treating and/or prophylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most cases, the dosage is from about 0J mg/kg to about 10 g/kg body weight daily, taking into account the routes of administration, symptoms, etc.
  • the agents of the present invention can be used in native form or can be modified to form a chemical derivative.
  • a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, REMINGTON'S PHARMACEUTICAL SCIENCES ( 1980) cited elsewhere herein.
  • such moieties may change an immunological character of the functional derivative, such as affinity for a given antibody.
  • Such changes in immunomodulation activity are measured by the appropriate assay, such as a competitive type immunoassay.
  • Modifications of such protein properties as redox or thermal stability, biological half-life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers also may be effected in this way and can be assayed by methods well known to the skilled artisan.
  • the therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient by any suitable means (e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single or multiple injections.
  • the dosage of the administered agent will vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered.
  • the therapeutically effective dose can be lowered by using combinations of the agents of the present invention or another agent.
  • two or more compounds or agents are said to be administered "in combination" with each other when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can be measured at the same time.
  • the composition of the present invention can be administered concurrently with, prior to, or following the administration of the other agent.
  • the agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to decrease the rate of growth (as defined above) of the target organism.
  • the administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose.
  • the agent(s) are provided in advance of any symptoms indicative of the organisms growth.
  • the prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of any subsequent infection.
  • the agent(s) are provided at (or shortly after) the onset of an indication of infection.
  • the therapeutic administration of the compound(s) serves to attenuate the pathological symptoms of the infection and to increase the rate of recovery.
  • the agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharmaceutically acceptable form and in a therapeutically effective concentration.
  • a composition is said to be "pharmacologically acceptable” if its administration can be tolerated by a recipient patient.
  • Such an agent is said to be administered in a "therapeutically effective amount” if the amount administered is physiologically significant.
  • An agent is physiologically significant if its presence results in a detectable change in the physiology of a recipient patient.
  • the agents of the present invention can be formulated according to known methods to prepare pharmaceutically useful compositions, whereby these materials, or their functional derivatives, are combined in a mixture with a pharmaceutically acceptable carrier vehicle.
  • compositions suitable for effective administration will contain an effective amount of one or more of the agents of the present invention, together with a suitable amount of carrier vehicle. Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention.
  • the controlled delivery may be effectuated by a variety of well known techniques, including formulation with macromolecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired time course of release.
  • Another possible method to control the duration of action by controlled release preparations is to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers.
  • microcapsules prepared, for example, by coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-microcapsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions.
  • colloidal drug delivery systems for example, liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions.
  • the invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention.
  • a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention.
  • Associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale for human administration.
  • the agents of the present invention may be employed in conjunction with other therapeutic compounds.
  • the present invention further demonstrates that a large sequence can be sequenced using a random shotgun approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols.
  • the probability that any given base has not been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, therefore, is equivalent to the fraction of the whole sequence that has yet to be determined.
  • approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced.
  • coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to .0067 or 0.67%.
  • 5X coverage of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with an average sequence read length of 410 bp.
  • 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a polynucleotide 2.8 Mb long.
  • T. pallidum DNA is prepared by phenol extraction. A mixture containing 200 ⁇ g DNA in 1.0 ml of 300 mM sodium acetate, 10 mM Tris-HCl, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The sonicated DNA is ethanol precipitated and redissolved in 500 ⁇ l TE buffer.
  • a 100 ⁇ l aliquot of the resuspended DNA is digested with 5 units of BAL31 nuclease (New England BioLabs) for 10 min at 30°C in 200 ⁇ l BAL31 buffer.
  • the digested DNA is phenol-extracted, ethanol-precipitated, redissolved in 100 ⁇ l TE buffer, and then size-fractionated by electrophoresis through a 1.0% low melting temperature agarose gel.
  • the section containing DNA fragments 1.6-2.0 kb in size is excised from the gel, and the LGT agarose is melted and the resulting solution is extracted with phenol to separate the agarose from the DNA.
  • DNA is ethanol precipitated and redissolved in 20 ⁇ l of TE buffer for ligation to vector.
  • a two-step ligation procedure is used to produce a plasmid library with 97% inserts, of which >99% were single inserts.
  • the first ligation mixture (50 ul) contains 2 ⁇ g of DNA fragments, 2 ⁇ g pUC18 DNA (Pharmacia) cut with Smal and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and is incubated at 14°C for 4 hr.
  • the ligation mixture then is phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 20 ⁇ l TE buffer and electrophoresed on a 1.0% low melting agarose gel.
  • Discrete bands in a ladder are visualized by ethidium bromide-staining and UV illumination and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc.
  • the portion of the gel containing v+I DNA is excised and the v+I DNA is recovered and resuspended into 20 ⁇ l TE.
  • the v+I DNA then is blunt-ended by T4 polymerase treatment for 5 min. at 37°C in a reaction mixture (50 ul) containing the v+I linears, 500 ⁇ M each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), under recommended buffer conditions.
  • the repaired v+I linears are dissolved in 20 ⁇ l TE.
  • the final ligation to produce circles is carried out in a 50 ⁇ l reaction containing 5 ⁇ l of v+I linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the following day, the reaction mixture is stored at -20°C.
  • This two-stage procedure results in a molecularly random collection of single-insert plasmid recombinants with minimal contamination from double-insert chimeras ( ⁇ 1%) or free vector ( ⁇ 3%).
  • E. coli host cells deficient in all recombination and restriction functions are used to prevent rearrangements, deletions, and loss of clones by restriction.
  • transformed cells are plated directly on antibiotic diffusion plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells. Plating is carried out as follows. A 100 ⁇ l aliquot of Epicurian Coli SURE II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a chilled Falcon 2059 tube on ice.
  • a 1.7 ⁇ l aliquot of 1.42 M beta-mercaptoethanol is added to the aliquot of cells to a final concentration of 25 mM.
  • Cells are incubated on ice for 10 min.
  • a 1 ⁇ l aliquot of the final ligation is added to the cells and incubated on ice for 30 min.
  • the cells are heat pulsed for 30 sec. at 42°C and placed back on ice for 2 min.
  • the outgrowth period in liquid culture is eliminated from this protocol in order to minimize the preferential growth of any given transformed cell.
  • the transformation mixture is plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g tryptone, 5 g yeast extract, 0.5 g NaCl, 1.5% Difco Agar per liter of media).
  • the 5 ml bottom layer is supplemented with 0.4 ml of 50 mg/ml ampicillin per 100 ml SOB agar.
  • the 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal (2%), 1 ml MgC12 (1 M), and 1 ml MgSO4/100 ml SOB agar.
  • the 15 ml top layer is poured just prior to plating.
  • High quality double stranded DNA plasmid templates are prepared using a "boiling bead” method developed in collaboration with Advanced Genetic Technology Corp. (Gaithersburg, MD) (Adams et al, Science 252:1651 (1991); Adams et al, Nature 355:632 (1992)). Plasmid preparation is performed in a 96-well format for all stages of DNA preparation from bacterial growth through final DNA purification. Template concentration is determined using Hoechst Dye and a Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding templates are identified where possible and not sequenced.
  • T. pallidum DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) containing 50 ⁇ g DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C. The digested DNA was phenol-extracted and electrophoresed on a 0.5% low melting agarose gel at 2V/cm for 7 hours.
  • Fragments from 15 to 25 kb are excised and recovered in a final volume of 6 ul.
  • One ⁇ l of fragments is used with 1 ⁇ l of DASH-fl vector (Stratagene) in the recommended ligation reaction.
  • One ⁇ l of the ligation mixture is used per packaging reaction following the recommended protocol with the Gigapack II XL Packaging Extract (Stratagene, #227711). Phage are plated directly without amplification from the packaging mixture (after dilution with 500 ⁇ l of recommended SM buffer and chloroform treatment). Yield is about 2.5x103 pfu/ul.
  • the amplified library is prepared essentially as above except the lambda GEM- 12 vector is used.
  • 3.5x104 pfu are plated on the restrictive NM539 host.
  • the lysate is harvested in 2 ml of SM buffer and stored frozen in 7% dimethylsulfoxide.
  • the phage titer is approximately 1x109 pfu ml.
  • Liquid ly sates (100 ⁇ l) are prepared from randomly selected plaques (from the unamplified library) and template is prepared by long-range PCR using T7 and T3 vector-specific primers.
  • Sequencing reactions are carried out on plasmid and/or PCR templates using the AB Catalyst LabStation with Applied Biosystems PRISM Ready Reaction Dye Primer Cycle Sequencing Kits for the M13 forward (M13-21) and the M13 reverse (M13RP1) primers (Adams et al. , Nature 368:414 (1994)).
  • Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits.
  • T7 and SP6 primers are used to sequence the ends of the inserts from the Lambda GEM- 12 library and T7 and T3 primers are used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are performed by eight individuals using an average of fourteen AB 373 DNA Sequencers per day. All sequencing reactions are analyzed using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read distance. The overall sequencing success rate very approximately is about 85% for Ml 3-21 and M13RP1 sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for Ml 3-21 sequences, 445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions.
  • the Catalyst robot is a publicly available sophisticated pipetting and temperature control robot which has been developed specifically for DNA sequencing reactions.
  • the Catalyst combines pre-aliquoted templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and reaction buffer. Reaction mixes and templates are combined in the wells of an aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear amplification (i.e., one primer synthesis) steps are performed including denaturation, annealing of primer and template, and extension; i.e., DNA synthesis.
  • a heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for an oil overlay.
  • Two sequencing protocols are used: one for dye-labelled primers and a second for dye- labelled dideoxy chain terminators.
  • the shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, permitting the four individual reactions to be combined into one lane of the 373 DNA Sequencer for electrophoresis, detection, and base-calling.
  • ABI currently supplies pre-mixed reaction mixes in bulk packages containing all the necessary non-template reagents for sequencing.
  • Sequencing can be done with both plasmid and PCR- generated templates with both dye-primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable sequences. Thirty-two reactions are loaded per AB373 Sequencer each day, for a total of 960 samples. Electrophoresis is run overnight following the manufacturer's protocols, and the data is collected for twelve hours. Following electrophoresis and fluorescence detection, the ABI 373 performs automatic lane tracking and base-calling. The lane- tracking is confirmed visually. Each sequence electropherogram (or fluorescence lane trace) is inspected visually and assessed for quality.
  • Trailing sequences of low quality are removed and the sequence itself is loaded via software to a Sybase database (archived daily to 8mm tape).
  • Leading vector poly linker sequence is removed automatically by a software program.
  • Average edited lengths of sequences from the standard ABI 373 are around 400 bp and depend mostly on the quality of the template used for the sequencing reaction.
  • ABI 373 Sequencers converted to Stretch Liners provide a longer electrophoresis path prior to fluorescence detection and increase the average number of usable bases to 500-600 bp.
  • TIGR Assembler developed for the rapid and accurate assembly of thousands of sequence fragments was employed to generate contigs.
  • the TIGR assembler simultaneously clusters and assembles fragments of the genome.
  • the algorithm builds a hash table of 12 bp ohgonucleotide subsequences to generate a list of potential sequence fragment overlaps. The number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements.
  • TIGR Assembler extends the current contig by attempting to add the best matching fragment based on ohgonucleotide content.
  • the contig and candidate fragment are aligned using a modified version of the Smith- Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., Methods in Enzymology 164:165 (1988)).
  • the contig is extended by the fragment only if strict criteria for the quality of the match are met.
  • the match criteria include the minimum length of overlap, the maximum length of an unmatched end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal coverage and raised in regions with a possible repetitive element.
  • the number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repetitive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of alignments and excluded from the current contig.
  • TIGR Assembler is designed to take advantage of clone size information coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from two ends of the same template point toward one another in the contig and are located within a certain range of base pairs (definable for each clone based on the known clone size range for a given library). The process resulted in 744 contigs as represented by SEQ ID NOs: 1-744.
  • the predicted coding regions of the T. pallidum genome were initially defined with the program GeneMark, which finds ORFs using a probabilistic classification technique.
  • the predicted coding region sequences were used in searches against a database of all nucleotide sequences from GenBank (June, 1997), using the BLASTN search method to identify overlaps of 50 or more nucleotides with at least a 95% identity.
  • Those ORFs with nucleotide sequence matches are shown in Table 1.
  • the ORFs without such matches were translated to protein sequences and compared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept databases.
  • ORFs that matched a database protein with BLASTP probability less than or equal to 0.01 are shown in Table 2.
  • the table also lists assigned functions based on the closest match in the databases. ORFs that did not match protein or nucleotide sequences in the databases at these levels are shown in Table 3.
  • Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the methods known in the art.
  • the protein can also be produced in a recombinant prokaryotic expression system, such as E. coli, or can be chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms/ml.
  • Monoclonal or polyclonal antibody to the protein can then be prepared as follows.
  • Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from murine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or modifications of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media).
  • HAT media aminopterin
  • the successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued.
  • Antibody- producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol 70:419 (1980), and modified methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Davis, L. et al, Basic Methods in Molecular Biology, Elsevier, New York. Section 21-2 (1989).
  • Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than others and may require the use of carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, J. et al, J. Clin. Endocrinol. Metab. 33:988-991 (1971).
  • Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony, O. et al, Chap. 19 in: Handbook of Experimental Immunology, Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher, D., Chap.
  • Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively or qualitatively to identify the presence of antigen in a biological sample.
  • antibodies are useful in various animal models of pneumococcal disease as a means of evaluating the protein used to make the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunotherapeutic or immunoprophylactic reagent.
  • NOS: 1-744 can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses.
  • the PCR primers are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approximately the same.
  • the PCR primers and amplified DNA of this Example find use in the Examples that follow.
  • T. pallidum strain B3 IPU has been deposited as a convienent source for obtaining a T. pallidum strain although a wide varity of strains T. pallidum strains can be used which are known in the art.
  • T. pallidum genomic DNA is prepared using the following method.
  • a 20ml overnight bacterial culture grown in a rich medium e.g., Trypticase Soy Broth, Brain Heart Infusion broth or Super broth
  • TES Tris-pH 8.0, 25mM EDTA, 50mM NaCl
  • TES high salt TES
  • Lysostaphin is added to final concentration of approx 50ug/ml and the mixture is rotated slowly 1 hour at 37C to make protoplast cells.
  • the solution is then placed in incubator (or place in a shaking water bath) and warmed to 55C.
  • a plasmid is directly isolated by screening a plasmid T. pallidum genomic DNA library using a polynucleotide probe corresponding to a polynucleotide of the present invention.
  • a polynucleotide probe corresponding to a polynucleotide of the present invention.
  • a specific polynucleotide with 30-40 nucleotides is synthesized using an Applied Biosystems DNA synthesizer according to the sequence reported.
  • the ohgonucleotide is labeled, for instance, with 32 P- ⁇ -ATP using T4 polynucleotide kinase and purified according to routine methods.
  • the library is transformed into a suitable host, as indicated above (such as XL-1 Blue (Stratagene)) using techniques known to those of skill in the art. See, e.g., Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL (Cold Spring Harbor, N.Y. 2nd ed. 1989); Ausubel et al., CURRENT PROTOCALS IN MOLECULAR BIOLOGY (John Wiley and Sons, N.Y. 1989).
  • the transformants are plated on 1.5% agar plates (containing the appropriate selection agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. These plates are screened using Nylon membranes according to routine methods for bacterial colony screening. See, e.g., Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL (Cold Spring Harbor, N.Y. 2nd ed. 1989); Ausubel et al., CURRENT PROTOCALS IN MOLECULAR BIOLOGY (John Wiley and Sons, N.Y. 1989) or other techniques known to those of skill in the art.
  • two primers of 15-25 nucleotides derived from the 5' and 3' ends of a polynucleotide of SEQ ID NOS: 1-744 are synthesized and used to amplify the desired DNA by PCR using a T. pallidum genomic DNA prep as a template.
  • PCR is carried out under routine conditions, for instance, in 25 ⁇ l of reaction mixture with 0.5 ug of the above DNA template.
  • a convenient reaction mixture is 1.5-5 mM MgCl 2 , 0.01% (w/v) gelatin, 20 ⁇ M each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq polymerase.
  • overlapping oligos of the DNA sequences of SEQ ID NOS: 1-744 can be chemically synthesized and used to generate a nucleotide sequence of desired length using PCR methods known in the art.
  • the bacterial expression vector pQE60 is used for bacterial expression of some of the polypeptide fragements of the present invention. (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311). pQE60 encodes ampicillin antibiotic resistance ("Ampr”) and contains a bacterial origin of replication ("ori"), an IPTG inducible promoter, a ribosome binding site (“RBS”), six codons encoding histidine residues that allow affinity purification using nickel- nitrilo-tri-acetic acid (“Ni-NTA”) affinity resin (QIAGEN, Inc., supra) and suitable single restriction enzyme cleavage sites.
  • Amr ampicillin antibiotic resistance
  • ori an IPTG inducible promoter
  • RBS ribosome binding site
  • 6 six codons encoding histidine residues that allow affinity purification using nickel- nitrilo-tri-acetic acid (“Ni-NTA”) affinity resin (QIAGEN, Inc., supra
  • the DNA sequence encoding the desired portion of a T. pallidum protein of the present invention is amplified from T. pallidum genomic DNA using PCR ohgonucleotide primers which anneal to the 5' and 3' sequences coding for the portions of the T. pallidum polynucleotide shown in SEQ ID NOS: 1-744. Additional nucleotides containing restriction sites to facilitate cloning in the pQE60 vector are added to the 5' and 3' sequences, respectively.
  • the 5' primer has a sequence containing an appropriate restriction site followed by nucleotides of the amino terminal coding sequence of the desired T. pallidum polynucleotide sequence in SEQ ID NOS: 1-744.
  • SEQ ID NOS: 1-744 nucleotides of the amino terminal coding sequence of the desired T. pallidum polynucleotide sequence in SEQ ID NOS: 1-744.
  • the 3' primer has a sequence containing an appropriate restriction site followed by nucleotides complementary to the 3' end of the polypeptide coding sequence of SEQ ID NOS: 1-744, excluding a stop codon, with the coding sequence aligned with the restriction site so as to maintain its reading frame with that of the six His codons in the pQE60 vector.
  • the amplified T. pallidum DNA fragment and the vector pQE60 are digested with restriction enzymes which recognize the sites in the primers and the digested DNAs are then ligated together.
  • the T. pallidum DNA is inserted into the restricted pQE60 vector in a manner which places the T. pallidum protein coding region downstream from the IPTG-inducible promoter and in-frame with an initiating AUG and the six histidine codons.
  • E. coli strain M15/rep4 containing multiple copies of the plasmid pREP4, which expresses the lac repressor and confers kanamycin resistance ("Kanr"), is used in carrying out the illustrative example described herein.
  • This strain which is only one of many that are suitable for expressing a T. pallidum polypeptide, is available commercially (QIAGEN, Inc., supra).
  • Transformants are identified by their ability to grow on LB agar plates in the presence of ampicillin and kanamycin. Plasmid DNA is isolated from resistant colonies and the identity of the cloned DNA confirmed by restriction analysis, PCR and DNA sequencing.
  • Clones containing the desired constructs are grown overnight ("O/N") in liquid culture in LB media supplemented with both ampicillin (100 ⁇ g/ml) and kanamycin (25 ⁇ g/ml).
  • the O/N culture is used to inoculate a large culture, at a dilution of approximately 1 :25 to 1 :250.
  • the cells are grown to an optical density at 600 nm ("OD600”) of between 0.4 and 0.6.
  • Isopropyl- ⁇ -D- thiogalactopyranoside (“IPTG”) is then added to a final concentration of 1 mM to induce transcription from the lac repressor sensitive promoter, by inactivating the lad repressor. Cells subsequently are incubated further for 3 to 4 hours.
  • Ni-NTA nickel-nitrilo-tri-acetic acid
  • the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then washed with 10 volumes of 6 M guanidine-HCl pH 6, and finally the T. pallidum polypeptide is eluted with 6 M guanidine- HCI, pH 5.
  • the purified protein is then renatured by dialyzing it against phosphate-buffered saline (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCl.
  • PBS phosphate-buffered saline
  • the protein could be successfully refolded while immobilized on the Ni-NTA column.
  • the recommended conditions are as follows: renature using a linear 6M-1M urea gradient in 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, containing protease inhibitors.
  • the renaturation should be performed over a period of 1.5 hours or more. After renaturation the proteins can be eluted by the addition of 250 mM immidazole.
  • Immidazole is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer plus 200 mM NaCl.
  • the purified protein is stored at 4°C or frozen at -80° C.
  • the polypeptide of the present invention are also prepared using a non-denaturing protein purification method.
  • Absorbance at 550 nm is approximately 10-20 O.D./ml.
  • the suspension is then put through three freeze/thaw cycles from -70°C (using a ethanol-dry ice bath) up to room temperature.
  • the cells are lysed via sonication in short 10 sec bursts over 3 minutes at approximately 80W while kept on ice.
  • the sonicated sample is then centrifuged at 15,000 RPM for 30 minutes at 4°C.
  • the supernatant is passed through a column containing 1.0 ml of CL-4B resin to pre-clear the sample of any proteins that may bind to agarose non-specifically, and the flow-through fraction is collected.
  • Ni-NTA nickel-nitrilo-tri-acetic acid
  • Buffer B 50 mM Na-Phosphate, 300 mM NaCl, 10% Glycerol, 10 mM 2-mercaptoethanol, 500 mM Imidazole, pH of the final buffer should be 7.5.
  • the protein is eluted off of the column with a series of increasing Imidazole solutions made by adjusting the ratios of Lysis Buffer A to Buffer B. Three different concentrations are used: 3 volumes of 75 mM Imidazole, 3 volumes of 150 mM Imidazole, 5 volumes of 500 mM Imidazole.
  • the fractions containing the purified protein are analyzed using 8 %, 10 % or 14% SDS-PAGE depending on the protein size.
  • the purified protein is then dialyzed 2X against phosphate-buffered saline (PBS) in order to place it into an easily workable buffer.
  • PBS phosphate-buffered saline
  • the purified protein is stored at 4° C or frozen at -80°.
  • the following alternative method may be used to purify T. pallidum expressed in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, all of the following steps are conducted at 4-10°C. Upon completion of the production phase of the E. coli fermentation, the cell culture is cooled to 4-10°C and the cells are harvested by continuous centrifugation at 15,000 rpm
  • cell paste On the basis of the expected yield of protein per unit weight of cell paste and the amount of purified protein required, an appropriate amount of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 mM ⁇ DTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a high shear mixer.
  • the cells are then lysed by passing the solution through a microfluidizer (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi.
  • the homogenate is then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by centrifugation at 7000 x g for 15 min.
  • the resultant pellet is washed again using 0.5M NaCl, 100 mM Tris, 50 mM ⁇ DTA, pH 7.4.
  • the resulting washed inclusion bodies are solubilized with 1.5 M guanidine hydrochloride (GuHCl) for 2-4 hours. After 7000 x g centrifugation for 15 min., the pellet is discarded and the T. pallidum polypeptide-containing supernatant is incubated at 4°C overnight to allow further GuHCl extraction. Following high speed centrifugation (30,000 x g) to remove insoluble particles, the
  • GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM ⁇ DTA by vigorous stirring.
  • the refolded diluted protein solution is kept at 4°C without mixing for 12 hours prior to further purification steps.
  • a previously prepared tangential filtration unit equipped with 0J6 ⁇ m membrane filter with appropriate surface area e.g.,
  • Fractions containing the T. pallidum polypeptide are then pooled and mixed with 4 volumes of water.
  • the diluted sample is then loaded onto a previously prepared set of tandem columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion (Poros CM-20, Perseptive Biosystems) exchange resins.
  • the columns are equilibrated with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium acetate, pH 6.0, 200 mM NaCl.
  • CM-20 column is then eluted using a 10 column volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A 280 monitoring of the effluent. Fractions containing the T. pallidum polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled.
  • T. pallidum polypeptide exhibits greater than 95% purity after the above refolding and purification steps. No major contaminant bands are observed from Commassie blue stained 16% SDS-PAGE gel when 5 ⁇ g of purified protein is loaded.
  • the purified protein is also tested for endotoxin/LPS contamination, and typically the LPS content is less than 0.1 ng/ml according to LAL assays.
  • Tthe vector pQElO is alternatively used to clone and express some of the polypeptides of the present invention for use in the soft tissue and systemic infection models discussed below. The difference being such that an inserted DNA fragment encoding a polypeptide expresses that polypeptide with the six His residues (i.e., a "6 X His tag") covalently linked to the amino terminus of that polypeptide.
  • the bacterial expression vector pQElO (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311) was used in this example .
  • the components of the pQElO plasmid are arranged such that the inserted DNA sequence encoding a polypeptide of the present invention expresses the polypeptide with the six His residues (i.e., a "6 X His tag”)) covalently linked to the amino terminus.
  • the DNA sequences encoding the desired portions of a polypeptide of SEQ ID NOS: 1- 744 were amplified using PCR ohgonucleotide primers from genomic T. pallidum DNA.
  • the PCR primers anneal to the nucleotide sequences encoding the desired amino acid sequence of a polypeptide of the present invention.
  • Additional nucleotides containing restriction sites to facilitate cloning in the pQElO vector were added to the 5' and 3' primer sequences, respectively.
  • the 5' and 3' primers were selected to amplify their respective nucleotide coding sequences.
  • the point in the protein coding sequence where the 5' and 3' primers begins may be varied to amplify a DNA segment encoding any desired portion of a polypeptide of the present invention.
  • the 5' primer was designed so the coding sequence of the 6 X His tag is aligned with the restriction site so as to maintain its reading frame with that of T. pallidum polypeptide.
  • the 3' was designed to include an stop codon. The amplified DNA fragment was then cloned, and the protein expressed, as described above for the pQE60 plasmid.
  • the DNA sequences encoding the amino acid sequences of SEQ ID NOS: 1-744 may also be cloned and expressed as fusion proteins by a protocol similar to that described directly above, wherein the pET-32b(+) vector (Novagen, 601 Science Drive, Madison, Wl 53711) is preferentially used in place of pQElO.
  • the above methods are not limited to the polypeptide fragements actually produced. The above method, like the methods below, can be used to produce either full length polypeptides or desired fragements therof.
  • the bacterial expression vector pQE60 is used for bacterial expression in this example (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311). However, in this example, the polypeptide coding sequence is inserted such that translation of the six His codons is prevented and, therefore, the polypeptide is produced with no 6 X His tag.
  • the DNA sequence encoding the desired portion of the T. pallidum amino acid sequence is amplified from an T. pallidum genomic DNA prep the deposited DNA clones using PCR ohgonucleotide primers which anneal to the 5' and 3' nucleotide sequences corresponding to the desired portion of the T. pallidum polypeptides. Additional nucleotides containing restriction sites to facilitate cloning in the pQE60 vector are added to the 5' and 3' primer sequences.
  • 5' and 3' primers are selected to amplify their respective nucleotide coding sequences.
  • the point in the protein coding sequence where the 5' and 3' primers begin may be varied to amplify a DNA segment encoding any desired portion of a polypeptide of the present invention.
  • the 3' and 5' primers contain appropriate restriction sites followed by nucleotides complementary to the 5' and 3' ends of the coding sequence respectively.
  • the 3' primer is additionally designed to include an in-frame stop codon.
  • the amplified T. pallidum DNA fragments and the vector pQE60 are digested with restriction enzymes recognizing the sites in the primers and the digested DNAs are then ligated together. Insertion of the T. pallidum DNA into the restricted pQE60 vector places the T. pallidum protein coding region including its associated stop codon downstream from the IPTG- inducible promoter and in-frame with an initiating AUG. The associated stop codon prevents translation of the six histidine codons downstream of the insertion point.
  • the ligation mixture is transformed into competent E. coli cells using standard procedures such as those described by Sambrook et al.
  • E. coli strain M15/rep4 containing multiple copies of the plasmid pREP4, which expresses the lac repressor and confers kanamycin resistance ("Kanr"), is used in carrying out the illustrative example described herein.
  • This strain which is only one of many that are suitable for expressing T. pallidum polypeptide, is available commercially (QIAGEN, Inc., supra).
  • Transformants are identified by their ability to grow on LB plates in the presence of ampicillin and kanamycin. Plasmid DNA is isolated from resistant colonies and the identity of the cloned DNA confirmed by restriction analysis, PCR and DNA sequencing.
  • Clones containing the desired constructs are grown overnight ("O/N") in liquid culture in LB media supplemented with both ampicillin (100 ⁇ g/ml) and kanamycin (25 ⁇ g/ml).
  • the O/N culture is used to inoculate a large culture, at a dilution of approximately 1:25 to 1:250.
  • the cells are grown to an optical density at 600 nm ("OD600") of between 0.4 and 0.6. isopropyl-b-D- thiogalactopyranoside (“IPTG”) is then added to a final concentration of 1 mM to induce transcription from the lac repressor sensitive promoter, by inactivating the lad repressor.
  • IPTG isopropyl-b-D- thiogalactopyranoside
  • the cells are then stirred for 3-4 hours at 4°C in
  • T. pallidum polypeptide 6M guanidine-HCl, pH 8.
  • the cell debris is removed by centrifugation, and the supernatant containing the T. pallidum polypeptide is dialyzed against 50 mM Na-acetate buffer pH 6, supplemented with 200 mM NaCl.
  • the protein can be successfully refolded by dialyzing it against 500 mM NaCl, 20% glycerol, 25 mM Tris/HCl pH 7.4, containing protease inhibitors. After renaturation the protein can be purified by ion exchange, hydrophobic interaction and size exclusion chromatography. Alternatively, an affinity chromatography step such as an antibody column can be used to obtain pure T pallidum polypeptide.
  • the purified protein is stored at 4°C or frozen at -80° C.
  • the following alternative method may be used to purify T. pallidum polypeptides expressed in E coli when it is present in the form of inclusion bodies. Unless otherwise specified, all of the following steps are conducted at 4-10°C.
  • the cell culture Upon completion of the production phase of the E. coli fermentation, the cell culture is cooled to 4-10°C and the cells are harvested by continuous centrifugation at 15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit weight of cell paste and the amount of purified protein required, an appropriate amount of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 mM ⁇ DTA, pH 1.4. The cells are dispersed to a homogeneous suspension using a high shear mixer.
  • the cells ware then lysed by passing the solution through a microfluidizer (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi.
  • the homogenate is then mixed with NaCl solution to a final concentration of 0.5 M NaCl, followed by centrifugation at 7000 x g for 15 min.
  • the resultant pellet is washed again using 0.5M NaCl, 100 mM Tris, 50 mM ⁇ DTA, pH 7.4.
  • the resulting washed inclusion bodies are solubilized with 1.5 M guanidine hydrochloride (GuHCl) for 2-4 hours. After 7000 x g centrifugation for 15 min., the pellet is discarded and the T. pallidum polypeptide-containing supernatant is incubated at 4°C overnight to allow further GuHCl extraction.
  • guanidine hydrochloride GuHCl
  • the GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 volumes of buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM ⁇ DTA by vigorous stirring.
  • the refolded diluted protein solution is kept at 4°C without mixing for 12 hours prior to further purification steps.
  • a previously prepared tangential filtration unit equipped with 0J6 ⁇ m membrane filter with appropriate surface area e.g.,
  • the diluted sample is then loaded onto a previously prepared set of tandem columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion (Poros CM-20, Perseptive Biosystems) exchange resins.
  • the columns are equilibrated with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium acetate, pH 6.0, 200 mM NaCl.
  • the CM-20 column is then eluted using a 10 column volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A 280 monitoring of the effluent. Fractions containing the T. pallidum polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled.
  • T pallidum polypeptide exhibits greater than 95% purity after the above refolding and purification steps. No major contaminant bands are observed from Commassie blue stained 16% SDS-PAGE gel when 5 ⁇ g of purified protein is loaded.
  • the purified protein is also tested for endotoxin/LPS contamination, and typically the LPS content is less than 0.1 ng/ml according to LAL assays.
  • T. pallidum polypeptides can also be produced in: T. pallidum using the methods of S.
  • a T. pallidum expression plasmid is made by cloning a portion of the DNA encoding a T pallidum polypeptide into the expression vector pDNAI A-mp or pDNAIII (which can be obtained from Invitrogen, Inc.).
  • the expression vector pDNAI/amp contains: (1) an E. coli origin of replication effective for propagation in E.
  • coli and other prokaryotic cells (2) an ampicillin resistance gene for selection of plasmid-containing prokaryotic cells; (3) an SV40 origin of replication for propagation in eukaryotic cells; (4) a CMV promoter, a polylinker, an SV40 intron; (5) several codons encoding a hemagglutinin fragment (i.e., an "HA" tag to facilitate purification) followed by a termination codon and polyadenylation signal arranged so that a DNA can be conveniently placed under expression control of the CMV promoter and operably linked to the SV40 intron and the polyadenylation signal by means of restriction sites in the polylinker.
  • HA hemagglutinin fragment
  • the HA tag corresponds to an epitope derived from the influenza hemagglutinin protein described by Wilson et al. 1984 Cell 37:767.
  • the fusion of the HA tag to the target protein allows easy detection and recovery of the recombinant protein with an antibody that recognizes the HA epitope.
  • pDNAIII contains, in addition, the selectable neomycin marker.
  • a DNA fragment encoding a T. pallidum polypeptide is cloned into the polylinker region of the vector so that recombinant protein expression is directed by the CMV promoter.
  • the plasmid construction strategy is as follows. The DNA from a T. pallidum genomic DNA prep is amplified using primers that contain convenient restriction sites, much as described above for construction of vectors for expression of T. pallidum in E. coli.
  • the 5' primer contains a Kozak sequence, an AUG start codon, and nucleotides of the 5' coding region of the T. pallidum polypeptide.
  • the 3' primer contains nucleotides complementary to the 3' coding sequence of the T. pallidum DNA, a stop codon, and a convenient restriction site.
  • the PCR amplified DNA fragment and the vector, pDNAI/Amp, are digested with appropriate restriction enzymes and then ligated.
  • the ligation mixture is transformed into an appropriate E. coli strain such as SURETM (Stratagene Cloning Systems, La Jolla, CA 92037), and the transformed culture is plated on ampicillin media plates which then are incubated to allow growth of ampicillin resistant colonies. Plasmid DNA is isolated from resistant colonies and examined by restriction analysis or other means for the presence of the fragment encoding the T. pallidum polypeptide
  • COS cells are transfected with an expression vector, as described above, using DEAE-dextran, as described, for instance, by Sambrook et al. (supra). Cells are incubated under conditions for expression of T. pallidum by the vector.
  • T. pallidum-RA fusion protein is detected by radiolabeling and immunoprecipitation, using methods described in, for example Harlow et al., supra.. To this end, two days after transfection, the cells are labeled by incubation in media containing 35 S- cysteine for 8 hours. The cells and the media are collected, and the cells are washed and the lysed with detergent-containing RIPA buffer: 150 mM NaCl, 1% NP-40, 0.1% SDS, 1% NP- 40, 0.5% DOC, 50 mM TRIS, pH 7.5, as described by Wilson et al. (supra ).
  • Proteins are precipitated from the cell lysate and from the culture media using an HA-specific monoclonal antibody. The precipitated proteins then are analyzed by SDS-PAGE and autoradiography. An expression product of the expected size is seen in the cell lysate, which is not seen in negative controls.
  • Plasmid pC4 is used for the expression of T. pallidum polypeptide in this example.
  • Plasmid pC4 is a derivative of the plasmid pSV2-dhfr (ATCC Accession No. 37146).
  • the plasmid contains the mouse DHFR gene under control of the SV40 early promoter.
  • Chinese hamster ovary cells or other cells lacking dihydrofolate activity that are transfected with these plasmids can be selected by growing the cells in a selective medium (alpha minus MEM, Life Technologies) supplemented with the chemotherapeutic agent methotrexate.
  • amplification of the DHFR genes in cells resistant to methotrexate (MTX) has been well documented.
  • DHFR as a result of amplification of the DHFR gene. If a second gene is linked to the DHFR gene, it is usually co-amplified and over-expressed. It is known in the art that this approach may be used to develop cell lines carrying more than 1,000 copies of the amplified gene(s). Subsequently, when the methotrexate is withdrawn, cell lines are obtained which contain the amplified gene integrated into one or more chromosome(s) of the host cell.
  • Plasmid pC4 contains the strong promoter of the long terminal repeat (LTR) of the Rouse Sarcoma Virus, for expressing a polypeptide of interest, Cullen, et al. (1985) Mol. Cell. Biol. 5:438-447; plus a fragment isolated from the enhancer of the immediate early gene of human cytomegalovirus (CMV), Boshart, et al., 1985, Cell 41:521-530. Downstream of the promoter are the following single restriction enzyme cleavage sites that allow the integration of the genes: Bam HI, Xba I, and Asp 718. Behind these cloning sites the plasmid contains the 3' intron and polyadenylation site of the rat preproinsulin gene.
  • LTR long terminal repeat
  • CMV cytomegalovirus
  • ⁇ -actin promoter e.g., the human ⁇ -actin promoter, the SV40 early or late promoters or the long terminal repeats from other retroviruses, e.g., HIV and HTLVI.
  • Clontech's Tet-Off and Tet-On gene expression systems and similar systems can be used to express the T. pallidum polypeptide in a regulated way in mammalian cells (Gossen et al., 1992, Proc. Natl. Acad. Sci. USA 89:5547-5551.
  • Other signals e.g., from the human growth hormone or globin genes can be used as well.
  • Stable cell lines carrying a gene of interest integrated into the chromosomes can also be selected upon co-transfection with a selectable marker such as gpt, G418 or hygromycin. It is advantageous to use more than one selectable marker in the beginning, e.g., G418 plus methotrexate.
  • the plasmid pC4 is digested with the restriction enzymes and then dephosphorylated using calf intestinal phosphates by procedures known in the art.
  • the vector is then isolated from a 1% agarose gel.
  • the DNA sequence encoding the T pallidum polypeptide is amplified using PCR ohgonucleotide primers corresponding to the 5' and 3' sequences of the desired portion of the gene.
  • a 5' primer containing a restriction site, a Kozak sequence, an AUG start codon, and nucleotides of the 5' coding region of the T. pallidum polypeptide is synthesized and used.
  • a 3' primer, containing a restriction site, stop codon, and nucleotides complementary to the 3' coding sequence of the T. pallidum polypeptides is synthesized and used.
  • the amplified fragment is digested with the restriction endonucleases and then purified again on a 1 % agarose gel.
  • the isolated fragment and the dephosphorylated vector are then ligated with T4 DNA ligase.
  • E. coli HB101 or XL-1 Blue cells are then transformed and bacteria are identified that contain the fragment inserted into plasmid pC4 using, for instance, restriction enzyme analysis. Chinese hamster ovary cells lacking an active DHFR gene are used for transfection.
  • lipid-mediated transfection agent such as LipofectinTM or LipofectAMIN ⁇ .TM (LifeTechnologies).
  • the plasmid pSV2-neo contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme that confers resistance to a group of antibiotics including G418.
  • the cells are seeded in alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus MEM supplemented with 10, 25, or 50 ng/ml of methotrexate plus 1 mg/ml G418.
  • single clones are trypsinized and then seeded in 6- well petri dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 100 nM, 200 nM, 400 nM, 800 nM). Clones growing at the highest concentrations of methotrexate are then transferred to new 6- well plates containing even higher concentrations of methotrexate (1 ⁇ M, 2 ⁇ M, 5 ⁇ M, 10 mM, 20 mM). The same procedure is repeated until clones are obtained which grow at a concentration of 100-200 ⁇ M. Expression of the desired gene product is analyzed, for instance, by SDS-PAGE and Western blot or by reversed phase HPLC analysis.
  • GACAGCATCC GCTATCATGC CAATGGTGGT GACAACCAGG GGTTTCCCGT CCGCTGCGGC 1020
  • CTCCATATCA ATGACACTCA CGCGCGTCTT ACGCTTTCCT ATGCAGTGTT TGACCTTCCG 5280
  • ATGTCATGCA CCTTTTGGAA GAAATGCACG AGCACAATGA ACGAGAAAGG CTATCGTGAA 7020
  • TCAcTACGCC ATCCATACCG CGAAGAAACA TAGTACCGAC TGCGAAGAGG AGCACGAAAC 7860
  • AATCGTTCTA AGGAGATCTG ATGCGCCGCC GCTATAGACG AAAACGTATC GCCGTTTTTT 8160
  • ACGGTATATA AAATGCCGTC CACTGAGGGG ATTTTTAGTA GCTGTCCAAC TTGGAGCGCC 8220
  • TTTTACGCGC GTTTTGTTCG GTGGCAAAAT CGGGAATTGG CTCAACCCCA TAGCGCTTGC 12300
  • TGAATGGTGA CAATGGTGCC ACCCCTGTTT TTCGTATGCG CCCTTTTCTT TGCCGAGGGC 14040
  • CTCAGTCTCG AACAGGtGCG CACGAGACGA GTATCGCACG CCGTTACCTG GAGGCGCTCG 840

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Testing Of Short-Circuits, Discontinuities, Leakage, Or Incorrect Line Connections (AREA)
  • Peptides Or Proteins (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne des séquences polynucléotidiques du génome du T. Pallidium, les séquences polypeptidiques codées par les séquences polynucléotidiques, les polynucléotides et polypeptides correspondants, les vecteurs et hôtes comportant ces polynucléotides, ainsi que leurs dosages et autres utilisations. La présente invention concerne également les informations relatives aux séquences polynucléotidiques et polypeptidiques qui sont stockées sur supports exploitables par ordinateur, ainsi que les systèmes et procédés informatiques facilitant leur utilisation.
PCT/US1998/013041 1997-06-24 1998-06-23 POLYNUCLEOTIDES ET SEQUENCES DE $i(TREPONEMA PALLIDIUM) WO1998059034A2 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP98931511A EP0990022A1 (fr) 1997-06-24 1998-06-23 POLYNUCLEOTIDES ET SEQUENCES DE $i(TREPONEMA PALLIDIUM)
CA002296814A CA2296814A1 (fr) 1997-06-24 1998-06-23 Polynucleotides et sequences de treponema pallidium
AU81623/98A AU8162398A (en) 1997-06-24 1998-06-23 (treponema pallidum) polynucleotides and sequences

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US5066797P 1997-06-24 1997-06-24
US60/050,667 1997-06-24

Publications (2)

Publication Number Publication Date
WO1998059034A2 true WO1998059034A2 (fr) 1998-12-30
WO1998059034A3 WO1998059034A3 (fr) 2000-06-29

Family

ID=21966650

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/013041 WO1998059034A2 (fr) 1997-06-24 1998-06-23 POLYNUCLEOTIDES ET SEQUENCES DE $i(TREPONEMA PALLIDIUM)

Country Status (4)

Country Link
EP (1) EP0990022A1 (fr)
AU (1) AU8162398A (fr)
CA (1) CA2296814A1 (fr)
WO (1) WO1998059034A2 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999054470A2 (fr) * 1998-04-22 1999-10-28 Glaxo Group Limited Famille de polypeptides bacteriens ygjd
FR2798386A1 (fr) * 1999-09-10 2001-03-16 Didier Raoult Oligonucleotides monocatenaires, sondes, amorces et procede de detection des spirochetes
CN102277418A (zh) * 2011-01-26 2011-12-14 宁波基内生物技术有限公司 一种用于检测梅毒螺旋体的引物、试剂盒及方法
CN109021082A (zh) * 2018-07-27 2018-12-18 南华大学 梅毒螺旋体重组蛋白Tp0971的表达纯化及应用
WO2024050428A3 (fr) * 2022-08-31 2024-06-27 University Of Washington Compositions, kits et méthodes de détection de syphilis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Wisconsin Sequence Analysis Package, User's Guide, Version 8, 1995, Genetics Computer Group, University Research Park, 575 Science Drive. Madison, Wisconson 53711, pages i-xvii, 1-1 to 4-12, and A-1 to A-17 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999054470A2 (fr) * 1998-04-22 1999-10-28 Glaxo Group Limited Famille de polypeptides bacteriens ygjd
WO1999054470A3 (fr) * 1998-04-22 2000-03-30 Glaxo Group Ltd Famille de polypeptides bacteriens ygjd
FR2798386A1 (fr) * 1999-09-10 2001-03-16 Didier Raoult Oligonucleotides monocatenaires, sondes, amorces et procede de detection des spirochetes
WO2001020028A1 (fr) * 1999-09-10 2001-03-22 Universite De La Mediterranee (Aix-Marseille Ii) Oligonucleotides monocatenaires, sondes, amorces et procede de detection des spirochetes
US7141658B1 (en) 1999-09-10 2006-11-28 Biomerieux Single stranded oligonucleotides, probes, primers and method for detecting spirochetes
CN102277418A (zh) * 2011-01-26 2011-12-14 宁波基内生物技术有限公司 一种用于检测梅毒螺旋体的引物、试剂盒及方法
CN109021082A (zh) * 2018-07-27 2018-12-18 南华大学 梅毒螺旋体重组蛋白Tp0971的表达纯化及应用
WO2024050428A3 (fr) * 2022-08-31 2024-06-27 University Of Washington Compositions, kits et méthodes de détection de syphilis

Also Published As

Publication number Publication date
CA2296814A1 (fr) 1998-12-30
AU8162398A (en) 1999-01-04
EP0990022A1 (fr) 2000-04-05
WO1998059034A3 (fr) 2000-06-29

Similar Documents

Publication Publication Date Title
KR102644935B1 (ko) 항-PD1/PD-L1/PD-L2 항체에 대한 반응성의 마커로서의 미생물총 조성물, 및 항-PD1/PD-L1/PD-L2 Ab-기반 치료의 효능을 개선하기 위한 미생물 조정제의 용도
AU2024202444A1 (en) Ammonia-oxidizing nitrosomonas eutropha strain D23
AU2021269424A1 (en) Composition comprising bacterial strains
AU2021290210A1 (en) Compositions comprising bacterial strains
KR102523805B1 (ko) 면역 조정
AU2021202753A1 (en) Isolated polynucleotides and polypeptides and methods of using same for increasing plant yield, biomass, growth rate, vigor, oil content, abiotic stress tolerance of plants and nitrogen use efficiency
AU2021200769A1 (en) Compositions comprising bacterial strains
AU2018203260A1 (en) Isolated Polynucleotides and Polypeptides, and Methods of using same for increasing Abiotic Stress Tolerance, Yield, Growth Rate, Vigor, Biomass, Oil Content, and/or Nitrogen use Efficiency of Plants
TW202222339A (zh) 包含細菌菌株之組合物
AU2016274683A1 (en) Streptomyces endophyte compositions and methods for improved agronomic traits in plants
KR20170005829A (ko) 모기 제어를 위한 조성물 및 그의 용도
KR20070086634A (ko) 공업적으로 유용한 미생물
AU7280598A (en) Enterococcus faecalis polynucleotides and polypeptides
KR102363357B1 (ko) 단쇄 지방산을 생산하는 균주, 항생제 내성 병원균에 대한 항균 활성 균주 및 이를 포함하는 항균용 조성물
AU2016295174A1 (en) Genetic testing for predicting resistance of salmonella species against antimicrobial agents
AU2016295176A1 (en) Genetic testing for predicting resistance of gram-negative proteus against antimicrobial agents
KR20210068484A (ko) 신세포암에서 항-pd1/pd-l1/pd-l2 항체에 대한 반응성의 마커로서의 미생물총 조성물
JP2002355074A (ja) 腸管出血性病原性大腸菌o157:h7に特異的な核酸分子およびポリペプチド並びにこれらの使用方法
KR20200038970A (ko) 박테리아 균주를 포함하는 조성물
CN107208149A (zh) 结直肠癌相关疾病的生物标志物
KR20220004117A (ko) 증가된 저장 안정성을 갖는 프로바이오틱 균주
KR102411381B1 (ko) 서팩틴 및 효소 생산능이 우수한 바실러스 서브틸리스 균주 및 이의 용도
WO1998059034A2 (fr) POLYNUCLEOTIDES ET SEQUENCES DE $i(TREPONEMA PALLIDIUM)
KR20190059562A (ko) γPGA 활성을 가지는 신규 고초균 및 이의 용도
KR20190057790A (ko) Nattokinase 활성을 가지는 신규 고초균 및 이의 용도

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref document number: 2296814

Country of ref document: CA

Ref country code: CA

Ref document number: 2296814

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1998931511

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 1999504981

Format of ref document f/p: F

WWP Wipo information: published in national office

Ref document number: 1998931511

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

WWW Wipo information: withdrawn in national office

Ref document number: 1998931511

Country of ref document: EP